Compute Family Signatures¶
p3-signature-families --gs1=FileOfGenomeIds --gs2=FileOfGenomeIds [--min=MinGs1Frac] [--max=MaxGs2Frac] > family.signatures This script produces a file in which the last field in each line is a family signature. The first field will be the number of hits against Gs1, and the second will be the number of hits against Gs2.
Specifies the (1-based) column index or name of the genome ID column in the two genome input files. The default is
0, indicating the last colummn.
A tab-delimited file of genomes. These are thought of as the genomes that have a given property (e.g. belong to a certain species, have resistance to a particular antibiotic). If omitted, the standard input is used. The genome IDs must be in the last column.
A tab-delimited file of genomes. These are genomes that do not have the given property. If omitted, the standard input is used. The genome IDs must be in the last column. Any genomes present in the gs1 set will be automatically deleted from this list.
Minimum fraction of genomes in Gs1 that occur in a signature family (default 0.8).
Maximum fraction of genomes in Gs2 that occur in a signature family (default 0.2).
Write progress messages to STDERR.