Identify Clusters in Genomes¶
p3-identify-clusters.pl [options] clusterFile <features.tbl
Given an input file of features and locations, this script find occurrences of functional clusters.
The cluster file should contain in its last column a list of the clustered identifiers (usually roles or protein family IDs), separated by
a double colon delimited (
::). The first column should contain a cluster ID of some sort. This is the default for the output of
The input file must include columns for the genome ID, the identifier used for clustering (again, usually roles or protein family IDs), the sequence ID, and the location. Features that are close together on the chromosome and belong to the same cluster will be output along with the sequence ID, start and end locations, and a cluster ID number.
The positional parameter is the name of a tab-delimited file containing the clusters. The clusters must be in the first column,
and consist of multiple clustered identifiers (roles or family IDs) separated by item delimiters (
The standard input can be overriddn using the options in Input Options. The standard input must be a tab-delimited file
containing features. By default, the feature ID should be in a column named
patric_id, the location in a column named
the sequence ID in a column named
sequence_id, and the clustered identifier (role or family) should be in the last column.
The clustered identifier is considered the key column.
Index (1-based) or name of the location column in the input. The default is
Index (1-based) or name of the feature ID column in the input. The default is
Index (1-based) or name of the sequence ID column in the input. The default is
The maximum gap between features for them to be considered part of a cluster. The default is
The minimum number of features for a group to be considered a cluster. The default is
If specified, the roles found in the cluster will be displayed in a column of the output.
If specified, the features found in the cluster will be displayed in a column of the output.