Find Roles That Occur Close Together

p3-generate-close-roles.pl [options] <roles.tbl >pairs.tbl

This script is part of a pipeline to compute functionally-coupled roles. It takes a file of locations and roles, then outputs a file of pairs of roles with the number of times features containing those two roles occur close together on the chromosome. Such roles typically have related functions in a genome.

The input file must contain the following four fields.


genome ID


contig (sequence) ID


location in the sequence


functional role

The default script assumes the four columns are in that order. This can all be overridden with command-line options.

The input file must be sorted by genome ID and then by sequence ID within genome ID. Otherwise, the results will be incorrect. Use p3-sort to sort the file.

The location is a PATRIC location string, either of the form start..end or complement(left..right). Given a set of genome IDs in the file genomes.tbl, you can generate the proper file using the following pipe.

p3-get-genome-features --attr sequence_id --attr location --attr product <genomes.tbl | p3-function-to-role

(If PATRIC does not yet have roles defined, you will need to use an additional command-line option on p3-function-to-role.)


There are no positional parameters.

The standard input can be overriddn using the options in Input Options.

Additional command-line options are


The index (1-based) or name of the column containing the genome ID. The default is 1.


The index (1-based) or name of the column containing the sequence ID. The default is 2.


The index (1-based) or name of the column containing the location string. The default is 3.


The index (1-based) or name of the column containing the role description. The default is 4.


The maximum space between two features considered close. The default is 2000.


The minimum number of occurrences for a pair to be considered significant. The default is 4.