PATRIC Launches Initial Website for All-Baterial Bioinformatics Resource Center

Published on 2009-12-09

In September 2009, the National Institute of Allergy and Infectious Diseases (NIAID), part of the National Institutes of Health (NIH), awarded a five-year contract to Virginia Tech’s CyberInfrastructure Group (CIG) to support the biomedical research community’s work on infectious diseases. The funding is being used to develop a multi-faceted web-based bioinformatics resource that provides rich data and analysis tools for all bacterial species with an emphasis on the bacterial Orders that include NIAID category A-C priority pathogens.

Data Enhancements

December 9, 2009 marks our initial release of the PATRIC Website. By design, this release is data-centric, providing the first compiled set of all bacterial genomes, along with their annotations, available from RefSeq and GFF3 files provided by incumbent BRCs from the previous BRC awards. The following table summarizes the data loaded in the PATRIC database and currently available through this December 9, 2009 PATRIC Website release. At this time, we are missing data from a small number of GFF3 files, but are working to make these data available by December 31, 2009.

From incumbent BRCs

From RefSeq

Number of genomes

394

2,317

Number of genomic features

2,157,973

14,766,475

Website Enhancements

This December 9, 2009 PATRIC website release provides the following functionality:

  • Basic Website Navigation: Including Taxonomy Browser, Taxon Overview page, Genome Overview page, Genome/Sequence List, Genomic Feature Table, and Feature Overview page.

  • Searches and Tools: Genome Finder and Feature Search Tools allow users to quickly find genomes or features of interest. BLAST Search allows users to quickly search genomic sequences and protein coding genes based on sequence similarity.

  • Feature Cart: Allows users to collect the features of interest from multiple pages across the PATRIC website. Once collected, these features can be exported as FASTA DNA or FASTA Protein sequences, or as a Feature Table.

  • PubMed Integration: A simple but effective real-time literature retrieval system that quickly identifies publications relevant to a taxon, genome, or gene of interest using PubMed and Entrez Programming Utilities (eUtils) from NCBI and search terms derived from genome metadata and/or functional annotation of a gene/protein. Users can filter results by area of interest (i.e., Countermeasures, Diagnosis, Disease, Epidemiology, or Gene Expression).

  • Google Search: Provides an automated list of related web resources as determined by Google search engine; resources are grouped by content category including Google Web, News, Images, Books, Patents, and Video. Google search results are summarized on Taxon, Genome, and Fetaure Overview pages, and direct links are provided to the detailed result pages.

  • File Download: Allows users to download genome sequences and annotations as GenBank or GFF3 files.