MicrobesOnline Comparative Genomics Database

The MicrobesOnline genome database contains over 1000 prokaryotic genomes. Genomes were last updated in late 2011 and no further database updates are planned.

All genomes are analyzed through the VIMSS genome pipeline. We use publicly available sequence analysis tools and databases to search for homologs (NCBI BLAST, UCSC Blat, SwissProt, COG) and protein domains (HMMer, InterPro), to assign gene ontologies (Gene Ontology Consortium) and EC numbers and to map the metabolic pathways (KEGG). We then link the orthology relationships between genes and predict operon structures.

Most genome data is downloaded from RefSeq. When an incomplete genome is directly downloaded from a sequencing center, we submit the genome sequence to RAST for automated annotation. For all genomes, we also search for CRISPR regions using PILER-CR and CRT.

All of the information in the VIMSS genome database is freely available on our website.

Currently we use these versions of external databases:

  • RefSeq: Release 44, November 2010 (complete microbial genomes plus selected WGS genomes)
  • COG: CDD 2.21 April 2010 (from NCBI CDD)
  • PDB: June 2010
  • KEGG: April 2009
    • In order to save space, we only store the top 100 hits to KEGG for each gene in our database. We do use all hits to KEGG for our EC assignments.
  • UniProt/SwissProt: UniProt 15.0, SwissProt 57.0, March 2009
  • InterPro: release 4.6, February 2010
    • All except TIGRFam: data 26.0, March 2010.
    • TIGRFam: 10.1.
  • Gene Ontology: 200711

MicrobesOnline Home Page