Modeling the Microcosmos
The Environmental Genome Encyclopedia (EngCyc) Portal is a compendium of microbial community metabolic models supported by high-performance software tools implemented on grids and clouds.
Reconstructing Microbial Community Metabolism
Over the past decade, high-throughput sequencing and mass spectrometry platforms generating multi-omic sequence information (DNA, RNA, protein, and metabolites) which contain information about the function and identity of microbial life, have transformed our perception of the microcosmos, illuminating microbial dark matter and conceptually linking microorganisms at the individual, population, and community levels to a wide range of ecosystem functions and services. Despite the power and promise of this new perception, a persistent paucity of scalable software tools to mine, monitor, and interact with environmental sequence information limits knowledge creation and translation. This is especially vexing in a time of climate change when microbial community metabolism offers practical wisdom to rebuild our global future in more sustainable ways.
In response to this challenge, we are developing the Environmental Genome Encyclopedia (EngCyc), a compendium of microbial community metabolic models supported by high-performance software tools implemented on grids and clouds. EngCyc will provide direct and comparative access to these models through a portal system on the World Wide Web and support user-defined model construction in an automated and scalable manner enabling gene and pathway discovery. Downstream analysis modules will provide intuitive and beautiful data exploration options to power knowledge creation and translation.
Current workflows include MetaPathways and TreeSAPP for pathway prediction and gene-centric phylogenetic analyses, respectively, from assembled metagenomes. The final set of modules will include sequence read quality-assessment and control, metagenome assembly, population genome binning, taxonomic assignment, completeness and contamination of metagenome-assembled genomes (MAGs) and single-cell amplified genomes (SAGs) and metabolic pathway prediction algorithms in addition to the current tool set.
The EngCyc portal will expand access to scalable and intuitive high-performance computing resources and enable the global research community to more effectively explore and harness the hidden metabolic powers of the microcosmos. Relevant primary publications related to portal development include (PMID: 27183115, PMID: 25048541, PMID: 26076725, PMID: 23800136 and PMID: 28398290).