Start time: 8:30 a.m.
Room Location: 201B – Long Beach Convention Center (LBCC)
Janusz Dutowski—From Networks to Ontologies of Gene Function
Abstract: Ontologies are of key importance to many domains of biological research. The Gene Ontology (GO), in particular, has been instrumental in unifying knowledge about biological processes, cellular components, and molecular functions through a hierarchy of concepts and their interrelationships. However, given only partial biological knowledge and inconsistency in how this knowledge is curated, it has been difficult to construct, extend and validate GO in an unbiased manner. We show that the existing collection of high-throughput network maps for Saccharomyces cerevisiae can be analyzed to automatically assemble an ontology of gene function that is comparable to manually curated efforts. Our systematic computational approach combines evidence from physical, genetic and transcriptional networks to produce an ontology comprised of 4,123 biological concepts and 5,766 hierarchical concept relations. Using a new ontology alignment procedure, we find that the network-based ontology captures the majority of known cellular components in budding yeast and identifies approximately 600 new cellular components and component relations – many of which we were able to validate either experimentally or bioinformatically. By working closely with the GO curators, we were also able to incorporate selected new components and relations into the Gene Ontology. We demonstrate the many aspects of the multi-scale analysis performed using our framework, including automatically identifying, annotating and visualizing the complete hierarchical structure of biological networks. We also illustrate how it provides a powerful tool to uncover new biological knowledge and errors of manual curation. Finally, based on our results, we suggest a new role for ontologies in bioinformatics: rather than merely being used as a gold-standard for performing functional enrichment, ontologies should serve as evolvable models that are validated, revised, and expanded based on new genomic data.
Robert Hoendorf—My ontology is better than yours! Building and evaluating ontologies for integrative research
Abstract: Ontologies are now pervasive in all areas of biology and research on biomedical ontologies applies theories and methods from diverse disciplines such as information management, knowledge representation, cognitive science, linguistics and philosophy. With the increasing use of ontologies, it becomes crucial to establish a methodology based on which the ontologies' utility for biomedical research can be evaluated, compared and improved. For ontologies that are intended to facilitate or improve scientific analyses, such a methodology must be based on the scientific results obtained using a particular ontology. I will demonstrate how multiple ontologies can be combined to integrate pharmacogenomics databases and knowledge, and used to reveal significant associations between drugs, pathways and diseases. In this application, different ontology design decisions lead to different results, making a sound evaluation based on real biomedical data a necessity. Ultimately, developing a research methodology based on ontology evaluation with respect to biomedical applications and analyses will improve the utility of ontologies in integrative biology and translational research.
Sean D. Mooney—Using ontologies for hypothesis generation and prediction in translational informatics
Abstract: Diverse ontologies have been developed for the annotation of disease, genome and proteome function, physiology, pharmacology and phenotype. We have been using automated concept recognition tools to explore ontological concepts present in curated text that describes genes and proteins. In this talk, we will review how to automatically annotate genes and proteins with a diverse set of ontological terms. We compare these annotations to gold standard curated annotations from Gene Ontology (GO) and human disease ontologies, and describe the scope of annotations we observe. I will then describe the use of these annotations to perform enrichment analysis of these concepts and to generate hypotheses from the results of high throughput experimentation, typically lists of genes and/or proteins. We show that it is possible to map these annotations to an integrated functional network derived from high throughput datasets, such as protein interaction, gene coexpression, shared sequence features, etc. Using this map, we can evaluate which of these terms can be predicted in humans and other model organisms using semi-supervised classification and cross validation methods. Finally, we will also discuss species specific gene and protein annotation similarity networks derived from both curated and/or automatically annotated terms and their utility in describing gene and concept similarity.