Ontology Driven Annotation, Integration and Analysis of Gene Expression Data




Archiving, integrating, and adding semantic value to high-throughput gene-expression data present major challenges in bioinformatics. The data are both diverse, covering many species, disease, and tissue types, and the annotations need to be detailed enough to capture complex experimental nuance relating to experimental process and biology. The main challenges in annotating such data lie in identification of suitable ontologies, consistent application of ontologies to the data in the context of annotation tools, provision of context specific views of existing ontologies for query, and development of automated annotation pipelines to support curators. We describe in this talk our progress in addressing these challenges, specifically: the development of tools to help build ontologies for applications, the development of a linked data repository of annotation knowledge and the publication of our gene expression database as RDF linked data, connecting our ontology annotations to the data. Finally, we outline our future work in developing an analysis package for use with this semantically rich expression data enabling the generation and analysis of gene lists of biological relevance. We highlight how we utilise NCBO tools and the drivers we have identified as a result of this work. 



Dr. Malone is a Bioinformatician in the Functional Genomics Production Team at the European Bioinformatics Institute based in Cambridge, UK. He leads efforts to develop application ontologies and supporting ontology development tools for use in the annotation of biomedical data. He previously worked at the AIAI in University of Edinburgh on various joint commercial-academic projects in areas including biomedical intelligent system applications. He holds a PhD in Machine Learning in Bioinformatics, an MSc in Bioinformatics and a BSc in Computer Science.