NCBO User Profile: Simon Twigger, PhD, Medical College of Wisconsin
Data Management and Integration in the Biological Sciences
Need for NCBO Services:
One approach to addressing the challenges of both information retrieval and data analysis is the use of ontologies to annotate and then aggregate data found in online data repositories. Annotation takes time so for this to be tractable at the scale required the process has to be as automated as possible. For the annotation information to be useful in the broader context the annotations should come from a core set of popular, widely used ontologies so they can integrate with complementary datasets produced by others. The NCBO Annotator addresses both aspects of this problem by providing a high throughput way to rapidly annotate text using virtually all of the popular bio-ontologies in use by the community.
Specific Uses of NCBO Services:
The rat has powerful genetic tools and hence is a popular model for the studying the role of the genome in health and disease. This popularity has lead to a large amount of Rat gene expression data being amassed in various online repositories such as NCBI's Gene Expression Omnibus database (GEO). Using the NCBO tools as a foundation we have built the GMiner tool as a platform to annotate and curate this data and make it available to the broader community. The figure opposite shows the workflow being used. GEO data is queued for analysis within GMiner (1), specified datasets are then retrieved from GEO, distributed to the NCBO annotation tools via a message queue architecture and the resulting annotations saved to the local database (2). The GMiner web tool is then used to review and curate these automated annotations (3) to clean up any errors or omissions. The final dataset can then be searched online (4) and/or exported to other tools for integration with other genomic datasets. Through this work we have enhanced our abilities to use rat gene expression data from specific Rat strains, derived from particular anatomical regions or following treatment with specific drugs.



