OBD:Querying

From NCBO Wiki

Jump to: navigation, search

This page is a little out of date. See OBD:UI


Contents

Introduction: Querying OBD

This page is the main housing area for demos of the Open Bio-Ontologies Database (OBD:Main_Page). The software and data on this page should be considered alpha or pre-alpha. URLs are subject to change

OBD will be accessable via a web portal BioPortal, and also programmatically via one or more endpoints. The current focus of our work is on the endpoints, but we have some demos using existing client software

Datasets

OBD nodes will house a variety of annotation data spanning multiple ontologies. The current focus of OBD is on data for the 3 NCBO [[Driving Biological Projects]]; a lot of this data is in a preliminary state. We also have some demos including different datasets here.

Phenotype Annotations

See PATO Annotations. The eventual dataset will consist of detailed annotations of 200 orthologous genes across Fly, Zebrafish and Human. Currently there are only a few genes (EYA, ACHE) annotated across all 3 species, although with all the different genotypes this yields over 700 annotations.

Mouse and other phenotypes will also be added; this will be extremely useful when species-specific phenotype ontologies are defined according to the methodology set out in PATO:Pre_vs_post-coordinating

Clinical Trial Data

This data has been imported directly from the TrialBank database (via an Ocelot2OWL conversion)

Fly Gene Expression Image Data

This dataset does not form part of the requirements for any of the existing DBPs, but it is useful to add as a means of illustrating how annotated images can be handled as part of the OBD framework

Gene Ontology Annotations

The Gene Ontology project is distinct from NCBO; however, it is useful to illustrate how the GO annotation model is consistent with the representation of annotation in OBD.

BIRN Data

Forthcoming

Web Apps

These are preliminary and for demo purposes only. They will be superseded by BioPortal. Note that OBD will expose as much API and query capability as possible, allowing developers to write their own OBD client components and software.

Note these UIs have many rough edges - they are for demo purposes only

OBD-Phenotype

  • URL: http://spade.lbl.gov:8100
  • Data Sources: FlyBase ZFIN OMIM
  • Ontologies: PATO, FMA, fly_anatomy, zebrafish_anatomy, Cell, GO-BP, GO-CC, GO-MF

The easiest way to search for phenotype is to leave the search box blank, choose phenotype data, and click Go. You should see a list of 700 or so phenotypes. You can use the filters at the top to see how the phenotype annotations are divided amonst different bearer ontologies.

Click on any of the bearer entity terms to see other phenotypes annotated to that term. You will see a term info page, with annotations listed at the moment. Note that currently phenotypes are only propagated up is_a

You can also use the tree explorer to select terms for annotation viewing

OBD-CT

Absolute bare bones no-frills generic interface, directly exposing the TrialBank classes and instances

Endpoints

OBD will be accessible programmatically from a variety of endpoints. Clients other than BioPortal will be able to access OBD through advanced queries

OBD Basic Annotation API Endpoints

This is the REST/API layer that wraps any OBD installation. Documentation to follow.

OBD-SQL Endpoints

Documentation to follow

Download area for OBD-SQL dumps to follow

Sesame/SeRQL Endpoints

Download area for OBD-RDF/OWL dumps to follow


Server details:

This version only allows SeRQL queries - we will put up a Sesame2 endpoint soon (this will give the option of SPARQL queries, amongst other things)

Clients:

Phenotype annotation Sesame Server

Clinical trials Sesame Server

SPARQL Endpoints

The W3C SPARQL Protocol for RDF is described here:

SPARQL has certain known limitations (eg lack of aggregate queries). In addition, certain issues regarding OWL semantics and entailment need to be clarified. Nevertheless, SPARQL Endpoints are the de facto standard means of exposing large databases on the semantic web, so OBD is dedicated to being compliant in this respect.

As mentioned above, the version of Sesame being used does not provide SPARQL endpointing. However, we have taken advantage of the relational schema mapping capabilities of D2RQ and [D2R-Server http://sites.wiwiss.fu-berlin.de/suhl/bizer/d2r-server/] to expose two relational databases of relevance to existing NCBO DBPs.

None of these endpoints are self-describing; we are watching for recommendations (ie here). Descriptions intended for humans follow.

These SPARQL endpoints can be queried directly (eg via the snorql interface that comes with each endpoint), or by using a semantic web browser. Examples of the latter include:

  1. Disco : http://sites.wiwiss.fu-berlin.de/suhl/bizer/ng4j/disco/
  2. Tabulator : http://dig.csail.mit.edu/2005/ajar/release/tabulator/0.7/tab.html

Fly insitu expression image annotation OBD SPARQL Endpoint

See OBD:SPARQL-InSitu for full details. This is a wrapping of the BDGP InSitu Database. Note the schema wrapped is an extension of the GO Database schema (below)

GO Annotation OBD SPARQL Endpoint

See OBD:SPARQL-GO for full details. This is a wrapping of the GO Database.

Personal tools