SPARQL BioPortal

From NCBO Wiki
Revision as of 21:26, 5 September 2012 by Whetzel (talk | contribs)
Jump to navigation Jump to search

NCBO is releasing a free and open SPARQL endpoint to query ontologies hosted in the BioPortal ontology repository. This SPARQL service, that is in BETA status, is stable for testing by our community of users. If you encounter any errors or unexpected behavior please report it to us support@bioontology.org.

Before using the BioPortal SPARQL service please read our SPARQL Release Notes And Usage Policy

Web Interface and Query Examples

There is a Web interface to test SPARQL queries at http://sparql.bioontology.org/

Also, interactive examples can be tested here http://sparql.bioontology.org/examples

Submitting SPARQL queries programmatically

A github project contains examples to query our SPARQL service programmatically:

https://github.com/ncbo/sparql-code-examples

A tarball with these examples is for download here:

https://github.com/ncbo/sparql-code-examples/tarball/master

This project contains examples in Java, Python, JavaScript and Perl. Some of the examples use just language built-in capabilities and other need third-party libraries like Jena, Sesame or SPARQLWrapper. The github project and the tarball are self-contained, no need to download and install extra libraries.

To run these examples or any other SPARQL queries programmatically an API key from BioPortal is required. If you do not have a BioPortal account go to [New Account] and create one. Once you have the BioPortal account, login in BioPortal and go to your account details. You should see your API Key as part of your account profile.

Database Named Graph Structure

Each ontology is asserted into a single graph. The graph is named with an acronym based URI. For example, the graph:

http://bioportal.bioontology.org/ontologies/HP

contains the Human Phenotype Ontology ontology. And the graph:

http://bioportal.bioontology.org/ontologies/SNOMEDCT

contains the SNOMEDCT ontology.

The following query would return all version IDs with the graph IDs where ontologies are located:

PREFIX meta: <http://bioportal.bioontology.org/metadata/def/> 

SELECT DISTINCT ?version ?graph
WHERE { 
    ?version meta:hasDataGraph ?graph
}

BioPortal Preferred Label

There are problematic cases of label definition. In order to provide a consistent mechanism to query by label across different ontologies we generate labels for the following cases. These label are attached to terms using the predicate http://bioportal.bioontology.org/metadata/def/prefLabel (bp:prefLabel)

  • Missing labels: for every owl:Class that is missing a label we generate a label based on the latest fragment of URI.
  • Terms that use rdfs:label as preferred name: BioPortal uses skos:prefLabel and skos:altLabel for preferred names and synonyms respectively. Both skos:prefLabel and skos:altLabel are subproperties of rdfs:label in the SKOS ontology. If someone uses rdfs:label to record preferred names, in the SKOS context, he would be saying that that name can be a preferred name or a synonym. To avoid this confusion we generate bp:prefLabel(s) for every rdfs:label used as preferred name.

Preferred Label, Synonyms and other common predicates

When ontologies are submitted to BioPortal the user can select which predicates that ontology uses for:

  • Preferred Names.
  • Synonyms or alternative names.
  • Author.
  • Description.

The BioPortal SPARQL endpoint supports rdfs:subPropertyOf reasoning to enable cross querying across all these configurable predicates. In the triple store, the following URI:

http://bioportal.bioontology.org/ontologies/globals

is used as identifier for the named graph that contains all the sub-property of statements that have been configured by users when uploading their ontologies. The root properties to be used to trigger the reasoning are the following:

  • skos:prefLabel for Preferred name.
  • skos:altLabel for Synonyms or alternative names.
  • dc:author for Author.
  • rdfs:comment for Description.

When using named graphs if you want to use this reasoning then you should include the globals graph that contains the subproperty statements, i.e:

PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT DISTINCT ?termURI ?prefLabel
 FROM <http://bioportal.bioontology.org/ontologies/EHDA>
 FROM <http://bioportal.bioontology.org/ontologies/globals> 
WHERE {
      ?termURI a owl:Class;
      skos:prefLabel ?prefLabel .
} 

Otherwise the subproperty statements that take part in the query processor will not be taken into account.

Partial or Incomplete Results

sparql.bioontology.org uses 4store's soft-limit internal mechanism to limit resources for expensive queries. Our setup is configured to bind 8K elements per triple pattern. If you hit these limits a warning message will be appended to the query response. This message says something like: "hit complexity limit 8 times". If you see this warning it means that the results are incomplete, and probably there is a more efficient way to write that query.

Contact our support mail list if you need help to rewrite your query in a more efficient way and avoid incomplete results.

Slides