Difference between revisions of "SPARQL BioPortal"

From NCBO Wiki
Jump to navigation Jump to search
Line 1: Line 1:
NCBO is releasing a free and open SPARQL endpoint to query ontologies hosted in the BioPortal ontology repository. This SPARQL service, that is in alpha status, is stable for testing by our community of users. If you encounter any errors or unexpected behavior please report it to us [mailto:support@bioontology.org support@bioontology.org].  
+
NCBO is releasing a free and open SPARQL endpoint to query ontologies hosted in the BioPortal ontology repository. This SPARQL service, that is in BETA status, is stable for testing by our community of users. If you encounter any errors or unexpected behavior please report it to us [mailto:support@bioontology.org support@bioontology.org].  
  
 
<div style="margin-top: 10px; background: #F6F9ED; padding: 6px; border: 1px solid #aaaaaa; margin-bottom: 20px;">
 
<div style="margin-top: 10px; background: #F6F9ED; padding: 6px; border: 1px solid #aaaaaa; margin-bottom: 20px;">
Line 7: Line 7:
 
=== Web Interface and Query Examples ===
 
=== Web Interface and Query Examples ===
  
There is a Web interface to test SPARQL queries at http://alphasparql.bioontology.org/
+
There is a Web interface to test SPARQL queries at http://sparql.bioontology.org/
  
Also, interactive examples can be tested here http://alphasparql.bioontology.org/examples
+
Also, interactive examples can be tested here http://sparql.bioontology.org/examples
  
 
=== Submitting SPARQL queries programmatically ===
 
=== Submitting SPARQL queries programmatically ===
Line 29: Line 29:
 
=== Database Named Graph Structure ===
 
=== Database Named Graph Structure ===
  
Eventually the graphs IDs for each ontology will be based on the PURL URIs that use BioPortal abbreviations. Currently, not all ontologies have unique abbreviations. While we resolve this issue the graph IDs are based on the virtual IDs. With the form:
+
Each ontology is asserted into a single graph. The graph is named with an acronym based URI. For example, the graph:
  
 
<pre>
 
<pre>
http://bioportal.bioontology.org/ontologies/{VIRTUAL ID}
+
http://bioportal.bioontology.org/ontologies/HP
 
</pre>
 
</pre>
  
For example:
+
contains the Human Phenotype Ontology ontology. And the graph:
  
 
<pre>
 
<pre>
http://bioportal.bioontology.org/ontologies/1407
+
http://bioportal.bioontology.org/ontologies/SNOMEDCT
 
</pre>
 
</pre>
  
 +
contains the SNOMEDCT ontology.
  
The predicate that connects metadata graphs with ontology data graphs is:
+
The following query would return all version IDs with the graph IDs where ontologies are located:
  
 
<pre>
 
<pre>
http://bioportal.bioontology.org/metadata/def/hasVersion
+
PREFIX meta: <http://bioportal.bioontology.org/metadata/def/>  
</pre>
 
 
 
Therefore, a SPARQL query like:
 
  
<pre>
+
SELECT DISTINCT ?version ?graph
SELECT ?graph {
+
WHERE {  
     ?graph <http://bioportal.bioontology.org/metadata/def/hasVersion> ?version
+
     ?version meta:hasDataGraph ?graph
 
}
 
}
 
</pre>
 
</pre>
 
... would list all the graphs IDs for all the ontologies in the RDF database.
 
  
 
=== BioPortal Preferred Label ===
 
=== BioPortal Preferred Label ===
Line 92: Line 88:
 
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
 
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
 
SELECT DISTINCT ?termURI ?prefLabel
 
SELECT DISTINCT ?termURI ?prefLabel
  FROM <http://bioportal.bioontology.org/ontologies/1022>
+
  FROM <http://bioportal.bioontology.org/ontologies/EHDA>
 
  FROM <http://bioportal.bioontology.org/ontologies/globals>  
 
  FROM <http://bioportal.bioontology.org/ontologies/globals>  
 
WHERE {
 
WHERE {

Revision as of 19:18, 20 March 2012

NCBO is releasing a free and open SPARQL endpoint to query ontologies hosted in the BioPortal ontology repository. This SPARQL service, that is in BETA status, is stable for testing by our community of users. If you encounter any errors or unexpected behavior please report it to us support@bioontology.org.

Before using the BioPortal SPARQL service please read our SPARQL Release Notes And Usage Policy

Web Interface and Query Examples

There is a Web interface to test SPARQL queries at http://sparql.bioontology.org/

Also, interactive examples can be tested here http://sparql.bioontology.org/examples

Submitting SPARQL queries programmatically

A github project contains examples to query our SPARQL service programmatically:

https://github.com/ncbo/sparql-code-examples

A tarball with these examples is for download here:

https://github.com/ncbo/sparql-code-examples/tarball/master

This project contains examples in Java, Python, JavaScript and Perl. Some of the examples use just language built-in capabilities and other need third-party libraries like Jena, Sesame or SPARQLWrapper. The github project and the tarball are self-contained, no need to download and install extra libraries.

To run these examples or any other SPARQL queries programmatically an API key from BioPortal is required. If you do not have a BioPortal account go to [New Account] and create one. Once you have the BioPortal account, login in BioPortal and go to your account details. You should see your API Key as part of your account profile.

Database Named Graph Structure

Each ontology is asserted into a single graph. The graph is named with an acronym based URI. For example, the graph:

http://bioportal.bioontology.org/ontologies/HP

contains the Human Phenotype Ontology ontology. And the graph:

http://bioportal.bioontology.org/ontologies/SNOMEDCT

contains the SNOMEDCT ontology.

The following query would return all version IDs with the graph IDs where ontologies are located:

PREFIX meta: <http://bioportal.bioontology.org/metadata/def/> 

SELECT DISTINCT ?version ?graph
WHERE { 
    ?version meta:hasDataGraph ?graph
}

BioPortal Preferred Label

There are problematic cases of label definition. In order to provide a consistent mechanism to query by label across different ontologies we generate labels for the following cases. These label are attached to terms using the predicate http://bioportal.bioontology.org/metadata/def/prefLabel (bp:prefLabel)

  • Missing labels: for every owl:Class that is missing a label we generate a label based on the latest fragment of URI.
  • Terms that use rdfs:label as preferred name: BioPortal uses skos:prefLabel and skos:altLabel for preferred names and synonyms respectively. Both skos:prefLabel and skos:altLabel are subproperties of rdfs:label in the SKOS ontology. If someone uses rdfs:label to record preferred names, in the SKOS context, he would be saying that that name can be a preferred name or a synonym. To avoid this confusion we generate bp:prefLabel(s) for every rdfs:label used as preferred name.

Preferred Label, Synonyms and other common predicates

When ontologies are submitted to BioPortal the user can select which predicates that ontology uses for:

  • Preferred Names.
  • Synonyms or alternative names.
  • Author.
  • Description.

The BioPortal SPARQL endpoint supports rdfs:subPropertyOf reasoning to enable cross querying across all these configurable predicates. In the triple store, the following URI:

http://bioportal.bioontology.org/ontologies/globals

is used as identifier for the named graph that contains all the sub-property of statements that have been configured by users when uploading their ontologies. The root properties to be used to trigger the reasoning are the following:

  • skos:prefLabel for Preferred name.
  • skos:altLabel for Synonyms or alternative names.
  • dc:author for Author.
  • rdfs:comment for Description.

When using named graphs if you want to use this reasoning then you should include the globals graph that contains the subproperty statements, i.e:

PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT DISTINCT ?termURI ?prefLabel
 FROM <http://bioportal.bioontology.org/ontologies/EHDA>
 FROM <http://bioportal.bioontology.org/ontologies/globals> 
WHERE {
      ?termURI a owl:Class;
      skos:prefLabel ?prefLabel .
} 

Otherwise the subproperty statements that take part in the query processor will not be taken into account.