RDF Triple Stores

From NCBO Wiki

Jump to: navigation, search

Introduction

A Powerpoint presentation (.ppt) provides an introduction to triple stores and their underlying technology.

The following are perceived as the prominent open-source triple stores in use today (May, 2009):

  • Jena - "a Java framework for building Semantic Web applications. It provides a programmatic environment for RDF, RDFS and OWL, SPARQL and includes a rule-based inference engine."
  • Mulgara - "a scalable RDF database written entirely in Java."
  • OpenLink Virtuoso - "an Object-Relational Database for SQL, XML, RDF, and Free Text that includes Java and .NET runtime hosting; an RDF store and SPARQL engine; a WebDAV Server; a Web Services Platform for SOA; a Web Application Server for Drupal, WordPress, MediaWiki, and phpBB"
  • Sesame - "an open source Java framework for storage and querying of RDF data"

An exhaustive list including proprietary triple stores is available elsewhere.

Evaluations

Evaluating or benchmarking triple stores is a contentious issue. A number of variable factors like hardware, loading procedures, and more importantly intricate knowledge (or simply reading and acting on the documentation) of the store can tweak results significantly.

ESW wiki which collects RDF benchmarking results provides references to a number of reports and blogs

In February 2009, NCBO evaluated triple stores specifically to serve as a backend for BioPortal. The report provides some details on the set-up, load times, and query times. The triple stores were used out of the box, without any tweaking of the default installed parameters, apparently ignoring the documentation for each. The evaluations were performed on a machine with 3GB RAM and at least one Intel Xeon 3.2 Ghz processor.

Code

The code and a read me on how to run are available here. The code can be accessed from the GForge repository

The steps to run the code are:

  • From Eclipse:
    • Checkout the entire project from SVN on GForge
    • Resolve the Classpath by changing it to where the 'Libraries' folder (checked out from SVN) is located on the local disk
    • Same as '2 ii)'
    • Same as '2 iii)'
    • To load an RDF/XML serialization file ,use the following command as the program arguments using the Run dialog box
     -dataload "true" -directory "C:\\Triple Store\\Data" -configFile "C:\Triple Store\Config\sdb-mysql.ttl"


  • From the command line:
    • Open a command prompt on the folder where you checked out the code.
    • Set ClassPath to include the the jar in the 'Libraries'. Use command set CLASSPATH =..... (Don't forget to include the current directory as well)
    • In SDB config file (sdb-mysql.ttl) checked out from the 'Config' folder of the repo set the 'sdbName' to whatever you want.
    • Create a MySQL DB by the same name provided as sdbName in step ii)
    • In the folder testRDFStore, run the following commands
      javac *.java
      java  -dataload "true" -directory "C:\\Triple Store\\Data" -configFile "C:\Triple Store\Config\sdb-mysql.ttl"
Personal tools