NCBO User Profile: Line Pouchard, Oak Ridge National Laboratory
Semantic Technologies in the Earth and Climate Systems Sciences
“The NCBO Virtual Machine installed at ORNL has been invaluable to demonstrate the potential of ontologies and semantics. Scientists apply them for interoperability, and developers have gained a better appreciation for the complexity of interdisciplinary research”
Line Pouchard (lead)1, Robert Cook1, Jim Green1, Michaels Huhns2, Natasha Noy3, Giri Palanisamy1
(1: Oak Ridge National Laboratory, 2: University of South Carolina, 3: Stanford University)
Research Interests: Earth and Climate Systems Scientists produce and consume large amounts of diverse and multi-scale data that are kept in numerous institutional data centers, each with their own mission and policies. The data centers provide access, curation, preservation and specialized tools to ensure data discovery, re-use, and provenance. Our team is providing semantic services to improving data discovery to two such centers: the Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC) is a NASA-sponsored effort holding observation and experimental datasets about biogeochemical dynamics, ecological data, and environmental processes; the NSF-sponsored Data Observation Network for Earth (DataONE) is the foundation of new innovative environmental science through a distributed framework and sustainable cyberinfrastructure.
Use of NCBO technology: Using our own image of NCBO-BioPortal, our team added semantic functionality to the existing metadata repository and facetted search engine (Mercury) used by both the ORNL DAAC and DataONE. The primary motivation for deploying our image was ease of programmatic access behind a firewall, and customization of content based on the domain sciences. Our VM currently hosts the Semantic Web Earth and Environmental Terminology (SWEET) and other ontologies of interest in Earth and Climate Sciences. We enabled search and programmatic access to the ontology terms through NCBO-REST services and display ontology search results in our facetted display. The displayed ontology terms provide context for a search term and additional search terms for the Solr/Lucene index search in Mercury. We analyzed the coverage of the SWEET ontologies for the ORNL DAAC using frequency counts. Results appear in the figure. Out of the top 100 search terms in the ORNL DAAC, 79 are covered in SWEET, with 36 appearing once, and two terms (water and carbon) appear respectively 38 and 28 times.
Future work includes 1) further evaluation of ontological coverage for DataONE, and 2) deployment of the NCBO-VM in the cloud for the ESIP foundation.