Difference between revisions of "Processing OBR Resources"

From NCBO Wiki
Jump to navigation Jump to search
Line 1: Line 1:
== Resources Access Tools ==
+
This wiki page describes the status of the OBR index.
  
This page is for keeping track of what resources we process and what is done with each ResourceAccessTool. There are three main activities: 1) Reprocessing resources and writing access tools for new resources, 2) Executing the annotation workflow, and 3) Migrating to a production architecture.
+
= Resources Access Tools (RATs) development and update =
  
== How to write Resources Access Tools ==
+
This page is for keeping track of Resources Access Tools (RATs) developement.
  
 +
=== How to write Resources Access Tools ===
 
How to write a resource access tool is described in this [http://www.stanford.edu/~coulet/obr_tutorial_to_add_new_resources.pdf tutorial] by Adrien Coulet.
 
How to write a resource access tool is described in this [http://www.stanford.edu/~coulet/obr_tutorial_to_add_new_resources.pdf tutorial] by Adrien Coulet.
  
==Reprocessing:  Simple re-run of an existing ResourceAccessTool==
+
=== Available resources ===
  
 +
Resources fully functional and available in the OBR index are available here: [http://ncbolabs-dev2.stanford.edu:8080/OBS_v1/obr/resources /obr/resources]
  
1. Resource: CDD
+
=== On development resources ===
*Person responsible: Adrien (on the way to be managed by Optra)
+
* PubMed (PM)
*Status: annotations to process
+
* Stanford Microarray Database (SMD)
*Last processed: 3/9/2009
+
* Pathway Commons
 +
* CaNanoLab
  
2. Resource: OMIM
+
Others in queue
*Person responsible: Adrien (on the way to be managed by Optra)
 
*Status:  annotations to process
 
*Last processed: 3/9/2009
 
 
 
3. Resource: PharmGKB
 
*Person responsible: Adrien (on the way to be managed by Optra)
 
*Status: annotations to process
 
*Last processed: 3/9/2009
 
 
 
4. Resource: Reactome
 
*Person responsible: Adrien (on the way to be managed by Optra)
 
*Status: annotations to process
 
*Last processed: 3/9/2009
 
 
 
5. Resource: ResearchCrossroads
 
*Person responsible: Adrien (on the way to be managed by Optra)
 
*Status: annotations to process
 
*Last processed: 3/4/2009
 
 
 
6. Resource: UniProt
 
*Person responsible: Adrien (on the way to be managed by Optra)
 
*Status: annotations to process
 
*Last processed: 3/9/2009
 
  
 +
* ChemSpider
 +
* Human Gene Mutation Database
  
=Reprocessing:  Modification of an existing ResourceAccessTool =
+
=== Development history ===
Examples include GEO, ClinicalTrials, Pubmed.
 
  
 
1. Resource: Clinicaltrials.gov  
 
1. Resource: Clinicaltrials.gov  
Line 51: Line 32:
 
         1. Decrease request delay from 1000 ms to 800ms.
 
         1. Decrease request delay from 1000 ms to 800ms.
 
         2. Implement log4j logger mechanism.
 
         2. Implement log4j logger mechanism.
 
*Last processed: 3/20/2009.
 
  
 
2. Resource: GEO  
 
2. Resource: GEO  
Line 59: Line 38:
 
*Status:   
 
*Status:   
 
         Changes done as per suggestions :
 
         Changes done as per suggestions :
         1. Implement log4j logger mechanism.        
+
         1. Implement log4j logger mechanism.  
   
 
*Last processed: 3/20/2009.
 
  
 
3. Resource: Pubmed  
 
3. Resource: Pubmed  
Line 73: Line 50:
 
       4. Code changes pushed in SVN.
 
       4. Code changes pushed in SVN.
  
*Last processed: 3/23/2009.
+
= Execution of the OBR worklow & maintenance of the OBR index =
 
 
=Writing a new ResourceAccessTools=
 
Examples include CaNanoLab.
 
 
 
Others in queue
 
 
 
* PathwayCommons http://www.pathwaycommons.org/pc/ they have a web service, and we should focus on getting all pathways via web services.
 
* Stanford Microarray Database
 
* ChemSpider
 
* Human Gene Mutation Database
 
 
 
= Executing the Annotation workflow =
 
 
 
== Population of the OBR index ==  
 
  
 +
== Technical documentation ==
 
Optra & Clement are working on a instruction document to be posted here soon.
 
Optra & Clement are working on a instruction document to be posted here soon.
 
Image Resource: [[Image:PopulationOBRtables.png|thumb|Population of the OBR tables]]
 
Image Resource: [[Image:PopulationOBRtables.png|thumb|Population of the OBR tables]]
  
== Population of OBS tables ==  
+
== Execution history ==
  
See [[Populating_OBS_database]]
+
* June 8, 2009 [Cleemnt]: End of execution of the oBR workflow & switch of OBR API to the obsdb1.obs_stage schema
 +
* May 22, 2009 [Clement]: OBR workflow run (from scratch) on the new schema obsdb1.obs_stage for all resources
 +
* May 2009 [Adrien]: Execution of the OBR index on RXRD independently (on obsdb1.obs_stage schema)
 +
* March 2009 [Clement]: DB schema use right now: ncbodb2.obs
  
=Migrating to new Architecture=
+
= Migration to a production architecture & framework =
  
 
The task of writing a skeleton RAT conforming to Cherie's architecture.
 
The task of writing a skeleton RAT conforming to Cherie's architecture.
  
=Ongoing Challenges=
+
= Ongoing Challenges =

Revision as of 08:03, 8 June 2009

This wiki page describes the status of the OBR index.

Resources Access Tools (RATs) development and update

This page is for keeping track of Resources Access Tools (RATs) developement.

How to write Resources Access Tools

How to write a resource access tool is described in this tutorial by Adrien Coulet.

Available resources

Resources fully functional and available in the OBR index are available here: /obr/resources

On development resources

  • PubMed (PM)
  • Stanford Microarray Database (SMD)
  • Pathway Commons
  • CaNanoLab

Others in queue

  • ChemSpider
  • Human Gene Mutation Database

Development history

1. Resource: Clinicaltrials.gov

  • Person responsible: Kuladip Yadav(Optra),Sanjay Jadhav(Optra).
  • Notes: Fixed issue of authentication, fixed other xml related issues.
  • Status:
       Changes done as per suggestions : 
       1. Decrease request delay from 1000 ms to 800ms.
       2. Implement log4j logger mechanism.

2. Resource: GEO

  • Person responsible: Kuladip Yadav(Optra), Sanjay Jadhav(Optra).
  • Notes:Modified GEO resource access tool to get data from both GSE and GDS database.
  • Status:
       Changes done as per suggestions :
       1. Implement log4j logger mechanism.   

3. Resource: Pubmed

  • Person responsible: Kuladip Yadav(Optra), Sanjay Jadhav(Optra).
  • Notes:Built a new resource access tool from existing PubMedAccessTool to populate data from eutils and pubmed xml files .
  • Status:
      Changes done as per suggestions :
      1. Remove direct database call from PubMed RAT.
      2. Implement mapStringsToLocalConceptIDs method into ObsOntologiesAccessTool and TermTable.       
      3. Implement log4j logger in related classes.
      4. Code changes pushed in SVN.

Execution of the OBR worklow & maintenance of the OBR index

Technical documentation

Optra & Clement are working on a instruction document to be posted here soon.

Image Resource:

Error creating thumbnail: /srv/www/vhosts/www.bioontology.org/app/mediawiki-1.35.9/includes/shell/limit.sh: line 61: ulimit: cpu time: cannot modify limit: Permission denied /srv/www/vhosts/www.bioontology.org/app/mediawiki-1.35.9/includes/shell/limit.sh: line 84: ulimit: virtual memory: cannot modify limit: Permission denied /srv/www/vhosts/www.bioontology.org/app/mediawiki-1.35.9/includes/shell/limit.sh: line 90: ulimit: file size: cannot modify limit: Permission denied /bin/bash: /usr/bin/convert: No such file or directory Error code: 127
Population of the OBR tables

Execution history

  • June 8, 2009 [Cleemnt]: End of execution of the oBR workflow & switch of OBR API to the obsdb1.obs_stage schema
  • May 22, 2009 [Clement]: OBR workflow run (from scratch) on the new schema obsdb1.obs_stage for all resources
  • May 2009 [Adrien]: Execution of the OBR index on RXRD independently (on obsdb1.obs_stage schema)
  • March 2009 [Clement]: DB schema use right now: ncbodb2.obs

Migration to a production architecture & framework

The task of writing a skeleton RAT conforming to Cherie's architecture.

Ongoing Challenges