Difference between revisions of "MetaMap Installation"

From NCBO Wiki
Jump to navigation Jump to search
(New page: == Introduction == MetaMap is a highly powerful tool to map biomedical text to the UMLS Metathesaurus or a custom Metathesaurus, equivalently, to discover Metathesaurus concepts referred t...)
 
Line 79: Line 79:
 
== Setting up the workspace ==
 
== Setting up the workspace ==
  
1. Install the Lexical Variant Generator (LVG) before running
+
1. '''Install the Lexical Variant Generator (LVG)''' before running
 
MetaMap's install program.  LVG is part of the Lexical Tools distribution and is  available from the Lexical Systems Group (http://lexsrv3.nlm.nih.gov/SPECIALIST/Projects/lvg/current/index.html).
 
MetaMap's install program.  LVG is part of the Lexical Tools distribution and is  available from the Lexical Systems Group (http://lexsrv3.nlm.nih.gov/SPECIALIST/Projects/lvg/current/index.html).
  
2. Before using MetaMap install program to install data file builder,
+
2. Before using MetaMap install program to install data file builder, you also need to '''add LVG's bin directory''' {LVG_DIR}/bin to your program path:
You also need to add LVG's bin directory {LVG_DIR}/bin to your
+
 
program path:
 
# in Bourne Again Shell (bash)
 
 
export PATH=$PATH:<LVG_DIR>/bin
 
export PATH=$PATH:<LVG_DIR>/bin
  
3. Install MetaMap using instructions at their site  
+
3. '''Install MetaMap''' using instructions at their site  
  
 
4. Unzip the DFB installation files in the same directory as MetaMap
 
4. Unzip the DFB installation files in the same directory as MetaMap
  
5. Connect to the new directory created by extracting the distribution
+
5. Connect to the new directory created by extracting the distribution and invoke the install program:
and invoke the install program:
+
 
 
cd <distribution directory>
 
cd <distribution directory>
 +
 
./bin/install.sh
 
./bin/install.sh
  
A sample run of the installation script follows:
+
'''A sample run of the installation script follows:'''
 +
 
 +
Enter basedir of installation [/nfsvol/nlsaux15/public_mm]
  
^<<
 
Enter basedir of installation [/nfsvol/nlsaux15/public_mm] <user hits
 
                                                            return to get the default>
 
 
Basedir is set to /nfsvol/nlsaux15/public_mm.
 
Basedir is set to /nfsvol/nlsaux15/public_mm.
  
Line 120: Line 118:
 
/nfsvol/nlsaux15/public_mm/bin/skrmedpostctl generated.
 
/nfsvol/nlsaux15/public_mm/bin/skrmedpostctl generated.
 
Install complete.
 
Install complete.
Would like to use a custom data set with MetaMap (use data file builder)? [yN]: <user types y and return>
+
Would like to use a custom data set with MetaMap (use data file builder)? [yN]:  
  
 
running Data File Builder Install...
 
running Data File Builder Install...
Line 126: Line 124:
  
 
running Data File Builder Install...
 
running Data File Builder Install...
Enter home path of LVG [/nfsvol/nls/tools/Linux-i686/lvg2009]: <user hits
+
Enter home path of LVG [/nfsvol/nls/tools/Linux-i686/lvg2009]:
                                                            return to get the default>
 
 
 
 
Using /nfsvol/nls/tools/Linux-i686/lvg2009 for LVG_DIR.
 
Using /nfsvol/nls/tools/Linux-i686/lvg2009 for LVG_DIR.
  
Line 134: Line 130:
 
/nfsvol/nlsaux15/public_mm/scripts/dfbuilder/mm_variants/0doit.xwords generated.
 
/nfsvol/nlsaux15/public_mm/scripts/dfbuilder/mm_variants/0doit.xwords generated.
 
Datafile Builder Setup is complete.
 
Datafile Builder Setup is complete.
%
 
^>>
 
  
6. Make sure that the SKR/MedPOST tagger is running; to run the tagger, move to the public_mm directory present inside the working directory and invoke
+
 
 +
6. '''Make sure that the SKR/MedPOST tagger is running;''' to run the tagger, move to the public_mm directory present inside the working directory and invoke
 +
 
 
./bin/skrmedpostctl start
 
./bin/skrmedpostctl start
  
Incase there is an error due to a port number, change the port number in the following files and try to run the tagger again:
+
'''Incase there is an error''' due to a port number, change the port number in the following files and try to run the tagger again:
 +
 
 
/bin/skrmedpostctl
 
/bin/skrmedpostctl
  
If you change the port number skrmedpostctl uses you need to change
+
If you change the port number skrmedpostctl uses you need to change public_mm/bin/SKRrun (or SKRrun.in and re-run install.sh) to match the port you set in skrmedpostctl.  The environment variables that should be changed in SKRrun are:
public_mm/bin/SKRrun (or SKRrun.in and re-run install.sh) to match the
+
 
port you set in skrmedpostctl.  The environment variables that should
 
be changed in SKRrun are:
 
 
TAGGER_SERVER_DEFAULT_TCP_PORT
 
TAGGER_SERVER_DEFAULT_TCP_PORT
 +
 
TAGGER_SERVER_TCP_PORT_0
 
TAGGER_SERVER_TCP_PORT_0
 +
 
TAGGER_SERVER_TCP_PORT_1
 
TAGGER_SERVER_TCP_PORT_1
  
  
 
7. Inside the directory public_mm/sourceData create a directory for workspace
 
7. Inside the directory public_mm/sourceData create a directory for workspace
 +
 
mkdir sourceData/09_custom
 
mkdir sourceData/09_custom
  
 
Finally, Create a directory to store the knowledge sources:
 
Finally, Create a directory to store the knowledge sources:
 +
 
mkdir sourceData/09_custom/umls
 
mkdir sourceData/09_custom/umls
 
  
 
== Installation of Metamap ==
 
== Installation of Metamap ==

Revision as of 03:35, 22 January 2010

Introduction

MetaMap is a highly powerful tool to map biomedical text to the UMLS Metathesaurus or a custom Metathesaurus, equivalently, to discover Metathesaurus concepts referred to in text. The following article describes the way of using the Data File Builder utility inside MetaMap to build a concept recognizer for text containing NCBO terms.

Installation of MetaMap primarily involves 3 stages.

1.Preparation of data

2.Setting up the workspace

3.Installation of Metamap

Each of the step is explained below followed by instructions to run the newly installed custom MetaMap.

Preparation of data

There are primarily 4 files to be generated for the installation.

1.MRCON: This file is primarily generated from the table OBS_TT. Below is a mapping between the columns in MRCON and OBS_TT. The columns listed in OBS_TT are used to fill their corresponding columns in MRCON file.

MRCON OBS_TT

CUI (Concept Unique ID) conceptID

Language- Status isPreferred LUI(Lexical Unique ID) termID String type - SUI(String Unique ID) termID String termName LRL -

The sql command would fill correct values into MRCON table in database.


Example entry row:

C0027051|ENG|P|L0027051|PF| S0064638|Myocardial Infarction|0|


2.MRSO: This file is primarily generated from the tables OBS_TT, OBS_OT and OBS_CT. Below is a mapping between the columns in MRSO and the above OBS tables. The columns listed in the tables are used to fill their corresponding columns in MRSO file.

MRSO OBS tables CUI (Concept Unique ID) OBS_TT.conceptID LUI(Lexical Unique ID) OBS_TT.termID SUI(String Unique ID) OBS_TT.termID SABSourceAbbrev OBS_OT.localOntologyID TermType OBS_TT.isPreferred SourceID OBS_OT.ontologyID Restrictionlevel -

The sql command would fill correct values into MRSO table in database.

Example entry row:

C0027051|L0027051|S0064638| MEDLINEPLUS|ET|T5|0|


3.MRSTY: This file is primarily generated from the table OBS_STT. Below is a mapping between the columns in MRSTY and OBS_STT. The columns listed in OBS_STT are used to fill their corresponding columns in MRSTY file.

MRSTY OBS_STT CUI (Concept Unique ID) OBS_STT.conceptID TUI (term Unique ID) OBS_STT.localSemanticTypeID STY OBS_STT.semanticTypeName

The sql command would fill correct values into MRSO table in database.

Example entry row:

C0027051|T047|Disease or Syndrome|


4.st.raw: This file is primarily generated from the column semanticTypeName in the table OBS_STT. This file contains row of semantic type names and a short name for each of them.

Example row: semantic Type Name|semantictypename


Setting up the workspace

1. Install the Lexical Variant Generator (LVG) before running MetaMap's install program. LVG is part of the Lexical Tools distribution and is available from the Lexical Systems Group (http://lexsrv3.nlm.nih.gov/SPECIALIST/Projects/lvg/current/index.html).

2. Before using MetaMap install program to install data file builder, you also need to add LVG's bin directory {LVG_DIR}/bin to your program path:

export PATH=$PATH:<LVG_DIR>/bin

3. Install MetaMap using instructions at their site

4. Unzip the DFB installation files in the same directory as MetaMap

5. Connect to the new directory created by extracting the distribution and invoke the install program:

cd <distribution directory>

./bin/install.sh

A sample run of the installation script follows:

Enter basedir of installation [/nfsvol/nlsaux15/public_mm]

Basedir is set to /nfsvol/nlsaux15/public_mm.

The WSD Server requires Sun's Java Runtime Environment (JRE) Sun's Java Developer Kit (JDK) will work as well. if the command: "which" java returns /usr/local/jre1.4.2/bin/java, then the JRE resides in /usr/local/jre1.4.2/.

Where does your distribution of Sun's JRE reside? Enter home path of JRE (JDK) [/usr]: /nfsvol/nls/tools/Linux-i686/java1.4.2 Using /nfsvol/nls/tools/Linux-i686/java1.4.2 for JAVA_HOME.

/nfsvol/nlsaux15/public_mm/WSD_Server/config/disambServer.cfg generated /nfsvol/nlsaux15/public_mm/WSD_Server/config/log4j.properties generated /nfsvol/nlsaux15/public_mm/bin/SKRrun generated. /nfsvol/nlsaux15/public_mm/bin/metamap07 generated. /nfsvol/nlsaux15/public_mm/bin/wsdserverctl generated. /nfsvol/nlsaux15/public_mm/bin/skrmedpostctl generated. Install complete. Would like to use a custom data set with MetaMap (use data file builder)? [yN]:

running Data File Builder Install... Is LVG installed? [yN] <The user types y and return>

running Data File Builder Install... Enter home path of LVG [/nfsvol/nls/tools/Linux-i686/lvg2009]: Using /nfsvol/nls/tools/Linux-i686/lvg2009 for LVG_DIR.

/nfsvol/nlsaux15/public_mm/scripts/dfbuilder/mm_variants/0doit.lvglab generated. /nfsvol/nlsaux15/public_mm/scripts/dfbuilder/mm_variants/0doit.xwords generated. Datafile Builder Setup is complete.


6. Make sure that the SKR/MedPOST tagger is running; to run the tagger, move to the public_mm directory present inside the working directory and invoke

./bin/skrmedpostctl start

Incase there is an error due to a port number, change the port number in the following files and try to run the tagger again:

/bin/skrmedpostctl

If you change the port number skrmedpostctl uses you need to change public_mm/bin/SKRrun (or SKRrun.in and re-run install.sh) to match the port you set in skrmedpostctl. The environment variables that should be changed in SKRrun are:

TAGGER_SERVER_DEFAULT_TCP_PORT

TAGGER_SERVER_TCP_PORT_0

TAGGER_SERVER_TCP_PORT_1


7. Inside the directory public_mm/sourceData create a directory for workspace

mkdir sourceData/09_custom

Finally, Create a directory to store the knowledge sources:

mkdir sourceData/09_custom/umls

Installation of Metamap

1. First, run the BuildDataFiles program as follows:

$fullpath/public_mm/bin/BuildDataFiles

2. The file st.raw, located in $fullpath/public_mm/data/dfbuilder/2009 will need to be modified. Append the file generated in point 4 of Preparation of Data section to this file.

3. Move to $fullpath/public_mm/sourceData/09_custom/01metawordindex and execute the following scripts in order

./01CreateWorkFiles

./02Suppress

./03FilterPrep

./04FilterStrict

./05GenerateMWIFiles

4. Move to 02treecodes directory and run 01GenerateTreecodes

cd ../02treecodes

./01GenerateTreecodes

5. Move to 03Variants directory and run 01GenerateVariants

cd ../03Variants

./01GenerateVariants

6. cd ../04synonyms

./01GenerateSynonyms

7. cd ../05abbrAcronyms

./01GenerateAbbrAcronyms

8. move to $fullpath/public_mm and run

./bin/LoadDataFiles


Running MetaMap

To run the newly installed MetaMap dfb, move to the main workspace folder (public_mm) and run the command below

bin/SKRrun -L 2009 -M /DATA/XDR -B /BDB4 -w ./lexicon ./bin/metamap09.BINARY -Z 09_custom ./resources/input -I