Annotation Summarizer
| RANSUM is no longer maintained. The production implementation is called Statistical Tracking of Ontological Phrases (STOP), available at http://www.mooneygroup.org/stop |
|---|
This utility (RANSUM) provides a mechanism for interpreting ontological annotation data. There are two ways to use the software (a web and a REST interface). Each exposes the same two functions:
- Processing annotations - in the form of (elementID, conceptID) pairs
- Processing raw text - in the form of (elementID, "associated text") pairs
The web interface is available http://ransum.stanford.edu
Note: processing files with more than several hundred concepts will take some time (with greater than a thousand lines, expect several hundred seconds). This applies doubly to processing of elementID - text pairs, as there is the additional overhead in annotating those strings.
Contents |
Web (Browser) Interface
Process Annotations: Generate Summary for elementID - conceptID pairs
Parameters:
- vid: the virtual ontology ID. A dropdown list of possible choices is available.
- output: output format. Options are 'TagCloud' and 'XML'. XML output optionally includes background information from MEDLINE and/or Google.
- file: '\n'-separated file containing '\t'-delimited (elementID, conceptID) pairs.
- useMEDLINE: True or False. If True, the output for each concept will include information on concept frequency in MEDLINE (ie, the number of times text was annotated with that concept in the MEDLINE corpus).
- useGoogle: True or False. If True, the output for each concept will include the number of results for a naive search for the given concept's name, as well as a very rough guesstimate of the total size of Google English-language index.
An example file, associating genes with concepts in the mammalian phenotype ontology.
Process Text: Generate Summary for elementID - text pairs
Parameters:
- vid: the virtual ontology ID to use in annotation.
- output: output format. Options are 'TagCloud' and 'XML'.
- types: ','-delimited list of semantic types to use in annotation. If not specified, all will be used. For a list of available types, see http://ransum.stanford.edu/types/ or the NIH's Semantic Network documentation.
- file: '\n'-separated file containing '\t' delimited (elementID, text) pairs.
- useMEDLINE: True or False. If True, the output for each concept will include information on concept frequency in MEDLINE (ie, the number of times text was annotated with that concept in the MEDLINE corpus).
- useGoogle: True or False. If True, the output for each concept will include the number of results for a naive search for the given concept's name, as well as a very rough guesstimate of the total size of Google English-language index.
An example file that can be annotated (suggest using SNOMEDCT or MeSH).
REST Interface
Note: all REST URLs are relative to http://ransum.stanford.edu/rest/
Process Annotations: Generate Summary for elementID - conceptID pairs
POST to /process/pairs/. Parameters:
- vid: the virtual ontology ID.
- output: output format. Options are 'TagCloud' and 'XML'.
- lines: '\n'-separated list of '\t'-delimited (elementID, conceptID) pairs.
- useMEDLINE: True or False. If True, the output for each concept will include information on concept frequency in MEDLINE (ie, the number of times text was annotated with that concept in the MEDLINE corpus).
- useGoogle: True or False. If True, the output for each concept will include the number of results for a naive search for the given concept's name, as well as a very rough guesstimate of the total size of Google English-language index.
Process Text: Generate Summary for elementID - text pairs
POST to /process/text/. Parameters:
- vid: the virtual ontology ID to use in annotation.
- types: ','-delimited list of semantic types to use in annotation. If not specified, all will be used. For a list of available types, see http://ransum.stanford.edu/types/ or the NIH's Semantic Network documentation.
- output: output format. Options are 'TagCloud' and 'XML'.
- lines: '\n'-separated list of '\t' delimited (elementID, text) pairs.
- useMEDLINE: True or False. If True, the output for each concept will include information on concept frequency in MEDLINE (ie, the number of times text was annotated with that concept in the MEDLINE corpus).
- useGoogle: True or False. If True, the output for each concept will include the number of results for a naive search for the given concept's name, as well as a very rough guesstimate of the total size of Google English-language index.