PATO:Pre vs Post Coordinating
From NCBO Wiki
[edit] Background
See also
http://wiki.geneontology.org/index.php/Category:Cross_Products
There is also a draft of a paper in progress - email cjm to view
[edit] Reconciling pre and post coordinated phenotype descriptions
PRELIMINARY DOCUMENT! IN PROGRESS!
There are two paradigms for representing phenotypes and phenotype-like
entities (1) use ontologies of pre-coordinated phenotypes, such as MP
or plant_trait (PT) or (2) post-coordinate the phenotype description
using terms from an ontology of qualities (eg PATO) and terms from
ontologies of quality-bearers (eg GO, AOs, CL, ...); this is the
EQ-annotation methodology
Both approaches have their advantages, and they are in fact entirely compatible, provided you follow the SOP laid out here. It is even possible to mix and match these approaches.
This methodology is applicable not just to "phenotypes" or "traits", however we choose to define these terms, but to any kind of Dependent Entity, such as diseases, syndromes, disorders and even roles.
The methodology hinges on the provision of Aristotelian Definitions of pre-coordinated phenotype terms in a computable format.
Below we provide some examples, taken from MP and PT
[edit] Defining Specific Phenotypes
Here are some examples of (somewhat trivial) aristotelian (genus-differentia) definitions of mammalian phenotypes:
If we take the MP term "big ears" (MP:0000017), this can be defined as the genus "large"/"largeness" (PATO:0000586) inhering in an "ear" (MA:0000236).
In OBO-1.2 syntax this can be represented as:
[Term] id: MP:0000017 name: big ears namespace: MPheno.ontology def: "outer ears of a greater than normal size" [] is_a: MP:0002177 ! abnormal outer ear morphology intersection_of: PATO:0000586 ! large size intersection_of: inheres_in MA:0000236 ! ear
Of course, an ontology editor will not manipulate the syntax directly. This will be visible in oboedit in the "cross products" box
Note that our definition here is different from the text definition, which has the genus as "ear" and the differentia has "having a greater than normal size". We could write this in OBO syntax as:
[Term] id: MP:0000017 name: big ears namespace: MPheno.ontology def: "outer ears of a greater than normal size" [] is_a: MP:0002177 ! abnormal outer ear morphology intersection_of: MA:0000236 ! ear intersection_of: has_quality PATO:0000586 ! large size
The first form is prefered, despite the fact the english phrasing is more contorted. The reason is that a phenotype ontology should define phenotypes, and not the entities in which phenotypes occur. The genus term is the primary is_a parent, and it makes sense for the phenotype ontology to be arranged on the quality axis, and not the anatomical axis.
If we go the second route, consider the problems that will be encountered defining terms like "sensitivity to chlorine"
Another way to record the same information is pheno-syntax. This is a convenient shorthand for combinatorial descriptions of phenotypes
E= MA:0000236 Q= PATO:0000586
[edit] Results: plant_trait
So far we've analyzed the plant_trait ontology. We refer to this ontology as PT ontology here.
Our focus is on phenotypes that are applicable to human health and disease; however, the PT ontology is easier to use to illustrate patterns that can also be applied to MP (Mammalian Phenotype)
Obol was used to generate prospective definitions for all PT terms - these were examined and fixed manually by a non-domain expert (CJM).
[edit] Downloading
Logical definitions file available at:
[edit] OBO-Edit
If you are using OBO-Edit, then you can paste the following URL directly into the obo-edit load box:
(you may want to use the 'advanced' option and allow dangling references)
The above .obo file contains just import statements - to pull in the relevant obo files, as well as the main xp file (this means you will need an internet connection, even if you download the file for use later)
[edit] OWL Editors
Not yet fully tested. You may need to allocate additional heap, as this imports a lot of ontologies.
[edit] Summary
Summary of Results as follows:
[edit] Standard EQ Terms
The standard PT term is a pre-coordinated EQ term; this can easily be defined as:
A <Q> *which* inheres_in an <E>
ie a quality carried by a bearer entity
This can easily be represented as a logical definition; below is an example in oboformat
[Term] id: TO:0000227 name: root length namespace: plant_trait_ontology def: "Average maximum length of the root of a plant in a study." [] synonym: "MRD" [] synonym: "MRL" [] synonym: "RTLG" [] synonym: "maximum root depth" [] synonym: "maximum root length" [] is_a: TO:0000043 ! root anatomy and morphology trait intersection_of: PATO:0000122 ! length intersection_of: inheres_in PO:0009005 ! root
Providing these definitions in oboformat has the following advantages:
- PATO definitions of qualities can be shared and reused
- The oboedit reasoner can keep PT in sync with PO and PATO, and perform automatic DAG placement of terms
- PATO terms can be used to query PT-annotated phenotypes
- PO terms can be used to query PT-annotated phenotypes (eg "halogen sensitivity" can be used as a query, even though this term is not in PT)
[edit] Sensitivity Terms
This is the other major class of terms. There are lots of sensitivity terms. These are represented using the definition pattern:
sensitivity which is *towards* <chemical>
For example (again, in obo format):
[Term] id: TO:0000029 name: chlorine sensitivity namespace: plant_trait_ontology def: "Sensitivity to the chlorine content in the growth medium. Chloride helps regulate the correct pH (acid/alkaline) balance. This is a major electrolyte in the living cell besides sodium and potassium. It is available to the cell mainly in the form of NaCl and KCl salts." [] synonym: "CHLORSN" [] is_a: TO:0000080 ! micronutrient sensitivity intersection_of: PATO:0000085 ! sensitivity intersection_of: towards CHEBI:23116 ! chlorine
Providing these definitions in oboformat has the following advantages:
- CHEBI definitions of qualities can be shared and reused
- The oboedit reasoner can keep PT in sync with CHEBI, and perform automatic DAG placement of terms when/if CHEBI changes
- CHEBI terms can be used to query PT-annotated phenotypes
[edit] Nutrients
See this tracker item:
[edit] Environment ontology
There is also a plant-centric environment ontology (EO). This could be generalised for other uses.
Some of the sensitivity terms referred to environments or parts of the environment that are not at the chemical/molecular level. EO was used to define some of these, for example:
[Term] id: TO:0000188 name: drought sensitivity namespace: plant_trait_ontology def: "Drought sensitivity is highly interactive with crop phenology, plant growth prior to stress, and timing, duration, and intensity of drought stress. For many soils, it takes at least 2 rainless weeks to cause marked differences in drought sensitivity during the vegetative stage and at least 7 rainless days during the reproductive stage to cause severe drought injury. Leaf rolling precedes leaf drying during drought. Repeated ratings are recommended through progress of the drought." [] synonym: "DRS" [] synonym: "DRSN" [] synonym: "drought susceptibility" [] is_a: TO:0000394 ! drought related trait intersection_of: PATO:0000085 ! sensitivity intersection_of: towards EO:0007404 ! drought environment
these will need closer examination. Some of the EO terms could themselves be decomposed, eg soil alkilinity
[edit] Ratios, proportions and compositions
Many PT terms include ratios:
[Term] id: TO:0000278 name: root to shoot ratio namespace: plant_trait_ontology synonym: "RTSHRO" [] synonym: "r/s" [] synonym: "root/shoot ratio" [] is_a: TO:0000043 ! root anatomy and morphology trait
Course of action: we will add a term "proportionality" to PATO http://sourceforge.net/tracker/index.php?func=detail&aid=1606404&group_id=76834&atid=595654
intersection_of: PATO:<<new-term-proportionality>> intersection_of: towards PO:0009005 ! root intersection_of: relative_to PO:0009006 ! shoot
Some of the PT ratios can be expressed using the relational quality PATO:composition, which can take an additional 2 arguments:
[Term] id: TO:0000372 name: amylose to amylopectin ratio namespace: plant_trait_ontology def: "Ratio of amount of amylose to amylopectin content." [] synonym: "AMYAMYPCTRO" [] is_a: TO:0000097 ! amylopectin content is_a: TO:0000196 ! amylose content intersection_of: PATO:0000025 !composition intersection_of: towards CHEBI:28102 ! amylose intersection_of: relative_to CHEBI:28057 ! amylopectin
Note the is_a links to amylopectin content and amylose content. Is this correct? In actual fact we have a reversal of magnitude here, as the magnitude of this parent increases, the magnitude of the child will decrease
[edit] Synonyms
PT uses lots of synonyms, but does not assign a scope (exact/broad/narrow). This makes it harder to glean the exact meaning of a term from the synonym.
[edit] Relative
PT uses terms like relative growth rate. It is not clear how this is different from growth rate. This seems to be a difference in how the term is used.
[edit] Disease resistance phenotypes
PT has many disease resistance terms; eg:
[Term] id: TO:0000323 name: stem rot disease resistance namespace: plant_trait_ontology def: "Causal agent: Magnaporthe salvinii (Nakataea sigmoidea, Sclerotium oryzae), and Helminthosporium sigmoideum var. irregulare. Symptoms: dark lesions develop on the stems near the water line. Small, dark bodies (sclerotia) develop, weaken the stem and cause lodging." [] synonym: "CULMROTRS" [] synonym: "SR" [] is_a: TO:0000439 ! fungal disease resistance
These would be easy to define if we had an orthogonal ontology of plant diseases and infectious agents. The definition pattern would be:
A <TO:resistance> which *towards* <InfectiousAgent>
This will be easier to do for mammalian phenotypes since these orthogonal ontologies are being developed. It is recommended the POC develops such a separate ontology for plants.
[edit] Assay-specific terms
Example: root dry weight
It is unclear how to proceed with these. No action was taken for these.
[edit] Conjunctive terms
Example: lemma and palea related traits
Example: lemma and palea pubescence
I think these should be defined as:
a pubescence which inheres_in the lemma and inheres_in the palea
ie the necessary and sufficient conditions are that both the lemma and palea are pubsescent
can the same quality instances inhere in 2 entities?
[edit] Inconsistency of Term name syntax
Compare: "shrunken endosperm" with "leaf length"
In one case the adjectival noun (naming the quality) precedes the noun (naming the bearer entity); in other cases it succeeds it.
It is recommended all prefered terms follow the same lexical pattern (unless community usage or plain english trumps this).
[edit] Missing anatomy terms
Example: canopy temperature
"canopy" should be the name of a type in the plant_anatomy ontology
[edit] Use of "other"
Example: other miscellaneous trait Example: other nutrient sensitivity
OBO Foundry principles dictate avoidance of "other"
[edit] Use of word "quality"
PATO and PT's use of the word quality is not univocal. PATO uses the term to mean "property" whereas TO uses it to mean a more nebulous subjective(?) quality. Perhaps within the plant trait community the meaning of this term is more fixed, if so, it should be defined somewhere. Does it simply mean healthiness? If so, it would be good to use the same PATO term, at least as a synonym
[Term] id: TO:0000587 name: endosperm quality alt_id: TO:0000150 related_synonym: "End" [] related_synonym: "endosperm quality (sensu Poaceae)" [] is_a: TO:0000162 ! seed quality
[edit] Redundancy
There is some redundancy with GO Molecular Function activity terms
For example:
[Term] id: TO:0000284 name: ADP glucose pyrophosphorylase activity namespace: plant_trait_ontology def: "Catalysis of the reaction: ATP + alpha-D-glucose 1-phosphate = diphosphate + ADP-glucose." []
A phenotype ontology should be orthogonal to an ontology of molecular function. It is not clear how the above term would be used in phenotype annotation.
There is a case for including malfunctions, loss of function and gain of function in a phenotype ontology, but this should be made explicit
[edit] Complex Phenotypes
It's too hard to even attempt to automatically generate definitions for these; logical definitions should be provided manually. For example:
[Term] id: TO:0000169 name: chinsurah boro CMS namespace: plant_trait_ontology def: "Abortion of microspore development at trinucleate stage" []
This could be defined in terms of "microspore development" (GO:0009555), "trinucleate stage" (term required in plant developmental stage ontology)
[edit] Other difficult cases
Example: photoperiod sensitivity
[edit] Errors and inconsistencies detected
The following were detected as a direct result of this approach
in PO, panicle is a synonym for inflorescence; in TO, panicle color is_a inflorescence color
[Term] id: TO:0000201 name: panicle color namespace: plant_trait_ontology def: "Variation in color of the panicle inflorescence." [] synonym: "PNCL" [] is_a: TO:0000581 ! inflorescence color
similarly, panicle/inflorescence length, weight, shape
[edit] Results: Mammalian Phenotype (MP)
The MP ontology proved more difficult to analyze automatically - changes to Obol and/or manual editing of logical definitions may be required.
[edit] Downloading
Logical definitions file available at:
[edit] OBO-Edit
If you are using OBO-Edit, then you can paste the following URL directly into the obo-edit load box:
(you may want to use the 'advanced' option and allow dangling references)
The above .obo file contains just import statements - to pull in the relevant obo files, as well as the main xp file (this means you will need an internet connection, even if you download the file for use later)
[edit] OWL Editors
Not yet fully tested. You may need to allocate additional heap, as this imports a lot of ontologies.
[edit] Visualising Results
As well as oboedit and owl editors, the cross-products can be viewed in the experimental Obol branch of Amigo. For example, to see what terms are defined using the PATO term hypoplastic, see:
Here we can see the following terms defined:
- nasal bone hypoplasia
- maxilla hypoplasia
- mandible hypoplasia
- liver hypoplasia
- adrenal gland hypoplasia
- bulbourethral gland hypoplasia
- spleen hypoplasia
- forebrain hypoplasia
- telencephalon hypoplasia
- cerebellum hypoplasia
- pulmonary hypoplasia
- skin hypoplasia
Following the like to pulmonary hyoplasia shows you the term in the original MP context (a DAG) and also a "decomposed view" showing separate trees for the genus and differentium:
This page is just meant for illustration - for display to a biologist end-user this would be compacted; the upper levels of PATO would not be shown.
[edit] Term Syntax
The syntax of the term names in MP differs from PT; a common lexical pattern in MP is:
- VALUE - ENTITY - ATTRIBUTE
[edit] Example: increased brown fat cell amount
One example of an MP term following the above pattern is:
- "<increased> <brown fat cell> <amount>"
Which has the text definition: "increased amount of thermogenic tissue in the body that is composed of cells containing multiple small fat droplets"
As a first attempt at a logical definition:
- genus: PATO:0000420 ("increased number")
- differentia: inheres_in MA:0000057 ("brown fat")
Represented in obo format as:
intersection_of: PATO:0000420 ! increased number intersection_of: inheres_in MA:0000057 ! brown fat tissue
However, this is not quite right, as it is not an "increased number" of "brown fat" tissue, it is a larger "portion size".
At the time of writing this PATO term is not defined so we're not sure if this is applicable.
(TODO)
There are many advantages to explicitly defining MP terms using PATO and MA in this fashion. Reasoners can be used to keep the ontologies in sync (either the oboedit reasoner or one of any number of owl reasoners). PATO definitions and MP definitions can be shared and reused.
[edit] Complex phenotypes in mammals
Many phenotypes such as "ovary hypoplasia" or "degenerate molars" are a single quality inhering in a single entity (or collection of entities - eg the collection of all molars in an individual organism).
For others such as "holoprosencephaly" or "osteoporosis", it may be an interconnected collection of qualities inhering in the same or different parts. We can still define these as conjunctions of EQs, but this is where we start crossing the grey line between the dependent continuants known as 'phenotypes' into the dependent continuants known as 'disorders'...
It is also important to distinguish between the definition, which captures the necessary and sufficient conditions, from the gloss/additional notes; eg MP defines holoprosencephaly as:
presence of a single forebrain hemisphere or lobe; often accompanied by a deficit in median facial development
Thus the logical definition should only refer to the part before the semicolon. (genus: PATO:having_a_single_part, differentia: towards MA:forebrain_hemisphere). The part after the semicolon ("often accompanied by...") is useful, but is statistical and are neither necessary nor sufficient so should be captured by other means.
[edit] Issue: species-specificity
Whilst the MP represents mammalian phenotypes, we do not have a common mammalian anatomical ontology. It will take some time before CARO reaches the level of granularity required.
We recommend that since the focus of MP is biased towards the mouse and mouse models of human disease [CHECK THIS STMT], we treat it is a mouse phenotype ontology, with a view to migrating the definitions towards some kind of FMA/MA synthesis when such an ontology becomes available.
[edit] Absence-oriented terms
The absence case is different from normal EQ phenotypes - the absence does not inhere in the entity that is absent, since this entity does not exist!
We treat these the same as "sensitivity to..." terms. The absence inheres in the organism as a whole (or the anatomical entity which is missing the part)
[Term] id: MP:0000766 name: absent tongue squamous epithelium namespace: MPheno.ontology def: "missing the scaly epithelial layer of the tongue" [] synonym: "absence of tongue squamous epithelium" [] synonym: "loss of tongue squamous epithelium" [] is_a: MP:0000765 ! abnormal tongue squamous epithelium morphology
Could be defined as:
An <lacking_of_a_part> which inheres_in <organism> and towards <tongue squamous epithelium>
The "inheres_in <organism>" part of the definition can be omitted
in obo format:
intersection_of: PATO:lacks_part intersection_of: towards MA:tongue_squamous_epithelium
See PATO:Absent for more discussion on this
[edit] "Other" categories
MP contains some terms such as "other metabolic defect" - the use of "other" is to be avoided
[edit] Root term
The root of the MP is the term "phenotype ontology". There are relations such as "cellular phenotype" is_a "phenotype ontology". The root term should be changed to "phenotype".
[edit] Inconsistencies detected
A number of inconsistencies were detected.
Example of inconsistency between MP and OBO-Cell:
Here the oboedit reasoner has found an unstated implied link (blue squiggly arrow) between 'abnormal amacrine cell morphology' and 'abnormal interneuron morphology'. In the split pane display we see the reason for this inference: OBO-Cell states that 'amacrine cell' is_a 'interneuron'.
The logical definitions will also be used for comparing phenotypes across organisms once OBD is populated.
[edit] Results: Worm Phenotype (WP)
Defining xps for the WP is still in preliminary stages
[edit] Downloading
Logical definitions file available at:
[edit] Conclusions
Providing formal computable definitions of pre-coordinated phenotype terms in terms of basic qualities (PATO) and bearer entities (eg AOs) will make both styles of phenotype annotation commensurable, promote sharing of core ontologies and reusable ontology "building blocks"
More work is required to curate these logical definitions in phenotype ontologies. This preliminary analysis suggests some of this can be automated.
Pages using the relation “Pre vs Post Coordinating”
Showing 0 pages using this relation.


