PATO:About

From NCBO Wiki

Jump to: navigation, search

Contents

[edit] PATO - An ontology of Phenotypic Qualities

PATO is an ontology of *phenotypic qualities*, intended for use in a number of applications, primarily phenotype annotation.

The new PATO differs from the old in that the system of attributes and values has been abandoned in favour of a single hierarchy of qualities. For a comparison and translation guide, see further on in this document.

PATO is designed to be used in conjunction with ontologies of "quality-bearing entities". An example of such an entity is an insect eye (taken from the fly_anatomy ontology), which could be the bearer of the quality 'red' (PATO:0000322). This combination is the "red eye" phenotype. We say that the phenotype term is 'post-coordinated', as it is formed by coordinating two terms together. This is in contrast to ontologies of pre-coordinated phenotypes, such as the Mammalian Phenotype (MP) ontology. See below for a more formal treatment.

Other ontologies of content bearers include the OBO Cell ontology (CL), the GO biological process or cellular component ontologies, and ontologies of anatomical entities.

PATO is independent of any exchange format or database schema. One way of expressing phenotype annotation using PATO is pheno-syntax [1], or pheno-xml [2]. We will also post recommendations for representing phenotypes using OWL. All representations share the same basic formal underpinnings, a combination of quality-bearing entity and a quality (the EQ model).

Pheno-syntax is the de-facto standard for human-readable annotations, for example in the body of email messages in this mail list. The red eye phenotype would be written as:

E= FBbt:eye Q= PATO:red

(we allow ID-space prefixed names in place of the authentic numeric IDs as a convention to enhance readability, although these must be converted to authentic IDs to be valid pheno-syntax).

Many phenotypes are more complex than "red eyes", and even the "red eye" annotation has subtleties (what if both eyes are different colours?). Some of these are discussed in the pheno-syntax document. Other examples will be discussed on this list and on the wiki, in order to reach consensus on best practices for annotating complex phenotypes. Some considerable progress has been made already, with FlyBase and ZFIN under the auspices of the NCBO (http://www.bioontology.org) and the CToL (Cypriniformes Tree of Life) project [3].

PATO originally stood for "Phenotype and Trait Ontology", but the name has never been truly reflective of the contents of the ontology: phenotypic qualities. However, we've decided to retain the name and the PATO identifier space since it has stuck over the years.

[edit] Differences compared to pre-2006 PATO

Prveiously PATO was arranged as two hierarchies of attributes and values. Annotations followed the so-called EAV models, entity-attribute-value. The red eye phenotype would be annotated as:

E= FBbt:eye A= PATO:color V= PATO:red

Under the new system this is now collapsed to:

E= FBbt:eye Q= PATO:red

In the ontology, "red" stands in an is_a relationship to "color", so there is no information loss.

The decision to move from a two-hierarchy to single-hierarchy system (and thus from a tripartite to bipartite annotation system). Was a lengthy process and was not undertaken lightly. We have reached consensus, and this decision is final. We will make the full justification and arguments available separately (they are available now, but you'd have to mine various meetings minutes, powerpoints and email threads).

Note that this change is largely backward compatible with pre-existing EAV annotations - the A can simply be removed as it is redundant. In the cases where this does not hold, the ID that fills the "V" field will have been obsoleted, and this will require manual transference (as is the cases for obsoletion of OBO terms in general)

(A brief note to computer scientists in order to pre-empt certain questions: the EQ model can still be regarded as "EAV" or "subject-predicate-object" in the sense normally used by database people, since there is an implicit "inherence" or "has_quality" relation between the E and the Q. Please join the obo-relations list [*] for discussions on this relation.)

[edit] Stability

(as of 2006/08/25)

Whilst we lack consistent metrics to guage the stability of an ontology, we believe PATO is ready for production annotation use (many of you have already been using it for a while). IDs will remain stable, and normal obsoletion rules apply (ie terms will be marked obsolete rather than simply disappearing from the record).

Eventually we would like to a reach a stage whereby a discussion is initiated prior to the obsoletion of any terms, giving those in charge of phenotype annotation a chance to comment, avoiding causing unneccessary obsoletion churn. This is the policy with GO - however, it took GO a number of years to reach this level. From here on we will do our best to keep the community in the loop (by means of the obo-phenotype list) by means of this list, but we are still in a phase where we may want to make obsoletions without first achieving community consensus.

The structure of the ontology is subject to change. This will not affect annotations, but it may affect MOD decisions as to how and whether to present the PATO DAG to their communities in the short term. We are also considering generating ontology views (aka subsets, slims) for this purpose.

There are many terms still lacking from PATO - everyone is encouraged to use the PATO request tracker. Terms with definitions are preferred, especially definitions that conform to the PATO definition guidelines.

[edit] Current users and annotations

See PATO:Annotations

[edit] Commensurability of pre- and post- coordinated approaches

There are two paradigms for representing phenotypes and phenotype-like entities (1) use ontologies of pre-coordinated phenotypes, such as MP or plant_trait (PT) or (2) post-coordinate the phenotype description using terms from an ontology of qualities (eg PATO) and terms from ontologies of quality-bearers (eg GO, AOs, CL, ...); this is the EQ-annotation methodology.

For example, one could annotate using the MP term "big ears" (MP:0000017).

Or one could post-coordinate the same description as

 E=MA:0000236 Q=PATO:0000586

Both approaches have their advantages, and they are in fact entirely compatible, provided you follow the SOP laid out below. It is even possible to mix and match these approaches.

See the document PATO:Pre vs Post Coordinating

[edit] Definition Guidelines

These are the best practices for supplying definitions; these do not have to be adhered to when making new term requests, a dictionary definition will often suffice.

Definitions will be specified in accord with OBO Foundry principles. Each term definition should refine it's parent (genus) term by providing differentiating characteristics that are both necessary and sufficient to discriminate instances of this term from siblings terms. These definitions are of the form "An X is a G which D"

[edit] Phenotype Data Exchange Formats, Formalisms and database schemas

See

[edit] Pheno-syntax

  • Pheno Syntax : pheno syntax is a human readable and computationally parseable way of exchanging phenotype annotation

[edit] Pheno-XML


[edit] OWL

[edit] Phenotype annotation database schemas

[edit] OBD

OBD is being developed by the National Center for Biomedical Ontology. It is a generic annotation database, and the first release will include phenotype database from ZFIN, FlyBase and other sources.

OBD (temporary page)

[edit] Chado

Chado has a module for representing phenotypic data (under revision)

See the GMOD/Chado page

[edit] Related ontologies

PATO is intended to be used with post-coordinated annotations. There are various ontologies containing pre-coordinated phenotype terms. See for example the plant_trait ontology or the mammaliam phenotype MP:Main_Page ontology.

In order to make PATO-style EQ descriptions interoperable it will be necessary to provide formal computable definitions for pre-coordinate MP terms. For example, to give a trivial example:

MP:small_ears can be defined as a PATO:small which inheres_in MA:ear

This can be done as a genus-differentia (cross-product) definition in oboedit. Obol could be used to make a first-pass seeding of the definition.

phenotypes can also be linked to diseases in DO - the details have not been worked out yet


[edit] Questions?

Please ask any questions on the obo-phenotype list. We realise this ontology is more abstract than an ontology like GO, because of the "building block" (post-coordination) approach. The switch from EAV (pre-2006) to EQ (2006 onwards) may also cause confusion. We will we setting up an FAQ and annotation best practices on the wiki, as well as definition guidelines for suggesting new terms. In the mean time, please use the https://lists.sourceforge.net/lists/listinfo/obo-phenotype list to ask questions and share experiences.

[edit] Ongoing issues

  • PATO:Absent - annotating phenotypes involving some missing part


[edit] Links

[1] http://www.fruitfly.org/~cjm/obd/pheno-syntax.html
[2] http://www.fruitfly.org/~cjm/obd/formats.html#xml-format
[3] http://bio.slu.edu/mayden/cypriniformes/home.html
[4] http://www.obofoundry.org

See also PATO:Main_Page


Pages using the relation “About”

Showing 0 pages using this relation.

Personal tools