UBERON:Main Page

From NCBO Wiki
Jump to navigation Jump to search

Uberon is a multi-species anatomy ontology created to facilitate comparison of phenotypes across multiple species. Uberon is generated semi-automatically from the union of existing species-centric anatomy ontologies. As such, it undoubtedly contains many errors and biological falsehoods. The guiding principles deliberately err on the side of generating false positives in query results.

Status

Uberon should be considered pre-alpha

Availability

Uberon is deliberately not listed on the OBO Foundry site yet. However, it is available in sourceforge cvs:

anatmomy/anatomy_xp/uberon.obo

Use with caution!

Maintenance workflow

Post-automatic generation, Uberon is occasionally manually edited to remove some of the most egregious errors. However, there are no curational resources for continued development. Uberon will thus probably remain in a largely static slightly unsatisfactory state.

Future Plans

We hope that there will eventually be resources for a biologically sound multi-species anatomy ontology - when this arrives Uberon will have served its purpose and can disappear into the night.

In the intermediate time, Uberon can serve as a strawman ontology, and a useful source of empirical results illustrating the need for a properly curated multi-species anatomy ontology with highly specific classes.

Relationship to other ontologies

MIAA

MIAA is undoubtedly better than Uberon in that it is manually curated. However, Uberon has more specific and granular classes than MIAA (Uberon has 2000, MIAA 400). Uberon also attempts to employ is_a, part_of and developmental relations in the same manner as species specific ontologies (sometimes wrongly) - it attempts (not entirely successfully) to be an ontology. MIAA is more of a terminology.

Uberon subsumes MIAA (MIAA was one of the inputs), and includes xrefs to MIAA IDs.

CARO

CARO has very general upper level classes

Uberon should have is_a links to CARO where appropriate?

species-centric anatomy ontologies

Uberon links to these via xrefs. In addition, there is a separate mapping file that provides is_a links between ssAO IDs and Uberon IDs. This is to facilitate subsumption based reasoning (eg queries for uberon:lower_jaw should return ma:lower_jaw)

Homology

Uberon was constructed around analogy rather than homology, purely as a matter of expediency. In fact Uberon may even contain classes that represent groupings that are not even analagous. This is due to the fact that construction is largely automated, with an unhealthy dependence on text-based methods.

As noted above, Uberon is a temporary ontology intended to serve an immediate need.

uberon has utility despite not having formal phylogenetic representation. Here is why:

What uberon does do that is really important is that it allows biologists to search for analogy too. A lot of things that are considered analogous are really homologous, just in ways or at different levels of granularity that aren't fully understood. Later on we should separate homology from analogy, but I think there should always be some kind of uber ontology that allows searching for analogy. So while parts of uberon may get a formal homology treatise, there are parts that will remain useful purely as analogous groupings. In fact, I see this as a process of gradual reclassification in terms of homology as we learn.

If nothing else uberon can help people consider where we want to go from here.

Use Cases

Currently used in OBD. See http://www.berkeleybop.org/obd

See for example: lower jaw annotations

A query for "lower jaw" in Uberon returns mouse genes, zebrafish genes and human genes that are somehow implicated in phenotypes of the lower jaw. This query also uses MP-XP

Uberon may also be used to make GO xps, also for facilitating analysis in OBD. See biological_process_xp_anatomy

Methods

We subdivide ontologies into those that are species-centric (scAOs) and generic (gAOs). An example of an scAO is the Foundational Model of Anatomy (FMA), which is human-centric. Some scAOs may be applicable further up the taxonomic hierarchy above the species level - for example, the adult mouse anatomy (MA) CHECK WITH MGI EXACTLY HOW FAR UP IT IS APPLICABLE AND VALID FOR. The Gene Ontology cellular component ontology is an example of a gAO at the subcellular level, applicable to prokaryotyes and eukaryotes. The OBO Cell Ontology (CL) is a gAO that represents different kinds of cells across a variety of phyla. Of course, gAOs may contain classes that are only applicable for certain taxonomic subsets. Some ontologies are 'inbetween' - for example the TAO is an anatomy ontology applicable for teleost (bony fish) which is more general than the zebrafish anatomy ontology (ZFA) yet more specific than a true gAO.

We tend to find that scAOs sit at the gross anatomical level (presumably at least in part due to the higher cross-species diversity at this level). Some scAOs also delved into cellular and subcellular territory (eg ZFA and FBbt). CARO is a very general gAO that defines a set of high level classes to be used across AOs. MIAA is a gAO that defines a minimal set of anatomy terms to be used in microarray annotation.

Our approach is homology-neutral. We seek to group classes from multiple anatomical ontologies regardless of whether the relation between the anatomical entities (AEs) is one of homology or analogy (e.g. convergent evolution). This approach is driven by pragmaticism rather than biology - we hope that future efforts will extend this with a more phylogenetically valid approach as outlined in [REF:CARO], and described at the multi-species anatomy ontology meeting [REF].

We first sought to collate all sources of putative homology and analogy between the AEs in different species-centric anatomy ontologies (scAOs). Sometimes these came from the ontology themselves, in the form of xref tags in the underlying obo file. The xref tag has no fixed semantics, but is by convention used to link scAOs to gAOs - see for example OBO Mappings. For the mouse-human mappings we relied on [REF:Bodenreider et al]. In other cases we had to create our own mappings by running the Obol S3 algorithm (simple synonym and stemming) which performs basic text-based matching.

We then sought to eliminate as many non-isomorphic (1-1) pairwise mappings as possible. This was done on an ad-hoc basis by a non-expert by occasional consultation with experts. This aspect could use further curation.

We collected grouping classes based on mappings. We used an extremely promiscuous grouping algorithm, finding the maximally self-connected sets. This undoubtedly results in some meaningless classes. Each grouping class was given an ID and placed in ontology called UBERON. This ontology includes is_a links incoming from the external AOs, and the superset of all synonyms and definitions from the external AOs.

Finally we added links to the ontology based on links in the external AOs. Again, we were extremely liberal in what we accepted: if an external AO contained a link (X,R,Y) (asserted or logically entailed) we also created a link (X',R,Y'), where X<->X' and Y<->Y' are mappings. This ultra-liberal policy undoubtedly creates invalid links. The goal is to create the maximal set for now, and curate this in future. 3 cyclic links were generated, these were manually removed.