|Version 65 (modified by nata_courby, 2 years ago) (diff)|
ontoCAT R package
We provide two versions of ontoCAT R package:
- Light-weight ontoCAT R package version is available in Bioconductor starting from release 2.7, and includes all single-ontology functionality except for methods to work with multiple ontologies and search in OLS and BioPortal.
- Full ontoCAT R package version includes batch methods and due to package-size limitations is available only from the ontoCAT project sourceforge page.
The R package ontoCAT was created to support basic operations on ontologies: traversal and search, uniform access to ontologies in OWL and OBO formats and to provide R access to major ontology repositories OLS and BioPortal.
Several hundreds of public ontologies and numerous private ontologies for describing biological data exist today. Using ontologies in R is difficult due to the lack of uniform package support. At the same time numerous Java-based ontology projects are available. ontoCAT takes advantage of a standard Java library with the same name "ontoCAT" to implement its functionality. Here are java sources used for our R package: Java sources.
The ontoCAT package:
- gives unified, format-independent access to ontology terms and the ontology hierarchy represented in OWL and OBO formats;
- provides basic methods for ontology traversal, such as searching for terms, listing a specific term's relations, showing paths to the term from the root element of the ontology, showing flattened-tree representations of the ontology hierarchy;
- supports working with groups of ontologies and with major public ontology repositories: searching for terms across ontologies, listing available ontologies and loading ontologies for further analysis as necessary.
No other package with similar functionality exists at the moment in the R environment.
The integration of the above functionality into R allows combining and automating ontology-related tasks.
Different examples of ontology-related tasks that can be accomplished with the help of the ontoCAT package are given at the examples page of the ontoCAT website: gene enrichment test and grouping of results, search and re-annotation of free-text to ontology.
ontoCAT has been included into Bioconductor, the main R open source project in bioinformatics.
Reasoning over ontologies and extracting relationships is supported by using HermiT reasoner. OBO ontologies are translated by OWL API into valid OWL format that can be reasoned over.
What reasoning is needed for?
Reasoning is fundamental to exploring any sort of hierarchies of the more expressive ontologies. We present here a simple example how reasoner could be employed, but considerably more complicated scenarios are in use by the OWL community.
Consider the following hypothetical ontology (annotated with Description Logic syntax below) describing the anatomical relations among the parts of heart. We start by defining four classes: Heart, HeartComponent, LeftHeart, and MitralValve and two object properties to describe the partonomy: partOf and hasPart. The two primitive classes LeftHeart and MitralValve have additional necessary conditions defined as partOf some Heart and partOf some LeftHeart respectively in order to describe that mitral valve is a component of left heart, which in turn is a part of the whole heart. We additionally create a defined class HeartComponent as a convenience class to capture all the different parts of heart as subsumptions and this is our primary query -- find all parts of heart.
Simply parsing the ontology to find which classes have the statement partOf some Heart among their restrictions would only return LeftHeart, but miss MitralValve completely. However, if we additionally specify that partOf is transitive, the reasoner would be able to infer that MitralValve is also a HeartComponent.
Conversely, while it is possible to query for all classes fulfilling the existential restriction partOf some Heart to retrieve all parts of heart, the opposite query for hasPart some LeftHeart will not return Heart due to the OWL's open world assumption, unless a closure axiom hasPart some Heart AND hasPart only Heart is additionally specified on LeftHeart class. This limitation is not directly addressed in the package, as it would require precomputing a complete graph of different relations axes, which is computationally intensive and heavily dependant on the size and expressiveness of the ontology in question.
Furthermore, while the reasoner does return LeftHeart and MitralValve when queried for subclasses of 'anonymous class' partOf some Heart (equivalent to the aforementioned 'defined class' HeartComponent) we additionally transform all the properties into their inverses at this stage, so that partOf becomes hasPart and the resulted output from the package is Heart hasPart LeftHeart and MitralValve. As this depends on the existence of inverse restrictions on the object properties within the ontology, these are automatically added based on the Relation Ontology (cite Smith B, Ceusters W, Klagges B, Kohler J, Kumar A, Lomax J, Mungall CJ, Neuhaus F, Rector A, Rosse C Relations in Biomedical Ontologies.Genome Biology, 2005, 6:R46) where necessary.
In ontoCAT the subsumption "subclass/superclass" is supported in a user friendly form of "child -- parent" relationship.
For instance, ontology term "myocardium" is a parent for term "atrial myocardium" since "atrial myocardium" is subclass of "myocardium". No distinction is made between universals (classes) and particulars (instances) as they are both treated as ontology terms.
The advantage of using a reasoner in ontoCAT is the ability to work with different relationships in addition to subsumption.
Example of operations with relationships is available here: R Example 3.
Single Ontology Traversal Methods
- To load an ontology getOntology() method of the Ontology class is available. It takes a single argument, specifying the local filesystem path, the full URI for the ontology file, or its OLS/BioPortal accession number.
- getAllTermChildren(Ontology, term) returns list of term's all children
- getAllTermChildrenById(Ontology, 'EFO_0000343') returns list of term's all children
- getAllTermIds(Ontology) returns list of all term accessions
- getAllTermParents(Ontology, term) returns list of term's all parents
- getAllTermParentsById(Ontology, 'EFO_0000343') returns list of term's all parents
- getAllTerms(Ontology) returns list of all terms
- getEFOBranchRootIds(Ontology) returns set of branch root accessions. Method specific for EFO ontology
- getOntologyAccession(Ontology) returns parsed ontology accession
- getOntologyDescription(Ontology) returns parsed ontology description
- getRootIds(Ontology) returns list of root terms accessions, if there are any
- getRoots(Ontology) returns list of root terms, if there are any
- getTermAndAllChildren(Ontology, term) returns list of accessions of term itself and all its children recursively
- getTermAndAllChildrenById(Ontology, 'EFO_0000343') returns list of accessions of term itself and all its children recursively
- getTermById(Ontology, 'EFO_0000343') fetchs term by accession. returns external term representation if found in ontology, null otherwise
- getTermChildren(Ontology, term) returns list of term's direct children
- getTermChildrenById(Ontology, 'EFO_0000343') returns list of term's direct children
- getTermDefinitions(Ontology, term) returns set of term's definitions if there are some
- getTermNameById(Ontology, 'EFO_0000343') returns term's label by accession
- getTermParents(Ontology, term) returns list of term's direct parents
- getTermParentsById(Ontology, 'EFO_0000343') returns list of term's direct parents
- getTermSynonyms(Ontology, term) returns set of term's synonyms if there are some
- hasTerm(Ontology, 'EFO_0000343') Check if term with specified accession exists in ontology
- isEFOBranchRoot(Ontology, term) returns true if term is branch root of EFO. Method specific for EFO ontology
- isEFOBranchRootById(Ontology, 'EFO_0000343') returns true if term is branch root of EFO. Method specific for EFO ontology
- isRoot(Ontology, term) returns true if term is root of ontology
- isRootById(Ontology, 'EFO_0000343') returns true if term is root of ontology
- searchTerm(Ontology, 'thymus') searches for term in ontology by name
- searchTermPrefix(Ontology, 'thym') searches for prefix in ontology
- showHierarchyDownToTerm(Ontology, term) returns set of terms that represent ontology "opened" down to specified term, hence displaying all its parents first and then a tree level, containing specified term
- showHierarchyDownToTermById(Ontology, 'EFO_0000343') returns set of terms that represent ontology "opened" down to specified term, hence displaying all its parents first and then a tree level, containing specified term
- showPathsToTerm(Ontology, term) returns paths to the specified term from ontology's root term
- showPathsToTermById(Ontology, 'EFO_0000343') returns paths to the specified term from ontology's root term
- getOntologyRelationNames(Ontology) returns list of relations used in ontology
- getTermRelationNames(Ontology, term) returns list of relations that term has
- getTermRelationNamesById(Ontology, 'EFO_0000343') returns list of relations that term under given accession has
- getTermRelations(Ontology, 'EFO_0000343', 'has_part') returns list of terms that are in defined relation with term of interest
- getTermRelations(Ontology, term, 'has_part') returns list of terms that are in defined relation with term of interest
Operations on Multiple Ontologies
- To create a local batch of ontologies the getOntologyFromBatch() method of the batch class is provided, taking a single argument: the path to the local directory containing ontology files. By default, a call to getOntologyFromBatch() without any arguments will load the EFO ontology.
- Ontologies can be added to an existing batch as needed via the addOntology() method.
- searchTerm(batch, 'thymus') searches for term in all ontologies in the batch.
- searchTermInOLS('thymus') searches for term in OLS.
- searchTermInBioPortal('thymus') searches for term in BioPortal.
- serachTermInAll(batch,'thymus') searches for term in all ontologies in the batch as well as in OLS and BioPortal repositories.
- listLoadedOntologies(batch) returns a list of ontologies from the batch.
- listOLSOntologies(batch) returns a list of ontologies available from OLS.
- listBioportalOntologies(batch) returns a list of ontologies from BioPortal.
- When the sought terms are found and term-specific operations (parent/child retrieval, etc.) are needed, the getOntologyFromBatch(batch,accession) returns the ontology parser for the concrete ontology with all single-ontology methods as described above.