The compositional structure of Gene Ontology terms release_qulzmfbqezdzdar5yezgdh4k7y

by P V Ogren, K B Cohen, G K Acquaah-Mensah, J Eberlein, L Hunter

Abstract

An analysis of the term names in the Gene Ontology reveals the prevalence of substring relations between terms: 65.3% of all GO terms contain another GO term as a proper substring. This substring relation often coincides with a derivational relationship between the terms. For example, the term regulation of cell proliferation (GO:0042127) is derived from the term cell proliferation (GO:0008283) by addition of the phrase regulation of. Further, we note that particular substrings which are not themselves GO terms (e.g. regulation of in the preceding example) recur frequently and in consistent subtrees of the ontology, and that these frequently occurring substrings often indicate interesting semantic relationships between the related terms. We describe the extent of these phenomena--substring relations between terms, and the recurrence of derivational phrases such as regulation of--and propose that these phenomena can be exploited in various ways to make the information in GO more computationally accessible, to construct a conceptually richer representation of the data encoded in the ontology, and to assist in the analysis of natural language texts.
In text/plain format

Archived Files and Locations

application/pdf  300.5 kB
file_kib5a36a7ncz5jhvf4qoot62uq
europepmc.org (repository)
web.archive.org (webarchive)
application/pdf  189.2 kB
file_vszpjkdbprdstcs4dgiw5ssu3e
psb.stanford.edu (web)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article-journal
Stage   published
Year   2004
Language   en ?
PubMed  14992505
PMC  PMC2490823
Container Metadata
Not in DOAJ
Not in Keepers Registry
ISSN-L:  2335-6928
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: d3430463-d799-4048-b4c0-bff0826cd097
API URL: JSON