The compositional structure of Gene Ontology terms
release_qulzmfbqezdzdar5yezgdh4k7y
by
P V Ogren, K B Cohen, G K Acquaah-Mensah, J Eberlein, L Hunter
2004 p214-25
Abstract
An analysis of the term names in the Gene Ontology reveals the prevalence of substring relations between terms: 65.3% of all GO terms contain another GO term as a proper substring. This substring relation often coincides with a derivational relationship between the terms. For example, the term regulation of cell proliferation (GO:0042127) is derived from the term cell proliferation (GO:0008283) by addition of the phrase regulation of. Further, we note that particular substrings which are not themselves GO terms (e.g. regulation of in the preceding example) recur frequently and in consistent subtrees of the ontology, and that these frequently occurring substrings often indicate interesting semantic relationships between the related terms. We describe the extent of these phenomena--substring relations between terms, and the recurrence of derivational phrases such as regulation of--and propose that these phenomena can be exploited in various ways to make the information in GO more computationally accessible, to construct a conceptually richer representation of the data encoded in the ontology, and to assist in the analysis of natural language texts.
In text/plain
format
Archived Files and Locations
application/pdf 300.5 kB
file_kib5a36a7ncz5jhvf4qoot62uq
|
europepmc.org (repository) web.archive.org (webarchive) |
application/pdf 189.2 kB
file_vszpjkdbprdstcs4dgiw5ssu3e
|
psb.stanford.edu (web) web.archive.org (webarchive) |
access all versions, variants, and formats of this works (eg, pre-prints)