The car pet in the carpet. On the interaction of computer-linguistic methodology and manual refinement in researching noun compounds
release_coxlm7ekcjhfdcuc64rthzcy5u
by
Elisabeth Huber
2019 p78
Abstract
Why does football combine productively with further nouns to form more complex expressions like football game, whereas seemingly comparable compounds like keyword only infrequently expand to more complex sequences? This project explores why some two-noun compounds are more readily available for forming triconstituent constructions than others. I hypothesize that the productivity of a two-noun compound in the formation of triconstituent sequences depends on the degree of entrenchment of that two-noun compound, assuming that only compounds that are entrenched to a certain degree are productive in forming more complex constructions. In order to test this hypothesis, a list of three-noun compounds in the English language needed to be compiled. The obvious thing to do would be to search for sequences of three nouns in POS-tagged corpora. However, since such automatized searches on the one hand do not allow the recall of all required instances and, on the other hand, often create results that are not precise enough, this requires substantial manual screening. Furthermore, in order to operationalize the concepts of entrenchment and productivity, it was necessary to count the usage frequencies of noun constructions. For this work, as well, the automatic elicitation of the data needed to be complemented by further manual selection in order to obtain correct usage frequencies. Both the complex automatic and manual work processes in the elicitation of the data will be presented in detail to give an impression of the extent of such a project.
In application/xml+jats
format
Archived Files and Locations
application/pdf 1.2 MB
file_4g6h6cuuzjdhhd6eoc7qfxzkom
|
polipapers.upv.es (publisher) web.archive.org (webarchive) |
article-journal
Stage
published
Date 2019-07-16
Open Access Publication
Not in DOAJ
In ISSN ROAD
Not in Keepers Registry
ISSN-L:
2530-9455
access all versions, variants, and formats of this works (eg, pre-prints)
Crossref Metadata (via API)
Worldcat
SHERPA/RoMEO (journal policies)
wikidata.org
CORE.ac.uk
Semantic Scholar
Google Scholar