Different Issues in the Design of a Lemmatizer/Tagger for Basque
release_hogi5xdv3bf47bcqriqyi5v2wi
by
I. Aduriz, I. Alegria, J. M. Arriola, X. Artola, Diaz de Illarraza A.,
N. Ezeiza, K. Gojenola, M. Maritxalar
1995
Abstract
This paper presents relevant issues that have been considered in the design
of a general purpose lemmatizer/tagger for Basque (EUSLEM). The
lemmatizer/tagger is conceived as a basic tool necessary for other linguistic
applications. It uses the lexical data base and the morphological analyzer
previously developed and implemented. Due to the characteristics of the
language, the tagset here proposed in structured in for levels, so that each
level is a refinement of the previous one in the sense that it adds more
detailed information. We will focus on the problems found in designing this
tagset and on the strategies for morphological disambiguation that will be
used.
In text/plain
format
Archived Files and Locations
application/pdf 26.2 kB
file_xwjguvq3vfcy3aqsy3lpnwsmt4
|
arxiv.org (repository) web.archive.org (webarchive) |
application/pdf 26.2 kB
file_tm5o7lcq5rfohoai6cn3nmau5y
|
archive.org (archive) |
cmp-lg/9503020v1
access all versions, variants, and formats of this works (eg, pre-prints)