Survey on English Entity Linking on Wikidata
release_mq4w5jhwmrdvxdxduppab2pfwa
by
Cedric Möller, Jens Lehmann, Ricardo Usbeck
2021
Abstract
Wikidata is a frequently updated, community-driven, and multilingual
knowledge graph. Hence, Wikidata is an attractive basis for Entity Linking,
which is evident by the recent increase in published papers. This survey
focuses on four subjects: (1) Which Wikidata Entity Linking datasets exist, how
widely used are they and how are they constructed? (2) Do the characteristics
of Wikidata matter for the design of Entity Linking datasets and if so, how?
(3) How do current Entity Linking approaches exploit the specific
characteristics of Wikidata? (4) Which Wikidata characteristics are unexploited
by existing Entity Linking approaches? This survey reveals that current
Wikidata-specific Entity Linking datasets do not differ in their annotation
scheme from schemes for other knowledge graphs like DBpedia. Thus, the
potential for multilingual and time-dependent datasets, naturally suited for
Wikidata, is not lifted. Furthermore, we show that most Entity Linking
approaches use Wikidata in the same way as any other knowledge graph missing
the chance to leverage Wikidata-specific characteristics to increase quality.
Almost all approaches employ specific properties like labels and sometimes
descriptions but ignore characteristics such as the hyper-relational structure.
Hence, there is still room for improvement, for example, by including
hyper-relational graph embeddings or type information. Many approaches also
include information from Wikipedia, which is easily combinable with Wikidata
and provides valuable textual information, which Wikidata lacks.
In text/plain
format
Archived Files and Locations
application/pdf 995.2 kB
file_p2oqzczisrgypepymc2riamw7a
|
arxiv.org (repository) web.archive.org (webarchive) |
2112.01989v1
access all versions, variants, and formats of this works (eg, pre-prints)