CoPhIR: a Test Collection for Content-Based Image Retrieval
release_h6myccbt6vh2rasga7lmmqvzt4
by
Paolo Bolettieri, Andrea Esuli, Fabrizio Falchi, Claudio Lucchese,
Raffaele Perego, Tommaso Piccioli, Fausto Rabitti
2009
Abstract
The scalability, as well as the effectiveness, of the different Content-based
Image Retrieval (CBIR) approaches proposed in literature, is today an important
research issue. Given the wealth of images on the Web, CBIR systems must in
fact leap towards Web-scale datasets. In this paper, we report on our
experience in building a test collection of 100 million images, with the
corresponding descriptive features, to be used in experimenting new scalable
techniques for similarity searching, and comparing their results. In the
context of the SAPIR (Search on Audio-visual content using Peer-to-peer
Information Retrieval) European project, we had to experiment our distributed
similarity searching technology on a realistic data set. Therefore, since no
large-scale collection was available for research purposes, we had to tackle
the non-trivial process of image crawling and descriptive feature extraction
(we used five MPEG-7 features) using the European EGEE computer GRID. The
result of this effort is CoPhIR, the first CBIR test collection of such scale.
CoPhIR is now open to the research community for experiments and comparisons,
and access to the collection was already granted to more than 50 research
groups worldwide.
In text/plain
format
Archived Files and Locations
application/pdf 2.1 MB
file_xu2avz3fxbf4pkml6soq54rseu
|
archive.org (archive) |
0905.4627v2
access all versions, variants, and formats of this works (eg, pre-prints)