CoPhIR: a Test Collection for Content-Based Image Retrieval release_h6myccbt6vh2rasga7lmmqvzt4

by Paolo Bolettieri, Andrea Esuli, Fabrizio Falchi, Claudio Lucchese, Raffaele Perego, Tommaso Piccioli, Fausto Rabitti

Released as a article .

2009  

Abstract

The scalability, as well as the effectiveness, of the different Content-based Image Retrieval (CBIR) approaches proposed in literature, is today an important research issue. Given the wealth of images on the Web, CBIR systems must in fact leap towards Web-scale datasets. In this paper, we report on our experience in building a test collection of 100 million images, with the corresponding descriptive features, to be used in experimenting new scalable techniques for similarity searching, and comparing their results. In the context of the SAPIR (Search on Audio-visual content using Peer-to-peer Information Retrieval) European project, we had to experiment our distributed similarity searching technology on a realistic data set. Therefore, since no large-scale collection was available for research purposes, we had to tackle the non-trivial process of image crawling and descriptive feature extraction (we used five MPEG-7 features) using the European EGEE computer GRID. The result of this effort is CoPhIR, the first CBIR test collection of such scale. CoPhIR is now open to the research community for experiments and comparisons, and access to the collection was already granted to more than 50 research groups worldwide.
In text/plain format

Archived Files and Locations

application/pdf  2.1 MB
file_xu2avz3fxbf4pkml6soq54rseu
archive.org (archive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2009-06-01
Version   v2
Language   en ?
arXiv  0905.4627v2
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: e82c800c-aa93-4b58-b902-450a41f87846
API URL: JSON