Max-Sum Diversification, Monotone Submodular Functions and Dynamic Updates release_oqtr4m64zjewhd4xz7fulfa4uy

by Allan Borodin, Aadhar Jain, Hyun Chul Lee, Yuli Ye

Released as a article .

2012  

Abstract

Result diversification is an important aspect in web-based search, document summarization, facility location, portfolio management and other applications. Given a set of ranked results for a set of objects (e.g. web documents, facilities, etc.) with a distance between any pair, the goal is to select a subset S satisfying the following three criteria: (a) the subset S satisfies some constraint (e.g. bounded cardinality); (b) the subset contains results of high "quality"; and (c) the subset contains results that are "diverse" relative to the distance measure. The goal of result diversification is to produce a diversified subset while maintaining high quality as much as possible. We study a broad class of problems where the distances are a metric, where the constraint is given by independence in a matroid, where quality is determined by a monotone submodular function, and diversity is defined as the sum of distances between objects in S. Our problem is a generalization of the max sum diversification problem studied in GoSh09 which in turn is a generaliztion of the max sum p-dispersion problem studied extensively in location theory. It is NP-hard even with the triangle inequality. We propose two simple and natural algorithms: a greedy algorithm for a cardinality constraint and a local search algorithm for an arbitary matroid constraint. We prove that both algorithms achieve constant approximation ratios.
In text/plain format

Archived Files and Locations

application/pdf  245.1 kB
file_am5w7rndzzcxzkygnropxbj644
archive.org (archive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2012-03-28
Version   v1
Language   en ?
arXiv  1203.6397v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 01817381-e354-4c0e-97e5-4f53ecb4463c
API URL: JSON