Amin Mantrach, Jean-Michel Renders
International Conference on Knowledge Discovery and Information Retrieval, Paris, France, October 26-29, 2011.
The growing importance of social media and heterogeneous relational data emphasizes to the fundamental
problem of combining different sources of evidence (or modes) efficiently. In this work, we are considering the
problem of people retrieval where the requested information consists of persons and not of documents. Indeed,
the processed queries contain generally both textual keywords and social links while the target collection
consists of a set of documents with social metadata. Traditional approaches tackle this problem by early or late
fusion where, typically, a person is represented by two sets of features: a word profile and a contact/link profile.
Inspired by cross-modal similarity measures initially designed to combine image and text, we propose in this
paper new way of combining social and content aspects for retrieving people from a collection of documents
with social metadata. To this aim, we define a set of multimodal similarity measures between sociallv.labelled
documents and queries, that could then be aggregated at the person level to provide a final relevance score for
the general people retrieval problem. Then, we examine particular instances of this problem: author retrieval,
recipient recommendation and alias detection, For this purpose, experiments have been conducted on the
ENRON email collection, showing the benefits of our proposed approach with respect to more standard fusion and aggregation methods.
Report number: