Matthijs Hovelynch, Boris Chidlovskii
19th International World Wide Conference, Raleigh, North Carolina USA, 26-30 April 2010
We propose a method for improving classification performance in a one-class setting by combining classifiers of different modalities.
We apply the method to the problem of distinguishing responsive
documents in a corpus of e-mails, like Enron Corpus. We extract the social network of actors which is implicit in a large body of electronic communication and turn it into valuable features for classifying the exchanged documents. Working in a one-class setting we follow a semi-supervised approach based on the Mapping Convergence framework. We propose an alternative interpretation, that allows for broader applicability when positive and negative items are not naturally separable. We propose an extension to the oneclass evaluation framework in truly one-case cases when only some positive training examples are available. We extent the one-class setting to the co-training principle that enables us to take advantage of multiple views on the data. We report evaluation results of this extension on three different corpora including Enron Corpus.
Report number: