2015/081 - K-Nearest Neighbor Search and Outlier Detection via Minimax Distances
Morteza Haghir Chehreghani
SIAM,International Conference on Datamining, Miami, Florida, USA, May 5-7, 2016.
We study Minimax distance measures for K-nearest neighbor search and outlier detection. Recently, the use of this distance measure is shown to improve the K-nearest neighbor classification results. We consider the computational aspects of this problem and propose an efficient and general-purpose algorithm for computing Minimax distances for this task which requires a significantly lower runtime and is applicable with any arbitrary distance measure. We study its connection to minimum spanning trees, particularly the Prim's algorithm, and then, generalize our analysis to computing one-to-all Minimax distances. In the following, we investigate in detail the dges selected by Minimax distances and thereby propose an efficient and effective method to predict outliers while performing Minimax K-nearest neighbor search. We evaluate the performance of our methods on variety of real-world datasets, e.g. text documents and images.