Searching for the furthest neighbor in a given dataset is a linear time complexity problem. This complexity rises to be quadratic when we need to find the furthest neighbor for each point (example) in a dataset. That is, the brute force algorithm takes O(n 2 ) to find the furthest neighbor for all points. Such an algorithm is computationally expensive, particularly when the number of samples n in a dataset is large. In this paper, we introduce an approximate tree-based searching method mainly to reduce the time complexity of the search. The proposed method recursively utilizes the k-means approach in order to split the data into sub-groups and then arranges them as a tree structure. Using such a structure, the searching process consumes O(log(n)) to find the approximated furthest neighbor from a given example; and O(nlog(n)) to find it for all examples in the dataset. Our experiments show that the proposed method is reliable and efficient in approximating the furthest neighbor, therefore, can be used in practice, particularly for big data.
K-Means Tree for Fast Furthest Neighbor Approximation
- Details
- Written by Ahmad S Tarawneh, Ahmad B Hassanat, Issam Elkhadiri, Dmitry Chetverikov, Malek Alrashidi
- Category: Computer Science
- Hits: 5