Original Research ARTICLE
Clustering using Boosted Constrained K-means Algorithm
- 1Prefectural University of Hiroshima, Japan
- 2National Institute of Informatics, Japan
- 3Graduate University for Advanced Studies (SOKENDAI), Japan
Constrained k-means clustering using constraints as background knowledge, though easy to implement and quick, has insufficient performance compared with metric learning based methods. Since it simply adds a function into the data assignment process of the k-means algorithm to check for constraint violations, it often exploits only a small number of constraints. Metric learning based methods, which exploit constraints to create a new metric for data similarity, have shown promising results although the methods proposed so far are often slow depending on the amount of data or number of feature dimensions. We present a method that exploits the advantages of the constrained k-means and metric learning approaches. It incorporates a mechanism for accepting constraint priorities and a metric learning framework based on the boosting principle into a constrained k-means algorithm. In the framework, a metric is learned in the form of a kernel matrix that integrates weak cluster hypotheses produced by the constrained k-means algorithm, which works as a weak learner under the boosting principle. Experimental results for 12 datasets from 3 data sources demonstrated that our method has performance comparable to those of state-of-the-art constrained clustering methods for most datasets and that it takes much less computation time. Experimental evaluation demonstrated the effectiveness of controlling the constraint priorities by using the boosting principle and that our constrained k-means algorithm functions correctly as a weak learner of boosting.
Keywords: Constrained clustering, Metric learning, boosting, Constrained k-means algorithm, Kernel matrix learning
Received: 22 Sep 2017;
Accepted: 06 Feb 2018.
Edited by:Thomas Nowotny, University of Sussex, United Kingdom
Reviewed by:Andre Gruning, University of Surrey, United Kingdom
Hussein Abbass, University of New South Wales Australia, Australia
Copyright: © 2018 Okabe and Yamada. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Dr. Masayuki Okabe, Prefectural University of Hiroshima, Hiroshima, Japan, email@example.com