Improving Clustering with Sparse Coding


Reference 1

Y. Yang, Z. Wang, J. Yang, and T. Huang, “Data Clustering by Laplacian Regularized L1-Graph”, In Proceedings of the 28-th AAAI Conference on Artificial Intelligence (AAAI), 2014.


  • ℓ1-Graph has been proven to be effective in data clustering, which partitions the data space by using the sparse representation of the data as the similarity measure. However, the sparse representation is performed for each datum separately without taking into account the geometric structure of the data. Motivated by ℓ1-Graph and manifold leaning, we propose Laplacian Regularized ℓ1-Graph (LRℓ1-Graph) for data clustering. The sparse representations of LRℓ1-Graph are regularized by the geometric information of the data so that they vary smoothly along the geodesics of the data manifold. 


Reference 2

Y. Yang, Z. Wang, J. Yang, J. Han, and T. Huang, “Regularized L1-Graph for Data Clustering ”, In Proceedings of British Machine Vision Conference (BMVC), 2014.


  • Based on the first work, rather than using a pre-defined graph, we proceed to jointly optimize the graph construction and the sparse coding. Our regularization term uses the graph Laplacian constructed by the (intermediate) sparse codes, which exhibits superior clustering performance compared to the previous pre-fixed Laplacian regularization. 


Reference 3: 

Z. Wang, Y. Yang, S. Chang, J. Li, S. Fong and T. Huang, “A Joint Optimization Framework of Sparse Coding and Discriminative Clustering”, In Proceedings of International Joint Conferences on Artificial Intelligence (IJCAI), 2015


  • Many clustering methods highly depend on extracted features. We propose a joint optimization framework in terms of both feature extraction and discriminative clustering. We utilize graph regularized sparse codes as the features, and formulate sparse coding as the constraint for clustering. Two cost functions are developed based on entropy-minimization and maximum-margin clustering principles, respectively. They are considered as the objectives to be minimized. Solving such a bi-level optimization mutually reinforces both sparse coding and clustering steps. Experiments on several benchmark datasets verify the remarkable performance improvement led by the proposed joint optimization.




















Zhangyang (Atlas) Wang​