PHH-ICPR14

Deep Embedding Network for Clustering

The proposed deep embedded network

People

Peihao Huang
Yan Huang
Wei Wang
Liang Wang

Overview

Clustering is a fundamental technique widely used for exploring the inherent data structure in pattern recognition and machine learning. Most of the existing methods focus on modeling the similarity/dissimilarity relationship among instances, such as k-means and spectral clustering, and ignore to extract more effective representation for clustering. In this paper, we propose a deep embedding network for representation learning, which is more beneficial for clustering by considering two constraints on learned representations. We first utilize a deep autoencoder to learn the reduced representations from the raw data. To make the learned representations suitable for clustering, we first impose a locality-persevering constraint on the learned representations, which aims to embed original data into its underlying manifold space. Then, different from spectral clustering which extracts representations from the block diagonal similarity matrix, we apply a group sparsity constraint for the learned representations, and aim to learn block diagonal representations in which the nonzero groups correspond to its cluster. After obtaining the learned representations, we use k-means to cluster them. To evaluate the proposed deep embedding network, we compare its performance with k-means and spectral clustering on three commonly-used datasets. The experiments demonstrate that the proposed method achieves promising performance.

Paper

Deep Embedding Network for Clustering

Peihao Huang, Yan Huang, Wei Wang, Liang Wang

International Conference on Pattern Recognition (ICPR best student paper 2014)

[PDF] [Poster]

Experimental Results

Representations learned from 5,850 samples of Yale-B. Horizontal axis represents the samples while vertical axis represents groups in the learned representations. Blue lines separate groups in samples and red ones separate groups in the learned representations.

Acknowledgments

This work is jointly supported by National Natural Science Foundation of China (61175003, 61135002, 61202328), Hundred Talents Program of CAS, National Basic Research Program of China (2012CB316300).