Original Research ARTICLE
HVGH: Unsupervised Segmentation for High-dimensional Time Series Using Deep Neural Compression and Statistical Generative Model
- 1University of Electro-Communications, Japan
- 2Osaka University, Japan
- 3Institute of Statistical Mathematics (ISM), Japan
- 4Ochanomizu University, Japan
Humans perceive continuous high-dimensional information by dividing it into meaningful segments, such as words and units of motion.
We believe that such unsupervised segmentation is also important for robots to learn topics such as language and motion.
To this end, we previously proposed a hierarchical Dirichlet process--Gaussian process--hidden semi-Markov model (HDP-GP-HSMM).
However, an important drawback of this model is that it cannot divide high-dimensional time-series data.
Furthermore, low-dimensional features must be extracted in advance.
Segmentation largely depends on the design of features, and it is difficult to design effective features, especially in the case of high-dimensional data.
To overcome this problem, this study proposes a hierarchical Dirichlet process--variational autoencoder--Gaussian process--hidden semi-Markov model (HVGH).
The parameters of the proposed HVGH are estimated through a mutual learning loop of the variational autoencoder and our previously proposed HDP-GP-HSMM.
Hence, HVGH can extract features from high-dimensional time-series data while simultaneously dividing it into segments in an unsupervised manner.
In an experiment, we used various motion-capture data to demonstrate that our proposed model estimates the correct number of classes and more accurate segments than baseline methods.
Moreover, we show that the proposed method can learn latent space suitable for segmentation.
Keywords: motion segmentation, Gaussian process, Hidden semi-Markov model, Motion capture data, high-dimensional time-series data, Variational autoencoder
Received: 31 Mar 2019;
Accepted: 22 Oct 2019.
Copyright: © 2019 Nagano, Nakamura, Nagai, Mochihashi, Kobayashi and Takano. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Mr. Masatoshi Nagano, University of Electro-Communications, Chofu, Japan, email@example.com