An Open-Source Library of 2D-GMM-HMM Based on Kaldi Toolkit and Its Application to Handwritten Chinese Character Recognition

2D-GMM-HMM System

Abstract

As an open source toolkit based on 1D-HMM framework, Kaldi toolkit is widely used in many signal processing tasks. However, when dealing with complex spatial structures, e.g. in image related tasks, 2D-HMM is more suitable since it allows free transition between hidden states in both horizontal and vertical directions. Although 2D-HMM framework has been proposed for years, there is still a lack of efficient open source toolkit for further research due to its complexity. In this paper we present a highly efficient code library of 2D-GMM-HMM based on Kaldi toolkit with implementation details. As a demonstration of its effectiveness, we apply 2D-GMM-HMM to handwritten Chinese character recognition (HCCR) task. The experiments on a 50-class HCCR task have proved that the 2D-GMM-HMM system has obvious advantages over the 1D-GMM-HMM system in terms of recognition accuracy and modeling precision. Moreover, the visual analysis shows that 2D-GMMHMM can well segment the Chinese characters into basic components such as radicals via the hidden states in both horizontal and vertical directions while 1D-GMM-HMM can only conduct the segmentation in the horizontal direction. The project code of 2D-GMM-HMM library and its recipe on HCCR is publicly available at https://github.com/jfmaUSTC/2DHMM.

Publication
In International Conference on Image and Graphics 2021