1

HRDoc: Dataset and Baseline Method Toward Hierarchical Reconstruction of Document Structures

We built a large-scale dataset named HRDoc, which consists of 2,500 multi-page documents with nearly 2 million semantic units. Moreover, we proposed an encoder-decoder-based hierarchical document structure parsing system (DSPS) to tackle document structure reconstruction task. Code and dataset are available at https://github.com/jfma-USTC/HRDoc.

Jiefeng Ma, Jun Du, Pengfei Hu, Zhenrong Zhang, et al

HRDoc: Dataset and Baseline Method Toward Hierarchical Reconstruction of Document Structures

Query-driven Generative Network for Document Information Extraction in the Wild

In contrast to existing studies mainly tailored for document cases in known templates with predefined layouts and keys, we aim to build up a more practical DIE paradigm for real-world scenarios in zero-shot setting and in the scenes of the problematic OCR results.

Haoyu Cao, Jiefeng Ma, et al

GMN: Generative Multi-modal Network for Practical Document Information Extraction

This paper proposes Generative Multi-modal Network (GMN) for real-world scenarios to address these problems, which is a robust multi-modal generation method without predefined label categories. GMN can deal with complex documents that are hard to serialized into sequential order and tolerate errors in OCR results which means it requires no character-level annotation

Haoyu Cao, Jiefeng Ma, et al

GMN: Generative Multi-modal Network for Practical Document Information Extraction

An Open-Source Library of 2D-GMM-HMM Based on Kaldi Toolkit and Its Application to Handwritten Chinese Character Recognition

We present a highly efficient code library of 2D-GMM-HMM based on Kaldi toolkit and apply it to Handwritten Chinese Character Recognition (HCCR) task.. The visual analysis shows that 2D-GMMHMM can well segment the Chinese characters into basic components such as radicals via the hidden states in both horizontal and vertical directions

Jiefeng Ma, Zirui Wang, Jun Du

An Open-Source Library of 2D-GMM-HMM Based on Kaldi Toolkit and Its Application to Handwritten Chinese Character Recognition