Speech2Stroke: Generate Chinese Character Strokes Directly from Speech

Yinhui Zhang; Wei Xi; Zhao Yang; Sitao Men; Rui Jiang; Yuxin Yang; Jizhong Zhao

Collaborative Computing: Networking, Applications and Worksharing. 16th EAI International Conference, CollaborateCom 2020, Shanghai, China, October 16–18, 2020, Proceedings, Part I

Research Article

Speech2Stroke: Generate Chinese Character Strokes Directly from Speech

Download

19 downloads

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1007/978-3-030-67537-0_6,
    author={Yinhui Zhang and Wei Xi and Zhao Yang and Sitao Men and Rui Jiang and Yuxin Yang and Jizhong Zhao},
    title={Speech2Stroke: Generate Chinese Character Strokes Directly from Speech},
    proceedings={Collaborative Computing: Networking, Applications and Worksharing. 16th EAI International Conference, CollaborateCom 2020, Shanghai, China, October 16--18, 2020, Proceedings, Part I},
    proceedings_a={COLLABORATECOM},
    year={2021},
    month={1},
    keywords={Deep learning Stroke of Chinese character Pictographic word},
    doi={10.1007/978-3-030-67537-0_6}
}

Yinhui Zhang
Wei Xi
Zhao Yang
Sitao Men
Rui Jiang
Yuxin Yang
Jizhong Zhao
Year: 2021
Speech2Stroke: Generate Chinese Character Strokes Directly from Speech
COLLABORATECOM
Springer
DOI: 10.1007/978-3-030-67537-0_6

Yinhui Zhang¹^,*, Wei Xi¹, Zhao Yang¹, Sitao Men¹, Rui Jiang¹, Yuxin Yang¹, Jizhong Zhao¹

1: School of Computer Science and Technology

*Contact email: manli0826@gmail.com

Abstract

Chinese character is composed of spatial arrangement of strokes. A portion of these strokes combines to form phonetic component, which provides a clue to the pronunciation of the entire character, the others combine to form semantic component, which indicates semantic level information for speech context. How closely the connection between the internal strokes of Chinese characters and speech? In this paper, we propose Speech2Stroke, a end-to-end model that exploits the phonetic and morphologic level information of pictographic words. Specifically, we generate strokes directly from the speech by Speech2Stroke. The performance of Speech2Stroke is evaluated by the specific stroke error rate(SER). The SER of the optimal model can achieve 20.61%. Through the experiments and analysis, we show that our model has the ability to capture the alignment between audio and the internal structures of pictographic characters.

Keywords: Deep learning, Stroke of Chinese character, Pictographic word

Published: 2021-01-22
Appears in: SpringerLink

: http://dx.doi.org/10.1007/978-3-030-67537-0_6

Speech2Stroke: Generate Chinese Character Strokes Directly from Speech

Abstract

About EAI

Community

Publish with EAI