Research Article
Research on the Dependency Distance of Quantifiers in Modern Chinese Based on Big Data Text Taking the Dot Quantifier “粒(particle)” as an Example
@INPROCEEDINGS{10.4108/eai.6-1-2023.2330293, author={Linlin Zou}, title={Research on the Dependency Distance of Quantifiers in Modern Chinese Based on Big Data Text Taking the Dot Quantifier “粒(particle)” as an Example}, proceedings={Proceedings of the 2nd International Conference on Big Data Economy and Digital Management, BDEDM 2023, January 6-8, 2023, Changsha, China}, publisher={EAI}, proceedings_a={BDEDM}, year={2023}, month={6}, keywords={chinese dependency distance quantifiers 粒(particle) big data}, doi={10.4108/eai.6-1-2023.2330293} }
- Linlin Zou
Year: 2023
Research on the Dependency Distance of Quantifiers in Modern Chinese Based on Big Data Text Taking the Dot Quantifier “粒(particle)” as an Example
BDEDM
EAI
DOI: 10.4108/eai.6-1-2023.2330293
Abstract
In the era of information explosion, quantifiers were often ignored in the past natural language processing (NLP). However, with the deepening of NLP system, the importance of quantifiers in the field of NLP, information retrieval, Chinese teaching, and corpus construction has become increasingly prominent. By analyzing its Nominal structure semantic categories, we can find that: category b (specific affairs) has the highest proportion, accounting for 74.98%. b4 (materials) account for 26.38% of b catagory; catagory f (social activities) is the least, appearing only twice, and its proportion can be ignored. Therefore, material subclass of specific affairs class is often used in combination with “粒”. In Liu’s paper, he believes that the mean dependency distance of Chinese is 3.662, so the average number of words that Chinese users need to remember when processing sentences is about 3. By calculating the entries in the corpus, we find that the distance between “粒(particle)” and the subject or topic of the sentence is about 3.217.