Chinese Notional Word Discrimination Based on RoBERTa-ND

doi:10.15888/j.cnki.csa.009099

AIPUB归智期刊联盟

WeChat

Mobile website

2025-4-14- 20

Home > Archive>Volume 32, Issue 5, 2023 >157-163. DOI:10.15888/j.cnki.csa.009099

PDF HTML XML Export Cite reminder

Chinese Notional Word Discrimination Based on RoBERTa-ND
DOI:
                        10.15888/j.cnki.csa.009099
                    
CSTR:
                        [cstr]
                    
Author:
                        SUN Chen-YuSUN Chen-Yu
College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
WANG Zhen-QiWANG Zhen-Qi
College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
ZHANG Bao-YuZHANG Bao-Yu
College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
ZHANG Wei-ShanZHANG Wei-Shan
College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
HOU Zhao-XiangHOU Zhao-Xiang
College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
CHEN TaoCHEN Tao
College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Chinese notional words are combinatorial and metaphorical in nature, and there is a lack of data sets on Chinese notional word discrimination. As a result, the understanding and discriminative capability of traditional methods for Chinese notional words are still limited in machine reading comprehension tasks. For this reason, a large-scale (600k) Chinese notional word discrimination cloze data set (CND) is constructed. In the dataset, a notional word in a sentence is replaced with a blank placeholder, and the correct answer needs to be selected from the two candidate notional words provided. A baseline model, RoBERTa-based notional word discrimination model (RoBERTa-ND), is designed to select candidate words. The model first extracts semantic information in the context using a pre-trained language model. Second, the semantics of candidate notional words are fused, and the scores of candidate words are computed by a classification task. Finally, the model’s ability to discriminate Chinese notional words is further enhanced by enhancing the model’s perception of locations and orientation information. Experiments show that the model achieves the accuracy of 90.21% on CND, beating mainstream cloze test models such as DUMA (87.59%) and GNN-QA (84.23%). This work fills the gap in the research on Chinese metaphorical semantic understanding and can develop more practical value in improving the cognitive ability of Chinese Quiz Bot. The codes of CND and RoBERTa-ND are open-source: https://github.com/2572926348/CND-Large-scale-Chinese-National-word-discrimination-dataset.

Key words:metaphorical semantic understanding;Chinese notional word discrimination;machine reading comprehension

Get Citation

孙晨瑜,王振琦,张宝宇,张卫山,侯召祥,陈涛.基于RoBERTa-ND的中文实词辨析.计算机系统应用,2023,32(5):157-163

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:November 03,2022
Revised:December 10,2022
Adopted:
Online: March 17,2023
Published:

Article QR Code

You are the first991215Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063