基于规则的多语种音译软件设计

Design of Multilingual Transliteration Software Based on Rules
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [13]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    为了快速获得音译结果,借鉴人工处理方式,开发了基于规则的多语种音译软件.该软件采用算法和规则独立设计的思想,能满足多种语言的音译需求.完整的音译过程包括单词预处理、字母识别与切分、字母重组与定位、规则表查询四个步骤.在字母重组阶段提出一种确定最佳音节划分的方法,有效解决了音节划分错误较多的难题,从而保证了最终音译结果的质量.通过对英语、罗马语和俄语进行分组实验,经人工检查,音译正确率达到95%以上.

    Abstract:

    In order to obtain the transliteration results quickly, a multi-lingual transliteration software based on rules is designed on the basis of manual processing in this study. The software can meet the transliteration needs of various lingual, because algorithm and rule are designed separately. The complete transliteration process includes four steps of word pre-treating, letter recognition and segmentation, letter recombination and localization, and rule table searching. In the letter recombination, this study proposes a method of determining the best syllable division, which can reduce the error rate of syllable division effectively and improve the quality of final transliteration results. The results of the experiments for English, Roman, and Russian in this study show that the transliteration accuracy can reach to 95% or more.

    参考文献
    1 王丹丹. 英汉人名音译的研究[硕士学位论文]. 大连:大连理工大学, 2014.
    2 于恒, 凃兆鹏, 刘群, 等. 基于多粒度的英汉人名音译. 中文信息学报, 2013, 27(4):16-21.[doi:10.3969/j.issn.1003-0077.2013.04.003]
    3 李业刚, 黄河燕, 史树敏, 等. 多策略机器翻译研究综述. 中文信息学报, 2015, 29(2):1-9.[doi:10.3969/j.issn.1003-0077.2015.02.001]
    4 童杉姗, 庞小平, 张璐璐. 双语地图中地图注记的设计. 地理空间信息, 2010, 8(2):154-156.[doi:10.3969/j.issn.1672-4623.2010.02.051]
    5 Karimi S, Scholer F, Turpin A. Machine transliteration survey. ACM Computing Surveys(CSUR), 2011, 43(3):17-46.
    6 Knight K, Graehl J. Machine transliteration. Computational Linguistics, 1998, 24(4):599-612.
    7 Haizhou L, Min Z, Jian S. A joint source-channel model for machine transliteration. Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. 2004. 159-166.[doi:10.3115/1218955.1218976]
    8 Oh JH, Choi KS. An ensemble of transliteration models for information retrieval. Information processing & management, 2006, 42(4):980-1002.
    9 Yaser AO, Kevin K. Translating named entities using monolingual and bilingual resources. Proceedings of the Association for Computational Linguistics. Association for Computational Linguistics. 2002. 400-408.
    10 Tarek S, Grzegorz K. Bootstrapping a stochastic transducer for Arabic-English transliteration extraction. Proceedings of the Association for Computational Linguistics. Association for Computational Linguistics. 2007. 864-871.
    11 Lin WH, Chen HH. Backward machine transliteration by learning phonetic similarity. Conference on Natural Language Learning. Association for Computational Linguistics. 2002. 1-7.[doi:10.3115/1118853.1118870]
    12 蒋龙, 周明, 简立峰. 利用音译和网络挖掘翻译命名实体.中文信息学报, 2007, (1):23-29.[doi:10.3969/j.issn.1003-0077.2007.01.004]
    13 邹波, 赵军.英汉人名音译方法研究. 第四届全国学生计算语言学研讨会论文集. 2008. 336-343.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

梁建飞,王军元,刘敏.基于规则的多语种音译软件设计.计算机系统应用,2018,27(9):268-272

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2018-01-31
  • 最后修改日期:2018-02-27
  • 在线发布日期: 2018-08-17
文章二维码
您是第12435928位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号