基于石油领域本体的概念相似度级联模型
作者:
基金项目:

科技部创新方法工作专项资助(2015IM010300);北京市重点实验室开放课题(BKBD-2017KF07)


Cascade Model for Semantic Similarity of Concept Based on Petroleum Ontology
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [25]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    提出了一种利用级联模型来计算本体中概念间相似度的新方法.在模型的第一阶段,采用了基于距离的语义相似度计算方法,计算出概念对在本体中的路径得分;第二阶段,采用IC (Information Content)算法精确计算概念对间相似度得分,并利用概念的公共子代集合对算法进行了扩展;第三阶段我们采用了特征整合策略,将所有的相似性得分构建成特征向量来描述概念对,并且使用权重来平衡第一阶段与第二阶段的相似度结算得分.最后使用BP神经网络确定两个概念的相似性.我们对新提出的语义相似度算法进行了评估,并与现有的方法相比.实验结果表明,该方法有效提高相似度算法的准确性和科学性.

    Abstract:

    This paper presents a new cascade architecture to calculate the similarity between concepts in the ontology. In the first stage of proposed model, we use path-based methods to calculate the concept of path score in the ontology. In the second stage, we use Information Content (IC)-based methods to obtain similarity scores of two extended concepts with further consideration of the two concepts of public parent set and public subset. In the third stage, we adopt a feature integration strategy to combine all the similarity scores derived from the ontology to construct various kinds of features to characterize each concept pair and using weights to balance the concept of first-stage-scores with the third-stage-scores. In the end, BP neural network is used to obtain the similarity of two concepts. This model has been evaluated and compared with existing methods when applied to the task of semantic similarity estimation. Experimental results show that the proposed method effectively improves the accuracy and scientificity of similar calculation.

    参考文献
    [1] Lavanya S, Arya SS. An approach for measuring semantic similarity between words using SVM and LS-SVM. Proceedings of the 2012 International Conference on Computer Communication and Informatics. Coimbatore, India. 2012. 1-4.
    [2] Mcinnes BT, Pedersen T. Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text. Journal of Biomedical Informatics, 2013, 46(6):1116-1124.[doi:10.1016/j.jbi.2013.08.008]
    [3] Hassan H, Hassan A, Emam O. Unsupervised information extraction approach using graph mutual reinforcement. Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. Sydney, Australia. 2006. 501-508.
    [4] Corley C, Mihalcea R. Measuring the semantic similarity of texts. Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment. Ann Arbor, MI, USA. 2005. 13-18.
    [5] Huang AN. Similarity measures for text document clustering. Proceedings of the New Zealand Computer Science Research Student Conference 2008. Christchurch, New Zealand. 2008.
    [6] Wu HW, Su ZC, Mao FL, et al. Prediction of functional modules based on comparative genome analysis and Gene Ontology application. Nucleic Acids Research, 2005, 33(9):2822-2837.[doi:10.1093/nar/gki573]
    [7] Sánchez D, Batet M, Isern D. Ontology-based information content computation. Knowledge-Based Systems, 2011, 24(2):297-303.
    [8] Resnik P. Using information content to evaluate semantic similarity in a taxonomy. Proceedings of the 14th International Joint Conference on Artificial Intelligence. Montreal, Quebec, Canada. 1995, 1. 448-453.
    [9] Wu XM, Pang EL, Lin K, et al. Improving the measurement of semantic similarity between gene ontology terms and gene products:Insights from an edge- and IC-based hybrid method. PLoS One, 2013, 8(5):e66745.[doi:10.1371/journal.pone.0066745]
    [10] Alexopoulou D, Andreopoulos B, Dietze H, et al. Biomedical word sense disambiguation with ontologies and metadata:Automation meets accuracy. BMC Bioinformatics, 2009, 10:28.[doi:10.1186/1471-2105-10-28]
    [11] Garla VN, Brandt C. Knowledge-based biomedical word sense disambiguation:An evaluation and application to clinical document classification. Journal of the American Medical Informatics Association, 2013, 20(5):882-886.[doi:10.1136/amiajnl-2012-001350]
    [12] Patwardhan S, Banerjee S, Pedersen T. Using measures of semantic relatedness for word sense disambiguation. Proceedings of the 4th International Conference on Computational Linguistics and Intelligent Text Processing. Mexico City, Mexico. 2003. 241-257.
    [13] Jiang JJ, Conrath DW. Semantic similarity based on corpus statistics and lexical taxonomy. arXiv:cmp-lg/9709008, 1997. 11512——0.
    [14] Rada R, Mili H, Bicknell E, et al. Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man, and Cybernetics, 1989, 19(1):17-30.[doi:10.1109/21.24528]
    [15] Dang V, Bendersky M, Croft WB. Two-stage learning to rank for information retrieval. In:Serdyukov P, Braslavski P, Kuznetsov SO, et al., eds. Advances in Information Retrieval. Berlin Heidelberg:Springer, 2013:423-434.
    [16] Wang LD, Lin J, Metzler D. A cascade ranking model for efficient ranked retrieval. Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. Beijing, China. 2011. 105-114.
    [17] 朱新华, 马润聪, 孙柳, 等. 基于知网与词林的词语语义相似度计算. 中文信息学报, 2016, 30(4):29-36.
    [18] 李阳, 高大启. 知识图谱中实体相似度计算研究. 中文信息学报, 2017, 31(1):140-146.
    [19] Pesquita C. Semantic similarity in the gene ontology. In:Dessimoz C, Škunca N, eds. The Gene Ontology Handbook. New York:Humana Press, 2017. 161-173.
    [20] Leacock C, Chodorow M. Combining local context and wordnet similarity for word sense identification. In:Fellbaum C, ed. WordNet:An Electronic Lexical Database. Cambridge:MIT, 1998. 265-283.
    [21] Batet M, Sánchez D, Valls A. An ontology-based measure to compute semantic similarity in biomedicine. Journal of Biomedical Informatics, 2011, 44(1):118.
    [22] Lin DK. An information-theoretic definition of similarity. Proceedings of the 15th International Conference on Machine Learning. San Francisco, CA, USA. 1998. 296-304.
    [23] Zhang SB, Lai JH. Semantic similarity measurement between gene ontology terms based on exclusively inherited shared information. Gene, 2014, 558(1):108.
    [24] Navigli R, Ponzetto SP. BabelNet:The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence, 2012, 193:217-250.[doi:10.1016/j.artint.2012.07.001]
    [25] Fellbaum C. WordNet:An electronic lexical database. Cambridge:MIT Press, 1998.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

赵国梁,宫法明.基于石油领域本体的概念相似度级联模型.计算机系统应用,2018,27(7):182-187

复制
分享
文章指标
  • 点击次数:1731
  • 下载次数: 1985
  • HTML阅读次数: 1575
  • 引用次数: 0
历史
  • 收稿日期:2017-11-30
  • 最后修改日期:2017-12-21
  • 在线发布日期: 2018-06-27
文章二维码
您是第11418035位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号