多任务学习在不良言论与个体特征检测中的应用
作者:

Application of Multi-task Learning in Hate-speech and Individual Characteristics Detection
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [31]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    多任务学习在自然语言处理领域有广泛应用, 但多任务模型往往对任务间的相关性比较敏感. 如果任务相关性较低或信息传递不合理, 可能会严重影响任务性能. 本文提出了一种新的共享-私有结构的多任务学习模型BB-MTL (BERT-BiLSTM multi-task learning model), 并借助元学习的思想为其设计了一种特殊的参数优化方式MLL-TM (meta-learning-like train methods). 进一步引入一个新的信息融合门SoWLG (Softmax weighted linear gate), 用于选择性地融合每项任务的共享特征与私有特征. 实验验证所提出的多任务学习方法, 考虑到用户在网络上的行为与其个体特征密切相关, 文中结合了不良言论检测、人格检测和情绪检测任务进行了一系列实验. 实验结果表明, BB-MTL能够有效学习相关任务中的特征信息, 在3项任务上的准确率分别达到了81.56%、77.09%和70.82%.

    Abstract:

    Multi-task learning is widely used in the field of natural language processing, but multi-task models tend to be sensitive to the relevance between tasks. If the task relevance is low or the information transfer is unreasonable, the task performance may be seriously affected. This study proposes a new shared-private structure multi-task learning model, BERT-BiLSTM multi-task learning (BB-MTL). It designs a special parameter optimization method, meta-learning-like train methods (MLL-TM) for the model with the help of meta-learning ideas. Further, a new information fusion gate, Softmax weighted linear gate (SoWLG), is introduced for selectively fusing the shared and private features of each task. To validate the proposed multi-task learning method, a series of experiments are conducted by combining the tasks of hate-speech detection, personality detection, and emotion detection, taking into account the fact that user behavior on the Internet is closely related to individual characteristics. The experimental results show that BB-MTL can effectively learn feature information in relevant tasks, and the accuracy rates reach 81.56%, 77.09%, and 70.82% in the three tasks, respectively.

    参考文献
    [1] Mehta Y, Majumder N, Gelbukh A, et al. Recent trends in deep learning based personality detection. Artificial Intelligence Review, 2020, 53(4): 2313–2339.
    [2] 林浩, 王春东, 孙永杰. 面向社交媒体数据的人格识别研究进展. 计算机科学与探索, 2023, 17(5): 1002–1016.
    [3] Markov I, Ljubešić N, Fišer D, et al. Exploring stylometric and emotion-based features for multilingual cross-domain hate speech detection. Proceedings of the 11th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. ACL, 2021. 149–159.
    [4] Caruana R. Multitask learning. Machine Learning, 1997, 28(1): 41–75.
    [5] 王嘉伟, 胡曦, 丁子怡, 等. 基于GA-IPSO-BSVM算法的新浪微博评论信息分类. 计算机系统应用, 2022, 31(8): 169–175.
    [6] Choong EJ, Varathan KD. Predicting judging-perceiving of Myers-Briggs type indicator (MBTI) in online social forum. PeerJ, 2021, 9: 11382.
    [7] 王洁, 朱贝贝. 面向中文歌词的音乐情感分类方法. 计算机系统应用, 2019, 28(8): 24–29.
    [8] Humeau-Heurtier A. Texture feature extraction methods: A survey. IEEE Access, 2019, 7: 8975–9000.
    [9] Wang CC, Day MY, Wu CL. Political hate speech detection and lexicon building: A study in Taiwan. IEEE Access, 2022, 10: 44337–44346.
    [10] Maulidah M, Pardede HF. Prediction of Myers-Briggs type indicator personality using long short-term memory. Jurnal Elektronika dan Telekomunikasi, 2021, 21(2): 104–111.
    [11] Widarmanti T, Widodo MP, Ramadhani DP, et al. Text emotion detection: Discover the meaning behind YouTube comments using indo RoBERTa. Proceedings of the 2022 International Conference on Advanced Creative Networks and Intelligent Systems. Bandung: IEEE, 2022. 1–6.
    [12] 闫尚义, 王靖亚, 朱少武, 等. 融合字词特征的互联网敏感言论识别研究. 计算机工程与应用, 2023, 59(13): 129–138.
    [13] Mehta Y, Fatehi S, Kazameini A, et al. Bottom-up and top-down: Predicting personality with psycholinguistic and language model features. Proceedings of the 2020 IEEE International Conference on Data Mining. Sorrento: IEEE, 2020. 1184–1189.
    [14] 丁美荣, 冯伟森, 黄荣翔, 等. 基于预训练模型和基础词典扩展的酒店评论情感分析. 计算机系统应用, 2022, 31(11): 296–308.
    [15] Ding N, Qin YJ, Yang G, et al. Parameter-efficient fine-tuning of large-scale pre-trained language models. Nature Machine Intelligence, 2023, 5(3): 220–235.
    [16] Li LD, Zhu HC, Zhao SC, et al. Personality-assisted multi-task learning for generic and personalized image aesthetics assessment. IEEE Transactions on Image Processing, 2020, 29: 3898–3910.
    [17] Elourajini F, Aïmeur E. AWS-EP: A multi-task prediction approach for MBTI/Big5 personality tests. Proceedings of the 2022 IEEE International Conference on Data Mining Workshops. Orlando: IEEE, 2022. 1–8.
    [18] Nelatoori KB, Kommanti HB. Multi-task learning for toxic comment classification and rationale extraction. Journal of Intelligent Information Systems, 2023, 60(2): 495–519.
    [19] Plaza-Del-Arco FM, Molina-González M, Ureña-López LA, et al. A multi-task learning approach to hate speech detection leveraging sentiment analysis. IEEE Access, 2021, 9: 112478–112489.
    [20] Liu S, Liu SQ, Liu Z, et al. Automated detection of emotional and cognitive engagement in MOOC discussions to predict learning achievement. Computers & Education, 2022, 181: 104461.
    [21] Elfwing S, Uchibe E, Doya K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Networks, 2018, 107: 3–11.
    [22] Hospedales T, Antoniou A, Micaelli P, et al. Meta-learning in neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(9): 5149–5169.
    [23] Pennebaker JW, King LA. Linguistic styles: Language use as an individual difference. Journal of Personality and Social Psychology, 1999, 77(6): 1296–1312.
    [24] Founta A, Djouvas C, Chatzakou D, et al. Large scale crowdsourcing and characterization of Twitter abusive behavior. Proceedings of the 12th International AAAI Conference on Web and Social Media. Stanford: AAAI, 2018. 491–500.
    [25] Scherer KR, Wallbott HG. Evidence for universality and cultural variation of differential emotion response patterning. Journal of Personality and Social Psychology, 1994, 66(2): 310–328.
    [26] Majumder N, Poria S, Gelbukh A, et al. Deep learning-based document modeling for personality detection from text. IEEE Intelligent Systems, 2017, 32(2): 74–79.
    [27] Zhou P, Shi W, Tian J, et al. Attention-based bidirectional long short-term memory networks for relation classification. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin: ACL, 2016. 207–212.
    [28] Devlin J, Chang MW, Lee K, et al. BERT: Pre-training of deep bidirectional Transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis: ACL, 2019. 4171–4186.
    [29] Li Y, Kazemeini A, Mehta Y, et al. Multitask learning for emotion and personality traits detection. Neurocomputing, 2022, 493: 340–350.
    [30] Rajamanickam S, Mishra P, Yannakoudakis H, et al. Joint modelling of emotion and abusive language detection. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. ACL, 2020. 4270–4279.
    [31] Liu XD, He PC, Chen WZ, et al. Multi-task deep neural networks for natural language understanding. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: ACL, 2019. 4487–4496.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

肖博健,曹霑懋,许莉芬.多任务学习在不良言论与个体特征检测中的应用.计算机系统应用,2024,33(7):74-83

复制
分享
文章指标
  • 点击次数:390
  • 下载次数: 1072
  • HTML阅读次数: 601
  • 引用次数: 0
历史
  • 收稿日期:2024-01-08
  • 最后修改日期:2024-02-04
  • 在线发布日期: 2024-05-31
文章二维码
您是第11245625位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号