Multi-attribute Controllable Text Summary Model Based on Pointer Generator Network and Extended Transformer
CSTR:
Author:
  • Article
  • | |
  • Metrics
  • |
  • Reference [21]
  • |
  • Related [7]
  • | | |
  • Comments
    Abstract:

    The controllable text summary models can generate summaries that conform to user preferences. Previous summary models focus on controlling a certain attribute alone, rather than the combination of multiple attributes. When multiple control attributes are satisfied, the traditional Seq2Seq multi-attribute controllable text summary model cannot integrate all control attributes, accurately reproduce key information in the texts, and handle words outside the word lists. Therefore, this study proposes a model based on the extended Transformer and pointer generator network (PGN). The extended Transformer in the model extends the Transformer single encoder-single decoder model form into a dual encoder with dual text semantic information extraction and a single decoder form that can fuse guidance signal features. Then the PGN model is employed to select the source from the source copy words in the text or adopt vocabulary to generate new summary information to solve the OOV (out of vocabulary) problem that often occurs in summary tasks. Additionally, to efficiently complete position information encoding, the model utilizes relative position representation in the attention layer to introduce sequence information of the texts. The model can be leveraged to control many important summary attributes, including lengths, topics, and specificity. Experiments on the public dataset MACSum show that compared with previous methods, the proposed model performs better at ensuring the summary quality. At the same time, it is more in line with the attribute requirements given by users.

    Reference
    [1] Sutskever I, Vinyals O, Le QV. Sequence to sequence learning with neural networks. Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal: MIT Press, 2014. 3104–3112.
    [2] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. Proceedings of the 3rd International Conference on Learning Representations. San Diego: ICLR, 2015.
    [3] Luong MT, Le QV, Sutskever I, et al. Multi-task sequence to sequence learning. Proceedings of the 4th International Conference on Learning Representations. San Juan: ICLR, 2016.
    [4] Chopra S, Auli M, Rush AM. Abstractive sentence summarization with attentive recurrent neural networks. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego: Association for Computational Linguistics, 2016. 93–98.
    [5] Nallapati R, Zhou BW, Dos Santos C, et al. Abstractive text summarization using sequence-to-sequence RNNs and beyond. Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. Berlin: Association for Computational Linguistics, 2016. 280–290.
    [6] See A, Liu PJ, Manning CD. Get to the point: Summarization with pointer-generator networks. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vancouver: Association for Computational Linguistics, 2017. 1073–1083.
    [7] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 6000–6010.
    [8] Lewis M, Liu YH, Goyal N, et al. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2019. 7871–7880.
    [9] Radford A, Narasimhan K, Salimans T, et al. Improving language understanding by generative pre-training. https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf. [2023-08-10].
    [10] Shaw P, Uszkoreit J, Vaswani A. Self-attention with relative position representations. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). New Orleans: Association for Computational Linguistics, 2018. 464–468.
    [11] Makino T, Iwakura T, Takamura H, et al. Global optimization under length constraint for neural text summarization. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019. 1039–1048.
    [12] He J, Kryściński W, McCann B, et al. CTRLsum: Towards generic controllable text summarization. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Abu Dhabi: Association for Computational Linguistics, 2020. 5879–5915.
    [13] Cao SY, Wang L. Inference time style control for summarization. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 2021. 5942–5953.
    [14] Zhong M, Liu Y, Ge SY, et al. Unsupervised multi-granularity summarization. Proceedings of the 2022 Findings of the Association for Computational Linguistics: EMNLP 2022. Abu Dhabi: Association for Computational Linguistics, 2022. 4980–4995.
    [15] Chan HP, Wang L, King I. Controllable summarization with constrained Markov decision process. Transactions of the Association for Computational Linguistics, 2021, 9: 1213–1232.
    [16] Goyal T, Rajani NF, Liu WH, et al. Hydrasum: Disentangling stylistic features in text summarization using multi-decoder models. arXiv:2110.04400, 2021.
    [17] Dou ZY, Liu PF, Hayashi H, et al. GSum: A general framework for guided neural abstractive summarization. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 2020. 4830–4842.
    [18] Zhang YS, Liu Y, Yang ZY, et al. MACSum: Controllable summarization with mixed attributes. Transactions of the Association for Computational Linguistics, 2022, 11: 787–803.
    [19] Lin CY. ROUGE: A package for automatic evaluation of summaries. Proceedings of the 2004 Workshop on Text Summarization Branches Out. Barcelona: Association for Computational Linguistics, 2004. 74–81.
    [20] Zhong M, Liu PF, Chen YR, et al. Extractive summarization as text matching. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2020. 6197–6208.
    [21] Raffel C, Shazeer N, Roberts A, et al. Exploring the limits of transfer learning with a unified text-to-text Transformer. The Journal of Machine Learning Research, 2020, 21(1): 140.
    Cited by
Get Citation

冼广铭,李凡龙,郑兆明.基于指针生成网络和扩展Transformer的多属性可控文本摘要模型.计算机系统应用,2024,33(4):246-253

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:September 28,2023
  • Revised:November 09,2023
  • Online: January 30,2024
Article QR Code
You are the first990420Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063