基于多模态融合的移动应用细粒度用户意图理解
作者:
基金项目:

工信部专项(TC220H079)


Fine-grained User Intention Understanding for Mobile Applications Based on Multi-modality Fusion
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [51]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    随着移动应用功能日益复杂, 现有基于用户意图的隐私泄露检测方法面临更大挑战. 一方面, 传统隐私泄露检测基于应用级别的用户意图, 只关注应用的隐私收集行为是否与应用的核心功能需求相符合, 不适用于现如今具有广泛功能和多元用户意图的移动应用安全检测, 亟需粒度更细的用户意图分类; 另一方面, 现行研究大多集中于评估图标等界面小部件触发的隐私收集行为是否与用户意图一致, 然而, 图标不当设计和滥用现象十分普遍, 这限制了仅依赖小部件用户意图进行隐私风险评估的有效性, 因此当前仍需要对整体用户界面的意图进行理解. 针对以上问题, 本文首先从中文隐私政策中提取总结出常见的、适用于隐私合规判断的细粒度用户意图列表; 之后结合移动应用界面设计特点, 设计并实现了多模态特征融合的多分类模型对整个移动界面反映的用户意图进行识别. 评估结果表明, 本文隐私政策意图提取工具精确率与召回率均达到83%, 用户意图识别工具精确率与召回率分别达到了80%与83%, 具有较好的检测效果与实际可用性.

    Abstract:

    With the increasing complexity of mobile applications, existing privacy leak detection methods based on user intent face greater challenges. On the one hand, traditional privacy leak detection, which is based on APP-level user intent, only focuses on whether the privacy collection behavior of the application aligns with its core functional requirements. This approach is not suitable for today’s mobile APP security detection, which has broad functionalities and diverse user intents, necessitating a more fine-grained user intent classification. On the other hand, current research mainly focuses on evaluating whether the privacy collection behaviors triggered by interface widgets, such as icons, are consistent with user intent. However, the improper design and misuse of icons are very common, which limits the effectiveness of privacy risk assessments that rely solely on widget-based user intents. Therefore, a comprehensive understanding of user intent at the overall interface level is still needed. In response to the above issues, this study first extracts and summarizes a fine-grained user intent list suitable for privacy compliance detection based on Chinese privacy policies. Then, based on the characteristics of mobile application interface design, a multi-classification model with multi-modal feature fusion is designed and implemented to identify the user intent reflected by the entire mobile interface. Evaluation results show that the intent extraction tool in this study has achieved 83% in both precision and recall, and the user intent classification model reaches 80% and 83% in precision and recall, respectively, demonstrating good detection effectiveness and practical usability.

    参考文献
    [1] 中华人民共和国工业和信息化部. 工业和信息化部关于2024年第一季度电信服务质量的通告. https://www.miit.gov.cn/zwgk/zcwj/wjfb/tg/art/2024/art_6e0a9c81d2d24e71bd017cb2b61820a0.html. [2024-04-26].
    [2] Qu ZY, Rastogi V, Zhang XY, et al. AutoCog: Measuring the description-to-permission fidelity in Android applications. Proceedings of 2014 ACM SIGSAC Conference on Computer and Communications Security. Scottsdale: ACM. 2014. 1354–1365.
    [3] Pandita R, Xiao XS, Yang W, et al. WHYPER: Towards automating risk assessment of mobile applications. Proceedings of the 22nd USENIX Security Symposium. Washington: USENIX Association, 2013. 527–542.
    [4] Gorla A, Tavecchia I, Gross F, et al. Checking APP behavior against APP descriptions. Proceedings of the 36th International Conference on Software Engineering. Hyderabad: ACM, 2014. 1025–1035.
    [5] Wang R, Wang ZB, Tang BX, et al. SmartPI: Understanding permission implications of Android APPs from user reviews. IEEE Transactions on Mobile Computing, 2020, 19(12): 2933–2945.
    [6] Xi SQ, Yang S, Xiao XS, et al. DeepIntent: Deep icon-behavior learning for detecting intention-behavior discrepancy in mobile APPs. Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. London: ACM, 2019. 2421–2436.
    [7] Xiao XS, Wang XY, Cao ZH, et al. IconIntent: Automatic identification of sensitive UI widgets based on icon classification for Android APPs. Proceedings of the 41st International Conference on Software Engineering. Montreal: IEEE Press, 2019. 257–268.
    [8] Liu J, He DJ, Wu DY, et al. Correlating UI contexts with sensitive API calls: Dynamic semantic extraction and analysis. Proceedings of the 31st International Symposium on Software Reliability Engineering. Coimbra: IEEE Press, 2020. 241–252.
    [9] Nguyen TT, Nguyen DC, Schilling M, et al. Measuring user perception for detecting unexpected access to sensitive resource in mobile APPs. Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security. Hong Kong: ACM, 2021. 578–592.
    [10] Zhang SK, Lei HW, Wang YP, et al. How Android APPs break the data minimization principle: An empirical study. Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering. Luxembourg: IEEE Press, 2023. 1238–1250.
    [11] Li LL, Wang RF, Zhan X, et al. What you see is what you get? It is not the case! Detecting misleading icons for mobile applications. Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis. Seattle: ACM, 2023. 538–550.
    [12] Enck W, Gilbert P, Han S, et al. TaintDroid: An information-flow tracking system for realtime privacy monitoring on smartphones. ACM Transactions on Computer Systems, 2014, 32(2): 5.
    [13] Arzt S, Rasthofer S, Fritz C, et al. FlowDroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android APPs. ACM SIGPLAN Notices, 2014, 49(6): 259–269.
    [14] Yu L, Chen JC, Zhou H, et al. Localizing function errors in mobile APPs with user reviews. Proceedings of the 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. Luxembourg: IEEE Press, 2018. 418–429.
    [15] Hu YY, Wang HY, Ji TT, et al. CHAMP: Characterizing undesired APP behaviors from user comments based on market policies. Proceedings of the 43rd International Conference on Software Engineering. Madrid: IEEE Press, 2021. 933–945.
    [16] Slavin R, Wang XY, Hosseini MB, et al. Toward a framework for detecting privacy policy violations in Android application code. Proceedings of the 38th International Conference on Software Engineering. Austin: ACM, 2016. 25–36.
    [17] Wang XY, Qin X, Hosseini MB, et al. GUILeak: Tracing privacy policy claims on user input data for Android applications. Proceedings of the 40th International Conference on Software Engineering. Gothenburg: ACM, 2018. 37–47.
    [18] Andow B, Mahmud SY, Whitaker J, et al. Actions speak louder than words: Entity-sensitive privacy policy and data flow analysis with POLICHECK. Proceedings of the 29th USENIX Conference on Security Symposium. Berkeley: USENIX Association, 2020. 56.
    [19] Bui D, Yao Y, Shin KG, et al. Consistency analysis of data-usage purposes in mobile APPs. Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. New York: ACM, 2021. 2824–2843.
    [20] Andow B, Mahmud SY, Wang WY, et al. Policylint: Investigating internal privacy policy contradictions on Google Play. Proceedings of the 28th USENIX Conference on Security Symposium. Santa Clara: USENIX Association, 2019. 585–602.
    [21] Avdiienko V, Kuznetsov K, Rommelfanger I, et al. Detecting behavior anomalies in graphical user interfaces. Proceedings of the 39th International Conference on Software Engineering Companion. Buenos Aires: IEEE Press, 2017. 201–203.
    [22] Yang ZM, Yang M, Zhang Y, et al. AppIntent: Analyzing sensitive data transmission in Android for privacy leakage detection. Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security. Berlin: ACM, 2013. 1043–1054.
    [23] Fu H, Zheng ZZ, Das AK, et al. FlowIntent: Detecting privacy leakage from user intention to network traffic mapping. Proceedings of the 13th Annual IEEE International Conference on Sensing, Communication, and Networking. London: IEEE Press, 2016. 1–9.
    [24] Pan X, Cao YZ, Du XC, et al. FlowCog: Context-aware semantics extraction and analysis of information flow leaks in Android APPs. Proceedings of the 27th USENIX Security Symposium. Baltimore: USENIX Association, 2018. 1669–1685.
    [25] Huang JJ, Zhang XY, Tan L, et al. AsDroid: Detecting stealthy behaviors in Android applications by user interface and program behavior contradiction. Proceedings of the 36th International Conference on Software Engineering. Hyderabad: ACM, 2014. 1036–1046.
    [26] Li YX, Feng RT, Chen S, et al. IconChecker: Anomaly detection of icon-behaviors for Android APPs. Proceedings of the 28th Asia-Pacific Software Engineering Conference. Taipei, China: IEEE Press, 2021. 202–212.
    [27] Qi CH, Shao S, Guo YH, et al. An efficient method for analyzing widget intent of Android system. Proceedings of the 9th International Conference on Communications and Broadband Networking. Shanghai: ACM, 2021. 78–85.
    [28] Malviya VK, Tun YN, Leow CW, et al. Fine-grained in-context permission classification for Android APPs using control-flow graph embedding. Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering. Luxembourg: IEEE Press, 2023. 1225–1237.
    [29] Lin JL, Liu B, Sadeh NM, et al. Modeling users’ mobile APP privacy preferences: Restoring usability in a sea of permission settings. Proceedings of the 10th Symposium on Usable Privacy and Security. Menlo Park: USENIX Association, 2014. 199–212.
    [30] Han S, Jung J, Wetherall D. A study of third-party tracking by mobile APPs in the wild. Technical Report, UW-CSE-12-03-01. Washington: University of Washington. 2012.
    [31] Jain V, Gupta SD, Ghanavati S, et al. PAcT: Detecting and classifying privacy behavior of Android applications. Proceedings of the 15th ACM Conference on Security and Privacy in Wireless and Mobile Networks. San Antonio: ACM, 2022. 104–118.
    [32] 陈瀚, 赵春蕾, 蒋昊达, 等. 基于融合模型与语义网络的APP用户意图识别研究. 计算机工程, 2024, 50(8): 50–63.
    [33] 姜超. 基于语义的用户意图领域多分类算法分析 [硕士学位论文]. 武汉: 武汉大学, 2018.
    [34] 贺国秀, 张晓娟. 查询意图自动分类的方法改进探讨. 数字图书馆论坛, 2018(1): 53–60.
    [35] 马莹雪. 基于用户意图和时序偏好特征的兴趣点推荐方法研究 [博士学位论文]. 北京: 北京科技大学, 2022.
    [36] 钱忠胜, 张丁, 李端明, 等. 结合用户共同意图及社交关系的群组推荐方法. 计算机科学与探索, 2024, 18(5): 1368–1382.
    [37] 杜思佳. 基于深度神经网络的法律咨询用户意图理解研究与实现 [硕士学位论文]. 哈尔滨: 哈尔滨工业大学, 2019.
    [38] 张春英, 李春虎, 兰思武. 基于多粒度特征融合的用户意图分类. 华北理工大学学报(自然科学版), 2019, 41(3): 127–134.
    [39] Richardson L. Beautiful soup. https://www.crummy.com/software/BeautifulSoup/. [2024-01-17].
    [40] Pomikálek J. Removing boilerplate and duplicate content from Web corpora [Ph.D. Thesis]. Brno: Masaryk University, 2011.
    [41] Devlin J, Chang MW, Lee K, et al. BERT: Pre-training of deep bidirectional Transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis: Association for Computational Linguistics, 2019. 4171–4186.
    [42] Lafferty JD, McCallum A, Pereira FCN. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. Proceedings of the 18th International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers Inc., 2001. 282–289.
    [43] 国家市场监督管理总局, 中国国家标准化管理委员会. GB/T 41391-2022 信息安全技术 移动互联网应用程序(App)收集个人信息基本要求. 北京: 中国标准出版社, 2022.
    [44] Lan ZZ, Chen MD, Goodman S, et al. ALBERT: A lite BERT for self-supervised learning of language representations. Proceedings of the 8th International Conference on Learning Representations. Addis Ababa: OpenReview.net, 2020.
    [45] Jiao ZY, Sun SQ, Sun K. Chinese lexical analysis with deep Bi-GRU-CRF network. arXiv:1807.01882, 2018.
    [46] Reimers N, Gurevych I. Sentence-BERT: Sentence embeddings using Siamese BERT-networks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Hong Kong: ACL, 2019. 3982–3992.
    [47] Murtagh F, Legendre P. Ward’s hierarchical agglomerative clustering method: Which algorithms implement ward’s criterion? Journal of Classification, 2014, 31(3): 274–295.
    [48] Lin TY, Goyal P, Girshick R, et al. Focal loss for dense object detection. Proceedings of the 2017 IEEE International Conference on Computer Vision. Venice: IEEE Press, 2017. 2980–2988.
    [49] Ren SQ, He KM, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149.
    [50] Qi CR, Yi L, Su H, et al. PointNet++: Deep hierarchical feature learning on point sets in a metric space. Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 5105–5114.
    [51] Sandler M, Howard A, Zhu ML, et al. MobileNetV2: Inverted residuals and linear bottlenecks. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 4510–4520.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

张逸涵,洪赓,杨哲慜.基于多模态融合的移动应用细粒度用户意图理解.计算机系统应用,2024,33(11):209-223

复制
分享
文章指标
  • 点击次数:106
  • 下载次数: 955
  • HTML阅读次数: 620
  • 引用次数: 0
历史
  • 收稿日期:2024-04-07
  • 最后修改日期:2024-05-06
  • 在线发布日期: 2024-09-24
文章二维码
您是第11183081位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号