###
计算机系统应用英文版:2024,33(2):62-71
本文二维码信息
码上扫一扫!
面向SW26010Pro处理器的全局符号重定位优化
(1.中国科学技术大学 计算机科学与技术学院, 合肥 230026;2.清华大学 计算机科学与技术系, 北京 100084;3.之江实验室, 杭州 311121;4.国家超级计算无锡中心, 无锡 214000)
Optimized Global Symbol Relocations in SW26010Pro Processors
(1.School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, China;2.Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China;3.Zhejiang Lab, Hangzhou 311121, China;4.National Supercomputing Center in Wuxi, Wuxi 214000, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 240次   下载 543
Received:July 16, 2023    Revised:August 21, 2023
中文摘要: 申威异构众核处理器运算核心访问主存的延迟很大, 程序中应尽量避免运算核心代码访问主存的操作. 全局偏移表存放程序中全局变量和函数的地址, 不适合保存在珍稀的运算核心局部存储空间中, 并且其访问模式通常比较离散, 因而也不适合对其做Cache预取, 访问全局偏移表引入的访问主存操作对程序性能影响较大. 本文针对异构众核程序静态链接与动态链接的使用场景, 分析链接器relaxation优化的使用限制, 通过“gp基地址+扩展偏移”的方法实现避免访问主存操作的全局符号重定位优化. 实验结果表明, 该重定位优化方法能够以增加少量代码为代价, 在运算核心代码调用函数与访问全局变量时有效避免访问全局偏移表引入的访问主存的操作, 提高众核程序的运行性能.
Abstract:The delay of the computing core access to the main memory of Shenwei heterogeneous many-core processors is very large, and thus the program should try to avoid the access of computing core code to main the memory as much as possible. The global offset table stores the addresses of global variables and functions in the program, which is not suitable to be saved in the rare local storage space of the computing core, and it is not suitable for cache prefetching because of its discrete access patterns. Therefore, accessing the main memory operation introduced by accessing the global offset table has a great influence on program performance. In view of the usage scenarios of static linking and dynamic linking of heterogeneous many-core programs, the usage limitations of linker relaxation optimization are analyzed, and a global symbol relocation optimization method is designed based on “gp address base+extended offset” to avoid accessing the main memory. Experimental results show that at the cost of adding a small amount of code, the relocation optimization method can effectively avoid the operation of accessing the main memory introduced by accessing the global offset table when the computing core code calls functions and accesses global variables, which improves the running performance of many-core programs.
文章编号:     中图分类号:    文献标志码:
基金项目:国家重点研发计划(2020YFB0204602)
引用文本:
钱宏,王飞,刘沙,郑天宇,宋佳伟,安虹.面向SW26010Pro处理器的全局符号重定位优化.计算机系统应用,2024,33(2):62-71
QIAN Hong,WANG Fei,LIU Sha,ZHENG Tian-Yu,SONG Jia-Wei,AN Hong.Optimized Global Symbol Relocations in SW26010Pro Processors.COMPUTER SYSTEMS APPLICATIONS,2024,33(2):62-71