Abstract:The RISC-V software ecosystem is in the stage of accelerated development. International open-source community makes active contributions with focus on adaptation and optimization for RISC-V, driving its software ecosystem forward. PyTorch, an open-source Python machine learning library, has significant advantages in performance, open-source ecosystem, and research areas. It provides strong support for instruction set architectures such as x86, ARM, PowerPC, and CUDA. However, in the current RISC-V architecture, the software ecosystem porting is mainly focused on adapting to the RISC-V standard instruction set and has not yet fully utilized the RISC-V extended instruction sets to optimize the software ecosystem, which leaves a significant gap between the RISC-V software ecosystem and the mature ecosystems like ARM and x86. PyTorch, lacking support of RISC-V V extension (RVV), results in a considerable gap in inference performance between RISC-V platforms and ARM platforms of similar specifications. To address this issue, this study proposes an efficient development scheme for PyTorch RVV1.0 and optimizes deep convolution operators in PyTorch by using the RVV extended instruction set. A comparative analysis is conducted on the K230 development board, with experimental results showing that the performance of deep convolution operators optimized with RVV is improved by approximately 1.35 to 3.8 times compared to scalar implementations.