Abstract:With fine parallel processing capability and flexibility, Field Programmable Gate Array (FPGA) has been widely applied to hardware-accelerated computation, especially in Convolution Neural Networks (CNN). However, traditional image convolution on FPGA has limited modular design and large space overhead. This study builds a general experiment platform of image convolution for hardware acceleration. Through the modular design, it greatly improves the flexibility in image convolution for different convolution kernels. In addition, an image batch-processing system is adopted to enable memory sharing due to data repetition, reducing the need for storage space. Experimental results present that the proposed platform boasts a better reconfigurable architecture in terms of modular design. Besides, the complexity of BRAM only increases linearly with higher parallelism, which has the advantage of reducing power consumption.