Abstract:
Deep learning neural networks have revolutionized the elds of Computer Vision, Robotics, Artifi cial Intelligence. However, these State of the art algorithms come at a high computational cost, huge memory requirements and have high hardware resources utilization, making them completely unfeasible for smaller devices. This project aims at bridging this gap. The goal is to design a compressed, fast object(Pedestrian) detection CNN model which is efficient in terms of resource utilization and memory allocation without trading off the fi nal accuracy in real-time. In this report, I am proposing a hardware architecture and a tool for porting any convolutional neural network on Zynq family-based FPGA's both for classifi cation and detection tasks. It has been tested on several networks like VGG16, Alexnet, Lenet and Tiny Yolo. Several optimization techniques are used for efficient resource management and for better performance.