Architectural improvements and FPGA implementation of a multimodel neuroprocessor
Since neural networks (NNs) require an enormous amount of learning time, various kinds of dedicated parallel computers have been developed. In the paper a 2-D systolic array (SA) of dedicated processing elements (PEs) also called systolic cells (SCs) is presented as the heart of a multimodel neural-network accelerator. The instruction set of the SA allows the implementation of several neural algorithms, including error back propagation and a self organizing feature map algorithm. Several special architectural facilities are presented in the paper in order to improve the 2-D SA performance. A swapping mechanism of the weight matrix allows the implementation of NNs larger than 2-D SA. A systo…