0000000000807197
AUTHOR
Franck Mamalet
Fast and Robust Face Detection on a Parallel Optimized Architecture implemented on FPGA
In this paper, we present a parallel architecture for fast and robust face detection implemented on FPGA hardware. We propose the first implementation that meets both real-time requirements in an embedded context and face detection robustness within complex backgrounds. The chosen face detection method is the Convolutional Face Finder (CFF) algorithm, which consists of a pipeline of convolution and subsampling operations, followed by a multilayer perceptron. We present the design methodology of our face detection processor element (PE). This methodology was followed in order to optimize our implementation in terms of memory usage and parallelization efficiency. We then built a parallel arch…
Design of a Real-time face detection parallel architecture using High-Level Synthesis
Abstract We describe a High-Level Synthesis implementation of a parallel architecture for face detection. The chosen face detection method is the well-known Convolutional Face Finder (CFF) algorithm, which consists of a pipeline of convolution operations. We rely on dataflow modelling of the algorithm and we use a high-level synthesis tool in order to specify the local dataflows of our Processing Element (PE), by describing in C language inter-PE communication, fine scheduling of the successive convolutions, and memory distribution and bandwidth. Using this approach, we explore several implementation alternatives in order to find a compromise between processing speed and area of the PE. We …
A Parallel Face Detection System Implemented on FPGA
In this paper, we introduce a methodology for designing a system for face detection and its implementation on FPGA. The chosen face detection method is the well-known convolutional face finder (CFF) algorithm, which consists in a pipeline of convolutions and subsampling operations. Our goal is to define a parallel architecture able to process efficiently this algorithm. We present a dataflow based architecture algorithm adequation (AAA) methodology implemented using the SynDEx software, in order to find the best compromise between the processing power and functionality requirement of each processor element (PE), and the efficiency of algorithm parallelization. We describe a first implementa…