Search results for "Array"
showing 10 items of 1264 documents
Moving Learning Machine Towards Fast Real-Time Applications: A High-Speed FPGA-based Implementation of the OS-ELM Training Algorithm
2018
Currently, there are some emerging online learning applications handling data streams in real-time. The On-line Sequential Extreme Learning Machine (OS-ELM) has been successfully used in real-time condition prediction applications because of its good generalization performance at an extreme learning speed, but the number of trainings by a second (training frequency) achieved in these continuous learning applications has to be further reduced. This paper proposes a performance-optimized implementation of the OS-ELM training algorithm when it is applied to real-time applications. In this case, the natural way of feeding the training of the neural network is one-by-one, i.e., training the neur…
An optimized mass storage FFT for vector computers
1995
Abstract The performance of a segmented FFT algorithm which allows the out-of-core computation of the Fourier transform of a very large mass storage data array is presented. The code is particularly optimized for vector computers. Tests performed mainly on a CONVEX C210 vector computer showed that, for very long transforms, tuning of the main parameters involved leads to computation speed and global efficiency better than for FFTs performed in-core. The use of tunable parameters allows optimization of the algorithm on machines with different configurations.
An efficient hardware implementation of MQ decoder of the JPEG2000
2014
Abstract JPEG2000 is an international standard for still images intended to overcome the shortcomings of the existing JPEG standard. Compared to JPEG image compression techniques, JPEG2000 standard has not only better not only has better compression ratios, but it also offers some exciting features. As it’s hard to meet the real-time requirement of image compression systems by software, it is necessary to implement compression system by hardware. The MQ decoder of the JPEG2000 standard is an important bottleneck for real-time applications. In order to meet the real-time requirement we propose in this paper a novel architecture for a MQ decoder with high throughput which is comparable to tha…
Multiprocessor SoC Implementation of Neural Network Training on FPGA
2008
Software implementations of artificial neural networks (ANNs) and their training on a sequential processor are inefficient because they do not take advantage of parallelism. ASIC and FPGA implementations employ specific hardware structures to exploit parallelism in order to improve processing speed; however, optimizing resource usage requires the use of fixed-point arithmetic, thereby losing precision, and the final system is restricted to a particular network topology. This paper presents a mixed approach based on a multiprocessor system-on-chip (SoC) on a FPGA. The use of software-driven embedded microprocessors with custom floating-point extensions for ANN related functions allows for gr…
SVM approximation for real-time image segmentation by using an improved hyperrectangles-based method
2003
A real-time implementation of an approximation of the support vector machine (SVM) decision rule is proposed. This method is based on an improvement of a supervised classification method using hyperrectangles, which is useful for real-time image segmentation. The final decision combines the accuracy of the SVM learning algorithm and the speed of a hyperrectangles-based method. We review the principles of the classification methods and we evaluate the hardware implementation cost of each method. We present the combination algorithm, which consists of rejecting ambiguities in the learning set using SVM decision, before using the learning step of the hyperrectangles-based method. We present re…
Extension of luminance component based demosaicking algorithm to 4- and 5-band multispectral images
2021
Abstract Multispectral imaging systems are currently expanding with a variety of multispectral demosaicking algorithms. But these algorithms have limitations due to the remarkable presence of artifacts in the reconstructed image. In this paper, we propose a powerful multispectral image demosaicking method that focuses on the G band and luminance component. We've first identified a relevant 4-and 5-band multispectral filter array (MSFA) with the dominant G band and then proposed an algorithm that consistently estimates the missing G values and other missing components using a convolution operator and a weighted bilinear interpolation algorithm based on the luminance component. Using the cons…
Continuous-flow tristimulus colorimetry: a new approach for gradient scanning techniques
1991
Abstract A flow-injection gradient scanning technique for colour evaluation by means of tristimulus colorimetry is described. Equipment and data acquisition requirements are discussed. The program CHROMA.FIA data the treatment and comparative chromatic analysis is presented. The chemical and flow conditions were optimized. Comparative studies using metallochromic indicators with both the flow-injection and the conventional batch procedures were made. The continuous-flow procedure provides good results and is more than fifteen times faster than the manual titrimetric procedure.
Modeling RISC-V Processor in IP-XACT
2018
IP-XACT is the most used standard in IP (Intellectual Property) integration. It is intended as a language neutral golden reference, from which RTL and HW dependent SW is automatically generated. Despite its wide popularity in the industry, there are practically no public and open design examples for any part of the design flow from IP-XACT to synthesis. One reason is the difficulty of creating IP-XACT models for existing RTL projects. In this paper, we address the issues by modeling the PULPino RISC-V microprocessor that is written in SystemVerilog (SV) and the project distributed over several repositories. We propose how to solve the mismatching concepts between SV project and IP-XACT, and…
Efficient FPGA Implementation of an Adaptive Noise Canceller
2006
A hardware implementation of an adaptive noise canceller (ANC) is presented. It has been synthesized within an FPGA, using a modified version of the least mean square (LMS) error algorithm. The results obtained so far show a significant decrease of the required gate count when compared with a standard LMS implementation, while increasing the ANC bandwidth and signal to noise (S/N) ratio. This novel adaptive noise canceller is then useful for enhancing the S/N ratio of data collected from sensors (or sensor arrays) working in noisy environment, or dealing with potentially weak signals.
Parallelizing Epistasis Detection in GWAS on FPGA and GPU-Accelerated Computing Systems
2015
This is a post-peer-review, pre-copyedit version of an article published in IEEE - ACM Transactions on Computational Biology and Bioinformatics. The final authenticated version is available online at: http://dx.doi.org/10.1109/TCBB.2015.2389958 [Abstract] High-throughput genotyping technologies (such as SNP-arrays) allow the rapid collection of up to a few million genetic markers of an individual. Detecting epistasis (based on 2-SNP interactions) in Genome-Wide Association Studies is an important but time consuming operation since statistical computations have to be performed for each pair of measured markers. Computational methods to detect epistasis therefore suffer from prohibitively lon…