0000000001058922

AUTHOR

Pedro Zuccarello

showing 16 related works from this author

32×32 winner-take-all matrix with single winner selection

2010

A 32 × 32 winner-take-all (WTA) matrix with single winner selection is introduced. A high-resolution gain-boosted regulated-cascode WTA circuit is used in a first competition stage. Because of the large number of competing cells the possibility of a multiple winners situation arises. A single winner is obtained by means of a digital inhibitory circuit following each WTA analogue amplifier. Simulations show that this mixed analogue-digital circuit achieves its objective with a current resolution of approximately 10 nA (0.8% of the maximum input current in the simulated case). A time response of ?s can be achieved.

Digital electronicsEngineeringArtificial neural networkbusiness.industryAmplifierHigh resolutionWinner-take-allMatrix (mathematics)Time responseElectronic engineeringElectrical and Electronic EngineeringbusinessAlgorithmSelection (genetic algorithm)Electronics Letters
researchProduct

Selective Change Driven Imaging: A Biomimetic Visual Sensing Strategy

2011

Selective Change Driven (SCD) Vision is a biologically inspired strategy for acquiring, transmitting and processing images that significantly speeds up image sensing. SCD vision is based on a new CMOS image sensor which delivers, ordered by the absolute magnitude of its change, the pixels that have changed after the last time they were read out. Moreover, the traditional full frame processing hardware and programming methodology has to be changed, as a part of this biomimetic approach, to a new processing paradigm based on pixel processing in a data flow manner, instead of full frame image processing.

Motion analysisComputer scienceComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONImage processinglcsh:Chemical technologyBiochemistryArticleAnalytical ChemistryMotionArtificial IntelligenceDigital image processingImage Processing Computer-AssistedComputer SimulationComputer visionlcsh:TP1-1185biomimeticsElectrical and Electronic EngineeringImage sensorInstrumentationPixelbusiness.industrymotion analysisFrame (networking)Atomic and Molecular Physics and OpticsCMOS image sensorArtificial intelligencebusinessAlgorithmsevent-based visionSensors
researchProduct

A Comparative Analysis of Residual Block Alternatives for End-to-End Audio Classification

2020

Residual learning is known for being a learning framework that facilitates the training of very deep neural networks. Residual blocks or units are made up of a set of stacked layers, where the inputs are added back to their outputs with the aim of creating identity mappings. In practice, such identity mappings are accomplished by means of the so-called skip or shortcut connections. However, multiple implementation alternatives arise with respect to where such skip connections are applied within the set of stacked layers making up a residual block. While residual networks for image classification using convolutional neural networks (CNNs) have been widely discussed in the literature, their a…

Normalization (statistics)General Computer ScienceComputer scienceFeature extractionESC02 engineering and technologycomputer.software_genreResidualConvolutional neural networkconvolutional neural networks0202 electrical engineering electronic engineering information engineeringGeneral Materials Scienceurbansound8kAudio signal processingBlock (data storage)Contextual image classificationGeneral EngineeringAudio classification020206 networking & telecommunications113 Computer and information sciences020201 artificial intelligence & image processinglcsh:Electrical engineering. Electronics. Nuclear engineeringData mininglcsh:TK1-9971computerresidual learningIEEE Access
researchProduct

Applying logistic regression to relevance feedback in image retrieval systems

2007

This paper deals with the problem of image retrieval from large image databases. A particularly interesting problem is the retrieval of all images which are similar to one in the user's mind, taking into account his/her feedback which is expressed as positive or negative preferences for the images that the system progressively shows during the search. Here we present a novel algorithm for the incorporation of user preferences in an image retrieval system based exclusively on the visual content of the image, which is stored as a vector of low-level features. The algorithm considers the probability of an image belonging to the set of those sought by the user, and models the logit of this prob…

business.industryIterative methodLinear modelRelevance feedbackPattern recognitioncomputer.software_genreImage (mathematics)Set (abstract data type)Artificial IntelligenceSignal ProcessingRelevance (information retrieval)Computer Vision and Pattern RecognitionArtificial intelligenceData miningbusinessCluster analysisImage retrievalcomputerSoftwareMathematicsPattern Recognition
researchProduct

An Open-set Recognition and Few-Shot Learning Dataset for Audio Event Classification in Domestic Environments

2020

The problem of training with a small set of positive samples is known as few-shot learning (FSL). It is widely known that traditional deep learning (DL) algorithms usually show very good performance when trained with large datasets. However, in many applications, it is not possible to obtain such a high number of samples. In the image domain, typical FSL applications include those related to face recognition. In the audio domain, music fraud or speaker recognition can be clearly benefited from FSL methods. This paper deals with the application of FSL to the detection of specific and intentional acoustic events given by different types of sound alarms, such as door bells or fire alarms, usin…

FOS: Computer and information sciencesComputer Science - Machine LearningSound (cs.SD)sound processingaudio datasetmachine listeningUNESCO::CIENCIAS TECNOLÓGICASComputer Science - SoundMachine Learning (cs.LG)classificationArtificial IntelligenceAudio and Speech Processing (eess.AS)Signal ProcessingFOS: Electrical engineering electronic engineering information engineeringfew-shot learningopen-set recognitionComputer Vision and Pattern RecognitionSoftwareElectrical Engineering and Systems Science - Audio and Speech Processing
researchProduct

On the Advantages of Asynchronous Pixel Reading and Processing for High-Speed Motion Estimation

2008

Biological visual systems are becoming an interesting source for the improvement of artificial visual systems. A biologically inspired read-out and pixel processing strategy is presented. This read-out mechanism is based on Selective pixel Change-Driven (SCD) processing. Pixels are individually processed and read-out instead of the classical approach where the read-out and processing is based on complete frames. Changing pixels are read-out and processed at short time intervals. The simulated experiments show that the response delay using this strategy is several orders of magnitude lower than current cameras while still keeping the same, or even tighter, bandwidth requirements.

PixelOrders of magnitude (time)Asynchronous communicationbusiness.industryComputer scienceReading (computer)Bandwidth (signal processing)Computer visionArtificial intelligencebusinessResponse delaySpeed (motion)
researchProduct

Taking Advantage of Selective Change Driven Processing for 3D Scanning

2013

This article deals with the application of the principles of SCD (Selective Change Driven) vision to 3D laser scanning. Two experimental sets have been implemented: one with a classical CMOS (Complementary Metal-Oxide Semiconductor) sensor, and the other one with a recently developed CMOS SCD sensor for comparative purposes, both using the technique known as Active Triangulation. An SCD sensor only delivers the pixels that have changed most, ordered by the magnitude of their change since their last readout. The 3D scanning method is based on the systematic search through the entire image to detect pixels that exceed a certain threshold, showing the SCD approach to be ideal for this applicat…

Event-based visionLaser scanningComputer scienceTransducers3d scanninglcsh:Chemical technologySensitivity and SpecificityBiochemistryArticleAnalytical Chemistrylaw.inventionPhotometryPhotometry (optics)Imaging Three-DimensionallawInformàticaNyquist–Shannon sampling theoremComputer visionlcsh:TP1-11853D scanningElectrical and Electronic Engineeringhigh-speed visual acquisitionInstrumentationPixelbusiness.industryLasers3D reconstructionReproducibility of ResultsSignal Processing Computer-AssistedEquipment DesignImage EnhancementLaserAtomic and Molecular Physics and OpticsEquipment Failure AnalysisTransducerSemiconductorsCMOSArtificial intelligencebusinessHigh-speed visual acquisitionevent-based visionSensors
researchProduct

Open Set Audio Classification Using Autoencoders Trained on Few Data.

2020

Open-set recognition (OSR) is a challenging machine learning problem that appears when classifiers are faced with test instances from classes not seen during training. It can be summarized as the problem of correctly identifying instances from a known class (seen during training) while rejecting any unknown or unwanted samples (those belonging to unseen classes). Another problem arising in practical scenarios is few-shot learning (FSL), which appears when there is no availability of a large number of positive samples for training a recognition system. Taking these two limitations into account, a new dataset for OSR and FSL for audio data was recently released to promote research on solution…

Computer scienceOpen set02 engineering and technologylcsh:Chemical technologyMachine learningcomputer.software_genreBiochemistryArticleAnalytical ChemistrySet (abstract data type)open set recognition020204 information systemsaudio classificationautoencoders0202 electrical engineering electronic engineering information engineeringFeature (machine learning)lcsh:TP1-1185few-shot learningElectrical and Electronic EngineeringRepresentation (mathematics)Instrumentationbusiness.industryopen set classificationPerceptronClass (biology)AutoencoderAtomic and Molecular Physics and OpticsEmbedding020201 artificial intelligence & image processingArtificial intelligenceTransfer of learningbusinesscomputerSensors (Basel, Switzerland)
researchProduct

A novel Bayesian framework for relevance feedback in image content-based retrieval systems

2006

This paper presents a new algorithm for image retrieval in content-based image retrieval systems. The objective of these systems is to get the images which are as similar as possible to a user query from those contained in the global image database without using textual annotations attached to the images. The main problem in obtaining a robust and effective retrieval is the gap between the low level descriptors that can be automatically extracted from the images and the user intention. The algorithm proposed here to address this problem is based on the modeling of user preferences as a probability distribution on the image space. Following a Bayesian methodology, this distribution is the pr…

Computer sciencebusiness.industryComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONRelevance feedbackPattern recognitioncomputer.software_genreAutomatic image annotationArtificial IntelligenceComputer Science::Computer Vision and Pattern RecognitionSignal ProcessingProbability distributionComputer Vision and Pattern RecognitionVisual WordArtificial intelligenceData miningbusinessPrecision and recallImage retrievalcomputerSoftwarePattern Recognition
researchProduct

Acoustic Scene Classification with Squeeze-Excitation Residual Networks

2020

Acoustic scene classification (ASC) is a problem related to the field of machine listening whose objective is to classify/tag an audio clip in a predefined label describing a scene location (e. g. park, airport, etc.). Many state-of-the-art solutions to ASC incorporate data augmentation techniques and model ensembles. However, considerable improvements can also be achieved only by modifying the architecture of convolutional neural networks (CNNs). In this work we propose two novel squeeze-excitation blocks to improve the accuracy of a CNN-based ASC framework based on residual learning. The main idea of squeeze-excitation blocks is to learn spatial and channel-wise feature maps independently…

FOS: Computer and information sciencesSound (cs.SD)Computer Science - Machine LearningGeneral Computer ScienceCalibration (statistics)Computer scienceResidualConvolutional neural networkField (computer science)Computer Science - SoundMachine Learning (cs.LG)030507 speech-language pathology & audiology03 medical and health sciencesAudio and Speech Processing (eess.AS)Acoustic scene classificationFeature (machine learning)FOS: Electrical engineering electronic engineering information engineeringGeneral Materials ScienceBlock (data storage)Artificial neural networkbusiness.industrypattern recognitionGeneral Engineeringdeep learningPattern recognitionmachine listeningsqueeze-excitationArtificial intelligencelcsh:Electrical engineering. Electronics. Nuclear engineering0305 other medical sciencebusinesslcsh:TK1-9971Electrical Engineering and Systems Science - Audio and Speech Processing
researchProduct

IOWA Operators and Its Application to Image Retrieval

2014

This paper presents a relevance feedback procedure based on logistic regression analysis. Since, the dimension of the feature vector associated to each image is typically larger than the number of evaluated images by the user, different logistic regression models have to be fitted separately. Each fitted model provides us with a relevance probability and a confidence interval for that probability. In order to aggregate these set of probabilities and confidence intervals we use an IOWA operator. The results will show the success of our algorithm and that OWA operators are an efficient and natural way of dealing with this kind of fusion problems.

Operator (computer programming)Feature vectorRelevance feedbackRelevance (information retrieval)Data miningLogistic regressioncomputer.software_genreContent-based image retrievalcomputerImage retrievalConfidence intervalMathematics
researchProduct

Selective Change-Driven Image Processing: A Speeding-Up Strategy

2009

Biologically inspired schemes are a source for the improvement of visual systems. Real-time implementation of image processing algorithms is constrained by the large amount of data to be processed. Full image processing is many times unnecessary since there are many pixels that suffer a small change or not suffer any change at all. A strategy based on delivering and processing pixels, instead of processing the complete frame, is presented. The pixels that have suffered higher changes in each frame, ordered by the absolute value of its change, are read-out and processed. Two examples are shown: a morphological motion detection algorithm and the Horn and Schunck optical flow algorithm. Result…

Pixelbusiness.industryComputer scienceFrame (networking)Digital image processingComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONOptical flowMotion detectionComputer visionImage processingAbsolute valueArtificial intelligencebusiness
researchProduct

On the performance of residual block design alternatives in convolutional neural networks for end-to-end audio classification

2019

Residual learning is a recently proposed learning framework to facilitate the training of very deep neural networks. Residual blocks or units are made of a set of stacked layers, where the inputs are added back to their outputs with the aim of creating identity mappings. In practice, such identity mappings are accomplished by means of the so-called skip or residual connections. However, multiple implementation alternatives arise with respect to where such skip connections are applied within the set of stacked layers that make up a residual block. While ResNet architectures for image classification using convolutional neural networks (CNNs) have been widely discussed in the literature, few w…

FOS: Computer and information sciencesSound (cs.SD)Computer Science - Machine LearningAudio and Speech Processing (eess.AS)FOS: Electrical engineering electronic engineering information engineeringComputer Science - SoundMachine Learning (cs.LG)Electrical Engineering and Systems Science - Audio and Speech Processing
researchProduct

Sound Event Localization and Detection using Squeeze-Excitation Residual CNNs

2020

Sound Event Localization and Detection (SELD) is a problem related to the field of machine listening whose objective is to recognize individual sound events, detect their temporal activity, and estimate their spatial location. Thanks to the emergence of more hard-labeled audio datasets, deep learning techniques have become state-of-the-art solutions. The most common ones are those that implement a convolutional recurrent network (CRNN) having previously transformed the audio signal into multichannel 2D representation. The squeeze-excitation technique can be considered as a convolution enhancement that aims to learn spatial and channel feature maps independently rather than together as stand…

FOS: Computer and information sciencesSound (cs.SD)Audio and Speech Processing (eess.AS)FOS: Electrical engineering electronic engineering information engineeringComputer Science - SoundElectrical Engineering and Systems Science - Audio and Speech Processing
researchProduct

Anomalous Sound Detection using unsupervised and semi-supervised autoencoders and gammatone audio representation

2020

Anomalous sound detection (ASD) is, nowadays, one of the topical subjects in machine listening discipline. Unsupervised detection is attracting a lot of interest due to its immediate applicability in many fields. For example, related to industrial processes, the early detection of malfunctions or damage in machines can mean great savings and an improvement in the efficiency of industrial processes. This problem can be solved with an unsupervised ASD solution since industrial machines will not be damaged simply by having this audio data in the training stage. This paper proposes a novel framework based on convolutional autoencoders (both unsupervised and semi-supervised) and a Gammatone-base…

FOS: Computer and information sciencesSound (cs.SD)Computer Science - Machine LearningAudio and Speech Processing (eess.AS)FOS: Electrical engineering electronic engineering information engineeringComputer Science - SoundMachine Learning (cs.LG)Electrical Engineering and Systems Science - Audio and Speech Processing
researchProduct

CNN depth analysis with different channel inputs for Acoustic Scene Classification

2019

Acoustic scene classification (ASC) has been approached in the last years using deep learning techniques such as convolutional neural networks or recurrent neural networks. Many state-of-the-art solutions are based on image classification frameworks and, as such, a 2D representation of the audio signal is considered for training these networks. Finding the most suitable audio representation is still a research area of interest. In this paper, different log-Mel representations and combinations are analyzed. Experiments show that the best results are obtained using the harmonic and percussive components plus the difference between left and right stereo channels, (L-R). On the other hand, it i…

FOS: Computer and information sciencesSound (cs.SD)Computer Science - Machine LearningAudio and Speech Processing (eess.AS)FOS: Electrical engineering electronic engineering information engineeringComputer Science - SoundMachine Learning (cs.LG)Electrical Engineering and Systems Science - Audio and Speech Processing
researchProduct