Search results for "Speech recognition"

showing 10 items of 357 documents

Semantic structures of timbre emerging from social and acoustic descriptions of music

2011

The perceptual attributes of timbre have inspired a considerable amount of multidisciplinary research, but because of the complexity of the phenomena, the approach has traditionally been confined to laboratory conditions, much to the detriment of its ecological validity. In this study, we present a purely bottom-up approach for mapping the concepts that emerge from sound qualities. A social media ( http://www.last.fm ) is used to obtain a wide sample of verbal descriptions of music (in the form of tags) that go beyond the commonly studied concept of genre, and from this the underlying semantic structure of this sample is extracted. The structure that is thereby obtained is then evaluated th…

Acoustics and UltrasonicsComputer scienceEcological validityMusic information retrievalsointiväriSpeech recognitionmusiikkisosiaalinen mediacomputer.software_genreTimbreSimilarity (psychology)Social media.Music information retrievalElectrical and Electronic EngineeringSet (psychology)Structure (mathematical logic)Music psychologybusiness.industryNatural language processingVector-based semantic analysisDegree (music)acoustic featuresakustiset piirteetArtificial intelligencebusinessTimbrecomputerNatural language processingEURASIP Journal on Audio, Speech, and Music Processing
researchProduct

Combining gestures and vocalizations to imitate sounds

2015

International audience; Communicating about sounds is a difficult task without a technical language, and naïve speakers often rely on different kinds of non-linguistic vocalizations and body gestures (Lemaitre et al. 2014). Previous work has independently studied how effectively people describe sounds with gestures or vocalizations (Caramiaux, 2014, Lemaitre and Rocchesso, 2014). However, speech communication studies suggest a more intimate link between the two processes (Kendon, 2004). Our study thus focused on the combination of manual gestures and non-speech vocalizations in the communication of sounds. We first collected a large database of vocal and gestural imitations of a variety of …

Acoustics and UltrasonicsComputer scienceInformationSystems_INFORMATIONINTERFACESANDPRESENTATION(e.g.HCI)Speech recognition02 engineering and technologyRepresentation (arts)[ SPI.SIGNAL ] Engineering Sciences [physics]/Signal and Image processing[INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE][INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]Loudness[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI][SCCO]Cognitive science0202 electrical engineering electronic engineering information engineering[ INFO.INFO-NE ] Computer Science [cs]/Neural and Evolutionary Computing [cs.NE]050107 human factorsComputingMilieux_MISCELLANEOUSSound (medical instrument)05 social sciences[ SHS.ANTHRO-SE ] Humanities and Social Sciences/Social Anthropology and ethnology[INFO.INFO-MA]Computer Science [cs]/Multiagent Systems [cs.MA][ SCCO.COMP ] Cognitive science/Computer science[SCCO.PSYC] Cognitive science/Psychology[INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD][ SCCO.NEUR ] Cognitive science/Neuroscience[SCCO.PSYC]Cognitive science/Psychology[ INFO.EIAH ] Computer Science [cs]/Technology for Human Learning[ INFO.INFO-MA ] Computer Science [cs]/Multiagent Systems [cs.MA][INFO.EIAH]Computer Science [cs]/Technology for Human Learning[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processingGesture[ SHS.MUSIQ ] Humanities and Social Sciences/Musicology and performing artsAcoustics[SCCO.COMP]Cognitive science/Computer scienceArts and Humanities (miscellaneous)[ INFO.INFO-HC ] Computer Science [cs]/Human-Computer Interaction [cs.HC]0501 psychology and cognitive sciences[ INFO.INFO-CL ] Computer Science [cs]/Computation and Language [cs.CL][INFO.INFO-HC]Computer Science [cs]/Human-Computer Interaction [cs.HC]Set (psychology)[ INFO.INFO-AI ] Computer Science [cs]/Artificial Intelligence [cs.AI][SPI.ACOU]Engineering Sciences [physics]/Acoustics [physics.class-ph][SPI.ACOU] Engineering Sciences [physics]/Acoustics [physics.class-ph][SHS.MUSIQ]Humanities and Social Sciences/Musicology and performing arts[ INFO.INFO-ET ] Computer Science [cs]/Emerging Technologies [cs.ET][SCCO.NEUR]Cognitive science/Neuroscience020207 software engineering[SHS.ANTHRO-SE]Humanities and Social Sciences/Social Anthropology and ethnologyVariety (linguistics)loudness[INFO.INFO-ET]Computer Science [cs]/Emerging Technologies [cs.ET]Noise (video)[ INFO.INFO-SD ] Computer Science [cs]/Sound [cs.SD]
researchProduct

Comparing identification of vocal imitations and computational sketches of everyday sounds

2016

International audience; Sounds are notably difficult to describe. It is thus not surprising that human speakers often use many imitative vocalizations to communicate about sounds. In practice,vocal imitations of non-speech everyday sounds (e.g. the sound of a car passing by) arevery effective: listeners identify sounds better with vocal imitations than with verbal descriptions, despite the fact that vocal imitations are often inaccurate, constrained by the human vocal apparatus. The present study investigated the semantic representations evoked by vocal imitations by experimentally quantifying how well listeners could match sounds to category labels. Itcompared two different types of sounds…

Acoustics and UltrasonicsComputer science[ SHS.MUSIQ ] Humanities and Social Sciences/Musicology and performing artsSpeech recognitionAcoustics[SCCO.COMP]Cognitive science/Computer science[ SPI.SIGNAL ] Engineering Sciences [physics]/Signal and Image processing[INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE][INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL][INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI][SPI]Engineering Sciences [physics][SCCO]Cognitive scienceArts and Humanities (miscellaneous)[ INFO.INFO-HC ] Computer Science [cs]/Human-Computer Interaction [cs.HC][ INFO.INFO-CL ] Computer Science [cs]/Computation and Language [cs.CL][INFO.INFO-HC]Computer Science [cs]/Human-Computer Interaction [cs.HC][ INFO.INFO-NE ] Computer Science [cs]/Neural and Evolutionary Computing [cs.NE][ INFO.INFO-AI ] Computer Science [cs]/Artificial Intelligence [cs.AI]ComputingMilieux_MISCELLANEOUSSound (medical instrument)[ INFO.INFO-ET ] Computer Science [cs]/Emerging Technologies [cs.ET][SHS.MUSIQ]Humanities and Social Sciences/Musicology and performing arts[SCCO.NEUR]Cognitive science/Neuroscience[SHS.ANTHRO-SE]Humanities and Social Sciences/Social Anthropology and ethnologyIdentification (information)[ SHS.ANTHRO-SE ] Humanities and Social Sciences/Social Anthropology and ethnology[INFO.INFO-MA]Computer Science [cs]/Multiagent Systems [cs.MA][ SCCO.COMP ] Cognitive science/Computer science[ SCCO.NEUR ] Cognitive science/Neuroscience[INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD][ INFO.EIAH ] Computer Science [cs]/Technology for Human Learning[ INFO.INFO-MA ] Computer Science [cs]/Multiagent Systems [cs.MA][INFO.INFO-ET]Computer Science [cs]/Emerging Technologies [cs.ET][INFO.EIAH]Computer Science [cs]/Technology for Human Learning[ INFO.INFO-SD ] Computer Science [cs]/Sound [cs.SD][SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing
researchProduct

2016

The flexible access to information in working memory is crucial for adaptive behavior. It is assumed that this is realized by switching the focus of attention within working memory. Switching of attention is mirrored in the P3a component of the human event-related brain potential (ERP) and it has been argued that the processes reflected by the P3a are also relevant for selecting information within working memory. The aim of the present study was to further evaluate whether the P3a mirrors genuine switching of attention within working memory by applying an object switching task: Participants updated a memory list of four digits either by replacing one item with another digit or by processing…

Adaptive behaviorWorking memorybusiness.industrySpeech recognition05 social sciencesMemory rehearsalProcess (computing)Object (computer science)050105 experimental psychologyTask (project management)03 medical and health sciencesBehavioral NeurosciencePsychiatry and Mental healthP3a0302 clinical medicineNeuropsychology and Physiological PsychologyNeurologyMemory span0501 psychology and cognitive sciencesArtificial intelligencePsychologybusiness030217 neurology & neurosurgeryBiological PsychiatryFrontiers in Human Neuroscience
researchProduct

2013

Distraction of goal-oriented performance by a sudden change in the auditory environment is an everyday life experience. Different types of changes can be distracting, including a sudden onset of a transient sound and a slight deviation of otherwise regular auditory background stimulation. With regard to deviance detection, it is assumed that slight changes in a continuous sequence of auditory stimuli are detected by a predictive coding mechanisms and it has been demonstrated that this mechanism is capable of distracting ongoing task performance. In contrast, it is open whether transient detection – which does not rely on predictive coding mechanisms – can trigger behavioral distraction, too…

Adaptive behaviormedicine.medical_specialtyMechanism (biology)Speech recognitionMismatch negativitySensory systemAudiologybehavioral disciplines and activitiesTask (project management)Behavioral NeurosciencePsychiatry and Mental healthP3aNeuropsychology and Physiological PsychologyNeurologyDistractionmedicinesense organsPsychologypsychological phenomena and processesBiological PsychiatryChange detectionFrontiers in Human Neuroscience
researchProduct

SINGLE-TRIAL BASED INDEPENDENT COMPONENT ANALYSIS ON MISMATCH NEGATIVITY IN CHILDREN

2010

Independent component analysis (ICA) does not follow the superposition rule. This motivates us to study a negative event-related potential — mismatch negativity (MMN) estimated by the single-trial based ICA (sICA) and averaged trace based ICA (aICA), respectively. To sICA, an optimal digital filter (ODF) was used to remove low-frequency noise. As a result, this study demonstrates that the performance of the sICA+ODF and aICA could be different. Moreover, MMN under sICA+ODF fits better with the theoretical expectation, i.e., larger deviant elicits larger MMN peak amplitude.

AdolescentLearning DisabilitiesComputer Networks and CommunicationsSpeech recognitionMismatch negativityElectroencephalographyGeneral MedicineIndependent component analysisNoiseAcoustic StimulationAttention Deficit Disorder with HyperactivityEvoked Potentials AuditoryHumansSingle trialChildEvoked PotentialsDigital filterAlgorithmsMathematicsInternational Journal of Neural Systems
researchProduct

ERP correlates of transposed-letter priming effects: The role of vowels versus consonants

2008

One key issue for any computational model of visual-word recognition is the choice of an input coding scheme for assigning letter position. Recent research has shown that pseudowords created by transposing two letters are very effective at activating the lexical representation of their base words (e.g., relovution activates REVOLUTION). We report a masked priming lexical decision experiment in which the pseudoword primes were created by transposing/replacing two consonants or two vowels while event-related potentials were recorded. The results showed a modulation of the amplitude at an early window (150-250 ms) and at the N400 component for vowels but not for consonant transpositions. In ad…

AdultConsonantAdolescentCognitive NeuroscienceSpeech recognitionExperimental and Cognitive PsychologyPsycholinguisticsYoung AdultDevelopmental NeuroscienceLexical decision taskHumansBiological PsychiatryVisual word recognitionCommunicationPsycholinguisticsEndocrine and Autonomic Systemsbusiness.industryGeneral NeuroscienceElectroencephalographyRecognition PsychologyLexical representationN400ElectrophysiologyPseudowordNeuropsychology and Physiological PsychologyReadingNeurologyFemaleCuesbusinessPsychologyPhotic StimulationPsychomotor PerformanceCoding (social sciences)Psychophysiology
researchProduct

Can colours be used to segment words when reading?

2015

Rayner, Fischer, and Pollatsek (1998, Vision Research) demonstrated that reading unspaced text in Indo-European languages produces a substantial reading cost in word identification (as deduced from an increased word-frequency effect on target words embedded in the unspaced vs. spaced sentences) and in eye movement guidance (as deduced from landing sites closer to the beginning of the words in unspaced sentences). However, the addition of spaces between words comes with a cost: nearby words may fall outside high-acuity central vision, thus reducing the potential benefits of parafoveal processing. In the present experiment, we introduced a salient visual cue intended to facilitate the process…

AdultEye MovementsComputer sciencemedia_common.quotation_subjectSpeech recognitionExperimental and Cognitive PsychologyYoung AdultArts and Humanities (miscellaneous)Reading (process)Developmental and Educational PsychologyHumansmedia_commonCommunicationbusiness.industryText segmentationEye movementGeneral MedicineWord lists by frequencyPattern Recognition VisualReadingSalientWord recognitionCentral visionbusinessColor PerceptionWord (group theory)Acta Psychologica
researchProduct

The external frame function in the control of pitch, register, and singing mode: Radiographic observations of a female singer

1999

Summary This study investigates pitch control, register, and singing mode related movements of the laryngo-pharyngeal structures by radiographic methods. One trained female singer served as the subject. The results show that singing voice production involves complex movements in the laryngeal structures. Pitch related increase in the thyro-arytenoid distance (vocal fold length) is nonlinear, slowing down as pitch rises. Similar observations have been made earlier. At the highest pitches, a shortening of the distance can be seen, suggesting the use of alternative pitch control mechanisms. The various observations made support the existence of three registers in this trained female singing vo…

AdultLarynxVoice QualitySpeech recognitionSpeech and HearingMode (music)PhonationPitch controlPhoneticsotorhinolaryngologic diseasesmedicineHumansControl (linguistics)Hyoid BoneFunction (mathematics)LPN and LVNhumanitiesRadiographymedicine.anatomical_structureOtorhinolaryngologyRegister (music)Thyroid CartilagePharynxFemaleLarynxSingingPsychologypsychological phenomena and processesArytenoid CartilageRelative pitchJournal of Voice
researchProduct

Dissociating spatial and letter-based word length effects observed in readers’ eye movement patterns

2011

In previous eye movement research on word length effects, spatial width has been confounded with the number of letters. McDonald (2006) unconfounded these factors by rendering all words in sentences in constant spatial width. In the present study, the Arial font with proportional letter spacing was used for varying the number of letters while equating for spatial width, while the Courier font with monospaced letter spacing was used to measure the contribution of spatial width to the observed word length effect. Number of letters in words affected single fixation duration on target words, whereas words’ spatial width determined fixation locations in words and the probability of skipping a wo…

AdultLetter processingSpeech recognitionsanan spatiaalinen leveysFixation OcularlukeminensilmänliikkeetYoung AdultNumber of lettersFontSaccadesHumansWord lengthkirjainten lukumääräspatial widthMathematicsSpatial widthCommunicationbusiness.industryEye movementCrowdingSensory SystemsForm Perceptionword lengthnumber of lettersOphthalmologyEye movementsPattern Recognition VisualReadingSpace PerceptionFixation (visual)Word lengthbusinesssanan pituusVision Research
researchProduct