Search results for "artificial intelligence"

showing 10 items of 6122 documents

BuscaPalabras: a program for deriving orthographic and phonological neighborhood statistics and other psycholinguistic indices in Spanish.

2005

This article describes a Windows program that enables users to obtain a broad range of statistics concerning the properties of word and nonword stimuli in Spanish, including word frequency, syllable frequency, bigram and biphone frequency, orthographic similarity, orthographic and phonological structure, concreteness, familiarity, imageability, valence, arousal, and age-of-acquisition measures. It is designed for use by researchers in psycholinguistics, particularly those concerned with recognition of isolated words. The program computes measures of orthographic similarity online, with respect to either a default vocabulary of 31,491 Spanish words or a vocabulary specified by the user. In a…

VocabularyBigramSpeech recognitionmedia_common.quotation_subjectExperimental and Cognitive Psychologycomputer.software_genreConcretenessVocabularyPsycholinguisticsArts and Humanities (miscellaneous)PhoneticsStatisticsDevelopmental and Educational PsychologyHumansGeneral Psychologymedia_commonLanguagePsycholinguisticsbusiness.industryOrthographic projectionCognitionPhoneticsWord lists by frequencySpainExploratory BehaviorPsychology (miscellaneous)Artificial intelligencebusinessPsychologycomputerNatural language processingBehavior research methods

researchProduct

GreekLex 2: A comprehensive lexical database with part-of-speech, syllabic, phonological, and stress information

2017

Databases containing lexical properties on any given orthography are crucial for psycholinguistic research. In the last ten years, a number of lexical databases have been developed for Greek. However, these lack important part-of-speech information. Furthermore, the need for alternative procedures for calculating syllabic measurements and stress information, as well as combination of several metrics to investigate linguistic properties of the Greek language are highlighted. To address these issues, we present a new extensive lexical database of Modern Greek (GreekLex 2) with part-of-speech information for each word and accurate syllabification and orthographic information predictive of stre…

VocabularyDatabases FactualComputer scienceSocial Scienceslcsh:Medicinecomputer.software_genreLexical databaseVocabulary0302 clinical medicinePsychologylcsh:ScienceLanguagemedia_commonPsycholinguisticsMultidisciplinaryGreeceSyllabification05 social sciencesModern GreekSyllablesPhoneticsGreek languagePhysical SciencesSyllabic verseSyllableNatural language processingResearch ArticleStatistical Distributionsmedia_common.quotation_subjectDNA transcriptionGrammatical categoryPhonology050105 experimental psychology03 medical and health sciencesPhoneticsGeneticsHumansSpeech0501 psychology and cognitive sciencesVowelsbusiness.industrylcsh:RPhonetic transcriptionCognitive PsychologyBiology and Life SciencesLinguisticsProbability TheoryPart of speechCognitive Sciencelcsh:QGene expressionArtificial intelligencebusinesscomputerMathematics030217 neurology & neurosurgeryOrthographyNeurosciencePLOS ONE

researchProduct

Improving Classification of Tweets Using Linguistic Information from a Large External Corpus

2016

The bag of words representation of documents is often unsatisfactory as it ignores relationships between important terms that do not co-occur literally. Improvements might be achieved by expanding the vocabulary with other relevant word, like synonyms.

VocabularyInformation retrievalbusiness.industryComputer sciencemedia_common.quotation_subjectRepresentation (systemics)computer.software_genreRule-based machine translationBag-of-words modelArtificial intelligencebusinesscomputerNatural language processingWord (computer architecture)media_common

researchProduct

Part-of-Speech Induction by Singular Value Decomposition and Hierarchical Clustering

2006

Part-of-speech induction involves the automatic discovery of word classes and the assignment of each word of a vocabulary to one or several of these classes. The approach proposed here is based on the analysis of word distributions in a large collection of German newspaper texts. Its main advantage over other attempts is that it combines the hierarchical clustering of context vectors with a previous step of dimensionality reduction that minimizes the effects of sampling errors.

VocabularyK-SVDComputer sciencebusiness.industrymedia_common.quotation_subjectDimensionality reductionCorrelation clusteringPattern recognitionContext (language use)Hierarchical clusteringSingular value decompositionArtificial intelligencebusinessWord (computer architecture)media_common

researchProduct

Conceptual graph operations for formal visual reasoning in the medical domain

2014

International audience; Objective - Conceptual graphs (CGs) are used to represent clinical guidelines because they support visual reasoning with a logical background, making them a potentially valuable representation for guidelines.Materials and methods - Conceptual graph formalism has an essential and basic component: a formal vocabulary that drives all of the other mechanisms, notably specialization and projection. The graph's theoretical operations, such as projection, rules, derivation, constraints, probabilities and uncertainty, support diagrammatic reasoning.Results - A conceptual graph's graphical user interface includes a multilingual vocabulary management, some query and decision-m…

VocabularyKnowledge representation and reasoningComputer sciencemedia_common.quotation_subjectBiomedical EngineeringBiophysicsHeart failurecomputer.software_genreVisual reasoning[INFO.INFO-IM]Computer Science [cs]/Medical ImagingClinical guidelines and protocolsGraphical user interfacemedia_commonImagerie médicalebusiness.industryVisual reasoningFormal semanticsDiagrammatic reasoningConceptual graphsKnowledge representationConceptual graphGraph (abstract data type)Artificial intelligenceUser interfacebusinesscomputerNatural language processing

researchProduct

Impact of textual data augmentation on linguistic pattern extraction to improve the idiomaticity of extractive summaries

2021

International audience; The present work aims to develop a text summarisation system for financial texts with a focus on the fluidity of the target language. Linguistic analysis shows that the process of writing summaries should take into account not only terminological and collocational extraction, but also a range of linguistic material referred to here as the "support lexicon", that plays an important role in the cognitive organisation of the field. On this basis, this paper highlights the relevance of pre-training the CamemBERT model on a French financial dataset to extend its domainspecific vocabulary and fine-tuning it on extractive summarisation. We then evaluate the impact of textua…

VocabularyProcess (engineering)Computer sciencemedia_common.quotation_subjectLinguistic PatternsDeep learning02 engineering and technologyLexiconTerminology[SHS.LANGUE] Humanities and Social Sciences/LinguisticsLinguisticsField (computer science)Focus (linguistics)TerminologyText summarisationCorpus linguistics0202 electrical engineering electronic engineering information engineeringCorpus Linguistics020201 artificial intelligence & image processingRelevance (information retrieval)[SHS.LANGUE]Humanities and Social Sciences/Linguisticsmedia_commonNatural Language Processing

researchProduct

Adaptive Vocabulary Learning Environment for Late Talkers

2016

The main aim of this research is to provide children who have an early language delay with an adaptive way to train their vocabulary taking into account individuality of the learner. The suggested system is a mobile game-based learning environment which provides simple tasks where the learner chooses a picture that corresponds to a played back sound from multiple pictures presented on the screen. Our basic assumption is that the more similar the concepts (in our case, words) are, the harder the recognition task is. The system chooses the pictures to be presented on the screen by calculating the distances between the concepts in different dimensions. The distances are considered to consist o…

VocabularySLIComputer scienceProcess (engineering)media_common.quotation_subjectSpeech recognitioncomputer.software_genre050105 experimental psychologyTask (project management)self adaptive learning03 medical and health sciences0302 clinical medicineSimple (abstract algebra)Factor (programming language)Similarity (psychology)ta5160501 psychology and cognitive scienceslate talkersmedia_commoncomputer.programming_languageta113business.industryLearning environment05 social sciencesadaptive learningvocabulary learninggame-based learningFacilitationArtificial intelligencebusinesscomputer030217 neurology & neurosurgeryNatural language processingProceedings of the 8th International Conference on Computer Supported Education

researchProduct

Numerical Analysis of Word Frequencies in Artificial and Natural Language Texts

1997

We perform a numerical study of the statistical properties of natural texts written in English and of two types of artificial texts. As statistical tools we use the conventional Zipf analysis of the distribution of words and the inverse Zipf analysis of the distribution of frequencies of words, the analysis of vocabulary growth, the Shannon entropy and a quantity which is a nonlinear function of frequencies of words, the frequency "entropy". Our numerical results, obtained by investigation of eight complete books and sixteen related artificial texts, suggest that, among these analyses, the analysis of vocabulary growth shows the most striking difference between natural and artificial texts…

VocabularyZipf's lawbusiness.industryApplied Mathematicsmedia_common.quotation_subjectNumerical analysisInversecomputer.software_genreWord lists by frequencyModeling and SimulationEntropy (information theory)Geometry and TopologyArtificial intelligencebusinesscomputerNatural language processingNatural languageMathematicsmedia_commonFractals

researchProduct

2014

Codebook is an effective image representation method. By clustering in local image descriptors, a codebook is shown to be a distinctive image feature and widely applied in object classification. In almost all existing works on codebooks, the building of the visual vocabulary follows a basic routine, that is, extracting local image descriptors and clustering with a user-designated number of clusters. The problem with this routine lies in that building a codebook for each single dataset is not efficient. In order to deal with this problem, we investigate the influence of vocabulary sizes on classification performance and vocabulary universality with the kNN classifier. Experimental results in…

Vocabularybusiness.industryApplied Mathematicsmedia_common.quotation_subjectInformationSystems_INFORMATIONSTORAGEANDRETRIEVALVisual descriptorsComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONCodebookPattern recognitionKnn classifierUniversality (dynamical systems)ComputingMethodologies_PATTERNRECOGNITIONImage representationArtificial intelligenceCluster analysisbusinessAnalysisMathematicsmedia_commonAbstract and Applied Analysis

researchProduct

A practical solution to the problem of automatic part-of-speech induction from text

2005

The problem of part-of-speech induction from text involves two aspects: Firstly, a set of word classes is to be derived automatically. Secondly, each word of a vocabulary is to be assigned to one or several of these word classes. In this paper we present a method that solves both problems with good accuracy. Our approach adopts a mixture of statistical methods that have been successfully applied in word sense induction. Its main advantage over previous attempts is that it reduces the syntactic space to only the most important dimensions, thereby almost eliminating the otherwise omnipresent problem of data sparseness.

Vocabularybusiness.industryComputer sciencemedia_common.quotation_subjectSpeech recognitionSpace (commercial competition)Part of speechcomputer.software_genreSyntaxSet (abstract data type)Word-sense inductionArtificial intelligencebusinesscomputerNatural language processingWord (computer architecture)media_commonProceedings of the ACL 2005 on Interactive poster and demonstration sessions - ACL '05

researchProduct