0000000000223570

AUTHOR

Carmelo Spiccia

showing 3 related works from this author

A word prediction methodology for automatic sentence completion

2015

Word prediction generally relies on n-grams occurrence statistics, which may have huge data storage requirements and does not take into account the general meaning of the text. We propose an alternative methodology, based on Latent Semantic Analysis, to address these issues. An asymmetric Word-Word frequency matrix is employed to achieve higher scalability with large training datasets than the classic Word-Document approach. We propose a function for scoring candidate terms for the missing word in a sentence. We show how this function approximates the probability of occurrence of a given candidate word. Experimental results show that the proposed approach outperforms non neural network lang…

business.industryLatent semantic analysisComputer scienceSentence completionComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Statistical semanticsMachine learningcomputer.software_genreSemanticsSemEvalSentence completion testsword space modelLSAScalabilitylanguage modellatent semantic analysisArtificial intelligencebusinesscomputerComputer Science::Formal Languages and Automata TheoryNatural language processingSentenceWord (computer architecture)word predictionProceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)
researchProduct

Semantic Word Error Rate for Sentence Similarity

2016

Sentence similarity measures have applications in several tasks, including: Machine Translation, Paraphrase Iden- tification, Speech Recognition, Question-answering and Text Summarization. However, measures designed for these tasks are aimed at assessing equivalence rather than resemblance, partly departing from human cognition of similarity. While this is reasonable for these activities, it hinders the applicability of sentence similarity measures to other tasks. We therefore propose a new sentence similarity measure specifically designed for resemblance evaluation, in order to cover these fields better. Experimental results are discussed.

Machine translationComputer scienceSpeech recognitionWord error rate02 engineering and technologycomputer.software_genreParaphrase030507 speech-language pathology & audiology03 medical and health sciencesSemantic similarityArtificial IntelligenceLSAWord Error Rate0202 electrical engineering electronic engineering information engineeringsentence resemblanceEquivalence (formal languages)Latent Semantic AnalysiSemantic Word Error Ratesentence similarity measureSWERbusiness.industryLatent semantic analysisSentence SimilaritySemantic ComputingCognitionAutomatic summarizationComputer Networks and Communicationword relatedne020201 artificial intelligence & image processingArtificial intelligence0305 other medical sciencebusinesscomputerNatural language processingWERInformation Systems2016 IEEE Tenth International Conference on Semantic Computing (ICSC)
researchProduct

An Innovative Similarity Measure for Sentence Plagiarism Detection

2016

We propose and experimentally assess Semantic Word Error Rate (SWER), an innovative similarity measure for sentence plagiarism detection. SWER introduces a complex approach based on latent semantic analysis, which is capable of outperforming the accuracy of competitor methods in plagiarism detection. We provide principles and functionalities of SWER, and we complement our analytical contribution by means of a significant preliminary experimental analysis. Derived results are promising, and confirm to use the goodness of our proposal.

business.industryComputer scienceLatent semantic analysisPlagiarism DetectionComputer Science (all)Sentence similarity measureWord error rate02 engineering and technologySimilarity measurecomputer.software_genreComplement (complexity)Theoretical Computer SciencePlagiarism detection020204 information systems0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingPlagiarism detectionArtificial intelligenceSentence Similarity MeasurebusinesscomputerNatural language processingSentencePlagiarism detection; Sentence similarity measure; Theoretical Computer Science; Computer Science (all)
researchProduct