0000000000749359

AUTHOR

Aleksander Stensby

showing 2 related works from this author

THE USE OF WEAK ESTIMATORS TO ACHIEVE LANGUAGE DETECTION AND TRACKING IN MULTILINGUAL DOCUMENTS

2013

This paper deals with the problems of language detection and tracking in multilingual online short word-of-mouth (WoM) discussions. This problem is particularly unusual and difficult from a pattern recognition perspective because, in these discussions, the participants and content involve the opinions of users from all over the world. The nature of these discussions, consisting of multiple topics in different languages, presents us with a problem of finding training and classification strategies when the class-conditional distributions are nonstationary. The difficulties in solving the problem are many-fold. First of all, the analyst has no knowledge of when one language stops and when the…

Language identificationbusiness.industryComputer sciencePerspective (graphical)Estimatorcomputer.software_genreArtificial IntelligencePattern recognition (psychology)Computer Vision and Pattern RecognitionTracking (education)Artificial intelligencebusinesscomputerSoftwareNatural language processingInternational Journal of Pattern Recognition and Artificial Intelligence
researchProduct

Language Detection and Tracking in Multilingual Documents Using Weak Estimators

2010

Published version of an article from the book: Structural, Syntactic, and Statistical Pattern Recognition . The original publication is available at Spingerlink. http://dx.doi.org/DOI: 10.1007/978-3-642-14980-1_59 This paper deals with the extremely complicated problem of language detection and tracking in real-life electronic (for example, in Word-of-Mouth (WoM)) applications, where various segments of the text are written in different languages. The difficulties in solving the problem are many-fold. First of all, the analyst has no knowledge of when one language stops and when the next starts. Further, the features which one uses for any one language (for example, the n-grams) will not be…

Language identificationComputer sciencebusiness.industry05 social sciencesEstimator02 engineering and technologyVariety (linguistics)computer.software_genre0502 economics and business0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingArtificial intelligenceTracking (education)businessEstimation methodscomputer050203 business & managementNatural language processing
researchProduct