6533b82afe1ef96bd128ca2f
RESEARCH PRODUCT
Automatic Dictionary Creation by Sub-symbolic Encoding of Words
Salvatore GaglioGiovanni PilatoFilippo VellaIgnazio Motisisubject
Text corpusCorrectnessProbabilistic latent semantic analysisComputer scienceLatent semantic analysisbusiness.industryContext (language use)Translation (geometry)computer.software_genreFeature (linguistics)Artificial intelligencebusinessRepresentation (mathematics)computerNatural language processingdescription
This paper describes a technique for automatic creation of dictionaries using sub-symbolic representation of words in cross-language context. Semantic relationship among words of two languages is extracted from aligned bilingual text corpora. This feature is obtained applying the Latent Semantic Analysis technique to the matrices representing terms co-occurrences in aligned text fragments. The technique allows to find the “best translation” according to a properly defined geometric distance in an automatically created semantic space. Experiments show an interesting correctness of 95% obtained in the best case.
year | journal | country | edition | language |
---|---|---|---|---|
2006-01-01 |