6533b81ffe1ef96bd1278e72
RESEARCH PRODUCT
Discovering the Senses of an Ambiguous Word by Clustering its Local Contexts
Reinhard Rappsubject
Text corpusbusiness.industryComputer scienceContext (language use)computer.software_genreWord senseWord-sense inductionArtificial intelligencebusinessCluster analysiscomputerNatural language processingWord (computer architecture)Strengths and weaknessesdescription
As has been shown recently, it is possible to automatically discover the senses of an ambiguous word by statistically analyzing its contextual behavior in a large text corpus. However, this kind of research is still at an early stage. The results need to be improved and there is considerable disagreement on methodological issues. For example, although most researchers use clustering approaches for word sense induction, it is not clear what statistical features the clustering should be based on. Whereas so far most researchers cluster global co-occurrence vectors that reflect the overall behavior of a word in a corpus, in this paper we argue that it is more appropriate to use local context vectors. We support our view by comparing both approaches and by discussing their strengths and weaknesses.
year | journal | country | edition | language |
---|---|---|---|---|
2005-10-17 |