0000000000022863

AUTHOR

M.d. López De Luise

A Metric for Automatic Word categorization

This paper presents a metric to be used by the working prototype WIH (Web Intelligent Handler). This metric (referred here as po) is designed to reflect main topic words and discriminate certain text profiles through word weightings. The actual version is designed only for Spanish web texts. Statistical analyses show that it is possible to differentiate text profiles upon po behavior. A poll is presented also, showing that it is a good main words discriminator. This paper is posted here as a new algorithm useful for Spanish text processing.

research product

Ambiguity and Contradiction From a Morpho-Syntactic Prototype Perspective

In this paper, the contradiction and ambiguity problems in Natural language Processing are briefly introduced. We also present the morpho-syntactic WIH (Web Intelligent Handler) prototype and the overall approach it takes to process any Spanish text. Finally, we analyze how it processes Spanish sentences with contradictions or ambiguities using its own perspective, despite deeper linguistic considerations.

research product

Improved Induction Tree Training for Automatic Lexical Categorization

This paper studies a tuned version of an induction tree which is used for automatic detection of lexical word category. The database used to train the tree has several fields to describe Spanish words morpho-syntactically. All the processing is performed using only the information of the word and its actual sentence. It will be shown here that this kind of induction is good enough to perform the linguistic categorization.

research product

Non-Technological Aspects on Web Searching Success

This paper studies the influence of social, cultural and emotional background of typical Web users into the web searching process. Several variables, describing such aspects, are represented and statistically analyzed with well known clustering and classifying algorithms such, as COBWEB, J48, Bayes classification, and Correspondence analysis. Results indicate that the efficiency of the complete process of Information Retrieval will not be fully understood without considering subjectivity and personality facts.

research product

Automatically Modeling Linguistic Categories in Spanish

This paper presents an approach to process Spanish linguistic categories automatically. The approach is based in a module of a prototype named WIH (Word Intelligent Handler), which is a project to develop a conversational bot. It basically learns category usage sequence in a sentence. It extracts a weighting metric to discriminate most common structures in real dialogs. Such a metric is important to define the preferred organization to be used by the robot to build an answer.

research product