Search results for "Natural language"
showing 10 items of 650 documents
On Mathematical Language: Characteristics, Semiosis and Indispensability
2021
Mathematicians and others often discuss mathematics as a universal language, and say that mathematics holds a special status among sciences. In particular, it is the language of science. In some way, it is the basis of the physical world, but globally it is beyond any other science, and it is not a mere servant of sciences. Apparently, mathematical language is simple, with a little grammar and a limited vocabulary, but very different from others. Unlike natural languages, it is a rigorously defined and unambiguous one. This characteristic constitutes its greatest advantage: its complete lack of ambiguity. Although it is limited in the range of things that can express, it can be adapted to t…
Numerical Analysis of Word Frequencies in Artificial and Natural Language Texts
1997
We perform a numerical study of the statistical properties of natural texts written in English and of two types of artificial texts. As statistical tools we use the conventional Zipf analysis of the distribution of words and the inverse Zipf analysis of the distribution of frequencies of words, the analysis of vocabulary growth, the Shannon entropy and a quantity which is a nonlinear function of frequencies of words, the frequency "entropy". Our numerical results, obtained by investigation of eight complete books and sixteen related artificial texts, suggest that, among these analyses, the analysis of vocabulary growth shows the most striking difference between natural and artificial texts…
A practical solution to the problem of automatic part-of-speech induction from text
2005
The problem of part-of-speech induction from text involves two aspects: Firstly, a set of word classes is to be derived automatically. Secondly, each word of a vocabulary is to be assigned to one or several of these word classes. In this paper we present a method that solves both problems with good accuracy. Our approach adopts a mixture of statistical methods that have been successfully applied in word sense induction. Its main advantage over previous attempts is that it reduces the syntactic space to only the most important dimensions, thereby almost eliminating the otherwise omnipresent problem of data sparseness.
Ontology languages for the semantic web: A never completely updated review
2006
This paper gives a never completely account of approaches that have been used for the research community for representing knowledge. After underlining the importance of a layered approach and the use of standards, it starts with early efforts used for artificial intelligence researchers. Then recent approaches, aimed mainly at the semantic web, are described. Coding examples from the literature are presented in both sections. Finally, the semantic web ontology creation process, as we envision it, is introduced.
Natural Language Processing Agents and Document Clustering in Knowledge Management
2008
While HTML provides the Web with a standard format for information presentation, XML has been made a standard for information structuring on the Web. The mission of the Semantic Web now is to provide meaning to the Web. Apart from building on the existing Web technologies, we need other tools from other areas of science to do that. This chapter shows how natural language processing methods and technologies, together with ontologies and a neural algorithm, can be used to help in the task of adding meaning to the Web, thus making the Web a better platform for knowledge management in general.
Within and between variations of texts elicited from nine wine experts
2006
Nine wine experts tasted in replicate six Chardonnay wines that had been aged in oak barrels from different forests and/or species. They freely gave their descriptions in writing; the only instruction given was to underline three words or expressions that best characterized each tasted wine. The texts were submitted to an objective lexical analysis that quantified the important variation among the experts. In addition a matching task was performed by 117 assessors in which each assessor received from each expert six white cards and six yellow cards representing the descriptions of the six white wines and six red wines. The assessors were incapable of matching the descriptions for the same e…
An Extension of the VSM Documents Representation using Word Embedding
2017
Abstract In this paper, we will present experiments that try to integrate the power of Word Embedding representation in real problems for documents classification. Word Embedding is a new tendency used in the natural language processing domain that tries to represent each word from the document in a vector format. This representation embeds the semantically context in that the word occurs more frequently. We include this new representation in a classical VSM document representation and evaluate it using a learning algorithm based on the Support Vector Machine. This new added information makes the classification to be more difficult because it increases the learning time and the memory neede…
Interpretability in Word Sense Disambiguation using Tsetlin Machine
2021
SisHiTra : A Hybrid Machine Translation System from Spanish to Catalan
2004
In the current European scenario, characterized by the coexistence of communities writing and speaking a great variety of languages, machine translation has become a technology of capital importance. In areas of Spain and of other countries, coofficiality of several languages implies producing several versions of public information. Machine translation between all the languages of the Iberian Peninsula and from them into English will allow for a better integration of Iberian linguistic communities among them and inside Europe. The purpose of this paper is to show a machine translation system from Spanish to Catalan that deals with text input. In our approach, both deductive (linguistic) and…
Experiments in Non-Coherent Post-editing
2017
Market pressure on translation productivity joined with technological innovation is likely to fragment and decontextualise translation jobs even more than is cur-rently the case. Many different translators increasingly work on one document at different places, collaboratively working in the cloud. This paper investigates the effect of decontextualised source texts on behaviour by comparing post-editing of sequentially ordered sentences with shuffled sentences from two different texts. The findings suggest that there is little or no effect of the decontextualised source texts on behaviour.