0000000000248210

AUTHOR

Alexander Mehler

BIOfid dataset: publishing a German gold standard for named entity recognition in historical biodiversity literature

The Specialized Information Service Biodiversity Research (BIOfid) has been launched to mobilize valuable biological data from printed literature hidden in German libraries for over the past 250 years. In this project, we annotate German texts converted by OCR from historical scientific literature on the biodiversity of plants, birds, moths and butterflies. Our work enables the automatic extraction of biological information previously buried in the mass of papers and volumes. For this purpose, we generated training data for the tasks of Named Entity Recognition (NER) and Taxa Recognition (TR) in biological documents. We use this data to train a number of leading machine learning tools and c…

research product

Integrating Computational Linguistic Analysis of Multilingual Learning Data and Educational Measurement Approaches to Explore Learning in Higher Education

This chapter develops a computational linguistic model for analyzing and comparing multilingual data as well as its application to a large body of standardized assessment data from higher education. The approach employs both an automatic and a manual annotation of the data on several linguistic layers (including parts of speech, text structure and content). Quantitative features of the textual data are explored that are related to both the students’ (domain-specific knowledge) test results and their level of academic experience. The respective analysis involves statistics of distance correlation, text categorization with respect to text types (questions and response options) as well as lang…

research product

Computational linguistic assessment of textbook and online learning media by means of threshold concepts in business education

Threshold concepts are key terms in domain-based knowledge acquisition. They are regarded as building blocks of the conceptual development of domain knowledge within particular learners. From a linguistic perspective, however, threshold concepts are instances of specialized vocabularies, exhibiting particular linguistic features. Threshold concepts are typically used in specialized texts such as textbooks -- that is, within a formal learning environment. However, they also occur in informal learning environments like newspapers. In this article, a first approach is taken to combine both lines into an overarching research program - that is, to provide a computational linguistic assessment of…

research product

Positive Learning in the Internet Age: Developments and Perspectives in the PLATO Program

The Internet has become the main informational entity, i.e., a public source of information. The Internet offers many new benefits and opportunities for human learning, teaching, and research. However, by providing a vast amount of information from innumerable sources, it also enables the manipulation of information; there are countless examples of disseminated misinformation and false data in mass and social media. Much of the information presented online is conflicting, preselected, or algorithmically obscure, often colliding with fundamental humanistic values and posing moral or ethical problems.

research product