0000000000248209

AUTHOR

Sajawel Ahmed

showing 1 related works from this author

BIOfid dataset: publishing a German gold standard for named entity recognition in historical biodiversity literature

2019

The Specialized Information Service Biodiversity Research (BIOfid) has been launched to mobilize valuable biological data from printed literature hidden in German libraries for over the past 250 years. In this project, we annotate German texts converted by OCR from historical scientific literature on the biodiversity of plants, birds, moths and butterflies. Our work enables the automatic extraction of biological information previously buried in the mass of papers and volumes. For this purpose, we generated training data for the tasks of Named Entity Recognition (NER) and Taxa Recognition (TR) in biological documents. We use this data to train a number of leading machine learning tools and c…

Biological dataService (systems architecture)Information retrievalbusiness.industryComputer science02 engineering and technologyScientific literature010501 environmental sciencescomputer.software_genre01 natural scienceslanguage.human_languageField (computer science)GermanInformation extractionNamed-entity recognitionPublishingddc:020ddc:5700202 electrical engineering electronic engineering information engineeringlanguage020201 artificial intelligence & image processingArtificial intelligencebusinesscomputer0105 earth and related environmental sciences
researchProduct