Search results for "annotointi"
showing 9 items of 9 documents
Oppijankieliaineistojen annotointi – esimerkkinä ICLFI:n annotoinnin prosessit, ongelmat ja ratkaisut
2014
This article illustrates the grammatical and error annotation of learner language with the help of the International Corpus of Learner Finnish (ICLFI). In particular, we will focus on issues arising from handling with at least semi-automatic methods a morphologically rich language. What makes this corpus special compared to, for example, English-language material, is the frequent variation in di" erent forms and related errors, both due to the rich morphology of the target language. This article begins with a description of the design and implementation process of both the grammatical and error annotation, followed by a brief introduction to the material for which the annotations were desig…
Creating Corpora of Finland’s Sign Languages
2016
This paper discusses the process of creating corpora of the sign languages used in Finland, Finnish Sign Language (FinSL) and Finland-Swedish Sign Language (FinSSL). It describes the process of getting informants and data, editing and storing the data, the general principles of annotation, and the creation of a web-based lexical database, the FinSL Signbank, developed on the basis of the NGT Signbank, which is a branch of the Auslan Signbank. The corpus project of Finland’s Sign Languages (CFINSL) started in 2014 at the Sign Language Centre of the University of Jyväskylä. Its aim is to collect conversations and narrations from 80 FinSL users and 20 FinSSL users who are living in different p…
The Usability of the Annotation
2016
Several corpus projects for sign languages have tried to establish conventions and standards for the annotation of signed data. When discussing corpora, it is necessary to develop a way of considering and evaluating holistically the features and problems of annotation. This paper aims to develop a conceptual framework for the evaluation of the usability of annotations. The purpose of the framework is not to give conventions for annotating but to offer tools for the evaluation of the usability of the annotation, in order to make annotations more usable and make it possible to justify and explain decisions about annotation conventions. Based on our experience of annotation in the corpus proje…
BRIMA : Low-Overhead Browser-Only Image Annotation Tool
2021
Image annotation and large annotated datasets are crucial parts within the Computer Vision and Artificial Intelligence fields. At the same time, it is well-known and acknowledged by the research community that the image annotation process is challenging, time-consuming and hard to scale. Therefore, the researchers and practitioners are always seeking ways to perform the annotations easier, faster, and at higher quality. Even though several widely used tools exist and the tools’ landscape evolved considerably, most of the tools still require intricate technical setups and high levels of technical savviness from its operators and crowdsource contributors.In order to address such challenges, w…
Music mood annotation using semantic computing and machine learning
2015
Suomen viittomakielten korpusta rakentamassa
2019
Viittomakielikorpusten rakentaminen on lisääntynyt merkittävästi 2000-luvulla: ensimmäiset korpusprojektit käynnistyivät 2000-luvun alussa Australiassa ja Hollannissa, minkä myötä laajoja, koneluettavia aineistokokoelmia on ryhdytty rakentamaan useissa Euroopan maissa 2010-luvulla. Tässä artikkelissa tarkastellaan Suomen viittomakielten, suomalaisen ja suomenruotsalaisen viittomakielen, korpuksen syntyä. Artikkeli esittelee korpuksen rakennusvaiheita eli aineiston keräämistä, käsittelyä, annotointia, pitkäaikaissäilytystä sekä julkaisua tietosuojakysymyksineen. Lisäksi artikkelissa kuvaillaan, miten korpusaineistoa on käytetty ja voidaan hyödyntää viittomakielten tutkimuksessa sekä opetukse…
Annotated Video Corpus of FinSL with Kinect and Computer-Vision Data
2016
This paper presents an annotated video corpus of Finnish Sign Language (FinSL) to which has been appended Kinect and computer-vision data. The video material consists of signed retellings of the stories Snowman and Frog, where are you?, elicited from 12 native FinSL signers in a dialogue setting. The recordings were carried out with 6 cameras directed toward the signers from different angles, and 6 signers were also recorded with one Kinect motion and depth sensing input device. All the material has been annotated in ELAN for signs, translations, grammar and prosody. To further facilitate research into FinSL prosody, computer-vision data describing the head movements and the aperture change…
Semantic annotation and big data techniques for patent information processing
2017
This thesis analyzes approaches to generate semantic annotations on patent records, as well as on other structured data, by relying on the structure and semantic representation of documents. Information in patent records reflects how real-world technologies evolve, and the approximately 3 million annual new patent applications capture the global inventive frontier. The volume of this information is too big to be effectively analyzed purely with human effort, necessitating Big data approaches to analyze it with computer aided tools and techniques. Big data is a term that describes a massive volume of structured, semi structured and unstructured data that is so large to the point that it is d…
Building the Corpus of Finland-Swedish Sign Language : Acknowledging the Language History and Future Revitalization
2022
This paper presents the first steps in the process of creating a multimedia corpus for the severely endangered Finland-Swedish Sign Language (FinSSL). In the paper, we will first outline the history and current situation of FinSSL and then move on to describe some of the foundational choices which we have made both in the earlier data collection and at the start of the currently ongoing annotation work. Finally, we will bring up challenges related to the corpus data processing and discuss the future uses of the corpus, especially from the point of view of the FinSSL revitalization process. peerReviewed