Search results for "annotointi"

showing 9 items of 9 documents

Oppijankieliaineistojen annotointi – esimerkkinä ICLFI:n annotoinnin prosessit, ongelmat ja ratkaisut

2014

This article illustrates the grammatical and error annotation of learner language with the help of the International Corpus of Learner Finnish (ICLFI). In particular, we will focus on issues arising from handling with at least semi-automatic methods a morphologically rich language. What makes this corpus special compared to, for example, English-language material, is the frequent variation in di" erent forms and related errors, both due to the rich morphology of the target language. This article begins with a description of the design and implementation process of both the grammatical and error annotation, followed by a brief introduction to the material for which the annotations were desig…

learner language corporaannotationannotointiArtikkelitcorpus studyerror annotation
researchProduct

Creating Corpora of Finland’s Sign Languages

2016

This paper discusses the process of creating corpora of the sign languages used in Finland, Finnish Sign Language (FinSL) and Finland-Swedish Sign Language (FinSSL). It describes the process of getting informants and data, editing and storing the data, the general principles of annotation, and the creation of a web-based lexical database, the FinSL Signbank, developed on the basis of the NGT Signbank, which is a branch of the Auslan Signbank. The corpus project of Finland’s Sign Languages (CFINSL) started in 2014 at the Sign Language Centre of the University of Jyväskylä. Its aim is to collect conversations and narrations from 80 FinSL users and 20 FinSSL users who are living in different p…

Signbanksuomenruotsalainen viittomakielimetadatasuomalainen viittomakieliannotointisign language corpus
researchProduct

The Usability of the Annotation

2016

Several corpus projects for sign languages have tried to establish conventions and standards for the annotation of signed data. When discussing corpora, it is necessary to develop a way of considering and evaluating holistically the features and problems of annotation. This paper aims to develop a conceptual framework for the evaluation of the usability of annotations. The purpose of the framework is not to give conventions for annotating but to offer tools for the evaluation of the usability of the annotation, in order to make annotations more usable and make it possible to justify and explain decisions about annotation conventions. Based on our experience of annotation in the corpus proje…

signed languagekäytettävyysframeworkannotointicorpusarviointi
researchProduct

BRIMA : Low-Overhead Browser-Only Image Annotation Tool

2021

Image annotation and large annotated datasets are crucial parts within the Computer Vision and Artificial Intelligence fields. At the same time, it is well-known and acknowledged by the research community that the image annotation process is challenging, time-consuming and hard to scale. Therefore, the researchers and practitioners are always seeking ways to perform the annotations easier, faster, and at higher quality. Even though several widely used tools exist and the tools’ landscape evolved considerably, most of the tools still require intricate technical setups and high levels of technical savviness from its operators and crowdsource contributors.In order to address such challenges, w…

COCOhahmontunnistus (tietotekniikka)selaimetjoukkoistaminenimage dataset generationcrowdsource annotationannotointiannotation toolkonenäköimage annotationkuvat
researchProduct

Music mood annotation using semantic computing and machine learning

2015

mallintaminentägitmusic emotion recognitionverkkoyhteisötmusiikkiannotointisosiaalinen mediamusic mood annotationfeature selectionkoneoppiminentunteeteditorial tagssemantic computingaudio feature extractiondigitaalinen musiikkigenre-adaptivesocial tagslaskentamenetelmätcircumplex model
researchProduct

Suomen viittomakielten korpusta rakentamassa

2019

Viittomakielikorpusten rakentaminen on lisääntynyt merkittävästi 2000-luvulla: ensimmäiset korpusprojektit käynnistyivät 2000-luvun alussa Australiassa ja Hollannissa, minkä myötä laajoja, koneluettavia aineistokokoelmia on ryhdytty rakentamaan useissa Euroopan maissa 2010-luvulla. Tässä artikkelissa tarkastellaan Suomen viittomakielten, suomalaisen ja suomenruotsalaisen viittomakielen, korpuksen syntyä. Artikkeli esittelee korpuksen rakennusvaiheita eli aineiston keräämistä, käsittelyä, annotointia, pitkäaikaissäilytystä sekä julkaisua tietosuojakysymyksineen. Lisäksi artikkelissa kuvaillaan, miten korpusaineistoa on käytetty ja voidaan hyödyntää viittomakielten tutkimuksessa sekä opetukse…

kuvatallenteettietosuojaviittomakielisuomenruotsalainen viittomakielimetadataannotointisuomalainen viittomakielikorpuksettutkimusaineisto
researchProduct

Annotated Video Corpus of FinSL with Kinect and Computer-Vision Data

2016

This paper presents an annotated video corpus of Finnish Sign Language (FinSL) to which has been appended Kinect and computer-vision data. The video material consists of signed retellings of the stories Snowman and Frog, where are you?, elicited from 12 native FinSL signers in a dialogue setting. The recordings were carried out with 6 cameras directed toward the signers from different angles, and 6 signers were also recorded with one Kinect motion and depth sensing input device. All the material has been annotated in ELAN for signs, translations, grammar and prosody. To further facilitate research into FinSL prosody, computer-vision data describing the head movements and the aperture change…

kielioppisuomalainen viittomakieliannotointicorpusKinect [prosody]computer-vision
researchProduct

Semantic annotation and big data techniques for patent information processing

2017

This thesis analyzes approaches to generate semantic annotations on patent records, as well as on other structured data, by relying on the structure and semantic representation of documents. Information in patent records reflects how real-world technologies evolve, and the approximately 3 million annual new patent applications capture the global inventive frontier. The volume of this information is too big to be effectively analyzed purely with human effort, necessitating Big data approaches to analyze it with computer aided tools and techniques. Big data is a term that describes a massive volume of structured, semi structured and unstructured data that is so large to the point that it is d…

Semantic annotationPatent informationbig datasemanttinen annotointiannotointiData Miningpatentittiedonlouhinta
researchProduct

Building the Corpus of Finland-Swedish Sign Language : Acknowledging the Language History and Future Revitalization

2022

This paper presents the first steps in the process of creating a multimedia corpus for the severely endangered Finland-Swedish Sign Language (FinSSL). In the paper, we will first outline the history and current situation of FinSSL and then move on to describe some of the foundational choices which we have made both in the earlier data collection and at the start of the currently ongoing annotation work. Finally, we will bring up challenges related to the corpus data processing and discuss the future uses of the corpus, especially from the point of view of the FinSSL revitalization process. peerReviewed

multimediaviittomakielisuomenruotsalainen viittomakieliannotointikielen elvytyskorpuksetkielihistoria
researchProduct