Search results for "louhinta"
showing 10 items of 93 documents
Comparison of Internal Clustering Validation Indices for Prototype-Based Clustering
2017
Clustering is an unsupervised machine learning and pattern recognition method. In general, in addition to revealing hidden groups of similar observations and clusters, their number needs to be determined. Internal clustering validation indices estimate this number without any external information. The purpose of this article is to evaluate, empirically, characteristics of a representative set of internal clustering validation indices with many datasets. The prototype-based clustering framework includes multiple, classical and robust, statistical estimates of cluster location so that the overall setting of the paper is novel. General observations on the quality of validation indices and on t…
Data Analytics in Healthcare: A Tertiary Study
2022
AbstractThe field of healthcare has seen a rapid increase in the applications of data analytics during the last decades. By utilizing different data analytic solutions, healthcare areas such as medical image analysis, disease recognition, outbreak monitoring, and clinical decision support have been automated to various degrees. Consequently, the intersection of healthcare and data analytics has received scientific attention to the point of numerous secondary studies. We analyze studies on healthcare data analytics, and provide a wide overview of the subject. This is a tertiary study, i.e., a systematic review of systematic reviews. We identified 45 systematic secondary studies on data analy…
Talent identification in soccer using a one-class support vector machine
2019
Abstract Identifying potential future elite athletes is important in many sporting events. The successful identification of potential future elite athletes at an early age would help to provide high-quality coaching and training environments in which to optimize their development. However, a large variety of different skills and qualities are needed to succeed in elite sports, making talent identification generally a complex and multifaceted problem. Due to the rarity of elite athletes, datasets are inherently imbalanced, making classical statistical inference difficult. Therefore, we approach talent identification as an anomaly detection problem. We trained a nonlinear one-class support ve…
Reconsidering authorship in the Ciceronian corpus through computational authorship attribution
2019
In recent years, methods of computational authorship attribution have offered promising results for the reattribution of classical texts. We use and further develop these methods to verify the authorship of several texts belonging or related to the Ciceronian corpus: Rhetorica ad C. Herennium, De inventione, De optimo genere oratorum, and Commentariolum petitionis. We use two classifiers, Support Vector Machine and Convolutional Neural Network, of which the latter is more accurate except in regard to certain aspects of vocabulary. The most important of our results is that Commentariolum petitionis seems to be authored by Marcus Cicero, not by his brother Quintus. Negli ultimi anni metodi co…
Haavoittuvuuden kudelmat : digitaalinen subjekti ja haavoittuvuus datavetoista yhteiskuntaa käsittelevässä tutkimuskirjallisuudessa
2021
Artikkelissa tarkastellaan sitä, millaisia merkityksiä haavoittuvuudelle on annettu datavetoista yhteiskuntaa ja digitaalista subjektia koskevassa tutkimuskirjallisuudessa. Artikkeli perustuu kirjallisuuskatsaukseen, joka on tehty vuosina 2015–2020 ilmestyneistä haavoittuvuutta datafikaation kontekstissa käsittelevistä tieteellisistä julkaisuista. Kirjallisuushaut tehtiin yhteiskuntatieteiden alojen keskeisistä tietokannoista ja digitaalisista kirjastoista. Hakujen pohjalta tutkimuskirjallisuus järjestettiin neljään teemakokonaisuuteen: 1) datavalvonnan tuottamat haavoittuvuudet, 2) data tietämisen tapana ja osallisuutena, 3) digitaalisten subjektien kategorisointi ja näkyvyyden säätely sek…
The poor man’s goldmine? : Career paths in Swedish and Finnish merchant shipping, c. 1840–1950
2017
This article analyses the career paths of Swedish and Finnish sailors from the mid-19th to the mid-20th century. The article shows that, for the most of the men, the seaman’s occupation was just a passing phase before taking up a job on shore, but many of them also created a longlasting and advancing career by going to sea. There was not necessarily, however, a clear distinction between job opportunities at sea and those on shore in those days: men worked both at sea and on shore. We therefore argue that an individual’s advancement in a maritime career was a context-specific socio-economic phenomenon. In Scandinavia, work on board ships was dependent on features that characterized the divis…
Evolving Conceptualisations of Internationalism in the UK Parliament : Collocation Analyses from the League to Brexit
2020
This chapter explores a historical distant reading strategy of British Parliamentary discourse. It uses historical collocation analyses of ‘internationalism’ and the ‘international’ in the British Hansard Corpus and a selection of Commons and Lords debates concerning British membership in international organisations as it relates to the League of Nations, United Nations, Council of Europe, EEC and Brexit. The collocates that were deemed to be politically significant are grouped in 13 loose semantic fields. This macro-level analysis of long-term trends of discourse is supplemented with an analysis of the said key debates in their historical contexts, including comparisons between the two Hou…
Maailma muuttui - muuttuiko ekologinen tutkimus? : ekologia-aiheiset artikkelit Nature- ja Science-lehdissä 1981-2000
2008
Kansainvälisten koulutusarvioiden vertailu koulutuksellisen tiedonlouhinnan keinoin
2016
Koulutusta ja eri ikäisten lasten akateemista suorituskykyä mittaavat tutkimustulokset ovat kiinnostavaa tarkasteltavaa monien alojen työntekijöille ja tutkijoille. Nykyään monet organisaatiot, kuten OECD (Organisation for Economic Co-operation and Development) ja IEA (International Association for the Evaluation of Educational Achievement), järjestävät tietyin aikavälein kansainvälisiä mittauksia, joissa mitataan tietyn ikäisten lasten akateemisia kykyjä ja kysellään heidän elämästään koulussa ja kotona. Näistä mittauksista syntyvät tietokannat ovat suuria ja ne tarjoavat monipuolista tietoa koulutuksesta ja lasten oppimiseen vaikuttavista tekijöistä. Kaiken tämän lisäksi, nämä tietokannat…
Semantic annotation and big data techniques for patent information processing
2017
This thesis analyzes approaches to generate semantic annotations on patent records, as well as on other structured data, by relying on the structure and semantic representation of documents. Information in patent records reflects how real-world technologies evolve, and the approximately 3 million annual new patent applications capture the global inventive frontier. The volume of this information is too big to be effectively analyzed purely with human effort, necessitating Big data approaches to analyze it with computer aided tools and techniques. Big data is a term that describes a massive volume of structured, semi structured and unstructured data that is so large to the point that it is d…