6533b833fe1ef96bd129c51b

RESEARCH PRODUCT

Intelligent solutions for real-life data-driven applications

Elena Ivannikova

subject

spectral clusteringregression treesanomaly detectionregression analysislaadunvalvontaregressioanalyysikoneoppiminenpaper machinebig datagraph segmentationcommunity detectionnetwork securityklusterianalyysitiedonlouhintatietoturvamutual informationpaperikoneetclusteringvariable selection

description

The subject of this thesis belongs to the topic of machine learning or, specifically, to the development of advanced methods for regression analysis, clustering, and anomaly detection. Industry is constantly seeking improved production practices and minimized production time and costs. In connection to this, several industrial case studies are presented in which mathematical models for predicting paper quality were proposed. The most important variables for the prediction models are selected based on information-theoretic measures and regression trees approach. The rest of the original papers are devoted to unsupervised machine learning. The main focus is developing advanced spectral clustering techniques for community detection and anomaly detection. As part of these efforts, a number of enhancements for the dependence clustering algorithm have been proposed. These enhancements include adding regularization for controlling the size of clusters, extension to the ensemble version for improving model stability, handling overlapping clusters, and adaptation to solving anomaly detection problems and handling big datasets. Another focus of the thesis is on developing anomaly detection algorithms for network security data. In connection to this, a probabilistic transition-based approach is proposed for detecting application-layer distributed denial-of-service attacks. The developed approaches are tested on real datasets and are capable of efficiently solving the given tasks with high accuracy and good performance. They are shown to be applicable to solving variable selection, graph segmentation, and anomaly detection tasks in different applications.

http://urn.fi/URN:ISBN:978-951-39-7279-0