0000000000186021

AUTHOR

Antti Juvonen

Growing Hierarchical Self-organizing Maps and Statistical Distribution Models for Online Detection of Web Attacks

In modern networks, HTTP clients communicate with web servers using request messages. By manipulating these messages attackers can collect confidential information from servers or even corrupt them. In this study, the approach based on anomaly detection is considered to find such attacks. For HTTP queries, feature matrices are obtained by applying an n-gram model, and, by learning on the basis of these matrices, growing hierarchical self-organizing maps are constructed. For HTTP headers, we employ statistical distribution models based on the lengths of header values and relative frequency of symbols. New requests received by the web-server are classified by using the maps and models obtaine…

research product

Intrusion detection applications using knowledge discovery and data mining

research product

Online anomaly detection using dimensionality reduction techniques for HTTP log analysis

Modern web services face an increasing number of new threats. Logs are collected from almost all web servers, and for this reason analyzing them is beneficial when trying to prevent intrusions. Intrusive behavior often differs from the normal web traffic. This paper proposes a framework to find abnormal behavior from these logs. We compare random projection, principal component analysis and diffusion map for anomaly detection. In addition, the framework has online capabilities. The first two methods have intuitive extensions while diffusion map uses the Nyström extension. This fast out-of-sample extension enables real-time analysis of web server traffic. The framework is demonstrated using …

research product

An Efficient Network Log Anomaly Detection System Using Random Projection Dimensionality Reduction

Network traffic is increasing all the time and network services are becoming more complex and vulnerable. To protect these networks, intrusion detection systems are used. Signature-based intrusion detection cannot find previously unknown attacks, which is why anomaly detection is needed. However, many new systems are slow and complicated. We propose a log anomaly detection framework which aims to facilitate quick anomaly detection and also provide visualizations of the network traffic structure. The system preprocesses network logs into a numerical data matrix, reduces the dimensionality of this matrix using random projection and uses Mahalanobis distance to find outliers and calculate an a…

research product

Combining conjunctive rule extraction with diffusion maps for network intrusion detection

Network security and intrusion detection are important in the modern world where communication happens via information networks. Traditional signature-based intrusion detection methods cannot find previously unknown attacks. On the other hand, algorithms used for anomaly detection often have black box qualities that are difficult to understand for people who are not algorithm experts. Rule extraction methods create interpretable rule sets that act as classifiers. They have mostly been combined with already labeled data sets. This paper aims to combine unsupervised anomaly detection with rule extraction techniques to create an online anomaly detection framework. Unsupervised anomaly detectio…

research product

Anomaly Detection Framework Using Rule Extraction for Efficient Intrusion Detection

Huge datasets in cyber security, such as network traffic logs, can be analyzed using machine learning and data mining methods. However, the amount of collected data is increasing, which makes analysis more difficult. Many machine learning methods have not been designed for big datasets, and consequently are slow and difficult to understand. We address the issue of efficient network traffic classification by creating an intrusion detection framework that applies dimensionality reduction and conjunctive rule extraction. The system can perform unsupervised anomaly detection and use this information to create conjunctive rules that classify huge amounts of traffic in real time. We test the impl…

research product

Poikkeavuuksien havaitseminen WWW-palvelinlokidatasta

Nykyajan web-palvelut ovat dynaamisia ja avoimia. Tämä antaa suurelle joukolle käyttäjiä mahdollisuuden päästä käsiksi palveluun ja sen sisältämään tietoon. Samalla avautuu uusia mahdollisuuksia toteuttaa hyökkäys. Tietoturvan pitäminen riittävällä tasolla on kilpailua aikaa vastaan. Poikkeavuuksien havaitsemisjärjestelmillä pystytään kuitenkin havaitsemaan ennestään tuntemattomat hyökkäykset ja muu epänormaali toiminta ja siten pitämään tietoturva hyvällä tasolla. Tutkimuksessa sovellettiin n-grammianalyysia, tukivektorikonetta ja diffuusiokarttoja esikäsitellyn verkkodatan analysointiin. Kaikilla menetelmillä saatiin lupaavia tuloksia, mutta reaaliaikainen järjestelmä vaatii vielä jatkoke…

research product

Adaptive framework for network traffic classification using dimensionality reduction and clustering

Information security has become a very important topic especially during the last years. Web services are becoming more complex and dynamic. This offers new possibilities for attackers to exploit vulnerabilities by inputting malicious queries or code. However, these attack attempts are often recorded in server logs. Analyzing these logs could be a way to detect intrusions either periodically or in real time. We propose a framework that preprocesses and analyzes these log files. HTTP queries are transformed to numerical matrices using n-gram analysis. The dimensionality of these matrices is reduced using principal component analysis and diffusion map methodology. Abnormal log lines can then …

research product

Anomaly Detection from Network Logs Using Diffusion Maps

The goal of this study is to detect anomalous queries from network logs using a dimensionality reduction framework. The fequencies of 2-grams in queries are extracted to a feature matrix. Dimensionality reduction is done by applying diffusion maps. The method is adaptive and thus does not need training before analysis. We tested the method with data that includes normal and intrusive traffic to a web server. This approach finds all intrusions in the dataset. peerReviewed

research product

...Johnnyllakin on univormu, heimovaatteet ja -kampaus.. : musiikillisen erityisorientaation polku musiikkiminän, maailmankuvan ja musiikkimaun heijastamina

research product