Search results for "Mining"

showing 10 items of 1730 documents

Anomaly detection approach to keystroke dynamics based user authentication

2017

Keystroke dynamics is one of the authentication mechanisms which uses natural typing pattern of a user for identification. In this work, we introduced Dependence Clustering based approach to user authentication using keystroke dynamics. In addition, we applied a k-NN-based approach that demonstrated strong results. Most of the existing approaches use only genuine users data for training and validation. We designed a cross validation procedure with artificially generated impostor samples that improves the learning process yet allows fair comparison to previous works. We evaluated the methods using the CMU keystroke dynamics benchmark dataset. Both proposed approaches outperformed the previou…

ta113AuthenticationpääsynvalvontaComputer scienceaccess control02 engineering and technologycomputer.software_genreKeystroke dynamicstodentaminen020204 information systems0202 electrical engineering electronic engineering information engineeringBenchmark (computing)Unsupervised learningauthentication020201 artificial intelligence & image processingAnomaly detectionData miningtietoturvadata securitycomputer
researchProduct

Adaptive framework for network traffic classification using dimensionality reduction and clustering

2012

Information security has become a very important topic especially during the last years. Web services are becoming more complex and dynamic. This offers new possibilities for attackers to exploit vulnerabilities by inputting malicious queries or code. However, these attack attempts are often recorded in server logs. Analyzing these logs could be a way to detect intrusions either periodically or in real time. We propose a framework that preprocesses and analyzes these log files. HTTP queries are transformed to numerical matrices using n-gram analysis. The dimensionality of these matrices is reduced using principal component analysis and diffusion map methodology. Abnormal log lines can then …

ta113Computer scienceNetwork securitybusiness.industryDimensionality reductionintrusion detectionk-meansdiffusion mapServer logcomputer.software_genreanomaly detectionTraffic classificationkoneoppiminenWeb log analysis softwareAnomaly detectionData miningWeb servicetiedonlouhintaCluster analysisbusinesscomputern-grams
researchProduct

A novel heuristic memetic clustering algorithm

2013

In this paper we introduce a novel clustering algorithm based on the Memetic Algorithm meta-heuristic wherein clusters are iteratively evolved using a novel single operator employing a combination of heuristics. Several heuristics are described and employed for the three types of selections used in the operator. The algorithm was exhaustively tested on three benchmark problems and compared to a classical clustering algorithm (k-Medoids) using the same performance metrics. The results show that our clustering algorithm consistently provides better clustering solutions with less computational effort.

ta113Determining the number of clusters in a data setBiclusteringClustering high-dimensional dataDBSCANComputingMethodologies_PATTERNRECOGNITIONTheoretical computer scienceCURE data clustering algorithmCorrelation clusteringCanopy clustering algorithmCluster analysisAlgorithmMathematics2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP)
researchProduct

Gear classification and fault detection using a diffusion map framework

2015

This article proposes a system health monitoring approach that detects abnormal behavior of machines. Diffusion map is used to reduce the dimensionality of training data, which facilitates the classification of newly arriving measurements. The new measurements are handled with Nyström extension. The method is trained and tested with real gear monitoring data from several windmill parks. A machine health index is proposed, showing that data recordings can be classified as working or failing using dimensionality reduction and warning levels in the low dimensional space. The proposed approach can be used with any system that produces high-dimensional measurement data. peerReviewed

ta113Diffusion (acoustics)Training setta214Computer scienceDimensionality reductiondiffusion mapExtension (predicate logic)computer.software_genreFault detection and isolationfault detectionsystem health monitoringArtificial IntelligenceSignal ProcessingComputer Vision and Pattern RecognitionData miningCluster analysiscomputerSoftwareCurse of dimensionalityclustering
researchProduct

Quantile index for gradual and abrupt change detection from CFB boiler sensor data in online settings

2012

In this paper we consider the problem of online detection of gradual and abrupt changes in sensor data having high levels of noise and outliers. We propose a simple heuristic method based on the Quantile Index (QI) and study how robust this method is for detecting both gradual and abrupt changes with such data. We evaluate the performance of our method on the artificially generated and real datasets that represent different operational settings of a pilot circulating fluidized bed (CFB) reactor and CFB cold model. Our experiments suggest that QI can be used for designing very simple yet effective methods for gradual change detection in the noisy sensor data. It can be also used for detectin…

ta113Engineeringbusiness.industryOutlierBoiler (power generation)Data miningbusinesscomputer.software_genrecomputerChange detectionQuantile
researchProduct

Anomaly Detection Algorithms for the Sleeping Cell Detection in LTE Networks

2015

The Sleeping Cell problem is a particular type of cell degradation in Long-Term Evolution (LTE) networks. In practice such cell outage leads to the lack of network service and sometimes it can be revealed only after multiple user complains by an operator. In this study a cell becomes sleeping because of a Random Access Channel (RACH) failure, which may happen due to software or hardware problems. For the detection of malfunctioning cells, we introduce a data mining based framework. In its core is the analysis of event sequences reported by a User Equipment (UE) to a serving Base Station (BS). The crucial element of the developed framework is an anomaly detection algorithm. We compare perfor…

ta113Engineeringta213business.industryEvent (computing)Real-time computingProbabilistic logicdata miningSONanomaly detectionself-organizing networksLTEBase stationcell outageSoftwareRandom-access channelUser equipmentNetwork serviceAnomaly detectionmobile cellular networkstiedonlouhintabusiness
researchProduct

A modelling framework for social media monitoring

2013

This paper describes a hierarchical, three-level modelling framework for monitoring social media. Immediate social reality is modelled through the first level of the models. They represent various virtual communities at social media sites and adhere to the social world models of the sites, i.e., the "site ontologies". The second-level model is a temporal multirelational graph that captures the static and dynamic properties of the first-level models from the perspective of the monitoring site. The third-level model consists of a temporal relational database scheme that models the temporal multirelational graph within the database. The models are specified and instantiated at the monitoring s…

ta113Graph databaseComputer Networks and Communicationsbusiness.industryComputer scienceRelational databaseSocial realitySchematiccomputer.software_genreTemporal databaseHardware and ArchitectureGraph (abstract data type)The InternetSocial mediaData miningbusinesscomputerInformation SystemsInternational Journal of Web Engineering and Technology
researchProduct

Twister Tries

2015

Many commonly used data-mining techniques utilized across research fields perform poorly when used for large data sets. Sequential agglomerative hierarchical non-overlapping clustering is one technique for which the algorithms’ scaling properties prohibit clustering of a large amount of items. Besides the unfavorable time complexity of O(n 2 ), these algorithms have a space complexity of O(n 2 ), which can be reduced to O(n) if the time complexity is allowed to rise to O(n 2 log2 n). In this paper, we propose the use of locality-sensitive hashing combined with a novel data structure called twister tries to provide an approximate clustering for average linkage. Our approach requires only lin…

ta113Hierarchical agglomerative clusteringta112Fuzzy clusteringBrown clusteringComputer scienceSingle-linkage clusteringcomputer.software_genreHierarchical clusteringLocality-sensitive hashingData setCURE data clustering algorithmlocality-sensitive hashingaverage linkageData miningHierarchical clustering of networkslinear complexityCluster analysishierarchical clusteringAlgorithmcomputerTime complexityProceedings of the 2015 ACM SIGMOD International Conference on Management of Data
researchProduct

A Hybrid Multigroup Coclustering Recommendation Framework Based on Information Fusion

2015

Collaborative Filtering (CF) is one of the most successful algorithms in recommender systems. However, it suffers from data sparsity and scalability problems. Although many clustering techniques have been incorporated to alleviate these two problems, most of them fail to achieve further significant improvement in recommendation accuracy. First of all, most of them assume each user or item belongs to a single cluster. Since usually users can hold multiple interests and items may belong to multiple categories, it is more reasonable to assume that users and items can join multiple clusters (groups), where each cluster is a subset of like-minded users and items they prefer. Furthermore, most of…

ta113Information retrievalComputer sciencebusiness.industrydata miningRecommender systemcomputer.software_genreTheoretical Computer ScienceInformation fusionKnowledge baseArtificial IntelligenceCollaborative FilteringScalabilityCluster (physics)Collaborative filteringLearning to rankData miningrecommender systemsCluster analysisbusinesscomputercluster analysisACM Transactions on Intelligent Systems and Technology
researchProduct

Support vector machine integrated with game-theoretic approach and genetic algorithm for the detection and classification of malware

2013

Abstract. —In the modern world, a rapid growth of mali- cious software production has become one of the most signifi- cant threats to the network security. Unfortunately, wides pread signature-based anti-malware strategies can not help to de tect malware unseen previously nor deal with code obfuscation te ch- niques employed by malware designers. In our study, the prob lem of malware detection and classification is solved by applyin g a data-mining-based approach that relies on supervised mach ine- learning. Executable files are presented in the form of byte a nd opcode sequences and n-gram models are employed to extract essential features from these sequences. Feature vectors o btained are…

ta113Network securitybusiness.industryComputer scienceFeature vectorFeature extractionuhatBytecomputer.file_formatMachine learningcomputer.software_genrehaittaohjelmatSupport vector machineObfuscation (software)ComputingMethodologies_PATTERNRECOGNITIONnetworknetwork securityMalwareData miningArtificial intelligenceExecutabletietoturvabusinesscomputer2013 IEEE Globecom Workshops (GC Wkshps)
researchProduct