Search results for "Dimensionality Reduction"
showing 10 items of 120 documents
Emulation of Sun-Induced Fluorescence from Radiance Data Recorded by the HyPlant Airborne Imaging Spectrometer
2021
The retrieval of sun-induced fluorescence (SIF) from hyperspectral radiance data grew to maturity with research activities around the FLuorescence EXplorer satellite mission FLEX, yet full-spectrum estimation methods such as the spectral fitting method (SFM) are computationally expensive. To bypass this computational load, this work aims to approximate the SFM-based SIF retrieval by means of statistical learning, i.e., emulation. While emulators emerged as fast surrogate models of simulators, the accuracy-speedup trade-offs are still to be analyzed when the emulation concept is applied to experimental data. We evaluated the possibility of approximating the SFM-like SIF output directly based…
Special Functions for the Study of Economic Dynamics: The Case of the Lucas-Uzawa Model
2004
The special functions are intensively used in mathematical physics to solve differential systems. We argue that they should be most useful in economic dynamics, notably in the assessment of the transition dynamics of endogenous growth models. We illustrate our argument on the Lucas-Uzawa model, which we solve by the means of Gaussian hypergeometric functions. We show how the use of Gaussian hypergeometric functions allows for an explicit representation of the equilibrium dynamics of the variables in level. In contrast to the preexisting approaches, our method is global and does not rely on dimension reduction.
Generalizability and Simplicity as Criteria in Feature Selection: Application to Mood Classification in Music
2011
Classification of musical audio signals according to expressed mood or emotion has evident applications to content-based music retrieval in large databases. Wrapper selection is a dimension reduction method that has been proposed for improving classification performance. However, the technique is prone to lead to overfitting of the training data, which decreases the generalizability of the obtained results. We claim that previous attempts to apply wrapper selection in the field of music information retrieval (MIR) have led to disputable conclusions about the used methods due to inadequate analysis frameworks, indicative of overfitting, and biased results. This paper presents a framework bas…
Adaptive framework for network traffic classification using dimensionality reduction and clustering
2012
Information security has become a very important topic especially during the last years. Web services are becoming more complex and dynamic. This offers new possibilities for attackers to exploit vulnerabilities by inputting malicious queries or code. However, these attack attempts are often recorded in server logs. Analyzing these logs could be a way to detect intrusions either periodically or in real time. We propose a framework that preprocesses and analyzes these log files. HTTP queries are transformed to numerical matrices using n-gram analysis. The dimensionality of these matrices is reduced using principal component analysis and diffusion map methodology. Abnormal log lines can then …
Gear classification and fault detection using a diffusion map framework
2015
This article proposes a system health monitoring approach that detects abnormal behavior of machines. Diffusion map is used to reduce the dimensionality of training data, which facilitates the classification of newly arriving measurements. The new measurements are handled with Nyström extension. The method is trained and tested with real gear monitoring data from several windmill parks. A machine health index is proposed, showing that data recordings can be classified as working or failing using dimensionality reduction and warning levels in the low dimensional space. The proposed approach can be used with any system that produces high-dimensional measurement data. peerReviewed
Combining PCA and multiset CCA for dimension reduction when group ICA is applied to decompose naturalistic fMRI data
2015
An extension of group independent component analysis (GICA) is introduced, where multi-set canonical correlation analysis (MCCA) is combined with principal component analysis (PCA) for three-stage dimension reduction. The method is applied on naturalistic functional MRI (fMRI) images acquired during task-free continuous music listening experiment, and the results are compared with the outcome of the conventional GICA. The extended GICA resulted slightly faster ICA convergence and, more interestingly, extracted more stimulus-related components than its conventional counterpart. Therefore, we think the extension is beneficial enhancement for GICA, especially when applied to challenging fMRI d…
Online anomaly detection using dimensionality reduction techniques for HTTP log analysis
2015
Modern web services face an increasing number of new threats. Logs are collected from almost all web servers, and for this reason analyzing them is beneficial when trying to prevent intrusions. Intrusive behavior often differs from the normal web traffic. This paper proposes a framework to find abnormal behavior from these logs. We compare random projection, principal component analysis and diffusion map for anomaly detection. In addition, the framework has online capabilities. The first two methods have intuitive extensions while diffusion map uses the Nyström extension. This fast out-of-sample extension enables real-time analysis of web server traffic. The framework is demonstrated using …
An Approach for Network Outage Detection from Drive-Testing Databases
2012
A data-mining framework for analyzing a cellular network drive testing database is described in this paper. The presented method is designed to detect sleeping base stations, network outage, and change of the dominance areas in a cognitive and self-organizing manner. The essence of the method is to find similarities between periodical network measurements and previously known outage data. For this purpose, diffusion maps dimensionality reduction and nearest neighbor data classification methods are utilized. The method is cognitive because it requires training data for the outage detection. In addition, the method is autonomous because it uses minimization of drive testing (MDT) functionalit…
Research literature clustering using diffusion maps
2013
We apply the knowledge discovery process to the mapping of current topics in a particular field of science. We are interested in how articles form clusters and what are the contents of the found clusters. A framework involving web scraping, keyword extraction, dimensionality reduction and clustering using the diffusion map algorithm is presented. We use publicly available information about articles in high-impact journals. The method should be of use to practitioners or scientists who want to overview recent research in a field of science. As a case study, we map the topics in data mining literature in the year 2011. peerReviewed
An Efficient Network Log Anomaly Detection System Using Random Projection Dimensionality Reduction
2014
Network traffic is increasing all the time and network services are becoming more complex and vulnerable. To protect these networks, intrusion detection systems are used. Signature-based intrusion detection cannot find previously unknown attacks, which is why anomaly detection is needed. However, many new systems are slow and complicated. We propose a log anomaly detection framework which aims to facilitate quick anomaly detection and also provide visualizations of the network traffic structure. The system preprocesses network logs into a numerical data matrix, reduces the dimensionality of this matrix using random projection and uses Mahalanobis distance to find outliers and calculate an a…