Search results for "Mining"
showing 10 items of 1730 documents
CUDA-Accelerated Alignment of Subsequences in Streamed Time Series Data
2014
Euclidean Distance (ED) and Dynamic Time Warping (DTW) are cornerstones in the field of time series data mining. Many high-level algorithms like kNN-classification, clustering or anomaly detection make excessive use of these distance measures as subroutines. Furthermore, the vast growth of recorded data produced by automated monitoring systems or integrated sensors establishes the need for efficient implementations. In this paper, we introduce linear memory parallelization schemes for the alignment of a given query Q in a stream of time series data S for both ED and DTW using CUDA-enabled accelerators. The ED parallelization features a log-linear calculation scheme in contrast to the naive …
Criminal networks analysis in missing data scenarios through graph distances
2021
Data collected in criminal investigations may suffer from issues like: (i) incompleteness, due to the covert nature of criminal organizations; (ii) incorrectness, caused by either unintentional data collection errors or intentional deception by criminals; (iii) inconsistency, when the same information is collected into law enforcement databases multiple times, or in different formats. In this paper we analyze nine real criminal networks of different nature (i.e., Mafia networks, criminal street gangs and terrorist organizations) in order to quantify the impact of incomplete data, and to determine which network type is most affected by it. The networks are firstly pruned using two specific m…
GEM
2014
The widespread use of digital sensor systems causes a tremendous demand for high-quality time series analysis tools. In this domain the majority of data mining algorithms relies on established distance measures like Dynamic Time Warping (DTW) or Euclidean distance (ED). However, the notion of similarity induced by ED and DTW may lead to unsatisfactory clusterings. In order to address this shortcoming we introduce the Gliding Elastic Match (GEM) algorithm. It determines an optimal local similarity measure of a query time series Q and a subject time series S. The measure is invariant under both local deformation on the measurement-axis and scaling in the time domain. GEM is compared to ED and…
Data Mining Algorithms for Knowledge Extraction
2020
In this paper, we study the methods, techniques, and algorithms used in data mining, and from the studied algorithms, we emphasized the clustering algorithms, more precisely on the K-means algorithm. This algorithm was first studied using the Euclidean distance, then modifying the distance between the clusters using the distances Mahalanobis and Canberra. After implementing the algorithms in C/C++, we compared the clustering of the three algorithms, after which we modified them and studied the distance between the clusters.
What explains the resilience of SMEs? Ambidexterity capability and strategic consistency
2020
Abstract The ability to be resilient, to recover and bounce back when confronted with a threatening and stressful external event, such as the most recent global economic crisis, is an important issue for strategic management research, particularly for small and medium-sized enterprises (SMEs). Research studies on firm level antecedents of resilience offer contradictory propositions, some of which stress the need for experimentation, while others suggest focusing on reliability. To disentangle this controversy and face this gap, our study proposes that for SMEs to achieve resilience, it is necessary that these companies are able to efficiently respond to the changing environments through amb…
Badanie pamięci lokalnej w kontekście państwowej polityki pamięci w Polsce. Deportacje w głąb Rosji i na Sybir w pamięci nieoficjalnej
2018
Researchers in social memory are bound in their work to include the relations occurring between the memory of local communities and the official historical policy of the State. When events retained and exposed in the local memory are persistently passed over in silence, distorted, falsified or removed from the public sphere by means of decisions taken by the censorship, the remembrance of these events takes on the character of concealed memory, which integrates the given social group tightly (e.g., the Siberians, the Silesians). The transformations which followed in Poland after 1989 (liquidation of the Censorship) formally introduced „commonwealths of memory” into the public debate; howeve…
On thermoeconomics of energy systems at variable load conditions: integrated optimization of plant design and operation
2007
Abstract Thermoeconomics has been assuming a growing role among the disciplines oriented to the analysis of energy systems, its different methodologies allowing solution of problems in the fields of cost accounting, plant design optimisation and diagnostic of malfunctions. However, the thermoeconomic methodologies as such are particularly appropriate to analyse large industrial systems at steady or quasi-steady operation, but they can be hardly applied to small to medium scale units operating in unsteady conditions to cover a variable energy demand. In this paper, the fundamentals of thermoeconomics for systems operated at variable load are discussed, examining the cost formation process an…
Literature, social media and questionnaire surveys identify relevant conservation areas for Carcharhinus species in the Mediterranean Sea
2023
Sharks support ecosystems’ health, but their populations are facing severe declines worldwide. Knowledge gaps on shark distribution and the negative human perception of them still represent a barrier to the implementation of effective conservation measures. Here we carried out a regional-scale analysis in the Mediterranean Sea using data on requiem shark catches and sightings available in the scientific literature and on social media platforms to: 1) depict the distribution of Carcharhinus species across the basin, 2) identify potentially relevant areas for their conservation, and 3) evaluate people’s attitude toward shark protection. In addition, we administered 112 questionnaires in one o…
Remote Sensing Image Classification with Large Scale Gaussian Processes
2017
Current remote sensing image classification problems have to deal with an unprecedented amount of heterogeneous and complex data sources. Upcoming missions will soon provide large data streams that will make land cover/use classification difficult. Machine learning classifiers can help at this, and many methods are currently available. A popular kernel classifier is the Gaussian process classifier (GPC), since it approaches the classification problem with a solid probabilistic treatment, thus yielding confidence intervals for the predictions as well as very competitive results to state-of-the-art neural networks and support vector machines. However, its computational cost is prohibitive for…
Core of communities in bipartite networks
2017
We use the information present in a bipartite network to detect cores of communities of each set of the bipartite system. Cores of communities are found by investigating statistically validated projected networks obtained using information present in the bipartite network. Cores of communities are highly informative and robust with respect to the presence of errors or missing entries in the bipartite network. We assess the statistical robustness of cores by investigating an artificial benchmark network, the co-authorship network, and the actor-movie network. The accuracy and precision of the partition obtained with respect to the reference partition are measured in terms of the adjusted Ran…