Search results for "mining"

showing 10 items of 1730 documents

CUDA-Accelerated Alignment of Subsequences in Streamed Time Series Data

2014

Euclidean Distance (ED) and Dynamic Time Warping (DTW) are cornerstones in the field of time series data mining. Many high-level algorithms like kNN-classification, clustering or anomaly detection make excessive use of these distance measures as subroutines. Furthermore, the vast growth of recorded data produced by automated monitoring systems or integrated sensors establishes the need for efficient implementations. In this paper, we introduce linear memory parallelization schemes for the alignment of a given query Q in a stream of time series data S for both ED and DTW using CUDA-enabled accelerators. The ED parallelization features a log-linear calculation scheme in contrast to the naive …

Euclidean distanceCUDADynamic time warpingData stream miningComputer scienceAnomaly detectionParallel computingCluster analysisTime complexityDistance measures2014 43rd International Conference on Parallel Processing
researchProduct

Criminal networks analysis in missing data scenarios through graph distances

2021

Data collected in criminal investigations may suffer from issues like: (i) incompleteness, due to the covert nature of criminal organizations; (ii) incorrectness, caused by either unintentional data collection errors or intentional deception by criminals; (iii) inconsistency, when the same information is collected into law enforcement databases multiple times, or in different formats. In this paper we analyze nine real criminal networks of different nature (i.e., Mafia networks, criminal street gangs and terrorist organizations) in order to quantify the impact of incomplete data, and to determine which network type is most affected by it. The networks are firstly pruned using two specific m…

Euclidean distanceData collectionComputer scienceNode (networking)Law enforcementGraph (abstract data type)Adjacency listData miningMissing datacomputer.software_genreCriminal investigationcomputerCrimRxiv
researchProduct

GEM

2014

The widespread use of digital sensor systems causes a tremendous demand for high-quality time series analysis tools. In this domain the majority of data mining algorithms relies on established distance measures like Dynamic Time Warping (DTW) or Euclidean distance (ED). However, the notion of similarity induced by ED and DTW may lead to unsatisfactory clusterings. In order to address this shortcoming we introduce the Gliding Elastic Match (GEM) algorithm. It determines an optimal local similarity measure of a query time series Q and a subject time series S. The measure is invariant under both local deformation on the measurement-axis and scaling in the time domain. GEM is compared to ED and…

Euclidean distanceDynamic time warpingSimilarity (network science)Computer scienceData miningInvariant (mathematics)Similarity measurecomputer.software_genreMeasure (mathematics)AlgorithmcomputerDistance measuresProceedings of the 29th Annual ACM Symposium on Applied Computing
researchProduct

Data Mining Algorithms for Knowledge Extraction

2020

In this paper, we study the methods, techniques, and algorithms used in data mining, and from the studied algorithms, we emphasized the clustering algorithms, more precisely on the K-means algorithm. This algorithm was first studied using the Euclidean distance, then modifying the distance between the clusters using the distances Mahalanobis and Canberra. After implementing the algorithms in C/C++, we compared the clustering of the three algorithms, after which we modified them and studied the distance between the clusters.

Euclidean distanceMahalanobis distanceMatrix (mathematics)ComputingMethodologies_PATTERNRECOGNITIONKnowledge extractionComputer sciencebusiness.industryValue (computer science)Pattern recognitionArtificial intelligenceCluster analysisbusinessData mining algorithm
researchProduct

What explains the resilience of SMEs? Ambidexterity capability and strategic consistency

2020

Abstract The ability to be resilient, to recover and bounce back when confronted with a threatening and stressful external event, such as the most recent global economic crisis, is an important issue for strategic management research, particularly for small and medium-sized enterprises (SMEs). Research studies on firm level antecedents of resilience offer contradictory propositions, some of which stress the need for experimentation, while others suggest focusing on reliability. To disentangle this controversy and face this gap, our study proposes that for SMEs to achieve resilience, it is necessary that these companies are able to efficiently respond to the changing environments through amb…

Event (computing)Strategy and Management05 social sciencesGeography Planning and Development0211 other engineering and technologies02 engineering and technologyConsistency (negotiation)0502 economics and businessResearch studiesStrategic managementBusinessResilience (network)050203 business & managementFinanceReliability (statistics)Industrial organization021102 mining & metallurgyAmbidexterityLong Range Planning
researchProduct

Badanie pamięci lokalnej w kontekście państwowej polityki pamięci w Polsce. Deportacje w głąb Rosji i na Sybir w pamięci nieoficjalnej

2018

Researchers in social memory are bound in their work to include the relations occurring between the memory of local communities and the official historical policy of the State. When events retained and exposed in the local memory are persistently passed over in silence, distorted, falsified or removed from the public sphere by means of decisions taken by the censorship, the remembrance of these events takes on the character of concealed memory, which integrates the given social group tightly (e.g., the Siberians, the Silesians). The transformations which followed in Poland after 1989 (liquidation of the Censorship) formally introduced „commonwealths of memory” into the public debate; howeve…

Examining local memory in the context of the state policy of remembrance in Poland. Deportations into the heart of Russia to Siberia in unofficial memoryWschodnioznawstwo
researchProduct

On thermoeconomics of energy systems at variable load conditions: integrated optimization of plant design and operation

2007

Abstract Thermoeconomics has been assuming a growing role among the disciplines oriented to the analysis of energy systems, its different methodologies allowing solution of problems in the fields of cost accounting, plant design optimisation and diagnostic of malfunctions. However, the thermoeconomic methodologies as such are particularly appropriate to analyse large industrial systems at steady or quasi-steady operation, but they can be hardly applied to small to medium scale units operating in unsteady conditions to cover a variable energy demand. In this paper, the fundamentals of thermoeconomics for systems operated at variable load are discussed, examining the cost formation process an…

ExergyEngineeringPrimary energyRenewable Energy Sustainability and the Environmentbusiness.industryThermoeconomics has been assuming a growing role among the disciplines oriented to the analysis of energy systems its different methodologies allowing solution of problems in the fields of cost accounting plant design optimisation and diagnostic of malfunctions. However the thermoeconomic methodologies as such are particularly appropriate to analyse large industrial systems at steady or quasisteady operation but they can be hardly applied to small to medium scale units operating in unsteady conditions to cover a variable energy demand. In this paper the fundamentals of thermoeconomics for systems operated at variable load are discussed examining the cost formation process and separately the cost fractions related to capital depreciation (which require additional distinctions with respect to plants in steady operation) and to exergy consumption. The relevant effects of the efficiency penalty due to off design operation on the exergetic cost of internal flows are also examined. An original algorithm is proposed for the integrated optimization of plant design and operation based on an analytical solution by the Lagrange multipliers method and on a multi-objective decision function expressed either in terms of net cash flow or primary energy saving. The method is suitable for application in complex energy systems such as ‘‘facilities of components of a same product’’ connected to external networks for power or heat distribution. For demonstrative purposes the proposed thermoeconomically aided optimization is performed for a grid connected trigeneration system to be installed in a large hotel.Energy Engineering and Power TechnologyCost accountingThermoeconomicsGridEnergy conservationVariable (computer science)symbols.namesakeFuel TechnologyNuclear Energy and EngineeringLagrange multipliersymbolsProcess engineeringbusinessSimulation
researchProduct

Literature, social media and questionnaire surveys identify relevant conservation areas for Carcharhinus species in the Mediterranean Sea

2023

Sharks support ecosystems’ health, but their populations are facing severe declines worldwide. Knowledge gaps on shark distribution and the negative human perception of them still represent a barrier to the implementation of effective conservation measures. Here we carried out a regional-scale analysis in the Mediterranean Sea using data on requiem shark catches and sightings available in the scientific literature and on social media platforms to: 1) depict the distribution of Carcharhinus species across the basin, 2) identify potentially relevant areas for their conservation, and 3) evaluate people’s attitude toward shark protection. In addition, we administered 112 questionnaires in one o…

Extinction Social media data mining Conservation hotspot Public perception Ecotourism Coastal sharks Requiem sharksEcology Evolution Behavior and SystematicsNature and Landscape Conservation
researchProduct

Remote Sensing Image Classification with Large Scale Gaussian Processes

2017

Current remote sensing image classification problems have to deal with an unprecedented amount of heterogeneous and complex data sources. Upcoming missions will soon provide large data streams that will make land cover/use classification difficult. Machine learning classifiers can help at this, and many methods are currently available. A popular kernel classifier is the Gaussian process classifier (GPC), since it approaches the classification problem with a solid probabilistic treatment, thus yielding confidence intervals for the predictions as well as very competitive results to state-of-the-art neural networks and support vector machines. However, its computational cost is prohibitive for…

FOS: Computer and information sciences010504 meteorology & atmospheric sciencesComputer scienceMultispectral image0211 other engineering and technologiesMachine Learning (stat.ML)02 engineering and technologyLand cover01 natural sciencesStatistics - ApplicationsMachine Learning (cs.LG)Kernel (linear algebra)Bayes' theoremsymbols.namesakeStatistics - Machine LearningApplications (stat.AP)Electrical and Electronic EngineeringGaussian process021101 geological & geomatics engineering0105 earth and related environmental sciencesRemote sensingContextual image classificationArtificial neural networkData stream miningProbabilistic logicSupport vector machineComputer Science - LearningKernel (image processing)symbolsGeneral Earth and Planetary Sciences
researchProduct

Core of communities in bipartite networks

2017

We use the information present in a bipartite network to detect cores of communities of each set of the bipartite system. Cores of communities are found by investigating statistically validated projected networks obtained using information present in the bipartite network. Cores of communities are highly informative and robust with respect to the presence of errors or missing entries in the bipartite network. We assess the statistical robustness of cores by investigating an artificial benchmark network, the co-authorship network, and the actor-movie network. The accuracy and precision of the partition obtained with respect to the reference partition are measured in terms of the adjusted Ran…

FOS: Computer and information sciencesAccuracy and precisionPhysics - Physics and SocietyBipartite systemRand indexFOS: Physical sciencesPhysics and Society (physics.soc-ph)computer.software_genre01 natural sciences010104 statistics & probabilityRobustness (computer science)0103 physical sciences01.02. Számítás- és információtudomány0101 mathematics010306 general physicsMathematicsSocial and Information Networks (cs.SI)Probability and statisticsComputer Science - Social and Information NetworksSettore FIS/07 - Fisica Applicata(Beni Culturali Ambientali Biol.e Medicin)network theory community detectionPhysics - Data Analysis Statistics and ProbabilityBipartite graphData miningcomputerData Analysis Statistics and Probability (physics.data-an)
researchProduct