Search results for "Data mining"
showing 10 items of 907 documents
Synergetic and redundant information flow detected by unnormalized Granger causality: application to resting state fMRI
2015
Objectives: We develop a framework for the analysis of synergy and redundancy in the pattern of information flow between subsystems of a complex network. Methods: The presence of redundancy and/or synergy in multivariate time series data renders difficult to estimate the neat flow of information from each driver variable to a given target. We show that adopting an unnormalized definition of Granger causality one may put in evidence redundant multiplets of variables influencing the target by maximizing the total Granger causality to a given target, over all the possible partitions of the set of driving variables. Consequently we introduce a pairwise index of synergy which is zero when two in…
Plaid model for microarray data: an enhancement of the pruning step
2010
Microarrays have become a standard tool for studying gene functions. For example, we can investigate if a subset of genes shows a coherent expression pattern under different conditions. The plaid model, a model-based biclustering method, can be used to incorporate the addiction structure used for the microarray experiment. In this paper we describe an enhancement for the plaid model algorithm based on the theory of the false discovery rate.
Epistemic uncertainty in fault tree analysis approached by the evidence theory
2012
Abstract Process plants may be subjected to dangerous events. Different methodologies are nowadays employed to identify failure events, that can lead to severe accidents, and to assess the relative probability of occurrence. As for rare events reliability data are generally poor, leading to a partial or incomplete knowledge of the process, the classical probabilistic approach can not be successfully used. Such an uncertainty, called epistemic uncertainty, can be treated by means of different methodologies, alternative to the probabilistic one. In this work, the Evidence Theory or Dempster–Shafer theory (DST) is proposed to deal with this kind of uncertainty. In particular, the classical Fau…
TREEZZY2, a Fuzzy Logic Computer Code for Fault Tree and Event Tree Analyses
2004
In conventional approach to reliability analysis using logical trees methodologies, uncertainties in system components or basic events failure probabilities are approached by assuming probability distribution functions. However, data are often insufficient for statistical estimation, and therefore it is required to resort to approximate estimations. Moreover, complicate calculations are needed to propagate uncertainties up to the final results. In our work, in order to take account of the uncertainties in system failure probabilities, the methodology based on fuzzy sets theory is used both in fault tree and event tree analyses. This paper just presents our work in this issue, which resulted…
PESI - a taxonomic backbone for Europe
2015
Reliable taxonomy underpins communication in all of biology, not least nature conservation and sustainable use of ecosystem resources. The flexibility of taxonomic interpretations, however, presents a serious challenge for end-users of taxonomic concepts. Users need standardised and continuously harmonised taxonomic reference systems, as well as highquality and complete taxonomic data sets, but these are generally lacking for nonspecialists. The solution is in dynamic, expertly curated web-based taxonomic tools. The Pan-European Species-directories Infrastructure (PESI) worked to solve this key issue by providing a taxonomic e-infrastructure for Europe. It strengthened the relevant social (…
Toward Optimal LSTM Neural Networks for Detecting Algorithmically Generated Domain Names
2021
Malware detection is a problem that has become particularly challenging over the last decade. A common strategy for detecting malware is to scan network traffic for malicious connections between infected devices and their command and control (C&C) servers. However, malware developers are aware of this detection method and begin to incorporate new strategies to go unnoticed. In particular, they generate domain names instead of using static Internet Protocol addresses or regular domain names pointing to their C&C servers. By using a domain generation algorithm, the effectiveness of the blacklisting of domains is reduced, as the large number of domain names that must be blocked g…
Web-based real-time data acquisition system as tool for energy efficiency monitoring
2013
A web-based data acquisition system is proposed as a research tool of the energy efficiency monitoring project of the test stands. Basic requirements for the architecture of the data acquisition system are discussed. The architecture of the data acquisition system is proposed to provide the real-time interface with sensors, to acquire and to log data from all sensors with fixed rate, and to deliver logged data through FTP to the end-user.
A two-armed bandit collective for hierarchical examplar based mining of frequent itemsets with applications to intrusion detection
2014
Published version of a chapter in the book: Transactions on Computational Collective Intelligence XIV. Also available from the publisher at: http://dx.doi.org/10.1007/978-3-662-44509-9_1 In this paper we address the above problem by posing frequent item-set mining as a collection of interrelated two-armed bandit problems. We seek to find itemsets that frequently appear as subsets in a stream of itemsets, with the frequency being constrained to support granularity requirements. Starting from a randomly or manually selected examplar itemset, a collective of Tsetlin automata based two-armed bandit players - one automaton for each item in the examplar - learns which items should be included in …
Hybrid Genetic Algorithms in Data Mining Applications
2009
Genetic algorithms (GAs) are a class of problem solving techniques which have been successfully applied to a wide variety of hard problems (Goldberg, 1989). In spite of conventional GAs are interesting approaches to several problems, in which they are able to obtain very good solutions, there exist cases in which the application of a conventional GA has shown poor results. Poor performance of GAs completely depends on the problem. In general, problems severely constrained or problems with difficult objective functions are hard to be optimized using GAs. Regarding the difficulty of a problem for a GA there is a well established theory. Traditionally, this has been studied for binary encoded …
Factorial graphical models for dynamic networks
2015
AbstractDynamic network models describe many important scientific processes, from cell biology and epidemiology to sociology and finance. Estimating dynamic networks from noisy time series data is a difficult task since the number of components involved in the system is very large. As a result, the number of parameters to be estimated is typically larger than the number of observations. However, a characteristic of many real life networks is that they are sparse. For example, the molecular structure of genes make interactions with other components a highly-structured and, therefore, a sparse process. Until now, the literature has focused on static networks, which lack specific temporal inte…