Search results for "Mach"
showing 10 items of 3360 documents
Stability-Based Model Selection for High Throughput Genomic Data: An Algorithmic Paradigm
2012
Clustering is one of the most well known activities in scien- tific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. In this beautiful area, one of the most difficult challenges is the model selection problem, i.e., the identifi- cation of the correct number of clusters in a dataset. In the last decade, a few novel techniques for model selection, representing a sharp departure from previous ones in statistics, have been proposed and gained promi- nence for microarray data analysis. Among those, the stability-based methods are the most robust and best performing in terms of predic- tion, but the slowest in terms of time. Unfortunately…
One-Sided Prototype Selection on Class Imbalanced Dissimilarity Matrices
2012
In the dissimilarity representation paradigm, several prototype selection methods have been used to cope with the topic of how to select a small representation set for generating a low-dimensional dissimilarity space. In addition, these methods have also been used to reduce the size of the dissimilarity matrix. However, these approaches assume a relatively balanced class distribution, which is grossly violated in many real-life problems. Often, the ratios of prior probabilities between classes are extremely skewed. In this paper, we study the use of renowned prototype selection methods adapted to the case of learning from an imbalanced dissimilarity matrix. More specifically, we propose the…
On Duality in Learning and the Selection of Learning Teams
1996
AbstractPrevious work in inductive inference dealt mostly with finding one or several machines (IIMs) that successfully learn collections of functions. Herein we start with a class of functions and considerthe learner setof all IIMs that are successful at learning the given class. Applying this perspective to the case of team inference leads to the notion ofdiversificationfor a class of functions. This enable us to distinguish between several flavours of IIMs all of which must be represented in a team learning the given class.
Positive Versions of Polynomial Time
1998
Abstract We show that restricting a number of characterizations of the complexity class P to be positive (in natural ways) results in the same class of (monotone) problems, which we denote by posP . By a well-known result of Razborov, posP is a proper subclass of the class of monotone problems in P . We exhibit complete problems for posP via weak logical reductions, as we do for other logically defined classes of problems. Our work is a continuation of research undertaken by Grigni and Sipser, and subsequently Stewart; indeed, we introduce the notion of a positive deterministic Turing machine and consequently solve a problem posed by Grigni and Sipser.
On a class of languages with holonomic generating functions
2017
We define a class of languages (RCM) obtained by considering Regular languages, linear Constraints on the number of occurrences of symbols and Morphisms. The class RCM presents some interesting closure properties, and contains languages with holonomic generating functions. As a matter of fact, RCM is related to one-way 1-reversal bounded k-counter machines and also to Parikh automata on letters. Indeed, RCM is contained in L-NFCM but not in L-DFCM, and strictly includes L-CPA. We conjecture that L-DFCM subset of RCM
Learning Molecular Classes from Small Numbers of Positive Examples Using Graph Grammars
2021
We consider the following problem: A researcher identified a small number of molecules with a certain property of interest and now wants to find further molecules sharing this property in a database. This can be described as learning molecular classes from small numbers of positive examples. In this work, we propose a method that is based on learning a graph grammar for the molecular class. We consider the type of graph grammars proposed by Althaus et al. [2], as it can be easily interpreted and allows relatively efficient queries. We identify rules that are frequently encountered in the positive examples and use these to construct a graph grammar. We then classify a molecule as being conta…
A new paradigm for pattern classification: Nearest Border Techniques
2013
Published version of a chapter in the book: AI 2013: Advances in Artificial Intelligence. Also available from the publisher at: http://dx.doi.org/10.1007/978-3-319-03680-9_44 There are many paradigms for pattern classification. As opposed to these, this paper introduces a paradigm that has not been reported in the literature earlier, which we shall refer to as the Nearest Border (NB) paradigm. The philosophy for developing such a NB strategy is as follows: Given the training data set for each class, we shall first attempt to create borders for each individual class. After that, we advocate that testing is accomplished by assigning the test sample to the class whose border it lies closest to…
Models of Computation, Riemann Hypothesis, and Classical Mathematics
1998
Classical mathematics is a source of ideas used by Computer Science since the very first days. Surprisingly, there is still much to be found. Computer scientists, especially, those in Theoretical Computer Science find inspiring ideas both in old notions and results, and in the 20th century mathematics. The latest decades have brought us evidence that computer people will soon study quantum physics and modern biology just to understand what computers are doing.
Variability of Classification Results in Data with High Dimensionality and Small Sample Size
2021
The study focuses on the analysis of biological data containing information on the number of genome sequences of intestinal microbiome bacteria before and after antibiotic use. The data have high dimensionality (bacterial taxa) and a small number of records, which is typical of bioinformatics data. Classification models induced on data sets like this usually are not stable and the accuracy metrics have high variance. The aim of the study is to create a preprocessing workflow and a classification model that can perform the most accurate classification of the microbiome into groups before and after the use of antibiotics and lessen the variability of accuracy measures of the classifier. To ev…
Comparative analysis of architectures for monitoring cloud computing infrastructures
2015
The lack of control over the cloud resources is one of the main disadvantages associated to cloud computing. The design of efficient architectures for monitoring such resources can help to overcome this problem. This contribution describes a complete set of architectures for monitoring cloud computing infrastructures, and provides a taxonomy of them. The architectures are described in detail, compared among them, and analysed in terms of performance, scalability, usage of resources, and security capabilities. The architectures have been implemented in real world settings and empirically validated against a real cloud computing infrastructure based on OpenStack. More than 1000 virtual machin…