Search results for "Machine learning"
showing 10 items of 1464 documents
Measuring the Novelty of Natural Language Text Using the Conjunctive Clauses of a Tsetlin Machine Text Classifier
2020
Most supervised text classification approaches assume a closed world, counting on all classes being present in the data at training time. This assumption can lead to unpredictable behaviour during operation, whenever novel, previously unseen, classes appear. Although deep learning-based methods have recently been used for novelty detection, they are challenging to interpret due to their black-box nature. This paper addresses \emph{interpretable} open-world text classification, where the trained classifier must deal with novel classes during operation. To this end, we extend the recently introduced Tsetlin machine (TM) with a novelty scoring mechanism. The mechanism uses the conjunctive clau…
Standard Vs Uniform Binary Search and Their Variants in Learned Static Indexing: The Case of the Searching on Sorted Data Benchmarking Software Platf…
2023
Learned Indexes are a novel approach to search in a sorted table. A model is used to predict an interval in which to search into and a Binary Search routine is used to finalize the search. They are quite effective. For the final stage, usually, the lower_bound routine of the Standard C++ library is used, although this is more of a natural choice rather than a requirement. However, recent studies, that do not use Machine Learning predictions, indicate that other implementations of Binary Search or variants, namely k-ary Search, are better suited to take advantage of the features offered by modern computer architectures. With the use of the Searching on Sorted Sets SOSD Learned Indexing bench…
Accelerated dinuclear palladium catalyst identification through unsupervised machine learning.
2021
Although machine learning bears enormous potential to accelerate developments in homogeneous catalysis, the frequent need for extensive experimental data can be a bottleneck for implementation. Here, we report an unsupervised machine learning workflow that uses only five experimental data points. It makes use of generalized parameter databases that are complemented with problem-specific in silico data acquisition and clustering. We showcase the power of this strategy for the challenging problem of speciation of palladium (Pd) catalysts, for which a mechanistic rationale is currently lacking. From a total space of 348 ligands, the algorithm predicted, and we experimentally verified, a number…
Basic Chemometric Tools
2013
Abstract The authentication of protected designation of origin and other protected geographical indications for foods involves the need for a deep knowledge of these kinds of samples and the correct identification of appropriate markers that are suitable to be used for authentication purposes. For this, significance tests must be developed and applied to provide evidence in a fast and accurate way; from this, it seems clear that advances in analytical tools, to obtain data regarding food chemical composition, and chemometric data treatments must be continued to provide to the users powerful identification methodologies. In this sense, the objective must be to differentiate between foods pro…
Benchmarking Wilms’ tumor in multisequence MRI data: why does current clinical practice fail? Which popular segmentation algorithms perform well?
2019
Wilms' tumor is one of the most frequent malignant solid tumors in childhood. Accurate segmentation of tumor tissue is a key step during therapy and treatment planning. Since it is difficult to obtain a comprehensive set of tumor data of children, there is no benchmark so far allowing evaluation of the quality of human or computer-based segmentations. The contributions in our paper are threefold: (i) we present the first heterogeneous Wilms' tumor benchmark data set. It contains multisequence MRI data sets before and after chemotherapy, along with ground truth annotation, approximated based on the consensus of five human experts. (ii) We analyze human expert annotations and interrater varia…
Learning the relevant image features with multiple kernels
2009
This paper proposes to learn the relevant features of remote sensing images for automatic spatio-spectral classification with the automatic optimization of multiple kernels. The method consists of building dedicated kernels for different sets of bands, contextual or textural features. The optimal linear combination of kernels is optimized through gradient descent on the support vector machine (SVM) objective function. Since a na¨ive implementation is computationally demanding, we propose an efficient model selection procedure based on kernel alignment. The result is a weight — learned from the data — for each kernel where both relevant and meaningless image features emerge after training. E…
Recent advances in remote sensing image processing
2009
Remote sensing image processing is nowadays a mature research area. The techniques developed in the field allow many real-life applications with great societal value. For instance, urban monitoring, fire detection or flood prediction can have a great impact on economical and environmental issues. To attain such objectives, the remote sensing community has turned into a multidisciplinary field of science that embraces physics, signal theory, computer science, electronics, and communications. From a machine learning and signal/image processing point of view, all the applications are tackled under specific formalisms, such as classification and clustering, regression and function approximation…
Generating Hyperspectral Skin Cancer Imagery using Generative Adversarial Neural Network
2020
In this study we develop a proof of concept of using generative adversarial neural networks in hyperspectral skin cancer imagery production. Generative adversarial neural network is a neural network, where two neural networks compete. The generator tries to produce data that is similar to the measured data, and the discriminator tries to correctly classify the data as fake or real. This is a reinforcement learning model, where both models get reinforcement based on their performance. In the training of the discriminator we use data measured from skin cancer patients. The aim for the study is to develop a generator for augmenting hyperspectral skin cancer imagery. peerReviewed
Setting up of a machine learning algorithm for the identification of severe liver fibrosis profile in the general US population cohort
2022
Background: The progress of digital transformation in clinical practice opens the door to transforming the current clinical line for liver disease diagnosis from a late-stage diagnosis approach to an early-stage based one. Early diagnosis of liver fibrosis can prevent the progression of the disease and decrease liver-related morbidity and mortality. We developed here a machine learning (ML) algorithm containing standard parameters that can identify liver fibrosis in the general US population.Materials and methods: Starting from a public database (National Health and Nutrition Examination Survey, NHANES), representative of the American population with 7265 eligible subjects (control populati…