Search results for "Learning"
showing 10 items of 6669 documents
Efficient and Accurate OTU Clustering with GPU-Based Sequence Alignment and Dynamic Dendrogram Cutting.
2015
De novo clustering is a popular technique to perform taxonomic profiling of a microbial community by grouping 16S rRNA amplicon reads into operational taxonomic units (OTUs). In this work, we introduce a new dendrogram-based OTU clustering pipeline called CRiSPy. The key idea used in CRiSPy to improve clustering accuracy is the application of an anomaly detection technique to obtain a dynamic distance cutoff instead of using the de facto value of 97 percent sequence similarity as in most existing OTU clustering pipelines. This technique works by detecting an abrupt change in the merging heights of a dendrogram. To produce the output dendrograms, CRiSPy employs the OTU hierarchical clusterin…
The impact of sample reduction on PCA-based feature extraction for supervised learning
2006
"The curse of dimensionality" is pertinent to many learning algorithms, and it denotes the drastic raise of computational complexity and classification error in high dimensions. In this paper, different feature extraction (FE) techniques are analyzed as means of dimensionality reduction, and constructive induction with respect to the performance of Naive Bayes classifier. When a data set contains a large number of instances, some sampling approach is applied to address the computational complexity of FE and classification processes. The main goal of this paper is to show the impact of sample reduction on the process of FE for supervised learning. In our study we analyzed the conventional PC…
Towards a Hierarchical Multitask Classification Framework for Cultural Heritage
2018
Digital technologies such as 3D imaging, data analytics and computer vision opened the door to a large set of applications in cultural heritage. Digital acquisition of a cultural assets takes nowadays a couple of seconds thanks to the achievements in 2D and 3D acquisition technologies. However, enriching these cultural assets with labels and relevant metadata is still not fully automatized especially due to their nature and specificities. With the recent publication of several cultural heritage datasets, many researchers are tackling the challenge of effectively classifying and annotating digital heritage. The challenges that are often addressed are related to visual recognition and image c…
The predictive power of game-related statistics for the final result under the rule changes introduced in the men’s world water polo championship: a …
2019
The objectives of this study were (i) to compare water polo game-related statistics by match outcome (winning and losing teams) after the application of the new rules, and (ii) to develop a classif...
Editing prototypes in the finite sample size case using alternative neighborhoods
1998
The recently introduced concept of Nearest Centroid Neighborhood is applied to discard outliers and prototypes 111 class overlapping regions in order to improve the performance of the Nearest Neighbor rule through an editing procedure, This approach is related to graph based editing algorithms which also define alternative neighborhoods in terms of geornetric relations, Classical editing algorithms are compared to these alternative editing schemes using several synthetic and real data problems. The empirical results show that, the proposed editing algorithm constitutes a good trade-off among performance and computational burden.
Enter the Serious E-scape Room: A Cost-Effective Serious Game Model for Deep and Meaningful E-learning
2019
Escape rooms are a phenomenon that has taken the world by storm in the last decade. Simultaneously Virtual Reality is a promising technology for innovation in education, training and e-learning. Combining these two concepts, this paper outlines a new model for designing serious games in virtual reality environments for high quality, deep and meaningful learning, the Serious E-scape Room. It describes the theoretical grounding, general guidelines and principles of the model. It also presents the case study “Room of Keys”, a serious virtual escape room for biology concepts. To test the assumptions of the model, researchers conducted a mixed research study with 148 students in a US high school…
Advancing Deep Learning for Earth Sciences: From Hybrid Modeling to Interpretability
2020
Machine learning and deep learning in particular have made a huge impact in many fields of science and engineering. In the last decade, advanced deep learning methods have been developed and applied to remote sensing and geoscientific data problems extensively. Applications on classification and parameter retrieval are making a difference: methods are very accurate, can handle large amounts of data, and can deal with spatial and temporal data structures efficiently. Nevertheless, several important challenges need still to be addressed. First, current standard deep architectures cannot deal with long-range dependencies so distant driving processes (in space or time) are not captured, and the…
A Machine Learning-Based Prediction Platform for P-Glycoprotein Modulators and Its Validation by Molecular Docking
2019
P-glycoprotein (P-gp) is an important determinant of multidrug resistance (MDR) because its overexpression is associated with increased efflux of various established chemotherapy drugs in many clinically resistant and refractory tumors. This leads to insufficient therapeutic targeting of tumor populations, representing a major drawback of cancer chemotherapy. Therefore, P-gp is a target for pharmacological inhibitors to overcome MDR. In the present study, we utilized machine learning strategies to establish a model for P-gp modulators to predict whether a given compound would behave as substrate or inhibitor of P-gp. Random forest feature selection algorithm-based leave-one-out random sampl…
Feature Extraction and Selection for Pain Recognition Using Peripheral Physiological Signals.
2019
In pattern recognition, the selection of appropriate features is paramount to both the performance and the robustness of the system. Over-reliance on machine learning-based feature selection methods can, therefore, be problematic; especially when conducted using small snapshots of data. The results of these methods, if adopted without proper interpretation, can lead to sub-optimal system design or worse, the abandonment of otherwise viable and important features. In this work, a deep exploration of pain-based emotion classification was conducted to better understand differences in the results of the related literature. In total, 155 different time domain and frequency domain features were e…
ADME Prediction with KNIME: Development and Validation of a Publicly Available Workflow for the Prediction of Human Oral Bioavailability.
2020
In silico prediction of human oral bioavailability is a relevant tool for the selection of potential drug candidates and for the rejection of those molecules with less probability of success during the early stages of drug discovery and development. However, the high variability and complexity of oral bioavailability and the limited experimental data in the public domain have mainly restricted the development of reliable in silico models to predict this property from the chemical structure. In this study we present a KNIME automated workflow to predict human oral bioavailability of new drug and drug-like molecules based on five machine learning approaches combined into an ensemble model. Th…