Search results for "overfitting"
showing 10 items of 22 documents
Classification of Heart Sounds Using Convolutional Neural Network
2020
Heart sounds play an important role in the diagnosis of cardiac conditions. Due to the low signal-to-noise ratio (SNR), it is problematic and time-consuming for experts to discriminate different kinds of heart sounds. Thus, objective classification of heart sounds is essential. In this study, we combined a conventional feature engineering method with deep learning algorithms to automatically classify normal and abnormal heart sounds. First, 497 features were extracted from eight domains. Then, we fed these features into the designed convolutional neural network (CNN), in which the fully connected layers that are usually used before the classification layer were replaced with a global averag…
Bivariate nonlinear prediction to quantify the strength of complex dynamical interactions in short-term cardiovascular variability.
2005
A nonlinear prediction method for investigating the dynamic interdependence between short length time series is presented. The method is a generalization to bivariate prediction of the univariate approach based on nearest neighbor local linear approximation. Given the input and output series x and y, the relationship between a pattern of samples of x and a synchronous sample of y was approximated with a linear polynomial whose coefficients were estimated from an equation system including the nearest neighbor patterns in x and the corresponding samples in y. To avoid overfitting and waste of data, the training and testing stages of the prediction were designed through a specific out-of-sampl…
Feature selection for classification of music according to expressed emotion
2009
Genetic programming through bi-objective genetic algorithms with a study of a simulated moving bed process involving multiple objectives
2013
A new bi-objective genetic programming (BioGP) technique has been developed for meta-modeling and applied in a chromatographic separation process using a simulated moving bed (SMB) process. The BioGP technique initially minimizes training error through a single objective optimization procedure and then a trade-off between complexity and accuracy is worked out through a genetic algorithm based bi-objective optimization strategy. A benefit of the BioGP approach is that an expert user or a decision maker (DM) can flexibly select the mathematical operations involved to construct a meta-model of desired complexity or accuracy. It is also designed to combat bloat - a perennial problem in genetic …
A comprehensive study of automatic program repair on the QuixBugs benchmark
2021
Abstract Automatic program repair papers tend to repeatedly use the same benchmarks. This poses a threat to the external validity of the findings of the program repair research community. In this paper, we perform an empirical study of automatic repair on a benchmark of bugs called QuixBugs, which has been little studied. In this paper, (1) We report on the characteristics of QuixBugs; (2) We study the effectiveness of 10 program repair tools on it; (3) We apply three patch correctness assessment techniques to comprehensively study the presence of overfitting patches in QuixBugs. Our key results are: (1) 16/40 buggy programs in QuixBugs can be repaired with at least a test suite adequate pa…
Proposition of Convolutional Neural Network Based System for Skin Cancer Detection
2019
Skin cancer automated diagnosis tools play a vital role in timely screening, helping dermatologists focus on melanoma cases. Best arts on automated melanoma screening use deep learning-based approaches, especially deep convolutional neural networks (CNN) to improve performances. Because of the large number of parameters that could be involved during training in CNN many training samples are needed to avoid overfitting problem. Gabor filtering can efficiently extract spatial information including edges and textures, which may reduce the features extraction burden to CNN. In this paper, we proposed a Gabor Convolutional Network (GCN) model to improve the performance of automated diagnosis of …
Automated Patch Assessment for Program Repair at Scale
2021
AbstractIn this paper, we do automatic correctness assessment for patches generated by program repair systems. We consider the human-written patch as ground truth oracle and randomly generate tests based on it, a technique proposed by Shamshiri et al., called Random testing with Ground Truth (RGT) in this paper. We build a curated dataset of 638 patches for Defects4J generated by 14 state-of-the-art repair systems, we evaluate automated patch assessment on this dataset. The results of this study are novel and significant: First, we improve the state of the art performance of automatic patch assessment with RGT by 190% by improving the oracle; Second, we show that RGT is reliable enough to h…
Assessment of susceptibility to earth-flow landslide using logistic regression and multivariate adaptive regression splines: A case of the Belice Riv…
2015
Abstract In this paper, terrain susceptibility to earth-flow occurrence was evaluated by using geographic information systems (GIS) and two statistical methods: Logistic regression (LR) and multivariate adaptive regression splines (MARS). LR has been already demonstrated to provide reliable predictions of earth-flow occurrence, whereas MARS, as far as we know, has never been used to generate earth-flow susceptibility models. The experiment was carried out in a basin of western Sicily (Italy), which extends for 51 km 2 and is severely affected by earth-flows. In total, we mapped 1376 earth-flows, covering an area of 4.59 km 2 . To explore the effect of pre-failure topography on earth-flow sp…
An analysis of the bias of variation operators of estimation of distribution programming
2018
Estimation of distribution programming (EDP) replaces standard GP variation operators with sampling from a learned probability model. To ensure a minimum amount of variation in a population, EDP adds random noise to the probabilities of random variables. This paper studies the bias of EDP's variation operator by performing random walks. The results indicate that the complexity of the EDP model is high since the model is overfitting the parent solutions when no additional noise is being used. Adding only a low amount of noise leads to a strong bias towards small trees. The bias gets stronger with an increased amount of noise. Our findings do not support the hypothesis that sampling drift is …
Generalizability and Simplicity as Criteria in Feature Selection: Application to Mood Classification in Music
2011
Classification of musical audio signals according to expressed mood or emotion has evident applications to content-based music retrieval in large databases. Wrapper selection is a dimension reduction method that has been proposed for improving classification performance. However, the technique is prone to lead to overfitting of the training data, which decreases the generalizability of the obtained results. We claim that previous attempts to apply wrapper selection in the field of music information retrieval (MIR) have led to disputable conclusions about the used methods due to inadequate analysis frameworks, indicative of overfitting, and biased results. This paper presents a framework bas…