Search results for "ample"
showing 10 items of 2398 documents
The impact of sample reduction on PCA-based feature extraction for supervised learning
2006
"The curse of dimensionality" is pertinent to many learning algorithms, and it denotes the drastic raise of computational complexity and classification error in high dimensions. In this paper, different feature extraction (FE) techniques are analyzed as means of dimensionality reduction, and constructive induction with respect to the performance of Naive Bayes classifier. When a data set contains a large number of instances, some sampling approach is applied to address the computational complexity of FE and classification processes. The main goal of this paper is to show the impact of sample reduction on the process of FE for supervised learning. In our study we analyzed the conventional PC…
Editing prototypes in the finite sample size case using alternative neighborhoods
1998
The recently introduced concept of Nearest Centroid Neighborhood is applied to discard outliers and prototypes 111 class overlapping regions in order to improve the performance of the Nearest Neighbor rule through an editing procedure, This approach is related to graph based editing algorithms which also define alternative neighborhoods in terms of geornetric relations, Classical editing algorithms are compared to these alternative editing schemes using several synthetic and real data problems. The empirical results show that, the proposed editing algorithm constitutes a good trade-off among performance and computational burden.
Grapevine and wine metabolomics-based guidelines for fair data and metadata management
2021
In the era of big and omics data, good organization, management, and description of experimental data are crucial for achieving high-quality datasets. This, in turn, is essential for the export of robust results, to publish reliable papers, make data more easily available, and unlock the huge potential of data reuse. Lately, more and more journals now require authors to share data and metadata according to the FAIR (Findable, Accessible, Interoperable, Reusable) principles. This work aims to provide a step-by-step guideline for the FAIR data and metadata management specific to grapevine and wine science. In detail, the guidelines include recommendations for the organization of data and meta…
Are nonlinear model-free conditional entropy approaches for the assessment of cardiac control complexity superior to the linear model-based one?
2016
Objective : We test the hypothesis that the linear model-based (MB) approach for the estimation of conditional entropy (CE) can be utilized to assess the complexity of the cardiac control in healthy individuals. Methods : An MB estimate of CE was tested in an experimental protocol (i.e., the graded head-up tilt) known to produce a gradual decrease of cardiac control complexity as a result of the progressive vagal withdrawal and concomitant sympathetic activation. The MB approach was compared with traditionally exploited nonlinear model-free (MF) techniques such as corrected approximate entropy, sample entropy, corrected CE, two k -nearest-neighbor CE procedures and permutation CE. Electroca…
Characterization of entropy measures against data loss: Application to EEG records
2012
This study is aimed at characterizing three signal entropy measures, Approximate Entropy (ApEn), Sample Entropy (SampEn) and Multiscale Entropy (MSE) over real EEG signals when a number of samples are randomly lost due to, for example, wireless data transmission. The experimental EEG database comprises two main signal groups: control EEGs and epileptic EEGs. Results show that both SampEn and ApEn enable a clear distinction between control and epileptic signals, but SampEn shows a more robust performance over a wide range of sample loss ratios. MSE exhibits a poor behavior for ratios over a 40% of sample loss. The EEG non-stationary and random trends are kept even when a great number of samp…
Register data in sample allocations for small-area estimation
2018
The inadequate control of sample sizes in surveys using stratified sampling and area estimation may occur when the overall sample size is small or auxiliary information is insufficiently used. Very small sample sizes are possible for some areas. The proposed allocation based on multi-objective optimization uses a small-area model and estimation method and semi-collected empirical data annually collected empirical data. The assessment of its performance at the area and at the population levels is based on design-based sample simulations. Five previously developed allocations serve as references. The model-based estimator is more accurate than the design-based Horvitz–Thompson estimator and t…
A Novel Marking Reader for Progressive Addition Lenses Based on Gabor Holography
2016
PURPOSE Progressive addition lenses (PALs) are marked with permanent engraved marks (PEMs) at standardized locations. Permanent engraved marks are very useful through the manufacturing and mounting processes, act as locator marks to re-ink the removable marks, and contain useful information about the PAL. However, PEMs are often faint and weak, obscured by scratches, partially occluded, and difficult to recognize on tinted lenses or with antireflection or scratch-resistant coatings. The aim of this article is to present a new generation of portable marking reader based on an extremely simplified concept for visualization and identification of PEMs in PALs. METHODS Permanent engraved marks o…
Identification of differential risk hotspots for collision and vehicle type in a directed linear network
2019
Traffic accidents can take place in very different ways and involve a substantially distinct number and types of vehicles. Thus, it is of interest to know which parts of a road structure present an overrepresentation of a specific type of traffic accident, specially for some typologies of collisions and vehicles that tend to trigger more severe consequences for the users being involved. In this study, a spatial approach is followed to estimate the risk that different types of collisions and vehicles present in the central area of Valencia (Spain), considering the accidents observed in this city during the period 2014-2017. A directed spatial linear network representing the non-pedestrian ro…
Clinically-Driven Virtual Patient Cohorts Generation: An Application to Aorta
2021
The combination of machine learning methods together with computational modeling and simulation of the cardiovascular system brings the possibility of obtaining very valuable information about new therapies or clinical devices through in-silico experiments. However, the application of machine learning methods demands access to large cohorts of patients. As an alternative to medical data acquisition and processing, which often requires some degree of manual intervention, the generation of virtual cohorts made of synthetic patients can be automated. However, the generation of a synthetic sample can still be computationally demanding to guarantee that it is clinically meaningful and that it re…
A Software Package for a Serum Bank Management
1979
A serum-bank is a collection of human serum samples coming from different locations (in our case Children Hospital, schools, factories, town departemens), allocated in some archives. Principal users of a such data-bank are, of course, physicians and biologists that are mainly interested in statistical analysis (computation of averages, variances factor analysis, etc.) of immunological and epidemiological relevance, in order to investigate about some haematochemical parameters common to some selected subset of the archives [1], [2].