Search results for " Selection"
showing 10 items of 1271 documents
Variable Selection with Quasi-Unbiased Estimation: the CDF Penalty
2022
We propose a new non-convex penalty in linear regression models. The new penalty function can be considered a competitor of the LASSO, SCAD or MCP penalties, as it guarantees sparse variable selection while reducing bias for the non-null estimates. We introduce the methodology and present some comparisons among different approaches.
Evaluation of the effect of chance correlations on variable selection using Partial Least Squares -Discriminant Analysis
2013
Variable subset selection is often mandatory in high throughput metabolomics and proteomics. However, depending on the variable to sample ratio there is a significant susceptibility of variable selection towards chance correlations. The evaluation of the predictive capabilities of PLSDA models estimated by cross-validation after feature selection provides overly optimistic results if the selection is performed on the entire set and no external validation set is available. In this work, a simulation of the statistical null hypothesis is proposed to test whether the discrimination capability of a PLSDA model after variable selection estimated by cross-validation is statistically higher than t…
Analyses spectrale et texturale de données haute résolution pour la détection automatique des maladies de la vigne
2019
‘Flavescence dorée’ is a contagious and incurable disease present on the vine leaves. The DAMAV project (Automatic detection of Vine Diseases) aims to develop a solution for automated detection of vine diseases using a micro-drone. The goal is to offer a turnkey solution for wine growers. This tool will allow the search for potential foci, and then more generally any type of detectable vine disease on the foliage. To enable this diagnosis, the foliage is proposed to be studied using a dedicated high-resolution multispectral camera.The objective of this PhD-thesis in the context of DAMAV is to participate in the design and implementation of a Multi-Spectral (MS) image acquisition system and …
Evolutionary selection and variation in family businesses
2011
PurposeThis qualitative study attempts to understand what kinds of evolutionary selection and variation occur in family businesses during the preparation of a managerial and ownership succession.Design/methodology/approachThe study was conducted by interviewing members of one family business in Louisiana, USA and one in Finland in order to contribute to the understanding of succession preparation in small family businesses with two generations. Evolutionary economics was adapted for this interdisciplinary study to explain evolutionary changes in a family business succession.FindingsThe findings indicate that both selection and variation can take place through different routes during the pre…
Ventricular arrhythmias in children: The uselessness of MRI
2008
Filtered circular fingerprints improve either prediction or runtime performance while retaining interpretability.
2016
Background Even though circular fingerprints have been first introduced more than 50 years ago, they are still widely used for building highly predictive, state-of-the-art (Q)SAR models. Historically, these structural fragments were designed to search large molecular databases. Hence, to derive a compact representation, circular fingerprint fragments are often folded to comparatively short bit-strings. However, folding fingerprints introduces bit collisions, and therefore adds noise to the encoded structural information and removes its interpretability. Both representations, folded as well as unprocessed fingerprints, are often used for (Q)SAR modeling. Results We show that it can be prefer…
Feature selection: A multi-objective stochastic optimization approach
2020
The feature subset task can be cast as a multiobjective discrete optimization problem. In this work, we study the search algorithm component of a feature subset selection method. We propose an algorithm based on the threshold accepting method, extended to the multi-objective framework by an appropriate definition of the acceptance rule. The method is used in the task of identifying relevant subsets of features in a Web bot recognition problem, where automated software agents on the Web are identified by analyzing the stream of HTTP requests to a Web server.
Identifying legitimate Web users and bots with different traffic profiles — an Information Bottleneck approach
2020
Abstract Recent studies reported that about half of Web users nowadays are intelligent agents (Web bots). Many bots are impersonators operating at a very high sophistication level, trying to emulate navigational behaviors of legitimate users (humans). Moreover, bot technology continues to evolve which makes bot detection even harder. To deal with this problem, many advanced methods for differentiating bots from humans have been proposed, a large part of which relies on supervised machine learning techniques. In this paper, we propose a novel approach to identify various profiles of bots and humans which combines feature selection and unsupervised learning of HTTP-level traffic patterns to d…
Energy efficient and distributed resource allocation for wireless powered OFDMA multi-cell networks
2017
In this paper, we investigate the energy efficient resource allocation problem for the wireless powered OFDMA multi-cell networks. In the considered system, the users who have data to transmit in the uplink can only be empowered by the wireless power obtained from multiple base stations (BSs) with a large scale of multiple antennas in the downlink. A time division protocol is considered to divide the time of wireless power transfer (WPT) in the downlink and wireless information transfer (WIT) in the uplink into separate time slot. With the objective to improve the energy efficiency (EE) of the system, we propose the antenna selection, time allocation, subcarrier and power allocation schemes…
Wind Speed Forecasting by Box-Jenkins Models
2008
The possibility of modelling observed wind speed time series and forecasting their future values is presented in this paper. Seasonal autoregressive integrated moving average (SARIMA) models are applied to time series formed by four years hourly average wind speed measurements in thirty sites of Sicily. Our approach is considerably different from the original one (the Box-Jenkins approach) since it is completely automatic. We use a peculiar feature of wind speed on a land area, its daily period, to identify a class of SARIMA models within which to find the best fitting model by information criteria (here we employ AICC). Here we report the results, concerning the fit and forecast accuracy, …