6533b837fe1ef96bd12a3004

RESEARCH PRODUCT

CovSel

Franz RothlaufDominik Sobania

subject

Computer scienceProcess (computing)Phase (waves)Genetic programming02 engineering and technology01 natural sciencesEnsemble learningSet (abstract data type)010104 statistics & probability0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingPoint (geometry)0101 mathematicsSymbolic regressionAlgorithmSelection (genetic algorithm)

description

Ensemble methods combine the predictions of a set of models to reach a better prediction quality compared to a single model's prediction. The ensemble process consists of three steps: 1) the generation phase where the models are created, 2) the selection phase where a set of possible ensembles is composed and one is selected by a selection method, 3) the fusion phase where the individual models' predictions of the selected ensemble are combined to an ensemble's estimate. This paper proposes CovSel, a selection approach for regression problems that ranks ensembles based on the coverage of adequately estimated training points and selects the ensemble with the highest coverage to be used in the fusion phase. An ensemble covers a training point if at least one of its models produces an adequate prediction for this training point. The more training points are covered this way, the higher is the ensemble's coverage. The selection of the "right" ensemble has a large impact on the final prediction. Results for two symbolic regression problems show that CovSel improves the predictions for various state-of-the-art fusion methods for ensembles composed of independently evolved GP models and also beats approaches based on single GP models.

https://doi.org/10.1145/3205455.3205570