6533b82bfe1ef96bd128d4b2

RESEARCH PRODUCT

On the Generalizability of Programs Synthesized by Grammar-Guided Genetic Programming

Dominik Sobania

subject

business.industryComputer scienceSoftware developmentGenetic programming02 engineering and technologyMachine learningcomputer.software_genreTournament selectionSoftware metricTest case020204 information systems0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingGeneralizability theoryArtificial intelligencebusinesscomputerSelection (genetic algorithm)Program synthesis

description

Grammar-guided Genetic Programming is a common approach for program synthesis where the user’s intent is given by a set of input/output examples. For use in real-world software development, the generated programs must work on previously unseen test cases too. Therefore, we study in this work the generalizability of programs synthesized by grammar-guided GP with lexicase selection. As benchmark, we analyze proportionate and tournament selection too. We find that especially for program synthesis problems with a low output cardinality (e.g., a Boolean output) lexicase selection overfits the training cases and does not generalize well to unseen test cases. An analysis using common software metrics shows for such a problem that lexicase selection generates more complex programs with many code lines and a heavier use of control structures compared to the other studied selection methods. Nevertheless, the generalizability can be improved when we do not stop a GP run as usual after a first program is found that solves all training cases correctly, but give GP more time to find further solution candidates (also solving correctly all training cases) and select the smallest program (measured with different software metrics) out of these.

https://doi.org/10.1007/978-3-030-72812-0_9