6533b85efe1ef96bd12bf47e

RESEARCH PRODUCT

Finding optimal finite biological sequences over finite alphabets: the OptiFin toolbox

Stéphane ChrétienRegis GarnierChristophe Guyeux

subject

FOS: Computer and information sciences0301 basic medicineTheoretical computer scienceOptimization problemComputer Science - Artificial IntelligenceComputer science[INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE]Quantitative Biology - Quantitative MethodsSet (abstract data type)[INFO.INFO-IU]Computer Science [cs]/Ubiquitous Computing03 medical and health sciences[INFO.INFO-CR]Computer Science [cs]/Cryptography and Security [cs.CR]State spaceMetaheuristicQuantitative Methods (q-bio.QM)Protein structure prediction[INFO.INFO-MO]Computer Science [cs]/Modeling and SimulationToolboxCore (game theory)Artificial Intelligence (cs.AI)030104 developmental biology[INFO.INFO-MA]Computer Science [cs]/Multiagent Systems [cs.MA]FOS: Biological sciences[INFO.INFO-ET]Computer Science [cs]/Emerging Technologies [cs.ET][INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]Word (computer architecture)

description

International audience; In this paper, we present a toolbox for a specific optimization problem that frequently arises in bioinformatics or genomics. In this specific optimisation problem, the state space is a set of words of specified length over a finite alphabet. To each word is associated a score. The overall objective is to find the words which have the lowest possible score. This type of general optimization problem is encountered in e.g 3D conformation optimisation for protein structure prediction, or largest core genes subset discovery based on best supported phylogenetic tree for a set of species. In order to solve this problem, we propose a toolbox that can be easily launched using MPI and embeds 3 well-known metaheuristics. The toolbox is fully parametrized and well documented. It has been specifically designed to be easy modified and possibly improved by the user depending on the application, and does not require to be a computer scientist. We show that the toolbox performs very well on two difficult practical problems.

https://hal.archives-ouvertes.fr/hal-02392536/file/23c798ae-611d-43b3-8a7b-f09fe3c0736f-author.pdf