6533b829fe1ef96bd128aaff

RESEARCH PRODUCT

Approches bioinformatiques innovantes pour l’analyse de données de séquençage à haut-débit appliquées à l’étude de pathologies génétiques rares avec anomalies du développement

Philippine Garret

subject

BioinformatiqueBioinformatics[SDV.BBM] Life Sciences [q-bio]/Biochemistry Molecular BiologyMaladies génétiques raresExome[SDV.BBM]Life Sciences [q-bio]/Biochemistry Molecular BiologyRare genetic diseases

description

In the last years, the advent of exome sequencing (ES) in diagnosis and in research led to the identification of the genetic bases of many Mendelian disorders, allowing many diagnostic wavering cases to be solved. Nevertheless, ES data analysis only leads to the identification of pathogenic or likely pathogenic variants in 30 to 45 % of the undiagnosed cases. Indeed, some limits exist, both at clinical, molecular and bioinformatic levels. The constant evolution of the clinical knowledge, of the number of genes involved in human diseases, and of the clinical-biological correlations, has a significant impact on data analysis, leading to a progressive improvement in diagnostic research. Limits of the current technologies, especially not covered regions, exist, but have been significantly reduced in the recent years. Although genome sequencing will solve some undiagnosed cases, especially in case of non-coding or structural variants, there is still a lot of information to be extracted and analyzed from ES data. Finally, beyond SNV and CNV analyzes, other genetic events can be involved in rare disorders, requiring a bioinformatic development to optimize results.The aim of the project was therefore to improve bioinformatic approaches of ES data analysis in order to identify new molecular mechanisms involved in rare genetic disorders and reduce diagnostic wavering.Several strategies were established. The first one consisted in reanalysing ES data from 80 undiagnosed patients, who were sequenced by the Laboratoire CERBA (CIFRE thesis). It led to the identification of 2 new candidate genes involved in ID, especially OTUD7A gene (article 1). The second strategy was the development of a bioinformatic pipeline in order to extract mitochondrial DNA data from ES data. The mitochondrial genome is not targeted by exome capture kits but can be extracted from off-target data, giving the opportunity to analyze it from preexisting ES data. From the GAD exomes cohort of undiagnosed patients, 2 causal variations were identified in 2 individuals out of 928, affected with neuro-developmental disorder. It thus solved the diagnostic wavering in 0.2 % of patients without diagnosis (article 2). The third strategy consisted in the development of a bioinformatic pipeline to identify mobile elements insertion within ES data, with the expectation that about 0.03 % of the pathogenic variants originate from de novo mobile element insertion. From the GAD exomes cohort of 3322 undiagnosed patients, this step led to the identification of two Alu element insertions in FERMT1 and GRIN2B gene exons (article 3, in process).This PhD permitted to push out some ES limits. Other perspectives exist, and are explored by the GAD team, in connection with the European Solve-RD project.

https://tel.archives-ouvertes.fr/tel-02880120