6533b82bfe1ef96bd128d646

RESEARCH PRODUCT

Detailed analysis of inversions predicted between two human genomes: errors, real polymorphisms, and their origin and population distribution.

Carla Giner-delgadoMarta Sabariego PuigMagdalena Gayà-vidalMario CáceresIsaac NogueraAlexander Martínez-fundichelyJosé Ignacio Lucas-lledóJosé Ignacio Lucas-lledóXavier EstivillDavid Vicente-salvadorAurora Ruiz-herreraCristina AguadoSarai PachecoDavid Izquierdo

subject

0301 basic medicinePopulationBiologyGenomeEvolution Molecular03 medical and health sciencesGeneticsHumans1000 Genomes ProjectAlleleSelection GeneticeducationMolecular BiologyAllele frequencyGenetics (clinical)Geneticseducation.field_of_studyPolymorphism GeneticGenome HumanSequence InversionBreakpointMolecular Sequence AnnotationGeneral MedicineSequence Analysis DNA030104 developmental biologyChromosome InversionHuman genomeReference genome

description

The growing catalogue of structural variants in humans often overlooks inversions as one of the most difficult types of variation to study, even though they affect phenotypic traits in diverse organisms. Here, we have analysed in detail 90 inversions predicted from the comparison of two independently assembled human genomes: the reference genome (NCBI36/HG18) and HuRef. Surprisingly, we found that two thirds of these predictions (62) represent errors either in assembly comparison or in one of the assemblies, including 27 misassembled regions in HG18. Next, we validated 22 of the remaining 28 potential polymorphic inversions using different PCR techniques and characterized their breakpoints and ancestral state. In addition, we determined experimentally the derived allele frequency in Europeans for 17 inversions (DAF = 0.01-0.80), as well as the distribution in 14 worldwide populations for 12 of them based on the 1000 Genomes Project data. Among the validated inversions, nine have inverted repeats (IRs) at their breakpoints, and two show nucleotide variation patterns consistent with a recurrent origin. Conversely, inversions without IRs have a unique origin and almost all of them show deletions or insertions at the breakpoints in the derived allele mediated by microhomology sequences, which highlights the importance of mechanisms like FoSTeS/MMBIR in the generation of complex rearrangements in the human genome. Finally, we found several inversions located within genes and at least one candidate to be positively selected in Africa. Thus, our study emphasizes the importance of careful analysis and validation of large-scale genomic predictions to extract reliable biological conclusions.

10.1093/hmg/ddw415https://pubmed.ncbi.nlm.nih.gov/28025331