0000000000016517
AUTHOR
Ignacio Marín
The gene encoding ganglioside-induced differentiation-associated protein 1 is mutated in axonal Charcot-Marie-Tooth type 4A disease
We identified three distinct mutations and six mutant alleles in GDAP1 in three families with axonal Charcot-Marie-Tooth (CMT) neuropathy and vocal cord paresis, which were previously linked to the CMT4A locus on chromosome 8q21.1. These results establish the molecular etiology of CMT4A (MIM 214400) and suggest that it may be associated with both axonal and demyelinating phenotypes.
Tracing the origin of the compensasome: evolutionary history of DEAH helicase and MYST acetyltransferase gene families.
Dosage compensation in Drosophila is mediated by a complex of proteins and RNAs called the "compensasome." Two of the genes that encode proteins of the complex, maleless (mle) and males-absent-on-the-first (mof), respectively, belong to the DEAH helicase and MYST acetyltransferase gene families. We performed comprehensive phylogenetic and structural analyses to determine the evolutionary histories of these two gene families and thus to better understand the origin of the compensasome. All of the members of the DEAH and MYST families of the completely sequenced Saccharomyces cerevisiae and Caenorhabditis elegans genomes, as well as those so far (June 2000) found in Drosophila melanogaster (f…
Iterative Cluster Analysis of Protein Interaction Data
Abstract Motivation: Generation of fast tools of hierarchical clustering to be applied when distances among elements of a set are constrained, causing frequent distance ties, as happens in protein interaction data. Results: We present in this work the program UVCLUSTER, that iteratively explores distance datasets using hierarchical clustering. Once the user selects a group of proteins, UVCLUSTER converts the set of primary distances among them (i.e. the minimum number of steps, or interactions, required to connect two proteins) into secondary distances that measure the strength of the connection between each pair of proteins when the interactions for all the proteins in the group are consid…
Basic networks: Definition and applications
7 pages, 4 figures, 1 table.-- PMID: 19490867 [PubMed]
Comparative Genomics of the RBR Family, Including the Parkinson's Disease–Related Gene Parkin and the Genes of the Ariadne Subfamily
Genes of the RBR family are characterized by the RBR signature (two RING finger domains separated by an IBR/DRIL domain). The RBR family is widespread in eukaryotes, with numerous members in animals (mammals, Drosophila, Caenorhabditis) and plants (Arabidopsis). But yeasts, such as Saccharomyces cerevisiae or Schizosaccharomyces pombe, contain only two RBR genes. We determined the phylogenetic relationships and the most likely orthologs in different species of several family members for which functional data are available. These include: (1) parkin, whose mutations are involved in forms of familial Parkinson's disease; (2) the ariadne genes, recently characterized in Drosophila and mammals;…
Evolutionary and structural analyses of GDAP1, involved in Charcot-Marie-Tooth disease, characterize a novel class of glutathione transferase-related genes.
Mutations in the Ganglioside-induced differentiation-associated protein-1 (GDAP1) gene cause autosomal recessive Charcot-Marie-Tooth disease type 4A. The protein encoded by GDAP1 shows clear similarity to glutathione transferases (also known as glutathione S-transferases or GSTs). The human genome contains a paralog of GDAP1 called GDAP1L1. Using comparative genomics, we show that orthologs of GDAP1 and GDAP1L1 are found in mammals, birds, amphibians, and fishes. Likely orthologs of those genes in invertebrates and a low but consistent similarity with some plant and eubacterial genes have also been found. We demonstrate that GDAP1 and GDAP1L1 do not belong to any of the known classes of GST…
Comparative genomics and protein domain graph analyses link ubiquitination and RNA metabolism.
The human gene parkin, known to cause familial Parkinson disease, as well as several other genes, likely involved in other neurodegenerative diseases or in cancer, encode proteins of the RBR family of ubiquitin ligases. Here, we describe the structural diversity of the RBR family in order to infer their functional roles. Of particular interest is a relationship detected between RBR-mediated ubiquitination and RNA metabolism: a few RBR proteins contain RNA binding domains and DEAH-box RNA helicase domains. Global protein domain graph analyses demonstrate that this connection is not RBR-specific, but instead many other proteins contain both ubiquitination and RNA-related domains. These protei…
The evolution of dosage-compensation mechanisms
Dosage compensation is the process by which the expression levels of sex-linked genes are altered in one sex to offset a difference in sex-chromosome number between females and males of a heterogametic species. Degeneration of a sex-limited chromosome to produce heterogamety is a common, perhaps unavoidable, feature of sex-chromosome evolution. Selective pressure to equalize sex-linked gene expression in the two sexes accompanies degeneration, thereby driving the evolution of dosage-compensation mechanisms. Studies of model species indicate that what appear to be very different mechanisms have evolved in different lineages: the male X chromosome is hypertranscribed in drosophilid flies, bot…
A hierarchical clustering strategy and its application to proteomic interaction data
We describe a novel strategy of hierarchical clustering analysis, particularly useful to analyze proteomic interaction data. The logic behind this method is to use the information for all interactions among the elements of a set to evaluate the strength of the interaction of each pair of elements. Our procedure allows the characterization of protein complexes starting with partial data and the detection of "promiscuous" proteins that bias the results, generating false positive data. We demonstrate the usefulness of our strategy by analyzing a real case that involves 137 Saccharomyces cerevisiae proteins. Because most functional studies require the evaluation of similar data sets, our method…
A fast algorithm for the exhaustive analysis of 12-nucleotide-long DNA sequences. Applications to human genomics
We have developed a new algorithm that allows the exhaustive determination of words of up to 12 nucleotides in DNA sequences. It is fast enough as to be used at a genomic scale running on a standard personal computer. As an example, we apply the algorithm to compare the number of all 12-nucleotide long words in human chromosomes 21 and 22, each of them more than 33 million nucleotides long. Sequences that are chromosome specific are detected in less than 2 minutes, being analyzed any pair of chromosomes at a rate of 45 millions of nucleotides (45 Mb) per minute. The size of the words is long enough as to allow further analyses of all significant sequences using conventional database searche…
Evolutionary relationships among the members of an ancient class of non-LTR retrotransposons found in the nematode Caenorhabditis elegans.
We took advantage of the massive amount of sequence information generated by the Caenorhabditis elegans genome project to perform a comprehensive analysis of a group of over 100 related sequences that has allowed us to describe two new C. elegans non-LTR retrotransposons. We named them Sam and Frodo. We also determined that several highly divergent subfamilies of both elements exist in C. elegans. It is likely that several master copies have been active at the same time in C. elegans, although only a few copies of both Sam and Frodo have characteristics that are compatible with them being active today. We discuss whether it is more appropriate under these circumstances to define only 2 elem…
Fast comparison of DNA sequences by oligonucleotide profiling
Provisional abstact and full-text PDF files correspond to the article as it appeared upon acceptance. Fully formatted PDF and final abstract will be made available soon.
The Parkinson Disease Gene LRRK2: Evolutionary and Structural Insights
Mutations in the human leucine-rich repeat kinase 2 (LRRK2) gene are associated with both familial and sporadic Parkinson disease (PD). LRRK2 belongs to a gene family known as Roco. Roco genes encode for large proteins with several protein domains. Particularly, all Roco proteins have a characteristic GTPase domain, named Roc, plus a domain of unknown function called COR. In addition, LRRK2 and several other Roco proteins also contain a protein kinase domain. In this study, I use a combination of phylogenetic and structural analyses of the COR, Roc, and kinase domains present in Roco proteins to describe the origin and evolutionary history of LRRK2. Phylogenetic analyses using these domains…
Parkin and relatives: the RBR family of ubiquitin ligases
Mutations in the parkin gene cause autosomal-recessive juvenile parkinsonism. Parkin encodes a ubiquitinprotein ligase characterized by having the RBR domain, composed of two RING fingers plus an IBR/DRIL domain. The RBR family is defined as the group of genes whose products contain an RBR domain. RBR family members exist in all eukaryotic species for which significant sequence data is available, including animals, plants, fungi, and several protists. The integration of comparative genomics with structural and functional data allows us to conclude that RBR proteins have multiple roles, not only in protein quality control mechanisms, but also as indirect regulators of transcription. A recent…
Evolution of chromatin-remodeling complexes: comparative genomics reveals the ancient origin of "novel" compensasome genes.
Dosage compensation in Drosophila is mediated by a complex, called compensasome, com- posed of at least five proteins and two noncoding RNAs. Genes encoding compensasome proteins have been collectively named male-specific lethals or msls. Recent work showed that three of the Drosophila msls (msl-3, mof, and mle) have an ancient origin. In this study, I describe likely orthologues of the two re- maining msls, msl-1 and msl-2, in several inverte- brates and vertebrates. The MSL-2 protein is the only one found in Drosophila and vertebrate genomes that contains both a RING finger and a peculiar type of CXC domain, related to the one present in Enhancer of Zeste proteins. MSL-1 also contains two…
A sequence motif enriched in regions bound by the Drosophila dosage compensation complex
Abstract Background In Drosophila melanogaster, dosage compensation is mediated by the action of the dosage compensation complex (DCC). How the DCC recognizes the fly X chromosome is still poorly understood. Characteristic sequence signatures at all DCC binding sites have not hitherto been found. Results In this study, we compare the known binding sites of the DCC with oligonucleotide profiles that measure the specificity of the sequences of the D. melanogaster X chromosome. We show that the X chromosome regions bound by the DCC are enriched for a particular type of short, repetitive sequences. Their distribution suggests that these sequences contribute to chromosome recognition, the genera…
Ty3/Gypsy Retrotransposons: Description of New Arabidopsis thaliana Elements and Evolutionary Perspectives Derived from Comparative Genomic Data
We performed a comprehensive analysis of the evolution of the Ty3/GYPSY: group of long-terminal-repeat retrotransposons (also known as METAVIRIDAE:). Exhaustive database searches allowed us to detect novel elements of this group. In particular, the Arabidopsis thaliana and Drosophila melanogaster genome sequencing projects have recently disclosed a large number of new Ty3/GYPSY: sequences. So far, elements of three different Ty3/GYPSY: lineages had been described for A. thaliana. Here, we describe six new lineages, which we have called Tit-for-tat1, Tit-for-tat2, Gimli, Gloin, Legolas, and Little Athila. We confirm that plant Ty3/GYPSY: elements form two main monophyletic groups. Moreover, …
A general strategy to determine the congruence between a hierarchical and a non-hierarchical classification
This article is available from: http://www.biomedcentral.com/1471-2105/8/442
Global patterns of sequence evolution in Drosophila.
This article is available from: http://www.biomedcentral.com/1471-2164/8/408
UVPAR: fast detection of functional shifts in duplicate genes.
Abstract Background The imprint of natural selection on gene sequences is often difficult to detect. A plethora of methods have been devised to detect genetic changes due to selective processes. However, many of those methods depend heavily on underlying assumptions regarding the mode of change of DNA sequences and often require sophisticated mathematical treatments that made them computationally slow. The development of fast and effective methods to detect modifications in the selective constraints of genes is therefore of great interest. Results We describe UVPAR, a program designed to quickly test for changes in the functional constraints of duplicate genes. Starting with alignments of t…
A mammalian gene evolved from the integrase domain of an LTR retrotransposon.
FIG. 1.—Summary of the structure and coding sequence of the human Gin-1 gene. Sequences of human cDNAs with accession numbers XMp003947.2 (a putative full-length cDNA), BE502574, AW173201.1, AW950418.1, AI631948.1, and AA766836.1 were used to deduce and confirm these data. The full-length protein is 522 amino acids long. The Gin-1 coding region spans nucleotides 36153–15345 in the genomic clone NTp002663.4. Arrowheads and the numbers above them, respectively, indicate the positions and lengths of introns. Several Alu repeats were detected within the two largest introns. Bold letters indicate the region homologous to the most conserved part of the IN domain, detailed in figure 2 and used to …
A new evolutionary paradigm for the Parkinson disease gene DJ-1.
The DJ-1 gene is extensively studied because of its involvement in familial Parkinson disease. DJ-1 belongs to a complex superfamily of genes that includes both prokaryotic and eukaryotic representatives. We determine that many prokaryotic groups, such as proteobacteria, cyanobacteria, spirochaetes, firmicutes, or fusobacteria, have genes, often incorrectly called "Thij," that are very close relatives of DJ-1, to the point that they cannot be clearly separated from the eukaryotic DJ-1 genes by phylogenetic analyses of their sequences. In addition, and contrary to a previous study that suggested that DJ-1 genes were animal specific, we show that DJ-1 genes are found in at least 5 of the 6 ma…
Selection on Coding Regions Determined Hox7 Genes Evolution
The important role of Hox genes in determining the regionalization of the body plan of the vertebrates makes them invaluable candidates for evolutionary analyses regarding functional and morphological innovation. Gene duplication and gene loss led to a variable number of Hox genes in different vertebrate lineages. The evolutionary forces determining the conservation or loss of Hox genes are poorly understood. In this study, we show that variable selective pressures acted on Hox7 genes in different evolutionary lineages, with episodes of positive selection occurring after gene duplications. Tests for functional divergence in paralogs detected significant differentiation in a region known to …