Search results for "Sequence Alignment"

showing 10 items of 447 documents

dAPE: a web server to detect homorepeats and follow their evolution.

2016

Abstract Summary Homorepeats are low complexity regions consisting of repetitions of a single amino acid residue. There is no current consensus on the minimum number of residues needed to define a functional homorepeat, nor even if mismatches are allowed. Here we present dAPE, a web server that helps following the evolution of homorepeats based on orthology information, using a sensitive but tunable cutoff to help in the identification of emerging homorepeats. Availability and Implementation dAPE can be accessed from http://cbdm-01.zdv.uni-mainz.de/∼munoz/polyx. Supplementary information Supplementary data are available at Bioinformatics online.

0301 basic medicineStatistics and ProbabilityRepetitive Sequences Amino AcidWeb serverInternetComputer sciencecomputer.software_genreBiochemistryApplications NotesComputer Science ApplicationsWorld Wide WebEvolution Molecular03 medical and health sciencesComputational Mathematics030104 developmental biologyComputational Theory and MathematicsAnimalsHumansData miningMolecular BiologycomputerSequence AlignmentSequence AnalysisSoftwareBioinformatics (Oxford, England)
researchProduct

MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems

2016

This is a pre-copyedited, author-produced version of an article accepted for publication in Bioinformatics following peer review. The version of recordJorge González-Domínguez, Yongchao Liu, Juan Touriño, Bertil Schmidt; MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems, Bioinformatics, Volume 32, Issue 24, 15 December 2016, Pages 3826–3828, https://doi.org/10.1093/bioinformatics/btw558is available online at: https://doi.org/10.1093/bioinformatics/btw558 [Abstracts] MSAProbs is a state-of-the-art protein multiple sequence alignment tool based on hidden Markov models. It can achieve high alignment accuracy at the expense of relatively long runtimes for large-sca…

0301 basic medicineStatistics and ProbabilitySource codeComputer sciencemedia_common.quotation_subject02 engineering and technologyParallel computingcomputer.software_genreBiochemistryExecution time03 medical and health sciences0202 electrical engineering electronic engineering information engineeringCluster (physics)Point (geometry)Amino Acid SequenceMolecular Biologymedia_commonSequenceMultiple sequence alignmentProtein multiple sequenceComputational BiologyProteinsMarkov ChainsComputer Science ApplicationsComputational Mathematics030104 developmental biologyComputational Theory and MathematicsDistributed memory systemsMSAProbs020201 artificial intelligence & image processingMPIData miningSequence AlignmentcomputerAlgorithmsSoftware
researchProduct

Identification of transcribed protein coding sequence remnants within lincRNAs

2018

Abstract Long intergenic non-coding RNAs (lincRNAs) are non-coding transcripts >200 nucleotides long that do not overlap protein-coding sequences. Importantly, such elements are known to be tissue-specifically expressed and to play a widespread role in gene regulation across thousands of genomic loci. However, very little is known of the mechanisms for the evolutionary biogenesis of these RNA elements, especially given their poor conservation across species. It has been proposed that lincRNAs might arise from pseudogenes. To test this systematically, we developed a novel method that searches for remnants of protein-coding sequences within lincRNA transcripts; the hypothesis is that we can t…

0301 basic medicineTransposable elementSequence analysisPseudogeneRetrotransposonComputational biologyBiologyOpen Reading Frames03 medical and health sciences0302 clinical medicineIntergenic regionSequence Analysis ProteinGeneticsHumansAmino Acid SequenceGeneRegulation of gene expressionBase SequenceSequence Analysis RNAComputational Biology030104 developmental biologyGene Expression RegulationDNA IntergenicRNA Long NoncodingSequence AlignmentAlgorithms030217 neurology & neurosurgeryBiogenesisNucleic Acids Research
researchProduct

Evolutionary conserved mechanisms pervade structure and transcriptional modulation of allograft inflammatory factor-1 from sea anemone Anemonia virid…

2017

Gene family encoding allograft inflammatory factor-1 (AIF-1) is well conserved among organisms; however, there is limited knowledge in lower organisms. In this study, the first AIF-1 homologue from cnidarians was identified and characterised in the sea anemone Anemonia viridis. The full-length cDNA of AvAIF-1 was of 913 bp with a 5' -untranslated region (UTR) of 148 bp, a 3'-UTR of 315 and an open reading frame (ORF) of 450 bp encoding a polypeptide with149 amino acid residues and predicted molecular weight of about 17 kDa. The predicted protein possesses evolutionary conserved EF hand Ca2+ binding motifs, post-transcriptional modification sites and a 3D structure which can be superimposed …

0301 basic medicineUntranslated regionCnidaria; Gene expression; Homology modelling; Inflammation; Sea anemone; Environmental Chemistry; Aquatic ScienceSettore BIO/11 - Biologia MolecolareSea anemoneSea anemoneAquatic ScienceAnemoniaEvolution Molecular03 medical and health sciencesCnidaria0302 clinical medicineComplementary DNABotanyGene familyEnvironmental ChemistryAnimalsAmino Acid SequenceeducationPhylogenyInflammationeducation.field_of_studybiologyBase SequenceEF handCalcium-Binding ProteinsGeneral Medicinebiology.organism_classificationCell biologyCnidaria; Sea anemone; Gene expression; Inflammation; Homology modellingOpen reading frame030104 developmental biologySea Anemones030220 oncology & carcinogenesisAllograft inflammatory factor 1Gene expressionHomology modellingSequence Alignment
researchProduct

Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters

2016

Computing alignments between two or more sequences are common operations frequently performed in computational molecular biology. The continuing growth of biological sequence databases establishes the need for their efficient parallel implementation on modern accelerators. This paper presents new approaches to high performance biological sequence database scanning with the Smith-Waterman algorithm and the first stage of progressive multiple sequence alignment based on the ClustalW heuristic on a Xeon Phi-based compute cluster. Our approach uses a three-level parallelization scheme to take full advantage of the compute power available on this type of architecture; i.e. cluster-level data par…

0301 basic medicineXeon Phi clustersComputer scienceData parallelismParallel algorithm02 engineering and technologyDynamic programmingBiochemistryPairwise sequence alignmentComputational science03 medical and health sciencesStructural BiologyComputer cluster0202 electrical engineering electronic engineering information engineeringAmino Acid SequenceDatabases ProteinMolecular Biology020203 distributed computingResearchApplied MathematicsComputational BiologyProteinsSmith-WatermanComputer Science Applications030104 developmental biologyMultiple sequence alignmentDatabases Nucleic AcidSequence AlignmentAlgorithmsSoftwareXeon PhiBMC Bioinformatics
researchProduct

A giant type I polyketide synthase participates in zygospore maturation in Chlamydomonas reinhardtii

2017

Polyketide synthases (PKSs) occur in many bacteria, fungi and plants. They are highly versatile enzymes involved in the biosynthesis of a large variety of compounds including antimicrobial agents, polymers associated with bacterial cell walls and plant pigments. While harmful algae are known to produce polyketide toxins, sequences of the genomes of non-toxic algae, including those of many green algal species, have surprisingly revealed the presence of genes encoding type I PKSs. The genome of the model alga Chlamydomonas reinhardtii (Chlorophyta) contains a single type I PKS gene, designated PKS1 (Cre10.g449750), which encodes a giant PKS with a predicted mass of 2.3 MDa. Here, we show that…

0301 basic medicinebiologyMutantChlamydomonas reinhardtiiCell BiologyPlant ScienceChlorophytaGenes Plantbiology.organism_classificationBacterial cell structureCell wall03 medical and health sciencesPolyketide030104 developmental biologyBiochemistryCell WallSeedsGeneticsZygosporePolyketide SynthasesSequence AlignmentGeneChlamydomonas reinhardtiiPlant ProteinsThe Plant Journal
researchProduct

Avoided motifs: short amino acid strings missing from protein datasets.

2020

Abstract According to the amino acid composition of natural proteins, it could be expected that all possible sequences of three or four amino acids will occur at least once in large protein datasets purely by chance. However, in some species or cellular context, specific short amino acid motifs are missing due to unknown reasons. We describe these as Avoided Motifs, short amino acid combinations missing from biological sequences. Here we identify 209 human and 154 bacterial Avoided Motifs of length four amino acids, and discuss their possible functionality according to their presence in other species. Furthermore, we determine two Avoided Motifs of length three amino acids in human proteins…

0301 basic medicinechemistry.chemical_classificationProtein functionAmino Acid Motifs030102 biochemistry & molecular biologyClinical BiochemistryComputational BiologyProteinsContext (language use)Computational biologyBiologyBiochemistryAmino acid03 medical and health sciences030104 developmental biologySecretory proteinchemistryAmino acid compositionCytoplasmMolecular BiologyHuman proteinsSequence AlignmentBiological chemistryReferences
researchProduct

Unexpected associated microalgal diversity in the lichen Ramalina farinacea is uncovered by pyrosequencing analyses

2017

The current literature reveals that the intrathalline coexistence of multiple microalgal taxa in lichens is more common than previously thought, and additional complexity is supported by the coexistence of bacteria and basidiomycete yeasts in lichen thalli. This replaces the old paradigm that lichen symbiosis occurs between a fungus and a single photobiont. The lichen Ramalina farinacea has proven to be a suitable model to study the multiplicity of microalgae in lichen thalli due to the constant coexistence of Trebouxia sp. TR9 and T. jamesii in long-distance populations. To date, studies involving phycobiont diversity within entire thalli are based on Sanger sequencing, but this method see…

0301 basic medicinelcsh:MedicineLichenologyArtificial Gene Amplification and ExtensionPlant SciencePolymerase Chain ReactionDatabase and Informatics MethodsDiversity indexMicroalgaeCluster AnalysisDNA Fungallcsh:ScienceLichenPhylogenyData ManagementMultidisciplinaryEcologybiologyEcologyPhylogenetic AnalysisBiodiversitysymbiosisThallusPhylogeneticspyrosequencingLichenologyTrebouxiaSequence AnalysisResearch ArticleTrebouxiaComputer and Information SciencesBioinformaticsSequence DatabasesReal-Time Polymerase Chain ReactionResearch and Analysis MethodslichenRamalina farinacea03 medical and health sciencesAscomycotaAlgaelichen photobionts pyrosequencing symbiosis TrebouxiaBotanyEvolutionary SystematicsMolecular Biology TechniquesMolecular BiologyDNA sequence analysisTaxonomyEvolutionary BiologyEcology and Environmental Scienceslcsh:RGenetic VariationBiology and Life SciencesSequence Analysis DNAReverse Transcriptase-Polymerase Chain Reactionbiology.organism_classificationBiological Databases030104 developmental biologyphotobiontsPyrosequencinglcsh:QSequence AlignmentPLOS ONE
researchProduct

MAGA: A Supervised Method to Detect Motifs From Annotated Groups in Alignments

2020

Multiple sequence alignments are usually phylogenetically driven. They are studied in the framework of evolution. But sometimes, it is interesting to study residue conservation at positions unconstrained by evolutionary rules. We present a supervised method to access a layer of information difficult to appreciate visually when many protein sequences are aligned. This new tool (MAGA; http://cbdm-01.zdv.uni-mainz.de/~munoz/maga/ ) locates positions in multiple sequence alignments differentially conserved in manually defined groups of sequences.

0303 health sciencesmultiple sequence alignmentsSequence analysisComputer science0206 medical engineeringMethods and ProtocolsSequence analysislcsh:Evolution02 engineering and technologyComputational biologyComputer Science Applications03 medical and health sciencesmotif findingcomputational biologyweb servicesGeneticslcsh:QH359-425020602 bioinformaticsEcology Evolution Behavior and Systematics030304 developmental biologyEvolutionary Bioinformatics
researchProduct

Genomic organization and promoter characterization of the gene encoding a putative endoplasmic reticulum chaperone, ERp29

2002

Abstract ERp29 is a soluble protein localized in the endoplasmic reticulum (ER) of eukaryotic cells, which is conserved in all mammalian species. The N-terminal domain of ERp29 displays sequence and structural similarity to the protein disulfide isomerase despite the lack of the characteristic double cysteine motif. Although the exact function of ERp29 is not yet known, it was hypothesized that it may facilitate folding and/or export of secretory proteins in/from the ER. ERp29 is induced by ER stress, i.e. accumulation of unfolded proteins in the ER. To gain an insight into the mechanisms regulating ERp29 expression we have cloned and characterized the rat ERp29 gene and studied in details …

5' Flanking RegionRecombinant Fusion ProteinsMolecular Sequence DataCHO CellsBiologyCell LineMiceCricetinaeSequence Homology Nucleic AcidGene expressionTumor Cells CulturedGeneticsAnimalsHumansRNA MessengerLuciferasesPromoter Regions GeneticProtein disulfide-isomeraseGeneHeat-Shock ProteinsPhylogenyBase SequenceGene Expression ProfilingEndoplasmic reticulumPromoter3T3 CellsDNAExonsSequence Analysis DNAGeneral MedicineMolecular biologyIntronsRatsHousekeeping geneSecretory proteinGenesUnfolded protein responseFemaleTranscription Initiation SiteSequence AlignmentHeLa CellsGene
researchProduct