Search results for "Annotation"

showing 10 items of 161 documents

Detailed analysis of inversions predicted between two human genomes: errors, real polymorphisms, and their origin and population distribution.

2016

The growing catalogue of structural variants in humans often overlooks inversions as one of the most difficult types of variation to study, even though they affect phenotypic traits in diverse organisms. Here, we have analysed in detail 90 inversions predicted from the comparison of two independently assembled human genomes: the reference genome (NCBI36/HG18) and HuRef. Surprisingly, we found that two thirds of these predictions (62) represent errors either in assembly comparison or in one of the assemblies, including 27 misassembled regions in HG18. Next, we validated 22 of the remaining 28 potential polymorphic inversions using different PCR techniques and characterized their breakpoints …

0301 basic medicinePopulationBiologyGenomeEvolution Molecular03 medical and health sciencesGeneticsHumans1000 Genomes ProjectAlleleSelection GeneticeducationMolecular BiologyAllele frequencyGenetics (clinical)Geneticseducation.field_of_studyPolymorphism GeneticGenome HumanSequence InversionBreakpointMolecular Sequence AnnotationGeneral MedicineSequence Analysis DNA030104 developmental biologyChromosome InversionHuman genomeReference genomeHuman molecular genetics
researchProduct

Diagnostic Targeted Resequencing in 349 Patients with Drug-Resistant Pediatric Epilepsies Identifies Causative Mutations in 30 Different Genes

2017

Targeted resequencing gene panels are used in the diagnostic setting to identify gene defects in epilepsy. We performed targeted resequencing using a 30-genes panel and a 95-genes panel in 349 patients with drug-resistant epilepsies beginning in the first years of life. We identified 71 pathogenic variants, 42 of which novel, in 30 genes, corresponding to 20.3% of the probands. In 66% of mutation positive patients seizures onset occurred before age 6 months. The 95-genes panel allowed a genetic diagnosis in 22 (6.3%) patients that would have otherwise been missed using the 30-gene panel. About 50% of mutations were identified in genes coding for sodium and potassium channel components. SCN2…

0301 basic medicineProbandMaleCDKL5Drug Resistancemedicine.disease_causeBioinformaticsEpilepsyAnticonvulsantSTXBP1Age of OnsetChildGenetics (clinical)AlleleMutationepilepsy; next-generation sequencing; gene panel; mutationPhenotypeMagnetic Resonance ImagingSettore MED/39 - Neuropsichiatria Infantile3. Good healthPhenotypeChild PreschoolAnticonvulsantsFemaleSequence AnalysisHumanAdolescentGenotypeGenetic Association StudieBiologyMECP203 medical and health sciencesGeneticgene panelGeneticsmedicineHumansGenetic Predisposition to DiseasePreschoolGeneAllelesGenetic Association StudiesGene Expression ProfilingInfant NewbornComputational BiologyInfantMolecular Sequence AnnotationDNASequence Analysis DNANewbornmedicine.disease030104 developmental biologyepilepsynext-generation sequencingmutation
researchProduct

RepeatsDB 2.0: improved annotation, classification, search and visualization of repeat protein structures

2017

RepeatsDB 2.0 (URL: http://repeatsdb.bio.unipd.it/) is an update of the database of annotated tandem repeat protein structures. Repeat proteins are a widespread class of non-globular proteins carrying heterogeneous functions involved in several diseases. Here we provide a new version of RepeatsDB with an improved classification schema including high quality annotations for ∼5400 protein structures. RepeatsDB 2.0 features information on start and end positions for the repeat regions and units for all entries. The extensive growth of repeat unit characterization was possible by applying the novel ReUPred annotation method over the entire Protein Data Bank, with data quality is guaranteed by a…

0301 basic medicineRepetitive Sequences Amino Acid[SDV.BC]Life Sciences [q-bio]/Cellular BiologyBiologyBioinformaticsSearch engineAnnotationStructure-Activity Relationship03 medical and health sciences0302 clinical medicineTandem repeatGeneticsAnimalsHumansDatabase IssueDatabases ProteinComputingMilieux_MISCELLANEOUSRepeat unit030304 developmental biology0303 health sciencesInformation retrievalProteinscomputer.file_formatProtein Data BankVisualizationSchema (genetic algorithms)030104 developmental biologyData qualityCorrigendumcomputerSoftware030217 neurology & neurosurgeryNucleic Acids Research
researchProduct

The nucleic acid-binding protein PcCNBP is transcriptionally regulated during the immune response in red swamp crayfish Procambarus clarkii

2016

Gene family encoding cellular nucleic acid binding proteins (CNBP) is well conserved among vertebrates; however, there is limited knowledge in lower organisms. In this study, a CNBP homolog from the red swamp crayfish Procambarus clarkii was characterised. The full-length cDNA of PcCNBP was of 1257 bp with a 5′-untranslated region (UTR) of 63 bp and a 3′-UTR of 331 bp with a poly (A) tail, and an open-reading frame (ORF) of 864 bp encoding a polypeptide of 287 amino acids with the predicted molecular weight of about 33 kDa. The predicted protein possesses 7 tandem repeats of 14 amino acids containing the CCHC zinc finger consensus sequence, two RGG-rich single-stranded RNA-binding domain an…

0301 basic medicineUntranslated regionNucleic acid-binding proteinDNA ComplementaryHemocytesTranscription GeneticGene ExpressionHepatopancreasSettore BIO/11 - Biologia MolecolareAstacoideaBiochemistry03 medical and health sciencesComplementary DNAAnimalsGene expression patternTissue DistributionAmino Acid SequenceZinc finger motifsProcambarus clarkiiZinc fingerchemistry.chemical_classificationInnate immunityOriginal PaperbiologyRNA-Binding ProteinsMolecular Sequence AnnotationZinc finger motifCell Biologybiology.organism_classificationCrayfishMolecular biologyCrayfishImmunity InnateCell biologyAmino acid030104 developmental biologychemistryNucleic acidHepatopancreasCrayfish; Gene expression pattern; Innate immunity; Nucleic acid-binding protein; Zinc finger motifs; Biochemistry; Cell Biology
researchProduct

Draft genome sequence of Shimia marina CECT 7688T

2016

Shimia marina is a member of the family Rhodobacteraceae described in 2006. Strain CL-TA03(T) (=CECT 7688(T)) was isolated from a biofilm formed on an acrylic slide submerged in surface water in a coastal fish farm in Tongyeong, Korea. Here we report the draft genome sequence and annotation of S. marina CECT 7688(T) which is composed by 4,001,860bp arranged in 45 scaffolds with a G+C content of 57.4%, 3878 protein coding genes, 40 tRNA genes, 4 rRNA genes and 1 repeat region. An overview of annotated genes revealed diverse genes encoding for exopolysaccharide and capsular biosynthesis enzymes, secondary metabolite biosynthesis enzymes, multiple antibiotic and metal resistance and the abilit…

0301 basic medicineWhole genome sequencingchemistry.chemical_classificationBase Composition030102 biochemistry & molecular biologybiologyBiofilmMolecular Sequence AnnotationSequence Analysis DNAAquatic ScienceRibosomal RNAbiology.organism_classificationMicrobiology03 medical and health sciences030104 developmental biologyMolecular Sequence AnnotationEnzymechemistryRepublic of KoreaTransfer RNAGeneticsRhodobacteraceaeRhodobacteraceaeGeneGenome BacterialMarine Genomics
researchProduct

Functional insights into the infective larval stage of Anisakis simplex s.s., Anisakis pegreffii and their hybrids based on gene expression patterns

2018

[Background]: Anisakis simplex sensu stricto and Anisakis pegreffii are sibling species of nematodes parasitic on marine mammals. Zoonotic human infection with third stage infective larvae causes anisakiasis, a debilitating and potentially fatal disease. These 2 species show evidence of hybridisation in geographical areas where they are sympatric. How the species and their hybrids differ is still poorly understood. [Results]: Third stage larvae of Anisakis simplex s.s., Anisakis pegreffii and hybrids were sampled from Merluccius merluccius (Teleosti) hosts captured in waters of the FAO 27 geographical area. Specimens of each species and hybrids were distinguished with a diagnostic genetic m…

0301 basic medicinelcsh:QH426-470Virulence Factorslcsh:BiotechnologyAnisakis simplexBreedingBiologyAnisakisTranscriptomeFish Diseases03 medical and health scienceslcsh:TP248.13-248.65parasitic diseasesGeneticsAnimalsAlleleGeneGeneticsSequence Analysis RNAGene Expression ProfilingAnisakis simplexMolecular Sequence AnnotationHelminth Proteins030108 mycology & parasitologyAllergensbiology.organism_classificationA. PegreffiiAnisakisGene expression profilingGadiformeslcsh:Genetics030104 developmental biologyGene Expression RegulationSympatric speciationGenetic markerLarvaGene expressionEnergy MetabolismTranscriptomeResearch ArticleBiotechnologyBMC Genomics
researchProduct

WES/WGS Reporting of Mutations from Cardiovascular "Actionable" Genes in Clinical Practice: A Key Role for UMD Knowledgebases in the Era of Big Datab…

2016

International audience; High-throughput next-generation sequencing such as whole-exome and whole-genome sequencing are being rapidly integrated into clinical practice. The use of these techniques leads to the identification of secondary variants for which decisions about the reporting or not to the patient need to be made. The American College of Medical Genetics and Genomics recently published recommendations for the reporting of these variants in clinical practice for 56 "actionable" genes. Among these, seven are involved in Marfan Syndrome And Related Disorders (MSARD) resulting from mutations of the FBN1, TGFBR1 and 2, ACTA2, SMAD3, MYH11 and MYLK genes. Here, we show that mutations col…

0301 basic medicinemedicine.medical_specialtyKnowledge BasesGenomicsmarfan-syndrome[SDV.GEN.GH] Life Sciences [q-bio]/Genetics/Human genetics030105 genetics & heredityBiologycomputer.software_genreGenomeExAC03 medical and health sciencesAnnotationincidental findingsGeneticsmedicineHumanspathogenicityGenetic Predisposition to Diseasetgfbr2ExomegenomeESPGenetics (clinical)Exome sequencing[INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM]variantsDatabasethoracic aortic-aneurysmsGenome HumanHigh-Throughput Nucleotide SequencingMYLKGenomicspredictionmutations3. Good healthMarfan syndrome030104 developmental biologydissection[SDV.GEN.GH]Life Sciences [q-bio]/Genetics/Human geneticsCardiovascular DiseasesMutationMedical geneticsIdentification (biology)LSDB[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]computerexome
researchProduct

De Novo Genome Assembly of the Raccoon Dog (Nyctereutes Procyonoides)

2021

The raccoon dog, Nyctereutes procyonoides (NCBI Taxonomy ID: 34880, Figure 1a) belongs to the family Canidae, with foxes (genus Vulpes) being their closest relatives (Lindblad-Toh et al., 2005; Sun et al., 2019). Its original distribution in East Asia ranges from south-eastern Siberia to northern Vietnam and the Japanese islands. In the early 20th century, the raccoon dog was introduced into Western Russia for fur breeding and hunting purposes, which led to its widespread establishment in many European countries, Figure 1b. Together with the raccoon (Procyon lotor), it is now listed in Europe as an invasive species of Union concern (Regulation (EU) No. 1143/2014) and member states are requi…

0301 basic medicinepopulation genomicsRange (biology)ZoologyB chromosomeQH426-470GenomePopulation genomics03 medical and health sciences0302 clinical medicineddc:590Data ReportGeneticsraccoon dog (nyctereutes procyonoides)IUCN Red Listmedia_common.cataloged_instanceGenetics (clinical)Syntenymedia_commonB chromosomebiologySARS-CoV-2sequencebiology.organism_classificationgenome assembly and annotationanimalsCanis lupus familiaris030104 developmental biology030220 oncology & carcinogenesisrangeMolecular MedicinecarnivoraNyctereutes procyonoides
researchProduct

Comparing the Quality of Neural Machine Translation and Professional Post-Editing

2019

This empirical corpus study explores the quality of neural machine translations (NMT) and their post-edits (NMTPE) at the German Department of the European Commission’s Directorate-General for Translation (DGT) by evaluating NMT outputs, NMTPE, and respective revisions (REV) with the automatic error annotation tool Hjerson (Popovic 2011) and the more fine-grained manual MQM framework (Lommel 2014). Results show that quality assurance measures by post-editors and revisors at the DGT are most often necessary for lexical errors. More specifically, if post-editors correct mistranslations, terminology or stylistic errors in an NMT sentence, revisors are likely to correct the same type of error i…

050101 languages & linguisticsTransitive relationMachine translationComputer sciencebusiness.industrymedia_common.quotation_subject05 social sciences02 engineering and technologycomputer.software_genrelanguage.human_languageTerminologyGermanAnnotation0202 electrical engineering electronic engineering information engineeringlanguage020201 artificial intelligence & image processing0501 psychology and cognitive sciencesQuality (business)Artificial intelligencebusinesscomputerQuality assuranceNatural language processingSentencemedia_common2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX)
researchProduct

Semi-automated annotation of page-based documents within the Genre and Multimodality framework

2016

This paper describes ongoing work on a tool developed for annotating document images for their multimodal features and compiling this information into a corpus. The tool leverages open source computer vision and natural language processing libraries to describe the content and structure of multimodal documents and to generate multiple layers of XML annotation. The paper introduces the annotation schema, describes the document processing pipeline and concludes with a brief description of future work.

060201 languages & linguisticsStructure (mathematical logic)Information retrievalComputer sciencecomputer.internet_protocolbusiness.industry05 social sciences050801 communication & media studies06 humanities and the artsTemporal annotationcomputer.software_genreDocument processingPipeline (software)MultimodalityAnnotation0508 media and communicationsOpen source0602 languages and literatureComputingMethodologies_DOCUMENTANDTEXTPROCESSINGArtificial intelligencebusinesscomputerNatural language processingXMLProceedings of the 10th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
researchProduct