Search results for "DATABASES"

showing 10 items of 937 documents

FastaHerder2: Four Ways to Research Protein Function and Evolution with Clustering and Clustered Databases.

2016

The accelerated growth of protein databases offers great possibilities for the study of protein function using sequence similarity and conservation. However, the huge number of sequences deposited in these databases requires new ways of analyzing and organizing the data. It is necessary to group the many very similar sequences, creating clusters with automated derived annotations useful to understand their function, evolution, and level of experimental evidence. We developed an algorithm called FastaHerder2, which can cluster any protein database, putting together very similar protein sequences based on near-full-length similarity and/or high threshold of sequence identity. We compressed 50…

0301 basic medicineProtein structure databaseProteomicsProteomeSequence analysisComputer sciencecomputer.software_genreSensitivity and SpecificitySet (abstract data type)Evolution Molecular03 medical and health sciences0302 clinical medicineSimilarity (network science)Sequence Analysis ProteinGeneticsCluster (physics)AnimalsCluster AnalysisHumansCluster analysisDatabases ProteinMolecular BiologySequenceDatabaseFunction (mathematics)Computational Mathematics030104 developmental biologyComputational Theory and MathematicsModeling and SimulationData miningcomputer030217 neurology & neurosurgerySoftwareJournal of computational biology : a journal of computational molecular cell biology

researchProduct

Automated selection of homologs to track the evolutionary history of proteins

2018

Background The selection of distant homologs of a query protein under study is a usual and useful application of protein sequence databases. Such sets of homologs are often applied to investigate the function of a protein and the degree to which experimental results can be transferred from one organism to another. In particular, a variety of databases facilitates static browsing for orthologs. However, these resources have a limited power when identifying orthologs between taxonomically distant species. In addition, in some situations, for a given query protein, it is advantageous to compare the sets of orthologs from different specific organisms: this recursive step-wise search might give …

0301 basic medicineProteomeComputer scienceComputational biologyWeb toollcsh:Computer applications to medicine. Medical informaticsBiochemistryHomology (biology)Evolution Molecular03 medical and health sciences0302 clinical medicineProtein sequencingStructural BiologyHomologous chromosomeHumansDatabases ProteinMolecular Biologylcsh:QH301-705.5OrganismProtein functionMethodology ArticleApplied MathematicsProteinsA proteinComputer Science ApplicationsHomologyEvolutionary path030104 developmental biologyComputingMethodologies_PATTERNRECOGNITIONlcsh:Biology (General)Proteomelcsh:R858-859.7DNA microarraySoftware030217 neurology & neurosurgeryBMC Bioinformatics

researchProduct

The Human Proteome Organization–Proteomics Standards Initiative Quality Control Working Group: Making quality control more accessible for biological …

2017

To have confidence in results acquired during biological mass spectrometry experiments, a systematic approach to quality control is of vital importance. Nonetheless, until now, only scattered initiatives have been undertaken to this end, and these individual efforts have often not been complementary. To address this issue, the Human Proteome Organization–Proteomics Standards Initiative has established a new working group on quality control at its meeting in the spring of 2016. The goal of this working group is to provide a unifying framework for quality control data. The initial focus will be on providing a community-driven standardized file format for quality control. For this purpose, the…

0301 basic medicineProteomicsQuality ControlProteomics Standards InitiativeProteomeChemistrymedia_common.quotation_subjectControl (management)File formatData scienceMass SpectrometryAnalytical ChemistryVariety (cybernetics)03 medical and health sciences030104 developmental biologyControl dataHuman proteome projectHumansUse caseQuality (business)Databases Proteinmedia_common

researchProduct

Proteomics Standards Initiative: Fifteen Years of Progress and Future Work.

2017

Abstract: The Proteomics Standards Initiative (PSI) of the Human Proteome Organization (HUPO) has now been developing and promoting open community standards and software tools in the field of proteomics for 15 years. Under the guidance of the chair, co-chairs, and other leadership positions, the PSI working groups are tasked with the development and maintenance of community standards via special workshops and ongoing work. Among the existing, ratified standards, the PSI working groups continue to update PSI-MI XML, MITAB, mzML, mzIdentML, mzQuantML, mzTab, and the MIAPE (Minimum Information About a Proteomics Experiment) guidelines with the advance of new technologies and techniques. Furthe…

0301 basic medicineProteomicsprotein quantificationEmerging technologiesComputer sciencecomputer.internet_protocolGuidelines as Topiccomputer.software_genreBiochemistry03 medical and health sciencesprotein identificationHuman proteome projectHumansCommunity standardsquality controlDatabases ProteinBiologydatabasemass spectrometryComputer. Automation030102 biochemistry & molecular biologyApplication programming interfaceProteomics Standards InitiativeGeneral ChemistryReference StandardsData sciencemetabolomicsChemistry030104 developmental biologyPerspectivedata standardWeb servicebioinformatics softwareWorking groupcomputerXMLSoftwaremolecular interactionsJournal of proteome research

researchProduct

MODOMICS: a database of RNA modification pathways. 2017 update

2017

Abstract MODOMICS is a database of RNA modifications that provides comprehensive information concerning the chemical structures of modified ribonucleosides, their biosynthetic pathways, the location of modified residues in RNA sequences, and RNA-modifying enzymes. In the current database version, we included the following new features and data: extended mass spectrometry and liquid chromatography data for modified nucleosides; links between human tRNA sequences and MINTbase - a framework for the interactive exploration of mitochondrial and nuclear tRNA fragments; new, machine-friendly system of unified abbreviations for modified nucleoside names; sets of modified tRNA sequences for two bact…

0301 basic medicineRNA methylationBiologycomputer.software_genreMass Spectrometry03 medical and health scienceschemistry.chemical_compound0302 clinical medicineRNA TransferEpitranscriptomicsTerminology as TopicRNA modificationDatabases GeneticGeneticsDatabase IssueHumanschemistry.chemical_classificationDatabase2'-O-methylationRNA030104 developmental biologyEnzymechemistry030220 oncology & carcinogenesisTransfer RNARNARibonucleosidesN6-MethyladenosinecomputerChromatography LiquidNucleic Acids Research

researchProduct

Sentinel hospital-based surveillance for norovirus infection in children with gastroenteritis between 2015 and 2016 in Italy

2018

Noroviruses are one of the leading causes of gastro-enteric diseases worldwide in all age groups. Novel epidemic noroviruses with GII.P16 polymerase and GII.2 or GII.4 capsid type have emerged worldwide in late 2015 and in 2016. We performed a molecular epidemiological study of the noroviruses circulating in Italy to investigate the emergence of new norovirus strains. Sentinel hospital-based surveillance, in three different Italian regions, revealed increased prevalence of norovirus infection in children (<15 years) in 2016 (14.4% versus 9.8% in 2015) and the emergence of GII.P16 strains in late 2016, which accounted for 23.0% of norovirus infections. The majority of the strains with a GII.…

0301 basic medicineRNA virusesEuropean PeopleSettore MED/07 - Microbiologia E Microbiologia Clinicavirusesmedicine.disease_causePathology and Laboratory MedicinePediatricsGeographical locationsfluids and secretionsEpidemiologyGenotypePrevalenceMedicine and Health SciencesEthnicitiesChildCaliciviridae InfectionsMultidisciplinaryIncidence (epidemiology)Database and informatics methodsQRSequence analysisvirus diseasesGastroenteritisItalian PeopleEuropeCapsidItalyMedical MicrobiologyChild PreschoolViral PathogensVirusesMedicineRNA ViralPathogensPediatric InfectionsResearch Articlemedicine.medical_specialtyGenotypingGenotypeBioinformaticsScience030106 microbiologySequence DatabasesMicrobiologyCaliciviruses03 medical and health sciencesAge groupsmedicineHumansEuropean UnionMolecular Biology TechniquesGenotypingMicrobial PathogensMolecular BiologyBiochemistry Genetics and Molecular Biology (all)RNA sequence analysisBiology and life sciencesbusiness.industrySequence Analysis RNANorovirusOrganismsGenetic VariationRNA-Dependent RNA PolymeraseVirologydigestive system diseasesResearch and analysis methods030104 developmental biologyCaliciviridae InfectionsBiological DatabasesAgricultural and Biological Sciences (all)NorovirusCapsid ProteinsPopulation GroupingsPeople and placesbusinessSentinel Surveillance

researchProduct

Newly Digitized Database Reveals the Lives and Families of Forced Migrants from Finnish Karelia

2017

Studies on displaced persons often suffer from a lack of data on the long-term effects of forced migration. A register created during 1960s and published as a book series ‘Siirtokarjalaisten tie’ in 1970 documented the lives of individuals who fled the southern Karelian district of Finland after its first and second occupation by the Soviet Union in 1940 and 1944. To realize the potential value of these data for scientific research, we have recently scanned the register using optical character recognition (OCR) software, and developed proprietary computer code to extract these data. Here we outline the steps involved in the digitization process, and present an overview of the Migration Kare…

0301 basic medicineRegister (sociolinguistics)Historyväestönsiirrotdatabases [http://www.yso.fi/onto/yso/p3056]forced migrationmarriage [http://www.yso.fi/onto/yso/p2790]computer.software_genrelcsh:Social Sciences03 medical and health sciencesbirthsoccupations (professions) [http://www.yso.fi/onto/yso/p1179]avioituvuustietokannatrekisterit112 Statistics and probabilityDigitizationta119syntyvyysdatabaseFinlandmobility [http://www.yso.fi/onto/yso/p252]perheet (ryhmät)Databaseregister informationoccupationsDisplaced persondisplaced personsOptical character recognition113 Computer and information sciencesmarriagesmobilitylcsh:HForced migration030104 developmental biologyliikkuvuuslcsh:HB848-3697digitizationlcsh:Demography. Population. Vital eventsta1181Research findingsSoviet unionKarjalacomputerdigiointiFinnish Yearbook of Population Research

researchProduct

RepeatsDB 2.0: improved annotation, classification, search and visualization of repeat protein structures

2017

RepeatsDB 2.0 (URL: http://repeatsdb.bio.unipd.it/) is an update of the database of annotated tandem repeat protein structures. Repeat proteins are a widespread class of non-globular proteins carrying heterogeneous functions involved in several diseases. Here we provide a new version of RepeatsDB with an improved classification schema including high quality annotations for ∼5400 protein structures. RepeatsDB 2.0 features information on start and end positions for the repeat regions and units for all entries. The extensive growth of repeat unit characterization was possible by applying the novel ReUPred annotation method over the entire Protein Data Bank, with data quality is guaranteed by a…

0301 basic medicineRepetitive Sequences Amino Acid[SDV.BC]Life Sciences [q-bio]/Cellular BiologyBiologyBioinformaticsSearch engineAnnotationStructure-Activity Relationship03 medical and health sciences0302 clinical medicineTandem repeatGeneticsAnimalsHumansDatabase IssueDatabases ProteinComputingMilieux_MISCELLANEOUSRepeat unit030304 developmental biology0303 health sciencesInformation retrievalProteinscomputer.file_formatProtein Data BankVisualizationSchema (genetic algorithms)030104 developmental biologyData qualityCorrigendumcomputerSoftware030217 neurology & neurosurgeryNucleic Acids Research

researchProduct

Fragments of peer review: A quantitative analysis of the literature (1969-2015)

2018

This paper examines research on peer review between 1969 and 2015 by looking at records indexed from the Scopus database. Although it is often argued that peer review has been poorly investigated, we found that the number of publications in this field doubled from 2005. A half of this work was indexed as research articles, a third as editorial notes and literature reviews and the rest were book chapters or letters. We identified the most prolific and influential scholars, the most cited publications and the most important journals in the field. Co-authorship network analysis showed that research on peer review is fragmented, with the largest group of co-authors including only 2.1% of the wh…

0301 basic medicineScience and Technology WorkforceResearch Quality Assessmentlcsh:MedicineCareers in ResearchPeer review co-authorship collaboration communityCitation analysisCentralityData MiningSociologylcsh:ScienceMultidisciplinary05 social sciencesScientometricsco-authorshipResearch AssessmentKnowledge sharingProfessionsCitation AnalysiscommunityNetwork AnalysisResearch ArticleComputer and Information SciencesScience PolicyAbstracting and IndexingPeer ReviewAbstracting and Indexing as Topic ; Animals ; Data Mining ; Databases Bibliographic ; History 20th Century ; History 21st Century ; Humans ; Peer ReviewScopusLibrary science050905 science studiesResearch and Analysis MethodsHistory 21st Century03 medical and health sciencesAnimalsHumansScientific Publishinglcsh:RScientometricsHistory 20th CenturyDatabases Bibliographiccollaboration030104 developmental biologyQuantitative analysis (finance)People and PlacesScientistslcsh:QPopulation Groupings0509 other social sciencesScientific publishingCentrality

researchProduct

Prediction of Chromatin Accessibility in Gene-Regulatory Regions from Transcriptomics Data

2017

AbstractThe epigenetics landscape of cells plays a key role in the establishment of cell-type specific gene expression programs characteristic of different cellular phenotypes. Different experimental procedures have been developed to obtain insights into the accessible chromatin landscape including DNase-seq, FAIRE-seq and ATAC-seq. However, current downstream computational tools fail to reliably determine regulatory region accessibility from the analysis of these experimental data. In particular, currently available peak calling algorithms are very sensitive to their parameter settings and show highly heterogeneous results, which hampers a trustworthy identification of accessible chromatin…

0301 basic medicineScienceComputational biologyRegulatory Sequences Nucleic AcidBiologycomputer.software_genreArticleEpigenesis Genetic03 medical and health sciencesDatabases GeneticHumansEpigeneticsComputational modelDeoxyribonucleasesMultidisciplinarySequence Analysis RNAGene Expression ProfilingDecision tree learningQRSequence Analysis DNAChromatinChromatinGene expression profilingIdentification (information)030104 developmental biologyGene Expression RegulationMedicineData miningPrecision and recallPeak callingcomputerAlgorithmsScientific reports

researchProduct