Search results for "Datasets"

showing 10 items of 45 documents

Towards A Twitter Observatory: A Multi-Paradigm Framework For Collecting, Storing And Analysing Tweets

2016

International audience; In this article we show how a multi-paradigm framework can fulfil the requirements of tweets analysis and reduce the waiting time for researchers that use computational resources and storage systems to support large-scale data analysis. The originality of our approach is to combine concerns about data harvesting, data storage, data analysis and data visualisation into a framework that supports inductive reasoning in multidisciplinary scientific research. Our main contribution is a polyglot storage system with a generic data model to support logical data independence and a set of tools that can provide a suitable solution for mixing different types of algorithms in or…

[ INFO.INFO-IR ] Computer Science [cs]/Information Retrieval [cs.IR][ INFO ] Computer Science [cs]Computer scienceknowledge discovery02 engineering and technology[INFO] Computer Science [cs][INFO.INFO-SI]Computer Science [cs]/Social and Information Networks [cs.SI]Data modelingmassive datasetsopen source softwareData visualization[ INFO.INFO-IT ] Computer Science [cs]/Information Theory [cs.IT]polyglot storage020204 information systems0202 electrical engineering electronic engineering information engineering[INFO]Computer Science [cs]Twitter analysis . SystemsComputingMilieux_MISCELLANEOUS[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB]business.industryPolyglotInductive reasoningData science[SPI.TRON] Engineering Sciences [physics]/ElectronicsData independence[ SPI.TRON ] Engineering Sciences [physics]/Electronics[SPI.TRON]Engineering Sciences [physics]/ElectronicsData model[INFO.INFO-IT]Computer Science [cs]/Information Theory [cs.IT][INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]020201 artificial intelligence & image processing[INFO.INFO-IR] Computer Science [cs]/Information Retrieval [cs.IR][INFO.INFO-IT] Computer Science [cs]/Information Theory [cs.IT]Data architecturebusinessSoftware architecture
researchProduct

Perspectives on the Impact of Sampling Design and Intensity on Soil Microbial Diversity Estimates

2019

Soil bacterial communities have long been recognized as important ecosystem components, and have been the focus of many local and regional studies. However, there is a lack of data at large spatial scales, on the biodiversity of soil microorganisms; national or more extensive studies to date have typically consisted of low replication of haphazardly collected samples. This has led to large spatial gaps in soil microbial biodiversity data. Using a pre-existing dataset of bacterial community composition across a 16-km regular sampling grid in France, we show that the number of detected OTUs changes little under different sampling designs (grid, random, or representative), but increases with t…

Microbiology (medical)Biomelcsh:QR1-502BiodiversityDistribution (economics)Sample (statistics)Microbiologylcsh:Microbiology03 medical and health sciencesglobal datasetsSampling designCitizen scienceEcosystemnational datasetsbiogeography030304 developmental biologybiodiversity0303 health sciences030306 microbiologybusiness.industrysoil bacteriaEnvironmental resource managementSampling (statistics)PerspectiveEnvironmental sciencebusinessFrontiers in Microbiology
researchProduct

Algorithmic paradigms for stability-based cluster validity and model selection statistical methods, with applications to microarray data analysis

2012

AbstractThe advent of high throughput technologies, in particular microarrays, for biological research has revived interest in clustering, resulting in a plethora of new clustering algorithms. However, model selection, i.e., the identification of the correct number of clusters in a dataset, has received relatively little attention. Indeed, although central for statistics, its difficulty is also well known. Fortunately, a few novel techniques for model selection, representing a sharp departure from previous ones in statistics, have been proposed and gained prominence for microarray data analysis. Among those, the stability-based methods are the most robust and best performing in terms of pre…

Settore INF/01 - InformaticaGeneral Computer Sciencebusiness.industryComputer scienceBioinformaticsModel selectionGeneral statisticsMachine learningcomputer.software_genreTheoretical Computer ScienceComputational biologyAnalysis of massive datasetsMachine learningCluster (physics)Algorithms and data structures General statistics Analysis of massive datasets Machine learning Computational biology BioinformaticsAlgorithms and data structuresAlgorithm designArtificial intelligenceCluster analysisbusinessCompleteness (statistics)computerComputer Science(all)Theoretical Computer Science
researchProduct

Assessment of the 4-factor score: Retrospective analysis of 586 CLL patients receiving ibrutinib. A campus CLL study

2021

Not Available

OncologyMalechronic B cell leukemiachronic lymphocytic leukemia; ibrutinib; 4-factor score; prognosis.Datasets as TopicSeverity of Illness Indexchemistry.chemical_compoundPiperidinesRetrospective analysisMulticenter Studies as TopicChronicLeukemiaHematologyMiddle AgedPrognosisLymphocyticProgression-Free SurvivalIbrutinibFemalemedicine.medical_specialtyreal-word studyFactor scoreAntineoplastic AgentsAdenine; Aged; Antineoplastic Agents; Datasets as Topic; Female; Follow-Up Studies; Humans; Leukemia Lymphocytic Chronic B-Cell; Male; Middle Aged; Multicenter Studies as Topic; Piperidines; Prognosis; Progression-Free Survival; Proportional Hazards Models; Protein Kinase Inhibitors; Reproducibility of Results; Retrospective Studies; Risk Assessment; Severity of Illness Index; Survival AnalysisRisk AssessmentNOibrutinibInternal medicineSeverity of illnessmedicineHumansProgression-free survivalProtein Kinase InhibitorsSurvival analysisAgedProportional Hazards ModelsRetrospective Studiesbusiness.industryProportional hazards modelAdenineB-CellReproducibility of ResultsRetrospective cohort studyAdenine; Aged; Antineoplastic Agents; Datasets as Topic; Female; Follow-Up Studies; Humans; Leukemia Lymphocytic Chronic B-Cell; Male; Middle Aged; Multicenter Studies as Topic; Piperidines; Prognosis; Progression-Free Survival; Proportional Hazards Models; Protein Kinase Inhibitors; Reproducibility of Results; Retrospective Studies; Risk Assessment; Survival Analysis; Severity of Illness IndexLeukemia Lymphocytic Chronic B-CellSurvival AnalysisSettore MED/15 - MALATTIE DEL SANGUEchemistrybusinesschronic lymphocytic leukaemiaFollow-Up Studies
researchProduct

Fast Estimation of Diffusion Tensors under Rician noise by the EM algorithm

2016

Diffusion tensor imaging (DTI) is widely used to characterize, in vivo, the white matter of the central nerve system (CNS). This biological tissue contains much anatomic, structural and orientational information of fibers in human brain. Spectral data from the displacement distribution of water molecules located in the brain tissue are collected by a magnetic resonance scanner and acquired in the Fourier domain. After the Fourier inversion, the noise distribution is Gaussian in both real and imaginary parts and, as a consequence, the recorded magnitude data are corrupted by Rician noise. Statistical estimation of diffusion leads a non-linear regression problem. In this paper, we present a f…

FOS: Computer and information sciencesreduced computationGaussianModels NeurologicalDatasets as Topicta3112Statistics - ComputationStatistics - ApplicationsTime030218 nuclear medicine & medical imagingMethodology (stat.ME)Diffusion03 medical and health sciencessymbols.namesake0302 clinical medicineScoring algorithmRician fadingPrior probabilityExpectation–maximization algorithmImage Processing Computer-AssistedMaximum a posteriori estimationHumansApplications (stat.AP)Computer SimulationComputation (stat.CO)Statistics - MethodologyMathematicsta112Likelihood FunctionsGeneral NeuroscienceBrainEstimatormaximum likelihood estimatorFisher scoringMagnetic Resonance ImagingWhite MatterRician likelihoodDiffusion Tensor ImagingFourier transformNonlinear Dynamicssymbolsmaximum a posteriori estimatorAlgorithmAlgorithms030217 neurology & neurosurgerydata augmentation
researchProduct

Compendium of TCDD-mediated transcriptomic response datasets in mammalian model systems.

2017

Background 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) is the most potent congener of the dioxin class of environmental contaminants. Exposure to TCDD causes a wide range of toxic outcomes, ranging from chloracne to acute lethality. The severity of toxicity is highly dependent on the aryl hydrocarbon receptor (AHR). Binding of TCDD to the AHR leads to changes in transcription of numerous genes. Studies evaluating the transcriptional changes brought on by TCDD may provide valuable insight into the role of the AHR in human health and disease. We therefore compiled a collection of transcriptomic datasets that can be used to aid the scientific community in better understanding the transcriptiona…

0301 basic medicineMaleTCDDPolychlorinated DibenzodioxinsBioinformaticsMicroarray datasetsAHRWhite adipose tissueBiologyWeb BrowserProteomics413 Veterinary scienceMedical and Health SciencesCell LineTranscriptome03 medical and health sciencesMice0302 clinical medicineTranscription (biology)Information and Computing SciencesmedicineGeneticsAnimalsHumansheterocyclic compoundsGeneGeneticsGene Expression ProfilingRComputational BiologyBiological SciencesAryl hydrocarbon receptormedicine.disease3. Good healthRatsChloracnestomatognathic diseases030104 developmental biologyGene Expression Regulation030220 oncology & carcinogenesisAgent Orange & Dioxinbiology.proteinEnvironmental PollutantsFemaleDNA microarrayTranscriptomeSoftwareBiotechnology
researchProduct

Genome-wide associations for birth weight and correlations with adult disease

2016

Birth weight (BW) has been shown to be influenced by both fetal and maternal factors and in observational studies is reproducibly associated with future risk of adult metabolic diseases including type 2 diabetes (T2D) and cardiovascular disease. These life-course associations have often been attributed to the impact of an adverse early life environment. Here, we performed a multi-ancestry genome-wide association study (GWAS) meta-analysis of BW in 153,781 individuals, identifying 60 loci where fetal genotype was associated with BW (P < 5 × 10(-8)). Overall, approximately 15% of variance in BW was captured by assays of fetal genetic variation. Using genet…

Male0301 basic medicineNetherlands Twin Register (NTR)AgingDatasets as TopicPhysiologyBlood PressureGenome-wide association studyCoronary Artery DiseaseType 2 diabetesBioinformaticsCHARGE Consortium Hematology Working GroupCohort Studies0302 clinical medicineBirth WeightInsulinGlucose homeostasis030212 general & internal medicineeducation.field_of_studyMultidisciplinaryAnthropometry3. Good healthPhenotype/dk/atira/pure/sustainabledevelopmentgoals/good_health_and_well_beingFemaleGlycogenSignal TransductionAdulthypertensionGenotypeGeneral Science & TechnologyBirth weightintrauterine growthPopulationQuantitative trait locusBiologyArticlequantitative traitGenomic Imprinting03 medical and health sciencesFetusSDG 3 - Good Health and Well-beingEarly Growth Genetics (EGG) ConsortiumMD MultidisciplinaryGenetic variation/dk/atira/pure/keywords/cohort_studies/netherlands_twin_register_ntr_medicineHumansmetabolic disordersGenetic Predisposition to DiseaseeducationgenomeGenetic associationGenetic Variationbirth weightta3121Chromatin Assembly and Disassemblymedicine.diseaseta3123Glucose030104 developmental biologyDiabetes Mellitus Type 2Genetic Locigenome-wide association studiesadult diseaseGenome-Wide Association Study
researchProduct

Pathological significance and prognostic value of surfactant protein D in cancer

2018

Surfactant protein D (SP-D) is a pattern recognition molecule belonging to the Collectin (collagen-containing C-type lectin) family that has pulmonary as well as extra-pulmonary existence. In the lungs, it is a well-established opsonin that can agglutinate a range of microbes, and enhance their clearance via phagocytosis and super-oxidative burst. It can interfere with allergen–IgE interaction and suppress basophil and mast cell activation. However, it is now becoming evident that SP-D is likely to be an innate immune surveillance molecule against tumor development. SP-D has been shown to induce apoptosis in sensitized eosinophils derived from allergic patients and a leukemic cell line via …

Male0301 basic medicineLung NeoplasmsDatasets as Topic0302 clinical medicineEpidermal growth factorNeoplasmsImmunology and AllergyRNA NeoplasmOriginal ResearchCancerOvarian NeoplasmsInnate immunitySurfactant protein DBioinformatics analysiPrognosisPulmonary Surfactant-Associated Protein DImmunohistochemistryTumor microenvironment030220 oncology & carcinogenesisAdenocarcinomaFemaleCancersBreast NeoplasmHumanlcsh:Immunologic diseases. AllergyPrognosiImmunologyBreast NeoplasmsBiology03 medical and health sciencesImmune systemBioinformatics analysisStomach NeoplasmsStomach NeoplasmBiomarkers TumormedicineHumansComputer SimulationLung cancerTumor microenvironmentOvarian NeoplasmComputational BiologySurfactant protein DCancermedicine.diseaseSurvival AnalysisLung NeoplasmImmune surveillance030104 developmental biologyCancer researchNeoplasmBioinformatics analysis; Cancers; Immune surveillance; Immunohistochemistry; Innate immunity; Surfactant protein D; Tumor microenvironment; Immunology and Allergy; Immunologylcsh:RC581-607Ovarian cancer
researchProduct

Ventricular Fibrillation and Tachycardia detection from surface ECG using time-frequency representation images as input dataset for machine learning

2017

Parameter-less ventricular fibrillation detection with time-frequency representation.Time-frequency representations are treated as images for a classifier.A comparison for four classifiers demonstrates the validity of the proposed method.The proposed technique could be applied to any signal and research field.This is a novel approach to signal analysis. Background and objectiveTo safely select the proper therapy for Ventricullar Fibrillation (VF) is essential to distinct it correctly from Ventricular Tachycardia (VT) and other rhythms. Provided that the required therapy would not be the same, an erroneous detection might lead to serious injuries to the patient or even cause Ventricular Fibr…

TachycardiaSupport Vector MachineComputer scienceSpeech recognition0206 medical engineeringDatasets as TopicHealth Informatics02 engineering and technologyVentricular tachycardiaMachine learningcomputer.software_genreMachine LearningElectrocardiographyTachycardia0202 electrical engineering electronic engineering information engineeringmedicineHumansFibrillationbusiness.industrySignal Processing Computer-AssistedPattern recognitionmedicine.disease020601 biomedical engineeringComputer Science ApplicationsVentricular FibrillationVentricular fibrillation020201 artificial intelligence & image processingNeural Networks ComputerArtificial intelligencemedicine.symptombusinessClassifier (UML)computerSoftwareComputer Methods and Programs in Biomedicine
researchProduct

Big Data in Medical Science–a Biostatistical View

2015

Big data” is a universal buzzword in business and science, referring to the retrieval and handling of ever-growing amounts of information. It can be assumed, for example, that a typical hospital generates hundreds of terabytes (1 TB = 1012 bytes) of data annually in the course of patient care (1). For instance, exome sequencing, which results in 5 gigabytes (1 GB = 109 bytes) of data per patient, is on the way to becoming routine (2). The analysis of such enormous volumes of information, i.e., organization and description of the data and the drawing of (scientifically valid) conclusions, can already hardly be accomplished with the traditional tools of computer science and statistics. For ex…

Gigabytebusiness.industrymedia_common.quotation_subjectBig dataByteCloud computingGeneral MedicineTerabyteBioinformaticsData scienceData analysisMedicinebusinessFunction (engineering)media_commonDatasets as TopicDeutsches Ärzteblatt international
researchProduct