Search results for "clustering"

showing 10 items of 446 documents

SpCLUST: Towards a fast and reliable clustering for potentially divergent biological sequences

2019

International audience; This paper presents SpCLUST, a new C++ package that takes a list of sequences as input, aligns them with MUSCLE, computes their similarity matrix in parallel and then performs the clustering. SpCLUST extends a previously released software by integrating additional scoring matrices which enables it to cover the clustering of amino-acid sequences. The similarity matrix is now computed in parallel according to the master/slave distributed architecture, using MPI. Performance analysis, realized on two real datasets of 100 nucleotide sequences and 1049 amino-acids ones, show that the resulting library substantially outperforms the original Python package. The proposed pac…

0301 basic medicineComputer science[INFO.INFO-SE] Computer Science [cs]/Software Engineering [cs.SE]Health Informatics[INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE][INFO.INFO-IU]Computer Science [cs]/Ubiquitous Computing03 medical and health sciences[INFO.INFO-CR]Computer Science [cs]/Cryptography and Security [cs.CR]0302 clinical medicineSoftware[INFO.INFO-ET] Computer Science [cs]/Emerging Technologies [cs.ET][INFO.INFO-DC] Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]Cluster AnalysisHumansCluster analysis[INFO.INFO-CR] Computer Science [cs]/Cryptography and Security [cs.CR]computer.programming_languagebusiness.industry[INFO.INFO-IU] Computer Science [cs]/Ubiquitous ComputingSimilarity matrixPattern recognitionDNAGenomicsSequence Analysis DNAPython (programming language)Mixture model[INFO.INFO-MO]Computer Science [cs]/Modeling and SimulationSpectral clusteringComputer Science Applications030104 developmental biologyComputingMethodologies_PATTERNRECOGNITION[INFO.INFO-MA]Computer Science [cs]/Multiagent Systems [cs.MA][INFO.INFO-ET]Computer Science [cs]/Emerging Technologies [cs.ET][INFO.INFO-MA] Computer Science [cs]/Multiagent Systems [cs.MA][INFO.INFO-MO] Computer Science [cs]/Modeling and SimulationArtificial intelligence[INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]businesscomputerAlgorithmsSoftware030217 neurology & neurosurgery
researchProduct

Retrospective Proteomic Screening of 100 Breast Cancer Tissues.

2017

The present investigation has been conducted on one hundred tissue fragments of breast cancer, collected and immediately cryopreserved following the surgical resection. The specimens were selected from patients with invasive ductal carcinoma of the breast, the most frequent and potentially aggressive type of mammary cancer, with the objective to increase the knowledge of breast cancer molecular markers potentially useful for clinical applications. The proteomic screening; by 2D-IPG and mass spectrometry; allowed us to identify two main classes of protein clusters: proteins expressed ubiquitously at high levels in all patients; and proteins expressed sporadically among the same patients. Wit…

0301 basic medicineGene isoformClinical Biochemistrygel-based proteomiclcsh:QR1-502Motilitysurgical tissuegel-based proteomicsBiologyBioinformaticsProteomicsBiochemistrylcsh:MicrobiologyArticleMetastasis03 medical and health sciencesBreast cancerbreast cancerStructural BiologyMedicineSettore BIO/06 - Anatomia Comparata E CitologiaMolecular Biologyoncology_oncogenicsmass spectrometrysurgical tissuesbusiness.industryCancermedicine.diseasePrimary tumor030104 developmental biologyApoptosisprotein clusteringCancer researchbreast cancer; surgical tissues; gel-based proteomics; mass spectrometry; protein clusteringbusinessProteomes
researchProduct

FragClust and TestClust, two informatics tools for chemical structure hierarchical clustering analysis applied to lipidomics. The example of Alzheime…

2016

Lipidomic analysis is able to measure simultaneously thousands of compounds belonging to a few lipid classes. In each lipid class, compounds differ only by the acyl radical, ranging between C10:0 (capric acid) and C24:0 (lignoceric acid). Although some metabolites have a peculiar pathological role, more often compounds belonging to a single lipid class exert the same biological effect. Here, we present a lipidomics workflow that extracts the tandem mass spectrometry data from individual files and uses them to group compounds into structurally homogeneous clusters by chemical structure hierarchical clustering analysis (CHCA). The case-to-control peak area ratios of the metabolites are then a…

0301 basic medicineHigh-resolution mass spectrometrySettore MED/09 - Medicina InternaChemical structureComputational biologyPlasma biomarkers01 natural sciencesTriglycerideBiochemistryHomogeneous clustersAnalytical ChemistryCeramide03 medical and health sciencesAlzheimer DiseaseTandem Mass SpectrometryHealth informatics toolsLipidomicsHumansStatistical analysisData miningChromatography High Pressure LiquidAgedAged 80 and overMolecular StructureChemistry010401 analytical chemistryLipids0104 chemical sciencesHierarchical clusteringPhospholipid030104 developmental biologyWorkflowBiochemistryCase-Control StudiesSettore MED/26 - Neurologia
researchProduct

A clustering package for nucleotide sequences using Laplacian Eigenmaps and Gaussian Mixture Model.

2018

International audience; In this article, a new Python package for nucleotide sequences clustering is proposed. This package, freely available on-line, implements a Laplacian eigenmap embedding and a Gaussian Mixture Model for DNA clustering. It takes nucleotide sequences as input, and produces the optimal number of clusters along with a relevant visualization. Despite the fact that we did not optimise the computational speed, our method still performs reasonably well in practice. Our focus was mainly on data analytics and accuracy and as a result, our approach outperforms the state of the art, even in the case of divergent sequences. Furthermore, an a priori knowledge on the number of clust…

0301 basic medicineNematoda01 natural sciencesGaussian Mixture Model[STAT.ML]Statistics [stat]/Machine Learning [stat.ML][MATH.MATH-ST]Mathematics [math]/Statistics [math.ST]ComputingMilieux_MISCELLANEOUScomputer.programming_language[STAT.AP]Statistics [stat]/Applications [stat.AP]Phylogenetic treeDNA ClusteringGenomicsHelminth ProteinsComputer Science Applications[STAT]Statistics [stat]010201 computation theory & mathematics[INFO.INFO-MA]Computer Science [cs]/Multiagent Systems [cs.MA]Data analysisEmbeddingA priori and a posteriori[INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]Health Informatics0102 computer and information sciences[INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE]Biology[INFO.INFO-IU]Computer Science [cs]/Ubiquitous Computing03 medical and health sciences[INFO.INFO-CR]Computer Science [cs]/Cryptography and Security [cs.CR]Laplacian EigenmapsAnimalsCluster analysis[SDV.GEN]Life Sciences [q-bio]/GeneticsModels Geneticbusiness.industryPattern recognitionNADH DehydrogenaseSequence Analysis DNAPython (programming language)Mixture model[INFO.INFO-MO]Computer Science [cs]/Modeling and SimulationVisualization030104 developmental biologyComputingMethodologies_PATTERNRECOGNITIONPlatyhelminths[INFO.INFO-ET]Computer Science [cs]/Emerging Technologies [cs.ET]Programming LanguagesArtificial intelligence[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]businesscomputerComputers in biology and medicine
researchProduct

Autoimmune polyglandular diseases.

2019

Autoimmune polyglandular diseases (APD) are defined as the presence of two autoimmune -induced endocrine failures. With respect to the significant morbidity and potential mortality of APD, the diagnostic objective is to detect APD at an early stage, with the advantage of less frequent complications, effective therapy and better prognosis. This requires that patients at risk be regularly screened for subclinical endocrinopathies prior to clinical manifestation. Regarding the time interval between manifestation of first and further endocrinopathies, regular and long-term follow-up is warranted. Quality of life and psychosocial status are poor in APD patients and involved relatives. Familial c…

0301 basic medicinePediatricsmedicine.medical_specialtyEndocrinology Diabetes and Metabolism030209 endocrinology & metabolismFamilial clusteringClinical manifestationEndocrine System DiseasesAutoimmune Diseases03 medical and health sciences0302 clinical medicineEndocrinologyQuality of lifeMedicineHumansIn patientStage (cooking)Polyendocrinopathies AutoimmuneSubclinical infectionPatient Care Teambusiness.industrymusculoskeletal neural and ocular physiologyIncidence030104 developmental biologycardiovascular systemQuality of LifeInterdisciplinary CommunicationHigh incidenceMorbiditybusinessPsychosocialcirculatory and respiratory physiologyBest practiceresearch. Clinical endocrinologymetabolism
researchProduct

Innovative Strategies to Develop Chemical Categories Using a Combination of Structural and Toxicological Properties.

2016

Interest is increasing in the development of non-animal methods for toxicological evaluations. These methods are however, particularly challenging for complex toxicological endpoints such as repeated dose toxicity. European Legislation, e.g., the European Union's Cosmetic Directive and REACH, demands the use of alternative methods. Frameworks, such as the Read-across Assessment Framework or the Adverse Outcome Pathway Knowledge Base, support the development of these methods. The aim of the project presented in this publication was to develop substance categories for a read-across with complex endpoints of toxicity based on existing databases. The basic conceptual approach was to combine str…

0301 basic medicineQuantitative structure–activity relationshipread acrossPredictive Clustering Tree (PCT) methodComputer science610010501 environmental sciencescomputer.software_genre600 Technik Medizin angewandte Wissenschaften::610 Medizin und Gesundheit01 natural sciences03 medical and health sciencesPharmacology (medical)Cluster analysis0105 earth and related environmental sciencesOriginal ResearchAlternative methodsPharmacologytoxicological and structural similaritybusiness.industryQSARlcsh:RM1-950non-animal methods; QSAR; readacross; Predictive Clustering Tree (PCT) method; toxicological and structural similarityIdentification (information)Tree (data structure)030104 developmental biologyConceptual approachlcsh:Therapeutics. PharmacologyKnowledge basenon-animal methodsData miningWeb servicebusinesscomputerFrontiers in pharmacology
researchProduct

CUDA-enabled hierarchical ward clustering of protein structures based on the nearest neighbour chain algorithm

2015

Clustering of molecular systems according to their three-dimensional structure is an important step in many bioinformatics workflows. In applications such as docking or structure prediction, many algorithms initially generate large numbers of candidate poses (or decoys), which are then clustered to allow for subsequent computationally expensive evaluations of reasonable representatives. Since the number of such candidates can easily range from thousands to millions, performing the clustering on standard central processing units (CPUs) is highly time consuming. In this paper, we analyse and evaluate different approaches to parallelize the nearest neighbour chain algorithm to perform hierarc…

0301 basic medicineSpeedupComputer scienceCorrelation clusteringParallel computingTheoretical Computer Science03 medical and health sciencesCUDA030104 developmental biologyHardware and ArchitectureCluster analysisAlgorithmSoftwareWard's methodThe International Journal of High Performance Computing Applications
researchProduct

Multivariate statistical analysis of a large odorants database aimed at revealing similarities and links between odorants and odors

2017

International audience; The perception of odor is an important component of smell; the first step of odor detection, and the discrimination of structurally diverse odorants depends on their interactions with olfactory receptors (ORs). Indeed, the perception of an odor's quality results from a combinatorial coding, in which the deciphering remains a major challenge. Several studies have successfully established links between odors and odorants by categorizing and classifying data. Hence, the categorization of odors appears to be a promising way to manage odors. In the proposed study, we performed a computational analysis using odor descriptions of the odorants present in Flavor-Base 9th Edit…

0301 basic medicinemultidimensional scalingmedia_common.quotation_subjectAgglomerative hierarchical clusteringKohonen self-organizing mapsodorants03 medical and health sciences0302 clinical medicinePerceptionComputational analysisMultidimensional scalingmedia_commonChemistrybusiness.industrymusculoskeletal neural and ocular physiologyPattern recognitionKohonen self organizing mapGeneral Chemistrycategorization030104 developmental biologyCategorizationOdorodor notesagglomerative hierarchical clusteringArtificial intelligenceMultivariate statisticalbusiness[SDV.AEN]Life Sciences [q-bio]/Food and Nutrition030217 neurology & neurosurgerypsychological phenomena and processesFood Science
researchProduct

Differentiating cancer cells reveal early large-scale genome regulation by pericentric domains.

2021

Abstract Finding out how cells prepare for fate change during differentiation commitment was our task. To address whether the constitutive pericentromere-associated domains (PADs) may be involved, we used a model system with known transcriptome data, MCF-7 breast cancer cells treated with the ErbB3 ligand heregulin (HRG), which induces differentiation and is used in the therapy of cancer. PAD-repressive heterochromatin (H3K9me3), centromere-associated-protein-specific, and active euchromatin (H3K4me3) antibodies, real-time PCR, acridine orange DNA structural test (AOT), and microscopic image analysis were applied. We found a two-step DNA unfolding after 15–20 and 60 min of HRG treatment, re…

0303 health sciencesEuchromatinNucleolusCentromere clusteringHeterochromatinNeuregulin-1CentromereBiophysicsBreast NeoplasmsBiologyChromatinCell biologyTranscriptome03 medical and health sciences0302 clinical medicineTranscription (biology)HeterochromatinConstitutive heterochromatinHumans030217 neurology & neurosurgery030304 developmental biologyBiophysical journal
researchProduct

Comparison of conventional descriptive analysis and a citation frequency-based descriptive method for odor profiling: An application to Burgundy Pino…

2010

International audience; The limitations of intensity scoring when describing the odor characteristics of a complex product have been documented in the literature. In the present work, the odor properties of 12 Burgundy Pinot noir wines were described by two independent panels performing, respectively, an intensity-based (conventional descriptive analysis) and a citation frequency-based method. Methods were compared according to three criteria: similarity of the sensory maps, control of panel performance and practical aspects. Intensity scoring and citation frequency data were analyzed, respectively, by Principal Components Analysis (PCA) and Correspondence Analysis (CA) followed by Hierarch…

030309 nutrition & dietetics[ SDV.AEN ] Life Sciences [q-bio]/Food and NutritionSensory analysisCorrespondence analysis03 medical and health sciences0404 agricultural biotechnologySENSORY ANALYSISStatistics[SDV.IDA]Life Sciences [q-bio]/Food engineeringCluster analysisComputingMilieux_MISCELLANEOUSMathematicsWinePINOT NOIR0303 health sciencesFREQUENCY OF CITATIONNutrition and DieteticsDescriptive statisticsbusiness.industryDESCRIPTIVE PROFILEWINE04 agricultural and veterinary sciencesCONVENTIONAL DA040401 food scienceHierarchical clusteringOdorPrincipal component analysisArtificial intelligencebusiness[SDV.AEN]Life Sciences [q-bio]/Food and NutritionMETHOD COMPARISONFood Science
researchProduct