Search results for "algorithm"

showing 10 items of 4887 documents

Machine learning at the interface of structural health monitoring and non-destructive evaluation

2020

While both non-destructive evaluation (NDE) and structural health monitoring (SHM) share the objective of damage detection and identification in structures, they are distinct in many respects. This paper will discuss the differences and commonalities and consider ultrasonic/guided-wave inspection as a technology at the interface of the two methodologies. It will discuss how data-based/machine learning analysis provides a powerful approach to ultrasonic NDE/SHM in terms of the available algorithms, and more generally, how different techniques can accommodate the very substantial quantities of data that are provided by modern monitoring campaigns. Several machine learning methods will be illu…

Damage detectionComputer scienceTKGeneral MathematicsInterface (computing)General Physics and AstronomyCompressive sensing machine learning non-destructive evaluation structural health monitoring transfer learning ultrasoundMachine learningcomputer.software_genreMachine LearningSettore ING-IND/14 - Progettazione Meccanica E Costruzione Di MacchineEngineeringManufacturing and Industrial FacilitiesNon destructiveHumansUltrasonicsFeature databusiness.industryUltrasonic testingGeneral EngineeringBayes TheoremSignal Processing Computer-AssistedArticlesRoboticsData CompressionIdentification (information)Regression AnalysisStructural health monitoringArtificial intelligenceTransfer of learningbusinesscomputerAlgorithmsPhilosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
researchProduct

UVPAR: fast detection of functional shifts in duplicate genes.

2006

Abstract Background The imprint of natural selection on gene sequences is often difficult to detect. A plethora of methods have been devised to detect genetic changes due to selective processes. However, many of those methods depend heavily on underlying assumptions regarding the mode of change of DNA sequences and often require sophisticated mathematical treatments that made them computationally slow. The development of fast and effective methods to detect modifications in the selective constraints of genes is therefore of great interest. Results We describe UVPAR, a program designed to quickly test for changes in the functional constraints of duplicate genes. Starting with alignments of t…

DanioComputational biologyBiologylcsh:Computer applications to medicine. Medical informaticsBiochemistryDNA sequencingEvolution MolecularGenes DuplicateSequence Analysis ProteinStructural BiologySelection GeneticHox geneMolecular BiologyGenelcsh:QH301-705.5Selection (genetic algorithm)GeneticsNatural selectionApplied MathematicsProteinsSequence Analysis DNAbiology.organism_classificationComputer Science Applicationslcsh:Biology (General)lcsh:R858-859.7DNA microarraySequence AlignmentSoftwareAlgorithmsGenètica
researchProduct

gcType : a high-quality type strain genome database for microbial phylogenetic and functional research

2020

Abstract Taxonomic and functional research of microorganisms has increasingly relied upon genome-based data and methods. As the depository of the Global Catalogue of Microorganisms (GCM) 10K prokaryotic type strain sequencing project, Global Catalogue of Type Strain (gcType) has published 1049 type strain genomes sequenced by the GCM 10K project which are preserved in global culture collections with a valid published status. Additionally, the information provided through gcType includes >12 000 publicly available type strain genome sequences from GenBank incorporated using quality control criteria and standard data annotation pipelines to form a high-quality reference database. This …

Data AnalysisBACTERIALAcademicSubjects/SCI000100206 medical engineering02 engineering and technologyComputational biologyBiologyGenome03 medical and health sciencesMULTIPLE SEQUENCE ALIGNMENTPhylogeneticsRNA Ribosomal 16SDatabases GeneticGeneticsPROGRAMDatabase IssueALGORITHMPhylogeny030304 developmental biology0303 health sciencesGenomeMultiple sequence alignmentBase SequencePhylogenetic treeResearchGenome databaseBiology and Life SciencesGCM transcription factorsProkaryotic CellsGenBankReference database020602 bioinformatics
researchProduct

Criminal networks analysis in missing data scenarios through graph distances.

2021

Data collected in criminal investigations may suffer from: (i) incompleteness, due to the covert nature of criminal organisations; (ii) incorrectness, caused by either unintentional data collection errors and intentional deception by criminals; (iii) inconsistency, when the same information is collected into law enforcement databases multiple times, or in different formats. In this paper we analyse nine real criminal networks of different nature (i.e., Mafia networks, criminal street gangs and terrorist organizations) in order to quantify the impact of incomplete data and to determine which network type is most affected by it. The networks are firstly pruned following two specific methods: …

Data AnalysisFOS: Computer and information sciencesComputer and Information SciencesScienceIntelligenceSocial SciencesTransportationCriminologyCivil EngineeringSocial NetworkingComputer Science - Computers and SocietyLaw EnforcementSociologyComputers and Society (cs.CY)PsychologyHumansComputer NetworksSocial and Information Networks (cs.SI)Algorithms; Humans; Terrorism; Criminals; Data Analysis; Social NetworkingSettore INF/01 - InformaticaQCognitive PsychologyRBiology and Life SciencesEigenvaluesComputer Science - Social and Information NetworksCriminalsTransportation InfrastructurePoliceRoadsProfessionsAlgebraLinear AlgebraPeople and PlacesPhysical SciencesEngineering and TechnologyCognitive ScienceMedicineLaw and Legal SciencesPopulation GroupingsTerrorismCrimeCriminal Justice SystemMathematicsNetwork AnalysisAlgorithmsResearch ArticleNeurosciencePLoS ONE
researchProduct

Analyzing big datasets of genomic sequences: fast and scalable collection of k-mer statistics

2019

Abstract Background Distributed approaches based on the MapReduce programming paradigm have started to be proposed in the Bioinformatics domain, due to the large amount of data produced by the next-generation sequencing techniques. However, the use of MapReduce and related Big Data technologies and frameworks (e.g., Apache Hadoop and Spark) does not necessarily produce satisfactory results, in terms of both efficiency and effectiveness. We discuss how the development of distributed and Big Data management technologies has affected the analysis of large datasets of biological sequences. Moreover, we show how the choice of different parameter configurations and the careful engineering of the …

Data AnalysisFOS: Computer and information sciencesTime FactorsTime FactorComputer scienceStatistics as TopicBig dataApache Spark; distributed computing; performance evaluation; k-mer countinglcsh:Computer applications to medicine. Medical informaticsBiochemistryDomain (software engineering)Databases03 medical and health sciences0302 clinical medicineStructural BiologyComputer clusterStatisticsSpark (mathematics)Molecular Biologylcsh:QH301-705.5030304 developmental biology0303 health sciencesGenomeSettore INF/01 - InformaticaBase SequenceNucleic AcidApache Sparkbusiness.industryResearchApache Spark; Distributed computing; k-mer counting; Performance evaluation; Algorithms; Base Sequence; Software; Time Factors; Data Analysis; Databases Nucleic Acid; Genome; Statistics as TopicApplied Mathematicsk-mer countingDistributed computingComputer Science ApplicationsAlgorithmData AnalysiComputer Science - Distributed Parallel and Cluster Computinglcsh:Biology (General)030220 oncology & carcinogenesisScalabilityPerformance evaluationlcsh:R858-859.7Algorithm designDistributed Parallel and Cluster Computing (cs.DC)Databases Nucleic AcidbusinessAlgorithmsSoftware
researchProduct

Data Augmentation Approach in Bayesian Modelling of Presence-only Data

2011

Abstract Ecologists are interested in prediction of potential distribution of species in suitable areas, essential for planning conservation and management strategies. Unfortunately, often the only available information in such studies is the true presence of the species at few locations of the study area and the associated environmental covariates over the entire area, referred as presence-only data. We propose a Bayesian approach to estimate logistic linear regressions adapted to presence-only data through the introduction of a random approximation of the correction factor in the adjusted logistic model that allows us to overcome the need to know a priori the prevalence of the species.

Data augmentationPresence-only dataComputer scienceBayesian probabilityLogistic regressionBayesian inferencePseudo-absence approachBayesian statisticsBayesian model; Data augmentation; MCMC algorithm; Potential distribution; Presence-only data; Pseudo-absence approachBayesian model Data augmentation MCMC algorithm Presence-only data Pseudo-absence approach Potential distributionpotentialdistributionBayesian modelBayesian multivariate linear regressionPotential distributionStatisticsCovariateEconometricsGeneral Earth and Planetary Sciencespseudo-absence approach; potentialdistribution.; data augmentation; presence-only data; potential distribution; mcmc algorithm; bayesian modelBayesian linear regressionBayesian averageMCMC algorithmGeneral Environmental ScienceProcedia Environmental Sciences
researchProduct

Controlling false match rates in record linkage using extreme value theory

2011

AbstractCleansing data from synonyms and homonyms is a relevant task in fields where high quality of data is crucial, for example in disease registries and medical research networks. Record linkage provides methods for minimizing synonym and homonym errors thereby improving data quality. We focus our attention to the case of homonym errors (in the following denoted as ‘false matches’), in which records belonging to different entities are wrongly classified as equal. Synonym errors (‘false non-matches’) occur when a single entity maps to multiple records in the linkage result. They are not considered in this study because in our application domain they are not as crucial as false matches. Fa…

Data cleansingData cleansingBiomedical ResearchDatabases FactualCalibration (statistics)Computer scienceHealth Informaticscomputer.software_genrePlot (graphics)Mean excess plotStatisticsRegistriesExtreme value theoryLinkage (software)Models StatisticalComputational BiologyFellegi–Sunter modelMixture modelGeneralized Pareto distributionComputer Science ApplicationsData qualityStatistics of extreme valuesDatabase Management SystemsMedical Record LinkageData miningcomputerAlgorithmsMedical InformaticsRecord linkageJournal of Biomedical Informatics
researchProduct

Regression diagnostics applied in kinetic data processing: Outlier recognition and robust weighting procedures

2010

An efficient protocol, based on advanced statistical diagnostics and robust fitting techniques applied to the least-squares processing of kinetic data of chemical reactions, is presented and discussed. The procedure, which is aimed at obtaining highly accurate estimation of the fitting parameters, consists of the identification of the outliers that remarkably impair the fitting by means of the so-called “leverage analysis” and some related diagnostics. This approach allows the elimination of the actually aberrant observations from the data set and/or their robust weighting to inhibit the negative effects induced on the data fitting, with consequent reduction of the bias introduced into the …

Data processingChemistryOrganic ChemistryBiochemistryRegressionRobust regressionWeightingInorganic ChemistryOutlierCurve fittingLeverage (statistics)Physical and Theoretical ChemistryRegression diagnosticAlgorithmInternational Journal of Chemical Kinetics
researchProduct

A new fast and fault-tolerant identification algorithm for spectral databases

1995

A new method for an automatic, computer and database driven identification of UV/VIS spectra is described. It is shown that an identification algorithm must consider the spectral differences as well as their common features. The described identification method allows identifications, even if the spectra are distorted or shifted.

Data processingDatabaseComputer sciencePattern analysisFault toleranceVis spectraFuzzy control systemcomputer.software_genreBiochemistrySpectral lineAnalyse qualitativeAnalytical ChemistryIdentification (information)ComputingMethodologies_PATTERNRECOGNITIONcomputerAlgorithmAnalytical and Bioanalytical Chemistry
researchProduct

Climate Data Records of Vegetation Variables from Geostationary SEVIRI/MSG Data: Products, Algorithms and Applications

2019

The scientific community requires long-term data records with well-characterized uncertainty and suitable for modeling terrestrial ecosystems and energy cycles at regional and global scales. This paper presents the methodology currently developed in EUMETSAT within its Satellite Application Facility for Land Surface Analysis (LSA SAF) to generate biophysical variables from the Spinning Enhanced Visible and InfraRed Imager (SEVIRI) on board MSG 1-4 (Meteosat 8-11) geostationary satellites. Using this methodology, the LSA SAF generates and disseminates at a time a suite of vegetation products, such as the leaf area index (LAI), the fraction of the photosynthetically active radiation absorbed …

Data records010504 meteorology & atmospheric sciencesData productsSciencemeteosat second generation (MSG); biophysical parameters (LAI; FVC; FAPAR); SEVIRI; climate data records (CDR); stochastic spectral mixture model (SSMM); Satellite Application Facility for Land Surface Analysis (LSA SAF)0211 other engineering and technologiesstochastic spectral mixture model (SSMM)02 engineering and technology01 natural sciencesFAPAR)climate data records (CDR)Leaf area index021101 geological & geomatics engineering0105 earth and related environmental sciencesQVegetationSEVIRIMixture modelSatellite Application Facility for Land Surface Analysis (LSA SAF)FVCbiophysical parameters (LAIPhotosynthetically active radiationGeostationary orbitGeneral Earth and Planetary SciencesEnvironmental sciencemeteosat second generation (MSG)SatelliteAlgorithmRemote Sensing; Volume 11; Issue 18; Pages: 2103
researchProduct