Search results for "computer.software_genre"

showing 10 items of 3858 documents

Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform

2012

Motivation The Burrows-Wheeler transform (BWT) is the foundation of many algorithms for compression and indexing of text data, but the cost of computing the BWT of very large string collections has prevented these techniques from being widely applied to the large sets of sequences often encountered as the outcome of DNA sequencing experiments. In previous work, we presented a novel algorithm that allows the BWT of human genome scale data to be computed on very moderate hardware, thus enabling us to investigate the BWT as a tool for the compression of such datasets. Results We first used simulated reads to explore the relationship between the level of compression and the error rate, the leng…

FOS: Computer and information sciencesStatistics and ProbabilityBurrows–Wheeler transformComputer scienceData_CODINGANDINFORMATIONTHEORYBurrows-Wheeler transformcomputer.software_genreBiochemistryBurrows-Wheeler transform; Data Compression; Next-generation sequencingComputer Science - Data Structures and AlgorithmsEscherichia coliCode (cryptography)HumansOverhead (computing)Data Structures and Algorithms (cs.DS)Computer SimulationQuantitative Biology - GenomicsMolecular BiologyGenomics (q-bio.GN)Genome HumanString (computer science)Search engine indexingSortingGenomicsSequence Analysis DNAConstruct (python library)Data CompressionComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicsFOS: Biological sciencesNext-generation sequencingData miningDatabases Nucleic AcidcomputerAlgorithmsData compression

researchProduct

Causal Effect Identification from Multiple Incomplete Data Sources: A General Search-Based Approach

2021

Causal effect identification considers whether an interventional probability distribution can be uniquely determined without parametric assumptions from measured source distributions and structural knowledge on the generating system. While complete graphical criteria and procedures exist for many identification problems, there are still challenging but important extensions that have not been considered in the literature. To tackle these new settings, we present a search algorithm directly over the rules of do-calculus. Due to generality of do-calculus, the search is capable of taking more advanced data-generating mechanisms into account along with an arbitrary type of both observational and…

FOS: Computer and information sciencesStatistics and ProbabilityComputer Science - Machine LearningcausalityComputer Science - Artificial IntelligenceHeuristic (computer science)Computer scienceeducationMachine Learning (stat.ML)transportabilitycomputer.software_genre01 natural sciencesMachine Learning (cs.LG)R-kielimissing dataQA76.75-76.765; QA273-280010104 statistics & probabilitydo-calculuscausality; do-calculus; selection bias; transportability; missing data; case-control design; meta-analysisStatistics - Machine LearningSearch algorithmselection bias0101 mathematicsParametric statisticspäättelymeta-analyysicase-control designhakualgoritmit113 Computer and information sciencesMissing datameta-analysisIdentification (information)Artificial Intelligence (cs.AI)Causal inferencekausaliteettiIdentifiabilityProbability distributionData miningStatistics Probability and UncertaintycomputerSoftwareJournal of Statistical Software

researchProduct

Bayesian Checking of the Second Levels of Hierarchical Models

2007

Hierarchical models are increasingly used in many applications. Along with this increased use comes a desire to investigate whether the model is compatible with the observed data. Bayesian methods are well suited to eliminate the many (nuisance) parameters in these complicated models; in this paper we investigate Bayesian methods for model checking. Since we contemplate model checking as a preliminary, exploratory analysis, we concentrate on objective Bayesian methods in which careful specification of an informative prior distribution is avoided. Numerous examples are given and different proposals are investigated and critically compared.

FOS: Computer and information sciencesStatistics and ProbabilityModel checkingModel checkingComputer scienceconflictGeneral MathematicsBayesian probabilityMachine learningcomputer.software_genreMethodology (stat.ME)partial posterior predictivePrior probabilityStatistics - Methodologybusiness.industrymodel criticismProbability and statisticsExploratory analysisobjective Bayesian methodsempirical-Bayesposterior predictivep-valuesArtificial intelligenceStatistics Probability and Uncertaintybusinesscomputer

researchProduct

Centrality measures for networks with community structure

2016

Understanding the network structure, and finding out the influential nodes is a challenging issue in the large networks. Identifying the most influential nodes in the network can be useful in many applications like immunization of nodes in case of epidemic spreading, during intentional attacks on complex networks. A lot of research is done to devise centrality measures which could efficiently identify the most influential nodes in the network. There are two major approaches to the problem: On one hand, deterministic strategies that exploit knowledge about the overall network topology in order to find the influential nodes, while on the other end, random strategies are completely agnostic ab…

FOS: Computer and information sciencesStatistics and ProbabilityPhysics - Physics and SocietyExploitComplex networksFOS: Physical sciencesNetwork sciencePhysics and Society (physics.soc-ph)Network theoryMachine learningcomputer.software_genreNetwork topologyImmunization strategies01 natural sciences010305 fluids & plasmas0103 physical sciences010306 general physicsMathematicsSocial and Information Networks (cs.SI)Structure (mathematical logic)[PHYS.PHYS]Physics [physics]/Physics [physics]business.industryCommunity structureComputer Science - Social and Information NetworksComplex networkEpidemic dynamicsCondensed Matter Physics[ PHYS.PHYS ] Physics [physics]/Physics [physics]Community structureArtificial intelligenceData miningbusinessCentralitycomputer

researchProduct

A multi-scale area-interaction model for spatio-temporal point patterns

2018

Models for fitting spatio-temporal point processes should incorporate spatio-temporal inhomogeneity and allow for different types of interaction between points (clustering or regularity). This paper proposes an extension of the spatial multi-scale area-interaction model to a spatio-temporal framework. This model allows for interaction between points at different spatio-temporal scales and the inclusion of covariates. We fit the proposed model to varicella cases registered during 2013 in Valencia, Spain. The fitted model indicates small scale clustering and regularity for higher spatio-temporal scales.

FOS: Computer and information sciencesStatistics and ProbabilityScale (ratio)Computer scienceManagement Monitoring Policy and LawMulti-scale area-interaction modelcomputer.software_genreVaricella01 natural sciencesPoint processMethodology (stat.ME)010104 statistics & probability0502 economics and businessStatisticsCovariate60D05 60G55 62M30Point (geometry)0101 mathematicsComputers in Earth SciencesCluster analysisStatistics - Methodology050205 econometrics 05 social sciencesInteraction modelExtension (predicate logic)Gibbs point processesComputingMethodologies_PATTERNRECOGNITIONSpatio-temporal point processesData miningcomputer

researchProduct

Imputation Procedures in Surveys Using Nonparametric and Machine Learning Methods: An Empirical Comparison

2020

Abstract Nonparametric and machine learning methods are flexible methods for obtaining accurate predictions. Nowadays, data sets with a large number of predictors and complex structures are fairly common. In the presence of item nonresponse, nonparametric and machine learning procedures may thus provide a useful alternative to traditional imputation procedures for deriving a set of imputed values used next for the estimation of study parameters defined as solution of population estimating equation. In this paper, we conduct an extensive empirical investigation that compares a number of imputation procedures in terms of bias and efficiency in a wide variety of settings, including high-dimens…

FOS: Computer and information sciencesStatistics and ProbabilityStatistics::ApplicationsEmpirical comparisonbusiness.industryComputer scienceApplied MathematicsNonparametric statisticsMachine learningcomputer.software_genreStatistics - ComputationVariety (cybernetics)Methodology (stat.ME)Set (abstract data type)Statistics::MethodologyImputation (statistics)Artificial intelligenceStatistics Probability and UncertaintybusinesscomputerStatistics - MethodologyComputation (stat.CO)Social Sciences (miscellaneous)Journal of Survey Statistics and Methodology

researchProduct

Alignment-free Genomic Analysis via a Big Data Spark Platform

2021

Abstract Motivation Alignment-free distance and similarity functions (AF functions, for short) are a well-established alternative to pairwise and multiple sequence alignments for many genomic, metagenomic and epigenomic tasks. Due to data-intensive applications, the computation of AF functions is a Big Data problem, with the recent literature indicating that the development of fast and scalable algorithms computing AF functions is a high-priority task. Somewhat surprisingly, despite the increasing popularity of Big Data technologies in computational biology, the development of a Big Data platform for those tasks has not been pursued, possibly due to its complexity. Results We fill this impo…

FOS: Computer and information sciencesStatistics and Probabilitysequence analysisComputer science0206 medical engineeringBig data02 engineering and technologyMachine learningcomputer.software_genreBiochemistry03 medical and health sciencesSpark (mathematics)MapReduceMolecular Biology030304 developmental biology0303 health sciencesSettore INF/01 - Informaticabusiness.industryBioinformatics High Performance Computing Compressed Data StructuresMapReduce; hadoop; sequence analysisComputer Science ApplicationsComputational MathematicsTask (computing)Computer Science - Distributed Parallel and Cluster ComputingComputational Theory and MathematicsDistributed Parallel and Cluster Computing (cs.DC)Artificial intelligencehadoopbusinesscomputer020602 bioinformaticsBioinformatics

researchProduct

Introducing Traceability in GitHub for Medical Software Development

2021

Assuring traceability from requirements to implementation is a key element when developing safety critical software systems. Traditionally, this traceability is ensured by a waterfall-like process, where phases follow each other, and tracing between different phases can be managed. However, new software development paradigms, such as continuous software engineering and DevOps, which encourage a steady stream of new features, committed by developers in a seemingly uncontrolled fashion in terms of former phasing, challenge this view. In this paper, we introduce our approach that adds traceability capabilities to GitHub, so that the developers can act like they normally do in GitHub context bu…

FOS: Computer and information sciencesTraceabilityComputer scienceProcess (engineering)Context (language use)computer.software_genreregulated softwareGitHubComputer Science - Software EngineeringDocumentationMedical softwarejäljitettävyysSoftware systemDevOpsDevOpsbusiness.industryturvallisuusSoftware developmenttietokoneohjelmatohjelmistot (taiteet)kehittäminen113 Computer and information sciencesSoftware Engineering (cs.SE)ohjelmistosuunnittelutraceabilityvaatimustenhallintabusinessSoftware engineeringohjelmistokehityscomputercontinuous software engineering

researchProduct

Semantic Computing of Moods Based on Tags in Social Media of Music

2014

Social tags inherent in online music services such as Last.fm provide a rich source of information on musical moods. The abundance of social tags makes this data highly beneficial for developing techniques to manage and retrieve mood information, and enables study of the relationships between music content and mood representations with data substantially larger than that available for conventional emotion research. However, no systematic assessment has been done on the accuracy of social tags and derived semantic models at capturing mood information in music. We propose a novel technique called Affective Circumplex Transformation (ACT) for representing the moods of music tracks in an interp…

FOS: Computer and information sciencesVocabularyComputer scienceMusic information retrievalmedia_common.quotation_subjectSemantic analysis (machine learning)Moodscomputer.software_genreAffect (psychology)SemanticsComputer Science - Information RetrievalSemantic computingMusic information retrievalAffective computingmedia_commonSocial and Information Networks (cs.SI)ta113Probabilistic latent semantic analysisSocial tagsbusiness.industryComputer Science - Social and Information NetworksMultimedia (cs.MM)Semantic analysisComputer Science ApplicationsMoodComputational Theory and MathematicsWeb miningta6131Vector space modelArtificial intelligenceGenresbusinesscomputerComputer Science - MultimediaInformation Retrieval (cs.IR)MusicNatural language processingPrediction.Information SystemsIEEE Transactions on Knowledge and Data Engineering

researchProduct

Measuring Semantic Coherence of a Conversation

2018

Conversational systems have become increasingly popular as a way for humans to interact with computers. To be able to provide intelligent responses, conversational systems must correctly model the structure and semantics of a conversation. We introduce the task of measuring semantic (in)coherence in a conversation with respect to background knowledge, which relies on the identification of semantic relations between concepts introduced during a conversation. We propose and evaluate graph-based and machine learning-based approaches for measuring semantic coherence using knowledge graphs, their vector space embeddings and word embedding models, as sources of background knowledge. We demonstrat…

FOS: Computer and information sciencesWord embeddingComputer scienceComputer Science - Artificial Intelligencemedia_common.quotation_subjectihmisen ja tietokoneen vuorovaikutus02 engineering and technologycomputer.software_genrekeskustelu020204 information systems0202 electrical engineering electronic engineering information engineeringConversationconversational systemsmedia_commonComputer Science - Computation and Languagebusiness.industrykoneoppiminenArtificial Intelligence (cs.AI)Knowledge graphsemantiikkaGraph (abstract data type)020201 artificial intelligence & image processingArtificial intelligencebusinesssemantic coherencecomputerComputation and Language (cs.CL)Natural language processing

researchProduct