Search results for "Computer Science Application"
showing 10 items of 3998 documents
Pruning Incremental Linear Model Trees with Approximate Lookahead
2014
Incremental linear model trees with approximate lookahead are fast, but produce overly large trees. This is due to non-optimal splitting decisions boosted by a possibly unlimited number of examples obtained from a data source. To keep the processing speed high and the tree complexity low, appropriate incremental pruning techniques are needed. In this paper, we introduce a pruning technique for the class of incremental linear model trees with approximate lookahead on stationary data sources. Experimental results show that the advantage of approximate lookahead in terms of processing speed can be further improved by producing much smaller and consequently more explanatory, less memory consumi…
Traitpedia: a collaborative effort to gather species traits
2018
Abstract Summary Traitpedia is a collaborative database aimed to collect binary traits in a tabular form for a growing number of species. Availability and implementation Traitpedia can be accessed from http://cbdm-01.zdv.uni-mainz.de/~munoz/traitpedia. Supplementary information Supplementary data are available at Bioinformatics online.
BGSA: a bit-parallel global sequence alignment toolkit for multi-core and many-core architectures
2018
Abstract Motivation Modern bioinformatics tools for analyzing large-scale NGS datasets often need to include fast implementations of core sequence alignment algorithms in order to achieve reasonable execution times. We address this need by presenting the BGSA toolkit for optimized implementations of popular bit-parallel global pairwise alignment algorithms on modern microprocessors. Results BGSA outperforms Edlib, SeqAn and BitPAl for pairwise edit distance computations and Parasail, SeqAn and BitPAl when using more general scoring schemes for pairwise alignments of a batch of sequence reads on both standard multi-core CPUs and Xeon Phi many-core CPUs. Furthermore, banded edit distance perf…
The Psychological Science Accelerator’s COVID-19 rapid-response dataset
2023
Funder: Amazon Web Services (AWS) Imagine Grant
CROSSMAPPER: estimating cross-mapping rates and optimizing experimental design in multi-species sequencing studies
2020
Motivation Numerous sequencing studies, including transcriptomics of host-pathogen systems, sequencing of hybrid genomes, xenografts, mixed species systems, metagenomics and meta-transcriptomics, involve samples containing genetic material from divergent organisms. A crucial step in these studies is identifying from which organism each sequencing read originated, and the experimental design should be directed to minimize biases caused by cross-mapping of reads to incorrect source genomes. Additionally, pooling of sufficiently different genetic material into a single sequencing library could significantly reduce experimental costs but requires careful planning and assessment of the impact of…
Quantitative characterization of antigens using monoclonal antibody reactivities
1993
A multipurpose program that empirically relates antigenic reactivities with monoclonal antibodies (MAbs) to genetic distances is presented. The program uses a set of known genetic pairwise distances to weigh each MAb depending on its capacity to define groups of taxonomically related antigens. This allows highly accurate identification and classification of unknown antigens. Also, the weights obtained constitute a quantitative measure of epitope conservation and can be used for improved vaccine design. © 1993 Oxford University Press.
Expert-based versus citation-based ranking of scholarly and scientific publication channels
2016
Abstract The Finnish publication channel quality ranking system was established in 2010. The system is expert-based, where separate panels decide and update the rankings of a set of publications channels allocated to them. The aggregated rankings have a notable role in the allocation of public resources into universities. The purpose of this article is to analyze this national ranking system. The analysis is mainly based on two publicly available databases containing the publication source information and the actual national publication activity information. Using citation-based indicators and other available information with association rule mining, decision trees, and confusion matrices, …
TiFoSi: an efficient tool for mechanobiology simulations of epithelia
2020
[Motivation]: Emerging phenomena in developmental biology and tissue engineering are the result of feedbacks between gene expression and cell biomechanics. In that context, in silico experiments are a powerful tool to understand fundamental mechanisms and to formulate and test hypotheses.
Consensus among preference rankings: a new weighted correlation coefficient for linear and weak orderings
2021
AbstractPreference data are a particular type of ranking data where some subjects (voters, judges,...) express their preferences over a set of alternatives (items). In most real life cases, some items receive the same preference by a judge, thus giving rise to a ranking with ties. An important issue involving rankings concerns the aggregation of the preferences into a “consensus”. The purpose of this paper is to investigate the consensus between rankings with ties, taking into account the importance of swapping elements belonging to the top (or to the bottom) of the ordering (position weights). By combining the structure of $$\tau _x$$ τ x proposed by Emond and Mason (J Multi-Criteria Decis…
Pathway analysis of high-throughput biological data within a Bayesian network framework
2011
Abstract Motivation: Most current approaches to high-throughput biological data (HTBD) analysis either perform individual gene/protein analysis or, gene/protein set enrichment analysis for a list of biologically relevant molecules. Bayesian Networks (BNs) capture linear and non-linear interactions, handle stochastic events accounting for noise, and focus on local interactions, which can be related to causal inference. Here, we describe for the first time an algorithm that models biological pathways as BNs and identifies pathways that best explain given HTBD by scoring fitness of each network. Results: Proposed method takes into account the connectivity and relatedness between nodes of the p…