Search results for "Retrieval"
showing 10 items of 1176 documents
Expert-based versus citation-based ranking of scholarly and scientific publication channels
2016
Abstract The Finnish publication channel quality ranking system was established in 2010. The system is expert-based, where separate panels decide and update the rankings of a set of publications channels allocated to them. The aggregated rankings have a notable role in the allocation of public resources into universities. The purpose of this article is to analyze this national ranking system. The analysis is mainly based on two publicly available databases containing the publication source information and the actual national publication activity information. Using citation-based indicators and other available information with association rule mining, decision trees, and confusion matrices, …
Estimating the decomposition of predictive information in multivariate systems
2015
In the study of complex systems from observed multivariate time series, insight into the evolution of one system may be under investigation, which can be explained by the information storage of the system and the information transfer from other interacting systems. We present a framework for the model-free estimation of information storage and information transfer computed as the terms composing the predictive information about the target of a multivariate dynamical process. The approach tackles the curse of dimensionality employing a nonuniform embedding scheme that selects progressively, among the past components of the multivariate process, only those that contribute most, in terms of co…
A new position weight correlation coefficient for consensus ranking process without ties
2019
Preference data represent a particular type of ranking data where a group of people gives their preferences over a set of alternatives. The traditional metrics between rankings do not take into account the importance of swapping elements similar among them (element weights) or elements belonging to the top (or to the bottom) of an ordering (position weights). Following the structure of the τx proposed by Emond and Mason and the class of weighted Kemeny–Snell distances, a proper rank correlation coefficient is defined for measuring the correlation among weighted position rankings without ties. The one‐to‐one correspondence between the weighted distance and the rank correlation coefficient ho…
Textual data compression in computational biology: a synopsis.
2009
Abstract Motivation: Textual data compression, and the associated techniques coming from information theory, are often perceived as being of interest for data communication and storage. However, they are also deeply related to classification and data mining and analysis. In recent years, a substantial effort has been made for the application of textual data compression techniques to various computational biology tasks, ranging from storage and indexing of large datasets to comparison and reverse engineering of biological networks. Results: The main focus of this review is on a systematic presentation of the key areas of bioinformatics and computational biology where compression has been use…
Immune networks: multitasking capabilities near saturation
2013
Pattern-diluted associative networks were introduced recently as models for the immune system, with nodes representing T-lymphocytes and stored patterns representing signalling protocols between T- and B-lymphocytes. It was shown earlier that in the regime of extreme pattern dilution, a system with $N_T$ T-lymphocytes can manage a number $N_B!=!\order(N_T^\delta)$ of B-lymphocytes simultaneously, with $\delta!<!1$. Here we study this model in the extensive load regime $N_B!=!\alpha N_T$, with also a high degree of pattern dilution, in agreement with immunological findings. We use graph theory and statistical mechanical analysis based on replica methods to show that in the finite-connectivit…
Visualizing categorical data in ViSta
2003
The modules in the statistical package ViSta related to categorical data analysis are presented These modules are: visualization of frequency data with mosaic and bar plots, correspondence analysis, multiple correspondence analysis and loglinear analysis. All these methods are implemented in ViSta with a big emphasis on plots and graphical representations of data, as well as interactivity for the user with the system. These provide a system that has shown to be easy, useful, and powerful, both for novice and experienced users.
SeqEditor: an application for primer design and sequence analysis with or without GTF/GFF files
2021
[Motivation]: Sequence analyses oriented to investigate specific features, patterns and functions of protein and DNA/RNA sequences usually require tools based on graphic interfaces whose main characteristic is their intuitiveness and interactivity with the user’s expertise, especially when curation or primer design tasks are required. However, interface-based tools usually pose certain computational limitations when managing large sequences or complex datasets, such as genome and transcriptome assemblies. Having these requirments in mind we have developed SeqEditor an interactive software tool for nucleotide and protein sequences’ analysis.
Immune networks: Multi-tasking capabilities at medium load
2013
Associative network models featuring multi-tasking properties have been introduced recently and studied in the low load regime, where the number $P$ of simultaneously retrievable patterns scales with the number $N$ of nodes as $P\sim \log N$. In addition to their relevance in artificial intelligence, these models are increasingly important in immunology, where stored patterns represent strategies to fight pathogens and nodes represent lymphocyte clones. They allow us to understand the crucial ability of the immune system to respond simultaneously to multiple distinct antigen invasions. Here we develop further the statistical mechanical analysis of such systems, by studying the medium load r…
Hybrid recommendation methods in complex networks
2015
We propose here two new recommendation methods, based on the appropriate normalization of already existing similarity measures, and on the convex combination of the recommendation scores derived from similarity between users and between objects. We validate the proposed measures on three relevant data sets, and we compare their performance with several recommendation systems recently proposed in the literature. We show that the proposed similarity measures allow to attain an improvement of performances of up to 20\% with respect to existing non-parametric methods, and that the accuracy of a recommendation can vary widely from one specific bipartite network to another, which suggests that a …
On the empirical spectral distribution for certain models related to sample covariance matrices with different correlations
2021
Given [Formula: see text], we study two classes of large random matrices of the form [Formula: see text] where for every [Formula: see text], [Formula: see text] are iid copies of a random variable [Formula: see text], [Formula: see text], [Formula: see text] are two (not necessarily independent) sets of independent random vectors having different covariance matrices and generating well concentrated bilinear forms. We consider two main asymptotic regimes as [Formula: see text]: a standard one, where [Formula: see text], and a slightly modified one, where [Formula: see text] and [Formula: see text] while [Formula: see text] for some [Formula: see text]. Assuming that vectors [Formula: see t…