Search results for "Information Systems"
showing 10 items of 1926 documents
Pruning Incremental Linear Model Trees with Approximate Lookahead
2014
Incremental linear model trees with approximate lookahead are fast, but produce overly large trees. This is due to non-optimal splitting decisions boosted by a possibly unlimited number of examples obtained from a data source. To keep the processing speed high and the tree complexity low, appropriate incremental pruning techniques are needed. In this paper, we introduce a pruning technique for the class of incremental linear model trees with approximate lookahead on stationary data sources. Experimental results show that the advantage of approximate lookahead in terms of processing speed can be further improved by producing much smaller and consequently more explanatory, less memory consumi…
Ranking coherence in topic models using statistically validated networks
2023
Probabilistic topic models have become one of the most widespread machine learning techniques in textual analysis. Topic discovering is an unsupervised process that does not guarantee the interpretability of its output. Hence, the automatic evaluation of topic coherence has attracted the interest of many researchers over the last decade, and it is an open research area. This article offers a new quality evaluation method based on statistically validated networks (SVNs). The proposed probabilistic approach consists of representing each topic as a weighted network of its most probable words. The presence of a link between each pair of words is assessed by statistically validating their co-oc…
The Psychological Science Accelerator’s COVID-19 rapid-response dataset
2023
Funder: Amazon Web Services (AWS) Imagine Grant
A multi-local optimization algorithm
1998
The development of efficient algorithms that provide all the local minima of a function is crucial to solve certain subproblems in many optimization methods. A “multi-local” optimization procedure using inexact line searches is presented, and numerical experiments are also reported. An application of the method to a semi-infinite programming procedure is included.
The Serial Property and Restricted Balanced Contributions in discrete cost sharing problems
2006
We show that the Serial Poperty and Restricted Balanced Contributions characterize the subsidy-free serial cost sharing method (Moulin (1995)) in discrete cost allocation problems.
On the analysis of a random walk-jump chain with tree-based transitions and its applications to faulty dichotomous search
2018
Random Walks (RWs) have been extensively studied for more than a century [1]. These walks have traditionally been on a line, and the generalizations for two and three dimensions, have been by extending the random steps to the corresponding neighboring positions in one or many of the dimensions. Among the most popular RWs on a line are the various models for birth and death processes, renewal processes and the gambler’s ruin problem. All of these RWs operate “on a discretized line”, and the walk is achieved by performing small steps to the current-state’s neighbor states. Indeed, it is this neighbor-step motion that renders their analyses tractable. When some of the transitions are to non-ne…
WOODIV, a database of occurrences, functional traits, and phylogenetic data for all Euro-Mediterranean trees
2021
Trees play a key role in the structure and function of many ecosystems worldwide. In the Mediterranean Basin, forests cover approximately 22% of the total land area hosting a large number of endemics (46 species). Despite its particularities and vulnerability, the biodiversity of Mediterranean trees is not well known at the taxonomic, spatial, functional, and genetic levels required for conservation applications. The WOODIV database fills this gap by providing reliable occurrences, four functional traits (plant height, seed mass, wood density, and specific leaf area), and sequences from three DNA-regions (rbcL, matK, and trnH-psbA), together with modelled occurrences and a phylogeny for all…
Spanish electoral archive. SEA database
2021
This paper introduces the SEA database (acronym for Spanish Electoral Archive). SEA brings together the most complete public repository available to date on Spanish election outcomes. SEA holds all the results recorded from the electoral processes of General (1979–2019), Regional (1989–2021), Local (1979–2019) and European Parliamentary (1987–2019) elections held in Spain since the restoration of democracy in the late 70 s, in addition to other data sets with electoral content. The data are offered for free and is presented in a homogeneous and friendly format. Most of the databases are available for download with data from various electoral levels, including from the ballot box level. This…
A database for the monitoring of thermal anomalies over the Amazon forest and adjacent intertropical oceans
2015
AbstractAdvances in information technologies and accessibility to climate and satellite data in recent years have favored the development of web-based tools with user-friendly interfaces in order to facilitate the dissemination of geo/biophysical products. These products are useful for the analysis of the impact of global warming over different biomes. In particular, the study of the Amazon forest responses to drought have recently received attention by the scientific community due to the occurrence of two extreme droughts and sustained warming over the last decade. Thermal Amazoni@ is a web-based platform for the visualization and download of surface thermal anomalies products over the Ama…
Galaxy LIMS for next-generation sequencing.
2013
Abstract Summary: We have developed a laboratory information management system (LIMS) for a next-generation sequencing (NGS) laboratory within the existing Galaxy platform. The system provides lab technicians standard and customizable sample information forms, barcoded submission forms, tracking of input sample quality, multiplex-capable automatic flow cell design and automatically generated sample sheets to aid physical flow cell preparation. In addition, the platform provides the researcher with a user-friendly interface to create a request, submit accompanying samples, upload sample quality measurements and access to the sequencing results. As the LIMS is within the Galaxy platform, the …