Search results for "Data set"
showing 10 items of 154 documents
Online Induction of Probabilistic Real Time Automata
2012
Probabilistic real time automata (PRTAs) are a representation of dynamic processes arising in the sciences and industry. Currently, the induction of automata is divided into two steps: the creation of the prefix tree acceptor (PTA) and the merge procedure based on clustering of the states. These two steps can be very time intensive when a PRTA is to be induced for massive or even unbounded data sets. The latter one can be efficiently processed, as there exist scalable online clustering algorithms. However, the creation of the PTA still can be very time consuming. To overcome this problem, we propose a genuine online PRTA induction approach that incorporates new instances by first collapsing…
Ultimate Order Statistics-Based Prototype Reduction Schemes
2013
Published version of a chapter in the book: AI 2013: Advances in Artificial Intelligence. Also available from the publisher at: http://dx.doi.org/10.1007/978-3-319-03680-9_42 The objective of Prototype Reduction Schemes (PRSs) and Border Identification (BI) algorithms is to reduce the number of training vectors, while simultaneously attempting to guarantee that the classifier built on the reduced design set performs as well, or nearly as well, as the classifier built on the original design set. In this paper, we shall push the limit on the field of PRSs to see if we can obtain a classification accuracy comparable to the optimal, by condensing the information in the data set into a single tr…
A comprehensive analysis of Universal Soil Loss Equation-based models at the Sparacia experimental area
2020
Improving Universal Soil Loss Equation (USLE)‐based models has large interest because simple and reliable analytical tools are necessary in the perspective of a sustainable land management. At first, in this paper, a general definition of the event rainfall‐ runoff erosivity factor for the USLE‐based models, REFₑ = (QR)ᵇ¹(EI₃₀)ᵇ², in which QR is the event runoff coefficient, EI₃₀ is the single‐storm erosion index, and b₁ and b₂ are coefficients, was introduced. The rainfall‐runoff erosivity factors of the USLE (b₁ = 0 and b₂ = 1), USLE‐M (b₁ = b₂ = 1), USLE‐MB (b₁ ≠ 1 and b₂ = 1), USLE‐MR (b₁ = 1 and b₂ ≠ 1), USLE‐MM (b₁ = b₂ ≠ 1), and USLE‐M2 (b₁ ≠ b₂ ≠ 1) can be defined using REFₑ. Then t…
Intruder Pattern Identification
2008
This paper considers the problem of intrusion detection in information systems as a classification problem. In particular the case of masquerader is treated. This kind of intrusion is one of the more difficult to discover because it may attack already open user sessions. Moreover, this problem is complex because of the large variability of user models and the lack of available data for the learning purpose. Here, flexible and robust similarity measures, suitable also for non-numeric data, are defined, they will be incorporated on a one-class training K N N and compared with several classification methods proposed in the literature using the Masquerading User Data set (www.schonlau.net) repr…
Comparing Traditional and IRT Scoring of Forced-Choice Tests.
2018
This article explores how traditional scores obtained from different forced-choice (FC) formats relate to their true scores and item response theory (IRT) estimates. Three FC formats are considered from a block of items, and respondents are asked to (a) pick the item that describes them most (PICK), (b) choose the two items that describe them the most and the least (MOLE), or (c) rank all the items in the order of their descriptiveness of the respondents (RANK). The multi-unidimensional pairwise-preference (MUPP) model, which is extended to more than two items per block and different FC formats, is applied to obtain the responses to each item block. Traditional and IRT (i.e., expected a po…
A Comparative Study of Nonlinear Machine Learning for the "In Silico" Depiction of Tyrosinase Inhibitory Activity from Molecular Structure.
2011
In the preset report, for the first time, support vector machine (SVM), artificial neural network (ANN), Baye- sian networks (BNs), k-nearest neighbor (k-NN) are applied and compared on two "in-house" datasets to describe the tyrosinase inhibitory activity from the molecular structure. The data set Data I is used for the identification of tyrosi- nase inhibitors (TIs) including 701 active and 728 inactive compounds. Data II consists of active chemicals for potency estimation of TIs. The 2D TOMOCOMD-CARDD atom-based quadratic indices are used as molecular descriptors. The de- rived models show rather encouraging results with the areas under the Receiver Operating Characteristic (AURC) curve …
Continuous SO2 flux measurements for Vulcano Island, Italy
2012
<p>The La Fossa cone of Vulcano Island (Aeolian Archipelago, Italy) is a closed conduit volcano. Today, Vulcano Island is characterized by sulfataric activity, with a large fumarolic field that is mainly located in the summit area. A scanning differential optical absorption spectroscopy instrument designed by the Optical Sensing Group of Chalmers University of Technology in Göteborg, Sweden, was installed in the framework of the European project "Network for Observation of Volcanic and Atmospheric Change", in March 2008. This study presents the first dataset of SO<sub>2</sub> plume fluxes recorded for a closed volcanic system. Between 2008 and 2010, the SO<sub>2</…
A Bayesian Reconstruction of a Historical Population in Finland, 1647–1850
2020
This article provides a novel method for estimating historical population development. We review the previous literature on historical population time-series estimates and propose a general outline to address the well-known methodological problems. We use a Bayesian hierarchical time-series model that allows us to integrate the parish-level data set and prior population information in a coherent manner. The procedure provides us with model-based posterior intervals for the final population estimates. We demonstrate its applicability by estimating the long-term development of Finlands population from 1647 onward and simultaneously place the country among the very few to have an annual popula…
Colorimetric Characterization of a Positive Film Scanner Using an Extremely Reduced Training Data Set
2011
International audience; In this work, we address the problem of having an accurate colorimetric characterization of a scanner for traditional posi- tive film in order to guarantee the accuracy of the color informa- tion during the digitization of a movie. The scanning of a posi- tive film is not an usual task, however it can happen for cultural heritage purpose. Art-movies, are often created and stored as positive-film in museums. One of the problems one can face for a colorimetric characterization is to have a reasonable number of measurements from an item. In this work we succeeded in having a reasonable accuracy with just a few number of measurement (typically 4 to 7 ∆Ea∗b units with 2 t…
Coupling two radar backscattering models to assess soil roughness and surface water content at farm scale
2013
Remote sensing techniques are useful for agro-hydrological monitoring at the farm scale because the availability of spatially and temporally distributed data improves agricultural models for irrigation and crop yield optimization under water scarcity conditions. This research focuses on the surface water content retrieval using active microwave data. Two semi-empirical models were chosen as these showed the best performances in simulating cross and co-polarized backscatter. Thus, these models were coupled to obtain reliable assessments of both soil water content and soil roughness. The use of the coupled model enables one to avoid using roughness measured in situ. Remote sensing images and …