Search results for "data set"
showing 10 items of 154 documents
Multi-phase classification by a least-squares support vector machine approach in tomography images of geological samples
2016
Abstract. Image processing of X-ray-computed polychromatic cone-beam micro-tomography (μXCT) data of geological samples mainly involves artefact reduction and phase segmentation. For the former, the main beam-hardening (BH) artefact is removed by applying a best-fit quadratic surface algorithm to a given image data set (reconstructed slice), which minimizes the BH offsets of the attenuation data points from that surface. A Matlab code for this approach is provided in the Appendix. The final BH-corrected image is extracted from the residual data or from the difference between the surface elevation values and the original grey-scale values. For the segmentation, we propose a novel least-squar…
On the Dependence of Cirrus Parametrizations on the Cloud Origin
2019
<p>Particle size distributions (PSDs) for cirrus clouds are important for both climate models as well as many remote sensing retrieval methods. Therefore, PSD parametrizations are required. This study presents parametrizations of Arctic cirrus PSDs. The dataset used for this purpose originates from balloon-borne measurements carried out during winter above Kiruna (Sweden), i.e. north of the Arctic circle. The observations are sorted into two types of cirrus cloud origin, either in-situ or liquid. The cloud origin describes the formation pathway of the ice particles. At temperatures below −38 °C, ice particles form in-situ from solution or ice nuclea…
A critical discussion of the electromagnetic radiation (EMR) method to determine stress orientations within the crust
2012
Abstract. In recent years, the electromagnetic radiation (EMR) method has been used to detect faults and to determine main horizontal stress directions from variations in intensities and directional properties of electromagnetic emissions, which are assumed to be generated during micro-cracking. Based on a large data set taken from an area of about 250 000 km2 in Northern Germany, Denmark, and southern Sweden with repeated measurements at one location during a time span of about 1.5 yr, the method was systematically tested. Reproducible observations of temporary changes in the signal patterns, as well as a strongly concentric spatial pattern of the main directions of the magnetic component …
ATLANTIC BIRDS: a data set of bird species from the Brazilian Atlantic Forest
2017
South America holds 30% of the world's avifauna, with the Atlantic Forest representing one of the richest regions of the Neotropics. Here we have compiled a data set on Brazilian Atlantic Forest bird occurrence (150,423) and abundance samples (N = 832 bird species; 33,119 bird individuals) using multiple methods, including qualitative surveys, mist nets, point counts, and line transects). We used four main sources of data: museum collections, on‐line databases, literature sources, and unpublished reports. The data set comprises 4,122 localities and data from 1815 to 2017. Most studies were conducted in the Florestas de Interior (1,510 localities) and Serra do Mar (1,280 localities) biogeogr…
Wind component estimation for UAS flying in turbulent air
2019
One of the most important problem of autonomous flight for UAS is the wind identification, especially for small scale vehicles. This research focusses on an identification methodology based on the Extended Kalman Filter (EKF). In particular authors focus their attention on.the filter tuning problem. The proposed procedure requires low computational power, so it is very useful for UAS. Besides it allows a robust wind component identification even when, as it is usually, the measurement data set is affected by noticeable noises. (C) 2019 Elsevier Masson SAS. All rights reserved.
Quantum clustering in non-spherical data distributions: Finding a suitable number of clusters
2017
Quantum Clustering (QC) provides an alternative approach to clustering algorithms, several of which are based on geometric relationships between data points. Instead, QC makes use of quantum mechanics concepts to find structures (clusters) in data sets by finding the minima of a quantum potential. The starting point of QC is a Parzen estimator with a fixed length scale, which significantly affects the final cluster allocation. This dependence on an adjustable parameter is common to other methods. We propose a framework to find suitable values of the length parameter σ by optimising twin measures of cluster separation and consistency for a given cluster number. This is an extension of the Se…
Partitioned learning of deep Boltzmann machines for SNP data.
2016
Abstract Motivation Learning the joint distributions of measurements, and in particular identification of an appropriate low-dimensional manifold, has been found to be a powerful ingredient of deep leaning approaches. Yet, such approaches have hardly been applied to single nucleotide polymorphism (SNP) data, probably due to the high number of features typically exceeding the number of studied individuals. Results After a brief overview of how deep Boltzmann machines (DBMs), a deep learning approach, can be adapted to SNP data in principle, we specifically present a way to alleviate the dimensionality problem by partitioned learning. We propose a sparse regression approach to coarsely screen…
Dynamic Functional Connectivity Captures Individuals’ Unique Brain Signatures
2020
Recent neuroimaging evidence suggest that there exists a unique individual-specific functional connectivity (FC) pattern consistent across tasks. The objective of our study is to utilize FC patterns to identify an individual using a supervised machine learning approach. To this end, we use two previously published data sets that comprises resting-state and task-based fMRI responses. We use static FC measures as input to a linear classifier to evaluate its performance. We additionally extend this analysis to capture dynamic FC using two approaches: the common sliding window approach and the more recent phase synchrony-based measure. We found that the classification models using dynamic FC pa…
Improved polyhedral descriptions and exact procedures for a broad class of uncapacitated p-hub median problems
2019
Abstract This work focuses on a broad class of uncapacitated p-hub median problems that includes non-stop services and setup costs for the network structures. In order to capture both the single and the multiple allocation patterns as well as any intermediate case of interest, we consider the so-called r-allocation pattern with r denoting the maximum number of hubs a terminal can be allocated to. We start by revisiting an optimization model recently proposed for the problem. For that model, we introduce several families of valid inequalities as well as optimality cuts. Moreover, we consider a relaxation of the model that contains several sets of set packing constraints. This motivates a pol…
Estimation of ADME Properties in Drug Discovery: Predicting Caco-2 Cell Permeability Using Atom-Based Stochastic and Non-stochastic Linear Indices
2007
The in vitro determination of the permeability through cultured Caco-2 cells is the most often-used in vitro model for drug absorption. In this report, we use the largest data set of measured P(Caco-2), consisting of 157 structurally diverse compounds. Linear discriminant analysis (LDA) was used to obtain quantitative models that discriminate higher absorption compounds from those with moderate-poorer absorption. The best LDA model has an accuracy of 90.58% and 84.21% for training and test set. The percentage of good correlation, in the virtual screening of 241 drugs with the reported values of the percentage of human intestinal absorption (HIA), was greater than 81%. In addition, multiple …