Search results for "Machine"
showing 10 items of 2592 documents
Randomized kernels for large scale Earth observation applications
2020
Abstract Current remote sensing applications of bio-geophysical parameter estimation and image classification have to deal with an unprecedented big amount of heterogeneous and complex data sources. New satellite sensors involving a high number of improved time, space and wavelength resolutions give rise to challenging computational problems. Standard physical inversion techniques cannot cope efficiently with this new scenario. Dealing with land cover classification of the new image sources has also turned to be a complex problem requiring large amount of memory and processing time. In order to cope with these problems, statistical learning has greatly helped in the last years to develop st…
Machine learning information fusion in Earth observation: A comprehensive review of methods, applications and data sources
2020
This paper reviews the most important information fusion data-driven algorithms based on Machine Learning (ML) techniques for problems in Earth observation. Nowadays we observe and model the Earth with a wealth of observations, from a plethora of different sensors, measuring states, fluxes, processes and variables, at unprecedented spatial and temporal resolutions. Earth observation is well equipped with remote sensing systems, mounted on satellites and airborne platforms, but it also involves in-situ observations, numerical models and social media data streams, among other data sources. Data-driven approaches, and ML techniques in particular, are the natural choice to extract significant i…
Disrupting resilient criminal networks through data analysis: The case of Sicilian Mafia
2020
Compared to other types of social networks, criminal networks present hard challenges, due to their strong resilience to disruption, which poses severe hurdles to law-enforcement agencies. Herein, we borrow methods and tools from Social Network Analysis to (i) unveil the structure of Sicilian Mafia gangs, based on two real-world datasets, and (ii) gain insights as to how to efficiently disrupt them. Mafia networks have peculiar features, due to the links distribution and strength, which makes them very different from other social networks, and extremely robust to exogenous perturbations. Analysts are also faced with the difficulty in collecting reliable datasets that accurately describe the…
Do-search -- a tool for causal inference and study design with multiple data sources
2020
Epidemiologic evidence is based on multiple data sources including clinical trials, cohort studies, surveys, registries, and expert opinions. Merging information from different sources opens up new possibilities for the estimation of causal effects. We show how causal effects can be identified and estimated by combining experiments and observations in real and realistic scenarios. As a new tool, we present do-search, a recently developed algorithmic approach that can determine the identifiability of a causal effect. The approach is based on do-calculus, and it can utilize data with nontrivial missing data and selection bias mechanisms. When the effect is identifiable, do-search outputs an i…
Quantum, stochastic, and pseudo stochastic languages with few states
2014
Stochastic languages are the languages recognized by probabilistic finite automata (PFAs) with cutpoint over the field of real numbers. More general computational models over the same field such as generalized finite automata (GFAs) and quantum finite automata (QFAs) define the same class. In 1963, Rabin proved the set of stochastic languages to be uncountable presenting a single 2-state PFA over the binary alphabet recognizing uncountably many languages depending on the cutpoint. In this paper, we show the same result for unary stochastic languages. Namely, we exhibit a 2-state unary GFA, a 2-state unary QFA, and a family of 3-state unary PFAs recognizing uncountably many languages; all th…
New Results on Vector and Homing Vector Automata
2019
We present several new results and connections between various extensions of finite automata through the study of vector automata and homing vector automata. We show that homing vector automata outperform extended finite automata when both are defined over $ 2 \times 2 $ integer matrices. We study the string separation problem for vector automata and demonstrate that generalized finite automata with rational entries can separate any pair of strings using only two states. Investigating stateless homing vector automata, we prove that a language is recognized by stateless blind deterministic real-time version of finite automata with multiplication iff it is commutative and its Parikh image is …
Fast Neural Machine Translation Implementation
2018
This paper describes the submissions to the efficiency track for GPUs at the Workshop for Neural Machine Translation and Generation by members of the University of Edinburgh, Adam Mickiewicz University, Tilde and University of Alicante. We focus on efficient implementation of the recurrent deep-learning model as implemented in Amun, the fast inference engine for neural machine translation. We improve the performance with an efficient mini-batching algorithm, and by fusing the softmax operation with the k-best extraction algorithm. Submissions using Amun were first, second and third fastest in the GPU efficiency track.
Automated Patch Assessment for Program Repair at Scale
2021
AbstractIn this paper, we do automatic correctness assessment for patches generated by program repair systems. We consider the human-written patch as ground truth oracle and randomly generate tests based on it, a technique proposed by Shamshiri et al., called Random testing with Ground Truth (RGT) in this paper. We build a curated dataset of 638 patches for Defects4J generated by 14 state-of-the-art repair systems, we evaluate automated patch assessment on this dataset. The results of this study are novel and significant: First, we improve the state of the art performance of automatic patch assessment with RGT by 190% by improving the oracle; Second, we show that RGT is reliable enough to h…
Joint Gaussian Processes for Biophysical Parameter Retrieval
2017
Solving inverse problems is central to geosciences and remote sensing. Radiative transfer models (RTMs) represent mathematically the physical laws which govern the phenomena in remote sensing applications (forward models). The numerical inversion of the RTM equations is a challenging and computationally demanding problem, and for this reason, often the application of a nonlinear statistical regression is preferred. In general, regression models predict the biophysical parameter of interest from the corresponding received radiance. However, this approach does not employ the physical information encoded in the RTMs. An alternative strategy, which attempts to include the physical knowledge, co…
Randomized Rx For Target Detection
2018
This work tackles the target detection problem through the well-known global RX method. The RX method models the clutter as a multivariate Gaussian distribution, and has been extended to nonlinear distributions using kernel methods. While the kernel RX can cope with complex clutters, it requires a considerable amount of computational resources as the number of clutter pixels gets larger. Here we propose random Fourier features to approximate the Gaussian kernel in kernel RX and consequently our development keep the accuracy of the nonlinearity while reducing the computational cost which is now controlled by an hyperparameter. Results over both synthetic and real-world image target detection…