6533b820fe1ef96bd127ab14

RESEARCH PRODUCT

Semantics of Voids within Data: Ignorance-Aware Machine Learning

Vagan TerziyanAnton Nikulin

subject

luokitus (toiminta)paikkatiedotlcsh:G1-922data miningprototype selectionignorancekoneoppiminenclassificationdata semanticstiedonlouhintaadversarial learningdata voidslcsh:Geography (General)

description

Operating with ignorance is an important concern of geographical information science when the objective is to discover knowledge from the imperfect spatial data. Data mining (driven by knowledge discovery tools) is about processing available (observed, known, and understood) samples of data aiming to build a model (e.g., a classifier) to handle data samples that are not yet observed, known, or understood. These tools traditionally take semantically labeled samples of the available data (known facts) as an input for learning. We want to challenge the indispensability of this approach, and we suggest considering the things the other way around. What if the task would be as follows: how to build a model based on the semantics of our ignorance, i.e., by processing the shape of “voids” within the available data space? Can we improve traditional classification by also modeling the ignorance? In this paper, we provide some algorithms for the discovery and visualization of the ignorance zones in two-dimensional data spaces and design two ignorance-aware smart prototype selection techniques (incremental and adversarial) to improve the performance of the nearest neighbor classifiers. We present experiments with artificial and real datasets to test the concept of the usefulness of ignorance semantics discovery. peerReviewed

10.3390/ijgi10040246https://www.mdpi.com/2220-9964/10/4/246