0000000000172822

AUTHOR

Mykola Pechenizkiy

0000-0003-4955-0743

Personalization of immediate feedback to learning styles

Feedback provided to a user is an important part of learning and interaction in e-learning systems. In this paper we present the results of our pilot experiment aimed to study interrelation between several types of immediate feedback presentation and learning styles (LSs) of users. In the experiment we used the feedback supported by quiz module of moodle learning system. The obtained results demonstrate tendencies in interrelation between LS and immediate/summative feedback presentation and we suggest three hypotheses for future research.

research product

A holistic framework for understanding acceptance of remote patient management (RPM) systems by non-professional users

The successful integration of Information and Communication Technologies (ICT) in healthcare facilitates the use of the sophisticated medical equipment and computer applications by medical practitioners. If earlier medical systems were mainly used by the health professionals (e.g. medical staff or nurses), nowadays with the appearance of Internet health systems are becoming available to the broader user groups, particularly patients and their families. eHealth has become an active research and development area within healthcare industry. Another important tendency in the development of ICT for health is a shift from “hospital-centered” to “person-centered” health systems which can enable ma…

research product

Diversity in search strategies for ensemble feature selection

Ensembles of learnt models constitute one of the main current directions in machine learning and data mining. Ensembles allow us to achieve higher accuracy, which is often not achievable with single models. It was shown theoretically and experimentally that in order for an ensemble to be effective, it should consist of base classifiers that have diversity in their predictions. One technique, which proved to be effective for constructing an ensemble of diverse base classifiers, is the use of different feature subsets, or so-called ensemble feature selection. Many ensemble feature selection strategies incorporate diversity as an objective in the search for the best collection of feature subse…

research product

The Challenge of Feedback Personalization to Learning Styles in a Web-Based Learning System

Feedback is information that is provided to a user to inform him/her about the result of his/her action and to motivate him/her to further interact with the system. In web-based learning systems (WBLS), feedback is particularly important in test and evaluation tasks. The main objective of the paper is twofold: (1) to encourage WBLS designers and specialists to pay more attention to the problem of feedback adaptation, and (2) to analyze suggestions for feedback personalization to learning styles in a WBLS.

research product

Towards more relevance-oriented data mining research

Data mining (DM) research has successfully developed advanced DM techniques and algorithms over the last few decades, and many organisations have great expectations to take more benefit of their data warehouses in decision making. Currently, the strong focus of most DM-researchers is still only on technology-oriented topics. Commonly the DM research has several stakeholders, the major of which can be divided into internal and external ones each having their own point of view, and which are at least partly conflicting. The most important internal groups of stakeholders are the DM research community and academics in other disciplines. The most important external stakeholder groups are manager…

research product

Feature extraction for supervised learning in knowledge discovery systems

Tiedon louhinnalla pyritään paljastamaan tietokannasta tietomassaan sisältyviä säännönmukaisuuksia, joiden olemassaolosta ei vielä olla tietoisia. Kun tietokantaan sisältyvät tiedot ovat kovin moniulotteisia, yksittäisten tapausten sisältäessä lukuisia piirteitä, monen koneoppimisen menetelmän suorituskyky heikkenee ratkaisevasti. Tätä ilmiötä nimitetään ”moniulotteisuuden kiroukseksi”, koska se johtaa usein sekä koneellisen käsittelyn monimutkaisuuden että käsittelyn yhteydessä syntyvien luokitusvirheiden kasvuun. Toisaalta tietokantaan mahdollisesti sisältyvät epärelevantit tai vain epäsuorasti relevantit piirteet tarjoavat heikon esitysavaruuden tietokannan käsiterakenteen kuvaamiseen. P…

research product

Class Noise and Supervised Learning in Medical Domains: The Effect of Feature Extraction

Inductive learning systems have been successfully applied in a number of medical domains. It is generally accepted that the highest accuracy results that an inductive learning system can achieve depend on the quality of data and on the appropriate selection of a learning algorithm for the data. In this paper we analyze the effect of class noise on supervised learning in medical domains. We review the related work on learning from noisy data and propose to use feature extraction as a pre-processing step to diminish the effect of class noise on the learning process. Our experiments with 8 medical datasets show that feature extraction indeed helps to deal with class noise. It clearly results i…

research product

Keynote Paper: Data Mining Researcher, Who is Your Customer? Some Issues Inspired by the Information Systems Field

Data mining as an applied research field is still causing great expectations among organizations which want to raise the utility they are getting from their huge databases and data warehouses. There exist too few success stories about organizations having managed to satisfy even some of those expectations. This situation is very similar to the one inside the information systems (IS) field, especially earlier but even currently. The recent lively debate about the identity of the IS discipline included also the analysis concerning the customers of IS research. Inspired by IS researchers' insights related to the topic, we ask the question "who is our customer?" as data mining researchers. With…

research product

Local dimensionality reduction and supervised learning within natural clusters for biomedical data analysis

Inductive learning systems were successfully applied in a number of medical domains. Nevertheless, the effective use of these systems often requires data preprocessing before applying a learning algorithm. This is especially important for multidimensional heterogeneous data presented by a large number of features of different types. Dimensionality reduction (DR) is one commonly applied approach. The goal of this paper is to study the impact of natural clustering--clustering according to expert domain knowledge--on DR for supervised learning (SL) in the area of antibiotic resistance. We compare several data-mining strategies that apply DR by means of feature extraction or feature selection w…

research product

The impact of sample reduction on PCA-based feature extraction for supervised learning

"The curse of dimensionality" is pertinent to many learning algorithms, and it denotes the drastic raise of computational complexity and classification error in high dimensions. In this paper, different feature extraction (FE) techniques are analyzed as means of dimensionality reduction, and constructive induction with respect to the performance of Naive Bayes classifier. When a data set contains a large number of instances, some sampling approach is applied to address the computational complexity of FE and classification processes. The main goal of this paper is to show the impact of sample reduction on the process of FE for supervised learning. In our study we analyzed the conventional PC…

research product

Local dimensionality reduction within natural clusters for medical data analysis

Inductive learning systems have been successfully applied in a number of medical domains. Nevertheless, the effective use of these systems requires data preprocessing before applying a learning algorithm. Especially it is important for multidimensional heterogeneous data, presented by a large number of features of different types. Dimensionality reduction is one commonly applied approach. The goal of this paper is to study the impact of natural clustering on dimensionality reduction for classification. We compare several data mining strategies that apply dimensionality reduction by means of feature extraction or feature selection for subsequent classification. We show experimentally on micr…

research product

Effectiveness of local feature selection in ensemble learning for prediction of antimicrobial resistance

In the real world concepts are often not stable but change over time. A typical example of this in the biomedical context is antibiotic resistance, where pathogen sensitivity may change over time as pathogen strains develop resistance to antibiotics that were previously effective. This problem, known as concept drift (CD), complicates the task of learning a robust model. Different ensemble learning (EL) approaches (that instead of learning a single classifier try to learn and maintain a set of classifiers over time) have been shown to perform reasonably well in the presence of concept drift. In this paper we study how much local feature selection (FS) can improve ensemble performance for da…

research product

The influence of dataset size on the performance of cell outage detection approach in LTE-A networks

The configuration and maintenance of constantly evolving mobile cellular networks are getting more and more complex and hence expensive. Self-Organizing Networks (SON) concept is an umbrella term for the set of automated solutions for network operations proposed by 3rd Generation Partnership Project (3GPP) group. Automated cell outage detection is one of the components of SON functionality. In early studies our research group developed data-driven approach for the detection of malfunctioning cells. In this paper we investigate the performance of the proposed solution as a function of the density of active users and the size of observation interval. The evaluation is conducted in Long Term E…

research product

Knowledge management challenges in knowledge discovery systems

Current knowledge discovery systems are armed with many data mining techniques that can be potentially applied to a new problem. However, a system faces a challenge of selecting the most appropriate technique(s) for a problem at hand, since in the real domain area it is infeasible to perform a comparison of all applicable techniques. The main goal of this paper is to consider the limitations of data-driven approaches and propose a knowledge-driven approach to enhance the use of multiple data-mining strategies in a knowledge discovery system. We introduce the concept of (meta-) knowledge management, which is aimed to organize a systematic process of (meta-) knowledge capture and refinement o…

research product

DOBRO : a prediction error correcting robot under drifts

We propose DOBRO, a light online learning module, which is equipped with a smart correction policy helping making decision to correct or not the given prediction depending on how likely the correction will lead to a better prediction performance. DOBRO is a standalone module requiring nothing more than a time series of prediction errors and it is flexible to be integrated into any black-box model to improve its performance under drifts. We performed evaluation in a real-world application with bus arrival time prediction problem. The obtained results show that DOBRO improved prediction performance significantly meanwhile it did not hurt the accuracy when drift does not happen.

research product

Handling local concept drift with dynamic integration of classifiers : domain of antibiotic resistance in nosocomial infections

In the real world concepts and data distributions are often not stable but change with time. This problem, known as concept drift, complicates the task of learning a model from data and requires special approaches, different from commonly used techniques, which treat arriving instances as equally important contributors to the target concept. Among the most popular and effective approaches to handle concept drift is ensemble learning, where a set of models built over different time periods is maintained and the best model is selected or the predictions of models are combined. In this paper we consider the use of an ensemble integration technique that helps to better handle concept drift at t…

research product

Quantile index for gradual and abrupt change detection from CFB boiler sensor data in online settings

In this paper we consider the problem of online detection of gradual and abrupt changes in sensor data having high levels of noise and outliers. We propose a simple heuristic method based on the Quantile Index (QI) and study how robust this method is for detecting both gradual and abrupt changes with such data. We evaluate the performance of our method on the artificially generated and real datasets that represent different operational settings of a pilot circulating fluidized bed (CFB) reactor and CFB cold model. Our experiments suggest that QI can be used for designing very simple yet effective methods for gradual change detection in the noisy sensor data. It can be also used for detectin…

research product

Using Cellular Automata for feature construction - preliminary study

When first faced with a learning task, it is often not clear what a good representation of the training data should look like. We are often forced to create some set of features that appear plausible, without any strong confidence that they will yield superior learning. Beside, we often do not have any prior knowledge of what learning method is the best to apply, and thus often try multiple methods in an attempt to find the one that performs best. This paper describes a new method and its preliminary study for constructing features based on cellular automata (CA). Our approach uses self-organisation ability of cellular automata by constructing features being most efficient for making predic…

research product

Feedback adaptation in web-based learning systems

Feedback provided by a learning system to its users plays an important role in web-based education. This paper presents an overview of feedback studies and then concentrates on the problem of feedback adaptation in web-based learning systems. We introduce our taxonomy of feedback concept with regard to its functions, complexity, intention, time of occurrence, way of presentation, and level and way of its adaptation. We consider what can be adapted in feedback and how to facilitate feedback adaptation in web-based learning systems.

research product

Online mass flow prediction in CFB boilers with explicit detection of sudden concept drift

Fuel feeding and inhomogeneity of fuel typically cause fluctuations in the circulating fluidized bed (CFB) process. If control systems fail to compensate the fluctuations, the whole plant will suffer from dynamics that is reinforced by the closed-loop controls. This phenomenon causes reducing efficiency and the lifetime of process components. In this paper we address the problem of online mass flow prediction, which is a part of control. Particularly, we consider the problem of learning an accurate predictor with explicit detection of abrupt concept drift and noise handling mechanisms. We emphasize the importance of having domain knowledge concerning the considered case and constructing the…

research product

Search strategies for ensemble feature selection in medical diagnostics

The goal of this paper is to propose, evaluate, and compare four search strategies for ensemble feature selection, and to consider their application to medical diagnostics, with a focus on the problem of the classification of acute abdominal pain. Ensembles of learnt models constitute one of the main current directions in machine learning and data mining. Ensembles allow us to get higher accuracy, sensitivity, and specificity, which are often not achievable with single models. One technique, which proved to be effective for ensemble construction, is feature selection. Lately, several strategies for ensemble feature selection were proposed, including random subspacing, hill-climbing-based se…

research product

Modelling Recurrent Events for Improving Online Change Detection

The task of online change point detection in sensor data streams is often complicated due to presence of noise that can be mistaken for real changes and therefore affecting performance of change detectors. Most of the existing change detection methods assume that changes are independent from each other and occur at random in time. In this paper we study how performance of detectors can be improved in case of recurrent changes. We analytically demonstrate under which conditions and for how long recurrence information is useful for improving the detection accuracy. We propose a simple computationally efficient message passing procedure for calculating a predictive probability distribution of …

research product