0000000000172821
AUTHOR
Seppo Puuronen
Personalization of immediate feedback to learning styles
Feedback provided to a user is an important part of learning and interaction in e-learning systems. In this paper we present the results of our pilot experiment aimed to study interrelation between several types of immediate feedback presentation and learning styles (LSs) of users. In the experiment we used the feedback supported by quiz module of moodle learning system. The obtained results demonstrate tendencies in interrelation between LS and immediate/summative feedback presentation and we suggest three hypotheses for future research.
A Multi-Expert Based Approach to Continuous Authentication of Mobile-Device Users
Currently used in mobile devices PIN-based user authentication cannot provide a sufficient security level. Methods based on multi-modal user authentication involving biometrics (i.e. physical and behavioral characteristics of a person) may be employed to cope with this problem. However, dealing with physical characteristics only, these methods are either unable to provide continuous and user-friendly identity verification, or are resource consuming.
A holistic framework for understanding acceptance of remote patient management (RPM) systems by non-professional users
The successful integration of Information and Communication Technologies (ICT) in healthcare facilitates the use of the sophisticated medical equipment and computer applications by medical practitioners. If earlier medical systems were mainly used by the health professionals (e.g. medical staff or nurses), nowadays with the appearance of Internet health systems are becoming available to the broader user groups, particularly patients and their families. eHealth has become an active research and development area within healthcare industry. Another important tendency in the development of ICT for health is a shift from “hospital-centered” to “person-centered” health systems which can enable ma…
The role of Ukrainian universities in the development of the global information society
This paper presents an observation of the positive experience obtained by Kharkov State Technical University in the area of education specialists for the development of an information society programme. Economic problems postponed the beginning of the information society programme in Ukraine. Nevertheless, Ukraine now has good possibilities to apply the best experience of western universities and speed up the process of transferring existing education to the European level. The paper also presents an analysis of the main trends that are taking place in that education. The paper uses the example of one of the leading Ukrainian technical universities to show the possible ways of positive chan…
A Similarity Evaluation Technique for Cooperative Problem Solving with a Group of Agents
Evaluation of distance or similarity is very important in cooperative problem solving with a group of agents. Distance between problems is used by agents to recognize nearest solved problems for a new problem, distance between solutions is necessary to compare and evaluate the solutions made by different agents, and distance between agents is useful to evaluate weights of the agents to be able to integrate them by weighted voting. The goal of this paper is to develop a similarity evaluation technique to be used for cooperative problem solving with a group of agents. Virtual training environment used for this goal is represented by predicates that define relationships within three sets: prob…
Estimation of Uncertain Relations between Indeterminate Temporal Points
Many database applications need to manage temporal information and sometimes to estimate relations between indeterminate temporal points. Indeterminacy means that we do not know exactly when a particular event happened. In this case, temporal points can be defined within some temporal intervals. Measurements of these intervals are not necessarily based on exactly synchronized clocks, and, therefore, possible measurement errors need to be taken into account when estimating the temporal relation between two indeterminate points. This paper presents an approach to calculate the probabilities of the basic relations (before, at the same time, and after) between any two indeterminate temporal poi…
Dynamic Integration of Decision Committees
Decision committee learning has demonstrated outstanding success in reducing classification error with an ensemble of classifiers. In a way a decision committee is a classifier formed upon an ensemble of subsidiary classifiers. Voting, which is commonly used to produce the final decision of committees has, however, a shortcoming. It is unable to take into account local expertise. When a new instance is difficult to classify, then it easily happens that only the minority of the classifiers will succeed, and the majority voting will quite probably result in a wrong classification. We suggest that dynamic integration of classifiers is used instead of majority voting in decision committees. Our…
The Challenge of Feedback Personalization to Learning Styles in a Web-Based Learning System
Feedback is information that is provided to a user to inform him/her about the result of his/her action and to motivate him/her to further interact with the system. In web-based learning systems (WBLS), feedback is particularly important in test and evaluation tasks. The main objective of the paper is twofold: (1) to encourage WBLS designers and specialists to pay more attention to the problem of feedback adaptation, and (2) to analyze suggestions for feedback personalization to learning styles in a WBLS.
Multi-State System in human reliability analysis
Application of mathematical model of Multi-State System for human reliability analysis is considered in the paper. This paper describes new method for estimation of changes of more than one component states and they influence to Multi-State System reliability by Dynamic Reliability Indices. The Multi-State System failure is considered depending on decrease of some system component efficiency and the Multi-State System repair is declared depending on replacement of some failed components. The mathematical approach of Logical Differential Calculus is used for analysis of the Multi-State System reliability change that is caused by modifications of some system components states.
The decision support system for telemedicine based on multiple expertise
This paper discusses the application of artificial intelligence in telemedicine and some of our research results in this area. The main goal of our research is to develop methods and systems to collect, analyse, distribute and use medical diagnostics knowledge from multiple knowledge sources and areas of expertise. Use of modern communication tools enable a physician to collect and analyse information obtained from experts worldwide with the help of a decision support medical system. In this paper we discuss a multilevel representation and processing of medical data using a system which evaluates and exploits knowledge about the behaviour of statistical diagnostics methods. The presented te…
Decision support system for telemedicine based on multiple expertise
This paper discusses results of the research in the area of artificial intelligence applications in telemedicine. The main goal of research is to manage multiple expertise obtained from experts-physicians in different countries to develop decision support medical system of broad earmarking based on telecommunication tools. The multilevel representation of medical data is discussed based on the apparatus of metastatistics. The technique is able to acquire semantically essential information from complex dynamics of quasi-periodical medical signals by applying recursively ordinary statistical tools. The voting-type technique is used to find consensus among medical experts in their description …
Reasoning with Multilevel Contexts in Semantic Metanetworks
It is generally accepted that knowledge has a contextual component. Acquisition, representation, and exploitation of knowledge in context would have a major contribution in knowledge representation, knowledge acquisition, and explanation, as Brezillon and Abu-Hakima supposed in [Brezillon and Abu-Hakima, 1995]. Among the advantages of the use of contexts in knowledge representation and reasoning Akman and Surav [Akman and Surav, 1996] mentioned the following: economy of representation, more competent reasoning, allowance for inconsistent knowledge bases, resolving of lexical ambiguity and flexible entailment. Brezillon and Cases noticed however in [Brezillon and Cases, 1995] that knowledge-…
Correlation-Based and Contextual Merit-Based Ensemble Feature Selection
Recent research has proved the benefits of using an ensemble of diverse and accurate base classifiers for classification problems. In this paper the focus is on producing diverse ensembles with the aid of three feature selection heuristics based on two approaches: correlation and contextual merit -based ones. We have developed an algorithm and experimented with it to evaluate and compare the three feature selection heuristics on ten data sets from UCI Repository. On average, simple correlation-based ensemble has the superiority in accuracy. The contextual merit -based heuristics seem to include too many features in the initial ensembles and iterations were most successful with it.
Ensemble Feature Selection Based on Contextual Merit and Correlation Heuristics
Recent research has proven the benefits of using ensembles of classifiers for classification problems. Ensembles of diverse and accurate base classifiers are constructed by machine learning methods manipulating the training sets. One way to manipulate the training set is to use feature selection heuristics generating the base classifiers. In this paper we examine two of them: correlation-based and contextual merit -based heuristics. Both rely on quite similar assumptions concerning heterogeneous classification problems. Experiments are considered on several data sets from UCI Repository. We construct fixed number of base classifiers over selected feature subsets and refine the ensemble iter…
Using continuous user authentication to detect masqueraders
Nowadays computer and network intrusions have become more common and more complicated, challenging the intrusion detection systems. Also, network traffic has been constantly increasing. As a consequence, the amount of data to be processed by an intrusion detection system has been growing, making it difficult to efficiently detect intrusions online. Proposes an approach for continuous user authentication based on the user’s behaviour, aiming at development of an efficient and portable anomaly intrusion detection system. A prototype of a host‐based intrusion detection system was built. It detects masqueraders by comparing the current user behaviour with his/her stored behavioural model. The m…
Towards more relevance-oriented data mining research
Data mining (DM) research has successfully developed advanced DM techniques and algorithms over the last few decades, and many organisations have great expectations to take more benefit of their data warehouses in decision making. Currently, the strong focus of most DM-researchers is still only on technology-oriented topics. Commonly the DM research has several stakeholders, the major of which can be divided into internal and external ones each having their own point of view, and which are at least partly conflicting. The most important internal groups of stakeholders are the DM research community and academics in other disciplines. The most important external stakeholder groups are manager…
Class Noise and Supervised Learning in Medical Domains: The Effect of Feature Extraction
Inductive learning systems have been successfully applied in a number of medical domains. It is generally accepted that the highest accuracy results that an inductive learning system can achieve depend on the quality of data and on the appropriate selection of a learning algorithm for the data. In this paper we analyze the effect of class noise on supervised learning in medical domains. We review the related work on learning from noisy data and propose to use feature extraction as a pre-processing step to diminish the effect of class noise on the learning process. Our experiments with 8 medical datasets show that feature extraction indeed helps to deal with class noise. It clearly results i…
Local Feature Selection with Dynamic Integration of Classifiers
Multidimensional data is often feature space heterogeneous so that individual features have unequal importance in different sub areas of the feature space. This motivates to search for a technique that provides a strategic splitting of the instance space being able to identify the best subset of features for each instance to be classified. Our technique applies the wrapper approach where a classification algorithm is used as an evaluation function to differentiate between different feature subsets. In order to make the feature selection local, we apply the recent technique for dynamic integration of classifiers. This allows to determine which classifier and which feature subset should be us…
<title>Expanding context against weighted voting of classifiers</title>
In the paper we propose a new method to integrate the predictions of multiple classifiers for Data Mining and Machine Learning tasks. The method assumes that each classifier stands in it's own context, and the contexts are partially ordered. The order is defined by monotonous quality function that maps each context to the value from the interval [0,1]. The classifier that has the context with better quality is supposed to predict better than the classifier from worse quality. The objective is to generate the opinion of `virtual' classifier that stands in the context with quality equal to 1. This virtual classifier must have the best accuracy of predictions due to the best context. To do thi…
Feature Selection for Ensembles of Simple Bayesian Classifiers
A popular method for creating an accurate classifier from a set of training data is to train several classifiers, and then to combine their predictions. The ensembles of simple Bayesian classifiers have traditionally not been a focus of research. However, the simple Bayesian classifier has much broader applicability than previously thought. Besides its high classification accuracy, it also has advantages in terms of simplicity, learning speed, classification speed, storage space, and incrementality. One way to generate an ensemble of simple Bayesian classifiers is to use different feature subsets as in the random subspace method. In this paper we present a technique for building ensembles o…
Fuzzy Classifier Based on Fuzzy Decision Tree
A popular method for making a fuzzy decision tree for classification is Fuzzy ID3 algorithm. We introduce a new approach that uses cumulative information estimations of initial data. Based on these estimations we propose a new greedy version of fuzzy ID3 algorithm to be used to generate understandable fuzzy classification rules. The goal is to find a sequence of rules that causes near minimal classification costs.
Evaluating Classifiers for Mobile-Masquerader Detection
As a result of the impersonation of a user of a mobile terminal, sensitive information kept locally or accessible over the network can be abused. The means of masquerader detection are therefore needed to detect the cases of impersonation. In this paper, the problem of mobile-masquerader detection is considered as a problem of classifying the user behaviour as originating from the legitimate user or someone else. Different behavioural characteristics are analysed by designated one-class classifiers whose classifications are combined. The paper focuses on selecting the classifiers for mobile-masquerader detection. The selection process is conducted in two phases. First, the classification ac…
Arbiter Meta-Learning with Dynamic Selection of Classifiers and its Experimental Investigation
In data mining, the selection of an appropriate classifier to estimate the value of an unknown attribute for a new instance has an essential impact to the quality of the classification result. Recently promising approaches using parallel and distributed computing have been presented. In this paper, we consider an approach that uses classifiers trained on a number of data subsets in parallel as in the arbiter meta-learning technique. We suggest that information is collected during the learning phase about the performance of the included base classifiers and arbiters and that this information is used during the application phase to select the best classifier dynamically. We evaluate our techn…
Estimating Accuracy of Mobile-Masquerader Detection Using Worst-Case and Best-Case Scenario
In order to resist an unauthorized use of the resources accessible through mobile terminals, masquerader detection means can be employed. In this paper, the problem of mobile-masquerader detection is approached as a classification problem, and the detection is performed by an ensemble of one-class classifiers. Each classifier compares a measure describing user behavior or environment with the profile accumulating the information about past behavior and environment. The accuracy of classification is empirically estimated by experimenting with a dataset describing the behavior and environment of two groups of mobile users, where the users within groups are affiliated with each other. It is as…
Characteristics and Measures for Mobile-Masquerader Detection
Personal mobile devices, as mobile phones, smartphones, and communicators can be easily lost or stolen. Due to the functional abilities of these devices, their use by an unintended person may result in a severe security incident concerning private or corporate data and services. Organizations develop their security policy and mobilize preventive techniques against unauthorized use. Current solutions, however, are still breakable and there still exists strong need for means to detect user substitution when it happens. A crucial issue in designing such means is to define what measures to monitor.
<title>Dynamic integration of multiple data mining techniques in a knowledge discovery management system</title>
One of the most important directions in improvement of data mining and knowledge discovery, is the integration of multiple classification techniques of an ensemble of classifiers. An integration technique should be able to estimate and select the most appropriate component classifiers from the ensemble. We present two variations of an advanced dynamic integration technique with two distance metrics. The technique is one variation of the stacked generalization method, with an assumption that each of the component classifiers is the best one, inside a certain sub area of the entire domain area. Our technique includes two phases: the learning phase and the application phase. During the learnin…
The impact of sample reduction on PCA-based feature extraction for supervised learning
"The curse of dimensionality" is pertinent to many learning algorithms, and it denotes the drastic raise of computational complexity and classification error in high dimensions. In this paper, different feature extraction (FE) techniques are analyzed as means of dimensionality reduction, and constructive induction with respect to the performance of Naive Bayes classifier. When a data set contains a large number of instances, some sampling approach is applied to address the computational complexity of FE and classification processes. The main goal of this paper is to show the impact of sample reduction on the process of FE for supervised learning. In our study we analyzed the conventional PC…
Bagging and Boosting with Dynamic Integration of Classifiers
One approach in classification tasks is to use machine learning techniques to derive classifiers using learning instances. The co-operation of several base classifiers as a decision committee has succeeded to reduce classification error. The main current decision committee learning approaches boosting and bagging use resampling with the training set and they can be used with different machine learning techniques which derive base classifiers. Boosting uses a kind of weighted voting and bagging uses equal weight voting as a combining method. Both do not take into account the local aspects that the base classifiers may have inside the problem space. We have proposed a dynamic integration tech…
Learning Temporal Regularities of User Behavior for Anomaly Detection
Fast expansion of inexpensive computers and computer networks has dramatically increased number of computer security incidents during last years. While quite many computer systems are still vulnerable to numerous attacks, intrusion detection has become vitally important as a response to constantly increasing number of threats. In this paper we discuss an approach to discover temporal and sequential regularities in user behavior. We present an algorithm that allows creating and maintaining user profiles relying not only on sequential information but taking into account temporal features, such as events' lengths and possible temporal relations between them. The constructed profiles represent …
A framework for behavior-based detection of user substitution in a mobile context
Personal mobile devices, such as mobile phones, smartphones, and communicators can be easily lost or stolen. Due to the functional abilities of these devices, their use by unintended persons may result in severe security breaches concerning private or corporate data and services. Organizations develop their security policy and employ preventive techniques to combat unauthorized use. Current solutions, however, are still breakable and there is a strong need for means to detect user substitution when it happens. A crucial issue in designing such means is to define the measures to be monitored. In this paper, a structured conceptual framework for mobile-user substitution detection is proposed.…
Knowledge management challenges in knowledge discovery systems
Current knowledge discovery systems are armed with many data mining techniques that can be potentially applied to a new problem. However, a system faces a challenge of selecting the most appropriate technique(s) for a problem at hand, since in the real domain area it is infeasible to perform a comparison of all applicable techniques. The main goal of this paper is to consider the limitations of data-driven approaches and propose a knowledge-driven approach to enhance the use of multiple data-mining strategies in a knowledge discovery system. We introduce the concept of (meta-) knowledge management, which is aimed to organize a systematic process of (meta-) knowledge capture and refinement o…
Ensemble feature selection with the simple Bayesian classification
Abstract A popular method for creating an accurate classifier from a set of training data is to build several classifiers, and then to combine their predictions. The ensembles of simple Bayesian classifiers have traditionally not been a focus of research. One way to generate an ensemble of accurate and diverse simple Bayesian classifiers is to use different feature subsets generated with the random subspace method. In this case, the ensemble consists of multiple classifiers constructed by randomly selecting feature subsets, that is, classifiers constructed in randomly chosen subspaces. In this paper, we present an algorithm for building ensembles of simple Bayesian classifiers in random sub…
Knowledge Acquisition Based on Semantic Balance of Internal and External Knowledge
This paper presents a strategy to handle incomplete knowledge during acquisition process. The goal of this research is to develop formal tools that benefit the law of semantic balance. The assumption is used that a situation inside the object’s boundary in some world should be in balance with a situation outside it. It means that continuous cognition of an object aspires to a complete knowledge about it and knowledge about internal structure of the object will be in balance with knowledge about relationships of the object with other objects in its environment. It is supposed that one way to discover incompleteness of knowledge about some object is to measure and compare knowledge about its …
<title>Distance functions in dynamic integration of data mining techniques</title>
One of the most important directions in the improvement of data mining and knowledge discovery is the integration of multiple data mining techniques. An integration method needs to be able either to evaluate and select the most appropriate data mining technique or to combine two or more techniques efficiently. A recent integration method for the dynamic integration of multiple data mining techniques is based on the assumption that each of the data mining techniques is the best one inside a certain subarea of the whole domain area. This method uses an instance-based learning approach to collect information about the competence areas of the mining techniques and applies a distance function to…
Handling local concept drift with dynamic integration of classifiers : domain of antibiotic resistance in nosocomial infections
In the real world concepts and data distributions are often not stable but change with time. This problem, known as concept drift, complicates the task of learning a model from data and requires special approaches, different from commonly used techniques, which treat arriving instances as equally important contributors to the target concept. Among the most popular and effective approaches to handle concept drift is ensemble learning, where a set of models built over different time periods is maintained and the best model is selected or the predictions of models are combined. In this paper we consider the use of an ensemble integration technique that helps to better handle concept drift at t…
Handling Context-Sensitive Temporal Knowledge from Multiple Differently Ranked Sources
In this paper we develop one way to represent and reason with temporal relations in the context of multiple experts. Every relation between temporal intervals consists of four endpoints’ relations. It is supposed that the context we know is the value of every expert competence concerning every endpoint relation. Thus the context for an interval temporal relation is one kind of compound expert’s rank, which has four components appropriate to every interval endpoints’ relation. Context is being updated after every new opinion is being added to the previous opinions about certain temporal relation. The context of a temporal relation collects all support given by different experts to all compon…
Dynamic reliability indices for k-out-of-n multi-state system
A multi-state system k-out-of-n is one of basic models in reliability analysis. In this system, k is the minimum number of n components that must work for the system to work and both the system and its components can have more than two states. A structure function declares relation between system and component states uniquely. New algorithm for reliability analysis of the k-out-of-n multi-state system is proposed in this paper. We use two tools for examine this system: (a) structure function for the system description; (b) direct partial logic derivatives for analysis this system. New algorithm for reliability analysis of the multi-state system k-out-of-n is allowed to calculate the probabi…
Knowledge Acquisition from Multiple Experts Based on Semantics of Concepts
This paper presents one approach to acquire knowledge from multiple experts. The experts are grouped into a multilevel hierarchical structure, according to the type of knowledge acquired. The first level consists of experts who have knowledge about the basic objects and their relationships. The second level of experts includes those who have knowledge about the relationships of the experts at the first level and each higher level accordingly. We show how to derive the most supported opinion among the experts at each level. This is used to order the experts into categories of their competence defined as the support they get from their colleagues.
Feedback adaptation in web-based learning systems
Feedback provided by a learning system to its users plays an important role in web-based education. This paper presents an overview of feedback studies and then concentrates on the problem of feedback adaptation in web-based learning systems. We introduce our taxonomy of feedback concept with regard to its functions, complexity, intention, time of occurrence, way of presentation, and level and way of its adaptation. We consider what can be adapted in feedback and how to facilitate feedback adaptation in web-based learning systems.
A dynamic integration algorithm for an ensemble of classifiers
Numerous data mining methods have recently been developed, and there is often a need to select the most appropriate data mining method or methods. The method selection can be done statically or dynamically. Dynamic selection takes into account characteristics of a new instance and usually results in higher classification accuracy. We discuss a dynamic integration algorithm for an ensemble of classifiers. Our algorithm is a new variation of the stacked generalization method and is based on the basic assumption that each basic classifier is best inside certain subareas of the application domain. The algorithm includes two main phases: a learning phase, which collects information about the qua…
Modelling Dependencies Between Classifiers in Mobile Masquerader Detection
The unauthorised use of mobile terminals may result in an abuse of sensitive information kept locally on the terminals or accessible over the network. Therefore, there is a need for security means capable of detecting the cases when the legitimate user of the terminal is substituted. The problem of user substitution detection is considered in the paper as a problem of classifying the behaviour of the person interacting with the terminal as originating from the user or someone else. Different aspects of behaviour are analysed by designated one-class classifiers whose classifications are subsequently combined. A modification of majority voting that takes into account some of the dependencies …
Mobile information systems - executives' view
The concepts of travelling executive and executives' mobile information system are first defined in this paper. The main findings are in the form of collected data and opinions concluded from our personal discussions with 49 executives in the United Kingdom, France, Italy and Finland concerning the nature of the work of executives and the usage of information technology (IT) to support their work today. The near future expectations of the executives are also analysed, especially concerning the mobile use of IT services in order to construct executives' holistic view of mobile computing. The use of IT services was found to be very widespread. Big differences were found: some of the executive…
Search strategies for ensemble feature selection in medical diagnostics
The goal of this paper is to propose, evaluate, and compare four search strategies for ensemble feature selection, and to consider their application to medical diagnostics, with a focus on the problem of the classification of acute abdominal pain. Ensembles of learnt models constitute one of the main current directions in machine learning and data mining. Ensembles allow us to get higher accuracy, sensitivity, and specificity, which are often not achievable with single models. One technique, which proved to be effective for ensemble construction, is feature selection. Lately, several strategies for ensemble feature selection were proposed, including random subspacing, hill-climbing-based se…
Ensemble Feature Selection Based on the Contextual Merit
Recent research has proved the benefits of using ensembles of classifiers for classification problems. Ensembles constructed by machine learning methods manipulating the training set are used to create diverse sets of accurate classifiers. Different feature selection techniques based on applying different heuristics for generating base classifiers can be adjusted to specific domain characteristics. In this paper we consider and experiment with the contextual feature merit measure as a feature selection heuristic. We use the diversity of an ensemble as evaluation function in our new algorithm with a refinement cycle. We have evaluated our algorithm on seven data sets from UCI. The experiment…