Search results for "Data type"
showing 10 items of 1183 documents
Hierarchically nested factor model from multivariate data
2005
We show how to achieve a statistical description of the hierarchical structure of a multivariate data set. Specifically we show that the similarity matrix resulting from a hierarchical clustering procedure is the correlation matrix of a factor model, the hierarchically nested factor model. In this model, factors are mutually independent and hierarchically organized. Finally, we use a bootstrap based procedure to reduce the number of factors in the model with the aim of retaining only those factors significantly robust with respect to the statistical uncertainty due to the finite length of data records.
Online Density Estimation of Heterogeneous Data Streams in Higher Dimensions
2016
The joint density of a data stream is suitable for performing data mining tasks without having access to the original data. However, the methods proposed so far only target a small to medium number of variables, since their estimates rely on representing all the interdependencies between the variables of the data. High-dimensional data streams, which are becoming more and more frequent due to increasing numbers of interconnected devices, are, therefore, pushing these methods to their limits. To mitigate these limitations, we present an approach that projects the original data stream into a vector space and uses a set of representatives to provide an estimate. Due to the structure of the est…
Reverse-Safe Text Indexing
2021
We introduce the notion of reverse-safe data structures. These are data structures that prevent the reconstruction of the data they encode (i.e., they cannot be easily reversed). A data structure D is called z - reverse-safe when there exist at least z datasets with the same set of answers as the ones stored by D . The main challenge is to ensure that D stores as many answers to useful queries as possible, is constructed efficiently, and has size close to the size of the original dataset it encodes. Given a text of length n and an integer z , we propose an algorithm that constructs a z -reverse-safe data structure ( z -RSDS) that has size O(n) and answers decision and counting pattern matc…
Verbal ordinal classification with multicriteria decision aiding
2008
Abstract Professionals in neuropsychology usually perform diagnoses of patients’ behaviour in a verbal rather than in a numerical form. This fact generates interest in decision support systems that process verbal data. It also motivates us to develop methods for the classification of such data. In this paper, we describe ways of aiding classification of a discrete set of objects, evaluated on set of criteria that may have verbal estimations, into ordered decision classes. In some situations, there is no explicit additional information available, while in others it is possible to order the criteria lexicographically. We consider both of these cases. The proposed Dichotomic Classification (DC…
Rough Set Theory for Supporting Decision Making on Relevance in Browsing Multilingual Digital Resources
2017
Browsing digital library (DL) collections seems to pose a challenge for a user owning to the number of factors like for instance, operability of the system, interface readability or clarity, and retrieval efficiency directly related to it, or the number of digital items within the user’s domain. However, when it comes to searching for an item in a foreign language to the user, the number of the factors arises even more which translates proportionally to the growing number of clicks aimed to retrieve the target item. Such a procedure usually leads to disheartening the user from browsing the digital collections. Our study into the user’s behavior interacting with multilingual DL system is set…
Prediction of arrival times and human resources allocation for container terminal
2011
Increasing competition in the container shipping sector has meant that terminals are having to equip themselves with increasingly accurate analytical and governance tools. A transhipment terminal is an extremely complex system in terms of both organisation and management. Added to the uncertainty surrounding ships’ arrival time in port and the costs resulting from over-underestimation of resources is the large number of constraints and variables involved in port activities. Predicting ships delays in advance means that the relative demand for each shift can be determined with greater accuracy, and the basic resources then allocated to satisfy that demand. To this end, in this article we pro…
Cholesky decomposition-based definition of atomic subsystems in electronic structure calculations
2010
Decomposing the Hartree-Fock one-electron density matrix and a virtual pseudodensity matrix, we obtain an orthogonal set of normalized molecular orbitals with local character to be used in post-Hartree-Fock calculations. The applicability of the procedure is illustrated by calculating CCSD(T) energies and CCSD molecular properties in reduced active spaces. © 2010 American Institute of Physics.
Design principles for learning analytics information systems in higher education
2020
This paper reports a design science research (DSR) study that develops, demonstrates and evaluates a set of design principles for information systems (IS) that utilise learning analytics to support learning and teaching in higher education. The initial set of design principles is created from theory-inspired conceptualisation based on the literature, and they are evaluated and revised through a DSR process of demonstration and evaluation. We evaluated the developed artefact in four courses with a total enrolment of 1,173 students. The developed design principles for learning analytics information systems (LAIS) to establish a foundation for further development and implementation of learning…
A Branch-Price-and-Cut Algorithm for the Min-Max k -Vehicle Windy Rural Postman Problem
2013
[EN] The min-max k -vehicles windy rural postman problem consists of minimizing the maximal distance traveled by a vehicle to find a set of balanced routes that jointly service all the required edges in a windy graph. This is a very difficult problem, for which a branch-and-cut algorithm has already been proposed, providing good results when the number of vehicles is small. In this article, we present a branch-price-and-cut method capable of obtaining optimal solutions for this problem when the number of vehicles is larger for the same set of required edges. Extensive computational results on instances from the literature are presented.
CADEM: calculate X-ray diffraction of epitaxial multilayers
2017
This article presents a powerful yet simple program, based on the general one-dimensional kinematic X-ray diffraction (XRD) theory, which calculates the XRD patterns of tailor-made multilayers and thus enables quantitative comparison of measured and calculated XRD data. As the multilayers are constructed layer by layer, the final material stack can be entirely arbitrary.