Search results for "Data modeling"
showing 10 items of 112 documents
Fusing storage and computing for the domain of business intelligence and analytics: research opportunities
2015
With the growing importance of external and shared data, the set of requirements for Business Intelligence and Analytics (BIA) is shifting. Current solutions still come with shortcomings, esp. In multi-stakeholder environments where sensitive content is exchanged. We argue that a new level in the evolution of BIA can be unlocked by tearing down the barriers between storage and computing based on upcoming storage technologies. In particular, we propose a revitalization of ideas from object-oriented databases. We present results from a joint project that aimed at delineating design options for BIA solutions built upon this idea. The paper outlines the interplay of various architectural layers…
A Hidden Markov Model for Automatic Generation of ER Diagrams from OWL Ontology
2014
Connecting ontological representations and data models is a crucial need in enterprise knowledge management, above all in the case of federated enterprises where corporate ontologies are used to share information coming from different databases. OWL to ERD transformations are a challenging research field in this scenario, due to the loss of expressiveness arising when OWL axioms have to be represented using ERD notation. In this paper we propose an innovative technique for estimating the most likely composition of ERD constructs that correspond to a given sequence of OWL axioms. We model such a process using a Hidden Markov Model (HMM) where the OWL inputs are the observable states, while E…
Estimation of the spatially distributed surface energy budget for AgriSAR 2006, part I : remote sensing model intercomparison
2011
A number of energy balance models of variable complexity that use remotely sensed boundary conditions for producing spatially distributed maps of surface fluxes have been proposed. Validation typically involves comparing model output to flux tower observations at a handful of sites, and hence there is no way of evaluating the reliability of model output for the remaining pixels comprising a scene. To assess the uncertainty in flux estimation over a remote sensing scene requires one to conduct pixel-by-pixel comparisons of the output. The objective of this paper is to assess whether the simplifications made in a simple model lead to erroneous predictions or deviations from a more complex mod…
KIKS Creativity and Technology for All
2019
Abstract To help meet an educational and societal requirement for all students to enjoy, have confidence and ability in creativity and technology, the “Kids Inspiring Kids in STEAM” (KIKS) EU project adopted an intensive Hothousing process challenging students in Finland, Spain, Hungary and the United Kingdom to engage in collaborative problem solving to develop solutions to: “How would you get your schoolmates to LOVE STEAM?” The project provided a process and technology toolkit for students, including those with special educational needs, to achieve their solutions. A completion rate of 90% suggested that all schools and students could cope with and enjoy the process and associated techno…
Issues in synthetic data generation for advanced manufacturing
2017
To have any chance of application in real world, advanced manufacturing research in data analytics needs to explore and prove itself with real-world manufacturing data. Limited access to real-world data largely contrasts with the need for data of varied types and larger quantity for research. Use of virtual data is a promising approach to make up for the lack of access. This paper explores the issues, identifies challenges, and suggests requirements and desirable features in the generation of virtual data. These issues, requirements, and features can be used by researchers to build virtual data generators and gain experience that will provide data to data scientists while avoiding known or …
Code Interoperability and Standard Data Formats in Quantum Chemistry and Quantum Dynamics: The Q5/Q5cost Data Model
2014
Code interoperability and the search for domain-specific standard data formats represent critical issues in many areas of computational science. The advent of novel computing infrastructures such as computational grids and clouds make these issues even more urgent. The design and implementation of a common data format for quantum chemistry (QC) and quantum dynamics (QD) computer programs is discussed with reference to the research performed in the course of two Collaboration in Science and Technology Actions. The specific data models adopted, Q5Cost and D5Cost, are shown to work for a number of interoperating codes, regardless of the type and amount of information (small or large datasets) …
Framework for Evaluating the Version Management Capabilities of a Class of UML Modeling Tools from the Viewpoint of Multi-Site, Multi-Partner Product…
2010
UML models are widely used in software product line engineering for activities such as modeling the software product line reference architecture, detailed design, and automation of software code generation and testing. But in high-tech companies, modeling activities are typically distributed across multiple sites and involve multiple partners in different countries, thus complicating model management. Today's UML modeling tools support sophisticated version management for managing parallel and distributed modeling. However, the literature does not provide a comprehensive set of industrial-level criteria to evaluate the version management capabilities of UML tools. This article's contributio…
Interoperability between Distributed Systems and Web-Services Composition
2009
An information system is a multi-axis system characterized by a “data” axis, a “behavioral” axis, and a “communication” axis. The data axis corresponds to the structural and schematic technologies used to store data into the system. The behavioral axis represents management and production processes carried out by the system and corresponding technologies. The processes can interact with the data to extract, generate, and store data. The communication axis relates to the network used to exchange data and activate processes between geographically distant users or machines. Nowadays, technologies required for interoperability are extended to deal with the semantic aspect of the information sys…
Adaptive Learning Process for the Evolution of Ontology-Described Classification Model in Big Data Context
2016
International audience; One of the biggest challenges in Big Data is to exploit value from large volumes of variable and changing data. For this, one must focus on analyzing the data in these Big Data sources and classify the data items according to a domain model (e.g. an ontology). To automatically classify unstructured text documents according to an ontology, a hierarchical multi-label classification process called Semantic HMC was proposed. This process uses ontologies to describe the classification model. To prevent cold start and user overload, the classification process automatically learns the ontology-described classification model from a very large set of unstructured text documen…
Towards A Twitter Observatory: A Multi-Paradigm Framework For Collecting, Storing And Analysing Tweets
2016
International audience; In this article we show how a multi-paradigm framework can fulfil the requirements of tweets analysis and reduce the waiting time for researchers that use computational resources and storage systems to support large-scale data analysis. The originality of our approach is to combine concerns about data harvesting, data storage, data analysis and data visualisation into a framework that supports inductive reasoning in multidisciplinary scientific research. Our main contribution is a polyglot storage system with a generic data model to support logical data independence and a set of tools that can provide a suitable solution for mixing different types of algorithms in or…