Search results for "data"
showing 10 items of 12992 documents
Improvements and applications of the elements of prototype-based clustering
2018
Clustering or cluster analysis is an essential part of data mining, machine learning, and pattern recognition. The most popularly applied clustering methods are partitioning-based or prototype-based methods. Prototype-based clustering methods usually have easy implementability and good scalability. These methods, such as K-means clustering, have been used for different applications in various fields. On the other hand, prototype-based clustering methods are typically sensitive to initialization, and the selection of the number of clusters for knowledge discovery purposes is not straightforward. In the era of big data, in high-velocity, ever-growing datasets, which can also be erroneous, outl…
Rationale and design of the DARWIN-T2D (DApagliflozin Real World evIdeNce in Type 2 Diabetes): A multicenter retrospective nationwide Italian study a…
2017
Background Randomized controlled trials (RCTs) in the field of diabetes have limitations inherent to the fact that design, setting, and patient characteristics may be poorly transferrable to clinical practice. Thus, evidence from studies using routinely accumulated clinical data are increasingly valued. Aims We herein describe rationale and design of the DARWIN-T2D (DApagliflozin Real World evIdeNce in Type 2 Diabetes), a multicenter retrospective nationwide study conducted at 50 specialist outpatient clinics in Italy and promoted by the Italian Diabetes Society. Data synthesis The primary objective of the study is to describe the baseline clinical characteristics (particularly HbA1c) of pa…
ENSEMBLE METHODS FOR RANKING DATA
2017
The last years have seen a remarkable flowering of works about the use of decision trees for ranking data. As a matter of fact, decision trees are useful and intuitive, but they are very unstable: small perturbations bring big changes. This is the reason why it could be necessary to use more stable procedures, as ensemble methods, in order to find which predictors are able to explain the preference structure. In this work ensemble methods as BAGGING and Random Forest are proposed, from both a theoretical and computational point of view, for deriving classification trees when ranking data are observed. The advantages of these procedures are shown through an example on the SUSHI data set.
Ensemble methods for ranking data with and without position weights
2020
The main goal of this Thesis is to build suitable Ensemble Methods for ranking data with weights assigned to the items’positions, in the cases of rankings with and without ties. The Thesis begins with the definition of a new rank correlation coefficient, able to take into account the importance of items’position. Inspired by the rank correlation coefficient, τ x , proposed by Emond and Mason (2002) for unweighted rankings and the weighted Kemeny distance proposed by García-Lapresta and Pérez-Román (2010), this work proposes τ x w , a new rank correlation coefficient corresponding to the weighted Kemeny distance. The new coefficient is analized analitically and empirically and represents the main…
Repeated kidney re-transplantation—the Eurotransplant experience: a retrospective multicenter outcome analysis
2020
Transplant international (2020). doi:10.1111/tri.13569
An Automatic Ontology-Based Approach to Support Logical Representation of Observable and Measurable Data for Healthy Lifestyle Management: Proof-of-C…
2020
Background Lifestyle diseases, because of adverse health behavior, are the foremost cause of death worldwide. An eCoach system may encourage individuals to lead a healthy lifestyle with early health risk prediction, personalized recommendation generation, and goal evaluation. Such an eCoach system needs to collect and transform distributed heterogenous health and wellness data into meaningful information to train an artificially intelligent health risk prediction model. However, it may produce a data compatibility dilemma. Our proposed eHealth ontology can increase interoperability between different heterogeneous networks, provide situation awareness, help in data integration, and discover…
TISSBERT: A benchmark for the validation and comparison of NDVI time series reconstruction methods
2018
[EN] This paper introduces the Time Series Simulation for Benchmarking of Reconstruction Techniques (TISSBERT) dataset, intended to provide a benchmark for the validation and comparison of time series reconstruction methods. Such methods are routinely used to estimate vegetation characteristics from optical remotely sensed data, where the presence of clouds decreases the usefulness of the data. As for their validation, these methods have been compared with previously published ones, although with different approaches, which sometimes lead to contradictory results. We designed the TISSBERT dataset to be generic so that it could simulate realistic reference and cloud-contaminated time series …
Register Variation in Electronic Business Correspondence
2011
Electronic correspondence is a highly dynamic genre within the business world in which Register Variation (RV) is frequently used as a tool to improve communication but it often can lead to misunderstanding. In order to shed some light on this still unexplored area, the present study firstly offers a practical approach to classify and analyse RV within professional communication. After this, it reviews previous studies on email writing to apply their findings to this approach and, in the third part of the study, a corpus of recent business emails in English is analysed to examine how the key parameters of RV are currently used within this genre. The results will show that, not only the cont…
Pārrobežu informācijas apmaiņas tiesiskais regulējums valsts funkciju izpildei un ar to saistītas problēmas sauszemes transportlīdzekļu jomā
2022
Maģistra darba ietvaros tiek analizēts starpvalstu datu apmaiņas procesa tiesiskais regulējums sauszemes transportlīdzekļu un to vadītāju jomā, procesa organizācija un problēmas, ar kurām saskaras ES dalībvalstis un Latvija. Darbā ir izpētīta regulējuma ietekme uz fizisko personu datu apmaiņu un ar to saistītas problēmas. Pētījums aptver spēkā esošo un arī vēsturisko tiesisko regulējumu, to attīstību un esošo situāciju. Maģistra darbs sastāv no 3 nodaļām. Maģistra darba apjoms ir 66. lappuses, t.s. 5 attēli, 8 tabulas. Maģistra darbā izmantoti 55. informācijas avoti.
Jet fragmentation transverse momentum distributions in pp and p-Pb collisions at √s, √sNN = 5.02 TeV
2021
Jet fragmentation transverse momentum (jT) distributions are measured in proton-proton (pp) and proton-lead (p-Pb) collisions at √sNN = 5.02 TeV with the ALICE experiment at the LHC. Jets are reconstructed with the ALICE tracking detectors and electromagnetic calorimeter using the anti-kT algorithm with resolution parameter R = 0.4 in the pseudorapidity range |η| < 0.25. The jT values are calculated for charged particles inside a fixed cone with a radius R = 0.4 around the reconstructed jet axis. The measured jT distributions are compared with a variety of parton-shower models. Herwig and Pythia 8 based models describe the data well for the higher jT region, while they underestimate the low…