Search results for "cluster analysis."
showing 10 items of 805 documents
New Similarity Rules for Mining Data
2006
Variability and noise in data-sets entries make hard the discover of important regularities among association rules in mining problems. The need exists for defining flexible and robust similarity measures between association rules. This paper introduces a new class of similarity functions, SF's, that can be used to discover properties in the feature space X and to perform their grouping with standard clustering techniques. Properties of the proposed SF's are investigated and experiments on simulated data-sets are also shown to evaluate the grouping performance.
Methods of spatial cluster detection in rare childhood cancers: Benchmarking data and results from a simulation study on nephroblastoma
2021
Abstract The potential existence of spatial clusters in childhood cancer incidence is a debated topic. Identification of rare disease clusters in general may help to better understand disease etiology and develop preventive strategies against such entities. The incidence of newly diagnosed childhood malignancies under 15 years of age is 140/1,000,000. In this context, the subgroup of nephroblastoma represents an extremely rare entity with an annual incidence of 7/1,000,000. We evaluated widely used statistical approaches for spatial cluster detection in childhood cancer (Ref. [22] Schundeln et al., 2021, Cancer Epidemiology). For the simulation study, random high risk clusters of 1 to 50 ad…
Parental psychological control, autonomy support and Italian emerging adult’s psychosocial well-being: a cluster analytic approach
2020
According to a person-oriented approach, the study was addressed to inquire the existence of different groups of emerging adults (EAs) each characterized by distinct configurations of parental psychological control and autonomy support conceptualized in terms of promotion of volitional functioning (PVF) and in terms promotion of independence (PI). At the study participated 476 Italian undergraduate students following the academic track in several south Italian universities. Results showed the existence of four profiles: 1. the Moderate Volitional Dependence cluster; 2. the Moderate Controlling Independence cluster; 3. the Volitional Independence cluster; 4. the Controlling Dependence cluste…
Assisted labeling for spam account detection on twitter
2019
Online Social Networks (OSNs) have become increasingly popular both because of their ease of use and their availability through almost any smart device. Unfortunately, these characteristics make OSNs also target of users interested in performing malicious activities, such as spreading malware and performing phishing attacks. In this paper we address the problem of spam detection on Twitter providing a novel method to support the creation of large-scale annotated datasets. More specifically, URL inspection and tweet clustering are performed in order to detect some common behaviors of spammers and legitimate users. Finally, the manual annotation effort is further reduced by grouping similar u…
ValWorkBench: an open source Java library for cluster validation, with applications to microarray data analysis.
2015
Background: Cluster analysis is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from statistics to computer science. It is central to the life sciences due to the advent of high throughput technologies, e.g., classification of tumors. In particular, in cluster analysis, it is of relevance to assess cluster quality and to predict the number of clusters in a dataset, if any. This latter task is usually performed via internal validation measures. Despite their potentially important role, both the use of classic internal validation measures and the design of new ones, specific for microarray data, do not seem to have grea…
Description, microhabitat selection and infection patterns of sealworm larvae (Pseudoterranova decipiens species complex, nematoda: ascaridoidea) in …
2013
Third-stage larvae of the Pseudoterranova decipiens species complex (also known as sealworms) have been reported in at least 40 marine fish species belonging to 21 families and 10 orders along the South American coast. Sealworms are a cause for concern because they can infect humans who consume raw or undercooked fish. However, despite their economic and zoonotic importance, morphological and molecular characterization of species of Pseudoterranova in South America is still scarce. Methods: A total of 542 individual fish from 20 species from the Patagonian coast of Argentina were examined for sealworms. The body cavity, the muscles, internal organs, and the mesenteries were examined to dete…
The age and evolution of sociality in Stegodyphus spiders: a molecular phylogenetic perspective
2006
Social, cooperative breeding behaviour is rare in spiders and generally characterized by inbreeding, skewed sex ratios and high rates of colony turnover, processes that when combined may reduce genetic variation and lower individual fitness quickly. On these grounds, social spider species have been suggested to be unstable in evolutionary time, and hence sociality a rare phenomenon in spiders. Based on a partial molecular phylogeny of the genus Stegodyphus , we address the hypothesis that social spiders in this genus are evolutionary transient. We estimate the age of the three social species, test whether they represent an ancestral or derived state and assess diversification relative to s…
Nondestructive Direct Determination of Heroin in Seized Illicit Street Drugs by Diffuse Reflectance near-Infrared Spectroscopy
2008
A new method has been developed for the fast and nondestructive direct determination of heroin in seized street illicit drugs using partial least-squares regression analysis of diffuse reflectance near-infrared spectra. Data were obtained from untreated samples placed in standard glass chromatography vials. A heterogeneous population of 31 samples, previously analyzed by a reference method, was employed to build the calibration model and to have a separated validation set. Based on the use of zero-order data for a calibration set of 21 samples, after standard normal variate and quadratic linear removed baseline correction (detrending), in the wavelength range from 1111 to 1647 nm, 8 PLS fac…
Lexical and sublexical units in speech perception.
2009
Saffran, Newport, and Aslin (1996a) found that human infants are sensitive to statistical regularities corresponding to lexical units when hearing an artificial spoken language. Two sorts of segmentation strategies have been proposed to account for this early word-segmentation ability: bracketing strategies, in which infants are assumed to insert boundaries into continuous speech, and clustering strategies, in which infants are assumed to group certain speech sequences together into units (Swingley, 2005). In the present study, we test the predictions of two computational models instantiating each of these strategies i.e., Serial Recurrent Networks: Elman, 1990; and Parser: Perruchet & Vint…
A branch-and-cut algorithm for the soft-clustered vehicle-routing problem
2021
Abstract The soft-clustered vehicle-routing problem is a variant of the classical capacitated vehicle-routing problem (CVRP) in which customers are partitioned into clusters and all customers of the same cluster must be served by the same vehicle. We introduce a novel symmetric formulation of the problem in which the clustering part is modeled with an asymmetric sub-model. We solve the new model with a branch-and-cut algorithm exploiting some known valid inequalities for the CVRP that can be adapted. In addition, we derive problem-specific cutting planes and new heuristic and exact separation procedures. For square grid instances in the Euclidean plane, we provide lower-bounding techniques …