0000000000170993

AUTHOR

Terence A. Etchells

showing 3 related works from this author

Clustering categorical data: A stability analysis framework

2011

Clustering to identify inherent structure is an important first step in data exploration. The k-means algorithm is a popular choice, but K-means is not generally appropriate for categorical data. A specific extension of k-means for categorical data is the k-modes algorithm. Both of these partition clustering methods are sensitive to the initialization of prototypes, which creates the difficulty of selecting the best solution for a given problem. In addition, selecting the number of clusters can be an issue. Further, the k-modes method is especially prone to instability when presented with ‘noisy’ data, since the calculation of the mode lacks the smoothing effect inherent in the calculation …

Computer sciencebusiness.industrySingle-linkage clusteringCorrelation clusteringConstrained clusteringcomputer.software_genreMachine learningDetermining the number of clusters in a data setData stream clusteringCURE data clustering algorithmConsensus clusteringData miningArtificial intelligenceCluster analysisbusinesscomputer2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)
researchProduct

An integrated framework for risk profiling of breast cancer patients following surgery.

2006

Objective: An integrated decision support framework is proposed for clinical oncologists making prognostic assessments of patients with operable breast cancer. The framework may be delivered over a web interface. It comprises a triangulation of prognostic modelling, visualisation of historical patient data and an explanatory facility to interpret risk group assignments using empirically derived Boolean rules expressed directly in clinical terms. Methods and materials: The prognostic inferences in the interface are validated in a multicentre longitudinal cohort study by modelling retrospective data from 917 patients recruited at Christie Hospital, Wilmslow between 1983 and 1989 and predictin…

Risk profilingAdultmedicine.medical_specialtyDecision support systemMedicine (miscellaneous)Breast NeoplasmsMachine learningcomputer.software_genreModels BiologicalRisk AssessmentDecision Support TechniquesUser-Computer InterfaceBreast cancerRisk groupsArtificial IntelligencemedicineConfidence IntervalsHealth Status IndicatorsHumansMedical physicsSurvival analysisMastectomyRetrospective StudiesInternetbusiness.industryPatient SelectionReproducibility of ResultsPatient dataMiddle Agedmedicine.diseaseDecision Support Systems ClinicalPrognosisConfidence intervalTreatment OutcomeNottingham Prognostic IndexFemaleArtificial intelligenceNeural Networks ComputerbusinesscomputerMonte Carlo MethodAlgorithmsArtificial intelligence in medicine
researchProduct

A principled approach to network-based classification and data representation

2013

Measures of similarity are fundamental in pattern recognition and data mining. Typically the Euclidean metric is used in this context, weighting all variables equally and therefore assuming equal relevance, which is very rare in real applications. In contrast, given an estimate of a conditional density function, the Fisher information calculated in primary data space implicitly measures the relevance of variables in a principled way by reference to auxiliary data such as class labels. This paper proposes a framework that uses a distance metric based on Fisher information to construct similarity networks that achieve a more informative and principled representation of data. The framework ena…

business.industryCognitive NeuroscienceFisher kernelPattern recognitionProbability density functionConditional probability distributionExternal Data Representationcomputer.software_genreComputer Science ApplicationsWeightingEuclidean distancesymbols.namesakeData pointArtificial IntelligencesymbolsArtificial intelligenceData miningFisher informationbusinesscomputerMathematicsNeurocomputing
researchProduct