Tackling the Problem of Data Imbalancing for Melanoma Classification

6533b821fe1ef96bd127b072

RESEARCH PRODUCT

Tackling the Problem of Data Imbalancing for Melanoma Classification

Fabrice Meriaudeau Rafael Garcia Mojdeh Rastgoo Joan Massich Guillaume Lemaitre Olivier Morel Franck Marzani

subject

medicine.medical_specialty Feature vector MELANOMA 02 engineering and technology [ SPI.SIGNAL ] Engineering Sciences [physics]/Signal and Image processing Imbalanced data CLASSIFICATION 030218 nuclear medicine & medical imaging 03 medical and health sciences DERMOSCOPY 0302 clinical medicine 0202 electrical engineering electronic engineering information engineering medicine IMBALANCED Stage (cooking)Melanoma [SPI.SIGNAL] Engineering Sciences [physics]/Signal and Image processing business.industry Melanoma Cancer medicine.disease Dermatology Data balancing Feature (computer vision)020201 artificial intelligence & image processing Enginyeria biomèdica Skin cancer business [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing Biomedical engineering

description

Comunicació de congrés presentada a: 3rd International Conference on Bioimaging, BIOIMAGING 2016 - Part of 9th International Joint Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2016, Roma, Italy Malignant melanoma is the most dangerous type of skin cancer, yet melanoma is the most treatable kind of cancer when diagnosed at an early stage. In this regard, Computer-Aided Diagnosis systems based on machine learning have been developed to discern melanoma lesions from benign and dysplastic nevi in dermoscopic images. Similar to a large range of real world applications encountered in machine learning, melanoma classification faces the challenge of imbalanced data, where the percentage of melanoma cases in comparison with benign and dysplastic cases is far less. This article analyzes the impact of data balancing strategies at the training step. Subsequently, Over-Sampling (OS) and Under-Sampling (US) are extensively compared in both feature and data space, revealing that NearMiss-2 (NM2) outperform other methods achieving Sensitivity (SE) and Specificity (SP) of 91.2% and 81.7%, respectively. More generally, the reported results highlight that methods based on US or combination of OS and US in feature space outperform the others

year	journal	country	edition	language
2016-02-21

http://hdl.handle.net/10256/17715