6533b85cfe1ef96bd12bc619

RESEARCH PRODUCT

Large-scale nonlinear dimensionality reduction for network intrusion detection

Yasir HamidLudovic JournauxJohn Aldo LeeLucile SautotBushra NabiM. Sugumaran

subject

intrusion detection[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV][ SPI.SIGNAL ] Engineering Sciences [physics]/Signal and Image processing[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG][ INFO.INFO-CV ] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV][ INFO.INFO-LG ] Computer Science [cs]/Machine Learning [cs.LG][STAT.ML] Statistics [stat]/Machine Learning [stat.ML][INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]ComputingMethodologies_PATTERNRECOGNITION[STAT.ML]Statistics [stat]/Machine Learning [stat.ML][INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]Gower[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing[ STAT.ML ] Statistics [stat]/Machine Learning [stat.ML][SPI.SIGNAL] Engineering Sciences [physics]/Signal and Image processingdimensionality reduction

description

International audience; Network intrusion detection (NID) is a complex classification problem. In this paper, we combine classification with recent and scalable nonlinear dimensionality reduction (NLDR) methods. Classification and DR are not necessarily adversarial, provided adequate cluster magnification occurring in NLDR methods like $t$-SNE: DR mitigates the curse of dimensionality, while cluster magnification can maintain class separability. We demonstrate experimentally the effectiveness of the approach by analyzing and comparing results on the big KDD99 dataset, using both NLDR quality assessment and classification rate for SVMs and random forests. Since data involves features of mixed types (numerical and categorical), the use of Gower's similarity coefficient as metric further improves the results over the classical similarity metric.

https://hal.archives-ouvertes.fr/hal-01517215/document