6533b852fe1ef96bd12ab966

RESEARCH PRODUCT

Criminal networks analysis in missing data scenarios through graph distances.

Ficara AnnamariaCavallaro LuciaCurreri FrancescoFiumara GiacomoDe Meo PasqualeBagdasar OvidiuSong WeiLiotta Antonio

subject

Data AnalysisFOS: Computer and information sciencesComputer and Information SciencesScienceIntelligenceSocial SciencesTransportationCriminologyCivil EngineeringSocial NetworkingComputer Science - Computers and SocietyLaw EnforcementSociologyComputers and Society (cs.CY)PsychologyHumansComputer NetworksSocial and Information Networks (cs.SI)Algorithms; Humans; Terrorism; Criminals; Data Analysis; Social NetworkingSettore INF/01 - InformaticaQCognitive PsychologyRBiology and Life SciencesEigenvaluesComputer Science - Social and Information NetworksCriminalsTransportation InfrastructurePoliceRoadsProfessionsAlgebraLinear AlgebraPeople and PlacesPhysical SciencesEngineering and TechnologyCognitive ScienceMedicineLaw and Legal SciencesPopulation GroupingsTerrorismCrimeCriminal Justice SystemMathematicsNetwork AnalysisAlgorithmsResearch ArticleNeuroscience

description

Data collected in criminal investigations may suffer from: (i) incompleteness, due to the covert nature of criminal organisations; (ii) incorrectness, caused by either unintentional data collection errors and intentional deception by criminals; (iii) inconsistency, when the same information is collected into law enforcement databases multiple times, or in different formats. In this paper we analyse nine real criminal networks of different nature (i.e., Mafia networks, criminal street gangs and terrorist organizations) in order to quantify the impact of incomplete data and to determine which network type is most affected by it. The networks are firstly pruned following two specific methods: (i) random edges removal, simulating the scenario in which the Law Enforcement Agencies (LEAs) fail to intercept some calls, or to spot sporadic meetings among suspects; (ii) nodes removal, that catches the hypothesis in which some suspects cannot be intercepted or investigated. Finally we compute spectral (i.e., Adjacency, Laplacian and Normalised Laplacian Spectral Distances) and matrix (i.e., Root Euclidean Distance) distances between the complete and pruned networks, which we compare using statistical analysis. Our investigation identified two main features: first, the overall understanding of the criminal networks remains high even with incomplete data on criminal interactions (i.e., 10% removed edges); second, removing even a small fraction of suspects not investigated (i.e., 2% removed nodes) may lead to significant misinterpretation of the overall network.

10.1371/journal.pone.0255067https://doaj.org/article/f912ea94763b4c5e908d3458a1ff4f37