6533b857fe1ef96bd12b455c

RESEARCH PRODUCT

MAGICPL: A Generic Process Description Language for Distributed Pseudonymization Scenarios

Galina TremperFlorian StampeAndreas BorgEsther SchmidtMartin LablansMartin BialkeDavid CroftTorben Brenner

subject

Service (systems architecture)Biomedical ResearchComputer scienceProcess (engineering)computer.internet_protocolHealth InformaticsReuse03 medical and health sciences0302 clinical medicineHealth Information ManagementComponent (UML)Humans030212 general & internal medicinePseudonymizationComputer SecurityLanguageAdvanced and Specialized NursingClass (computer programming)Application programming interfacebusiness.industry030220 oncology & carcinogenesisSoftware engineeringbusinesscomputerConfidentialitySoftwareXML

description

Abstract Objectives Pseudonymization is an important aspect of projects dealing with sensitive patient data. Most projects build their own specialized, hard-coded, solutions. However, these overlap in many aspects of their functionality. As any re-implementation binds resources, we would like to propose a solution that facilitates and encourages the reuse of existing components. Methods We analyzed already-established data protection concepts to gain an insight into their common features and the ways in which their components were linked together. We found that we could represent these pseudonymization processes with a simple descriptive language, which we have called MAGICPL, plus a relatively small set of components. We designed MAGICPL as an XML-based language, to make it human-readable and accessible to nonprogrammers. Additionally, a prototype implementation of the components was written in Java. MAGICPL makes it possible to reference the components using their class names, making it easy to extend or exchange the component set. Furthermore, there is a simple HTTP application programming interface (API) that runs the tasks and allows other systems to communicate with the pseudonymization process. Results MAGICPL has been used in at least three projects, including the re-implementation of the pseudonymization process of the German Cancer Consortium, clinical data flows in a large-scale translational research network (National Network Genomic Medicine), and for our own institute's pseudonymization service. Conclusions Putting our solution into productive use at both our own institute and at our partner sites facilitated a reduction in the time and effort required to build pseudonymization pipelines in medical research.

https://doi.org/10.1055/s-0041-1731387