Hierarchical modeling for rare event detection and cell subset alignment across flow cytometry samples.

6533b7d3fe1ef96bd125ffff

RESEARCH PRODUCT

Hierarchical modeling for rare event detection and cell subset alignment across flow cytometry samples.

Marij J. P. Welters Jacob Frelinger Cedrik M. Britten Lin Lin Sjoerd H. Van Der Burg Cliburn Chan Mike West Satwinder Kaur Singh Cécile Gouttefangeas Andrew Cron

subject

Computer science Adaptive Immunity computer.software_genre 0302 clinical medicine Single-cell analysis Enumeration Biology (General)Immune Response Event (probability theory)0303 health sciences Ecology medicine.diagnostic_test T Cells Statistics Flow Cytometry 3. Good health Computational Theory and Mathematics Data model Modeling and Simulation Medicine Data mining Immunotherapy Research Article Tumor Immunology QH301-705.5 Immune Cells Immunology Context (language use)Biostatistics Models Biological Flow cytometry 03 medical and health sciences Cellular and Molecular Neuroscience Genetics medicine Humans Sensitivity (control systems)Statistical Methods Immunoassays Molecular Biology Biology Ecology Evolution Behavior and Systematics 030304 developmental biology business.industry Immunity Reproducibility of Results Pattern recognition Statistical model Immunologic Subspecialties Lymphocyte Subsets Immunologic Techniques Clinical Immunology Artificial intelligence business computer Mathematics 030215 immunology

description

Flow cytometry is the prototypical assay for multi-parameter single cell analysis, and is essential in vaccine and biomarker research for the enumeration of antigen-specific lymphocytes that are often found in extremely low frequencies (0.1% or less). Standard analysis of flow cytometry data relies on visual identification of cell subsets by experts, a process that is subjective and often difficult to reproduce. An alternative and more objective approach is the use of statistical models to identify cell subsets of interest in an automated fashion. Two specific challenges for automated analysis are to detect extremely low frequency event subsets without biasing the estimate by pre-processing enrichment, and the ability to align cell subsets across multiple data samples for comparative analysis. In this manuscript, we develop hierarchical modeling extensions to the Dirichlet Process Gaussian Mixture Model (DPGMM) approach we have previously described for cell subset identification, and show that the hierarchical DPGMM (HDPGMM) naturally generates an aligned data model that captures both commonalities and variations across multiple samples. HDPGMM also increases the sensitivity to extremely low frequency events by sharing information across multiple samples analyzed simultaneously. We validate the accuracy and reproducibility of HDPGMM estimates of antigen-specific T cells on clinically relevant reference peripheral blood mononuclear cell (PBMC) samples with known frequencies of antigen-specific T cells. These cell samples take advantage of retrovirally TCR-transduced T cells spiked into autologous PBMC samples to give a defined number of antigen-specific T cells detectable by HLA-peptide multimer binding. We provide open source software that can take advantage of both multiple processors and GPU-acceleration to perform the numerically-demanding computations. We show that hierarchical modeling is a useful probabilistic approach that can provide a consistent labeling of cell subsets and increase the sensitivity of rare event detection in the context of quantifying antigen-specific immune responses.

year	journal	country	edition	language
2013-07-31	PLoS computational biology

10.1371/journal.pcbi.1003130 https://pubmed.ncbi.nlm.nih.gov/23874174