6533b826fe1ef96bd1284d9e

RESEARCH PRODUCT

A Route toward Protein Sequencing using Solid-State Nanopores Assisted by Machine Learning

Andreina Nohemi Urquiola HernandezAdrien NicolaïChristophe GuyeuxPatrick Senet

subject

[INFO.INFO-ET] Computer Science [cs]/Emerging Technologies [cs.ET][INFO.INFO-SE] Computer Science [cs]/Software Engineering [cs.SE][INFO.INFO-DC] Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC][INFO.INFO-IU] Computer Science [cs]/Ubiquitous Computing[INFO.INFO-MA] Computer Science [cs]/Multiagent Systems [cs.MA][INFO.INFO-MO] Computer Science [cs]/Modeling and Simulation[INFO.INFO-CR] Computer Science [cs]/Cryptography and Security [cs.CR]

description

Solid-State Nanopores made of 2-D materials such as MoS2 have emerged as one of the most versatile sensors for single-biomolecule detection, which is essential for early disease diagnosis (biomarker detection). One of the most promising applications of SSN is DNA and protein sequencing, at a low cost and faster than the current standard methods. The detection principle relies on measuring the relatively small variations of ionic current as charged biomolecules immersed in an electrolyte traverse the nanopore, in response to an external voltage applied across the membrane. The passage of a biomolecule through the pore yields information about its structure and chemical properties, as demonstrated experimentally particularly for DNA molecules. Indeed, protein sequencing using SSN remains highly challenging since the protein ensemble is far more complex than the DNA ensemble [1]. In this work, we performed extensive unbiased all-atom classical Molecular Dynamics simulations to produce data of translocation of biological peptides through single-layer MoS2 nanopores (D = 1.3 nm). Peptide made of 12 different amino acids from the different families (non polar/hydrophobic, polar/neutral, basic and acidic) were chemically linked to a short polycationic charge carrier. First, ionic current time series were computed from MD and peptide-induced blockade events were extracted and characterized using structural break detection. Second, clustering (unsupervised learning) of ionic current subdrops and<br&gtduration using Gaussian Mixture Model was applied. Using this technique, we demonstrate that each amino acid presents a large diversity of ionic current characteristics, however, charged amino acids were distinguished from the others. These promising findings may offer a route toward protein sequencing using MoS2 solid-state nanopores.

https://hal.science/hal-03893453