6533b823fe1ef96bd127e0ae

RESEARCH PRODUCT

Classification of Sequences with Deep Artificial Neural Networks: Representation and Architectural Issues

M La RosaAlfonso UrsoMa Di GangiG. Lo BoscoLaura La PagliaRiccardo RizzoDomenico AmatoAntonino Fiannaca

subject

SequenceBiological dataSequence classificationSettore INF/01 - InformaticaArtificial neural networkProcess (engineering)Computer sciencebusiness.industryDeep learningBacteria classificationSequence classificationBacteria classificationNucleosome identificationDeep neural networkMachine learningcomputer.software_genreData typeNucleosome identificationComponent (UML)Artificial intelligenceMetagenomicsRepresentation (mathematics)businesscomputer

description

DNA sequences are the basic data type that is processed to perform a generic study of biological data analysis. One key component of the biological analysis is represented by sequence classification, a methodology that is widely used to analyze sequential data of different nature. However, its application to DNA sequences requires a proper representation of such sequences, which is still an open research problem. Machine Learning (ML) methodologies have given a fundamental contribution to the solution of the problem. Among them, recently, also Deep Neural Network (DNN) models have shown strongly encouraging results. In this chapter, we deal with specific classification problems related to two biological scenarios: (A) metagenomics and (B) chromatin organization. The investigations have been carried out by considering DNA sequences as input data for the classification methodologies. In particular, we study and test the efficacy of (1) different DNA sequence representations and (2) several Deep Learning (DL) architectures that process sequences for the solution of the related supervised classification problems. Although developed for specific classification tasks, we think that such architectures could be served as a suggestion for developing other DNN models that process the same kind of input.

10.1007/978-3-030-71676-9_2https://publications.cnr.it/doc/455358