6533b7d3fe1ef96bd125fe56

RESEARCH PRODUCT

Testing selected optimal descriptors with artificial neural networks

Lionello PoglianiJesús Vicente De Julián-ortiz

subject

Artificial neural networkbusiness.industryGeneral Chemical EngineeringModel studyPattern recognitionGeneral ChemistryENCODECore electronPartition (number theory)Statistical analysisModel qualityArtificial intelligencebusinessMathematics

description

Eleven properties have been modeled with the objective of checking the importance for model purposes of mixed descriptors made of empirical parameters, molecular connectivity indices and random numbers. The mixed descriptors with random indices have a descriptive character which is satisfactorily confirmed by the leave-one-out method of statistical analysis. The introduction of a partition of the set of compounds into training and evaluation sets decreases drastically the probability to find a mixed descriptor with random indices with good model quality. Two properties, the magnetic susceptibility and the elutropic values, insist on having optimal descriptors with random indices. The overall model study underlines the importance of semiempirical descriptors made of experimental parameters and molecular connectivity indices, as well as the importance of a perturbation parameter that has been introduced into the valence delta number to encode the contribution of the depleted hydrogen atoms. The use of complete graphs to encode the core electrons of higher-row atoms is also underlined. The model quality of the mixed descriptors obtained with combinatorial regressive methods has also been tested with three-layered feed-forward artificial neural network (ANN) methods. This methodology not only confirms the validity of the descriptors but also improves their model quality widening, thus, their predictive ability.

https://doi.org/10.1039/c3ra41435c