6533b86efe1ef96bd12cc9cc

RESEARCH PRODUCT

Hyper-flexible Convolutional Neural Networks based on Generalized Lehmer and Power Means

Vagan TerziyanDiana MalykMariia GoloviankoVladyslav Branytskyi

subject

neural networkCognitive NeuroscienceLehmer meansyväoppiminenneuroverkotMachine LearningflexibilitykoneoppiminenPower meanArtificial Intelligenceconvolutionadversarial robustnesspoolingNeural Networks Computeractivation functionconvolutionalgeneralization

description

Convolutional Neural Network is one of the famous members of the deep learning family of neural network architectures, which is used for many purposes, including image classification. In spite of the wide adoption, such networks are known to be highly tuned to the training data (samples representing a particular problem), and they are poorly reusable to address new problems. One way to change this would be, in addition to trainable weights, to apply trainable parameters of the mathematical functions, which simulate various neural computations within such networks. In this way, we may distinguish between the narrowly focused task-specific parameters (weights) and more generic capability-specific parameters. In this paper, we suggest a couple of flexible mathematical functions (Generalized Lehmer Mean and Generalized Power Mean) with trainable parameters to replace some fixed operations (such as ordinary arithmetic mean or simple weighted aggregation), which are traditionally used within various components of a convolutional neural network architecture. We named the overall architecture with such an update as a hyper-flexible convolutional neural network. We provide mathematical justification of various components of such architecture and experimentally show that it performs better than the traditional one, including better robustness regarding the adversarial perturbations of testing data. peerReviewed

http://urn.fi/URN:NBN:fi:jyu-202209054469