6533b828fe1ef96bd1287a0a
RESEARCH PRODUCT
Color illusions also deceive CNNs for low-level vision tasks: Analysis and implications.
Alexander Gomez-villaMarcelo BertalmíoJavier Vazquez-corralAdrián MartínJesús Malosubject
Computer sciencemedia_common.quotation_subjectIllusionColor spaceConvolutional neural network050105 experimental psychology03 medical and health sciences0302 clinical medicinePerceptionHumans0501 psychology and cognitive sciencesVision Ocularmedia_commonArtificial neural networkbusiness.industryOptical illusion05 social sciencesIllusionsSensory SystemsOphthalmologyVision scienceHuman visual system modelArtificial intelligenceNeural Networks Computerbusiness030217 neurology & neurosurgerydescription
The study of visual illusions has proven to be a very useful approach in vision science. In this work we start by showing that, while convolutional neural networks (CNNs) trained for low-level visual tasks in natural images may be deceived by brightness and color illusions, some network illusions can be inconsistent with the perception of humans. Next, we analyze where these similarities and differences may come from. On one hand, the proposed linear eigenanalysis explains the overall similarities: in simple CNNs trained for tasks like denoising or deblurring, the linear version of the network has center-surround receptive fields, and global transfer functions are very similar to the human achromatic and chromatic contrast sensitivity functions in human-like opponent color spaces. These similarities are consistent with the long-standing hypothesis that considers low-level visual illusions as a by-product of the optimization to natural environments. Specifically, here human-like features emerge from error minimization. On the other hand, the observed differences must be due to the behavior of the human visual system not explained by the linear approximation. However, our study also shows that more ‘flexible’ network architectures, with more layers and a higher degree of nonlinearity, may actually have a worse capability of reproducing visual illusions. This implies, in line with other works in the vision science literature, a word of caution on using CNNs to study human vision: on top of the intrinsic limitations of the L+NL formulation of artificial networks to model vision, the nonlinear behavior of flexible architectures may easily be markedly different from that of the visual system. This work has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement number 761544 (project HDR4EU) and under grant agreement number 780470 (project SAUCE), and by the Spanish government and FEDER Fund, grant Ref. PGC2018-099651-B-I00 (MCIU/AEI/FEDER, UE). The work of AM was supported by the Spanish government under Grant FJCI-2017–31758. JM has been supported by the Spanish government under the MINECO grant Ref. DPI2017-89867 and by the Generalitat Velanciana grant Ref. GrisoliaP-2019-035. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research.
year | journal | country | edition | language |
---|---|---|---|---|
2019-12-02 | Vision research |