0000000000074504
AUTHOR
Jesús Malo
PerceptNet: A Human Visual System Inspired Neural Network for Estimating Perceptual Distance
Traditionally, the vision community has devised algorithms to estimate the distance between an original image and images that have been subject to perturbations. Inspiration was usually taken from the human visual perceptual system and how the system processes different perturbations in order to replicate to what extent it determines our ability to judge image quality. While recent works have presented deep neural networks trained to predict human perceptual quality, very few borrow any intuitions from the human visual system. To address this, we present PerceptNet, a convolutional neural network where the architecture has been chosen to reflect the structure and various stages in the human…
Visual Cortex Performs a Sort of Non-linear ICA
Here, the standard V1 cortex model optimized to reproduce image distortion psychophysics is shown to have nice statistical properties, e.g. approximate factorization of the PDF of natural images. These results confirm the efficient encoding hypothesis that aims to explain the organization of biological sensors by information theory arguments.
In praise of artifice reloaded: Caution with natural image databases in modeling vision
Subjective image quality databases are a major source of raw data on how the visual system works in naturalistic environments. These databases describe the sensitivity of many observers to a wide range of distortions of different nature and intensity seen on top of a variety of natural images. Data of this kind seems to open a number of possibilities for the vision scientist to check the models in realistic scenarios. However, while these natural databases are great benchmarks for models developed in some other way (e.g., by using the well-controlled artificial stimuli of traditional psychophysics), they should be carefully used when trying to fit vision models. Given the high dimensionalit…
Topographic Independent Component Analysis reveals random scrambling of orientation in visual space
Neurons at primary visual cortex (V1) in humans and other species are edge filters organized in orientation maps. In these maps, neurons with similar orientation preference are clustered together in iso-orientation domains. These maps have two fundamental properties: (1) retinotopy, i.e. correspondence between displacements at the image space and displacements at the cortical surface, and (2) a trade-off between good coverage of the visual field with all orientations and continuity of iso-orientation domains in the cortical space. There is an active debate on the origin of these locally continuous maps. While most of the existing descriptions take purely geometric/mechanistic approaches whi…
Visual information flow in Wilson-Cowan networks.
In this paper, we study the communication efficiency of a psychophysically tuned cascade of Wilson-Cowan and divisive normalization layers that simulate the retina-V1 pathway. This is the first analysis of Wilson-Cowan networks in terms of multivariate total correlation. The parameters of the cortical model have been derived through the relation between the steady state of the Wilson-Cowan model and the divisive normalization model. The communication efficiency has been analyzed in two ways: First, we provide an analytical expression for the reduction of the total correlation among the responses of a V1-like population after the application of the Wilson-Cowan interaction. Second, we empiri…
Visual Information Fidelity with better Vision Models and better Mutual Information Estimates
Splitting criterion for hierarchical motion estimation based on perceptual coding
A new entropy-constrained motion estimation scheme using variable-size block matching is proposed. It is known that fixed-size block matching as used in most video codec standards is improved by using a multiresolution or multigrid approach. In this work, it is shown that further improvement is possible in terms of both the final bit rate achieved and the robustness of the predicted motion field if perceptual coding is taken into account in the motion estimation phase. The proposed scheme is compared against other variable- and fixed-size block matching algorithms.
Comparison of perceptually uniform quantisation with average error minimisation in image transform coding
An alternative transform coder design criterion based on restricting the maximum perceptual error of each coefficient is proposed. This perceptually uniform quantisation of the transform domain ensures that the perceptual error will be below a certain limit regardless of the particular input image. The results show that the proposed criterion improves the subjective quality of the conventional average error criterion even if it is weighted with the same perceptual metric.
Perceptual Image Representations for Support Vector Machine Image Coding
Support-vector-machine image coding relies on the ability of SVMs for function approximation. The size and the profile of the e-insensitivity zone of the support vector regressor (SVR) at some specific image representation determines (a) the amount of selected support vectors (the compression ratio), and (b) the nature of the introduced error (the compression distortion). However, the selection of an appropriate image representation is a key issue for a meaningful design of the e-insensitivity profile. For example, in image-coding applications, taking human perception into account is of paramount relevance to obtain a good rate-distortion performance. However, depending on the accuracy of t…
PRINCIPAL POLYNOMIAL ANALYSIS
© 2014 World Scientific Publishing Company. This paper presents a new framework for manifold learning based on a sequence of principal polynomials that capture the possibly nonlinear nature of the data. The proposed Principal Polynomial Analysis (PPA) generalizes PCA by modeling the directions of maximal variance by means of curves instead of straight lines. Contrarily to previous approaches PPA reduces to performing simple univariate regressions which makes it computationally feasible and robust. Moreover PPA shows a number of interesting analytical properties. First PPA is a volume preserving map which in turn guarantees the existence of the inverse. Second such an inverse can be obtained…
PCA Gaussianization for image processing
The estimation of high-dimensional probability density functions (PDFs) is not an easy task for many image processing applications. The linear models assumed by widely used transforms are often quite restrictive to describe the PDF of natural images. In fact, additional non-linear processing is needed to overcome the limitations of the model. On the contrary, the class of techniques collectively known as projection pursuit, which solve the high-dimensional problem by sequential univariate solutions, may be applied to very general PDFs (e.g. iterative Gaussianization procedures). However, the associated computational cost has prevented their extensive use in image processing. In this work, w…
Estimating biophysical variable dependences with kernels
This paper introduces a nonlinear measure of dependence between random variables in the context of remote sensing data analysis. The Hilbert-Schmidt Independence Criterion (HSIC) is a kernel method for evaluating statistical dependence. HSIC is based on computing the Hilbert-Schmidt norm of the cross-covariance operator of mapped samples in the corresponding Hilbert spaces. The HSIC empirical estimator is very easy to compute and has good theoretical and practical properties. We exploit the capabilities of HSIC to explain nonlinear dependences in two remote sensing problems: temperature estimation and chlorophyll concentration prediction from spectra. Results show that, when the relationshi…
Visual aftereffects and sensory nonlinearities from a single statistical framework
When adapted to a particular scenery our senses may fool us: colors are misinterpreted, certain spatial patterns seem to fade out, and static objects appear to move in reverse. A mere empirical description of the mechanisms tuned to color, texture, and motion may tell us where these visual illusions come from. However, such empirical models of gain control do not explain why these mechanisms work in this apparently dysfunctional manner. Current normative explanations of aftereffects based on scene statistics derive gain changes by (1) invoking decorrelation and linear manifold matching/equalization, or (2) using nonlinear divisive normalization obtained from parametric scene models. These p…
Psychophysically Tuned Divisive Normalization Approximately Factorizes the PDF of Natural Images
The conventional approach in computational neuroscience in favor of the efficient coding hypothesis goes from image statistics to perception. It has been argued that the behavior of the early stages of biological visual processing (e.g., spatial frequency analyzers and their nonlinearities) may be obtained from image samples and the efficient coding hypothesis using no psychophysical or physiological information. In this work we address the same issue in the opposite direction: from perception to image statistics. We show that psychophysically fitted image representation in V1 has appealing statistical properties, for example, approximate PDF factorization and substantial mutual informatio…
Regression Wavelet Analysis for Lossless Coding of Remote-Sensing Data
A novel wavelet-based scheme to increase coefficient independence in hyperspectral images is introduced for lossless coding. The proposed regression wavelet analysis (RWA) uses multivariate regression to exploit the relationships among wavelet-transformed components. It builds on our previous nonlinear schemes that estimate each coefficient from neighbor coefficients. Specifically, RWA performs a pyramidal estimation in the wavelet domain, thus reducing the statistical relations in the residuals and the energy of the representation compared to existing wavelet-based schemes. We propose three regression models to address the issues concerning estimation accuracy, component scalability, and c…
Nonlinear data description with Principal Polynomial Analysis
Principal Component Analysis (PCA) has been widely used for manifold description and dimensionality reduction. Performance of PCA is however hampered when data exhibits nonlinear feature relations. In this work, we propose a new framework for manifold learning based on the use of a sequence of Principal Polynomials that capture the eventually nonlinear nature of the data. The proposed Principal Polynomial Analysis (PPA) is shown to generalize PCA. Unlike recently proposed nonlinear methods (e.g. spectral/kernel methods and projection pursuit techniques, neural networks), PPA features are easily interpretable and the method leads to a fully invertible transform, which is a desirable property…
Adaptive motion estimation and video vector quantization based on spatiotemporal non-linearities of human perception
The two main tasks of a video coding system are motion estimation and vector quantization of the signal. In this work a new splitting criterion to control the adaptive decomposition for the non-uniform optical flow estimation is exposed. Also, a novel bit allocation procedure is proposed for the quantization of the DCT transform of the video signal. These new approaches are founded on a perception model that reproduce the relative importance given by the human visual system to any location in the spatial frequency, temporal frequency and amplitude domain of the DCT transform. The experiments show that the proposed procedures behave better than their equivalent (fixed-block-size motion estim…
Characterization of the human visual system threshold performance by a weighting function in the Gabor domain
Abstract As evidenced by many physiological and psychophysical reports, the receptive fields of the first-stage set of mechanisms of the visual process fit to two-dimensional (2D) compactly supported harmonic functions. The application of this set of band-pass filter functions to the input signal implies that the visual system carries out some kind of conjoint space/spatial frequency transform. Assuming that a conjoint transform is carried out, we present in this paper a new characterization of the visual system performance by means of a weighting function in the conjoint domain. We have called this weighting function (in the particular case of the Gabor transform) the Gabor stimuli Sensiti…
Graph matching for efficient classifiers adaptation
In this work we present an adaptation algorithm focused on the description of the measurement changes under different acquisition conditions. The adaptation is carried out by transforming the manifold in the first observation conditions into the corresponding manifold in the second. The eventually non-linear transform is based on vector quantization and graph matching. The transfer learning mapping is defined in an unsupervised manner. Once this mapping has been defined, the labeled samples in the first are projected into the second domain, thus allowing the application of any classifier in the transformed domain. Experiments on VHR series of images show the validity of the proposed method …
Color illusions also deceive CNNs for low-level vision tasks: Analysis and implications.
The study of visual illusions has proven to be a very useful approach in vision science. In this work we start by showing that, while convolutional neural networks (CNNs) trained for low-level visual tasks in natural images may be deceived by brightness and color illusions, some network illusions can be inconsistent with the perception of humans. Next, we analyze where these similarities and differences may come from. On one hand, the proposed linear eigenanalysis explains the overall similarities: in simple CNNs trained for tasks like denoising or deblurring, the linear version of the network has center-surround receptive fields, and global transfer functions are very similar to the human …
The role of perceptual contrast non-linearities in image transform quantization
Abstract The conventional quantizer design based on average error minimization over a training set does not guarantee a good subjective behavior on individual images even if perceptual metrics are used. In this work a novel criterion for transform coder design is analyzed in depth. Its aim is to bound the perceptual distortion in each individual quantization according to a non-linear model of early human vision. A common comparison framework is presented to describe the qualitative behavior of the optimal quantizers under the proposed criterion and the conventional rate-distortion based criterion. Several underlying metrics, with and without perceptual non-linearities, are used with both cr…
Computing variations of entropy and redundancy under nonlinear mappings not preserving the signal dimension: quantifying the efficiency of V1 cortex
In computational neuroscience, the Efficient Coding Hypothesis argues that the neural organization comes from the optimization of information-theoretic goals [Barlow Proc.Nat.Phys.Lab.59]. A way to confirm this requires the analysis of the statistical performance of biological systems that have not been statistically optimized [Renart et al. Science10, Malo&Laparra Neur.Comp.10, Foster JOSA18, Gomez-Villa&Malo J.Neurophysiol.19]. However, when analyzing the information-theoretic performance, cortical magnification in the retina-cortex pathway poses a theoretical problem. Cortical magnification stands for the increase the signal dimensionality [Cowey&Rolls Exp. Brain Res.74]. Conventional mo…
Perceptually weighted optical flow for motion-based segmentation in MPEG-4 paradigm
In the MPEG-4 paradigm, the sequence must be described in terms of meaningful objects. This meaningful, high-level representation should emerge from low-level primitives such as optical flow and prediction error which are the basic elements of previous-generation video coders. The accuracy of the high-level models strongly depends on the robustness of the primitives used. It is shown how perceptual weighting in optical flow computation gives rise to better motion estimates which consistently improve motion-based segmentation compared to equivalent unweighted motion estimates.
V1 non-linear properties emerge from local-to-global non-linear ICA
It has been argued that the aim of non-linearities in different visual and auditory mechanisms may be to remove the relations between the coefficients of the signal after global linear ICA-like stages. Specifically, in Schwartz and Simoncelli (2001), it was shown that masking effects are reproduced by fitting the parameters of a particular non-linearity in order to remove the dependencies between the energy of wavelet coefficients. In this work, we present a different result that supports the same efficient encoding hypothesis. However, this result is more general because, instead of assuming any specific functional form for the non-linearity, we show that by using an unconstrained approach…
Subjective image fidelity metric based on bit allocation of the human visual system in the DCT domain
Until now, subjective image distortion measures have partially used diverse empirical facts concerning human perception: non-linear perception of luminance, masking of the impairments by a highly textured surround, linear filtering by the threshold contrast frequency response of the visual system, and non-linear post-filtering amplitude corrections in the frequency domain. In this work, we develop a frequency and contrast dependent metric in the DCT domain using a fully non-linear and suprathreshold contrast perception model: the Information Allocation Function (IAF) of the visual system. It is derived from experimental data about frequency and contrast incremental thresholds and it is cons…
Dimensionality reduction via regression on hyperspectral infrared sounding data
This paper introduces a new method for dimensionality reduction via regression (DRR). The method generalizes Principal Component Analysis (PCA) in such a way that reduces the variance of the PCA scores. In order to do so, DRR relies on a deflationary process in which a non-linear regression reduces the redundancy between the PC scores. Unlike other nonlinear dimensionality reduction methods, DRR is easy to apply, it has out-of-sample extension, it is invertible, and the learned transformation is volume-preserving. These properties make the method useful for a wide range of applications, especially in very high dimensional data in general, and for hyperspectral image processing in particular…
Principal polynomial analysis for remote sensing data processing
Inspired by the concept of Principal Curves, in this paper, we define Principal Polynomials as a non-linear generalization of Principal Components to overcome the conditional mean independence restriction of PCA. Principal Polynomials deform the straight Principal Components by minimizing the regression error (or variance) in the corresponding orthogonal subspaces. We propose to use a projection on a series of these polynomials to set a new nonlinear data representation: the Principal Polynomial Analysis (PPA). We prove that the dimensionality reduction error in PPA is always lower than in PCA. Lower truncation error and increased independence suggest that unsupervised PPA features can be b…
Lossless coding of hyperspectral images with principal polynomial analysis
The transform in image coding aims to remove redundancy among data coefficients so that they can be independently coded, and to capture most of the image information in few coefficients. While the second goal ensures that discarding coefficients will not lead to large errors, the first goal ensures that simple (point-wise) coding schemes can be applied to the retained coefficients with optimal results. Principal Component Analysis (PCA) provides the best independence and data compaction for Gaussian sources. Yet, non-linear generalizations of PCA may provide better performance for more realistic non-Gaussian sources. Principal Polynomial Analysis (PPA) generalizes PCA by removing the non-li…
The Wilson-Cowan model describes Contrast Response and Subjective Distortion
Divisive normalization image quality metric revisited.
Structural similarity metrics and information-theory-based metrics have been proposed as completely different alternatives to the traditional metrics based on error visibility and human vision models. Three basic criticisms were raised against the traditional error visibility approach: (1) it is based on near-threshold performance, (2) its geometric meaning may be limited, and (3) stationary pooling strategies may not be statistically justified. These criticisms and the good performance of structural and information-theory-based metrics have popularized the idea of their superiority over the error visibility approach. In this work we experimentally or analytically show that the above critic…
Linear transform for simultaneous diagonalization of covariance and perceptual metric matrix in image coding
Two types ofredundancies are contained in images: statistical redundancy and psychovisual redundancy. Image representation techniques for image coding should remove both redundancies in order to obtain good results. In order to establish an appropriate representation, the standard approach to transform coding only considers the statistical redundancy, whereas the psychovisual factors are introduced after the selection ofthe representation as a simple scalar weighting in the transform domain. In this work, we take into account the psychovisual factors in the de8nition of the representation together with the statistical factors, by means of the perceptual metric and the covariance matrix, res…
Regularization operators for natural images based on nonlinear perception models.
Image restoration requires some a priori knowledge of the solution. Some of the conventional regularization techniques are based on the estimation of the power spectrum density. Simple statistical models for spectral estimation just take into account second-order relations between the pixels of the image. However, natural images exhibit additional features, such as particular relationships between local Fourier or wavelet transform coefficients. Biological visual systems have evolved to capture these relations. We propose the use of this biological behavior to build regularization operators as an alternative to simple statistical models. The results suggest that if the penalty operator take…
Channel Capacity in Psychovisual Deep-Nets: Gaussianization Versus Kozachenko-Leonenko
In this work, we quantify how neural networks designed from biology using no statistical training have a remarkable performance in information theoretic terms. Specifically, we address the question of the amount of information that can be extracted about the images from the different layers of psychophysically tuned deep networks. We show that analytical approaches are not possible, and we propose the use of two empirical estimators of capacity: the classical Kozachenko-Lonenko estimator and a recent estimator based on Gaussianization. Results show that networks purely based on visual psychophysics are extremely efficient in two aspects: (1) the internal representation of these networks dup…
Implementations of a novel algorithm for colour constancy
AbstractIn agreement with the principles of the relativistic model proposed by Creutzfeldtet al., with the photometric rule (lightness anchoring rule) and with the influence of simultaneous contrast in the appearance of a visual scene, we propose a first-stage mechanism yielding substantial colour constancy. We have defined a set of first-stage colour descriptors, and to test their utility, we have performed a simulation using a Machine Vision System (MVS). The statistical stability of the descriptors for Munsell samples under different illuminants is good.
Image quality metric based on multidimensional contrast perception models
Abstract The procedure to compute the subjective difference between two input images should be equivalent to a straightforward difference between their perceived versions, hence reliable subjective difference metrics must be founded on a proper perception model. For image distortion evaluation purposes, perception can be considered as a set of signal transforms that successively map the original image in the spatial domain into a feature and a response space. The properties of the spatial pattern analyzers involved in these transforms determine the geometry of these different signal representation domains. In this work the general relations between the sensitivity of the human visual system…
Psychophysics of Artificial Neural Networks Questions Classical Hue Cancellation Experiments
We show that classical hue cancellation experiments lead to human-like opponent curves even if the task is done by trivial (identity) artificial networks. Specifically, human-like opponent spectral sensitivities always emerge in artificial networks as long as (i) the retina converts the input radiation into any tristimulus-like representation, and (ii) the post-retinal network solves the standard hue cancellation task, e.g. the network looks for the weights of the cancelling lights so that every monochromatic stimulus plus the weighted cancelling lights match a grey reference in the (arbitrary) color representation used by the network. In fact, the specific cancellation lights (and not the …
Spatio-Chromatic Adaptation via Higher-Order Canonical Correlation Analysis of Natural Images
Independent component and canonical correlation analysis are two general-purpose statistical methods with wide applicability. In neuroscience, independent component analysis of chromatic natural images explains the spatio-chromatic structure of primary cortical receptive fields in terms of properties of the visual environment. Canonical correlation analysis explains similarly chromatic adaptation to different illuminations. But, as we show in this paper, neither of the two methods generalizes well to explain both spatio-chromatic processing and adaptation at the same time. We propose a statistical method which combines the desirable properties of independent component and canonical correlat…
Corresponding-pair procedure: a new approach to simulation of dichromatic color perception.
The dichromatic color appearance of a chromatic stimulus T can be described if a stimulus S is found that verifies that a normal observer experiences the same sensation viewing S as a dichromat viewing T. If dichromatic and normal versions of the same color vision model are available, S can be computed by applying the inverse of the normal model to the descriptors of T obtained with the dichromatic model. We give analytical form to this algorithm, which we call the corresponding-pair procedure. The analytical form highlights the requisites that a color vision model must verify for this procedure to be used. To show the capabilities of the method, we apply the algorithm to different color vi…
Complex-Valued Independent Component Analysis of Natural Images
Linear independent component analysis (ICA) learns simple cell receptive fields fromnatural images. Here,we showthat linear complex-valued ICA learns complex cell properties from Fourier-transformed natural images, i.e. two Gabor-like filters with quadrature-phase relationship. Conventional methods for complex-valued ICA assume that the phases of the output signals have uniform distribution. We show here that for natural images the phase distributions are, however, often far from uniform. We thus relax the uniformity assumption and model also the phase of the sources in complex-valued ICA. Compared to the original complex ICA model, the new model provides a better fit to the data, and leads…
Canonical Retina-to-Cortex Vision Model Ready for Automatic Differentiation
Canonical vision models of the retina-to-V1 cortex pathway consist of cascades of several Linear+Nonlinear layers. In this setting, parameter tuning is the key to obtain a sensible behavior when putting all these multiple layers to work together. Conventional tuning of these neural models very much depends on the explicit computation of the derivatives of the response with regard to the parameters. And, in general, this is not an easy task. Automatic differentiation is a tool developed by the deep learning community to solve similar problems without the need of explicit computation of the analytic derivatives. Therefore, implementations of canonical visual neuroscience models that are ready…
Nonlinearities and Adaptation of Color Vision from Sequential Principal Curves Analysis
Mechanisms of human color vision are characterized by two phenomenological aspects: the system is nonlinear and adaptive to changing environments. Conventional attempts to derive these features from statistics use separate arguments for each aspect. The few statistical explanations that do consider both phenomena simultaneously follow parametric formulations based on empirical models. Therefore, it may be argued that the behavior does not come directly from the color statistics but from the convenient functional form adopted. In addition, many times the whole statistical analysis is based on simplified databases that disregard relevant physical effects in the input signal, as, for instance…
What motion information is perceptually relevant?
Importance of quantiser design compared to optimal multigrid motion estimation in video coding
Adaptive flow computation and DCT quantisation play complementary roles in motion compensated video coding schemes. Since the introduction of the intuitive entropy-constrained motion estimation of Dufaux et al. (1995), several optimal variable-size block matching algorithms have been proposed. Many of these approaches put forward their intrinsic optimality, but the corresponding visual effect has not been explored. The relative importance of optimal multigrid motion estimation with regard to quantisation is addressed in the context of MPEG-like coding. It is shown that while simpler (suboptimal) motion estimates give subjective results as good as the optimal motion estimates, small enhancem…
The Brain’s Camera. Optimal Algorithms for Wiring the Eye to the Brain Shape How We See
The problem of sending information at long distances, without significant attenuation and at a low cost, is common to both artificial and natural environments. In the brain, a widespread strategy to solve the cost-efficiency trade off in long distance communication is the presence of convergent pathways, or bottlenecks. In the visual system, for example, to preserve resolution, information is acquired by a first layer with a large number of neurons (the photoreceptors in the retina) and then compressed into a much smaller number of units in the output layer (the retinal ganglion cells), to send that information to the brain at the lowest possible metabolic cost. Recently, we found experimen…
The Maximum Differentiation competition depends on the Viewing Conditions
Disentangling the Link Between Image Statistics and Human Perception
In the 1950s Horace Barlow and Fred Attneave suggested a connection between sensory systems and how they are adapted to the environment: early vision evolved to maximise the information it conveys about incoming signals. Following Shannon's definition, this information was described using the probability of the images taken from natural scenes. Previously, direct accurate predictions of image probabilities were not possible due to computational limitations. Despite the exploration of this idea being indirect, mainly based on oversimplified models of the image density or on system design methods, these methods had success in reproducing a wide range of physiological and psychophysical phenom…
Dimensionality Reduction via Regression in Hyperspectral Imagery
This paper introduces a new unsupervised method for dimensionality reduction via regression (DRR). The algorithm belongs to the family of invertible transforms that generalize Principal Component Analysis (PCA) by using curvilinear instead of linear features. DRR identifies the nonlinear features through multivariate regression to ensure the reduction in redundancy between he PCA coefficients, the reduction of the variance of the scores, and the reduction in the reconstruction error. More importantly, unlike other nonlinear dimensionality reduction methods, the invertibility, volume-preservation, and straightforward out-of-sample extension, makes DRR interpretable and easy to apply. The pro…
Non-linear Invertible Representation for Joint Statistical and Perceptual Feature Decorrelation
The aim of many image mappings is representing the signal in a basis of decorrelated features. Two fundamental aspects must be taken into account in the basis selection problem: data distribution and the qualitative meaning of the underlying space. The classical PCA techniques reduce the statistical correlation using the data distribution. However, in applications where human vision has to be taken into account, there are perceptual factors that make the feature space uneven, and additional interaction among the dimensions may arise. In this work a common framework is presented to analyse the perceptual and statistical interactions among the coefficients of any representation. Using a recen…
Perceptual adaptive insensitivity for support vector machine image coding.
Support vector machine (SVM) learning has been recently proposed for image compression in the frequency domain using a constant epsilon-insensitivity zone by Robinson and Kecman. However, according to the statistical properties of natural images and the properties of human perception, a constant insensitivity makes sense in the spatial domain but it is certainly not a good option in a frequency domain. In fact, in their approach, they made a fixed low-pass assumption as the number of discrete cosine transform (DCT) coefficients to be used in the training was limited. This paper extends the work of Robinson and Kecman by proposing the use of adaptive insensitivity SVMs [2] for image coding u…