0000000000231079

AUTHOR

David Cox

A Comment on the Coefficient of Determination for Binary Responses

Abstract Linear logistic or probit regression can be closely approximated by an unweighted least squares analysis of the regression linear in the conditional probabilities provided that these probabilities for success and failure are not too extreme. It is shown how this restriction on the probabilities translates into a restriction on the range of the coefficient of determination R 2 so that, as a consequence, R 2 is not suitable to judge the effectiveness of linear regressions with binary responses even if an important relation is present.

research product

On Association Models Defined over Independence Graphs

Conditions on joint distributions are given under which two variables will be conditionally associated whenever an independence graph does not imply a corresponding conditional independence statement. To this end the notions of parametric cancellation, of stable paths and of quasi-linear models are discussed in some detail.

research product

An approximation to maximum likelihood estimates in reduced models

SUMMARY An approximation to the maximum likelihood estimates of the parameters in a model can be obtained from the corresponding estimates and information matrices in an extended model, i.e. a model with additional parameters. The approximation is close provided that the data are consistent with the first model. Applications are described to log linear models for discrete data, to models for multivariate normal distributions with special covariance matrices and to mixed discrete-continuous models.

research product

Derived variables calculated from similar joint responses: some characteristics and examples

Abstract A technique (Cox and Wermuth, 1992) is reviewed for finding linear combinations of a set of response variables having special relations of linear conditional independence with a set of explanatory variables. A theorem in linear algebra is used both to examine conditions in which the derived variables take a specially simple form and lead to reduced computations. Examples are discussed of medical and psychological investigations in which the method has aided interpretation.

research product

Tests of Linearity, Multivariate Normality and the Adequacy of Linear Scores

After some discussion of the purposes of testing multivariate normality, the paper concentrates on two different approaches to testing linearity: on repeated regression tests of non-linearity and on exploiting properties of a dichotomized normal distribution. Regression tests of linearity are used to examine the adequacy of linear scoring systems for explanatory variables, initially recorded on an ordinal scale. Examples from recent psychological and medical research are given in which the methods have led to some insight into subject-matter.

research product

On the calculation of derived variables in the analysis of multivariate responses

AbstractThe multivariate regression of a p × 1 vector Y of random variables on a q × 1 vector X of explanatory variables is considered. It is assumed that linear transformations of the components of Y can be the basis for useful interpretation whereas the components of X have strong individual identity. When p ≥ q a transformation is found to a new q × 1 vector of responses Y∗ such that in the multiple regression of, say, Y1∗ on X, only the coefficient of X1 is nonzero, i.e. such that Y1∗ is conditionally independent of X2, …, Xq, given X1. Some associated inferential procedures are sketched. An illustrative example is described in which the resulting transformation has aided interpretation.

research product

Response models for mixed binary and quantitative variables

SUMMARY A number of special representations are considered for the joint distribution of qualitative, mostly binary, and quantitative variables. In addition to the conditional Gaussian models and to conditional Gaussian regression chain models some emphasis is placed on models derived from an underlying multivariate normal distribution and on models in which discrete probabilities are specified linearly in terms of unknown parameters. The possibilities for choosing between the models empirically are examined, as well as the testing of independence and conditional independence and the estimation of parameters. Often the testing of independence is exactly or nearly the same for a number of di…

research product

Graphical Models for Dependencies and Associations

The role of graphical representations is described in distinguishing various special forms of independency structure that can arise with multivariate data, especially in observational studies in the social sciences. Conventions for constructing the graphs and strategies for analysing three sets of data are summarized. Finally some directions for desirable future work are outlined.

research product

Causal diagrams for empirical research

research product

Causal Inference and Statistical Fallacies

Fallacies are defined as plausible-seeming arguments that give the wrong conclusion. The article concentrates on those with some connection with causality. The classical definition of causality involving a necessary and sufficient condition for an effect is rejected and three possible definitions discussed. The first is that of a statistical association that cannot be explained away as the effect of admissible alternative features. To make this more precise, Markov graphical representations are introduced and the important distinction between pairs of variables on an equal footing and those in a potential explanatory-response relation described. The roles of unobserved confounders and of ra…

research product

Statistical Dependence and Independence

Statistical dependence is a type of relation between different characteristics measured on the same units. At one extreme is deterministic dependence; at the other is statistical independence, where the distribution of one variable is the same for all levels of the other. With more than two variables, an important distinction is between marginal and conditional dependence. In many contexts, the degree of dependence may be summarized by a suitable measure of association, perhaps as part of a general model. Reference is made to graphical models. Keywords: association; correlation; marginal; conditional; exponential family; graphical Markov models

research product