Skip to Main Content

  • imageCorrelation and regression are statistical methods to examine the linear relationship between two numerical variables measured on the same subjects. Correlation describes a relationship, and regression describes both a relationship and predicts an outcome.
  • imageCorrelation coefficients range from –1 to +1, both indicating a perfect relationship between two variables. A correlation equal to 0 indicates no relationship.
  • imageScatterplots provide a visual display of the relationship between two numerical variables and are recommended to check for a linear relationship and extreme values.
  • imageThe coefficient of determination, or r2, is simply the squared correlation; it is the preferred statistic to describe the strength between two numerical variables.
  • imageThe t test can be used to test the hypothesis that the population correlation is zero.
  • imageThe Fisher z transformation is used to form confidence intervals for the correlation or to test any hypotheses about the value of the correlation.
  • imageThe Fisher z transformation can also be used to form confidence intervals for the difference between correlations in two independent groups.
  • imageIt is possible to test whether the correlation between one variable and a second is the same as the correlation between a third variable and a second variable.
  • imageWhen one or both of the variables in correlation is skewed, the Spearman rho nonparametric correlation is advised.
  • imageLinear regression is called linear because it measures only straight-line relationships.
  • imageThe least squares method is the one used in almost all regression examples in medicine. With one independent and one dependent variable, the regression equation can be given as a straight line.
  • imageThe standard error of the estimate is a statistic that can be used to test hypotheses or form confidence intervals about both the intercept and the regression coefficient (slope).
  • imageOne important use of regression is to be able to predict outcomes in a future group of subjects.
  • imageWhen predicting outcomes, the confidence limits are called confidence bands about the regression line. The most accurate predictions are for outcomes close to the mean of the independent variable X, and they become less precise as the outcome departs from the mean.
  • imageIt is possible to test whether the regression line is the same (ie, has the same slope and intercept) in two different groups.
  • imageA residual is the difference between the actual and the predicted outcome; looking at the distribution of residuals helps statisticians decide if the linear regression model is the best approach to analyzing the data.
  • imageRegression toward the mean can result in a treatment or procedure appearing to be of value when it has had no actual effect; having a control group helps to guard against this problem.
  • imageCorrelation and regression should not be used unless observations are independent; it is not appropriate to include multiple measurements of the same subjects.
  • imageMixing two populations can also cause the correlation and regression coefficient to be larger than they should.
  • imageThe use of correlation versus regression should be dictated by the purpose of the research—whether it is to establish ...

Pop-up div Successfully Displayed

This div only appears when the trigger link is hovered over. Otherwise it is hidden from view.