++
Describe how binary logistic regression and Cox regression are commonly used in the drug literature
Distinguish the nature of the outcome variable for logistic regression and Cox regression compared to linear regression
Describe and appropriately interpret the coefficient estimates and predictions produced by binary logistic regression and Cox regression
Describe how the coefficient estimates and predictions produced by binary logistic regression and Cox regression can be used to evaluate a research hypothesis
Interpret an estimated survival curve
Evaluate a Kaplan-Meier plot comparing the survival curves of two different groups
++
Binary logistic regression
Censoring
Cox proportional hazards regression model
Hazard
Hazard function
Hazard ratio (HR)
Kaplan-Meier method
Logit
Log-rank test
Maximum likelihood estimation (MLE)
Odds ratio (OR)
Survival analysis
Survival curve
Survival function
++
In general, the linear regression model is used when the dependent variable or outcome variable of interest is a continuous variable, such as glycosylated hemoglobin (A1C) or systolic blood pressure. However, in clinical research, many outcome variables of interest cannot be conceptualized as being continuous. In some cases, the outcome variable may be categorical (e.g., a dichotomous or binary variable, such as whether or not one has a disease), while in many other situations the outcome variable may be the time until an event occurs and it is possible that the researcher may not know when (or whether) the event occurs for everyone in the study.
++
The purpose of this chapter is to discuss additional methods of regression analysis that are appropriate for such situations. This chapter begins with a discussion of the methodology behind, and appropriate use of, logistic regression for the analysis of an outcome variable that is binary (or dichotomous). The second part of the chapter provides an overview of two different techniques that fall under the general umbrella of survival analysis, the Kaplan-Meier method and Cox regression. Both are widely used in the clinical literature when the outcome variable is the time until the occurrence of an event.
+++
BINARY LOGISTIC REGRESSION
++
In Chapter 11, “Simple and Multiple Linear Regression,” the case scenario involved the prediction of patient A1C levels, a continuous variable (see http://biostat.mc.vanderbilt.edu/wiki/Main/DataSets).1,2 From a clinical perspective, the linear regression model using ordinary least squares (OLS) estimation has a limitation. Although higher A1C levels do indicate a more pronounced onset of diabetes, when diagnosing patients there must be some practical cutoff, above which a patient “has” diabetes, and below which the patient “does not have” diabetes. Most pharmacotherapy guidelines use an A1C level of seven as this cutoff point.3 It is conceivable to characterize this discrete decision by creating a new variable in the diabetes data set that takes a value of one for patients whose A1C levels exceed or equal seven and a value of zero ...