Methods Matter: Misinterpreting Covariate Coefficients from Multivariable Models
July 5, 2023

This blog is the second part of a two-part series on covariates in real-world evidence (RWE) studies. Part 1 explored the concept of confounding, and why it is essential to choose covariates carefully when building a multivariable model of the relationship between an exposure and a health outcome. It described how including unnecessary variables in a multivariable model might introduce bias instead of correcting it.1

The current blog discusses issues with interpreting the covariate effects derived from a multivariable model when the researcher’s intent is to explain the relationship between an exposure and a health outcome. An issue described by Welstreich and Greenland is known as the “Table 2 fallacy”.2

• The Table 2 fallacy suggests that even after carefully identifying confounding variables (covariates) to adjust for, it is common for researchers to mistakenly interpret each coefficient from an adjusted regression model as the total effect of the covariate, which is not the case.

Example

Suppose researchers are interested in the effect of HIV on the 10-year risk of stroke and identified smoking and age as potential confounders, as depicted below. In this case, a multivariable logistic regression model may be fitted to estimate the effect of HIV on the log odds of stroke after controlling for the effects of age and smoking. In the figure above, β1, β2, and β3 represent the coefficients of HIV, smoking, and age, respectively, in the fitted multivariable regression model.

It is crucial to note that the coefficients from this model cannot be interpreted in the same way, despite being mutually adjusted.

• The coefficient β1 estimates the total effect of HIV (the exposure of interest) on the log odds of stroke.
• The coefficient β2 for smoking estimates the direct effect of smoking that does not go through HIV.  Therefore, it is a biased estimate of the total effect of smoking.
• The coefficient β3 for age estimates the direct effect of age that does not go through smoking and HIV, which is a biased estimate of its total effect.

The model equation is as follows: Care must be taken in presenting and interpreting coefficients in the literature from multivariable regression models with a primary exposure of interest and confounders because the different coefficients have different interpretations.

How to avoid misinterpretation

To avoid misinterpretation, Welstreich et al. suggest presenting the estimated effects of the primary exposure (e.g., HIV) separately from those of secondary variables from the same model (e.g., smoking and age).2 For example:

• Limit the main results table to only the estimates of the primary exposure, and list covariates that were adjusted for in a footnote.
• If necessary, use different covariate subsets to estimate the total effects of secondary covariates. For example, build a model in which smoking is the exposure of interest and use that coefficient to gain a more accurate understanding of the relationships at play.

Why choose Medlior for your RWE project?