Rasouliyan L, Plana E, Martinez D, Aguado J, Ziemiecki R. The missing data problem: using propensity scores to estimate non-randomised treatment effects with missing covariate data. Poster presented at the 2018 ISPOR 21st Annual European Congress; November 13, 2018. Barcelona, Spain.

OBJECTIVES: The objective of this research is to evaluate the performance of various propensity score (PS)-based methods to estimate non-randomised treatment effects in the presence of missing covariate data.

METHODS: Patient-level data were simulated based on a published observational cohort study of overactive bladder disease. The incidence of cardiovascular (CV) mortality was evaluated in two non-randomised treatment cohorts (Treatments A and B). Patient covariates of interest included demographic, lifestyle, comorbidity, and CV history variables. Smoking status and CV history covariates were set to missing at random (MAR) depending on cohort entry year in 5%, 10%, and 20% of patients. For each MAR scenario, the incidence rate ratio (IRR) of CV mortality comparing Treatment B to A was estimated with the following PS methods: (1) using only patients with non-missing covariate data; (2) using multiple imputation (MI); and (3) generating the PS model using only covariates with fully complete data, then adjusting for remaining covariates in the model in conjunction with MI.

By design, crude and true IRR values were 3.00 and 1.80, respectively. With complete covariate information, PS stratification methods after trimming yielded an IRR point estimate of 2.04. For the 5%, 10%, and 20% MAR scenarios, respectively, IRR point estimates were as follows: 2.07, 2.02, and 1.99 using Method 1; 2.06, 2.03, and 1.94 using Method 2; and 1.96, 1.94, and 1.85 using Method 3. Relative bias in coefficient estimates ranged from 4.2% (Method 3, 20% missing) to 23.7% (Method 1, 5% missing).

CONCLUSIONS: In this simulation, Methods 1 and 3 yielded the most and least biased IRR point estimates, respectively, across most evaluated MAR scenarios. All PS methods tended to produce more accurate estimates when more data were missing. More research is needed to understand the usefulness and performance of each PS method in specific situations.

Share on: