Fortuny J, Kaye JA, Bui CL, Gilsenan AW, Bartsch J, Plana E, McQuay LJ, Calingaert B, Atsma WJ, Appenteng K, Franks B, de Vogel S, D'Silva M, Perez Gutthan S, Arana A, Margulis AV. Evaluation of free-text comments to validate common cancer diagnoses in the UK CPRD. Poster presented at the 32nd International Conference on Pharmacoepidemiology & Therapeutic Risk Management (ICPE); August 26, 2016. Dublin, Ireland. [abstract] Pharmacoepidemiol Drug Saf. 2016 Aug; 25(Suppl 3):58.


BACKGROUND: Some primary care databases include physicians' free-text comments, which reflect physicians' thinking without constraint to coded entries. Free text may be used in database studies to help validate outcomes. Starting in April 2016, free text will not be available for research in CPRD owing to transparency and governance concerns.

OBJESTIVES: Evaluate the relative contribution of free-text comments in the validation of incident cases of prostate, breast, lung, and bladder cancer.

METHODS: For Read code-identified potential cancer cases in CPRD, we created two sets of electronic medical record profiles (prescriptions, diagnoses, procedures, laboratory tests, referrals, clinical information), one with and one without free text. One physician reviewed profiles of patients with free text and determined cancer type (e.g., breast) and status: confirmed or not confirmed (diagnosed before cohort entry, not incident cancer, unclear). Another physician independently reviewed profiles without free text. Prior to reviews, reviewers underwent training to decrease interrater variability.

RESULTS: We identified 168 potential cases, of which 143 (85%) were confirmed in the review with free text and were considered the gold standard for calculations. The positive predictive value of case confirmation in the review without free text was 0.93 (128 of 137; 95% confidence interval [CI], 0.88-0.97), negative predictive value was 0.52 (16 of 31; 95% CI, 0.34-0.69), sensitivity was 0.90 (128 of 143; 95% CI, 0.84-0.94) and specificity was 0.64 (16 of 25; 95% CI, 0.44-0.81). Results were similar for individual cancer types. Cancer type matched in 142 of 143 confirmed cases.

CONCLUSIONS: Free text did not add information on cancer type. The review without free text classified most cases correctly. However, about half (15 of 31) of cases not confirmed in the review without free text were actually cases, and one third (9 of 25) of cases not confirmed by free text were falsely considered confirmed. Although some discrepancies may be due to interrater variability, misclassification of case status (confirmed vs. not) would likely increase without availability of free text.

Share on: