IMPORTANCE: Predicting primary hyperparathyroidism in data may facilitate earlier diagnosis and treatment.
OBJECTIVE: Primary Hyperparathyroidism (pHPT) is the leading cause of hypercalcemia and up to 75% of hypercalcemic patients go undiagnosed. The purpose of this study was to examine the use of predictive modeling using a large clinical database to predict pHPT in patients with benign thyroid nodules.
DESIGN: Retrospective analysis and predictive modeling of pHPT using a large discharge database. A predictive model of pHPT was created using logistic regression and compared to three machine learning algorithms: a Gaussian naive Bayes classifier, a stochastic gradient descent classifier, and a histogram-based gradient boosting classifier.
SETTING: Vizient hospital discharge database from over 1000 hospitals including academic health centers.
PARTICIPANTS: Data from the Vizient Clinical Database (CDB), 2 541 901 patients with benign thyroid nodules were identified between 2020 and 2023, of whom 83 555 (3.29%) had pHPT. INTERVENTION(S) (FOR CLINICAL TRIALS) OR EXPOSURE(S) (FOR OBSERVATIONAL STUDIES): Analyses controlled for demographics (age, sex, race), comorbidities (body mass index (BMI), diabetes, hypertension, smoking status, renal disease) and use of proton pump inhibitors and bisphosphonates.
MAIN OUTCOME(S) AND MEASURE(S): The primary outcome measure was the presence of pHPT, which was identified using ICD-10 codes. Model performance was compared using the area under the receiver operating characteristics (ROC) curve.
RESULTS: In the baseline predictive model, several demographic characteristics were significant predictors of pHPT. The logistic regression model had an area under the ROC curve of 68.1%, which was lower than that of the histogram gradient boosting model (68.7%) but equivalent to the gradient descent classifier (68.1%). Furthermore, the logistic regression model correctly classified 80.4% of pHPT cases, compared to 80.5% for both the histogram gradient boosting classifier and the gradient descent classifier. A threshold of 5% yielded a sensitivity of 38.5% and specificity of 81.8% for logistic regression.
CONCLUSIONS AND RELEVANCE: Predictive modeling of pHPT among patients with benign thyroid nodules is possible using a large clinical database. The predictive equation could be built into decision support systems to alert clinicians to potentially undiagnosed pHPT and aid in timely diagnosis and treatment of pHPT.