Random Forest Model Helps Predict Pulmonary Hypertension for Systemic Sclerosis Patients

June 9, 2022
Kenny Walter

Kenny Walter is an editor with HCPLive. Prior to joining MJH Life Sciences in 2019, he worked as a digital reporter covering nanotechnology, life sciences, material science and more with R&D Magazine. He graduated with a degree in journalism from Temple University in 2008 and began his career as a local reporter for a chain of weekly newspapers based on the Jersey shore. When not working, he enjoys going to the beach and enjoying the shore in the summer and watching North Carolina Tar Heel basketball in the winter.

Currently many more patients undergo right heart catheterization than is necessary for diagnosing pulmonary hypertension in patients with systemic sclerosis.

Utilizing a multimodal prediction model could help avoid unnecessary procedures in diagnosing pulmonary hypertension in patients with systemic sclerosis (SSc).

A team, led by Justin K. Lui, MD, MS, The Pulmonary Center, Boston University School of Medicine, examined the accuracy of 3 different models in forecastings pulmonary hypertension in this patient population.

Current Practices

To diagnose pulmonary hypertension in systemic sclerosis, there needs to be an invasive right heart catheterization (RHC), which is often based on an elevated estimated pulmonary artery systolic pression on screening echocardiography.

However, because of poor specificity of echocardiography, there are a lot more patients who undergo unnecessary RHC, which exposes them to potentially avoidable complication risks.

That is where the value of improved prediction models for pulmonary hypertension in SSc shows.

The Study

In the retrospective study, the investigators examined 130 patients with SSc, 50.8% (n = 66) of which were diagnosed with pulmonary hypertension by RHC.

The team used pulmonary function testing, electrocardiography, echocardiography, and computed tomography data to identify and compare the performance characteristics of 3 models for predicting the presence of pulmonary hypertension—random forest, classification and resgression tree, and logistic regression.

Overall, the best performing model was the random forest with an area under the curve of 0.92 (95% CI, 0.83-1.00), sensitivity of 0.95 (95% CI, 0.75-1.00), and specificity of 0.80 (95% CI, 0.56-0.94).

The investigators also found the 2 most important variables in the random forest model were pulmonary artery diameter on chest computed tomography and diffusing capacity for carbon monoxide on pulmonary function testing.

“In patients with SSc, a random forest model can aid in the detection of PH with high sensitivity and specificity, and may allow for better patient selection for RHC, thereby minimizing patient risk,” the authors wrote.

An Algorithm for Pulmonary Arterial Hypertension

A new claims-based, machine-learning algorithm model has been shown to be effective in identifying patients with pulmonary arterial hypertension (PAH). This model was responsible for correctly identifying 73% of patients with PAH 6 months prior to a confirmed diagnosis.

Though early diagnosis and treatment initiation for PAH have always been the objective, PAH symptoms are non-specific, which has resulted in many patients experiencing delays (an average of 2 or more years) in symptom onset and a confirmed diagnosis.

The machine-learning model analyzed data from the US-based Optum Clinformatics database from January 2015 to December 2019.

A total of 1724 patients were in the PAH cohort while the control group has 5352 patients. Mean age was similar (69 and 70 years old), and roughly two-thirds of patients were female and White.

Bettencourt and colleagues observed that the model was able to distinguish between patients with PAH and controls at 6 months prior to diagnosis, with a threshold of 0.43.

Additionally, the model was capable of accurately identifying 73% of the patients with PAH, and the precision of the model was 50%.

The study, “A Multimodal Prediction Model for Diagnosing Pulmonary Hypertension in Systemic Sclerosis,” was published online in Arthritis Care & Research.


x