Machine learning models better predicted best-corrected visual acuity (BCVA) outcomes at weeks 4 and 24 among eyes with diabetic macular edema (DME) in the phase 3 YOSEMITE and RHINE trials, while the 1-year BCVA predictions remained complex for the models.
The analysis, presented at the 2024 Association for Research in Vision and Ophthalmology (ARVO) Meeting, centered around the performance of machine learning models using baseline features to predict functional outcomes across different treatment durations (short-, mid-, and long-term) of faricimab, a bispecific antibody, blocking both VEGF-A and Ang-2 molecules, for DME.
In an interview with HCPLive, Daniela Ferrara, MD, PhD, a principal medical director in the ophthalmology imaging program at Genentech, described her thoughts on the potential performance metrics of the machine learning model, particularly the promising accuracy of the visual function response at the 1- and 6-month time points.
“I think the relevance for clinicians like myself is we know that visual function prediction is a challenge clinically,” Ferrara told HCPLive. “In other words, when I have a patient sitting in front of me at the clinic, or a patient enrolling in a clinical trial, based on solely clinical considerations, it’s very difficult to predict what is going to be the vision over time. We’re very excited about seeing that some models were able to make such a prediction.”
Patients in YOSEMITE and RHINE who received every-8-week faricimab 6.0 mg were pooled and split into 70% development, 15% test, and 15% holdout sets. The development set was split into 5 folds for cross-validation. Target functional outcomes comprised BCVA letter scores, with target time points at week 4, week 24, and year 1.
Only baseline features were assessed to predict BCVA over time. ElasticNet, random forest, support vector machine, and eXtreme Gradient Boosting tree were trained and evaluated on the data set. Models and input features that exhibited the best performance on the test set were selected for evaluation on the holdout set for each target time point.
After analysis, based on the test results, the random forest model with tier-1 input was selected for week 4 prediction, and the ElasticNet model with tier-1A input was selected for week 24 and year 1 prediction. Investigators evaluated the model performance using the percentage increase from the root mean square error (RMSE) on the test set to the holdout set.
The RMSE on the holdout set for week 4, week 24, and year 1 predictions were 8.26 (29%), 7.80 (22%), and 13.15 (99%), respectively. Among the holdout set, Spearman’s correlations of residuals at the 3 study time points were statistically significant for each pair (P <.0001).
Ferrara indicated the performance gap may be better in the future, as statistically significant correlations of residuals indicate the models’ capability could be improved, whether due to larger data sets, new input capabilities, or more advanced modeling.
“From the technological perspective, what we’ve been thinking about next steps is we can continue to exposure the baseline features, look at the results of this particular project, and understand how we can better mine or interpret clinical features at baseline,” Ferrara told HCPLive.
“From the modeling perspective, and I’m a clinician, not a data science expert, but we will continue to work on a very fast-evolving field of artificial intelligence and machine learning, and our group is at the forefront of developing new modeling approaches and evolve our efforts in the models themselves,” she added.
Disclosures: Daniela Ferrara is employed by Genentech, Inc
References
Kikuchi Y, Abderezaei J, McLeod, Chen C, Benech AC, Ferrara D, Anegondi N, Yang Q. Predicting functional outcomes for different treatment durations of faricimab in diabetic macular edema (DME). Poster presented at the Association for Research in Vision and Ophthalmology (ARVO) 2024 Meeting, May 5–9, 2024.