Abstract
Objective: Deep-learning techniques, particularly the Transformer model, have shown great potential in enhancing the prediction performance of longitudinal health records. Previous methods focused on fixed-time risk prediction, however, time-to-event prediction is often more appropriate for clinical scenarios. Here, we present STRAFE, a generalizable survival analysis Transformer-based architecture for electronic health records. Materials and Methods: The input for STRAFE is a sequence of visits with SNOMED-CT codes in OMOP-CDM format. A Transformer-based architecture was developed to calculate probabilities of the occurrence of the event in each of 48 months. Performance was evaluated using a real-world claims dataset of over 130 000 individuals with stage 3 chronic kidney disease (CKD). Results: STRAFE showed improved mean absolute error (MAE) compared to other time-to-event algorithms in predicting the time to deterioration to stage 5 CKD. Additionally, STRAFE showed an improved area under the receiver operating curve compared to binary outcome algorithms. We show that STRAFE predictions can improve the positive predictive value of high-risk patients by 3-fold. Finally, we suggest a novel visualization approach to predictions on a per-patient basis. Discussion: Time-to-event predictions are the most appropriate approach for clinical predictions. Our deep-learning algorithm outperformed not only other time-to-event prediction algorithms but also fixed-time algorithms, possibly due to its ability to train on censored data. We demonstrated possible clinical usage by identifying the highest-risk patients. Conclusions: The ability to accurately identify patients at high risk and prioritize their needs can result in improved health outcomes, reduced costs, and more efficient use of resources.
Original language | English |
---|---|
Pages (from-to) | 980-990 |
Number of pages | 11 |
Journal | Journal of the American Medical Informatics Association : JAMIA |
Volume | 31 |
Issue number | 4 |
DOIs | |
State | Published - 1 Apr 2024 |
Keywords
- chronic kidney disease
- clinical data
- deep-learning
- survival analysis
- transformer
All Science Journal Classification (ASJC) codes
- Health Informatics