6533b7ddfe1ef96bd12753f0
RESEARCH PRODUCT
Supervised Analysis for Phenotype Identification: The Case of Heart Failure Ejection Fraction Class
Antonio FernandezJulio NúñezJose Miguel CalderonCristina LopezJose Luis HolgadoRedon JosepJosep RedonInma SauriRaquel Cortéssubject
Technologymedicine.medical_specialtyphenotypeQH301-705.5heart failureBioengineering030204 cardiovascular system & hematologyArticleprimary care03 medical and health sciences0302 clinical medicineText miningLasso (statistics)Internal medicinemedicine030212 general & internal medicineMyocardial infarctionBiology (General)Cluster analysisEjection fractionbusiness.industryUnstable anginaTallergologyleft ventricular ejection fractionAtrial fibrillationartificial intelligencemedicine.disease3. Good healthRandom forestHeart failureCardiologysupervised analysisbusinessdescription
Artificial Intelligence is creating a paradigm shift in health care, with phenotyping patients through clustering techniques being one of the areas of interest. Objective: To develop a predictive model to classify heart failure (HF) patients according to their left ventricular ejection fraction (LVEF), by using available data from Electronic Health Records (EHR). Subjects and methods: 2854 subjects over 25 years old with a diagnosis of HF and LVEF, measured by echocardiography, were selected to develop an algorithm to predict patients with reduced EF using supervised analysis. The performance of the developed algorithm was tested in heart failure patients from Primary Care. To select the most influentual variables, the LASSO algorithm setting was used, and to tackle the issue of one class exceeding the other one by a large amount, we used the Synthetic Minority Oversampling Technique (SMOTE). Finally, Random Forest (RF) and XGBoost models were constructed. Results: The full XGBoost model obtained the maximum accuracy, a high negative predictive value, and the highest positive predictive value. Gender, age, unstable angina, atrial fibrillation and acute myocardial infarct are the variables that most influence EF value. Applied in the EHR dataset, with a total of 25,594 patients with an ICD-code of HF and no regular follow-up in cardiology clinics, 6170 (21.1%) were identified as pertaining to the reduced EF group. Conclusion: The obtained algorithm was able to identify a number of HF patients with reduced ejection fraction, who could benefit from a protocol with a strong possibility of success. Furthermore, the methodology can be used for studies using data extracted from the Electronic Health Records.
year | journal | country | edition | language |
---|---|---|---|---|
2021-06-21 |