Predicting Chronic Heart Failure#
Clinical Use Case#
Patients admitted with myocardial infarction (MI) are at risk of developing a range of complications, including chronic heart failure (CHF).
Chronic heart failure is a serious condition that can significantly impair quality of life and is associated with increased morbidity and mortality.
Following an acute myocardial infarction, some patients recover without long-term consequences, while others develop progressive cardiac dysfunction leading to heart failure.
Early identification of patients at risk of developing chronic heart failure is challenging, even for experienced clinicians, but highly relevant for optimizing treatment and improving long-term outcomes.
Goal of this analysis:
Build a machine learning model that predicts whether a patient will develop chronic heart failure during hospitalization.
You can download the dataset of myocardial infarction complications from the University of Leicester here: https://figshare.le.ac.uk/ndownloader/files/23581310
About the Dataset#
This dataset contains clinical information about patients admitted with myocardial infarction and was designed to evaluate real-world medical prediction problems.
Variables include:
demographic data
medical history
ECG findings
laboratory values
treatment information
Possible complications are stored in the target variables.
In this notebook, we focus on predicting:
Chronic Heart Failure
Additional information about the dataset, including variable descriptions, can be found here: https://doi.org/10.25392/leicester.data.12045261
Important methodological aspect
The dataset allows prediction at different time points during the hospital stay:
At admission
After 24 hours
After 48 hours
After 72 hours
Depending on the chosen time point, different variables are available.
For this exercise, you must decide on one time point and adapt your feature selection accordingly.
For example:
If you predict at admission, you may only use variables available at admission
Later time points allow more information, but also introduce the risk of data leakage
This reflects a key challenge in clinical machine learning:
Predictions must be based only on information that is available at the time the decision is made.
Potential clinical use:
early identification of patients at risk of chronic heart failure
timely initiation of preventive or therapeutic interventions
improved long-term management and follow-up planning
Your Tasks#
Load and explore the dataset to understand its structure and contents
Decide at which time point you want to predict the ventricular fibrillation (target variable = “FIBR_JELUD”)
Adjust your feature selection accordinglyPrepare the data for machine learning
Train and compare different models (e.g. Logistic Regression, Random Forest, XGB)
Evaluate model performance using appropriate metrics
Interpret your results and reflect on their clinical relevance
# Import bia-bob as a helpful Python & Medical AI expert
from bia_bob import bob
import os
bob.initialize(
endpoint=os.getenv('ENDPOINT_URL'),
model="vllm-llama-4-scout-17b-16e-instruct",
system_prompt=os.getenv('SYSTEM_PROMPT_MEDICAL_AI')
)
# %bob Who are you? Just one sentence!