Predicting Chronic Heart Failure

Predicting Chronic Heart Failure#

Clinical Use Case#

Patients admitted with myocardial infarction (MI) are at risk of developing a range of complications, including chronic heart failure (CHF).

Chronic heart failure is a serious condition that can significantly impair quality of life and is associated with increased morbidity and mortality.

Following an acute myocardial infarction, some patients recover without long-term consequences, while others develop progressive cardiac dysfunction leading to heart failure.

Early identification of patients at risk of developing chronic heart failure is challenging, even for experienced clinicians, but highly relevant for optimizing treatment and improving long-term outcomes.


Goal of this analysis:

Build a machine learning model that predicts whether a patient will develop chronic heart failure during hospitalization.

You can download the dataset of myocardial infarction complications from the University of Leicester here: https://figshare.le.ac.uk/ndownloader/files/23581310


About the Dataset#

This dataset contains clinical information about patients admitted with myocardial infarction and was designed to evaluate real-world medical prediction problems.

Variables include:

  • demographic data

  • medical history

  • ECG findings

  • laboratory values

  • treatment information

Possible complications are stored in the target variables.

In this notebook, we focus on predicting:

Chronic Heart Failure

Additional information about the dataset, including variable descriptions, can be found here: https://doi.org/10.25392/leicester.data.12045261


Important methodological aspect

The dataset allows prediction at different time points during the hospital stay:

  1. At admission

  2. After 24 hours

  3. After 48 hours

  4. After 72 hours

Depending on the chosen time point, different variables are available.

For this exercise, you must decide on one time point and adapt your feature selection accordingly.

For example:

  • If you predict at admission, you may only use variables available at admission

  • Later time points allow more information, but also introduce the risk of data leakage

This reflects a key challenge in clinical machine learning:

Predictions must be based only on information that is available at the time the decision is made.


Potential clinical use:

  • early identification of patients at risk of chronic heart failure

  • timely initiation of preventive or therapeutic interventions

  • improved long-term management and follow-up planning

Your Tasks#

  • Load and explore the dataset to understand its structure and contents

  • Decide at which time point you want to predict the ventricular fibrillation (target variable = “FIBR_JELUD”)
    Adjust your feature selection accordingly

  • Prepare the data for machine learning

  • Train and compare different models (e.g. Logistic Regression, Random Forest, XGB)

  • Evaluate model performance using appropriate metrics

  • Interpret your results and reflect on their clinical relevance

# Import bia-bob as a helpful Python & Medical AI expert
from bia_bob import bob
import os

bob.initialize(
    endpoint=os.getenv('ENDPOINT_URL'),
    model="vllm-llama-4-scout-17b-16e-instruct",
    system_prompt=os.getenv('SYSTEM_PROMPT_MEDICAL_AI')
)
# %bob Who are you? Just one sentence!