Technical Overview

Cohorting with MIMIC-IV, feature extraction, and ML models for readmission prediction and staffing regression.

Data & Cohorting
  • Identify HF patients via ICD-10; define index stays and 30-day readmissions.
  • Features: demographics, comorbidities, vitals, labs, medications, procedures, care unit.
  • Processing: memory-efficient table joins and chunked reads; standardized schema.
Models
  • Classification: Logistic Regression, Random Forest, XGBoost.
  • Resource regression: predict nursing hours by care level and LOS.
  • Evaluation: AUROC, AUPRC, calibration; MAE/RMSE for regression.
Pipeline & Reproducibility
  • Modular src/ structure; tests/ for validation; config-driven runs.
  • Scripts: make_dataset.py, extract_features.py, preprocess.py, readmission_model.py, resource_model.py.
  • Automation: run_pipeline.sh for end-to-end execution.

Repo: cyranothebard/heart_failure_readmission

At a glance
Classifier
LR / RF / XGB
Eval
AUROC / AUPRC
Staffing
MAE / RMSE