Technical Overview
Cohorting with MIMIC-IV, feature extraction, and ML models for readmission prediction and staffing regression.
Data & Cohorting
- Identify HF patients via ICD-10; define index stays and 30-day readmissions.
- Features: demographics, comorbidities, vitals, labs, medications, procedures, care unit.
- Processing: memory-efficient table joins and chunked reads; standardized schema.
Models
- Classification: Logistic Regression, Random Forest, XGBoost.
- Resource regression: predict nursing hours by care level and LOS.
- Evaluation: AUROC, AUPRC, calibration; MAE/RMSE for regression.
Pipeline & Reproducibility
- Modular src/ structure; tests/ for validation; config-driven runs.
- Scripts: make_dataset.py, extract_features.py, preprocess.py, readmission_model.py, resource_model.py.
- Automation: run_pipeline.sh for end-to-end execution.
Repo: cyranothebard/heart_failure_readmission
At a glance
Classifier
LR / RF / XGB
Eval
AUROC / AUPRC
Staffing
MAE / RMSE