We predict the length-of-stay (LOS) of hospital inpatients.
The problem is treated as a classification problem using the xgboost algorithm and considering the following two classes:
- class 1 = '1 or 2 days'
- class 2 = '3+ days'
→ LOS = 1 ( class 1 = '1 or 2 days' ) is considered when admission date = release date.
→ LOS = 2 ( class 1 = '1 or 2 days' ) is considered when admission date = release date + 1 day.
→ LOS = 3 ( class 2 = '3+ days' ) is considered when admission date = release date + 2 days.
etc.
The code is written in Python.
- folder evaluate includes the evaluation code
- folder deploy includes the deployment code
The required packages are included in file requirements.txt
.
Python interpreter version used for this project: 3.9.4
- sex : categorical variable := sex of patient
- family : categorical variable := family status id of patient
- ter : categorical variable := prefecture id of patient's residence
- wayin : categorical variable := type of patient's admission
- asfal1 : categorical variable := id of patient's 1st health insurance
- has_asfal2 : categorical variable := boolean flag on whether the patient has 2nd health insurance or not
- has_asfal3 : categorical variable := boolean flag on whether the patient has 3rd health insurance or not
- icd10groupid : categorical variable := id of ICD10 group assigned to patient on admission
- specialty : categorical variable := id of the doctor's specialty
- weekday : numerical variable := day of week (0,1,..6) on admission
- hh24 : numerical variable := hour of day (00,01,02,...,23) on admission
- age : numerical variable := patient age on admission day