libraries needed by this project are provided by the Anaconda distribution of Python. he code should run with no issues using Python versions 3.*.
In this project, I look into home credit default data and try to build a model that predict the likelihood of each applicant repaying a loan .
Features are generated both manually and automaticly.
LightGBM is used to predict likelihood.
Data file can be found at Kaggle
EDA notebook contain all steps of data EDA and cleaning process.
Features_Models notebook contain all steps of building the model
The final model gives local cv 0.77954 and 0.78205 on test data
The full findings of the code can be found at the post available here.
Data Source can be acquired from Kaggle