Credit Risk Prediction
Classification
- Here we determine whether a borrower is likely to default on a loan or not.
- We classify them into "Default" and "Non-Default."
- For our dataset we predict the loan_status using the features
- person_home_ownership,
- loan_intent,
- loan_grade,
- cb_person_default_on_file
Logistic Regression
- Supervised Machine Learning Algorithm
- It is used for predicting the categorical dependent variable using a given set of independent variables
- It can be either Yes or No, 0 or 1, true or False, etc.
- In Logistic regression, instead of fitting a regression line, we fit an “S” shaped logistic function, which predicts two maximum values (0 or 1).
- the model accuracy score is 0.8055853920515574
- precision score is 0.7071005917159763
Random Forest Classification
- Supervised Machine Learning Algorithm
- It leverages an ensemble of multiple decision trees to generate predictions or classifications
- it can handle the data set containing continuous variables, as in the case of regression, and categorical variables, as in the case of classification
- the model accuracy score is 0.9364738376553629
- precision score is 0.9733079122974261
Confusion Matrix
True Positive - 5047
True Negative - 42
False Positive - 397
False Negative - 1031
Factors Affecting Loan Status
Inferences
- A person's annual income, employment history length, loan grade, and credit history length have a negative relationship with the risk of loan default. The higher the annual income, the longer the employment length, the higher the loan grade, or the longer credit history a person has, the less likely of a loan default might happen for that person.
- The loan's interest rate and loan percent income have a positive relationship with the risk of loan default. The higher a loan's interest rate or higher the loan percent income ratio is, the loan will have higher the risk of loan default.
- A person's age and loan amount do not seem to have a strong correlation with the potential loan default risk.
- In addition, there are other factors that can differentiate a person with a higher risk of loan default and the ones with a lower risk:
- Having a home or mortgage could indicate a lower loan default risk compared to the ones who rents.
- Different types of loans could associate different levels of loan default risks.
- Having a default on file could indicate a higher loan default risk compared to the ones without any default on files.
Random Forest Classification
Classification




No comments:
Post a Comment