Monday, 18 September 2023

Credit Risk Prediction

 Credit Risk Prediction

Classification

  • Here we determine whether a borrower is likely to default on a loan or not. 
  • We classify them into "Default" and "Non-Default." 
  • For our dataset we predict the loan_status using the features 
  1. person_home_ownership, 
  2. loan_intent, 
  3. loan_grade, 
  4. cb_person_default_on_file

Logistic Regression



  • Supervised Machine Learning Algorithm 
  • It is used for predicting the categorical dependent variable using a given set of independent variables 
  • It can be either Yes or No, 0 or 1, true or False, etc. 
  • In Logistic regression, instead of fitting a regression line, we fit an “S” shaped logistic function, which predicts two maximum values (0 or 1). 
  • the model accuracy score is  0.8055853920515574 
  • precision score is 0.7071005917159763

Random Forest Classification


  • Supervised Machine Learning Algorithm 
  • It leverages an ensemble of multiple decision trees to generate predictions or classifications 
  • it can handle the data set containing continuous variables, as in the case of regression, and categorical variables, as in the case of classification 
  • the model accuracy score is 0.9364738376553629 
  • precision score is 0.9733079122974261

Confusion Matrix



True Positive - 5047
True Negative - 42
False Positive - 397
False Negative - 1031


Factors Affecting Loan Status



Inferences

  • A person's annual income, employment history length, loan grade, and credit history length have a negative relationship with the risk of loan default. The higher the annual income, the longer the employment length, the higher the loan grade, or the longer credit history a person has, the less likely of a loan default might happen for that person.
  • The loan's interest rate and loan percent income have a positive relationship with the risk of loan default. The higher a loan's interest rate or higher the loan percent income ratio is, the loan will have higher the risk of loan default.
  • A person's age and loan amount do not seem to have a strong correlation with the potential loan default risk.
  • In addition, there are other factors that can differentiate a person with a higher risk of loan default and the ones with a lower risk:
  1. Having a home or mortgage could indicate a lower loan default risk compared to the ones who rents.
  2. Different types of loans could associate different levels of loan default risks.
  3. Having a default on file could indicate a higher loan default risk compared to the ones without any default on files.
Random Forest Classification
Classification

No comments:

Post a Comment