← Back to Projects
Data SciencePlatform: Analytics Vidhya

Loan Predictions

Loan Predictions

1. Context & Objective

Dream Housing Finance company wants to automate the loan eligibility process based on customer details. This project builds a classifier to predict approval in real-time.

2. Methodology

1. Imputed missing values using mode/median strategies. 2. Applied log transformation to handle income skewness. 3. Used SMOTE to handle class imbalance. 4. Trained an XGBoost classifier for final predictions.
In [1]:
import pandas as pd, numpy as np
from xgboost import XGBClassifier
from imblearn.over_sampling import SMOTE

df = pd.read_csv('loan_train.csv')
df['LoanAmount_log'] = np.log(df['LoanAmount'])

X = pd.get_dummies(df.drop('Loan_Status', axis=1))
y = df['Loan_Status']
X_res, y_res = SMOTE(random_state=42).fit_resample(X, y)

model = XGBClassifier(eval_metric='logloss')
model.fit(X_res, y_res)

3. Final Learnings

Log transformation on skewed income data improved generalization. SMOTE prevented the model from simply predicting 'Approve' for all, resulting in a robust 78% F1-score.

Dataset details

Language

Python

Size

614 rows (Training)

Libraries Used

PandasScikit-LearnXGBoost