Program To Predict Loan Eligibility in Python Language Assignment Solution

July 09, 2024

Dr. Andrew

🇨🇦 Canada

Python

Dr. Andrew Taylor, a renowned figure in the realm of Computer Science, earned his PhD from McGill University in Montreal, Canada. With 7 years of experience, he has tackled over 500 Python assignments, leveraging his extensive knowledge and skills to deliver outstanding results.

Hire Me To Do Your Python Assignment

Python

Key Topics

Instructions
- Objective
Requirements and Specifications

Submit Your Python Assignment

Get a FREE Quote

Tip of the day

Start by clearly understanding the schema and relationships between tables. Use proper SQL syntax, normalize data to avoid redundancy, and always back up before running complex queries. Practice using JOIN, GROUP BY, and subqueries—they're essential for real-world database management tasks.

News

In late June, IntelliJ IDEA 2025.1.3 was released with key bug fixes, including improved AsyncAPI preview, Python interpreter support on ARM/Aarch64 with WSL, and refined test result displays—smoothing the workflow for students using Java, Kotlin, and Python

Instructions

Objective

Write a python assignment program to predict loan eligibility.

Requirements and Specifications

Additional Project : Bancassurance

Description

Background and Context

Best insurance company and My Bank have set up a Bancassurance(Bancassurance is a relationship between a bank and an insurance company), now using the data of liability customers of My Bank, The Best insurance company wants to convert customers with both a life insurance policy and an account in My bank to loan customers(taking a loan against a life insurance policy)

A campaign that the company ran last year for liability customers showed a healthy conversion rate of over 12.56% success. You are provided with data of customers who have an account in My bank and life insurance policy in the Best insurance company

You as a data scientist at the Best insurance company have to build a model to identify the positively responding customers who have a higher probability of purchasing the loan. This will increase the success ratio and reduce the cost of the campaign.

Objective

To predict whether a liability customer will buy a loan or not.
Which variables are most significant for making predictions.
Which segment of customers should be targeted more.

Source Code

import pandas as pd import numpy as np import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn import preprocessing, tree import seaborn as sns from sklearn.feature_selection import SelectKBest from sklearn.feature_selection import chi2 ### Read Data df = pd.read_csv('My_Bank.csv') df.head(10) print(f"This dataset has {len(df)} rows") ### Show the number of NaN values in each column df.isnull().sum() ### Remove non-useful columns df = df.drop(columns = ['CUST_ID']) ### Convert ACC_OP_DATE to Numeric df['ACC_OP_DATE'] = pd.to_datetime(df['ACC_OP_DATE']).dt.strftime("%m%d%Y").astype(int) df.head(5) ### Categorize object columns object_columns = df.select_dtypes(include=['object']).columns for col in object_columns: values = df[col].unique() values_dict = {x[0]: x[1] for x in zip(values, range(len(values)))} df[col] = df[col].map(values_dict) ### Normalize data df_norm = (df-df.min())/(df.max()-df.min()) df_norm.head() ### Extract target column Y = df_norm['TARGET'] X = df_norm.drop(columns=['TARGET']) X.head() print(f"There are {len(X.columns)} variables and {len(X)} records") ### Display correlation map to see the relation between variables f = plt.figure(figsize = (10,10)) plt.matshow(df_norm.corr(), fignum = f.number) plt.colorbar() plt.xticks(range(len(df_norm.columns)), df_norm.columns, rotation=90); plt.yticks(range(len(df_norm.columns)), df_norm.columns); plt.show() ### Split data into train and test X_train, X_test, Y_train, Y_test = train_test_split( ... X, Y, test_size=0.3, random_state=42) ### Build LogisticRegression Model model = LogisticRegression() model.fit(X_train, Y_train) ### Score model.score(X_test, Y_test) ### Create a plot of model's accuracy vs. K best features scores = [] for k in range(1, len(X.columns)): X_new = SelectKBest(chi2, k = k).fit_transform(X, Y) X_train2, X_test2, Y_train2, Y_test2 = train_test_split(X_new, Y, test_size=0.3, random_state=42) model = LogisticRegression() model.fit(X_train2, Y_train2) score = model.score(X_test2, Y_test2) scores.append(score) plt.plot(range(1, len(X.columns)), scores) plt.grid(True) plt.xlabel('Number of Features') plt.ylabel("Model's Accuracy") ### Pick optimal number of features kopt = range(1, len(X.columns))[np.argmax(scores)] print(f"The optimal number of features is {kopt}, giving a model accuracy of {max(scores)*100.0}%") Xopt_lr = SelectKBest(chi2, k = kopt).fit_transform(X, Y) # Build a new model but only with best features ### Select best features X_new = SelectKBest(chi2, k=kopt).fit_transform(X, Y) ### Split into Train and Test with new X values X_train2, X_test2, Y_train2, Y_test2 = train_test_split(X, Y, test_size=0.3, random_state=42) ### Build Model model2 = LogisticRegression() model2.fit(X_train2, Y_train2) model2.score(X_test2, Y_test2) # Decision Tree treeClf = tree.DecisionTreeClassifier() treeClf.fit(X_train, Y_train) treeClf.score(X_test, Y_test) ### Select K best features and run again the decision tree scoresTree = [] for k in range(1, len(X.columns)): X_new = SelectKBest(chi2, k = k).fit_transform(X, Y) X_train3, X_test3, Y_train3, Y_test3 = train_test_split(X_new, Y, test_size=0.3, random_state=42) treeClf = tree.DecisionTreeClassifier() treeClf.fit(X_train3, Y_train3) score = treeClf.score(X_test3, Y_test3) scoresTree.append(score) plt.plot(range(1, len(X.columns)), scores) plt.grid(True) plt.xlabel('Number of Features') plt.ylabel("Model's Accuracy") koptTree = range(1, len(X.columns))[np.argmax(scoresTree)] print(f"The optimal number of features for Decision Tree is {koptTree}, giving a model accuracy of {max(scoresTree)*100.0}%") Xopt_tree = SelectKBest(chi2, k = koptTree).fit_transform(X, Y) ### Plot Scores of both LogisticRegression and DecisionTree vs. Number of features plt.plot(range(1, len(X.columns)), scores, label = 'LogisticRegression') plt.plot(range(1, len(X.columns)), scoresTree, label = 'DecisionTree') plt.legend() plt.grid(True) plt.xlabel('Number of Features') plt.ylabel("Model's Accuracy") plt.show()

Similar Samples

Explore our comprehensive programming assignment samples at ProgrammingHomeworkHelp.com. From Java and Python to C++ and SQL, each sample exemplifies our commitment to delivering high-quality solutions. Whether you need assistance with algorithms, databases, or web development, our samples showcase our expertise in tackling diverse programming challenges effectively. Dive into our examples to see how we can help you excel in your programming assignments.

See All Samples

Prime Number Check, Sum of Even Numbers, Guessing Game, and Dice Simulation in Python

Python

Word Count

4091 Words

Writer Name:Walter Parkes

Total Orders:2387

Satisfaction rate:

Python Assignment Sample: Analyzing Stock Market Data with Pandas

Python

Word Count

2184 Words