The Regression Model Of The United States

Better Essays

First of all, I would like to mention that it is more reasonable to compare the models that are based on the same data, so I tried to use the same variables and the same missing value treatment approach (excluding decision tree) to all of the models.
All the 3 models showed a performance of nearly the same quality, according to the various lift charts produced and presented in the further parts of the report.

However, the difference becomes more evident on the % captured response and the most efficient and useful model turns out to be the logistic regression model.
It is described in a greater detail in part 4 of this report.

This ROC plot indicates that the logistic regression is also efficient in terms of trade-off between …show more content…

2. Recommended Model - Decision Tree

The recommended decision tree model includes 2 variables : annual income and loans, both of them are interval variables and represent the original observations. They were chosen for the final model, because after several trials, they proved to be the key ones in determining the rules within decision trees.
In terms of missing values, nothing particular had to be done, because decision trees conveniently handle missing values by default.
As for the splitting criterion, after getting more knowledge about each of the criteria and performing numerous trials , Gini was chosen, due to its ability to measure the differences between the values of a frequency distribution.
Presented below is the model assessment graph that represents the misclassification rates at each number of leaves.

As can be seen from the graph, the model enables to reduce the difference between the training and actual sets compared to other situations when different settings were used and different variables included.

Another indicator of this model’s usefulness is the lift value graph. The base line represents the nonexistence of our prediction model, while the intercept of the red line states that with this decision tree we can identify 3,7% more bad customers than we would have done without it.

The %

Get Access

The Regression Model Of The United States

Exponential Model: The Growth Rate Of Zombie Population

Exponential Model: The Growth Rate Of Zombie Population

Macroeconomics In The United States

Macroeconomics In The United States

Hca 270 Week 6 Comparative Data Essay

Hca 270 Week 6 Comparative Data Essay

One Brain or Two? (Psychology) Essay

One Brain or Two? (Psychology) Essay

MATH 533 Course Project Data AJ DAVIS

MATH 533 Course Project Data AJ DAVIS

Nt1310 Unit 3 Test Report

Nt1310 Unit 3 Test Report

Pt1420 Unit 3 Agression Analysis

Pt1420 Unit 3 Agression Analysis

Pt2520 Unit 6 Data Mining Project

Pt2520 Unit 6 Data Mining Project

Comprehensive Severity Index (CSI)

Comprehensive Severity Index (CSI)

The Latent Class Model In Health Care

The Latent Class Model In Health Care

The US Economy

The US Economy

The Dataset Diabetes Details From Efron Et Al

The Dataset Diabetes Details From Efron Et Al

Data Mining Essay

Data Mining Essay

The Heart Attack Study Data And R Studio Software Essay

The Heart Attack Study Data And R Studio Software Essay

Data Analysis Golf Course Design

Data Analysis Golf Course Design

Related Topics