Decision Tree Modeling Using R

Course Details

Course Objectives
After the completion of Decision Tree course at Iteanz, you should be able to:
1. Understand the Anatomy of a Decision Tree
2. Learn to use the R platform to develop Decision Trees
3. Apply various Decision Tree techniques (CHAID / CART etc.)
4. Perform Decision Tree Model Validation
5. Learn where to use CHAID / CART / ID3,etc.
6. Learn to design data for Decision Tree modelling
7. Interpret and Implement Decision Tree model
8. Implement Decision Trees to derive business insights

Who should go for this course?
The course is designed for professionals who want to learn Decision Tree modelling and apply the modelling techniques using R. They are:
1. Developers who want to step-up as 'Data Scientists
2. Analytics Consultants
3. R / SAS / SPSS Professionals
4. Data Analysts
5. Information Architects and Data Engineers
6. Statisticians

Why learn Decision Tree?
Decision Tree Modelling is a popular Analytic technique. This course can give you a head start on:
1. What is core Analytics work
2. What do they mean, when they talk of model
3. Why modelling is such a beneficial proposition
4. How do you develop decision tree using popular platform of R
5. How do you validate to know, it will work over time

What are the pre-requisites for this Course?
The pre-requisite for this course is basic knowledge of R programming language. This course will explain only those R programming syntax which is required for the Decision Tree model development.

Which Case-Studies will be a part of the Course?
During the Course, you will be working on a live project where you will be using Credit Risk data-set to perform Decision Tree Modelling using R.

Project #1:Development of a classification tree
Industry : Credit Card Problem Statement : Develop propensity of better credit rating

Project #2:Development of a Regression Tree. Understand the difference between a Regression Tree and Linear Regression scenarios
Industry : Education Problem Statement : We will use R to develop a regression tree (where the dependent variable is numeric) and we will understand what benefits / constraints it brings in comparison to linear regression

1. Introduction to Decision Tree
Learning Objectives - In this module, you will understand What is a Decision Tree and what are the benefits. What are the core objectives of Decision Tree modelling, How to understand the gains from the Decision Tree and How does one apply the same in business scenarios
Topics - Decision Tree modeling Objective, Anatomy of a Decision Tree, Gains from a decision tree (KS calculations), and Definitions related to objective segmentations
2. Data design for Modelling
Learning Objectives - In this module, you will learn how to design the data for modelling
Topics- Historical window, Performance window, Decide performance window horizon using Vintage analysis, General precautions related to data design
3. Data treatment before Modelling
Learning Objectives - In this module, you will learn how to ensure Data Sanity check and you will also learn to perform the necessary checks before modelling
Topics- Data sanity check-Contents, View, Frequency Distribution, Means / Uni-variate, Categorical variable treatment, Missing value treatment guideline, capping guideline
4. Classification of Tree development and Algorithm details
Learning Objectives - In this module, you will learn to use R and the Algorithm to develop the Decision Tree.
Topics- Preamble to data, Installing R package and R studio, Developing first Decision Tree in R studio, Find strength of the model, Algorithm behind Decision Tree, How is a Decision Tree developed?, First on Categorical dependent variable, GINI Method, Steps taken by software programs to learn the classification (develop the tree), Assignment on decision tree
5. Industry practice of Classification tree-Development, Validation and Usage
Learning Objectives - In this module you will understand how Classification trees are Developed, Validated and Used in the industry
Topics- Discussion on assignment, Find Strength of the model, Steps taken by software program to implement the learning on unseen data, learning more from practical point of view, Model Validation and Deployment.
6. Regression Tree and Auto Pruning
Learning Objectives - In this module you will understand the Advance stopping criteria of a decision tree. You will also learn to develop Decision Trees for numerous outcomes.
Topics- Introduction to Pruning, Steps of Pruning, Logic of pruning, Understand K fold validation for model, Implement Auto Pruning using R, Develop Regression Tree, Interpret the output, How it is different from Linear Regression, Advantages and Disadvantages over Linear Regression, Another Regression Tree using R
7. CHAID Algorithm
Learning Objectives - In this module you will learn what is Chi square and CHAID and their working and also the difference between CHAID and CART etc..
Topics- Key features of CART, Chi square statistics, Implement Chi square for decision tree development, Syntax for CHAID using R, and CHAID vs CART.
8. Other Algorithms
Learning Objectives - In this module you will learn about ID3, Entropy, Random Forest and Random Forest using R
Topics- Entropy in the context of decision tree, ID3, Random Forest Method and Using R for Random forest method, Project work
This course does not have any sections.