MapReduce Design Patterns

Course Features

Course Details

Course Objectives
After the completion of the MapReduce Design Patterns course, you will be able to:
1. Understand about the commonly used Design Patterns in MapReduce
2. Learn the scenarios where to apply those Patterns in real world problems
3. Write mature code using MapReduce
4. Learn the best practices for using MapReduce

Who should go for this course?
The course is designed for people who want to gain expertise in their understanding of MapReduce paradigm.

Pre-requisites
The pre-requisites for this course include hands-on experience in Hadoop Framework and a basic understanding of MapReduce.

Why Learn MapReduce Design Patterns? 
Design Patterns are problem specific templates developers have perfected over the years for writing correct and efficient codes. It encodes correct practices for solving a given piece of problem, so that a developer need not re-invent the wheel. MapReduce program bugs can be hard to debug – using well established Design Patterns can alleviate the pain.

1. Introduction & Summarization Patterns
Learning Objectives - In this module, you will be introduced to Design Patterns vis-a-vis MapReduce, general structure of the course & project work. Also, discussion on Summarization Patterns: Patterns that give a summarized top level view of large data sets.
Topics - Review of MapReduce, Why are Design Patterns required for MapReduce, Discussion of different classes of Design Patterns, Discussion of project work and problem, About Summarization Patterns, Types of Summarization Patterns – Numerical Summarization Patterns, Inverted Index Pattern and Counting with counters pattern, Description, Applicability, Structure (how mappers, combiners & reducers are used in this pattern), use cases, analogies to Pig & SLQ, Performance Analysis, Example code walk-through & data flow.
 
2. Filtering Patterns
Learning Objectives - In this module, we will discuss about Filtering Patterns: Patterns that create subsets of data for a more detailed view.
Topics - About Filtering Patterns, Explain & Distinguish 4 different types of Filtering Patterns: Filtering Pattern, Bloom Filter Pattern, Top Ten Pattern and Distinct Pattern, Description, Applicability, Structure (how mappers, combiners & reducers are used in this pattern), use cases, analogies to Pig & SLQ, Performance Analysis, Example code walk-through & data flow.
 
3. Data Organization Patterns
Learning Objectives - In this module, we will discuss about Data Organization Patterns: Patterns that are about re-organizing and transforming data. Categories of these patterns are used together to achieve end objective.
Topics - About Organization patterns, Explain 5 different types of Organization Patterns – Structured to Hierarchical Pattern, Partitioning Pattern, Binning Pattern, Total Order Sorting Pattern and Shuffling Pattern, Description, Applicability, Structure (how mappers, combiners & reducers are used in this pattern), use cases, analogies to Pig & SLQ, Performance Analysis, Example code walk-through & data flow.
 
4. Join Patterns
Learning Objectives - In this module, we will discuss Join Patterns: Patterns to be used when your data is scattered across multiple sources and you want to uncover interesting relationships using these sources together.
Topics - About Join Patterns, Explain 4 different types of Join Patterns: Reduce Side Join Pattern, Replicated Join Pattern, Composite Join Pattern, Cartesian Product Join Pattern, Description, Applicability, Structure (how mappers, combiners & reducers are used in this pattern), use cases, analogies to Pig & SLQ, Performance Analysis, Example code walk-through & data flow.
 
5. Meta Patterns & Graph Patterns
Learning Objectives - In this module, we will discuss about Meta Patterns & Graph Patterns. Meta Patterns are different from other Patterns discussed above i.e. these are not basic patterns, but Pattern about Patterns, Introduction to Graph Patterns.
Topics - About Meta Patterns, Types of Meta Patterns: Job Chaining – Description, use cases, chaining with driver, basic & parallel job chaining, chaining with shell scripts, chaining with job control, Example code walk-through, Chain Folding – Description, What to fold, Chain mapper, Chain Reducer, Example code walk-through, Job Merging - Description, Steps for merging two jobs, Example code walk-through, Introduction to Graph design Pattern, Types of Graph Design Patterns: In-mapper Combining Pattern, Schimmy Pattern and Range Partitioning Pattern Pseudo-code for each pattern applied to Page-rank algorithm.
 
6. Input Output Pattern & Project Review
Learning Objectives - In this module, we discuss about Input Output Pattern: Input Output Patterns are about customizing input & output to increase the value of map reduce, Project Review.
Topics - About Input Output Patterns, Types of Input Output Patterns – Customizing Input & Output, Generating Data, External Source output, External Source Input, Partition Pruning: Description, Applicability, Structure (how mappers, combiners & reducers are used in this pattern), use cases, analogies to Pig & SLQ, Performance Analysis, Example code walk-through & reviewing the project work solution.
This course does not have any sections.

More Courses by this Instructor