HDP Developer: Quick Start

HDP Developer: Quick Start

This 4 day training course is designed for developers who need to create applications to analyze Big Data stored in Apache Hadoop using Apache Pig and Apache Hive, and developing applications on Apache Spark.

About this course

Subject-Matter Expert Live Training

Overview:
This training course is designed for developers who need to create applications to analyze Big Data stored in Apache Hadoop using Apache Pig and Apache Hive, and developing applications on Apache Spark. Topics include: Essential understanding of HDP & its capabilities, Hadoop, YARN, HDFS, MapReduce/Tez, data ingestion, using Pig and Hive to perform data analytics on Big Data and an introduction to Spark Core, Spark SQL, Apache Zeppelin, and additional Spark features.

Target Audience:
Developers and data engineers who need to understand and develop applications on HDP. 

Prerequisites:
Students should be familiar with programming principles and have experience in software development. SQL and light scripting knowledge is also helpful. No prior Hadoop knowledge is required.

Format:
Lecture/Discussion, Hands-on Labs and Demos

Duration:
4 Days

 

How to Register:

  1. Click the  "Purchase" button at the top of the page to initiate your purchase
  2. After you have completed your purchase and registration, you will be able to select the event that you wish to attend from the classes scheduled below after logging into your account
  3. Please note you will NOT be registered into the live event until you manually register into the event after you have completed purchasing this course

HDP Developer: Quick Start - Live Training Schedule

Event Date Spaces left
HDP Developer: Quick Start Training (VILT) | Paris May 22, 2018, 9 a.m. -
May 25, 2018, 5 p.m. CEST
30
HDP Developer: Quick Start (Virtual) - 4 June 4, 2018, 10 a.m. -
June 7, 2018, 6 p.m. EDT
29

Curriculum

  • Course Logistics
  • IMPORTANT - Please complete your registration by selecting a Live Training Event Date
  • HDP Developer: Quick Start - Live Training Schedule
  • Lesson 1:
  • Case for Hadoop
  • Lesson 2:
  • The Hadoop Ecosystem
  • Lab 1- Starting an HDP 2.3 Cluster
  • Lesson 3:
  • HDFS Architecture
  • Lab 2- Using HDFS Commands
  • Lesson 4:
  • Ingesting Data Into HDFS
  • Lesson 5:
  • Parallel Processing Fundamentals
  • Lesson 6:
  • YARN Architecture
  • Lesson 7:
  • Apache Pig
  • Demonstration 1 - Understanding Pig
  • Lab 3 - Getting Started with Pig
  • Lab 4- Exploring Data with Pig
  • Lesson 8:
  • Advanced Pig Processing
  • Lab 5 - Splitting a Dataset
  • Lab 6 - Joining Datasets (Optional)
  • Lab 7 - Preparing Data for Hive
  • Lesson 9:
  • Apache Hive
  • Lab 8 -Understanding Hve Tables
  • Demonstration 2 - Understanding Partitions and Skew
  • Lab 9 - Analyzing Big Data with Hive
  • Demonstration 3 - Computing ngrams (Optional)
  • Lab 10 - Joining Datasets in Hive
  • Lab 11 - Computing ngrams of Emails in Avro Format (Optional)
  • Lesson 10:
  • Using HCatalog
  • Lab 12 - Using HCatalog with Pig (Optional)
  • Lesson 11:
  • Advanced Hive Programming
  • Lab 13 - Advanced Hive Programming
  • Lesson 12:
  • Overview of Zeppelin and Spark
  • Lab 14 - Introduction to Spark REPLs and Zeppelin
  • Lesson 13:
  • RDD Programming
  • Lab 15 - Create and Manipulate RDDs
  • Lesson 14:
  • Pair RDDs
  • Lab 16 - Create and Manipulate Pair RDDs
  • Lesson 15:
  • Spark SQL
  • Lab 17 - Create and Save DataFrames and Tables
  • Lab 18 - Working with DataFrames
  • Lesson 16:
  • Caching and Persisting
  • Lesson 17:
  • Build and Submit Spark Applications
  • Lab 19 - Build and Submit Applications to YARN
  • Lesson 18: (Optional)
  • Introduction to Machine Learning with Spark (Optional)
  • Lab- Machine Learning Walkthrough.pdf (Optional)
  • Wrapping Up
  • Course & Instructor Survey

About this course

Subject-Matter Expert Live Training

Overview:
This training course is designed for developers who need to create applications to analyze Big Data stored in Apache Hadoop using Apache Pig and Apache Hive, and developing applications on Apache Spark. Topics include: Essential understanding of HDP & its capabilities, Hadoop, YARN, HDFS, MapReduce/Tez, data ingestion, using Pig and Hive to perform data analytics on Big Data and an introduction to Spark Core, Spark SQL, Apache Zeppelin, and additional Spark features.

Target Audience:
Developers and data engineers who need to understand and develop applications on HDP. 

Prerequisites:
Students should be familiar with programming principles and have experience in software development. SQL and light scripting knowledge is also helpful. No prior Hadoop knowledge is required.

Format:
Lecture/Discussion, Hands-on Labs and Demos

Duration:
4 Days

 

How to Register:

  1. Click the  "Purchase" button at the top of the page to initiate your purchase
  2. After you have completed your purchase and registration, you will be able to select the event that you wish to attend from the classes scheduled below after logging into your account
  3. Please note you will NOT be registered into the live event until you manually register into the event after you have completed purchasing this course

Live events

HDP Developer: Quick Start - Live Training Schedule

Event Date Spaces left
HDP Developer: Quick Start Training (VILT) | Paris May 22, 2018, 9 a.m. -
May 25, 2018, 5 p.m. CEST
30
HDP Developer: Quick Start (Virtual) - 4 June 4, 2018, 10 a.m. -
June 7, 2018, 6 p.m. EDT
29

Curriculum

  • Course Logistics
  • IMPORTANT - Please complete your registration by selecting a Live Training Event Date
  • HDP Developer: Quick Start - Live Training Schedule
  • Lesson 1:
  • Case for Hadoop
  • Lesson 2:
  • The Hadoop Ecosystem
  • Lab 1- Starting an HDP 2.3 Cluster
  • Lesson 3:
  • HDFS Architecture
  • Lab 2- Using HDFS Commands
  • Lesson 4:
  • Ingesting Data Into HDFS
  • Lesson 5:
  • Parallel Processing Fundamentals
  • Lesson 6:
  • YARN Architecture
  • Lesson 7:
  • Apache Pig
  • Demonstration 1 - Understanding Pig
  • Lab 3 - Getting Started with Pig
  • Lab 4- Exploring Data with Pig
  • Lesson 8:
  • Advanced Pig Processing
  • Lab 5 - Splitting a Dataset
  • Lab 6 - Joining Datasets (Optional)
  • Lab 7 - Preparing Data for Hive
  • Lesson 9:
  • Apache Hive
  • Lab 8 -Understanding Hve Tables
  • Demonstration 2 - Understanding Partitions and Skew
  • Lab 9 - Analyzing Big Data with Hive
  • Demonstration 3 - Computing ngrams (Optional)
  • Lab 10 - Joining Datasets in Hive
  • Lab 11 - Computing ngrams of Emails in Avro Format (Optional)
  • Lesson 10:
  • Using HCatalog
  • Lab 12 - Using HCatalog with Pig (Optional)
  • Lesson 11:
  • Advanced Hive Programming
  • Lab 13 - Advanced Hive Programming
  • Lesson 12:
  • Overview of Zeppelin and Spark
  • Lab 14 - Introduction to Spark REPLs and Zeppelin
  • Lesson 13:
  • RDD Programming
  • Lab 15 - Create and Manipulate RDDs
  • Lesson 14:
  • Pair RDDs
  • Lab 16 - Create and Manipulate Pair RDDs
  • Lesson 15:
  • Spark SQL
  • Lab 17 - Create and Save DataFrames and Tables
  • Lab 18 - Working with DataFrames
  • Lesson 16:
  • Caching and Persisting
  • Lesson 17:
  • Build and Submit Spark Applications
  • Lab 19 - Build and Submit Applications to YARN
  • Lesson 18: (Optional)
  • Introduction to Machine Learning with Spark (Optional)
  • Lab- Machine Learning Walkthrough.pdf (Optional)
  • Wrapping Up
  • Course & Instructor Survey