The Ultimate Hands On Hadoop Tame your Big Data (

Hadoop tutorial with MapReduce, HDFS, Spark, Flink, Hive, HBase, MongoDB, Cassandra, Kafka + more! Over 25 technologies.

Created by: Frank Kane

Produced in 2022

What you will learn

  • Design distributed systems that manage "big data" using Hadoop and related technologies.
  • Use HDFS and MapReduce for storing and analyzing data at scale.
  • Use Pig and Spark to create scripts to process data on a Hadoop cluster in more complex ways.
  • Analyze relational data using Hive and MySQL
  • Analyze non-relational data using HBase, Cassandra, and MongoDB
  • Query data interactively with Drill, Phoenix, and Presto
  • Choose an appropriate data storage technology for your application Understand how Hadoop clusters are managed by YARN, Tez, Mesos, Zookeeper, Zeppelin, Hue, and Oozie.
  • Publish data to your Hadoop cluster using Kafka, Sqoop, and Flume
  • Consume streaming data using Spark Streaming, Flink, and Storm

Quality Score

Content Quality
Video Quality
Qualified Instructor
Course Pace
Course Depth & Coverage

Overall Score : 0 / 100

Live Chat with CourseDuck's Co-Founder for Help

Need help deciding on a hadoop course? Or looking for more detail on Frank Kane's The Ultimate Hands On Hadoop Tame your Big Data? Feel free to chat below.
Join CourseDuck's Online Learning Discord Community

Course Description

The world of Hadoop and "Big Data" can be intimidating - hundreds of different technologies with cryptic names form the Hadoop ecosystem. With this Hadoop tutorial, you'll not only understand what those systems are and how they fit together - but you'll go hands-on and learn how to use them to solve real business problems!

Learn and master the most popular big data technologies in this comprehensive course, taught by a former engineer and senior manager from Amazon and IMDb. We'll go way beyond Hadoop itself, and dive into all sorts of distributed systems you may need to integrate with.

Install and work with a real Hadoop installation right on your desktop with Hortonworks (now part of Cloudera) and the Ambari UI

Manage big data on a cluster with HDFS and MapReduce

Write programs to analyze data on Hadoop with Pig and Spark

Store and query your data with Sqoop, Hive, MySQL, HBase, Cassandra, MongoDB, Drill, Phoenix, and Presto

Design real-world systems using the Hadoop ecosystem

Learn how your cluster is managed with YARN, Mesos, Zookeeper, Oozie, Zeppelin, and Hue

Handle streaming data in real time with Kafka, Flume, Spark Streaming, Flink, and Storm

Understanding Hadoop is a highly valuable skill for anyone working at companies with large amounts of data.

Almost every large company you might want to work at uses Hadoop in some way, including Amazon, Ebay, Facebook, Google, LinkedIn, IBM, Spotify, Twitter, and Yahoo! And it's not just technology companies that need Hadoop; even the New York Times uses Hadoop for processing images.

This course is comprehensive, covering over 25 different technologies in over 14 hours of video lectures. It's filled with hands-on activities and exercises, so you get some real experience in using Hadoop - it's not just theory.

You'll find a range of activities in this course for people at every level. If you're a project manager who just wants to learn the buzzwords, there are web UI's for many of the activities in the course that require no programming knowledge. If you're comfortable with command lines, we'll show you how to work with them too. And if you're a programmer, I'll challenge you with writing real scripts on a Hadoop system using Scala, Pig Latin, and Python.

You'll walk away from this course with a real, deep understanding of Hadoop and its associated distributed systems, and you can apply Hadoop to real-world problems. Plus a valuable completion certificate is waiting for you at the end!

Please note the focus on this course is on application development, not Hadoop administration. Although you will pick up some administration skills along the way.

Knowing how to wrangle "big data" is an incredibly valuable skill for today's top tech employers. Don't be left behind - enroll now!

"The Ultimate Hands-On Hadoop... was a crucial discovery for me. I supplemented your course with a bunch of literature and conferences until I managed to land an interview. I can proudly say that I landed a job as a Big Data Engineer around a year after I started your course. Thanks so much for all the great content you have generated and the crystal clear explanations. " - Aldo Serrano

"I honestly wouldnt be where I am now without this course. Frank makes the complex simple by helping you through the process every step of the way. Highly recommended and worth your time especially the Spark environment. This course helped me achieve a far greater understanding of the environment and its capabilities. Frank makes the complex simple by helping you through the process every step of the way. Highly recommended and worth your time especially the Spark environment." - Tyler Buck

Who this course is for:
Software engineers and programmers who want to understand the larger Hadoop ecosystem, and use it to store, analyze, and vend "big data" at scale.
Project, program, or product managers who want to understand the lingo and high-level architecture of Hadoop.
Data analysts and database administrators who are curious about Hadoop and how it relates to their work.
System architects who need to understand the components available in the Hadoop ecosystem, and how they fit together.

*Some courses are excluded from this sale. Coupon not working? If the link above doesn't drop prices, clear the cookies in your browser and then click this link here.
Also, you may need to apply the coupon code directly on the cart page to get the discount.

Coupon Code

Instructor Details

Frank Kane

Frank spent 9 years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers, all the time. Frank holds 17 issued patents in the fields of distributed computing, data mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology, and teaching others about big data analysis.



0 total reviews

5 star 4 star 3 star 2 star 1 star
% Complete
% Complete
% Complete
% Complete
% Complete

This was a very thorough course and explained the concepts well. The only reason I rate it 4 stars instead of 5 is because many of the videos are outdated at this point.
It's a little frustrating that at the beginning of the course, it is suggested that HDP 2.6.5 is used, but all of the lectures use 2.5. This is an issue because so many of the technologies have changed just enough to where the general concept of what is being explained is correct, but I still have to dig through the Q&As or do google searches to make minor changes to commands or lines of code. Now an activity that should have taken 20 min takes 45 min.
Also, in the Oozie courses, if you are using HDP 2.6.5, there is a chance that the workflow will get stuck in a permanent running state. The job cannot be killed and resetting the machine does not work either. I suggest you just watch that lecture and not participate or else you may need to wipe your VM.
This is still a good course to learn Hadoop, but it definitely needs to be updated in many lectures with up to date commands.

By Camille Chi on 8/21/2020

This course covers many technologies regarding big data, which is very helpful to broaden our knowledge base. However, some of the technologies are not up-to-date such as Flink, which makes it very difficult to follow. Hope the course could be updated as possible as it can be. Thanks.

By Mukesh Kumar K on 8/2/2020

Worth taking this amazing course. I was skeptical about this course since it had mentioned that it requires more than 8 GB RAM, but I was able to reduce the base memory and complete the whole course with Hortonworks. Learnt more than 20 technologies with hands on experience. Thank you Udemy and Frank.

All the sessions were informative & very engaging. I took very long time to complete the sessions with hands on .. but sure every second and the money is worth it.. Thanks Frank & Friends !!!

By Jason Obihara on 5/25/2020

I'm training for a data science role, and i believe i needed to have some understanding of big data. This course. I believe gave me that and much much more. Learnt a lot of different technologies inside and out of the eco hadoop system. Was a worthwhile journey.

By Sunil Kumar Ege on 8/10/2020

Concepts are toooooooo fast, you need to control your english. Overall its good overview on the Hadoop ecosystem, give basic understanding of all the Tech stalk used in the current market.

By Davide on 8/15/2020

I wanted to learn more about distributed computing technology and I've chosen this course based on the ratings and because I've already got a few courses from Frank and I liked them quite a lot. Although Hadoop is just ONE way to go, this course provides a very comprehensive vision of all the underlying technologies, many of which can be used without Hadoop (i.e. Kafka, Cassandra, etc.). Frank keeps an open mind about technology and thanks to this it's easy to translate what you learn here to other systems, for example to AWS. This course is perfect to get an update on the progress of distributed computing in the last 10/15 years and for this very reason is not exactly an 'easy' one. Highly recommended!

By QA6 QA-Analytics on 5/10/2020

The context is good. Annoying that the presented scripts are not available on the course contents. A different hadoop implementation should have been proposed that could run on a machines with less than 16 GB ram.

By Laxmi L on 8/18/2020

Great Course . Absolutely love your teaching method.Loved every piece of the information. Anytime will buy any relevant course if you are the teacher..!And one compliment here , you really have a very soothing voice which makes audios enjoyable and encouraging.

By Tobias Anhuser on 9/27/2020

I am aiming for a career in Data Analysis/Science and I took this course mainly to broaden my knowledge in Hadoop, Big Data and all the other buzzwords :) I learned A LOT, that's for sure. Even though I got lost sometimes (particularly during the Streaming topic), I could follow most of the lectures and finish up most of the scripts/exercises on my little sandbox. Considering that I neither have a professional IT background nor do I intend to become full Data Engineer, I wasn't freaking out if I didn't understand every little detail. For some modules I could grasp the general idea of it and that's fine considering my background and intentions. A tiny criticism: some scripts/modules do not work straight away and require some look ups in the Q&A section and a lot of try-and-error takes. However, considering the number of used technologies (which are somehow all inter-connected but updated individually) this seems an issue that cannot be prevented. Nonetheless, awesome course, I learned a lot, totally recommend it!

By Alan Treanor on 8/11/2020

A great course that covers a wide range of technologies. It has helped me understand where and how they fit together, with some examples of how to use them.
A good starting point before a deeper dive into specific technologies of interest. I have 1 yr experience in Big Data Engineer role.

By Harsha Vardhan on 7/1/2020

Very good and informative. However, tries to cover too many technologies as hadoop ecosystem is quite vast.
If you want that, this course is perfect for you.