Introduction to Big Data

Drive better business decisions with an overview of how big data is organized, analyzed, and interpreted. Apply your insights to real-world problems and questions.*********Do you need to understand big data and how it will impact your business? This Specialization is for you. You will gain an understanding of what insights big data can provide through hands-on experience with the tools and systems used by big data scientists and engineers. Previous programming experience is not required! You will be guided through the basics of using Hadoop with MapReduce, Spark, Pig and Hive. By following alo

Created by: Ilkay Altintas

icon
Quality Score

Content Quality
/
Video Quality
/
Qualified Instructor
/
Course Pace
/
Course Depth & Coverage
/

Overall Score : 100 / 100

icon
Live Chat with CourseDuck's Co-Founder for Help

Need help deciding on a data science course? Or looking for more detail on Ilkay Altintas's Introduction to Big Data? Feel free to chat below.
Join CourseDuck's Online Learning Discord Community

icon
Course Description

Interested in increasing your knowledge of the Big Data landscape? This course is for those new to data science and interested in understanding why the Big Data Era has come to be. It is for those who want to become conversant with the terminology and the core concepts behind big data problems, applications, and systems. It is for those who want to start thinking about how Big Data might be useful in their business or career. It provides an introduction to one of the most common frameworks, Hadoop, that has made big data analysis easier and more accessible -- increasing the potential for data to transform our world!At the end of this course, you will be able to:* Describe the Big Data landscape including examples of real world big data problems including the three key sources of Big Data: people, organizations, and sensors. * Explain the V's of Big Data (volume, velocity, variety, veracity, valence, and value) and why each impacts data collection, monitoring, storage, analysis and reporting.* Get value out of Big Data by using a 5-step process to structure your analysis. * Identify what are and what are not big data problems and be able to recast big data problems as data science questions.* Provide an explanation of the architectural components and programming models used for scalable big data analysis.* Summarize the features and value of core Hadoop stack components including the YARN resource and job management system, the HDFS file system and the MapReduce programming model.* Install and run a program using Hadoop!This course is for those new to data science. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments. Hardware Requirements:(A) Quad Core Processor (VT-x or AMD-V support recommended), 64-bit; (B) 8 GB RAM; (C) 20 GB disk free. How to find your hardware information: (Windows): Open System by clicking the Start button, right-clicking Computer, and then clicking Properties; (Mac): Open Overview by clicking on the Apple menu and clicking "About This Mac." Most computers with 8 GB RAM purchased in the last 3 years will meet the minimum requirements.You will need a high speed internet connection because you will be downloading files up to 4 Gb in size. Software Requirements:This course relies on several open-source software tools, including Apache Hadoop. All required software can be downloaded and installed free of charge. Software requirements include: Windows 7+, Mac OS X 10.10+, Ubuntu 14.04+ or CentOS 6+ VirtualBox 5+.

icon
Instructor Details

Ilkay Altintas

Ilkay Altintas is the Chief Data Science Officer at the San Diego Supercomputer Center (SDSC), UC San Diego, where she is also the Founder and Director for the Workflows for Data Science Center of Excellence. Since joining SDSC in 2001, she has in the areas of computational data science and e-Sciences at the intersection of scientific workflows, provenance, distributed computing, bioinformatics, observatory systems, conceptual data querying, and software modeling. She is a co-initiator of and an active contributor to the popular open-source Kepler Scientific Workflow System. Ilkay Altintas received her Ph.D. degree from the University of Amsterdam in the Netherlands.

icon
Students also recommend

Free

Free

$10.44

icon
Reviews

5.0

135 total reviews

5 star 4 star 3 star 2 star 1 star
% Complete
% Complete
% Complete
% Complete
% Complete

By Catherine B on 8-Nov-18

Very interesting course. Explained concepts I'd heard of but didn't really know about. It is a foundation course for a specialization, it's not enough by itself but very good as a foundation course.

By Isara A on 5-Oct-18

I think the course could be shorten by half. The instructors extended the course into 3 weeks, but I think the real meat can fit into just 2 weeks. They use too many examples and several similar case studies to make the same points about why big data is important and beneficial -- repeatedly.I know why it is important, that's why I took the course! Just give a few mins intro and move on already.They even have a session discussing why should learn basic tools and theory and have a quiz about that; A quiz about why should you learn how to add and subtract before using a calculator. They can just explain once and people would understand.Anyhow, I feel like I learned quite a lot, but those I can do my reading just first chapter of big data book or watch 1-2 YouTube tutorials.Should not waste so many hours watching the course.

By Abdul K on 27-Sep-16

The course provided almost no value , and there was almost nothing covered that you couldn't find online for free. The presentation was probably the worst part of the course: It was extremely boring and made it hard to get through the videos. I didn't expect much practical parts in the course, but there was even less than what I expected: What was described as setting up a hadoop cluster consisted of downloading a preconfigured VM image and running it, and then typing in a couple of commands that were provided to you. Again, nothing of value was added, and quickly going over the FAQ page of any of the products mentioned would be more helpful, faster and a lot more interesting.

By Matthew L M on 11-May-19

Good intro that also helped me understand some of my previous assumptions

By Jeffery T on 31-Aug-16

This is a great introduction for Big Data. It helps me to revisit what I learned from the meetups and webinars, then put the fundamental knowledge and information in a solid foundation. Thank you.

By Rongon C on 14-May-19

Perfect for starters!

By Hendrik B on 1-Dec-17

I have participated in better courses on Coursera. I think the power point presentations are very very ugly. The colors really don't match well. This could be a lot more visually appealing. Also, for self learners, a bit more text would be cool, even though I know that not having too much text, but rather pictures, is beneficial. The presenter could speak a bit more natural. At the moment, it is very obvious that she reads text aloud. Also, there need to be some more FAQs in the practical parts, because there can be quite a few problems when trying to run mapreduce. I was only able to do it using Google, because I also had to install another hadoop version (5.12.) because my Intel Chipset was not compatible with the version you wanted me to download. There, the file path to map reduce was different from the example you proviided. To be honest, I did not check the Discussion Forums because I was not aware there were any. Maybe I could have gotten help there.

By Raivis J on 5-Feb-19

The general concepts of big data (week 1 and week 2) seemed a bit drawn out, much of the same was repeated over and over again. This was not necessary. The peer-graded assignment needs more detailed instructions - like mentioning that it is ok to resize the fields, and that shapes need to be copy/pasted as much as needed.

By Patricia P on 7-Nov-18

The intro part is too long. You will not start with the practice till the end of next week and I find that the lectures and videos by then are a lot less complete even when the content is harder. I would have prefer is the theoretical part was shorter and there was more support when we start learning about hadoop

By Pranav V on 7-Mar-19

This was absolutely useless. Lots of definitions with no motivation. The one saving grace is that the slides are comprehensive.

By DHANRAJ N on 19-Nov-18

Very information and knowledgeable

By karra k k on 17-Nov-18

It is a fantastic platform to upgrade our skills and the way of teaching is every nice to hear and understandable really impressive.Thanks for this beautiful course