icon
Quality Score

Content Quality
/
Video Quality
/
Qualified Instructor
/
Course Pace
/
Course Depth & Coverage
/

Overall Score : 98 / 100

icon
Live Chat with CourseDuck's Co-Founder for Help

Need help deciding on a c course? Or looking for more detail on Jeff Leek's Getting and Cleaning Data? Feel free to chat below.
Join CourseDuck's Online Learning Discord Community

icon
Course Description

Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data "tidy". Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.

icon
Instructor Details

Jeff Leek

Jeff Leek is an Assistant Professor of statistics at the Johns Hopkins Bloomberg School of Public Health and co-editor of the Simply Statistics Blog. He received his Ph.D. in statistics from the University of Washington and is recognized for his contributions to genomic data analysis and statistical methods for personalized medicine. His data analyses have helped us understand the molecular mechanisms behind brain development, stem cell self-renewal, and the immune response to major blunt force trauma. His work has appeared in the top scientific and medical journals Nature, Proceedings of the National Academy of Sciences, Genome logy, and PLoS Medicine. He created Data Analysis as a component of the year-long statistical methods core sequence for statistics students at Johns Hopkins. The course has won a teaching excellence award, voted on by the students at Johns Hopkins, every year Dr. Leek has taught the course.

icon
Reviews

4.9

141 total reviews

5 star 4 star 3 star 2 star 1 star
% Complete
% Complete
% Complete
% Complete
% Complete

By Gustavo X on 4-Feb-18

It's not really acceptable to make students google new things in order to pass the quizzes. Quizzes should asses knowledge gained through the reading and lectures, not our ability to learn via Google.

By Prathamesh N on 4-Apr-19

The course is good but the only problem is there is no explanation on how to solve different problems. there should be a live example of problems so people who have some trouble can get through

By 20e on 17-Feb-19

Swirl practice in for Getting and Cleaning Data in this class is terrible. Most of my code working fine in R and R studio but Swirl would tell me "That's not the answer I'm looking for, try again" Then I type "skip()" Swirl will give me the exact answers that I just typed earlier.

By Artem A on 1-Feb-19

There is a huge disconnect with the material and the HAR dataset exercise. I would suggest that there is some help with smaller exercises to help explain how to complete it. Yes, I know you're supposed to do research to help figure out problems, and I have. As a matter of fact, I have taken other courses on data wrangling to be able to figure out this problem. Merging two datasets makes this problem very confusing. Why can't you help guide students through a similar problem, instead of throwing to the fire?

By Alfredo A B on 14-Mar-19

The material in this course is very condensed. Data Table lecture was very much a copy of someone else' information on the web and was so terse, I would imagine even people from programming backgrounds had had to listen to it many times just to understand what was going . Expect to put in good 8-10 hours a week into this course if you want to become proficient in course' material.

By Sindre F on 13-May-19

There's too much of a jump from the theory to the practice. I had a difficult time understanding what was being asked of me.

By Philip A on 12-Jan-18

The contents of the course are extremely useful. BUT if your programming experience is the two previous courses I think it's a very difficult course, since there are some issues that are outdated or not explained in detail or not explained at all.To do most of the quizzes it's not enough to repeat and listen to the videos. In many cases it's necessary to read a lot of documentation, search and apply new functions that are not explained in the videos, search forums and realize that the packages not work in the same way for the new versions of R, that some functions don't work correctly with RStudio but they do with RGUI, in other cases must be added a certain argument that was not explained in the videos (eg: for windows "binary" mode in the function download.file, which I still have no idea what it means).In short, a lot of things that make certain parts of the evaluations do not measure if you really learned what was taught in the course, but what has been your ability to handle yourself in a self-taught way. Which is a necessary skill in general (not only in R and Data Science) but that isn't what I expect this course teaches me.All this search is more difficult especially for Spanish-speaking people because it isn't enough to have a level in the language between intermediate and advanced, rely on Google Translator and rewind the video many times; to really understand, you have to have some technical language management.

By Carlos M on 9-Apr-18

Week 1 can be more detailed as per what you expect in the quiz. The main idea of following a course is that we get all material about that topic together at one place. But here we are given just names of topics and told to research & read about them ourselves.

By jutzhang on 26-Jan-19

Modules 1 and 2 are horrible, so much to cover (several types of files) and so little actual information from the course. Yet, quizzes demand one knows every detail of each file type. Scripts and links are not available from the slides, although I did manage to find a repository with all scripts of the course (after much trouble). Why not make it available from the main page of the course? Anyways, some links were broken and could not be used to follow classes. Classes themselves are very dull, no interaction whatsoever.

By Puja G on 29-Nov-18

Horrible Assignment. So vague. So much puzzling to do. Students cannot waste their time in attempting to understand the loose vague assignment that was made. ASsignment took me 4-5 hrs of pondering and referrinf to online material just to freaking understand partially what the hell is expected of me to do. I hate this part of CoursERA IT is ugly!

By Rishabh J on 17-Jul-18

Prepare to not actually learn anything, rather you're going to go on a journey through google to try and find obscure ways to install packages onto your Windows computer. Whether it be packages to read Excel files, SQL files, API's and more, you'll rarely have the time or patience to put any of this to practice because you'll struggle to just get packages installed.For the record, I gave the previous two courses in the specialty a good rating, but this is clearly a low effort showing. It's a shame because I really think this might be the most applicable and useful content in the course.

By Deliang L on 14-May-18

Loved the structure of the course. Learned a lot. The course project seemed a little funky , especially creating the codebook for an already existing set of data but was a useful teaching aid.