The Data Scientist's Toolbox

This Coursera course was created by Johns Hopkins University. It introduces tools and resources that are essential to working in data science, and it splits lessons into a theoretical and practical half.

Created by: Jeff Leek

Produced in 2019

What you will learn

  • The fundamentals of data science.
  • R and RStudio.
  • Version control and GitHub.
  • Scientific Thinking, R Markdown and Big Data.

Quality Score

Content Quality
Video Quality
Qualified Instructor
Course Pace
Course Depth & Coverage

Overall Score : 100 / 100

Live Chat with CourseDuck's Co-Founder for Help

Need help deciding on a data science course? Or looking for more detail on Jeff Leek's The Data Scientist's Toolbox? Feel free to chat below.
Join CourseDuck's Online Learning Discord Community

Course Description

data science Awards Best Coursera Course

In this course you will get an introduction to the main tools and ideas in the data scientist's toolbox. The course gives an overview of the data, questions, and tools that data analysts and data scientists work with. There are two components to this course. The first is a conceptual introduction to the ideas behind turning data into actionable knowledge. The second is a practical introduction to the tools that will be used in the program like version control, markdown, git, GitHub, R, and RStudio.



    • Course takes a light introduction on a broad range of topics that all apply to data science. Great preparation for a full-dive, multicourse adventure into data science.
    • Covers mindset of data science in a way most courses skip.
    • Course ensures that you have the tools to take on a journey to truly master data science.
    • Course covers a substantial range of data science tools. Accessing all of them can potentially add a hefty price tag to completing the course.
    • Course is not a Capstone project. It is intended to prepare for a data science Capstone project.
    • Course teaches very little data science itself. It is more like going over the syllabus and prerequisites before diving into real learning.

Instructor Details

Jeff Leek

Jeff Leek is an Assistant Professor of statistics at the Johns Hopkins Bloomberg School of Public Health and co-editor of the Simply Statistics Blog. He received his Ph.D. in statistics from the University of Washington and is recognized for his contributions to genomic data analysis and statistical methods for personalized medicine. His data analyses have helped us understand the molecular mechanisms behind brain development, stem cell self-renewal, and the immune response to major blunt force trauma. His work has appeared in the top scientific and medical journals Nature, Proceedings of the National Academy of Sciences, Genome logy, and PLoS Medicine. He created Data Analysis as a component of the year-long statistical methods core sequence for statistics students at Johns Hopkins. The course has won a teaching excellence award, voted on by the students at Johns Hopkins, every year Dr. Leek has taught the course.

Students also recommend






148 total reviews

5 star 4 star 3 star 2 star 1 star
% Complete
% Complete
% Complete
% Complete
% Complete

By Jitin V on 13-Aug-18

Good to set you up for advance courses.

By kaan b on 28-May-19

Not met what offered. I really don't know why but Instructor was in a hurry and like, he was in the position of instructor by obligation. Maybe, He has knowledge of the subject, but definitely does not have even basic skills of teaching.Because of this course, I am not planning to follow other courses on this specialization.

By Anthony V on 16-Aug-18

Great course, really helps get you into the right mindset for becoming a data scientist.

By Tolga T on 24-Nov-18

!!!STOP DON'T TAKE THIS COURSE!!!%100 pure advertising. There is a moment I felt like I learned some thing, but rest of the course I played with x2.0, of there was more I would have get it. Putting this into Specialization requirements is smart from your perspective, you are basically saying if you want to reach Capstone pay me $50 more, but at least fix the typos you made during video, just a little respect to your subscribers. But right now, I highly doubt that Capstone Project will be something serious that I want to mention in my Linkedn. There is also downside of what you do. But since you are in between the top rated courses either nobody uses Coursera anymore or people are silent enough and patient enough.You are all Scientists like me, I'm also biostatistician but I would never ever post a course like this to any platform. I'd rather use Google or Facebook ads to lead people here.If somebody wise enough to get Data Science Course, he should be skillful enough to download R, click next and install it, and R has help for it, shows you step by step. GitHub is free platform, anyone who can signup for Coursera can signup for GitHub, too.I know there is no requirements for this course or specialization course, it is 0 to Scientist but seriously you are talking about R codes, arrays, loops, regression, model fit but signing up for GitHub.Your target group in Coursera is either Data Scientist or becoming one, so they know what the Data Scientist job posts requires. It requires coding blind folded R/Python/Java/one of C family at least 2 of them, hopefully all of them. It requires SQL, MySQL, NoSQL, any kind of SQL or database solution mankind ever used.It requires Math, Statistics, Analytics, Algebra, Finance, Economics + all kinds of computational sciencesIt requires management, social relations, advertising, psychology, anthropology + rest of the social sciences.+++++ it requires LOGIC and NON-ARTIFICIAL HUMAN INTELLIGENCEso we are trying to be that guy, no need to show installing R or GitHub, I'm sure you will do it again doing rest of the Specialization.

By Frederik C on 13-Aug-18

Great intro

By Annette I on 24-Apr-18

This course was a good intro especially in setting all the necessary software for future courses. I suggest to read the manuals, books and other readings the profs suggest. The resources are helpful.

By William C on 26-Sep-17

I really don't know much about this stuff, I think the jury's still out on whether the last four weeks will be helpful in the future. We'll see how much I think I've learned at the end of the course

By David S on 20-Dec-18

This course was in many ways the first day of lectures, get your syllabus, buy your books, install your tools, etc. I would give it 5 stars but the lectures inclusion of internet addresses that aren't links and aren't included in the transcript led to a lot of time paused and typing out long addresses.

By SANJEEVE K G on 24-Jan-19

Coursera has given new life to me

By Andrea R C on 11-Apr-19

A great intro to the course. I am not the biggest fan of the automated voice, but it gets the job done. I do like the secondary lessons written out with bulleted lists and close-ups of the slides. That is like a helpful review.

By sonal g on 3-Feb-19

Providing feedback means giving students an explanation of what they are doing correctly AND incorrectly. However, the focus of the feedback should be based essentially on what the students is doing right. It is most productive to a students learning when they are provided with an explanation and example as to what is accurate and inaccurate about their work.Use the concept of a feedback sandwich to guide your feedback: Compliment, Correct, Compliment.

By Khaleel u r on 22-May-19

execellent i am very to gland get this certificate .. it is so valueable for me. the first one of data science track