Statistics for Genomic Data Science

With genomics sparks a revolution in medical discoveries, it becomes imperative to be able to better understand the genome, and be able to leverage the data and information from genomic datasets. Genomic Data Science is the field that applies statistics and data science to the genome.This Specialization covers the concepts and tools to understand, analyze, and interpret data from next generation sequencing experiments. It teaches the most common tools used in genomic data science including how to use the command line, along with a variety of software implementation tools like Python, R, Biocon

Created by: Jeff Leek

icon
Quality Score

Content Quality
/
Video Quality
/
Qualified Instructor
/
Course Pace
/
Course Depth & Coverage
/

Overall Score : 76 / 100

icon
Live Chat with CourseDuck's Co-Founder for Help

Need help deciding on a data science course? Or looking for more detail on Jeff Leek's Statistics for Genomic Data Science? Feel free to chat below.
Join CourseDuck's Online Learning Discord Community

icon
Course Description

An introduction to the statistics behind the most popular genomic data science projects. This is the sixth course in the Genomic Big Data Science Specialization from Johns Hopkins University.

icon
Instructor Details

Jeff Leek

Jeff Leek is an Assistant Professor of statistics at the Johns Hopkins Bloomberg School of Public Health and co-editor of the Simply Statistics Blog. He received his Ph.D. in statistics from the University of Washington and is recognized for his contributions to genomic data analysis and statistical methods for personalized medicine. His data analyses have helped us understand the molecular mechanisms behind brain development, stem cell self-renewal, and the immune response to major blunt force trauma. His work has appeared in the top scientific and medical journals Nature, Proceedings of the National Academy of Sciences, Genome logy, and PLoS Medicine. He created Data Analysis as a component of the year-long statistical methods core sequence for statistics students at Johns Hopkins. The course has won a teaching excellence award, voted on by the students at Johns Hopkins, every year Dr. Leek has taught the course.

icon
Reviews

3.8

68 total reviews

5 star 4 star 3 star 2 star 1 star
% Complete
% Complete
% Complete
% Complete
% Complete

By Ian P on 30-Aug-18

I did my best to work through module 1, but encountered one problem after another with installing the various required R packages, due to version issues. From the absence of recent discussion posts it seems that this is not really a current, viable course. From what I have seen of the course, I get the impression that even if package installation went smoothly, the course is more about R than statistics or genomics - which is not what I joined for.

By Paul S on 3-Jan-18

The worst executed course I have taken in 36 years of post-graduate education.1 The instructor speaks so fast it is difficult even for a native English speaker like myself to understand.2. This course is only suitable as a review for people who are experts in the field already. Even if you know how to use Bioconductor and are familiar with programming in R, if you don't know the tools being used already the instruction in the course will not give enough information to be able to do the quizzes without a great deal of difficulty. 3. The examples presented are thrown out in a cursory fashion without enough detail about how the data is being set up or manipulated. Matrices are transformed and recombined with little explanation about why things are being done. 4. Although generalizing from material presented to new applications is a valid instructional approach, the instruction does not give the student enough information to do this and the instructor expects students to be able to figure out new algorithms from vague public domain documentation. 5. Although the instructor makes an impassioned plea for carefully thought out statistical test design, proper documentation of work flow, and appropriate use of p-values, he does not describe the interpretation of statistical tools presented. For example, tools for calculating thousands of principle components in seconds is given, but beyond showing clusters of dots on a graph may indicate a genetic cluster does not explain what the individual points in the PCA mean.In summary, the tools presented are very powerful but are not well described. Extensive revision to the course is needed.

By Tushar K on 25-Mar-19

Very good course and useful understanding statistical aspects of data.

By Gregorio A A P on 26-Aug-17

Excellent, but I would be grateful if you could translate all your courses of absolute quality into Spanish.

By on 1-Jul-18

really a good course for people who want to learn use R to dispose genomic data

By Zhen M on 28-Jun-18

The professor is really enthusiasm, so I was really impreesed by him. And his teaching is brief, and I can learn key points through the lectures. Great course!

By Apostolos Z on 21-Oct-17

Excellent course! Thank you!

By Roman S on 4-Jan-18

Really great and in-depth class! thank you

By Alex Z on 7-Aug-17

talk fast and informative! I enjoyed it a lot.

By Juan J S G on 7-Mar-17

La semana 3 puede hacerse dura, pero el curso es muy completo y recomendable.

By Maximo R on 22-Mar-16

Great course!!!!

By Chunyu Z on 10-Feb-16

very helpful class. instructor very organized.