Web Scraping and Crawling with Python: Beautiful Soup, Requests & Selenium
Created by: GoTrained Academy
Produced in 2018
What you will learn
- Python Refresher: Review of Data Structures, Conditionals, File Handling
- How Websites are Hosted on Servers; Basic Calls to Server (GET, POST Methods)
- Web Scraping with Python Beautiful Soup and Requests
- Diverse Web Scraping Exercises
- Source codes (*.py files) for all Exercises can be downloaded
- Q&A board to send your questions and get them answered quickly
Overall Score : 84 / 100
Live Chat with CourseDuck's Co-Founder for Help
In this course, you will learn how to perform web scraping using Python 3 and the Beautiful Soup, a free open-source library written in Python for parsing HTML.
We will use lxml, which is an extensive library for parsing XML and HTML documents very quickly; it can even handle messed up tags. We will also be using the Requests module instead of the already built-in urllib2 module due to improvements in speed and readability.
The course cover the following topics: accessing web pages programmatically; scraping web pages to extract the required data using Beautiful Soup to parse web pages; interacting with web pages to do different things with them programmatically; and using Selenium for web scraping and when we need it.
By the end of this course, you will be able to understand how websites and servers function, diverse data extraction techniques, and methods of handling and organizing data.
This Web Scraping course covers the following topics:
- Review of data structures (Lists, Dictionaries, Tuples, File Handling)
- How websites are hosted on servers
- Calls to the server (GET, POST methods)
- Review of HTML and CSS
- Requests Module and BeautifulSoup Module overview
- Parsing HTML using BeautifulSoup
- Filtering elements using BeautifulSoup and navigating the Parse Tree
- Selenium and the need for it
- Selecting elements using Selenium
- CSS selectors
- XPath selectors
- Navigating pages using Selenium
- Practical Projects
Who this course is for:
- Those who want to learn how to use Python for web scraping and data extraction.
GoTrained is an e-learning academy aiming at creating useful content in different languages and it concentrates on technology and management.
We adopt a special approach for selecting content we provide; we mainly focus on skills that are frequently requested by clients and jobs while there are only few videos that cover them. We also try to build video series to cover not only the basics, but also the advanced areas.
Students also recommend
4.7 (91 Reviews)
- Provider: Coursera
- Time: 25h
4.8 (28 Reviews)
- Provider: YouTube
- Time: 9h
4.4 (18 Reviews)
- Provider: Google
- Time: 3h 35m