A Practical Introduction to Web Scraping in Python

Although regular expressions are great for pattern matching in general, sometimes it’s easier to use an HTML parser that’s explicitly designed for parsing out HTML pages. There are many Python tools written for this purpose, but the Beautiful Soup library is a good one to start with.

Install Beautiful Soup

To install Beautiful Soup, you can run the following in your terminal:

$ python -m pip install beautifulsoup4

With this command, you’re installing the latest version of Beautiful Soup into your global Python environment.

Create a BeautifulSoup Object

Type the following program into a new editor window:

# beauty_soup.py

from bs4 import BeautifulSoup
from

 

To finish reading, please visit source site