Web Scraping with Python and BeautifulSoup

March 16, 2022

Agenda

Facebook acquired WhatsApp for 22 billion US dollars or 55 dollars per user in 2014. What was the reason behind this acquisition? Get more user data! Diverse, representative, good quality data is the lifeblood of an analytics pipeline.

The number of websites on the internet is estimated to be around 2 billion. Web scraping turns the entire world wide web into your data set. In this webinar, we will introduce how to scrape a website using the BeautifulSoup package in Python. We will discuss how to navigate the HTML DOM to find data that interests you, some best practices, the legality of web scraping, and briefly touch on how to build and automate a web scraper on the cloud using Azure Functions.

Arham Noman-Data Science Dojo

Arham Noman

Data Scientist at Data Science Dojo

Arham Noman is a Data Scientist at Data Science Dojo. He has worked on a variety of projects ranging from building cloud-based machine learning pipelines to indexing and extracting insights from large unstructured datasets. Arham is also an instructor at Data Science Dojo and takes joy in practicing the “Data Science for everyone” philosophy through his sessions.

We are looking for passionate people willing to cultivate and inspire the next generation of leaders in tech, business, and data science. If you are one of them get in touch with us!

Resources

Become a Presenter