HACKER Q&A
📣 ggregoire

37% of people are using Python for web scraping. What do they scrap?


Just saw this stat in this survey: https://www.jetbrains.com/lp/devecosystem-2019/python

I'm not familiar with web scraping. Wondering what are the most common applications of it?

I can think of a few like public users data collection, prices comparison, news aggregation… Are there some other obvious ones I am missing?


  👤 princess445 Accepted Answer ✓
You can use something like this https://github.com/proxycrawl/proxycrawl-python

To crawl literally anything. For my last project I got more than 10 million linkedin profiles to build a social network for entrepreneurs.

So data can give you power to build anything


👤 mjhea0
Most people are probably using scrapers and crawlers to collect data on sites that either don't have an open API or the API is difficult to use.

Examples:

Ancestry.com scraper - https://github.com/mjhea0/ancestry-scraper

Indeed job scraper - https://github.com/mjhea0/indeed-scraper

Craigslist housing scraper - https://github.com/mjhea0/craigslist-housing-scraper

https://github.com/ThaWeatherman/scrapers

What can you then do with the data?

Monitor competition or your own brand

Sentiment Analysis

Gather sales leads

Machine learning

Generate content (blog posts, building a custom job board) Find cheap flights