HACKER Q&A
📣 jorisboris

Are there ways to harvest FB group and subreddit content?


I'm trying to learn more about expat life in Crete, Mallorca, Roatan, ...

Facebook groups like "Expats in Mallorca" are a treasure trove but rather unfriendly to skim through as a human reader. Similar for subreddits.

I was wondering if there's a way to scrape and feed the content into an AI platform which can then summarise it?

Next to technical challenge, it's probably also against T&Cs to scrape data from groups so are there any "legal" solutions?


  👤 ailef Accepted Answer ✓
Reddit used to be really easy to scrape. I haven't really had the need to do it after the API changes drama, but the trick of appending `.json` to the URL apparently still works, example:

https://www.reddit.com/r/TheWire/comments/1aqvtoz/every_year...

Not sure if this only works on posts or also entire subreddits.


👤 infinitedata
You are already logged in, therefore you are using your rights as a member to review content. Maybe you can implement good time delays and leverage Scrapy from Python.

👤 toomuchtodo
Write a browser extension to scrape as you browse the group, export should be in a structured format you can train against.

👤 al_borland
Would asking ChatGPT accomplish this, without the extra steps? It has likely already harvested all this data and more.