HACKER Q&A
📣 wslh

Has Hacker News stopped uploading its dataset in 2022?


Has Hacker News stopped uploading its dataset in 2022?


  👤 jerbear4328 Accepted Answer ✓
That doesn't appear to be an official HN thing, it's some third party responsible for it. I see dang has said he contacted them but can't do anything about it. The API still works.

https://news.ycombinator.com/item?id=38796861


👤 xnx
You can now get Hacker News data in real time from the Hacker News API powered by Firebase: https://github.com/HackerNews/API

This is great (real time!), but also kind of a pain (38+ millions individual http requests to get the whole thing).

Thankfully there's no authentication or apparent rate limiting. I fumbled my way through downloading the whole thing with curl. I screwed up a few times so made over 70 million requests in total.

Toy analysis of the data I downloaded here: https://public.tableau.com/app/profile/isna/viz/HackerNewsDa...


👤 wslh
After one day of my post and reviewing these answers a simple query [0] gives a lot of responses. For example [1] and [2].

[0] Google: dump all hackernews site:github.com

[1] https://github.com/ashish01/hn-data-dumps

[2] https://github.com/iOliverNguyen/hackernews-dump


👤 cldellow
The graveyard that is Google's issue tracker has an abandoned ticket about this: https://issuetracker.google.com/issues/261579123

👤 pvg
Email this stuff to the mods or search the site

https://news.ycombinator.com/item?id=38781031