HACKER Q&A
📣 c1sc0

What is the best way to get a data dump of HN?


I’m working on a little text indexing side project & I think the content posted to HN would be a good dataset to work on. What is the best way to get a dump of all the url’s That have been submitted to HN? Asking for ideas before firing up a crawler. Are there existing dumps? APIs?


  👤 yamrzou Accepted Answer ✓

👤 krapp
You can find a link to HN's API in the footer of the page. Unfortunately, it's a bit awkward to work with, but it isn't rate limited.

👤 anigbrowl
Have you considered looking at the bottom of the page as well as well as the top?