HACKER Q&A
📣 dom96

Legality of archiving/re-hosting Reddit content


For content hosted publicly on reddit.com, what is the legality of downloading/scraping that content and re-hosting it on a separate website?

I am aware that the Archive Team is currently archiving Reddit[1]. As far as I understand what they do is legal. But I would like some reassurance.

Are there any good articles on this topic? Contact details for lawyers specialising in this area of the law also welcome.

1 - https://news.ycombinator.com/item?id=36254172


  👤 neovialogistics Accepted Answer ✓
It depends entirely on what nation you're operating in. (And additionally state/territory, if the answer is the United States)

👤 tssva
When someone posts content to Reddit under the Reddit terms of use they grant Reddit a license to use and distribute the content but ownership rights of the content remains with the poster. If you scrape Reddit and post the content in theory you open yourself up to copyright violation claims from the original poster of the content. The odds that someone is going to sue you for redistributing without permission content they posted to Reddit is likely extremely small but it is not zero.

👤 BunnyOSteele
From https://www.redditinc.com/policies/user-agreement

> Except and solely to the extent such a restriction is impermissible under applicable law, you may not, without our written agreement:

> - license, sell, transfer, assign, distribute, host, or otherwise commercially exploit the Services or Content;

Even without being a lawyer that seems pretty clear that it is not legal. Unless you have Reddits written permission, which I guess the Archive Team has


👤 CM30
Given the users own the content, it'd presumably be up to them whether it can be rehosted or not. Reddit gets a license to display the content, but they don't really have any control over what third parties can do with it.

Personally I don't care if anyone reuses stuff I've posted on Reddit or other forms of social media (forums like Hacker News included), but there's always the possibility that someone might. And if you remove their posts when asked, I doubt most of them will take it any further than that.


👤 linuxftw
There are separate terms for the API, which seem to indicate to me it's legal to use their API to download user content: https://www.reddit.com/wiki/api-terms

According to these terms, the content is owned by the users, and you're not to modify the content. However, if the content is owned by the users, then IMO Reddit cannot really say what you do or don't do with the content, as long as you're not building an application that acts as a proxy to Reddit.

The license is revocable to accessing their API, but they're not licensing you the user content, only the ability to download it. What you can do with that content is likely up to the laws in your jurisdiction. I'd say most content would qualify as public domain, though obviously some content will have copyright protection.

I would do the downloading now before they start charging for the API if you're serious about the project.


👤 raldi
For what it’s worth, Reddit’s TOS reads:

You may not: Access, search, or collect data from the Services by any means (automated or otherwise) except as permitted in these Terms or in a separate agreement with Reddit (we conditionally grant permission to crawl the Services in accordance with the parameters set forth in our robots.txt file, but scraping the Services without Reddit’s prior written consent is prohibited)


👤 Cloudef
What's with the bunch of reddit crap on the front page recently?