- Requests to download even a small amount of data get rate ACLed (it says slow down/too many requests) - It seems like this is a known issue and that common crawl is no longer well maintained. https://groups.google.com/g/common-crawl/c/BvMGYUY-dro
Are there any alternatives for accessing a large amount of web crawl data?
Thanks!
https://commoncrawl.org/blog/oct-nov-2023-performance-issues
And the new status website at: