HACKER Q&A
📣 tusslewake

Best approaches to archiving interactive web journalism/writing


I was scrolling through this article (https://www.nytimes.com/interactive/2021/01/29/arts/design/juan-gris-cubism-collage.html) in the New York Times on the history of collage and art. It's a cool article with a lot of dynamic visual content that moves, shifts and refocuses as you scroll, what I believe the NYTimes calls an "interactive story."

Is anyone trying to archive articles like this for the medium/long term? I can imagine a static video of the article, but to really capture the original experience, you'd need something like a Docker for the typical 2022-era browser that was meant to run the story.

Does archive.org have the best solution to this problem right now? Who else is working on it and what are they doing?


  👤 roneoo Accepted Answer ✓
This firefox extension does a good job most of the time:

https://addons.mozilla.org/fr/firefox/addon/single-file/

This is for personal/offline backup in the first place. Do you need the possibility to share the archive?


👤 enhdless
I just learned about this organization, Saving Ukrainian Cultural Heritage Online (SUCHO): https://www.sucho.org/

They seem to be using various tools, like Browsertrix: https://github.com/webrecorder/browsertrix-crawler

It sounds promising for interactive sites:

> Support for custom browser behaviors, using Browsertix Behaviors including autoscroll, video autoplay and site-specific behaviors

Browsertrix links to https://replayweb.page/ for a way to view an archived site.