HACKER Q&A
📣 labrador

Should we be saving our favorite information locally?


Here's a thought, and someone please tell me if I'm wrong, but if you have bookmarks to favorite articles, essays, poems, etc... I recommend you use the browser print function to print them to PDF format and save them locally. Reasons:

1) With the advent of ChatAI people will be googling much less if at all, which will reduce traffic to websites. It may not make economic sense for the websites to stay up, so your favorite essay will go away. It might be saved in the internet archive, or it might not

2) Hard drives are ridiculously cheap. A 10 terabyte hard drive (10,000 gigabytes) is less than $200

3) An AI like ChapGPT is not guaranteed to be trained on your favorite information, so it may be lost to you or hard to find again

4) Soon we will have AI assistants which can be trained on all the PDFs you have saved to produce a highly customized and personal AI tailored to what you like


  👤 logicalmonster Accepted Answer ✓
I'm not certain what kind of impact AI will have on the marketplace, but it seems like a good idea to store what you value locally regardless of what happens with AI.

Sites die off. Online web archives aren't reliable or completely trustworthy. And censorship of many forms seems to be occurring more and more.

With storage so cheap, there's little downside to saving what you like.


👤 JohnFen
I have had the habit of saving my favorite stuff locally for a very long time. If it's locally stored, it's always available. If it's online, there's a chance that you'll lose access either temporarily or permanently.

👤 theandrewbailey
Absolutely. The internet has proven itself to be ephemeral. The only part of it that is guaranteed is now. Content can be silently changed. Posts get deleted. Links break and 404. Images get lost. Sites put up paywalls, or go down entirely if the owners go bankrupt.

If you find something worth saving, save it! And don't forget to back up your stuff!


👤 animesh
Yes, had the same realization some months ago. I started building a CLI based tool, smaller in scope, offline first and occasionally online.

👤 Terretta
Yes and no.

Instead of PDF, use Markdownload (on iOS, use a Safari web content to markdown file extension):

https://github.com/deathau/markdownload

And save in a journaled folder like "YYYY-MM-DD - Page Title.md" with a YAML frontmatter of all available metadata.

Have this as a folder in your PKM of choice (Obsidian, Foam, whatever).

These days, point some text embedding at it, and let it generate your own LLM brain.

But you can also static-site-generate that back into your own web knowledge site or base.

If you don't need it locally, and depending on the capture you want, consider pinboard.in or historio.us:

https://pinboard.in/

https://historio.us/


👤 sdwr
Only if you're a hoarder, or it's career-related and you're meticulous. I find that if I start saving stuff, the drive to collect starts outweighing the value of the collection rapidly.

AI search would be the only reason to. If it saves everything automatically and can query references / make inferences seamlessly, then great. Anything less, and my life eats itself like a snake.


👤 quickthrower2
I agree. Instapaper (phone app) is a good tool for doing this. But pdfs are probably more “open” in that you know the format and can choose where to put the files. Internet archive sometimes saves dead links though.

👤 bsnnkv
I have a slightly different take on this: I save the text that I care about, and have some automation set up to archive the source URL of the text to archive.org[1] (which works well enough for me, even if it's not 100% perfect, because I'm only archiving it for the greater context of the highlighted text, which I rarely go back to).

I just got myself an Nvidia 4090, and I'm looking into using local LLMs to feed my data into (I think this is called retrieval augmented generation?) for various assistant-type use cases.

I'm particularly excited to potentially be able to go through my saved Kindle highlights for multi-novel sci-fi and fantasy series in order to refresh my memory by clarifying key story beats before continuing with the next book.

[1]: https://lgug2z.com/articles/notado-07-2023-update/