1. Besides web scraping and large language models, do I need any other technologies?
2. Besides writing a dedicated scraper for each website, is there a simpler technology available?
3. I plan to have every piece of information analyzed by a large language model to see if it meets my criteria, but it feels like it consumes a lot of tokens. Is there a more cost-effective way?
4. Is my technical approach completely wrong? Is there an easier way?
1. Info is exploding while the time you have available per day to consume info is finite. This time can be divided between Info that has short term value, long term value, no value (ie entertainment).
2. Since info is exploding, a gap grows every day between supply and demand. If you run an automated process every day it can easily overwhelm, what your actual need in the 3 categories cuz it will easily generate more content than you have time.
3. Complicating everything further is People's Needs constantly change with time. In all 3 categories.