- Oct26 3:51PM
- Nov 6, 2023 at 9:42:44 AM
- 2023-05-29T06:40:31.249-06:00
- June 3, 2011 at 4:52 AM
- Fr Nov 10, 2023, at 9:42:44 AM
- Friday November 10 at 2:20 AM
... You get the gist
[1] - https://stackoverflow.com/questions/63371125/python-how-to-c...
Divide and conquer:
- look at the first 50 or so unhandled cases, and pick the most common pattern in them.
- find a way to handle that pattern with a smile parser (e.g. a JVM DateTimeFormatter, a regex, or whatever works decently in your preferred language)
- repeat
That probably will decrease that 10 million to a million, then to 100,000 fairly rapidly.
Once you’re down to a manageable number, get your LLM to handle those.
(Also: this task likely is easily run in parallel, so if you have money, you won’t need 10M+ seconds)