Data: https://www.transportation.gov/developer Example: https://transportation.report/
Well under 1gb if you focus on models and specs, not safety!
It's highly normalized, to the point where it goes backwards and starts to actually interfere with human usability. Not sure if that helps your goal or hinders it. I used to use it for teaching both sql and the pros/cons/usefulness of the normal forms.
general overview: https://wahapedia.ru/wh40k9ed/the-rules/data-export/
spreadsheet with dataset links: https://wahapedia.ru/wh40k9ed/Export%20Data%20Specs.xlsx
It’s here as a CSV if you’d like it: https://gist.github.com/connordoner/9cda1857b8fff5b8e042013d.... There’s no license attached so do as you wish.
https://github.com/RANDCorporation/milliondigits
It's probably not what you're looking for, but it's my own favorite dataset.
# Data from spinning a tire while crushing it into a "road" at an angle