HACKER Q&A
📣 sashank_1509

A good project for getting into Language Modelling?


Hi HN, I am looking for a career change and have a week long vacation planned soon. My experience is in software engineering, what do you think would be a good project to do over the week to get into Language Modelling? Some ideas I had was:

1. Build a repo of Transformers and all it's variants and try to benchmark the efficiency of each variant.

2. Train a small transformer on a small dataset that I can extract from the internet.

3. Try to reproduce a small scale version of RLHF (maybe not realistic at even a small scale)

I'm biased towards projects that people find useful beyond me just learning more about the field and assume I have access to like a few hundred dollars of AWS GPU compute. Thanks a lot for your help


  👤 ArtWomb Accepted Answer ✓
NES APU sound synthesis chip is an ideal constraint for cross-domain learning with Transformers ;)

LakhNES: Improving multi-instrumental music generation with cross-domain pre-training

https://chrisdonahue.com/LakhNES/

The NES Music Database (5k+ scores)

https://github.com/chrisdonahue/nesmdb

The Lakh MIDI Dataset v0.1 (175k+ files)

https://colinraffel.com/projects/lmd/


👤 0x008
Andrej karpathy made a YouTube Video about implementing a very small GPT (nanoGPT) from scratch, very interesting.