How do you manage to stay informed without getting overwhelmed? Are there specific strategies or resources you use to filter information and focus on what's most important?
Most of the recent work has been about scaling up transformers as much as possible and throwing them at as much data as we can, so it's mostly making transformers more efficient and distributing them across large clusters.