What's the current state of in-browser GPU accelerated LLM inference?

Question

Curious if anything is approaching usability at GPT3+ levels, even with a large binary download.

rustystump · Accepted Answer

I don't think there has been much change since this. https://news.ycombinator.com/item?id=35583349WebGPU is getting close to general support which will make things a bit faster but compute isn't as much of the issue as ram.