HACKER Q&A
📣 MrAR
How do we reduce latency for AI applications?
I am connecting 3-4 AI models serially, such that output of one model is fed as input of another model. I am getting a lot of latency, even after using GPUs, how to reduce it?
👤 compressedgas
Accepted Answer ✓
Use smaller models.