- You have a GPU attached to each instance.
- Each request takes anywhere from 10ms to 2min.
- There's a hard limit on the number of in-flight requests/queries (I assume because of the GPUs).
Normally, I see people fronting the instances with software load balancers, but this doesn't work very well for reasons. Assuming I have a solution in the form of a fancy load balancer, how would I go about monetizing it? Let's assume the solution is non-trivial to create, but very straightforward to use (essentially a drop-in replacement).
I ask because I don't think I can just "sell a fancy load balancer" like it's the late 90s or something. Modern companies appear to always have more complicated products and I just want to sell a straightforward piece of infrastructure that solves a fairly hard problem.
Thanks in advance.
What were the "reasons" for "doesn't work very well? aka trying to do goolgle search type work on 2mb intel 486 oover a 2mb network and expecting to be able to compete with google is never going to work out.
What type of load balancing? Load balancing typically has to be tuned/adjusted based on end usage requirements/production environment (not just per factory setting)
Which reasons? In my experience/exposure, people are perfectly happy with Proxmox on a big GPU-laden boxen.