How are you managing LLM APIs in production?

Question

Looks like LangChain has LangSmith but it's in closed beta.I saw a couple YC launches like Hegel AI.I'm personally interested in deployments in small teams or teams with a lot of freedom to pick and choose their own tooling.

ianpurton · Accepted Answer

I'm currently writing up a deployment architecture for LLM's and the API question is answered here https://fine-tuna.com/docs/choosing-a-model/model/
Basically you can get a Docker container that will publish an Open AI API compatible end point. You can then choose the model that sits behind that API.
As deployment will be in Kuberenetes we will clusters with GPU resources to maxz out performance but we're not there yet.

retrovrv · Answer

Langsmith is broadly for tracing the chains - are you looking for prompt deployment solutions?

XGBoost · Answer

Play around with langchain and then convert all of that into decent code. After a few prototypes, you'll realize langchains or other pipelining are just for non-coders. You can architect elegant solutions yourself.