HACKER Q&A
📣 lamroger

How are you managing LLM APIs in production?


Looks like LangChain has LangSmith but it's in closed beta.

I saw a couple YC launches like Hegel AI.

I'm personally interested in deployments in small teams or teams with a lot of freedom to pick and choose their own tooling.


  👤 ianpurton Accepted Answer ✓
I'm currently writing up a deployment architecture for LLM's and the API question is answered here https://fine-tuna.com/docs/choosing-a-model/model/

Basically you can get a Docker container that will publish an Open AI API compatible end point. You can then choose the model that sits behind that API.

As deployment will be in Kuberenetes we will clusters with GPU resources to maxz out performance but we're not there yet.


👤 retrovrv
Langsmith is broadly for tracing the chains - are you looking for prompt deployment solutions?

👤 XGBoost
Play around with langchain and then convert all of that into decent code. After a few prototypes, you'll realize langchains or other pipelining are just for non-coders. You can architect elegant solutions yourself.