HACKER Q&A
📣 sebabytes

How do you monitor LLM response quality?


For any of you that have systems with LLM interactions in production — how do you monitor how the quality of LLM outputs continuously?

Do you use another LLM to evaluate if the response was hallucinated, and grade it across a set of metrics?


  👤 constantinum Accepted Answer ✓
Quoting Unstract’s documentation on LLMEVAL to reduce hallucination “ it is always a good idea to choose an LLM from a completely different provenance or origin as the evaluator LLM as compared to the LLM that is used to structure documents.” https://docs.unstract.com/unstract_platform/setup_accounts/s...

👤 spenrose
What We’ve Learned From A Year of Building with LLMs: https://applied-llms.org