Measure and Improve
Measure and Improve
Lyro gets better when you treat it as a loop, not a one-time setup. This guide walks you through measuring how your agent performs, acting on the gaps Lyro surfaces, and re-testing your changes before you publish them.
The improvement loop
Improving your agent comes down to four repeating steps:
- Read your Analytics dashboards to see how the agent is doing.
- Act on Findings by adding knowledge articles or intents.
- Re-test your changes with Evaluations.
- Re-publish the agent and watch the next cycle of data.
Each step feeds the next, so the more often you run the loop, the sharper your agent becomes.
Step 1 - Read your analytics
Open Analytics to see how your agent is performing. The dashboard ships with a standard set of charts and lets you build your own.
The default charts cover:
| Metric | What it tells you |
|---|---|
| Resolution rate | How often conversations get resolved without a human |
| Answer rate | How often the agent actually answers instead of deflecting |
| Intents | Which topics customers ask about most |
| CSAT | How satisfied customers are with the replies |
| Conversations | Overall volume over time |
| Response time | How fast the agent replies |
| Knowledge grounding | How well answers are backed by your sources |
| Tool health | Whether connected tools are firing reliably |
Use the controls at the top of the page to focus your view:
- Date range picker - narrow charts to the period you care about.
- New chart - build a custom chart for a metric or breakdown you want to track.
- Settings menu - export a CSV, include or exclude test traffic, or reset to the default chart set.
Tip: Test traffic from the playground is excluded by default so your numbers reflect real customers. Toggle "Include test traffic" only when you are debugging your own tests.
Low answer rate, dipping CSAT, or weak knowledge grounding on a popular intent are your cue to dig deeper.
Step 2 - Act on findings
Findings is where Lyro tells you exactly what to fix. The Operator sweeps your workspace and surfaces knowledge gaps, recurring questions your agent could not answer, and configuration issues.
Each finding is grouped so you can triage fast:
- Urgent - high-confidence issues worth handling first.
- Worth a look - lower-confidence items to review when you have time.
You can run a sweep on demand with the Run sweep button, or schedule recurring sweeps from Settings. Filter findings by status, knowledge base, or intent to focus on one area.
Turn a finding into a fix
Open any finding to see the detail view, then act on it:
- Draft an article - Lyro drafts a knowledge article that answers the gap. Review it and publish.
- Add or refine an intent - close the gap with a dedicated intent for a recurring topic so your agents route and answer it correctly.
- Ask the Operator - open the Operator panel for more context on why a finding was flagged.
Once you have acted, update the finding's status - mark it resolved, snooze it, or dismiss it - so your task list stays clean.
Note: Resolved and dismissed findings stay in the History tab, so you always have a record of what changed and when.
Step 3 - Re-test with evaluations
Before you trust a change, prove it works. Evaluations are test suites that measure how your agent handles real questions.
To build a suite, you can:
- Generate from KB - auto-create a starter set of cases from one of your knowledge bases.
- Save as eval - capture a real turn from the playground as a case.
- New set - create a suite by hand and add your own cases.
Each case can carry an expected answer, which Lyro uses as the ground truth to judge against.
Run and read results
Pick the agent to test, choose a version (live, draft, or an older published version), and click Run. Lyro scores every case and shows a pass or fail along with:
- Pass rate across the whole suite.
- Per-axis scores - whether it routed to the right agent, matched the expected answer, and stayed grounded in your sources.
- Judge rationale - why each case passed or failed.
Compare the actual answer against the expected one side by side. If a new answer is better, use Use as expected to promote it so future runs are judged against it.
Tip: Run the same suite against your draft before publishing. If the pass rate holds or improves, you know your fix did not break anything that already worked.
Step 4 - Re-publish and repeat
When your evals look good, publish the updated agent. Then come back to Analytics in a few days to confirm the metric you were chasing actually moved.
Run this loop on a regular cadence - read, act, re-test, re-publish - and your agent keeps closing gaps on its own. For a refresher on the building blocks, see Train Lyro on Your Content.
