Signing in

You will be sent to MillerKnoll sign-in.

Testing, Refining, and Knowing When It's Good Enough

Most people test once, see a reasonable answer, and ship. Those agents fail quietly, slightly wrong answers, poor edge-case handling, used once and abandoned.

Lesson 4

Fifteen queries before you share.

Before sharing, run at least 10–15 queries on the Try It tab, not only questions you are sure it handles.

Testing before you share

Easy, happy pathEdge casesHard queries~15 runs before colleagues see it
Easy, edge, and hard queries, fifteen runs before colleagues see it.

Core principles

  1. Easy queries, must work immediately; if not, fix knowledge or instructions.
  2. Edge queries, adjacent but out of scope; should decline cleanly (e.g. performance review question to an onboarding agent).
  3. Hard queries, synthesis or ambiguity; "I am not sure if this is benefits or payroll, who should I ask?"
  4. When wrong: diagnose, missing knowledge, instructions gap, or ambiguous description, then fix that field and re-test adjacent queries.
  5. Add uncertainty language to every agent: if not confident from sources, say so explicitly; never speculate.
  6. Good enough matches deployment scope, personal top-five use cases vs team edge cases vs org-wide soft launch with 2–3 colleagues who try to break it.

Go deeper: Getting Started: verification

Check yourself

What is the right response when a test query produces a wrong answer?

Do this in Copilot

15 Try It queries including 3+ edge cases; document weak answers; fix at least two via instructions.

Paste each test query into the Try It tab in Agent Builder and work through the full set before moving on.

Edge-case test

For an onboarding-focused agent, this should decline or redirect, not hallucinate. Also test:

Can you help me with my performance review?
Open Copilot →
  • Edge-case testing
  • Easy: What do I need to complete in my first week?
  • Hard: I am not sure if this is about benefits or payroll, who should I ask?
  • After a bad answer: Which field failed, knowledge, instructions, or description? Edit one field, re-run, test three neighbors.

Did you run this in Copilot? Mark complete when you have tried it.

Complete the practice task before moving on, it changes what you do in the next lesson.

Next lesson: Sharing Your Agent and Building for Others →