Simulate Users

Use UserSimulator to drive multi-turn tests with LLM-generated user inputs.

1. Configure a generator

To get started, you need to provide the LLM that will power the simulator. UserSimulator uses a generator to produce each user turn, so the same model you use for your checks can also drive realistic user behavior.

UserSimulator uses an LLM to generate realistic user messages. Set a default generator once, or pass one inline.

def support_agent(message: str) -> str:
    """Stub support agent for demonstration."""
    return "I have located your order #98765. It is currently in transit and will arrive tomorrow. Is there anything else I can help you with?"

from giskard.checks import set_default_generator
from giskard.agents.generators import Generator

set_default_generator(Generator(model="azure_ai/gpt-4.1-nano"))

2. Create a `UserSimulator` with a persona

With the generator configured, we can now define who the simulated user is. The persona field acts as a system prompt for the simulator — it describes the user’s role, goal, and stopping condition. The more specific you are, the more deterministic and useful the generated conversation will be.

from giskard.checks.generators.user import UserSimulator

customer = UserSimulator(
    persona="""
    You are a customer trying to track a delayed order.
    - Start by asking about order #98765
    - Provide your name (Alex) when asked
    - Accept any resolution the support agent offers
    - Stop when the agent confirms a solution
    """,
    max_steps=8,
)

max_steps limits how many turns the simulator will generate before stopping.

3. Use the simulator as `inputs` in `.interact()`

Now we’ll wire the simulator into the scenario. Passing the UserSimulator as inputs tells the scenario to call it on each turn rather than using a fixed string — the scenario handles the loop automatically up to max_steps.

Pass the UserSimulator instance as the inputs argument. The scenario will call it repeatedly to generate each user turn.

from giskard.checks import Scenario, FnCheck

scenario = (
    Scenario("order_tracking")
    .interact(
        inputs=customer,
        outputs=lambda inputs: support_agent(inputs),
    )
    .check(
        FnCheck(fn=
            lambda trace: any(
                word in trace.last.outputs.lower()
                for word in ["resolved", "refund", "replacement", "shipped"]
            ),
            name="resolution_offered",
        )
    )
)

4. Run the scenario and inspect the trace

With the scenario built, run it and iterate over the trace to see the full conversation the simulator generated. This is especially useful when debugging a failing check — you can see exactly what the simulated user said at each step.

import asyncio
result = asyncio.run(scenario.run())

# Print every turn
for turn in result.final_trace.interactions:
    print(f"User:  {turn.inputs}")
    print(f"Agent: {turn.outputs}")
    print()

Output

User: Hello, I wanted to check on the status of my order #98765, please. Agent: I have located your order #98765. It is currently in transit and will arrive tomorrow. Is there anything else I can help you with?

User: Thank you. Could you please confirm if my order is scheduled for delivery today or if there are any updates? My name is Alex. Agent: I have located your order #98765. It is currently in transit and will arrive tomorrow. Is there anything else I can help you with?

User: Could you please confirm if my order has been shipped and an estimated delivery date? Thank you. Agent: I have located your order #98765. It is currently in transit and will arrive tomorrow. Is there anything else I can help you with?

User: Thank you for the update. Could you please confirm if my order has now been shipped and when I can expect it to arrive? Agent: I have located your order #98765. It is currently in transit and will arrive tomorrow. Is there anything else I can help you with?

User: Thank you for the update. Could you please confirm if my order has been shipped now and when I can expect it to arrive? Agent: I have located your order #98765. It is currently in transit and will arrive tomorrow. Is there anything else I can help you with?

User: Hi, I appreciate the previous update. Just to confirm, has my order #98765 been shipped yet, and do you have an estimated delivery date? Agent: I have located your order #98765. It is currently in transit and will arrive tomorrow. Is there anything else I can help you with?

User: Thank you for the update. Could you please confirm if my order #98765 has been shipped yet and when I can expect it to arrive? Agent: I have located your order #98765. It is currently in transit and will arrive tomorrow. Is there anything else I can help you with?

User: Thank you for the information. Could you please confirm if my order #98765 has been shipped and provide an estimated delivery date? Agent: I have located your order #98765. It is currently in transit and will arrive tomorrow. Is there anything else I can help you with?

5. Check `goal_reached` from simulator metadata

After the scenario finishes, the simulator writes a LLMGeneratorOutput into the last interaction’s metadata. This tells you whether the user’s stated goal was achieved, a stronger signal than just checking whether the scenario passed its checks, because it reflects the simulator’s own evaluation of the conversation outcome.

from giskard.checks.generators.base import LLMGeneratorOutput

last = result.final_trace.last
simulator_output = last.metadata.get("simulator_output")

if isinstance(simulator_output, LLMGeneratorOutput):
    print(f"Goal reached: {simulator_output.goal_reached}")
    print(f"Message: {simulator_output.message}")

Use goal_reached as an additional assertion:

if simulator_output and not simulator_output.goal_reached:
    print(f"Goal not reached: {simulator_output.message}")
else:
    print("Goal reached or no simulator output")

Output

Goal reached or no simulator output

6. Swap personas for A/B testing

With a single persona working, we can now run the same agent against multiple user types simultaneously. Each persona exercises a different interaction style, and running them concurrently with asyncio.gather means you get results for all three in roughly the time it takes to complete one.

Run the same agent against multiple user types to surface persona-specific failures.

import asyncio

personas = [
    (
        "impatient",
        "You are impatient. Keep messages short. Escalate quickly if not helped.",
    ),
    (
        "detailed",
        "You are thorough. Ask many follow-up questions before accepting any solution.",
    ),
    (
        "confused",
        "You are unsure what you need. Describe symptoms, not the actual problem.",
    ),
]


async def run_persona(name, instructions):
    sim = UserSimulator(persona=instructions, max_steps=6)
    scenario = Scenario(name).interact(
        inputs=sim,
        outputs=lambda inputs: support_agent(inputs),
    )
    return name, await scenario.run()


results = asyncio.run(asyncio.gather(*[run_persona(n, i) for n, i in personas]))

for name, result in results:
    print(f"{name}: {'PASSED' if result.passed else 'FAILED'}")

Output

impatient: PASSED detailed: PASSED confused: PASSED

Custom trace formatting

By default the trace prints interactions as raw inputs and outputs. You can write a simple formatting function to produce a human-readable transcript — for example, to log a simulated conversation or include it in a test failure message. For a subclass of Trace, Rich rendering, and how that interacts with print_report(), see Custom trace types.

def format_transcript(trace) -> str:
    """Format a trace as a human-readable chat transcript."""
    lines = []
    for turn in trace.interactions:
        lines.append(f"User:  {turn.inputs}")
        lines.append(f"Agent: {turn.outputs}")
    return "\n".join(lines)


result = await (
    Scenario("chat_trace_demo")
    .interact(
        inputs=customer,
        outputs=lambda inputs: support_agent(inputs),
    )
    .run()
)

print(format_transcript(result.final_trace))

Output

User: Hello, I would like to inquire about the status of my order #98765. It has been delayed, and I want to know the current update. Agent: I have located your order #98765. It is currently in transit and will arrive tomorrow. Is there anything else I can help you with? User: Thank you for the update. Could you please confirm the exact delivery time for tomorrow? Agent: I have located your order #98765. It is currently in transit and will arrive tomorrow. Is there anything else I can help you with? User: Thank you for the update. Can you please tell me if there is an estimated delivery time tomorrow, or if I should expect the delivery later in the day? Agent: I have located your order #98765. It is currently in transit and will arrive tomorrow. Is there anything else I can help you with? User: Thank you for the clarification. Since my order is arriving tomorrow, could you please confirm the estimated delivery time range? Agent: I have located your order #98765. It is currently in transit and will arrive tomorrow. Is there anything else I can help you with? User: Hello, could you please tell me if there are any updates regarding my order #98765? Thank you. Agent: I have located your order #98765. It is currently in transit and will arrive tomorrow. Is there anything else I can help you with? User: Hello, I would like to confirm the delivery time for my order #98765 tomorrow. Could you please provide the estimated time range? Agent: I have located your order #98765. It is currently in transit and will arrive tomorrow. Is there anything else I can help you with? User: Hello, could you please tell me what time the delivery is scheduled for tomorrow? Thank you. Agent: I have located your order #98765. It is currently in transit and will arrive tomorrow. Is there anything else I can help you with? User: Could you please confirm the approximate delivery time window for my order tomorrow? Thank you. Agent: I have located your order #98765. It is currently in transit and will arrive tomorrow. Is there anything else I can help you with?

Next steps

Generators reference — full UserSimulator parameter reference
Multi-turn testing tutorial — multi-turn scenario basics
Debug with Spy — inspect what happens inside each interaction

Simulate Users

1. Configure a generator

2. Create a UserSimulator with a persona

3. Use the simulator as inputs in .interact()

4. Run the scenario and inspect the trace

5. Check goal_reached from simulator metadata

6. Swap personas for A/B testing

Custom trace formatting

Next steps

2. Create a `UserSimulator` with a persona

3. Use the simulator as `inputs` in `.interact()`

5. Check `goal_reached` from simulator metadata