Monitoring Runs via the API

This guide is for teams who have Agents running in production and need to answer operational questions through the public API: Is this Run healthy? Did it complete? Where did it fail? Why is the failure rate rising? It covers patterns for checking Run status, diagnosing failures, building aggregated views, and integrating Duvo data into your existing observability tools. Where the API has gaps, each section notes the current limitation and the recommended workaround. All endpoints are on the base URL https://api.duvo.ai/v2; see Running Agents via API for authentication and the error model.

Checking Whether a Run Is Healthy

Poll a single Run until it finishes

Runs are asynchronous. Start a Run, capture the run_id from the response, then poll GET /runs/{run_id} until the status reaches a terminal state. Terminal statuses: completed, failed, stopped, interrupted Non-terminal statuses: pending, starting, running, waiting

#!/bin/bash
API_KEY="dv_your_api_key"
RUN_ID="550e8400-e29b-41d4-a716-446655440000"
BASE_URL="https://api.duvo.ai/v2"

while true; do
  RESPONSE=$(curl -s "$BASE_URL/runs/$RUN_ID" \
    -H "Authorization: Bearer $API_KEY")
  STATUS=$(echo "$RESPONSE" | jq -r '.run.status')

  echo "$(date -u +%H:%M:%S) — status: $STATUS"

  case "$STATUS" in
    completed|failed|stopped|interrupted)
      echo "Run finished: $STATUS"
      break
      ;;
  esac

  sleep 10
done

A waiting status means the Agent paused for human input. Check pending_human_request on the same response to see what it is waiting on.

Polling interval guidance:

Run type	Suggested interval
Short-running (< 2 min)	5–10 seconds
Medium (2–15 min)	30 seconds
Long-running (> 15 min)	60–120 seconds

Polling more frequently than once every 5 seconds per key approaches the rate limit (300 requests per minute). If you are monitoring many Runs in parallel, increase the interval or use a webhook instead.

Get notified when a Run finishes (webhooks)

Pass a webhook_url when starting a Run to receive a POST the moment the Run changes state. This avoids polling entirely and is the preferred pattern for production pipelines. Duvo sends run_completed, run_failed, and run_interrupted events (plus human_request events when the Agent needs input), so filter by the event field for the state you care about.

curl -X POST "https://api.duvo.ai/v2/teams/$TEAM_ID/runs" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "agent_id": "7c9e6679-7425-40de-944b-e07fc1f90ae7",
    "webhook_url": "https://your-service.example.com/hooks/duvo"
  }'

Your endpoint receives a payload that identifies the Run and its outcome — at minimum the event type, the run_id, and the run status. Use the run_id to call GET /runs/{run_id} for the full run record if you need more detail.

{
  "event": "run_completed",
  "run_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed"
}

For event-driven triggers that start Runs automatically, see Event-Driven Triggers.

Diagnosing a Failed Run

Identify where a Run went wrong

When a Run shows status: failed, retrieve its message log to find the point of failure. GET /runs/{run_id}/messages returns every step the Agent took in chronological order — model decisions, tool calls and their results, HITL events, and the final output.

curl -s "https://api.duvo.ai/v2/runs/$RUN_ID/messages?limit=100" \
  -H "Authorization: Bearer $API_KEY" | jq '.messages[-10:]'

Reading the last few messages usually shows what happened at the end of the Run. Look for:

A tool_call whose tool_result carries an error
A message where the Agent describes why it is stopping
A human_request near the end with no follow-up response (the run status shows waiting via GET /runs/{run_id} when this happens)

Inspect the Agent’s tool activity

The messages endpoint returns up to 100 messages per page; use limit and offset to page through longer Runs (total in the response tells you how many there are). Each message has a type, role, timestamp, and — for tool steps — tool_call and tool_result objects.

curl -s "https://api.duvo.ai/v2/runs/$RUN_ID/messages?limit=100" \
  -H "Authorization: Bearer $API_KEY" \
  | jq '.messages[] | select(.tool_call != null) | {tool: .tool_call.name, result: .tool_result}'

Distinguish a retried success from a Run that never recovered

Duvo retries transient errors automatically. A Run that retried and then succeeded shows status: completed — you will not see intermediate retry attempts in the status field. To confirm whether a Run went through retries, check the message log for repeated identical tool_calls with error results followed eventually by a successful result. A Run that exhausted all retries shows status: failed with the final tool error in the message log. See Retries, Failures, and Skipped Steps for the full retry behavior reference.

Surface Runs that need attention

The Runs list supports an has_issues filter and an issue_severity filter (critical, medium, low), so you can pull the Runs Duvo flagged for quality or reliability concerns without reading every log:

curl -s "https://api.duvo.ai/v2/teams/$TEAM_ID/runs?has_issues=true&issue_severity=critical&limit=50" \
  -H "Authorization: Bearer $API_KEY" | jq '.data[] | {run_id: .id, status: .status, agent_id: .agent_id}'

The full evaluation detail behind a flagged Run is shown in the Runs List UI; the API exposes the flags for filtering, not the per-criterion scores.

Tracking Trends Across Multiple Runs

Count Runs by status for an Agent

Use GET /teams/{teamId}/runs with the agent_id filter to retrieve Runs for a specific Agent, then count by status. The list response returns Runs under data with a total count.

#!/bin/bash
TEAM_ID="your-team-id"
AGENT_ID="7c9e6679-7425-40de-944b-e07fc1f90ae7"
API_KEY="dv_your_api_key"
BASE_URL="https://api.duvo.ai/v2"

# Fetch the last 100 Runs for this Agent
curl -s "$BASE_URL/teams/$TEAM_ID/runs?agent_id=$AGENT_ID&limit=100&sort_by=created_at&sort_order=desc" \
  -H "Authorization: Bearer $API_KEY" \
  | jq '
    .data
    | group_by(.status)
    | map({ status: .[0].status, count: length })
  '

Sample output:

[
  { "status": "completed", "count": 87 },
  { "status": "failed", "count": 9 },
  { "status": "stopped", "count": 4 }
]

A failure rate above 10% for a well-established Agent usually signals a Connection Login issue, a change in the upstream data format, or an external API outage.

Page through Runs in a time window

The list endpoint uses offset-based pagination and accepts a since parameter (an ISO timestamp) to bound results to a time window. Page with limit=100 until you have read total rows.

#!/bin/bash
TEAM_ID="your-team-id"
API_KEY="dv_your_api_key"
BASE_URL="https://api.duvo.ai/v2"
SINCE="2026-05-17T00:00:00Z"   # last 24h
OFFSET=0

while true; do
  RESPONSE=$(curl -s "$BASE_URL/teams/$TEAM_ID/runs?since=$SINCE&limit=100&offset=$OFFSET&sort_by=created_at&sort_order=desc" \
    -H "Authorization: Bearer $API_KEY")

  echo "$RESPONSE" | jq '.data[] | {run_id: .id, status: .status, agent_id: .agent_id, started_at: .started_at, completed_at: .completed_at}'

  TOTAL=$(echo "$RESPONSE" | jq -r '.total')
  OFFSET=$((OFFSET + 100))
  [ "$OFFSET" -ge "$TOTAL" ] && break
done

Use this as the basis for a nightly summary report or a feed into a BI tool.

Measure duration and spot latency regressions

The run record exposes started_at and completed_at timestamps; compute duration from the two. A jump in average duration usually means an external service your Agent depends on has slowed down.

curl -s "https://api.duvo.ai/v2/teams/$TEAM_ID/runs?agent_id=$AGENT_ID&status=completed&limit=100" \
  -H "Authorization: Bearer $API_KEY" \
  | jq '
    [ .data[]
      | select(.started_at != null and .completed_at != null)
      | ((.completed_at | fromdateiso8601) - (.started_at | fromdateiso8601)) ] as $secs
    | if ($secs | length) > 0 then
        { count: ($secs | length),
          avg_seconds: (($secs | add) / ($secs | length) | round),
          max_seconds: ($secs | max),
          min_seconds: ($secs | min) }
      else
        { count: 0, avg_seconds: 0, max_seconds: 0, min_seconds: 0 }
      end
  '

Integrating with Your Observability Stack

Send Run events to Datadog, Grafana, or Splunk

Duvo does not have a native push connector for third-party observability tools. The supported pattern is a pull-based pipeline: a scheduled process that pages through new Runs and forwards them to your tool. Example: forward recent Runs to the Datadog Logs API

import requests

DUVO_API_KEY = "dv_your_api_key"
DATADOG_API_KEY = "your_datadog_api_key"
TEAM_ID = "your-team-id"
LAST_SEEN_OFFSET = 0  # persist this between invocations

def fetch_new_runs(offset):
    resp = requests.get(
        f"https://api.duvo.ai/v2/teams/{TEAM_ID}/runs",
        headers={"Authorization": f"Bearer {DUVO_API_KEY}"},
        params={"limit": 100, "offset": offset, "sort_by": "created_at", "sort_order": "asc"},
    )
    resp.raise_for_status()
    return resp.json()

def send_to_datadog(runs):
    logs = [
        {
            "ddsource": "duvo",
            "ddtags": f"agent:{r['agent_id']},status:{r['status']}",
            "service": "duvo-assignments",
            "message": f"Run {r['id']} {r['status']}",
            "run_id": r["id"],
            "agent_id": r["agent_id"],
            "status": r["status"],
            "started_at": r.get("started_at"),
            "completed_at": r.get("completed_at"),
        }
        for r in runs
    ]
    requests.post(
        "https://http-intake.logs.datadoghq.com/api/v2/logs",
        headers={"DD-API-KEY": DATADOG_API_KEY, "Content-Type": "application/json"},
        json=logs,
    ).raise_for_status()

data = fetch_new_runs(LAST_SEEN_OFFSET)
runs = data["data"]
if runs:
    send_to_datadog(runs)
    LAST_SEEN_OFFSET += len(runs)

Run this as a cron job every 5 minutes to keep your Datadog dashboard current with a maximum 5-minute lag. For Splunk, replace the send_to_datadog call with an HTTP Event Collector (HEC) POST. For Grafana Loki, use the Loki push API with the same payload shape.

Use `run_id` as your correlation key

Duvo does not currently expose OpenTelemetry trace IDs in API responses. Use run_id as the stable identifier to correlate Duvo events with records in your SIEM or log tool.

# In your pipeline, always include run_id in the log record
log_record = {
    "run_id": run["id"],            # use this to join Duvo events with your SIEM
    "agent_id": run["agent_id"],
    "status": run["status"],
    "timestamp": run.get("completed_at") or run.get("started_at"),
}

For a full SIEM integration walkthrough — including exporting actor events (logins, role changes) and builder events (AOP edits, publishes) — see Audit Log and Activity Tracking.

Known Monitoring Gaps

These capabilities are not yet available via the public API. Each row includes the current workaround.

What you may want	Current state	Workaround
Per-criterion evaluation scores via API	Flags are filterable; full scores are UI only	Filter with `has_issues` / `issue_severity`; read full evaluation detail in the Runs List
Real-time streaming of Run events	Not available — polling or webhook only	Poll `GET /runs/{run_id}` every 10–30 seconds; use the `webhook_url` events for state changes
Run duration as a single field	Not a field on the run object	Compute from `started_at` and `completed_at`
Per-step tool timing	Not exposed via the API	Use message `timestamp`s to estimate the gap between steps
Per-step cost breakdown	Not exposed via the API	Use Team Insights for aggregated cost trends
OpenTelemetry trace IDs in responses	Not currently exposed	Use `run_id` as the stable correlation key across your systems
Actor and builder event export via API	Not available — in-product audit log only	Contact security@duvo.ai for a data extract

Running Agents via API

Starting Runs, uploading files, and HITL webhook details

Retries, Failures, and Skipped Steps

How Duvo handles transient errors and permanent failures

Audit Log and Activity Tracking

SIEM integration, actor and builder events, CSV/JSON export

Event-Driven Triggers

Trigger Runs automatically from file drops, status changes, or Slack messages

Team Insights

Aggregated metrics, completion rates, and cost trends across your team

Welcome

Getting Started

Examples

Building Agents

Running Agents

Agent Features

Skills

Connections

Playbooks

Solutions

Analytics

Advanced

Reliability

Security

Resources

Organizations

Teams

Monitoring Runs via the API

Checking Whether a Run Is Healthy

Poll a single Run until it finishes

Get notified when a Run finishes (webhooks)

Diagnosing a Failed Run

Identify where a Run went wrong

Inspect the Agent’s tool activity

Distinguish a retried success from a Run that never recovered

Surface Runs that need attention

Tracking Trends Across Multiple Runs

Count Runs by status for an Agent

Page through Runs in a time window

Measure duration and spot latency regressions

Integrating with Your Observability Stack

Send Run events to Datadog, Grafana, or Splunk

Use `run_id` as your correlation key

Known Monitoring Gaps

Running Agents via API

Retries, Failures, and Skipped Steps

Audit Log and Activity Tracking

Event-Driven Triggers

Team Insights

​Checking Whether a Run Is Healthy

​Poll a single Run until it finishes

​Get notified when a Run finishes (webhooks)

​Diagnosing a Failed Run

​Identify where a Run went wrong

​Inspect the Agent’s tool activity

​Distinguish a retried success from a Run that never recovered

​Surface Runs that need attention

​Tracking Trends Across Multiple Runs

​Count Runs by status for an Agent

​Page through Runs in a time window

​Measure duration and spot latency regressions

​Integrating with Your Observability Stack

​Send Run events to Datadog, Grafana, or Splunk

​Use run_id as your correlation key

​Known Monitoring Gaps

​Related

Running Agents via API

Retries, Failures, and Skipped Steps

Audit Log and Activity Tracking

Event-Driven Triggers

Team Insights

Checking Whether a Run Is Healthy

Poll a single Run until it finishes

Get notified when a Run finishes (webhooks)

Diagnosing a Failed Run

Identify where a Run went wrong

Inspect the Agent’s tool activity

Distinguish a retried success from a Run that never recovered

Surface Runs that need attention

Tracking Trends Across Multiple Runs

Count Runs by status for an Agent

Page through Runs in a time window

Measure duration and spot latency regressions

Integrating with Your Observability Stack

Send Run events to Datadog, Grafana, or Splunk

Use `run_id` as your correlation key

Known Monitoring Gaps

Related