Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.duvo.ai/llms.txt

Use this file to discover all available pages before exploring further.

File-drop triggers let your assignments start automatically when a new file arrives — in a monitored cloud folder, as an email attachment, or via the Duvo-hosted upload endpoint — without anyone having to kick off a Job manually.

Key Capabilities

  • Cloud folder monitoring — Watch a Google Drive folder, OneDrive folder, or SharePoint library for new files and start a Job as each one lands
  • Email attachment monitoring — Start a Job when an email arrives in Gmail or Outlook with an attachment, passing both the message and its files to the assignment
  • Duvo-hosted upload endpoint — Upload files programmatically via the Duvo API (sandbox upload), then start a Job with those files as input. Useful when an external system (a form, script, or pipeline) needs to push files to Duvo directly over HTTP
  • Deduplication with Assignment Memory — Track which files have already been processed so the assignment never acts on the same file twice
  • Multi-format support — Process PDFs, spreadsheets, images, Word documents, and other formats using the built-in Email Attachments Reader and Intelligent Document Reader connections
  • Flexible response — Extract structured data, validate against business rules, write records to downstream systems, and flag anomalies for human review

How File-Drop Triggers Work

Duvo does not currently support a native “watch folder” push event. File-drop monitoring is implemented by scheduling an assignment to run on a short interval — every 5 or 15 minutes, for example — and having the SOP list new files in the target folder, compare against what it has already seen (stored in Assignment Memory), and process only the new ones. For email attachment workflows, you can use the email trigger directly: the assignment fires as soon as a new email arrives in the connected inbox. No polling interval is needed.
Landing locationMechanismLatency
Gmail or Outlook inboxPush (email trigger)Seconds
Google Drive folderPolling (scheduled assignment)Equal to your schedule interval
OneDrive folderPolling (scheduled assignment)Equal to your schedule interval
SharePoint libraryPolling (scheduled assignment)Equal to your schedule interval
Duvo-hosted upload endpointPush (HTTP POST via Duvo API)Seconds

When to Use File-Drop Triggers

  • Invoice processing — Vendors drop PDF invoices in a shared folder; the assignment extracts fields and posts records to your accounting system
  • Document validation — Suppliers submit compliance documents via email; the assignment checks required fields and flags missing information
  • Report ingestion — A third-party system exports CSVs to a shared drive nightly; the assignment imports and transforms the data each morning
  • Image and scan processing — Field teams photograph forms and upload to a shared folder; the assignment reads and transcribes each image

How to Set It Up

Option A: Email attachment trigger (push, seconds latency)

This is the simplest path. Use it when files arrive as email attachments.
  1. Open your assignment and go to Setup.
  2. Enable the Gmail or Microsoft Outlook trigger.
  3. Optionally add a sender filter to restrict the trigger to emails from known senders (see Sender-Scoped Email Triggers).
  4. Add the Email Attachments Reader connection to extract data from PDF, Excel, image, and other attachment types.
  5. Write your SOP to describe what to do with each attachment — extract fields, validate, write records.

Option B: Cloud folder polling trigger (minutes latency)

Use this when files land in Google Drive, OneDrive, or SharePoint.
  1. Open your assignment and go to Setup.
  2. Set a schedule of every 5 or 15 minutes (or whatever interval is appropriate for your workflow).
  3. Add the relevant cloud storage connection (Google Drive, Microsoft OneDrive, or Microsoft SharePoint).
  4. Enable Assignment Memory and note a memory key such as processed_files where the assignment will store the list of file IDs it has already handled.
  5. Write your SOP to:
    • List all files in the target folder
    • Compare against processed_files in memory to identify new files
    • Process each new file
    • Add each processed file’s ID to processed_files in memory before finishing

Option C: Duvo-hosted upload endpoint (push, seconds latency)

Use this when an external system — a form submission service, a script, or a data pipeline — needs to push files directly to Duvo over HTTP and start a Job immediately. The Duvo API provides a sandbox that acts as a staging area for files. Your external system uploads files to the sandbox, then calls the run endpoint to start a Job with those files as input. How it works:
  1. Your external system calls POST /v1/sandboxes to create a sandbox and receives a sandbox_id.
  2. It uploads one or more files to POST /v1/sandboxes/{sandbox_id}/files, passing each file as multipart form data with an Authorization: Bearer <api_key> header.
  3. It calls POST /v1/runs with the sandbox_id and your assignment ID to start the Job. The uploaded files are available to the assignment at the path specified during upload.
  4. Optionally, provide a human_request_webhook_url when calling POST /v1/runs. Duvo will POST to this URL whenever the assignment reaches a Human-in-the-Loop step — that is, when it cannot complete processing automatically and needs a human to approve, reject, or supply information. The payload includes the run ID, request ID, request title and description, and a timestamp. Use the Respond to Human Request endpoint to resume the Job once the reviewer has responded.
Requirements:
  • A Duvo API key — create one in Team Settings → API Keys. See Running Assignments via API for authentication details.
  • Your external system should handle retries and protect its webhook endpoint (for example, by validating a shared secret or token). Duvo does not validate signatures on requests sent to your endpoint.
Example (curl):
API_KEY="dv_your_api_key"
ASSIGNMENT_ID="your-assignment-id"
BASE_URL="https://api.duvo.ai/v1"

# Step 1 — Create a sandbox
SANDBOX_ID=$(curl -s -X POST "$BASE_URL/sandboxes" \
  -H "Authorization: Bearer $API_KEY" | jq -r '.sandbox_id')

# Step 2 — Upload the file
curl -X POST "$BASE_URL/sandboxes/$SANDBOX_ID/files" \
  -H "Authorization: Bearer $API_KEY" \
  -F "file=@./invoice.pdf" \
  -F "path=/workspace/invoice.pdf"

# Step 3 — Start the Job
curl -X POST "$BASE_URL/runs" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d "{\"agent_id\": \"$ASSIGNMENT_ID\", \"sandbox_id\": \"$SANDBOX_ID\"}"
The assignment’s SOP can read the file from /workspace/invoice.pdf (or whatever path you specified in Step 2).

Worked Example — Vendor Invoice Processing

Outcome: Vendor invoices arrive in a shared Google Drive folder. Each invoice is read, key fields are extracted, a record is written to your accounting spreadsheet, and anomalous invoices are held for human review. Connections used:
  • Google Drive — list and read files in the shared folder
  • Intelligent Document Reader — extract fields from PDF invoices
  • Google Sheets — write extracted records
  • Human-in-the-Loop — pause for review when a field is missing or out of range

Before You Start

Make sure you have these ready before building the assignment:
  • Invoice folder — A shared Google Drive folder where vendors drop invoices. Note the folder name or ID.
  • Approved vendor list — A Google Sheet with a column of valid vendor names (for example, a sheet named “Approved Vendors” with a “Vendor Name” column). Optionally add an “ERP Vendor ID” column for downstream mapping. Note the sheet name and tab.
  • Accounting spreadsheet — A Google Sheet where extracted invoice records will be written. Note the sheet name and column headers you expect (Vendor Name, Invoice Number, Invoice Date, Due Date, Line Items, Total Amount, Processed Date).
  • Google Drive and Google Sheets connectionsConnect Google Drive and Connect Google Sheets from the Connections page.
  • Assignment Memory enabled — You will enable this in Step 4; no pre-configuration needed.

Step 1: Create the assignment

  1. Click + Create Assignment from your dashboard.
  2. Select Use Assignment Builder.

Step 2: Write your SOP

Paste and adapt this into the Assignment Builder:
Check the Google Drive folder at [folder name or ID] for new files that have not been
processed before.

For each new file:
1. Read the file using the Intelligent Document Reader and extract:
   - Vendor name
   - Invoice number
   - Invoice date
   - Due date
   - Line items (description, quantity, unit price)
   - Total amount

2. Validate the extracted data:
   - Vendor name must appear in [your approved vendor list in Google Sheets / your ERP]
   - Invoice number must not already exist in [your accounting spreadsheet]
   - Total amount must be greater than 0 and below [your single-invoice limit, e.g. $50,000]

3. If all fields are present and pass validation:
   - Append a new row to the "Invoices" sheet with all extracted fields and today's date.

4. If any field is missing, unreadable, or fails validation:
   - Do not write the row.
   - Request human review. Title: "[Issue type] — [Invoice number or filename] — [Vendor name]".
     Include all extracted fields and the specific issue.
   - After the reviewer approves (with any corrections) or rejects, either write the corrected
     row or leave the invoice unprocessed and log the reason.

5. Record the file ID in memory under the key "processed_files" so it is not processed again.

Step 3: Add connections

Under Connections, enable:
  • Google Drive — to list and read files in the shared folder
  • Intelligent Document Reader — already available by default
  • Google Sheets — to write invoice records and look up the approved vendor list
  • Human-in-the-Loop — already available by default

Step 4: Enable Assignment Memory

Go to Setup and enable Assignment Memory. This allows the assignment to store the list of processed file IDs between Jobs.

Step 5: Set a schedule

  1. Click Schedule in the assignment header.
  2. Choose Custom and set an interval of every 15 minutes, or every hour if near-real-time processing is not required.
  3. Click Add schedule to save.

Step 6: Test with a sample invoice

  1. Upload a sample invoice PDF to the Google Drive folder.
  2. Click Start Work to run the assignment manually.
  3. Confirm the fields were extracted correctly and the row was written to your spreadsheet.
  4. Upload a second PDF with a missing field and verify the Human-in-the-Loop request is created.

Expected Results

When the assignment is running: In Google Sheets:
  • A new row for each valid invoice, with all extracted fields populated.
  • No rows created for invoices with validation errors — those are held for review.
In your Activity Inbox:
  • A Human-in-the-Loop request for each invoice with a missing, unreadable, or anomalous field.
In Assignment Memory:
  • The processed_files list grows with each Job. Files already processed are skipped even if they are still in the folder.
In Duvo:
  • A session log per scheduled Job showing which files were found, processed, and skipped.

Troubleshooting

New files are not being picked up

  • Folder path: Verify the folder name or ID in your SOP matches the actual folder in Google Drive. Paste the folder URL into the SOP if the name is ambiguous.
  • Memory conflict: If the processed_files list was corrupted or contains a wrong ID, clear the memory for this assignment in Setup > Assignment Memory and let it reprocess.
  • Permissions: Confirm the Google Drive connection has read access to the shared folder, including folders owned by others. Re-authorize from the Connections page if needed.

The same file is processed more than once

  • The processed_files key in Assignment Memory is case-sensitive. Make sure your SOP writes the same file ID format that it reads when checking for duplicates.
  • If Assignment Memory was cleared, files previously processed will appear new on the next Job.

PDF extraction is incomplete

  • Scanned PDFs: Low-resolution scans reduce extraction accuracy. Ask vendors to send native PDFs where possible.
  • Multi-page invoices: Explicitly state in your SOP: “Read all pages of the document.”
  • Non-standard layouts: Add two or three sample invoices to Files and reference them in the SOP as formatting examples.

Files in subfolders are missed

By default, listing a folder returns only top-level files. To include subfolders, add to your SOP: “List all files in the folder and all its subfolders recursively.”

Take It Further

Route by vendor
After writing the invoice row, check if the vendor has a preferred reviewer in [your spreadsheet].
If so, include their name in the Human-in-the-Loop request title.
Send a confirmation to the vendor
After writing the invoice row, send a confirmation email to the vendor's address (extracted from
the invoice) with the subject "Invoice [number] received — processing in progress."
Trigger from Slack Drop a file directly in a Slack channel and mention the Duvo app to process it on demand, rather than waiting for the next scheduled Job. See Slack Mention Workflows.