Async Extraction Patterns - Cyberdesk Docs

Overview

Cyberdesk provides flexible async extraction modes that let you optimize workflow performance based on when and how you need extracted data. Understanding these patterns is key to building fast, efficient workflows.

Async extraction works seamlessly with Cyberdesk’s trajectory caching system. During trajectory replay, extract prompts re-execute to capture fresh data, so you get both the speed benefits of caching and the flexibility of dynamic data extraction.

The Three Processing Modes

Synchronous (Default)

When: process_async not set, false, or None Behavior: Extraction blocks until complete Processing Time: 2-5 seconds per extraction Use When:

You need the result immediately for the next decision
Extracting a single value
The extraction determines workflow branching
Simple workflows with < 5 total extractions

Example:

"Navigate to order details. Take a screenshot with extract_prompt='Extract 
the order status as one word: Pending, Processing, or Shipped' to determine 
if the order needs manual intervention."

Timing Diagram:

Agent Step 1 → Extract (3s) → Agent Step 2 → Extract (3s) → Agent Step 3
Total: 6 seconds of extraction time

Batch-Scoped Async

When: process_async=true or process_async="batch" Behavior: All extractions in the current tool call batch run in parallel, complete before next agent step Processing Time: ~3 seconds for entire batch (no matter how many extractions) Use When:

Scrolling through lists or paginated content
Extracting from multiple sequential views
Extractions don’t depend on each other
Results should be ready for next agent decision
Want to store runtime variables from extractions before next agent turn

Special Capability: Extraction agent has access to upsert_runtime_values tool - can store specific values AND provide observations Example:

"Scroll through the product catalog and extract all data:
- Take screenshot with extract_prompt='Extract visible products as JSON' 
  and process_async='batch'
- Scroll down
- Take screenshot with extract_prompt='Extract visible products as JSON' 
  and process_async='batch'
- Scroll down
- Take screenshot with extract_prompt='Extract visible products as JSON' 
  and process_async='batch'

All extractions complete in parallel before next agent turn."

Timing Diagram:

Agent Step: [Screenshot + Extract] → [Scroll] → [Screenshot + Extract] → [Scroll] → [Screenshot + Extract]
            └─────────── All 3 extractions run in parallel (3s total) ──────────┘
                                                                                  ↓
                                                                          Agent Step 2
Total: ~3 seconds for all extractions combined

Performance Benefit: 3-5x faster than synchronous when extracting from multiple views

Run-Scoped Async

When: process_async="run" Behavior: Extraction runs completely in background for entire workflow, only awaited at final output generation Processing Time: Non-blocking, completes while workflow continues Use When:

Large data extractions not needed for navigation
Extraction is only for final output
You want maximum parallelism
Need to set runtime variables from extraction that won’t be used until later

Special Capability: Extraction agent has access to upsert_runtime_values tool - can store specific values AND provide observations Example:

"Navigate to analytics dashboard. Take screenshot with extract_prompt='Extract 
complete analytics data: all metrics, charts, KPIs, trends as detailed JSON' 
and process_async='run'

Continue with report generation workflow. The analytics extraction will complete 
in the background and be included automatically in final output."

Timing Diagram:

Agent Step 1 → [Start Extract (non-blocking)] → Agent Step 2 → Agent Step 3 → Agent Step 4
                        ↓ (running in background)
                        ↓
                        ↓
               [Extraction completes while workflow continues]
                        ↓
                        └─────────────────────────────────────→ Final Output Generation
                                                                 (waits for completion)
Total: 0 seconds of blocking time, extraction happens in parallel with workflow

Performance Benefit: Maximum parallelism, zero blocking time during workflow execution

Comparing the Modes

Aspect	Synchronous	Batch-Scoped	Run-Scoped
Blocking	✅ Blocks each time	✅ Blocks at end of batch	❌ Non-blocking
Parallelism	❌ Sequential	✅ Within batch	✅ Across entire run
Best For	Single extraction, decisions	Lists, pagination	Large output data
Result Available	Immediately	Before next step	At final output
Runtime Variables	❌ No	✅ Yes (via upsert_runtime_values)	✅ Yes (via upsert_runtime_values)
Performance	Slowest (N × 3s)	Fast (3s per batch)	Fastest (0s blocking)

Performance Examples

Scenario: Extract from 10 Pages of Data

Synchronous:

Page 1: Extract (3s) → Navigate
Page 2: Extract (3s) → Navigate
...
Page 10: Extract (3s)
Total: 30 seconds of extraction time

Batch-Scoped:

Batch 1: [Page 1 Extract, Navigate, Page 2 Extract, Navigate, Page 3 Extract]
         All 3 extractions in parallel: 3s
Batch 2: [Page 4 Extract, Navigate, Page 5 Extract, Navigate, Page 6 Extract]
         All 3 extractions in parallel: 3s
Batch 3: [Page 7 Extract, Navigate, Page 8 Extract, Navigate, Page 9 Extract]
         All 3 extractions in parallel: 3s
Batch 4: [Page 10 Extract]
         1 extraction: 3s
Total: 12 seconds of extraction time (2.5x faster)

Run-Scoped (if extraction not needed for navigation):

Start: Launch extraction for all pages → Continue workflow
       All 10 extractions run in background while workflow completes
Total: 0 seconds of blocking time (10x faster)

Advanced Pattern: Hybrid Extraction

Combine multiple modes for optimal performance:

"Navigate to customer portal.

Step 1 - Quick Decision (Synchronous):
Take screenshot with extract_prompt='Extract customer account status: 
Active, Suspended, or Closed' to determine workflow path.

Step 2 - List Processing (Batch-Scoped):
If status is Active, extract recent transactions:
- Take screenshot with extract_prompt='Extract visible transactions as JSON' 
  and process_async='batch'
- Scroll down
- Take screenshot with extract_prompt='Extract visible transactions as JSON' 
  and process_async='batch'
- Repeat for all pages

Step 3 - Detailed Analytics (Run-Scoped):
Take screenshot with extract_prompt='Extract complete account analytics: 
spending patterns, category breakdown, payment history' and process_async='run'

Continue with report generation. The analytics extraction completes in background."

Result:

Fast decision making (synchronous where needed)
Efficient list processing (batch-scoped parallelism)
Zero blocking for large data (run-scoped for final output)

Async Extractions with Runtime Variables

Both batch-scoped and run-scoped extractions have a powerful capability: the extraction agent can call upsert_runtime_values to make specific extracted values available as runtime variables. The difference is in timing: Batch-Scoped: Variables available before next agent step (good for iterative workflows) Run-Scoped: Variables available when extraction completes (good for background processing)

The Agent Loop

When using process_async (either “batch” or “run”), the extraction becomes a proper agent that can:

Call upsert_runtime_values to store specific fields
Provide final observations as text
Do both: Store values AND provide observations

System Prompt (Any Async Mode)

When using batch or run scoped extraction, the extraction agent receives:

You are an extraction assistant. You have two capabilities:

1. Call upsert_runtime_values to store specific extracted values that should 
   be available throughout the workflow as {{key_name}} placeholders.

2. Provide final observations as text describing what you see on screen.

You can do BOTH: store specific values AND provide observations, or just do 
one or the other. Your final text message will be recorded as the extraction result.

Example: Batch-Scoped with Runtime Variables (Available Before Next Step)

"Scroll through order list:
- Take screenshot with extract_prompt='Extract order_id as runtime variable 
  using upsert_runtime_values, then describe order status and customer name' 
  and process_async='batch'
- Use {{order_id}} to determine if this order needs special handling
- If yes, click on order
- Repeat for next order"

What Happens:

Screenshot taken
Extraction agent analyzes screenshot
Calls upsert_runtime_values({order_id: "ORD-123"})
Provides observation about status and customer
All batch extractions complete in parallel
Variables available before next agent step - can be used immediately
Agent proceeds with {{order_id}} available

Example: Run-Scoped with Runtime Variables (Available When Extraction Completes)

"Open invoice {invoice_number}.

Take screenshot with extract_prompt='Extract the invoice_date and total_amount 
and store them as runtime variables using upsert_runtime_values. Then provide 
a detailed description of all line items, tax breakdown, and payment terms.' 
and process_async='run'

Continue with payment workflow. The {{invoice_date}} and {{total_amount}} 
variables will be available once extraction completes in background, and the 
full invoice details will be in the final output."

What Happens:

Extraction starts in background
Workflow continues with other tasks
Extraction agent analyzes screenshot (in background)
Calls upsert_runtime_values({invoice_date: "2024-01-15", total_amount: 1250.00})
Variables become available once extraction completes
Agent then provides detailed observation about line items, taxes, etc.
Both the runtime variables and observation text are included in final output

Example: Pure Observation (No Runtime Variables)

"Take screenshot with extract_prompt='Describe all customer information 
visible on screen including contact details, order history, and preferences. 
Format as detailed JSON.' and process_async='run'

This large extraction runs in background and will be in final output."

Example: Multiple Runtime Variables

"Take screenshot with extract_prompt='Extract customer_id, order_total, and 
expected_delivery_date as runtime variables using upsert_runtime_values. Then 
provide a summary of order contents, shipping address, and special instructions.' 
and process_async='run'

Use {{customer_id}} and {{order_total}} in subsequent workflow steps once available."

Array and Object Operators

When accumulating data across multiple extractions (e.g., scrolling through a list), use MongoDB-style operators to append to arrays instead of replacing values:

Operator	Description	Example
`$append`	Append item to array	`{"items": {"$append": "new_item"}}`
`$prepend`	Prepend item to array	`{"items": {"$prepend": "first"}}`
`$concat`	Concatenate arrays	`{"items": {"$concat": ["a", "b"]}}`
`$merge`	Shallow merge objects	`{"config": {"$merge": {"key": "val"}}}`
`$remove`	Remove first occurrence by value	`{"tags": {"$remove": "old"}}`
`$pop`	Remove last element	`{"stack": {"$pop": true}}`

Example: Accumulating Extracted Items Across Pages

"Scroll through the products list and extract all items:

Page 1:
- Take screenshot with extract_prompt='Extract all visible products as JSON array. 
  For each product, call upsert_runtime_values with {\"products\": {\"$append\": {...product data...}}} 
  to add it to the running list.' and process_async='batch'

Page 2:
- Scroll down
- Take screenshot with extract_prompt='Extract all visible products as JSON array. 
  For each product, call upsert_runtime_values with {\"products\": {\"$append\": {...product data...}}} 
  to add it to the running list.' and process_async='batch'

After all pages, {{products}} contains the complete list of all extracted products."

The $append operator creates the array if it doesn’t exist, so you don’t need to initialize {{products}} before the first extraction.

Choosing the Right Mode

Use this decision tree:

Do you need the extraction result to make the next decision?
├─ YES → Use Synchronous
│         Example: "Extract status to determine next action"
│
└─ NO → Is this a list/multiple views?
    ├─ YES → Use Batch-Scoped Async
    │         Example: "Scroll and extract from each page"
    │
    └─ NO → Is this only for final output?
        ├─ YES → Use Run-Scoped Async
        │         Example: "Extract analytics for report"
        │
        └─ DEPENDS → Does it need to set runtime variables?
            ├─ YES → Use Run-Scoped Async (with upsert_runtime_values)
            │         Example: "Extract invoice_number for later use"
            │
            └─ NO → Use Synchronous if < 3 extractions
                    Use Batch-Scoped if >= 3 extractions

Real-World Patterns

Healthcare: Patient Record Processing

"Navigate to patient {patient_mrn}.

Quick Check (Synchronous):
Take screenshot with extract_prompt='Extract patient age as number' to 
determine if pediatric workflow is needed.

Medical History (Batch-Scoped):
Navigate through each section of medical history:
- Take screenshot with extract_prompt='Extract diagnosis history' and process_async='batch'
- Go to medications tab
- Take screenshot with extract_prompt='Extract current medications' and process_async='batch'
- Go to allergies tab
- Take screenshot with extract_prompt='Extract allergies' and process_async='batch'

Comprehensive Record (Run-Scoped):
Take screenshot of complete patient chart with extract_prompt='Extract full 
patient record including all demographics, vitals, lab results, imaging reports, 
and visit notes. Store patient_mrn as runtime variable.' and process_async='run'

Continue with documentation workflow using {{patient_mrn}} for file naming."

E-Commerce: Inventory Extraction

"Log into inventory system.

Category Check (Synchronous):
Take screenshot with extract_prompt='Count visible categories' to determine 
navigation depth.

Per-Category Extraction (Batch-Scoped):
For each category:
- Navigate to category
- Take screenshot with extract_prompt='Extract all products as JSON' and process_async='batch'
- Repeat

Full Catalog Analytics (Run-Scoped):
Take screenshot with extract_prompt='Extract complete inventory analytics: 
stock levels, trends, low-stock alerts, reorder recommendations. Store 
total_products and low_stock_count as runtime variables.' and process_async='run'

Generate inventory report using {{total_products}} and {{low_stock_count}} in report header."

Finance: Transaction Processing

"Open account {account_number}.

Balance Check (Synchronous):
Take screenshot with extract_prompt='Extract current balance as number' to 
verify sufficient funds.

Recent Transactions (Batch-Scoped):
Scroll through transaction history:
- Take screenshot with extract_prompt='Extract visible transactions' and process_async='batch'
- Scroll down
- Take screenshot with extract_prompt='Extract visible transactions' and process_async='batch'
- Repeat

Account Analysis (Run-Scoped):
Take screenshot with extract_prompt='Extract complete account analysis: 
spending categories, monthly trends, unusual patterns. Store account_type 
and risk_level as runtime variables.' and process_async='run'

Continue with report generation using {{account_type}} and {{risk_level}} for classification."

Best Practices

1. Start with Synchronous, Optimize Later

Begin with simple synchronous extraction, then optimize bottlenecks:

# Initial (works, but slow)
"Extract field A (sync) → Extract field B (sync) → Extract field C (sync)"

# Optimized (3x faster)
"Extract all three fields with process_async='batch' in one batch"

2. Use Batch-Scoped for Lists

Any time you’re iterating (scrolling, clicking next, navigating pages), use batch-scoped:

"For each item in list:
- Extract data with process_async='batch'
- Navigate to next
All extractions complete in parallel"

3. Use Run-Scoped for Final Output Only

If the extracted data doesn’t influence navigation or decisions, make it run-scoped:

"Take screenshot with extract_prompt='Extract complete report data' 
and process_async='run'

Continue workflow - extraction completes in background"

4. Combine with Other Extraction Methods

Use copy_to_clipboard for fast copyable text, and extract_prompt for vision-based extraction:

"Extract order data:
- Triple-click Order ID, use copy_to_clipboard with key 'order_id' (instant)
- Take screenshot with extract_prompt='Extract order items' and process_async='batch' (parallel)
- Take screenshot with extract_prompt='Extract shipping analytics' and process_async='run' (background)

Optimal performance using all available methods"

5. Set Runtime Variables from Run-Scoped Extractions

When you need specific values mid-workflow but also want large extractions:

"Take screenshot with extract_prompt='Extract customer_id and order_total 
as runtime variables. Then extract complete order details, history, and 
preferences as detailed JSON.' and process_async='run'

The {{customer_id}} and {{order_total}} become available once extraction completes, 
while full details are in final output."

Performance Metrics

Based on typical workflow patterns:

Scenario	Synchronous	Batch-Scoped	Run-Scoped	Speedup
1 extraction	3s	3s	0s*	1x
5 extractions (sequential)	15s	3-6s	0s*	2.5-5x
10 extractions (sequential)	30s	9-12s	0s*	2.5-10x
20 extractions (sequential)	60s	18-24s	0s*	2.5-10x

*Run-scoped shows 0s blocking time but still processes in background; workflow continues unblocked

Common Patterns Summary

Pattern	Mode	Example
Decision Making	Synchronous	”Extract status to determine next step”
List Processing	Batch-Scoped	”Scroll and extract from each page”
Final Output Data	Run-Scoped	”Extract analytics for report”
Runtime Variables	Run-Scoped	”Extract and store ID for later use”
Mixed Requirements	Hybrid	Sync for decisions + Batch for lists + Run for output

Migration Guide

From Synchronous to Batch-Scoped

Before:

"Extract from page 1
Navigate
Extract from page 2
Navigate
Extract from page 3"

After:

"Extract from page 1 with process_async='batch'
Navigate
Extract from page 2 with process_async='batch'
Navigate
Extract from page 3 with process_async='batch'"

Benefit: 3x faster with no other changes

From Batch-Scoped to Run-Scoped

Before:

"Extract complete data with process_async='batch'
Continue workflow (must wait for extraction)"

After:

"Extract complete data with process_async='run'
Continue workflow (extraction in background)"

Benefit: Zero blocking time, maximum parallelism Caution: Only use if extraction result not needed for navigation

Summary

Synchronous: Simple, reliable, blocks until complete. Use for decisions.
Batch-Scoped: Parallel within batch, 3-5x faster for lists. Use for iteration.
Run-Scoped: Maximum parallelism, zero blocking. Use for output-only data.

Choose based on when you need the data, not just “async is faster” - the right pattern depends on your workflow structure. For more information, see:

Extract Prompt - Detailed extraction syntax and examples
Trajectories 101 - How caching amplifies these performance benefits

Getting started

SDK Guides

Cyberdriver

Workflow Prompting

Webhooks

Concepts

Additional Details

​Overview

​The Three Processing Modes

​Synchronous (Default)

​Batch-Scoped Async

​Run-Scoped Async

​Comparing the Modes

​Performance Examples

​Scenario: Extract from 10 Pages of Data

​Advanced Pattern: Hybrid Extraction

​Async Extractions with Runtime Variables

​The Agent Loop

​System Prompt (Any Async Mode)

​Example: Batch-Scoped with Runtime Variables (Available Before Next Step)

​Example: Run-Scoped with Runtime Variables (Available When Extraction Completes)

​Example: Pure Observation (No Runtime Variables)

​Example: Multiple Runtime Variables

​Array and Object Operators

​Example: Accumulating Extracted Items Across Pages

​Choosing the Right Mode

​Real-World Patterns

​Healthcare: Patient Record Processing

​E-Commerce: Inventory Extraction

​Finance: Transaction Processing

​Best Practices

​1. Start with Synchronous, Optimize Later

​2. Use Batch-Scoped for Lists

​3. Use Run-Scoped for Final Output Only

​4. Combine with Other Extraction Methods

​5. Set Runtime Variables from Run-Scoped Extractions

​Performance Metrics

​Common Patterns Summary

​Migration Guide

​From Synchronous to Batch-Scoped

​From Batch-Scoped to Run-Scoped

​Summary

Overview

The Three Processing Modes

Synchronous (Default)

Batch-Scoped Async

Run-Scoped Async

Comparing the Modes

Performance Examples

Scenario: Extract from 10 Pages of Data

Advanced Pattern: Hybrid Extraction

Async Extractions with Runtime Variables

The Agent Loop

System Prompt (Any Async Mode)

Example: Batch-Scoped with Runtime Variables (Available Before Next Step)

Example: Run-Scoped with Runtime Variables (Available When Extraction Completes)

Example: Pure Observation (No Runtime Variables)

Example: Multiple Runtime Variables

Array and Object Operators

Example: Accumulating Extracted Items Across Pages

Choosing the Right Mode

Real-World Patterns

Healthcare: Patient Record Processing

E-Commerce: Inventory Extraction

Finance: Transaction Processing

Best Practices

1. Start with Synchronous, Optimize Later

2. Use Batch-Scoped for Lists

3. Use Run-Scoped for Final Output Only

4. Combine with Other Extraction Methods

5. Set Runtime Variables from Run-Scoped Extractions

Performance Metrics

Common Patterns Summary

Migration Guide

From Synchronous to Batch-Scoped

From Batch-Scoped to Run-Scoped

Summary