Skip to main content

Overview

Cyberdesk provides flexible async extraction modes that let you optimize workflow performance based on when and how you need extracted data. Understanding these patterns is key to building fast, efficient workflows.
Async extraction works seamlessly with Cyberdesk’s trajectory caching system. During trajectory replay, extract prompts re-execute to capture fresh data, so you get both the speed benefits of caching and the flexibility of dynamic data extraction.

The Three Processing Modes

Synchronous (Default)

When: process_async not set, false, or None Behavior: Extraction blocks until complete Processing Time: 2-5 seconds per extraction Use When:
  • You need the result immediately for the next decision
  • Extracting a single value
  • The extraction determines workflow branching
  • Simple workflows with < 5 total extractions
Example:
"Navigate to order details. Take a screenshot with extract_prompt='Extract 
the order status as one word: Pending, Processing, or Shipped' to determine 
if the order needs manual intervention."
Timing Diagram:
Agent Step 1 → Extract (3s) → Agent Step 2 → Extract (3s) → Agent Step 3
Total: 6 seconds of extraction time

Batch-Scoped Async

When: process_async=true or process_async="batch" Behavior: All extractions in the current tool call batch run in parallel, complete before next agent step Processing Time: ~3 seconds for entire batch (no matter how many extractions) Use When:
  • Scrolling through lists or paginated content
  • Extracting from multiple sequential views
  • Extractions don’t depend on each other
  • Results should be ready for next agent decision
  • Want to store runtime variables from extractions before next agent turn
Special Capability: Extraction agent has access to upsert_runtime_values tool - can store specific values AND provide observations Example:
"Scroll through the product catalog and extract all data:
- Take screenshot with extract_prompt='Extract visible products as JSON' 
  and process_async='batch'
- Scroll down
- Take screenshot with extract_prompt='Extract visible products as JSON' 
  and process_async='batch'
- Scroll down
- Take screenshot with extract_prompt='Extract visible products as JSON' 
  and process_async='batch'

All extractions complete in parallel before next agent turn."
Timing Diagram:
Agent Step: [Screenshot + Extract] → [Scroll] → [Screenshot + Extract] → [Scroll] → [Screenshot + Extract]
            └─────────── All 3 extractions run in parallel (3s total) ──────────┘

                                                                          Agent Step 2
Total: ~3 seconds for all extractions combined
Performance Benefit: 3-5x faster than synchronous when extracting from multiple views

Run-Scoped Async

When: process_async="run" Behavior: Extraction runs completely in background for entire workflow, only awaited at final output generation Processing Time: Non-blocking, completes while workflow continues Use When:
  • Large data extractions not needed for navigation
  • Extraction is only for final output
  • You want maximum parallelism
  • Need to set runtime variables from extraction that won’t be used until later
Special Capability: Extraction agent has access to upsert_runtime_values tool - can store specific values AND provide observations Example:
"Navigate to analytics dashboard. Take screenshot with extract_prompt='Extract 
complete analytics data: all metrics, charts, KPIs, trends as detailed JSON' 
and process_async='run'

Continue with report generation workflow. The analytics extraction will complete 
in the background and be included automatically in final output."
Timing Diagram:
Agent Step 1 → [Start Extract (non-blocking)] → Agent Step 2 → Agent Step 3 → Agent Step 4
                        ↓ (running in background)


               [Extraction completes while workflow continues]

                        └─────────────────────────────────────→ Final Output Generation
                                                                 (waits for completion)
Total: 0 seconds of blocking time, extraction happens in parallel with workflow
Performance Benefit: Maximum parallelism, zero blocking time during workflow execution

Comparing the Modes

AspectSynchronousBatch-ScopedRun-Scoped
Blocking✅ Blocks each time✅ Blocks at end of batch❌ Non-blocking
Parallelism❌ Sequential✅ Within batch✅ Across entire run
Best ForSingle extraction, decisionsLists, paginationLarge output data
Result AvailableImmediatelyBefore next stepAt final output
Runtime Variables❌ No✅ Yes (via upsert_runtime_values)✅ Yes (via upsert_runtime_values)
PerformanceSlowest (N × 3s)Fast (3s per batch)Fastest (0s blocking)

Performance Examples

Scenario: Extract from 10 Pages of Data

Synchronous:
Page 1: Extract (3s) → Navigate
Page 2: Extract (3s) → Navigate
...
Page 10: Extract (3s)
Total: 30 seconds of extraction time
Batch-Scoped:
Batch 1: [Page 1 Extract, Navigate, Page 2 Extract, Navigate, Page 3 Extract]
         All 3 extractions in parallel: 3s
Batch 2: [Page 4 Extract, Navigate, Page 5 Extract, Navigate, Page 6 Extract]
         All 3 extractions in parallel: 3s
Batch 3: [Page 7 Extract, Navigate, Page 8 Extract, Navigate, Page 9 Extract]
         All 3 extractions in parallel: 3s
Batch 4: [Page 10 Extract]
         1 extraction: 3s
Total: 12 seconds of extraction time (2.5x faster)
Run-Scoped (if extraction not needed for navigation):
Start: Launch extraction for all pages → Continue workflow
       All 10 extractions run in background while workflow completes
Total: 0 seconds of blocking time (10x faster)

Advanced Pattern: Hybrid Extraction

Combine multiple modes for optimal performance:
"Navigate to customer portal.

Step 1 - Quick Decision (Synchronous):
Take screenshot with extract_prompt='Extract customer account status: 
Active, Suspended, or Closed' to determine workflow path.

Step 2 - List Processing (Batch-Scoped):
If status is Active, extract recent transactions:
- Take screenshot with extract_prompt='Extract visible transactions as JSON' 
  and process_async='batch'
- Scroll down
- Take screenshot with extract_prompt='Extract visible transactions as JSON' 
  and process_async='batch'
- Repeat for all pages

Step 3 - Detailed Analytics (Run-Scoped):
Take screenshot with extract_prompt='Extract complete account analytics: 
spending patterns, category breakdown, payment history' and process_async='run'

Continue with report generation. The analytics extraction completes in background."
Result:
  • Fast decision making (synchronous where needed)
  • Efficient list processing (batch-scoped parallelism)
  • Zero blocking for large data (run-scoped for final output)

Async Extractions with Runtime Variables

Both batch-scoped and run-scoped extractions have a powerful capability: the extraction agent can call upsert_runtime_values to make specific extracted values available as runtime variables. The difference is in timing: Batch-Scoped: Variables available before next agent step (good for iterative workflows) Run-Scoped: Variables available when extraction completes (good for background processing)

The Agent Loop

When using process_async (either “batch” or “run”), the extraction becomes a proper agent that can:
  1. Call upsert_runtime_values to store specific fields
  2. Provide final observations as text
  3. Do both: Store values AND provide observations

System Prompt (Any Async Mode)

When using batch or run scoped extraction, the extraction agent receives:
You are an extraction assistant. You have two capabilities:

1. Call upsert_runtime_values to store specific extracted values that should 
   be available throughout the workflow as {{key_name}} placeholders.

2. Provide final observations as text describing what you see on screen.

You can do BOTH: store specific values AND provide observations, or just do 
one or the other. Your final text message will be recorded as the extraction result.

Example: Batch-Scoped with Runtime Variables (Available Before Next Step)

"Scroll through order list:
- Take screenshot with extract_prompt='Extract order_id as runtime variable 
  using upsert_runtime_values, then describe order status and customer name' 
  and process_async='batch'
- Use {{order_id}} to determine if this order needs special handling
- If yes, click on order
- Repeat for next order"
What Happens:
  1. Screenshot taken
  2. Extraction agent analyzes screenshot
  3. Calls upsert_runtime_values({order_id: "ORD-123"})
  4. Provides observation about status and customer
  5. All batch extractions complete in parallel
  6. Variables available before next agent step - can be used immediately
  7. Agent proceeds with {{order_id}} available

Example: Run-Scoped with Runtime Variables (Available When Extraction Completes)

"Open invoice {invoice_number}.

Take screenshot with extract_prompt='Extract the invoice_date and total_amount 
and store them as runtime variables using upsert_runtime_values. Then provide 
a detailed description of all line items, tax breakdown, and payment terms.' 
and process_async='run'

Continue with payment workflow. The {{invoice_date}} and {{total_amount}} 
variables will be available once extraction completes in background, and the 
full invoice details will be in the final output."
What Happens:
  1. Extraction starts in background
  2. Workflow continues with other tasks
  3. Extraction agent analyzes screenshot (in background)
  4. Calls upsert_runtime_values({invoice_date: "2024-01-15", total_amount: 1250.00})
  5. Variables become available once extraction completes
  6. Agent then provides detailed observation about line items, taxes, etc.
  7. Both the runtime variables and observation text are included in final output

Example: Pure Observation (No Runtime Variables)

"Take screenshot with extract_prompt='Describe all customer information 
visible on screen including contact details, order history, and preferences. 
Format as detailed JSON.' and process_async='run'

This large extraction runs in background and will be in final output."

Example: Multiple Runtime Variables

"Take screenshot with extract_prompt='Extract customer_id, order_total, and 
expected_delivery_date as runtime variables using upsert_runtime_values. Then 
provide a summary of order contents, shipping address, and special instructions.' 
and process_async='run'

Use {{customer_id}} and {{order_total}} in subsequent workflow steps once available."

Choosing the Right Mode

Use this decision tree:
Do you need the extraction result to make the next decision?
├─ YES → Use Synchronous
│         Example: "Extract status to determine next action"

└─ NO → Is this a list/multiple views?
    ├─ YES → Use Batch-Scoped Async
    │         Example: "Scroll and extract from each page"

    └─ NO → Is this only for final output?
        ├─ YES → Use Run-Scoped Async
        │         Example: "Extract analytics for report"

        └─ DEPENDS → Does it need to set runtime variables?
            ├─ YES → Use Run-Scoped Async (with upsert_runtime_values)
            │         Example: "Extract invoice_number for later use"

            └─ NO → Use Synchronous if < 3 extractions
                    Use Batch-Scoped if >= 3 extractions

Real-World Patterns

Healthcare: Patient Record Processing

"Navigate to patient {patient_mrn}.

Quick Check (Synchronous):
Take screenshot with extract_prompt='Extract patient age as number' to 
determine if pediatric workflow is needed.

Medical History (Batch-Scoped):
Navigate through each section of medical history:
- Take screenshot with extract_prompt='Extract diagnosis history' and process_async='batch'
- Go to medications tab
- Take screenshot with extract_prompt='Extract current medications' and process_async='batch'
- Go to allergies tab
- Take screenshot with extract_prompt='Extract allergies' and process_async='batch'

Comprehensive Record (Run-Scoped):
Take screenshot of complete patient chart with extract_prompt='Extract full 
patient record including all demographics, vitals, lab results, imaging reports, 
and visit notes. Store patient_mrn as runtime variable.' and process_async='run'

Continue with documentation workflow using {{patient_mrn}} for file naming."

E-Commerce: Inventory Extraction

"Log into inventory system.

Category Check (Synchronous):
Take screenshot with extract_prompt='Count visible categories' to determine 
navigation depth.

Per-Category Extraction (Batch-Scoped):
For each category:
- Navigate to category
- Take screenshot with extract_prompt='Extract all products as JSON' and process_async='batch'
- Repeat

Full Catalog Analytics (Run-Scoped):
Take screenshot with extract_prompt='Extract complete inventory analytics: 
stock levels, trends, low-stock alerts, reorder recommendations. Store 
total_products and low_stock_count as runtime variables.' and process_async='run'

Generate inventory report using {{total_products}} and {{low_stock_count}} in report header."

Finance: Transaction Processing

"Open account {account_number}.

Balance Check (Synchronous):
Take screenshot with extract_prompt='Extract current balance as number' to 
verify sufficient funds.

Recent Transactions (Batch-Scoped):
Scroll through transaction history:
- Take screenshot with extract_prompt='Extract visible transactions' and process_async='batch'
- Scroll down
- Take screenshot with extract_prompt='Extract visible transactions' and process_async='batch'
- Repeat

Account Analysis (Run-Scoped):
Take screenshot with extract_prompt='Extract complete account analysis: 
spending categories, monthly trends, unusual patterns. Store account_type 
and risk_level as runtime variables.' and process_async='run'

Continue with report generation using {{account_type}} and {{risk_level}} for classification."

Best Practices

1. Start with Synchronous, Optimize Later

Begin with simple synchronous extraction, then optimize bottlenecks:
# Initial (works, but slow)
"Extract field A (sync) → Extract field B (sync) → Extract field C (sync)"

# Optimized (3x faster)
"Extract all three fields with process_async='batch' in one batch"

2. Use Batch-Scoped for Lists

Any time you’re iterating (scrolling, clicking next, navigating pages), use batch-scoped:
"For each item in list:
- Extract data with process_async='batch'
- Navigate to next
All extractions complete in parallel"

3. Use Run-Scoped for Final Output Only

If the extracted data doesn’t influence navigation or decisions, make it run-scoped:
"Take screenshot with extract_prompt='Extract complete report data' 
and process_async='run'

Continue workflow - extraction completes in background"

4. Combine with Other Extraction Methods

Use copy_to_clipboard for fast copyable text, and extract_prompt for vision-based extraction:
"Extract order data:
- Triple-click Order ID, use copy_to_clipboard with key 'order_id' (instant)
- Take screenshot with extract_prompt='Extract order items' and process_async='batch' (parallel)
- Take screenshot with extract_prompt='Extract shipping analytics' and process_async='run' (background)

Optimal performance using all available methods"

5. Set Runtime Variables from Run-Scoped Extractions

When you need specific values mid-workflow but also want large extractions:
"Take screenshot with extract_prompt='Extract customer_id and order_total 
as runtime variables. Then extract complete order details, history, and 
preferences as detailed JSON.' and process_async='run'

The {{customer_id}} and {{order_total}} become available once extraction completes, 
while full details are in final output."

Performance Metrics

Based on typical workflow patterns:
ScenarioSynchronousBatch-ScopedRun-ScopedSpeedup
1 extraction3s3s0s*1x
5 extractions (sequential)15s3-6s0s*2.5-5x
10 extractions (sequential)30s9-12s0s*2.5-10x
20 extractions (sequential)60s18-24s0s*2.5-10x
*Run-scoped shows 0s blocking time but still processes in background; workflow continues unblocked

Common Patterns Summary

PatternModeExample
Decision MakingSynchronous”Extract status to determine next step”
List ProcessingBatch-Scoped”Scroll and extract from each page”
Final Output DataRun-Scoped”Extract analytics for report”
Runtime VariablesRun-Scoped”Extract and store ID for later use”
Mixed RequirementsHybridSync for decisions + Batch for lists + Run for output

Migration Guide

From Synchronous to Batch-Scoped

Before:
"Extract from page 1
Navigate
Extract from page 2
Navigate
Extract from page 3"
After:
"Extract from page 1 with process_async='batch'
Navigate
Extract from page 2 with process_async='batch'
Navigate
Extract from page 3 with process_async='batch'"
Benefit: 3x faster with no other changes

From Batch-Scoped to Run-Scoped

Before:
"Extract complete data with process_async='batch'
Continue workflow (must wait for extraction)"
After:
"Extract complete data with process_async='run'
Continue workflow (extraction in background)"
Benefit: Zero blocking time, maximum parallelism Caution: Only use if extraction result not needed for navigation

Summary

  • Synchronous: Simple, reliable, blocks until complete. Use for decisions.
  • Batch-Scoped: Parallel within batch, 3-5x faster for lists. Use for iteration.
  • Run-Scoped: Maximum parallelism, zero blocking. Use for output-only data.
Choose based on when you need the data, not just “async is faster” - the right pattern depends on your workflow structure. For more information, see: