Writing Effective Prompts

When creating workflows, clarity and specificity are paramount. Your prompts should be:
  • Unambiguous: Leave no room for interpretation
  • Specific: Include exact details like file paths, button names, or expected values
  • Actionable: Use clear, directive language

Example of Good vs Poor Prompting

"Download the report"

Understanding Variables

Cyberdesk supports three types of variables in your workflows:

Input Variables: {variable}

  • When defined: Before the workflow starts
  • Purpose: Pass dynamic data like patient names, dates, or IDs
  • Example: {patient_name}, {report_date}

Runtime Variables: {{variable}}

  • When defined: During workflow execution (ONLY by focused actions)
  • Purpose: Store values discovered during the workflow for later use
  • Example: {{extracted_invoice_id}}, {{found_file_path}}
  • Important: Can only be set by focused_action, but once set can be used anywhere
Key Difference: Input variables are replaced before the agent sees them, runtime variables are set during execution and used by subsequent steps, and sensitive variables are passed verbatim (as aliases) to tools but their plaintext values are never exposed to the agent; they are resolved only during actual computer actions.

Sensitive Variables: {$variable}

  • When defined: At run creation (via sensitive_input_values)
  • Purpose: Handle secrets like passwords, SSNs, tokens without exposing plaintext to the agent or logs
  • Example: {$password}, {$ssn}
  • Behavior:
    • Kept as {$variable} in prompts and passed verbatim to tools
    • Stored securely in a third‑party vault (Basis Theory) only for the duration of the run
    • Never logged, never shown on the dashboard, and never sent to LLMs
    • Resolved to plaintext only at the last mile (e.g., when typing), then deleted from the vault after run completion

Utilizing Prompt Images

Images can anchor your instructions with visual context. Add screenshots alongside your text in the dashboard editor to help the agent recognize ambiguous icons, menu locations, or tricky UI flows (especially in legacy apps).
  • Place images near the paragraph they illustrate for best results
  • Use images as guidance – the agent will still take fresh screenshots during execution
  • Keep images focused on the relevant UI area; include multiple images if needed
Runtime access to prompt images is available via the grab_reference_image tool by filename (tail only). After analyzing any reference or zoomed image, the agent should take a fresh full screenshot before issuing coordinate actions. When specifying zoom regions, consider leaving a small margin so labels/edges aren’t cropped.
Use your OS’s native Markup/annotation tools to add red circles and arrows around the exact UI elements you want the AI to focus on. Simple visual callouts dramatically improve recognition.

Understanding Cached Tools

Cyberdesk’s intelligent caching system allows workflows to run deterministically while still capturing dynamic outputs. To achieve this, you must explicitly prompt the agent to use specific tools at the right moments.
Why explicit tool prompting?When a workflow is cached, the agent replays the exact same actions. By explicitly telling the agent when to use these specialized tools, you ensure that even during cached runs, you’ll still get the dynamic outputs you need (extracted data, exported files, screenshots, etc.).

The Essential Workflow Tools

1. Focused Action

Capture dynamic content or make context-aware decisions during cached runs.

When to use

  • Selecting different items from a list each run
  • Extracting dynamic data that changes between runs
  • Making decisions based on screen content
In prompts, refer to it as "focused_action"

2. Mark File for Export

Export files from the remote machine after workflow completion.

When to use

  • Downloading generated reports
  • Extracting processed files
  • Saving workflow outputs
In prompts, refer to it as "mark_file_for_export"

3. Save Screenshot as Run Attachment

Capture important moments as screenshots accessible after the run.

When to use

  • Documenting visual confirmation
  • Capturing patient X-rays or medical images
  • Recording transaction confirmations
In prompts, refer to it as "save_screenshot_as_run_attachment"

4. Grab Reference Image

Retrieve a user-provided prompt image during execution by filename (tail only).

When to use

  • Provide visual context to disambiguate UI elements
  • Help the agent recognize specific screens, dialogs, or popups
In prompts, refer to it as "grab_reference_image". After analyzing, take a fresh full screenshot before coordinate actions. See the dedicated page for details.

5. Screen Extraction (just ask to extract)

Extract text/data directly from the current screen without a focused action. Simply tell the agent what to extract (and optionally the format). Under the hood, it will take a screenshot and use a strong vision model.

When to use

  • Simple reads where you just need on‑screen values (IDs, totals, statuses)
  • When you want structured output (ask for JSON) from a region; combine with zoom_bounding_box
  • Extracting data from multiple screens in parallel (e.g., scrolling through a long list)
Prompting tip: Just say “extract X from the screen” (optionally “as JSON”). For dynamic decisions and runtime variables, prefer focused_action.

Parallel Extraction for Multiple Screens

When you need to extract data from multiple screens or areas, you can make it 3-5x faster by requesting parallel extraction. The agent will process all extractions simultaneously while continuing to navigate. How to use it: Add phrases like “in parallel”, “do parallel extraction”, or “extract all data in parallel” to your prompt. Examples:
  1. Scrolling through lists
    Extract all customer data from this table by scrolling through the entire list. 
    Do this in parallel.
    
  2. Multiple tabs/sections
    Navigate to each tab (Overview, Details, History, Notes) and extract all 
    information from each tab in parallel.
    
  3. Grid of items
    For each product card visible on screen, extract the name, price, and 
    availability. Then click 'Next Page' and repeat. Do all extractions in parallel.
    
  4. Form with multiple pages
    Go through each page of this multi-step form and extract all the filled 
    values in parallel. Click 'Next' between pages.
    
  5. Dashboard with multiple widgets
    Extract data from all dashboard widgets in parallel - the sales chart, 
    customer count, revenue total, and recent activities list.
    

6. Execute Terminal Command

Run PowerShell commands on the Windows machine.

When to use

  • System administration tasks
  • File operations
  • API calls or data processing
In prompts, refer to it as "execute_terminal_command"

7. Declare Task Failed

Terminate workflow execution when failure conditions are met.

When to use

  • Error handling
  • Validation failures
  • Preventing unnecessary continuation
In prompts, refer to it as "declare_task_failed"

Quick Examples

Using Focused Action

"When you reach the patient list, use focused_action to select the patient 
whose name matches {patient_name} and extract their ID number"

Exporting Files

"After generating the report, use mark_file_for_export to export the file 
located at C:\Reports\output.pdf"

Capturing Screenshots

"Once the payment confirmation appears, use save_screenshot_as_run_attachment 
to capture the confirmation screen as 'payment_confirmation.png'"

Running Terminal Commands

"Use execute_terminal_command to run 'Get-ChildItem C:\Exports\*.pdf | 
ConvertTo-Json' to list all PDF files in the exports folder"

Handling Failures

"If the login screen shows 'Account Locked' or 'Invalid License', 
use declare_task_failed to terminate with message 'Unable to access system'"

Next Steps

For detailed information about each tool, including advanced usage patterns and real-world examples, explore the individual tool documentation pages: