Skip to main content

Workers

Workers are agents designed for task-based execution. Unlike interactive agents that handle multi-turn conversations, workers execute a sequence of steps and return an output value.

When to Use Workers

Workers are ideal for:

  • Background processing - Long-running tasks that don't need conversation
  • Composable tasks - Reusable units of work called by other agents
  • Pipelines - Multi-step processing with structured output
  • Parallel execution - Tasks that can run independently

Use interactive agents instead when:

  • Conversation is needed - Multi-turn dialogue with users
  • Persistence matters - State should survive across interactions
  • Session context - User context needs to persist

Worker vs Interactive

AspectInteractiveWorker
Structuretriggers + handlers + agentsteps + output
LLM ConfigGlobal agent: sectionPer-thread via start-thread
InvocationFire a named triggerDirect execution with input
SessionPersists across triggers (24h TTL)Single execution
ResultStreaming chatStreaming + output value

Protocol Structure

Workers use a simpler protocol structure than interactive agents:

yaml

settings.json

Workers are identified by the format field:

json

Key Differences

No Global Agent Config

Interactive agents have a global agent: section that configures a main thread. Workers don't have this - every thread must be explicitly created via start-thread:

yaml

This gives workers flexibility to use different models, tools, skills, and settings at different stages.

Steps Instead of Handlers

Workers use steps: instead of handlers:. Steps execute sequentially, like handler blocks:

yaml

Output Value

Workers can return an output value to the caller:

yaml

The output field references a variable declared in variables:. If omitted, the worker completes without returning a value.

Available Blocks

Workers support the same blocks as handlers:

BlockPurpose
start-threadCreate a named thread with LLM configuration
add-messageAdd a message to a thread
next-messageGenerate LLM response
tool-callCall a tool deterministically
set-resourceUpdate a resource value
serialize-threadConvert thread to text
generate-imageGenerate an image from a prompt variable

start-thread (Required for LLM)

Every thread must be initialized with start-thread before using next-message:

yaml

All LLM configuration goes here:

FieldDescription
threadThread name (defaults to block name)
modelLLM model to use
systemSystem prompt filename (required)
inputVariables for system prompt
toolsTools available in this thread
skillsOctavus skills available in this thread
mcpServersMCP servers available in this thread
imageModelImage generation model
webSearchEnable built-in web search tool
thinkingExtended reasoning level (low/medium/high/max), "off", or variable reference
cachePrompt caching mode: auto (default), extended, or off
temperatureModel temperature (0-2), "off", or variable reference
maxStepsMaximum tool call cycles (enables agentic if > 1), or variable reference

Simple Example

A worker that generates a title from a summary:

yaml

Advanced Example

A worker with multiple threads, tools, and agentic behavior:

yaml

MCP Servers

Workers can declare and use MCP servers, just like interactive agents. Define them in mcpServers: and reference them in start-thread:

yaml

Workers resolve their own MCP connections independently - they don't inherit MCP servers from a parent interactive agent. Remote MCP connections are project-scoped, so a worker in the same project automatically has access to the same OAuth connections.

See MCP Servers for full documentation.

Workers can use Octavus skills, image generation, and web search, configured per-thread via start-thread:

yaml

Workers define their own skills independently - they don't inherit skills from a parent interactive agent. Each thread gets its own sandbox scoped to only its listed skills.

Skills with execution: device work the same way in workers as in interactive agents - the skill runs on the agent's computer. Workers resolve their device execution independently, so a worker can use device skills even if the parent agent does not.

See Skills for full documentation.

Tool Handling

Workers support the same tool handling as interactive agents:

  • Server tools - Handled by tool handlers you provide
  • Client tools - Pause execution, return tool request to caller
typescript

See Server SDK Workers for tool handling details.

Stream Events

Workers emit the same events as interactive agents, plus worker-specific events:

EventDescription
worker-startWorker execution begins
worker-resultWorker completes (includes output)

All standard events (text-delta, tool calls, etc.) are also emitted.

Calling Workers from Interactive Agents

Interactive agents can call workers in two ways:

  1. Deterministically - Using the run-worker block
  2. Agentically - LLM calls worker as a tool

Worker Declaration

First, declare workers in your interactive agent's protocol:

yaml

run-worker Block

Call a worker deterministically from a handler:

yaml

LLM Tool Invocation

Make workers available to the LLM:

yaml

The LLM can then call workers as tools during conversation.

Display Modes

Controls how worker execution appears to users. The default for workers is stream.

ModeBehavior
hiddenWorker runs silently. No events reach the client - no UIWorkerPart is created.
nameShows a running/done indicator with the worker name. No nested content (text, tool calls, reasoning) is forwarded.
descriptionShows a running/done indicator with the worker description. No nested content is forwarded.
streamFull visibility. All nested events are forwarded - text, reasoning, tool calls, sources, files. Worker input is included on start.

Progressive input streaming: When a worker with display: stream is invoked agentically (LLM calls it as a tool), the UIWorkerPart appears in the UI immediately as the LLM starts generating the worker's arguments. The worker input streams progressively into the worker part, the same way text tokens stream into a text part. Once input finishes, worker execution begins and nested content flows into the same worker part. There is no intermediate tool card.

name and description modes: Worker input is stripped from the worker-start event (it may contain sensitive data). Only the running/done status and the final worker-result are forwarded to the parent stream. Use these for workers where the user only needs to know the worker ran, not what it did internally.

hidden mode: The worker executes normally but produces no UI presence at all. Use for internal workers that are implementation details.

Tool Mapping

Map parent tools to worker tools when the worker needs access to your tool handlers:

yaml

When the worker calls its search tool, your web-search handler executes.

Next Steps