Skip to main content

Agent Config

The agent section configures the LLM model, system prompt, tools, and behavior.

Basic Configuration

yaml

Configuration Options

FieldRequiredDescription
modelYesModel identifier or variable reference
backupModelNoBackup model for automatic failover on provider errors
systemYesSystem prompt filename (without .md)
inputNoVariables to pass to the system prompt
toolsNoList of tools the LLM can call
mcpServersNoList of MCP servers to connect (see MCP Servers)
skillsNoList of Octavus skills the LLM can use
referencesNoList of references the LLM can fetch on demand
sandboxTimeoutNoSkill sandbox timeout in ms (default: 5 min, max: 1 hour)
imageModelNoImage generation model (enables agentic image generation)
webSearchNoEnable built-in web search tool (provider-agnostic)
agenticNoAllow multiple tool call cycles
maxStepsNoMaximum agentic steps (default: 10)
temperatureNoModel temperature (0-2)
thinkingNoExtended reasoning level
anthropicNoAnthropic-specific options (tools, skills)

Models

Specify models in provider/model-id format. Any model supported by the provider's SDK will work.

Supported Providers

ProviderFormatExamples
Anthropicanthropic/{model-id}claude-opus-4-5, claude-sonnet-4-5, claude-haiku-4-5
Googlegoogle/{model-id}gemini-3-pro-preview, gemini-3-flash-preview, gemini-2.5-flash
OpenAIopenai/{model-id}gpt-5, gpt-4o, o4-mini, o3, o3-mini, o1

Examples

yaml

Note: Model IDs are passed directly to the provider SDK. Check the provider's documentation for the latest available models.

Dynamic Model Selection

The model field can also reference an input variable, allowing consumers to choose the model when creating a session:

yaml

When creating a session, pass the model:

typescript

This enables:

  • Multi-provider support — Same agent works with different providers
  • A/B testing — Test different models without protocol changes
  • User preferences — Let users choose their preferred model

The model value is validated at runtime to ensure it's in the correct provider/model-id format.

Note: When using dynamic models, provider-specific options (like anthropic:) may not apply if the model resolves to a different provider.

Backup Model

Configure a fallback model that activates automatically when the primary model encounters a transient provider error (rate limits, outages, timeouts):

yaml

When a provider error occurs, the system retries once with the backup model. If the backup also fails, the original error is returned.

Key behaviors:

  • Only transient provider errors trigger fallback — authentication and validation errors are not retried
  • Provider-specific options (like anthropic:) are only forwarded to the backup model if it uses the same provider
  • For streaming responses, fallback only occurs if no content has been sent to the client yet

Like model, backupModel supports variable references:

yaml

Tip: Use a different provider for your backup model (e.g., primary on Anthropic, backup on OpenAI) to maximize resilience against single-provider outages.

System Prompt

The system prompt sets the agent's persona and instructions. The input field controls which variables are available to the prompt — only variables listed in input are interpolated.

yaml

Variables in input can come from protocol.input, protocol.resources, or protocol.variables.

Input Mapping Formats

yaml

The left side (label) is what the prompt sees. The right side (source) is where the value comes from.

Example

prompts/system.md:

markdown

Agentic Mode

Enable multi-step tool calling:

yaml

How it works:

  1. LLM receives user message
  2. LLM decides to call a tool
  3. Tool executes, result returned to LLM
  4. LLM decides if more tools needed
  5. Repeat until LLM responds or maxSteps reached

Extended Thinking

Enable extended reasoning for complex tasks:

yaml
LevelToken BudgetUse Case
low~5,000Simple reasoning
medium~10,000Moderate complexity
high~20,000Complex analysis

Thinking content streams to the UI and can be displayed to users.

Skills

Enable Octavus skills for code execution and file generation:

yaml

Skills provide provider-agnostic code execution in isolated sandboxes. When enabled, the LLM can execute Python/Bash code, run skill scripts, and generate files.

See Skills for full documentation.

References

Enable on-demand context loading via reference documents:

yaml

References are markdown files stored in the agent's references/ directory. When enabled, the LLM can list available references and read their content using octavus_reference_list and octavus_reference_read tools.

See References for full documentation.

Image Generation

Enable the LLM to generate images autonomously:

yaml

When imageModel is configured, the octavus_generate_image tool becomes available. The LLM can decide when to generate images based on user requests. The tool supports both text-to-image generation and image editing/transformation using reference images.

Supported Image Providers

ProviderModel TypesExamples
OpenAIDedicated image modelsgpt-image-1
GoogleGemini native (contains "image")gemini-2.5-flash-image, gemini-3-flash-image-generate
GoogleImagen dedicated (starts with "imagen")imagen-4.0-generate-001

Note: Google has two image generation approaches. Gemini "native" models (containing "image" in the ID) generate images using the language model API with responseModalities. Imagen models (starting with "imagen") use a dedicated image generation API.

Image Sizes

The tool supports three image sizes:

  • 1024x1024 (default) — Square
  • 1792x1024 — Landscape (16:9)
  • 1024x1792 — Portrait (9:16)

Image Editing with Reference Images

Both the agentic tool and the generate-image block support reference images for editing and transformation. When reference images are provided, the prompt describes how to modify or use those images.

ProviderModelsReference Image Support
OpenAIgpt-image-1Yes
GoogleGemini native (gemini-*-image)Yes
GoogleImagen (imagen-*)No

Agentic vs Deterministic

Use imageModel in agent config when:

  • The LLM should decide when to generate or edit images
  • Users ask for images in natural language

Use generate-image block (see Handlers) when:

  • You want explicit control over image generation or editing
  • Building prompt engineering pipelines
  • Images are generated at specific handler steps

Enable the LLM to search the web for current information:

yaml

When webSearch is enabled, the octavus_web_search tool becomes available. The LLM can decide when to search the web based on the conversation. Search results include source URLs that are emitted as citations in the UI.

This is a provider-agnostic built-in tool — it works with any LLM provider (Anthropic, Google, OpenAI, etc.). For Anthropic's own web search implementation, see Provider Options.

Use cases:

  • Current events and real-time data
  • Fact verification and documentation lookups
  • Any information that may have changed since the model's training

Temperature

Control response randomness:

yaml

Guidelines:

  • 0 - 0.3: Factual, consistent responses
  • 0.4 - 0.7: Balanced (good default)
  • 0.8 - 1.2: Creative, varied responses
  • > 1.2: Very creative (may be inconsistent)

Provider Options

Enable provider-specific features like Anthropic's built-in tools and skills:

yaml

Provider options are validated against the model—using anthropic: with a non-Anthropic model will fail validation.

See Provider Options for full documentation.

Thread-Specific Config

Override config for named threads:

yaml

Each thread can have its own model, backup model, MCP servers, skills, references, image model, and web search setting. Skills must be defined in the protocol's skills: section. References must exist in the agent's references/ directory. Workers use this same pattern since they don't have a global agent: section.

Full Example

yaml