Agent Config

The agent section configures the LLM model, system prompt, tools, and behavior.

Configuration Options

Field	Required	Description
`model`	Yes	Model identifier or variable reference
`system`	Yes	System prompt filename (without .md)
`input`	No	Variables to pass to the system prompt
`tools`	No	List of tools the LLM can call
`skills`	No	List of Octavus skills the LLM can use
`imageModel`	No	Image generation model (enables agentic image generation)
`agentic`	No	Allow multiple tool call cycles
`maxSteps`	No	Maximum agentic steps (default: 10)
`temperature`	No	Model temperature (0-2)
`thinking`	No	Extended reasoning level
`anthropic`	No	Anthropic-specific options (tools, skills)

Models

Specify models in provider/model-id format. Any model supported by the provider's SDK will work.

Supported Providers

Provider	Format	Examples
Anthropic	`anthropic/{model-id}`	`claude-opus-4-5`, `claude-sonnet-4-5`, `claude-haiku-4-5`
Google	`google/{model-id}`	`gemini-3-pro-preview`, `gemini-3-flash-preview`, `gemini-2.5-flash`
OpenAI	`openai/{model-id}`	`gpt-5`, `gpt-4o`, `o4-mini`, `o3`, `o3-mini`, `o1`

Examples

yaml

Note: Model IDs are passed directly to the provider SDK. Check the provider's documentation for the latest available models.

Dynamic Model Selection

The model field can also reference an input variable, allowing consumers to choose the model when creating a session:

yaml

When creating a session, pass the model:

typescript

This enables:

Multi-provider support — Same agent works with different providers
A/B testing — Test different models without protocol changes
User preferences — Let users choose their preferred model

The model value is validated at runtime to ensure it's in the correct provider/model-id format.

Note: When using dynamic models, provider-specific options (like anthropic:) may not apply if the model resolves to a different provider.

System Prompt

The system prompt sets the agent's persona and instructions. The input field controls which variables are available to the prompt — only variables listed in input are interpolated.

yaml

Variables in input can come from protocol.input, protocol.resources, or protocol.variables.

Input Mapping Formats

yaml

The left side (label) is what the prompt sees. The right side (source) is where the value comes from.

Agentic Mode

Enable multi-step tool calling:

yaml

How it works:

LLM receives user message
LLM decides to call a tool
Tool executes, result returned to LLM
LLM decides if more tools needed
Repeat until LLM responds or maxSteps reached

Extended Thinking

Enable extended reasoning for complex tasks:

yaml

Level	Token Budget	Use Case
`low`	~5,000	Simple reasoning
`medium`	~10,000	Moderate complexity
`high`	~20,000	Complex analysis

Thinking content streams to the UI and can be displayed to users.

Skills

Enable Octavus skills for code execution and file generation:

yaml

Skills provide provider-agnostic code execution in isolated sandboxes. When enabled, the LLM can execute Python/Bash code, run skill scripts, and generate files.

See Skills for full documentation.

Image Generation

Enable the LLM to generate images autonomously:

yaml

When imageModel is configured, the octavus_generate_image tool becomes available. The LLM can decide when to generate images based on user requests. The tool supports both text-to-image generation and image editing/transformation using reference images.

Supported Image Providers

Provider	Model Types	Examples
OpenAI	Dedicated image models	`gpt-image-1`
Google	Gemini native (contains "image")	`gemini-2.5-flash-image`, `gemini-3-flash-image-generate`
Google	Imagen dedicated (starts with "imagen")	`imagen-4.0-generate-001`

Note: Google has two image generation approaches. Gemini "native" models (containing "image" in the ID) generate images using the language model API with responseModalities. Imagen models (starting with "imagen") use a dedicated image generation API.

Image Sizes

The tool supports three image sizes:

1024x1024 (default) — Square
1792x1024 — Landscape (16:9)
1024x1792 — Portrait (9:16)

Image Editing with Reference Images

Both the agentic tool and the generate-image block support reference images for editing and transformation. When reference images are provided, the prompt describes how to modify or use those images.