Agent Config
The agent section configures the LLM model, system prompt, tools, and behavior.
Basic Configuration
Configuration Options
| Field | Required | Description |
|---|---|---|
model | Yes | Model identifier or variable reference |
system | Yes | System prompt filename (without .md) |
input | No | Variables to pass to the system prompt |
tools | No | List of tools the LLM can call |
skills | No | List of Octavus skills the LLM can use |
imageModel | No | Image generation model (enables agentic image generation) |
agentic | No | Allow multiple tool call cycles |
maxSteps | No | Maximum agentic steps (default: 10) |
temperature | No | Model temperature (0-2) |
thinking | No | Extended reasoning level |
anthropic | No | Anthropic-specific options (tools, skills) |
Models
Specify models in provider/model-id format. Any model supported by the provider's SDK will work.
Supported Providers
| Provider | Format | Examples |
|---|---|---|
| Anthropic | anthropic/{model-id} | claude-opus-4-5, claude-sonnet-4-5, claude-haiku-4-5 |
google/{model-id} | gemini-3-pro-preview, gemini-3-flash-preview, gemini-2.5-flash | |
| OpenAI | openai/{model-id} | gpt-5, gpt-4o, o4-mini, o3, o3-mini, o1 |
Examples
Note: Model IDs are passed directly to the provider SDK. Check the provider's documentation for the latest available models.
Dynamic Model Selection
The model field can also reference an input variable, allowing consumers to choose the model when creating a session:
When creating a session, pass the model:
This enables:
- Multi-provider support — Same agent works with different providers
- A/B testing — Test different models without protocol changes
- User preferences — Let users choose their preferred model
The model value is validated at runtime to ensure it's in the correct provider/model-id format.
Note: When using dynamic models, provider-specific options (like anthropic:) may not apply if the model resolves to a different provider.
System Prompt
The system prompt sets the agent's persona and instructions. The input field controls which variables are available to the prompt — only variables listed in input are interpolated.
Variables in input can come from protocol.input, protocol.resources, or protocol.variables.
Input Mapping Formats
The left side (label) is what the prompt sees. The right side (source) is where the value comes from.
Example
prompts/system.md:
Agentic Mode
Enable multi-step tool calling:
How it works:
- LLM receives user message
- LLM decides to call a tool
- Tool executes, result returned to LLM
- LLM decides if more tools needed
- Repeat until LLM responds or maxSteps reached
Extended Thinking
Enable extended reasoning for complex tasks:
| Level | Token Budget | Use Case |
|---|---|---|
low | ~5,000 | Simple reasoning |
medium | ~10,000 | Moderate complexity |
high | ~20,000 | Complex analysis |
Thinking content streams to the UI and can be displayed to users.
Skills
Enable Octavus skills for code execution and file generation:
Skills provide provider-agnostic code execution in isolated sandboxes. When enabled, the LLM can execute Python/Bash code, run skill scripts, and generate files.
See Skills for full documentation.
Image Generation
Enable the LLM to generate images autonomously:
When imageModel is configured, the octavus_generate_image tool becomes available. The LLM can decide when to generate images based on user requests. The tool supports both text-to-image generation and image editing/transformation using reference images.
Supported Image Providers
| Provider | Model Types | Examples |
|---|---|---|
| OpenAI | Dedicated image models | gpt-image-1 |
| Gemini native (contains "image") | gemini-2.5-flash-image, gemini-3-flash-image-generate | |
| Imagen dedicated (starts with "imagen") | imagen-4.0-generate-001 |
Note: Google has two image generation approaches. Gemini "native" models (containing "image" in the ID) generate images using the language model API with responseModalities. Imagen models (starting with "imagen") use a dedicated image generation API.
Image Sizes
The tool supports three image sizes:
1024x1024(default) — Square1792x1024— Landscape (16:9)1024x1792— Portrait (9:16)
Image Editing with Reference Images
Both the agentic tool and the generate-image block support reference images for editing and transformation. When reference images are provided, the prompt describes how to modify or use those images.
| Provider | Models | Reference Image Support |
|---|---|---|
| OpenAI | gpt-image-1 | Yes |
Gemini native (gemini-*-image) | Yes | |
Imagen (imagen-*) | No |
Agentic vs Deterministic
Use imageModel in agent config when:
- The LLM should decide when to generate or edit images
- Users ask for images in natural language
Use generate-image block (see Handlers) when:
- You want explicit control over image generation or editing
- Building prompt engineering pipelines
- Images are generated at specific handler steps
Temperature
Control response randomness:
Guidelines:
0 - 0.3: Factual, consistent responses0.4 - 0.7: Balanced (good default)0.8 - 1.2: Creative, varied responses> 1.2: Very creative (may be inconsistent)
Provider Options
Enable provider-specific features like Anthropic's built-in tools and skills:
Provider options are validated against the model—using anthropic: with a non-Anthropic model will fail validation.
See Provider Options for full documentation.
Thread-Specific Config
Override config for named threads: