Skip to main content

Agent

The Agent resource is how you declare an AI agent on Pai. Two execution modes selected by spec.longRunning and the presence of spec.triggers[]:

Modespec.longRunningspec.triggers[]When to use
Long-runningtrue(rejected)Image-backed Deployment running continuously — web UIs, daemons, anything persistent. Requires spec.harness: custom + spec.image.
Session-spawningfalse (default)optionalThe Agent is a definition. Each run is a Session (one-shot or recurring). Use pai run, the UI Chat tab, POST /agents/{name}/sessions, or configure triggers (webhook, telegram, slack, email, schedule) for spawning.

The harness that runs inside each pod is set by spec.harness: default | claude | codex | custom. default is today's built-in Pai harness with all the file/web/email tools. custom requires spec.image.

pai create -f agent.yaml                    # create
pai apply -f agent.yaml # create or update
pai run <name> --agent <ref> --task "…" # spawn a Session against an existing Agent
pai chat <name> # interactive chat (resumes the caller's Idle Session)
pai get agents # list Agents
pai get sessions [--agent <name>] # list Sessions
pai migrate-agents [--dry-run] # rewrite v1 Agents into v1beta2 shape
pai delete agent <name> # cascades to all Sessions via ownerReferences

Migrating from v1? The legacy fields (spec.type: service | task, spec.agentDefinition, spec.schedule, spec.webhook, spec.conversational, spec.mcpServers, spec.managedAgentDefinitions) are still accepted on input through the cutover but read-only — write the v1beta2 fields documented below. pai migrate-agents rewrites every Agent in-place; it's idempotent so a re-run after a fix is safe.

Complete example per mode — use these as references for what a full YAML can contain:

apiVersion: pai.io/v1
kind: Agent
metadata:
name: dev-assistant
namespace: team-a
spec:
longRunning: true
harness: custom
image: ghcr.io/pai-platform/openclaw:latest
replicas: 1
runAsUser: 1000

models:
- google/gemini-2.5-flash
- anthropic/claude-sonnet-4-6

providers:
- github-writer
- telegram-bot

guards:
- binding: prompt-guard-default
scan: {prompts: true, responses: false}
enforcement: enforce

inbound:
port: 3000
allowCIDRs: ["0.0.0.0/0"]
customDomain: dev.helppa.io

expose:
- urlPath: /reports
directory: /data/reports

volumes:
- name: workspace
mountPath: /home/node/workspace
size: "10Gi"

configFiles:
- path: /home/node/.openclaw/openclaw.json
content: |
{"channels": ["web", "telegram"], "defaultModel": "gemini-flash"}

env:
- name: LOG_LEVEL
value: "info"
- name: SLACK_WEBHOOK
secretRef: {name: slack-credentials, key: webhook-url}

tokens:
maxPerDay: 1_000_000
maxPerRequest: 16_000

rateLimits:
maxRequestsPerMinute: 60
maxConcurrentRequests: 4
maxCostPerDayUSD: 20.0

resources:
requests: {cpu: "500m", memory: "512Mi"}
limits: {cpu: "2", memory: "2Gi"}

autoscaling:
minReplicas: 1
maxReplicas: 5
metrics:
- type: tokenRate
targetValuePerReplica: 500

filesystem:
readOnlyPaths: [/home/node/.openclaw/openclaw.json, /etc/cron.d]
writablePaths: [/home/node/.npm]

ops:
instructions: |
Run `openclaw status` to check health.
Logs are at /data/logs/app.log.

The sections below group fields by which modes they apply to:

  • Shared fields work on any Agent.
  • Harness fields apply to session-spawning Agents (longRunning: false, the default). Ignored on long-running Agents.
  • Long-running-only fields apply when longRunning: true.
  • Session-defaults are inherited by every Session this Agent spawns and can be overridden per-Session.

Shared fields (all modes)

These fields work on any Agent regardless of mode. When set on a template Agent (no type), task agents that reference it inherit them; a task agent's own value overrides the template per-field.

models

Which LLMs the agent can use. Every call routes through the Pai Gateway, which injects the provider's API key — the agent never sees credentials.

models:
- anthropic/claude-sonnet-4-6
- google/gemini-2.5-flash

Written as modelprovider/model-id. First entry is the primary/default; extras are fallbacks when the primary is rate-limited or over budget.

providers

External services the agent is allowed to reach (GitHub, AWS, Telegram, MCP servers, etc.). Credentials are injected by the Pai sidecar — the agent never holds real API keys.

providers:
- github-writer
- name: aws-s3
policy: {allow: ["s3:GetObject"]}
scope: {resources: ["arn:aws:s3:::reports/*"]}
EntryDescription
<name> (string)Attach a Provider by name with its full defaults
{name, policy, scope} (object)Attach inline with narrower permissions — can only tighten, never widen, the Provider's own rules

guards

Attach prompt-injection / jailbreak scanners to this agent's LLM traffic. Each prompt, response, or tool result is classified at the gateway before it reaches the model, and blocked or logged based on the configured enforcement mode.

guards:
- binding: prompt-guard-default
scan:
prompts: true
responses: false
enforcement: enforce
FieldDescription
bindingName of a GuardBinding resource
scan.promptsScan user-role messages (default true)
scan.responsesScan assistant output — audit-only when streaming
scan.toolResults.toolsTool names whose results are scanned; ["*"] scans all. Omit to disable
enforcementOptional override; may only tighten (auditenforce)

See the prompt-injection guard guide. Setting spec.guards[] on a task Agent fully replaces any guards declared on its referenced template Agent.

tokens

Cap how many LLM tokens this agent can consume per day and per individual request. The gateway tracks usage in real time and rejects excess calls with HTTP 429.

tokens:
maxPerDay: 50000
maxPerRequest: 8192
FieldDescription
maxPerDayHard daily token cap across all of this agent's LLM calls
maxPerRequestPer-request context window limit

rateLimits

Cap request rate, concurrency, and daily USD spend for this agent. Complements tokens with shape-based controls (bursts vs. cost vs. total tokens).

rateLimits:
maxRequestsPerMinute: 60
maxConcurrentRequests: 4
maxCostPerDayUSD: 5.00
FieldDescription
maxRequestsPerMinuteFixed 60-second window request cap
maxConcurrentRequestsSimultaneous in-flight LLM calls
maxCostPerDayUSDHard daily USD spend cap, computed from the gateway's price table

All three are independent and optional — 0 or unset means unlimited for that axis. Rejections are HTTP 429.

env

Extra environment variables injected into the agent container. Values can be literal strings or pulled from a Secret.

env:
- name: LOG_LEVEL
value: "debug"
- name: SLACK_WEBHOOK
secretRef:
name: slack-credentials
key: webhook-url
FieldDescription
nameEnvironment variable name
valueLiteral string value
secretRef.nameSecret to read from
secretRef.keyKey within the Secret

Personal vars (git author, email, etc.) can be set once with pai config set-env and are auto-injected into every agent you create.

secretRef exposes the value to the agent

Unlike Provider credentials (injected by the sidecar) and ModelProvider API keys (injected by the gateway), a Secret mounted via env[].secretRef is readable by the agent container. Use this only for secrets the agent genuinely needs to see itself — webhook URLs, library-specific API keys Pai doesn't proxy, etc. See Exposing a secret to the agent.

configFiles

Seed files into the agent at startup — config files, credentials, reference data, templates. Useful for injecting small pieces of configuration without rebuilding the image.

configFiles:
- path: /home/node/.myapp/config.json
content: |
{ "setting": "value" }
FieldDescription
pathAbsolute path inside the agent
contentInline file contents

Files are only written if they don't already exist — edits made by the agent at runtime are preserved across restarts.

files

Mount versioned PaiFile resources into the agent. Useful for large or frequently-updated reference data (datasets, knowledge bases, large prompts) that don't belong in a config map.

files:
- name: customer-data
mountPath: /data/customers.csv
readOnly: true
FieldDescription
namePaiFile resource name
mountPathAbsolute path inside the agent
versionSpecific version to pin (default: current at creation time)
readOnlyMount read-only (default false)

No size limit. Agents can edit mounted files at runtime and call register_file to publish a new version.

skills

Attach reusable capability bundles — Markdown instructions, helper scripts, JSON data, or anything else agents can use at runtime.

skills:
- name: coding-guidelines
mountPath: /skills
FieldDescription
nameSkill resource name
mountPathParent directory; files land at {mountPath}/{skill-name}/

Service agents need their runtime to consume the mounted files (e.g. OpenClaw's skills.load.extraDirs); harness-backed agents load them automatically.

volumes

Persistent storage that survives agent restarts. Use for workspace files, model caches, scratch space, or anything the agent needs to keep between runs.

volumes:
- name: workspace
mountPath: /data
size: "10Gi"
FieldDescription
nameVolume identifier
mountPathMount path inside the agent
sizeStorage size (e.g. "1Gi", "50Gi", "1Ti")

resources

Reserve CPU and memory for the agent, and set hard limits so one runaway agent can't starve the rest of the cluster.

resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "2"
memory: "2Gi"
FieldDescription
requests.cpuGuaranteed CPU (e.g. "500m" = 0.5 cores)
requests.memoryGuaranteed memory (e.g. "512Mi")
limits.cpuCPU ceiling — agent is throttled if exceeded
limits.memoryMemory ceiling — agent is killed if exceeded

Harness fields (task + template)

These fields drive the Pai agent harness — the built-in runtime used by task and template agents. Service agents bring their own runtime inside their container image and ignore these fields.

system

The system prompt the harness injects at agent start — defines the agent's persona, goals, and constraints.

system: |
You are a senior data analyst. Prefer clarity over cleverness.

tools

Pick which built-in tools the harness exposes to the model, and declare any custom caller-executed tools.

tools:
- type: bash
- type: web_search
- type: custom
name: approve_pr
description: Ask the user to approve a pull request
inputSchema: {type: object, properties: {pr_url: {type: string}}}
FieldDescription
typebash, read, write, edit, glob, grep, web_fetch, web_search, send_email, screenshot, or custom
enabledSet to false to disable a built-in tool (default true)
nameTool name (required when type: custom)
descriptionDescription shown to the model (required for custom tools)
inputSchemaJSON Schema for the tool's arguments (required for custom tools)

Omit tools entirely to enable all built-ins. Custom tools are caller-executed — the harness emits agent.custom_tool_use events and waits for user.custom_tool_result.

packages

Language-level packages to install into the harness container before the agent starts — pandas, axios, whatever your tools need.

packages:
pip: [pandas, matplotlib]
npm: ["@anthropic-ai/sdk"]
apt: [poppler-utils]
KeyPackage manager
pipPython (pip install)
npmNode (npm install -g)
aptDebian / Ubuntu system (apt-get install)
goGo (go install)
gemRuby (gem install)
cargoRust (cargo install)

For apt, prefer baking packages into a custom harness image in production — install-at-startup is fine for iteration but adds latency per agent start.

triggers + idleTimeout

Make the agent conversational — wake it on inbound events (Telegram, email, Slack, webhook) instead of running to completion. Between events the agent sits idle and exits after idleTimeout of inactivity.

triggers:
- telegram:
chatId: "-100123456"
provider: telegram-bot
- email:
address: bot+linear@pairun.dev
provider: office365-imap
- webhook:
allowCIDRs: ["0.0.0.0/0"]
idleTimeout: 15m
TriggerSub-fieldsDescription
telegramchatId, providerWake on a Telegram message
emailaddress, providerWake on email to a plus-tagged address
slackchannel, providerWake on a Slack channel message
webhookallowCIDRs, linearSignatureSecretWake on an HTTP POST to a unique URL
idleTimeoutdurationHow long to wait after last activity before exiting (e.g. 15m, 1h). Default 15m.

See the webhooks and email guides for full setup.

managedAgents

Grant this agent the ability to read and modify specific other Agents in its namespace. Used for self-improving setups where one agent tunes another agent's system prompt or tools.

managedAgents: [tuner-agent]

Pai grants narrow, per-name permissions — the agent can only touch the Agents listed here, nothing else.


Long-running-only fields (spec.longRunning: true)

image

The container image Pai runs for this long-running Agent. You bring the runtime; Pai handles the surrounding infrastructure.

longRunning: true
harness: custom
image: ghcr.io/pai-platform/openclaw:latest

replicas, runAsUser, command

Basic container controls: how many replicas to run, which Unix UID to run as, and whether to override the image's entrypoint.

FieldDefaultDescription
replicas1Number of instances
runAsUser65532UID to run the container as. Cannot be 0 — root is rejected at reconcile time
commandOverride the container entrypoint + command

inbound

Expose a container port for external HTTPS traffic. Pai provisions a load balancer, DNS, and a TLS cert — the agent just needs to serve HTTP on the declared port.

inbound:
port: 3000
allowCIDRs: ["0.0.0.0/0"]
customDomain: app.helppa.io
FieldDescription
portContainer port the agent listens on
allowCIDRsCIDR allowlist (default [] = deny all)
customDomainCustom domain — Pai provisions DNS + TLS automatically

Without a custom domain, agents get a random <hostname>.pairun.dev URL on status.url.

expose

Serve files from the agent's filesystem at a public URL path. Useful for sharing reports, dashboards, or generated artifacts without standing up a web server inside the agent.

expose:
- urlPath: /reports
directory: /data/reports
FieldDescription
urlPathPath under the agent's public hostname
directoryDirectory inside the agent to serve

Files appear at https://<hostname>.pairun.dev/<urlPath>.

autoscaling

Scale the number of replicas up and down based on load. Pick a metric (token rate, an HTTP endpoint returning a queue depth, etc.) and a per-replica target; Pai does the rest.

autoscaling:
minReplicas: 1
maxReplicas: 5
metrics:
- type: tokenRate
targetValuePerReplica: 500
FieldDefaultDescription
minReplicas1Minimum replicas (0 = paused)
maxReplicasRequired. Maximum replicas
scaleUpCooldownSeconds60Minimum seconds between scale-up events
scaleDownCooldownSeconds300Minimum seconds between scale-down events
scaleDownStabilizationWindowSeconds120Rolling window before allowing a scale-down
pollIntervalSeconds30How often metrics are evaluated

Metric types:

TypeDescription
tokenRateTokens per minute observed by the gateway
httpPoll any URL returning a JSON number (JIRA, SQS, custom queues)
metrics:
- type: http
url: "https://your-queue.example.com/depth"
jsonPath: "queue.depth"
targetValuePerReplica: 10

View live scaling status with pai scaling <name> (--watch to tail).

filesystem

Kernel-level write restrictions using Landlock — protect specific paths from a compromised agent overwriting its config, installing cron jobs, or corrupting persistent state.

filesystem:
readOnlyPaths: [/etc/myapp/config.json]
writablePaths: [/home/node/.npm]
FieldDescription
readOnlyPathsPaths the agent cannot write to
writablePathsExtra writable paths beyond volumes mountPaths and /tmp

Requires Linux kernel 5.13+; on older kernels the agent starts normally with a warning. Read is always allowed.

ops.instructions

Plain-English runbook for operators. pai chat uses this when a human asks the agent for its health, logs, or common fixes.

ops:
instructions: |
Run `myapp status` to check health.
Logs are at /data/logs/app.log.

cdpRelay

Let the agent drive a Chrome browser running on your laptop via the Chrome DevTools Protocol. Useful for agents that need to log into sites, click through dashboards, or automate flows that resist scripting.

cdpRelay:
token: "my-secret-token"
FieldDescription
tokenShared secret pairing the agent with pai relay

Pair with pai relay <name> --token <tok> on your machine. See the Browser Automation guide.


Session-defaults (inherited by every spawned Session)

These fields live on the session-spawning Agent (longRunning: false) and act as defaults for every Session this Agent spawns. A Session can override any of them in its own spec.

idleTimeoutMinutes

How long Pai keeps an Idle Session alive before cleaning it up. An interactive Session sits in Idle between user messages; this setting controls how long that grace period lasts.

idleTimeoutMinutes: 30

Default 30; set 0 to never auto-terminate.

ttlSecondsAfterFinished

How long to keep the record of a finished Session (Complete or Error) before the controller deletes the Session CR.

ttlSecondsAfterFinished: 604800   # 7 days

Defaults: 7 days for Complete, 14 days for Error. Set 0 to disable.

triggers[].schedule

Run a Session on a recurring cron schedule. The controller materialises a CronJob that POSTs a fresh Session each tick. Each run is fully isolated — new pod, new event stream, new token counter.

triggers:
- schedule: "0 9 * * 1" # every Monday at 9:00am UTC

Standard 5-field cron expression. See Scheduled tasks below.


Scheduled tasks

Add a schedule entry under spec.triggers[] to run an Agent automatically. Cron syntax reference:

┌─── minute (0–59)
│ ┌── hour (0–23, UTC)
│ │ ┌─ day of month (1–31)
│ │ │ ┌ month (1–12)
│ │ │ │ ┌ day of week (0–6, Sun=0)
│ │ │ │ │
0 9 * * * → daily at 9:00am UTC
0 */6 * * * → every 6 hours
0 9 * * 1 → every Monday at 9:00am UTC
0 0 1 * * → first day of every month at midnight

Behaviour:

  • No overlap. If the previous run is still active when the next trigger fires, the new run is skipped.
  • Each run is retained for 24 hours after completion, then cleaned up.
  • The last 3 successful and 3 failed runs are kept for inspection (pai logs <run-name>).

Trigger a run immediately outside the schedule:

pai run manual-run-1 --agent <agent-name>

Pause or resume the schedule without deleting the agent: remove schedule from the spec (re-apply to resume).


Lifecycle

Session (one-shot or triggered):

`pai run` / `POST /agents/{name}/sessions` / inbound trigger


Pending ──── Session CR created, waiting for compute


Running ──── container is running, harness is active

├──── Idle ────────── Harness done; waiting for user.message
│ │
│ └──── Running (resumed on next user.message)

├──── Complete ──── Job succeeded (exit 0)

└──── Error ──────── Job failed (non-zero exit or backoff exceeded)

Scheduled (triggers[].schedule):

Agent with spec.triggers[].schedule set

▼ (at each scheduled time, the CronJob creates a fresh Session)
[new Session] → Pending → Running → Complete / Error

└──── visible via `pai get sessions --agent <name>`

Each Session is immutable once created. Re-running an Agent always spawns a fresh Session — there is no in-place "rerun" semantics. The parent Agent owns its Sessions via ownerReferences, so pai delete agent <name> cascade-deletes every Session.


Status fields

No-type Agents

FieldDescription
status.readytrue when referenced ModelProviders and Providers all exist and the spec is valid
status.messageValidation error details when ready is false
status.toolSummaryComma-separated list of enabled tool names (shown in pai get agents output)
status.observedGenerationThe metadata.generation this status was computed for

Service agents

FieldDescription
status.phaseCreating · Pending · Running · Failed · Terminating
status.urlPublic HTTPS URL (e.g. https://a7x3k9.pairun.dev)
status.tokensTodayToken consumption for the current day
status.messageFailure reason when phase is Failed
status.currentReplicasCurrent replica count (when autoscaling is active)
status.lastScaleTimeTimestamp of the last autoscaler action

Task agents (one-shot)

FieldDescription
status.phasePendingRunningIdleComplete or Error
status.messageError details when phase is Error
status.jobNameIdentifier of the backing run
status.podNameContainer identifier once scheduled
status.startedAtISO-8601 timestamp when the pod started
status.completedAtISO-8601 timestamp when the pod finished
status.tokensUsedTotal tokens consumed in this run

Task agents (scheduled)

FieldDescription
status.phaseAlways Scheduled while the recurring job is active
status.cronJobNameIdentifier of the recurring job
status.lastRunAtTimestamp of the most recent scheduled run
status.lastRunPhasePhase of the most recent run: Running, Complete, or Error
pai get agent daily-report
# NAME TYPE PHASE SCHEDULE LAST RUN AGE
# daily-report task Scheduled 0 9 * * * Complete 2d

Listing agents

pai get agents shows all agents regardless of mode. Use --type task or --type service to filter:

NAME               TYPE      STATUS    MODELS          TOKENS/DAY   URL / TASK                              AGE
dev-assistant claude-sonnet — — 5d
my-app service Running gemini-flash 12,450 https://a7x3k9.pairun.dev 2h
analyze-q4-data task Complete claude-sonnet 4,120 Analyze Q4 sales data and produce… 10m
fix-bug-42 task Running claude-sonnet 1,800 Fix the null pointer in auth middleware 2m

No-type Agents show up with an empty TYPE column.