Agent
The Agent resource is how you declare an AI agent on Pai. A single resource covers three modes, selected by the type field:
| Mode | Set type to | When to use |
|---|---|---|
| service | service | Long-running agent with your own container image — web UIs, daemons, anything persistent |
| task | task | Ephemeral run of the Pai harness — a one-off task or a recurring scheduled run |
| (template) | omit | Reusable spec that task agents reference by name via agentDefinition |
pai create -f agent.yaml # create
pai apply -f agent.yaml # create or update
pai run <name> --agent <ref> --task "…" # run a task against an existing Agent
pai get agents # list all (the TYPE column shows the mode)
pai delete agent <name>
Complete example per mode — use these as references for what a full YAML can contain:
- service
- task
- template
apiVersion: pai.io/v1
kind: Agent
metadata:
name: dev-assistant
namespace: team-a
spec:
type: service
image: ghcr.io/pai-platform/openclaw:latest
replicas: 1
runAsUser: 1000
models:
- google/gemini-2.5-flash
- anthropic/claude-sonnet-4-6
providers:
- github-writer
- telegram-bot
guards:
- binding: prompt-guard-default
scan: {prompts: true, responses: false}
enforcement: enforce
inbound:
port: 3000
allowCIDRs: ["0.0.0.0/0"]
customDomain: dev.helppa.io
expose:
- urlPath: /reports
directory: /data/reports
volumes:
- name: workspace
mountPath: /home/node/workspace
size: "10Gi"
configFiles:
- path: /home/node/.openclaw/openclaw.json
content: |
{"channels": ["web", "telegram"], "defaultModel": "gemini-flash"}
env:
- name: LOG_LEVEL
value: "info"
- name: SLACK_WEBHOOK
secretRef: {name: slack-credentials, key: webhook-url}
tokens:
maxPerDay: 1_000_000
maxPerRequest: 16_000
rateLimits:
maxRequestsPerMinute: 60
maxConcurrentRequests: 4
maxCostPerDayUSD: 20.0
resources:
requests: {cpu: "500m", memory: "512Mi"}
limits: {cpu: "2", memory: "2Gi"}
autoscaling:
minReplicas: 1
maxReplicas: 5
metrics:
- type: tokenRate
targetValuePerReplica: 500
filesystem:
readOnlyPaths: [/home/node/.openclaw/openclaw.json, /etc/cron.d]
writablePaths: [/home/node/.npm]
ops:
instructions: |
Run `openclaw status` to check health.
Logs are at /data/logs/app.log.
apiVersion: pai.io/v1
kind: Agent
metadata:
name: analyze-q4-data
namespace: team-a
spec:
type: task
agentDefinition: data-analyst # references the template Agent below
title: "Analyze Q4 sales data and produce charts"
# Optional: schedule turns this into a recurring run.
# schedule: "0 9 * * 1" # every Monday at 9:00am UTC
# Fields below override the template for this run only.
env:
- name: REPORT_PERIOD
value: "Q4-2025"
- name: SLACK_WEBHOOK
secretRef: {name: slack-credentials, key: webhook-url}
volumes:
- name: workspace
mountPath: /workspace
size: "5Gi"
tokens:
maxPerDay: 50_000
resources:
limits: {memory: "8Gi"}
idleTimeoutMinutes: 15
ttlSecondsAfterFinished: 604800 # keep the run record for 7 days
apiVersion: pai.io/v1
kind: Agent
metadata:
name: data-analyst
namespace: team-a
spec:
# No spec.type and no spec.image — this is a template.
# Task agents reference this by name via spec.agentDefinition.
models:
- anthropic/claude-sonnet-4-6
- google/gemini-2.5-flash
system: |
You are a senior data analyst. Prefer clarity over cleverness.
Save all outputs under /workspace/out/.
tools:
- type: bash
- type: read
- type: write
- type: edit
- type: web_search
- type: custom
name: approve_report
description: Ask a human to approve the final report before posting
inputSchema:
type: object
properties: {report_url: {type: string}}
skills:
- name: company-style-guide
mountPath: /skills
packages:
pip: [pandas, matplotlib, openpyxl]
apt: [poppler-utils]
providers:
- github-readonly
- slack-bot
guards:
- binding: prompt-guard-default
scan:
prompts: true
toolResults: {tools: ["web_fetch", "web_search"]}
enforcement: enforce
triggers:
- webhook:
allowCIDRs: ["0.0.0.0/0"]
idleTimeout: 30m
files:
- name: customer-segments
mountPath: /data/segments.csv
readOnly: true
tokens:
maxPerDay: 500_000
maxPerRequest: 16_000
rateLimits:
maxCostPerDayUSD: 50.0
resources:
requests: {cpu: "500m", memory: "1Gi"}
limits: {cpu: "2", memory: "4Gi"}
managedAgents: [data-analyst-tuner]
The sections below group fields by which modes they apply to:
- Shared fields work on any Agent.
- Harness fields apply to task and template agents (ignored on service agents).
- Service-only fields apply to
type: serviceagents. - Task-only fields apply to
type: taskagents.
Shared fields (all modes)
These fields work on any Agent regardless of mode. When set on a template Agent (no type), task agents that reference it inherit them; a task agent's own value overrides the template per-field.
models
Which LLMs the agent can use. Every call routes through the Pai Gateway, which injects the provider's API key — the agent never sees credentials.
models:
- anthropic/claude-sonnet-4-6
- google/gemini-2.5-flash
Written as modelprovider/model-id. First entry is the primary/default; extras are fallbacks when the primary is rate-limited or over budget.
providers
External services the agent is allowed to reach (GitHub, AWS, Telegram, MCP servers, etc.). Credentials are injected by the Pai sidecar — the agent never holds real API keys.
providers:
- github-writer
- name: aws-s3
policy: {allow: ["s3:GetObject"]}
scope: {resources: ["arn:aws:s3:::reports/*"]}
| Entry | Description |
|---|---|
<name> (string) | Attach a Provider by name with its full defaults |
{name, policy, scope} (object) | Attach inline with narrower permissions — can only tighten, never widen, the Provider's own rules |
guards
Attach prompt-injection / jailbreak scanners to this agent's LLM traffic. Each prompt, response, or tool result is classified at the gateway before it reaches the model, and blocked or logged based on the configured enforcement mode.
guards:
- binding: prompt-guard-default
scan:
prompts: true
responses: false
enforcement: enforce
| Field | Description |
|---|---|
binding | Name of a GuardBinding resource |
scan.prompts | Scan user-role messages (default true) |
scan.responses | Scan assistant output — audit-only when streaming |
scan.toolResults.tools | Tool names whose results are scanned; ["*"] scans all. Omit to disable |
enforcement | Optional override; may only tighten (audit → enforce) |
See the prompt-injection guard guide. Setting spec.guards[] on a task Agent fully replaces any guards declared on its referenced template Agent.
tokens
Cap how many LLM tokens this agent can consume per day and per individual request. The gateway tracks usage in real time and rejects excess calls with HTTP 429.
tokens:
maxPerDay: 50000
maxPerRequest: 8192
| Field | Description |
|---|---|
maxPerDay | Hard daily token cap across all of this agent's LLM calls |
maxPerRequest | Per-request context window limit |
rateLimits
Cap request rate, concurrency, and daily USD spend for this agent. Complements tokens with shape-based controls (bursts vs. cost vs. total tokens).
rateLimits:
maxRequestsPerMinute: 60
maxConcurrentRequests: 4
maxCostPerDayUSD: 5.00
| Field | Description |
|---|---|
maxRequestsPerMinute | Fixed 60-second window request cap |
maxConcurrentRequests | Simultaneous in-flight LLM calls |
maxCostPerDayUSD | Hard daily USD spend cap, computed from the gateway's price table |
All three are independent and optional — 0 or unset means unlimited for that axis. Rejections are HTTP 429.
env
Extra environment variables injected into the agent container. Values can be literal strings or pulled from a Secret.
env:
- name: LOG_LEVEL
value: "debug"
- name: SLACK_WEBHOOK
secretRef:
name: slack-credentials
key: webhook-url
| Field | Description |
|---|---|
name | Environment variable name |
value | Literal string value |
secretRef.name | Secret to read from |
secretRef.key | Key within the Secret |
Personal vars (git author, email, etc.) can be set once with pai config set-env and are auto-injected into every agent you create.
Unlike Provider credentials (injected by the sidecar) and ModelProvider API keys (injected by the gateway), a Secret mounted via env[].secretRef is readable by the agent container. Use this only for secrets the agent genuinely needs to see itself — webhook URLs, library-specific API keys Pai doesn't proxy, etc. See Exposing a secret to the agent.
configFiles
Seed files into the agent at startup — config files, credentials, reference data, templates. Useful for injecting small pieces of configuration without rebuilding the image.
configFiles:
- path: /home/node/.myapp/config.json
content: |
{ "setting": "value" }
| Field | Description |
|---|---|
path | Absolute path inside the agent |
content | Inline file contents |
Files are only written if they don't already exist — edits made by the agent at runtime are preserved across restarts.
files
Mount versioned PaiFile resources into the agent. Useful for large or frequently-updated reference data (datasets, knowledge bases, large prompts) that don't belong in a config map.
files:
- name: customer-data
mountPath: /data/customers.csv
readOnly: true
| Field | Description |
|---|---|
name | PaiFile resource name |
mountPath | Absolute path inside the agent |
version | Specific version to pin (default: current at creation time) |
readOnly | Mount read-only (default false) |
No size limit. Agents can edit mounted files at runtime and call register_file to publish a new version.
skills
Attach reusable capability bundles — Markdown instructions, helper scripts, JSON data, or anything else agents can use at runtime.
skills:
- name: coding-guidelines
mountPath: /skills
| Field | Description |
|---|---|
name | Skill resource name |
mountPath | Parent directory; files land at {mountPath}/{skill-name}/ |
Service agents need their runtime to consume the mounted files (e.g. OpenClaw's skills.load.extraDirs); harness-backed agents load them automatically.
volumes
Persistent storage that survives agent restarts. Use for workspace files, model caches, scratch space, or anything the agent needs to keep between runs.
volumes:
- name: workspace
mountPath: /data
size: "10Gi"
| Field | Description |
|---|---|
name | Volume identifier |
mountPath | Mount path inside the agent |
size | Storage size (e.g. "1Gi", "50Gi", "1Ti") |
resources
Reserve CPU and memory for the agent, and set hard limits so one runaway agent can't starve the rest of the cluster.
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "2"
memory: "2Gi"
| Field | Description |
|---|---|
requests.cpu | Guaranteed CPU (e.g. "500m" = 0.5 cores) |
requests.memory | Guaranteed memory (e.g. "512Mi") |
limits.cpu | CPU ceiling — agent is throttled if exceeded |
limits.memory | Memory ceiling — agent is killed if exceeded |
Harness fields (task + template)
These fields drive the Pai agent harness — the built-in runtime used by task and template agents. Service agents bring their own runtime inside their container image and ignore these fields.
system
The system prompt the harness injects at agent start — defines the agent's persona, goals, and constraints.
system: |
You are a senior data analyst. Prefer clarity over cleverness.
tools
Pick which built-in tools the harness exposes to the model, and declare any custom caller-executed tools.
tools:
- type: bash
- type: web_search
- type: custom
name: approve_pr
description: Ask the user to approve a pull request
inputSchema: {type: object, properties: {pr_url: {type: string}}}
| Field | Description |
|---|---|
type | bash, read, write, edit, glob, grep, web_fetch, web_search, send_email, screenshot, or custom |
enabled | Set to false to disable a built-in tool (default true) |
name | Tool name (required when type: custom) |
description | Description shown to the model (required for custom tools) |
inputSchema | JSON Schema for the tool's arguments (required for custom tools) |
Omit tools entirely to enable all built-ins. Custom tools are caller-executed — the harness emits agent.custom_tool_use events and waits for user.custom_tool_result.
packages
Language-level packages to install into the harness container before the agent starts — pandas, axios, whatever your tools need.
packages:
pip: [pandas, matplotlib]
npm: ["@anthropic-ai/sdk"]
apt: [poppler-utils]
| Key | Package manager |
|---|---|
pip | Python (pip install) |
npm | Node (npm install -g) |
apt | Debian / Ubuntu system (apt-get install) |
go | Go (go install) |
gem | Ruby (gem install) |
cargo | Rust (cargo install) |
For apt, prefer baking packages into a custom harness image in production — install-at-startup is fine for iteration but adds latency per agent start.
triggers + idleTimeout
Make the agent conversational — wake it on inbound events (Telegram, email, Slack, webhook) instead of running to completion. Between events the agent sits idle and exits after idleTimeout of inactivity.
triggers:
- telegram:
chatId: "-100123456"
provider: telegram-bot
- email:
address: bot+linear@pairun.dev
provider: office365-imap
- webhook:
allowCIDRs: ["0.0.0.0/0"]
idleTimeout: 15m
| Trigger | Sub-fields | Description |
|---|---|---|
telegram | chatId, provider | Wake on a Telegram message |
email | address, provider | Wake on email to a plus-tagged address |
slack | channel, provider | Wake on a Slack channel message |
webhook | allowCIDRs, linearSignatureSecret | Wake on an HTTP POST to a unique URL |
idleTimeout | duration | How long to wait after last activity before exiting (e.g. 15m, 1h). Default 15m. |
See the webhooks and email guides for full setup.
managedAgents
Grant this agent the ability to read and modify specific other Agents in its namespace. Used for self-improving setups where one agent tunes another agent's system prompt or tools.
managedAgents: [tuner-agent]
Pai grants narrow, per-name permissions — the agent can only touch the Agents listed here, nothing else.
Service-only fields (spec.type: service)
image
The container image Pai runs for this service agent. You bring the runtime; Pai handles the surrounding infrastructure.
type: service
image: ghcr.io/pai-platform/openclaw:latest
replicas, runAsUser, command
Basic container controls: how many replicas to run, which Unix UID to run as, and whether to override the image's entrypoint.
| Field | Default | Description |
|---|---|---|
replicas | 1 | Number of instances |
runAsUser | 65532 | UID to run the container as. Cannot be 0 — root is rejected at reconcile time |
command | — | Override the container entrypoint + command |
inbound
Expose a container port for external HTTPS traffic. Pai provisions a load balancer, DNS, and a TLS cert — the agent just needs to serve HTTP on the declared port.
inbound:
port: 3000
allowCIDRs: ["0.0.0.0/0"]
customDomain: app.helppa.io
| Field | Description |
|---|---|
port | Container port the agent listens on |
allowCIDRs | CIDR allowlist (default [] = deny all) |
customDomain | Custom domain — Pai provisions DNS + TLS automatically |
Without a custom domain, agents get a random <hostname>.pairun.dev URL on status.url.
expose
Serve files from the agent's filesystem at a public URL path. Useful for sharing reports, dashboards, or generated artifacts without standing up a web server inside the agent.
expose:
- urlPath: /reports
directory: /data/reports
| Field | Description |
|---|---|
urlPath | Path under the agent's public hostname |
directory | Directory inside the agent to serve |
Files appear at https://<hostname>.pairun.dev/<urlPath>.
autoscaling
Scale the number of replicas up and down based on load. Pick a metric (token rate, an HTTP endpoint returning a queue depth, etc.) and a per-replica target; Pai does the rest.
autoscaling:
minReplicas: 1
maxReplicas: 5
metrics:
- type: tokenRate
targetValuePerReplica: 500
| Field | Default | Description |
|---|---|---|
minReplicas | 1 | Minimum replicas (0 = paused) |
maxReplicas | — | Required. Maximum replicas |
scaleUpCooldownSeconds | 60 | Minimum seconds between scale-up events |
scaleDownCooldownSeconds | 300 | Minimum seconds between scale-down events |
scaleDownStabilizationWindowSeconds | 120 | Rolling window before allowing a scale-down |
pollIntervalSeconds | 30 | How often metrics are evaluated |
Metric types:
| Type | Description |
|---|---|
tokenRate | Tokens per minute observed by the gateway |
http | Poll any URL returning a JSON number (JIRA, SQS, custom queues) |
metrics:
- type: http
url: "https://your-queue.example.com/depth"
jsonPath: "queue.depth"
targetValuePerReplica: 10
View live scaling status with pai scaling <name> (--watch to tail).
filesystem
Kernel-level write restrictions using Landlock — protect specific paths from a compromised agent overwriting its config, installing cron jobs, or corrupting persistent state.
filesystem:
readOnlyPaths: [/etc/myapp/config.json]
writablePaths: [/home/node/.npm]
| Field | Description |
|---|---|
readOnlyPaths | Paths the agent cannot write to |
writablePaths | Extra writable paths beyond volumes mountPaths and /tmp |
Requires Linux kernel 5.13+; on older kernels the agent starts normally with a warning. Read is always allowed.
ops.instructions
Plain-English runbook for operators. pai chat uses this when a human asks the agent for its health, logs, or common fixes.
ops:
instructions: |
Run `myapp status` to check health.
Logs are at /data/logs/app.log.
cdpRelay
Let the agent drive a Chrome browser running on your laptop via the Chrome DevTools Protocol. Useful for agents that need to log into sites, click through dashboards, or automate flows that resist scripting.
cdpRelay:
token: "my-secret-token"
| Field | Description |
|---|---|
token | Shared secret pairing the agent with pai relay |
Pair with pai relay <name> --token <tok> on your machine. See the Browser Automation guide.
Task-only fields (spec.type: task)
agentDefinition
The template Agent this task is based on. The task inherits all harness + shared fields from the template; any fields set on the task itself override the template per-field.
type: task
agentDefinition: dev-assistant
Required.
title
The initial prompt sent to the agent when it starts. When set, the harness auto-starts immediately — you don't need to send a user.message. Leave unset for interactive sessions driven by pai chat or the event API.
title: "Analyze Q4 sales data and produce charts"
schedule
Run the task on a recurring cron schedule instead of once. Each run is fully isolated — fresh filesystem, clean environment, independent token counter.
schedule: "0 9 * * 1" # every Monday at 9:00am UTC
Standard 5-field cron expression. See Scheduled tasks below for the full cron syntax reference and triggering manual runs.
idleTimeoutMinutes
How long Pai keeps an idle task alive before cleaning it up. An interactive task sits in Idle between user messages; this setting controls how long that grace period lasts.
idleTimeoutMinutes: 30
Default 30; set 0 to never auto-terminate.
ttlSecondsAfterFinished
How long to keep the record of a finished task (Complete or Error) before Pai deletes the Agent resource.
ttlSecondsAfterFinished: 604800 # 7 days
Defaults: 7 days for Complete, 14 days for Error. Set 0 to disable. Only applies to one-shot task agents (scheduled task agents are never auto-deleted).
Scheduled tasks
Add schedule to a task Agent to run it automatically. Cron syntax reference:
┌─── minute (0–59)
│ ┌── hour (0–23, UTC)
│ │ ┌─ day of month (1–31)
│ │ │ ┌ month (1–12)
│ │ │ │ ┌ day of week (0–6, Sun=0)
│ │ │ │ │
0 9 * * * → daily at 9:00am UTC
0 */6 * * * → every 6 hours
0 9 * * 1 → every Monday at 9:00am UTC
0 0 1 * * → first day of every month at midnight
Behaviour:
- No overlap. If the previous run is still active when the next trigger fires, the new run is skipped.
- Each run is retained for 24 hours after completion, then cleaned up.
- The last 3 successful and 3 failed runs are kept for inspection (
pai logs <run-name>).
Trigger a run immediately outside the schedule:
pai run manual-run-1 --agent <agent-name>
Pause or resume the schedule without deleting the agent: remove schedule from the spec (re-apply to resume).
Lifecycle
One-shot task:
pai run (or Agent type: task created, no schedule)
│
▼
Pending ──── run accepted, waiting for compute to become available
│
▼
Running ──── container is running, harness is active
│
├──── Idle ────────── Harness done; waiting for user.message
│ │
│ └──── Running (resumed on next user.message)
│
├──── Complete ──── Job succeeded (exit 0)
│
└──── Error ──────── Job failed (non-zero exit or backoff exceeded)
Scheduled task:
Agent type: task with schedule set
│
▼
Scheduled ──── waiting for next scheduled trigger
│
│ (at each scheduled time)
▼
[new run] → Running → Complete / Error
│
└──── status.lastRunAt / lastRunPhase updated
One-shot task agents are immutable once created. To re-run, use pai run again with a new name (or omit the name for an auto-generated one). Scheduled agents run indefinitely until deleted.
Status fields
No-type Agents
| Field | Description |
|---|---|
status.ready | true when referenced ModelProviders and Providers all exist and the spec is valid |
status.message | Validation error details when ready is false |
status.toolSummary | Comma-separated list of enabled tool names (shown in pai get agents output) |
status.observedGeneration | The metadata.generation this status was computed for |
Service agents
| Field | Description |
|---|---|
status.phase | Creating · Pending · Running · Failed · Terminating |
status.url | Public HTTPS URL (e.g. https://a7x3k9.pairun.dev) |
status.tokensToday | Token consumption for the current day |
status.message | Failure reason when phase is Failed |
status.currentReplicas | Current replica count (when autoscaling is active) |
status.lastScaleTime | Timestamp of the last autoscaler action |
Task agents (one-shot)
| Field | Description |
|---|---|
status.phase | Pending → Running → Idle → Complete or Error |
status.message | Error details when phase is Error |
status.jobName | Identifier of the backing run |
status.podName | Container identifier once scheduled |
status.startedAt | ISO-8601 timestamp when the pod started |
status.completedAt | ISO-8601 timestamp when the pod finished |
status.tokensUsed | Total tokens consumed in this run |
Task agents (scheduled)
| Field | Description |
|---|---|
status.phase | Always Scheduled while the recurring job is active |
status.cronJobName | Identifier of the recurring job |
status.lastRunAt | Timestamp of the most recent scheduled run |
status.lastRunPhase | Phase of the most recent run: Running, Complete, or Error |
pai get agent daily-report
# NAME TYPE PHASE SCHEDULE LAST RUN AGE
# daily-report task Scheduled 0 9 * * * Complete 2d
Listing agents
pai get agents shows all agents regardless of mode. Use --type task or --type service to filter:
NAME TYPE STATUS MODELS TOKENS/DAY URL / TASK AGE
dev-assistant claude-sonnet — — 5d
my-app service Running gemini-flash 12,450 https://a7x3k9.pairun.dev 2h
analyze-q4-data task Complete claude-sonnet 4,120 Analyze Q4 sales data and produce… 10m
fix-bug-42 task Running claude-sonnet 1,800 Fix the null pointer in auth middleware 2m
No-type Agents show up with an empty TYPE column.