Agent

The Agent resource is how you declare an AI agent on Pai. A single resource covers three modes, selected by the type field:

Mode	Set `type` to	When to use
service	`service`	Long-running agent with your own container image — web UIs, daemons, anything persistent
task	`task`	Ephemeral run of the Pai harness — a one-off task or a recurring scheduled run
(template)	omit	Reusable spec that task agents reference by name via `agentDefinition`

pai create -f agent.yaml                    # create
pai apply -f agent.yaml                     # create or update
pai run <name> --agent <ref> --task "…"     # run a task against an existing Agent
pai get agents                              # list all (the TYPE column shows the mode)
pai delete agent <name>

Complete example per mode — use these as references for what a full YAML can contain:

service
task
template

apiVersion: pai.io/v1
kind: Agent
metadata:
  name: dev-assistant
  namespace: team-a
spec:
  type: service
  image: ghcr.io/pai-platform/openclaw:latest
  replicas: 1
  runAsUser: 1000

  models:
    - google/gemini-2.5-flash
    - anthropic/claude-sonnet-4-6

  providers:
    - github-writer
    - telegram-bot

  guards:
    - binding: prompt-guard-default
      scan: {prompts: true, responses: false}
      enforcement: enforce

  inbound:
    port: 3000
    allowCIDRs: ["0.0.0.0/0"]
    customDomain: dev.helppa.io

  expose:
    - urlPath: /reports
      directory: /data/reports

  volumes:
    - name: workspace
      mountPath: /home/node/workspace
      size: "10Gi"

  configFiles:
    - path: /home/node/.openclaw/openclaw.json
      content: |
        {"channels": ["web", "telegram"], "defaultModel": "gemini-flash"}

  env:
    - name: LOG_LEVEL
      value: "info"
    - name: SLACK_WEBHOOK
      secretRef: {name: slack-credentials, key: webhook-url}

  tokens:
    maxPerDay: 1_000_000
    maxPerRequest: 16_000

  rateLimits:
    maxRequestsPerMinute: 60
    maxConcurrentRequests: 4
    maxCostPerDayUSD: 20.0

  resources:
    requests: {cpu: "500m", memory: "512Mi"}
    limits:   {cpu: "2",    memory: "2Gi"}

  autoscaling:
    minReplicas: 1
    maxReplicas: 5
    metrics:
      - type: tokenRate
        targetValuePerReplica: 500

  filesystem:
    readOnlyPaths: [/home/node/.openclaw/openclaw.json, /etc/cron.d]
    writablePaths: [/home/node/.npm]

  ops:
    instructions: |
      Run `openclaw status` to check health.
      Logs are at /data/logs/app.log.

apiVersion: pai.io/v1
kind: Agent
metadata:
  name: analyze-q4-data
  namespace: team-a
spec:
  type: task
  agentDefinition: data-analyst            # references the template Agent below
  title: "Analyze Q4 sales data and produce charts"

  # Optional: schedule turns this into a recurring run.
  # schedule: "0 9 * * 1"                   # every Monday at 9:00am UTC

  # Fields below override the template for this run only.
  env:
    - name: REPORT_PERIOD
      value: "Q4-2025"
    - name: SLACK_WEBHOOK
      secretRef: {name: slack-credentials, key: webhook-url}

  volumes:
    - name: workspace
      mountPath: /workspace
      size: "5Gi"

  tokens:
    maxPerDay: 50_000

  resources:
    limits: {memory: "8Gi"}

  idleTimeoutMinutes: 15
  ttlSecondsAfterFinished: 604800          # keep the run record for 7 days

apiVersion: pai.io/v1
kind: Agent
metadata:
  name: data-analyst
  namespace: team-a
spec:
  # No spec.type and no spec.image — this is a template.
  # Task agents reference this by name via spec.agentDefinition.

  models:
    - anthropic/claude-sonnet-4-6
    - google/gemini-2.5-flash

  system: |
    You are a senior data analyst. Prefer clarity over cleverness.
    Save all outputs under /workspace/out/.

  tools:
    - type: bash
    - type: read
    - type: write
    - type: edit
    - type: web_search
    - type: custom
      name: approve_report
      description: Ask a human to approve the final report before posting
      inputSchema:
        type: object
        properties: {report_url: {type: string}}

  skills:
    - name: company-style-guide
      mountPath: /skills

  packages:
    pip: [pandas, matplotlib, openpyxl]
    apt: [poppler-utils]

  providers:
    - github-readonly
    - slack-bot

  guards:
    - binding: prompt-guard-default
      scan:
        prompts: true
        toolResults: {tools: ["web_fetch", "web_search"]}
      enforcement: enforce

  triggers:
    - webhook:
        allowCIDRs: ["0.0.0.0/0"]
  idleTimeout: 30m

  files:
    - name: customer-segments
      mountPath: /data/segments.csv
      readOnly: true

  tokens:
    maxPerDay: 500_000
    maxPerRequest: 16_000

  rateLimits:
    maxCostPerDayUSD: 50.0

  resources:
    requests: {cpu: "500m", memory: "1Gi"}
    limits:   {cpu: "2",    memory: "4Gi"}

  managedAgents: [data-analyst-tuner]

The sections below group fields by which modes they apply to:

Shared fields work on any Agent.
Harness fields apply to task and template agents (ignored on service agents).
Service-only fields apply to type: service agents.
Task-only fields apply to type: task agents.

Shared fields (all modes)

These fields work on any Agent regardless of mode. When set on a template Agent (no type), task agents that reference it inherit them; a task agent's own value overrides the template per-field.

`models`

Which LLMs the agent can use. Every call routes through the Pai Gateway, which injects the provider's API key — the agent never sees credentials.

models:
  - anthropic/claude-sonnet-4-6
  - google/gemini-2.5-flash

Written as modelprovider/model-id. First entry is the primary/default; extras are fallbacks when the primary is rate-limited or over budget.

`providers`

External services the agent is allowed to reach (GitHub, AWS, Telegram, MCP servers, etc.). Credentials are injected by the Pai sidecar — the agent never holds real API keys.

providers:
  - github-writer
  - name: aws-s3
    policy: {allow: ["s3:GetObject"]}
    scope: {resources: ["arn:aws:s3:::reports/*"]}

Entry	Description
`<name>` (string)	Attach a Provider by name with its full defaults
`{name, policy, scope}` (object)	Attach inline with narrower permissions — can only tighten, never widen, the Provider's own rules

`guards`

Attach prompt-injection / jailbreak scanners to this agent's LLM traffic. Each prompt, response, or tool result is classified at the gateway before it reaches the model, and blocked or logged based on the configured enforcement mode.

guards:
  - binding: prompt-guard-default
    scan:
      prompts: true
      responses: false
    enforcement: enforce

Field	Description
`binding`	Name of a GuardBinding resource
`scan.prompts`	Scan user-role messages (default `true`)
`scan.responses`	Scan assistant output — audit-only when streaming
`scan.toolResults.tools`	Tool names whose results are scanned; `["*"]` scans all. Omit to disable
`enforcement`	Optional override; may only tighten (`audit` → `enforce`)

See the prompt-injection guard guide. Setting spec.guards[] on a task Agent fully replaces any guards declared on its referenced template Agent.

`tokens`

Cap how many LLM tokens this agent can consume per day and per individual request. The gateway tracks usage in real time and rejects excess calls with HTTP 429.

tokens:
  maxPerDay: 50000
  maxPerRequest: 8192

Field	Description
`maxPerDay`	Hard daily token cap across all of this agent's LLM calls
`maxPerRequest`	Per-request context window limit

`rateLimits`

Cap request rate, concurrency, and daily USD spend for this agent. Complements tokens with shape-based controls (bursts vs. cost vs. total tokens).

rateLimits:
  maxRequestsPerMinute: 60
  maxConcurrentRequests: 4
  maxCostPerDayUSD: 5.00

Field	Description
`maxRequestsPerMinute`	Fixed 60-second window request cap
`maxConcurrentRequests`	Simultaneous in-flight LLM calls
`maxCostPerDayUSD`	Hard daily USD spend cap, computed from the gateway's price table

All three are independent and optional — 0 or unset means unlimited for that axis. Rejections are HTTP 429.

`env`

Extra environment variables injected into the agent container. Values can be literal strings or pulled from a Secret.

env:
  - name: LOG_LEVEL
    value: "debug"
  - name: SLACK_WEBHOOK
    secretRef:
      name: slack-credentials
      key: webhook-url

Field	Description
`name`	Environment variable name
`value`	Literal string value
`secretRef.name`	Secret to read from
`secretRef.key`	Key within the Secret

Personal vars (git author, email, etc.) can be set once with pai config set-env and are auto-injected into every agent you create.

secretRef exposes the value to the agent

Unlike Provider credentials (injected by the sidecar) and ModelProvider API keys (injected by the gateway), a Secret mounted via env[].secretRef is readable by the agent container. Use this only for secrets the agent genuinely needs to see itself — webhook URLs, library-specific API keys Pai doesn't proxy, etc. See Exposing a secret to the agent.

`configFiles`

Seed files into the agent at startup — config files, credentials, reference data, templates. Useful for injecting small pieces of configuration without rebuilding the image.

configFiles:
  - path: /home/node/.myapp/config.json
    content: |
      { "setting": "value" }

Field	Description
`path`	Absolute path inside the agent
`content`	Inline file contents

Files are only written if they don't already exist — edits made by the agent at runtime are preserved across restarts.

`files`

Mount versioned PaiFile resources into the agent. Useful for large or frequently-updated reference data (datasets, knowledge bases, large prompts) that don't belong in a config map.

files:
  - name: customer-data
    mountPath: /data/customers.csv
    readOnly: true

Field	Description
`name`	PaiFile resource name
`mountPath`	Absolute path inside the agent
`version`	Specific version to pin (default: current at creation time)
`readOnly`	Mount read-only (default `false`)

No size limit. Agents can edit mounted files at runtime and call register_file to publish a new version.

`skills`

Attach reusable capability bundles — Markdown instructions, helper scripts, JSON data, or anything else agents can use at runtime.

skills:
  - name: coding-guidelines
    mountPath: /skills

Field	Description
`name`	Skill resource name
`mountPath`	Parent directory; files land at `{mountPath}/{skill-name}/`

Service agents need their runtime to consume the mounted files (e.g. OpenClaw's skills.load.extraDirs); harness-backed agents load them automatically.

`volumes`

Persistent storage that survives agent restarts. Use for workspace files, model caches, scratch space, or anything the agent needs to keep between runs.

volumes:
  - name: workspace
    mountPath: /data
    size: "10Gi"

Field	Description
`name`	Volume identifier
`mountPath`	Mount path inside the agent
`size`	Storage size (e.g. `"1Gi"`, `"50Gi"`, `"1Ti"`)

`resources`

Reserve CPU and memory for the agent, and set hard limits so one runaway agent can't starve the rest of the cluster.

resources:
  requests:
    cpu: "500m"
    memory: "512Mi"
  limits:
    cpu: "2"
    memory: "2Gi"

Field	Description
`requests.cpu`	Guaranteed CPU (e.g. `"500m"` = 0.5 cores)
`requests.memory`	Guaranteed memory (e.g. `"512Mi"`)
`limits.cpu`	CPU ceiling — agent is throttled if exceeded
`limits.memory`	Memory ceiling — agent is killed if exceeded

Harness fields (task + template)

These fields drive the Pai agent harness — the built-in runtime used by task and template agents. Service agents bring their own runtime inside their container image and ignore these fields.

`system`

The system prompt the harness injects at agent start — defines the agent's persona, goals, and constraints.

system: |
  You are a senior data analyst. Prefer clarity over cleverness.

`tools`

Pick which built-in tools the harness exposes to the model, and declare any custom caller-executed tools.

tools:
  - type: bash
  - type: web_search
  - type: custom
    name: approve_pr
    description: Ask the user to approve a pull request
    inputSchema: {type: object, properties: {pr_url: {type: string}}}

Field	Description
`type`	`bash`, `read`, `write`, `edit`, `glob`, `grep`, `web_fetch`, `web_search`, `send_email`, `screenshot`, or `custom`
`enabled`	Set to `false` to disable a built-in tool (default `true`)
`name`	Tool name (required when `type: custom`)
`description`	Description shown to the model (required for custom tools)
`inputSchema`	JSON Schema for the tool's arguments (required for custom tools)

Omit tools entirely to enable all built-ins. Custom tools are caller-executed — the harness emits agent.custom_tool_use events and waits for user.custom_tool_result.

`packages`

Language-level packages to install into the harness container before the agent starts — pandas, axios, whatever your tools need.

packages:
  pip:   [pandas, matplotlib]
  npm:   ["@anthropic-ai/sdk"]
  apt:   [poppler-utils]

Key	Package manager
`pip`	Python (`pip install`)
`npm`	Node (`npm install -g`)
`apt`	Debian / Ubuntu system (`apt-get install`)
`go`	Go (`go install`)
`gem`	Ruby (`gem install`)
`cargo`	Rust (`cargo install`)

For apt, prefer baking packages into a custom harness image in production — install-at-startup is fine for iteration but adds latency per agent start.

`triggers` + `idleTimeout`

Make the agent conversational — wake it on inbound events (Telegram, email, Slack, webhook) instead of running to completion. Between events the agent sits idle and exits after idleTimeout of inactivity.

triggers:
  - telegram:
      chatId: "-100123456"
      provider: telegram-bot
  - email:
      address: bot+linear@pairun.dev
      provider: office365-imap
  - webhook:
      allowCIDRs: ["0.0.0.0/0"]
idleTimeout: 15m

Trigger	Sub-fields	Description
`telegram`	`chatId`, `provider`	Wake on a Telegram message
`email`	`address`, `provider`	Wake on email to a plus-tagged address
`slack`	`channel`, `provider`	Wake on a Slack channel message
`webhook`	`allowCIDRs`, `linearSignatureSecret`	Wake on an HTTP POST to a unique URL
`idleTimeout`	duration	How long to wait after last activity before exiting (e.g. `15m`, `1h`). Default `15m`.

See the webhooks and email guides for full setup.

`managedAgents`

Grant this agent the ability to read and modify specific other Agents in its namespace. Used for self-improving setups where one agent tunes another agent's system prompt or tools.

managedAgents: [tuner-agent]

Pai grants narrow, per-name permissions — the agent can only touch the Agents listed here, nothing else.

Service-only fields (`spec.type: service`)

`image`

The container image Pai runs for this service agent. You bring the runtime; Pai handles the surrounding infrastructure.

type: service
image: ghcr.io/pai-platform/openclaw:latest

`replicas`, `runAsUser`, `command`

Basic container controls: how many replicas to run, which Unix UID to run as, and whether to override the image's entrypoint.

Field	Default	Description
`replicas`	`1`	Number of instances
`runAsUser`	`65532`	UID to run the container as. Cannot be `0` — root is rejected at reconcile time
`command`	—	Override the container entrypoint + command

`inbound`

Expose a container port for external HTTPS traffic. Pai provisions a load balancer, DNS, and a TLS cert — the agent just needs to serve HTTP on the declared port.

inbound:
  port: 3000
  allowCIDRs: ["0.0.0.0/0"]
  customDomain: app.helppa.io

Field	Description
`port`	Container port the agent listens on
`allowCIDRs`	CIDR allowlist (default `[]` = deny all)
`customDomain`	Custom domain — Pai provisions DNS + TLS automatically

Without a custom domain, agents get a random <hostname>.pairun.dev URL on status.url.

`expose`

Serve files from the agent's filesystem at a public URL path. Useful for sharing reports, dashboards, or generated artifacts without standing up a web server inside the agent.

expose:
  - urlPath: /reports
    directory: /data/reports

Field	Description
`urlPath`	Path under the agent's public hostname
`directory`	Directory inside the agent to serve

Files appear at https://<hostname>.pairun.dev/<urlPath>.

`autoscaling`

Scale the number of replicas up and down based on load. Pick a metric (token rate, an HTTP endpoint returning a queue depth, etc.) and a per-replica target; Pai does the rest.

autoscaling:
  minReplicas: 1
  maxReplicas: 5
  metrics:
    - type: tokenRate
      targetValuePerReplica: 500

Field	Default	Description
`minReplicas`	`1`	Minimum replicas (`0` = paused)
`maxReplicas`	—	Required. Maximum replicas
`scaleUpCooldownSeconds`	`60`	Minimum seconds between scale-up events
`scaleDownCooldownSeconds`	`300`	Minimum seconds between scale-down events
`scaleDownStabilizationWindowSeconds`	`120`	Rolling window before allowing a scale-down
`pollIntervalSeconds`	`30`	How often metrics are evaluated

Metric types:

Type	Description
`tokenRate`	Tokens per minute observed by the gateway
`http`	Poll any URL returning a JSON number (JIRA, SQS, custom queues)

metrics:
  - type: http
    url: "https://your-queue.example.com/depth"
    jsonPath: "queue.depth"
    targetValuePerReplica: 10

View live scaling status with pai scaling <name> (--watch to tail).

`filesystem`

Kernel-level write restrictions using Landlock — protect specific paths from a compromised agent overwriting its config, installing cron jobs, or corrupting persistent state.

filesystem:
  readOnlyPaths: [/etc/myapp/config.json]
  writablePaths: [/home/node/.npm]

Field	Description
`readOnlyPaths`	Paths the agent cannot write to
`writablePaths`	Extra writable paths beyond `volumes` mountPaths and `/tmp`

Requires Linux kernel 5.13+; on older kernels the agent starts normally with a warning. Read is always allowed.

`ops.instructions`

Plain-English runbook for operators. pai chat uses this when a human asks the agent for its health, logs, or common fixes.

ops:
  instructions: |
    Run `myapp status` to check health.
    Logs are at /data/logs/app.log.

`cdpRelay`

Let the agent drive a Chrome browser running on your laptop via the Chrome DevTools Protocol. Useful for agents that need to log into sites, click through dashboards, or automate flows that resist scripting.

cdpRelay:
  token: "my-secret-token"

Field	Description
`token`	Shared secret pairing the agent with `pai relay`

Pair with pai relay <name> --token <tok> on your machine. See the Browser Automation guide.

Task-only fields (`spec.type: task`)

`agentDefinition`

The template Agent this task is based on. The task inherits all harness + shared fields from the template; any fields set on the task itself override the template per-field.

type: task
agentDefinition: dev-assistant

Required.

`title`

The initial prompt sent to the agent when it starts. When set, the harness auto-starts immediately — you don't need to send a user.message. Leave unset for interactive sessions driven by pai chat or the event API.

title: "Analyze Q4 sales data and produce charts"

`schedule`

Run the task on a recurring cron schedule instead of once. Each run is fully isolated — fresh filesystem, clean environment, independent token counter.

schedule: "0 9 * * 1"    # every Monday at 9:00am UTC

Standard 5-field cron expression. See Scheduled tasks below for the full cron syntax reference and triggering manual runs.

`idleTimeoutMinutes`

How long Pai keeps an idle task alive before cleaning it up. An interactive task sits in Idle between user messages; this setting controls how long that grace period lasts.

idleTimeoutMinutes: 30

Default 30; set 0 to never auto-terminate.

`ttlSecondsAfterFinished`

How long to keep the record of a finished task (Complete or Error) before Pai deletes the Agent resource.

ttlSecondsAfterFinished: 604800   # 7 days

Defaults: 7 days for Complete, 14 days for Error. Set 0 to disable. Only applies to one-shot task agents (scheduled task agents are never auto-deleted).

Scheduled tasks

Add schedule to a task Agent to run it automatically. Cron syntax reference:

┌─── minute (0–59)
│  ┌── hour (0–23, UTC)
│  │  ┌─ day of month (1–31)
│  │  │  ┌ month (1–12)
│  │  │  │  ┌ day of week (0–6, Sun=0)
│  │  │  │  │
0  9  *  *  *   → daily at 9:00am UTC
0  */6  *  *  *  → every 6 hours
0  9  *  *  1   → every Monday at 9:00am UTC
0  0  1  *  *   → first day of every month at midnight

Behaviour:

No overlap. If the previous run is still active when the next trigger fires, the new run is skipped.
Each run is retained for 24 hours after completion, then cleaned up.
The last 3 successful and 3 failed runs are kept for inspection (pai logs <run-name>).

Trigger a run immediately outside the schedule:

pai run manual-run-1 --agent <agent-name>

Pause or resume the schedule without deleting the agent: remove schedule from the spec (re-apply to resume).

Lifecycle

One-shot task:

pai run (or Agent type: task created, no schedule)
       │
       ▼
  Pending ──── run accepted, waiting for compute to become available
       │
       ▼
  Running ──── container is running, harness is active
       │
       ├──── Idle ────────── Harness done; waiting for user.message
       │         │
       │         └──── Running (resumed on next user.message)
       │
       ├──── Complete ──── Job succeeded (exit 0)
       │
       └──── Error ──────── Job failed (non-zero exit or backoff exceeded)

Scheduled task:

Agent type: task with schedule set
       │
       ▼
  Scheduled ──── waiting for next scheduled trigger
       │
       │  (at each scheduled time)
       ▼
  [new run] → Running → Complete / Error
       │
       └──── status.lastRunAt / lastRunPhase updated

One-shot task agents are immutable once created. To re-run, use pai run again with a new name (or omit the name for an auto-generated one). Scheduled agents run indefinitely until deleted.

Status fields

No-type Agents

Field	Description
`status.ready`	`true` when referenced ModelProviders and Providers all exist and the spec is valid
`status.message`	Validation error details when `ready` is `false`
`status.toolSummary`	Comma-separated list of enabled tool names (shown in `pai get agents` output)
`status.observedGeneration`	The `metadata.generation` this status was computed for

Service agents

Field	Description
`status.phase`	`Creating` · `Pending` · `Running` · `Failed` · `Terminating`
`status.url`	Public HTTPS URL (e.g. `https://a7x3k9.pairun.dev`)
`status.tokensToday`	Token consumption for the current day
`status.message`	Failure reason when `phase` is `Failed`
`status.currentReplicas`	Current replica count (when autoscaling is active)
`status.lastScaleTime`	Timestamp of the last autoscaler action

Task agents (one-shot)

Field	Description
`status.phase`	`Pending` → `Running` → `Idle` → `Complete` or `Error`
`status.message`	Error details when phase is `Error`
`status.jobName`	Identifier of the backing run
`status.podName`	Container identifier once scheduled
`status.startedAt`	ISO-8601 timestamp when the pod started
`status.completedAt`	ISO-8601 timestamp when the pod finished
`status.tokensUsed`	Total tokens consumed in this run

Task agents (scheduled)

Field	Description
`status.phase`	Always `Scheduled` while the recurring job is active
`status.cronJobName`	Identifier of the recurring job
`status.lastRunAt`	Timestamp of the most recent scheduled run
`status.lastRunPhase`	Phase of the most recent run: `Running`, `Complete`, or `Error`

pai get agent daily-report
# NAME           TYPE   PHASE       SCHEDULE    LAST RUN   AGE
# daily-report   task   Scheduled   0 9 * * *   Complete   2d

Listing agents

pai get agents shows all agents regardless of mode. Use --type task or --type service to filter:

NAME               TYPE      STATUS    MODELS          TOKENS/DAY   URL / TASK                              AGE
dev-assistant                          claude-sonnet   —            —                                      5d
my-app             service   Running   gemini-flash    12,450       https://a7x3k9.pairun.dev              2h
analyze-q4-data    task      Complete  claude-sonnet   4,120        Analyze Q4 sales data and produce…     10m
fix-bug-42         task      Running   claude-sonnet   1,800        Fix the null pointer in auth middleware  2m

No-type Agents show up with an empty TYPE column.

Shared fields (all modes)​

models​

providers​

guards​

tokens​

rateLimits​

env​

configFiles​

files​

skills​

volumes​

resources​

Harness fields (task + template)​

system​

tools​

packages​

triggers + idleTimeout​

managedAgents​

Service-only fields (spec.type: service)​

image​

replicas, runAsUser, command​

inbound​

expose​

autoscaling​

filesystem​

ops.instructions​

cdpRelay​

Task-only fields (spec.type: task)​

agentDefinition​

title​

schedule​

idleTimeoutMinutes​

ttlSecondsAfterFinished​

Scheduled tasks​

Lifecycle​

Status fields​

No-type Agents​

Service agents​

Task agents (one-shot)​

Task agents (scheduled)​

Listing agents​