Skip to main content

Agent

An AgentWorkload is the core resource in Pai. It declares an AI agent: what container to run, which models it can use, which external services it can access, and how to expose it.

pai create -f agent.yaml
pai apply -f agent.yaml # create or update
pai delete agent <name>
pai list # list all agents
pai status <name> # detailed status

Minimal example

apiVersion: pai.io/v1
kind: AgentWorkload
metadata:
name: my-agent
spec:
image: ghcr.io/pai-platform/openclaw:latest
modelBindings:
- gemini-flash

Full example

apiVersion: pai.io/v1
kind: AgentWorkload
metadata:
name: dev-assistant
spec:
image: ghcr.io/pai-platform/openclaw:latest
runAsUser: 1000
modelBindings:
- gemini-flash
- claude-sonnet
providers:
- github-writer
- telegram-bot
inbound:
port: 3000
allowCIDRs:
- "0.0.0.0/0"
volumes:
- name: workspace
mountPath: /home/node/workspace
size: "5Gi"
configFiles:
- path: /home/node/.openclaw/openclaw.json
content: |
{
"channels": ["web", "telegram"],
"defaultModel": "gemini-flash"
}
env:
- name: LOG_LEVEL
value: "info"
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "2"
memory: "2Gi"

Field reference

Core

FieldTypeDefaultDescription
imagestringContainer image to run
replicasinteger1Number of instances
runAsUserinteger65532UID to run the container as. Cannot be 0 (root is rejected)
commandstring[]Override the container entrypoint + command

Models

FieldTypeDescription
modelBindingsstring[]Names of Model resources the agent can use

The agent's LLM calls are routed through the Pai gateway, which injects the real API key. The agent never sees credentials.

Providers

FieldTypeDescription
providersstring[]Names of Provider resources to attach

Each provider gives the agent access to an external service (GitHub, Telegram, AWS, etc.) through the sidecar proxy.

Inbound traffic

Expose a port for incoming requests (webhooks, web UI, APIs):

FieldTypeDefaultDescription
inbound.portintegerContainer port the agent listens on
inbound.allowCIDRsstring[][] (deny all)CIDR blocks allowed to reach the port

When inbound is set, Pai provisions a public load balancer and exposes the port at a stable IP. The agent's public URL is always available at status.url regardless of inbound config.

Storage

Persistent volumes survive pod restarts:

volumes:
- name: workspace # becomes the PVC name
mountPath: /data # where it's mounted inside the container
size: "10Gi"
FieldTypeDescription
volumes[].namestringVolume name (used as PVC identifier)
volumes[].mountPathstringMount path inside the container
volumes[].sizestringStorage size (e.g. "1Gi", "50Gi")

Config files

Inject files into the container at startup via a ConfigMap. Files are seeded into volumes on first start using copy-no-clobber — existing files are not overwritten on restart, so user edits survive pod restarts.

configFiles:
- path: /home/node/.myapp/config.json
content: |
{ "setting": "value" }

Environment variables

env:
- name: LOG_LEVEL
value: "debug"
- name: API_ENDPOINT
value: "https://api.example.com"

Personal env vars (git author name, email, etc.) can be set once with pai config set-env and are automatically injected into every agent you create.

Resources

resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "2"
memory: "2Gi"
FieldDescription
requests.cpuGuaranteed CPU (e.g. "500m" = 0.5 cores)
requests.memoryGuaranteed memory (e.g. "512Mi")
limits.cpuCPU ceiling — agent is throttled if exceeded
limits.memoryMemory ceiling — agent is OOM-killed if exceeded

Token budgets

Override the default token limits for this agent:

tokens:
maxPerDay: 50000
maxPerRequest: 8192

Expose directories

Serve files from the agent's filesystem at a public URL path:

expose:
- urlPath: /reports
directory: /data/reports

https://<hostname>.pairun.dev/reports will serve the contents of /data/reports inside the container.

Autoscaling

Automatically scale replicas up and down based on traffic or queue depth:

autoscaling:
minReplicas: 1
maxReplicas: 5
scaleUpCooldownSeconds: 60
scaleDownCooldownSeconds: 300
metrics:
- type: tokenRate
targetValuePerReplica: 500 # scale up at 500 tokens/min per replica
FieldDefaultDescription
minReplicas1Minimum replica count (0 = pause the agent)
maxReplicasMaximum replica count (required)
scaleUpCooldownSeconds60Minimum seconds between scale-up events
scaleDownCooldownSeconds300Minimum seconds between scale-down events
scaleDownStabilizationWindowSeconds120Rolling window before allowing a scale-down
pollIntervalSeconds30How often metrics are evaluated

Metric types:

TypeDescription
tokenRateTokens per minute observed by the gateway
httpPoll any URL returning a JSON number (JIRA, SQS, custom queues)
metrics:
- type: http
url: "https://your-queue.example.com/depth"
jsonPath: "queue.depth"
targetValuePerReplica: 10

View live scaling status:

pai scaling my-agent
pai scaling my-agent --watch

Browser automation (CDP relay)

Give the agent access to a Chrome browser running on your machine:

cdpRelay:
token: "my-secret-token"

Then run pai relay my-agent --token my-secret-token on your machine. See the Browser Automation guide for full setup instructions.

Filesystem confinement

Restrict which paths the agent can write to (Linux kernel 5.13+ only, best-effort):

filesystem:
readOnlyPaths:
- /etc/myapp/config.json
writablePaths:
- /home/node/.npm
FieldDescription
readOnlyPathsPaths the agent cannot write to (read is always allowed)
writablePathsExtra writable paths beyond volumes and /tmp

Operational runbook

Plain-English instructions for pai chat to use when querying agent status:

ops:
instructions: |
Run `myapp status` to check health.
Logs are at /data/logs/app.log.

Status fields

FieldDescription
status.phaseCreating · Pending · Running · Failed · Terminating
status.urlPublic HTTPS URL (e.g. https://a7x3k9.pairun.dev)
status.tokensTodayToken consumption for the current day
status.messageFailure reason when phase is Failed
status.currentReplicasCurrent replica count (when autoscaling is active)
status.lastScaleTimeTimestamp of the last autoscaler action