Architecture
Pai is built on Kubernetes and uses a two-tier proxy model to mediate all agent traffic. This page covers the core architecture, credential flow, DNS interception, TLS strategy, and security model.
High-level overview
Two-tier proxy model
All agent traffic is mediated by two proxy layers. Agents never call external services directly.
Tier 1: LLM Gateway
The LLM Gateway is a cluster-wide service that proxies all agent-to-LLM communication.
- Exposes an OpenAI-compatible API to agents
- Routes requests to the correct provider (Anthropic, OpenAI, Gemini, OpenRouter) based on the ModelBinding
- Injects API keys from Kubernetes Secrets -- agents never see provider credentials
- Enforces token budgets (per-day and per-request limits)
- Tracks cost and usage per workload
- Supports fallback chains -- if one provider is unavailable, routes to the next
Tier 2: Service Binding Proxy (Sidecar)
Each agent pod includes a sidecar proxy that intercepts all calls to external services (GitHub, AWS, Azure, GCP, Telegram, Slack, etc.).
- Runs as a sidecar container in every agent pod
- Intercepts HTTPS traffic via DNS hijacking and iptables redirect
- Routes requests to the correct provider plugin based on the
Hostheader - Resolves actions (e.g.,
GET /repos/org/repo/pullsmaps topulls:read) - Evaluates policy (allow/deny lists) before forwarding
- Injects credentials (Bearer tokens, SigV4 signatures, OAuth2 tokens)
- Logs every request for audit
Agent Container
|
|-- LLM calls --> PAI Gateway (port 8000) --> Anthropic / OpenAI / Gemini
|
|-- Tool/service calls --> Binding Proxy Sidecar (port 8081 HTTP, 8443 HTTPS)
|
|-- 1. Route: Host header --> PluginRegistry --> ProviderPlugin
|-- 2. HTTP rules: check policy.httpRules (method + path globs)
|-- 3. Resolve: plugin.resolve_action(method, path) --> "pulls:create"
|-- 4. Policy: check allow/deny action lists
|-- 5. Scope: check resource scope (repos, ARNs, projects)
|-- 6. Audit: log(workload, binding, action, resource, allowed)
|-- 7. Auth: plugin.inject_credentials(request)
|-- 8. Forward: proxy to real endpoint
Credential flow
Agents never hold real credentials. The credential flow works as follows:
Key points:
- Kubernetes Secrets hold the raw credentials (PATs, AWS keys, OAuth client secrets, GCP service account JSON)
- The Controller injects secrets into the sidecar container only -- never into the agent container
- OAuth2 providers (Azure, GCP) perform token exchange at startup and refresh tokens in the background
- AWS SigV4 signing is computed per-request by the plugin (pure Python, no boto3)
- The agent receives responses but never sees the auth headers
DNS interception
Pai uses DNS hijacking to transparently route agent traffic through the sidecar proxy.
How it works:
- The Controller adds
hostAliasesentries to the pod spec, pointing intercepted hostnames (e.g.,api.github.com,generativelanguage.googleapis.com) to127.0.0.1. - An init container sets up
iptablesNAT rules to redirect port 443 traffic to the sidecar's HTTPS port (8443). - The sidecar terminates TLS using a self-signed CA certificate, inspects the request, applies policy, injects credentials, and forwards to the real endpoint.
TLS strategy
Pai uses two layers of TLS:
External TLS (inbound traffic)
- Agent workloads with
inbound.portconfigured get a unique hostname on thepairun.devdomain (e.g.,a7x3k9.pairun.dev) - Pai creates a cert-manager
Certificateresource for the hostname - cert-manager obtains a trusted TLS certificate from Let's Encrypt via DNS-01 challenge
- External clients connect over HTTPS with a valid, publicly trusted certificate
Internal TLS (interception)
- The sidecar generates a self-signed CA certificate at startup via
entrypoint.sh - The CA cert includes SANs for all intercepted hosts (e.g.,
api.github.com,s3.amazonaws.com) - The CA cert is shared with the agent container via an
emptyDirvolume - Standard environment variables are set to make the agent trust the CA:
REQUESTS_CA_BUNDLE(Python requests)SSL_CERT_FILE(generic)NODE_EXTRA_CA_CERTS(Node.js)GIT_SSL_CAINFO(Git)
DNS: auto-generated hostnames
When an agent workload declares an inbound.port, Pai:
- Generates a random 6-character hostname (e.g.,
a7x3k9) - Creates a DNS record at
a7x3k9.pairun.devpointing to the load balancer - Provisions a TLS certificate for the hostname
- Sets
status.urlon the AgentWorkload tohttps://a7x3k9.pairun.dev
The hostname is stable for the lifetime of the workload. Deleting and recreating the workload generates a new hostname.
Security model
Pai enforces defense-in-depth at every layer:
Pod-level hardening
Every agent pod receives the following controls automatically — no AgentWorkload configuration required:
| Control | Implementation | Effect |
|---|---|---|
| No service account token | automountServiceAccountToken: false | Pod cannot call the Kubernetes API |
| Non-root execution | runAsNonRoot: true, default UID 65532 | Agent cannot run as root |
| No privilege escalation | allowPrivilegeEscalation: false | No setuid/sudo escalation |
| Syscall filtering | seccompProfile: RuntimeDefault | Blocks ~300 dangerous syscalls (ptrace, mount, kexec, bpf, etc.) |
| Root UID rejection | Controller validates runAsUser != 0 at reconcile time | spec.runAsUser: 0 sets status.phase: Failed and skips deployment |
| Filesystem confinement | Landlock LSM via spec.filesystem (opt-in, kernel 5.13+) | Per-path write restrictions; agent cannot overwrite config or install cron jobs |
Filesystem confinement (Landlock)
When spec.filesystem.readOnlyPaths is set, Pai injects a Landlock LSM enforcer via LD_PRELOAD — no entrypoint change required:
- An init container copies
pai-landlock.sofrom the proxy image into a sharedpai-sandboxemptyDir volume. - The controller sets
LD_PRELOAD=/pai-sandbox/pai-landlock.soandPAI_LANDLOCK_RW=<writable paths>on the agent container. - The dynamic linker loads
pai-landlock.sobeforemain(). Its constructor readsPAI_LANDLOCK_RW, applies Landlock, and returns. All child processes inherit the restrictions.
Algorithm:
- Handled mask = ALL write rights (
WRITE_FILE,REMOVE_*,MAKE_*,TRUNCATE,REFER) - Declared writable paths (
spec.volumesmountPaths +/tmp+spec.filesystem.writablePaths) receive explicit write grants spec.filesystem.readOnlyPathsreceive NO write grant → kernel silently denies writes- READ is outside the handled mask → reading is always unrestricted everywhere
ABI probing: v3 (kernel 5.19+) → v2 (5.17+) → v1 (5.13+). If Landlock is unavailable, the wrapper logs a warning and execs the original command — startup is never blocked.
Two-tier enforcement (automatic fallback):
| Tier | Mechanism | When active | Bypass resistance |
|---|---|---|---|
| 1 | Landlock LSM | Kernel 5.13+ with Landlock built in | Kernel-level, resists direct syscalls |
| 2 | libc interception | All other kernels (automatic fallback) | Covers libc callers (Node.js, Python, JVM); bypassed by direct syscalls |
Requirements:
- Tier 1: Linux kernel 5.13+ with Landlock compiled in (e.g. Ubuntu 22.04+, k3s; not GKE/COS)
- Tier 2: Dynamically linked runtime — works on any kernel
Only list paths the application itself never writes to in readOnlyPaths. If the app writes to a file at startup (config persistence, atomic renames, etc.), protecting it will crash the agent. Good candidates: system directories like /etc/cron.d, /var/spool/cron, /etc/passwd.
Network isolation
- A NetworkPolicy restricts agent egress to only the Pai gateway and the sidecar
- The agent cannot reach the Kubernetes API, other pods, or the internet directly
- All external access is mediated by the sidecar, which enforces per-binding policy
- Inbound traffic (when configured) is restricted to specified CIDR blocks via
loadBalancerSourceRangesand NetworkPolicyipBlockrules
Credential isolation
- Secrets are mounted only in the sidecar container
- The agent container has no access to secrets via environment variables, volumes, or the Kubernetes API
- OAuth2 tokens are held in memory by the sidecar and refreshed automatically
- AWS SigV4 signatures are computed per-request and never exposed to the agent
Policy enforcement modes
Providers support two enforcement modes via spec.audit.enforcement:
enforce(default): Requests that violatepolicy.allow/policy.denyare blocked with HTTP 403.audit: Violations are logged with aAUDIT (not blocked):prefix but the request is forwarded. Use this when rolling out new policies to production agents — validate the policy against live traffic, then switch toenforceonce confident.
The enforcement mode is evaluated per binding independently, so you can audit one binding while enforcing others.
Policy hot-reload
Updating a Provider (policy rules, httpRules, audit settings) takes effect on running agents without a pod restart.
How it works:
- When a
Providerchanges, the controller's watcher triggers a reconcile for all affectedAgentWorkloadresources. - The controller writes the updated provider specs to a
ConfigMapnamedpai-{workload}-providers. - The Kubernetes kubelet automatically syncs the ConfigMap to the pod's volume mount (typically within ~1 minute).
- The sidecar proxy's background watcher thread (polling every 30 seconds) detects the file's mtime change and reloads
_bindingsin-place under a threading lock. - The next request through that binding uses the updated policy.
What triggers a reload vs. a restart:
| Change | Hot-reload | Restart required |
|---|---|---|
policy.allow / policy.deny | ✅ | — |
policy.httpRules | ✅ | — |
audit.logRequests / audit.enforcement | ✅ | — |
scope.* | ✅ | — |
auth.secretRef (new credentials) | — | ✅ (secrets are env vars) |
spec.providers list change | — | ✅ (changes sidecar config) |
Provider plugin system
External service integrations are implemented as self-contained plugins in platform/proxy/providers/. Each plugin implements the ProviderPlugin abstract base class.
| Method | Purpose |
|---|---|
provider_name() | Canonical name (e.g., "github") |
default_hosts() | Hostnames to intercept (e.g., ["api.github.com"]) |
host_patterns() | Wildcard patterns (e.g., ["*.amazonaws.com"]) |
resolve_action() | Map HTTP request to a provider-specific action |
inject_credentials() | Add auth headers before forwarding |
start() / stop() | Lifecycle hooks (token exchange, cleanup) |
refresh_token() | Background token refresh (Azure, GCP) |
Supported providers:
| Provider | Auth Method | Token Refresh |
|---|---|---|
| GitHub | PAT / GitHub App | No (static) |
| AWS | SigV4 (per-request HMAC signing) | No (signed per-request) |
| Azure | OAuth2 client credentials (Entra ID) | Yes (every ~50 min) |
| GCP | Service account JWT to OAuth2 | Yes (every ~45 min) |
| Telegram | Bot token | No (static) |
| Slack | OAuth2 / API key | Varies |
Adding a new provider requires creating a single file in platform/proxy/providers/ -- the registry auto-discovers it at startup.