Skip to main content

Architecture

Pai is built on Kubernetes and uses a two-tier proxy model to mediate all agent traffic. This page covers the core architecture, credential flow, DNS interception, TLS strategy, and security model.

High-level overview

Two-tier proxy model

All agent traffic is mediated by two proxy layers. Agents never call external services directly.

Tier 1: LLM Gateway

The LLM Gateway is a cluster-wide service that proxies all agent-to-LLM communication.

  • Exposes an OpenAI-compatible API to agents
  • Routes requests to the correct provider (Anthropic, OpenAI, Gemini, OpenRouter) based on the ModelBinding
  • Injects API keys from Kubernetes Secrets -- agents never see provider credentials
  • Enforces token budgets (per-day and per-request limits)
  • Tracks cost and usage per workload
  • Supports fallback chains -- if one provider is unavailable, routes to the next

Tier 2: Service Binding Proxy (Sidecar)

Each agent pod includes a sidecar proxy that intercepts all calls to external services (GitHub, AWS, Azure, GCP, Telegram, Slack, etc.).

  • Runs as a sidecar container in every agent pod
  • Intercepts HTTPS traffic via DNS hijacking and iptables redirect
  • Routes requests to the correct provider plugin based on the Host header
  • Resolves actions (e.g., GET /repos/org/repo/pulls maps to pulls:read)
  • Evaluates policy (allow/deny lists) before forwarding
  • Injects credentials (Bearer tokens, SigV4 signatures, OAuth2 tokens)
  • Logs every request for audit
Agent Container
|
|-- LLM calls --> PAI Gateway (port 8000) --> Anthropic / OpenAI / Gemini
|
|-- Tool/service calls --> Binding Proxy Sidecar (port 8081 HTTP, 8443 HTTPS)
|
|-- 1. Route: Host header --> PluginRegistry --> ProviderPlugin
|-- 2. HTTP rules: check policy.httpRules (method + path globs)
|-- 3. Resolve: plugin.resolve_action(method, path) --> "pulls:create"
|-- 4. Policy: check allow/deny action lists
|-- 5. Scope: check resource scope (repos, ARNs, projects)
|-- 6. Audit: log(workload, binding, action, resource, allowed)
|-- 7. Auth: plugin.inject_credentials(request)
|-- 8. Forward: proxy to real endpoint

Credential flow

Agents never hold real credentials. The credential flow works as follows:

Key points:

  • Kubernetes Secrets hold the raw credentials (PATs, AWS keys, OAuth client secrets, GCP service account JSON)
  • The Controller injects secrets into the sidecar container only -- never into the agent container
  • OAuth2 providers (Azure, GCP) perform token exchange at startup and refresh tokens in the background
  • AWS SigV4 signing is computed per-request by the plugin (pure Python, no boto3)
  • The agent receives responses but never sees the auth headers

DNS interception

Pai uses DNS hijacking to transparently route agent traffic through the sidecar proxy.

How it works:

  1. The Controller adds hostAliases entries to the pod spec, pointing intercepted hostnames (e.g., api.github.com, generativelanguage.googleapis.com) to 127.0.0.1.
  2. An init container sets up iptables NAT rules to redirect port 443 traffic to the sidecar's HTTPS port (8443).
  3. The sidecar terminates TLS using a self-signed CA certificate, inspects the request, applies policy, injects credentials, and forwards to the real endpoint.

TLS strategy

Pai uses two layers of TLS:

External TLS (inbound traffic)

  • Agent workloads with inbound.port configured get a unique hostname on the pairun.dev domain (e.g., a7x3k9.pairun.dev)
  • Pai creates a cert-manager Certificate resource for the hostname
  • cert-manager obtains a trusted TLS certificate from Let's Encrypt via DNS-01 challenge
  • External clients connect over HTTPS with a valid, publicly trusted certificate

Internal TLS (interception)

  • The sidecar generates a self-signed CA certificate at startup via entrypoint.sh
  • The CA cert includes SANs for all intercepted hosts (e.g., api.github.com, s3.amazonaws.com)
  • The CA cert is shared with the agent container via an emptyDir volume
  • Standard environment variables are set to make the agent trust the CA:
    • REQUESTS_CA_BUNDLE (Python requests)
    • SSL_CERT_FILE (generic)
    • NODE_EXTRA_CA_CERTS (Node.js)
    • GIT_SSL_CAINFO (Git)

DNS: auto-generated hostnames

When an agent workload declares an inbound.port, Pai:

  1. Generates a random 6-character hostname (e.g., a7x3k9)
  2. Creates a DNS record at a7x3k9.pairun.dev pointing to the load balancer
  3. Provisions a TLS certificate for the hostname
  4. Sets status.url on the AgentWorkload to https://a7x3k9.pairun.dev

The hostname is stable for the lifetime of the workload. Deleting and recreating the workload generates a new hostname.

Security model

Pai enforces defense-in-depth at every layer:

Pod-level hardening

Every agent pod receives the following controls automatically — no AgentWorkload configuration required:

ControlImplementationEffect
No service account tokenautomountServiceAccountToken: falsePod cannot call the Kubernetes API
Non-root executionrunAsNonRoot: true, default UID 65532Agent cannot run as root
No privilege escalationallowPrivilegeEscalation: falseNo setuid/sudo escalation
Syscall filteringseccompProfile: RuntimeDefaultBlocks ~300 dangerous syscalls (ptrace, mount, kexec, bpf, etc.)
Root UID rejectionController validates runAsUser != 0 at reconcile timespec.runAsUser: 0 sets status.phase: Failed and skips deployment
Filesystem confinementLandlock LSM via spec.filesystem (opt-in, kernel 5.13+)Per-path write restrictions; agent cannot overwrite config or install cron jobs

Filesystem confinement (Landlock)

When spec.filesystem.readOnlyPaths is set, Pai injects a Landlock LSM enforcer via LD_PRELOAD — no entrypoint change required:

  1. An init container copies pai-landlock.so from the proxy image into a shared pai-sandbox emptyDir volume.
  2. The controller sets LD_PRELOAD=/pai-sandbox/pai-landlock.so and PAI_LANDLOCK_RW=<writable paths> on the agent container.
  3. The dynamic linker loads pai-landlock.so before main(). Its constructor reads PAI_LANDLOCK_RW, applies Landlock, and returns. All child processes inherit the restrictions.

Algorithm:

  • Handled mask = ALL write rights (WRITE_FILE, REMOVE_*, MAKE_*, TRUNCATE, REFER)
  • Declared writable paths (spec.volumes mountPaths + /tmp + spec.filesystem.writablePaths) receive explicit write grants
  • spec.filesystem.readOnlyPaths receive NO write grant → kernel silently denies writes
  • READ is outside the handled mask → reading is always unrestricted everywhere

ABI probing: v3 (kernel 5.19+) → v2 (5.17+) → v1 (5.13+). If Landlock is unavailable, the wrapper logs a warning and execs the original command — startup is never blocked.

Two-tier enforcement (automatic fallback):

TierMechanismWhen activeBypass resistance
1Landlock LSMKernel 5.13+ with Landlock built inKernel-level, resists direct syscalls
2libc interceptionAll other kernels (automatic fallback)Covers libc callers (Node.js, Python, JVM); bypassed by direct syscalls

Requirements:

  • Tier 1: Linux kernel 5.13+ with Landlock compiled in (e.g. Ubuntu 22.04+, k3s; not GKE/COS)
  • Tier 2: Dynamically linked runtime — works on any kernel
Choosing paths to protect

Only list paths the application itself never writes to in readOnlyPaths. If the app writes to a file at startup (config persistence, atomic renames, etc.), protecting it will crash the agent. Good candidates: system directories like /etc/cron.d, /var/spool/cron, /etc/passwd.

Network isolation

  • A NetworkPolicy restricts agent egress to only the Pai gateway and the sidecar
  • The agent cannot reach the Kubernetes API, other pods, or the internet directly
  • All external access is mediated by the sidecar, which enforces per-binding policy
  • Inbound traffic (when configured) is restricted to specified CIDR blocks via loadBalancerSourceRanges and NetworkPolicy ipBlock rules

Credential isolation

  • Secrets are mounted only in the sidecar container
  • The agent container has no access to secrets via environment variables, volumes, or the Kubernetes API
  • OAuth2 tokens are held in memory by the sidecar and refreshed automatically
  • AWS SigV4 signatures are computed per-request and never exposed to the agent

Policy enforcement modes

Providers support two enforcement modes via spec.audit.enforcement:

  • enforce (default): Requests that violate policy.allow/policy.deny are blocked with HTTP 403.
  • audit: Violations are logged with a AUDIT (not blocked): prefix but the request is forwarded. Use this when rolling out new policies to production agents — validate the policy against live traffic, then switch to enforce once confident.

The enforcement mode is evaluated per binding independently, so you can audit one binding while enforcing others.

Policy hot-reload

Updating a Provider (policy rules, httpRules, audit settings) takes effect on running agents without a pod restart.

How it works:

  1. When a Provider changes, the controller's watcher triggers a reconcile for all affected AgentWorkload resources.
  2. The controller writes the updated provider specs to a ConfigMap named pai-{workload}-providers.
  3. The Kubernetes kubelet automatically syncs the ConfigMap to the pod's volume mount (typically within ~1 minute).
  4. The sidecar proxy's background watcher thread (polling every 30 seconds) detects the file's mtime change and reloads _bindings in-place under a threading lock.
  5. The next request through that binding uses the updated policy.

What triggers a reload vs. a restart:

ChangeHot-reloadRestart required
policy.allow / policy.deny
policy.httpRules
audit.logRequests / audit.enforcement
scope.*
auth.secretRef (new credentials)✅ (secrets are env vars)
spec.providers list change✅ (changes sidecar config)

Provider plugin system

External service integrations are implemented as self-contained plugins in platform/proxy/providers/. Each plugin implements the ProviderPlugin abstract base class.

MethodPurpose
provider_name()Canonical name (e.g., "github")
default_hosts()Hostnames to intercept (e.g., ["api.github.com"])
host_patterns()Wildcard patterns (e.g., ["*.amazonaws.com"])
resolve_action()Map HTTP request to a provider-specific action
inject_credentials()Add auth headers before forwarding
start() / stop()Lifecycle hooks (token exchange, cleanup)
refresh_token()Background token refresh (Azure, GCP)

Supported providers:

ProviderAuth MethodToken Refresh
GitHubPAT / GitHub AppNo (static)
AWSSigV4 (per-request HMAC signing)No (signed per-request)
AzureOAuth2 client credentials (Entra ID)Yes (every ~50 min)
GCPService account JWT to OAuth2Yes (every ~45 min)
TelegramBot tokenNo (static)
SlackOAuth2 / API keyVaries

Adding a new provider requires creating a single file in platform/proxy/providers/ -- the registry auto-discovers it at startup.