Skip to main content

Provider Gateway

The Pai Provider Gateway extends the LLM Gateway pattern to Providers — GitHub, AWS, Azure, GCP, Telegram, generic HTTP, and MCP servers. A developer laptop (or any external client) reaches cluster-managed upstream services through a single URL, authenticated by a narrow, per-resource AccessKey. The laptop never holds the real upstream credential.

How it works

Developer laptop                 Pai cluster                       Upstream
+---------------+ HTTPS +------------------+ HTTPS +---------------+
| Claude Code | ---------> | Pai Gateway | -----------> | api.github |
| gh / aws CLI | pak_... | - auth check | real PAT | api.notion |
| custom code | AccessKey | - IP/tool/HTTP | / SigV4 / | S3 / GCS |
+---------------+ | restrictions | OAuth / | ... |
| - Provider pol. | mcp Bearer +---------------+
| - MCP tool pol. |
| - audit log |
+------------------+
  1. Admin creates a Provider with externalAccess.enabled: true.
  2. Admin (or the developer, via CLI) creates an AccessKey bound to that Provider (and optionally to more Providers and ModelProviders in the same namespace).
  3. Developer runs pai gateway provider <name> or pai gateway mcp <name> to emit env vars or a .mcp.json snippet with a pak_... bearer.
  4. Every request goes through the gateway, is validated against the AccessKey's restrictions and the Provider's policy, has the real upstream credential injected, and is forwarded.

Why AccessKey is a separate token type

The gateway's /ext/provider/*, /ext/mcp/*, and /ext/v1/* endpoints accept pak_... (AccessKey) bearers only. The API server's /v1/* endpoints accept pai_tok_... (admin CLI token) bearers only. This is enforced structurally at the header layer, before any resource lookup — a token carries information about WHICH surface it may be used on, not just whether it is valid. A leaked laptop AccessKey can never be turned into an API-server admin token by accident or misconfiguration.

SurfaceAcceptsRejects
/v1/* (API server)pai_tok_...pak_...
/ext/v1/* (LLM gateway)pak_...pai_tok_...
/ext/provider/*, /ext/mcp/*pak_...pai_tok_...

Setup

1. Create a Provider with external access

apiVersion: pai.io/v1
kind: Provider
metadata:
name: gh-readonly
namespace: team-a
spec:
type: github
auth:
type: pat
secretRef: github-readonly-pat
policy:
allow: ["contents:read", "pulls:read", "issues:read"]
scope:
repositories: ["org/repo-a", "org/repo-b"]
externalAccess:
enabled: true
maxRequestsPerDay: 5000

For MCP servers:

apiVersion: pai.io/v1
kind: Provider
metadata:
name: notion-docs
namespace: team-a
spec:
type: mcp
mcp:
transport: sse
url: https://mcp.notion.com/sse
auth:
type: api-key
secretRef: notion-mcp-token
policy:
mcp:
allowedTools: [search_pages, fetch_document]
deniedTools: [delete_page]
externalAccess:
enabled: true

2. Mint an AccessKey

pai access-key create --name alice-laptop \
-n team-a \
--provider gh-readonly \
--provider notion-docs \
--model-provider anthropic \
--allowed-cidr 10.0.0.0/8 \
--allowed-http-method GET \
--allowed-mcp-tool search_pages \
--allowed-model claude-haiku-4-5

The CLI prints the raw pak_... once. Store it securely — it cannot be retrieved later. Rotate with pai access-key rotate alice-laptop.

3. Emit client configuration

HTTP Provider (GitHub, etc.):

eval $(pai gateway provider gh-readonly --key alice-laptop)
# export PAI_PROVIDER_URL=https://gateway.pairun.dev/ext/provider/gh-readonly
# export PAI_PROVIDER_TOKEN=pak_...

MCP Provider (for Claude Code's .mcp.json):

pai gateway mcp notion-docs --key alice-laptop > ~/.mcp-notion.json

LLM Provider (existing flow, now on AccessKey):

eval $(pai gateway env)
# export ANTHROPIC_BASE_URL=https://gateway.pairun.dev/ext/v1
# export ANTHROPIC_API_KEY=sk-ant-api03-pak-...-AA

AccessKey restrictions — what can a key carry?

Each restriction is optional and only narrows (never loosens) what the bound Provider / ModelProvider already permits. The gateway ANDs both; admission rejects keys whose allowedModels or allowedMcpTools would grant more than the bound resource's own policy.

spec.restrictions.*Applies toSemantics
allowedCIDRsevery surfaceClient IP must match at least one CIDR. See Client IP extraction below.
allowedHttpMethods/ext/provider/*Request method must be in the list (case-insensitive).
allowedHttpPaths/ext/provider/*fnmatch glob allowlist.
deniedHttpPaths/ext/provider/*Denies, evaluated before allowedHttpPaths.
allowedMcpTools/ext/mcp/* tools/callTool name must match.
deniedMcpTools/ext/mcp/* tools/callDeny, evaluated first.
allowedModels/ext/v1/*Request body model must match.

Per-key daily hard caps:

spec.limits.*Applies to
maxRequestsPerDay/ext/provider/* + /ext/mcp/*
maxTokensPerDay/ext/v1/*

Client IP extraction (allowedCIDRs)

The gateway determines a caller's IP from X-Forwarded-For only when the direct peer (the pod's RemoteAddr) is in the configured trustedProxies CIDR list. If it isn't, the direct peer is used. Untrusted peers cannot spoof X-Forwarded-For to bypass allowedCIDRs.

Set trustedProxies in values.yaml:

gateway:
externalProviderGateway:
enabled: true
trustedProxies:
- 10.64.0.0/10 # e.g. the cluster's LoadBalancer CIDR

If empty (default), the gateway always uses the peer address — safest default.

Security invariants

  • Rules before credentials. Every policy gate (service, httpRules, action, scope, AccessKey restrictions, MCP tool policy) is evaluated before Plugin.inject_credentials(). A denied request never causes a Secret to be read.
  • Cross-surface rejection. The prefix check runs in middleware before any lookup; misuse can't accidentally succeed due to a config mistake.
  • Per-resource blast radius. A leaked pak_... for Provider/gh-readonly can touch only that Provider in its namespace. It cannot talk to the API server, cannot enumerate other Providers, cannot reach the LLM gateway for any ModelProvider not in its spec.modelProviders.
  • Independent rotation. pai access-key rotate <name> mints a new hash without touching any Secret or any other AccessKey.
  • In-cluster agents are unaffected. The sidecar's agentEnvVar dummy pattern stays as-is; in-cluster agents don't hold AccessKeys.

Audit mode

Provider-level audit.enforcement: audit lets the gateway log policy violations with an AUDIT (not blocked): prefix and still forward the request upstream. Use this to validate a new policy against live traffic before switching to enforce. The setting applies to all surfaces — Action policy, httpRules, MCP tool policy, and scope checks.

Observability

  • Gateway logs every request with source=external, access_key=..., provider=..., client_ip=....
  • Provider.status.requestsToday / AccessKey.status.requestsToday / AccessKey.status.tokensToday expose counter state.
  • AccessKey.status.danglingRefs is populated by the controller when a bound Provider or ModelProvider is deleted.

Limits (v1)

  • Cloud providers (AWS, Azure, GCP) require spec.host to be set explicitly on the Provider — the gateway does not synthesize regional hostnames. The sidecar's DNS-interception model supports dynamic subdomains; the URL-based gateway does not.
  • MCP transport — only sse is supported externally in v1. Local stdio MCP servers stay in-cluster.
  • Batched JSON-RPC — if any tools/call frame in a batch is denied, the whole batch is denied.
  • Body cap for /ext/mcp/{name}/message is 1 MiB.