Provider Gateway
The Pai Provider Gateway extends the LLM Gateway pattern to Providers — GitHub, AWS, Azure, GCP, Telegram, generic HTTP, and MCP servers. A developer laptop (or any external client) reaches cluster-managed upstream services through a single URL, authenticated by a narrow, per-resource AccessKey. The laptop never holds the real upstream credential.
How it works
Developer laptop Pai cluster Upstream
+---------------+ HTTPS +------------------+ HTTPS +---------------+
| Claude Code | ---------> | Pai Gateway | -----------> | api.github |
| gh / aws CLI | pak_... | - auth check | real PAT | api.notion |
| custom code | AccessKey | - IP/tool/HTTP | / SigV4 / | S3 / GCS |
+---------------+ | restrictions | OAuth / | ... |
| - Provider pol. | mcp Bearer +---------------+
| - MCP tool pol. |
| - audit log |
+------------------+
- Admin creates a
ProviderwithexternalAccess.enabled: true. - Admin (or the developer, via CLI) creates an
AccessKeybound to that Provider (and optionally to more Providers and ModelProviders in the same namespace). - Developer runs
pai gateway provider <name>orpai gateway mcp <name>to emit env vars or a.mcp.jsonsnippet with apak_...bearer. - Every request goes through the gateway, is validated against the AccessKey's restrictions and the Provider's policy, has the real upstream credential injected, and is forwarded.
Why AccessKey is a separate token type
The gateway's /ext/provider/*, /ext/mcp/*, and /ext/v1/* endpoints accept
pak_... (AccessKey) bearers only. The API server's /v1/* endpoints accept
pai_tok_... (admin CLI token) bearers only. This is enforced structurally at
the header layer, before any resource lookup — a token carries information
about WHICH surface it may be used on, not just whether it is valid. A leaked
laptop AccessKey can never be turned into an API-server admin token by
accident or misconfiguration.
| Surface | Accepts | Rejects |
|---|---|---|
/v1/* (API server) | pai_tok_... | pak_... |
/ext/v1/* (LLM gateway) | pak_... | pai_tok_... |
/ext/provider/*, /ext/mcp/* | pak_... | pai_tok_... |
Setup
1. Create a Provider with external access
apiVersion: pai.io/v1
kind: Provider
metadata:
name: gh-readonly
namespace: team-a
spec:
type: github
auth:
type: pat
secretRef: github-readonly-pat
policy:
allow: ["contents:read", "pulls:read", "issues:read"]
scope:
repositories: ["org/repo-a", "org/repo-b"]
externalAccess:
enabled: true
maxRequestsPerDay: 5000
For MCP servers:
apiVersion: pai.io/v1
kind: Provider
metadata:
name: notion-docs
namespace: team-a
spec:
type: mcp
mcp:
transport: sse
url: https://mcp.notion.com/sse
auth:
type: api-key
secretRef: notion-mcp-token
policy:
mcp:
allowedTools: [search_pages, fetch_document]
deniedTools: [delete_page]
externalAccess:
enabled: true
2. Mint an AccessKey
pai access-key create --name alice-laptop \
-n team-a \
--provider gh-readonly \
--provider notion-docs \
--model-provider anthropic \
--allowed-cidr 10.0.0.0/8 \
--allowed-http-method GET \
--allowed-mcp-tool search_pages \
--allowed-model claude-haiku-4-5
The CLI prints the raw pak_... once. Store it securely — it cannot be
retrieved later. Rotate with pai access-key rotate alice-laptop.
3. Emit client configuration
HTTP Provider (GitHub, etc.):
eval $(pai gateway provider gh-readonly --key alice-laptop)
# export PAI_PROVIDER_URL=https://gateway.pairun.dev/ext/provider/gh-readonly
# export PAI_PROVIDER_TOKEN=pak_...
MCP Provider (for Claude Code's .mcp.json):
pai gateway mcp notion-docs --key alice-laptop > ~/.mcp-notion.json
LLM Provider (existing flow, now on AccessKey):
eval $(pai gateway env)
# export ANTHROPIC_BASE_URL=https://gateway.pairun.dev/ext/v1
# export ANTHROPIC_API_KEY=sk-ant-api03-pak-...-AA
AccessKey restrictions — what can a key carry?
Each restriction is optional and only narrows (never loosens) what the
bound Provider / ModelProvider already permits. The gateway ANDs both; admission
rejects keys whose allowedModels or allowedMcpTools would grant more than the
bound resource's own policy.
spec.restrictions.* | Applies to | Semantics |
|---|---|---|
allowedCIDRs | every surface | Client IP must match at least one CIDR. See Client IP extraction below. |
allowedHttpMethods | /ext/provider/* | Request method must be in the list (case-insensitive). |
allowedHttpPaths | /ext/provider/* | fnmatch glob allowlist. |
deniedHttpPaths | /ext/provider/* | Denies, evaluated before allowedHttpPaths. |
allowedMcpTools | /ext/mcp/* tools/call | Tool name must match. |
deniedMcpTools | /ext/mcp/* tools/call | Deny, evaluated first. |
allowedModels | /ext/v1/* | Request body model must match. |
Per-key daily hard caps:
spec.limits.* | Applies to |
|---|---|
maxRequestsPerDay | /ext/provider/* + /ext/mcp/* |
maxTokensPerDay | /ext/v1/* |
Client IP extraction (allowedCIDRs)
The gateway determines a caller's IP from X-Forwarded-For only when the
direct peer (the pod's RemoteAddr) is in the configured trustedProxies CIDR
list. If it isn't, the direct peer is used. Untrusted peers cannot spoof
X-Forwarded-For to bypass allowedCIDRs.
Set trustedProxies in values.yaml:
gateway:
externalProviderGateway:
enabled: true
trustedProxies:
- 10.64.0.0/10 # e.g. the cluster's LoadBalancer CIDR
If empty (default), the gateway always uses the peer address — safest default.
Security invariants
- Rules before credentials. Every policy gate (service,
httpRules, action, scope, AccessKey restrictions, MCP tool policy) is evaluated beforePlugin.inject_credentials(). A denied request never causes a Secret to be read. - Cross-surface rejection. The prefix check runs in middleware before any lookup; misuse can't accidentally succeed due to a config mistake.
- Per-resource blast radius. A leaked
pak_...forProvider/gh-readonlycan touch only that Provider in its namespace. It cannot talk to the API server, cannot enumerate other Providers, cannot reach the LLM gateway for any ModelProvider not in itsspec.modelProviders. - Independent rotation.
pai access-key rotate <name>mints a new hash without touching any Secret or any other AccessKey. - In-cluster agents are unaffected. The sidecar's
agentEnvVardummy pattern stays as-is; in-cluster agents don't hold AccessKeys.
Audit mode
Provider-level audit.enforcement: audit lets the gateway log policy
violations with an AUDIT (not blocked): prefix and still forward the request
upstream. Use this to validate a new policy against live traffic before
switching to enforce. The setting applies to all surfaces — Action policy,
httpRules, MCP tool policy, and scope checks.
Observability
- Gateway logs every request with
source=external, access_key=..., provider=..., client_ip=.... Provider.status.requestsToday/AccessKey.status.requestsToday/AccessKey.status.tokensTodayexpose counter state.AccessKey.status.danglingRefsis populated by the controller when a bound Provider or ModelProvider is deleted.
Limits (v1)
- Cloud providers (AWS, Azure, GCP) require
spec.hostto be set explicitly on the Provider — the gateway does not synthesize regional hostnames. The sidecar's DNS-interception model supports dynamic subdomains; the URL-based gateway does not. - MCP transport — only
sseis supported externally in v1. LocalstdioMCP servers stay in-cluster. - Batched JSON-RPC — if any
tools/callframe in a batch is denied, the whole batch is denied. - Body cap for
/ext/mcp/{name}/messageis 1 MiB.