Provider Gateway

The Pai Provider Gateway extends the LLM Gateway pattern to Providers — GitHub, AWS, Azure, GCP, Telegram, generic HTTP, and MCP servers. A developer laptop (or any external client) reaches cluster-managed upstream services through a single URL, authenticated by a narrow, per-resource AccessKey. The laptop never holds the real upstream credential.

How it works

Developer laptop                 Pai cluster                       Upstream
+---------------+   HTTPS    +------------------+   HTTPS      +---------------+
|  Claude Code  | ---------> |  Pai Gateway     | -----------> |  api.github   |
|  gh / aws CLI |  pak_...   |  - auth check    |  real PAT    |  api.notion   |
|  custom code  |  AccessKey |  - IP/tool/HTTP  |  / SigV4 /   |  S3 / GCS     |
+---------------+            |    restrictions  |  OAuth /     |  ...          |
                             |  - Provider pol. |  mcp Bearer  +---------------+
                             |  - MCP tool pol. |
                             |  - audit log     |
                             +------------------+

Admin creates a Provider with externalAccess.enabled: true.
Admin (or the developer, via CLI) creates an AccessKey bound to that Provider (and optionally to more Providers and ModelProviders in the same namespace).
Developer runs pai gateway provider <name> or pai gateway mcp <name> to emit env vars or a .mcp.json snippet with a pak_... bearer.
Every request goes through the gateway, is validated against the AccessKey's restrictions and the Provider's policy, has the real upstream credential injected, and is forwarded.

Why `AccessKey` is a separate token type

The gateway's /ext/provider/*, /ext/mcp/*, and /ext/v1/* endpoints accept pak_... (AccessKey) bearers only. The API server's /v1/* endpoints accept pai_tok_... (admin CLI token) bearers only. This is enforced structurally at the header layer, before any resource lookup — a token carries information about WHICH surface it may be used on, not just whether it is valid. A leaked laptop AccessKey can never be turned into an API-server admin token by accident or misconfiguration.

Surface	Accepts	Rejects
`/v1/*` (API server)	`pai_tok_...`	`pak_...`
`/ext/v1/*` (LLM gateway)	`pak_...`	`pai_tok_...`
`/ext/provider/`, `/ext/mcp/`	`pak_...`	`pai_tok_...`

Setup

1. Create a Provider with external access

apiVersion: pai.io/v1
kind: Provider
metadata:
  name: gh-readonly
  namespace: team-a
spec:
  type: github
  auth:
    type: pat
    secretRef: github-readonly-pat
  policy:
    allow: ["contents:read", "pulls:read", "issues:read"]
  scope:
    repositories: ["org/repo-a", "org/repo-b"]
  externalAccess:
    enabled: true
    maxRequestsPerDay: 5000

For MCP servers:

apiVersion: pai.io/v1
kind: Provider
metadata:
  name: notion-docs
  namespace: team-a
spec:
  type: mcp
  mcp:
    transport: sse
    url: https://mcp.notion.com/sse
  auth:
    type: api-key
    secretRef: notion-mcp-token
  policy:
    mcp:
      allowedTools: [search_pages, fetch_document]
      deniedTools: [delete_page]
  externalAccess:
    enabled: true

2. Mint an AccessKey

pai access-key create --name alice-laptop \
    -n team-a \
    --provider gh-readonly \
    --provider notion-docs \
    --model-provider anthropic \
    --allowed-cidr 10.0.0.0/8 \
    --allowed-http-method GET \
    --allowed-mcp-tool search_pages \
    --allowed-model claude-haiku-4-5

The CLI prints the raw pak_... once. Store it securely — it cannot be retrieved later. Rotate with pai access-key rotate alice-laptop.

3. Emit client configuration

HTTP Provider (GitHub, etc.):

eval $(pai gateway provider gh-readonly --key alice-laptop)
# export PAI_PROVIDER_URL=https://gateway.pairun.dev/ext/provider/gh-readonly
# export PAI_PROVIDER_TOKEN=pak_...

MCP Provider (for Claude Code's .mcp.json):

pai gateway mcp notion-docs --key alice-laptop > ~/.mcp-notion.json

LLM Provider (existing flow, now on AccessKey):

eval $(pai gateway env)
# export ANTHROPIC_BASE_URL=https://gateway.pairun.dev/ext/v1
# export ANTHROPIC_API_KEY=sk-ant-api03-pak-...-AA

AccessKey restrictions — what can a key carry?

Each restriction is optional and only narrows (never loosens) what the bound Provider / ModelProvider already permits. The gateway ANDs both; admission rejects keys whose allowedModels or allowedMcpTools would grant more than the bound resource's own policy.

`spec.restrictions.*`	Applies to	Semantics
`allowedCIDRs`	every surface	Client IP must match at least one CIDR. See Client IP extraction below.
`allowedHttpMethods`	`/ext/provider/*`	Request method must be in the list (case-insensitive).
`allowedHttpPaths`	`/ext/provider/*`	fnmatch glob allowlist.
`deniedHttpPaths`	`/ext/provider/*`	Denies, evaluated before `allowedHttpPaths`.
`allowedMcpTools`	`/ext/mcp/*` `tools/call`	Tool name must match.
`deniedMcpTools`	`/ext/mcp/*` `tools/call`	Deny, evaluated first.
`allowedModels`	`/ext/v1/*`	Request body `model` must match.

Per-key daily hard caps:

`spec.limits.*`	Applies to
`maxRequestsPerDay`	`/ext/provider/` + `/ext/mcp/`
`maxTokensPerDay`	`/ext/v1/*`

Client IP extraction (`allowedCIDRs`)

The gateway determines a caller's IP from X-Forwarded-For only when the direct peer (the pod's RemoteAddr) is in the configured trustedProxies CIDR list. If it isn't, the direct peer is used. Untrusted peers cannot spoof X-Forwarded-For to bypass allowedCIDRs.

Set trustedProxies in values.yaml:

gateway:
  externalProviderGateway:
    enabled: true
    trustedProxies:
      - 10.64.0.0/10   # e.g. the cluster's LoadBalancer CIDR

If empty (default), the gateway always uses the peer address — safest default.

Security invariants

Rules before credentials. Every policy gate (service, httpRules, action, scope, AccessKey restrictions, MCP tool policy) is evaluated before Plugin.inject_credentials(). A denied request never causes a Secret to be read.
Cross-surface rejection. The prefix check runs in middleware before any lookup; misuse can't accidentally succeed due to a config mistake.
Per-resource blast radius. A leaked pak_... for Provider/gh-readonly can touch only that Provider in its namespace. It cannot talk to the API server, cannot enumerate other Providers, cannot reach the LLM gateway for any ModelProvider not in its spec.modelProviders.
Independent rotation. pai access-key rotate <name> mints a new hash without touching any Secret or any other AccessKey.
In-cluster agents are unaffected. The sidecar's agentEnvVar dummy pattern stays as-is; in-cluster agents don't hold AccessKeys.

Audit mode

Provider-level audit.enforcement: audit lets the gateway log policy violations with an AUDIT (not blocked): prefix and still forward the request upstream. Use this to validate a new policy against live traffic before switching to enforce. The setting applies to all surfaces — Action policy, httpRules, MCP tool policy, and scope checks.

Observability

Gateway logs every request with source=external, access_key=..., provider=..., client_ip=....
Provider.status.requestsToday / AccessKey.status.requestsToday / AccessKey.status.tokensToday expose counter state.
AccessKey.status.danglingRefs is populated by the controller when a bound Provider or ModelProvider is deleted.

Limits (v1)

Cloud providers (AWS, Azure, GCP) require spec.host to be set explicitly on the Provider — the gateway does not synthesize regional hostnames. The sidecar's DNS-interception model supports dynamic subdomains; the URL-based gateway does not.
MCP transport — only sse is supported externally in v1. Local stdio MCP servers stay in-cluster.
Batched JSON-RPC — if any tools/call frame in a batch is denied, the whole batch is denied.
Body cap for /ext/mcp/{name}/message is 1 MiB.

How it works​

Why AccessKey is a separate token type​

Setup​

1. Create a Provider with external access​

2. Mint an AccessKey​

3. Emit client configuration​

AccessKey restrictions — what can a key carry?​

Client IP extraction (allowedCIDRs)​

Security invariants​

Audit mode​

Observability​

Limits (v1)​