# Key Management at Scale

Managing API keys is straightforward when you have a handful of users. It becomes an engineering challenge when you have thousands of keys across multiple teams, environments, and regions. This guide covers the patterns and practices that keep key management tractable as your platform grows.

## Organizational Patterns

How you organize keys determines how easily you can audit, rotate, and revoke them. The right model depends on your platform's structure.

### Team-Based Organization

Keys are issued to teams or departments rather than individuals. This works well when access patterns are shared and accountability sits at the team level.

- Each team gets a namespace or project that contains their keys
- Team leads manage key creation and revocation within their scope
- Audit logs are filterable by team

### Project-Based Organization

Keys are scoped to individual projects or services. This is common in microservice architectures where each service has its own identity.

- One key per service per environment (e.g., `payments-service-prod`, `payments-service-staging`)
- Keys inherit permissions from the project's role
- Decommissioning a project automatically revokes its keys

### Hierarchical Organization

For larger platforms, a hierarchy of organizations, teams, and projects provides the most flexibility. Keys exist at the project level, teams own projects, and organizations own teams. This mirrors how companies like Stripe and AWS structure their access management.

## Automated Lifecycle Management

At scale, manual key management is a liability. Automate every stage of the key lifecycle.

**Provisioning.** Use your platform's API or infrastructure-as-code tools to create keys as part of service deployment. Keys should be provisioned alongside the service that needs them, not as a separate manual step. Managed key services like [Zuplo's API key API](https://zuplo.com/docs/articles/api-key-api?ref=apikeys-guide&utm_source=apikeys-guide&utm_medium=web&utm_campaign=api-keys) expose a programmatic interface for creating consumers and keys, so provisioning can be wired into your deployment pipeline or Terraform workflow.

```yaml
# Example: Terraform resource for key provisioning
resource "apiplatform_api_key" "payments" {
  name        = "payments-service-prod"
  project_id  = apiplatform_project.payments.id
  scopes      = ["transactions:read", "transactions:write"]
  expires_in  = "90d"
}
```

**[Rotation](/docs/security/key-rotation).** Schedule automated rotation using your secrets manager. The workflow is: create a new key, update the consuming service's configuration, verify the new key is in use, [revoke](/docs/implementation/revocation) the old key.

**[Expiration](/docs/security/expiration-policies) and cleanup.** Set expiration policies at the organizational level. Keys that are not rotated within the policy window should trigger alerts, and keys unused for an extended period should be flagged for revocation.

**Revocation.** When a service is decommissioned or a team member leaves, revocation should happen automatically through your identity provider or deployment pipeline integration, not through a manual checklist.

## Self-Service Key Management

A self-service portal reduces operational burden on your platform team and improves the developer experience for your users. Managed API platforms like [Zuplo](https://zuplo.com/docs/articles/developer-portal?ref=apikeys-guide&utm_source=apikeys-guide&utm_medium=web&utm_campaign=api-keys) include a self-service portal out of the box — consumers can create, rotate, and revoke their own keys, view usage data, and manage multiple keys per account without your team building a custom dashboard.

**Essential features:**
- Key creation with scope selection and optional expiration
- Dashboard showing all active keys with masked values, creation dates, and last-used timestamps
- One-click revocation with confirmation
- Rotation workflow that creates a new key and schedules the old one for deactivation
- Activity logs showing per-key usage history
- Bulk operations for managing multiple keys at once

**Access control for the portal itself:**
- Role-based access (admin can create/revoke, member can view and use)
- Audit trail of all portal actions
- SSO integration so portal access is governed by your identity provider

## Multi-Region Considerations

If your API serves users across multiple regions, key management needs to account for distributed infrastructure.

- **Replication strategy.** Key data (hashes, metadata, scopes) must be available in every region where authentication happens. Decide between synchronous replication (stronger consistency, higher latency) and eventual consistency (faster, but brief windows where a newly created key may not work in all regions). Edge-native gateways like [Zuplo](https://zuplo.com/docs/articles/api-key-management?ref=apikeys-guide&utm_source=apikeys-guide&utm_medium=web&utm_campaign=api-keys) handle this replication automatically, propagating key state to 300+ edge locations within seconds.
- **Regional key issuance.** For compliance or performance reasons, you may need keys that are scoped to specific regions. Encode the region in the key prefix (e.g., `sk_eu_`, `sk_us_`) so routing and validation can happen locally.
- **Revocation propagation.** Revocation must propagate to all regions quickly. A key revoked in `us-east-1` should not remain valid in `eu-west-1` for an extended period. Use a push-based invalidation mechanism or short cache TTLs for key validation results.

## Database Design for Key Storage

Your key storage layer needs to handle high read throughput (every API request performs a lookup) with strong security guarantees.

**Schema considerations:**

```sql
CREATE TABLE api_keys (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    key_prefix      VARCHAR(12) NOT NULL,       -- "sk_live_" + last 4 chars
    key_hash        BYTEA NOT NULL,             -- SHA-256 or argon2 hash
    organization_id UUID NOT NULL REFERENCES organizations(id),
    project_id      UUID REFERENCES projects(id),
    name            VARCHAR(255),
    scopes          TEXT[] NOT NULL DEFAULT '{}',
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    expires_at      TIMESTAMPTZ,
    last_used_at    TIMESTAMPTZ,
    revoked_at      TIMESTAMPTZ,
    created_by      UUID REFERENCES users(id)
);

CREATE INDEX idx_api_keys_hash ON api_keys (key_hash) WHERE revoked_at IS NULL;
CREATE INDEX idx_api_keys_org ON api_keys (organization_id);
```

**Performance tips:**
- Index the [key hash](/docs/security/hashing-and-storage) column and filter out revoked keys in the index (partial index) so lookups only scan active keys.
- Cache validated keys in memory (with a short TTL) to avoid a database round-trip on every request. Invalidate the cache entry when a key is revoked.
- If using a hash function like argon2 that is intentionally slow, consider a two-tier approach: use a fast hash (SHA-256) for the initial lookup, then verify with the slow hash. Alternatively, hash once at creation and store only the fast hash if offline brute-force resistance is handled by key length and entropy.

**Partitioning.** If your key table grows into the millions of rows, partition by organization or by a hash of the key prefix to keep query performance stable.

Scaling key management is fundamentally about removing humans from the critical path. The more you can automate provisioning, rotation, and revocation, the fewer opportunities there are for mistakes, and the easier it is to maintain a strong security posture as your platform grows.
