The Three APIs
Before anything else, you pick an API type. They're three separate products that share a console. Pick wrong and you'll be migrating later — or paying 3.5× more than you need to.
- VTL mapping templates Yes
- Resource policies Yes
- Per-API throttling Yes
- X-Ray tracing Yes
- WAF integration Yes
- Pricing per 1M calls $3.50
- p99 latency ~80 ms
- VTL mapping templates No
- Native JWT authorizer Yes
- CORS built-in Yes
- X-Ray tracing No
- WAF integration No
- Pricing per 1M calls $1.00
- p99 latency ~50 ms
- Persistent connections Yes
- Connection routing Yes
- 2-hour max connection Yes
- 10-min idle timeout Yes
- Pricing per 1M msgs $1.00
- + Connection minutes $0.25/M
- Max payload 128 KB
The Authorizer Zoo
There are five different ways to authenticate a request to API Gateway. They do not work the same way across REST and HTTP APIs.
| Authorizer | REST | HTTP | Notes |
|---|---|---|---|
| IAM (SigV4) | Yes | Yes | Internal AWS-to-AWS only. Requires SigV4 signing on every request. |
| Cognito User Pools | Yes | No | REST-only. Validates Cognito-issued JWTs. For HTTP, use the JWT authorizer with Cognito as IdP. |
| JWT (native) | No | Yes | HTTP-only. JWKS-based, no Lambda needed. The reason to choose HTTP for OAuth/OIDC. |
| Lambda authorizer (REQUEST) | Yes | Yes | Custom Lambda; can authorize on any header/query/path/IP. ~10ms cold-start cost. |
| Lambda authorizer (TOKEN) | Yes | No | REST-only legacy variant: receives a single header value. Use REQUEST type for new APIs. |
| API keys + usage plans | Yes | No | Not authentication — identification + throttling only. Pair with another authorizer. |
| mTLS (client cert) | Yes | Yes | Custom domain only. Truststore in S3. Doesn't replace another authorizer for app-level identity. |
$request.header.Authorization. If two users send the same Authorization value during the TTL, the second one gets the first one's allow/deny decision. Set TTL to 0 for sensitive ops.
Deployment Modes
REST APIs offer three deployment modes; HTTP APIs are always Regional. Pick based on who's calling.
- Caller location Global
- Backend latency +30 ms
- TLS termination CloudFront
- Best for public APIs, mobile
- Caller location Same region
- Backend latency Direct
- TLS termination API GW
- Best for B2B, internal services
- Caller location Same VPC
- Backend latency Direct
- Resource policy Required
- Best for internal-only APIs
edge-optimized
Throttling & Quotas
API Gateway throttles in three places — account, stage, and usage plan. Hit any of them and you get HTTP 429. Hit them in production and you find out which.
| Limit | Default | Hard limit? | Adjustable |
|---|---|---|---|
| Account RPS (steady) | 10,000 RPS | Soft | Yes — open ticket |
| Account burst | 5,000 requests | Soft | Yes — open ticket |
| Per-stage RPS | Inherits account | — | In stage settings |
| Per-method RPS | Inherits stage | — | In method settings |
| Usage plan throttle | Configurable | Custom | Per usage plan |
| Integration timeout | 29 s | Soft (REST) | Yes — default soft quota, raisable for Regional & Private REST APIs (since Jun 2024). HTTP API: 30 s hard. |
| Payload size | 10 MB | Hard | No |
Pricing in practice
Headline pricing is requests. Real bill is requests + data transfer + Lambda invocation + CloudWatch logs + (sometimes) WAF + X-Ray + Cognito.
| Component | Rate | Notes |
|---|---|---|
| REST API requests | $3.50 / 1M | First 333M/mo; $2.80 next 667M; $2.38 over 1B/mo. |
| HTTP API requests | $1.00 / 1M | First 300M; tiered to $0.90 above. ~3.5× cheaper than REST. |
| WebSocket messages | $1.00 / 1M | Plus $0.25 per million connection-minutes. |
| Caching (REST) | ~$15–500 / month | By cache size (0.5 GB → 237 GB). Adds latency improvement at fixed cost. |
| Data transfer out | $0.09 / GB | Standard AWS egress; same-region to VPC is free. |
| X-Ray traces | $5.00 / 1M | Sample to keep this in check. |
| CloudWatch logs | $0.50 / GB ingest | Easy to forget; access logs at scale add up. |
Strengths & Gotchas
What it does well
- Tight AWS integration. Lambda, IAM, Cognito, VPC link, Direct Connect, EventBridge — all native.
- No infrastructure to manage. Serverless gateway, scales automatically up to account limits.
- Cheap at small scale. HTTP API at $1/M is hard to beat for <100M requests/month.
- Mature
SAM+CDK+Terraformtooling for declarative API definition. - Per-stage canary deployments with weighted traffic split, built in.
What to watch for
- No native dev portal. Stoplight, Backstage, Redocly, or roll your own.
- VTL templates are arcane. Apache Velocity in 2026 still feels like a hostage situation.
- No real rate-limit per consumer. Usage plans + API keys is the workaround; not as clean as Apigee/Kong.
- 29-second default integration timeout — raisable as a soft quota for Regional/Private REST APIs (since 2024); HTTP API stays a hard 30 s.
- AWS lock-in. Migration to another gateway means rewriting authorizers, mappings, custom domains.
- Cold starts on Lambda authorizers add 50–800 ms p99 latency on first call.
Migration paths
| Direction | Path | Effort |
|---|---|---|
| REST → HTTP | Strip VTL, rebuild authorizers as JWT/Lambda. Most apps can swap if they don't use mapping templates. | Medium |
| HTTP → REST | Rare. Usually because team needs WAF/X-Ray/resource policies. Rebuild routes, add stages, configure throttling. | Medium |
| API GW → Kong / Apigee | OpenAPI export + rewrite authorizers + new dev portal. Plan for 3–6 weeks per ~100 endpoints. |
High |
| ALB-only → API GW | Common pattern. Put API GW in front of existing ALB via VPC Link (REST) or VPC link v2 (HTTP). | Low |
| API GW → Lambda Function URLs | For single-Lambda APIs, drop API GW and use Function URLs. Loses authorizers / mappings / routing. | Low |