REST APIs Refresher
Spring Boot 3 FastAPI Express Production API design — from resource modeling to deployment
Table of Contents
1. REST Fundamentals
REST (Representational State Transfer) is an architectural style, not a protocol. Roy Fielding defined it in his 2000 dissertation. Most "REST APIs" in the wild are REST-ish — understanding the real constraints helps you make deliberate trade-offs.
The Six Architectural Constraints
- Client-Server: UI concerns are separated from data storage concerns. The client doesn't need to know how orders are persisted; the server doesn't need to know how they're rendered.
- Stateless: Each request from client to server must contain all information needed to understand the request. No session state is stored on the server between calls. This is why JWTs are popular — the token carries identity.
- Cacheable: Responses must define themselves as cacheable or non-cacheable. Correctly using
Cache-ControlandETagcan eliminate entire categories of load. - Uniform Interface: The central feature that distinguishes REST. Four sub-constraints: resource identification in requests (URI), manipulation of resources through representations, self-descriptive messages, and hypermedia as the engine of application state (HATEOAS).
- Layered System: A client can't tell whether it's connected directly to the end server or to an intermediary (load balancer, CDN, API gateway). This enables horizontal scaling invisibly.
- Code on Demand (optional): Servers can extend client functionality by transferring executable code (JavaScript). The only optional constraint.
Richardson Maturity Model
Leonard Richardson's model describes four levels of API maturity. Most production APIs target Level 2. Level 3 (HATEOAS) is rarely worth the complexity unless you're building a public hypermedia API.
| Level | Name | Characteristics | Example |
|---|---|---|---|
| 0 | The Swamp of POX | Single URI, single HTTP method (POST). All operations are in the body. | POST /api → {"action":"getOrder","id":42} |
| 1 | Resources | Multiple URIs for different resources, but still usually just GET/POST. | POST /orders/42 to get order 42 |
| 2 | HTTP Verbs | Correct HTTP methods (GET, POST, PUT, DELETE). Correct status codes. The practical target for most APIs. | GET /orders/42, DELETE /orders/42 |
| 3 | Hypermedia Controls (HATEOAS) | Responses contain links describing what actions are available next. Self-documenting API state machines. | Response includes "_links": {"cancel": "/orders/42/cancel"} |
REST vs RPC vs GraphQL
| Dimension | REST | gRPC / RPC | GraphQL |
|---|---|---|---|
| Mental model | Resources & representations | Functions / procedures | Graph of typed fields |
| Transport | HTTP/1.1, HTTP/2 | HTTP/2 (gRPC), any (JSON-RPC) | HTTP/1.1, HTTP/2, WebSocket |
| Payload | JSON, XML, any | Protobuf (binary), JSON | JSON |
| Over-fetching | Common (fixed response shapes) | Low (schema-driven) | None (client selects fields) |
| Caching | HTTP-native (ETags, Cache-Control) | Manual or no caching | Hard (POST-based queries bypass HTTP cache) |
| Tooling maturity | Excellent | Good (Protobuf ecosystem) | Good (Apollo, Relay) |
| Best for | Public APIs, mobile backends, microservices with stable contracts | Internal service-to-service, streaming, polyglot systems | Complex UIs with many data shapes, BFF pattern |
2. HTTP Methods & Status Codes
Methods: Safety & Idempotency
Two properties that determine how proxies, CDNs, and clients may retry or cache requests:
- Safe: Does not modify server state. GET, HEAD, OPTIONS are safe.
- Idempotent: Calling it N times has the same effect as calling it once. GET, PUT, DELETE, HEAD, OPTIONS are idempotent. POST is neither. PATCH is neither by default.
| Method | Safe | Idempotent | Request Body | Typical Use |
|---|---|---|---|---|
GET | Yes | Yes | No | Fetch resource or collection |
POST | No | No | Yes | Create resource, trigger action |
PUT | No | Yes | Yes | Full replacement of a resource |
PATCH | No | No* | Yes | Partial update |
DELETE | No | Yes | Rarely | Remove resource |
HEAD | Yes | Yes | No | Check resource exists / get headers |
OPTIONS | Yes | Yes | No | CORS preflight, API discovery |
* PATCH can be designed to be idempotent (JSON Patch ops like "set" are idempotent; "increment" is not).
Status Codes That Matter in Production
| Code | Name | Real Scenario |
|---|---|---|
200 | OK | GET /orders/42 returns the order |
201 | Created | POST /orders creates an order; include Location: /orders/99 header |
204 | No Content | DELETE /orders/42 succeeds; no body needed |
301 | Moved Permanently | API versioning: /api/orders permanently moved to /api/v2/orders |
304 | Not Modified | Client sends If-None-Match: "abc123"; resource unchanged; save bandwidth |
400 | Bad Request | Request body fails validation (missing required field, wrong type) |
401 | Unauthorized | JWT token missing or expired; include WWW-Authenticate header |
403 | Forbidden | Token valid, but user lacks permission (e.g., non-admin accessing admin endpoint) |
404 | Not Found | GET /orders/99999 where that order doesn't exist |
409 | Conflict | POST /users with email that already exists (unique constraint violation) |
422 | Unprocessable Entity | Syntactically valid JSON but semantically wrong (e.g., end date before start date) |
429 | Too Many Requests | Rate limit exceeded; include Retry-After and X-RateLimit-Reset headers |
500 | Internal Server Error | Unhandled exception; log the full stack trace server-side, return sanitized message |
502 | Bad Gateway | Upstream service (payment processor, DB) returned invalid response |
503 | Service Unavailable | Intentional during maintenance, circuit breaker open, or DB connection exhausted |
3. Resource Design & URL Patterns
Nouns Not Verbs
URLs identify resources (things), not actions (verbs). HTTP methods already express the action.
| Bad (verbs in URL) | Good (nouns + HTTP method) |
|---|---|
POST /createOrder | POST /orders |
GET /getOrderById?id=42 | GET /orders/42 |
POST /cancelOrder/42 | POST /orders/42/cancellation |
DELETE /deleteUser/5 | DELETE /users/5 |
GET /searchProducts | GET /products?q=laptop |
Nested vs Flat Resources
Nesting expresses ownership and context. The practical rule: nest at most one level deep. Deeper nesting creates brittle URLs and forces callers to know the full ownership chain.
| Approach | URL Example | When to Use | Trade-offs |
|---|---|---|---|
| Nested | /orders/{orderId}/items |
Resource has no meaning outside parent (order items without an order) | Clear ownership; couples client to hierarchy; hard to paginate across parents |
| Flat with filter | /order-items?orderId=42 |
Resource can exist independently or needs querying across parents | Flexible; less intuitive; authorization must be explicit |
| Mixed | /orders/{id}/items writes; /order-items?status=pending queries |
Best of both: ownership for writes, flexibility for reads | More endpoints; document the intent clearly |
Route Definition in All Three Frameworks
Java — Spring Boot 3
@RestController
@RequestMapping("/api/v1")
public class OrderController {
// GET /api/v1/orders
@GetMapping("/orders")
public ResponseEntity<Page<OrderDto>> listOrders(
@RequestParam(defaultValue = "0") int page,
@RequestParam(defaultValue = "20") int size) {
return ResponseEntity.ok(orderService.findAll(PageRequest.of(page, size)));
}
// GET /api/v1/orders/{orderId}
@GetMapping("/orders/{orderId}")
public ResponseEntity<OrderDto> getOrder(@PathVariable UUID orderId) {
return ResponseEntity.ok(orderService.findById(orderId));
}
// GET /api/v1/orders/{orderId}/items
@GetMapping("/orders/{orderId}/items")
public ResponseEntity<List<OrderItemDto>> listItems(@PathVariable UUID orderId) {
return ResponseEntity.ok(orderService.findItems(orderId));
}
// POST /api/v1/orders
@PostMapping("/orders")
public ResponseEntity<OrderDto> createOrder(
@Valid @RequestBody CreateOrderRequest request,
UriComponentsBuilder uriBuilder) {
OrderDto created = orderService.create(request);
URI location = uriBuilder.path("/api/v1/orders/{id}")
.buildAndExpand(created.id()).toUri();
return ResponseEntity.created(location).body(created);
}
// PATCH /api/v1/orders/{orderId}
@PatchMapping("/orders/{orderId}")
public ResponseEntity<OrderDto> updateOrder(
@PathVariable UUID orderId,
@Valid @RequestBody UpdateOrderRequest request) {
return ResponseEntity.ok(orderService.update(orderId, request));
}
// DELETE /api/v1/orders/{orderId}
@DeleteMapping("/orders/{orderId}")
public ResponseEntity<Void> deleteOrder(@PathVariable UUID orderId) {
orderService.delete(orderId);
return ResponseEntity.noContent().build();
}
// POST /api/v1/orders/{orderId}/cancellation (action as sub-resource)
@PostMapping("/orders/{orderId}/cancellation")
public ResponseEntity<OrderDto> cancelOrder(
@PathVariable UUID orderId,
@Valid @RequestBody CancelOrderRequest request) {
return ResponseEntity.ok(orderService.cancel(orderId, request.reason()));
}
}
Python — FastAPI
from fastapi import APIRouter, HTTPException, status
from fastapi.responses import Response
from uuid import UUID
router = APIRouter(prefix="/api/v1", tags=["orders"])
@router.get("/orders", response_model=PaginatedResponse[OrderDto])
async def list_orders(
page: int = Query(0, ge=0),
size: int = Query(20, ge=1, le=100),
db: AsyncSession = Depends(get_db),
):
return await order_service.find_all(db, page=page, size=size)
@router.get("/orders/{order_id}", response_model=OrderDto)
async def get_order(order_id: UUID, db: AsyncSession = Depends(get_db)):
order = await order_service.find_by_id(db, order_id)
if not order:
raise HTTPException(status_code=404, detail="Order not found")
return order
@router.get("/orders/{order_id}/items", response_model=list[OrderItemDto])
async def list_order_items(order_id: UUID, db: AsyncSession = Depends(get_db)):
return await order_service.find_items(db, order_id)
@router.post("/orders", response_model=OrderDto, status_code=status.HTTP_201_CREATED)
async def create_order(
request: CreateOrderRequest,
response: Response,
db: AsyncSession = Depends(get_db),
):
created = await order_service.create(db, request)
response.headers["Location"] = f"/api/v1/orders/{created.id}"
return created
@router.patch("/orders/{order_id}", response_model=OrderDto)
async def update_order(
order_id: UUID,
request: UpdateOrderRequest,
db: AsyncSession = Depends(get_db),
):
return await order_service.update(db, order_id, request)
@router.delete("/orders/{order_id}", status_code=status.HTTP_204_NO_CONTENT)
async def delete_order(order_id: UUID, db: AsyncSession = Depends(get_db)):
await order_service.delete(db, order_id)
@router.post("/orders/{order_id}/cancellation", response_model=OrderDto)
async def cancel_order(
order_id: UUID,
request: CancelOrderRequest,
db: AsyncSession = Depends(get_db),
):
return await order_service.cancel(db, order_id, request.reason)
Node.js — Express
import { Router } from 'express';
import { validate } from '../middleware/validate.js';
import { createOrderSchema, updateOrderSchema, cancelOrderSchema } from '../schemas/order.js';
const router = Router();
// GET /api/v1/orders
router.get('/orders', async (req, res, next) => {
try {
const { page = 0, size = 20 } = req.query;
const result = await orderService.findAll({ page: Number(page), size: Number(size) });
res.json(result);
} catch (err) { next(err); }
});
// GET /api/v1/orders/:orderId
router.get('/orders/:orderId', async (req, res, next) => {
try {
const order = await orderService.findById(req.params.orderId);
if (!order) return res.status(404).json({ error: 'Order not found' });
res.json(order);
} catch (err) { next(err); }
});
// GET /api/v1/orders/:orderId/items
router.get('/orders/:orderId/items', async (req, res, next) => {
try {
const items = await orderService.findItems(req.params.orderId);
res.json(items);
} catch (err) { next(err); }
});
// POST /api/v1/orders
router.post('/orders', validate(createOrderSchema), async (req, res, next) => {
try {
const created = await orderService.create(req.body);
res.status(201)
.setHeader('Location', `/api/v1/orders/${created.id}`)
.json(created);
} catch (err) { next(err); }
});
// PATCH /api/v1/orders/:orderId
router.patch('/orders/:orderId', validate(updateOrderSchema), async (req, res, next) => {
try {
const updated = await orderService.update(req.params.orderId, req.body);
res.json(updated);
} catch (err) { next(err); }
});
// DELETE /api/v1/orders/:orderId
router.delete('/orders/:orderId', async (req, res, next) => {
try {
await orderService.delete(req.params.orderId);
res.status(204).send();
} catch (err) { next(err); }
});
// POST /api/v1/orders/:orderId/cancellation
router.post('/orders/:orderId/cancellation', validate(cancelOrderSchema), async (req, res, next) => {
try {
const updated = await orderService.cancel(req.params.orderId, req.body.reason);
res.json(updated);
} catch (err) { next(err); }
});
export const orderRouter = router;
4. Request & Response Design
JSON Conventions
Pick a naming convention and stick to it throughout the entire API. Mixing camelCase and snake_case in the same API is the fastest way to cause client bugs.
| Convention | Ecosystem Default | Example |
|---|---|---|
| camelCase | JavaScript, Java (Jackson), Go | {"orderId": 42, "createdAt": "..."} |
| snake_case | Python, Ruby, PostgreSQL | {"order_id": 42, "created_at": "..."} |
| PascalCase | .NET (default) | {"OrderId": 42, "CreatedAt": "..."} |
| kebab-case | Rare in JSON bodies, common in headers | {"order-id": 42} |
Envelope Patterns
A consistent envelope makes API responses predictable. Two common approaches:
Bare resource (GitHub API style)
// Single resource — return the object directly
{
"id": "order_abc123",
"status": "processing",
"total": 9999,
"currency": "usd",
"created_at": "2026-02-15T10:30:00Z"
}
// Collection — include pagination metadata
{
"data": [
{ "id": "order_abc123", "status": "processing" },
{ "id": "order_def456", "status": "shipped" }
],
"pagination": {
"total": 142,
"page": 0,
"size": 20,
"has_more": true
}
}
RFC 7807 Problem Details for errors
// Always use this for errors — standardized and machine-readable
{
"type": "https://api.example.com/errors/validation-failed",
"title": "Validation Failed",
"status": 422,
"detail": "The request body contains invalid fields.",
"instance": "/api/v1/orders/abc123",
"errors": [
{
"field": "items[0].quantity",
"message": "must be greater than 0",
"code": "MIN_VALUE"
},
{
"field": "shipping_address.zip",
"message": "invalid postal code format",
"code": "INVALID_FORMAT"
}
]
}
Serialization in Each Framework
Java — Spring Boot 3 with Records
// DTO using Java record (immutable, auto-equals/hashCode, compact)
public record OrderDto(
UUID id,
String status,
BigDecimal total,
String currency,
List<OrderItemDto> items,
@JsonProperty("created_at") Instant createdAt
) {
// Jackson maps JSON field "created_at" to Java "createdAt"
}
public record CreateOrderRequest(
@NotNull @Size(min = 1, max = 50) List<CreateOrderItemRequest> items,
@NotNull @Valid ShippingAddressRequest shippingAddress,
@Pattern(regexp = "^[A-Z]{3}$") String currency
) {}
// Configure Jackson globally in application.yml:
// spring.jackson.property-naming-strategy: SNAKE_CASE
// spring.jackson.serialization.write-dates-as-timestamps: false
Python — FastAPI + Pydantic v2
from pydantic import BaseModel, Field, field_validator
from datetime import datetime
from uuid import UUID
from decimal import Decimal
class OrderItemDto(BaseModel):
product_id: UUID
quantity: int = Field(gt=0, le=1000)
unit_price: Decimal = Field(decimal_places=2)
name: str
class OrderDto(BaseModel):
id: UUID
status: str
total: Decimal
currency: str
items: list[OrderItemDto]
created_at: datetime
model_config = {
"from_attributes": True, # Allow creating from ORM objects
"json_encoders": {Decimal: str} # Serialize Decimal as string to avoid float precision loss
}
class CreateOrderRequest(BaseModel):
items: list[CreateOrderItemRequest] = Field(min_length=1, max_length=50)
shipping_address: ShippingAddressRequest
currency: str = Field(pattern=r"^[A-Z]{3}$", default="USD")
@field_validator("currency")
@classmethod
def currency_must_be_supported(cls, v: str) -> str:
supported = {"USD", "EUR", "GBP", "CAD"}
if v not in supported:
raise ValueError(f"Currency must be one of: {', '.join(supported)}")
return v
Node.js — Express + Zod
import { z } from 'zod';
// Response shape (TypeScript interface for documentation)
const OrderItemDtoSchema = z.object({
productId: z.string().uuid(),
quantity: z.number().int().positive().max(1000),
unitPrice: z.string(), // Decimal as string to avoid float imprecision
name: z.string(),
});
// Zod schema doubles as runtime validator and TypeScript type
const CreateOrderSchema = z.object({
items: z.array(z.object({
productId: z.string().uuid(),
quantity: z.number().int().min(1).max(1000),
})).min(1).max(50),
shippingAddress: ShippingAddressSchema,
currency: z.string().regex(/^[A-Z]{3}$/).default('USD'),
});
export type CreateOrderRequest = z.infer<typeof CreateOrderSchema>;
// Validation middleware
export function validate(schema) {
return (req, res, next) => {
const result = schema.safeParse(req.body);
if (!result.success) {
return res.status(422).json({
type: 'https://api.example.com/errors/validation-failed',
title: 'Validation Failed',
status: 422,
errors: result.error.errors.map(e => ({
field: e.path.join('.'),
message: e.message,
code: e.code.toUpperCase(),
})),
});
}
req.body = result.data; // Replace with parsed + typed value
next();
};
}
5. Pagination, Filtering & Sorting
Offset vs Cursor Pagination
| Dimension | Offset / Page-Based | Cursor / Keyset |
|---|---|---|
| URL | ?page=3&size=20 | ?cursor=eyJpZCI6MTAwfQ&size=20 |
| DB Query | LIMIT 20 OFFSET 60 | WHERE id > 100 LIMIT 20 |
| Performance | Degrades at high offsets (DB scans all skipped rows) | Constant time (index seek) |
| Consistency | Rows inserted during pagination cause duplicates/skips | Stable (cursor is a fixed point) |
| Random access | Yes ("jump to page 50") | No (sequential only) |
| Best for | Admin UIs, small datasets, when users need page numbers | Infinite scroll, feeds, large tables, public APIs |
starting_after / ending_before cursor pattern. All their list endpoints return "has_more": true/false and an array of objects. No page numbers. This is the right choice for financial data where consistency under concurrent writes matters.
Pagination Implementation
Java — Spring Boot with JPA
// Offset pagination (Spring Data handles it natively)
@GetMapping("/orders")
public ResponseEntity<PageResponse<OrderDto>> listOrders(
@RequestParam(defaultValue = "0") int page,
@RequestParam(defaultValue = "20") @Max(100) int size,
@RequestParam(required = false) String status,
@RequestParam(required = false) @DateTimeFormat(iso = ISO.DATE) LocalDate createdAfter,
@RequestParam(defaultValue = "created_at") String sort,
@RequestParam(defaultValue = "desc") String direction) {
Sort.Direction dir = Sort.Direction.fromString(direction);
Pageable pageable = PageRequest.of(page, size, Sort.by(dir, toColumn(sort)));
Specification<Order> spec = Specification
.where(hasStatus(status))
.and(createdAfter(createdAfter));
Page<Order> result = orderRepository.findAll(spec, pageable);
return ResponseEntity.ok(PageResponse.from(result, orderMapper::toDto));
}
// Cursor pagination for feeds
@GetMapping("/orders/feed")
public ResponseEntity<CursorPage<OrderDto>> orderFeed(
@RequestParam(required = false) String cursor,
@RequestParam(defaultValue = "20") @Max(100) int size) {
UUID afterId = cursor != null ? decodeCursor(cursor) : null;
List<Order> orders = orderRepository.findAfterCursor(afterId, size + 1);
boolean hasMore = orders.size() > size;
if (hasMore) orders = orders.subList(0, size);
String nextCursor = hasMore ? encodeCursor(orders.get(size - 1).getId()) : null;
return ResponseEntity.ok(new CursorPage<>(orders.stream().map(orderMapper::toDto).toList(), nextCursor, hasMore));
}
// JPA repository
@Query("SELECT o FROM Order o WHERE (:afterId IS NULL OR o.id > :afterId) ORDER BY o.id ASC")
List<Order> findAfterCursor(@Param("afterId") UUID afterId, Pageable pageable);
Python — FastAPI with SQLAlchemy
from base64 import b64encode, b64decode
import json
@router.get("/orders", response_model=CursorPage[OrderDto])
async def list_orders(
cursor: str | None = Query(None),
size: int = Query(20, ge=1, le=100),
status: str | None = Query(None),
created_after: date | None = Query(None),
sort: str = Query("created_at"),
direction: Literal["asc", "desc"] = Query("desc"),
db: AsyncSession = Depends(get_db),
):
stmt = select(Order)
# Filtering
if status:
stmt = stmt.where(Order.status == status)
if created_after:
stmt = stmt.where(Order.created_at >= created_after)
# Cursor decoding
if cursor:
cursor_data = json.loads(b64decode(cursor))
if direction == "desc":
stmt = stmt.where(Order.created_at < cursor_data["created_at"])
else:
stmt = stmt.where(Order.created_at > cursor_data["created_at"])
# Sort and limit
col = getattr(Order, sort, Order.created_at)
stmt = stmt.order_by(col.desc() if direction == "desc" else col.asc())
stmt = stmt.limit(size + 1)
result = await db.execute(stmt)
orders = result.scalars().all()
has_more = len(orders) > size
if has_more:
orders = orders[:size]
next_cursor = None
if has_more:
last = orders[-1]
next_cursor = b64encode(json.dumps({
"created_at": last.created_at.isoformat()
}).encode()).decode()
return CursorPage(
data=[OrderDto.model_validate(o) for o in orders],
next_cursor=next_cursor,
has_more=has_more,
)
Node.js — Express with pg
router.get('/orders', async (req, res, next) => {
try {
const {
cursor,
size = '20',
status,
created_after,
sort = 'created_at',
direction = 'desc',
} = req.query;
const pageSize = Math.min(parseInt(size, 10), 100);
const params = [];
const conditions = [];
let idx = 1;
// Filtering
if (status) {
conditions.push(`status = $${idx++}`);
params.push(status);
}
if (created_after) {
conditions.push(`created_at >= $${idx++}`);
params.push(new Date(created_after));
}
// Cursor decoding
if (cursor) {
const { created_at: cursorTs } = JSON.parse(Buffer.from(cursor, 'base64').toString());
const op = direction === 'desc' ? '<' : '>';
conditions.push(`created_at ${op} $${idx++}`);
params.push(new Date(cursorTs));
}
const where = conditions.length ? `WHERE ${conditions.join(' AND ')}` : '';
const allowedSort = ['created_at', 'total', 'status'];
const sortCol = allowedSort.includes(sort) ? sort : 'created_at';
const dir = direction === 'asc' ? 'ASC' : 'DESC';
// Fetch one extra to detect has_more
params.push(pageSize + 1);
const { rows } = await db.query(
`SELECT * FROM orders ${where} ORDER BY ${sortCol} ${dir} LIMIT $${idx}`,
params,
);
const hasMore = rows.length > pageSize;
const data = hasMore ? rows.slice(0, pageSize) : rows;
const nextCursor = hasMore
? Buffer.from(JSON.stringify({ created_at: data[data.length - 1].created_at })).toString('base64')
: null;
res.json({ data, next_cursor: nextCursor, has_more: hasMore });
} catch (err) { next(err); }
});
6. Validation & Error Handling
Validation belongs at the boundary. Never let invalid data reach your service layer or database. Structured, consistent error responses are a hallmark of a mature API.
Global Exception / Error Handlers
Java — Spring Boot @ControllerAdvice
@RestControllerAdvice
@Slf4j
public class GlobalExceptionHandler {
// Handle bean validation failures (@Valid on request body)
@ExceptionHandler(MethodArgumentNotValidException.class)
public ResponseEntity<ProblemDetail> handleValidation(
MethodArgumentNotValidException ex, HttpServletRequest request) {
ProblemDetail problem = ProblemDetail.forStatusAndDetail(
HttpStatus.UNPROCESSABLE_ENTITY, "The request body contains invalid fields.");
problem.setType(URI.create("https://api.example.com/errors/validation-failed"));
problem.setInstance(URI.create(request.getRequestURI()));
List<FieldError> fieldErrors = ex.getBindingResult().getFieldErrors()
.stream()
.map(e -> new FieldError(e.getField(), e.getDefaultMessage(), e.getCode()))
.toList();
problem.setProperty("errors", fieldErrors);
return ResponseEntity.unprocessableEntity().body(problem);
}
@ExceptionHandler(OrderNotFoundException.class)
public ResponseEntity<ProblemDetail> handleNotFound(
OrderNotFoundException ex, HttpServletRequest request) {
ProblemDetail problem = ProblemDetail.forStatusAndDetail(
HttpStatus.NOT_FOUND, ex.getMessage());
problem.setInstance(URI.create(request.getRequestURI()));
return ResponseEntity.status(404).body(problem);
}
@ExceptionHandler(DuplicateEmailException.class)
public ResponseEntity<ProblemDetail> handleConflict(
DuplicateEmailException ex, HttpServletRequest request) {
ProblemDetail problem = ProblemDetail.forStatusAndDetail(
HttpStatus.CONFLICT, "A user with this email already exists.");
problem.setType(URI.create("https://api.example.com/errors/duplicate-email"));
return ResponseEntity.status(409).body(problem);
}
// Catch-all: log + return 500 without leaking stack trace
@ExceptionHandler(Exception.class)
public ResponseEntity<ProblemDetail> handleUnexpected(
Exception ex, HttpServletRequest request) {
String traceId = MDC.get("traceId"); // From OpenTelemetry/Sleuth
log.error("Unhandled exception [traceId={}]", traceId, ex);
ProblemDetail problem = ProblemDetail.forStatusAndDetail(
HttpStatus.INTERNAL_SERVER_ERROR,
"An unexpected error occurred. Reference: " + traceId);
return ResponseEntity.internalServerError().body(problem);
}
}
Python — FastAPI exception handlers
from fastapi import Request, status
from fastapi.exceptions import RequestValidationError
from fastapi.responses import JSONResponse
# Pydantic v2 validation errors are caught automatically by FastAPI,
# but we can customize the response format:
@app.exception_handler(RequestValidationError)
async def validation_exception_handler(request: Request, exc: RequestValidationError):
errors = []
for error in exc.errors():
errors.append({
"field": ".".join(str(loc) for loc in error["loc"][1:]), # Skip "body"
"message": error["msg"],
"code": error["type"].upper(),
})
return JSONResponse(
status_code=status.HTTP_422_UNPROCESSABLE_ENTITY,
content={
"type": "https://api.example.com/errors/validation-failed",
"title": "Validation Failed",
"status": 422,
"detail": "The request body contains invalid fields.",
"errors": errors,
},
)
class OrderNotFoundError(Exception):
def __init__(self, order_id: UUID):
self.order_id = order_id
@app.exception_handler(OrderNotFoundError)
async def order_not_found_handler(request: Request, exc: OrderNotFoundError):
return JSONResponse(
status_code=404,
content={
"type": "https://api.example.com/errors/not-found",
"title": "Not Found",
"status": 404,
"detail": f"Order {exc.order_id} not found.",
"instance": str(request.url),
},
)
# Catch-all: log + sanitize
@app.exception_handler(Exception)
async def unexpected_error_handler(request: Request, exc: Exception):
trace_id = request.state.trace_id # Set by middleware
logger.exception("Unhandled exception [trace_id=%s]", trace_id)
return JSONResponse(
status_code=500,
content={
"title": "Internal Server Error",
"status": 500,
"detail": f"An unexpected error occurred. Reference: {trace_id}",
},
)
Node.js — Express error middleware
// Error middleware must have 4 parameters — Express identifies it by arity
export function errorHandler(err, req, res, next) {
const traceId = req.headers['x-trace-id'] || crypto.randomUUID();
// Known domain errors
if (err instanceof OrderNotFoundError) {
return res.status(404).json({
type: 'https://api.example.com/errors/not-found',
title: 'Not Found',
status: 404,
detail: err.message,
instance: req.path,
});
}
if (err instanceof DuplicateEmailError) {
return res.status(409).json({
type: 'https://api.example.com/errors/duplicate-email',
title: 'Conflict',
status: 409,
detail: 'A user with this email already exists.',
});
}
// Validation errors from Zod middleware (already handled inline, but as fallback)
if (err.name === 'ZodError') {
return res.status(422).json({
type: 'https://api.example.com/errors/validation-failed',
title: 'Validation Failed',
status: 422,
errors: err.errors.map(e => ({ field: e.path.join('.'), message: e.message })),
});
}
// Catch-all
logger.error({ err, traceId, path: req.path }, 'Unhandled exception');
res.status(500).json({
title: 'Internal Server Error',
status: 500,
detail: `An unexpected error occurred. Reference: ${traceId}`,
});
}
// Register LAST, after all routes
app.use(errorHandler);
7. Authentication & Authorization
The Problem: Auth in Distributed Systems
Traditional web auth uses server-side sessions: client logs in, server creates a session object in memory (or a DB), returns a session ID as a cookie. On every subsequent request the server looks up the session by ID.
This breaks down in distributed systems. With N application servers behind a load balancer, you need either sticky sessions (defeats the purpose of load balancing) or a shared session store (Redis/DB — adds latency and a single point of failure on every request). JWT solves this by making the token self-contained: the server can validate it cryptographically without any backend lookup.
Auth Approaches Compared
| Approach | How It Works | Stateless? | Best For | Drawbacks |
|---|---|---|---|---|
| Session Cookie | Server stores session; client sends Set-Cookie session ID |
No | Traditional server-rendered apps | Shared store needed for horizontal scaling; CSRF risk |
| API Key | Static secret in header (X-API-Key) |
Yes | Service-to-service, public developer APIs | No user identity; rotation is manual; leaked key = full access |
| JWT (Bearer Token) | Signed token with embedded claims in Authorization: Bearer |
Yes | SPAs, mobile apps, microservices | Can't revoke individual tokens without a blocklist; payload is readable |
| OAuth 2.0 Opaque Token | Random string; resource server calls auth server to validate (token introspection) | No | When instant revocation is critical | Extra network hop per request; auth server is a bottleneck |
| mTLS (Mutual TLS) | Both client and server present X.509 certificates | Yes | Internal service mesh, zero-trust networks | Complex certificate management; not suitable for end users |
JWT Structure & Flow
A JWT is three base64url-encoded segments separated by dots: header.payload.signature.
// Header — algorithm and token type
{
"alg": "RS256",
"typ": "JWT",
"kid": "key-2026-02" // Key ID — lets the verifier pick the right public key
}
// Payload — claims (data)
{
"sub": "user_8f3a2b", // Subject (user ID)
"iss": "auth.example.com", // Issuer
"aud": "api.example.com", // Audience (intended recipient)
"exp": 1740000000, // Expires at (Unix timestamp)
"iat": 1739999100, // Issued at
"roles": ["editor"], // Custom claim — used for RBAC
"org_id": "org_xyz" // Custom claim — used for tenant isolation
}
// Signature — server verifies this; tampered tokens fail
// RSASHA256(base64url(header) + "." + base64url(payload), privateKey)
Token lifecycle:
- Client authenticates (username/password, SSO, social login).
- Auth server issues an access token (short-lived, 15 min) and a refresh token (long-lived, 7–30 days, stored server-side).
- Client sends access token on every request:
Authorization: Bearer <token>. - Resource server validates signature, then checks
exp,iss,aud— no DB lookup. - On expiry, client uses the refresh token to get a new access token (refresh token rotation recommended).
Signing Algorithms
| Algorithm | Type | How It Works | When to Use |
|---|---|---|---|
HS256 |
Symmetric (HMAC) | Same secret signs and verifies | Single-service apps; simple setups |
RS256 |
Asymmetric (RSA) | Private key signs; public key verifies | Microservices — only auth service has the private key, all services verify with public key via JWKS |
ES256 |
Asymmetric (ECDSA) | Same as RS256 but smaller keys (256-bit vs 2048-bit) | Modern default — faster, smaller tokens, equivalent security |
/.well-known/jwks.json). Resource servers fetch and cache these keys. The kid
claim in the JWT header tells the verifier which key to use — this enables key rotation
without downtime.
JWT + OAuth 2.0 + OIDC
These three are frequently confused. They solve different problems at different layers:
| Concept | What It Is | Analogy |
|---|---|---|
| JWT | A token format — a way to encode and sign claims | The passport document format |
| OAuth 2.0 | An authorization framework — defines how to obtain tokens (authorization code flow, client credentials, etc.) | The process for getting a passport |
| OIDC | An identity layer on top of OAuth 2.0 — adds authentication (who you are) via ID tokens | The identity verification step within the passport process |
How they fit together:
- OAuth 2.0 doesn't mandate a token format — tokens can be opaque strings or JWTs.
- In practice, most identity providers (Auth0, Okta, Cognito, Keycloak) issue JWTs as access tokens.
- OIDC requires the ID token to be a JWT (so the client can read user info without an extra call).
- Your API typically only cares about the access token — validate its signature and read claims.
OAuth 2.0 Grant Types Quick Reference
| Grant Type | Use Case | Flow |
|---|---|---|
| Authorization Code + PKCE | SPAs, mobile apps, server-side web apps | Redirect to IdP → user authenticates → redirect back with code → exchange code for tokens |
| Client Credentials | Service-to-service (no user) | Service sends client_id + client_secret directly → gets access token |
| Device Code | Smart TVs, CLI tools (no browser) | Device shows code → user enters code on phone/desktop → device polls for token |
| Refresh Token | Renewing expired access tokens | Send refresh token → get new access token (+ optionally new refresh token) |
RBAC & Authorization Models
Authentication (who are you?) and authorization (what can you do?) are separate concerns. JWT handles the transport — it carries identity and role claims. The authorization model decides how to use those claims.
| Model | How It Works | Granularity | Example |
|---|---|---|---|
| RBAC (Role-Based) |
Users are assigned roles; roles have permissions | Coarse — role-level | admin can delete any order; viewer can only read |
| ABAC (Attribute-Based) |
Policies evaluate attributes of user, resource, and environment | Fine — attribute-level | "Allow if user.department == resource.department AND time < 18:00" |
| ReBAC (Relationship-Based) |
Authorization based on relationships in a graph (e.g., Zanzibar/SpiceDB) | Fine — relationship-level | "User is an editor of this document" (Google Docs model) |
Most APIs start with RBAC because it's the simplest to implement. The pattern:
- Auth server includes
"roles": ["admin"]in the JWT payload at login time. - API middleware reads the JWT, extracts roles, and checks against the endpoint's required role.
- For resource-level authorization (e.g., "can this user edit this order?"), you still need a DB lookup — JWT roles alone aren't enough. This is where ownership checks live.
JWT: Issue, Validate, Refresh — Implementation
Below are production patterns for JWT validation and RBAC enforcement in all three frameworks.
- Do not store sensitive data in JWT payload — it is base64-encoded, not encrypted (use JWE for encryption).
- Do not use long-lived access tokens. Keep them short (15 min) and use refresh tokens.
- Always validate
exp,iss, andaudclaims — not just the signature. - Store refresh tokens in
HttpOnlycookies, not localStorage (XSS protection).
Java — Spring Security JWT Resource Server
// build.gradle: implementation 'org.springframework.boot:spring-boot-starter-oauth2-resource-server'
@Configuration
@EnableWebSecurity
@EnableMethodSecurity
public class SecurityConfig {
@Bean
public SecurityFilterChain securityFilterChain(HttpSecurity http) throws Exception {
return http
.csrf(AbstractHttpConfigurer::disable) // Stateless API — no CSRF needed
.sessionManagement(s -> s.sessionCreationPolicy(STATELESS))
.authorizeHttpRequests(auth -> auth
.requestMatchers("/api/v1/auth/**").permitAll()
.requestMatchers(GET, "/api/v1/products/**").permitAll()
.requestMatchers("/actuator/health").permitAll()
.anyRequest().authenticated()
)
.oauth2ResourceServer(oauth2 -> oauth2
.jwt(jwt -> jwt.jwtAuthenticationConverter(jwtConverter()))
)
.build();
}
@Bean
public JwtAuthenticationConverter jwtAuthenticationConverter() {
JwtGrantedAuthoritiesConverter converter = new JwtGrantedAuthoritiesConverter();
converter.setAuthoritiesClaimName("roles");
converter.setAuthorityPrefix("ROLE_");
JwtAuthenticationConverter jwtConverter = new JwtAuthenticationConverter();
jwtConverter.setJwtGrantedAuthoritiesConverter(converter);
return jwtConverter;
}
}
// RBAC at the method level
@RestController
@RequestMapping("/api/v1/admin")
public class AdminController {
@GetMapping("/users")
@PreAuthorize("hasRole('ADMIN')")
public ResponseEntity<List<UserDto>> listUsers() {
return ResponseEntity.ok(userService.findAll());
}
// Resource-level authorization: user can only access their own data
@GetMapping("/orders/{orderId}")
@PreAuthorize("hasRole('ADMIN') or @orderAuthService.isOwner(#orderId, authentication)")
public ResponseEntity<OrderDto> getOrder(@PathVariable UUID orderId) {
return ResponseEntity.ok(orderService.findById(orderId));
}
}
Python — FastAPI JWT middleware
from fastapi import Security, HTTPException, status
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
import jwt # PyJWT
security = HTTPBearer()
def decode_token(token: str) -> dict:
try:
payload = jwt.decode(
token,
settings.JWT_PUBLIC_KEY,
algorithms=["RS256"],
audience="api.example.com",
issuer="auth.example.com",
)
return payload
except jwt.ExpiredSignatureError:
raise HTTPException(status_code=401, detail="Token has expired")
except jwt.InvalidTokenError as e:
raise HTTPException(status_code=401, detail=f"Invalid token: {e}")
async def get_current_user(
credentials: HTTPAuthorizationCredentials = Security(security),
db: AsyncSession = Depends(get_db),
) -> User:
payload = decode_token(credentials.credentials)
user = await user_service.find_by_id(db, UUID(payload["sub"]))
if not user or not user.is_active:
raise HTTPException(status_code=401, detail="User not found or inactive")
return user
def require_role(*roles: str):
"""Factory for role-based dependency injection."""
async def dependency(current_user: User = Depends(get_current_user)) -> User:
if current_user.role not in roles:
raise HTTPException(
status_code=403,
detail=f"Requires role: {', '.join(roles)}"
)
return current_user
return dependency
# Usage
@router.get("/admin/users", response_model=list[UserDto])
async def list_users(current_user: User = Depends(require_role("admin"))):
return await user_service.find_all()
@router.get("/orders/{order_id}", response_model=OrderDto)
async def get_order(
order_id: UUID,
current_user: User = Depends(get_current_user),
db: AsyncSession = Depends(get_db),
):
order = await order_service.find_by_id(db, order_id)
# Resource-level auth: admins see all, users see own
if current_user.role != "admin" and order.user_id != current_user.id:
raise HTTPException(status_code=403, detail="Access denied")
return order
Node.js — Express JWT middleware
import jwt from 'jsonwebtoken';
import { createRemoteJWKSet, jwtVerify } from 'jose';
// Verify JWT using JWKS endpoint (production approach)
const JWKS = createRemoteJWKSet(new URL('https://auth.example.com/.well-known/jwks.json'));
export async function authenticate(req, res, next) {
const authHeader = req.headers.authorization;
if (!authHeader?.startsWith('Bearer ')) {
return res.status(401).json({
type: 'https://api.example.com/errors/unauthorized',
title: 'Unauthorized',
status: 401,
detail: 'Missing or invalid Authorization header.',
});
}
try {
const token = authHeader.slice(7);
const { payload } = await jwtVerify(token, JWKS, {
issuer: 'https://auth.example.com',
audience: 'api.example.com',
});
req.user = { id: payload.sub, role: payload.role, email: payload.email };
next();
} catch (err) {
const isExpired = err.code === 'ERR_JWT_EXPIRED';
res.status(401).json({
title: isExpired ? 'Token Expired' : 'Invalid Token',
status: 401,
detail: isExpired ? 'Your session has expired. Please re-authenticate.' : 'The provided token is invalid.',
});
}
}
// RBAC middleware factory
export function authorize(...roles) {
return (req, res, next) => {
if (!req.user) return res.status(401).json({ status: 401, title: 'Unauthorized' });
if (!roles.includes(req.user.role)) {
return res.status(403).json({
title: 'Forbidden',
status: 403,
detail: `This endpoint requires one of the following roles: ${roles.join(', ')}`,
});
}
next();
};
}
// Usage
router.get('/admin/users', authenticate, authorize('admin'), listUsersHandler);
router.get('/orders/:orderId', authenticate, getOrderHandler);
// Resource-level auth inside handler
async function getOrderHandler(req, res, next) {
try {
const order = await orderService.findById(req.params.orderId);
if (!order) return res.status(404).json({ status: 404, title: 'Not Found' });
if (req.user.role !== 'admin' && order.userId !== req.user.id) {
return res.status(403).json({ status: 403, title: 'Forbidden' });
}
res.json(order);
} catch (err) { next(err); }
}
Security Headers Table
| Header | Value Example | Purpose |
|---|---|---|
Strict-Transport-Security | max-age=31536000; includeSubDomains | Force HTTPS for 1 year |
X-Content-Type-Options | nosniff | Prevent MIME type sniffing |
X-Frame-Options | DENY | Prevent clickjacking |
Content-Security-Policy | default-src 'self' | Restrict resource origins |
Referrer-Policy | strict-origin-when-cross-origin | Control referrer information leakage |
Permissions-Policy | camera=(), microphone=() | Disable browser features not needed |
Cache-Control | no-store | On auth endpoints — prevent caching tokens |
8. Versioning Strategies
| Strategy | URL Example | Pros | Cons |
|---|---|---|---|
| URL Path | /api/v1/orders |
Visible, easy to route, works with every client | Version is in URI which REST purists argue violates resource identity |
| Header | Accept: application/vnd.myapp.v1+json |
Clean URIs, follows HTTP spec (content negotiation) | Hard to test in browser, less visible, CDN caching requires Vary: Accept |
| Query Param | /api/orders?version=1 |
Simple to add to existing URL | Easy to forget, clutters URLs, bad for caching |
| Date-based (Stripe) | Stripe-Version: 2024-12-18 |
Fine-grained control; users opt-in to changes on their own schedule | Complex server logic (multiple code paths per date); only realistic for large teams |
URL Versioning Implementation
Java — Spring Boot
// Approach 1: Separate controller classes per version
@RestController
@RequestMapping("/api/v1/orders")
public class OrderControllerV1 { /* v1 implementation */ }
@RestController
@RequestMapping("/api/v2/orders")
public class OrderControllerV2 { /* v2 with breaking changes */ }
// Approach 2: Single controller, route-level versioning
@RestController
@RequestMapping("/api")
public class OrderController {
@GetMapping("/v1/orders/{id}")
public ResponseEntity<OrderDtoV1> getOrderV1(@PathVariable UUID id) {
return ResponseEntity.ok(orderMapper.toDtoV1(orderService.findById(id)));
}
@GetMapping("/v2/orders/{id}")
public ResponseEntity<OrderDtoV2> getOrderV2(@PathVariable UUID id) {
// V2 adds expanded items array, deprecates "total" in favor of "amount"
return ResponseEntity.ok(orderMapper.toDtoV2(orderService.findById(id)));
}
}
// Deprecation header for sunset planning
@GetMapping("/v1/orders/{id}")
public ResponseEntity<OrderDtoV1> getOrderV1(@PathVariable UUID id) {
return ResponseEntity.ok()
.header("Deprecation", "true")
.header("Sunset", "Sat, 31 Dec 2026 23:59:59 GMT")
.header("Link", "</api/v2/orders>; rel=\"successor-version\"")
.body(orderMapper.toDtoV1(orderService.findById(id)));
}
Python — FastAPI
from fastapi import APIRouter, FastAPI
app = FastAPI()
# Separate routers per version
v1_router = APIRouter(prefix="/api/v1")
v2_router = APIRouter(prefix="/api/v2")
@v1_router.get("/orders/{order_id}", response_model=OrderDtoV1)
async def get_order_v1(order_id: UUID, db: AsyncSession = Depends(get_db)):
return await order_service.find_by_id_v1(db, order_id)
@v2_router.get("/orders/{order_id}", response_model=OrderDtoV2)
async def get_order_v2(order_id: UUID, db: AsyncSession = Depends(get_db)):
return await order_service.find_by_id_v2(db, order_id)
app.include_router(v1_router)
app.include_router(v2_router)
# Add deprecation headers via middleware on v1 routes
@app.middleware("http")
async def deprecation_header_middleware(request: Request, call_next):
response = await call_next(request)
if request.url.path.startswith("/api/v1/"):
response.headers["Deprecation"] = "true"
response.headers["Sunset"] = "Sat, 31 Dec 2026 23:59:59 GMT"
return response
Node.js — Express
import { Router } from 'express';
// v1 and v2 routers
const v1Router = Router();
const v2Router = Router();
// Deprecation middleware for v1
const deprecationWarning = (req, res, next) => {
res.setHeader('Deprecation', 'true');
res.setHeader('Sunset', 'Sat, 31 Dec 2026 23:59:59 GMT');
res.setHeader('Link', '</api/v2>; rel="successor-version"');
next();
};
v1Router.use(deprecationWarning);
v1Router.get('/orders/:id', async (req, res, next) => {
try {
const order = await orderService.findByIdV1(req.params.id);
res.json(order);
} catch (err) { next(err); }
});
v2Router.get('/orders/:id', async (req, res, next) => {
try {
const order = await orderService.findByIdV2(req.params.id);
res.json(order);
} catch (err) { next(err); }
});
// Mount versioned routers
app.use('/api/v1', v1Router);
app.use('/api/v2', v2Router);
9. Rate Limiting & Throttling
Rate limiting protects your API from abuse, ensures fair usage across tenants, and prevents a single client from overwhelming downstream services. The token bucket algorithm is the most practical to implement and reason about.
Token Bucket Algorithm
Each API key starts with a bucket of N tokens. Each request consumes one token. Tokens refill at a fixed rate (e.g., 100 tokens/minute). Requests arriving when the bucket is empty receive a 429. Unlike a fixed window, token bucket smooths bursts while still enforcing long-term rates.
Response Headers
| Header | Value Example | Meaning |
|---|---|---|
X-RateLimit-Limit | 1000 | Max requests per window |
X-RateLimit-Remaining | 742 | Requests remaining in current window |
X-RateLimit-Reset | 1708992000 | Unix timestamp when window resets |
Retry-After | 47 | Seconds to wait (on 429 response) |
429 Response Body
{
"type": "https://api.example.com/errors/rate-limit-exceeded",
"title": "Too Many Requests",
"status": 429,
"detail": "You have exceeded your rate limit of 1000 requests per minute.",
"retry_after": 47
}
Redis-Backed Rate Limiter
Java — Spring Boot (Redis + Bucket4j)
// build.gradle: implementation 'com.bucket4j:bucket4j-redis:8.7.0'
@Component
@RequiredArgsConstructor
public class RateLimitFilter extends OncePerRequestFilter {
private final RedissonClient redissonClient;
// Sliding window: 1000 req/min per API key, 100 req/min per IP
private Bucket resolveBucket(String key) {
ProxyManager<String> proxyManager = Bucket4jRedisson.casBasedBuilder(redissonClient)
.build();
BucketConfiguration config = BucketConfiguration.builder()
.addLimit(Bandwidth.classic(1000, Refill.intervally(1000, Duration.ofMinutes(1))))
.build();
return proxyManager.builder().build(key, () -> config);
}
@Override
protected void doFilterInternal(HttpServletRequest request,
HttpServletResponse response,
FilterChain chain) throws ServletException, IOException {
String key = resolveKey(request); // API key from header, or IP fallback
Bucket bucket = resolveBucket(key);
ConsumptionProbe probe = bucket.tryConsumeAndReturnRemaining(1);
response.addHeader("X-RateLimit-Limit", "1000");
response.addHeader("X-RateLimit-Remaining", String.valueOf(probe.getRemainingTokens()));
if (probe.isConsumed()) {
chain.doFilter(request, response);
} else {
long waitSeconds = probe.getNanosToWaitForRefill() / 1_000_000_000;
response.addHeader("Retry-After", String.valueOf(waitSeconds));
response.setStatus(429);
response.setContentType("application/json");
response.getWriter().write("""
{"type":"https://api.example.com/errors/rate-limit-exceeded",
"status":429,"title":"Too Many Requests",
"retry_after":%d}""".formatted(waitSeconds));
}
}
private String resolveKey(HttpServletRequest request) {
String apiKey = request.getHeader("X-API-Key");
return apiKey != null ? "apikey:" + apiKey : "ip:" + request.getRemoteAddr();
}
}
Python — FastAPI (Redis + sliding window)
import redis.asyncio as redis
import time
from fastapi import Request, HTTPException
redis_client = redis.from_url("redis://localhost:6379", decode_responses=True)
def rate_limiter(limit: int = 1000, window_seconds: int = 60):
"""Sliding window rate limiter using Redis sorted sets."""
async def _limit(request: Request):
# Prefer API key, fall back to IP
key = request.headers.get("X-API-Key") or request.client.host
redis_key = f"ratelimit:{key}"
now = time.time()
window_start = now - window_seconds
pipe = redis_client.pipeline()
# Remove timestamps outside the current window
pipe.zremrangebyscore(redis_key, 0, window_start)
# Count requests in window
pipe.zcard(redis_key)
# Add current request timestamp
pipe.zadd(redis_key, {str(now): now})
# Set expiry to avoid orphan keys
pipe.expire(redis_key, window_seconds * 2)
results = await pipe.execute()
count = results[1]
remaining = max(0, limit - count - 1)
reset_at = int(now) + window_seconds
request.state.rate_limit_remaining = remaining
if count >= limit:
retry_after = window_seconds - int(now - window_start)
raise HTTPException(
status_code=429,
detail={
"type": "https://api.example.com/errors/rate-limit-exceeded",
"title": "Too Many Requests",
"status": 429,
"retry_after": max(retry_after, 1),
},
headers={
"X-RateLimit-Limit": str(limit),
"X-RateLimit-Remaining": "0",
"X-RateLimit-Reset": str(reset_at),
"Retry-After": str(max(retry_after, 1)),
},
)
return _limit
# Usage — apply per route or globally via middleware
@router.get("/orders", dependencies=[Depends(rate_limiter(limit=100, window_seconds=60))])
async def list_orders(): ...
Node.js — Express (redis-rate-limiter)
import { RateLimiterRedis } from 'rate-limiter-flexible';
import { createClient } from 'redis';
const redisClient = createClient({ url: process.env.REDIS_URL });
await redisClient.connect();
// Per-API-key: 1000/min; per-IP: 100/min
const apiKeyLimiter = new RateLimiterRedis({
storeClient: redisClient,
keyPrefix: 'rl_apikey',
points: 1000, // requests
duration: 60, // per 60 seconds
});
const ipLimiter = new RateLimiterRedis({
storeClient: redisClient,
keyPrefix: 'rl_ip',
points: 100,
duration: 60,
});
export async function rateLimitMiddleware(req, res, next) {
const apiKey = req.headers['x-api-key'];
const limiter = apiKey ? apiKeyLimiter : ipLimiter;
const key = apiKey || req.ip;
const limit = apiKey ? 1000 : 100;
try {
const result = await limiter.consume(key);
res.setHeader('X-RateLimit-Limit', limit);
res.setHeader('X-RateLimit-Remaining', result.remainingPoints);
res.setHeader('X-RateLimit-Reset', Math.ceil(Date.now() / 1000 + result.msBeforeNext / 1000));
next();
} catch (err) {
const retryAfter = Math.ceil(err.msBeforeNext / 1000);
res.setHeader('X-RateLimit-Limit', limit);
res.setHeader('X-RateLimit-Remaining', 0);
res.setHeader('Retry-After', retryAfter);
res.status(429).json({
type: 'https://api.example.com/errors/rate-limit-exceeded',
title: 'Too Many Requests',
status: 429,
detail: 'Rate limit exceeded.',
retry_after: retryAfter,
});
}
}
10. Caching
Cache-Control & Conditional Requests
| Mechanism | Header | Direction | Purpose |
|---|---|---|---|
| Cache duration | Cache-Control: max-age=300 | Response | Cache for 5 minutes |
| No cache | Cache-Control: no-store | Response | Never cache (use for auth endpoints) |
| Revalidate | Cache-Control: no-cache | Response | Cache but always validate with server |
| ETag | ETag: "abc123def456" | Response | Version fingerprint of the resource |
| Conditional GET | If-None-Match: "abc123def456" | Request | Return 304 if ETag unchanged |
| Last-Modified | Last-Modified: Tue, 10 Feb 2026 15:00:00 GMT | Response | Timestamp-based validation |
| Conditional GET | If-Modified-Since: Tue, 10 Feb 2026 15:00:00 GMT | Request | Return 304 if unchanged since timestamp |
ETag + Redis Caching Implementation
Java — Spring Boot
@RestController
@RequestMapping("/api/v1")
@RequiredArgsConstructor
public class ProductController {
private final ProductService productService;
private final RedisTemplate<String, String> redisTemplate;
@GetMapping("/products/{productId}")
public ResponseEntity<ProductDto> getProduct(
@PathVariable UUID productId,
@RequestHeader(value = "If-None-Match", required = false) String ifNoneMatch) {
// Try cache first
String cacheKey = "product:" + productId;
String cachedJson = redisTemplate.opsForValue().get(cacheKey);
String currentEtag;
if (cachedJson != null) {
currentEtag = "\"" + DigestUtils.md5DigestAsHex(cachedJson.getBytes()) + "\"";
if (currentEtag.equals(ifNoneMatch)) {
return ResponseEntity.status(HttpStatus.NOT_MODIFIED)
.eTag(currentEtag)
.build();
}
return ResponseEntity.ok()
.eTag(currentEtag)
.cacheControl(CacheControl.maxAge(5, TimeUnit.MINUTES).cachePublic())
.body(objectMapper.readValue(cachedJson, ProductDto.class));
}
// Cache miss — fetch from DB and cache
ProductDto product = productService.findById(productId);
String json = objectMapper.writeValueAsString(product);
redisTemplate.opsForValue().set(cacheKey, json, Duration.ofMinutes(5));
currentEtag = "\"" + DigestUtils.md5DigestAsHex(json.getBytes()) + "\"";
return ResponseEntity.ok()
.eTag(currentEtag)
.cacheControl(CacheControl.maxAge(5, TimeUnit.MINUTES).cachePublic())
.body(product);
}
// Invalidate cache on write
@PutMapping("/products/{productId}")
public ResponseEntity<ProductDto> updateProduct(
@PathVariable UUID productId,
@Valid @RequestBody UpdateProductRequest request) {
ProductDto updated = productService.update(productId, request);
redisTemplate.delete("product:" + productId);
return ResponseEntity.ok(updated);
}
}
Python — FastAPI
import hashlib
import json
from fastapi import Request, Response
from redis.asyncio import Redis
@router.get("/products/{product_id}", response_model=ProductDto)
async def get_product(
product_id: UUID,
request: Request,
response: Response,
redis: Redis = Depends(get_redis),
db: AsyncSession = Depends(get_db),
):
cache_key = f"product:{product_id}"
cached = await redis.get(cache_key)
if cached:
etag = f'"{hashlib.md5(cached).hexdigest()}"'
if request.headers.get("if-none-match") == etag:
return Response(status_code=304, headers={"ETag": etag})
response.headers["ETag"] = etag
response.headers["Cache-Control"] = "public, max-age=300"
return ProductDto.model_validate_json(cached)
product = await product_service.find_by_id(db, product_id)
if not product:
raise HTTPException(status_code=404, detail="Product not found")
product_json = product.model_dump_json().encode()
await redis.setex(cache_key, 300, product_json) # 5 min TTL
etag = f'"{hashlib.md5(product_json).hexdigest()}"'
response.headers["ETag"] = etag
response.headers["Cache-Control"] = "public, max-age=300"
return product
@router.put("/products/{product_id}", response_model=ProductDto)
async def update_product(
product_id: UUID,
request: UpdateProductRequest,
redis: Redis = Depends(get_redis),
db: AsyncSession = Depends(get_db),
):
updated = await product_service.update(db, product_id, request)
await redis.delete(f"product:{product_id}") # Invalidate cache
return updated
Node.js — Express
import crypto from 'crypto';
router.get('/products/:productId', async (req, res, next) => {
try {
const cacheKey = `product:${req.params.productId}`;
const cached = await redis.get(cacheKey);
if (cached) {
const etag = `"${crypto.createHash('md5').update(cached).digest('hex')}"`;
if (req.headers['if-none-match'] === etag) {
return res.status(304).setHeader('ETag', etag).send();
}
return res
.setHeader('ETag', etag)
.setHeader('Cache-Control', 'public, max-age=300')
.json(JSON.parse(cached));
}
const product = await productService.findById(req.params.productId);
if (!product) return res.status(404).json({ status: 404, title: 'Not Found' });
const json = JSON.stringify(product);
await redis.setEx(cacheKey, 300, json); // 5 min TTL
const etag = `"${crypto.createHash('md5').update(json).digest('hex')}"`;
res
.setHeader('ETag', etag)
.setHeader('Cache-Control', 'public, max-age=300')
.json(product);
} catch (err) { next(err); }
});
11. File Uploads & Streaming
Multipart Upload
Java — Spring Boot
@RestController
@RequestMapping("/api/v1")
public class UploadController {
// application.yml: spring.servlet.multipart.max-file-size=10MB
@PostMapping(value = "/products/{productId}/images",
consumes = MediaType.MULTIPART_FORM_DATA_VALUE)
public ResponseEntity<ImageDto> uploadImage(
@PathVariable UUID productId,
@RequestParam("file") MultipartFile file,
@RequestParam(value = "alt", required = false) String altText) {
// Validate MIME type — never trust Content-Type header alone
String contentType = file.getContentType();
if (!List.of("image/jpeg", "image/png", "image/webp").contains(contentType)) {
throw new ValidationException("Unsupported image type: " + contentType);
}
if (file.getSize() > 10 * 1024 * 1024) { // 10 MB
throw new ValidationException("File exceeds maximum size of 10MB");
}
// Validate actual file header (magic bytes)
byte[] header = Arrays.copyOf(file.getBytes(), 8);
if (!isSupportedImageHeader(header)) {
throw new ValidationException("File content does not match declared type");
}
ImageDto result = imageService.upload(productId, file.getInputStream(), contentType, altText);
return ResponseEntity.status(201).body(result);
}
}
Python — FastAPI
from fastapi import UploadFile, File, Form
import magic # python-magic library
ALLOWED_TYPES = {"image/jpeg", "image/png", "image/webp"}
MAX_SIZE = 10 * 1024 * 1024 # 10 MB
@router.post("/products/{product_id}/images",
response_model=ImageDto,
status_code=status.HTTP_201_CREATED)
async def upload_image(
product_id: UUID,
file: UploadFile = File(...),
alt: str | None = Form(None),
db: AsyncSession = Depends(get_db),
current_user: User = Depends(get_current_user),
):
# Read first chunk to detect MIME via magic bytes
header_bytes = await file.read(1024)
await file.seek(0) # Reset for subsequent read
mime = magic.from_buffer(header_bytes, mime=True)
if mime not in ALLOWED_TYPES:
raise HTTPException(status_code=415, detail=f"Unsupported media type: {mime}")
# Check total size without loading into memory
contents = await file.read()
if len(contents) > MAX_SIZE:
raise HTTPException(status_code=413, detail="File exceeds 10MB limit")
return await image_service.upload(db, product_id, contents, mime, alt)
Presigned URLs (S3 Pattern)
For large uploads, avoid routing files through your application server. Instead, issue a short-lived presigned URL directly to the S3/GCS bucket. The client uploads directly to object storage, and your server only handles metadata.
# FastAPI presigned upload flow
import boto3
from botocore.exceptions import ClientError
s3_client = boto3.client("s3", region_name="us-east-1")
@router.post("/products/{product_id}/image-upload-url")
async def create_upload_url(
product_id: UUID,
content_type: str = Query(..., regex="^image/(jpeg|png|webp)$"),
current_user: User = Depends(get_current_user),
):
key = f"products/{product_id}/images/{uuid4()}"
url = s3_client.generate_presigned_url(
"put_object",
Params={
"Bucket": settings.S3_BUCKET,
"Key": key,
"ContentType": content_type,
"ContentLengthRange": (1, 10 * 1024 * 1024), # 1 byte to 10 MB
},
ExpiresIn=300, # 5 minutes
)
# Record pending upload in DB, confirm once client POSTs back
await image_service.create_pending(db, product_id, key, content_user.id)
return {"upload_url": url, "key": key, "expires_in": 300}
Server-Sent Events (SSE) & NDJSON Streaming
# FastAPI — streaming NDJSON for large exports
from fastapi.responses import StreamingResponse
import json
@router.get("/orders/export")
async def export_orders(
current_user: User = Depends(require_role("admin")),
db: AsyncSession = Depends(get_db),
):
async def generate():
async for order in order_service.stream_all(db):
yield json.dumps(order.model_dump()) + "\n"
return StreamingResponse(
generate(),
media_type="application/x-ndjson",
headers={"Content-Disposition": "attachment; filename=orders.ndjson"},
)
# SSE — real-time order status updates
@router.get("/orders/{order_id}/events")
async def order_events(
order_id: UUID,
request: Request,
current_user: User = Depends(get_current_user),
):
async def event_stream():
async for event in order_service.subscribe(order_id):
if await request.is_disconnected():
break
yield f"data: {json.dumps(event)}\n\n"
return StreamingResponse(event_stream(), media_type="text/event-stream")
12. API Documentation (OpenAPI)
OpenAPI 3.1 Spec Structure
openapi: '3.1.0'
info:
title: Orders API
version: '1.0.0'
description: |
Manages the order lifecycle from placement to fulfillment.
All timestamps are ISO 8601 UTC. Monetary values are integers in minor currency units (cents).
contact:
email: [email protected]
servers:
- url: https://api.example.com/api/v1
description: Production
security:
- bearerAuth: []
paths:
/orders/{orderId}:
get:
summary: Get an order by ID
operationId: getOrder
tags: [Orders]
parameters:
- name: orderId
in: path
required: true
schema:
type: string
format: uuid
responses:
'200':
description: Order found
content:
application/json:
schema:
$ref: '#/components/schemas/OrderDto'
'404':
$ref: '#/components/responses/NotFound'
'401':
$ref: '#/components/responses/Unauthorized'
components:
schemas:
OrderDto:
type: object
required: [id, status, total, currency, created_at]
properties:
id:
type: string
format: uuid
status:
type: string
enum: [pending, processing, shipped, delivered, cancelled]
total:
type: integer
description: Total in minor currency units (e.g., cents)
example: 9999
currency:
type: string
pattern: '^[A-Z]{3}$'
example: USD
created_at:
type: string
format: date-time
responses:
NotFound:
description: Resource not found
content:
application/problem+json:
schema:
$ref: '#/components/schemas/ProblemDetail'
Unauthorized:
description: Missing or invalid authentication
securitySchemes:
bearerAuth:
type: http
scheme: bearer
bearerFormat: JWT
Auto-Generation per Framework
| Framework | Library | Docs URL | Notes |
|---|---|---|---|
| Spring Boot 3 | springdoc-openapi-starter-webmvc-ui |
/swagger-ui.html, /v3/api-docs |
Use @Operation, @ApiResponse annotations to enrich docs |
| FastAPI | Built-in (Pydantic + FastAPI) | /docs (Swagger), /redoc, /openapi.json |
Pydantic models auto-generate schemas; docstrings become descriptions |
| Express | swagger-jsdoc + swagger-ui-express |
/api-docs |
Write spec in JSDoc comments above routes; manual but flexible |
app = FastAPI(docs_url=None, redoc_url=None)) or protect them behind auth. The /openapi.json endpoint reveals your full API surface to anyone.
13. Testing APIs
Testing Pyramid for APIs
- Unit tests: Service layer logic, mapper/transformer functions, validators. No HTTP involved. Fast.
- Integration tests (slice tests): Controller + service + real DB (using Testcontainers). Full request/response cycle without network.
- Contract tests: Verify the API matches a consumer-defined contract (Pact, Spring Cloud Contract).
- E2E tests: Full stack with real HTTP client. Slow; use sparingly for critical paths.
Java — Spring Boot (MockMvc + Testcontainers)
@SpringBootTest
@AutoConfigureMockMvc
@Testcontainers
class OrderControllerIT {
@Container
static PostgreSQLContainer<?> postgres = new PostgreSQLContainer<>("postgres:16")
.withDatabaseName("testdb")
.withUsername("test")
.withPassword("test");
@DynamicPropertySource
static void overrideDataSource(DynamicPropertyRegistry registry) {
registry.add("spring.datasource.url", postgres::getJdbcUrl);
}
@Autowired
MockMvc mockMvc;
@Autowired
ObjectMapper objectMapper;
@Test
void createOrder_returnsCreated_withLocationHeader() throws Exception {
var request = new CreateOrderRequest(
List.of(new CreateOrderItemRequest(UUID.randomUUID(), 2)),
new ShippingAddressRequest("123 Main St", "New York", "NY", "10001", "US"),
"USD"
);
mockMvc.perform(post("/api/v1/orders")
.contentType(MediaType.APPLICATION_JSON)
.content(objectMapper.writeValueAsString(request))
.header("Authorization", "Bearer " + validJwt()))
.andExpect(status().isCreated())
.andExpect(header().exists("Location"))
.andExpect(jsonPath("$.status").value("pending"))
.andExpect(jsonPath("$.currency").value("USD"));
}
@Test
void createOrder_returnsBadRequest_whenItemsIsEmpty() throws Exception {
var request = Map.of("items", List.of(), "currency", "USD");
mockMvc.perform(post("/api/v1/orders")
.contentType(MediaType.APPLICATION_JSON)
.content(objectMapper.writeValueAsString(request))
.header("Authorization", "Bearer " + validJwt()))
.andExpect(status().isUnprocessableEntity())
.andExpect(jsonPath("$.status").value(422))
.andExpect(jsonPath("$.errors[0].field").value("items"));
}
@Test
void getOrder_returns401_whenNoToken() throws Exception {
mockMvc.perform(get("/api/v1/orders/{id}", UUID.randomUUID()))
.andExpect(status().isUnauthorized());
}
@Test
void getOrder_returns403_whenAccessingOtherUsersOrder() throws Exception {
UUID otherUserId = UUID.randomUUID();
Order order = createOrderForUser(otherUserId);
mockMvc.perform(get("/api/v1/orders/{id}", order.getId())
.header("Authorization", "Bearer " + validJwtForUser(UUID.randomUUID())))
.andExpect(status().isForbidden());
}
}
Python — FastAPI (pytest + httpx)
import pytest
import pytest_asyncio
from httpx import AsyncClient, ASGITransport
from app.main import app
@pytest_asyncio.fixture
async def client(db_session):
"""Each test gets a fresh HTTP client with a transaction that rolls back."""
async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as c:
yield c
@pytest.mark.asyncio
async def test_create_order_returns_201_with_location(client, auth_headers):
payload = {
"items": [{"product_id": str(uuid4()), "quantity": 2}],
"shipping_address": {
"street": "123 Main St", "city": "New York",
"state": "NY", "zip": "10001", "country": "US"
},
"currency": "USD",
}
response = await client.post("/api/v1/orders", json=payload, headers=auth_headers)
assert response.status_code == 201
assert "location" in response.headers
data = response.json()
assert data["status"] == "pending"
assert data["currency"] == "USD"
@pytest.mark.asyncio
async def test_create_order_422_when_items_empty(client, auth_headers):
payload = {"items": [], "currency": "USD"}
response = await client.post("/api/v1/orders", json=payload, headers=auth_headers)
assert response.status_code == 422
errors = response.json()["errors"]
assert any(e["field"] == "items" for e in errors)
@pytest.mark.asyncio
async def test_pagination_cursor_is_stable(client, auth_headers, seed_orders):
"""Cursor must not skip or duplicate across pages."""
seen_ids = set()
cursor = None
page_count = 0
while True:
params = {"size": 10}
if cursor:
params["cursor"] = cursor
response = await client.get("/api/v1/orders", params=params, headers=auth_headers)
assert response.status_code == 200
body = response.json()
for order in body["data"]:
assert order["id"] not in seen_ids, "Duplicate order in paginated results"
seen_ids.add(order["id"])
cursor = body.get("next_cursor")
page_count += 1
if not body["has_more"]:
break
assert len(seen_ids) == len(seed_orders)
Node.js — Express (supertest + jest)
import request from 'supertest';
import app from '../src/app.js';
import { createTestUser, generateToken, seedOrders } from './helpers.js';
describe('POST /api/v1/orders', () => {
let token;
let userId;
beforeEach(async () => {
const user = await createTestUser();
userId = user.id;
token = generateToken(user);
});
it('returns 201 with Location header on valid request', async () => {
const payload = {
items: [{ productId: crypto.randomUUID(), quantity: 2 }],
shippingAddress: { street: '123 Main St', city: 'New York', state: 'NY', zip: '10001', country: 'US' },
currency: 'USD',
};
const res = await request(app)
.post('/api/v1/orders')
.set('Authorization', `Bearer ${token}`)
.send(payload);
expect(res.status).toBe(201);
expect(res.headers.location).toMatch(/^\/api\/v1\/orders\/.+/);
expect(res.body.status).toBe('pending');
expect(res.body.currency).toBe('USD');
});
it('returns 422 when items array is empty', async () => {
const res = await request(app)
.post('/api/v1/orders')
.set('Authorization', `Bearer ${token}`)
.send({ items: [], currency: 'USD' });
expect(res.status).toBe(422);
expect(res.body.errors.some(e => e.field === 'items')).toBe(true);
});
it('returns 401 when Authorization header is missing', async () => {
const res = await request(app).post('/api/v1/orders').send({ items: [] });
expect(res.status).toBe(401);
});
});
describe('GET /api/v1/orders (rate limiting)', () => {
it('returns 429 after exceeding rate limit', async () => {
const { token } = await createTestUser();
// Exhaust limit
const requests = Array.from({ length: 101 }, () =>
request(app).get('/api/v1/orders').set('Authorization', `Bearer ${token}`)
);
const responses = await Promise.all(requests);
const tooMany = responses.filter(r => r.status === 429);
expect(tooMany.length).toBeGreaterThan(0);
expect(tooMany[0].headers['retry-after']).toBeDefined();
});
});
14. Performance & Production
Connection Pooling
| Framework | Library | Key Settings |
|---|---|---|
| Spring Boot | HikariCP (default) | spring.datasource.hikari.maximum-pool-size=20, minimum-idle=5, connection-timeout=3000 |
| FastAPI | asyncpg (via SQLAlchemy async) | pool_size=20, max_overflow=10, pool_pre_ping=True, pool_recycle=300 |
| Express | node-postgres (pg Pool) | max: 20, idleTimeoutMillis: 30000, connectionTimeoutMillis: 3000 |
JOIN FETCH / selectinload in a single query, or use a DataLoader pattern for GraphQL. Always check query logs under realistic load.
Health Checks & Graceful Shutdown
Java — Spring Boot (Actuator)
// build.gradle: implementation 'org.springframework.boot:spring-boot-starter-actuator'
// application.yml:
// management.endpoints.web.exposure.include: health,info,metrics,prometheus
// management.endpoint.health.show-details: when-authorized
// management.health.db.enabled: true
// management.health.redis.enabled: true
// Custom health indicator
@Component
public class PaymentServiceHealthIndicator implements HealthIndicator {
private final PaymentClient paymentClient;
@Override
public Health health() {
try {
boolean ok = paymentClient.ping();
return ok ? Health.up().withDetail("provider", "stripe").build()
: Health.down().withDetail("reason", "ping failed").build();
} catch (Exception e) {
return Health.down().withException(e).build();
}
}
}
// Graceful shutdown — built-in with Spring Boot 2.3+
// application.yml:
// server.shutdown: graceful
// spring.lifecycle.timeout-per-shutdown-phase: 30s
Python — FastAPI
from contextlib import asynccontextmanager
@asynccontextmanager
async def lifespan(app: FastAPI):
# Startup
await db_engine.connect()
await redis_client.ping()
logger.info("Application started")
yield
# Shutdown — Uvicorn waits for in-flight requests up to graceful_timeout
await db_engine.dispose()
await redis_client.close()
logger.info("Application shut down cleanly")
app = FastAPI(lifespan=lifespan)
@app.get("/health")
async def health():
checks = {}
try:
await db.execute(text("SELECT 1"))
checks["db"] = "ok"
except Exception as e:
checks["db"] = f"error: {e}"
try:
await redis_client.ping()
checks["redis"] = "ok"
except Exception as e:
checks["redis"] = f"error: {e}"
status = "healthy" if all(v == "ok" for v in checks.values()) else "degraded"
code = 200 if status == "healthy" else 503
return JSONResponse(status_code=code, content={"status": status, "checks": checks})
@app.get("/ready")
async def ready():
"""Kubernetes readiness probe — only pass when ready to serve traffic."""
return {"ready": True}
Node.js — Express
// Health check routes
app.get('/health', async (req, res) => {
const checks = {};
try {
await db.query('SELECT 1');
checks.db = 'ok';
} catch (err) {
checks.db = `error: ${err.message}`;
}
try {
await redis.ping();
checks.redis = 'ok';
} catch (err) {
checks.redis = `error: ${err.message}`;
}
const healthy = Object.values(checks).every(v => v === 'ok');
res.status(healthy ? 200 : 503).json({ status: healthy ? 'healthy' : 'degraded', checks });
});
app.get('/ready', (req, res) => res.json({ ready: true }));
// Graceful shutdown
const server = app.listen(PORT, () => logger.info(`Listening on :${PORT}`));
process.on('SIGTERM', () => {
logger.info('SIGTERM received, draining connections...');
server.close(async () => {
await db.end();
await redis.quit();
logger.info('Server shut down cleanly');
process.exit(0);
});
// Force exit after 30 seconds if not drained
setTimeout(() => process.exit(1), 30_000);
});
Compression Middleware
// Spring Boot — gzip is on by default for responses > 2KB
// application.yml:
// server.compression.enabled: true
// server.compression.min-response-size: 2048
// server.compression.mime-types: application/json,text/plain
# FastAPI — add brotli/gzip middleware
from fastapi.middleware.gzip import GZipMiddleware
app.add_middleware(GZipMiddleware, minimum_size=1000, compresslevel=5)
// Express
import compression from 'compression';
app.use(compression({ threshold: 1024 })); // Compress responses > 1KB
API Deployment (AWS)
Deployment architecture differs significantly depending on whether your API is internal (service-to-service) or public-facing (clients on the internet).
Internal APIs (Service-to-Service)
# Typical path:
# Service A → ALB (internal) → ECS Fargate / EKS pods (private subnet)
#
# Key decisions:
# - ALB (internal, scheme=internal) — stays within VPC, no public IP
# - Service discovery via AWS Cloud Map or internal DNS (service.local)
# - VPC Endpoints for AWS services (S3, DynamoDB, Secrets Manager)
# → traffic stays on AWS backbone, never hits public internet
# - Security groups: allow only specific source SGs, not CIDR blocks
# - No API Gateway needed — saves cost and latency for internal traffic
# - mTLS via service mesh (App Mesh / Istio) for zero-trust
Public APIs (Internet-Facing)
# Two common patterns:
#
# Pattern A: API Gateway (managed)
# Client → Route 53 → API Gateway → VPC Link → ALB (internal) → ECS/EKS
# Pros: managed throttling, API keys, usage plans, request validation, WAF integration
# Cons: $3.50/million requests + data transfer; adds ~10-30ms latency
# Best for: public developer APIs with usage plans and metering
#
# Pattern B: ALB + CloudFront (self-managed)
# Client → Route 53 → CloudFront → ALB (public) → ECS/EKS
# Pros: lower cost at high volume, full control, global edge caching
# Cons: you implement rate limiting / API keys yourself (or use WAF)
# Best for: high-traffic internal products (SPA backends, mobile APIs)
#
# Both patterns:
# - ACM certificate on the edge (CloudFront or API GW) + ALB
# - WAF rules: rate limiting, geo-blocking, SQL injection, XSS detection
# - Secrets in Secrets Manager or SSM Parameter Store (never env vars in task def)
Compute Options
| Option | Cold Start | Scaling | Cost Model | Best For |
|---|---|---|---|---|
| ECS Fargate | None (always-on) | Target tracking on CPU/memory/ALB requests | Per vCPU-hour + memory-hour | Most APIs — predictable latency, simple ops |
| EKS | None | HPA + Cluster Autoscaler / Karpenter | EC2 instances + $0.10/hr control plane | Large-scale microservices; team already on K8s |
| Lambda | 100ms–2s (depends on runtime + VPC) | Automatic (1000 concurrent default) | Per-invocation + duration | Sporadic traffic, event-driven, cost-sensitive |
| App Runner | None (min 1 instance) or ~2s (scale to zero) | Automatic based on concurrency | Per vCPU-hour + memory-hour | Simple APIs that don't need VPC features |
Production Checklist
Expand Deployment Checklist
- Networking: API in private subnets, ALB in public subnets, NAT Gateway for outbound, VPC endpoints for AWS services
- DNS: Route 53 alias record → ALB/CloudFront; health-check-based failover for multi-region
- TLS: ACM certificates on ALB + CloudFront; enforce TLS 1.2+ minimum; HSTS header
- Secrets: Secrets Manager with automatic rotation; injected as env vars at task start (not baked into image)
- Observability: Structured JSON logs → CloudWatch Logs; X-Ray or OpenTelemetry for distributed tracing; CloudWatch alarms on p99 latency, 5xx rate, and error budget
- CI/CD: CodePipeline or GitHub Actions → ECR image push → ECS rolling update (min healthy 100%, max 200%) or blue/green with CodeDeploy
- Database: RDS in private subnet with Multi-AZ; connection via IAM auth or Secrets Manager; RDS Proxy for Lambda (connection pooling)
- Rate limiting: WAF rate rules (IP-based) at the edge; application-level token bucket (Redis) for per-user limits
- DDoS: Shield Standard (free, automatic) protects against L3/L4; Shield Advanced for L7 + cost protection if needed
- Multi-region: Route 53 latency-based routing → regional ALBs; DynamoDB Global Tables or Aurora Global Database for data replication
15. Security Checklist
CORS Configuration
Java — Spring Boot
@Configuration
public class CorsConfig {
@Bean
public CorsConfigurationSource corsConfigurationSource() {
CorsConfiguration config = new CorsConfiguration();
// Never use * in production — enumerate allowed origins
config.setAllowedOrigins(List.of("https://app.example.com", "https://admin.example.com"));
config.setAllowedMethods(List.of("GET", "POST", "PUT", "PATCH", "DELETE", "OPTIONS"));
config.setAllowedHeaders(List.of("Authorization", "Content-Type", "X-API-Key"));
config.setAllowCredentials(true); // Required for cookies / Authorization header
config.setMaxAge(3600L); // Cache preflight for 1 hour
UrlBasedCorsConfigurationSource source = new UrlBasedCorsConfigurationSource();
source.registerCorsConfiguration("/api/**", config);
return source;
}
}
Python — FastAPI
from fastapi.middleware.cors import CORSMiddleware
app.add_middleware(
CORSMiddleware,
allow_origins=["https://app.example.com", "https://admin.example.com"],
allow_credentials=True,
allow_methods=["GET", "POST", "PUT", "PATCH", "DELETE"],
allow_headers=["Authorization", "Content-Type", "X-API-Key"],
max_age=3600,
)
Node.js — Express (cors + helmet)
import cors from 'cors';
import helmet from 'helmet';
// Helmet sets secure headers in one call
app.use(helmet({
crossOriginEmbedderPolicy: false, // Adjust for your CDN/iframe needs
}));
app.use(cors({
origin: (origin, callback) => {
const allowed = ['https://app.example.com', 'https://admin.example.com'];
if (!origin || allowed.includes(origin)) {
callback(null, true);
} else {
callback(new Error('Not allowed by CORS'));
}
},
credentials: true,
methods: ['GET', 'POST', 'PUT', 'PATCH', 'DELETE'],
allowedHeaders: ['Authorization', 'Content-Type', 'X-API-Key'],
maxAge: 3600,
}));
Security Checklist Table
| Category | Check | How to Enforce |
|---|---|---|
| Transport | HTTPS everywhere | HSTS header, redirect HTTP → HTTPS at load balancer |
| Auth | Short-lived JWT access tokens (15 min) | Set exp claim, validate on every request |
| Auth | Refresh tokens in HttpOnly cookies | Never in localStorage; set Secure; SameSite=Strict |
| Auth | No secrets in JWT payload | Code review; consider JWE for sensitive claims |
| Input | Validate all inputs at boundary | Bean Validation / Pydantic / Zod before service layer |
| Input | Parameterized queries only | ORM / prepared statements; never string interpolation in SQL |
| Input | Request size limits | spring.servlet.multipart.max-request-size, request.size in nginx |
| Headers | Security headers on all responses | Helmet (Node), Spring Security defaults, FastAPI middleware |
| Headers | CORS restricted to known origins | Enumerate allowed origins; never * with credentials |
| Rate Limits | Per-key and per-IP limits | Redis-backed token bucket at API gateway or middleware |
| Errors | No stack traces in responses | Global error handler sanitizes output; log internally with trace ID |
| Logging | No PII in logs | Redact email, phone, SSN in log formatters; audit log pipeline |
| Dependencies | Vulnerability scanning | mvn dependency-check, pip-audit, npm audit in CI |
| Authorization | Resource-level checks (not just role) | Verify resource.ownerId == currentUser.id in service layer |
| Files | Validate MIME via magic bytes | Use apache-tika / python-magic — never trust Content-Type |
16. Interview Quick Reference
"Design a REST API for X" Framework
When asked to design an API in an interview, walk through these steps explicitly — the interviewer wants to see your thought process, not just endpoint names.
- Clarify scope: Who are the consumers (mobile app, third-party, internal service)? Read or write heavy? Estimated scale?
- Identify resources: Nouns from the domain. For an e-commerce system:
User,Product,Order,OrderItem,Payment,Shipment. - Define relationships: An
Orderbelongs to aUser, has manyOrderItems. OnePaymentperOrder. OneShipmentper fulfilledOrder. - Define endpoints: CRUD for each resource, plus actions that don't fit CRUD (use sub-resource nouns:
/orders/{id}/cancellation). - Design request/response schemas: Fields, types, required vs optional, nested vs flat IDs.
- Address cross-cutting concerns: Authentication, authorization, pagination, rate limiting, versioning, error format.
- Identify edge cases: Idempotency for payments, optimistic locking for inventory, eventual consistency if distributed.
Common Interview Questions
Q: What's the difference between PUT and PATCH?
PUT replaces the entire resource. If you omit a field, it is set to null/default. The request body must represent the complete desired state. PUT is idempotent: calling it N times produces the same result.
PATCH applies a partial update — only the fields you include are changed. PATCH is not inherently idempotent (e.g., {"increment_stock": 5} would increase stock each call). To make PATCH idempotent, use JSON Patch operations like {"op": "replace", "path": "/status", "value": "shipped"}.
Production choice: Prefer PATCH for most update operations because it avoids clients needing to know the full current state. Use PUT when replacing entire documents (e.g., replacing a config file).
Q: How do you handle concurrent updates? (Optimistic locking)
Use optimistic concurrency control via ETag / version field:
- GET returns resource with
ETag: "v42" - Client sends PATCH with
If-Match: "v42" - Server checks if current version matches. If yes, apply update and increment version.
- If version mismatch (another client updated first), return
412 Precondition Failed
In the DB, this maps to: UPDATE orders SET status='shipped', version=43 WHERE id=? AND version=42. If 0 rows updated, someone else updated first.
Q: How do you make a POST idempotent? (Idempotency keys)
Use an idempotency key header, exactly like Stripe does: Idempotency-Key: <uuid-from-client>
- Client generates a UUID for the request and stores it.
- Server caches the response keyed by
idempotency_key + endpoint + user_idin Redis (TTL: 24h). - On retry, return the cached response instead of re-executing the operation.
- Return
409if same key is used for a different request body (likely a bug).
This is critical for payment APIs: if a network timeout occurs, the client can safely retry with the same key knowing the payment won't be charged twice.
Q: Should you return 200 with an error field, or use proper HTTP status codes?
Always use proper HTTP status codes. Returning 200 {"success": false, "error": "not found"} is an anti-pattern because:
- Monitoring systems count all 200s as successes — errors become invisible in metrics.
- HTTP clients (fetch, Axios, Spring RestTemplate) have built-in error handling keyed on status codes — you bypass it entirely.
- Load balancers and CDNs make caching decisions based on status codes.
- Contract testing (consumer-driven) relies on status codes to express expectations.
Use RFC 7807 Problem Details for a structured, standardized error body.
Q: How would you design pagination for a feed with real-time inserts?
Use cursor-based (keyset) pagination on a stable, unique, indexed column (typically id or created_at + id composite).
- The cursor encodes the last seen row's sort key (e.g.,
{"created_at": "2026-02-15T10:00:00Z", "id": "uuid"}), base64-encoded. - Each page fetches:
WHERE created_at < $cursor_ts OR (created_at = $cursor_ts AND id < $cursor_id) ORDER BY created_at DESC, id DESC LIMIT N+1 - Fetching N+1 rows lets you detect
has_morewithout a COUNT query. - No duplicates or skips even with concurrent inserts, because the cursor is a fixed point in the data.
Q: How do you prevent CSRF in a stateless JWT API?
A stateless JWT API stored in memory or Authorization headers is not vulnerable to CSRF — browsers never automatically send the Authorization header (unlike cookies). CSRF attacks only work with cookie-based auth.
If you use HttpOnly cookies for refresh tokens:
- Set
SameSite=Stricton the cookie — browsers won't send it in cross-site requests. - For SPAs that need
SameSite=Lax, add a Double Submit Cookie pattern or check theOriginheader server-side. - Spring Security disables CSRF by default for stateless (JWT) APIs.
API Design Review Checklist
Expand API Design Checklist
- Resources: Are all URLs nouns? Are collections plural? No verbs in paths?
- Methods: Are GET/DELETE idempotent? Is PUT doing full replacement?
- Status codes: Is POST returning 201 with Location? Is DELETE returning 204?
- Creation: Does POST return a Location header pointing to the new resource?
- Errors: Are all errors RFC 7807 Problem Details? No stack traces in bodies?
- Validation: Are all inputs validated at the boundary before reaching service layer?
- Auth: Are all non-public endpoints protected? Is resource-level authorization checked?
- Versioning: Is there a versioning strategy? Are deprecated endpoints flagged?
- Pagination: Do all list endpoints paginate? Is the cursor stable under concurrent writes?
- Filtering/sorting: Are query param names consistent? Are sort fields allowlisted?
- Consistency: Is JSON field naming consistent (all camelCase or all snake_case)?
- Idempotency: Do payment/side-effect endpoints support idempotency keys?
- Rate limiting: Is there per-key and per-IP rate limiting? Are headers set on all responses?
- Documentation: Is there an OpenAPI spec? Are all fields documented with examples?
- Tests: Are happy paths, validation errors, auth failures, and edge cases all tested?
HTTP Method Decision Tree
# What operation are you performing?
#
# Read-only, no side effects?
# └─► GET (cacheable, idempotent)
#
# Creating a new resource?
# └─► POST → returns 201 + Location header
#
# Replacing entire resource (client sends full state)?
# └─► PUT (idempotent; omitted fields → null/default)
#
# Partial update (only changed fields)?
# └─► PATCH (not idempotent by default)
#
# Removing a resource?
# └─► DELETE → returns 204 No Content (idempotent)
#
# Triggering an action that doesn't fit CRUD?
# └─► POST to a sub-resource noun:
# POST /orders/{id}/cancellation
# POST /invoices/{id}/payment
# POST /users/{id}/password-reset
#
# Checking if resource exists / fetching headers only?
# └─► HEAD
#
# CORS preflight (browser does this automatically)?
# └─► OPTIONS