Table of Contents

1. REST Fundamentals

REST (Representational State Transfer) is an architectural style, not a protocol. Roy Fielding defined it in his 2000 dissertation. Most "REST APIs" in the wild are REST-ish — understanding the real constraints helps you make deliberate trade-offs.

The Six Architectural Constraints

Richardson Maturity Model

Leonard Richardson's model describes four levels of API maturity. Most production APIs target Level 2. Level 3 (HATEOAS) is rarely worth the complexity unless you're building a public hypermedia API.

LevelNameCharacteristicsExample
0 The Swamp of POX Single URI, single HTTP method (POST). All operations are in the body. POST /api → {"action":"getOrder","id":42}
1 Resources Multiple URIs for different resources, but still usually just GET/POST. POST /orders/42 to get order 42
2 HTTP Verbs Correct HTTP methods (GET, POST, PUT, DELETE). Correct status codes. The practical target for most APIs. GET /orders/42, DELETE /orders/42
3 Hypermedia Controls (HATEOAS) Responses contain links describing what actions are available next. Self-documenting API state machines. Response includes "_links": {"cancel": "/orders/42/cancel"}

REST vs RPC vs GraphQL

DimensionRESTgRPC / RPCGraphQL
Mental model Resources & representations Functions / procedures Graph of typed fields
Transport HTTP/1.1, HTTP/2 HTTP/2 (gRPC), any (JSON-RPC) HTTP/1.1, HTTP/2, WebSocket
Payload JSON, XML, any Protobuf (binary), JSON JSON
Over-fetching Common (fixed response shapes) Low (schema-driven) None (client selects fields)
Caching HTTP-native (ETags, Cache-Control) Manual or no caching Hard (POST-based queries bypass HTTP cache)
Tooling maturity Excellent Good (Protobuf ecosystem) Good (Apollo, Relay)
Best for Public APIs, mobile backends, microservices with stable contracts Internal service-to-service, streaming, polyglot systems Complex UIs with many data shapes, BFF pattern
When to use what
Use REST for public APIs or when HTTP caching matters. Use gRPC for internal microservice communication where performance and strict contracts are paramount. Use GraphQL when you have a mobile/web BFF layer and multiple clients requesting different data shapes from the same backend.

2. HTTP Methods & Status Codes

Methods: Safety & Idempotency

Two properties that determine how proxies, CDNs, and clients may retry or cache requests:

MethodSafeIdempotentRequest BodyTypical Use
GETYesYesNoFetch resource or collection
POSTNoNoYesCreate resource, trigger action
PUTNoYesYesFull replacement of a resource
PATCHNoNo*YesPartial update
DELETENoYesRarelyRemove resource
HEADYesYesNoCheck resource exists / get headers
OPTIONSYesYesNoCORS preflight, API discovery

* PATCH can be designed to be idempotent (JSON Patch ops like "set" are idempotent; "increment" is not).

Status Codes That Matter in Production

CodeNameReal Scenario
200OKGET /orders/42 returns the order
201CreatedPOST /orders creates an order; include Location: /orders/99 header
204No ContentDELETE /orders/42 succeeds; no body needed
301Moved PermanentlyAPI versioning: /api/orders permanently moved to /api/v2/orders
304Not ModifiedClient sends If-None-Match: "abc123"; resource unchanged; save bandwidth
400Bad RequestRequest body fails validation (missing required field, wrong type)
401UnauthorizedJWT token missing or expired; include WWW-Authenticate header
403ForbiddenToken valid, but user lacks permission (e.g., non-admin accessing admin endpoint)
404Not FoundGET /orders/99999 where that order doesn't exist
409ConflictPOST /users with email that already exists (unique constraint violation)
422Unprocessable EntitySyntactically valid JSON but semantically wrong (e.g., end date before start date)
429Too Many RequestsRate limit exceeded; include Retry-After and X-RateLimit-Reset headers
500Internal Server ErrorUnhandled exception; log the full stack trace server-side, return sanitized message
502Bad GatewayUpstream service (payment processor, DB) returned invalid response
503Service UnavailableIntentional during maintenance, circuit breaker open, or DB connection exhausted
401 vs 403: The Interview Trap
401 means "I don't know who you are" (missing/invalid credentials). 403 means "I know who you are, but you can't do this." Never return 200 with an error body — that breaks every HTTP client and monitoring tool.

3. Resource Design & URL Patterns

Nouns Not Verbs

URLs identify resources (things), not actions (verbs). HTTP methods already express the action.

Bad (verbs in URL)Good (nouns + HTTP method)
POST /createOrderPOST /orders
GET /getOrderById?id=42GET /orders/42
POST /cancelOrder/42POST /orders/42/cancellation
DELETE /deleteUser/5DELETE /users/5
GET /searchProductsGET /products?q=laptop

Nested vs Flat Resources

Nesting expresses ownership and context. The practical rule: nest at most one level deep. Deeper nesting creates brittle URLs and forces callers to know the full ownership chain.

ApproachURL ExampleWhen to UseTrade-offs
Nested /orders/{orderId}/items Resource has no meaning outside parent (order items without an order) Clear ownership; couples client to hierarchy; hard to paginate across parents
Flat with filter /order-items?orderId=42 Resource can exist independently or needs querying across parents Flexible; less intuitive; authorization must be explicit
Mixed /orders/{id}/items writes; /order-items?status=pending queries Best of both: ownership for writes, flexibility for reads More endpoints; document the intent clearly

Route Definition in All Three Frameworks

Java — Spring Boot 3

@RestController
@RequestMapping("/api/v1")
public class OrderController {

    // GET /api/v1/orders
    @GetMapping("/orders")
    public ResponseEntity<Page<OrderDto>> listOrders(
            @RequestParam(defaultValue = "0") int page,
            @RequestParam(defaultValue = "20") int size) {
        return ResponseEntity.ok(orderService.findAll(PageRequest.of(page, size)));
    }

    // GET /api/v1/orders/{orderId}
    @GetMapping("/orders/{orderId}")
    public ResponseEntity<OrderDto> getOrder(@PathVariable UUID orderId) {
        return ResponseEntity.ok(orderService.findById(orderId));
    }

    // GET /api/v1/orders/{orderId}/items
    @GetMapping("/orders/{orderId}/items")
    public ResponseEntity<List<OrderItemDto>> listItems(@PathVariable UUID orderId) {
        return ResponseEntity.ok(orderService.findItems(orderId));
    }

    // POST /api/v1/orders
    @PostMapping("/orders")
    public ResponseEntity<OrderDto> createOrder(
            @Valid @RequestBody CreateOrderRequest request,
            UriComponentsBuilder uriBuilder) {
        OrderDto created = orderService.create(request);
        URI location = uriBuilder.path("/api/v1/orders/{id}")
                .buildAndExpand(created.id()).toUri();
        return ResponseEntity.created(location).body(created);
    }

    // PATCH /api/v1/orders/{orderId}
    @PatchMapping("/orders/{orderId}")
    public ResponseEntity<OrderDto> updateOrder(
            @PathVariable UUID orderId,
            @Valid @RequestBody UpdateOrderRequest request) {
        return ResponseEntity.ok(orderService.update(orderId, request));
    }

    // DELETE /api/v1/orders/{orderId}
    @DeleteMapping("/orders/{orderId}")
    public ResponseEntity<Void> deleteOrder(@PathVariable UUID orderId) {
        orderService.delete(orderId);
        return ResponseEntity.noContent().build();
    }

    // POST /api/v1/orders/{orderId}/cancellation (action as sub-resource)
    @PostMapping("/orders/{orderId}/cancellation")
    public ResponseEntity<OrderDto> cancelOrder(
            @PathVariable UUID orderId,
            @Valid @RequestBody CancelOrderRequest request) {
        return ResponseEntity.ok(orderService.cancel(orderId, request.reason()));
    }
}

Python — FastAPI

from fastapi import APIRouter, HTTPException, status
from fastapi.responses import Response
from uuid import UUID

router = APIRouter(prefix="/api/v1", tags=["orders"])

@router.get("/orders", response_model=PaginatedResponse[OrderDto])
async def list_orders(
    page: int = Query(0, ge=0),
    size: int = Query(20, ge=1, le=100),
    db: AsyncSession = Depends(get_db),
):
    return await order_service.find_all(db, page=page, size=size)

@router.get("/orders/{order_id}", response_model=OrderDto)
async def get_order(order_id: UUID, db: AsyncSession = Depends(get_db)):
    order = await order_service.find_by_id(db, order_id)
    if not order:
        raise HTTPException(status_code=404, detail="Order not found")
    return order

@router.get("/orders/{order_id}/items", response_model=list[OrderItemDto])
async def list_order_items(order_id: UUID, db: AsyncSession = Depends(get_db)):
    return await order_service.find_items(db, order_id)

@router.post("/orders", response_model=OrderDto, status_code=status.HTTP_201_CREATED)
async def create_order(
    request: CreateOrderRequest,
    response: Response,
    db: AsyncSession = Depends(get_db),
):
    created = await order_service.create(db, request)
    response.headers["Location"] = f"/api/v1/orders/{created.id}"
    return created

@router.patch("/orders/{order_id}", response_model=OrderDto)
async def update_order(
    order_id: UUID,
    request: UpdateOrderRequest,
    db: AsyncSession = Depends(get_db),
):
    return await order_service.update(db, order_id, request)

@router.delete("/orders/{order_id}", status_code=status.HTTP_204_NO_CONTENT)
async def delete_order(order_id: UUID, db: AsyncSession = Depends(get_db)):
    await order_service.delete(db, order_id)

@router.post("/orders/{order_id}/cancellation", response_model=OrderDto)
async def cancel_order(
    order_id: UUID,
    request: CancelOrderRequest,
    db: AsyncSession = Depends(get_db),
):
    return await order_service.cancel(db, order_id, request.reason)

Node.js — Express

import { Router } from 'express';
import { validate } from '../middleware/validate.js';
import { createOrderSchema, updateOrderSchema, cancelOrderSchema } from '../schemas/order.js';

const router = Router();

// GET /api/v1/orders
router.get('/orders', async (req, res, next) => {
  try {
    const { page = 0, size = 20 } = req.query;
    const result = await orderService.findAll({ page: Number(page), size: Number(size) });
    res.json(result);
  } catch (err) { next(err); }
});

// GET /api/v1/orders/:orderId
router.get('/orders/:orderId', async (req, res, next) => {
  try {
    const order = await orderService.findById(req.params.orderId);
    if (!order) return res.status(404).json({ error: 'Order not found' });
    res.json(order);
  } catch (err) { next(err); }
});

// GET /api/v1/orders/:orderId/items
router.get('/orders/:orderId/items', async (req, res, next) => {
  try {
    const items = await orderService.findItems(req.params.orderId);
    res.json(items);
  } catch (err) { next(err); }
});

// POST /api/v1/orders
router.post('/orders', validate(createOrderSchema), async (req, res, next) => {
  try {
    const created = await orderService.create(req.body);
    res.status(201)
       .setHeader('Location', `/api/v1/orders/${created.id}`)
       .json(created);
  } catch (err) { next(err); }
});

// PATCH /api/v1/orders/:orderId
router.patch('/orders/:orderId', validate(updateOrderSchema), async (req, res, next) => {
  try {
    const updated = await orderService.update(req.params.orderId, req.body);
    res.json(updated);
  } catch (err) { next(err); }
});

// DELETE /api/v1/orders/:orderId
router.delete('/orders/:orderId', async (req, res, next) => {
  try {
    await orderService.delete(req.params.orderId);
    res.status(204).send();
  } catch (err) { next(err); }
});

// POST /api/v1/orders/:orderId/cancellation
router.post('/orders/:orderId/cancellation', validate(cancelOrderSchema), async (req, res, next) => {
  try {
    const updated = await orderService.cancel(req.params.orderId, req.body.reason);
    res.json(updated);
  } catch (err) { next(err); }
});

export const orderRouter = router;

4. Request & Response Design

JSON Conventions

Pick a naming convention and stick to it throughout the entire API. Mixing camelCase and snake_case in the same API is the fastest way to cause client bugs.

ConventionEcosystem DefaultExample
camelCaseJavaScript, Java (Jackson), Go{"orderId": 42, "createdAt": "..."}
snake_casePython, Ruby, PostgreSQL{"order_id": 42, "created_at": "..."}
PascalCase.NET (default){"OrderId": 42, "CreatedAt": "..."}
kebab-caseRare in JSON bodies, common in headers{"order-id": 42}

Envelope Patterns

A consistent envelope makes API responses predictable. Two common approaches:

Bare resource (GitHub API style)

// Single resource — return the object directly
{
  "id": "order_abc123",
  "status": "processing",
  "total": 9999,
  "currency": "usd",
  "created_at": "2026-02-15T10:30:00Z"
}

// Collection — include pagination metadata
{
  "data": [
    { "id": "order_abc123", "status": "processing" },
    { "id": "order_def456", "status": "shipped" }
  ],
  "pagination": {
    "total": 142,
    "page": 0,
    "size": 20,
    "has_more": true
  }
}

RFC 7807 Problem Details for errors

// Always use this for errors — standardized and machine-readable
{
  "type": "https://api.example.com/errors/validation-failed",
  "title": "Validation Failed",
  "status": 422,
  "detail": "The request body contains invalid fields.",
  "instance": "/api/v1/orders/abc123",
  "errors": [
    {
      "field": "items[0].quantity",
      "message": "must be greater than 0",
      "code": "MIN_VALUE"
    },
    {
      "field": "shipping_address.zip",
      "message": "invalid postal code format",
      "code": "INVALID_FORMAT"
    }
  ]
}

Serialization in Each Framework

Java — Spring Boot 3 with Records

// DTO using Java record (immutable, auto-equals/hashCode, compact)
public record OrderDto(
    UUID id,
    String status,
    BigDecimal total,
    String currency,
    List<OrderItemDto> items,
    @JsonProperty("created_at") Instant createdAt
) {
    // Jackson maps JSON field "created_at" to Java "createdAt"
}

public record CreateOrderRequest(
    @NotNull @Size(min = 1, max = 50) List<CreateOrderItemRequest> items,
    @NotNull @Valid ShippingAddressRequest shippingAddress,
    @Pattern(regexp = "^[A-Z]{3}$") String currency
) {}

// Configure Jackson globally in application.yml:
// spring.jackson.property-naming-strategy: SNAKE_CASE
// spring.jackson.serialization.write-dates-as-timestamps: false

Python — FastAPI + Pydantic v2

from pydantic import BaseModel, Field, field_validator
from datetime import datetime
from uuid import UUID
from decimal import Decimal

class OrderItemDto(BaseModel):
    product_id: UUID
    quantity: int = Field(gt=0, le=1000)
    unit_price: Decimal = Field(decimal_places=2)
    name: str

class OrderDto(BaseModel):
    id: UUID
    status: str
    total: Decimal
    currency: str
    items: list[OrderItemDto]
    created_at: datetime

    model_config = {
        "from_attributes": True,       # Allow creating from ORM objects
        "json_encoders": {Decimal: str} # Serialize Decimal as string to avoid float precision loss
    }

class CreateOrderRequest(BaseModel):
    items: list[CreateOrderItemRequest] = Field(min_length=1, max_length=50)
    shipping_address: ShippingAddressRequest
    currency: str = Field(pattern=r"^[A-Z]{3}$", default="USD")

    @field_validator("currency")
    @classmethod
    def currency_must_be_supported(cls, v: str) -> str:
        supported = {"USD", "EUR", "GBP", "CAD"}
        if v not in supported:
            raise ValueError(f"Currency must be one of: {', '.join(supported)}")
        return v

Node.js — Express + Zod

import { z } from 'zod';

// Response shape (TypeScript interface for documentation)
const OrderItemDtoSchema = z.object({
  productId: z.string().uuid(),
  quantity: z.number().int().positive().max(1000),
  unitPrice: z.string(), // Decimal as string to avoid float imprecision
  name: z.string(),
});

// Zod schema doubles as runtime validator and TypeScript type
const CreateOrderSchema = z.object({
  items: z.array(z.object({
    productId: z.string().uuid(),
    quantity: z.number().int().min(1).max(1000),
  })).min(1).max(50),
  shippingAddress: ShippingAddressSchema,
  currency: z.string().regex(/^[A-Z]{3}$/).default('USD'),
});

export type CreateOrderRequest = z.infer<typeof CreateOrderSchema>;

// Validation middleware
export function validate(schema) {
  return (req, res, next) => {
    const result = schema.safeParse(req.body);
    if (!result.success) {
      return res.status(422).json({
        type: 'https://api.example.com/errors/validation-failed',
        title: 'Validation Failed',
        status: 422,
        errors: result.error.errors.map(e => ({
          field: e.path.join('.'),
          message: e.message,
          code: e.code.toUpperCase(),
        })),
      });
    }
    req.body = result.data; // Replace with parsed + typed value
    next();
  };
}

5. Pagination, Filtering & Sorting

Offset vs Cursor Pagination

DimensionOffset / Page-BasedCursor / Keyset
URL?page=3&size=20?cursor=eyJpZCI6MTAwfQ&size=20
DB QueryLIMIT 20 OFFSET 60WHERE id > 100 LIMIT 20
PerformanceDegrades at high offsets (DB scans all skipped rows)Constant time (index seek)
ConsistencyRows inserted during pagination cause duplicates/skipsStable (cursor is a fixed point)
Random accessYes ("jump to page 50")No (sequential only)
Best forAdmin UIs, small datasets, when users need page numbersInfinite scroll, feeds, large tables, public APIs
Stripe uses cursor pagination
Stripe's API uses a starting_after / ending_before cursor pattern. All their list endpoints return "has_more": true/false and an array of objects. No page numbers. This is the right choice for financial data where consistency under concurrent writes matters.

Pagination Implementation

Java — Spring Boot with JPA

// Offset pagination (Spring Data handles it natively)
@GetMapping("/orders")
public ResponseEntity<PageResponse<OrderDto>> listOrders(
        @RequestParam(defaultValue = "0") int page,
        @RequestParam(defaultValue = "20") @Max(100) int size,
        @RequestParam(required = false) String status,
        @RequestParam(required = false) @DateTimeFormat(iso = ISO.DATE) LocalDate createdAfter,
        @RequestParam(defaultValue = "created_at") String sort,
        @RequestParam(defaultValue = "desc") String direction) {

    Sort.Direction dir = Sort.Direction.fromString(direction);
    Pageable pageable = PageRequest.of(page, size, Sort.by(dir, toColumn(sort)));

    Specification<Order> spec = Specification
        .where(hasStatus(status))
        .and(createdAfter(createdAfter));

    Page<Order> result = orderRepository.findAll(spec, pageable);
    return ResponseEntity.ok(PageResponse.from(result, orderMapper::toDto));
}

// Cursor pagination for feeds
@GetMapping("/orders/feed")
public ResponseEntity<CursorPage<OrderDto>> orderFeed(
        @RequestParam(required = false) String cursor,
        @RequestParam(defaultValue = "20") @Max(100) int size) {

    UUID afterId = cursor != null ? decodeCursor(cursor) : null;
    List<Order> orders = orderRepository.findAfterCursor(afterId, size + 1);
    boolean hasMore = orders.size() > size;
    if (hasMore) orders = orders.subList(0, size);

    String nextCursor = hasMore ? encodeCursor(orders.get(size - 1).getId()) : null;
    return ResponseEntity.ok(new CursorPage<>(orders.stream().map(orderMapper::toDto).toList(), nextCursor, hasMore));
}

// JPA repository
@Query("SELECT o FROM Order o WHERE (:afterId IS NULL OR o.id > :afterId) ORDER BY o.id ASC")
List<Order> findAfterCursor(@Param("afterId") UUID afterId, Pageable pageable);

Python — FastAPI with SQLAlchemy

from base64 import b64encode, b64decode
import json

@router.get("/orders", response_model=CursorPage[OrderDto])
async def list_orders(
    cursor: str | None = Query(None),
    size: int = Query(20, ge=1, le=100),
    status: str | None = Query(None),
    created_after: date | None = Query(None),
    sort: str = Query("created_at"),
    direction: Literal["asc", "desc"] = Query("desc"),
    db: AsyncSession = Depends(get_db),
):
    stmt = select(Order)

    # Filtering
    if status:
        stmt = stmt.where(Order.status == status)
    if created_after:
        stmt = stmt.where(Order.created_at >= created_after)

    # Cursor decoding
    if cursor:
        cursor_data = json.loads(b64decode(cursor))
        if direction == "desc":
            stmt = stmt.where(Order.created_at < cursor_data["created_at"])
        else:
            stmt = stmt.where(Order.created_at > cursor_data["created_at"])

    # Sort and limit
    col = getattr(Order, sort, Order.created_at)
    stmt = stmt.order_by(col.desc() if direction == "desc" else col.asc())
    stmt = stmt.limit(size + 1)

    result = await db.execute(stmt)
    orders = result.scalars().all()

    has_more = len(orders) > size
    if has_more:
        orders = orders[:size]

    next_cursor = None
    if has_more:
        last = orders[-1]
        next_cursor = b64encode(json.dumps({
            "created_at": last.created_at.isoformat()
        }).encode()).decode()

    return CursorPage(
        data=[OrderDto.model_validate(o) for o in orders],
        next_cursor=next_cursor,
        has_more=has_more,
    )

Node.js — Express with pg

router.get('/orders', async (req, res, next) => {
  try {
    const {
      cursor,
      size = '20',
      status,
      created_after,
      sort = 'created_at',
      direction = 'desc',
    } = req.query;

    const pageSize = Math.min(parseInt(size, 10), 100);
    const params = [];
    const conditions = [];
    let idx = 1;

    // Filtering
    if (status) {
      conditions.push(`status = $${idx++}`);
      params.push(status);
    }
    if (created_after) {
      conditions.push(`created_at >= $${idx++}`);
      params.push(new Date(created_after));
    }

    // Cursor decoding
    if (cursor) {
      const { created_at: cursorTs } = JSON.parse(Buffer.from(cursor, 'base64').toString());
      const op = direction === 'desc' ? '<' : '>';
      conditions.push(`created_at ${op} $${idx++}`);
      params.push(new Date(cursorTs));
    }

    const where = conditions.length ? `WHERE ${conditions.join(' AND ')}` : '';
    const allowedSort = ['created_at', 'total', 'status'];
    const sortCol = allowedSort.includes(sort) ? sort : 'created_at';
    const dir = direction === 'asc' ? 'ASC' : 'DESC';

    // Fetch one extra to detect has_more
    params.push(pageSize + 1);
    const { rows } = await db.query(
      `SELECT * FROM orders ${where} ORDER BY ${sortCol} ${dir} LIMIT $${idx}`,
      params,
    );

    const hasMore = rows.length > pageSize;
    const data = hasMore ? rows.slice(0, pageSize) : rows;

    const nextCursor = hasMore
      ? Buffer.from(JSON.stringify({ created_at: data[data.length - 1].created_at })).toString('base64')
      : null;

    res.json({ data, next_cursor: nextCursor, has_more: hasMore });
  } catch (err) { next(err); }
});

6. Validation & Error Handling

Validation belongs at the boundary. Never let invalid data reach your service layer or database. Structured, consistent error responses are a hallmark of a mature API.

Global Exception / Error Handlers

Java — Spring Boot @ControllerAdvice

@RestControllerAdvice
@Slf4j
public class GlobalExceptionHandler {

    // Handle bean validation failures (@Valid on request body)
    @ExceptionHandler(MethodArgumentNotValidException.class)
    public ResponseEntity<ProblemDetail> handleValidation(
            MethodArgumentNotValidException ex, HttpServletRequest request) {

        ProblemDetail problem = ProblemDetail.forStatusAndDetail(
            HttpStatus.UNPROCESSABLE_ENTITY, "The request body contains invalid fields.");
        problem.setType(URI.create("https://api.example.com/errors/validation-failed"));
        problem.setInstance(URI.create(request.getRequestURI()));

        List<FieldError> fieldErrors = ex.getBindingResult().getFieldErrors()
            .stream()
            .map(e -> new FieldError(e.getField(), e.getDefaultMessage(), e.getCode()))
            .toList();
        problem.setProperty("errors", fieldErrors);

        return ResponseEntity.unprocessableEntity().body(problem);
    }

    @ExceptionHandler(OrderNotFoundException.class)
    public ResponseEntity<ProblemDetail> handleNotFound(
            OrderNotFoundException ex, HttpServletRequest request) {
        ProblemDetail problem = ProblemDetail.forStatusAndDetail(
            HttpStatus.NOT_FOUND, ex.getMessage());
        problem.setInstance(URI.create(request.getRequestURI()));
        return ResponseEntity.status(404).body(problem);
    }

    @ExceptionHandler(DuplicateEmailException.class)
    public ResponseEntity<ProblemDetail> handleConflict(
            DuplicateEmailException ex, HttpServletRequest request) {
        ProblemDetail problem = ProblemDetail.forStatusAndDetail(
            HttpStatus.CONFLICT, "A user with this email already exists.");
        problem.setType(URI.create("https://api.example.com/errors/duplicate-email"));
        return ResponseEntity.status(409).body(problem);
    }

    // Catch-all: log + return 500 without leaking stack trace
    @ExceptionHandler(Exception.class)
    public ResponseEntity<ProblemDetail> handleUnexpected(
            Exception ex, HttpServletRequest request) {
        String traceId = MDC.get("traceId"); // From OpenTelemetry/Sleuth
        log.error("Unhandled exception [traceId={}]", traceId, ex);
        ProblemDetail problem = ProblemDetail.forStatusAndDetail(
            HttpStatus.INTERNAL_SERVER_ERROR,
            "An unexpected error occurred. Reference: " + traceId);
        return ResponseEntity.internalServerError().body(problem);
    }
}

Python — FastAPI exception handlers

from fastapi import Request, status
from fastapi.exceptions import RequestValidationError
from fastapi.responses import JSONResponse

# Pydantic v2 validation errors are caught automatically by FastAPI,
# but we can customize the response format:
@app.exception_handler(RequestValidationError)
async def validation_exception_handler(request: Request, exc: RequestValidationError):
    errors = []
    for error in exc.errors():
        errors.append({
            "field": ".".join(str(loc) for loc in error["loc"][1:]),  # Skip "body"
            "message": error["msg"],
            "code": error["type"].upper(),
        })
    return JSONResponse(
        status_code=status.HTTP_422_UNPROCESSABLE_ENTITY,
        content={
            "type": "https://api.example.com/errors/validation-failed",
            "title": "Validation Failed",
            "status": 422,
            "detail": "The request body contains invalid fields.",
            "errors": errors,
        },
    )

class OrderNotFoundError(Exception):
    def __init__(self, order_id: UUID):
        self.order_id = order_id

@app.exception_handler(OrderNotFoundError)
async def order_not_found_handler(request: Request, exc: OrderNotFoundError):
    return JSONResponse(
        status_code=404,
        content={
            "type": "https://api.example.com/errors/not-found",
            "title": "Not Found",
            "status": 404,
            "detail": f"Order {exc.order_id} not found.",
            "instance": str(request.url),
        },
    )

# Catch-all: log + sanitize
@app.exception_handler(Exception)
async def unexpected_error_handler(request: Request, exc: Exception):
    trace_id = request.state.trace_id  # Set by middleware
    logger.exception("Unhandled exception [trace_id=%s]", trace_id)
    return JSONResponse(
        status_code=500,
        content={
            "title": "Internal Server Error",
            "status": 500,
            "detail": f"An unexpected error occurred. Reference: {trace_id}",
        },
    )

Node.js — Express error middleware

// Error middleware must have 4 parameters — Express identifies it by arity
export function errorHandler(err, req, res, next) {
  const traceId = req.headers['x-trace-id'] || crypto.randomUUID();

  // Known domain errors
  if (err instanceof OrderNotFoundError) {
    return res.status(404).json({
      type: 'https://api.example.com/errors/not-found',
      title: 'Not Found',
      status: 404,
      detail: err.message,
      instance: req.path,
    });
  }

  if (err instanceof DuplicateEmailError) {
    return res.status(409).json({
      type: 'https://api.example.com/errors/duplicate-email',
      title: 'Conflict',
      status: 409,
      detail: 'A user with this email already exists.',
    });
  }

  // Validation errors from Zod middleware (already handled inline, but as fallback)
  if (err.name === 'ZodError') {
    return res.status(422).json({
      type: 'https://api.example.com/errors/validation-failed',
      title: 'Validation Failed',
      status: 422,
      errors: err.errors.map(e => ({ field: e.path.join('.'), message: e.message })),
    });
  }

  // Catch-all
  logger.error({ err, traceId, path: req.path }, 'Unhandled exception');
  res.status(500).json({
    title: 'Internal Server Error',
    status: 500,
    detail: `An unexpected error occurred. Reference: ${traceId}`,
  });
}

// Register LAST, after all routes
app.use(errorHandler);

7. Authentication & Authorization

The Problem: Auth in Distributed Systems

Traditional web auth uses server-side sessions: client logs in, server creates a session object in memory (or a DB), returns a session ID as a cookie. On every subsequent request the server looks up the session by ID.

This breaks down in distributed systems. With N application servers behind a load balancer, you need either sticky sessions (defeats the purpose of load balancing) or a shared session store (Redis/DB — adds latency and a single point of failure on every request). JWT solves this by making the token self-contained: the server can validate it cryptographically without any backend lookup.

Auth Approaches Compared

ApproachHow It WorksStateless?Best ForDrawbacks
Session Cookie Server stores session; client sends Set-Cookie session ID No Traditional server-rendered apps Shared store needed for horizontal scaling; CSRF risk
API Key Static secret in header (X-API-Key) Yes Service-to-service, public developer APIs No user identity; rotation is manual; leaked key = full access
JWT (Bearer Token) Signed token with embedded claims in Authorization: Bearer Yes SPAs, mobile apps, microservices Can't revoke individual tokens without a blocklist; payload is readable
OAuth 2.0 Opaque Token Random string; resource server calls auth server to validate (token introspection) No When instant revocation is critical Extra network hop per request; auth server is a bottleneck
mTLS (Mutual TLS) Both client and server present X.509 certificates Yes Internal service mesh, zero-trust networks Complex certificate management; not suitable for end users

JWT Structure & Flow

A JWT is three base64url-encoded segments separated by dots: header.payload.signature.

// Header — algorithm and token type
{
  "alg": "RS256",
  "typ": "JWT",
  "kid": "key-2026-02"        // Key ID — lets the verifier pick the right public key
}

// Payload — claims (data)
{
  "sub": "user_8f3a2b",       // Subject (user ID)
  "iss": "auth.example.com",  // Issuer
  "aud": "api.example.com",   // Audience (intended recipient)
  "exp": 1740000000,          // Expires at (Unix timestamp)
  "iat": 1739999100,          // Issued at
  "roles": ["editor"],        // Custom claim — used for RBAC
  "org_id": "org_xyz"         // Custom claim — used for tenant isolation
}

// Signature — server verifies this; tampered tokens fail
// RSASHA256(base64url(header) + "." + base64url(payload), privateKey)

Token lifecycle:

  1. Client authenticates (username/password, SSO, social login).
  2. Auth server issues an access token (short-lived, 15 min) and a refresh token (long-lived, 7–30 days, stored server-side).
  3. Client sends access token on every request: Authorization: Bearer <token>.
  4. Resource server validates signature, then checks exp, iss, aud — no DB lookup.
  5. On expiry, client uses the refresh token to get a new access token (refresh token rotation recommended).

Signing Algorithms

AlgorithmTypeHow It WorksWhen to Use
HS256 Symmetric (HMAC) Same secret signs and verifies Single-service apps; simple setups
RS256 Asymmetric (RSA) Private key signs; public key verifies Microservices — only auth service has the private key, all services verify with public key via JWKS
ES256 Asymmetric (ECDSA) Same as RS256 but smaller keys (256-bit vs 2048-bit) Modern default — faster, smaller tokens, equivalent security
JWKS — Key Distribution
In production, the auth server publishes its public keys at a JWKS endpoint (/.well-known/jwks.json). Resource servers fetch and cache these keys. The kid claim in the JWT header tells the verifier which key to use — this enables key rotation without downtime.

JWT + OAuth 2.0 + OIDC

These three are frequently confused. They solve different problems at different layers:

ConceptWhat It IsAnalogy
JWT A token format — a way to encode and sign claims The passport document format
OAuth 2.0 An authorization framework — defines how to obtain tokens (authorization code flow, client credentials, etc.) The process for getting a passport
OIDC An identity layer on top of OAuth 2.0 — adds authentication (who you are) via ID tokens The identity verification step within the passport process

How they fit together:

OAuth 2.0 Grant Types Quick Reference
Grant TypeUse CaseFlow
Authorization Code + PKCE SPAs, mobile apps, server-side web apps Redirect to IdP → user authenticates → redirect back with code → exchange code for tokens
Client Credentials Service-to-service (no user) Service sends client_id + client_secret directly → gets access token
Device Code Smart TVs, CLI tools (no browser) Device shows code → user enters code on phone/desktop → device polls for token
Refresh Token Renewing expired access tokens Send refresh token → get new access token (+ optionally new refresh token)
Deprecated: Implicit Grant
The implicit grant (returning tokens directly in URL fragments) is deprecated in OAuth 2.1. Always use Authorization Code + PKCE for browser-based apps.

RBAC & Authorization Models

Authentication (who are you?) and authorization (what can you do?) are separate concerns. JWT handles the transport — it carries identity and role claims. The authorization model decides how to use those claims.

ModelHow It WorksGranularityExample
RBAC
(Role-Based)
Users are assigned roles; roles have permissions Coarse — role-level admin can delete any order; viewer can only read
ABAC
(Attribute-Based)
Policies evaluate attributes of user, resource, and environment Fine — attribute-level "Allow if user.department == resource.department AND time < 18:00"
ReBAC
(Relationship-Based)
Authorization based on relationships in a graph (e.g., Zanzibar/SpiceDB) Fine — relationship-level "User is an editor of this document" (Google Docs model)
RBAC + JWT in Practice

Most APIs start with RBAC because it's the simplest to implement. The pattern:

  1. Auth server includes "roles": ["admin"] in the JWT payload at login time.
  2. API middleware reads the JWT, extracts roles, and checks against the endpoint's required role.
  3. For resource-level authorization (e.g., "can this user edit this order?"), you still need a DB lookup — JWT roles alone aren't enough. This is where ownership checks live.

JWT: Issue, Validate, Refresh — Implementation

Below are production patterns for JWT validation and RBAC enforcement in all three frameworks.

JWT Anti-patterns
  • Do not store sensitive data in JWT payload — it is base64-encoded, not encrypted (use JWE for encryption).
  • Do not use long-lived access tokens. Keep them short (15 min) and use refresh tokens.
  • Always validate exp, iss, and aud claims — not just the signature.
  • Store refresh tokens in HttpOnly cookies, not localStorage (XSS protection).

Java — Spring Security JWT Resource Server

// build.gradle: implementation 'org.springframework.boot:spring-boot-starter-oauth2-resource-server'

@Configuration
@EnableWebSecurity
@EnableMethodSecurity
public class SecurityConfig {

    @Bean
    public SecurityFilterChain securityFilterChain(HttpSecurity http) throws Exception {
        return http
            .csrf(AbstractHttpConfigurer::disable)          // Stateless API — no CSRF needed
            .sessionManagement(s -> s.sessionCreationPolicy(STATELESS))
            .authorizeHttpRequests(auth -> auth
                .requestMatchers("/api/v1/auth/**").permitAll()
                .requestMatchers(GET, "/api/v1/products/**").permitAll()
                .requestMatchers("/actuator/health").permitAll()
                .anyRequest().authenticated()
            )
            .oauth2ResourceServer(oauth2 -> oauth2
                .jwt(jwt -> jwt.jwtAuthenticationConverter(jwtConverter()))
            )
            .build();
    }

    @Bean
    public JwtAuthenticationConverter jwtAuthenticationConverter() {
        JwtGrantedAuthoritiesConverter converter = new JwtGrantedAuthoritiesConverter();
        converter.setAuthoritiesClaimName("roles");
        converter.setAuthorityPrefix("ROLE_");
        JwtAuthenticationConverter jwtConverter = new JwtAuthenticationConverter();
        jwtConverter.setJwtGrantedAuthoritiesConverter(converter);
        return jwtConverter;
    }
}

// RBAC at the method level
@RestController
@RequestMapping("/api/v1/admin")
public class AdminController {

    @GetMapping("/users")
    @PreAuthorize("hasRole('ADMIN')")
    public ResponseEntity<List<UserDto>> listUsers() {
        return ResponseEntity.ok(userService.findAll());
    }

    // Resource-level authorization: user can only access their own data
    @GetMapping("/orders/{orderId}")
    @PreAuthorize("hasRole('ADMIN') or @orderAuthService.isOwner(#orderId, authentication)")
    public ResponseEntity<OrderDto> getOrder(@PathVariable UUID orderId) {
        return ResponseEntity.ok(orderService.findById(orderId));
    }
}

Python — FastAPI JWT middleware

from fastapi import Security, HTTPException, status
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
import jwt  # PyJWT

security = HTTPBearer()

def decode_token(token: str) -> dict:
    try:
        payload = jwt.decode(
            token,
            settings.JWT_PUBLIC_KEY,
            algorithms=["RS256"],
            audience="api.example.com",
            issuer="auth.example.com",
        )
        return payload
    except jwt.ExpiredSignatureError:
        raise HTTPException(status_code=401, detail="Token has expired")
    except jwt.InvalidTokenError as e:
        raise HTTPException(status_code=401, detail=f"Invalid token: {e}")

async def get_current_user(
    credentials: HTTPAuthorizationCredentials = Security(security),
    db: AsyncSession = Depends(get_db),
) -> User:
    payload = decode_token(credentials.credentials)
    user = await user_service.find_by_id(db, UUID(payload["sub"]))
    if not user or not user.is_active:
        raise HTTPException(status_code=401, detail="User not found or inactive")
    return user

def require_role(*roles: str):
    """Factory for role-based dependency injection."""
    async def dependency(current_user: User = Depends(get_current_user)) -> User:
        if current_user.role not in roles:
            raise HTTPException(
                status_code=403,
                detail=f"Requires role: {', '.join(roles)}"
            )
        return current_user
    return dependency

# Usage
@router.get("/admin/users", response_model=list[UserDto])
async def list_users(current_user: User = Depends(require_role("admin"))):
    return await user_service.find_all()

@router.get("/orders/{order_id}", response_model=OrderDto)
async def get_order(
    order_id: UUID,
    current_user: User = Depends(get_current_user),
    db: AsyncSession = Depends(get_db),
):
    order = await order_service.find_by_id(db, order_id)
    # Resource-level auth: admins see all, users see own
    if current_user.role != "admin" and order.user_id != current_user.id:
        raise HTTPException(status_code=403, detail="Access denied")
    return order

Node.js — Express JWT middleware

import jwt from 'jsonwebtoken';
import { createRemoteJWKSet, jwtVerify } from 'jose';

// Verify JWT using JWKS endpoint (production approach)
const JWKS = createRemoteJWKSet(new URL('https://auth.example.com/.well-known/jwks.json'));

export async function authenticate(req, res, next) {
  const authHeader = req.headers.authorization;
  if (!authHeader?.startsWith('Bearer ')) {
    return res.status(401).json({
      type: 'https://api.example.com/errors/unauthorized',
      title: 'Unauthorized',
      status: 401,
      detail: 'Missing or invalid Authorization header.',
    });
  }

  try {
    const token = authHeader.slice(7);
    const { payload } = await jwtVerify(token, JWKS, {
      issuer: 'https://auth.example.com',
      audience: 'api.example.com',
    });
    req.user = { id: payload.sub, role: payload.role, email: payload.email };
    next();
  } catch (err) {
    const isExpired = err.code === 'ERR_JWT_EXPIRED';
    res.status(401).json({
      title: isExpired ? 'Token Expired' : 'Invalid Token',
      status: 401,
      detail: isExpired ? 'Your session has expired. Please re-authenticate.' : 'The provided token is invalid.',
    });
  }
}

// RBAC middleware factory
export function authorize(...roles) {
  return (req, res, next) => {
    if (!req.user) return res.status(401).json({ status: 401, title: 'Unauthorized' });
    if (!roles.includes(req.user.role)) {
      return res.status(403).json({
        title: 'Forbidden',
        status: 403,
        detail: `This endpoint requires one of the following roles: ${roles.join(', ')}`,
      });
    }
    next();
  };
}

// Usage
router.get('/admin/users', authenticate, authorize('admin'), listUsersHandler);
router.get('/orders/:orderId', authenticate, getOrderHandler);

// Resource-level auth inside handler
async function getOrderHandler(req, res, next) {
  try {
    const order = await orderService.findById(req.params.orderId);
    if (!order) return res.status(404).json({ status: 404, title: 'Not Found' });
    if (req.user.role !== 'admin' && order.userId !== req.user.id) {
      return res.status(403).json({ status: 403, title: 'Forbidden' });
    }
    res.json(order);
  } catch (err) { next(err); }
}

Security Headers Table

HeaderValue ExamplePurpose
Strict-Transport-Securitymax-age=31536000; includeSubDomainsForce HTTPS for 1 year
X-Content-Type-OptionsnosniffPrevent MIME type sniffing
X-Frame-OptionsDENYPrevent clickjacking
Content-Security-Policydefault-src 'self'Restrict resource origins
Referrer-Policystrict-origin-when-cross-originControl referrer information leakage
Permissions-Policycamera=(), microphone=()Disable browser features not needed
Cache-Controlno-storeOn auth endpoints — prevent caching tokens

8. Versioning Strategies

StrategyURL ExampleProsCons
URL Path /api/v1/orders Visible, easy to route, works with every client Version is in URI which REST purists argue violates resource identity
Header Accept: application/vnd.myapp.v1+json Clean URIs, follows HTTP spec (content negotiation) Hard to test in browser, less visible, CDN caching requires Vary: Accept
Query Param /api/orders?version=1 Simple to add to existing URL Easy to forget, clutters URLs, bad for caching
Date-based (Stripe) Stripe-Version: 2024-12-18 Fine-grained control; users opt-in to changes on their own schedule Complex server logic (multiple code paths per date); only realistic for large teams
Stripe's Date-Based Versioning
Stripe pins each API key to a version date at creation time. When they ship breaking changes, existing keys continue using the old behavior. Users explicitly upgrade by changing their version header. This decouples Stripe's deployment cycle from their customers' migration cycle — sophisticated but requires significant infrastructure investment.

URL Versioning Implementation

Java — Spring Boot

// Approach 1: Separate controller classes per version
@RestController
@RequestMapping("/api/v1/orders")
public class OrderControllerV1 { /* v1 implementation */ }

@RestController
@RequestMapping("/api/v2/orders")
public class OrderControllerV2 { /* v2 with breaking changes */ }

// Approach 2: Single controller, route-level versioning
@RestController
@RequestMapping("/api")
public class OrderController {

    @GetMapping("/v1/orders/{id}")
    public ResponseEntity<OrderDtoV1> getOrderV1(@PathVariable UUID id) {
        return ResponseEntity.ok(orderMapper.toDtoV1(orderService.findById(id)));
    }

    @GetMapping("/v2/orders/{id}")
    public ResponseEntity<OrderDtoV2> getOrderV2(@PathVariable UUID id) {
        // V2 adds expanded items array, deprecates "total" in favor of "amount"
        return ResponseEntity.ok(orderMapper.toDtoV2(orderService.findById(id)));
    }
}

// Deprecation header for sunset planning
@GetMapping("/v1/orders/{id}")
public ResponseEntity<OrderDtoV1> getOrderV1(@PathVariable UUID id) {
    return ResponseEntity.ok()
        .header("Deprecation", "true")
        .header("Sunset", "Sat, 31 Dec 2026 23:59:59 GMT")
        .header("Link", "</api/v2/orders>; rel=\"successor-version\"")
        .body(orderMapper.toDtoV1(orderService.findById(id)));
}

Python — FastAPI

from fastapi import APIRouter, FastAPI

app = FastAPI()

# Separate routers per version
v1_router = APIRouter(prefix="/api/v1")
v2_router = APIRouter(prefix="/api/v2")

@v1_router.get("/orders/{order_id}", response_model=OrderDtoV1)
async def get_order_v1(order_id: UUID, db: AsyncSession = Depends(get_db)):
    return await order_service.find_by_id_v1(db, order_id)

@v2_router.get("/orders/{order_id}", response_model=OrderDtoV2)
async def get_order_v2(order_id: UUID, db: AsyncSession = Depends(get_db)):
    return await order_service.find_by_id_v2(db, order_id)

app.include_router(v1_router)
app.include_router(v2_router)

# Add deprecation headers via middleware on v1 routes
@app.middleware("http")
async def deprecation_header_middleware(request: Request, call_next):
    response = await call_next(request)
    if request.url.path.startswith("/api/v1/"):
        response.headers["Deprecation"] = "true"
        response.headers["Sunset"] = "Sat, 31 Dec 2026 23:59:59 GMT"
    return response

Node.js — Express

import { Router } from 'express';

// v1 and v2 routers
const v1Router = Router();
const v2Router = Router();

// Deprecation middleware for v1
const deprecationWarning = (req, res, next) => {
  res.setHeader('Deprecation', 'true');
  res.setHeader('Sunset', 'Sat, 31 Dec 2026 23:59:59 GMT');
  res.setHeader('Link', '</api/v2>; rel="successor-version"');
  next();
};

v1Router.use(deprecationWarning);

v1Router.get('/orders/:id', async (req, res, next) => {
  try {
    const order = await orderService.findByIdV1(req.params.id);
    res.json(order);
  } catch (err) { next(err); }
});

v2Router.get('/orders/:id', async (req, res, next) => {
  try {
    const order = await orderService.findByIdV2(req.params.id);
    res.json(order);
  } catch (err) { next(err); }
});

// Mount versioned routers
app.use('/api/v1', v1Router);
app.use('/api/v2', v2Router);

9. Rate Limiting & Throttling

Rate limiting protects your API from abuse, ensures fair usage across tenants, and prevents a single client from overwhelming downstream services. The token bucket algorithm is the most practical to implement and reason about.

Token Bucket Algorithm

Each API key starts with a bucket of N tokens. Each request consumes one token. Tokens refill at a fixed rate (e.g., 100 tokens/minute). Requests arriving when the bucket is empty receive a 429. Unlike a fixed window, token bucket smooths bursts while still enforcing long-term rates.

Response Headers

HeaderValue ExampleMeaning
X-RateLimit-Limit1000Max requests per window
X-RateLimit-Remaining742Requests remaining in current window
X-RateLimit-Reset1708992000Unix timestamp when window resets
Retry-After47Seconds to wait (on 429 response)

429 Response Body

{
  "type": "https://api.example.com/errors/rate-limit-exceeded",
  "title": "Too Many Requests",
  "status": 429,
  "detail": "You have exceeded your rate limit of 1000 requests per minute.",
  "retry_after": 47
}

Redis-Backed Rate Limiter

Java — Spring Boot (Redis + Bucket4j)

// build.gradle: implementation 'com.bucket4j:bucket4j-redis:8.7.0'

@Component
@RequiredArgsConstructor
public class RateLimitFilter extends OncePerRequestFilter {

    private final RedissonClient redissonClient;

    // Sliding window: 1000 req/min per API key, 100 req/min per IP
    private Bucket resolveBucket(String key) {
        ProxyManager<String> proxyManager = Bucket4jRedisson.casBasedBuilder(redissonClient)
            .build();
        BucketConfiguration config = BucketConfiguration.builder()
            .addLimit(Bandwidth.classic(1000, Refill.intervally(1000, Duration.ofMinutes(1))))
            .build();
        return proxyManager.builder().build(key, () -> config);
    }

    @Override
    protected void doFilterInternal(HttpServletRequest request,
                                    HttpServletResponse response,
                                    FilterChain chain) throws ServletException, IOException {
        String key = resolveKey(request); // API key from header, or IP fallback
        Bucket bucket = resolveBucket(key);
        ConsumptionProbe probe = bucket.tryConsumeAndReturnRemaining(1);

        response.addHeader("X-RateLimit-Limit", "1000");
        response.addHeader("X-RateLimit-Remaining", String.valueOf(probe.getRemainingTokens()));

        if (probe.isConsumed()) {
            chain.doFilter(request, response);
        } else {
            long waitSeconds = probe.getNanosToWaitForRefill() / 1_000_000_000;
            response.addHeader("Retry-After", String.valueOf(waitSeconds));
            response.setStatus(429);
            response.setContentType("application/json");
            response.getWriter().write("""
                {"type":"https://api.example.com/errors/rate-limit-exceeded",
                 "status":429,"title":"Too Many Requests",
                 "retry_after":%d}""".formatted(waitSeconds));
        }
    }

    private String resolveKey(HttpServletRequest request) {
        String apiKey = request.getHeader("X-API-Key");
        return apiKey != null ? "apikey:" + apiKey : "ip:" + request.getRemoteAddr();
    }
}

Python — FastAPI (Redis + sliding window)

import redis.asyncio as redis
import time
from fastapi import Request, HTTPException

redis_client = redis.from_url("redis://localhost:6379", decode_responses=True)

def rate_limiter(limit: int = 1000, window_seconds: int = 60):
    """Sliding window rate limiter using Redis sorted sets."""
    async def _limit(request: Request):
        # Prefer API key, fall back to IP
        key = request.headers.get("X-API-Key") or request.client.host
        redis_key = f"ratelimit:{key}"
        now = time.time()
        window_start = now - window_seconds

        pipe = redis_client.pipeline()
        # Remove timestamps outside the current window
        pipe.zremrangebyscore(redis_key, 0, window_start)
        # Count requests in window
        pipe.zcard(redis_key)
        # Add current request timestamp
        pipe.zadd(redis_key, {str(now): now})
        # Set expiry to avoid orphan keys
        pipe.expire(redis_key, window_seconds * 2)
        results = await pipe.execute()

        count = results[1]
        remaining = max(0, limit - count - 1)
        reset_at = int(now) + window_seconds

        request.state.rate_limit_remaining = remaining

        if count >= limit:
            retry_after = window_seconds - int(now - window_start)
            raise HTTPException(
                status_code=429,
                detail={
                    "type": "https://api.example.com/errors/rate-limit-exceeded",
                    "title": "Too Many Requests",
                    "status": 429,
                    "retry_after": max(retry_after, 1),
                },
                headers={
                    "X-RateLimit-Limit": str(limit),
                    "X-RateLimit-Remaining": "0",
                    "X-RateLimit-Reset": str(reset_at),
                    "Retry-After": str(max(retry_after, 1)),
                },
            )

    return _limit

# Usage — apply per route or globally via middleware
@router.get("/orders", dependencies=[Depends(rate_limiter(limit=100, window_seconds=60))])
async def list_orders(): ...

Node.js — Express (redis-rate-limiter)

import { RateLimiterRedis } from 'rate-limiter-flexible';
import { createClient } from 'redis';

const redisClient = createClient({ url: process.env.REDIS_URL });
await redisClient.connect();

// Per-API-key: 1000/min; per-IP: 100/min
const apiKeyLimiter = new RateLimiterRedis({
  storeClient: redisClient,
  keyPrefix: 'rl_apikey',
  points: 1000,        // requests
  duration: 60,        // per 60 seconds
});

const ipLimiter = new RateLimiterRedis({
  storeClient: redisClient,
  keyPrefix: 'rl_ip',
  points: 100,
  duration: 60,
});

export async function rateLimitMiddleware(req, res, next) {
  const apiKey = req.headers['x-api-key'];
  const limiter = apiKey ? apiKeyLimiter : ipLimiter;
  const key = apiKey || req.ip;
  const limit = apiKey ? 1000 : 100;

  try {
    const result = await limiter.consume(key);
    res.setHeader('X-RateLimit-Limit', limit);
    res.setHeader('X-RateLimit-Remaining', result.remainingPoints);
    res.setHeader('X-RateLimit-Reset', Math.ceil(Date.now() / 1000 + result.msBeforeNext / 1000));
    next();
  } catch (err) {
    const retryAfter = Math.ceil(err.msBeforeNext / 1000);
    res.setHeader('X-RateLimit-Limit', limit);
    res.setHeader('X-RateLimit-Remaining', 0);
    res.setHeader('Retry-After', retryAfter);
    res.status(429).json({
      type: 'https://api.example.com/errors/rate-limit-exceeded',
      title: 'Too Many Requests',
      status: 429,
      detail: 'Rate limit exceeded.',
      retry_after: retryAfter,
    });
  }
}

10. Caching

Cache-Control & Conditional Requests

MechanismHeaderDirectionPurpose
Cache durationCache-Control: max-age=300ResponseCache for 5 minutes
No cacheCache-Control: no-storeResponseNever cache (use for auth endpoints)
RevalidateCache-Control: no-cacheResponseCache but always validate with server
ETagETag: "abc123def456"ResponseVersion fingerprint of the resource
Conditional GETIf-None-Match: "abc123def456"RequestReturn 304 if ETag unchanged
Last-ModifiedLast-Modified: Tue, 10 Feb 2026 15:00:00 GMTResponseTimestamp-based validation
Conditional GETIf-Modified-Since: Tue, 10 Feb 2026 15:00:00 GMTRequestReturn 304 if unchanged since timestamp
ETag vs Last-Modified
Prefer ETags for resources that may be updated multiple times per second (the timestamp granularity is only 1 second). ETags can encode arbitrary versioning — a hash of the content, a database row version number, or a composite of multiple fields.

ETag + Redis Caching Implementation

Java — Spring Boot

@RestController
@RequestMapping("/api/v1")
@RequiredArgsConstructor
public class ProductController {

    private final ProductService productService;
    private final RedisTemplate<String, String> redisTemplate;

    @GetMapping("/products/{productId}")
    public ResponseEntity<ProductDto> getProduct(
            @PathVariable UUID productId,
            @RequestHeader(value = "If-None-Match", required = false) String ifNoneMatch) {

        // Try cache first
        String cacheKey = "product:" + productId;
        String cachedJson = redisTemplate.opsForValue().get(cacheKey);
        String currentEtag;

        if (cachedJson != null) {
            currentEtag = "\"" + DigestUtils.md5DigestAsHex(cachedJson.getBytes()) + "\"";
            if (currentEtag.equals(ifNoneMatch)) {
                return ResponseEntity.status(HttpStatus.NOT_MODIFIED)
                    .eTag(currentEtag)
                    .build();
            }
            return ResponseEntity.ok()
                .eTag(currentEtag)
                .cacheControl(CacheControl.maxAge(5, TimeUnit.MINUTES).cachePublic())
                .body(objectMapper.readValue(cachedJson, ProductDto.class));
        }

        // Cache miss — fetch from DB and cache
        ProductDto product = productService.findById(productId);
        String json = objectMapper.writeValueAsString(product);
        redisTemplate.opsForValue().set(cacheKey, json, Duration.ofMinutes(5));
        currentEtag = "\"" + DigestUtils.md5DigestAsHex(json.getBytes()) + "\"";

        return ResponseEntity.ok()
            .eTag(currentEtag)
            .cacheControl(CacheControl.maxAge(5, TimeUnit.MINUTES).cachePublic())
            .body(product);
    }

    // Invalidate cache on write
    @PutMapping("/products/{productId}")
    public ResponseEntity<ProductDto> updateProduct(
            @PathVariable UUID productId,
            @Valid @RequestBody UpdateProductRequest request) {
        ProductDto updated = productService.update(productId, request);
        redisTemplate.delete("product:" + productId);
        return ResponseEntity.ok(updated);
    }
}

Python — FastAPI

import hashlib
import json
from fastapi import Request, Response
from redis.asyncio import Redis

@router.get("/products/{product_id}", response_model=ProductDto)
async def get_product(
    product_id: UUID,
    request: Request,
    response: Response,
    redis: Redis = Depends(get_redis),
    db: AsyncSession = Depends(get_db),
):
    cache_key = f"product:{product_id}"
    cached = await redis.get(cache_key)

    if cached:
        etag = f'"{hashlib.md5(cached).hexdigest()}"'
        if request.headers.get("if-none-match") == etag:
            return Response(status_code=304, headers={"ETag": etag})
        response.headers["ETag"] = etag
        response.headers["Cache-Control"] = "public, max-age=300"
        return ProductDto.model_validate_json(cached)

    product = await product_service.find_by_id(db, product_id)
    if not product:
        raise HTTPException(status_code=404, detail="Product not found")

    product_json = product.model_dump_json().encode()
    await redis.setex(cache_key, 300, product_json)  # 5 min TTL

    etag = f'"{hashlib.md5(product_json).hexdigest()}"'
    response.headers["ETag"] = etag
    response.headers["Cache-Control"] = "public, max-age=300"
    return product

@router.put("/products/{product_id}", response_model=ProductDto)
async def update_product(
    product_id: UUID,
    request: UpdateProductRequest,
    redis: Redis = Depends(get_redis),
    db: AsyncSession = Depends(get_db),
):
    updated = await product_service.update(db, product_id, request)
    await redis.delete(f"product:{product_id}")  # Invalidate cache
    return updated

Node.js — Express

import crypto from 'crypto';

router.get('/products/:productId', async (req, res, next) => {
  try {
    const cacheKey = `product:${req.params.productId}`;
    const cached = await redis.get(cacheKey);

    if (cached) {
      const etag = `"${crypto.createHash('md5').update(cached).digest('hex')}"`;
      if (req.headers['if-none-match'] === etag) {
        return res.status(304).setHeader('ETag', etag).send();
      }
      return res
        .setHeader('ETag', etag)
        .setHeader('Cache-Control', 'public, max-age=300')
        .json(JSON.parse(cached));
    }

    const product = await productService.findById(req.params.productId);
    if (!product) return res.status(404).json({ status: 404, title: 'Not Found' });

    const json = JSON.stringify(product);
    await redis.setEx(cacheKey, 300, json); // 5 min TTL

    const etag = `"${crypto.createHash('md5').update(json).digest('hex')}"`;
    res
      .setHeader('ETag', etag)
      .setHeader('Cache-Control', 'public, max-age=300')
      .json(product);
  } catch (err) { next(err); }
});

11. File Uploads & Streaming

Multipart Upload

Java — Spring Boot

@RestController
@RequestMapping("/api/v1")
public class UploadController {

    // application.yml: spring.servlet.multipart.max-file-size=10MB
    @PostMapping(value = "/products/{productId}/images",
                 consumes = MediaType.MULTIPART_FORM_DATA_VALUE)
    public ResponseEntity<ImageDto> uploadImage(
            @PathVariable UUID productId,
            @RequestParam("file") MultipartFile file,
            @RequestParam(value = "alt", required = false) String altText) {

        // Validate MIME type — never trust Content-Type header alone
        String contentType = file.getContentType();
        if (!List.of("image/jpeg", "image/png", "image/webp").contains(contentType)) {
            throw new ValidationException("Unsupported image type: " + contentType);
        }
        if (file.getSize() > 10 * 1024 * 1024) { // 10 MB
            throw new ValidationException("File exceeds maximum size of 10MB");
        }

        // Validate actual file header (magic bytes)
        byte[] header = Arrays.copyOf(file.getBytes(), 8);
        if (!isSupportedImageHeader(header)) {
            throw new ValidationException("File content does not match declared type");
        }

        ImageDto result = imageService.upload(productId, file.getInputStream(), contentType, altText);
        return ResponseEntity.status(201).body(result);
    }
}

Python — FastAPI

from fastapi import UploadFile, File, Form
import magic  # python-magic library

ALLOWED_TYPES = {"image/jpeg", "image/png", "image/webp"}
MAX_SIZE = 10 * 1024 * 1024  # 10 MB

@router.post("/products/{product_id}/images",
             response_model=ImageDto,
             status_code=status.HTTP_201_CREATED)
async def upload_image(
    product_id: UUID,
    file: UploadFile = File(...),
    alt: str | None = Form(None),
    db: AsyncSession = Depends(get_db),
    current_user: User = Depends(get_current_user),
):
    # Read first chunk to detect MIME via magic bytes
    header_bytes = await file.read(1024)
    await file.seek(0)  # Reset for subsequent read

    mime = magic.from_buffer(header_bytes, mime=True)
    if mime not in ALLOWED_TYPES:
        raise HTTPException(status_code=415, detail=f"Unsupported media type: {mime}")

    # Check total size without loading into memory
    contents = await file.read()
    if len(contents) > MAX_SIZE:
        raise HTTPException(status_code=413, detail="File exceeds 10MB limit")

    return await image_service.upload(db, product_id, contents, mime, alt)

Presigned URLs (S3 Pattern)

For large uploads, avoid routing files through your application server. Instead, issue a short-lived presigned URL directly to the S3/GCS bucket. The client uploads directly to object storage, and your server only handles metadata.

# FastAPI presigned upload flow
import boto3
from botocore.exceptions import ClientError

s3_client = boto3.client("s3", region_name="us-east-1")

@router.post("/products/{product_id}/image-upload-url")
async def create_upload_url(
    product_id: UUID,
    content_type: str = Query(..., regex="^image/(jpeg|png|webp)$"),
    current_user: User = Depends(get_current_user),
):
    key = f"products/{product_id}/images/{uuid4()}"
    url = s3_client.generate_presigned_url(
        "put_object",
        Params={
            "Bucket": settings.S3_BUCKET,
            "Key": key,
            "ContentType": content_type,
            "ContentLengthRange": (1, 10 * 1024 * 1024),  # 1 byte to 10 MB
        },
        ExpiresIn=300,  # 5 minutes
    )
    # Record pending upload in DB, confirm once client POSTs back
    await image_service.create_pending(db, product_id, key, content_user.id)
    return {"upload_url": url, "key": key, "expires_in": 300}

Server-Sent Events (SSE) & NDJSON Streaming

# FastAPI — streaming NDJSON for large exports
from fastapi.responses import StreamingResponse
import json

@router.get("/orders/export")
async def export_orders(
    current_user: User = Depends(require_role("admin")),
    db: AsyncSession = Depends(get_db),
):
    async def generate():
        async for order in order_service.stream_all(db):
            yield json.dumps(order.model_dump()) + "\n"

    return StreamingResponse(
        generate(),
        media_type="application/x-ndjson",
        headers={"Content-Disposition": "attachment; filename=orders.ndjson"},
    )

# SSE — real-time order status updates
@router.get("/orders/{order_id}/events")
async def order_events(
    order_id: UUID,
    request: Request,
    current_user: User = Depends(get_current_user),
):
    async def event_stream():
        async for event in order_service.subscribe(order_id):
            if await request.is_disconnected():
                break
            yield f"data: {json.dumps(event)}\n\n"

    return StreamingResponse(event_stream(), media_type="text/event-stream")

12. API Documentation (OpenAPI)

OpenAPI 3.1 Spec Structure

openapi: '3.1.0'
info:
  title: Orders API
  version: '1.0.0'
  description: |
    Manages the order lifecycle from placement to fulfillment.
    All timestamps are ISO 8601 UTC. Monetary values are integers in minor currency units (cents).
  contact:
    email: [email protected]

servers:
  - url: https://api.example.com/api/v1
    description: Production

security:
  - bearerAuth: []

paths:
  /orders/{orderId}:
    get:
      summary: Get an order by ID
      operationId: getOrder
      tags: [Orders]
      parameters:
        - name: orderId
          in: path
          required: true
          schema:
            type: string
            format: uuid
      responses:
        '200':
          description: Order found
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/OrderDto'
        '404':
          $ref: '#/components/responses/NotFound'
        '401':
          $ref: '#/components/responses/Unauthorized'

components:
  schemas:
    OrderDto:
      type: object
      required: [id, status, total, currency, created_at]
      properties:
        id:
          type: string
          format: uuid
        status:
          type: string
          enum: [pending, processing, shipped, delivered, cancelled]
        total:
          type: integer
          description: Total in minor currency units (e.g., cents)
          example: 9999
        currency:
          type: string
          pattern: '^[A-Z]{3}$'
          example: USD
        created_at:
          type: string
          format: date-time

  responses:
    NotFound:
      description: Resource not found
      content:
        application/problem+json:
          schema:
            $ref: '#/components/schemas/ProblemDetail'
    Unauthorized:
      description: Missing or invalid authentication

  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      bearerFormat: JWT

Auto-Generation per Framework

FrameworkLibraryDocs URLNotes
Spring Boot 3 springdoc-openapi-starter-webmvc-ui /swagger-ui.html, /v3/api-docs Use @Operation, @ApiResponse annotations to enrich docs
FastAPI Built-in (Pydantic + FastAPI) /docs (Swagger), /redoc, /openapi.json Pydantic models auto-generate schemas; docstrings become descriptions
Express swagger-jsdoc + swagger-ui-express /api-docs Write spec in JSDoc comments above routes; manual but flexible
FastAPI docs in production
Disable the interactive docs in production (app = FastAPI(docs_url=None, redoc_url=None)) or protect them behind auth. The /openapi.json endpoint reveals your full API surface to anyone.

13. Testing APIs

Testing Pyramid for APIs

Java — Spring Boot (MockMvc + Testcontainers)

@SpringBootTest
@AutoConfigureMockMvc
@Testcontainers
class OrderControllerIT {

    @Container
    static PostgreSQLContainer<?> postgres = new PostgreSQLContainer<>("postgres:16")
        .withDatabaseName("testdb")
        .withUsername("test")
        .withPassword("test");

    @DynamicPropertySource
    static void overrideDataSource(DynamicPropertyRegistry registry) {
        registry.add("spring.datasource.url", postgres::getJdbcUrl);
    }

    @Autowired
    MockMvc mockMvc;

    @Autowired
    ObjectMapper objectMapper;

    @Test
    void createOrder_returnsCreated_withLocationHeader() throws Exception {
        var request = new CreateOrderRequest(
            List.of(new CreateOrderItemRequest(UUID.randomUUID(), 2)),
            new ShippingAddressRequest("123 Main St", "New York", "NY", "10001", "US"),
            "USD"
        );

        mockMvc.perform(post("/api/v1/orders")
                .contentType(MediaType.APPLICATION_JSON)
                .content(objectMapper.writeValueAsString(request))
                .header("Authorization", "Bearer " + validJwt()))
            .andExpect(status().isCreated())
            .andExpect(header().exists("Location"))
            .andExpect(jsonPath("$.status").value("pending"))
            .andExpect(jsonPath("$.currency").value("USD"));
    }

    @Test
    void createOrder_returnsBadRequest_whenItemsIsEmpty() throws Exception {
        var request = Map.of("items", List.of(), "currency", "USD");

        mockMvc.perform(post("/api/v1/orders")
                .contentType(MediaType.APPLICATION_JSON)
                .content(objectMapper.writeValueAsString(request))
                .header("Authorization", "Bearer " + validJwt()))
            .andExpect(status().isUnprocessableEntity())
            .andExpect(jsonPath("$.status").value(422))
            .andExpect(jsonPath("$.errors[0].field").value("items"));
    }

    @Test
    void getOrder_returns401_whenNoToken() throws Exception {
        mockMvc.perform(get("/api/v1/orders/{id}", UUID.randomUUID()))
            .andExpect(status().isUnauthorized());
    }

    @Test
    void getOrder_returns403_whenAccessingOtherUsersOrder() throws Exception {
        UUID otherUserId = UUID.randomUUID();
        Order order = createOrderForUser(otherUserId);

        mockMvc.perform(get("/api/v1/orders/{id}", order.getId())
                .header("Authorization", "Bearer " + validJwtForUser(UUID.randomUUID())))
            .andExpect(status().isForbidden());
    }
}

Python — FastAPI (pytest + httpx)

import pytest
import pytest_asyncio
from httpx import AsyncClient, ASGITransport
from app.main import app

@pytest_asyncio.fixture
async def client(db_session):
    """Each test gets a fresh HTTP client with a transaction that rolls back."""
    async with AsyncClient(transport=ASGITransport(app=app), base_url="http://test") as c:
        yield c

@pytest.mark.asyncio
async def test_create_order_returns_201_with_location(client, auth_headers):
    payload = {
        "items": [{"product_id": str(uuid4()), "quantity": 2}],
        "shipping_address": {
            "street": "123 Main St", "city": "New York",
            "state": "NY", "zip": "10001", "country": "US"
        },
        "currency": "USD",
    }
    response = await client.post("/api/v1/orders", json=payload, headers=auth_headers)
    assert response.status_code == 201
    assert "location" in response.headers
    data = response.json()
    assert data["status"] == "pending"
    assert data["currency"] == "USD"

@pytest.mark.asyncio
async def test_create_order_422_when_items_empty(client, auth_headers):
    payload = {"items": [], "currency": "USD"}
    response = await client.post("/api/v1/orders", json=payload, headers=auth_headers)
    assert response.status_code == 422
    errors = response.json()["errors"]
    assert any(e["field"] == "items" for e in errors)

@pytest.mark.asyncio
async def test_pagination_cursor_is_stable(client, auth_headers, seed_orders):
    """Cursor must not skip or duplicate across pages."""
    seen_ids = set()
    cursor = None
    page_count = 0

    while True:
        params = {"size": 10}
        if cursor:
            params["cursor"] = cursor
        response = await client.get("/api/v1/orders", params=params, headers=auth_headers)
        assert response.status_code == 200
        body = response.json()
        for order in body["data"]:
            assert order["id"] not in seen_ids, "Duplicate order in paginated results"
            seen_ids.add(order["id"])
        cursor = body.get("next_cursor")
        page_count += 1
        if not body["has_more"]:
            break

    assert len(seen_ids) == len(seed_orders)

Node.js — Express (supertest + jest)

import request from 'supertest';
import app from '../src/app.js';
import { createTestUser, generateToken, seedOrders } from './helpers.js';

describe('POST /api/v1/orders', () => {
  let token;
  let userId;

  beforeEach(async () => {
    const user = await createTestUser();
    userId = user.id;
    token = generateToken(user);
  });

  it('returns 201 with Location header on valid request', async () => {
    const payload = {
      items: [{ productId: crypto.randomUUID(), quantity: 2 }],
      shippingAddress: { street: '123 Main St', city: 'New York', state: 'NY', zip: '10001', country: 'US' },
      currency: 'USD',
    };

    const res = await request(app)
      .post('/api/v1/orders')
      .set('Authorization', `Bearer ${token}`)
      .send(payload);

    expect(res.status).toBe(201);
    expect(res.headers.location).toMatch(/^\/api\/v1\/orders\/.+/);
    expect(res.body.status).toBe('pending');
    expect(res.body.currency).toBe('USD');
  });

  it('returns 422 when items array is empty', async () => {
    const res = await request(app)
      .post('/api/v1/orders')
      .set('Authorization', `Bearer ${token}`)
      .send({ items: [], currency: 'USD' });

    expect(res.status).toBe(422);
    expect(res.body.errors.some(e => e.field === 'items')).toBe(true);
  });

  it('returns 401 when Authorization header is missing', async () => {
    const res = await request(app).post('/api/v1/orders').send({ items: [] });
    expect(res.status).toBe(401);
  });
});

describe('GET /api/v1/orders (rate limiting)', () => {
  it('returns 429 after exceeding rate limit', async () => {
    const { token } = await createTestUser();
    // Exhaust limit
    const requests = Array.from({ length: 101 }, () =>
      request(app).get('/api/v1/orders').set('Authorization', `Bearer ${token}`)
    );
    const responses = await Promise.all(requests);
    const tooMany = responses.filter(r => r.status === 429);
    expect(tooMany.length).toBeGreaterThan(0);
    expect(tooMany[0].headers['retry-after']).toBeDefined();
  });
});

14. Performance & Production

Connection Pooling

FrameworkLibraryKey Settings
Spring Boot HikariCP (default) spring.datasource.hikari.maximum-pool-size=20, minimum-idle=5, connection-timeout=3000
FastAPI asyncpg (via SQLAlchemy async) pool_size=20, max_overflow=10, pool_pre_ping=True, pool_recycle=300
Express node-postgres (pg Pool) max: 20, idleTimeoutMillis: 30000, connectionTimeoutMillis: 3000
N+1 Query Prevention
When returning a list of orders with items, a naïve implementation runs one query to fetch orders, then one per order to fetch its items — O(n) queries. Fix: use JOIN FETCH / selectinload in a single query, or use a DataLoader pattern for GraphQL. Always check query logs under realistic load.

Health Checks & Graceful Shutdown

Java — Spring Boot (Actuator)

// build.gradle: implementation 'org.springframework.boot:spring-boot-starter-actuator'
// application.yml:
// management.endpoints.web.exposure.include: health,info,metrics,prometheus
// management.endpoint.health.show-details: when-authorized
// management.health.db.enabled: true
// management.health.redis.enabled: true

// Custom health indicator
@Component
public class PaymentServiceHealthIndicator implements HealthIndicator {

    private final PaymentClient paymentClient;

    @Override
    public Health health() {
        try {
            boolean ok = paymentClient.ping();
            return ok ? Health.up().withDetail("provider", "stripe").build()
                      : Health.down().withDetail("reason", "ping failed").build();
        } catch (Exception e) {
            return Health.down().withException(e).build();
        }
    }
}

// Graceful shutdown — built-in with Spring Boot 2.3+
// application.yml:
// server.shutdown: graceful
// spring.lifecycle.timeout-per-shutdown-phase: 30s

Python — FastAPI

from contextlib import asynccontextmanager

@asynccontextmanager
async def lifespan(app: FastAPI):
    # Startup
    await db_engine.connect()
    await redis_client.ping()
    logger.info("Application started")
    yield
    # Shutdown — Uvicorn waits for in-flight requests up to graceful_timeout
    await db_engine.dispose()
    await redis_client.close()
    logger.info("Application shut down cleanly")

app = FastAPI(lifespan=lifespan)

@app.get("/health")
async def health():
    checks = {}
    try:
        await db.execute(text("SELECT 1"))
        checks["db"] = "ok"
    except Exception as e:
        checks["db"] = f"error: {e}"

    try:
        await redis_client.ping()
        checks["redis"] = "ok"
    except Exception as e:
        checks["redis"] = f"error: {e}"

    status = "healthy" if all(v == "ok" for v in checks.values()) else "degraded"
    code = 200 if status == "healthy" else 503
    return JSONResponse(status_code=code, content={"status": status, "checks": checks})

@app.get("/ready")
async def ready():
    """Kubernetes readiness probe — only pass when ready to serve traffic."""
    return {"ready": True}

Node.js — Express

// Health check routes
app.get('/health', async (req, res) => {
  const checks = {};

  try {
    await db.query('SELECT 1');
    checks.db = 'ok';
  } catch (err) {
    checks.db = `error: ${err.message}`;
  }

  try {
    await redis.ping();
    checks.redis = 'ok';
  } catch (err) {
    checks.redis = `error: ${err.message}`;
  }

  const healthy = Object.values(checks).every(v => v === 'ok');
  res.status(healthy ? 200 : 503).json({ status: healthy ? 'healthy' : 'degraded', checks });
});

app.get('/ready', (req, res) => res.json({ ready: true }));

// Graceful shutdown
const server = app.listen(PORT, () => logger.info(`Listening on :${PORT}`));

process.on('SIGTERM', () => {
  logger.info('SIGTERM received, draining connections...');
  server.close(async () => {
    await db.end();
    await redis.quit();
    logger.info('Server shut down cleanly');
    process.exit(0);
  });
  // Force exit after 30 seconds if not drained
  setTimeout(() => process.exit(1), 30_000);
});

Compression Middleware

// Spring Boot — gzip is on by default for responses > 2KB
// application.yml:
// server.compression.enabled: true
// server.compression.min-response-size: 2048
// server.compression.mime-types: application/json,text/plain
# FastAPI — add brotli/gzip middleware
from fastapi.middleware.gzip import GZipMiddleware
app.add_middleware(GZipMiddleware, minimum_size=1000, compresslevel=5)
// Express
import compression from 'compression';
app.use(compression({ threshold: 1024 })); // Compress responses > 1KB

API Deployment (AWS)

Deployment architecture differs significantly depending on whether your API is internal (service-to-service) or public-facing (clients on the internet).

Internal APIs (Service-to-Service)

# Typical path:
# Service A → ALB (internal) → ECS Fargate / EKS pods (private subnet)
#
# Key decisions:
# - ALB (internal, scheme=internal) — stays within VPC, no public IP
# - Service discovery via AWS Cloud Map or internal DNS (service.local)
# - VPC Endpoints for AWS services (S3, DynamoDB, Secrets Manager)
#   → traffic stays on AWS backbone, never hits public internet
# - Security groups: allow only specific source SGs, not CIDR blocks
# - No API Gateway needed — saves cost and latency for internal traffic
# - mTLS via service mesh (App Mesh / Istio) for zero-trust

Public APIs (Internet-Facing)

# Two common patterns:
#
# Pattern A: API Gateway (managed)
# Client → Route 53 → API Gateway → VPC Link → ALB (internal) → ECS/EKS
#   Pros: managed throttling, API keys, usage plans, request validation, WAF integration
#   Cons: $3.50/million requests + data transfer; adds ~10-30ms latency
#   Best for: public developer APIs with usage plans and metering
#
# Pattern B: ALB + CloudFront (self-managed)
# Client → Route 53 → CloudFront → ALB (public) → ECS/EKS
#   Pros: lower cost at high volume, full control, global edge caching
#   Cons: you implement rate limiting / API keys yourself (or use WAF)
#   Best for: high-traffic internal products (SPA backends, mobile APIs)
#
# Both patterns:
# - ACM certificate on the edge (CloudFront or API GW) + ALB
# - WAF rules: rate limiting, geo-blocking, SQL injection, XSS detection
# - Secrets in Secrets Manager or SSM Parameter Store (never env vars in task def)

Compute Options

OptionCold StartScalingCost ModelBest For
ECS Fargate None (always-on) Target tracking on CPU/memory/ALB requests Per vCPU-hour + memory-hour Most APIs — predictable latency, simple ops
EKS None HPA + Cluster Autoscaler / Karpenter EC2 instances + $0.10/hr control plane Large-scale microservices; team already on K8s
Lambda 100ms–2s (depends on runtime + VPC) Automatic (1000 concurrent default) Per-invocation + duration Sporadic traffic, event-driven, cost-sensitive
App Runner None (min 1 instance) or ~2s (scale to zero) Automatic based on concurrency Per vCPU-hour + memory-hour Simple APIs that don't need VPC features

Production Checklist

Expand Deployment Checklist
  • Networking: API in private subnets, ALB in public subnets, NAT Gateway for outbound, VPC endpoints for AWS services
  • DNS: Route 53 alias record → ALB/CloudFront; health-check-based failover for multi-region
  • TLS: ACM certificates on ALB + CloudFront; enforce TLS 1.2+ minimum; HSTS header
  • Secrets: Secrets Manager with automatic rotation; injected as env vars at task start (not baked into image)
  • Observability: Structured JSON logs → CloudWatch Logs; X-Ray or OpenTelemetry for distributed tracing; CloudWatch alarms on p99 latency, 5xx rate, and error budget
  • CI/CD: CodePipeline or GitHub Actions → ECR image push → ECS rolling update (min healthy 100%, max 200%) or blue/green with CodeDeploy
  • Database: RDS in private subnet with Multi-AZ; connection via IAM auth or Secrets Manager; RDS Proxy for Lambda (connection pooling)
  • Rate limiting: WAF rate rules (IP-based) at the edge; application-level token bucket (Redis) for per-user limits
  • DDoS: Shield Standard (free, automatic) protects against L3/L4; Shield Advanced for L7 + cost protection if needed
  • Multi-region: Route 53 latency-based routing → regional ALBs; DynamoDB Global Tables or Aurora Global Database for data replication
Cost Tip: API Gateway vs ALB
At 100M requests/month, API Gateway costs ~$350 in request charges alone. An ALB costs ~$25/month (fixed) + a few dollars in LCU charges. Use API Gateway when you need its managed features (API keys, usage plans, request validation, WebSocket support). Use ALB when your API is a backend for your own app and you handle auth/throttling at the application layer.

15. Security Checklist

CORS Configuration

Java — Spring Boot

@Configuration
public class CorsConfig {
    @Bean
    public CorsConfigurationSource corsConfigurationSource() {
        CorsConfiguration config = new CorsConfiguration();
        // Never use * in production — enumerate allowed origins
        config.setAllowedOrigins(List.of("https://app.example.com", "https://admin.example.com"));
        config.setAllowedMethods(List.of("GET", "POST", "PUT", "PATCH", "DELETE", "OPTIONS"));
        config.setAllowedHeaders(List.of("Authorization", "Content-Type", "X-API-Key"));
        config.setAllowCredentials(true);  // Required for cookies / Authorization header
        config.setMaxAge(3600L);           // Cache preflight for 1 hour

        UrlBasedCorsConfigurationSource source = new UrlBasedCorsConfigurationSource();
        source.registerCorsConfiguration("/api/**", config);
        return source;
    }
}

Python — FastAPI

from fastapi.middleware.cors import CORSMiddleware

app.add_middleware(
    CORSMiddleware,
    allow_origins=["https://app.example.com", "https://admin.example.com"],
    allow_credentials=True,
    allow_methods=["GET", "POST", "PUT", "PATCH", "DELETE"],
    allow_headers=["Authorization", "Content-Type", "X-API-Key"],
    max_age=3600,
)

Node.js — Express (cors + helmet)

import cors from 'cors';
import helmet from 'helmet';

// Helmet sets secure headers in one call
app.use(helmet({
  crossOriginEmbedderPolicy: false, // Adjust for your CDN/iframe needs
}));

app.use(cors({
  origin: (origin, callback) => {
    const allowed = ['https://app.example.com', 'https://admin.example.com'];
    if (!origin || allowed.includes(origin)) {
      callback(null, true);
    } else {
      callback(new Error('Not allowed by CORS'));
    }
  },
  credentials: true,
  methods: ['GET', 'POST', 'PUT', 'PATCH', 'DELETE'],
  allowedHeaders: ['Authorization', 'Content-Type', 'X-API-Key'],
  maxAge: 3600,
}));

Security Checklist Table

CategoryCheckHow to Enforce
TransportHTTPS everywhereHSTS header, redirect HTTP → HTTPS at load balancer
AuthShort-lived JWT access tokens (15 min)Set exp claim, validate on every request
AuthRefresh tokens in HttpOnly cookiesNever in localStorage; set Secure; SameSite=Strict
AuthNo secrets in JWT payloadCode review; consider JWE for sensitive claims
InputValidate all inputs at boundaryBean Validation / Pydantic / Zod before service layer
InputParameterized queries onlyORM / prepared statements; never string interpolation in SQL
InputRequest size limitsspring.servlet.multipart.max-request-size, request.size in nginx
HeadersSecurity headers on all responsesHelmet (Node), Spring Security defaults, FastAPI middleware
HeadersCORS restricted to known originsEnumerate allowed origins; never * with credentials
Rate LimitsPer-key and per-IP limitsRedis-backed token bucket at API gateway or middleware
ErrorsNo stack traces in responsesGlobal error handler sanitizes output; log internally with trace ID
LoggingNo PII in logsRedact email, phone, SSN in log formatters; audit log pipeline
DependenciesVulnerability scanningmvn dependency-check, pip-audit, npm audit in CI
AuthorizationResource-level checks (not just role)Verify resource.ownerId == currentUser.id in service layer
FilesValidate MIME via magic bytesUse apache-tika / python-magic — never trust Content-Type

16. Interview Quick Reference

"Design a REST API for X" Framework

When asked to design an API in an interview, walk through these steps explicitly — the interviewer wants to see your thought process, not just endpoint names.

  1. Clarify scope: Who are the consumers (mobile app, third-party, internal service)? Read or write heavy? Estimated scale?
  2. Identify resources: Nouns from the domain. For an e-commerce system: User, Product, Order, OrderItem, Payment, Shipment.
  3. Define relationships: An Order belongs to a User, has many OrderItems. One Payment per Order. One Shipment per fulfilled Order.
  4. Define endpoints: CRUD for each resource, plus actions that don't fit CRUD (use sub-resource nouns: /orders/{id}/cancellation).
  5. Design request/response schemas: Fields, types, required vs optional, nested vs flat IDs.
  6. Address cross-cutting concerns: Authentication, authorization, pagination, rate limiting, versioning, error format.
  7. Identify edge cases: Idempotency for payments, optimistic locking for inventory, eventual consistency if distributed.

Common Interview Questions

Q: What's the difference between PUT and PATCH?

PUT replaces the entire resource. If you omit a field, it is set to null/default. The request body must represent the complete desired state. PUT is idempotent: calling it N times produces the same result.

PATCH applies a partial update — only the fields you include are changed. PATCH is not inherently idempotent (e.g., {"increment_stock": 5} would increase stock each call). To make PATCH idempotent, use JSON Patch operations like {"op": "replace", "path": "/status", "value": "shipped"}.

Production choice: Prefer PATCH for most update operations because it avoids clients needing to know the full current state. Use PUT when replacing entire documents (e.g., replacing a config file).

Q: How do you handle concurrent updates? (Optimistic locking)

Use optimistic concurrency control via ETag / version field:

  1. GET returns resource with ETag: "v42"
  2. Client sends PATCH with If-Match: "v42"
  3. Server checks if current version matches. If yes, apply update and increment version.
  4. If version mismatch (another client updated first), return 412 Precondition Failed

In the DB, this maps to: UPDATE orders SET status='shipped', version=43 WHERE id=? AND version=42. If 0 rows updated, someone else updated first.

Q: How do you make a POST idempotent? (Idempotency keys)

Use an idempotency key header, exactly like Stripe does: Idempotency-Key: <uuid-from-client>

  1. Client generates a UUID for the request and stores it.
  2. Server caches the response keyed by idempotency_key + endpoint + user_id in Redis (TTL: 24h).
  3. On retry, return the cached response instead of re-executing the operation.
  4. Return 409 if same key is used for a different request body (likely a bug).

This is critical for payment APIs: if a network timeout occurs, the client can safely retry with the same key knowing the payment won't be charged twice.

Q: Should you return 200 with an error field, or use proper HTTP status codes?

Always use proper HTTP status codes. Returning 200 {"success": false, "error": "not found"} is an anti-pattern because:

  • Monitoring systems count all 200s as successes — errors become invisible in metrics.
  • HTTP clients (fetch, Axios, Spring RestTemplate) have built-in error handling keyed on status codes — you bypass it entirely.
  • Load balancers and CDNs make caching decisions based on status codes.
  • Contract testing (consumer-driven) relies on status codes to express expectations.

Use RFC 7807 Problem Details for a structured, standardized error body.

Q: How would you design pagination for a feed with real-time inserts?

Use cursor-based (keyset) pagination on a stable, unique, indexed column (typically id or created_at + id composite).

  • The cursor encodes the last seen row's sort key (e.g., {"created_at": "2026-02-15T10:00:00Z", "id": "uuid"}), base64-encoded.
  • Each page fetches: WHERE created_at < $cursor_ts OR (created_at = $cursor_ts AND id < $cursor_id) ORDER BY created_at DESC, id DESC LIMIT N+1
  • Fetching N+1 rows lets you detect has_more without a COUNT query.
  • No duplicates or skips even with concurrent inserts, because the cursor is a fixed point in the data.
Q: How do you prevent CSRF in a stateless JWT API?

A stateless JWT API stored in memory or Authorization headers is not vulnerable to CSRF — browsers never automatically send the Authorization header (unlike cookies). CSRF attacks only work with cookie-based auth.

If you use HttpOnly cookies for refresh tokens:

  • Set SameSite=Strict on the cookie — browsers won't send it in cross-site requests.
  • For SPAs that need SameSite=Lax, add a Double Submit Cookie pattern or check the Origin header server-side.
  • Spring Security disables CSRF by default for stateless (JWT) APIs.

API Design Review Checklist

Expand API Design Checklist
  • Resources: Are all URLs nouns? Are collections plural? No verbs in paths?
  • Methods: Are GET/DELETE idempotent? Is PUT doing full replacement?
  • Status codes: Is POST returning 201 with Location? Is DELETE returning 204?
  • Creation: Does POST return a Location header pointing to the new resource?
  • Errors: Are all errors RFC 7807 Problem Details? No stack traces in bodies?
  • Validation: Are all inputs validated at the boundary before reaching service layer?
  • Auth: Are all non-public endpoints protected? Is resource-level authorization checked?
  • Versioning: Is there a versioning strategy? Are deprecated endpoints flagged?
  • Pagination: Do all list endpoints paginate? Is the cursor stable under concurrent writes?
  • Filtering/sorting: Are query param names consistent? Are sort fields allowlisted?
  • Consistency: Is JSON field naming consistent (all camelCase or all snake_case)?
  • Idempotency: Do payment/side-effect endpoints support idempotency keys?
  • Rate limiting: Is there per-key and per-IP rate limiting? Are headers set on all responses?
  • Documentation: Is there an OpenAPI spec? Are all fields documented with examples?
  • Tests: Are happy paths, validation errors, auth failures, and edge cases all tested?

HTTP Method Decision Tree

# What operation are you performing?
#
# Read-only, no side effects?
#   └─► GET (cacheable, idempotent)
#
# Creating a new resource?
#   └─► POST → returns 201 + Location header
#
# Replacing entire resource (client sends full state)?
#   └─► PUT (idempotent; omitted fields → null/default)
#
# Partial update (only changed fields)?
#   └─► PATCH (not idempotent by default)
#
# Removing a resource?
#   └─► DELETE → returns 204 No Content (idempotent)
#
# Triggering an action that doesn't fit CRUD?
#   └─► POST to a sub-resource noun:
#       POST /orders/{id}/cancellation
#       POST /invoices/{id}/payment
#       POST /users/{id}/password-reset
#
# Checking if resource exists / fetching headers only?
#   └─► HEAD
#
# CORS preflight (browser does this automatically)?
#   └─► OPTIONS
Final Interview Tip
When given an open-ended "design an API" question, always ask: "What are the primary clients?" and "What's the expected scale?" These two answers will determine whether you need cursor pagination vs offset, sync vs async operations, strict versioning vs loose versioning, and whether gRPC or REST is the right choice at all. Demonstrating you ask before you design signals senior-level thinking.