API Design

Master API design paradigms for system design. Learn REST, GraphQL, and gRPC in depth — when to use each, their trade-offs, versioning strategies, pagination, and idempotency.

API Design

An Application Programming Interface (API) is a contract between two pieces of software. It defines how a client (e.g., a mobile app or another microservice) can request data or perform actions on a server. In system design, the API is the single most important interface you design — every user interaction, every service-to-service call, flows through an API.

Think of an API like a restaurant menu. The menu (API) tells you what dishes (operations) are available, what ingredients you need to provide (request parameters), and what you'll get back (response format). The kitchen (server) handles the complexity internally.


1. The Three Major API Paradigms

There are three dominant paradigms for building APIs. Each was designed for a different set of problems.

1. REST (Representational State Transfer)

REST is the most widely adopted API style on the web. It models everything as a Resource — a noun you can perform actions on using standard HTTP methods.

Core Principles:

  • Resources are identified by URLs: Each entity (user, order, product) has a unique URL endpoint.
  • Standard HTTP methods map to CRUD operations:
HTTP MethodOperationExampleDescription
GETReadGET /api/users/123Fetch user 123
POSTCreatePOST /api/usersCreate a new user
PUTReplacePUT /api/users/123Replace the entire user 123 resource
PATCHPartial UpdatePATCH /api/users/123Update only specific fields of user 123
DELETEDeleteDELETE /api/users/123Delete user 123
  • Stateless: Each request from the client must contain all the information the server needs. The server does not store client session state between requests.

Example: A REST API for a Blog

Code
# Get all posts (with pagination)
GET /api/v1/posts?page=1&limit=20
Response: { "data": [...], "nextPage": 2, "totalCount": 150 }
 
# Get a single post
GET /api/v1/posts/42
Response: { "id": 42, "title": "Hello", "author": { ... }, "comments": [...] }
 
# Create a post
POST /api/v1/posts
Body: { "title": "New Post", "content": "..." }
Response: 201 Created { "id": 43, ... }
 
# Update a post
PATCH /api/v1/posts/42
Body: { "title": "Updated Title" }
Response: 200 OK { "id": 42, "title": "Updated Title", ... }
 
# Delete a post
DELETE /api/v1/posts/42
Response: 204 No Content

When to Use REST:

  • Public-facing APIs consumed by third-party developers (GitHub, Stripe, Twitter all use REST).
  • Standard CRUD applications (e-commerce, content management).
  • When HTTP caching is important (GET responses can be cached by browsers, CDNs, and proxies automatically).

Trade-offs:

  • Over-fetching: GET /api/users/123 might return 50 fields when the client only needs name and avatar.
  • Under-fetching: To display a user's profile page, the client may need to make 3 separate requests: GET /users/123, GET /users/123/posts, GET /users/123/followers.

2. GraphQL

GraphQL, developed by Facebook in 2012, solves the over-fetching and under-fetching problems of REST. Instead of the server deciding what data to return, the client specifies exactly what it needs in a single request.

Core Principles:

  • Single endpoint: All requests go to POST /graphql.
  • Client-defined queries: The client sends a query describing the exact shape of the data it needs.
  • Strongly typed schema: The server publishes a schema that describes all available data types and operations.

Example: Fetching a User Profile in GraphQL

Code
# Client sends this query:
query {
  user(id: "123") {
    name
    avatar
    posts(limit: 5) {
      title
      createdAt
    }
    followersCount
  }
}
Code
// Server responds with EXACTLY the requested shape:
{
  "data": {
    "user": {
      "name": "Alice",
      "avatar": "https://cdn.example.com/alice.jpg",
      "posts": [
        { "title": "Hello World", "createdAt": "2024-01-15" },
        { "title": "GraphQL is great", "createdAt": "2024-02-01" }
      ],
      "followersCount": 1024
    }
  }
}

Notice how a single request returned data that would have required 3 separate REST calls. The response contains no extra fields — only what was requested.

When to Use GraphQL:

  • Complex frontend applications with diverse data requirements (e.g., Facebook, Shopify).
  • When multiple client types (mobile app, web, smartwatch) need different data shapes from the same backend.
  • As a Backend-for-Frontend (BFF) aggregation layer sitting in front of REST microservices.

Trade-offs:

  • No native HTTP caching: Since all queries go to POST /graphql, CDNs and browsers cannot cache responses using standard HTTP caching headers.
  • Security complexity: Malicious clients can craft deeply nested queries (e.g., user.posts.comments.author.posts.comments...) that exhaust server resources. You must implement query depth limiting and query cost analysis.
  • Backend complexity: Resolvers (the functions that fetch data for each field) can lead to the N+1 query problem if not carefully optimized with tools like DataLoader.

3. gRPC (Google Remote Procedure Call)

gRPC is a high-performance, binary protocol designed for internal service-to-service communication in microservice architectures. Instead of resources and queries, gRPC models APIs as remote function calls.

Core Principles:

  • Protocol Buffers (Protobuf): Data is serialized in a compact binary format, not JSON. This is significantly smaller and faster to parse.
  • HTTP/2: gRPC runs on HTTP/2, enabling multiplexing (multiple requests over a single TCP connection), header compression, and bi-directional streaming.
  • Contract-first development: You define your API in a .proto file, and the gRPC toolchain generates type-safe client and server code in any language.

Example: Defining a gRPC Service

Code
// user_service.proto
 
syntax = "proto3";
 
service UserService {
  // Unary RPC: Simple request-response
  rpc GetUser (GetUserRequest) returns (UserResponse);
 
  // Server Streaming: Server sends a stream of responses
  rpc ListUsers (ListUsersRequest) returns (stream UserResponse);
}
 
message GetUserRequest {
  string user_id = 1;
}
 
message UserResponse {
  string user_id = 1;
  string name = 2;
  string email = 3;
}

From this .proto file, gRPC auto-generates client and server stubs in Python, Go, Java, TypeScript, etc. The developer simply implements the GetUser function on the server and calls client.GetUser(request) on the client.

gRPC Streaming Modes:

ModeDescriptionUse Case
UnaryClient sends one request, server sends one response.Standard request-response (like REST).
Server StreamingClient sends one request, server sends a stream of responses.Real-time stock prices, live logs.
Client StreamingClient sends a stream of requests, server sends one response.Uploading a large file in chunks.
Bi-directional StreamingBoth client and server send streams simultaneously.Real-time chat, multiplayer games.

When to Use gRPC:

  • Internal microservice-to-microservice calls where latency is critical (payment service to fraud detection service).
  • Systems requiring real-time bi-directional streaming (live dashboards, IoT telemetry).
  • Polyglot environments where services are written in different languages (Go, Java, Python) and need auto-generated, type-safe clients.

Trade-offs:

  • Not browser-friendly: Browsers cannot natively make gRPC calls. You need a proxy layer (like Envoy with gRPC-Web) to translate.
  • Not human-readable: Binary Protobuf payloads cannot be inspected with curl or browser developer tools.

2. Choosing the Right Paradigm

CriteriaRESTGraphQLgRPC
Data FormatJSON (text)JSON (text)Protobuf (binary)
TransportHTTP/1.1 or HTTP/2HTTP (POST)HTTP/2
CachingNative (HTTP headers)Complex (no native)No native
Best ForPublic APIs, CRUDFlexible frontendsFast microservices
StreamingNo (needs WebSockets)SubscriptionsNative (4 modes)
Code GenerationOptional (OpenAPI)OptionalRequired (Protobuf)

[!IMPORTANT] In a system design interview, the "correct" answer is almost never "always use X." Demonstrate that you can match the right tool to the right problem. A common production pattern is: gRPC between internal services, a GraphQL BFF to aggregate data for frontends, and a REST API for public third-party integrations.


3. API Versioning Strategies

APIs evolve over time. You must support old clients while introducing new features. Breaking an existing client's integration is one of the most costly mistakes in software engineering.

Strategy 1: URL Path Versioning

GET /api/v1/users/123   →  Returns { name, email }
GET /api/v2/users/123   →  Returns { name, email, avatar, bio }
  • Pros: Simple, explicit, easy to route and cache.
  • Cons: The URL changes, which can break bookmarks and requires client-side code updates.
  • Used by: GitHub, Stripe, Twitter.

Strategy 2: Header Versioning

GET /api/users/123
Header: Accept: application/vnd.myapi.v2+json
  • Pros: Clean URLs. The API version is a metadata concern, not a resource identity concern.
  • Cons: Harder to test (can't just paste a URL in a browser). Caching proxies need to be configured to vary by header.

Strategy 3: Query Parameter Versioning

GET /api/users/123?version=2
  • Pros: Easy to add to existing requests.
  • Cons: Can pollute URLs and make caching more complex.

[!TIP] URL Path Versioning (/v1/, /v2/) is the industry default. It is the simplest to implement, test, and document. Start with this unless you have a strong reason not to.


4. Pagination Patterns

Any API endpoint that returns a list of items must implement pagination. Without it, a query that returns 10 million rows will exhaust server memory and crash the client.

1. Offset-Based Pagination

GET /api/posts?page=3&limit=20

The server skips (page - 1) * limit rows and returns the next limit rows.

  • Pros: Simple to implement. Allows random access to any page.
  • Cons: Performance degrades on large datasets. OFFSET 1000000 in SQL forces the database to scan and discard 1 million rows before returning results. Also, if a new item is inserted while the user is paginating, items can shift and be duplicated or skipped.
GET /api/posts?cursor=eyJpZCI6IDQyfQ==&limit=20

The cursor is an opaque, encoded pointer (usually the last item's ID or timestamp) to the position in the dataset. The server queries items after the cursor.

Code
-- Instead of OFFSET (slow):
SELECT * FROM posts ORDER BY id LIMIT 20 OFFSET 1000000;
 
-- Use a cursor (fast):
SELECT * FROM posts WHERE id > 42 ORDER BY id LIMIT 20;
  • Pros: Consistent performance regardless of dataset size ($O(\log N)$ with an index). No duplicated or skipped items during concurrent inserts.
  • Cons: Cannot jump to an arbitrary page (no "Go to page 50" feature).
  • Used by: Twitter, Facebook, Slack, Stripe.

5. Idempotency in API Design

Idempotency means that making the same request multiple times produces the same result as making it once. This is critical in distributed systems where network failures can cause retries.

HTTP MethodIdempotent?Explanation
GETYesReading data multiple times doesn't change it.
PUTYesReplacing a resource with the same data is the same every time.
DELETEYesDeleting a resource that's already deleted returns the same result.
POSTNoCreating a new resource twice creates two resources.
PATCHIt dependsSET balance = 100 is idempotent. INCREMENT balance BY 10 is not.

Making Non-Idempotent Operations Safe

For critical POST operations (like charging a credit card), use an Idempotency Key:

Code
// Client sends a unique key with the request:
POST /api/payments
Headers: { "Idempotency-Key": "a1b2c3d4-uuid" }
Body: { "amount": 50.00, "currency": "USD" }
 
// Server logic:
async function processPayment(req: Request) {
    const key = req.headers["Idempotency-Key"];
 
    // 1. Check if this key was already processed
    const existing = await redis.get(`idempotency:${key}`);
    if (existing) {
        return JSON.parse(existing); // Return the cached response
    }
 
    // 2. Process the payment (only happens once)
    const result = await chargeCard(req.body);
 
    // 3. Cache the response for future retries
    await redis.setex(`idempotency:${key}`, 86400, JSON.stringify(result));
 
    return result;
}

If the client's network drops after sending the payment request, it can safely retry with the same Idempotency Key. The server recognizes the duplicate and returns the original response without charging the card again.


6. HTTP Status Codes: Speaking the Language

A well-designed API communicates intent clearly through HTTP status codes:

CodeMeaningWhen to Use
200 OKSuccessSuccessful GET, PUT, PATCH, DELETE.
201 CreatedResource createdSuccessful POST that creates a new resource.
204 No ContentSuccess, no bodySuccessful DELETE.
400 Bad RequestClient errorInvalid request body, missing required fields.
401 UnauthorizedNot authenticatedNo valid authentication token provided.
403 ForbiddenNot authorizedAuthenticated but lacks permission for this action.
404 Not FoundResource doesn't existThe requested URL or resource ID is invalid.
409 ConflictState conflictTrying to create a resource that already exists.
429 Too Many RequestsRate limitedClient has exceeded the allowed request rate.
500 Internal Server ErrorServer failureUnhandled exception on the server.
503 Service UnavailableTemporarily downServer is overloaded or undergoing maintenance.

[!TIP] A common anti-pattern is returning 200 OK with an error message in the body (e.g., { "status": "error", "message": "User not found" }). This breaks standard HTTP tooling. Use the correct status code (404) and include details in the body.