How to Think in System Design: Mental Models

TL;DR / Key Takeaways

  • Interviews test your thinking process more than a perfect architecture.
  • Start with requirements, then quantify load and constraints.
  • Define the core metrics: throughput, latency, availability, durability.
  • A good design is a series of explicit trade-offs, not a single right answer.

What System Design Interviews Actually Test

Most interviews are not asking for the most complex system. They are asking whether you can:

  • Clarify the problem before building a solution.
  • Translate vague goals into measurable requirements.
  • Reason about scale using simple math.
  • Explain trade-offs calmly and clearly.

A strong answer looks like an architect thinking out loud, not a checklist of technologies.

Requirements First: Functional and Non-Functional

Functional requirements define what the system must do. Non-functional requirements define how well it must do it.

graph TD
  R[Requirements] --> F[Functional]
  R --> N[Non-Functional]
  N --> S[Scalability]
  N --> A[Availability]
  N --> L[Latency]
  N --> C[Cost]

Ask questions early:

  • Who uses it, and how often?
  • What is the read vs write mix?
  • What is the acceptable latency and error rate?
  • What happens when a dependency fails?

The Four Core Metrics

Use these terms precisely:

  • Throughput: requests per second or transactions per second.
  • Latency: time for a single request, usually p50 and p95.
  • Availability: percentage of time the system is usable (for example 99.9%).
  • Durability: probability that committed data is not lost.

A design that optimizes one metric often harms another. That is the point.

Translate Requirements into Constraints

Turn vague goals into hard constraints you can design against:

  • Read vs write mix drives caching, storage choice, and scaling direction.
  • Consistency expectations drive data model and replication strategy.
  • Data retention and growth define storage and backup needs.
  • Latency targets define how much work each hop can do.

Write these constraints down early so you can defend trade-offs later.

Latency Budgets and Critical Paths

If the user target is p95 = 200 ms, split that budget across components. Example:

  • 30 ms network and TLS
  • 20 ms load balancer and routing
  • 50 ms service logic
  • 80 ms database or cache
  • 20 ms for safety margin

Any single dependency that routinely exceeds its budget will break the latency target.

Back-of-the-Envelope Sizing

Quick math sets the scale before you design.

Example:

  • 10 million daily active users
  • 5 actions per day each
  • 50 million actions per day
  • 50,000,000 / 86,400 seconds = about 580 QPS average

Now add realistic peaks and storage:

  • Peak factor 5x -> 2,900 QPS peak
  • 1 KB per action -> 50 GB written per day
  • 30 days retention -> 1.5 TB raw data (before replication)

You now know the system must handle a few thousand QPS peak and multi-terabyte storage.

Worked Example: From Requirements to Size

Assume 2,000,000 daily active users and 4 actions per day:

  • Actions per day = 2,000,000 * 4 = 8,000,000
  • Average QPS = 8,000,000 / 86,400 = about 93
  • Peak factor 5x -> about 465 QPS
  • Read/write split 90/10 -> about 420 reads/s and 47 writes/s
  • Each write is 2 KB -> about 16 GB/day (8,000,000 * 2 KB)
  • 30-day retention -> about 480 GB raw
  • 3x replication -> about 1.4 TB total

The Assumptions Ledger

Keep a short list of assumptions you are willing to defend:

  • Peak factor and traffic distribution (steady vs spiky).
  • Read/write ratio and request fan-out.
  • Availability target and data loss tolerance.

If any assumption changes, explicitly describe how the architecture changes.

Common Failure Modes in Interviews

  • Skipping requirements and jumping to architecture.
  • Ignoring read vs write patterns.
  • Not stating assumptions or estimating load.
  • Treating latency, availability, and cost as free.

How to Narrate Trade-offs

A good structure:

  1. Clarify the use case and constraints.
  2. State assumptions and estimate load.
  3. Propose a simple baseline design.
  4. Identify bottlenecks and scale points.
  5. Offer alternatives with trade-offs.

This keeps the conversation grounded and shows senior-level thinking.

Checklist for a Strong Start

  • Define the users and core user actions.
  • Split functional from non-functional requirements.
  • Estimate QPS, data size, and storage growth.
  • Call out the key trade-offs you will make.

A clean mental model is the foundation for every system you design.

Leave a Comment

Your email address will not be published. Required fields are marked *