Learnings from: Designing Data-Intensive Applications

Designing Data-Intensive Applications (DDIA) by Martin Kleppmann is one of those books that keeps getting recommended in system design discussions — and for good reason. Rather than teaching you how to use specific tools, it teaches you how to think about data systems.

This post is not a full summary, but a collection of ideas, mental models, and quotes that stuck with me after reading it.

A short summary of the book

At its core, DDIA is about building systems that work reliably at scale, even when things inevitably go wrong.

The book is structured around three main goals of data systems:

Reliability – the system continues to work correctly, even in the face of faults.
Scalability – the system can handle growth (in data, traffic, or complexity).
Maintainability – the system remains understandable and modifiable over time.

Kleppmann walks through these goals by exploring:

Storage engines and data models (relational, document, graph)
Replication and partitioning
Consistency models and distributed systems trade-offs
Batch vs stream processing
The limits of coordination, time, and ordering in distributed systems

What makes the book stand out is that it’s concept-driven. Technologies like Kafka, MySQL, or Hadoop are used as examples, not as the point.

Some quotes that stuck with me

"Debates about normalization and denormalization become largely irrelevant if you can translate data from a write-optimized event log to a read-optimized application state.”

After undergoing hundreds of schema normalization exercises at university, and spending so many hours in meetings about how to best structure the data this statement really got stuck with me. Sometimes biases and discussions to try and get things perfect don't actually address the use cases that need to be addressed, and it's important to remember that multiple parts of the system can be complementary to better suit all requirements.

"The major difference between a thing that might go wrong and a thing that cannot possibly go wrong is that when a thing that cannot possibly go wrong goes wrong it usually turns out to be impossible to get at or repair.”

As a non-native English speaker this one seems a bit of a mouthful, but completely true. Kleppmann quotes this from Mostly Harmless, written by Douglas Adams. Thoughts like "this will never happen" tend to affect system designs, making developers not give much thought to observability or recovery paths for certain cases. If these cases have a very high risk, that makes the system vulnerable to whatever is regarded as the black swan.

“If you’re designing a system that must scale, you must ask: which operations will become expensive?”

Scalability is less about adding machines and more about understanding bottlenecks.

"Building for scale that you don't need is wasted effort and may lock you into an inflexible design. In effect, it is a form of premature optimization.”

Premature scaling often comes from following trends rather than concrete needs. Money keeps things running, don't waste it on users you don't have.

"Design and build software, even operating systems, to be tried early, ideally within weeks. Don’t hesitate to throw away the clumsy parts and rebuild them."

This requires a good mindset. It's important to remember that nothing you build will be perfect, so it's better to get it out quickly to check if your priorities are sound. Especially at the beginning, systems can be rebuilt from the ground up very quickly - the longer you wait the more complicated that gets. Instead of being scared of people criticizing our systems, we should be afraid of all the work that would take having to redo all the work.

Key takeaways and mental models

A few ideas I keep coming back to:

1. There is no “best” architecture

Almost every chapter reinforces this: every design decision is a trade-off. Consistency vs availability, latency vs throughput, simplicity vs flexibility. The book trains you to ask which trade-off fits my use case?

2. Data modeling is more important than tools

Whether you choose SQL or NoSQL matters less than:

How your data is queried
How it evolves over time
How failures affect correctness

Bad data models scale poorly no matter how modern the stack is.

3. Failure is the default

Instead of assuming things will work, DDIA encourages you to design with the assumption that:

Machines will crash
Networks will partition
Messages will arrive late or out of order

Systems that acknowledge this upfront tend to be simpler and more robust.

4. Time is surprisingly tricky

Clock skew, ordering, and causality are not edge cases — they are central problems in distributed systems. The book explains why “just use timestamps” is often not enough.

Why this feels like a modern classic

Despite being published years ago, DDIA hasn’t aged much — and that’s its biggest strength.

It avoids framework hype.
It focuses on fundamentals that don’t change quickly.
It gives you vocabulary to reason about systems clearly.

It’s the kind of book where you understand why certain architectures exist, not just how to implement them.

If you work in backend engineering, data engineering, or system design interviews, this book quietly raises your baseline.

Conclusion

Reading Designing Data-Intensive Applications doesn’t make you an expert in distributed systems overnight — but it does something more valuable: it reshapes how you think about them.

You come away with:

Better questions
Stronger intuitions
A healthier skepticism of “silver bullet” solutions