Systems Thinking for Engineering

Today’s post for Day 3 of the Mercari Advent Calendar 2021 is brought to you by Darren from the Mercari Data Team.


I was recently asked to share my thoughts on what it means to be a "distinguished engineer" at Mercari, and my first thought was about systems. Below is an internal note that I shared with our Engineering organization about how I approach engineering. I hope you find this useful for guiding your own work and careers.

-Darren


If you want to be a better engineer, you need to hack the system. Or more accurately, you need to understand the system.

Engineering is about wrangling systems to accomplish amazing things.

I’m not talking about operating systems, or distributed systems, but rather systems in the general sense. A "system" is simply a collection of heterogeneous components that work together to achieve some purpose. It might be a rocket; it might be an e-commerce platform; it might be the internet itself.

Changes in one part of a system can greatly influence nodes that are quite distant. The overall behavior of the system might only be understandable in a statistical sense, but the failure of the system can arise from any specific piece.

Learning as much as I can about the structure and function of a system is what I like to call a "systems mindset". I try to apply a systems mindset to all of my engineering, and if you want to become a better engineer, I encourage you to do the same.

What does this mean in practice? Well, when I started at Mercari, one of the first things I did was trace a purchase transaction through the entire stack.

Mercari, at its core, is an e-commerce company. We operate a marketplace that connects buyers and sellers. Transactions, where an item is listed by one person and purchased by another, represent the core unit of truth in our company. (We have since added other fundamental units of truth such as payments.)

I don’t work directly on the transaction subsystem, but my mental model of our marketplace needed a base to hang everything onto. A good base for engineering systems is the source of truth datum at the core of the system. If you can develop a good model of the data that’s flowing through your system, and what subsystems it touches along the way, then you have a very solid foundation for becoming a productive engineer. Conversely, without that foundation, your work just floats off in space, connected tenuously with other black boxes. And it’s hard to build on top of a fragile foundation.

Transactions begin when a user presses a purchase button, which sends a request to a server, which sends other requests to other servers, and finally, a response is returned to the user.

Simple enough, but what does the system dealing with requests look like? Roughly, the initial request is fired off by the network layer of a native app architecture; a reverse proxy handles the incoming request, authenticates it, and passes it off to the handling microservice; a microservice, which is an independently horizontally scalable collection of Kubernetes pods behind a Kubernetes service, processes the request and updates some data store; and everything unwinds back to the user.

Each part of the system generates logs, events, and other data. Each part abides by certain service level objectives as well as resource constraints. A simple tap on a button might ultimately touch myriad different services, some micro and some quite large. It is a carefully orchestrated dance that belies the underlying complexity, and tracing that journey is beneficial regardless of the precise piece you work on.

Systems thinking can be applied to not just the actual servers through which requests flow, but also to the IDEs, debugging tools, and other technology that we use on an everyday basis. In order to become a better engineer, you need to make these systems work for you.

Beginner engineers tend to rely on build and deployment environments being set up for them and never needing modification. But these are the systems that build your code and launch your servers, and they can be engineered with the same attentiveness and rigor as the service itself.

Think of how many times per day you build your code. If you can make that ten times faster, that is a huge multiplier on your daily productivity, and that daily productivity multiplies your weekly and monthly productivity, and so on. Similarly, learning how to be a good debugger is a skill that pays off in spades whether you work on the client side, the server side, or somewhere in between.

Remember: Engineering is about solving problems, and if you cannot pinpoint the cause of a problem, you’re merely going to be guessing at potential solutions.

Finally, systems thinking can be applied to the actual APIs and distributed systems that you design. As a younger engineer, I used to get caught up with relatively inconsequential decisions such as directory structure and function names, or far worse, premature optimization. When you zoom out and look at the system as a whole, you understand that if your system is well designed, then you can fix the names later, and even the macro structure of your code, if it’s unit tested well enough.

The same principles that apply to good API design also apply to good distributed systems design – your system needs built-in decoupling points in order to be resilient to bad decisions or unforeseen consequences. It is impossible to be a perfect engineer, which is why I now design for failure just as much as I design for success. I want to minimize the consequences of the failures that inevitably emerge, and I want to maximize the chance of success.

Engineering is a blend of science, art, and creativity, all held together with systems thinking. Understanding how each of the components of our system, as well as your personal skill set, interact and build on top of each other is the key to becoming a better engineer.

And that is how you hack the system.


Tomorrow’s article is by @aymeric, and will be about joining Mercari from overseas! Looking forward to it!