How "Code Wins" Can Lead to Bad Decisions
A few months ago, I saw a post on LinkedIn about the concept of "code wins" or sometimes "code wins arguments". As someone who used to espouse this idea, I decided to stop and think hard about what it means and why I stopped using it years ago.
Before getting into this, I want to be clear that I agree with a lot of the arguments for "code wins", which I will cover in a moment. The problem, as I see it, lies in the application of the idea to everyday software engineering.
Best intentions
If you were to search the internet for information about this approach, you will find some really useful information. A lot of the proponents will talk about the value of prototyping versus talking or writing. I agree that prototyping is extremely valuable and can easily take the place of a lot of design process.
If you've read my series on design reviews, you probably know that I don't put a lot of stock in design documents. Design documents have a short-lived value as communication tools before they become out-of-date and misleading. Especially in cases where the system design relies heavily on third-party libraries or tooling, the design document becomes a narrative about reading other documents.
In these cases, I absolutely support prototyping over document-writing, code over words. Several years ago, I was building Salesforce's ML platform. One major piece of that platform orchestrated the various processes (data extraction, model training, inference, prediction recording, etc.). Like every good project, we had to deal with a build vs. buy decision around how this orchestrator would work. We had documents for technical evaluations and designs. We finally had to throw that out and try to use the libraries before realizing that we really needed our own[1]. Ultimately, "code won".
Why did the code win, though? Is it because the code was there? No. We threw all that code away. Instead, the code proved that the coupling of the external libraries was too high to fit into our service frameworks. We'd have to reimplement monitoring, authentication, log management, configuration management, and a dozen other things.
When it comes to decision-making, I view prototype code much like I view diagrams. They provide a level of understanding above what words alone can provide, and they can allow for easier proof or dismissal of alternatives.
...meet bad decisions
So what's my problem? Why am I writing an article that seems to be negative about "code wins"?
The problem isn't the theory I discussed above. The problem (as is often the case) comes from how it appears in the real world.
Let me give you a hypothetical scenario. During a design review, a principal engineer is picking at a particular part of the system for scalability concerns. Frustrated (and possibly defensive), the engineer presenting the design announces that it's already implemented[2] and working just fine in the synthetic performance testing.
In almost all cases, that is going to put a damper on the conversation. The principal engineer may have a valid concern, which could be the root cause of an outage down the road. The declaration about existing code has basically killed the conversation, though.
Did code win? Probably. Is that a good thing? We won't know until it's too late.
Especially during my time at Amazon, I saw plenty of instances of my hypothetical. In other cases, a group that disagreed on the path forward would have their hands tied by whoever implemented something first that seemed to work, even if it had enormous architectural shortcomings.
You can argue that what I'm describing is outside the spirit of the "code wins" ideal. I would respond (probably politely) that I simply don't care. Ideals don't matter when they are abused.
The phenomenon I'm describing boils down to an exploitation of the fallacy of sunk cost. When the rubber hits the road on "code wins", the decision isn't about the best path forward but whether the effort already put in should be discarded. Real world software engineering doesn't usually have enough schedule leeway to decide to discard work.
Can we fix it?
Fundamentally, I do believe that code is more valuable to exploring some problems than documentation or discussions. I don't think we should go with whatever is coded and "working"[3].
So if you agree with the idealistic view of "code wins" why not say "code proves ideas" or just support prototyping and exploration through code? We don't need more winners and losers, and we don't need more bad decisions based on the sunk cost fallacy. We also don't need a one-size-fits-all approach, either. Some things lend themselves to prototyping. Other problems work better with deliberation and documentation. If you keep an open mind and pivot when one doesn't seem to be working, you have a much better chance of making the right decision and not needing to have an argument.
Did we really, really need our own? Maybe not, but the options all came with a surprising level of coupling to other technologies. There is a point where managing open source approval for 200 packages is more effort than just writing your own. ↩︎
I can almost add this to the design review series. This is such a common issue, and it's why I always push for design reviews early and without large formal documents or processes. ↩︎
Just to be clear, prototypes don't "work" in the product sense. They are always missing features and usually just prove a certain pattern. This is where I get nervous because extensibility often gets left out, making it harder to retrofit into the system later. ↩︎