Never Just One
Writing software these days involves asking the question "How many?" over and over again. There is one answer that's always wrong.
The world of enterprise software is complicated. Oftentimes, we have so many moving parts, we can't really hold an entire system in our brains at once. We make simplifying assumptions. It's natural.
I want to talk about one simplifying assumption that you shouldn't make, though.
Never assume that there is exactly one of anything.
What about the singleton pattern?
Ah, the singleton pattern. You probably shouldn't use it in enterprise software. Through the course of my career, singletons have moved from something you code (i.e. static INSTANCE member in a class) to something you configure (i.e. singleton scope beans in Spring). Why did this change happen?
Because there's never just one.
Sure, there may be just one per runtime. There isn't just one overall, though. Inversion of Control (IoC) frameworks like Spring encouraged this change because even coded singleton can have dependencies. Maybe you want to mock those dependencies in tests. Maybe the dependencies differ based on environment. Maybe you use a different singleton for the same job in different environments.
Ignoring the dangers of singletons can lead to insanely complex workarounds. Long ago at Amazon, my team owned a framework that had such a static and singleton problem that we implemented static byte code weaving rather than fixing the application code.
Well, I only need a single global instance of this service
Oh, really? So you're not going to integration test it? No GDPR or CCPA implications? Certainly, no country is going to pass a data residency law, right?
This may feel a bit harsh, but we've been living for years with multiple privacy and data residency laws. We need testing environments (and not just 1!) and different production environments for various compliance and legal requirements. That's what working in enterprise software means right now.
Let's say that your use case really does require a single global instance. You are still going to have other production-like instances that need to operate independently. Maybe there's a big refactor that needs to be tested. Or a database migration. Or an authentication change. Don't paint yourself into a corner.
This goes for data stores too
I might stir up some controversy here, but I think it's needed. Your system should assume that it will have to operate in a hybrid data store mode. What do I mean by that? Let me provide an example.
Several years ago, I wrote the best data store code I've ever written. We were using a proprietary (because RedisValkey didn't exist back then) key-value store. It was partitioned and replicated. The hosts only had so much disk space, though. We knew that we'd need to potentially re-partition or generally expand our data store cluster.
So I wrote a data store layer that could have multiple data store configuration active at any given time. We could set up multiple interaction patterns between them: dual write, single write, single read, fallback read, and fallback read with replication.
It was beautiful. It worked. All we had to do was walk through a series of config changes to the data store layer and run a job to re-propagate all of our data to make sure it was on the new data store. I'm proud of that code to this day.
I've never seen an ORM library that can do this. They always assume that there's just one database, violating the entire premise of this post.
Without allowing for multiple databases, you cannot have zero-downtime upgrades. You cannot have safe, revert-able migrations (schema or otherwise). We've spent so much time trying to get blue-green deployments of database upgrades working as well as possible with minimal downtime with CNAME flips, but we've ignored that we can make them have no downtime by actually accounting for failovers and flips in the client.
Some other things that have more than one
I could go on and on, but here's a list of things for which there are more than one, just to drive my point home:
- CPU architectures (x86_64, ARM64, RISC-V)
- OSes (Linux, BSD, Windows, MacOS because your systems need to run on dev machines)
- Core libraries (glibc/musl)
- Clouds (AWS/GCP/Azure) and all their native offerings
- Database software (postgres, mysql, oracle, db2 and all their various versions)
- Cache software (memcached, redis, valkey)
- Language versions
- API versions (this is a whole post in itself)
- UI languages
I could probably go on for far too long, if I haven't already. I'm not saying that we need to write software that can handle the full matrix of options above. I am saying that if you decide on one (say, x86_64 CPUs), don't assume that the choice will last forever. It doesn't take much to stick a CPU architecture tag on an artifact. Then, it'll be easier to add a new one later and probably won't force you to rebuild half the system.