Programming is the art of trade-off
No matter what programming language you are using, one common suggestion you all probably hear is that:
Don’t use switch statements
Besides people usually forgetting to add the break statement, the more profound reason is that developers often avoid using special cases in their code. Instead, they prefer to use more flexible and powerful constructs such as polymorphism or dictionaries.
I heard lots of experienced developers say that they view their code as art. It’s not hard to understand since they spend hours or even days working on a particular piece of code. Therefore they view their code as a reflection of their personal style and aesthetic preferences, much like artists who create works of art.
However, this pursuit of elegant, well-designed code can sometimes lead to a disregard for special case handling. It's easy to get carried away with the desire to create perfect code, but this obsession can sometimes blind developers from seeing the real problems they are trying to solve. In other words, this fixation on perfect code can be like a stain on white paper, obscuring the bigger picture.
Special Case 1
In my previous company, we updated our product from a web data analytics platform to a complete marketing platform. The major feature supporting this upgrade is to add a CDP(Customer Data Platform) under the hood. Therefore, you can create a user segment and make personalized content for them when they visit your website:
How to achieve that? Technically we store all the user behavior data collected from the website in a MongoDB. When a visitor comes, we will try to see whether he could match a particular user segment based on his identity, a cookie ID generated by our SDK for the anonymous user, or a real user ID identified by the tracking event like login.
After launching for a while, we found that the segment-matching process becomes slower and slower. It turns out that the main reason is that the data stored in MongoDB are way more than we expected which increases the read latency dramatically.
There is a very well-known fact that up to 98% of potential customers landing on corporate websites are completely anonymous:
After checking our database, it’s almost the same number. If so, in the real world, most anonymous users probably would never visit the website again, why do we still need to hold their behavior data forever? I shared my proposal with the leader of the backend team and we commenced discussions on potential solutions.
- He: If you want to remove an anonymous user’s data, how could you make sure he will never come back again?
- Me: Well, I couldn’t. But we could guess there is a big chance he will never return based on some heuristic method. For example, if one anonymous user’s last activity was already 7 days ago, which is the cookie lifetime for Safari.
- He: What if he comes again one day after 7 days?
- Me: Hm, so we need to find a way to recover his behavior data. How about we create a separate cold database to store all the deleted anonymous user data? For an anonymous user who comes to the website, we could first check whether we could find his data in the normal database, if not, then we try to find his data from the cold database, and reactive the data by moving it from the cold database to the normal database.
- He: It sounds like a solution. But we need to write additional code for this special case. And also if it happens, it will become even slower than now because of an extra database query.
- Me: Yes, it’s a special case that we need to handle, but it could solve the problem. And since it’s a special case, let’s see how often it could happen.
After making the change, the problem is solved, and according to the log, the special case rarely happens.
Special Case 2
My co-founder and I are building the full-stack toolkit ZenStack on top of Prisma. One of our customers asked if we could resolve an unresolved Prisma issue:
Add findManyAndCount to return count of queried items #7550
Don’t know why Prisma didn’t provide that, but it seems like an easy wrapper to achieve so:
const { items, count } = await prisma.$transaction(async (tx) => {
const items = await tx.user.findMany(query);
const count = await tx.user.count({ where: query.where });
return { items, count };
});
Obviously, it requires two queries for each API call. But that’s the fundamental database limitation, it seems nothing we can do, right?
One of our customers doesn’t think so, he offered another special case handling as below:
const { items, count } = await prisma.$transaction(async (tx) => {
const items = await tx.user.findMany(query);
// If it is true, we could avoid another query
if (query.take && items.length < query.take) {
return { items, count: items.length };
}
const count = await tx.user.count({ where: query.where });
return { items, count };
});
Here is his quote:
Especially when building an admin dashboard, extra counts every time you change a filter or a sort direction on a table add up if your database usage is metered, so if the assuredly awesome ZenStack solution could factor that in
Conclusion
Since many people consider programming to have a strong relationship with mathematics. They usually try to find a single solution that fits all cases, like a mathematical formula. However, programming is not about writing code that runs, it is about finding effective solutions to real-world problems. Real-world problems are often complex and multifaceted, and as such, they typically do not have a one-size-fits-all approach.
I do believe programming is an art, but it’s an art of trade-off. It’s a trade-off between time and space, stability and flexibility, performance and code complexity, etc. The ability to balance these trade-offs is an essential part of programming and is what makes it an art.
Therefore, as long as you know it’s the right trade-off, don’t be afraid to make the special case and write more code for that. That’s part of the art.