Generalization vs. Specialization

One of the more difficult tasks in software design is striking a balance between creating a general solution and a more specific one. Should we build something that can handle all sorts of hypothetical future requirements or one that solves a specific problem, but may not work for anything else down the road? Should we take longer and spend more up front but reap dividends down the road? Or should we go quick and dirty now and “cross that bridge when we get to it?”

For example, suppose you were building a web app that facilitated introductions over email. Jane wants to introduce John and Tom. Jane visits a website and enters John and Tom’s emails into a form. The application then sends emails to John and Tom saying that Jane would like to introduce them. John and Tom can click on a link in the email to confirm they would like to be introduced, and the application sends the two of them one another’s email addresses.

Thinking about the data model for this application, you reason that there should be a table that stores each introduction, with fields for Jane, John and Tom’s email addresses. But what if, down the road, you might want the application to support introductions for arbitrary numbers of people, rather than just two? Or what if you may want to handle not just email addresses, but phone numbers, mailing addresses, etc. Should you build the data model now to support these potential features, or should you wait until you need them to implement them?

On the positive side, building a robust data model now would enable you to quickly and easily build new features later; on the negative side, if, for lack of demand, you end up never implementing those features, well then you’ve wasted your time. On the other hand if you only build those things you need right now, over time you will end up with a tangle of spaghetti code.

So what’s the answer? Well, like many other things in life, it depends. It is really a case by case question, and that’s what makes it difficult. There are a few factors to weigh in making this decision:

  • What’s the real likelihood of implementing this feature down the road? In the example above I would argue that introducing more than two people to one another at a time is a corner case and therefore not a likely future feature. On the other hand supporting phone numbers and mailing address has a high likelihood.
  • What’s the cost of generalizing now? If the present cost of generalizing is just way too high (either in time or money), then you have no choice but to put it off for later.
  • Have we been here before? If you find yourself designing something that’s eerily similar to something you’ve done before, and you can foresee having to do it again, you should take the time to generalize now.

I find that many good software engineers fall into the trap of over-generalizing, because that is what they were taught to do in college, and so everything they do just ends up taking forever. On the flip-side, bad coders never plan for the future and just keep heaping crap on top of crap as they go. So keep in mind the guidelines above next time you have to make this trade-off. And also keep in mind that this trade-off is happening all the time in software design, so if you’re not thinking about it, you’re either wasting money or building crummy code.