Constants and Assumptions
As programmers, we’re taught to avoid hard-coded (“magic”) numbers in code. Instead, we replace them with constants or some other kind of named value. This is good advice, and helps with the understanding of our code. Also, if one of these magic numbers needs to change in the future, we can make the change in one place. This is a relatively basic rule.
A slightly higher-level rule is to make sure that the name we use for
the constant communicates the intent of the value and not the value
itself. Otherwise, you end up with something silly like
#define TWO 2
. Or worse, once the value has to change you get
something like #define TWO 3
.
One of the systems I work on uses image data that comes from a custom hardware device. This device always captures images of a certain width and so we dutifully encoded that width in a well-named constant, following the two rules above.
Those two rules are not sufficient. We’re now in a situation where we sometimes want to get back narrower images than those provided by the hardware. The hardware still provides images of the standard width, but parts of those images will be irrelevant to us. We want to eliminate the irrelvant data as soon as it enters our system.
This is turning out to be a big change, because the well-named image width constant represents an assumption, and we allowed that assumption to pervade our system. What seemed like a good idea at the time is now an “obvious” problem in the light of this new requirement. It wasn’t so obvious as we allowed this assumption to spread over many years. Some of this code has been around for 12-15 years.
We are now working to remove this assumption from our system. We are introducing data structures that carry along the necessary information, and modifying the code that works with the data structures to use the provided information rather than the constant. Given the amount of time and code that has passed, this is not a small task.
The higher-level principle here is that global data is dangerous, and constants are a form of global data. Even though we weren’t duplicating a magic number through our code, we were duplicating the assumption represented by that magic number.
We were protected against one kind of change: if we’d built a new hardware device that caputured images of a different width, we could have changed the constant in one place and been fine. However, that’s not the change that happened.
There is a tradeoff here, though. You can try to protect your system from all kinds of assumptions and end up with a bloated, over-designed mess. So you need to be careful not to go too far.
I will definitely keep this lesson in mind as I design code in the future. I need to think carefully about what assumptions I’m making, whether I’m adequately protecting myself from them, and whether it’s worth the cost of protecting myself from them.