Structural vs Conceptual Refactoring
When I was at Rogue Rails last month, our team was working on a story-tracking application. At one point, we had a pair of Cucumber specs that looked like this:
They’re not the greatest specs in the world, but they’re handy for illustrating the point I want to make.
As you can see, the last five lines of the spec are nearly identical. This is a good place to remove some duplication.
There are at least two ways to approach such a refactoring.
The first is a structural approach, where we just look at the structure of the code and mechanically remove all of the duplication. Using this approach, we’d extract all five lines to a higher-level step definition and completely eliminate the duplication.
I’m not sure that’s even possible in Cucumber, because the duplicated
lines span part of the When
and the Then
sections of the spec.
Even if it is possible, how would we word it? What kind of name could
we give those five lines that would communicate what they do?
The second approach is a more conceptual approach. In this approach,
we notice that the duplication of And I fill in "I want to" with
"<something>"
is really somewhat incidental. In the Create a story
scenario, that line really belongs with the lines before it where
we’re filling in the rest of the form.
The duplication of the And I click "Save Story"
line is less
incidental, but is also the core action of the spec and so shouldn’t
necessarily be extracted.
However, the three parts of the Then
section are very much related
to each other and should be extracted. In our case, we refactored to
this:
In my experience, conceptual refactorings tend to turn out better in the long run. The concepts that bind parts of the code to each other tend to last longer than incidental structural similarity at this fine grain size.
It takes a deeper understanding of the code to be able to find the right concepts to apply. When working on an unfamiliar legacy codebase, some simple structural refactorings are often needed in order to even begin to get a handle on things. In that case, I recommend starting with the obvious structural refactorings, but pay attention and learn as much as you can while doing them so that you can begin to move to more conceptual refactorings over time.
The structural vs conceptual divide happens at coarser grain sizes as well, all the way up to overall system design and architecture.
Often, we decide to architect our systems around structural considerations: “these parts of the system do the same kinds of things, so they should go together.” This gives us client-server and N-tier architectures, for example.
The alternative would be to architect our systems around conceptual considerations: “these parts of the system are about accomplishing goal X, so they should go together”. I’m sure there’s a name for such architectures, but I don’t know of one. I tend to see less of this style of system in the wild.
At the architecture level, I think it makes more sense to have structural divisions in the code, largely for reasons of infrastructure and deployment. In web applications, for example, it makes a lot of sense to separate what happens on the server from what happens on the client. However, when working on a system like this, I notice that every new feature I build has to touch many or all of the architectural layers and I wonder if there is a better way to do things.
I don’t have any good answers for that yet. Do you?