Packaging
If you follow software development news at all, you probably heard about the “left-pad” issue that affected npm this past week. If you haven’t heard about it, here’s npm’s explanation of the situation.
In short, an open source developer decided to remove all of his packages from the npm registry. One of these packages was left-pad, a short utility that adds padding to the left of string to make it a certain length. It turns out that a significant number of high-profile and heavily-used JavaScript packages depended on left-pad. Its removal ended up causing builds to fail all over the place.
There has been much reaction to this event. One post asks, “Have We Forgotten How to Program?”. The author, Haney, is concerned that so many package authors depended on such a simple utility rather than writing their own version.
What concerns me here is that so many packages took on a dependency for a simple left padding string function, rather than taking 2 minutes to write such a basic function themselves.
He goes on to state that “[f]unctions are too small to make into a package and dependency.”
Is he right? How do we decide that? When do we reuse an existing package, and when do we rewrite it ourselves?
This seemed like a good opportunity to revisit a three-part series of posts I wrote a couple of years ago where I talked about Uncle Bob Martin’s packaging principles.
Part 1 talks about the three principles of package cohesion.
Part 2 talks about the three principles of package coupling.
And Part 3 talks about how, why, and when to apply the principles.
Of particular interest is the Common-Reuse Principle (CRP), which states:
The classes in a package are reused together. If you reuse one of the classes in a package, you reuse them all.
The principle was written from the perspective of an object-oriented language, but if you replace “classes” with “functions”, it still applies.
This principle pushes us in the direction of making many small packages that only do one thing. The other principles help push us back in the other direction.
Taking the principles as a group, we might group left-pad, right-pad (another of the modules that was removed), and some other string manipulation utilities into a single npm module.
But in JavaScript, especially in the browser environment, there is a heavy emphasis on minimizing the download size of applications. If our application only needs the left-pad functionality, then carrying right-pad and the other utilities along for the ride adds to the overall size of the application.
That additional force seems to be what pushes npm modules to become smaller and smaller in violation of some of the other principles.
So what’s the answer?
As with most things, it depends.
As I said in Part 3 of my earlier series:
Applying any principle mindlessly is not likely to work out well. This is true of these packaging principles as well. There are sometimes-conflicting forces at play and we need to make some tradeoffs.
There is a tradeoff between having to write and maintain a bunch of code that solves problems that have already been solved, and having to maintain a large number of small dependencies. Both approaches have risks.
As I mentioned a few posts ago:
There have been a couple of recent articles about [minimizing dependencies]. Mike Perham wrote Kill Your Dependencies. And then Elle Meredith wrote a very balanced and thoughtful piece, To gem, or not to gem.
I expect there will be many more posts written in response to left-pad that suggest the same thing.
We could decide to have no dependencies at all, ending up in world that resembles Greenspun’s tenth rule of programming:
Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.
When we rewrite code that we could have re-used, we could end up missing some of the corner cases, or writing something that doesn’t perform as well as the available package.
In general, you have to decide on a case-by-case basis which approach works better for you. There are no easy answers.
But I recommend not over-reacting to the left-pad issue, because the other extreme has issues of its own. Balance is key.
I’m starting to see some ideas and tools that might help with this balance.
First, there are some JavaScript tools that are starting to use an approach called tree-shaking. The idea is that any code that isn’t used gets “shaken out” and not shipped to clients. I know Webpack 2 will implement tree-shaking, and I assume there are other tools as well.
Once tree-shaking becomes more mainstream, we’ll be able to have larger (but still cohesive) packages without paying the price of larger download sizes.
Another idea is to bundle all of your dependencies before publishing. This was described well by Rich Harris in How to not break the internet with this one weird trick.
As I write this, ESLint 2.5.0 was just released and touts bundled dependencies as one of the highlights of the release.
This is the first version of ESLint that bundles its dependencies. Recent events have made it clear that for a development tool like ESLint, bundling dependencies makes a lot of sense. This will ensure a couple of things:
- That everyone using v2.5.0 of ESLint will be using the same dependencies, meaning that dependency updates won’t break a previously working ESLint version.
- We won’t fall victim to dependencies that were available at release time suddenly disappearing.
Bundling dependencies does mean that npm cannot dedupe ESLint dependencies upon installation, but as ESLint is a development tool only, we felt like this tradeoff was worth making to ensure that any given ESLint version that was validated to work at release time will continue to work for everyone no matter what.
As they mention, there are tradeoffs is taking this approach.
As I further edit this before posting, ESLint 2.5.3 was just released that backs out this change.
Unfortunately, this process turned out to be more complicated than we expected. As we continued to get bug reports over the past couple of days, we decided to revert the bundling of dependencies for now until we can investigate further.
I expect that bundled dependencies will come back in a future ESLint release, and that other packages will start moving this direction as well. Time will tell.
How are you managing the balance between taking advantage of reusable code and having to support too many dependencies?