[Update Nov 27: This post had issues, and I retract some of my more provocative claims. See the errata at the end.]
All software comes with a version, some sequence of digits, periods and characters that seems to march ever upward. Rarely are the optimistically increasing versions accompanied by a commensurate increase in robustness. Instead, upgrading to new versions often causes regressions, and the stream of versions ends up spawning an extensive grapevine to disseminate information about the best version to use. Unsatisfying as this state of affairs is to everyone, I didn't think that the problem lay with these version numbers themselves. They're just names, right? However, over the past year I've finally had my attention focused on them, thanks to two people:
Why is version pinning so prevalent? The proximal reason is that modern package managers uniformly1 fail to provide the sane default of "give me the latest compatible version, excluding breaking changes."
These are all deep, competent projects. Why are their defaults so uniformly useless and misleading? The underlying reason is the traditional format of version numbers: mashing together multiple numbers into a single string, and more importantly separating the version string from the name of a package. A dependency that is just a name provides no hint on what version you want compatibility with, so a package manager has no easy way to pick a good default version.
Towards a better approach
To begin with, it's weird that versions are strings. Parsing versions is non-trivial. Let's just make them a tuple. Instead of "3.0.2", we'll say "(3, 0, 2)".
Next, move the major version to part of the name of a package. "Rails 5.1.4" becomes "Rails-5 (1, 4)". By following Rich Hickey's suggestion above, we also sidestep the question of what the default version should be. There's just no way to refer to a package without its major version.
Since we always want to provide the latest version by default, the distinction between minor versions and patch levels is moot. Just combine the 2-tuple into a single number. "LeftPad 17.5.0" would now become something like "LeftPad-17 37".
At this point you could even get rid of the version altogether and just use the commit hash on the rare occasions when we need a version identifier. We're all on the internet, and we're all constantly running npm install or equivalent. Just say "Leftpad-17", it's cleaner.
And that's it. Package managers should provide no options for version pinning.
A package manager that followed such a proposal would foster an eco-system with greater care for introducing incompatibility2. Packages that wantonly broke their consumers would "gain a reputation" and get out-competed by packages that didn't, rather than gaining a "temporary" pinning that serves only to perpetuate them. The occasional unintentional breakage would necessitate people downstream cloning repositories and changing dependency URLs, which would create a much more stringent atmosphere of accountability for the breaking package. As a result, breaking changes wouldn't live so long that they gain new users.
In particular, Semantic Versioning is misguided, an attempt to fix something that is broken beyond repair. The correct way to practice semantic versioning is without any version strings at all, just Rich Hickey's directive: if you change behavior, rename. Or ok, keep a number around if you really need a security blanket. Either way, we programmers should be manually messing with version numbers a whole lot less. They're a holdover from the pre-version-control pre-internet days of shrink-wrapped software, vestigial in a world of pervasive connectivity and constant push updates. All version numbers do is provide cover for incompatibilities to hide under.
Update (Nov 27)
This post aroused a lot of great feedback on Hacker News and Lobste.rs. After a day of engaging with comments my conclusion was that I should have been more explicit about my focus: the flow for upgrading (and testing) software in development, not deploying to production. In particular, the post doesn't make any claims about versions in production. Reproducible builds are great! But you just need a hash for them. Right?
A prolonged exchange with Joel Parker Henderson convinced me that it's just not feasible to separate operational concerns from development concerns. A common question when managing software in production is, "what version is this running?" And that question quickly requires drilling down to the constituent pieces of a release and their versions. A hash makes that too hard. And you can't have separate version strings for development and deployment either, that's just a recipe for confusion. Therefore, if you take operational considerations into account, my claim that we don't need versions at all is invalid.
What, if anything, remains of value in this post? Package managers should by default never upgrade dependencies past a major version.
The design goal of a package manager should be that a dependency once added to Gemfile or package.json should never need to be modified until it's deleted. What people specify manually goes there, what the package manager deduces goes somewhere else (like Gemfile.lock). If people are editing version strings en masse in Gemfile or equivalent, that is a smell.
In the next mainstream platform, the versions people specify for dependencies should consist of just a major version, because that's the part that the package manager can never deduce. SemVer is a siren here because it conflates pieces from multiple jurisdictions. The major version is the user's responsibility, and minor and patch versions are the package manager's. Why coalesce the two? That just necessitates baroque syntax like twiddle-waka to do the safe thing.
(And oh, if RubyGems and NPM are smelly, the Clojure approach totally stinks. Clojure requires manual intervention to pull in compatible/security fixes for dependencies. It follows the existing Java approach, but Java's eco-system predates the advance of package managers, half of whose reason for existence — after installing dependencies — is updating dependencies. I may still be unaware of some design rationale here, but for now I think Leiningen really missed an opportunity to improve on Java here.)
1. One exception here is Go, where the standard go get command requires no versions, and always grabs the head of the repo. However, the community seems to be turning from the light to the darkness with a proposal for a tool called go dep. It's unclear to me if this is due to a failure of communication on the part of the original authors of Go, or if there's a deeper justification for go dep that I'm missing. If you know, set me straight in the comments below.
Comments gratefully appreciated. Please send them to me by any method of your choice and I'll include them here.