"Stop using semantic versioning" — any writings on this?

I’ve just watched Stewart Halloway’s talk “Stewardship Made Practical”, on 12:50 he advises against semantic versioning.

Basically, he’s saying that if there’s a breaking change, it’s better to either bump a number in the the library name, or to rename it completely, because it’s not clear if and how old code would break. And that sometimes relying on multiple versions of library is ok. A couple of slides:

Screen Shot 2023-04-16 at 11.12.57

Screen Shot 2023-04-16 at 11.13.03

Is there any writing that discusses pros and cons of this approach?

The link to the talk, with the timestamp:

2 Likes

Somewhat related: encore/BREAK-VERSIONING.md at master · ptaoussanis/encore · GitHub

As my 5c: I don’t care about versions. Whenever I update a dependency, I approach the whole process very carefully and regardless of whether it is a patch, a minor, or a major update.
It’s more important what kind of a dependency I’m updating and what the actual changes are than whether any particular number got updated or the library’s name got changed:

  • A dependency that’s used in a non-critical place (e.g. highlight.js that’s used only during development): I upgrade it and see if it still works, at least at a glance. Done
  • A dependency that drives a particular user feature and nothing besides it: I read the changelog and the diff if it’s short, and carefully test that particular feature
  • A “core” dependency that would break many or all things if updated incorrectly: I definitely read the changelog and the full diff. If the diff makes me suspicious of something, check all the places where I use the dependency to see if something could break. Finally, after all the changes to my code to adapt to the newer version of the library, I thoroughly test the app in a way a user would use it

Any other label-based approach (be it a version of a library name) is, while usually simpler, also much more error-prone.

2 Likes

I’m not aware of anything discussion the pros and cons of Stuart’s approach and while I have huge respect for Stuart’s work, with some trepidation, I think I would have to disagree with his suggestion. However, I’ve not yet watched the full video, so perhaps I’m missing some crucial point.

I think what is really underlying Stuart’s position is ultimately about expectation management and the reason I don’t agree with his suggestion is I don’t see it having any significant improvement in the current situation. In fact, I feel it just substitutes one set of issues for another.

My own position is probably closer to that of the response from @p-hinik and why I feel it is essentially about expectation management. Stuart seems to be expecting more from semantic versioning than me. For me, it really just provides a guide as to what level of effort I can expect to invest when updating a dependency to a new version. If the version only represents a bug fix or minor version change, I expect minimal effort is required and that there are no API compatibility changes. If on the other hand, there is a major version change, then all bets are off and all I can assume is there are major changes which most likely will break things. In either case, there is no substitution for comprehensive testing after updating a dependency.

Versioning in any form is a challenge. Semantic versioning can provide some valuable additional information to clients of a library. However, depending on the complexity, size and breadth of change between versions, it can be extremely challenging to get it right. It can be easy to miss a change which has broken API compatibility in a subtle way or just miss one because of the number of changes or fail to recognise when one of your own dependency changes bubbles up to require a major version change etc. In one of my JS libraries, I’ve had issues logged because I increased the major version number and a user could not find any breaking change, so felt changing the major version number was a mistake. Again, an expectation mismatch.

Stuart’s solution feels a bit like a response based on an expectation that the semantic versioning would somehow protect against things blowing up when there is a major version change. I’m not sure how this could be and I don’t see how using a different name, whether it is a different library name or a different namespace reference name, changes things. If foo() from a new version of the library is going to blow up in my application, whether the library has a new name or is imported with a new namespace reference is irrelevant, it will still blow up. In some senses, Stuart’s approach is even worse, from a maintenance/change perspective as it will likely require more changes sprinkled through my code (everywhere the namespace is imported compared to just the deps) and in the worse case, could result in some weird situation where my application is dependent on two versions of a library at the same time, which in itself could introduce weird issues.

2 Likes

I’ll give you a concrete example:

HoneySQL 1.x is widely used. It has specific Clojars coordinates and specific namespaces.

HoneySQL 2.x is mostly compatible but there are some changes in the API and some changes in the underlying DSL.

If I had pushed out 2.x using the same namespaces, folks are faced with an all-or-nothing upgrade.

So I added the updated API in new namespaces so folks could start using 2.x without changing their code, using the new API alongside the old API, based on which namespaces they required.

In addition, because of transitive dependencies potentially bringing in different versions of any given library, I decided to push 2.x out using a different group/artifact ID so that folks can guarantee that code expecting the 1.x APIs continues to work while any code expecting the 2.x APIs gets those – completely independent of any transitive dependency issues – and by ensuring that 2.x is always accretive/fixative and never breaks the API, folks are always safe to use the latest 2.x version alongside 1.x.

The big problem with Semantic Versioning is that people aren’t generally very good at following it and transitive dependencies make this worse. Consider Library A 1.1.0 uses Library B 2.8.0 and B gets an update to 3.0.0 but none of the changes from 2.8.0 cause breakage for the code within A… so Library A decides to depend on B 3.0.0 as part of a minor update and releases A 1.2.0 with some existing functionality updated to depend on some non-breaking enhancement that B 3.0.0 provides. Library A 1.2.0 has no breaking changes in its code or API so it looks like a safe upgrade for Application X which depends on A 1.1.0… but Application X also depends on C 4.1.2 which deep down in its dependencies depends on B 2.7.4… in a way that B 3.0.0 breaks.

So, Application X changes a single dependency, for a minor release and explodes in a completely unrelated area of the code.

Are you going to argue that a) Library A should never update a transitive dep to a new major version without also moving to a new major version itself? b) Application X should consider the entire transitive dependency tree when considering even a minor library update? c) Some other option? Or would you say that Semantic Versioning is “broken” at this point.

That sounds a bit confrontational, as written, but it is intended as a genuine question: I’m just really not sure what the answer is here, but this sort of thing has been enough to convince me that SemVer is just not terribly useful in the real world… :frowning:

8 Likes

I certainly agree there is a use case where it makes better sense to release a library under a new name rather than just a new major version number. You example with honesql is a good one. However, I don’t see that as sufficient to justify not using semantic versioning. There is nothing in semantic versioning which prevents anyone from releasing a library under a new name and in fact, I have done this in the past when the library has changed so significantly it no longer resembled the original API or when I wantged to provide a more controlled or fine grained upgrade path (as you did with honeysql).

I also agree with the issue you raised regarding transitive dependencies. This is a problem and in fact, in my first draft response, I also included it as an example of one of the weaknesses in semantic versioning. However, I removed it, partly to make the response shorter, but also because I don’t believe the alternative approach of releasing the library under a new name addresses that issue either. In fact, I wonder if, in a gradual transition approach as you outlined with honeysql, could the situation be even worse as you are likely going to be loading two versions of tghe same transitory dependency II guess if all libraries stopped using semantic versioning and all libraries used a new name when releasing a new version, this may not be an issue, but I don’t see that as terribly likely).

No, I do not consider semantic versioning to be broken. However, this goes back to my original point regarding expectations. I don’t have any expectation that semantic versioning solves the messy reality associated with updating dependencies. I don’t think there is any great solution. All we have are some practices which can help with assessing the likely impact of an upgrade. However, this is only a likely indicator, not a sold gold guarantee. All a semantic version number can really provide is a high level indication of what has changed in the version. It provides no guarantee regarding how the changes will impact your application. In fact, I wouuld argue that the real problem with semantic versioning is that people interpret the version as implying too much with respect to any applicaiton which uses it. Obviously, from a library pespecitve, you cannot imply anything regarding applicaitons which use your library. A libraries version number, semantic or otherwise, cannot imply anything regarding the applications which use it. It can only impart information about the library it is associated with. Anyone who is surprised to find their applicaiton is broken after upgrading a dependency which only had minor or patch/bugfix level changes i.e. from 1.0.0 to 1.0.1 or 1.1.0 is misinterpreting what a semantic version is telling them. It isn’t telling them that the changes are not going to break their applicaiton. It is only telling them that the changes either add new functionality or fix a known issue. As the major number hasn’t changed, it is also telling them that existing API signatures and return values have not changed ‘shape’. However, as the library author doesn’t know how you use the library, they cannot know precisely whether, for example, your applicaiton didn’t rely on the issue/bug which has been fixed.

Version and dependency management is tough. I don’t think there are any short cuts. I find semantic versioning useful as it provides at a glance, some additional information which can help manage dependencies for my project. I don’t interpret semantic verisoning as providing any hard gurantee and approach all dependency updates as potential breakage regardless of the version number. A semantic version gives me a high level summary of the extent of change within a library and can provide me with some expectations regarding the effort associated with the upgrade, but implies no gurantees. The alternative is a version number which is bespoke for each library. It tells me nothing other than it is a different version. This isn’t necessarily a bad thing and I certainly do’t argue everyone should use semantic versioning. You should use whatever approach is right for your project (just be clear about that to avoid confusion and misaligned expectations). Likewise, I don’t think we should tell people not to use semantic versioning.

Like many things, the better solution likely lies somewhere in the middle. For exmaple, there is nothing preventing you from using semantic versioning and when it is appropriate, releasing a library under a new name with a new semantic version.

2 Likes

But they are different dependencies – different group/artifact and different namespaces – with a commitment to only fixative/accretive changes in each. That’s kind of the whole point: don’t make breaking changes, use different names.

The whole point of SemVer is to set those expectations: the fact that you don’t have any expectations about SemVer solving the problems means that you don’t believe SemVer works. What value do you get from SemVer if you can’t have those expectations? You might as well just use pretty much random version numbers at that point and “force” everyone to read your release notes (and rely on them being accurate) – or force everyone to treat every release as potentially breaking…

So SemVer is no better than any other versioning system? In which case, we agree. Which means SemVer is “broken” because it doesn’t achieve its goals.

Semantic Versioning 2.0.0 | Semantic Versioning (semver.org) repeatedly talks about “backwards compatible”. It specifically says “A simple example will demonstrate how Semantic Versioning can make dependency hell a thing of the past.”

Perhaps the underlying issue we might agree on is that almost no one is doing SemVer correctly? And therefore that claims made by a library that it “uses SemVer” should be taken with a grain of salt.

I suspect we will just have to agree to disagree.

The whole point of SemVer is to set those expectations: the fact that you don’t have any expectations about SemVer solving the problems means that you don’t believe SemVer works. What value do you get from SemVer if you can’t have those expectations? You might as well just use pretty much random version numbers at that point and “force” everyone to read your release notes (and rely on them being accurate) – or force everyone to treat every release as potentially breaking…

This seems to be where our expectations differ. For me, semver tells me about the API changes within a new version. It tells me if the API has changed, or if it has remained the same, but new functionality has been added or that the API has not changed and no new functionality has been added, but a bug has been fixed. That is all. It makes no promises regarding how the new version will perform with my application. That is my issue to assess and not something a library author is in the position to assess - they don’t know my app and cannot assess how those changes will impact my application.

Your position seems to be that because it would still be possible for a new version with no major version number change, to blow up in your app, semver is totally broken and should not be used. I think that is over stating the issue and based on an unreasonable expectation.

So SemVer is no better than any other versioning system? In which case, we agree. Which means SemVer is “broken” because it doesn’t achieve its goals.

I would agree that semvar should not claim to have solved the ‘dependency hell’ problem. I do think it is a tool which can help with managing that problem, but stating that it solves it is similar to JS claim that promises solved callback hell. However, I don’t agree that because semver does not solve dependency hell it has no use or that it is no better than any other versioning system. Semver does provide useful information. In addition to telling me that the library has changed, it tells me how it has changed i.e. existing api has changed, existing api has been extended or no api changes and only internal bug fix changes. It is up to me to determine how those changes can and do impact my application.

Flip things around the other way. Assume we don’t use semver and instead adopt the practice of always releasing a library with a new name - possibly by appending the version number to the name. As a user of the library, all that tells me is that a new version has been released. I don’t know if the API has changed or stayed the same. I have no way to assess the effort required to update to the new version without reading through the documentation. How easily this is to determine will depend heavily on the quality of the documentation. Non-semver version numbers represents a reduction in information. I guess less information means less chance of misinformation, but that seems like a high cost.

At this point, you are possibly thinking "Ah, but people don’t always follow that process and sometimes there are API changes in existing APIs without a major version change or something classified as a minor version extension or a bug fix breaks the API backward compatibility in subtle ways etc… To that I would argue there is a difference between something being used incorrectly and something being inherently broken. A system which fails because people didn’t use it correctly does not necessarily mean that the system is broken. Incorrect use can indicate a problem, especially if the majority of people use it incorrectly. However, I don’t think this is the case here. My experience has been that most people use it correctly and incorrect use is most often just a release error which is quickly addressed once identified. Incorrect use might simply be an example of the need to provide better documentation or training.

I don’t believe any versioning system can provide any assurance regarding how well a new version will interact with your application. This is an unreasonable expectation as library developers have no knowledge or control over how their libraries are used within applications. A version number can convey information about how a library has changed, but not about how that will impact the user of the library. I also don’t believe semver is making any claims regarding how a new version will behave in your application. All it is telling you is how the API for the library has changed. The rest is something for you to determine through testing.

But that’s not what Stu (and Rich and others) are saying.

They’re saying, for an existing dependency, always be backward compatible – only fix bugs and add new functionality. If you need to break backward compatibility, use a new name.

If you follow that process, you are “guaranteed” to be able to update to any new version of the library and expect backward compatibility – updating should always be safe.

Any backward incompatible behavior will either be:

  • a new function in the existing library
  • a new namespace in the existing library
  • a new group/artifact (and new namespaces to avoid collisions)

In other words, you can safely update and ignore the (new) backward incompatible behavior until you specifically decide to switch over some or all of your code to that new behavior.

At that point, the only real purpose of versioning is for your dependency management tool to be able to select the “most recent” version (which is what tools.deps and the Clojure CLI does – unlike Leiningen/Maven). And tools.deps can figure out “most recent” even in the case of git dependencies as long as all of the versions in play can be linearly arranged, unambiguously, to identify a “most recent” SHA.

It’s why, in the Clojure world, you’re starting to see quite a few libraries move to MAJOR.MINOR.COMMITS: the major/minor part is informational for users really – a library could easily stay at 1.0.z forever and just indicate “most recent” release via the commit count (z).

3 Likes

Assume we don’t use semver and instead adopt the practice of always releasing a library with a new name

  • possibly by appending the version number to the name. As a user of the library, all that tells me is
    that a new version has been released.

But that’s not what Stu (and Rich and others) are saying.

They’re saying, for an existing dependency, always be backward compatible – only fix bugs and add new
functionality. If you need to break backward compatibility, use a new name.

OK, that wasn’t totally clear to me from the snippet in the video.
Thanks for clarifying.

That approach does seem to make more sense. I still wouldn’t argue
semver is broken, but I can see how this approach, especially for
clojure, would have an advantage. For other languages, semver is likely
still the better approach.

I do have some concerns regarding implementation/use and possible
‘ripple’ change and how consistently people will adopt this technique,
but that is likely more just due to me not knowing what I don’t know
yet. Actually adopting this approach is likely the only real way to get
a solid grasp.

If you follow that process, you are “guaranteed” to be able to update to any new version of the library and
expect backward compatibility – updating should always be safe.

I’m not sure I understand how this ‘fixes’ the transitory dependency
issue (or perhaps we have different definitions of this issue) unless
all the transitory dependencies are also using this approach. Perhaps
that is an implicit assumption in order to claim this approach a guarantee?

Any backward incompatible behavior will either be:

  • a new function in the existing library
  • a new namespace in the existing library
  • a new group/artifact (and new namespaces to avoid collisions)

In other words, you can safely update and ignore the (new) backward incompatible behavior until you
specifically decide to switch over some or all of your code to that new behavior.

I see this is the approach you used for honey sql. Just wondering,
if/when you need to release v3, what approach do you think you will
take. Specifically, would you just change the artifact name - adding a
version number perhaps? What about the namespace? As both honey and
honeysql are already taken, will it become harder to identify good NS names?

1 Like

I think the advice from some of the Clojure community is that you do:

  1. Never make any breaking changes ever. Then it doesn’t matter what versioning scheme you use, it is ALWAYS safe for your users to upgrade. Now your versioning is just whatever you want to indicate any change. A timestamp, a commit hash, a incrementing int, etc.

  2. If you want to break something, because you don’t want to put the effort to maintain backward compatibility, or you just no longer like the interface of the library/framework, and want to release something that improves on it but would be breaking. Then just don’t release it as a new version. If you break backwards compatibility, this is just no longer a version upgrade of the same lib, breaking changes are fundamentally different enough that it’s now a different library with incompatible interfaces and behavior. Therefore it warrant a new library/framework, with a different namespace, different Maven coordinate, different branch or repo, different documentation, etc.

I don’t think it fixes it, unless the whole ecosystem adopts this approach. If you force the library to use a newer version of a dependency which has introduced breaking changes that break the library, there’s not much you can do. But if that transitive deps followed the same model, they would never release a newer version that is breaking, instead they’d have released a new lib (or just not made a breaking change in the first place), which you are free to now depend on for your use-cases, and the library can continue to depend on the other lib.

To be honest, it’s a high bar for sure. As a library author, you might want to make a small breaking change and don’t care about your users being broken, cause you work on your lib for you own fun, and maybe some current interface/behavior of it annoys you. And creating a new lib for a minor breaking change is a lot of work, new repo, new doc, new name, rename all namespaces, etc. Now just introducing an alternate function or something like that isn’t so bad, but you still need to make sure the older functions continue to work, etc. So ya, it’s a high bar, but I think people are trying to insist on it to create a very reliable ecosystem.

2 Likes

Really cool discussion. I didn’t realise this was an issue but it is really about strategies on how to evolve code.

The first time I saw <library>.v1.<module> in a namespace was in the techascent libraries and to be honest, it went against my sense of code aesthetic because of the Don’t Repeat Yourself principle that we had all ascribed to when learning how to program.

I still don’t like it to be honest but I can see the real benefits of being able to write new code without affecting the legacy interface - much like one would do with an API.


I don’t think it’s necessary for every library. It’s good to be able to support legacy code but this is what I would do to evolve a library with a small change to what cognitect is advocating - taking midje as an example:

  • v1 exposes midje.sweet as the main namespace
  • v2 renames v1 midje.sweet to midje.v1.sweet and implements a new midje.sweet namespace
  • v3 renames v2 midje.sweet to midje.v2.sweet and implements a new midje.sweet namespace

Evolving code has little to do with dependencies. I would argue that if v1 uses 1.4.0 of a subdependency and v2 uses 1.5.0 of a subdependency, and you package both versions in a single library, you might end up having dependency conflicts right then and there. Dependencies are tricky. The only solution I know is to have as little of them around as possible.

It would very much depend on how much “breakage” I wanted to introduce and why.

Going from v1 to v2, I fundamentally changed the entire internal model – and v1 exposed a lot of that model as multi-methods for extension, as well as using a lot of typed objects and wrappers. I tried to retain as much of the pure data DSL as I could, and I’ve gradually added more migration tools and improved backward compatibility.

I don’t have any idea right now what a v3 might need to look like. I think I can maintain a clear fixative/accretive path with v2 for the foreseeable future so perhaps v3 might rewrite some internals but the API in v2 is pretty minimal and stable (and much smaller than the v1 API) so I’d want to keep the same API, I think. So perhaps honey3.sql and honey3.sql.helpers? Assuming I really did need to break backward compatibility…

Part of the group/artifact change was driven by the Verified Group Names policy that Clojars introduced to improve security – essentially requiring a verified domain or identity for the group. I switched many of my libraries over to com.github.seancorfield/* at that point.

Some of my libraries are adopting org.corfield.* as a prefix so that’s another possibility (org.corfield.honey.sql etc). It’s a big space :slight_smile:

Interesting discussion. Just some random thoughts:

  • SemVer is a heuristic
  • dynamic linking is a heuristic
  • if you want exact, you get static linking or equivalent, e. g. as in nix, guix etc.
  • in dynamic linking, updating a dependency to a new major version /always/ bumps your own major version, too, even if it changes nothing in your own code nor API
  • different major versions should always be able to live alongside each other in the same image (different coordinates, different namespaces etc.)
  • what is or isn’t a breaking change is not always clear cut — can adding a new method to a multifunction break something /in general/?
  • how do you handle versions of an experimental library that you want to test out in the wild while working out the API and concepts?
3 Likes

Cool thoughts, only for:

how do you handle versions of an experimental library that you want to test out in the wild while working out the API and concepts?

I know generally, that’s version 0, or alpha, or beta, and it’s just made clear that backwards compatibility is not retained, API is quickly evolving. And once you go: We commit to never breaking backwards compatibility, then you never do :stuck_out_tongue:

1 Like

I believe, this is a typo, should be "Any backward compatible behavior ", right?

How would you introduce fixes? I see a couple of ways. Which is preferable? Depends on supposed usage frequency?

  • Add a fixed function and name it slightly differently, then mark the old one as deprecated in docs.
  • Add a next version namespace with the fixed function and then instruct users to update versions in requirements.

Another question: what if one decides to introduce breaking changes, squash all these namespaces and start over again, how bad is it when there are multiple versions of the same library in a large codebase? Would it put more cognitive load on users to figure out which version to use when they introduce changes in their own code? I mean, this sounds great, but how confusing would this be in practice? Still sounds better than semver though.

One more concern: what if a library is following this non-breaking changes way of versioning, but its dependencies are semversioned, can’t updates of these dependencies potentially break the users of the library? It seems that for this to really work, all dependencies should follow such non-breaking versioning scheme.

Any ideas on how to write tests that catch breaking changes? In statically-typed languages the typing systems themselves are a way of testing. How would you do this in clojure?

No. Backward-compatible changes can be made to existing code.

Changes that are not backward compatible require a new function, a new namespace, or a new group/artifact.

Purely fixative changes should be backward compatible – no new function or new namespace required.

At work, we have “legacy code” that uses clojure.java.jdbc and HoneySQL v1, and we have more recent code that uses next.jdbc and HoneySQL v2. It’s a large codebase – 136K lines – so the presence of different libraries in different parts of the code is not a big deal. Even where we have both libraries in the same namespace, the aliases tell you which library you’re using.

And the whole point of this approach is to let you migrate piecemeal from the old version to the new version, to make it as smooth as possible: you can convert a single expression, a single function, or an entire namespace.

I specifically mentioned this in an earlier post in this thread – although I didn’t offer a resolution. As a library maintainer, you have to be very careful about your dependencies, especially if some of them are not “well-behaved” about updates: if you only depend on Clojure libraries, you’re generally fairly safe; if you depend on Java libraries, you have to be very careful about updating them.

Spec can help here because it can check values at runtime during testing – as well as the shape of data. Static typing only helps with the latter half of that checking.

Purely fixative changes should be backward compatible – no new function or new namespace required.

Except, of course, XKCD 1172 usually gets in the way :blush: (that is, any observable behaviour ends up being part of the API, even if we didn’t intend to).

Years ago, the Microsoft Office team ended up depending on a bug in the Microsoft C++ compiler. The compiler team didn’t know about this, so when they found the bug, they fixed it. The Office team protested about the fix, so it was reverted.

A friend who worked there told me that story, and it’s always stuck with me – and I would view controversial bug fixes the same way: if the fix affected more people than the bug did, I’d consider making the bug the “official behavior” and reverting the fix.

1 Like

Surprised no one seems to have brought up Rich Hickey’s commentary on this (link should take you to the 29:54 mark):

I’m not saying this settles things, or even brings new points to this discussion (he even references the Stuart comments at the top of the thread, apparently made at the same conference), but as with many of his talks I found it persuasive.

3 Likes

The transcript: talk-transcripts/Spec_ulation.md at master · matthiasn/talk-transcripts (github.com)

In case anyone (like me) does better with the written word than the spoken word. He starts talking about dependencies about five minutes in and covers a lot of ground…

6 Likes