Build license management into your pipelines
Floor Drees
Posted on March 15, 2023
Below is more or less my FOSS Backstage (lightning) talk, titled "Build license management into your pipelines".
The 2022 Open Source Security and Risk report examined the results of more than 2,400 audits of commercial codebases, of which 97% contained open source. Some sectors even contained open source in 100% of their audited codebases.
We’re all moving fast and in order to do so we’re relying on a lot of dependencies to give us that commercial edge. In doing so we’re trusting that strangers on the internet understand the license that they release their software under, and also that vendors won’t change their mind on who can benefit from their software.
If you install Electron and have to add 87 packages — that means 87 license dependencies. Every single package is likely to have its own dependencies, and therefore, another license you need to comply with. As you can imagine license management can’t be done manually and when done incorrectly can create a technical debt.
A graph of the 1600 dependencies referenced when you start a new React app
Let’s look at some projects that were relicensed recently, how we can track and manage our license dependencies upon deployment, and how we can be set up to respond to project changing terms.
About (not so) open source licenses
In one contributing.today episode we talked about licenses with folks from the Open Source Initiative (OSI), from the Ethical Source Movement, Tidelift, and ClearlyDefined.
I learned a few things.
- Like that combining licenses for 1 project (so not a different license for a project’s core and IDK its documentation) is possible but is mostly great for lawyers - and most definitely not great for your users’ license compatibility quest
- That most developers can’t tell the difference between MIT, Apache 2.0, and AGPL, and if you’re lucky, choose a license that is standard in their ecosystem
- That if you’re thinking of building a business around a project ultimately, you’ll need to choose wisely because unless you’ve been using CLAs (Contributor License Agreements), relicensing is a huge pain for even just a mildly popular project. Contributors are hard to contact, may have moved on from the project or the platform, passed away… you’ll risk having to rewrite parts of the project, and you’ll need to do a major release, since you'll introduce breaking changes to your API. That can cause loss of community, especially when the release before the change gets forked and further developed under a community-friendlier license.
- And I wouldn’t have the license you adopt depend on whether or not you secure funding either - a real thing that happened recently - people will think twice about spending their time on a project that might change just how open they are.
On the topic of “not so open”, in recent years we've seen an increase in "kinda open source" licenses that in some cases conflict with the Free Software Definition (right to use software for any purpose) and Open Source Definition (in that the license shall not restrict any party from selling or giving away the software).
Pamela Chestek, member of the Board of Directors of the Open Source Initiative and current chair of the License Committee, introduced proposed changes to the License Review Process at the OSI at FOSDEM last month.
Evaluating whether there should be a process for decertifying licenses, is something that wasn’t discussed in the working group, but Pamela did mention that “just because something was approved in the past, doesn't mean it’d be approved again”, which I thought was interesting.
Let’s look at some examples of projects relicensing.
Lightbend changed Akka’s license from Apache 2.0 to the BSL v1.1 (Business Source License), starting with Akka v2.7, delivered in October last year. Side-rant: it should have been a major release since it’s breaking its API, but let’s park that discussion.
With any such change there's talk of a fork.
I've seen people advocating for forks with an aggressive copyleft license, so that the now proprietary licensed "original" can't make use of community bug fixes. It remains the question how effective this would be, and if hurting our fellow developers is anything but misdirected anger.
Apache is currently incubating Pekko, an Akka fork, and some of the Aiven OSPO members are involved in the project that is obviously licensed under Apache 2.0, so no copyleft.
When Elastic released the publication informing about its license change, a shockwave went through the community. In Elastic 2.0 you find clauses to prevent hosted or managed service providers from using the project. It's copyleft-style (like SSPL), and:
- prevents third parties obscuring trademark notices and branding
- can embed license keys to prevent circumvention
Its impact: Elasticsearch, Kibana, et al got removed from hosted service infrastructures like Azure and AWS, which was kinda the point. Several players eventually decided to collaborate and fork Elasticsearch and start the OpenSearch project - including AWS and Aiven.
Dotan Horovits talked about Elasticsearch’s license change at FOSS Backstage in 2022: https://youtu.be/v7JfIupF1Lk
Grafana, Loki, and Tempo relicensed from Apache 2.0 > AGPL (Affero General Public License), an "infectious" copyleft. The CNCF, in response to the license change of third party dependencies to AGPL, encourages to "switch to an alternative component, or freeze the component at the version prior to the license change." Needless to say they're no big fans.
Licenses are no fun, litigation is worse though
License litigation may end up forcing you to release your code under the same license as the package dependency you used. Other potential problems include being sued for financial liability by the creator of the component, and/or losing reputation and getting negative press coverage.
Shifting left dependency and license management to the build or deploy stage prevents you from being noncompliant in Production, avoiding a lot of hairy problems, and losing your customers’ trust.
There are 300+ OSS licenses (116 OSI approved), and that list is growing. However, the good news is that around 20 licenses account for 80% of the open source commonly used in enterprises.
An SBOM (Software Bill of Materials) also lists the licenses that govern components, the versions of the components used in the codebase, and their patch status, which allows security teams to quickly identify any associated security or license risks.
Despite the romantic idea we have about containers, getting info from your package manager is not easy, and likely incomplete.
Developers write tools to solve a problem, not to generate an SBOM.
Thomas Steenbergen, Head of Open Source Program Office at EPAM
A software composition analysis (SCA) tool can help you do the job. Like Thomas Steenbergen’s OSS Review Toolkit (ORT), which includes Software Package Data Exchange (SPDX), an open standard for a SBOM. SPDX allows the expression of components, licenses, copyrights, security references and other metadata relating to software.
Now what?
What can shield us from licenses going wrong? Humor me for some solutions I’ve seen in the wild.
Forking / mirror
Having a mirror of all the components you have in use as a solution. While useful in some cases (like when stuff’s no longer maintained), also:
- Congratulations, you’re now the maintainer of a bunch of mirrors
- Sometimes vulnerabilities are in the code for many weeks, months or even years, before anyone notices. New releases bring fixes just as well.
Pin to version: big bank example
A big bank that shall remain unnamed at a conference recently presented a slide detailing the components in their setup. Including several open source tools and their version numbers. When I checked these tools (Helm, Splunk, Flux, ao) a non-trivial amount was a major release behind, if not more.
When I asked about their update strategy their answer is that there is none, it's a very manual process (still). On the upside: they know exactly what versions they're using, which is more than many other companies can claim.
But this approach too prevents you from benefiting from updates.
Recognize and support healthy projects
This is a whole talk, and in fact there were a number of talks talking about this very topic at FOSS Backstage. Look up what Bitergia is doing, and/or read the blog version of my FOSDEM talk: "what I learned about leading a healthy project from talking to 50+ open source maintainers"
Recognizing projects at risk of relicensing, allows you to…
… do due diligence (of alternatives)
Having multiple tools in a space that cover a similar use case is a good thing. Make sure you’re aware of alternative components and how well they interop with your setup, for critical pieces of infrastructure.
And build license management into your CI/CD - watch Thomas’ talk to find out how to, when FOSS Backstage publishes the recordings!.
Posted on March 15, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.