Writing a Ruby Gem Specification

piotrmurach

Piotr Murach

Posted on March 2, 2020

Writing a Ruby Gem Specification

I created my very first Ruby gem in 2009. Since then I have released many packages. It's been a long ride. A lot of things have changed in a Ruby gem manifest file. Some configuration options are no longer necessary. A few new options appeared. Despite having a Bundler to help you generate a new gemspec, there are still questions you need to answer for yourself. The net effect is that everyone seems to be a little bit unsure about what to include and what to keep out from their gem specification. None of the gemspecs you will see in the wild will be alike. Confusing, isn't it?

In this article, I will examine the most common gemspec attributes and clarify the intention behind them. To help us, we will write a gemspec by hand and cover best practices related to each attribute and offer practical advice. As we learn, we will discover ways to make our gemspec simple and neat like Marie Kondo's drawers. Once we're done, we will have a manifest file that can serve as a reasonable starting point for any gem author. The final gemspec won't have the ultimate list of attributes that works in every scenario. But, I'm doing this in the hope that my practical experience will help you avoid issues that I've been bitten by myself. Whether you're a Ruby newcomer or a veteran with battle scars to show, I'm certain you will pick a few useful nuggets of information. Ready?

There's no better way to learn than to write our own gem manifest from scratch!

A Manifest File By Any Other Name

To build a Ruby gem, at the very least you need to have a manifest file. A file that will describe how to put source files together to do useful things. In the Ruby world, this file is often named package.gemspec. The convention is to use *.gemspec suffix but you could name the file anything you want. Inside the manifest file, we use Ruby to specify metadata and release information about our source code. Soon, we will explore each configuration attribute in more detail. Hold tight!

To aid our discussion, we are going to write a gemspec for an emoticon gem. This fictitious gem will help display ASCII emoticons in a terminal. Life is so much better when we can express our frustration with a bit of table-flipping (╯°□°)╯︵ ┻━┻. Instead of relying on gem generators like bundler or jeweler, we're going to take small steps and see what's necessary to write a gemspec by hand.

The Essential Ingredients

Any modern Ruby file starts with a magic comment that ensures all strings are immutable. This means that we cannot accidentally alter string content after its creation. So let's open emoticon.gemspec in an editor and add the following line:

# frozen_string_literal: true

There are many pieces of metadata we can provide about our gem. The good news is that despite the long list of possible attributes, we don't need to include all of them. There is a subset of only a handful that we need to use as the bare minimum. These attributes are:

  • name
  • version
  • summary
  • author(s)

To start describing our gem metadata, all we need to do is to create Gem::Specification instance and pass a configuration block as an argument. Within the block, we can start defining various attributes:

# frozen_string_literal: true

Gem::Specification.new do |spec|
end

Now, we are ready to add a minimal set of attributes. First, we add a name for our gem. A valid gem name can only include letters, numbers, dashes, underscores and dots. No other funky characters are allowed. You cannot start your gem name with a dash, underscore or dot either. Any gem name you pick needs to be unique. So like domain names, good names are often at a premium. You may need to be a little bit creative here. To see the naming conventions read "Name your gem" guide.

Bearing all this in mind, we add a name:

spec.name = "emoticon"

Versioning

Next in line, we need to specify a version. This is a three-digit number separated by dots in the format of "major.minor.patch". It's common to start a gem development with the "0.1.0" version. Then use what is referred to as "semantic versioning" to continue releasing new versions. You can find out more on how to manage this number on semver.org. In my experience, it's best to stay safe and do at least one release before the first stable release “1.0.0”. When you think about it, a lot of Ruby infrastructure depends on this little innocuous number. It's equally impressive and scary.

We can provide a version number directly as a string value. However, most often this number is read from another file located inside the lib/ folder. This way other libraries can also read a version number directly from the source code. For example, for our emoticon gem the file would look like this:

# lib/emoticon/version.rb

module Emoticon
  VERSION = "0.1.0"
end

It is common for gems to load the version file with the following three lines:

lib = File.expand_path("../lib", __FILE__)
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
require "emoticon/version"

That's a mouthful to just load a number. What is going on here?

Enter The $LOAD_PATH

We start by getting the hold of a path to the lib/ directory relative to our gemspec location. The lib/ folder contains all our source code files. Next, we check Ruby's global variable $LOAD_PATH to see if it already has the lib/ directory in its list of paths. This variable contains a list of paths that point to installation directories where Ruby files are saved. For example, on my system, one of the paths points to where Rbenv keeps Ruby 2.7 installation "~/.rbenv/versions/2.7.0/lib/ruby/2.7.0".

Ruby uses $LOAD_PATH when you call require statement. For example, if we wanted to load the YAML library in our code, we would do the following:

require "yaml"

The above would cause a search through all the paths in the $LOAD_PATH global variable. Since YAML gem is part of Ruby, it will be located in the language installation directory like "~/.rbenv/versions/2.7.0/lib/ruby/2.7.0/yaml.rb". Once found, Ruby will load the YAML for us automatically. In our case, we add our local lib/ folder to the searched paths list so that the require method can load our version file.

You must admit that's a lot of ceremony just to read a number! I agree.

This was a popular method to load a version file during the 1.8.7 and 1.9.2 Ruby era. It still remains popular, for example, for loading binary files. However, we can do better now. That's where the require_relative method comes handy. It will load a file relative to the current location. Exactly what we need. All the three lines can be replaced in one fell swoop with:

require_relative "lib/emoticon/version"

Nice. Once the version is loaded we can access it to populate the version number:

spec.version = Emoticon::VERSION

Explaining The Purpose

The next piece of information we're required to give is the summary. There is no limit on how long it should be but my rule of thumb is to use no more than ten words. I kind of see this as a tagline for a gem. A sales pitch if you wish. Let's look at examples of summaries from popular gems:

  • rake - "Rake is a Make-like program implemented in Ruby"
  • sequel - "The Database Toolkit for Ruby"
  • rails - "Full-stack web application framework."

With that said, we add a plain and simple summary:

spec.summary = "Display emoticons in your terminal"

The summary, for example, is used to display information when you run the gem list command for any installed gem package:

$ gem list -d package

Hand in hand with the summary goes description. It is an optional attribute but I do recommend you add it as well. This is where we're expected to provide more details. You can often see developers use heredoc to format many lines to keep their description readable:

  spec.description = <<-DESC
Display emoticons in your terminal. Communicate with your
users not only through words but also emotions.
  DESC

If you want to keep your text indented, you can use the “squiggly” operator introduced in Ruby 2.3. It removes the extra whitespace at the start of each line:

  spec.description = <<~DESC
    Display emoticons in your terminal. Communicate with your
    users not only through words but also emotions.
  DESC

Another approach, for those who wish to keep their spec attributes aligned, is to use a line continuation:

spec.description = "Display emoticons in your terminal. " \
                   "Communicate with your users not only " \
                   "through words but also emotions."

It's your choice. Once you publish your gem, the description will be displayed on the rubygems.org site when users search for your gem. For example, the tty-prompt gem description looks like this:

tty-prompt rubygems description

Attribution

The final bit of information necessary to make our gemspec build successful is the author field. We have the choice of two methods here - author and authors. Regardless of which one we use, the author names will be stored in an array. The author is a convenient shortcut that creates a single element array. Because of this, my preference is to always use the authors method. You never know when you will share the gem development responsibility with others. The following are equal:

spec.author = "Firstname Lastname"
spec.authors = ["Firstname Lastname"]

The Minimal Viable Gemspec

Putting it all together, our minimal gemspec should look like this:

# frozen_string_literal: true

require_relative "lib/emoticon/version"

Gem::Specification.new do |spec|
  spec.name        = "emoticon"
  spec.version     = Emoticon::VERSION
  spec.summary     = "Display emoticons in your terminal"
  spec.description = <<~DESC
    Display emoticons in your terminal. Communicate with your
    users not only through words but also emotions.
  DESC
  spec.authors     = ["Firstname Lastname"]
end

Nice! Now we have everything to build, an albeit useless, gem. If we were to try to build our gem, we would succeed but get quite a few warnings about missing some useful information tidbits and actual source code files! Let's change this.

Choosing a License

One of the extremely useful bits of information is license. From a legal perspective, if you don't provide any license with your gem, it will be assumed that you're the sole copyright holder. In turn, nobody will be able to use your gem in their code.

There are many open-source licenses to choose from. It is common to use a license identifier in place of a full name. The ones I tend to use are MIT and AGPL-3.0. The former to allow anyone to use and modify a gem for any purpose they may like. The latter for when I want any changes to the source to benefit the project and keep it freely available.

Similar to the author option, there are license and licenses methods. For our example, we pick permissive MIT license:

spec.license = "MIT"

Occasionally, gems are released under more than one license. Depending on the use case, one or more licenses may be applicable. For example, the rouge gem is distributed with MIT and BSD-2-CLAUSE licenses. Choosing the right license for a gem is important so take your time in deciding. The choosealicense.com may help you make up your mind.

Contact Information

Being part of an open-source community means that it's good to make yourself available in case people wish to contact you. You can do this by providing your email address with the email attribute:

spec.email = "email@example.com"

You can also provide more than one email address:

spec.email = ["email@example.com"]

A port of call for anyone wishing to learn more about your gem is a homepage. I tend to use a GitHub repository link here as it includes a readme file with project documentation:

spec.homepage = "https://github.com/your-handle/emoticon"

All the configuration options that we used so far are great. But, what if we need a little bit more flexibility and wish to add some custom information?

Metadata Inception

Since RubyGems version 2.0, you can add arbitrary information to your gem release. For example, we can tell people where they can find a changelog or submit an issue. This can be done using the metadata key. I know this is a bit like the Inception film - metadata in the metadata specification file. But not as mind-bending.

The metadata attribute must be a hash. Both keys and values need to be strings. Also, each key cannot exceed 128 characters, and the value cannot be longer than 1024 characters. Otherwise, you will get validation errors. Apart from that, there is no limit on what you can put inside the hash. Specifying metadata is optional but I'd strongly encourage you to use it to help others find out more about your gem.

There is a set of predefined keys whose values are expected to point to various web resources. You can use the following metadata keys to link to sources like a bug tracker or a changelog:

  • bug_tracker_uri
  • changelog_uri
  • documentation_uri
  • homepage_uri
  • mailing_list_uri
  • source_code_uri
  • wiki_uri
  • funding_uri

Out of the above, I tend to use the following five keys:

spec.metadata = {
  "bug_tracker_uri"   => "https://github.com/your-handle/emoticon/issues",
  "changelog_uri"     => "https://github.com/your-handle/emoticon/blob/master/CHANGELOG.md",
  "documentation_uri" => "https://www.rubydoc.info/gems/emoticon",
  "homepage_uri"      => spec.homepage,
  "source_code_uri"   => "https://github.com/your-handle/emoticon"
}

The gem's profile page on rubygems.org site will provide links based on the metadata information.

RubyGems profile links

We said a lot about our gem already but one essential bit we haven't added is the source code itself.

What Files to Include?

This is one of the most important but sometimes controversial attributes. What files should you include as part of the release? Should test files be part of it or not? Let me take a step back, and consider what a Ruby gem is and what it isn't. When we share a gem with others, we want them to use our code. Test files, build files like Rakefile or Gemfile, CI configuration files and other such artefacts don't mean anything to our users. These files are helpful when developing but don't serve any purpose in a packaged gem. Similar to, for example, a game distribution. You don't need all the files that were used to create the game. You want only the source files - the game itself!

But things are not as straightforward. In other programmings communities like Perl or Python, it is common to find the test files as part of the package. Tests are used to verify that the package works correctly on a given operating system. Publishing a gem to rubygems.org is not the only way to make your gem available for download. For example, my gems are distributed and repackaged on systems like Debian or ArchLinux. The package maintainers use custom scripts that convert a gem into a suitable package on that platform. As part of this process, they run tests.

I did a survey recently on Twitter to see whether people prefer to keep or remove test files from a gem release. As it turns out, around 64% of devs are in favour of dropping test files.

twitter include tests or not poll

This mirrors my own thoughts. After all, gems are downloaded and installed most of the time and so it makes sense to keep them as small as possible. By distributing only the source code, which often means files in lib/ directory, you will reduce a gem size significantly. In my case, it often means a reduction between 25% and 50% in size when I skip the tests.

To package files as part of a gem release, we can use the files attribute. There are other attributes that will add files as well which we will cover a little bit later. The value needs to be an Enumerable object that responds to each call.

Most often you will see gems that list their files using git and its ls-files command:

spec.files = `git ls-files -z`.split("\x0").reject do |f|
  f.match(%r{^(test|spec|features)/})
end

The advantage of this approach is that git already knows about all the files in the project. You can exclude files from the release with the gitignore file. The disadvantage is that git is the one that knows about all the files. As we discussed, there are many files that shouldn't be included in the release. Relying on git also makes the gemspec more brittle. The code needs to call out to an external dependency which may or may not be present on the system when the gem is built.

There is a simpler way!

We can use Ruby and the Dir class to help us list source files. We can provide a list of shell glob patterns to seek files that we wish to include. For example, to add all the files from the lib/ directory recursively, we can call the glob method like this:

Dir.glob("lib/**/*")

But there is an even more convenient array-like access method:

Dir["lib/**/*"]

A note of warning about the * wildcard character in the globbing pattern. Any files that start with a dot are considered hidden and excluded from matching the star pattern by default. This is in line with how the shell works on Unix systems. To include hidden files from the lib/ folder you need to use the File::FNM_DOTMATCH flag. Pass the flag as a second argument to the glob method like so (this won't work with the bracket alias):

spec.files = Dir.glob("lib/**/*", File::FNM_DOTMATCH)

Occasionally, you will see gemspecs that declare test files separately with the test_files attribute:

spec.test_files = gem.files.grep(%r{^(test|spec|features)/})

My recommendation is to skip this attribute as this is a thing of the past and no longer needed. Especially, given what we have discussed about the test files.

Autoloading Files

After declaring which files we wish to package, we can also tell RubyGems which files to make readily available from our gem. We met the $LOAD_PATH variable before. To add our lib/ directory location to it and make our code available via the require statement, we use the require_paths attribute:

spec.require_paths = %w[lib]

However, if adding the lib/ folder is all you need then you can skip this attribute as this is the default anyway.

Sometimes though, we also need to distribute an executable file with our gem similar to what bundler or rake gems do. How can we achieve this?

Executables

For a long time, it was common to store binary files in the bin/ directory. But, more recent bundler versions started generating development scripts in the bin/ folder. I've seen people add other scripts for profiling or running tests in there as well. The unfortunate scenario is when a gem's executable finds its way into this folder. If you're not careful a gem's executable can be released with all the development scripts as executables as well. Oops!

The new place to keep the gem's executables is in the exe/ folder. It is common to see executable names deduced with the following:

spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }

There is nothing wrong with this code. But to my taste, it's overly complex. Instead, I recommend being explicit about the executable names. Chances are that's a single file anyway that won't change. So instead consider listing names like this:

spec.executables = %w[emote]

If we attempt to build a gem right now, we will get an error that the bin/emote file cannot be found. That's because by default RubyGems assumes that all executables are in the bin/ folder. Using the bindir attribute, we can prepend each executable name with the new location of the binary folder. In our case, this is the exe/ directory:

spec.bindir = "exe"

Once you specify the bin folder name, all the executable files will be automatically added to the gem files list. There is no need to do it yourself.

Executables and source code files are not the only files we can add to our gem. There are other files that will help people get started and learn more about our project.

API Documentation

The RDoc tool takes any comments in your source code and turns them into a comprehensive documentation. It is compatible with many document formats like Markdown or inline code comments like YARD. By default, it only processes files in the lib/ folder. But, a lot of gems include other standard files like README.md, CHANGELOG.md, LICENSE.txt. These files provide a good source of information to begin using our code. We can instruct RDoc to add extra documentation files via the extra_rdoc_files attribute:

spec.extra_rdoc_files = Dir["README.md", "CHANGELOG.md", "LICENSE.txt"]

The above files will automatically get combined and included in a gem. There is no need to add them to the files attribute listing separately.

During a gem's installation process, the rdoc executable is run on the user's system to generate documentation. To control what users will see when browsing API examples, we can provide various options. Below is an example of a few self-explanatory options you may use to format documentation:

spec.rdoc_options += [
  "--title", "Emoticon - emotions in terminal",
  "--main", "README.md",
  "--line-numbers",
  "--inline-source",
  "--quiet"
]

At this point, we have included all the files. It's time to turn our attention to system requirements. It would be good to tell others which platform, Ruby or RubyGems version our library is compatible with.

System Requirements

We can specify the minimum version of Ruby that our gem works with using the required_ruby_version attribute. As a value, it accepts a string that conforms to the regular version comparison rules. For example, to specify Ruby version 2.4 or higher we can do the following:

spec.required_ruby_version = ">= 2.4.0"

If you don't provide any Ruby version then the default ">= 0" value will be used allowing for any Ruby version. Under the covers, the number is converted into Gem::Requirement object. Our previous example is the same as saying:

spec.required_ruby_version = Gem::Requirement.new(">= 2.4.0")

What version will you support depends on a few considerations. One constraint may be the type of Ruby features you wish to use that are not backwards compatible. The maintenance burden of supporting the older versions may lead you to choose more modern Ruby versions. Given these, I tend to support as many versions as I comfortably can. Even though a version of Ruby may be officially deprecated, it's important to remember that the business world moves slowly.

Next, you can force a minimum supported version of RubyGems. The required_rubygems_version attribute works exactly the same way as the Ruby version one. However, I tend to skip specifying this attribute. Why? By default Ruby installation is packaged with rubygems that limits its compatible versions.

There is a more esoteric attribute called the platform that allows you to limit your gem installation to certain systems only. With it, you can specify an exact processor architecture and system your gem is designed for. For example, to limit your gem to x86 CPU on the Windows system with VC8, you could specify it this way:

spec.platform = "x86-mswin32-80"

And if we wanted to automate the architecture detection to the current system, we could do the following:

spec.platform = Gem::Platform.local

By default, a gem's platform is Ruby which means that the gem can be installed pretty much anywhere:

spec.platform = Gem::Platform::RUBY

Chances are you won't have to ever worry about this attribute. I recommend you skip it.

Our code doesn't live in the vacuum. It often needs other gems for its functionality.

External Dependencies

Our emoticon gem needs to store the icons somewhere in a file. We decided that the most convenient format will be TOML. There is a dependency named tomlrb that does the job. To ensure this dependency is present on the user's system, we can use the add_runtime_dependency attribute. I prefer to use shorter add_dependency alias:

spec.add_dependency "tomlrb"

But declaring a dependency without a version is dangerous. It will make the gem build process issue warnings. You should always strive to select a version your gem is compatible with. Otherwise, you're almost guaranteed that some future release of the dependency will introduce breaking changes and your gem will stop working. There is an art to choosing a version number. It's a risk management strategy. It's good to check if the dependency author uses semantic versioning. Given this, I tend to use the pessimistic operator to constrain a version number:

spec.add_dependency "tomlrb", "~> 1.2"

I won't cover how to specify different constraints. For more details please read Declaring Dependencies official guide.

Apart from runtime dependencies, you can add development dependencies with the add_development_dependency attribute. The process is exactly the same. This attribute was added at the time when Bundler didn't exist. It was a way to setup a gem manually to start contributing. You can run gem install package - - development to install all the development gems. Now, we have Gemfile for declaring our development dependencies. What's the solution?

Based on my recent discussions, I realised that not everyone uses Bundler to develop their Ruby gems. Some developers even have a strong aversion to using it. Whether you like it or not, a few years from now Bundler may not be the tool that everyone uses to install their dependencies. But, you want to be able to continue building your gem. Because of this, I tend to include the most essential development dependencies necessary to test and build a gem. For me these are rake and rspec gems.

spec.add_development_dependency "rake", "~> 13.0"
spec.add_development_dependency "rspec", "~> 3.0"

If you depend on a piece of software to build your gem - add it to your development list. This means that gems like rubocop, simplecov, pry or bundler, though useful, are not essential to the process. So I keep them separate in a Gemfile.

Keeping Secure

Given recent abuses of rubygems.org service by nefarious types, I suggest you do anything possible to protect your gems. At the gem level, the solution is to sign your gem with a cryptographic key. There is a security guide that explains all the details on how to generate your private and public keys. It will also tell you how to verify a gem's integrity. Here, we are only concerned with including what's needed in a gemspec.

By default, a gem will be distributed with your public key if both the public and private keys are generated in the ~/.gem folder. You don't have to do anything else here. Alternatively, you can link to your public and private keys from different locations. It is common to distribute a gem with a public key in the certs/ folder and link to it with cert_chain attribute:

spec.cert_chain = ["certs/yourhandle.pem"]

Then keep the signing key together with other secure keys in the ~/.ssh directory (applies to Unix compatible systems). And only assign the private key via the signing_key attribute when a gem is built, that is, when the gem build package.gempsec command runs:

if $PROGRAM_NAME.end_with?("gem") && ARGV == ["build", __FILE__]
  spec.signing_key = File.expand_path("~/.ssh/gem-private_key.pem")
end

The Final Reveal

You have seen almost all of the available RubyGems attributes and code snippets demonstrating their usage. It may feel a little bit overwhelming at first. What you will also notice is that I introduced each attribute in order matching how I tend to place them in a gemspec. I like to group attributes together based on their conceptual affinity. Let's take a last final look at the entire gem specification in its full glory:

# frozen_string_literal: true

require_relative "lib/emoticon/version"

Gem::Specification.new do |spec|
  spec.name        = "emoticon"
  spec.version     = Emoticon::VERSION
  spec.summary     = "Display emoticons in your terminal"
  spec.description = <<~DESC
    Display emoticons in your terminal. Communicate with your
    users not only through words but also emotions.
  DESC
  spec.authors    = ["Your Name"]

  spec.license  = "MIT"
  spec.email    = "email@example.com"
  spec.homepage = "https://github.com/your-handle/emoticon"
  spec.metadata = {
    "bug_tracker_uri"   => "https://github.com/your-handle/emoticon/issues",
    "changelog_uri"     => "https://github.com/your-handle/emoticon/blob/master/CHANGELOG.md",
    "documentation_uri" => "https://www.rubydoc.info/gems/emoticon",
    "homepage_uri"      => spec.homepage,
    "source_code_uri"   => "https://github.com/your-handle/emoticon"
  }

  spec.files       = Dir["lib/**/*"]
  spec.bindir      = "exe"
  spec.executables = %w[emote]

  spec.extra_rdoc_files = Dir["README.md", "CHANGELOG.md", "LICENSE.txt"]
  spec.rdoc_options    += [
    "--title", "Emoticon - emotions in terminal",
    "--main", "README.md",
    "--line-numbers",
    "--inline-source",
    "--quiet"
  ]

  spec.required_ruby_version = ">= 2.4.0"

  spec.add_dependency "tomlrb", "~> 1.2"

  spec.add_development_dependency "rake", "~> 13.0"
  spec.add_development_dependency "rspec", "~> 3.0"
end

Sorted. Now our gem is ready to be released into the world.

Conclusion

My goal in this article was to remove the confusion that surrounds the RubyGems specification file. I selected the attributes I believe are necessary to make your gem release process smooth. If you stick to the advice in this article, your gemspec will surely spark joy! It won't be cluttered with unnecessary configuration options, cryptic method calls or spurious files. You will keep the gem in good shape by reducing its size significantly and help make your gem installation super quick. You will also avoid annoying your users with unrequested executable files.

I hope you enjoyed this whirlwind tour of the RubyGems specification. If you already have some open-source Ruby gems or plan to release one soon, I hope you have learnt enough to apply it in your working code! Next time, we will take a closer look at the process of turning a specification into a gem.

Stay hungry and creative!


This article was originally published on PiotrMurach.com.

Photo by Jas on Unsplash

💖 💪 🙅 🚩
piotrmurach
Piotr Murach

Posted on March 2, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related