Write your own Domain Specific Language in Ruby
Tom de Bruijn
Posted on February 6, 2023
Let's do some metaprogramming in Ruby! This article assumes you have some familiarity with writing Ruby and are interested in learning how to use metaprogramming.
This article is also available as a video recording of a talk.
What's a DSL?
Let's start with some definitions on what we're going to be looking at. The abbreviation DSL stands for Domain Specific Language.
A DSL is a sub-language within a Ruby app (or any other programming language) that is specific to a certain domain: like a Ruby gem or part of an app. Without that gem or app, the DSL won't work. It's specific to that domain.
In Ruby there's a gem named Rake. This gem provides a DSL to create tasks to be run from the command line. A small example looks like this:
# Rakefile
task :hello do
puts "Hello there"
end
You can call this task from the terminal with the following command. When called, it will perform the block we've given to the task
method.
$ rake hello
Hello there
This is the power of a DSL. We don't need to know how to listen to arguments given to Ruby from the command line. Rake does this for us. We only need to define the commands.
Why use a DSL?
Configuration
One of the most common reasons to create your own DSL is for configuration, for an app or part of a gem. In Rails, the Application
class contains configuration for the different gems that make up Rails. It's even possible to define your own custom config options specific to your app.
# config/application.rb
module MyApp
class Application < Rails::Application
config.load_defaults 7.0
config.time_zone = "UTC"
config.app.custom_option = "some value"
end
end
Within the Application class, the config
object can be used to set config options or load defaults, like shown above.
Automate writing code
A DSL can be used to abstract away a bunch of repetitive code. In a Rails app there can be many different HTTP endpoints the app listens to. To write a Ruby app that listens to these endpoints can require a lot of code. In Rails, we can declare several different routes with one method: resources
Rails.application.routes.draw do
resources :posts do
resources :comments
end
root "homepage#show"
end
The resources
method will do several things:
- create several endpoints like
GET /posts
,GET /posts/new
,POST /posts
,DELETE /posts
, etc.; - nest endpoints, if they are nested within another
resources
method call's block, like in the example above forcomments
; - route the request data on the endpoint to the relevant controller.
This is all defined with one method call. We don't need to go through several files and methods to make sure these steps are all done for every route. This makes, what would be high churn code, more maintainable*.
*: At times, we may be swapping out a lot of code with less code, but that code is more complex because of the metaprogramming involved.
Writing your own Ruby DSL
The start of a DSL
The first thing that usually gets a DSL is configuration. A gem ships with a configuration class or a part of an app does. A new Config
class gets created with some config options. (In this case a CLI tool that has options to print more information with verbose
and a particular output format called documentation
.)
class Config
def initialize(options = {})
@options = options
end
end
config = Config.new(
:verbose => true,
:format => :documentation
)
# => #<Config:...
# @options={:verbose=>true, :format=>:documentation}>
The above example is usually enough for a small component.
Let's look how we can make this DSL feel more like some of the DSLs we've seen. In the example below we don't configure the Config class with a Hash of options, but use method calls like the verbose=
attribute writer.
Ruby has helpers for defining reader and writer methods on a class for instance variables. Calling attr_accessor :verbose
in a class definition will define the verbose
or verbose=
methods.
class Config
attr_accessor :verbose, :format
end
config = Config.new
config.verbose = true
config.verbose # => true
config.format = :documentation
pp config
# => #<Config:...
@format=:documentation,
@verbose=true>
While this is quick to set up, it has some downsides. Like before, we want to store all config options on a @options
Hash instead of on their own instance variables. To export the config options as a Hash, we would need to export them all manually. Now there are more places that need to be updated the more config options are added. Easily forgotten, easily broken.
class Config
# ...
def options
{
:verbose => @verbose,
:format => @format
}
end
end
pp config.options
# => {:verbose=>true, :format=>:documentation}
To make the Config class work with a Hash of options, we can define our own methods for each config option. In these methods we set the values on the @options
Hash, using the key of the config option we want to set. Great! We now can export our configuration with ease.
class Config
attr_reader :options
def initialize
@options = {}
end
def verbose=(value)
@options[:verbose] = value
end
def format=(value)
@options[:format] = value
end
end
config = Config.new
config.verbose = true
config.format = :documentation
pp config.options
# => {:verbose=>true, :format=>:documentation}
With this setup, we run into a familiar problem. Every time a new config option is added we need to add another method for that config option. As more config options are added, the Config class grows and grows, becoming more difficult to maintain.
Dynamically define methods in Ruby
Instead of defining a method per config option, we can dynamically define these methods. Dynamically defining methods allow us to quickly define several methods that share almost the same implementation.
In the example below I've created a new class method, as indicated with the def self.def_options
syntax (1). This method, "define options", receives an array of option names to define methods for (2). In the def_options
method it will loop through these option names (3). In each iteration of the loop it will call the define_method
method (4). With this method we can define a new method, like with the def <method name>
syntax, but with a dynamic method name.
class Config
def self.def_options(*option_names) # 1
option_names.each do |option_name| # 3
define_method "#{option_name}=" do |value| # 4
@options[option_name] = value # 5
end
end
end
def_options :verbose, :format # 2
def initialize
@options = {}
end
def options
@options
end
end
config = Config.new
config.verbose = true
config.format = :documentation
pp config.options
# => {:verbose=>true, :format=>:documentation}
The define_method
method receives the name of the method, and a block (4). This block will be the implementation of the method we are defining. In this case, it will call the @options
instance variable to receive the Hash of options configured, and add a new value for the option name we've declared (5). The DSL syntax for the end-user remains the same.
Using Ruby blocks
In my opinion, no Ruby DSL is complete without blocks*. They look good for one, but they also provide a new scope of context to users of our DSL, in which our new DSL rules apply.
*: It's perfectly fine to have a DSL without blocks.
Config.configure do |config|
config.verbose = true
config.format = :documentation
end
In the next example, I've created a new class method called configure
. To interact with blocks given to a method, we can use the yield
keyword. If a method has been given a block, yield
will call that block. If we give a value to yield
, it will make it so that the block receives that value as an argument in the block, as is shown with |config|
in the example above.
class Config
def self.configure
instance = new # Create a new instance of the Config class
yield instance # Call the block and give the Config instance as an argument
instance
end
# ...
end
Using instance_eval
We can make this shorter, and in my opinion, feel more like a DSL with its own block context. We can drop the |config|
block argument and avoid having to call config.
in front of every config option.
Config.configure do
verbose true
format :documentation
end
I've omitted the equals sign (=
) from the example here. More on that later.
To make this work we go back to our configure
class method. Instead of yield
, we'll use instance_eval
. (Eval may sound scary, and it has some things to look out for, but in this small example it's quite safe.)
To call instance_eval
we need to pass in the block given to the method. We can't use yield
for this. We need to explicitly specify the block argument of the configure
method with &block
. The key part being the ampersand (&
) of the argument name &block
, telling Ruby it's the block argument.
class Config
def self.configure(&block)
instance = new
instance.instance_eval(&block)
instance
end
# ...
end
(In Ruby 3.1 and newer the &block
argument can also be written as only the ampersand &
, the name is optional.)
When called on another object, like our Config class instance, instance_eval
will execute that block within the context of the Config instance. The context of that block is changed from where it was defined to the Config instance. When a method is called within that block, it will call it directly on the Config instance.
Local variable assignment gotcha
Remember how in the example above I omitted the equals sign from the method names? When calling a method name ending with an equals sign in
instance_eval
, Ruby will interpret it as a local variable assignment instead. It will not call the method on the Config instance. That's how Ruby works, and we can't change that. I omitted the equals sign to make the DSL still work.Config.configure do # This won't work verbose = true format = :documentation end
It's possible to work around this by explicitly calling the methods on
self.
, but this syntax becomes basically the same as usingyield
. Instead, I removed the equals sign from the method name.Config.configure do # This works, but requires `self.` in front of every method call self.verbose = true self.format = :documentation end
When to use yield or instance_eval?
I've shown two ways of calling blocks in Ruby. Which one should be used is up to the authors of the DSL. I have my personal preference for using instance_eval
, but there are definitely scenarios where using yield
works better.
When something needs to be configured, using a lot of methods with an equals sign (writer methods), using yield
with a block argument is quite common.
Config.configure do |config|
config.verbose = true # Set a value
end
When defining something or calling actions, I see the instance_eval
approach used a lot.
Config.configure do
load_defaults! # Perform an action
end
That doesn't mean one excludes the other. Gems like Puma use the instance_eval
method for their configuration file.
# config/puma.rb
workers 3
preload_app!
Use the approach you feel works best for your DSL.
Share DSL behavior with modules
With the above methods you can start building your DSL. This gives us a good toolbox. As the DSL grows, or more DSLs get added to the domain, it's good to share behavior between the DSL classes.
For example, this domain has two separate configuration classes: a Config and an Output class. One configures how the gem works, the other how it outputs results to the terminal.
# Multiple configuration classes
Config.configure do
enabled true
end
Output.configure do
verbose true
format :documentation
end
We don't want to repeat ourselves, so we move the DSL definition behavior to a module called Configurable. When that module is included the class method def_options
is made available with the class definition.
class Config
include Configurable
def_options :enabled
end
class Output
include Configurable
def_options :verbose, :format
end
When the Configurable module is included, it adds the options
method to the class. The module also extends the class it's included in, using the included
callback in Ruby modules. In that callback, the class is extended with the ClassMethods module, defined inside the Configurable module. This module will add the class methods (def_options
and configure
) to the class. This is also how things like ActiveSupport::Concern work.
module Configurable
def self.included(base)
base.extend ClassMethods
end
module ClassMethods
def def_options(*option_names)
# ...
end
def configure(&block)
# ...
end
end
def options
@options ||= {}
end
end
With all the logic moved to the Configurable module, we now have a reusable module to create as many config classes as the domain needs.
Nesting DSLs
The Output class is really a sub domain of the Config class though. It's part of the gem config: the configurations should be nested within one another.
Config.configure do
enabled true
output do
verbose true
format :documentation
end
end
Rather than Output being a top-level config class, we store it on the Config class. In the output
method in the example below, the method creates a new Output instance. When a block is given, it will instance_eval
the block on the @output
object. The method will always return the created @output
object so the sub-config can be accessed.
class Config
include Configurable
def_options :enabled
class Output
include Configurable
def_options :verbose, :format
end
def output(&block)
@output ||= Output.new
@output.instance_eval(&block) if block_given?
@output
end
end
Module builder pattern
The Module builder pattern is a really neat design pattern in Ruby that allows us to do away with the two step process of including a module and then calling methods included by it. This pattern is described in more detail in The Ruby Module Builder Pattern by Chris Salzberg.
In the example below, a new ConfigOption
module is used to include a dynamically defined module. For the end-user, the resulting DSL remains the same.
class Config
include ConfigOption.new(:verbose, :format)
end
When ConfigOption.new
is called, the desired config option names are given to the initialize
method. Like before, we iterate over this list. Using define_method
the necessary methods are defined on the module. It's possible to do so in this initialize
method, because it's creating a new implementation of the ConfigOption module.
class ConfigOption < Module
def initialize(*option_names)
define_method :options do
@options ||= {}
end
option_names.each do |option_name|
define_method option_name do |value|
options[option_name] = value
end
end
end
end
For a much more in-depth look, I can highly recommend reading The Ruby Module Builder Pattern by Chris Salzberg.
Create DSL objects
Another step I recommend is to separate the behavior of the DSL itself and the actual app code that uses it, by creating DSL classes. These classes are only used when the end-user interacts with the DSL, like configuring a gem. When the configuration is done, the options are read from the DSL class and set on whatever config object the gem uses.
Like before, we have a Config class. When configured using Config.configure
, it creates a ConfigDSL class (1). This ConfigDSL class will become the context of the block given to the configure
class method.
class Config
def self.configure(&block)
dsl = ConfigDSL.dsl(&block) # 1
new(dsl.options) # 2
end
attr_reader :options
def initialize(options)
@options = options
end
end
class ConfigDSL
include ConfigOption.new(:verbose, :format)
def self.dsl(&block)
instance = new
instance.instance_eval(&block)
instance
end
end
After the configure
block has been called, the Config class reads the options from the ConfigDSL and initializes a new instance of itself with these options (2).
With this approach the DSL that configures the gem is only used when the app starts. The app doesn't carry around all that extra weight of how the DSL works all the time. The separation of this behavior makes it easier to not accidentally call the configure DSL when it should not be available.
This approach also solves a downside of using instance_eval
. When calling blocks this way, private methods are accessible inside the block. That's something we usually don't want to allow.
class Config
def self.configure(&block)
instance = new
instance.instance_eval(&block)
instance
end
private
def secret_method
"super secret method"
end
end
Config.configure do
# This should raise an error about calling
# a private method, but it doesn't
secret_method
# => "super secret method"
end
Private methods can be accessed in blocks called using instance_eval
, because the block is evaluated in the context of the instance. It's as if it's being run from a method within that object. With separate DSL classes you have to worry less about private methods that you don't want anyone to call.
Conclusion
Now that our toolbox is complete we can create our own Domain Specific Language using Ruby. We have the following tools at our disposal:
- Use
define_method
to dynamically define new methods on classes. - Use Ruby blocks to give your DSL that real DSL feeling and create a context wherein your DSL shines.
- Use
yield
to return a DSL object as a block argument, or; - Use
instance_eval
to change how to block works, and allow end-users to directly call methods on the new context of new block.
- Use
- Use Modules to:
- share behavior between many classes, and;
- use the Module builder pattern to remove any logic of how the DSL is constructed from the class that uses the DSL.
- DSL objects separate the logic of how the DSL works and how the rest of the gem works. Creating a better separation of responsibilities in your code.
Now go forth, and build your own DSL!
Posted on February 6, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.