Utilizing Macros & Annotations in a Web Framework

blacksmoke16

Blacksmoke16

Posted on April 4, 2019

Utilizing Macros & Annotations in a Web Framework

Macros, and especially annotations, are among the least documented features of Crystal. Because of this, I wanted to take the time and write a little post about how I used annotations and frameworks to build my web framework, focusing on how they are used, not necessarily the framework itself.

A few months ago I started on, yet another, web framework project. However I wanted to take a different approach, taking into consideration the experiences I have from my other side projects as well as those at my work. The result of this was Athena. Unlike other frameworks that rely mostly on macros to define the routes, Athena uses annotations with macros. This allows for code that is very readable, with route actions being just class methods. This has a few benefits such as:

  • Easier to document - Are just methods so doc comments work just fine.
  • Easier to test - Are just methods again, can just call your actions directly to test.
  • Easier to read/understand - There is no special syntax needed (other than learning the annotations)
  • Inheritance.

Annotations are use heavily, as the main mechanism to define routes, callbacks, and even CLI commands. I also used them in my other serialization/validation shard CrSerializer as an alternative way to manage property serialization/validations of models.

Macros

Macros are defined as (from the git book), "Methods that receive AST nodes at compile-time and produce code that is pasted into a program." Or in other words, code that you write that expands into Crystal code at compile time and gets inserted into the program. Macros are almost a language on their own; quite powerful and flexible. Macros allow you to reduce boilerplate by writing code that can define classes, methods, etc.

Annotations

Annotations, one of the least documented but most powerful features in Crystal. In short, annotations allow types (classes, structs, methods, and properties) to have information added to them that is available at compile time, and readable via macros.

Annotations are defined similarly to a class or struct, but using the annotation keyword.

annotation Get; end
Enter fullscreen mode Exit fullscreen mode

I'm going to define some scenarios I wanted to solve when setting out with creating Athena. Then do a walk-through of my thinking and how I used macros and annotations to solve those problems.

  1. Develop a way of dynamically registering routes based on a specific annotation as opposed to using a standard get "/path" do { } macro for example.
  2. Ability to easily validate properties on the ORM models by applying annotations to each property, instead of using a validate xxx macro.
  3. Similar to 1, but to have self registering CLI commands that get automatically exposed to an OptionParser interface.

Registering Routes

In order to solve this first obstacle, I had to think. "How can I get an Array of controllers with routes defined?"

My ideal solution would have been like:

@[Athena::Routing::Controller]
class MyController
  @[Athena::Routing::Get(path: "/me")]
  def get_me : String
    "Jim"
  end

  @[Athena::Routing::Post(path: "/user")]
  def new_me(body : String) : String
    body
  end
end

{% for controller in Athena:::Routing::Controller.types %}
  # Iterate over all the classes with the `Controller` annotation
{% end %}
Enter fullscreen mode Exit fullscreen mode

Thus, any class that has the Athena:::Routing::Controller would be known to be a controller, class methods would be annotated to tie a given route to the code that should be executed when the route is matched. However, this is not currently possible. See #7274. So I had to think of plan B.

The Crystal generated API docs were super helpful in this regard, especially TypeNode. A TypeNode represents a type in a program, such as String, or MyController, etc. Reading through the available methods, I noticed there was an all_subclasses method; which returns an array of all the subclasses of that type. Ah ha! I then had the idea, similar to Rails and their ApplicationController, that I could make my own class, have controllers inherit from that, and use all_subclasses to iterate over all controllers at compile time. Great!

This discovery lead to the syntax that is used today:

class MyController < Athena::Routing::Controller
  @[Athena::Routing::Get(path: "/me")]
  def get_me : String
    "Jim"
  end

  @[Athena::Routing::Post(path: "/user")]
  def new_me(body : String) : String
    body
  end
end
Enter fullscreen mode Exit fullscreen mode

However there was a new problem. Where do I do this? Turning my attention to the various routers available, such as Amber Router. In the examples, each router required the user to define a tree, add routes to it, and find those routes all in the same file. The key problem here was I needed a way so the user doesn't have to know about how to do any of that. I needed a way to hide the tree creation, adding of routes, and resolving routes behind the scenes.

I went back to the Crystal API docs, specifically the HTTP::Handler class. This module, when included, requires a class to define a call method, that will be executed on each HTTP request. Also, since it is nothing more than a class, I thought I could add an initialize method that would run once when newing up the handler for the HTTP::Server.

Perfect! A simplified cut down version of this ultimately ended up looking like:

class RouteHandler < Athena::Routing::Handlers::Handler
  @routes : Amber::Router::RouteSet(Action) = Amber::Router::RouteSet(Action).new

  def initialize
    {% for c in Athena::Routing::Controller.all_subclasses %}
      {% methods = c.class.methods.select { |m| m.annotation(Get) || m.annotation(Post) || m.annotation(Put) || m.annotation(Delete) } %}
      {% for m in methods %}
        {% if d = m.annotation(Get) %}
          {% method = "GET" %}
          {% route_def = d %}
        {% elsif d = m.annotation(Post) %}
          {% method = "POST" %}
          {% route_def = d %}
        {% end %}

        # Action is a record that contains data about the action.
        # Such as params, the action to execute, path, controller etc. 
        {% action = xxx %}

        @routes.add {{route_def[:path]}}, {{action}}
      {% end %}
    {% end %}
  end
end
Enter fullscreen mode Exit fullscreen mode

I first defined an instance variable routes, which is instantiated to be a new instance of my router tree. I then defined an initialize method. Within this method, I used a macro loop to iterate over the subclasses of the base controller class. I then used .select to get an array of actions defined on each controller, specifically looking for class methods annotated with one of the HTTP verb annotations. This allowed for private class methods to be used within a class, but not expected to be related to a route. I then set the method and route_definition based on what annotation was matched. Finally I register the route with the router.

Annotation fields can be access using the [] method, either using a String or Symbol for named fields, or an Int32 for index based.

Since this is all macro code, the actual code that gets added to the program at compile time would look like:

class RouteHandler < Athena::Routing::Handlers::Handler
  @routes : Amber::Router::RouteSet(Action) = Amber::Router::RouteSet(Action).new

  def initialize
    @routes.add "/GET/me", RouteAction.new("get_me", xxx)
    @routes.add "/POST/user", RouteAction.new("new_user", xxx)
  end
end
Enter fullscreen mode Exit fullscreen mode

Cool! Now I could define the call method to handle routing. Again, a simplified trimmed down version of this ended up looking like:

def call(ctx : HTTP::Server::Context)
  search_key = '/' + ctx.request.method + ctx.request.path
  route = @routes.find search_key

  if route.found?
    action = route.payload.not_nil!
  else
    raise Athena::Routing::Exceptions::NotFoundException.new "No route found for '#{ctx.request.method} #{ctx.request.path}'"
  end

  call_next ctx
end
Enter fullscreen mode Exit fullscreen mode

This calls other handlers I defined for CORS, and executing the actual action tied to the given route. However that doesn't involve any other notable uses of annotations/macros so I'm just going to skip it. However, If you're interested in learning more about how Athena is setup, feel free to message me on the Crystal Gitter.

This portion is now complete and could be used similarly to:

server = HTTP::Server.new([RouteHandler.new])

server.bind_tcp "127.0.0.1", 8080
server.listen
Enter fullscreen mode Exit fullscreen mode

Property Validation

The next challenge I wanted to solve was validations, also following an annotation based approach. The idea was you could add annotations to a property, then on a POST endpoint for example, run those validations. If the validations pass, pass an instance of your model to the route action, otherwise return a 400.

I wanted to have this built on top of JSON::Serializable, so that validations would work out of the box for any class/struct using that as its serialization/deserialization method.

While it sounds simple. "Just create some validation annotations, add a validate method that iterates over those annotations like you did for the routes, and there you go." However, it was not that simple :P

Originally I came up with the syntax of:

@[CrSerializer::Assertions(greater_than_or_equal: 0, nil: false)] 
property age : Int32?
Enter fullscreen mode Exit fullscreen mode

Then, I could read the fields off the annotation, and run corresponding logic to run the assertions.

However it quickly became apparent this approach has some flaws.

  • The line quickly gets long if there are a lot of assertions.
  • There is no easy way to force required fields, must supply min AND max for example.
  • It's not expandable. If a user wanted to add their own assertion, I could not possibly know the keys they would define let alone what to do with them.

So after some thinking I pivoted in favor of this syntax:

@[Assert::NotNil] 
@[Assert::GreaterThanOrEqual(value: 0)] 
property age : Int32?
Enter fullscreen mode Exit fullscreen mode

The benefits of this:

  • Easier to read, as the lines do not get as long
  • More expandable
  • Easier to see related fields

But of course there was some challenges with this approach. There is currently no way to just get an array of annotations applied to a type. Each annotations has to be read by its name. I now needed a way to know the names of each assertion annotation, what fields to read off of it, and a way to tie a given annotation to some specific logic.

I worked around the first two problems by knowing that constants are available at compile time. I defined a constant hash, mapping annotation names to an array of fields that would be read off of it.

ASSERTIONS = {
  Assert::Email              => [:mode],
  Assert::IP                 => [:version],
  Assert::Uuid               => [:versions, :variants, :strict],
  Assert::Url                => [:protocols, :relative_protocol],
  Assert::RegexMatch         => [:pattern, :match],
  ...
}
Enter fullscreen mode Exit fullscreen mode

Since constants are available at compile time, it is possible to use them in a macro loop; which I did to iterate over the name/fields of each assertion.

{% for ivar in @type.instance_vars %}
  {% for t, v in CrSerializer::Assertions::ASSERTIONS %}
    {% ann = ivar.annotation(t.resolve) %}
    {% if ann %}
      assertions << {{t.resolve.name.split("::").last.id}}Assertion({{ivar.type.id}}).new({{ivar.stringify}},{{ann[:message]}},{{ivar.id}},{{v.select { |fi| ann[fi] != nil }.map { |f| %(#{f.id}: #{ann[f]}) }.join(',').id}})
    {% end %}
  {% end %}
{% end %}
Enter fullscreen mode Exit fullscreen mode

This example introduces a new macro concept, @type. When used within a macro, @type represents the current scope or type as a TypeNode. In my case, @type would represent the class that included CrSerializer. I used @type.instance_vars to be able to iterate over each instance variable, read the annotations from it, and add the assertions if any exist. I also use it to add the type and name of each instance variable to the assertion instantiation.

NOTE: @type.instance_vars only currently works in the context of a instance/class method. See #7504.

Now that I had a way to handle the first two issues, I still needed a way to tie a given assertion to the logic specific to that assertion. To do this, I used the key of the ASSERTIONS hash, plus Assertion as the class name to instantiate a new class of that type. For example the logic for Assert::NotNil would be handled in a class called NotNilAssertion; which looks like:

class NotNilAssertion(ActualValueType) < Assertion
  @message : String = "'{{field}}' should not be null"

  def initialize(field : String, message : String?, @actual : ActualValueType)
    super field, message
  end

  def valid? : Bool
    @actual.nil? == false
  end
end
Enter fullscreen mode Exit fullscreen mode

Each assertion class has instance variables that map to the fields in the array from the ASSERTIONS hash. I use a macro to iterate over those fields, find the ones that are set, then new up an instance of that assertion with those values. I added this to a validate method that gets added to the type when including the CrSerializer module. Again, since this is all macro code, the actual code that gets inserted into the program would look like, based on the example above:

def validate : Nil
  assertions = [] of CrSerializer::Assertions::Assertion
  assertions << NotNilAssertion(Int32?).new
  assertions << GreaterThanOrEqualAssertion(Int32?).new(value: 0)
  @validator = CrSerializer::Validator.new assertions
end
Enter fullscreen mode Exit fullscreen mode

But wait, if the ASSERTIONS is a constant, how would this help with allowing users to add their own assertions?

Using a macro I'm able to define the annotation with the given name, and add properties to the hash at compile time, so that the hash contains the standard assertions plus any assertions defined by the user at runtime.

macro register_assertion(name, fields)
  module CrSerializer::Assertions
    annotation {{name.id}}; end
    {% CrSerializer::Assertions::ASSERTIONS[name] = fields %}
  end
end
Enter fullscreen mode Exit fullscreen mode

The user would use this like register_assertion Assert::MyCustom, [:value, :high_is_good], and of course have a MyCustomAssertion class to handle the logic.

@[Assert::MyCustom(value: 100, high_is_good: true)]
property age : Int32
Enter fullscreen mode Exit fullscreen mode

Auto Registering CLI Commands

This is quite similar to the regiserting routes problem, but I'll include it as another example of what macros can do.

struct Registry
  macro finished
    class_getter commands : Array(Athena::Cli::Command.class) = {{Athena::Cli::Command.subclasses}}{% if Athena::Cli::Command.subclasses.size > 0 %} of Athena::Cli::Command.class {% end %}
  end
end
Enter fullscreen mode Exit fullscreen mode

This defines a class getter with a value of an array of classes that inherit from the parent Command class. The value of the array is built out at compile time by simply calling {{Athena::Cli::Command.subclasses}}, which returns an array of classes. This has to be done within a finished macro, so that the types are known to prevent type mismatches.

I then used a macro that inserts an OptionParser that acts as an interface to list/run commands

macro register_commands
  OptionParser.parse! do |parser|
    parser.banner = "Usage: YOUR_BINARY [arguments]"
    parser.on("-h", "--help", "Show this help") { puts parser; exit }
    parser.on("-l", "--list", "List available commands") { puts Athena::Cli::Registry.to_s; exit }
    parser.on("-c NAME", "--command=NAME", "Run a command with the given name") do |name|
      Athena::Cli::Registry.find(name).command.call ARGV
      exit
    end
  end
end
Enter fullscreen mode Exit fullscreen mode
struct MigrateEventsCommand < Athena::Cli::Command
  self.command_name = "migrate:events"
  self.description = "Migrates legacy events for a given customer"

  def self.execute(customer_id : Int32, event_ids : Array(Int64)) : Nil
    # Do stuff for the migration
    # Params are converted from strings to their expected types
  end
end
Enter fullscreen mode Exit fullscreen mode

All that has to be done to use the commands is calling the register_commands macro in your main app file, and create your command struct inheriting from the base command struct. This will then get picked up by Athena::Cli::Command.subclasses and registered at compile time.

Ultimately it could then be used like ./MyApp -c migrate:events --customer_id=83726 --event_ids=1,2,3,4,5. Which would search the registry for the command and pass the args to it if it was defined.

Addendums

I didn't really know what to cover, as it's kind of hard to give examples out of the blue. I'll leave this section open for future additions. If there is something specific you would like to see an example of, or explained further feel free to ask.

ORM

@jwoertink made me think of another example I can walkthrough to go over the idea of "a reason I'd use them." I have been working on an annotation version of Granite, mostly for the lols, that I can give some examples from.

Annotations require a different view of how to structure an application as compared to simply using normal macros. We are all aware of the macro syntax ORMs use to define columns, i.e. field name : String. Behind the scenes the macro would add that field's name, type, and any options supplied to a constant so that at a later stage, once the macro inherited and finished hooks run, the constant could be used to create properties for those fields, applying the proper types/options to the property definition.

Begin Personal Opinion

However, in my opinion, this is a bit convoluted and is a good example of what type of situation annotations could be useful. The main point being, what advantage does using field name : String get you over property name : String then applying an annotation? The latter allows you to just have syntax like every other class in your app, inheritance works, can document properties, and third party annotations could be easily applied (like JSON::Field). Annotations would allow you to remove these extra unnecessary steps of going from macro to constant to property instead of straight to property.

End Personal Opinion

However when taking an annotation based approach, you have to look at it from a different point of view. In the case of an ORM, the main challenge becomes "how can i pass column data around, without adding everything to a mutable constant?" My solution to this is using records and building out, at compile time, an immutable array of objects that represents the columns in a model. These objects would contain data like the name of the column, the type, if its the PK, if its nullable etc. The type, name, nullable can all come directly from an instance variables TypeNode without touching annotations. Annotations come into play when you want to attach non-autogeneratable data to a column, aka property. Annotations would allow fields/values to be defined on it, that can also be read at compile time when building out our array of records.

    private record ColumnInfo(T) < ColumnBase, name : String, nilable : Bool, auto : Bool = false, primary : Bool = false, default : T? = nil, type : T.class = T

    protected def self.columns : Array(ColumnBase)
      columns = [] of ColumnBase
      {% begin %}
        {% fields = @type.instance_vars.select { |ivar| ivar.annotation(Granite::Column) } %}
        {% raise "Composite primary keys are not yet supported." if fields.select { |ivar| ann = ivar.annotation(Granite::Column); ann && ann[:primary] }.size > 1 %}
        {% for field in fields %}
          {% type = field.type.union? ? field.type.union_types.reject { |t| t == Nil }.first : field.type %}
          {% col_ann = field.annotation(Granite::Column) %}
          {% auto = col_ann && col_ann[:auto] ? col_ann[:auto] : false %}
          {% primary = col_ann && col_ann[:primary] ? col_ann[:primary] : false %}
          {% auto = col_ann[:auto] == nil && primary %}
          {% raise "Primary key '#{field.name}' of '#{@type.name}' must be nilable." if primary && !field.type.nilable? %}
          columns << ColumnInfo({{type}}).new({{field.stringify}}, {{field.type.nilable?}}, {{auto}}, {{primary}}, {{field.default_value}})
        {% end %}
      {% end %}
      columns
    end
Enter fullscreen mode Exit fullscreen mode

And this is what I came up with. Similar to building out routes, I iterate over the instance variables of the current class, find those with a Granite::Column annotation, then check various annotation fields to see if they are set, otherwise use defaults. All the while generating compile time errors if a user adds two primary keys, or has a non nilable PK.

An example of what the model could look like.

  @[Granite::Model(table: "parents", adapter: "my_db")]
  class Parent < Granite::Base
    @[Granite::Column(primary: true)]
    property id : Int64?

    @[Granite::Column]
    property name : String

    @[Granite::Column(name: "createdAt")]
    property created_at : Time?

    @[Granite::Column(name: "updatedAt")]
    property updated_at : Time?

    # A property that would be read from a request body but _not_ persisted to DB.
    property meta : String
  end
Enter fullscreen mode Exit fullscreen mode

IMHO, much cleaner and not dependent on shard specific macro syntax.

Hope this helps as well.

💖 💪 🙅 🚩
blacksmoke16
Blacksmoke16

Posted on April 4, 2019

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related