Utilizing Macros & Annotations in a Web Framework
Blacksmoke16
Posted on April 4, 2019
Macros, and especially annotations, are among the least documented features of Crystal. Because of this, I wanted to take the time and write a little post about how I used annotations and frameworks to build my web framework, focusing on how they are used, not necessarily the framework itself.
A few months ago I started on, yet another, web framework project. However I wanted to take a different approach, taking into consideration the experiences I have from my other side projects as well as those at my work. The result of this was Athena. Unlike other frameworks that rely mostly on macros to define the routes, Athena uses annotations with macros. This allows for code that is very readable, with route actions being just class methods. This has a few benefits such as:
- Easier to document - Are just methods so doc comments work just fine.
- Easier to test - Are just methods again, can just call your actions directly to test.
- Easier to read/understand - There is no special syntax needed (other than learning the annotations)
- Inheritance.
Annotations are use heavily, as the main mechanism to define routes, callbacks, and even CLI commands. I also used them in my other serialization/validation shard CrSerializer as an alternative way to manage property serialization/validations of models.
Macros
Macros are defined as (from the git book), "Methods that receive AST nodes at compile-time and produce code that is pasted into a program." Or in other words, code that you write that expands into Crystal code at compile time and gets inserted into the program. Macros are almost a language on their own; quite powerful and flexible. Macros allow you to reduce boilerplate by writing code that can define classes, methods, etc.
Annotations
Annotations, one of the least documented but most powerful features in Crystal. In short, annotations allow types (classes, structs, methods, and properties) to have information added to them that is available at compile time, and readable via macros.
Annotations are defined similarly to a class or struct, but using the annotation
keyword.
annotation Get; end
I'm going to define some scenarios I wanted to solve when setting out with creating Athena. Then do a walk-through of my thinking and how I used macros and annotations to solve those problems.
- Develop a way of dynamically registering routes based on a specific annotation as opposed to using a standard
get "/path" do { }
macro for example. - Ability to easily validate properties on the ORM models by applying annotations to each property, instead of using a
validate xxx
macro. - Similar to 1, but to have self registering CLI commands that get automatically exposed to an
OptionParser
interface.
Registering Routes
In order to solve this first obstacle, I had to think. "How can I get an Array of controllers with routes defined?"
My ideal solution would have been like:
@[Athena::Routing::Controller]
class MyController
@[Athena::Routing::Get(path: "/me")]
def get_me : String
"Jim"
end
@[Athena::Routing::Post(path: "/user")]
def new_me(body : String) : String
body
end
end
{% for controller in Athena:::Routing::Controller.types %}
# Iterate over all the classes with the `Controller` annotation
{% end %}
Thus, any class that has the Athena:::Routing::Controller
would be known to be a controller, class methods would be annotated to tie a given route to the code that should be executed when the route is matched. However, this is not currently possible. See #7274. So I had to think of plan B.
The Crystal generated API docs were super helpful in this regard, especially TypeNode. A TypeNode
represents a type in a program, such as String
, or MyController
, etc. Reading through the available methods, I noticed there was an all_subclasses
method; which returns an array of all the subclasses of that type. Ah ha! I then had the idea, similar to Rails and their ApplicationController
, that I could make my own class, have controllers inherit from that, and use all_subclasses
to iterate over all controllers at compile time. Great!
This discovery lead to the syntax that is used today:
class MyController < Athena::Routing::Controller
@[Athena::Routing::Get(path: "/me")]
def get_me : String
"Jim"
end
@[Athena::Routing::Post(path: "/user")]
def new_me(body : String) : String
body
end
end
However there was a new problem. Where do I do this? Turning my attention to the various routers available, such as Amber Router. In the examples, each router required the user to define a tree, add routes to it, and find those routes all in the same file. The key problem here was I needed a way so the user doesn't have to know about how to do any of that. I needed a way to hide the tree creation, adding of routes, and resolving routes behind the scenes.
I went back to the Crystal API docs, specifically the HTTP::Handler
class. This module, when included, requires a class to define a call
method, that will be executed on each HTTP request. Also, since it is nothing more than a class, I thought I could add an initialize
method that would run once when newing up the handler for the HTTP::Server
.
Perfect! A simplified cut down version of this ultimately ended up looking like:
class RouteHandler < Athena::Routing::Handlers::Handler
@routes : Amber::Router::RouteSet(Action) = Amber::Router::RouteSet(Action).new
def initialize
{% for c in Athena::Routing::Controller.all_subclasses %}
{% methods = c.class.methods.select { |m| m.annotation(Get) || m.annotation(Post) || m.annotation(Put) || m.annotation(Delete) } %}
{% for m in methods %}
{% if d = m.annotation(Get) %}
{% method = "GET" %}
{% route_def = d %}
{% elsif d = m.annotation(Post) %}
{% method = "POST" %}
{% route_def = d %}
{% end %}
# Action is a record that contains data about the action.
# Such as params, the action to execute, path, controller etc.
{% action = xxx %}
@routes.add {{route_def[:path]}}, {{action}}
{% end %}
{% end %}
end
end
I first defined an instance variable routes
, which is instantiated to be a new instance of my router tree. I then defined an initialize
method. Within this method, I used a macro loop to iterate over the subclasses of the base controller class. I then used .select
to get an array of actions defined on each controller, specifically looking for class methods annotated with one of the HTTP verb annotations. This allowed for private class methods to be used within a class, but not expected to be related to a route. I then set the method and route_definition based on what annotation was matched. Finally I register the route with the router.
Annotation fields can be access using the []
method, either using a String
or Symbol
for named fields, or an Int32
for index based.
Since this is all macro code, the actual code that gets added to the program at compile time would look like:
class RouteHandler < Athena::Routing::Handlers::Handler
@routes : Amber::Router::RouteSet(Action) = Amber::Router::RouteSet(Action).new
def initialize
@routes.add "/GET/me", RouteAction.new("get_me", xxx)
@routes.add "/POST/user", RouteAction.new("new_user", xxx)
end
end
Cool! Now I could define the call
method to handle routing. Again, a simplified trimmed down version of this ended up looking like:
def call(ctx : HTTP::Server::Context)
search_key = '/' + ctx.request.method + ctx.request.path
route = @routes.find search_key
if route.found?
action = route.payload.not_nil!
else
raise Athena::Routing::Exceptions::NotFoundException.new "No route found for '#{ctx.request.method} #{ctx.request.path}'"
end
call_next ctx
end
This calls other handlers I defined for CORS, and executing the actual action tied to the given route. However that doesn't involve any other notable uses of annotations/macros so I'm just going to skip it. However, If you're interested in learning more about how Athena is setup, feel free to message me on the Crystal Gitter.
This portion is now complete and could be used similarly to:
server = HTTP::Server.new([RouteHandler.new])
server.bind_tcp "127.0.0.1", 8080
server.listen
Property Validation
The next challenge I wanted to solve was validations, also following an annotation based approach. The idea was you could add annotations to a property, then on a POST endpoint for example, run those validations. If the validations pass, pass an instance of your model to the route action, otherwise return a 400.
I wanted to have this built on top of JSON::Serializable
, so that validations would work out of the box for any class/struct using that as its serialization/deserialization method.
While it sounds simple. "Just create some validation annotations, add a validate
method that iterates over those annotations like you did for the routes, and there you go." However, it was not that simple :P
Originally I came up with the syntax of:
@[CrSerializer::Assertions(greater_than_or_equal: 0, nil: false)]
property age : Int32?
Then, I could read the fields off the annotation, and run corresponding logic to run the assertions.
However it quickly became apparent this approach has some flaws.
- The line quickly gets long if there are a lot of assertions.
- There is no easy way to force required fields, must supply
min
ANDmax
for example. - It's not expandable. If a user wanted to add their own assertion, I could not possibly know the keys they would define let alone what to do with them.
So after some thinking I pivoted in favor of this syntax:
@[Assert::NotNil]
@[Assert::GreaterThanOrEqual(value: 0)]
property age : Int32?
The benefits of this:
- Easier to read, as the lines do not get as long
- More expandable
- Easier to see related fields
But of course there was some challenges with this approach. There is currently no way to just get an array of annotations applied to a type. Each annotations has to be read by its name. I now needed a way to know the names of each assertion annotation, what fields to read off of it, and a way to tie a given annotation to some specific logic.
I worked around the first two problems by knowing that constants are available at compile time. I defined a constant hash, mapping annotation names to an array of fields that would be read off of it.
ASSERTIONS = {
Assert::Email => [:mode],
Assert::IP => [:version],
Assert::Uuid => [:versions, :variants, :strict],
Assert::Url => [:protocols, :relative_protocol],
Assert::RegexMatch => [:pattern, :match],
...
}
Since constants are available at compile time, it is possible to use them in a macro loop; which I did to iterate over the name/fields of each assertion.
{% for ivar in @type.instance_vars %}
{% for t, v in CrSerializer::Assertions::ASSERTIONS %}
{% ann = ivar.annotation(t.resolve) %}
{% if ann %}
assertions << {{t.resolve.name.split("::").last.id}}Assertion({{ivar.type.id}}).new({{ivar.stringify}},{{ann[:message]}},{{ivar.id}},{{v.select { |fi| ann[fi] != nil }.map { |f| %(#{f.id}: #{ann[f]}) }.join(',').id}})
{% end %}
{% end %}
{% end %}
This example introduces a new macro concept, @type
. When used within a macro, @type
represents the current scope or type as a TypeNode
. In my case, @type
would represent the class that included CrSerializer
. I used @type.instance_vars
to be able to iterate over each instance variable, read the annotations from it, and add the assertions if any exist. I also use it to add the type and name of each instance variable to the assertion instantiation.
NOTE: @type.instance_vars
only currently works in the context of a instance/class method. See #7504.
Now that I had a way to handle the first two issues, I still needed a way to tie a given assertion to the logic specific to that assertion. To do this, I used the key of the ASSERTIONS
hash, plus Assertion
as the class name to instantiate a new class of that type. For example the logic for Assert::NotNil
would be handled in a class called NotNilAssertion
; which looks like:
class NotNilAssertion(ActualValueType) < Assertion
@message : String = "'{{field}}' should not be null"
def initialize(field : String, message : String?, @actual : ActualValueType)
super field, message
end
def valid? : Bool
@actual.nil? == false
end
end
Each assertion class has instance variables that map to the fields in the array from the ASSERTIONS
hash. I use a macro to iterate over those fields, find the ones that are set, then new up an instance of that assertion with those values. I added this to a validate
method that gets added to the type when including the CrSerializer
module. Again, since this is all macro code, the actual code that gets inserted into the program would look like, based on the example above:
def validate : Nil
assertions = [] of CrSerializer::Assertions::Assertion
assertions << NotNilAssertion(Int32?).new
assertions << GreaterThanOrEqualAssertion(Int32?).new(value: 0)
@validator = CrSerializer::Validator.new assertions
end
But wait, if the ASSERTIONS
is a constant, how would this help with allowing users to add their own assertions?
Using a macro I'm able to define the annotation with the given name, and add properties to the hash at compile time, so that the hash contains the standard assertions plus any assertions defined by the user at runtime.
macro register_assertion(name, fields)
module CrSerializer::Assertions
annotation {{name.id}}; end
{% CrSerializer::Assertions::ASSERTIONS[name] = fields %}
end
end
The user would use this like register_assertion Assert::MyCustom, [:value, :high_is_good]
, and of course have a MyCustomAssertion
class to handle the logic.
@[Assert::MyCustom(value: 100, high_is_good: true)]
property age : Int32
Auto Registering CLI Commands
This is quite similar to the regiserting routes problem, but I'll include it as another example of what macros can do.
struct Registry
macro finished
class_getter commands : Array(Athena::Cli::Command.class) = {{Athena::Cli::Command.subclasses}}{% if Athena::Cli::Command.subclasses.size > 0 %} of Athena::Cli::Command.class {% end %}
end
end
This defines a class getter with a value of an array of classes that inherit from the parent Command
class. The value of the array is built out at compile time by simply calling {{Athena::Cli::Command.subclasses}}
, which returns an array of classes. This has to be done within a finished
macro, so that the types are known to prevent type mismatches.
I then used a macro that inserts an OptionParser
that acts as an interface to list/run commands
macro register_commands
OptionParser.parse! do |parser|
parser.banner = "Usage: YOUR_BINARY [arguments]"
parser.on("-h", "--help", "Show this help") { puts parser; exit }
parser.on("-l", "--list", "List available commands") { puts Athena::Cli::Registry.to_s; exit }
parser.on("-c NAME", "--command=NAME", "Run a command with the given name") do |name|
Athena::Cli::Registry.find(name).command.call ARGV
exit
end
end
end
struct MigrateEventsCommand < Athena::Cli::Command
self.command_name = "migrate:events"
self.description = "Migrates legacy events for a given customer"
def self.execute(customer_id : Int32, event_ids : Array(Int64)) : Nil
# Do stuff for the migration
# Params are converted from strings to their expected types
end
end
All that has to be done to use the commands is calling the register_commands
macro in your main app file, and create your command struct inheriting from the base command struct. This will then get picked up by Athena::Cli::Command.subclasses
and registered at compile time.
Ultimately it could then be used like ./MyApp -c migrate:events --customer_id=83726 --event_ids=1,2,3,4,5
. Which would search the registry for the command and pass the args to it if it was defined.
Addendums
I didn't really know what to cover, as it's kind of hard to give examples out of the blue. I'll leave this section open for future additions. If there is something specific you would like to see an example of, or explained further feel free to ask.
ORM
@jwoertink made me think of another example I can walkthrough to go over the idea of "a reason I'd use them." I have been working on an annotation version of Granite, mostly for the lols, that I can give some examples from.
Annotations require a different view of how to structure an application as compared to simply using normal macros. We are all aware of the macro syntax ORMs use to define columns, i.e. field name : String
. Behind the scenes the macro would add that field's name, type, and any options supplied to a constant so that at a later stage, once the macro inherited
and finished
hooks run, the constant could be used to create properties for those fields, applying the proper types/options to the property definition.
Begin Personal Opinion
However, in my opinion, this is a bit convoluted and is a good example of what type of situation annotations could be useful. The main point being, what advantage does using
field name : String
get you overproperty name : String
then applying an annotation? The latter allows you to just have syntax like every other class in your app, inheritance works, can document properties, and third party annotations could be easily applied (like JSON::Field). Annotations would allow you to remove these extra unnecessary steps of going from macro to constant to property instead of straight to property.
End Personal Opinion
However when taking an annotation based approach, you have to look at it from a different point of view. In the case of an ORM, the main challenge becomes "how can i pass column data around, without adding everything to a mutable constant?" My solution to this is using records
and building out, at compile time, an immutable array of objects that represents the columns in a model. These objects would contain data like the name of the column, the type, if its the PK, if its nullable etc. The type, name, nullable can all come directly from an instance variables TypeNode
without touching annotations. Annotations come into play when you want to attach non-autogeneratable data to a column, aka property. Annotations would allow fields/values to be defined on it, that can also be read at compile time when building out our array of records.
private record ColumnInfo(T) < ColumnBase, name : String, nilable : Bool, auto : Bool = false, primary : Bool = false, default : T? = nil, type : T.class = T
protected def self.columns : Array(ColumnBase)
columns = [] of ColumnBase
{% begin %}
{% fields = @type.instance_vars.select { |ivar| ivar.annotation(Granite::Column) } %}
{% raise "Composite primary keys are not yet supported." if fields.select { |ivar| ann = ivar.annotation(Granite::Column); ann && ann[:primary] }.size > 1 %}
{% for field in fields %}
{% type = field.type.union? ? field.type.union_types.reject { |t| t == Nil }.first : field.type %}
{% col_ann = field.annotation(Granite::Column) %}
{% auto = col_ann && col_ann[:auto] ? col_ann[:auto] : false %}
{% primary = col_ann && col_ann[:primary] ? col_ann[:primary] : false %}
{% auto = col_ann[:auto] == nil && primary %}
{% raise "Primary key '#{field.name}' of '#{@type.name}' must be nilable." if primary && !field.type.nilable? %}
columns << ColumnInfo({{type}}).new({{field.stringify}}, {{field.type.nilable?}}, {{auto}}, {{primary}}, {{field.default_value}})
{% end %}
{% end %}
columns
end
And this is what I came up with. Similar to building out routes, I iterate over the instance variables of the current class, find those with a Granite::Column
annotation, then check various annotation fields to see if they are set, otherwise use defaults. All the while generating compile time errors if a user adds two primary keys, or has a non nilable PK.
An example of what the model could look like.
@[Granite::Model(table: "parents", adapter: "my_db")]
class Parent < Granite::Base
@[Granite::Column(primary: true)]
property id : Int64?
@[Granite::Column]
property name : String
@[Granite::Column(name: "createdAt")]
property created_at : Time?
@[Granite::Column(name: "updatedAt")]
property updated_at : Time?
# A property that would be read from a request body but _not_ persisted to DB.
property meta : String
end
IMHO, much cleaner and not dependent on shard specific macro syntax.
Hope this helps as well.
Posted on April 4, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 29, 2024
November 29, 2024
November 27, 2024