Peter Strøiman
Posted on June 3, 2024
Recently, I decided to get back to OCaml, a very powerful functional language that has also received some great additions since the last version I used.
My approach to development is almost always to practice TDD; not so much because of the testing part; but because this sets up a fast feedback loop, allowing me to work effectively with code. I looked into the choices of available test tools; and I wasn't really happy with any of them. They all seem to have a strong emphasis on the "test" part of it; rather the efficiency gained from a fast feedback loop.
So I decided to write my own, and I am now ready to present it to the world: Speed.
This is still very early acces, and things can change; but I invite everyone interested in TDD and OCaml to check it out.
The name
I has been looking a bit into another test library, Alcotest, and for some weird reason, I made an association to amphetamine; mostly known for its illegal recreational use; but also used as a drug for the treatment for ADHD. A nickname for amphetamine is speed.
Because the process of test-driven development makes me a more efficient programmer, I found the name to be fitting.
Features
Current features of speed include:
- Structured test suites with nested contexts
- Friendly DSL for building tests decoupled from the internal data structure
- Support for custom setup (though I am searching for a prettier syntax in the DSL)
- Metadata in a child context can be used in setup in a parent context.
- Support for synchronous tests and asynchronous tests using [[Lwt]]
- Technically; they are currently two separate suites; but the execution of both is transparent.
- Assertions library with potential for nice output
- Composable assertions (e.g.
Ok 42 |> should @@ be_ok >=> equal_int 42
) - A few PPX rewriters to help with some trivial code
- Test focus (the ability to only run a subset of the tests when working on a new feature)
The following features are planned, but not implemented:
- Customisation of test output (ability to plug in your own reporting module)
- Teardown/cleanup hooks
- Ability to fail on the presence of focused tests (important for CI servers)
- Better messages for assertions, particularly when assertions are composed into larger assertions.
- Multicore support
- Mixing Lwt/Async/Sync code in the same suite.
This is generally developed in parallel with a pet project; and features are added to Speed as I need them. E.g. you can expect to see teardown support added once I start making integration tests of message publishing.
How It Looks
There are two different DSLs, one based on lists, and one based on effect handlers.
List Based DSL
This constructs the tests using lists of objects.
open Speed.Dsl
let suite = parse [
context "Feature" [
test "Some behaviour" (fun _ -> (* ...*) ());
test "Something else" (fun _ -> ());
context "Child context" [
test "More stuff" (fun _ -> ())
]
]
]
This has some very powerful properties. As your tests are lists, you can use List
functions to generate tests e.g. from data.
context "Email validation" @
["jd@example.com", true;
"jd@example", false;
"@example.com", false]
|> List.map ~f(fun (email, expected) ->
test (Format.sprintf "Email %s is valid: %b" email expected) (fun ->
expect (Email.is_valid email) @@ equal_bool expected
)
)
By treating test as data, you have all the power of the data manipulation functions available in OCaml. This also allows you to compose larger test suites from smaller components. E.g. do if you have multiple things that should share the same behaviour, that specification could be reused between the suites.
Effect Based DSL
There is an alternate DSL based OCaml 5 effect handlers. Effect handlers are still treated as experimental in OCaml, so this should probably be avoided ... unless there is a significant benefit. That benefit is readability.
Although the list based approach is powerful in when creating test suites; it isn't always easy to read. I've experimented a lot with setting up ocamlformat
, but I never quite liked the output. Each individual test never really stand out, partly because ocamlformat
remove blanks lines.
When using the effect handler DSL, tests are expressed as function calls. This allows for a syntax more like that of similar tools in other languages, e.g. RSpec from Ruby (which is a direct inspiration for the metadata feature), and tools like Mocha, Jasmine, and Jest in JavaScript.
But more importantly, ocamlformat
can be configured to preserve a blank line between tests, providing much better visual separation in the test code.
open Speed.Dsl.Effect.Simple
let suite = root_context "Some feature" (fun _ ->
describe "in some context" (fun _ ->
test "Behaviour 1" (fun _ ->
()
);
test "Behaviour 2" (fun _ ->
()
);
);
describe "In another context" (fun _ ->
test "Behavior 3" (fun _ -> ());
);
)
Assertions
The library contains a module for writing assertions. Eventually; I will probably extract that to it's own library, as I believe that the test runner and verification are two different problems; the only relation between the two is that failed verifications must in some way be communicated back to the test runner.
But right now, for the purpose of faster development, they are placed in the same library.
Expectations
An "expectation" is right now simple function that can verify a value. An expectation that can verify an int value equals 42 can be created like this:
let expectation = equal_int 42
Two functions expect
and should
can verify the actual value matches an expectation, the only difference between them is the order in which they receive the arguments:
expect "foo" (equal_string "foo");
"foo" |> should (equal_string "foo");
(* or *)
expect "foo" @@ equal_string "foo";
"foo" |> should @@ equal_string "foo";
In both cases, an optional ~name
can be added, that will be written to the output:
expect ~name:"HTTP Status" status @@ equal_int 200;
status |> should ~name:"HTTP Status" @@ equal_int 200;
![[Pasted image 20240517150848.png]]
A successful expectation can return a new value, allowing you to compose matchers. E.g. the be_ok
that verifies that an ('a,'b) result
has the value Ok x
, will pass the actual x
along. Expectations can be composed using the fish operator (>=>
).
let be_ok_equal_42 = be_ok >=> equal_int 42 in
Ok 42 |> should be_ok_equal_42; (* Success *)
Ok 43 |> should be_ok_equal_42; (* Error *)
Error () |> should be_ok_equal_42; (* Error *)
Writing custom expectations
The expectation is currently just a function that takes the actual value as input and returns a result
.
(* note, this type is actually not defined in code, this is the type inferred from expect/should *)
type expectation = 'a -> ( 'b, [> `AssertionErrorWithFormat of Format.formatter -> unit ]) result
The Error
value carries a polymorphic variant. The special AssertionErrorWithFormat
carries a function that can write a nice error message, in the previous example, the "expected"/"actual" messages originates from the assertion error (as well as the output colouring). The should
/ expect
functions are responsible for adding the "Assertion error" along with the name if specified; and test runner / reporter will make this integrate in the test output where appropriate.
But as these are polymorphic variants; you can really return any polymorphic variant value you like; but they will not be formatted nicely with the current reporter if you don't use AssertionErrorWithFormat
.
NOTE: I expect to change the type of an expectation in order support better error messages when composing expectations. But I will write that in a new module; keeping the existing module unchanged for a reasonable amount of time; as to not break existing code.
Record matcher ppx_deriver
My philosophy is that a test should only contain details related to the behaviour the test describes. When verifying a record type, you should just verify the properties relevant to the behaviour described by the test.
There is a ppx deriver that can help with this just that. When with ppx_import
, you can generate matcher functions for record types with optional parameters for each field.
Let's say you have this in your production code:
(* /lib/user.ml *)
type user = {
first_name: string;
last_name: string;
email: string
}
You could easily have different tests for verifying different parts of a user
.
(* /test/user_test.ml *)
open Speed.Dsl.Effect.Simple
[%import: Account.user] [@@deriving matcher]
root_context "user" (fun _ ->
test "Has right email" (fun _ ->
create_user ()
|> should @@ match_user ~email:equal_string "jd@example.com"
);
test "Has right name" (fun _ ->
create_user ()
|> should @@ match_user ~first_name:equal_string "John" ~last_name:equal_string "Doe"
)
)
NOTE: In the dune file, you need to use staged_pps
instead of pps
for the ppx_import
/speed.ppx_matcher
combination to work.
General setup and shared data
In a child group, you can attach a setup function. The return value of that function will be made available to the tests inside that scope. Metadata is specified using the extensible variant, Speed.metadata
.
Fixtures
The fixture allows you to add value that will be made available to the test in the scope. Nested scopes also have acces to this, and can create a new value (of a different type)
open Speed.Dsl.Effect
s.fixture ~setup:(fun x -> 42) "Value is an int"
(fun s ->
s.test "Test 42" (fun x -> x.subject |> should @@ equal_int 42);
s.fixture ~setup:(fun x -> Int.to_string x.subject) "Value is a string"
(fun s ->
s.test "Test '42'" (fun x -> x.subject |> should @@ equal_string "42")
)
)
NOTE: The previous examples used the Speed.Dsl.Effect.Simple
which does not need the s
value in the nested test builder functions. But that version is incapable of carrying type information from the setup to the tests. So when fixtures are needed, a slightly more verbose syntax is required.
Metadata
The setup can receive metadata specified on the test, allowing you to have common setup code; while each test specifies the specific scenario. Without any helpers, reading the metadata requires a bit of cumbersome pattern matching.
(* Define metadata by extending the metadata type *)
type Speed.Domain.metadata += Username of string
(* The setup function receives a record `{ metadata: Domain.metadata list; ... }` *)
fixture ~setup:(fun x ->
(* To get the metadata, we have to search the list for the precense of a UserName
constructor. As it's not guaranteed that one exist, the result is a
`string option` rather than a `string` *)
let username =
x.metadata
|> List.find_map function | UserName x -> Some x | None -> None in
...
)
To simplify this task, Speed contains a ppx rewriter that can generate that code for you:
fixture ~setup:(fun x ->
let username = x.metadata |> [%m Username] in
...
);
(* Or the simpler version that performs get_metadata behind the scenes *)
fixture ~setup:(fun x ->
let username = x |> [%mx Username] in (* [%mx looks up x.metadata field itself ] *)
...
)
Like before, the returned value is here string option
, but you can supply a default value to get a string
:
fixture ~setup:(fun x ->
let username = x.metadata |> [%m Username "missing"] in
(* username inferred to be: string *)
...
);
fixture ~setup:(fun x ->
let username = x |> [%mx Username "missing"] in
...
)
To use that, add the speed.ppx_metadata
to your dune
file
(test
(preprocess (pps speed.ppx_metadata ...))
A larger example
This example shows the type of scenario that these features were intended to support. An application, using the Dream HTTP framework handles the route POST /registration
. In this test, the application logic has been replaced with a test double that accepts one hardcoded email address, and reports another as a duplicate.
So the general setup creates the request and the nested contexts define the different request bodies for the two cases handled by the mocked domain logic; AND two cases that should be handled by the HTTP layer's request validation. In each group, there are separate specification for the generated body, as well as the response status.
NOTE: At the time of writing this, Lwt
promises are not handled in setup code, which results in the subject being a response promise
instead of a response
; thus the liberal use infix operators. This should be handled by Speed, eliminating the excessive use of custom operators
type Speed.Domain.metadata += Body of string;;
root_context "POST /registration" (fun s ->
s.setup
(fun x ->
let body = x |> [%mx Body ""] in
Dream.request ~method_:`POST ~target:"/registration"
~headers:["content-type", "application/x-www-form-urlencoded"]
body
|> handle_request
)
(fun s ->
s.context ~metadata:[Body "email=john.doe%40example.com"] "Email is valid" (fun s ->
s.test "Should generate a status 200" (fun { subject; _ } ->
let+ status = subject >|= Dream.status >|= Dream.status_to_int in
status |> should @@ equal_int 200
);
s.test "Should return a 'please check your email' reply" (fun { subject; _ } ->
let+ body = subject >>= Dream.body in
body |> should @@ contain "Please check your email"
)
);
s.context ~metadata:[Body "email=jane.doe%40example.com"] "Email is a duplicate"
(fun s ->
s.test "Should generate a status 200" (fun { subject; _ } ->
let+ status = subject >|= Dream.status >|= Dream.status_to_int in
status |> should @@ equal_int 200
);
s.test "Should return a 'please check your email' reply" (fun { subject; _ } ->
let+ body = subject >>= Dream.body in
body |> should @@ contain "duplicate"
)
);
s.context ~metadata:[Body "email=invalid"] "Email is not a valid email" (fun s ->
(* ... *)
);
s.context ~metadata:[Body "bad_name=jd%40example.com"] "Body does not contain an email" (fun s ->
(* ... *)
)
)
)
This doesn't read as well as I would like, and at the time of reading this; I am contemplating which language features can be used to make this read better; or if I possibly need another ppx rewriter.
An alternate syntax is using the with_metadata
function, but ocamlformat may still squash it all into one line.
with_metadata [Body "bad_name=jd%40example.com"] s.context "Body does not contain an email" (fun s ->
(* ... *)
)
Can I use this now?
Yes you can! (with OCaml 5)
At the time of writing this, the best option is to get the sources and build directly.
> git clone https://github.com/stroiman/opam-speed.git
> cd speed
> opam install . --deps-only
> make install
It is published to the official opam repository, but release of new packages involve a manual approval flow; which takes time; which again makes me not create releases frequently.
I don't commit non-working changes to master (but I am human, not an LLM, so I might make mistakes)
Remember, this will pin the package to the local path, so be sure to opam unpin speed
if you later want to use the distributed package instead.
Be aware that this is still a very early, and may not follow standard OCaml practices, e.g. interfaces are not defined for the modules, thus not hiding the parts intended for internal use. Documentation in code is virtually non-existing, only real documentation is through README files, and that is not guaranteed to be up-to-date.
The best documentation is really to read the test code ;)
Internal structures may change, but I do intend to keep the essential modules stable, That means that the different DSLs, the test root test runner, and the assertions module. Whenever possible, breaking changes will be developed in a new module, in order to not break existing usages.
E.g. when the fixture feature was added, the effect handler based DSL needed a significant rewrite, but also a breaking change as the original version was incapable of carrying the type information from the fixture to child suites and tests. This was originally called Effect_dsl_2
, letting all the old code still work without modifications. I have since renamed the modules, but both modules still exist; as the original is a bit less verbose.
Can I help?
If you find this valuable, and have ideas how to improve this, please reach out :)
Also, I am not the most experienced OCaml programmer. Am I missing some idiomatic OCaml patterns? or are there bad OCaml practices
Posted on June 3, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.