Ilya Averyanov
Posted on July 16, 2023
General cosiderations
Testing code that deals with time, timeouts, and timers is hard. The main reason is that getting time and setting timers is often not treated as a public contract, but it actually is.
Often, to test such code, one provides custom significantly reduced timeouts and uses :timer.sleep/1
. This approach has several drawbacks:
- it makes the tests slow
- if we use large timeouts, it makes the tests even slower
- if we use small timeouts, it makes the tests flaky
One of the approaches to testing without sleeps is injecting time-related functions directly, thus making the contract explicitly public. However, this makes the code much more complex and harder to read.
Klotho
(created by the author) takes a different approach. It injects timer-based functions globally.
With Klotho
you do not use time-related functions directly, but instead, you use Klotho
functions that wrap the original ones in production code.
In tests, these functions are replaced with a mock implementation that allows controlling time "flow".
Managing time flow
Klotho
provides the following methods:
-
Klotho.Mock.reset/0
— resets the state of the mocked time flow. The mocked functions start to behave in the same manner as real ones. -
Klotho.Mock.freeze/0
— freezes time. In frozen mode, all calls tomonotonic_time
,system_time
, etc. will return constant value. Timers will be not triggered unless warp is called. The function is idempotent. -
Klotho.Mock.unfreeze/0
— unreezes time. In running (unfrozen) mode, all calls tomonotonic_time
,system_time
, etc. will start produce values increasing with "normal" monotonic time pace. Timers will be triggered according to their schedule. -
Klotho.Mock.warp_by/1
andKlotho.Mock.warp_by/2
— warps time forward. Regardless of the current mode (frozen or running), all the timers that are due to trigger within the warp interval will be triggered just after warp. -
Klotho.Mock.timer_event_history/1
— returns history of all messages (Klotho.Mock.TimerMsg.t()
) sent by triggered timers. Most recent messages are first.
Example
Assume we have a module implementing a simple "timeout map" functionality. It allows setting a timeout for a key and records are automatically removed from the map when the timeout expires.
A possible (and a bit naive) implementation could look like this:
defmodule TimeoutCache do
@moduledoc false
use GenServer
# API
def start_link() do
GenServer.start_link(__MODULE__, [])
end
def set(pid, key, value, timeout) do
GenServer.call(pid, {:set, key, value, timeout})
end
def get(pid, key) do
GenServer.call(pid, {:get, key})
end
# gen_server
def init([]) do
{:ok, %{}}
end
def handle_call({:set, key, value, timeout}, _from, state) do
new_st =
state
|> maybe_delete(key)
|> put_new(key, value, timeout)
{:reply, :ok, new_st}
end
def handle_call({:get, key}, _from, state) do
{:reply, get_value(state, key), state}
end
def handle_info({:timeout, ref, key}, state) do
new_st = maybe_delete_timeout(state, key, ref)
{:noreply, new_st}
end
# private
defp maybe_delete(state, key) do
case state do
%{^key => {_value, ref}} ->
:erlang.cancel_timer(ref)
Map.delete(state, key)
_ ->
state
end
end
defp put_new(state, key, value, timeout) do
ref = :erlang.start_timer(timeout, self(), key)
Map.put(state, key, {value, ref})
end
defp maybe_delete_timeout(state, key, ref) do
case state do
%{^key => {_value, ^ref}} ->
Map.delete(state, key)
_ ->
state
end
end
defp get_value(state, key) do
case state do
%{^key => {value, _ref}} ->
{:ok, value}
_ ->
:not_found
end
end
end
How do we test that keys are actually removed from the map after the timeout expires? A possible test suite could look like this:
defmodule TimeoutCacheTest do
use ExUnit.Case
setup do
{:ok, pid} = TimeoutCache.start_link()
{:ok, pid: pid}
end
test "set and get", %{pid: pid} do
TimeoutCache.set(pid, :foo, :bar, 100)
assert {:ok, :bar} == TimeoutCache.get(pid, :foo)
end
test "get after timeout", %{pid: pid} do
TimeoutCache.set(pid, :foo, :bar, 100)
:timer.sleep(200)
assert :not_found == TimeoutCache.get(pid, :foo)
end
test "renew lifetime", %{pid: pid} do
TimeoutCache.set(pid, :foo, :bar1, 100)
:timer.sleep(50)
TimeoutCache.set(pid, :foo, :bar2, 100)
:timer.sleep(70)
assert {:ok, :bar2} == TimeoutCache.get(pid, :foo)
end
end
This test suite is slow and flaky. It is slow because of :timer.sleep/1
calls. Also, if the test machine is under heavy load, the timeouts may expire later than expected, thus making the tests flaky. On the other hand, if we increase the timeouts, the tests will become even slower.
With Klotho
we may rewrite the implementation as follows:
defmodule TimeoutCache do
...
defp maybe_delete(state, key) do
case state do
%{^key => {_value, ref}} ->
Klotho.cancel_timer(ref)
Map.delete(state, key)
_ ->
state
end
end
defp put_new(state, key, value, timeout) do
ref = Klotho.start_timer(timeout, self(), key)
Map.put(state, key, {value, ref})
end
...
end
We just replaced :erlang.cancel_timer/1
with Klotho.cancel_timer/1
and :erlang.start_timer/3
with Klotho.start_timer/3
. See timeout_cache.erl
. Now we can rewrite the test suite as follows:
defmodule TimeoutCacheTest do
use ExUnit.Case
setup do
{:ok, pid} = TimeoutCache.start_link()
Klotho.Mock.reset()
Klotho.Mock.freeze()
{:ok, pid: pid}
end
test "set and get", %{pid: pid} do
TimeoutCache.set(pid, :foo, :bar, 1000)
assert {:ok, :bar} == TimeoutCache.get(pid, :foo)
end
test "get after timeout", %{pid: pid} do
TimeoutCache.set(pid, :foo, :bar, 1000)
## Timers whose time has passed are triggered in the end of warp
Klotho.Mock.warp_by(2000)
assert :not_found == TimeoutCache.get(pid, :foo)
end
test "renew lifetime", %{pid: pid} do
TimeoutCache.set(pid, :foo, :bar1, 1000)
Klotho.Mock.warp_by(500)
TimeoutCache.set(pid, :foo, :bar2, 1000)
Klotho.Mock.warp_by(700)
assert {:ok, :bar2} == TimeoutCache.get(pid, :foo)
end
end
We may use arbitrary timeouts in tests, just warping the time by the required amount. The code does not use sleeps and runs fast. Also, the tests are not flaky anymore.
See the difference in the test execution time:
$ mix test test/klotho_timeout_cache_test.exs --trace
Klotho.TimeoutCacheTest.Klotho [test/klotho_timeout_cache_test.exs]
* test set and get (1.7ms) [L#44]
* test renew lifetime (0.05ms) [L#55]
* test get after timeout (0.07ms) [L#49]
Klotho.TimeoutCacheTest.Timer [test/klotho_timeout_cache_test.exs]
* test set and get (0.05ms) [L#12]
* test renew lifetime (121.8ms) [L#23]
* test get after timeout (200.9ms) [L#17]
Finished in 0.3 seconds (0.00s async, 0.3s sync)
6 tests, 0 failures
Limitations
Klotho
does not intend to be suitable for any case and provide beam-wide time injection. It is designed to be used in a wide but still limited scope of cases when the code deals with some medium-intensive self-contained time-related logic.
- See the documentation for the full list of supported functions.
- The library may not be suitable for testing logic with some high-frequency events or events with an order of millisecond latency because the time is managed by a
GenServer
process. - The library may not work well if code actively uses other modules with their own and intensive time-related logic.
-
Klotho
cannot be used in tests running parallely.
Posted on July 16, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.