Elixir, Ecto, OTP? The beginner's view, Part I.

mpevec9

Milan Pevec

Posted on June 24, 2020

Elixir, Ecto, OTP? The beginner's view, Part I.

In my previous post Elixir, Weird Syntax, Is It Really?, I've started to talk about Elixir syntax as a total outsider. We touched the configuration for Ecto and its syntax. Let's continue here by showing my understanding of Ecto integration with Elixir.

In Ecto getting-started guide it is written that when we use Elixir mix for generating Elixir application, we need to provide —-sup option:

mix new our_app --sup

The --sup option ensures that our application has a supervision tree, which is mandatory when using Ecto. For a second, let's forget about these two keywords.

The starting point for using Ecto is the Ecto Repository. Without going into too many details we can see it as a heart of Ecto, a proxy that communicates with DB. After we have created it, we need to do "the last step" of its setup. The guide says:

The final piece of configuration is to setup the repository as a supervisor within the application's supervision tree... This piece of configuration will start the Ecto process which receives and executes our application's queries. Without it, we wouldn't be able to query the database at all!"

Ok, it makes some sense. First, we created our application with the support of supervision tree and then we configure our Ecto repository as a supervisor in application supervision tree.

Let's take a moment to break and write down all keywords (in none particular order):

supervision tree, supervisor, process, application

What's that all about? How do I understand these keywords without any background in Elixir?

The answer is hidden in the OTP. Let's go through its basic concepts to better understand how all things work together and then to understand how Ecto is configured/integrated in our applications. That's the goal of this blog, that's the mission!

OTP

One of many benefits of Elixir is the underlying Erlang runtime, which is giving us access to OTP (Open Telecom Platform).

Wait, Telecom what? The initials are not so important, lets just say that when we talk about OTP we talk about a set of libraries like Supervisor, GenServer, Application, Agent, Task...., a lot of libraries and tools. Let's try impossible, lets use a couple of them in one meaningful sentence:

Using OTP we can organize our software into lightweight execution units - processes, which can be observed and restarted if they fail with special processes called supervisors, where also supervisors can be supervised with other supervisors, which then together represent a supervision tree.

Interesting...

Process

OTP processes are the basic game piece. They are very lightweight, so don't confuse them with OS processes. You can have thousands of them, as many as you need. If you wanna hold on the state, you spawn a process; if u wanna concurrency, you spawn a process, and so on.

Lets spawn a process with iex and give it a function to run:

pid = spawn(fn -> IO.puts("hello world") end)

hello world
#PID<0.168.0>

With the pid variable we capture the process id for tracking purposes:

Process.alive?(pid)

false

This means we give the work - function to the process. It does the work and then it dies.

Process and the message

The most intuitive way to communicate between processes is with messages. In order to read messages, each process has a single mailbox - everyone can send a message to it but only the owner can read from its mailbox.

Using pattern matching you can decide on which messages to react and, by the nature of pattern matching itself, in which order.

Let's take a look at the example. Let's say we want to design a process, that will:

  • hold on the state of all greetings (like "Hello" etc.);
  • receive a message with a new greeting that will be put into the current state;
  • receive a message that will make our process send its current state;

We will communicate with this process within the iex (which is a process itself with its own mailbox!).

defmodule Greetings.Process do
  def start do
    spawn(__MODULE__, :loop, [%{}])
  end

  def loop(state) do
    receive do
      {:add, greeting_id, greeting} ->
        new_state = Map.put(state, greeting_id, greeting )
        loop(new_state)
      {:greetings, destination_pid} ->
        send(destination_pid, state)
        loop(state)
    end
  end
end

The code is very self-explanatory. In the start function, we spawn a process by sending to it the loop function with an empty map parameter. Map represents our initial and empty state.

In the loop function we are grabbing two types of messages by using pattern matching:

  • for adding a new greeting into our state;
  • for sending the current state (also as a message!) to the specific process with destination_pid. We are keeping our process alive and hence the state by looping.

Let's try it! In the iex we are first going to spawn a process and store its pid:

iex(42)> pid = Greetings.Process.start
#PID<0.293.0>

iex(43)> send(pid, {:add, 1, "Hello"})
{:add, 1, "Hello"}

iex(44)> send(pid, {:add, 2, "Bon jour"})
{:add, 2, "Bon jour"}

As of last we are going to send a message, which will trigger our process to send us back the current state:

iex(45)> send(pid, {:greetings, self})           
{:greetings, #PID<0.140.0>}

self is actually a pid of iex process, so that means we are saying to our process, send the state to us, to iex mailbox.
Now that we know that, we can ask iex to flush its mailbox:

iex(47)> flush
%{1 => "Hello", 2 => "Bon jour"}
:ok

Great. Feels very natural and clear.

Apparently this is a very common pattern in Elixir, hence they have built a more generic solution called GenServer. Why is that needed? Because the above example is missing a lot of functionality and has a lot of issues like deadlocks, message ordering, incompatibility with supervisors, etc.

GenServer

If we want to tackle the incompatibility with supervisors - remember our initial goal - we need to introduce GenServer.

On Hexdocs it's written:
"A GenServer is a process like any other Elixir process and it can be used to keep state, execute code asynchronously and so on... It will also fit into a supervision tree.".

Let's implement the previous example using Genserver.

defmodule Greetings.Server do
  use GenServer

  @name __MODULE__

  # API
  def start do
    GenServer.start(__MODULE__, :empty_state, name: @name)
  end

  def add(id, greeting) do
    GenServer.cast(@name, {:add, greeting_id, greeting})
  end

  def list do
    GenServer.call(@name, {:greetings})
  end

  # Callbacks
  def init(:empty_state) do
    {:ok, %{}}
  end

  def handle_cast({:add, greeting_id, greeting}, state) do
    new_state = Map.put(state, greeting_id, greeting )
    {:noreply, new_state}
  end

  def handle_call({:greetings}, _from, state) do
    {:reply, state, state}
  end
end

GenServer client APIs

API functions are used by the client to communicate with the server.

We need three functions: start, add/2, list.

Inside of the start function we call the GenServer to start a process. We set options to say, that the name of the started process should be the module name. What have we achieved with that? On downside now we have a "single server", which is fine for this purpose; but on the upside, we don't carry a process id in every API call so we just make it more clear. We also set the :empty_state atom, which will be explained later.

In the add/2 function, the client sends cast type of the message, which is async hence the client does not expect a reply. We are using this for adding new greetings.

In the list function, the client sends a call type of the message, which is sync and the client expects a reply from the server. We are using this to ask the server to reply with the list of all greetings - the state.

GenServer callback functions - Server APIs

With the Elixir use macro we are announcing that our Greetings.Server will use GenServer and will be populated with its specific behavior. I see this approach as a template method pattern.

In practice, that means we need to provide a series of callback functions that follow naming and typing conventions and are called by the server (GenServer).

We need three callback functions: init/1, handle_cast/2, handle_call/3.

When init/1 is called we return a tuple with our empty greetings state. The function has one parameter which matches the parameter in the client API, when we called GenServer.start/3. That means that atom :empty_state exists here for function clause matching.

With function handle_cast/2 server defines a new state (by adding a new greeting into it) and then return it.

With function handle_call/3 server replies with the current state of all greetings.

The second parameter _from is not needed hence the underscore.

Let's talk about the return value. The first state parameter is actually a payload that will be sent back to the client (our greetings) and the second state parameter is the payload that will be saved.

Let's try now our server. We are going to start a server, add greetings, and list them:

iex(56)> Greetings.Server.start
{:ok, #PID<0.645.0>}

iex(57)> Greetings.Server.add(1, "Hello")
:ok

iex(58)> Greetings.Server.add(2, "Hi")   
:ok

iex(59)> Greetings.Server.list           
%{1 => "Hello", 2 => "Hi"}

Great, very clear again. What have we seen so far is how OTP is working with processes by sending messages. Elixir also did another level of abstraction on top of GenServer by introducing Agents - for storing state, and Tasks - for concurrency jobs.

So we could rewrite our example again by using Agent and make the code even clearer. But this is out of the scope of this blog.

So now we can imagine better how a library like Ecto as a process could receive a message for querying DB and reply with results.

Let's try now to monitor our server with the supervisor.

Supervisor and monitoring

As you can notice we have used functions start/spawn but not functions start_link/spawn_link while we were creating processes. That means if the created process dies, nothing will happen with its parent, nor the process will be recreated. There is no link between them. If we would use start_link/spawn_link functions, then both processes die.

What if I would tell you, there is a way that the parent process can monitor its children and restart them in the case of a crash? That's possible when we have a supervisor process that monitors its linked children. When we have supervisors monitoring other supervisors, then we talk about supervision tree.

The example is even seen in the iex - if iex spawns a linked process that crashes, also iex will crash. But because iex is in its own supervision tree, a new iex process will be spawn after the crash.

Lets now create a supervisor and configure it to monitor our previous GenServer.

First, we will make little changes in the APIs of a GenServer:

  ...
  # API
  def start_link(opts) do
    GenServer.start_link(__MODULE__, :empty_state, opts)
  end

  def add(pid, greeting_id, greeting) do
    GenServer.cast(pid, {:add, greeting_id, greeting})
  end

  def list(pid) do
    GenServer.call(pid, {:greetings})
  end
  ...

We did two changes. First is we are using start_link/1 instead of the start and second is, now we are going to use pid parameter inside of our client API functions. Why?
We will name our GenServer process inside of its supervisor and send it as an option parameter (opts) when start_link/3 is called, so we don't have a fixed name for process anymore.

The implementation of supervisor looks like this:

defmodule Greetings.Supervisor do
    use Supervisor

    # Callbacks
    def start_link() do
        Supervisor.start_link(__MODULE__, [])
    end

    def init([]) do
        children = [
            {Greetings.Server, name: :supervised_greetings_genserver}
        ]

        Supervisor.init(children, strategy: :one_for_one)
    end
end

You can see we implement two callback functions, start_link - here we spawn a linked supervisor process and init/1 - here we initialize supervisor by defining its workers (child processes). We saw before how the use macro and callbacks are working.

In our case, the worker is our GenServer (Greetings.Server) where we name the process with the :supervised_greetings_genserver. This name is then used as a parameter when Greetings.Server.start_link/1 is called. Without that, we would have to work with process pid, which is less readable.

We also define the basic one-to-one strategy which means if the child process dies, a supervisor will just restart a new one.

So now that we have a supervisor in place, we don't start Greetings.Server directly, but we let the supervisor start it.

We will also use auxiliary call Supervisor.which_children/1 which will give us a list of supervised processes.

And last, we will use our Greetings.Server API for adding and listing greetings.

iex(1)> {:ok, sup} = Greetings.Supervisor.start_link
{:ok, #PID<0.158.0>}

iex(2)> Supervisor.which_children(sup)
[{Greetings.Server, #PID<0.159.0>, :worker, [Greetings.Server]}]

iex(3)> Greetings.Server.add(:supervised_greetings_genserver, 1, "Hello")
:ok

iex(4)> Greetings.Server.add(:supervised_greetings_genserver, 2, "Hi")   
:ok

iex(5)> Greetings.Server.list(:supervised_greetings_genserver)        
%{1 => "Hello", 2 => "Hi"}

Perfect. In order to complete our mission, we need to understand also the basics of the Application concept, which we will do in the follow-up blog.

💖 💪 🙅 🚩
mpevec9
Milan Pevec

Posted on June 24, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related