Etienne Depaulis
Posted on November 9, 2019
At Lifen, an healthcare communication SaaS, we rely on a micro-services architecture with more than 50 services written in 4 different langages (Ruby, Node, Java & Python) and therefore have a bunch of Github repos to monitor.
I've been using Ruby for quite some time now and last year I decided to learn Elixir for multiple reasons: functionnal paradigm and curiosity mainly, then I fell in love with pattern matching but that's another story...
In my opinion, the best, and maybe only way, to learn a new langage is to build something that goes in production as soon as possible. So I decided to build a Phoenix app to support a Github App which handles webhooks and provides an overview of key data about our repos.
This article covers/deals with two potential strategies to handle Github App authentication via an Elixir app.
Let's consider the following use case: retrieve and store the CODEOWNERS
content which enables the "bot" to link each repo to a team. The objective is to expose an endpoint to trigger the retrieval.
Github Authentication Flow
The whole flow is described in Github API documentation. The key aspect here is that you need to obtain a temporary (1h) access token per organization on which your application is installed to then authenticate your API calls:
As this process requires "extra" API calls, we would like to cache the provided token.
Let's persist it in the database!
The first strategy is to store the access token in the database and then check on each call if a valid one is available. If not, the whole token generating flow is activated and the new token is stored:
defmodule Github.JwtToken do
use Joken.Config
def signed_jwt do
current_timestamp = DateTime.utc_now() |> DateTime.to_unix()
{github_app_id, _remainder} = System.get_env("GITHUB_APP_ID") |> Integer.parse
extra_claims = %{"iat" => current_timestamp,
"exp" => current_timestamp + (10 * 60),
"iss" => github_app_id}
generate_and_sign!(extra_claims)
end
end
Github.JwtToken. signed_jwt
generates the self-signed JWT token which will then be used for the /app/
Github API endpoints.
defmodule Github.InstallationToken do
import Ecto.Query, warn: false
alias MyApplication.Repo
use Ecto.Schema
import Ecto.Changeset
alias Github.InstallationToken
schema "installation_tokens" do
field :expires_at, :utc_datetime
field :token, :string
field :organization, :string
timestamps()
end
@doc false
def changeset(installation_token, attrs) do
installation_token
|> cast(attrs, [:token, :organization, :expires_at])
|> validate_required([:token, :organization, :expires_at])
end
def create_installation_token(attrs \\ %{}) do
%InstallationToken{}
|> InstallationToken.changeset(attrs)
|> Repo.insert()
end
def valid_token_for(organization) do
installation_token = find_or_create_installation_token(organization)
installation_token.token
end
defp find_installation_id(client, organization) do
{200, installations, _response} = Tentacat.App.Installations.list_mine(client)
matched_installation = Enum.find(installations, fn installation -> installation["account"]["login"] == organization end)
matched_installation["id"]
end
defp generate_installation_token(organization) do
jwt = Github.JwtToken.signed_jwt
client = Tentacat.Client.new(%{jwt: jwt})
installation_id = find_installation_id(client, organization)
{201, access_tokens, _response} = Tentacat.App.Installations.token(client, installation_id)
%{token: access_tokens["token"], expires_at: access_tokens["expires_at"], organization: organization}
end
defp find_or_create_installation_token(organization) do
current_datetime = DateTime.utc_now()
query = from i in InstallationToken, where: ^current_datetime < i.expires_at and i.organization == ^organization
query |> first |> Repo.one |> eventually_create_installation_token(organization)
end
defp eventually_create_installation_token(nil, organization) do
{:ok, installation_token} = generate_installation_token(organization) |> create_installation_token
installation_token
end
defp eventually_create_installation_token(installation_token = %InstallationToken{}, _organization) do
installation_token
end
end
The token cache strategy discussed in this section is implemented in Github.InstallationToken
which regroups both the schema and Github's logic.
defmodule Github.Content do
def sync_codeowners(application) do
token = Github.InstallationToken.valid_token_for(application.organization)
client = Tentacat.Client.new(%{access_token: token})
{200, %{"content" => encoded_content}, _response} = Tentacat.Repositories.Contents.content(client, application.organization, repo, ".github/CODEOWNERS")
attrs = %{raw_codeowners: :base64.mime_decode(encoded_content)}
application |> MyApplication.Main.update_application(attrs)
end
end
defmodule MyApplicationWeb.ApplicationController do
use MyApplicationWeb, :controller
def sync_codeowners(conn, %{"id" => id}) do
MyApplication.Main.get_application!(id)
|> Github.Content.sync_codeowners(application)
conn
|> put_flash(:info, "Sync done")
|> redirect(to: Routes.application_path(conn, :index))
end
end
We then simply have to call MyApplicationWeb.ApplicationController.sync_codeowners
to trigger the desired action.
This code could be greatly improved by running Github.Content.sync_codeowners
in a separate process (aka asynchronous job) using Task.start(fn -> slow_method() end)
.
This approach is quite standard and could easily have been written in other frameworks, such as Rails for example. It is quick to implement, and thanks to the "fire & forget" background job strategy provided by Task
it can handle high volumes of requests. There are 2 potential bottlenecks though:
- Github API for Apps has a rate limit of 5 000 requests per hour per organization (with a higher limit for organizations with 20+ members)
- if multiple requests are triggered simultaneously with an invalid token, there are no locking mechanism to avoid requesting multiple access tokens
This approach also lacks the invalid token cleaning strategy.
What about a supervised GenServer strategy ?
After reading Elixir in Action by Saša Jurić, a lot of what I've been reading about Elixir, Erlang, BEAM and the OTP philosophy started to make sense. I strongly recommend this reading! (I also strongly recommend Elixir for programmers by Dave Thomas).
Let's put this knowledge in practice with our use case by creating a separate process per organization (process are super cheap in the BEAM world) using the GenServer approach. Each process will hold its own state, meaning we don't have to persist it to the database anymore.
Also, as we will only have one process per organization, we should be able to handle the rate limit and concurrent token refresh strategy.
defmodule MyApplication.Application do
use Application
def start(_type, _args) do
children = [
MyApplication.Repo,
MyApplicationWeb.Endpoint,
Github.System
]
opts = [strategy: :one_for_one, name: MyApplication.Supervisor]
Supervisor.start_link(children, opts)
end
end
When we start our phoenix server, we simply add an additionnal process (Github.System
) to be supervised by the "main" supervisor.
defmodule Github.System do
def start_link do
Supervisor.start_link(
[
Github.ProcessRegistry,
Github.Cache
],
strategy: :one_for_one
)
end
def child_spec(_arg) do
%{
id: __MODULE__,
start: {__MODULE__, :start_link, []},
type: :supervisor
}
end
end
This process will be responsible for supervising 2 processes:
-
Github.ProcessRegistry
: a registry to register our per-organization processes -
Github.Cache
: which acts as a switch to start or select our dynamic processes
defmodule Github.ProcessRegistry do
def start_link do
Registry.start_link(keys: :unique, name: __MODULE__)
end
def via_tuple(key) do
{:via, Registry, {__MODULE__, key}}
end
def child_spec(_) do
Supervisor.child_spec(
Registry,
id: __MODULE__,
start: {__MODULE__, :start_link, []}
)
end
end
defmodule Github.Cache do
def start_link() do
# IO.puts("Starting Github Cache")
DynamicSupervisor.start_link(name: __MODULE__, strategy: :one_for_one)
end
def child_spec(_arg) do
%{
id: __MODULE__,
start: {__MODULE__, :start_link, []},
type: :supervisor
}
end
def organization_process(organization_name) do
case start_child(organization_name) do
{:ok, pid} -> pid
{:error, {:already_started, pid}} -> pid
end
end
defp start_child(organization_name) do
DynamicSupervisor.start_child(__MODULE__, {Github.Client, organization_name})
end
end
We can then simply call Github.Cache.organization_process("lifen")
to either start or retrieve the process in charge of the Lifen organization.
defmodule Github.Client do
use GenServer
def start_link(name) do
GenServer.start_link(Github.Client, name, name: via_tuple(name))
end
def extract_codeowners(organization_process, application) do
GenServer.cast(organization_process, {:extract_codeowners, application})
end
defp via_tuple(name) do
Github.ProcessRegistry.via_tuple({__MODULE__, name})
end
@impl GenServer
def init(organization) do
{:ok, set_token(organization)}
end
@impl GenServer
def handle_cast({:extract_codeowners, application}, state) do
state = eventually_refresh_token(state)
state.token |> set_raw_codeowners(application)
MyApplicationWeb.Endpoint.broadcast("applications", "updated_application", %{})
{ :noreply, state }
end
defp set_raw_codeowners(token, %MyApplication.Main.Application{} = application) do
client = Tentacat.Client.new(%{access_token: token})
{200, %{"content" => encoded_content}, _response} = Tentacat.Repositories.Contents.content(client, application.organization, repo, ".github/CODEOWNERS")
attrs = %{raw_codeowners: :base64.mime_decode(encoded_content)}
application |> MyApplication.Main.update_application(attrs)
end
defp eventually_refresh_token(state) do
if state.expires_at > DateTime.utc_now() |> DateTime.to_unix() do
set_token(state.organization)
else
state
end
end
defp set_token(organization) do
Github.InstallationToken.generate_installation_token(organization)
end
end
Finally, once we have the process PID, we can send it the :extract_codeowners
message via Github.Client.extract_codeowners(process, application)
.
The main dashboard relies on Phoenix Live View so we simply have to broadcast a message to refresh it via MyApplicationWeb.Endpoint.broadcast("applications", "updated_application", %{})
.
Also, eventually_refresh_token
uses the only if .. else .. end
of the whole application thanks to pattern matching!
defmodule Github.JwtToken do
use Joken.Config
def signed_jwt do
current_timestamp = DateTime.utc_now() |> DateTime.to_unix()
{github_app_id, _remainder} = System.get_env("GITHUB_APP_ID") |> Integer.parse
extra_claims = %{"iat" => current_timestamp,
"exp" => current_timestamp + (10 * 60),
"iss" => github_app_id}
generate_and_sign!(extra_claims)
end
end
defmodule Github.InstallationToken do
def generate_installation_token(organization) do
jwt = Github.JwtToken.signed_jwt
client = Tentacat.Client.new(%{jwt: jwt})
installation_id = find_installation_id(client, organization)
create_access_token(client, organization, installation_id)
end
def find_installation_id(client, organization) do
{200, installations, _response} = Tentacat.App.Installations.list_mine(client)
matched_installation = Enum.find(installations, fn installation -> installation["account"]["login"] == organization end)
matched_installation["id"]
end
def create_access_token(client, organization, installation_id) do
{201, access_tokens, _response} = Tentacat.App.Installations.token(client, installation_id)
%{token: access_tokens["token"], expires_at: unix_expires_at(access_tokens["expires_at"]), organization: organization}
end
defp unix_expires_at(expires_at) do
{:ok, naive_expires_at} = NaiveDateTime.from_iso8601(expires_at)
{:ok, expires_at} = DateTime.from_naive(naive_expires_at, "Etc/UTC")
expires_at |> DateTime.to_unix()
end
end
defmodule MyApplicationWeb.ApplicationController do
use MyApplicationWeb, :controller
def sync_codeowners(conn, %{"id" => id}) do
application = MyApplication.Main.get_application!(id)
Github.Cache.organization_process(application.organisation)
|> Github.Client.extract_codeowners(application)
conn
|> put_flash(:info, "Sync in progress")
|> redirect(to: Routes.application_path(conn, :index))
end
end
This strategy was a bit more complex to implement as it is more advanced and specific to Elixir.
It is important to note that we only have one process per organisation, meaning only one "queue" each time. If we want to handle a heavier load, we could spawn processes from it.
Also an interesting optimization for Github.Client
could be to avoid calling the API during the initialization of the process (the solution to this problem is in Saša's book).
Posted on November 9, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.