Validating Data in Elixir: Using Ecto and NimbleOptions

davidsulc

David Sulc

Posted on November 14, 2023

Validating Data in Elixir: Using Ecto and NimbleOptions

In the previous part of this series about validating data at the boundary of an Elixir application, we covered a few general programming tactics to try and reject invalid and unexpected data in our software.

Continuing with that subject, we'll now explore how two libraries, namely Ecto and NimbleOptions, can further assist us.

Let's get started!

Using Ecto

As we've seen previously, Elixir provides many native techniques to help us guarantee data quality in our systems.

But it's also possible to leverage Ecto to cast, validate, and prune data even if there's no database interaction.

Schemaless Changesets in Ecto

Schemaless changesets (basically Ecto changesets that aren't tied to a database table) are a convenient way to create data structures. They also prevent bad data from making its way into structs (this can happen when bad data makes it past constructors or is added directly to a struct).

This approach is typically helpful when dealing with untrusted input (e.g., from an API request or as part of a module's public API). We can expose a function that will accept a plain map and, after proper vetting, will yield a struct. Other functions in the same module can then accept such a struct instance (rather than a map) to indicate that the data has been vetted and is safe to consume.

Here's what that approach can look like:

# in the Account module

defstruct [:name, suspended: false]

def from_params(%{} = params) do
  data  = %{}
  types = %{name: :string, suspended: :boolean}

  changeset =
    {data, types}
    |> Ecto.Changeset.cast(params, Map.keys(types))
    |> Ecto.Changeset.validate_required(...)
    |> Ecto.Changeset.validate_length(...)

  case apply_action(changeset, :insert) do
    {:ok, data} -> {:ok, struct(__MODULE__, data)}
    {:error, _} = error -> error
  end
end
Enter fullscreen mode Exit fullscreen mode

Note that we're applying an :insert action, so we could even use our function's error to display directly in a Phoenix form: those require an :insert or :update action to render possible errors with the form's data.

Applying This To Phoenix Forms

Indeed, Phoenix forms inspect the action to determine whether error hints should be displayed: if no action is set, no errors will be rendered in the form. This is useful if an empty changeset is being used to render a Phoenix form: the changeset is invalid, but we don't want to berate the user with errors when they haven't (yet) made any actual errors. But this also means that if you want the validation errors resulting from from_params to show up in a Phoenix form, you need to set the :action value yourself, which is what we're accomplishing with our call to apply_action.

Of course, if you don't plan on using this with a Phoenix-rendered form, it won't be needed and can be safely skipped.

Creating a new validated account is now a simple matter of Account.from_params(%{name: "ACME", suspended: false}).

And since the types are dynamic, they could also be passed in as arguments if your domain requires it.

Embedded Schemas in Ecto

Let's say you already have an Account struct that isn't persisted to a database but still want to use Ecto to validate the data. In that case, you can transition to using an embedded schema rather than a basic struct. The trade-off between a schemaless changeset and an embedded schema is that the latter provides a bit more convenience at the expense of flexibility.

Namely, embedded schemas require you to have a struct within which to define the schema. Schemaless changesets don't have that requirement since, as their name implies, they don't need a schema definition.

Also, the datatypes for attributes can be dynamic in the schemaless changeset case (and provided as an argument to a function call, for example). In contrast, embedded schema types cannot vary from their defined value. Here's the above example rewritten to use an embedded schema:

# in the Account module

@primary_key false
embedded_schema do
  field :name, :string
  field :suspended, :boolean
end

def from_params(%{} = params) do
  changeset =
    %__MODULE__{}
    |> Ecto.Changeset.cast(params, [:name, :suspended])
    |> Ecto.Changeset.validate_required(...)
    |> Ecto.Changeset.validate_length(...)

  case apply_action(changeset, :insert) do
    {:ok, data} -> {:ok, struct(__MODULE__, data)}
    {:error, _} = error -> error
  end
end
Enter fullscreen mode Exit fullscreen mode

If you'd like to dig deeper into how Ecto can also help you with your non-persisted data, I highly recommend reading Ecto's Data mapping and validation guide.

Validating Options

There are many circumstances where data is passed in as keyword lists (e.g., options given to OTP modules such as GenServers). These values need to be validated for conformity, but we also want that validation to be easy to understand and communicate. Enter NimbleOptions!

The NimbleOptions Library for Elixir

NimbleOptions is a great tool for validating options, as it is a lightweight library that verifies keyword lists and returns errors on invalid data. Let's see how it can be used!

Say we want to email suspended Account records. The email template to use will be different if we email a corporate or personal account. Additionally, you want to specify which values to pass into the template.

We'll want to adapt the signature: corporate accounts, for example, should have an account manager's name attached, while private accounts can simply have a generic signature. We'll also specify which URL to include as the call to action:

def email(%Account{} = account, opts) when is_list(opts) do
  template = Keyword.fetch!(opts, :template)
  values = Keyword.get(opts, :values, [])
  account_url = Keyword.get(values, :landing_url)
  signature = Keyword.get(values, :signature, "Yours truly,\nACME.com")

  # email generation and sending goes here

  :ok
end
Enter fullscreen mode Exit fullscreen mode

This is a great start: we're enforcing the :template to be provided and the program will crash if the template isn't given. This keeps out bad data, but isn't a great experience for callers: if they make a mistake (even as simple as a typo on the template name!) everything is going to crash instead of just generating an error they can handle.

We could, of course, do something like this:

def email(%Account{} = account, opts) when is_list(opts) do
  with template when template in ~w(personal corporate)a <- Keyword.get(opts, :template) do
    values = Keyword.get(opts, :values, [])
    landing_url = %URI{} = Keyword.get(values, :landing_url, URI.parse("https://www.example.com/sign_in"))
    signature = Keyword.get(values, :signature, "Yours truly,\nACME.com")

    # email generation and sending goes here

    :ok
  else
    nil -> {:error, :missing_tempate}
    other when is_atom(other) -> {:error, :invalid_template_value}
    _ ->  {:error, :invalid_template_type}
  end
end
Enter fullscreen mode Exit fullscreen mode

But that doesn't scale very well once the number of options grows: it gets pretty difficult to see what options are permitted, what their expected type is, and so on. Not to mention, this is going to get repetitive really fast.

Bringing In NimbleOptions

Let's try out NimbleOptions. We need to specify a schema that the options are expected to conform to:

@email_opts_schema [
  template: [
    required: true,
    type: {:in, ~w(personal corporate)a}
  ],
  values: [
    type: :keyword_list,
    default: [],
    keys: [
      landing_url: [
        type: {:struct, URI},
        default: URI.parse("https://www.example.com/sign_in")
      ],
      signature: [
        type: :string,
        default: "Yours truly,\nACME.com"
      ]
    ]
  ]
]

def email(%Account{} = account, opts) when is_list(opts) do
  with {:ok, validated_options} <- NimbleOptions.validate(opts, @email_opts_schema) do
    template = Keyword.fetch!(validated_options, :template)
    landing_url = validated_options[:values][:landing_url]
    signature = validated_options[:values][:landing_url]

    # email generation and sending goes here

    {:ok, validated_options}
  end
end
Enter fullscreen mode Exit fullscreen mode

Do note we've specified default: [] in the values configuration: that way, an empty list will be used as the default, enabling the cascading of default child values such as landing_url to be filled in. Without it, a default landing_url won't be filled in if a values keyword isn't present at all.

Here it is in action:

iex> email(an_account, template: :personal)

{:ok,
 [
   values: [
     signature: "Yours truly,\nACME.com",
     landing_url: %URI{
       authority: "www.example.com",
       fragment: nil,
       host: "www.example.com",
       path: "/sign_in",
       port: 443,
       query: nil,
       scheme: "https",
       userinfo: nil
     }
   ],
   template: :personal
 ]}

iex> email(an_account, template: :personal, values: [signature: "With love,\nBob"])

{:ok,
 [
   template: :personal,
   values: [
     landing_url: %URI{
       authority: "www.example.com",
       fragment: nil,
       host: "www.example.com",
       path: "/sign_in",
       port: 443,
       query: nil,
       scheme: "https",
       userinfo: nil
     },
     signature: "With love,\nBob"
   ]
 ]}
Enter fullscreen mode Exit fullscreen mode

As we can see, default values are being filled in when required, and they won't overwrite any provided values.

Validation also works out of the box:

iex> email(an_account, template: :boom)

{:error,
 %NimbleOptions.ValidationError{
   key: :template,
   keys_path: [],
   message: "invalid value for :template option: expected one of [:personal, :corporate], got: :boom",
   value: :boom
 }}
Enter fullscreen mode Exit fullscreen mode

It will also verify that we're not passing in unexpected option keys, such as typos:

iex> email(an_account, template: :personal, values: [singature: "With love,\nBob"])

{:error,
 %NimbleOptions.ValidationError{
   key: [:singature],
   keys_path: [:values],
   message: "unknown options [:singature], valid options are: [:landing_url, :signature]",
   value: nil
 }}
Enter fullscreen mode Exit fullscreen mode

Allowing Strings for landing_page

While having the landing_page as a URI ensures it at least looks right, having to provide it as a string is a bit of a pain. It would be nice to provide the option as a string, but here's how that currently behaves:

iex> email(an_account, template: :personal, values: [landing_url: "https://my.app"])

{:error,
 %NimbleOptions.ValidationError{
   key: :landing_url,
   keys_path: [:values],
   message: "invalid value for :landing_url option: expected URI, got: \"https://my.app\"",
   value: "https://my.app"
 }}
Enter fullscreen mode Exit fullscreen mode

Luckily, there's an easy fix — we just have to tweak the type definition:

@email_opts_schema [
  template: [
    required: true,
    type: {:in, ~w(personal corporate)a}
  ],
  values: [
    type: :keyword_list,
    default: [],
    keys: [
      landing_url: [
        # improved type to accept strings
        type: {:or, [{:struct, URI}, :string]},
        default: URI.parse("https://www.example.com/sign_in")
      ],
      signature: [
        type: :string,
        default: "Yours truly,\nACME.com"
      ]
    ]
  ]
]
Enter fullscreen mode Exit fullscreen mode

Strings are now accepted for the landing_page value:

iex> email(an_account, template: :personal, values: [landing_url: "https://my.app"])

{:ok,
 [
   template: :personal,
   values: [signature: "Yours truly,\nACME.com", landing_url: "https://my.app"]
 ]}
Enter fullscreen mode Exit fullscreen mode

While a step in the right direction, there are now two minor issues:

  • Our code has to handle both string and URI data types (or we need to manually convert one to another after validation).
  • Leaving the value as a string doesn't indicate anything. In particular, there's no indication that the value is a valid URI "safe" to process.

Elixir's NimbleOptions To the Rescue!

Once again, NimbleOptions comes to our rescue: we can use a :custom data type that specifies a parser function. In our case, that parser function can simply be URI.new/1, as it already conforms to the expectation set by :custom (namely, returning {:ok, value} or {:error, message}):

@email_opts_schema [
  template: [
    required: true,
    type: {:in, ~w(personal corporate)a}
  ],
  values: [
    type: :keyword_list,
    default: [],
    keys: [
      landing_url: [
        # also accept strings, but parse them in %URI{}
        type: {:or, [{:struct, URI}, {:custom, URI, :new, []}]},
        default: URI.parse("https://www.example.com/sign_in")
      ],
      signature: [
        type: :string,
        default: "Yours truly,\nACME.com"
      ]
    ]
  ]
]
Enter fullscreen mode Exit fullscreen mode

Check it out:

iex> email(an_account, template: :personal, values: [landing_url: "https://my.app"])

{:ok,
 [
   template: :personal,
   values: [
     signature: "Yours truly,\nACME.com",
     landing_url: %URI{
       authority: nil,
       fragment: nil,
       host: "my.app",
       path: nil,
       port: 443,
       query: nil,
       scheme: "https",
       userinfo: nil
     }
   ]
 ]}
Enter fullscreen mode Exit fullscreen mode

We've now got the best of both worlds: convenient argument types and a standardized representation post-validation!

Wrapping Up

I hope you've enjoyed these two blog posts covering how you can keep bad data out of your Elixir applications while also making your code more expressive.

As we've seen, there are both "plain vanilla" Elixir techniques as well as support you can get from libraries like NimbleOptions to help us reject invalid data from being processed in our code.

By introducing these approaches to our modules, we'll provide more immediate feedback to callers and also make our programs more resilient when faced with unexpected data.

Hopefully, you've now got a few more tools in your belt to try out. Enjoy!

P.S. If you'd like to read Elixir Alchemy posts as soon as they get off the press, subscribe to our Elixir Alchemy newsletter and never miss a single post!

💖 💪 🙅 🚩
davidsulc
David Sulc

Posted on November 14, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related