How to use database triggers in Rails

gavrilarails

George Gavrilchik

Posted on May 27, 2024

How to use database triggers in Rails

Introduction

In modern web development, efficiently managing data within a database is crucial for ensuring optimal application performance and data integrity. Rails provides robust tools for interacting with databases. However, there are scenarios where leveraging the power of the database itself can yield significant benefits. This is where database triggers come into play.

Database triggers are special procedures that are automatically executed in response to certain events on a particular table or view in a database. These events can include actions such as insertions, updates, or deletions. Triggers are powerful tools for enforcing rules, maintaining audit trails, and ensuring data consistency without the need for manual intervention. By encapsulating logic within the database, triggers can help offload processing from the application layer, reduce network overhead, and ensure that critical operations are performed reliably.

When to Use Them

While database triggers can be incredibly useful, their application should be carefully considered. Triggers can be highly effective for specific use cases such as data denormalization. Data denormalization involves storing redundant data to improve read performance. For example, in a denormalized database, you might maintain a summary table that aggregates data from multiple related tables. A common scenario is updating a total sales column in a customers table whenever a new order is inserted into the orders table. Instead of recalculating the total sales every time it is needed, a trigger can automatically update the total whenever an order is added, modified, or deleted, thereby reducing the complexity and number of queries required to fetch this data. This use of triggers helps maintain data integrity while significantly improving query performance.

Another practical case for denormalisation using database triggers is to add a reference to another table during an insert operation to simplify queries. For instance, consider a scenario where you have a companies table, and each company has many branches, each branch has many employees, and each employee has many sales. If you frequently need to get aggregated sales by company, you may benefit from adding a company_id column to the sales table and populating it on insert.

This simplifies your queries by allowing direct access to the company information without needing to join multiple tables. For example, instead of joining the sales, employees, branches, and companies tables to get the total sales for a company, you can directly query the sales table using the company_id. This not only makes your queries more straightforward but also improves performance by reducing the complexity and number of joins required. This approach ensures that your data remains consistent and that the company_id is always correctly populated, making your data aggregation tasks more efficient.

When Not to Use Them

While database triggers can offer significant advantages in certain scenarios, there are situations where their use can complicate the architecture and maintenance of an application. For example, consider a scenario where we decide to create a record in the invoices table every time a record is created in the sales table. Implementing this logic with a trigger would require embedding business rules directly within the database layer, which can violate the separation of concerns principle.

By placing business logic in triggers, you risk making your application harder to maintain and understand. Business logic is typically best handled within the application code, where it is more visible, easier to test, and simpler to modify as requirements evolve. When business rules are hidden within database triggers, it becomes challenging for developers to trace the flow of data and understand the application's behavior, leading to potential bugs and increased maintenance overhead. Additionally, debugging issues that arise from triggers can be more complex, as the logic is executed automatically by the database in response to events, rather than being explicitly called within the application code.

In this example, the preferred approach would be to handle the creation of invoice records within the application's service layer. This ensures that the business logic remains centralized, making the application more modular, maintainable, and easier to understand. By keeping business rules within the application code, developers can leverage standard development tools and practices, such as unit testing and version control, to ensure the integrity and reliability of the application.

Let's Start

Let's create an example trigger in the Rails environment. I suspect you have come to read this post expecting to see one. As an example, I've chosen to create a URL shortener that works the following way: For each long URL, we create a record in the database with the corresponding short alphanumeric code. For the short code, we will use the first 5 characters of the long URL's MD5 hash. If the short code is already taken by another URL, we will take the first 6 characters, and so on, until we find a free code or reach the limit of 32 characters, which is the length of an MD5 hash. By design, if the same URL is added again, it will generate the same short code, and instead of creating a new record, it will update the updated_at field of the existing record. This approach ensures that we avoid duplicate entries and maintain the uniqueness of short codes efficiently. Now, let's dive into the details of how to implement this in a Rails application with a database trigger.

class CreateUrls < ActiveRecord::Migration[7.1]
  def up
    create_table :urls do |t|
      t.string :long
      t.string :short

      t.timestamps
    end
    add_index :urls, :short, unique: true
    add_index :urls, :long, unique: true
    path = "#{Rails.root}/db/triggers/add_unique_short_to_urls_on_insert.sql"
    sql = File.read(path)
    ActiveRecord::Base.connection.execute sql
  end

  def down
    drop_table :urls
    ActiveRecord::Base.connection.execute 'DROP FUNCTION IF EXISTS add_unique_short'
  end
end
Enter fullscreen mode Exit fullscreen mode

Like any other database change, triggers are added to Rails using migrations. Here is the migration to create a table, add indices, and execute SQL code from the file we'll be storing in the triggers folder. It is important to add a down method to delete your trigger when you reverse the migration. I've also put the SQL code for the trigger into a separate folder. Not only does this allow us to use code highlighting and autocomplete provided by your IDE, but it also indicates to other developers (and your future self) that there are database triggers in the app, making it easy to get a full list of them. Giving them obvious names is also important to ensure clarity and ease of maintenance.

I don't think it is superfluous to clarify to folks who might not be familiar with Rails how migrations work. In Rails, we create classes for migrations in the db/migrate folder that inherit from ActiveRecord::Migration[:rails_version]. This is a good example of metaprogramming in Ruby. Instead of directly inheriting from the ActiveRecord::Migration class, ActiveRecord::Migration has redefined the [] (square brackets) method to dynamically create a class with features available for the specified Rails version and return it, so we can inherit from it. This allows us to use migrations created in previous versions of Rails even after updating Rails. The newly created class typically has two methods, up and down, which apply and revert the changes to the database, respectively. Alternatively, it might have a single change method if the changes can be automatically reversed by Rails. The name of the file containing the class starts with a 14-digit timestamp, and Rails maintains a table with one column containing these timestamps for executed migrations. This ensures that migrations are run only once and in the order they were created when you type rails db:migrate. This system ensures smooth database schema changes and consistency across different environments. It's important to mention that the SQL code we will write in the next section will be compiled and executed by the database engine, so if we need to make changes, we need to revert the migration, change the code, and execute it again. You can read more about migrations here: Rails Migrations Guide.

SQL Code

Now let's dive into the SQL code. Since I am using PostgreSQL, I will be writing code in PL/pgSQL. Here is the code:

CREATE OR REPLACE FUNCTION add_unique_short()
RETURNS TRIGGER AS $$
DECLARE
    l int := 5;
    hash varchar(32);
    found boolean;
BEGIN
    hash := MD5(NEW.long); -- Generate the hash based on the 'long' URL

    -- Check for uniqueness of the short URL
    LOOP
        SELECT EXISTS(SELECT 1 FROM "urls" WHERE "urls"."short" = LEFT(hash, l)) INTO found;
        EXIT WHEN NOT found OR l >= 32;
        l := l + 1;
    END LOOP;

    -- Handle the case where a unique short hash is not found
    IF l >= 32 AND found THEN
        RAISE EXCEPTION 'Unable to generate a unique short URL';


    END IF;

    -- Set the short URL
    NEW.short := LEFT(hash, l);

    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

DROP TRIGGER IF EXISTS add_unique_short_to_urls_on_insert ON urls;

CREATE TRIGGER add_unique_short_to_urls_on_insert
BEFORE INSERT OR UPDATE OF long ON urls
FOR EACH ROW
EXECUTE FUNCTION add_unique_short();
Enter fullscreen mode Exit fullscreen mode

The code is straightforward, but let's highlight a few details about how PL/pgSQL works. In PL/pgSQL, all variables must be declared in a specific section at the beginning of the function, and the language is strictly typed, requiring each variable to have a defined data type. Additionally, functions must be declared before they can be used in triggers, ensuring that the database recognizes the function and its capabilities when setting up the trigger.

Since this logic operates at the data layer, making queries within the loop is not as expensive as it would be if performed from Rails. This is particularly efficient because we have an index for the short column. The database might not need to scan the entire table for the EXISTS query; it can simply check the index tree, which significantly reduces the overhead and improves performance.

Alternatives and Debugging

While PL/pgSQL is a powerful language for writing stored procedures and triggers in PostgreSQL, other database engines use different languages for similar tasks. For example, MySQL uses a language called SQL/PSM (Persistent Stored Modules), and Oracle uses PL/SQL (Procedural Language/SQL). Microsoft SQL Server employs Transact-SQL (T-SQL), which extends SQL with procedural programming capabilities. Each of these languages has its own syntax and features tailored to the specific database engine, allowing developers to write complex logic directly within the database.

For debugging and testing SQL code, it's often more efficient to use a database management application like DBeaver. These tools provide a graphical interface where you can execute SQL code directly, inspect the results, and interact with your database more intuitively. This can be particularly useful when working with stored procedures and triggers, as you can immediately see the impact of your code and make adjustments as needed.

Creating a URL in Rails

Now let's try to create a URL. Assuming the Url model is already created, we expect that calling Url.create(long: 'https://dev.to') will return a Url object with both long and short attributes populated. However, by default, this won't happen because Rails expects that after a record is created, only the ID and timestamps can change, so it doesn't update other attributes. To make this work, I will redefine the save method to ensure it returns the short attribute as well. Additionally, we need to handle the case when the long URL already exists in the database. By creating a unique index in the migration, we have essentially created another trigger that stops the insert operation if the value is not unique. In Rails, I used the upsert method to handle this scenario and specified what to update if a duplicate is found. This ensures that if the long URL already exists, the updated_at timestamp is updated instead of creating a new record.

require 'uri'
class Url < ApplicationRecord
  before_validation :set_timestamps
  validate :long_is_a_valid_url

  def save(*_args)
    return false unless valid?

    result = self.class.upsert(attributes.except(*%w[id short]),
                               returning: %i[id short created_at updated_at],
                               unique_by: :long,
                               record_timestamps: false,
                               update_only: [:updated_at])
    update_attributes_from_upsert_result(result)
  rescue StandardError => e
    errors.add(:base, e.message)
    false
  end

  private

  def update_attributes_from_upsert_result(result)
    return false unless result&.first

    assign_attributes(result.first)
    true
  end

  def set_timestamps
    self.updated_at = Time.now
    self.created_at ||= Time.now
  end

  def long_is_a_valid_url
    uri = URI.parse(long)
    uri.is_a?(URI::HTTPS) && uri.host.present? || raise
  rescue StandardError
    errors.add(:long, 'Should be a valid and secure URL')
  end
end
Enter fullscreen mode Exit fullscreen mode

Conclusion

Now, everything should be working smoothly. You can find a working example here, and the source code is available in this GitHub Repository. If you have any suggestions or feedback, please feel free to leave a comment.

💖 💪 🙅 🚩
gavrilarails
George Gavrilchik

Posted on May 27, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

Enum Types in Postgres
rails Enum Types in Postgres

December 4, 2022