Michael Chaney
Posted on August 31, 2024
I ran across the excellent "auto_strip_attributes" gem years ago, and it's been a staple gem that gets added to my Gemfile right off the bat when making a new project. I even have a standard initializer that adds some extra functionality to it:
AutoStripAttributes::Config.setup do
set_filter(fix_curly_quotes: false) do |value|
!value.blank? && value.respond_to?(:gsub) ? value.gsub(/[\u201c\u201d]/, '"').gsub(/[\u2018\u2019]/, '\'') : value
end
set_filter(downcase: false) do |value|
!value.blank? && value.respond_to?(:downcase) ? value.downcase : value
end
set_filter(unicode_normalize: false) do |value|
!value.blank? && value.respond_to?(:unicode_normalize) ? value.unicode_normalize : value
end
end
These should be self-explanatory, but just in case:
fix_curly_quotes - Changes Unicode curly double quotes “” to a standard ASCII quotation mark ", and changes Unicode curly single quotes ‘’ to a standard ASCII apostrophe.
downcase - Force text to lowercase.
unicode_normalize - Normalizes Unicode characters to the "nfc" form.
This has been an awesome tool in the tool belt for years now. With the advent of Rails 7.1, we now have "normalizes" built in to ActiveModel. It takes the place of auto_strip_attributes, but there are some differences that require your attention if you're changing from one to the other.
Here's a standard setup for auto_strip_attributes that is pulled from my code:
class Track < ApplicationRecord
auto_strip_attributes :title, :original_title, :artist, squish: true, nullify: false, fix_curly_quotes: true
auto_strip_attributes :lyric_notes, nullify: true, fix_curly_quotes: true, squish: true
auto_strip_attributes :lyrics, nullify: true
...
end
This is all normal stuff. In general, you don't want to "squish" multi-line text blocks and if a field can be nullified then you add "nullify: true" to leave it as "null" if there is no value.
One serious "gotcha" here is that there is no checking for bad filter names. If you misspell a filter name it will be silently ignored.
The code for this gem is really simple, and you can view it here. It creates a "before_validation" callback to apply the filters right before validation.
Keep that in mind.
Starting in Rails 7.1 we have "normalizes" as a declaration to make in our models. It provides the same functionality, but in a different manner.
From a different project, also with a "Track" model:
class Track < ApplicationRecord
normalizes :title, with: -> (value) { value.strip.unicode_normalize }
normalizes :description, with: -> (value) { value.strip.unicode_normalize }
normalizes :lyrics, with: -> (value) { value.strip.unicode_normalize }
normalizes :lyrics_chorus_only, with: -> (value) { value.strip.unicode_normalize }
...
end
This is pretty similar, but there are differences:
- There are no "built-in" normalizers. You can munge the string with any methods on string or create a lambda/proc that'll do what you need.
- There are no "default" normalizers. auto_strip_attributes strips by default, "normalizes" has no defaults.
- By default, nils are ignored, so you don't have to worry about getting something other than a string. However, you can use the "apply_to_nil" option to force your normalizer to be called even with nil values.
- Each attribute has its own declaration. This is interesting because the model class has a method "normalize_value_for" which accepts an attribute name plus a value, allowing you to "normalize" a random string as if it were the named attribute.
- Big one: the normalizer is applied when the attribute is assigned a value.
5 has an interesting side effect which is useful. Consider this code:
class Artist < ApplicationRecord
normalizes :name, with: -> name { name.strip.downcase.unicode_normalize }
end
Artist.create(name: "Beyoncé")
At this point, we have a record where the name is "beyoncé". What does this code do?
beyonce = Artist.find_or_initialize_by(name: " Beyoncé ")
This is the big difference between the two. If you're using "auto_strip_attributes" here you'll get a new record. Using "normalizes" - the existing record will be found instead as it'll run " Beyoncé " through the normalizer before doing the find.
This isn't a gotcha, though. It's very desired functionality, as it allows you to perform this function without duplicating somehow the normalization code.
These are both great ways to accomplish normalizing your attributes, which is a very important aspect of bringing data from outside into your database. I am moving to "normalizes" for all new code simply because it's one less gem, but I still have lots of code using "auto_strip_attributes".
Posted on August 31, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.