Symbols in Ruby: A deep dive
Alexandre Calaça
Posted on July 16, 2024
Introduction
How's it going, guys?
We've all used Symbols in ruby in various situations, especially with hashes.
In this article, my goal is to share some of what I have studied. Feel free to drop your comments.
Symbols
Unique Integer Identifier
In Ruby, a symbol is represented internally as a number (specifically, an integer) with an attached identifier, which is a series of characters or bytes. This means that each symbol has a unique numeric identifier associated with it.
Symbols with the same identifier share the same internal representation
Demonstration
Let's open irb
and create two symbols with the same identifier, which is going to be symbol_identifier
in the following example:
symbol1 = :symbol_identifier
symbol2 = :symbol_identifier
Now, let's print the object id of each symbol:
symbol1 = :symbol_identifier
symbol2 = :symbol_identifier
puts "Object ID of symbol1: #{symbol1.object_id}"
puts "Object ID of symbol2: #{symbol2.object_id}"
As you can see, they seem to have the same object id
, but let's compare them anyway just to make sure:
symbol1 = :symbol_identifier
symbol2 = :symbol_identifier
puts "Object ID of symbol1: #{symbol1.object_id}"
puts "Object ID of symbol2: #{symbol2.object_id}"
symbol1.object_id == symbol2.object_id ? "Object id of symbol1 and symbol2 are equal" : "Object id of symbol1 and symbol2 are not equal"
When you run the previous code in irb, you'll notice that both symbol1
and symbol2
have the same object ID, indicating that they refer to the same internal object.
This demonstrates that symbols with the same identifier share the same internal representation, confirming that symbols are indeed represented internally as unique numeric identifiers.
Object Wrapper for Internal ID Type
Ruby treats symbols as object wrappers for an internal type called ID, which is essentially an integer type. This ID serves as the unique identifier for each symbol.
Since an integer is considered an immediate object in Ruby, it tends to have a better performance.
By the way, I might cover the topic of immediate objects
in another article.
Demonstration
Let's go to irb
again and create one symbol:
# The symbol identifier is `object_wrapper`
symbol3 = :object_wrapper
puts "Object id of symbol3 is #{symbol3.object_id}"
Let's create an integer from the symbol identifier
# The symbol identifier is `object_wrapper`
symbol3 = :object_wrapper
puts "Object id of symbol3 is #{symbol3.object_id}"
# integer
integer_identifier = :object_wrapper.object_id
puts "The value of integer_identifier is #{integer_identifier}"
As we can see, the values are the same, but let's create a comparasion case anyway.
# The symbol identifier is `object_wrapper`
symbol3 = :object_wrapper
puts "Object id of symbol3 is #{symbol3.object_id}"
# integer
integer_identifier = :object_wrapper.object_id
puts "The value of integer_identifier is #{integer_identifier}"
# Comparasion
symbol3.object_id == integer_identifier ? "Object id of the symbol is equal to the integer value" : "Object id of the symbol is not equal to the integer value"
Efficiency of Integer Representation
Internally representing symbols as integers (IDs) rather than character strings is more efficient for computers. Manipulating integers is generally faster and requires less memory compared to dealing with strings of characters or bytes.
Let's demonstrate this by using ObjectSpace
and Benchmark
. Let's start with the ObjectSpace
module.
ObjectSpace
The ObjectSpace
module contains a number of routines that interact with the garbage collection facility and allow you to traverse all living objects with an iterator.
The objspace
library extends the ObjectSpace
module and adds several methods to get internal statistic information about object/memory management.
require 'objspace'
symbol4 = :hello
string4 = "hello"
puts "Memory usage of symbol: #{ObjectSpace.memsize_of(symbol4)} bytes"
puts "Memory usage of string: #{ObjectSpace.memsize_of(string4)} bytes"
When you run this code in Ruby, you'll likely observe that the memory usage of the symbol is significantly lower than that of the string.
Something else noticeable is that the result in bytes for the Symbols is 0.
Symbols are immutable and unique, and the Ruby interpreter optimizes their memory usage by storing them in a shared internal table.
The reason is to ensure that each symbol is only stored once, regardless of how many times it is used.
This is because symbols are represented internally as integers, which require less memory compared to strings of characters.
Benchmark
In Ruby, Benchmark
is a module that provides methods to measure and report the time taken to execute code. It is useful for comparing the performance of different pieces of code or for profiling parts of the application to identify bottlenecks.
In the next comparasion, let's check eh performance of operations for symbols and strings when it comes to comparasion.
require 'benchmark'
# Define a symbol and a string
symbol5 = :hello
string5 = "hello"
# Benchmark symbol operations
symbol_time = Benchmark.realtime do
1_000_000.times { symbol5 == :hello }
end
# Benchmark string operations
string_time = Benchmark.realtime do
1_000_000.times { string5 == "hello" }
end
puts "Time taken for symbol operations: #{symbol_time} seconds"
puts "Time taken for string operations: #{string_time} seconds"
puts "string time is #{string_time/symbol_time} times faster than symbol_time"
Immutability
Strings can be modified, symbols cannot.
:hello.upcase
:hello.upcase!
symbol = :hey
symbol = symbol << " adding info"
In Ruby, methods ending with an exclamation mark (bang methods) typically indicate that the method will modify the object in place. However, not all methods have a bang version.
For example, upcase is a method that returns a new string with all characters converted to uppercase, but there isn't a built-in upcase! method to modify the symbol in place.
symbols are immutable, in our code, :hello.upcase!
will raise a NoMethodError because the upcase!
method doesn't exist for symbols (:hello).
Similarly, symbol = symbol << " adding info"
will raise a NoMethodError because there's no <<
method defined for symbols.
Done
Conclusion
Symbols are a powerful and efficient tool in Ruby, especially useful when you need immutable identifiers or keys, such as in hashes.
Their unique integer representation, memory efficiency, and immutability make them ideal for scenarios where performance and consistency are critical.
Understanding and leveraging symbols can lead to cleaner, more efficient Ruby code.
Celebrate
Reach me out
Final thoughts
Thank you for reading this article.
If you have any questions, thoughts, suggestions, or corrections, please share them with us.
We appreciate your feedback and look forward to hearing from you.
Feel free to suggest topics for future blog articles. Until next time!
Posted on July 16, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.