Symbols in Ruby: A deep dive

alexandrecalaca

Alexandre Calaça

Posted on July 16, 2024

Symbols in Ruby: A deep dive

Introduction

How's it going, guys?

We've all used Symbols in ruby in various situations, especially with hashes.

In this article, my goal is to share some of what I have studied. Feel free to drop your comments.


Symbols

Unique Integer Identifier

In Ruby, a symbol is represented internally as a number (specifically, an integer) with an attached identifier, which is a series of characters or bytes. This means that each symbol has a unique numeric identifier associated with it.

Symbols with the same identifier share the same internal representation

Demonstration

Let's open irb and create two symbols with the same identifier, which is going to be symbol_identifier in the following example:

symbol1 = :symbol_identifier
symbol2 = :symbol_identifier
Enter fullscreen mode Exit fullscreen mode

Image description

Now, let's print the object id of each symbol:

symbol1 = :symbol_identifier
symbol2 = :symbol_identifier

puts "Object ID of symbol1: #{symbol1.object_id}"
puts "Object ID of symbol2: #{symbol2.object_id}"
Enter fullscreen mode Exit fullscreen mode

Image description

As you can see, they seem to have the same object id, but let's compare them anyway just to make sure:

symbol1 = :symbol_identifier
symbol2 = :symbol_identifier

puts "Object ID of symbol1: #{symbol1.object_id}"
puts "Object ID of symbol2: #{symbol2.object_id}"

symbol1.object_id == symbol2.object_id ? "Object id of symbol1 and symbol2 are equal" : "Object id of symbol1 and symbol2 are not equal"
Enter fullscreen mode Exit fullscreen mode

Image description

When you run the previous code in irb, you'll notice that both symbol1 and symbol2 have the same object ID, indicating that they refer to the same internal object.

This demonstrates that symbols with the same identifier share the same internal representation, confirming that symbols are indeed represented internally as unique numeric identifiers.


Object Wrapper for Internal ID Type

Ruby treats symbols as object wrappers for an internal type called ID, which is essentially an integer type. This ID serves as the unique identifier for each symbol.

Since an integer is considered an immediate object in Ruby, it tends to have a better performance.

By the way, I might cover the topic of immediate objects in another article.

Demonstration

Let's go to irb again and create one symbol:

# The symbol identifier is `object_wrapper`
symbol3 = :object_wrapper
puts "Object id of symbol3 is #{symbol3.object_id}"
Enter fullscreen mode Exit fullscreen mode

Image description

Let's create an integer from the symbol identifier

# The symbol identifier is `object_wrapper`
symbol3 = :object_wrapper
puts "Object id of symbol3 is #{symbol3.object_id}"

# integer
integer_identifier = :object_wrapper.object_id
puts "The value of integer_identifier is #{integer_identifier}"
Enter fullscreen mode Exit fullscreen mode

Image description

As we can see, the values are the same, but let's create a comparasion case anyway.

# The symbol identifier is `object_wrapper`
symbol3 = :object_wrapper
puts "Object id of symbol3 is #{symbol3.object_id}"

# integer
integer_identifier = :object_wrapper.object_id
puts "The value of integer_identifier is #{integer_identifier}"

# Comparasion
symbol3.object_id == integer_identifier ? "Object id of the symbol is equal to the integer value" : "Object id of the symbol is not equal to the integer value"
Enter fullscreen mode Exit fullscreen mode

Image description


Efficiency of Integer Representation

Internally representing symbols as integers (IDs) rather than character strings is more efficient for computers. Manipulating integers is generally faster and requires less memory compared to dealing with strings of characters or bytes.

Let's demonstrate this by using ObjectSpace and Benchmark. Let's start with the ObjectSpace module.

ObjectSpace

The ObjectSpace module contains a number of routines that interact with the garbage collection facility and allow you to traverse all living objects with an iterator.

The objspace library extends the ObjectSpace module and adds several methods to get internal statistic information about object/memory management.

require 'objspace'

symbol4 = :hello
string4 = "hello"

puts "Memory usage of symbol: #{ObjectSpace.memsize_of(symbol4)} bytes"
puts "Memory usage of string: #{ObjectSpace.memsize_of(string4)} bytes"
Enter fullscreen mode Exit fullscreen mode

When you run this code in Ruby, you'll likely observe that the memory usage of the symbol is significantly lower than that of the string.

Something else noticeable is that the result in bytes for the Symbols is 0.

Symbols are immutable and unique, and the Ruby interpreter optimizes their memory usage by storing them in a shared internal table.

The reason is to ensure that each symbol is only stored once, regardless of how many times it is used.

This is because symbols are represented internally as integers, which require less memory compared to strings of characters.

Benchmark

In Ruby, Benchmark is a module that provides methods to measure and report the time taken to execute code. It is useful for comparing the performance of different pieces of code or for profiling parts of the application to identify bottlenecks.

In the next comparasion, let's check eh performance of operations for symbols and strings when it comes to comparasion.

require 'benchmark'

# Define a symbol and a string
symbol5 = :hello
string5 = "hello"

# Benchmark symbol operations
symbol_time = Benchmark.realtime do
  1_000_000.times { symbol5 == :hello }
end

# Benchmark string operations
string_time = Benchmark.realtime do
  1_000_000.times { string5 == "hello" }
end

puts "Time taken for symbol operations: #{symbol_time} seconds"
puts "Time taken for string operations: #{string_time} seconds"
puts "string time is #{string_time/symbol_time} times faster than symbol_time"
Enter fullscreen mode Exit fullscreen mode

Image description

Image description


Immutability

Strings can be modified, symbols cannot.

:hello.upcase
:hello.upcase!

symbol = :hey
symbol = symbol << " adding info"
Enter fullscreen mode Exit fullscreen mode

Image description


In Ruby, methods ending with an exclamation mark (bang methods) typically indicate that the method will modify the object in place. However, not all methods have a bang version.

For example, upcase is a method that returns a new string with all characters converted to uppercase, but there isn't a built-in upcase! method to modify the symbol in place.

symbols are immutable, in our code, :hello.upcase! will raise a NoMethodError because the upcase! method doesn't exist for symbols (:hello).

Similarly, symbol = symbol << " adding info" will raise a NoMethodError because there's no << method defined for symbols.


Done


Conclusion

Symbols are a powerful and efficient tool in Ruby, especially useful when you need immutable identifiers or keys, such as in hashes.

Their unique integer representation, memory efficiency, and immutability make them ideal for scenarios where performance and consistency are critical.

Understanding and leveraging symbols can lead to cleaner, more efficient Ruby code.


Celebrate

The IT crowd meme


Reach me out

Github

LinkedIn

Twitter

Dev.to

Youtube


Final thoughts

Thank you for reading this article.

If you have any questions, thoughts, suggestions, or corrections, please share them with us.

We appreciate your feedback and look forward to hearing from you.

Feel free to suggest topics for future blog articles. Until next time!


💖 💪 🙅 🚩
alexandrecalaca
Alexandre Calaça

Posted on July 16, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

Symbols in Ruby: A deep dive
ruby Symbols in Ruby: A deep dive

July 16, 2024