The letter A styled as Alchemists logo. lchemists Syndication Icon
Published November 1, 2021 Updated November 7, 2021
Cover
Ruby Structs

Ruby’s Struct is one of several powerful core classes which is often overlooked and under utilized compared to the more popular Hash class. This is a shame and I’m often surprised when working with others who don’t know about structs or, worse, abuse them entirely. I’d like to set the record straight by sharing the joy of structs with you and how you can leverage their power to improve your Ruby code further. 🚀

Definition

To begin, structs are a hybrid between a class and a hash and meant to be a container for data. They are best used to give a name to an object which encapsulates one to many data attributes. This allows you to avoid using arrays or hashes which leads to Primitive Obsession code smells.

To illustrate further, let’s consider an object which is a point on a graph and consists of x and y coordinates. Here’s how you might define that object using an Array (tuple), Hash, Struct, Class, and OpenStruct:

# Array
point = [1, 2]
point.first                        # 1
point.last                         # 2

# Hash
point = {x: 1, y: 2}
point[:x]                          # 1
point[:y]                          # 2

# Struct
Point = Struct.new :x, :y
point = Point.new 1, 2
point.x                            # 1
point.y                            # 2

# Class
class Point
  attr_accessor :x, :y

  def initialize x, y
    @x = x
    @y = y
  end
end

point = Point.new 1, 2
point.x                            # 1
point.y                            # 2

# OpenStruct
require "ostruct"
point = OpenStruct.new x: 1, y: 2
point.x                            # 1
point.y                            # 2

Based on the above you can immediately see the negative effects of Primitive Obsession with the Array and Hash instances. With the Array tuple, you can only use #first to obtain the value of x and #last to obtain the value of y. Those are terrible method names to represent a point object because the methods are not self describing. With the Hash, the methods are more readable but you have to message #[] with the key of the value you want to obtain which isn’t ideal when having to type the brackets each time you send a message.

When we move away from the Array and Hash by switching to the Struct, you can see the elegance of messaging our point instance via the #x or #y methods to get the desired values.

The Class example has the same capabilities as the struct but at the cost of being more cumbersome. Plus, the class is more heavyweight compared to the struct. There are additional reasons why a Struct is better than a Class, like performance, but I’ll expand upon this more later.

Finally, we have OpenStruct which is part of Ruby Core as well. At this point, you might be thinking: "Hey, looks like an OpenStruct is better in terms of melding Struct and Class syntax and functionality." Well, you’d be very wrong in terms of performance but, as mentioned with Class usage, I promise to expand upon this more later.

Construction

Now that we’ve gone over what a struct is and even, briefly, how it can be advantageous, let’s delve into construction.

New

There are two ways to define a struct which is via positional or keyword arguments. For example, here’s the same struct defined first with positional arguments (as used earlier) and then with keyword arguments:

# Positional
Point = Struct.new :x, :y
point = Point.new 1, 2

# Keyword
Point = Struct.new :x, :y, keyword_init: true
point = Point.new x: 1, y: 2

The difference is using keyword_init: true to construct a struct via keywords instead of positional arguments. For astute readers, you’ll recognize this as a Boolean Parameter Control Couple code smell and is definitely a shame because the Ruby Core team could have easily introduced a method other than .new to handle construction of a keyworded struct rather than use a boolean. Anyway, at a high level, here are the advantages and disadvantages of each:

  • Positional Arguments

    • Advantages: Concise with minimal typing to construct.

    • Disadvantages: Must ensure each argument is in the same position in which it was defined.

  • Keyword Arguments

    • Advantages: Arguments can be passed in out of order or using any subset of arguments.

    • Disadvantages: Must use the key and value for each argument.

Here’s a closer look:

# Positional
Point.new 1           # <struct Point x=1, y=nil>
Point.new nil, 2      # <struct Point x=nil, y=2>

# Keyword
Point.new y: 2, x: 1  # <struct Point x=1, y=2>
Point.new y: 2        # <struct Point x=nil, y=2>

While keyword arguments require more typing to define your key and value, you are free of positional constraints and can even construct with a subset of attributes which isn’t possible with positional arguments unless you fill in all positions prior to the position you desire to set. As for me, I tend to default to using a struct with keywords more than a struct with positional arguments. Only in rare cases where all arguments are necessary or where all preceding arguments will at least be filled in will I use a positional struct.

Subclass

So far you’ve only seen class construction using .new but you can use a subclass as well. Example:

class Inspectable < Struct
  def inspect = to_h.inspect
end

# Positional
Point = Inspectable.new :x, :y
point = Point.new 1, 2
point.inspect                                       # "{:x=>1, :y=>2}"

# Keyword
Point = Inspectable.new :x, :y, keyword_init: true
point = Point.new x: 1, y: 2
point.inspect                                       # "{:x=>1, :y=>2}"

Subclassing can be useful in rare cases but you can also see it’s not as much fun to use due to the additional lines of code (even with the included/overwritten #inspect method). While subclassing is good to be aware of, use with caution because inheritance carries a lot of baggage with it. You’re much better off using Dependency Inversion which is the D in SOLID design. So compose your objects rather than inheriting them.

Anonymous

Of all forms of Struct construction mentioned so far, anonymous hashes are the quickest. Example:

# Positional
point = Struct.new("Point", :x, :y).new 1, 2

# Keyword
point = Struct.new("Point", :x, :y, keyword_init: true).new x: 1, y: 2

Only problem with the above is, well, anonymous structs are only useful within the scope they are defined as temporary and short lived objects. For anything more permanent, you’ll need to define a constant for improved reuse. That said, anonymous structs can be handy in a pinch for one-off situations.

Initialization

Now that we know how to construct a Struct, let’s move on to initialization which consists of two forms. We’ll continue with our Point struct for these examples.

New

As shown earlier — and with nearly all Ruby objects — you can initialize a struct via the .new class method:

point = Point.new x: 1, y: 2

The above is great, straightforward, and what every Ruby engineer is used too. …​but what if there was a better way?

Brackets

Turns out there is a shorter way to initialize a struct and that’s via square brackets:

point = Point[x: 1, y: 2]

This is my favorite method and for two important reasons:

  1. Brackets require three less characters to type. ⚡️

  2. Brackets signify, more clearly, you are working with a struct versus a class which improves readability. Calling out structs like this when reading through code makes a big difference over time and encourage you to do the same.

Defaults

Structs, as with classes, can set defaults. The way to do this is to override the #initialize method as shown below:

Point = Struct.new :x, :y, keyword_init: true do
  def initialize *arguments
    super

    self[:x] ||= 0
    self[:y] ||= 0
  end
end

With the above, I’ve effectively made it so any new Point instance will default to x = 0, y = 0:

point = Point.new
point.x            # 0
point.y            # 0

There are three things to call out with the above:

  1. You must call super in order to forward the incoming arguments to the superclass. You are a subclass of Struct, after all, so you can lean on inheritance to avoid reinventing attribute assignment.

  2. Once all attributes are assigned, refer to your self to set the attribute.

  3. Use ||= to only set a default value as long as original value is nil. Otherwise, you can fall back to what ever argument was passed in without causing surprising behavior. Keep in mind that ||= isn’t full proof because passing in nil or false will not trigger the default assignment. Using defined? can be a stronger alternative if ||= doesn’t serve your situation.

I use this pattern when needing an instance of struct with safe defaults. This can also be abused so keep your defaults simple and without side effects. If you don’t need a default or can’t think of a safe default, then don’t override the initializer unnecessarily.

Transformations

Along the same lines as initialization is the ability for structs to transform an incoming data type to itself. This is a variant of the Adapter Pattern but instead of having a second object which adapts one object into an instance of your struct, you have the struct do the transformation. For example, consider the following:

module Graphs
  POINT_KEY_MAP = {horizontal: :x, vertical: :y}.freeze

  Point = Struct.new(*POINT_KEY_MAP.values, keyword_init: true) do
    def self.for(location, key_map: POINT_KEY_MAP) = new(location.transform_keys(key_map))
  end
end

With the above, you can now transform in incoming Hash into the Struct we need:

location = {horizontal: 1, vertical: 2}
point = Graphs::Point.for location
point.inspect                            # <struct Graphs::Point x=1, y=2>

This is a lot of power for a small amount of code because you can now convert one data type — which looks roughly similar to your struct but has the wrong keys — into your struct which is properly named and has a better interface. Let’s break this down further:

  1. The Graphs module gives you a namespace to group related constants (i.e. POINT_KEY_MAP and Point).

  2. The POINT_KEY_MAP constant allows you to define — in one place — the mapping of keys you need to transform. The hash keys are the foreign keys to transform while the hash values are the keys used to define your struct’s attributes.

  3. The .for class method allows you to consume the location hash along with an optional key map for transforming the foreign keys. Since location is a hash, we can ask it to transform its keys using the provided key map. The result is then used to initialize the struct with the newly transforms keys and values of the original hash.

The reason this is powerful is because, in Domain Driven Design, you have a single method — .for in this case — serving as a boundary for converting a foreign type into a struct with more flexibility and reuse with minimal effort. This is handy in situations where you might be dealing with an external API or any kind of similar data which is almost shaped the way you need but isn’t quite right.

I should point out that if .for isn’t to your liking, you can use .with, .for_location, .with_location, and so forth for the class method name. I tend to stick with short and simply named transforming method names like .for or .with until I find I need something more specific.

You can take all of this too far and put too much responsibility on your struct. Should that happen, consider crafting an adapter class that consumes and converts the incoming data into an instance of your struct. Otherwise, for simple situations like the above example, this is an nice way to give your struct extra superpower with concise syntactic sugar.

Whole Values

Another superpower of structs is that they are whole value objects by default. This is lovely because you can have two or more structs with the same values and they’ll be equal even though their object IDs are different. Here’s an example where, again, we reach for our Point struct:

a = Point[x: 1, y: 2]
b = Point[x: 1, y: 2]

a == b      # true
a === b     # true
a.eql? b    # true
a.equal? b  # false

This is exactly what’s makes the Versionaire gem so powerful by being able to provide a primitive, semantic, version type for use within your Ruby applications. Example:

a = Version major: 1, minor: 2, patch: 3  # <struct Versionaire::Version major=1, minor=2, patch=3>
b = Version [1, 2, 3]                     # <struct Versionaire::Version major=1, minor=2, patch=3>
c = Version "1.2.3"                       # <struct Versionaire::Version major=1, minor=2, patch=3>

a == b && b == c                          # true

Another advantage of having a whole value object shows up when writing RSpec specs where you expect the Struct answered back to be comprised of the correct set of values. Example:

expect(client.call).to contain_exactly(Point[x: 1, y: 2])

Pattern Matching

If you’ve seen me talk about pattern matching, you’ll know I’m a fan. Structs, along with arrays and hashes, natively support pattern matching. If we use the same point object, defined earlier as a keyworded struct, we can write code like this:

By Key And Value

case Point[x: 1, y: 1]
  in x: 1, y: 1 then puts "Low."
  in x: 10, y: 10 then puts "High."
  else puts "Unknown point."
end

# Prints: "Low."

By Position and Value

case Point[x: 10, y: 10]
  in 1, 1 then puts "Low."
  in 10, 10 then puts "High."
  else puts "Unknown point."
end

# Prints: "High."

By Range

case Point[x: -5, y: -1]
  in 0, 0 then puts "Neutral."
  in ..0, ..0 then puts "Negative."
  in 0.., 0.. then puts "Positive."
  else puts "Mixed."
end

# Prints: "Negative."

By Explicit Type

case {x: 1, y: 1}
  in Point[x: 1, y: 1] then puts "Low."
  in Point[x: 10, y: 10] then puts "High."
  else puts "Unknown point."
end

# Prints: "Unknown point."

In the above examples, you’d typically not inline an instance of your struct for pattern matching purposes but pass in the instance as an argument to your case expression. I inlined the instance to keep things concise. That aside — and as you can see — being able to pattern match gives you a lot of power and the above is by no means exhaustive.

Refinements

Structs, as with any Ruby object, can be refined. I’ve written extensively about Refinements and have a gem, of the same name, which refines several Ruby core primitives, including structs. Here’s an example of some of the ways in which we can refine our Point struct even further:

#! /usr/bin/env ruby
# frozen_string_literal: true

# Save as `snippet.rb` and run as `ruby snippet.rb`

require "bundler/inline"

gemfile true do
  source "https://rubygems.org"

  gem "refinements"
end

require "refinements/structs"

Point = Struct.new :x, :y

module Demo
  using Refinements::Structs

  def self.run
    puts Point.with_keywords(x: 1, y: 2)            # #<struct x=1, y=2>
    puts Point.keyworded?                           # false

    point = Point[1, 2]

    puts point.merge x: 0, y: 1                     # #<struct x=0, y=1>
    puts point.revalue { |position| position * 2 }  # #<struct x=2, y=4>
  end
end

Demo.run

If you were to run the above script, you’d see the same output as shown in the code comments. The above is only a small taste of how you can refine your structs. Feel free to check out the Refinements gem for details or even add it to your own projects.

Benchmarks

Earlier, when talking about construction, I hinted at additional reasons for reaching for a Struct over a Class or — worse — an OpenStruct. Well, improved performance is one of them. Consider the following benchmark script which compares the performance of an Array, Hash, Struct, OpenStruct, and Class.

#! /usr/bin/env ruby
# frozen_string_literal: true

# Save as `snippet.rb` and run as `ruby snippet.rb`

require "bundler/inline"

gemfile true do
  source "https://rubygems.org"

  gem "benchmark-ips"
end

require "benchmark/ips"
require "ostruct"

MAX = 1_000_000

ExampleStruct = Struct.new :to, :from

ExampleClass = Class.new do
  attr_reader :to, :from

  def initialize to:, from:
    @to = to
    @from = from
  end
end

Benchmark.ips do |benchmark|
  benchmark.config time: 5, warmup: 2

  benchmark.report "Array" do
    MAX.times { %w[Mork Mindy] }
  end

  benchmark.report "Hash" do
    MAX.times { {to: "Mork", from: "Mindy"} }
  end

  benchmark.report "Struct" do
    MAX.times { ExampleStruct[to: "Mork", from: "Mindy"] }
  end

  benchmark.report "OpenStruct" do
    MAX.times { OpenStruct.new to: "Mork", from: "Mindy" }
  end

  benchmark.report "Class" do
    MAX.times { ExampleClass.new to: "Mork", from: "Mindy" }
  end

  benchmark.compare!
end

If you save the above script to file and run locally, you’ll get output that looks roughly like this:

Warming up --------------------------------------
               Array     2.000  i/100ms
                Hash     1.000  i/100ms
              Struct     1.000  i/100ms
          OpenStruct     1.000  i/100ms
               Class     1.000  i/100ms
Calculating -------------------------------------
               Array     23.854  (± 0.0%) i/s -    120.000  in   5.030763s
                Hash      8.908  (±11.2%) i/s -     45.000  in   5.072652s
              Struct      4.798  (± 0.0%) i/s -     24.000  in   5.005263s
          OpenStruct      0.178  (± 0.0%) i/s -      1.000  in   5.618325s
               Class      4.230  (± 0.0%) i/s -     22.000  in   5.203019s

Comparison:
               Array:       23.9 i/s
                Hash:        8.9 i/s - 2.68x  (± 0.00) slower
              Struct:        4.8 i/s - 4.97x  (± 0.00) slower
               Class:        4.2 i/s - 5.64x  (± 0.00) slower
          OpenStruct:        0.2 i/s - 134.02x  (± 0.00) slower

Based on the benchmark statistics above, the Array is the clear winner with Hash as a runner up. No surprises there. You get great performance at the cost of readability/usage as mentioned earlier in this article.

When you ignore the Array and Hash, you are left with Struct, Class, and OpenStruct. This is where having a Struct truly shines. Granted, using a Class wouldn’t be the end of the world in terms of performance but when you compare the results against an OpenStruct, you can clearly see why an OpenStruct is not advised. This is why I don’t recommend using a Class over a Struct for encapsulating data and definitely avoid OpenStruct altogether.

Avoidances

Before wrapping up this article, there are a few avoidances worth pointing out when using structs in your Ruby code. Please don’t use these techniques yourself or, if you find others writing code this way, send them a link to this section of the article. 🙂

Anonymous Inheritance

We talked about how you can subclass a struct earlier but you can also create a subclass of an anonymous struct as well. Example:

class Point < Struct.new(:x, :y)
end

I didn’t bring this up earlier because the distinction is worth highlighting here due to the dangerous nature of creating a subclass from an anonymous struct superclass. The distinction might be subtle but Point < Struct.new is being used in the above example instead of class Point < Struct as discussed earlier. To make this more clear, consider the following for comparison:

# Normal Superclass
class Point < Struct
end

# Anonymous Superclass
class Point < Struct.new(:x, :y)
end

The normal superclass example is using proper inheritance as discussed earlier but the anonymous superclass example is creating a subclass from a temporary superclass which is not recommended. Even the official documention on Ruby structs says as much:

Subclassing an anonymous struct creates an extra anonymous class that will never be used.

Ruby’s documentation goes on to state that the recommended way to use or even customize a struct is what we discussed earlier which is:

Point = Struct.new :x, :y do
  def inspect = to_h.inspect
end

OpenStruct

By now I hope I have convinced you to avoid OpenStruct usage in your code. Don’t get me wrong, they are fun to play around within in your console for modeling data quickly but shouldn’t be used in any professional capacity. The reason is made clear in the official Ruby documentation:

An OpenStruct utilizes Ruby’s method lookup structure to find and define the necessary methods for properties. This is accomplished through the methods method_missing and define_singleton_method.

This should be a consideration if there is a concern about the performance of the objects that are created, as there is much more overhead in the setting of these properties compared to using a Hash or a Struct. Creating an open struct from a small Hash and accessing a few of the entries can be 200 times slower than accessing the hash directly.

This is a potential security issue; building OpenStruct from untrusted user data (e.g. JSON web request) may be susceptible to a “symbol denial of service” attack since the keys create methods and names of methods are never garbage collected.

Not only do you suffer a performance penalty but you expose a security vulnerability too. Even the RuboCop Performance gem has a Performance/OpenStruct linter to throw an error when detected in your code.

Conclusions

Structs come with a ton of power and are a joy to use. My Ruby code is better because of them. Hopefully, this will inspire you to use structs more effectively within your own code without reaching for classes or more complex objects. Even better, maybe this will encourage you to write cleaner code where data which consists of related attributes are given a proper name and Primitive Obsession is avoided altogether. 🎉