
Ruby’s Struct
is one of several powerful core classes which is often overlooked and under utilized
compared to the more popular Hash
class. This is a shame and I’m often surprised when working with
others who don’t know about structs or, worse, abuse them entirely. I’d like to set the record
straight by sharing the joy of structs with you and how you can leverage their power to improve your
Ruby code further. 🚀
Overview
Structs are a hybrid between a class and a hash where, by default, they are a mutable container for data. They are best used to give a name to an object which encapsulates one to many attributes. This allows you to avoid using arrays or hashes which leads to Primitive Obsession code smells.
To illustrate further, let’s consider an object which is a point on a graph and consists of x
and y
coordinates. Here’s how you might define that object using an Array
, Hash
, Data
, Struct
, Class
, and OpenStruct
:
# Array
point = [1, 2]
point.first # 1
point.last # 2
# Hash
point = {x: 1, y: 2}
point[:x] # 1
point[:y] # 2
# Data
Point = Data.define :x, :y
point = Point.new 1, 2
point.x # 1
point.y # 2
# Struct
Point = Struct.new :x, :y
point = Point.new 1, 2
point.x # 1
point.y # 2
# Class
class Point
attr_accessor :x, :y
def initialize x, y
@x = x
@y = y
end
end
point = Point.new 1, 2
point.x # 1
point.y # 2
# OpenStruct
require "ostruct"
point = OpenStruct.new x: 1, y: 2
point.x # 1
point.y # 2
Based on the above you can immediately see the negative effects of Primitive Obsession with the
Array
and Hash
instances. With the Array
tuple, you can only use #first
to obtain the value of x
and #last
to obtain the value of y
. Those are terrible method names to represent a point object because the methods are not self describing. With the Hash
, the methods are more readable but you have to message #[]
with the key of the value you want to obtain which isn’t ideal when having to type the brackets each time you send a message.
When we move away from the Array
and Hash
by switching to the Struct
, you can see the elegance of messaging our point
instance via the #x
or #y
methods to get the desired values. The same can be said about Data
objects and you might want to check out my Data article for a deeper dive.
The Class
example has the same values as the struct but at the cost of being more cumbersome to define. A class also differs in behavior:
-
Isn’t a value object, by default, since object equality isn’t based on the values that make up the class.
-
Doesn’t support pattern matching by default.
Finally, we have OpenStruct
which is part of Ruby Core as well. At this point, you might be thinking: "Hey, looks like OpenStruct
is better in terms of melding Struct
and Class
syntax and functionality." Well, you’d be very wrong in terms of performance and I’ll expand upon this more later.
History
Before proceeding, it’s important to note that this article is written using modern Ruby syntax so the following highlights the differences between each Ruby version in order to reduce confusion.
3.2.0
Use of the keyword_init: true
flag was no longer required which means you can define structs using the same syntax but initialize using either positional or keyword arguments. Example:
# Construction
Point = Struct.new :x, :y
# Initialization (positional)
point = Point.new 1, 2
# Initialization (keyword)
point = Point.new x: 1, y: 2
3.1.0
Warnings were added when constructing structs with the keyword_init: true
flag so people would be aware that the flag was being deprecated.
3.0.0 (and earlier)
Earlier versions of Ruby required two ways to define a struct via positional or keyword arguments. For example, here’s the same struct defined first with positional arguments and then with keyword arguments:
# Positional
Point = Struct.new :x, :y
point = Point.new 1, 2
# Keyword
Point = Struct.new :x, :y, keyword_init: true
point = Point.new x: 1, y: 2
The difference is using keyword_init: true
to construct a struct via keywords instead of
positional parameters. For astute readers, you’ll recognize this as a
Boolean Parameter
Control Couple code smell.
Construction
Now that we’ve gone over what a struct is along with historical context, let’s delve into construction.
New
You can construct a struct multiple ways. Example:
# Accepts positional or keyword arguments.
Point = Struct.new :x, :y
# Accepts keyword arguments only.
Point = Struct.new :x, :y, keyword_init: true
# Accepts positional arguments only.
Point = Struct.new :x, :y, keyword_init: false
# Accepts positional arguments only.
Point = Struct.new :x, :y, keyword_init: nil
Being restricted to only using positional arguments is no longer recommended so you should avoid using keyword_init
when constructing your structs and only use the first example shown above. Here’s a closer look:
Point = Struct.new :x, :y
# Positional
Point.new 1 # <struct Point x=1, y=nil>
Point.new nil, 2 # <struct Point x=nil, y=2>
# Keyword
Point.new y: 2, x: 1 # <struct Point x=1, y=2>
Point.new y: 2 # <struct Point x=nil, y=2>
While keyword arguments require more typing to define your key and value, you are free of positional constraints and can even construct with a subset of attributes which isn’t possible with positional arguments unless you fill in all positions prior to the position you desire to set. Even better — once you’ve constructed your struct, you can use positional or keyword arguments freely during initialization.
Subclass
So far you’ve only seen class construction using .new
but you can use a subclass as well. Example:
class Inspectable < Struct
def inspect = to_h.inspect
end
Point = Inspectable.new :x, :y
# Positional
Point.new(1, 2).inspect # "{:x=>1, :y=>2}"
# Keyword
Point.new(x: 1, y: 2).inspect # "{:x=>1, :y=>2}"
Subclassing can be useful in rare cases but you can also see it’s not as much fun to use due to the
additional lines of code (even with the overwritten #inspect
method). Additionally, you
can’t add more attributes to your subclass like you can with a regular subclass. For example,
trying to add a z
subclass attribute in addition to your existing x
and y
superclass
attributes won’t work. The reason is that once a struct is defined, it can’t be resized.
While subclassing is good to be aware of, use with caution because inheritance carries a lot of baggage with it. You’re much better off using Dependency Inversion which is the D in SOLID design. So compose your objects rather than inheriting them.
Initialization
Now that we know how to construct a Struct
, let’s move on to initialization. We’ll continue with
our Point
struct for these examples.
New
As shown earlier — and with nearly all Ruby objects — you can initialize a struct via the .new
class method:
# Positional
point = Point.new 1, 2
# Keyword
point = Point.new x: 1, y: 2
Keep in mind that even though you can initialize an instance of your Point
struct using positional or keyword arguments, you can’t mix and match them. Example:
point = Point.new 1, y: 2
point.x # 1
point.y # {y: 2}
Notice {y: 2}
was assigned to y
when the value should have been 2
so use either all positional arguments or all keyword arguments to avoid the this situation. Don’t mix them!
Anonymous
You can anonymously create a new instance of a struct by constructing your struct and initializing it via a single line. Example:
# Positional
point = Struct.new(:x, :y).new 1, 2
# Keyword
point = Struct.new(:x, :y).new x: 1, y: 2
The problem with the above is anonymous structs are only useful within the scope they are defined as temporary and short lived objects. Worse, you must redefine the struct each time you want to use it. For anything more permanent, you’ll need to define a constant for improved reuse. That said, anonymous structs can be handy in a pinch for one-off situations like scripts, specs, or code spikes.
Inline
While anonymous structs suffer from not being reusable, you can define inline and semi-reusable structs using a single line of code as follows (pay attention to the string which is the first argument):
# Positional
point = Struct.new("Point", :x, :y).new 1, 2
# Keyword
point = Struct.new("Point", :x, :y).new x: 1, y: 2
You can even use shorter syntax but I don’t recommend this because it’s harder to read and a bit too clever:
# Positional
point = Struct.new("Point", :x, :y)[1, 2]
# Keyword
point = Struct.new("Point", :x, :y)[x: 1, y: 2]
To create new instances of the above struct, you’d need to use the following syntax:
Struct::Point.new x: 3, y: 4 # <struct Struct::Point x=3, y=4>
Struct::Point.new x: 5, y: 6 # <struct Struct::Point x=5, y=6>
The downside is you must keep typing Struct::Point
which isn’t great as a constant you’d want
reuse on a permanent basis. Regardless, the difference between an inline struct and the earlier anonymous struct is that the first argument we pass in is the name of our struct which makes it a constant and reusable. To illustrate further, consider the following:
# Anonymous
point = Struct.new(:x, :y).new 1, 2
point = Struct.new(:x, :y).new 1, 2
# No warnings issued.
# Constant
point = Struct.new("Point", :x, :y).new 1, 2
point = Struct.new("Point", :x, :y).new 1, 2
# warning: redefining constant Struct::Point
With anonymous initialization, we don’t get a Ruby warning stating a constant has been defined. On
the other hand, with constant initialization, we do get a warning that the Point
class has
already been defined when we try to define it twice.
While it can be tempting to define a struct via a single line — and sometimes useful in one-off scripts — I would recommend not using this in production code since it’s too easy to obscure finding these constants within your code.
Brackets
As hinted at earlier, there is a shorter way to initialize a struct and that’s via square brackets:
point = Point[x: 1, y: 2]
This is my favorite form of initialization and for two important reasons:
-
Brackets require three less characters to type. ⚡️
-
Brackets signify, more clearly, you are working with a struct versus a class which improves readability. Calling out structs like this when reading through code makes a big difference over time and encourage you to do the same.
Defaults
Structs, as with classes, can set defaults. The way to do this is to define an #initialize
method. Here are a couple examples:
First
Point = Struct.new :x, :y do
def initialize x: 1, y: 2
super
end
end
Second
Point = Struct.new :x, :y do
def initialize(**)
super
self[:x] ||= 1
self[:y] ||= 2
end
end
Third
Point = Struct.new :x, :y, keyword_init: true do
def initialize(*)
super
self[:x] ||= 1
self[:y] ||= 2
end
end
With each of the above, any Point
instance will default to x = 1, y = 2
:
point = Point.new
point.x # 1
point.y # 2
There are a few important aspects of the above to call out:
-
In most cases, you should call
super
to ensure incoming arguments are forwarded to the superclass. -
The first example leverages concise syntax to define defaults and is recommended if your parameter lists is three or less.
-
The second example allows you forward all keyword arguments to super and then define defaults by using
self
with memoization. This is the recommended approach if you have more than three parameters to prevent your parameter list from getting long and unwieldy. -
The third example is nearly identical to the second example but splats positional arguments instead of keyword arguments since the older
keyword_init: true
argument is used.
I use either the first or second example when needing an instance of a struct with safe defaults and generally avoid using the third example since keyword_init: true
probably won’t be supported in the future.
Heavy use of defaults can also be abused so keep your defaults simple and without side effects. If you don’t need a default or can’t think of a safe default, then don’t override the initializer unnecessarily.
Transformations
Along the same lines as initialization is the ability for structs to transform an incoming data type to itself. This is a variant of the Adapter Pattern but instead of having a second object which adapts one object into an instance of your struct, you have the struct do the transformation. For example, consider the following:
module Graphs
POINT_KEY_MAP = {horizontal: :x, vertical: :y}.freeze
Point = Struct.new(*POINT_KEY_MAP.values) do
def self.for(location, key_map: POINT_KEY_MAP) = new(**location.transform_keys(key_map))
end
end
With the above, you can now transform in incoming Hash
into the Struct
we need:
location = {horizontal: 1, vertical: 2}
point = Graphs::Point.for location
point.inspect # <struct Graphs::Point x=1, y=2>
This is a lot of power for a small amount of code because you can now convert one data type — which looks roughly similar to your struct but has the wrong keys — into your struct which is properly named and has a better interface. Let’s break this down further:
-
The
Graphs
module gives you a namespace to group related constants (i.e.POINT_KEY_MAP
andPoint
). -
The
POINT_KEY_MAP
constant allows you to define — in one place — the mapping of keys you need to transform. The hash keys are the foreign keys to transform while the hash values are the keys used to define your struct’s attributes. -
The
.for
class method allows you to consume the location hash along with an optional key map for transforming the foreign keys. Sincelocation
is a hash, we can ask it to transform its keys using the provided key map. The result is then used to initialize the struct with the newly transforms keys and values of the original hash.
The reason this is powerful is because, in
Domain Driven Design, you have
a single method — .for
in this case — serving as a boundary for converting a foreign type into a
struct with more flexibility and reuse with minimal effort. This is handy in situations where you
might be dealing with an external API or any kind of similar data which is almost shaped the way
you need but isn’t quite right.
I should point out that if .for
isn’t to your liking, you can use .with
, .for_location
,
.with_location
, and so forth for the class method name. I tend to stick with short and simply
named transforming method names like .for
or .with
until I find I need something more specific.
You can take all of this too far and put too much responsibility on your struct. Should that happen, consider crafting an adapter class that consumes and converts the incoming data into an instance of your struct. Otherwise, for simple situations like the above example, this is an nice way to give your struct extra superpower with concise syntactic sugar.
Whole Values
Another superpower of structs is that they are whole value objects by default. This is lovely
because you can have two or more structs with the same values and they’ll be equal even though their
object IDs are different. Here’s an example where, again, we reach for our Point
struct:
a = Point[x: 1, y: 2]
b = Point[x: 1, y: 2]
a == b # true
a === b # true
a.eql? b # true
a.equal? b # false
This is exactly what’s makes the Versionaire gem so powerful by being able to provide a primitive, semantic, version type for use within your Ruby applications. Example:
a = Version major: 1, minor: 2, patch: 3 # <struct Versionaire::Version major=1, minor=2, patch=3>
b = Version [1, 2, 3] # <struct Versionaire::Version major=1, minor=2, patch=3>
c = Version "1.2.3" # <struct Versionaire::Version major=1, minor=2, patch=3>
a == b && b == c # true
Another advantage of having a whole value object shows up when writing
RSpec specs where you expect the Struct
answered back to be comprised of
the correct set of values. Example:
expect(client.call).to contain_exactly(Point[x: 1, y: 2])
Pattern Matching
I’ve written about pattern matching before, so you’ll know I’m a fan. Structs, along with arrays and hashes, natively support pattern matching. If we use the same point object, defined earlier as a keyworded struct, we can write code like this:
By Key And Value
case Point[x: 1, y: 1]
in x: 1, y: 1 then puts "Low."
in x: 10, y: 10 then puts "High."
else puts "Unknown point."
end
# Prints: "Low."
By Position and Value
case Point[x: 10, y: 10]
in 1, 1 then puts "Low."
in 10, 10 then puts "High."
else puts "Unknown point."
end
# Prints: "High."
By Range
case Point[x: -5, y: -1]
in 0, 0 then puts "Neutral."
in ..0, ..0 then puts "Negative."
in 0.., 0.. then puts "Positive."
else puts "Mixed."
end
# Prints: "Negative."
By Explicit Type
case {x: 1, y: 1}
in Point[x: 1, y: 1] then puts "Low."
in Point[x: 10, y: 10] then puts "High."
else puts "Unknown point."
end
# Prints: "Unknown point."
In the above examples, you’d typically not inline an instance of your struct for pattern matching
purposes but pass in the instance as an argument to your case
expression. I inlined the instance
to keep things concise. That aside — and as you can see — being able to pattern match gives you a
lot of power and the above is by no means exhaustive.
Refinements
Structs, as with any Ruby object, can be refined. I’ve written extensively about
refinements and have a gem, of the same
name, which refines several Ruby core primitives, including structs. Here’s an example of some of
the ways in which we can refine our Point
struct even further:
#! /usr/bin/env ruby
# frozen_string_literal: true
# Save as `snippet.rb` and run as `ruby snippet.rb`
require "bundler/inline"
gemfile true do
source "https://rubygems.org"
gem "refinements"
end
require "refinements/structs"
Point = Struct.new :x, :y
module Demo
using Refinements::Structs
def self.run
puts Point.with_keywords(x: 1, y: 2) # #<struct x=1, y=2>
puts Point.keyworded? # false
point = Point[1, 2]
puts point.merge x: 0, y: 1 # #<struct x=0, y=1>
puts point.revalue { |position| position * 2 } # #<struct x=2, y=4>
end
end
Demo.run
If you were to run the above script, you’d see the same output as shown in the code comments. The above is only a small taste of how you can refine your structs. Feel free to check out the Refinements gem for details or even add it to your own projects.
Benchmarks
Performance has waned recently where structs used to be more performant than classes but that is no longer the case. Consider the following YJIT-enabled benchmarks:
#!/usr/bin/env ruby
# frozen_string_literal: true
# Save as `benchmark`, then `chmod 755 benchmark`, and run as `./benchmark`.
require "bundler/inline"
gemfile true do
source "https://rubygems.org"
gem "benchmark-ips"
end
require "ostruct"
DataExample = Data.define :a, :b, :c, :d, :e
StructExample = Struct.new :a, :b, :c, :d, :e
ExampleClass = Class.new do
attr_reader :a, :b, :c, :d, :e
def initialize a:, b:, c:, d:, e:
@a = a
@b = b
@c = c
@d = d
@e = e
end
end
Benchmark.ips do |benchmark|
benchmark.config time: 5, warmup: 2
benchmark.report("Array") { [1, 2, 3, 4, 5] }
benchmark.report("Hash") { {a: 1, b: 2, c: 3, d: 4, e: 5} }
benchmark.report("Data") { DataExample[a: 1, b: 2, c: 3, d: 4, e: 5] }
benchmark.report("Struct") { StructExample[a: 1, b: 2, c: 3, d: 4, e: 5] }
benchmark.report("OpenStruct") { OpenStruct.new a: 1, b: 2, c: 3, d: 4, e: 5 }
benchmark.report("Class") { ExampleClass.new a: 1, b: 2, c: 3, d: 4, e: 5 }
benchmark.compare!
end
If you save the above script to file and run locally, you’ll get output that looks roughly like this:
Warming up -------------------------------------- Array 1.503M i/100ms Hash 903.074k i/100ms Data 245.690k i/100ms Struct 239.277k i/100ms OpenStruct 660.000 i/100ms Class 305.773k i/100ms Calculating ------------------------------------- Array 21.932M (± 2.6%) i/s - 109.697M in 5.005225s Hash 11.647M (± 7.8%) i/s - 58.700M in 5.069809s Data 2.680M (± 4.1%) i/s - 13.513M in 5.048739s Struct 2.590M (± 3.1%) i/s - 13.160M in 5.085549s OpenStruct 2.665k (±25.3%) i/s - 13.200k in 5.250260s Class 3.566M (± 7.6%) i/s - 17.735M in 5.001820s Comparison: Array: 21931846.7 i/s Hash: 11647135.3 i/s - 1.88x slower Class: 3565606.4 i/s - 6.15x slower Data: 2680440.0 i/s - 8.18x slower Struct: 2590171.8 i/s - 8.47x slower OpenStruct: 2664.9 i/s - 8229.99x slower
Based on the benchmark statistics above, the Array
is the clear winner with Hash
as a runner up.
No surprises there. You get great performance at the cost of readability/usage as mentioned earlier
in this article.
When you ignore the Array
and Hash
, you are left with Data
, Struct
, Class
, and OpenStruct
. You can clearly see why OpenStruct
is not advised while Class
, Data
, and Struct
are relatively close in performance but, as mentioned above, classes has become slightly more performant where they used to be slower.
Avoidances
Before wrapping up this article, there are a few avoidances worth pointing out when using structs in your Ruby code. Please don’t use these techniques yourself or, if you find others writing code this way, send them a link to this section of the article. 🙂
Anonymous Inheritance
We talked about how you can subclass a struct earlier but you can also create a subclass of an anonymous struct as well. Example:
class Point < Struct.new(:x, :y)
end
I didn’t bring this up earlier because the distinction is worth highlighting here due to the
dangerous nature of creating a subclass from an anonymous struct superclass. The distinction might
be subtle but Point < Struct.new
is being used in the above example instead of class Point <
Struct
as discussed earlier. To make this more clear, consider the following for comparison:
# Normal Subclass
class Point < Struct
end
Point.ancestors
# [Point, Struct, Enumerable, Object, Kernel, BasicObject]
# Anonymous Subclass
class Point < Struct.new(:x, :y)
end
Point.ancestors
# [Point, #<Class:0x000000010da8e248>, Struct, Enumerable, Object, Kernel, BasicObject]
The normal subclass example is using proper inheritance as discussed earlier but the anonymous subclass example is creating a subclass from a temporary superclass which is not recommended. This is most apparent when you see the anonymous <Class:0x000000010da8e248>
appear in the hierarchy. Even the official documentation on Ruby structs says as much:
Subclassing an anonymous struct creates an extra anonymous class that will never be used.
Ruby’s documentation goes on to state that the recommended way to use or even customize a struct is what we discussed earlier which is:
Point = Struct.new :x, :y do
def inspect = to_h.inspect
end
OpenStruct
By now I hope I have convinced you to avoid OpenStruct
usage in your code. Don’t get me wrong, they are fun to play with in your console for modeling data quickly but shouldn’t be used in any professional capacity. The reason is made clear in the official Ruby documentation:
An OpenStruct utilizes Ruby’s method lookup structure to find and define the necessary methods for properties. This is accomplished through the methods
method_missing
anddefine_singleton_method
.
This should be a consideration if there is a concern about the performance of the objects that are created, as there is much more overhead in the setting of these properties compared to using a Hash or a Struct. Creating an open struct from a small Hash and accessing a few of the entries can be 200 times slower than accessing the hash directly.
This is a potential security issue; building OpenStruct from untrusted user data (e.g. JSON web request) may be susceptible to a “symbol denial of service” attack since the keys create methods and names of methods are never garbage collected.
Not only do you suffer a performance penalty but you expose a security vulnerability too. Even the RuboCop Performance gem has a Performance/OpenStruct linter to throw an error when detected in your code.
Conclusion
Structs come with a ton of power and are a joy to use. My Ruby code is better because of them. Hopefully, this will inspire you to use structs more effectively within your own code without reaching for classes or more complex objects. Even better, maybe this will encourage you to write cleaner code where data which consists of related attributes are given a proper name and Primitive Obsession is avoided altogether. 🎉