Wikipedia defines pattern matching as:
[T]he act of checking a given sequence of tokens for the presence of the constituents of some pattern.
Using the above definition, this means a pattern could be applied to something as simple as a hash with ship and crew information:
{
ship: "Serenity",
crew: [
{
name: "Malcolm Reynolds",
title: "Captain"
},
{
name: "Zoe Alleyne Washburne",
title: "Soldier"
},
{
name: "Kaywinnet Lee Frye"
title: "Engineer"
}
]
}
Upon inspection, you can see we have a hash of arrays where each element of the array is also a hash that repeats the same name
and title
keys. That’s a pattern we can match against! To learn how, we’ll spend the rest of this article understanding what pattern matching is and how best to use it.
Overview
From a high level, here are the various patterns we’ll be discussing in this article:
Pattern | Example |
---|---|
Standalone |
in, => |
Arrays |
[<pattern>, <pattern>] |
Hashes |
{key: <pattern>, key: <pattern>} |
Remainders |
*, **, *remainder, **remainder |
Finds |
[*, <pattern>, *] |
Excludes |
_ |
Voids |
**nil |
Variable Binding |
^variable |
Combinations |
| |
Guards |
if, unless |
Classes |
deconstruct, desconstruct_keys |
The above can be a useful as a quick reference should you need it. We’ll get to the details of each shortly but first a quick look at the history of pattern matching can provide some helpful context.
History
The evolution of the pattern matching feature can be summarized over a series of Ruby versions:
-
3.2.0 - Array find patterns (i.e.
<object> in [*, query, *]
) were no longer experimental and became officially supported. -
3.1.0 - Rightward assignment and boolean checks (i.e. standalone) became officially supported while the pin operator (i.e.
^
) gained the ability to use statements and/or more complex expressions instead of only constants, literals, and local variables. -
3.0.0 - Officially supported while introducing new experimental features:
-
Standalone rightward assignment (i.e.
=>
). -
Standalone boolean checks (i.e.
in
). -
Find patterns (i.e.
<object> in [*, query, *]
).
-
-
2.7.0 - Introduced for the first time as an experimental feature.
Syntax
Now that you have a sense of what pattern matching is and how it evolved over time, we can dive into the specifics of syntax.
Standalone
Standalone syntax is the simplest form of pattern matching and is most useful when only needing to write a single line of code. Anything beyond a one-liner, will require the use of a case
statement (explained shortly). There are two forms of standalone syntax which are explained next.
Boolean Check
A boolean check — denoted by the in
keyword — is a handy way to pattern match when only needing to know if the result is true
or false
. Example:
basket = [{kind: "apple", quantity: 1}, {kind: "peach", quantity: 5}]
basket.any? { |fruit| fruit in {kind: /app/}} # true
basket.any? { |fruit| fruit in {kind: /berry/}} # false
Boolean checks are also useful when used in guard clauses or other kinds of conditional logic.
Rightward Assignment
Rightward assignment — denoted by =>
— provides a quick way to assign a value based on the matched pattern. Example:
{character: {first_name: "Malcolm", last_name: "Reynolds"}} => {character: {last_name:}}
puts "Last Name: #{last_name}" # "Last Name: Reynolds"
⚠️ Rightward assignment — unlike a boolean check — will throw an error when a pattern can’t be matched. For example:
{character: {first_name: "Malcolm", last_name: "Reynolds"}} => {character: {middle_name:}}
# NoMatchingPatternKeyError: key not found: :middle_name
For more robust code, you’ll either want to catch the exception (expensive) or switch to using a case
statement (recommended).
In some circumstances, rightward assignment, can be useful in type checking. Here’s an example where initialization of a weight object performs type checking upon input:
class Weight
def initialize value, unit
@value = value => Numeric
@unit = unit => String
end
def to_s = "#{value} #{unit}"
private
attr_reader :value, :unit
end
puts Weight.new(1, "gram")
# "1 gram"
puts Weight.new("1", "gram")
# "1": Numeric === "1" does not return true (NoMatchingPatternError)
puts Weight.new(1, :gram)
# :gram: String === :gram does not return true (NoMatchingPatternError)
In the case where a weight is initialized with the correct types you’ll get a valid instance but in situations with invalid types you’ll get a NoMatchingPatternError
instead. Use with caution since you are introducing an exception which can be expensive and possibly more jarring than necessary.
Case Statement
The case
statement is where you leverage the full power of pattern matching. The difference is with using in
instead of when
because the traditional case...when...else...end
syntax is for conditional logic while the new case...in...else...end
is for pattern matching. Here’s a better illustration:
case <expression> in <pattern> then <statement> in <pattern> then <statement> in <pattern> then <statement> else <statement> end
This doesn’t mean you can mix and match when
and in
statements together. You must either use a case
statement for conditional logic or pattern matching but not both because the following will fail with a syntax error:
case "example"
in text then puts text
when "example" then puts "Found"
else puts "Unknown"
end
# SyntaxError: unexpected `when', expecting `end'
Unlike case statement conditional logic — where else
is optional — pattern matching is exhaustive in that you’ll get a NoMatchingPatternError
if none of the patterns match. Example:
# Faulty code.
case %i[a b c d]
in Symbol then puts "Single"
in Symbol, Symbol then puts "Double"
end
# NoMatchingPatternError
# Robust code.
case %i[a b c d]
in Symbol then puts "Single"
in Symbol, Symbol then puts "Double"
else puts "Unable to match input."
end
# "Unable to match input."
This is a nice enhancement over case statements using conditional logic because you always want to use the else
branch of your logic for more robust code that doesn’t cause surprising downstream exceptions or the more insidious nil
which is never fun to debug.
💡 You can enforce case statements to always have an else
by ensuring RuboCop’s Style/MissingElse is enabled.
Syntax-wise, brackets and braces are optional when using case statements:
# With brackets and braces.
case [1, 2, 3]
in [Integer, Integer] then "match" # With brackets.
else "unmatched"
end
case {a: 1, b: 2, c: 3}
in {a: Integer} then "matched" # With braces.
else "unmatched"
end
# Without brackets and braces.
case [1, 2, 3]
in Integer, Integer then "match" # Without brackets.
else "unmatched"
end
case {a: 1, b: 2, c: 3}
in a: Integer then "matched" # Without braces.
else "unmatched"
end
⚠️ You must use brackets and braces when using standalone syntax and/or with nested patterns, though.
Whole versus Partial
Arrays and hashes behave differently when it comes to whole and partial matches. Consider the following:
case [1, 2, 3]
in Integer, Integer then "matched"
else "unmatched"
end
# "unmatched"
case {a: 1, b: 2, c: 3}
in a: Integer then "matched"
else "unmatched"
# "matched"
The reason the hash matched and the array didn’t is because arrays default to being whole matches while hashes are partial matches. The nuance of all this will be explained in more details once we delve in the specifics of arrays and hashes.
Variable Binding
Variable binding allows you to bind the result of a pattern match to a local variable. Example (using arrays and hashes):
# Array
case [1, 2]
in first, Integer then "matched: #{first}"
else "unmatched"
end
# "matched: 1"
# Hash
case {a: 1, b: 2}
in a: first then "matched: #{first}"
else "unmatched"
end
# "matched: 1"
In the above example, the local variable is bound by positional (array) and key (hash) but you can expand upon this by adding in type checks too:
# Array
case [1, 2]
in Integer => first, Integer then "matched: #{first}"
else "unmatched"
end
# "matched: 1"
# Hash
case {a: 1, b: 2}
in a: Integer => first then "matched: #{first}"
else "unmatched"
end
# "matched: 1"
You’ll notice we’ve bound our local variable, as before, except this time we check if the type is an Integer
for more robust pattern matching in situations where we need type safety.
Hashes
Unique to hashes is ability to bind a local variable by key alone. Example:
case {a: 1, b: 2}
in a: then "matched: #{a}"
else "unmatched"
end
# "matched: 1"
In situations where your hash keys are adequately named, this saves you extra typing while still having readable code.
Nesting
Variable binding works with more complex and nested arrays and hashes but requires you to make use of explicit braces and brackets:
case [%w[Apple Apricot], %w[Blueberry Blackberry]]
in [[first, String], *] then "matched: #{first}"
else "unmatched"
end
# Yields "matched: Apple"
case {label: "Basket", fruits: [{label: "Apple"}, {label: "Peach"}]}
in label:, fruits: [{label: first}, *] then "matched: #{first}"
else "unmatched"
end
# Yields "matched: Apple"
Remainders
Remainders — which you’ve seen a few examples of already — are through the use of single and double splats. These splats can be either bare or named. Starting with bare splats, here’s what they look like when used with arrays and hashes:
case [1, 2, 3]
in Integer, * then "matched"
else "unmatched"
end
# "matched"
case {a: 1, b: 2, c: 3}
in a: Integer, ** then "matched"
else "unmatched"
end
# "matched"
The above results in a match for both the array and hash examples where only the first elements of the pattern are matched since the single and double splats ignore everything else. This is great when you don’t care to use the remaining elements. In situations where you do care about the remainders, you’ll want to name them. Example:
case [1, 2, 3]
in first, *remainder then "matched: #{first}, #{remainder}"
else "unmatched"
end
# "matched: 1, [2, 3]"
case {a: 1, b: 2, c: 3}
in a:, **remainder then "matched: #{a}, #{remainder}"
else "unmatched"
end
# "matched: 1, {:b=>2, :c=>3}"
By using *remainder
for the array and **remainder
for the hash, the ability to match against specific positions/keys while also having access to the remaining elements allows you to do more with the matched results.
Variable Pinning
Variable pinning allows you to match against global, instance, and local variables. This works for both arrays and hashes but, for illustration purposes, we’ll stick to arrays:
$global = 1
class Demo
@@class_variable = 2
def initialize
@instance_variable = 3
end
def call
local = 4
case [1, 2, 3, 4]
in ^$global, ^@@class_variable, ^@instance_variable, ^local then "matched"
else "unmatched"
end
end
end
Demo.new.call # matched
The above is a match because the [1, 2, 3, 4]
array matched all of the variable values but let’s dive into this further by focusing on local variables for simplicity (these examples apply to global, class, and instance variables too):
expectation = 5
case [1, 2]
in expectation, * then "matched: #{expectation}"
else "unmatched: #{expectation}"
end
# "matched: 1"
expectation = 5
case [1, 2]
in ^expectation, * then "matched: #{expectation}"
else "unmatched: #{expectation}"
end
# "unmatched: 5"
A local variable doesn’t have to exist for pattern matching because you can dynamically use a local variable once the local variable is set. Example:
case [1, 1]
in value, ^value then "values are identical"
else "values are different"
end
# "values are identical"
case [1, 2]
in value, ^value then "values are identical"
else "values are different"
end
# "values are different"
You are not limited to pinning and using local variables within the same level of your pattern but can be leveraged within a nested pattern as well. Example:
case {school: "high", schools: [{id: 1, level: "middle"}, {id: 2, level: "high"}]}
in school:, schools: [*, {id:, level: ^school}] then "matched: #{id}"
else "unmatched"
end
# "matched: 2"
case {school: "high", schools: [{id: 1, level: "middle"}]}
in school:, schools: [*, {id:, level: ^school}] then "matched: #{id}"
else "unmatched"
end
# "unmatched"
The reason the first example works is because school
is pinned to "high" which happens to be equal the to same level as school record ID 2 but the second example is not a match because only one school record (i.e. "middle") exist which is not equal to the school level of "high".
Finally, you can use variable pinning with expressions as long as they are wrapped in parenthesis. Example:
multiplier = 2
case [2, 4]
in Integer, ^(2 * multiplier) then "matched"
else "unmatched"
end
# "matched"
case [2, 4]
in Integer, ^(2 * multiplier) => expectation then "matched: #{expectation}"
else "unmatched: #{expectation}"
end
# "matched: 4"
⚠️ Be aware that local variables will be overridden if not pinned:
expectation = 5
case [1, 2]
in expectation, * then "matched: #{expectation}"
else "unmatched: #{expectation}"
end
expectation # 1
Combinations
You can match one or more patterns when separated by a pipe (i.e. |
):
case [:a, 1, 2.0]
in [Symbol] | [Symbol, Integer, Float] then "matched"
else "unmatched"
end
# "matched"
case {a: 1, b: 2, c: 3}
in Array | {a: Integer} then "matched"
else "unmatched"
end
# "matched"
In both of the examples above, the "or" condition was satisfied so both are a match. Using different data with the same patterns we still see we have matches without reaching the "or" condition:
case [:a]
in [Symbol] | [Symbol, Integer, Float] then "matched"
else "unmatched"
end
# "matched"
case [1, 2, 3]
in Array | {a: Integer} then "matched"
else "unmatched"
end
# "matched"
⚠️ Be careful when matching arrays since implicit or explicit use of braces can yield different results when using combinations. Using standalone syntax, these variations of the above highlight the differences:
[:a, 1, 2.0] in Symbol | Symbol, Integer, Float # true
[:a] in String | Symbol, Integer, Float # false
[:a] in String | [Symbol, Integer, Float] # false
[:a] in [Symbol] | [Symbol, Integer, Float] # true
Basically, a single element array will only match if both combinations are wrapped in braces.
⚠️ You can’t bind a variable when using combinations either. Example:
case [1, 2, 3]
in [Integer => first] | [Integer, Integer] then "matched"
else "unmatched"
end
# SyntaxError: illegal variable in alternative pattern (first)
case {a: 1, b: 2, c: 3}
in {a:} | Array then "matched"
else "unmatched"
end
# SyntaxError: illegal variable in alternative pattern (a)
That said, you can — for hashes only — discard a key value using an underscore:
case {a: 1, b: 2, c: 3}
in {a: _} | Array then "matched"
else "unmatched"
end
# "matched"
This also means that _
is bound as a local variable but the Ruby Core team strongly recommends against using the _
local variable because they are meant to denote the discarding of a key’s value only.
Guards
Guard statements — as used elsewhere in your code — can be used as one-liners for pattern matching. Example:
case %w[apple peach blueberry]
in String, middle, String if middle == "peach" then "matched"
else "unmatched"
end
# "matched"
case {animal: "bat", vehicle: "tumbler", color: "black"}
in {color: String} if color == "green" then "matched"
else "unmatched"
end
# "unmatched"
💡 While only if
is shown above, unless
works as a guard clause too.
Arrays
There are aspects of pattern matching which are unique to arrays. The following will highlight these differences.
Order
Due to the inherent nature of arrays, the order of elements matters when using arrays versus hashes. Example:
case [1, :a]
in Symbol, Integer then "matched"
else "unmatched"
end
# "unmatched"
case [:a, 1]
in Symbol, Integer then "matched"
else "unmatched"
end
# "matched"
As you can see, only the second example is a match because the type of elements being pattern matched are in the correct order. On the other hand, hashes don’t need to be concerned about order since the benefit of having keys means they can be in any order:
case {a: 1, b: 2}
in b: Integer then "matched"
else "unmatched"
end
# "matched"
case {b: 2, a: 1}
in b: Integer then "matched"
else "unmatched"
end
# "matched"
Due to the importance of array order, it is wise to ensure your arrays are sorted prior to pattern matching them because this will increase the accuracy of your results.
Finds
Another uniqueness of arrays is using bare single splats to find elements within an array. Here’s an example where we only care that the last elements of the array match a specific pattern:
case [:a, 1, :b, :c, 2]
in *, Symbol, Integer then "matched"
else "unmatched"
end
# "matched"
You can also use an experimental feature where bare single splats are applied to both sides of the pattern to match for elements in the middle of an array:
case [:a, 1, :b, :c, 2]
in *, Symbol, Symbol, * then "matched"
else "unmatched"
end
# "matched"
An enhancement over bare single splats is named single splats for situations in which need to make use of the elements found in the splats. Example:
case [:a, 1, :b, :c, 2]
in *first, Symbol, Symbol, *last then {first: first, last: last}
else "unmatched"
end
# {first: [:a, 1], last: [2]}
⚠️ Leading splats — as used in both examples above — can also be expensive with large arrays since the pattern match must scan the entire array to obtain the match.
Hashes
Hashes also have unique characteristics when it comes to pattern matching. The following highlights these differences.
Empty
Hashes have the ability to match on empty hashes which is a slight exception to the partial matching behavior mentioned earlier in the Whole versus Partial matches section.
case {a: 1, b: 2, c: 3}
in {} then "matched"
else "unmatched"
end
# "unmatched"
case {}
in {} then "matched"
else "unmatched"
end
# "matched"
This difference is important to note since hashes use partial matching by default so the only way to look for an empty hash is to explicitly match against an empty hash.
Voids
You can use voids to obtain a whole match on a hash when you don’t want additional pairs in your hash. Example:
case {a: 1, b: 2}
in {a: Integer, **nil} then %(matched "a" part)
in {a: Integer, b: Integer, **nil} then "matched whole hash"
else "unmatched"
end
# "matched whole hash"
The reason the first pattern didn’t match is because b: 2
was an extra element. The second pattern, on the other hand, matched because it allowed exactly two keys. Had there been a third key then none of the patterns would have matched due to the void constraint. Voids are a nice way to ensure no additional elements show up when needing a whole match since hashes use partial matching by default.
Structs
So far we’ve mostly been focused on arrays and hashes when it comes to pattern matching but structs are excellent as well. In fact, structs — along with classes which we’ll get to in a moment — unlock a new capability with pattern matching which is: types. To start, let’s say we have the following struct:
Point = Struct.new :x, :y
Now watch what happens when we pattern match using array syntax:
case Point[1, 2]
in Point[..5, ..5] then "matched"
else "unmatched"
end
# "matched"
case Point[1, 2]
in ..5, ..5 then "matched"
else "unmatched"
end
# "matched"
case [1, 2]
in Point[..5, ..5] then "matched"
else "unmatched"
end
# "unmatched"
In the first example we get a match because we are matching by type and value which is about as explicit as you can get. In contrast, the second example is also a match but this time we don’t care about type and only match by value instead. In other words, the second example’s result is the same but less rigid than the first example. Finally, the third example is not a match because we have an incoming array but we are pattern matching against the Point
struct which is not the same type as the Array
type so this match fails.
We can also pattern match using hash syntax:
case Point[1, 2]
in Point(x: ..5, y: ..5) then "matched"
else "unmatched"
end
# "matched"
case Point[1, 2]
in x: ..5, y: ..5 then "matched"
else "unmatched"
end
# "matched"
case {x: 1, y: 2}
in Point(..5, ..5) then "matched"
else "unmatched"
end
# "unmatched"
As you can see, the results are the same using array and hash syntax. The reason is because structs implement both the #deconstruct
and #deconstruct_keys
methods.
Finally — and this is subtle — when matching array types, I used braces (i.e. []
) but when matching hash types, I used parenthesis (i.e. ()
). Both syntaxes are interchangeable but part of the reason for using braces is that they read nice with arrays versus hashes. It’s worth noting that the Ruby Core Team doesn’t specify which syntax you should use when type checking for structs and classes, in general. You can use braces (i.e. []
) or parenthesis (i.e. ()
). I tend to use parenthesis except when pattern matching with arrays. Whatever your choice, make sure you are consistent.
💡 For a deeper dive on Structs, you might enjoy this related article.
Classes
For situations where primitives such as arrays, hashes, and structs are not enough for your needs, you can teach non-primitive objects to be matchable by implementing the #deconstruct
method for arrays and/or the #deconstruct_keys
method for hashes. Example:
class Point
ALLOWED_KEYS = %i[x y].freeze
def initialize x, y
@x = x
@y = y
end
def to_a = [x, y]
def to_h = {x:, y:}
alias deconstruct to_a
def deconstruct_keys(keys) = to_h.slice(*(keys ? ALLOWED_KEYS.intersection(keys) : ALLOWED_KEYS))
private
attr_reader :x, :y
end
At a minimum, you only need to implement the #deconstruct
and #deconstruct_keys
methods for pattern matching purposes but, for completeness, you should also implement the corresponding #to_a
and #to_h
methods above. You’ll also notice, for integrity, that I constrain #deconstruct_keys
with the intersection of the given keys
and ALLOWED_KEYS
which ensures you can match when given nil
or unsupported keys. All of this makes it so you can allow a Point
instance to be matchable as an array or a hash. Here’s how we can use pattern matching:
case Point.new 1, -2
in x, Integer then "matched: #{x}"
else "unmatched"
end
# "matched: 1"
case Point.new 1, -2
in x: 0.. then "matched: #{x}"
else "unmatched"
end
In the first example, we get a match because we are matching by position and type within an array. The second example is also a match but this time we are only looking for the x
key which has a value of zero to infinity (i.e. endless range).
Examples
If you’d like more examples of pattern matching, several of this site’s projects use pattern matching, especially those with Command Line Interfaces (CLIs). For example, here’s a snippet from Rubysmith where pattern matching is used to process command line input:
module Rubysmith
module CLI
# The main Command Line Interface (CLI) object.
class Shell
# Truncated.
def call arguments = []
case parse arguments
in action_config: Symbol => action then config.call action
in action_build: true then build.call
in action_version: true then kernel.puts configuration.version
else kernel.puts parser.to_s
end
end
end
end
end
Another example is the Janus system which uses pattern matching to process Slack commands:
module Janus
module Actions
class Processor
# Trunctated.
def call request
# Trunctated.
text = request.text
case text.split
in ["help"] then help
in "status" => status, String => name then status.call name
in "version", String => name, String => version then version.call name, version
in "deploy", String => name, String => version then deploy.call name, version
else unknown.call text
end
end
end
end
end
Finally, you can always use Gemsmith to build your own gem with CLI support which automatically generates code that uses pattern matching. To experiment, run the following locally:
gem install gemsmith
gemsmith build --name demo --cli
Now you can edit your demo
gem in your favorite editor and look at the Shell
class and corresponding spec.
Guidelines
Pattern matching, while powerful, can get unwieldy when used without constraint. Here are a few guidelines for consideration to keep your code clean:
-
Use sorted arrays for consistent matching but also better performance.
-
Use one-line standalone pattern matches for boolean results instead of the more verbose multiple line syntax.
-
Avoid unnecessary deconstruction in your classes by only adding pattern matching support when you need it.
-
Avoid heavy use of the array find pattern (i.e.
[*, desired_match, *]
) since it can be expensive and hurt performance. -
Avoid shadow variables when variable binding since duplicate names can be hard to debug (unless you are using variable pinning).
-
Avoid complicated nested patterns by breaking them down into smaller patterns which can be composed together or using simplified data structure instead.
Resources
The original documentation and source for this article can be found via the Ruby Pattern Matching documentation. I’ve also spoken on this specific topic in the past so here are the slides.
Conclusion
Congratulations and thanks for sticking with me through to the end. By now, you can see the power pattern matching brings to Ruby and hopefully this’ll help you write better and more robust code as well. Enjoy and may this be a future reference to you.