The letter A styled as Alchemists logo. lchemists Syndication Icon

Putin's War on Ukraine - Watch President Zelenskyy's speech and help Ukraine fight against the senseless cruelty of a dictator!

Published October 1, 2022 Updated October 10, 2022
Cover
Ruby Pattern Matching

Wikipedia defines pattern matching as:

[T]he act of checking a given sequence of tokens for the presence of the constituents of some pattern.

Using the above definition, this means a pattern could be applied to something as simple as a hash with ship and crew information:

{
  ship: "Serenity",
  crew: [
    {
      name: "Malcolm Reynolds",
      title: "Captain"
    },
    {
      name: "Zoe Alleyne Washburne",
      title: "Soldier"
    },
    {
      name: "Kaywinnet Lee Frye"
      title: "Engineer"
    }
  ]
}

Upon inspection, you can see we have a hash of arrays where each element of the array is also a hash that repeats the same name and title keys. That’s a pattern we can match against! To learn how, we’ll spend the rest of this article understanding what pattern matching is and how best to use it.

Overview

From a high level, here are the various patterns we’ll be discussing in this article:

Pattern Example

Standalone

in, =>

Arrays

[<pattern>, <pattern>]

Hashes

{key: <pattern>, key: <pattern>}

Remainders

*, **, *remainder, **remainder

Finds

[*, <pattern>, *]

Excludes

_

Voids

**nil

Variable Binding

^variable

Combinations

|

Guards

if, unless

Classes

deconstruct, desconstruct_keys

The above can be a useful as a quick reference should you need it. We’ll get to the details of each shortly but first a quick look at the history of pattern matching can provide some helpful context.

History

The evolution of the pattern matching feature can be summarized over a series of versions (years):

  • Ruby 2.7.0 - Introduced for the first time as an experimental feature.

  • Ruby 3.0.0 - Officially supported while introducing new experimental features:

    • Standalone rightward assignment (i.e. =>).

    • Standalone boolean checks (i.e. in).

    • Find patterns (i.e. <object> in [*, query, *]).

  • Ruby 3.1.0 - Rightward assignment and boolean checks (i.e. standalone) became officially supported while the pin operator (i.e. ^) gained the ability to use statements and/or more complex expressions instead of only constants, literals, and local variables.

Syntax

Now that you have a sense of what pattern matching is and how it evolved over time, we can dive into the specifics of syntax.

Standalone

Standalone syntax is the simplest form of pattern matching and is most useful when only needing to write a single line of code. Anything beyond a one-liner, will require the use of a case statement (explained shortly). There are two forms of standalone syntax which are explained next.

Boolean Check

A boolean check — denoted by the in keyword — is a handy way to pattern match when only needing to know if the result is true or false. Example:

basket = [{kind: "apple", quantity: 1}, {kind: "peach", quantity: 5}]

basket.any? { |fruit| fruit in {kind: /app/}}    # true
basket.any? { |fruit| fruit in {kind: /berry/}}  # false

Boolean checks are also useful when used in guard clauses or other kinds of conditional logic.

Rightward Assignment

Rightward assignment — denoted by => — provides a quick way to assign a value based on the matched pattern. Example:

{character: {first_name: "Malcolm", last_name: "Reynolds"}} => {character: {last_name:}}

puts "Last Name: #{last_name}"  # "Last Name: Reynolds"

⚠️ Rightward assignment — unlike a boolean check — will throw an error when a pattern can’t be matched. For example:

{character: {first_name: "Malcolm", last_name: "Reynolds"}} => {character: {middle_name:}}

# NoMatchingPatternKeyError: key not found: :middle_name

For more robust code, you’ll either want to catch the exception (expensive) or switch to using a case statement (recommended).

Case Statement

The case statement is where you leverage the full power of pattern matching. The difference is with using in instead of when because the traditional case...when...else...end syntax is for conditional logic while the new case...in...else...end is for pattern matching. Here’s a better illustration:

case <expression>
  in <pattern> then <statement>
  in <pattern> then <statement>
  in <pattern> then <statement>
  else <statement>
end

This doesn’t mean you can mix and match when and in statements together. You must either use a case statement for conditional logic or pattern matching but not both because the following will fail with a syntax error:

case "example"
  in text then puts text
  when "example" then puts "Found"
  else puts "Unknown"
end

# SyntaxError: unexpected `when', expecting `end'

Unlike case statement conditional logic — where else is optional — pattern matching is exhaustive in that you’ll get a NoMatchingPatternError if none of the patterns match. Example:

# Faulty code.
case %i[a b c d]
  in Symbol then puts "Single"
  in Symbol, Symbol then puts "Double"
end

# NoMatchingPatternError

# Robust code.
case %i[a b c d]
  in Symbol then puts "Single"
  in Symbol, Symbol then puts "Double"
  else puts "Unable to match input."
end

# "Unable to match input."

This is a nice enhancement over case statements using conditional logic because you always want to use the else branch of your logic for more robust code that doesn’t cause surprising downstream exceptions or the more insidious nil which is never fun to debug.

💡 You can enforce case statements to always have an else by ensuring RuboCop’s Style/MissingElse is enabled.

Syntax-wise, brackets and braces are optional when using case statements:

# With brackets and braces.
case [1, 2, 3]
  in [Integer, Integer] then "match"  # With brackets.
  else "unmatched"
end

case {a: 1, b: 2, c: 3}
  in {a: Integer} then "matched"      # With braces.
  else "unmatched"
end

# Without brackets and braces.
case [1, 2, 3]
  in Integer, Integer then "match"   # Without brackets.
  else "unmatched"
end

case {a: 1, b: 2, c: 3}
  in a: Integer then "matched"       # Without braces.
  else "unmatched"
end

⚠️ You must use brackets and braces when using standalone syntax and/or with nested patterns, though.

Whole versus Partial

Arrays and hashes behave differently when it comes to whole and partial matches. Consider the following:

case [1, 2, 3]
  in Integer, Integer then "matched"
  else "unmatched"
end

# "unmatched"

case {a: 1, b: 2, c: 3}
  in a: Integer then "matched"
  else "unmatched"


# "matched"

The reason the hash matched and the array didn’t is because arrays default to being whole matches while hashes are partial matches. The nuance of all this will be explained in more details once we delve in the specifics of arrays and hashes.

Variable Binding

Variable binding allows you to bind the result of a pattern match to a local variable. Example (using arrays and hashes):

# Array
case [1, 2]
  in first, Integer then "matched: #{first}"
  else "unmatched"
end

# "matched: 1"

# Hash
case {a: 1, b: 2}
  in a: first then "matched: #{first}"
  else "unmatched"
end

# "matched: 1"

In the above example, the local variable is bound by positional (array) and key (hash) but you can expand upon this by adding in type checks too:

# Array
case [1, 2]
  in Integer => first, Integer then "matched: #{first}"
  else "unmatched"
end

# "matched: 1"

# Hash
case {a: 1, b: 2}
  in a: Integer => first then "matched: #{first}"
  else "unmatched"
end

# "matched: 1"

You’ll notice we’ve bound our local variable, as before, except this time we check if the type is an Integer for more robust pattern matching in situations where we need type safety.

Hashes

Unique to hashes is ability to bind a local variable by key alone. Example:

case {a: 1, b: 2}
  in a: then "matched: #{a}"
  else "unmatched"
end

# "matched: 1"

In situations where your hash keys are adequately named, this saves you extra typing while still having readable code.

Nesting

Variable binding works with more complex and nested arrays and hashes but requires you to make use of explicit braces and brackets:

case [%w[Apple Apricot], %w[Blueberry Blackberry]]
  in [[first, String], *] then "matched: #{first}"
  else "unmatched"
end

# Yields "matched: Apple"

case {label: "Basket", fruits: [{label: "Apple"}, {label: "Peach"}]}
  in label:, fruits: [{label: first}, *] then "matched: #{first}"
  else "unmatched"
end

# Yields "matched: Apple"
Remainders

Remainders — which you’ve seen a few examples of already — are through the use of single and double splats. These splats can be either bare or named. Starting with bare splats, here’s what they look like when used with arrays and hashes:

case [1, 2, 3]
  in Integer, * then "matched"
  else "unmatched"
end

# "matched"

case {a: 1, b: 2, c: 3}
  in a: Integer, ** then "matched"
  else "unmatched"
end

# "matched"

The above results in a match for both the array and hash examples where only the first elements of the pattern are matched since the single and double splats ignore everything else. This is great when you don’t care to use the remaining elements. In situations where you do care about the remainders, you’ll want to name them. Example:

case [1, 2, 3]
  in first, *remainder then "matched: #{first}, #{remainder}"
  else "unmatched"
end

# "matched: 1, [2, 3]"

case {a: 1, b: 2, c: 3}
  in a:, **remainder then "matched: #{a}, #{remainder}"
  else "unmatched"
end

# "matched: 1, {:b=>2, :c=>3}"

By using *remainder for the array and **remainder for the hash, the ability to match against specific positions/keys while also having access to the remaining elements allows you to do more with the matched results.

Variable Pinning

Variable pinning allows you to match against global, instance, and local variables. This works for both arrays and hashes but, for illustration purposes, we’ll stick to arrays:

$global = 1

class Demo
  @@class_variable = 2

  def initialize
    @instance_variable = 3
  end

  def call
    local = 4

    case [1, 2, 3, 4]
      in ^$global, ^@@class_variable, ^@instance_variable, ^local then "matched"
      else "unmatched"
    end
  end
end

Demo.new.call  # matched

The above is a match because the [1, 2, 3, 4] array matched all of the variable values but let’s dive into this further by focusing on local variables for simplicity (these examples apply to global, class, and instance variables too):

expectation = 5

case [1, 2]
  in expectation, * then "matched: #{expectation}"
  else "unmatched: #{expectation}"
end

# "matched: 1"

expectation = 5

case [1, 2]
  in ^expectation, * then "matched: #{expectation}"
  else "unmatched: #{expectation}"
end

# "unmatched: 5"

A local variable doesn’t have to exist for pattern matching because you can dynamically use a local variable once the local variable is set. Example:

case [1, 1]
  in value, ^value then "values are identical"
  else "values are different"
end

# "values are identical"

case [1, 2]
  in value, ^value then "values are identical"
  else "values are different"
end

# "values are different"

You are not limited to pinning and using local variables within the same level of your pattern but can be leveraged within a nested pattern as well. Example:

case {school: "high", schools: [{id: 1, level: "middle"}, {id: 2, level: "high"}]}
  in school:, schools: [*, {id:, level: ^school}] then "matched: #{id}"
  else "unmatched"
end

# "matched: 2"

case {school: "high", schools: [{id: 1, level: "middle"}]}
  in school:, schools: [*, {id:, level: ^school}] then "matched: #{id}"
  else "unmatched"
end

# "unmatched"

The reason the first example works is because school is pinned to "high" which happens to be equal the to same level as school record ID 2 but the second example is not a match because only one school record (i.e. "middle") exist which is not equal to the school level of "high".

Finally, you can use variable pinning with expressions as long as they are wrapped in parenthesis. Example:

multiplier = 2

case [2, 4]
  in Integer, ^(2 * multiplier) then "matched"
  else "unmatched"
end

# "matched"

case [2, 4]
  in Integer, ^(2 * multiplier) => expectation then "matched: #{expectation}"
  else "unmatched: #{expectation}"
end

# "matched: 4"

⚠️ Be aware that local variables will be overridden if not pinned:

expectation = 5

case [1, 2]
  in expectation, * then "matched: #{expectation}"
  else "unmatched: #{expectation}"
end

expectation  # 1

Combinations

You can match one or more patterns when separated by a pipe (i.e. |):

case [:a, 1, 2.0]
  in [Symbol] | [Symbol, Integer, Float] then "matched"
  else "unmatched"
end

# "matched"

case {a: 1, b: 2, c: 3}
  in Array | {a: Integer} then "matched"
  else "unmatched"
end

# "matched"

In both of the examples above, the "or" condition was satisfied so both are a match. Using different data with the same patterns we still see we have matches without reaching the "or" condition:

case [:a]
  in [Symbol] | [Symbol, Integer, Float] then "matched"
  else "unmatched"
end

# "matched"

case [1, 2, 3]
  in Array | {a: Integer} then "matched"
  else "unmatched"
end

# "matched"

⚠️ Be careful when matching arrays since implicit or explicit use of braces can yield different results when using combinations. Using standalone syntax, here are a variations on the above which highlight the differences:

[:a, 1, 2.0] in Symbol | Symbol, Integer, Float  # true
[:a] in String | Symbol, Integer, Float          # false
[:a] in String | [Symbol, Integer, Float]        # false
[:a] in [Symbol] | [Symbol, Integer, Float]      # true

Basically, a single element array will only match if both combinations are wrapped in braces.

⚠️ You can’t bind a variable when using combinations either. Example:

case [1, 2, 3]
  in [Integer => first] | [Integer, Integer] then "matched"
  else "unmatched"
end

# SyntaxError: illegal variable in alternative pattern (first)

case {a: 1, b: 2, c: 3}
  in {a:} | Array then "matched"
  else "unmatched"
end

# SyntaxError: illegal variable in alternative pattern (a)

That said, you can — for hashes only — discard a key value using an underscore:

case {a: 1, b: 2, c: 3}
  in {a: _} | Array then "matched"
  else "unmatched"
end

# "matched"

This also means that _ is bound as a local variable but the Ruby Core team strongly recommends against using the _ local variable because they are meant to denote the discarding of a key’s value only.

Guards

Guard statements — as used elsewhere in your code — can be used as one-liners for pattern matching. Example:

case %w[apple peach blueberry]
  in String, middle, String if middle == "peach" then "matched"
  else "unmatched"
end

# "matched"

case {animal: "bat", vehicle: "tumbler", color: "black"}
  in {color: String} if color == "green" then "matched"
  else "unmatched"
end

# "unmatched"

💡 While only if is shown above, unless works as a guard clause too.

Arrays

There are aspects of pattern matching which are unique to arrays. The following will highlight these differences.

Order

Due to the inherent nature of arrays, the order of elements matters when using arrays versus hashes. Example:

case [1, :a]
  in Symbol, Integer then "matched"
  else "unmatched"
end

# "unmatched"

case [:a, 1]
  in Symbol, Integer then "matched"
  else "unmatched"
end

# "matched"

As you can see, only the second example is a match because the type of elements being pattern matched are in the correct order. On the other hand, hashes don’t need to be concerned about order since the benefit of having keys means they can be in any order:

case {a: 1, b: 2}
  in b: Integer then "matched"
  else "unmatched"
end

# "matched"

case {b: 2, a: 1}
  in b: Integer then "matched"
  else "unmatched"
end

# "matched"

Due to the importance of array order, it is wise to ensure your arrays are sorted prior to pattern matching them because this will increase the accuracy of your results.

Finds

Another uniqueness of arrays is using bare single splats to find elements within an array. Here’s an example where we only care that the last elements of the array match a specific pattern:

case [:a, 1, :b, :c, 2]
  in *, Symbol, Integer then "matched"
  else "unmatched"
end

# "matched"

You can also use an experimental feature where bare single splats are applied to both sides of the pattern to match for elements in the middle of an array:

case [:a, 1, :b, :c, 2]
  in *, Symbol, Symbol, * then "matched"
  else "unmatched"
end

# "matched"

An enhancement over bare single splats is named single splats for situations in which need to make use of the elements found in the splats. Example:

case [:a, 1, :b, :c, 2]
  in *first, Symbol, Symbol, *last then {first: first, last: last}
  else "unmatched"
end

# {first: [:a, 1], last: [2]}

⚠️ Leading splats — as used in both examples above — are an experimental feature so you’ll see warnings when using them. They can also be expensive with large arrays since the pattern match must scan the entire array to obtain the match.

Hashes

Hashes also have unique characteristics when it comes to pattern matching. The following highlights these differences.

Empty

Hashes have the ability to match on empty hashes which is a slight exception to the partial matching behavior mentioned earlier in the Whole versus Partial matches section.

case {a: 1, b: 2, c: 3}
  in {} then "matched"
  else "unmatched"
end

# "unmatched"

case {}
  in {} then "matched"
  else "unmatched"
end

# "matched"

This difference is important to note since hashes use partial matching by default so the only way to look for an empty hash is to explicitly match against an empty hash.

Voids

You can use voids to obtain a whole match on a hash when you don’t want additional pairs in your hash. Example:

case {a: 1, b: 2}
  in {a: Integer, **nil} then %(matched "a" part)
  in {a: Integer, b: Integer, **nil} then "matched whole hash"
  else "unmatched"
end

# "matched whole hash"

The reason the first pattern didn’t match is because b: 2 was an extra element. The second pattern, on the other hand, matched because it allowed exactly two keys. Had there been a third key then none of the patterns would have matched due to the void constraint. Voids are a nice way to ensure no additional elements show up when needing a whole match since hashes use partial matching by default.

Structs

So far we’ve mostly been focused on arrays and hashes when it comes to pattern matching but structs are excellent as well. In fact, structs — along with classes which we’ll get to in a moment — unlock a new capability with pattern matching which is: types. To start, let’s say we have the following struct:

Point = Struct.new :x, :y

Now watch what happens when we pattern match using array syntax:

case Point[1, 2]
  in Point[..5, ..5] then "matched"
  else "unmatched"
end

# "matched"

case Point[1, 2]
  in ..5, ..5 then "matched"
  else "unmatched"
end

# "matched"

case [1, 2]
  in Point[..5, ..5] then "matched"
  else "unmatched"
end

# "unmatched"

In the first example we get a match because we are matching by type and value which is about as explicit as you can get. In contrast, the second example is also a match but this time we don’t care about type and only match by value instead. In other words, the second example’s result is the same but less rigid than the first example. Finally, the third example is not a match because we have an incoming array but we are pattern matching against the Point struct which is not the same type as the Array type so this match fails.

We can also pattern match using hash syntax:

case Point[1, 2]
  in Point(x: ..5, y: ..5) then "matched"
  else "unmatched"
end

# "matched"

case Point[1, 2]
  in x: ..5, y: ..5 then "matched"
  else "unmatched"
end

# "matched"

case {x: 1, y: 2}
  in Point(..5, ..5) then "matched"
  else "unmatched"
end

# "unmatched"

As you can see, the results are the same using array and hash syntax. The reason is because structs implement both the #deconstruct and #deconstruct_keys methods.

Finally — and this is subtle — when matching array types, I used braces (i.e. []) but when matching hash types, I used parenthesis (i.e. ()). Both syntaxes are interchangeable but part of the reason for using braces is that they read nice with arrays versus hashes. It’s worth noting that the Ruby Core Team doesn’t specify which syntax you should use when type checking for structs and classes, in general. You can use braces (i.e. []) or parenthesis (i.e. ()). I tend to use parenthesis except when pattern matching with arrays. Whatever your choice, make sure you are consistent.

💡 For a deeper dive on Structs, you might enjoy this related article.

Classes

For situations where primitives such as arrays, hashes, and structs are not enough for your needs, you can teach non-primitive objects to be pattern matchable by implementing the #deconstruct method for arrays and/or the #deconstruct_keys method for hashes. Example:

class Point
  attr_reader :x, :y

  def initialize x, y
    @x = x
    @y = y
  end

  def deconstruct = [x, y]

  def deconstruct_keys(keys) = {x: x, y: y}
end

The above implementation makes it so you can allow a Point instance to be matchable as an array or a hash. Here’s how we can use pattern matching:

case Point.new 1, -2
  in x, Integer then "matched: #{x}"
  else "unmatched"
end

# "matched: 1"

case Point.new 1, -2
  in x: 0.. then "matched: #{x}"
  else "unmatched"
end

# "matched: 1"

In the first example, we get a match because we are matching by position and type within an array. The second example is also a match but this time we are only looking for the x key which has a value of zero to infinity (i.e. endless range).

Examples

If you’d like more examples of pattern matching, several of this site’s projects use pattern matching, especially those with Command Line Interfaces (CLIs). For example, here’s a snippet from Rubysmith where pattern matching is used to process command line input:

module Rubysmith
  module CLI
    # The main Command Line Interface (CLI) object.
    class Shell
      # Truncated.

      def call arguments = []
        case parse arguments
          in action_config: Symbol => action then config.call action
          in action_build: true then build.call
          in action_version: true then logger.info { configuration.version }
          else logger.any { parser.to_s }
        end
      end
    end
  end
end

Another example is the Janus system which uses pattern matching to process Slack commands:

module Janus
  module Actions
    class Processor
      # Trunctated.
      def call request
        # Trunctated.
        text = request.text

        case text.split
          in ["help"] then help
          in "status" => status, String => name then status.call name
          in "version", String => name, String => version then version.call name, version
          in "deploy", String => name, String => version then deploy.call name, version
          else unknown.call text
        end
     end
   end
  end
end

Finally, you can always use Gemsmith to build your own gem with CLI support which automatically generates code that uses pattern matching. To experiment, run the following locally:

gem install gemsmith
gemsmith --build demo --cli

Now you can edit your demo gem in your favorite editor and look at the Shell class and corresponding spec.

Guidelines

Pattern matching, while powerful, can get unwieldy when used without constraint. Here are a few guidelines for consideration to keep your code clean:

  • Use sorted arrays.

  • Use limited finds since they can be expensive (i.e. [*, <pattern>, *]).

  • Avoid unnecessary deconstruction in your classes by only adding pattern matching support when you need it.

  • Avoid shadow variables when variable binding since the duplicate names can be hard to debug (unless you are using variable pinning).

  • Avoid complicated nested patterns by breaking them down into smaller pattern which can be composed together or using simplified data structure instead.

Resources

The original documentation and source for this article can be found via the Ruby Pattern Matching documentation. I’ve also spoken on this specific topic in the past so here are the slides.

Conclusion

Congratulations and thanks for sticking with me through to the end. By now, you can see the power pattern matching brings to Ruby and hopefully this’ll help you write better and more robust code as well. Enjoy and may this be a future reference to you.