Many years ago, when I considered myself a C++ expert (which was presumptous yet excusable for 19 yrs), I’ve liked to think something like “My language is the most powerful and expressive in entire world! I can implement literally any other language’s feature with my templates and operator redefinitions and …”
Now, being Ruby developer in my mid-30s, I became much more pragmatic, but still curious (and sometimes envious) about other languages and with passion to experiments like “what can we do to have this Erlang/Scala/Rust/you-name-it cool thing in our everyday code?”
So, here is one of those exercises: taking the whole idea of “pattern matching” from functional languages and seeing, how we can make use of it.
What is Pattern Matching?
In computer science, pattern matching is the act of checking a given sequence of tokens for the presence of the constituents of some pattern. (Wikipedia)
Not sounding really cool?
In practice of languages like Haskell, there is a very powerful tool, allowing, in particular:
- checking values against complex patterns (like “list consisting of element, then several, then exactly the same element”);
- decomposition of values (after checking list agains
(x:xs)pattern you can immediately work with “x” variable, having matched element;
- code execution dispatching by patters (you can have several defintions of function for different “argument patterns”);
- guards, which are not only check data structure but also can perform
any checks on value (
x < 15).
Read about Haskell pattern matching. It’s seriously mind-blowing.
Ruby Has (Sort Of) Pattern Matching
Three Equality Signs
Not everyone is aware of power of
#=== operator. In official
documentation it is named
“case equality operator” (suggesting typical, but not only, use) and you
implicitly use it in your everyday code:
You can also define
#=== for your class and have very complicated
This is hell of close to pattern matching. I’m constantly finding myself using this feature exactly for dispatching. It is not as cool as method overloading, yet concise enough:
It is less known that
#=== method also used in
And in fact for many tasks like parsing and evaluating complex contexts you can store your own array of heterogenous patterns and check some values against them.
But it is only a part of “pattern matching” puzzle.
Argument Decomposition in Expressions
Ruby definitely has some of decomposing-values-by-pattern features:
It is extremely useful for method/block arguments. When we are talking
about Ruby 2.2, we also have
**args for arguments, catching hashes,
and keyword arguments, deconstructing them. Unfortunately, in other
places you can’t has it:
Even more unfortunately, decomposition in expressions have nothing in
#===, so, you can’t do anything like
Regexps are different. They are patterns. The can
#===, as we’ve seen
before. They also can decompose:
You can think of them as a separate sub-language inside Ruby, which can be in no use outside the domain of string checking. Yet regexps can learn us a lesson of how could matching+decomposition look in Ruby.
They are used, well, for specs and inside framework. But, in fact, many RSpec matchers are complicated patterns for compound types. Like this:
That’s cool. Really, seriously cool and very Ruby-way. In fact, you can try it right now:
And it just works! I am, in fact, puzzled that we are not seeing more and more of this in real life code (or, I rather not; skip to “Being Pragmatic” section if you want to know why).
Yet those cool matchers still lack “decomposition” of values into variables. (To be fair, it’s not needed in context they are used today.)
So, we need to go deeper.
Implementing Missing Features
For me, all of them (looking pretty similar) are missing the point. They are trying to invent some DSL, mimicing existing pattern-matching languages. Yet, IMO, pattern matching is not some “domain”, it’s rather basic language feature. So, the “domain-specific language” attempt to it is doomed by design.
Let’s try to invent solution, defined by next requirements:
- it should play well with existing Ruby quasi-pattern-matching features:
- it should not pollute global namespace (and monkey-patch core classes) with lot of “cool inventions”;
- it should be some “opt-in” solution, which you can use in 2-3 places in your code (not changing anything else) and everything should just work, and just stay readable, not implying some “new paradigm”
- providing all the cool features of deep matching, values decomposition, value guards and more.
So, meet matchish. For now, it is NOT a gem, but rather an attempt to show the way. If you are wondering why I’m not over-enthusiastic over matchish acceptance, please read “Being Pragmatic” section.
Types Being Algebraic
Task: treat any Ruby object as “algebraic” data type, which allows to check its deep inside structure.
Solution: At first, I assumed something like RSpec’s
instance_of(...) and so on. But
for “core language feature” it seemed to be too verbose and not very
“guessable”. So, current solution looks like this:
Solution seems pretty straightforward for implementation, you can look at it in matchish/matchers.
As it is rather proof-of-concept than library, not all of useful matchers are implemented.
- one (and only one) method
mamonkey-patched in each object;
- (and it even can be refinement, not patch, in modern Ruby).
Values Being Decomposed
Task: alongside with matching pattern, store matched values in variables.
Solution 1: strangely beautiful
.value part is a bit ugly, but the overall solution looks cute,
How it’s done?
x = Object.m part is simple variable assignment.
x has value of
Matchish::Matcher. After very first comparison
with real value, pattern is “bound” to this value, and all forecoming
comparisons only return
true if the value is exactly the same.
NB: at the moment, I haven’t invented elegant solution to name the
splatted part of match (like
y = *any).
Downside of this approach is you can’t save this kind of pattern in the constant (as it will not know the context of “x”, and is usable only once, after which pattern is “bound”).
Looks like “more complex DSL”, yet, at the same time, less magic. (By “less magic”, BTW, I mean that there is no need for “how is it done?” explanation, as any expirienced Ruby programmer reads this as pretty obvious code.)
Compromise: It seems both solutions are mere compromises, each having its pros and cons. I honestly don’t know which is “more Ruby-way”. The first one seems “smarter” while the last is definitely provides less magic and therefore more usability.
Imperative Inside Functional: Guards
Task: Last but not least. Sometimes you are finding yourself writing code like…
…which is not too bad, yet branches inside branches inside branches… The whole idea around pattern-matching is - let’s do the only match, no branching.
Implementation, again, is pretty straightforward.
All the examples above are implemented in my matchish repository and working. But currently I’m not releasing this “matchish” as a gem (and therefore, not tried to make it feature-full: for ex., of all possible RSpec-like matchers only several necessary for showcase were implemented).
Look, I’ve invented matchish and tried to use it when working on a parser project (unreleased yet) – that’s the domain where pattern matching should definitely be in use. And there was hard lesson learned:
All in all, powerful pattern matching need to be core language feature.
That’s because of:
- natural look: the more I’ve used “cool” matchish feature, the more was fear of other developers being puzzled or disgusted with code… it became “extraterrastrial” a bit;
- more important: speed, optimization, overhead; it became really easy to write “clever” algorithm implementation in some small method, then call it 1000 times and then profiler shows you tons of Matchish objects created, dispatching checks furiously while binding complex “matching context” and feasting with your memory and CPU time;
- even more important: after all the considerations, it now seems for me that pattern matching IS a paradigm; either you do all the execution dispatching through the pattern matched function alternatives, or it will always be another (typically, more clear) way to express your intentions; the way your language (Ruby) is more naturally support.
Though, it still seems that some incremental enchancements of Ruby
existing “match-y” features will be appreciated by most of us. Personally
I’d be happy to write
case code with inline decomposition, as shown above:
Also, what about
#=== for types like
P.S. How I Did This Article
As I wanted just to “show the point”, I didn’t want to write extensive specifications for weeks, then updating them, and only THEN writing an article.
But as I wanted to show the point, I’ve need some tool to check at least all code I’ve written for article (before writing an implementation, BTW :) will work.
So, I’ve used so-called “readme-driven development” approach (using your readme as a specification) and tool dokaz helping to enforce this approach (ok, it’s shameless self-promotion, as I’m the author of the tool).
So, at the time I writing this paragraph, I only need to run
bundle exec dokaz Article.md and look what it outputs.