http://qs321.pair.com?node_id=1089424

No I'm not kidding, please hear me out... :)

History of smartmatch in Perl

Smartmatching was invented for Perl 6 where it turned out to be a very useful and well-loved1 feature, but the attempt to backport it to Perl 5.10 in 2009 did not turn out so great (and it was consequently deprecated again in Perl 5.18). Among the Perl 6 community the commonly accepted explanation for that is, from what I heard:

These limitations are hard to circumvent, but I don't think that means Perl should have no smartmatching at all, it just means it should have less ambitious / more focused smartmatching.

I wasn't around at the time, but it looks to me as if the Perl 5.10+ smartmatching was designed with these goals:

  1. Support all use-cases that the Perl 6 smartmatch operator supports
  2. Use it as an opportunity to sneak in useful new comparison/searching operations into the core, without having to invent separate operator names for them, and without having to justify them individually

...and was thus doomed to failure.

Some later proposals for re-designing the smartmatch operator (like this 2011 post by brian d foy) tend to avoid mistake no. 1, but still fall into the second trap.

If there are comparison/searching operations that are deemed worthy of being added to Perl (say, "deep comparison" of two arrays, or checking whether an array contains a given scalar), then each of them should get its own operator. That's the normal Perl way: One operator per type of operation (that's why we have both == and eq for example).

How smartmatch should be designed

Smartmatching explicitly breaks with the conventional "one meaning per operator" rule by dynamically deciding what operation to perform based on its arguments. This means it should be carefully designed around use-cases where you actually need to dynamically decide what operation to perform. Operations that you would likely never want to mix-and-match, have no business being part of the smartmatch operator, even if they would be useful to have in core by themselves.

So, what are those use-cases where you actually need dynamic smartmatching? I can think of two major ones:

  1. When you want to avoid writing out  ($_ <operator> ...)  in a given/when construct, for the purpose of brevity/elegance:

    use v6; given $username { when 'root' { dostuff } when /^guest\d*$/ { die "You're not allowed to do stuff." } when any(<http apache>) { authenticate :web; dostuff } default { authenticate :local; dostuff } }

    Of course it is only elegant when the meaning is self-evident without consulting a manual, so this use-case only makes sense for commonly used & unambiguous comparison operations.

  2. When you want your code to test things against a "filter/pattern/rule" that is passed in from the outside, and you don't want to restrict it to just one way of filtering (e.g. only by string comparison, or only by regex, or only by callback etc.)

    For example, consider the Perl 6 built-in function dir, which lists the contents of a directory in the filesystem. It takes an optional 'test' argument, against which it promises to smartmatch each filename and only return the matching ones. Since smartmatch is built into the language, Perl 6 programmers need no further documentation to understand that parameter; they know they can use anything that would be valid as the right-hand-side argument of ~~ as the test, for example:

    use v6; dir '/some/directory', test => /\.txt$/; # a regex dir '/some/directory', test => none('.', '..'); # a junctionČ dir '/some/directory', test => &validate_filename; # a coderef

    The result is a very flexible but still elegant and predictable API that is easy to imitate in your own functions/modules that want to allow their users to "match" or filter stuff: Just use smartmatch as your filter implementation!

We can make new Perl 5 smartmatching rules useful for those use-cases, while still keeping them sane and predictable, by adhering to the these two principles:

  1. Decide what operation to perform, based on the type of the right-hand-side argument (and nothing else!)
    (Put another way, this means that  LHS ~~ RHS  can always be expressed in words as the question "Does LHS fit the constraint/template defined by RHS?")

  2. Blindly coerce the left-hand-side argument to the type that the chosen operation requires, just as normal Perl operators like eq also coerce their arguments.
    (So, for example, @foo ~~ /foo/ would be the same as @foo =~ /foo/, even though that may not be useful, rather than doing anything special just because it's an array!)

Sensible smartmatch rules

With that in mind, we can start to think about the kind of right-hand-side "things" that it should be possible to smartmatch against.

The following are no-brainers imo:

if RHS is an... (example) then  LHS ~~ RHS  should do...
undefined scalar $x ~~ undef !defined(LHS)
simple scalar $x ~~ 'foo' LHS eq RHS
regex (literal or reference) $x ~~ /foo/ LHS =~ RHS
code reference $x ~~ sub { ... } RHS->(LHS)
an object that overloads ~~ $x ~~ $object call the overload method, with LHS as argument

The 'simple scalar' case is not as elegant as one might wish it to be; Ideally it would be able to dynamically decide between string or numeric comparison like it does in Perl 6, but I don't think that is possible to do safely in Perl (its type system being what it is), so we need to take what we can get.

The following two rules also tend to be pretty useful in Perl 6, and it might make sense to add them to our hypothetical new Perl smartmatch, but I'm unsure about them because range literals and typename barewords are not usually treated as first-class "things" in Perl, so it might feel strange:

if RHS is a... (example) then  LHS ~~ RHS  should do...
bareword $node ~~ XML::LibXML::Node ref(LHS) eq "RHS"
range literal $age ~~ 0..17 interpret LHS as a number, and check if is within the range

Lastly, the lack of junctions in Perl could be partially remedied by interpreting an array/list on the right-hand-side like an any() junction:

if RHS is an... (example) then  LHS ~~ RHS  should do...
array or list $switch ~~ qw(yes true on 1) (grep { LHS ~~ $_ } RHS) >= 1

Of course, a better solution would be to add junctions to Perl together with re-adding smartmatch... :)
(Perl6::Junctions already exists on CPAN, but it relies on at least one awful hack due to the fact that it is non-core).

Anyway, the above rules would be more or less a subset of both Perl 6 smartmatching and the deprecated Perl 5.10+ smartmatching, but without the craziness of the latter.

And that's it; All cases not handled by these rules should generate a runtime error.
I don't think any other special cases need to be added - in particular, all the arbitrary behaviors that Perl 5.10+ smartmatching added for when one or both arguments were arrays/hashes, only served to confuse people and made the operator "not safe to use" in practice. Let's not repeat that mistake.

PS: In case you want to get a "feel" for what this kind of smart-matching is like in practice, check out Toby Inkster's match::simple module which implements very similar rules to what is discussed here (but suffers from some unavoidable limitations due to the fact that it is not in core).

---

1) Among the small but passionate fan base of Perl 6 :)
2) This particular junction is in fact used as the default when the 'test' argument is omitted.