Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Teaching Regular Expression Pattern Matching

by Jim (Curate)
on Aug 21, 2011 at 02:34 UTC ( [id://921464]=perlmeditation: print w/replies, xml ) Need Help??

I'm contemplating teaching a four- or eight-hour class on regular expression pattern matching to my colleagues at a large consulting firm. The intended audience would be mostly computer forensics professionals and "accidental" programmers who use various tools and languages that support regular expressions, but who don't often use regular expressions because they're either unfamiliar with them or intimidated by them. One objective would be to convince those colleagues who tend to do a lot of elaborate string manipulation using only built-in string functions—a common anti-pattern I've observed—to use regular expression pattern matching instead.

How would you teach a class on regular expressions? What would your approach be? Are there any particularly good resources or teaching aids you would use?

Jim

  • Comment on Teaching Regular Expression Pattern Matching

Replies are listed 'Best First'.
Re: Teaching Regular Expression Pattern Matching
by eyepopslikeamosquito (Archbishop) on Aug 21, 2011 at 04:26 UTC

    ...colleagues who tend to do a lot of elaborate string manipulation using only built-in string functions—a common anti-pattern I've observed
    Curiously, I more frequently see the converse anti-pattern, namely over-using regexes. For example, I've often seen Perl beginners essaying:
    if ($nodename =~ /$mynode/)
    when they should have been using:
    if ($nodename eq $mynode)
    Apart from the obvious problem of embedded names (e.g. "freddy" v "fred"), I've lost count of the number of times I've asked a Perl rookie to consider what happens if $mynode contains regex metacharacters. Update: So I suggest you mention \Q and \E and quotemeta in your course.

    I faced a similar Perl training problem a few years back and sent out an email with seven specific word puzzle problems against the Unix "words" file (e.g. /usr/dict/words). I offered some sort of prize for the winner IIRC. All could be solved as one liners using: perl -ne 'your-program-here' words or via a longer program, if you prefer. For example: find all palindromes; find the longest word in the dictionary; find all words that contain a particular letter four or more times; find all words that start with "e", have "n" as their second last letter, and are greater than seven characters in length; find all words that are of even length and contain an even number of each and every distinct vowel in the word. There are an endless number of interesting word puzzles available. You can further ask them to produce both regex and non-regex solutions, to compare and contrast which approach is more appropriate for each problem. It is not hard to invent problems where the regex solution is vastly superior to the non-regex one, which may help convince them of the power of regex.

    Update: See also:

Re: Teaching Regular Expression Pattern Matching
by moritz (Cardinal) on Aug 21, 2011 at 06:04 UTC

    Since "Mastering Regular Expressions" by Jeffrey Friedl is wildly recognized as a very good resource on regular expressions (and I found it a very good read too), I'd closely follow the example-driven approach used in that book, and use some examples that are relevant for your colleagues.

Re: Teaching Regular Expression Pattern Matching
by zentara (Archbishop) on Aug 21, 2011 at 12:37 UTC
Re: Teaching Regular Expression Pattern Matching
by armstd (Friar) on Aug 21, 2011 at 14:39 UTC

    Regular expressions often look more complicated than they are. It's easy to get lazy, look at a ridiculous string, and just not start. I would spend time emphasizing the need to break them down into their components in order to understand them. Yeah, its work, but that's the job. Sometimes a long regexp is the perfect tool for the job.

    I would emphasize simplicity. Find real-world examples in the code you maintain that might have been more easily accomplished with regular expressions. Careful not to be picking on the authors while doing so. It's a tough balance, but these examples will directly speak to practicality and value.

    Another tactic I've used in refactoring work is clearly describing how truly complex what they've created is. Any code replacing a complicated regular expression is bound to be complex itself. I get lots of emotional feedback in my refactoring work about "overly complicated code" (read: objects). What they're missing is the complexity in what its replacing. It's just a different complexity, the one they understand (and aren't maintaining so good). Get to the truth of the matter, break down the emotional knee-jerk responses.

    --Dave

Re: Teaching Regular Expression Pattern Matching
by JavaFan (Canon) on Aug 21, 2011 at 13:10 UTC
    One objective would be to convince those colleagues who tend to do a lot of elaborate string manipulation using only built-in string functions—a common anti-pattern I've observed—to use regular expression pattern matching instead.
    This makes me wonder, is your intention to really teach about regular expressions, or are you pushing a political agenda?
    How would you teach a class on regular expressions? What would your approach be?
    That depends. 4 to 8 hours isn't a lot. I've done regexes as part of Perl training course, and although I typically spend 4-6 hours of a 4 to 5 day training on regular expressions, I only barely scratch the surface.

    How much do your co-workers know? Do they know the concept? Are they familiar with at least v8 style regexes? (as used by grep, sed, vi, awk, etc). They you can skip the basics and focus on more advanced things. Otherwise, 4-8 hours is just enough to do the basics.

Re: Teaching Regular Expression Pattern Matching
by kejohm (Hermit) on Aug 22, 2011 at 02:34 UTC
Re: Teaching Regular Expression Pattern Matching
by sundialsvc4 (Abbot) on Aug 21, 2011 at 15:38 UTC

    Consider also including tools such as Parse::RecDescent.

    Many tasks involve regular expressions (as this module of course does), but are best-described using a more complex logical structure of which “regular expressions” are simply a primitive-part.   Sometimes the task involves considering among multiple alternatives, such that you need to “back up and try another alternative.”   That is what Parsers do, and this one happens to be a very good one.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://921464]
Approved by muba
Front-paged by chrestomanci
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2024-04-19 03:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found