Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

Its a little strange to me how few regexp tutorials start with the basics and move on from there. Maybe I'm too close to the trees or maybe im too advanced to see what a beginner would need, but it strikes me that ommitting the basics is a bad start.

There are five fundamental building blocks of a regular expression. They are "characters", "concatenation", "alternation", "grouping", and "kleene closure"

Characters are literal characters that must be matched. A character is matched by finding the leftmost occuring equivelent in the input string.

Concatenation is the principle that two characters are concatenated together when not seperated by an operator. Concatenation is implied in a pattern, there is no special operator for it, and has the lowest precedence of all operators except for alternation.

Alternation is the way to say "match this subpattern or that subpattern". It is denoted by putting a | symbol in between the two subpatterns. Alternation has the lowest precedence of all the operators.

Grouping is a way to combine multiple components into a self contained subpattern. Alternation is often place into a grouping construct. In perl grouping is denoted by putting the subpattern in a parenthesis.

Kleene closure is a special pattern that matches 0 or more subpatterns in a string. This is denoted by a postfix * operator, or in less technical terms by placing a * after the subpattern.

It turns out that many of the common tasks one would wish to perform with a regex are quite clumsy when restricted to such a sparse language. Therefore various extensions have been made which allow common constructs to be written more elegantly.

Its common to want to match 1 or more subpatterns. While this can be expressed using klene closure alone, it can be clumsy, therefore the postfix plus operator is provided. P+ is defined to match the same thing as PP*.

Its common to want to match any one of several characters at a given point in a string. Therefore the "character class" parenthetical construct is provided. [ABC] matches the same text that (A|B|C) matches. Note that this is restricted to single characters and not longer subpatterns.

The ability to optionally match something is a common requirement. Therefore the ? postfix operator is provided. P? matches the same thing as (P|) matches. (P or nothing)

Anyway, just some thoughts for you. Obviously it all could use more polishing, buts its basic material that i think makes it easier to understand regexes.


In reply to Re: RFC - Regular Expressions Tutorial, the Basics (for BEGINNERS) by demerphq
in thread Regular Expressions Tutorial, the Basics (for BEGINNERS) by brusimm

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others surveying the Monastery: (6)
    As of 2020-09-23 14:13 GMT
    Find Nodes?
      Voting Booth?
      If at first I donít succeed, I Ö

      Results (131 votes). Check out past polls.