Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Its a little strange to me how few regexp tutorials start with the basics and move on from there. Maybe I'm too close to the trees or maybe im too advanced to see what a beginner would need, but it strikes me that ommitting the basics is a bad start.

There are five fundamental building blocks of a regular expression. They are "characters", "concatenation", "alternation", "grouping", and "kleene closure"

Characters are literal characters that must be matched. A character is matched by finding the leftmost occuring equivelent in the input string.

Concatenation is the principle that two characters are concatenated together when not seperated by an operator. Concatenation is implied in a pattern, there is no special operator for it, and has the lowest precedence of all operators except for alternation.

Alternation is the way to say "match this subpattern or that subpattern". It is denoted by putting a | symbol in between the two subpatterns. Alternation has the lowest precedence of all the operators.

Grouping is a way to combine multiple components into a self contained subpattern. Alternation is often place into a grouping construct. In perl grouping is denoted by putting the subpattern in a parenthesis.

Kleene closure is a special pattern that matches 0 or more subpatterns in a string. This is denoted by a postfix * operator, or in less technical terms by placing a * after the subpattern.

It turns out that many of the common tasks one would wish to perform with a regex are quite clumsy when restricted to such a sparse language. Therefore various extensions have been made which allow common constructs to be written more elegantly.

Its common to want to match 1 or more subpatterns. While this can be expressed using klene closure alone, it can be clumsy, therefore the postfix plus operator is provided. P+ is defined to match the same thing as PP*.

Its common to want to match any one of several characters at a given point in a string. Therefore the "character class" parenthetical construct is provided. [ABC] matches the same text that (A|B|C) matches. Note that this is restricted to single characters and not longer subpatterns.

The ability to optionally match something is a common requirement. Therefore the ? postfix operator is provided. P? matches the same thing as (P|) matches. (P or nothing)

Anyway, just some thoughts for you. Obviously it all could use more polishing, buts its basic material that i think makes it easier to understand regexes.

---
$world=~s/war/peace/g


In reply to Re: RFC - Regular Expressions Tutorial, the Basics (for BEGINNERS) by demerphq
in thread Regular Expressions Tutorial, the Basics (for BEGINNERS) by brusimm

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (6)
As of 2024-03-29 01:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found