in reply to Parsing/regex help required
You generally need to figure out how to describe the problem to yourself to guide yourself to a solution. You didn't present any requirements, but let's assume from your example that you want to recognize lines that are numbered (i.e., begin with a number followed by a period) and include a hyphen surrounded by whitespace.
There are several ways you can accomplish it. You've already mentioned index and substr, another way could be to use split, or as you mention in the title a regular expression.
For a regular expression, you just build the expression bit by bit, like this:
$ cat t.pl use strict; use warnings; my $str = "123. The quick brown fox - Jumps over the"; if ($str =~ /^ # start of line\/string (\d+) # capture one or more digits \.\s+ # a literal period followed by some space (.*) # some characters \s+-\s+ # some space, a hyphen and more space (.*) # more characters $ # end of the line or string /x) { # x means allow whitespace and comments in reg +ex my ($num, $text1, $text2) = ($1, $2, $3); print "num=$num, text1=<$text1>, text2=<$text2>\n"; } else { print "No match!\n"; } $ perl t.pl num=123, text1=<The quick brown fox>, text2=<Jumps over the>
The parenthesis tell perl to capture the part of the string you care about, so later if you find a match, you can use the matched parts. The first capture group will be in variable $1, the next in $2 and so on. A normal perl installation will have a good bit of documentation on regular expressions, so be sure to look over:
- perldoc perlreref : a quick reference,
- perldoc perlreftut : a tutorial,
- perldoc perlrequick : a quick start guide,
- and there are more, too!
Don't forget that you can check the perl documentation index via perldoc perldoc to see which documents may be helpful at a given time.
...roboticus
When your only tool is a hammer, all problems look like your thumb.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: Parsing/regex help required
by Anonymous Monk on Sep 27, 2021 at 14:02 UTC | |
by kcott (Archbishop) on Sep 28, 2021 at 07:50 UTC | |
by Marshall (Canon) on Sep 28, 2021 at 02:07 UTC |