http://qs321.pair.com?node_id=934502


in reply to A regex question

emilianenko:

Here's a quick bit of code to get you started:

use strict; use warnings; $/=undef; while (my $line = <DATA>) { for ($line =~ m/<a[^>]*>(.*?)<\/a>/gs) { print "Name '$_'\n"; } } __DATA__ <a href="foo">Jon.Martinez</a><li>gabba, gabba, hey!</li><a href=bar>Mary Jones</a><p>Gazebo!</p><a href="baz">Rob Oticus</a><a>Joe Blow</a>

Note that we slurp all the file in at once ($/=undef) otherwise we can't find names spread over two lines (like Mary Jones). We also need to use the 's' switch on the regular expression to let '.' match newlines (again to pick up Mary Jones!.

Running it gives you:

$ perl foo.pl 1 Name 'Jon.Martinez' Name 'Mary Jones' Name 'Rob Oticus' Name 'Joe Blow'

Now, having said all that: Remember to review perlre and perlop. Also, you may want to use a real HTML parser instead of hacking away with regular expressions. Otherwise you can find some difficulties with unexpected formatting.

...roboticus

When your only tool is a hammer, all problems look like your thumb.

Update: changed 'e' to 's' (thanks for catching that, hbm!)

Replies are listed 'Best First'.
Re^2: A regex question
by emelianenko (Initiate) on Oct 28, 2011 at 20:58 UTC
    Thank you whole heartedly. I am going to study this that you wrote. Definitively I want to incorporate Perl into my bagage but I am finishing C now. Right after that I will because I am fanatic about managing information. thank you again best regards