http://qs321.pair.com?node_id=484599

beeconn66 has asked for the wisdom of the Perl Monks concerning the following question:

hello,
let me first say i'm new to perl and pattern matching as well, so this may be a dumb question.
I was wondering if i can pattern match a record which contains variables. Its kind of hard to explain what i mean so let me give an example.
Say i have an employee record of first name, last name, and employee #. in the from

employee(firstname, lastname, soc)

based on that record i want to do pattern matching to see if it meets specific requirements. Requirements that may change in the future. With that being said, is there some way to declare valid patterns for first name, last name, and employeenum.

For example

firstname = [\w]{1,100} lastname = [\w]{1,100} employeenum = [\d]{1,100}

I think what i am asking is there a way to declare a regular expression as a variable?

Does any of this make sense to anyone? Your thoughts/suggestions are appreciated.

20050817 Janitored by Corion: Added formatting

Replies are listed 'Best First'.
Re: Regular expression help
by graff (Chancellor) on Aug 17, 2005 at 21:21 UTC
    Yes. You can look up the "qr//" operator in perlre, (update: better yet, look it up in perlop), or you can simply assign a regex string to a variable, and use that variable in matches:
    #!/usr/bin/perl use strict; my $testdata = 'This is a foo bar test'; while ( my $testregex = <DATA> ) { chomp $testregex; my $result = ( $testdata =~ /$testregex/ ) ? "matched" : "did not +match"; print "Test data $result $testregex\n"; } __DATA__ foo o{1,2} \w{5,10} \w{1,5} \d bar
Re: Regular expression help
by philcrow (Priest) on Aug 17, 2005 at 21:29 UTC
    You can use regex quoting to build up expressions from pieces. It looks something like this:
    my $name = qr/\w{1,100}/; my $num = qr/\d{1,100}/; $input =~ m{ employee \( ($name) ,\s* ($name) ,\s* ($num) \) }x; my ($first_name, $last_name, $number) = ($1, $2, $3);
    So you can store regexes in vars usin the qr quoting operator, then use them in subsequent regexes merely by mentioning them. They will interpolate.
    Phil
Re: Regular expression help
by CountZero (Bishop) on Aug 17, 2005 at 21:14 UTC
    There must be a pattern in your record for the regex to match. It is easiest if you give a number of practical examples and then we can all have a go at drafting a regex to match those examples.

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: Regular expression help
by gryphon (Abbot) on Aug 17, 2005 at 21:19 UTC

    Greetings beeconn66,

    I'm not sure if this is exactly what you're looking for, and I'm certainly not a regex-master or anything, but how about this:

    my $text = 'employee(firstname, lastname, soc)'; if ($text =~ /employee\(([^,\s)]+)\s*,\s*([^,\s)]+)\s*,\s*([^,\s)]+)/) + { print join("\n", $1, $2, $3); }

    gryphon
    Whitepages.com Development Manager (DSMS)
    code('Perl') || die;

Re: Regular expression help
by QM (Parson) on Aug 17, 2005 at 21:40 UTC
    First, be careful about "odd" names, such as "Jean-Michel Paris" and "Mary de Konig".

    I often do quick and dirty parsing jobs like this:

    my $name = qr/([\w_-=]+)\s*/; my $enum = qr/(([\d.-+,])\s*/; <p> while ( <> ) { if ( my @captures = /$name$name$enum/ ) { do_something_here(@captures); } }

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of

Re: Regular expression help
by admiral_grinder (Pilgrim) on Aug 18, 2005 at 18:48 UTC
    O'Reilly has books on Reg Expressions. I just bought the pocket reference on them and gave it a read today, and I think that will help you out lots. It is only 10 bucks at that.
Re: Regular expression help
by beeconn66 (Novice) on Aug 18, 2005 at 12:49 UTC
    Wow...I wasn't expecting so many responses so fast. Great feedback monks. This should give me some things to ponder. Thanks again.
      Okay...so I overcame that hurdle thanks to you guys...but now i need to find where a pattern match fails. Is it possible? For instance if i'm looking for digts and the string is 123a45 is there a way to inform the user that the pattern match is failing because of the "a"?
        you can match with m/\d+/ and if that fails you can tell the user "string contains none numerical digit" or something like that.