Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

RegEx advice needed

by Anonymous Monk
on Oct 23, 2003 at 18:32 UTC ( [id://301667]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have several hundred lines of firmware I'm parsing out and have gotten them all with the exception of three.

#!/usr/bin/perl use strict; use warnings; my @array = (); while (<DATA>) { push @array, $1 if ( /Version:?\s*([^\s,]+)/i ); push @array, $1 if ( /SW:?\s*$|SW_REV:?\s*([^.]+)/ ); push @array, $1 if ( /Rev\s*$|\s+Revision:?\s*([^\s,]+)/i ); } foreach (@array) { $_ =~ s/Copyright//i; print "FW: $_\n"; } # OUTPUT IS: #FW: rdtg7.0.4.7 #FW: 0; #FW: CG4D_05 #OUTPUT SHOULD BE #FW: rdtg7.0.47 #FW: 4.1.4p #FW: CG4D_05.3.02 __DATA__ Company: Nuera Communications, Inc., ProductFamily: ORCA Series, Produ +ct: RDT-8, Version: rdtg7.0.4.7, HardwareRevision: A Motorola Corporation SB4100E Cable Modem: Hardware version: 0; OS: VxW +orks 5.3.1; Software version: 4.1.4p <<HW_REV: 0; VENDOR: Motorola; BOOTR: CG4D_05.3.02; SW_REV: CG4D_05.3. +02; MODEL: SBV4200>>OS: VxWorks 5.4
Any suggestions are appreciated!

Replies are listed 'Best First'.
Re: RegEx advice needed
by talexb (Chancellor) on Oct 23, 2003 at 19:04 UTC

    I deleted your original regexen and write my own:

    #!/usr/bin/perl use strict; use warnings; my @array = (); while (<DATA>) { if ( /Version:\s*(\S+),/ ) { push ( @array, $1 ), next; } if ( /Software version:\s*(\S+)/ ) { push ( @array, $1 ), next; } if ( /SW_REV:\s*(\S+);/ ) { push ( @array, $1 ), next; } } foreach (@array) { $_ =~ s/Copyright//i; print "FW: $_\n"; } __DATA__ Company: Nuera Communications, Inc., ProductFamily: ORCA Series, Produ +ct: RDT-8, Version: rdtg7.0.4.7, HardwareRevision: A Motorola Corporation SB4100E Cable Modem: Hardware version: 0; OS: VxW ++orks 5.3.1; Software version: 4.1.4p <<HW_REV: 0; VENDOR: Motorola; BOOTR: CG4D_05.3.02; SW_REV: CG4D_05.3. +02; MODEL: SBV4200>>OS: VxWorks 5.4
    The code produces the output that you wanted:
    [tab@fred dev]$ perl -w pm23-10-03.pl FW: rdtg7.0.4.7 FW: 4.1.4p FW: CG4D_05.3.02
    but I skipped the alternative options with 'SW_REV' in the third regex that suggest you had to do some additional fiddling around.

    I also did a next after finiding a match on the assumption that once you've found a match, it's a waste of time to try and find another match. This isn't too bad when you're dealing with 100 lines of data but begins to be a real problem when you have 1000000 lines of code and dozens of regexen.

    --t. alex
    Life is short: get busy!
Re: RegEx advice needed
by monktim (Friar) on Oct 23, 2003 at 19:01 UTC
    Do you really want to strip off the last dot in rdtg7.0.4.7?

    In the SW regex you can do this:
    push @array, $1 if ( /SW(?:_REV)?:?\s*([^;\$]+)/ );
    What do you want to do in the verion regex if there is more than one version?
    Update: Added ? after (?:_REV).
Re: RegEx advice needed
by TomDLux (Vicar) on Oct 23, 2003 at 19:03 UTC

    For the first line of input, the first RE matches Version ..., and stores rdtg7.0.4.7.

    For the second line, the first RE matches Hardware version ... and stores 0;.

    For the third line, the second RE matches SW_REV... and stores CG4D_05.

    Update: zpy, I don't see how you get the third line of your output, FW: 4.1.4p. That can only be grabbed by the first RE, which has already used up it's one try at the input data matching Hardware version."

    --
    TTTATCGGTCGTTATATAGATGTTTGCA

Re: RegEx advice needed
by zby (Vicar) on Oct 23, 2003 at 19:06 UTC
    Mine output was:
    FW: rdtg7.0.4.7 FW: 0; FW: 4.1.4p FW: CG4D_05
    The first line seems to be correct, in the data you have 'rdtg7.0.4.7'. The second line is due to matching: "Hardware version: 0;", the third is correct. And in the last it is the pattern ([^.]+) that matches anything but '.' so it is clear why 'SW_REV: CG4D_05.3.02' was matched only to the first '.'.

    I can't say how should you fix the pattern because I don't know the other data.

Re: RegEx advice needed
by Anonymous Monk on Oct 23, 2003 at 19:14 UTC
    I made some adjustments and corrected two of the three with:
    push @array, $1 if ( /Version:?\s*([^\s;,\n\r]+)/i ); push @array, $1 if ( /SW:?\s*$|SW_REV:?\s*([^\s;,\n\r]+)/ ); push @array, $1 if ( /Rev\s*$|\s+Revision:?\s*([^\s;,\n\r]+)/i ); # OUTPUT IS: # FW: rdtg7.0.4.7 # FW: 0 # FW: CG4D_05.3.02
    It's still finding Hardware version first!....any thoughts?
      This is getting ugly. This uses your code to work for the given example. I don't know if this is what you really want though.
      push @array, $1 if ( /(?<!Hardware)(?=\s+Version:?\s*([^\s;,\n\r]+))/i + );

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://301667]
Approved by HyperZonk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2024-04-19 16:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found