Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: validate variable-length lines in one regex?

by poj (Abbot)
on Jul 06, 2015 at 19:32 UTC ( [id://1133427]=note: print w/replies, xml ) Need Help??


in reply to validate variable-length lines in one regex?

I'm not sure this is of any use but I'll offer it anyway. The idea is to create a mask of codes which is then used to select the correct regex for that column.

#!perl use strict; my %REGEX = ( 'A' => qr'^[A-Z]\d\d$', 'N' => qr'^[0-9]+$', 'N3' => qr'^[0-3]+$', 'D' => qr'^\d+\.\d+$', ); my @p = qw(A N N N3 D D D D D D); while (<DATA>){ chomp; my @f = split '\s+'; my $chk = 'OK '; for my $i (0..$#f){ if ($f[$i] !~ $REGEX{$p[$i]}){ $chk = 'ERR'; $f[$i] = '**'.$f[$i]." $REGEX{$p[$i]}**"; } } print join ' ',$chk,@f,"\n"; } __DATA__ C3 6 3 2.4 1.5 2.6 C32 2 7 3 1.0 H31 1 1 0 21.0 11.2 5.3 1.4 T11 2 1 0 6.0 1.1 2.2 L06 1 1 0 1.0 3.3 L06 1 4 0 1.1 1.8
poj

Replies are listed 'Best First'.
Re^2: validate variable-length lines in one regex?
by uhClem (Scribe) on Jul 06, 2015 at 20:38 UTC

    Oho! Now that is slick, in a gruesome way. I like it; just might use that. Bonus for doing exactly what I want in an almost unrelated way. It even seems like that could be the basis of a script that could figure out for itself what the probable pattern for each lousy file is, and just yank any outliers... Let's see how big a mess I can make with THAT!

    And just the same, there is still that "D D D D D D" -- a non-indeterminate sequence so you just have to hope you don't run into any lines with seven Ds. I bet there's a way around that (and I know I'll never have more than nine -- in this file...) but that does point back to my original question: Can you make a single regex carefully validate a variable number of fields (and return all matches)? Will perl regex do that, or does it exceed the possibilities?

    Anyway, thanks!

      Another thought - If it's possible to edit the file I would put the mask as the first line, no need to edit the script then. Failing that put something in the filename that chooses the correct mask for you. This would of course mean editing the file for each new mask.

      poj

        Hmmm....   Not sure that I would trust the preparer of such files that much.   Nor would I, personally, want to so much as touch the data.   However, you might have some kind of catalog or configuration-file, external to the script, which provides the necessary information.   (And, if the script could not locate exactly-one appropriate entry, for whatever file that it has been given, it would obligingly die().)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1133427]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (4)
As of 2024-03-29 09:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found