Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Regular Expressions-Finding info between semicolons

by adrianm96 (Novice)
on Nov 09, 2016 at 23:07 UTC ( [id://1175635]=perlquestion: print w/replies, xml ) Need Help??

adrianm96 has asked for the wisdom of the Perl Monks concerning the following question:

Hello All, For my input I've got a set of characters each surrounded by semicolons ';'. I'm trying to find one specific letter in the input and take out any information in between the semicolons. It sounds easy but it has stumped me for a day and a half. I've looked into Look-aheads, Look-behinds, anchor points but it seems as if whatever case I use all the criteria isn't met in my regex expression. I could use some help. Here are some examples I've used so far:


/\;(.*?W.*?)\;/

Output:
Letter to find=W
Example Input=;A-B;C-D;E-F;G-H;J-K;L-M;N-P;R-S;T-W;Y-Z;
RegExResult=A-B;C-D;E-F;G-H;J-K;L-M;N-P;R-S;T-W
Issue: Finds the very first semicolon in the whole string
Desired Result: T-W


 /\;(.*?L.*?)\;/

Output:
Letter to find=L
Example Input=;A-B-C-D;E-F-G-H;J-K-L-M;N-P-R-S;T-W-Y-Z;
RegExResult=A-B-C-D;E-F-G-H;J-K-L-M
Issue: Finds the very first semicolon in the whole string
Desired Result: J-K-L-M


 /\;(.*?B.*?)\;/

Output:
Letter to find=B
Example Input=;A-B;C-D;E-F;G-H;J-K;L-M;N-P;R-S;T-W;Y-Z;
RegExResult=A-B
Issue: No issue, works in this case since its at the beginning of the string
Desired Result: A-B

Replies are listed 'Best First'.
Re: Regular Expressions-Finding info between semicolons
by shmem (Chancellor) on Nov 09, 2016 at 23:21 UTC

    Use an inverted character class: [^;]* instead of .*? - then

    $_ = ';A-B;C-D;E-F;G-H;J-K;L-M;N-P;R-S;T-W;Y-Z;'; print $1, "\n" if /;([^;]*W[^;]*);/;

    yields the desired result.
    You don't have to escape the semicolon, since it is no meta-character.

    See perlre and "Bracketed Character Classes" in perlrecharclass.

    perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'

      This is exactly what I was looking for! I appreciate it! I'll look into what you wrote to fully understand it.

Re: Regular Expressions-Finding info between semicolons
by choroba (Cardinal) on Nov 09, 2016 at 23:18 UTC
    I'd first split the string on semicolons, then you can just grep the string that contains the letter:
    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; for my $tuple ( [ W => ';A-B;C-D;E-F;G-H;J-K;L-M;N-P;R-S;T-W;Y-Z;' ], [ L => ';A-B-C-D;E-F-G-H;J-K-L-M;N-P-R-S;T-W-Y-Z;' ], [ B => ';A-B;C-D;E-F;G-H;J-K;L-M;N-P;R-S;T-W;Y-Z;' ], [ A => ';A-B-C;D-E-F-G-H-J-K-L;M-N-P-R-S-T;W-Y-Z;' ], [ Z => ';A-B-C;D-E-F-G-H-J-K-L;M-N-P-R-S-T;W-Y-Z;' ], ) { my ($letter, $string) = @$tuple; say "$letter: ", grep index($_, $letter) != -1, split /;/, $string +; }
    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
Re: Regular Expressions-Finding info between semicolons
by Paladin (Vicar) on Nov 09, 2016 at 23:25 UTC

    This is a fairly common issue I see with Regex. People use . in their regexen when that's not what they actually mean.

    In your example you have: /\;(.*?L.*?)\;/. You are looking for a L surrounded by a bunch of stuff that isn't semicolons, but . means "everything", not "everything except semicolons". If you want to say "everything except semicolons" in the regex, then say that: /\;([^;]*?L[^;]*?)\;/

    [~]$ perl -le '$_ = ";A-B-C-D;E-F-G-H;J-K-L-M;N-P-R-S;T-W-Y-Z;"; /\;([ +^;]*?L[^;]*?)\;/; print $1;' J-K-L-M
    tl;dr: Don't use . if you don't mean .

      Thanks for your help! Yeah I saw what was happening, I was just having trouble translating what I wanted into code to match the pattern. Thanks again!

Re: Regular Expressions-Finding info between semicolons
by tybalt89 (Monsignor) on Nov 09, 2016 at 23:24 UTC
    #!/usr/bin/perl # http://perlmonks.org/?node_id=1175635 use strict; use warnings; my @testcases = ( [ 'W', ';A-B;C-D;E-F;G-H;J-K;L-M;N-P;R-S;T-W;Y-Z;' ], [ 'L', ';A-B-C-D;E-F-G-H;J-K-L-M;N-P-R-S;T-W-Y-Z;' ], [ 'B', ';A-B;C-D;E-F;G-H;J-K;L-M;N-P;R-S;T-W;Y-Z;' ], ); for ( @testcases ) { my ( $letter, $string ) = @$_; $string =~ /.*;(.*?$letter.*?);/ and print "$letter $1\n"; }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1175635]
Approved by stevieb
Front-paged by stevieb
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (5)
As of 2024-04-16 05:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found