Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Regex. Combination of words block. Need help.

by nickwest (Initiate)
on Feb 09, 2020 at 10:37 UTC ( [id://11112671]=perlquestion: print w/replies, xml ) Need Help??

nickwest has asked for the wisdom of the Perl Monks concerning the following question:

Hello, dear Monks.

$string = 'aaa, bbb,  ccc,ddddd;eee-12345abc Qwerty';

I need to get an array which consist of

'aaa, bbb, ccc,ddddd;eee-12345abc Qwerty' 'aaa, bbb' 'aaa, bbb, ccc' 'aaa, bbb, ccc,ddddd' 'aaa, bbb, ccc,ddddd;eee-12345abc' 'bbb, ccc,ddddd;eee-12345abc Qwerty' 'bbb, ccc' 'bbb, ccc,ddddd' 'bbb, ccc,ddddd;eee-12345abc' 'ccc,ddddd;eee-12345abc Qwerty' 'ccc,ddddd' 'ccc,ddddd;eee-12345abc' 'ddddd;eee-12345abc Qwerty' 'ddddd;eee-12345abc' 'eee-12345abc Qwerty' 'aaa' 'bbb' 'ccc' 'ddddd' 'eee-12345abc' 'Qwerty'

The word may consist [\w\-\/\_]
The delimiter is \s+ or , or ;

I think a decision should be like this
@a = @string =~ /............../gi;
Please help to complete the regexp.

Replies are listed 'Best First'.
Re: Regexp. Combination of words block. Need help.
by Corion (Patriarch) on Feb 09, 2020 at 10:47 UTC

    Why do you need to do this all in one regular expression?

    Also, do you mean that the space in front of  ccc should be treated differently from the space between abc Qwerty?

    Also, why does Qwerty not show up as a separate string in your other examples?

    If you want a simple approach, just split the string on /[;,\s]+/ to get an array, and then get all the sub-sequences of that array:

    #!perl use strict; use warnings; my $string = 'aaa, bbb, ccc,ddddd;eee-12345abc Qwerty'; my @items = split /[,;\s]+/, $string; for my $i (0..$#items) { for my $j ($i..$#items) { print join " - ", @items[ $i .. $j ]; print "\n"; }; } __END__ aaa aaa - bbb aaa - bbb - ccc aaa - bbb - ccc - ddddd aaa - bbb - ccc - ddddd - eee-12345abc aaa - bbb - ccc - ddddd - eee-12345abc - Qwerty bbb bbb - ccc bbb - ccc - ddddd bbb - ccc - ddddd - eee-12345abc bbb - ccc - ddddd - eee-12345abc - Qwerty ccc ccc - ddddd ccc - ddddd - eee-12345abc ccc - ddddd - eee-12345abc - Qwerty ddddd ddddd - eee-12345abc ddddd - eee-12345abc - Qwerty eee-12345abc eee-12345abc - Qwerty Qwerty
Re: Regex. Combination of words block. Need help.
by tybalt89 (Monsignor) on Feb 09, 2020 at 15:48 UTC
    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11112671 use warnings; my $string = 'aaa, bbb, ccc,ddddd;eee-12345abc Qwerty'; my @a; $string =~ m{ (?<![-\w/])[-\w/] .* (?<=[-\w/])(?![-\w/]) (?{ push @a, $& }) (*FAIL) }x; use Data::Dump 'dd'; dd \@a;

    Outputs:

    [ "aaa, bbb, ccc,ddddd;eee-12345abc Qwerty", "aaa, bbb, ccc,ddddd;eee-12345abc", "aaa, bbb, ccc,ddddd", "aaa, bbb, ccc", "aaa, bbb", "aaa", "bbb, ccc,ddddd;eee-12345abc Qwerty", "bbb, ccc,ddddd;eee-12345abc", "bbb, ccc,ddddd", "bbb, ccc", "bbb", "ccc,ddddd;eee-12345abc Qwerty", "ccc,ddddd;eee-12345abc", "ccc,ddddd", "ccc", "ddddd;eee-12345abc Qwerty", "ddddd;eee-12345abc", "ddddd", "eee-12345abc Qwerty", "eee-12345abc", "Qwerty", ]
Re: Regexp. Combination of words block. Need help.
by Anonymous Monk on Feb 09, 2020 at 10:48 UTC
    Why that array? Id use split
      $b = 'aaa,bbb,ccc ddd; eee fff, kv.6 end'; @a = split(/[\s\,\.\;]+/, $b); for ($i=0;$i<$#a-1;$i++) { $_ = $b; $k = $a[$i]; s/^.*$k[\s\,\.\;]+//; push @c, $_; } push @a, @c, $b; foreach (@a) { print "$_\n"; }

      and get:
      aaa bbb ccc ddd eee fff kv 6 end bbb,ccc ddd; eee fff, kv.6 end ccc ddd; eee fff, kv.6 end ddd; eee fff, kv.6 end eee fff, kv.6 end fff, kv.6 end kv.6 end 6 end aaa,bbb,ccc ddd; eee fff, kv.6 end
      Yes! Thank you! I'm going by your way. Firstly split! But I think it's possible make by one regex.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11112671]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2024-04-25 09:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found