Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

List::MoreUtils before, after and ... between?

by Boldra (Deacon)
on Feb 21, 2012 at 14:41 UTC ( [id://955318]=perlquestion: print w/replies, xml ) Need Help??

Boldra has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to clean up a list to take rubbish away from the top and the bottom (it's a stack trace from a dbic/dancer app).

This works:

perl -MList::MoreUtils -E ' @foo = reverse List::MoreUtils::before { $_ =~ /DBIC/ } reverse List::MoreUtils::before { $_ =~ /Dancer/ } qw<eval DBIC::3 DBIC::2 DBIC::1 MyApp::3 MyApp::2 MyApp::1 Dancer::3 + Dancer::2 Dancer::1>; say for @foo'

But it looks to me like there should be a more readable way of writing this. I had a look at using a flip-flop, but it seems that that would also require two reverses.

Any suggestions how to make this more readable?


- Boldra

Replies are listed 'Best First'.
Re: List::MoreUtils before, after and ... between?
by toolic (Bishop) on Feb 21, 2012 at 15:05 UTC
    Not very pretty, but it doesn't even go though your whole list once:
    use warnings; use strict; use Data::Dumper; my @foo; my $flag = 0; for (qw<eval DBIC::3 DBIC::2 DBIC::1 MyApp::3 MyApp::2 MyApp::1 Dancer +::3 Dancer::2 Dancer::1>) { last if /Dancer/; $flag = 1 if /DBIC/; $flag = 2 if $flag and !/DBIC/; push @foo, $_ if $flag == 2; } print Dumper(\@foo); __END__ $VAR1 = [ 'MyApp::3', 'MyApp::2', 'MyApp::1' ];
Re: List::MoreUtils before, after and ... between?
by moritz (Cardinal) on Feb 21, 2012 at 15:00 UTC

    If you are sure that both regexes will be found, you can write something like this:

    use List::MoreUtils qw/first_index last_index/; use 5.010; use strict; use warnings; my @a = qw<eval DBIC::3 DBIC::2 DBIC::1 MyApp::3 MyApp::2 MyApp::1 Dan +cer::3 Dancer::2 Dancer::1>; my $from = last_index { /DBIC/ } @a; my $to = first_index { /Dancer/ } @a; say for @a[$from + 1..$to - 1];

    If one or both regexes might not match, you'd need to special-case undefined $from and $to.

      This looks like the most readable to me, thanks Moritz!

      I'm surprised you didn't post a Perl 6 version too ... I love those!



      - Boldra

        I just didn't think of coming up with a Perl 6 solution, and even now I don't see a particularly elegant and readable way.

        So I've just ported toolic's solution to Perl 6, making the returned list of found values lazy:

        my @a = <eval DBIC::3 DBIC::2 DBIC::1 MyApp::3 MyApp::2 MyApp::1 Dance +r::3 Dancer::2 Dancer::1>; my @filtered := gather { my $flag = 0; for @a { last if /Dancer/; $flag = 1 if /DBIC/; $flag = 2 if $flag && !/DBIC/; take $_ if $flag == 2; } } .say for @filtered;

        (Tested with current Niecza)

Re: List::MoreUtils before, after and ... between?
by BrowserUk (Patriarch) on Feb 21, 2012 at 14:55 UTC

    Your specific example is more simply accomplished with grep:

    @foo = grep /MyApp/, qw<eval DBIC::3 DBIC::2 DBIC::1 MyApp::3 MyApp::2 + MyApp::1 Dancer::3 Dancer::2 Dancer::1>;; print for @foo;; MyApp::3 MyApp::2 MyApp::1

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

      This assumes that it is *known* that we search for "MyApp". I understood the problem in the following way: Throw away everything in the beginning of the list up to, and including, DBIC. Then keep everythin which follows, up to (but not including) the first Dancer. Then throw away what follows. For instance, a MyApp which precedes a DBIC or follows a Dancer, should also be thrown away. This can't be done with your grep solution.

      -- 
      Ronald Fischer <ynnor@mm.st>
        Your specific example...
Re: List::MoreUtils before, after and ... between? (SMoP)
by tye (Sage) on Feb 21, 2012 at 15:59 UTC
    my @stack; for my $level ( qw< eval DBIC::3 DBIC::2 DBIC::1 MyApp::3 MyApp::2 MyApp::1 Dancer::3 Dancer::2 Dancer::1 > ) { last if $level =~ /Dancer/; push @stack, $level; @stack = () if $level =~ /DBIC/; } say for @stack;

    (Updated to move 'last' statement above 'push' statement. Thanks, AnomalousMonk.)

    - tye        

Re: List::MoreUtils before, after and ... between?
by rovf (Priest) on Feb 21, 2012 at 15:53 UTC
    I don't think your solution is that bad. If you want to get rid of the two reverse, I think you have to buy and additional list function instead, for instance:
    List::MoreUtils::after_incl { $_ !~ /DBIC/ } List::MoreUtils::after_incl { /DBIC/ } List::MoreUtils::before { /Dancer/ } @YourList
    I consider this even more ugly. Alternatively you could try
    my $partition=0; my @parts = List::MoreUtils::part { if($pivots[0]) { if($partition % 2 == 0) { if($_ =~ $pivots[0]) { ++$partition; } } else { if($_ !~ $pivots[0]) { ++$partition; shift @pivots; } } } $partition } qw<eval DBIC::3 DBIC::2 DBIC::1 MyApp::3 MyApp::2 MyApp::1 Dancer::3 + Dancer::2 Dancer::1>; print join(',',@{ $parts[2] }),"\n";
    Whether this is more readable, is something to be questioned, but at least the code can be generalized easier to more than 2 "pattern changes". However, it silently assumes that part traverses the list left to right, which is not guaranteed by the documentation.

    Another solution (which does not use reverse) would be

    my @input=qw<eval DBIC::3 DBIC::2 DBIC::1 MyApp::3 MyApp::2 MyApp::1 D +ancer::3 Dancer::2 Dancer::1>; my @firsts=@input[0..@input-2]; my @seconds=@input[1..@input-1]; my $inside=0; my @result; List::MoreUtils::pairwise { if($inside == 1) { my $pat=qr(Dancer); push @result,$a; if($a !~ $pat && $b =~ $pat) { $inside=-1; } } elsif($inside == 0) { my $pat=qr(DBIC); if($a =~ $pat && $b !~ $pat) { $inside=1; } } } @firsts, @seconds; print("@result\n");
    but this is forcing the pairwise function to something in a way it was not really meant to be used (since the result of pairwise is never used). Maybe you are better off writing a conventional loop and storing the state (inside / outside your pattern sequence) in a status variable, similar to my part example, but without (mis-)using List::MoreUtils.

    -- 
    Ronald Fischer <ynnor@mm.st>
Re: List::MoreUtils before, after and ... between?
by RichardK (Parson) on Feb 21, 2012 at 16:09 UTC

    How about something along these lines?

    shift @array while($array[0] ~~ /DBIC/); pop @array while ($array[$#array] ~~ /Dancer/);
      I think you overlooked the first element of my sample array: 'eval'. It's worse than that, I can get an non-DBIC element in the middle of the DBIC elements.

        Well, obviously you can change the regex to match whatever you need.

        You didn't mention that about the DBICs, that's a more difficult and messy problem.

        It's probably going to be much more successful to look for what you do want, rather than what you don't, so I'd design a regex to do just that ;)

Re: List::MoreUtils before, after and ... between?
by LanX (Saint) on Feb 21, 2012 at 16:15 UTC
    Not sure if I understand your problem completely, why not simply combining grep and flip-flop?

    This should be close:

    DB<134> @list= ("a".."c","DBIC","A".."C","DANCER","a".."c") => ("a", "b", "c", "DBIC", "A", "B", "C", "DANCER", "a", "b", "c") DB<135> grep { /DBIC/.. /DANCER/ } @list => ("DBIC", "A", "B", "C", "DANCER")

    Cheers Rolf

      That's not the requested output.
        ...This should be close..

        UPDATE:

        Still not sure what the requested output is, but if it's only about eliminating the edges try

        DB<101> @list= ("a".."c","DBIC","A".."C","DANCER","a".."c") => ("a", "b", "c", "DBIC", "A", "B", "C", "DANCER", "a", "b", "c") DB<102> grep { /DBIC/ .. /DANCER/ and ! /DBIC/ and ! /DANCER/ } @lis +t => ("A", "B", "C")

        not DRY but effective.

Re: List::MoreUtils before, after and ... between?
by tobyink (Canon) on Feb 21, 2012 at 15:04 UTC

    I don't know if this counts as better or worse...

    @list = qw<eval DBIC::3 DBIC::2 DBIC::1 MyApp::3 MyApp::2 MyApp::1 Da +ncer::3 Dancer::2 Dancer::1>; $begin = 1 + last_index { /^(DBIC|eval)/ } @list; $end = (-1) + first_index { /^Dancer/ } @list; @foo2 = @list[ $begin .. $end ]; say for @foo2;
Re: List::MoreUtils before, after and ... between?
by LanX (Saint) on Feb 21, 2012 at 16:43 UTC
    sometimes I wished we could simply apply regexes to lists (preferably in a layzy functional way) ...

    DB<152> @list= ("a".."c","DBIC","A".."C","DANCER","a".."c") => ("a", "b", "c", "DBIC", "A", "B", "C", "DANCER", "a", "b", "c") DB<153> $list = join "\0", @list => "a\0b\0c\0DBIC\0A\0B\0C\0DANCER\0a\0b\0c" DB<154> ($match)= $list =~ /DBIC\0(.*)\0DANCER/ => "A\0B\0C" DB<155> split /\0/,$match => ("A", "B", "C")

    this will find the longest interval between the first DBIC and the last Dancer and you are free to use more powerful regexes.

    UPDATE: hmm maybe newlines are here a better choice as delimiter.

    Cheers Rolf

      Heh, I like that solution. Note that adding a second 'DBIC' and/or a second 'Dancer' demonstrates flaws in the exact regex you chose. But both of those flaws are easily overcome:

      $list =~ /.*DBIC\0(.*?)\0DANCER/ # ^^ ^

      But perhaps people less comfortable with regexes would like it less...

      - tye        

        And did you think about the edgecase of choosing a wrong delimiter which already appears in the elements? ;)

        FWIW, I'm quite often in a situation where I would prefere to apply regexes on lists, so I started meditating yesterday about a module abstracting the delimiter problem away by locally redefining a special var like $; or $\ for the delimiter and $_ for flattened list.

        something like

        @newlist = flat { s/START$;(.*)$;END/$1/ } @list sub flat (&@) { my ($code,@list) = @_; local ( $;, $_ ) = join_reliably (@list); $code->(); return split $;, $_ }

        untested.

        Cheers Rolf

      This looks great, because the list is generated by splitting a string on newlines in the first place:

      eval { confess }; my @stack = split /\n/, $@;
      The regex becomes a bit more complicated, but that usually also means I'm a bit more certain about what I'm matching.

      Thanks!

Re: List::MoreUtils before, after and ... between?
by ikegami (Patriarch) on Feb 21, 2012 at 23:05 UTC

    Wow, so many complicated solutions! It's actually quite simple:

    1. Find start of opening bound, then
    2. Find start of data, then
    3. Find start of closing bound
    my ($seen_opening, $seen_data, $seen_closing); my @filtered = grep { ($seen_opening ||= /DBIC/) # Find opening bound && ($seen_data ||= !/DBIC/) # Find data && !($seen_closing ||= /Dancer/) # Find closing bound } @list;

    Using the range operator is possible too.

    my @filtered = grep { (/DBIC/ .. /Dancer/) && !/DBIC/ && !/Dancer/ } @list;

    Update: Added range operator solution.

      I guess some definitions of "quite simple" can be applied to that code. It looks more like "clever" to me. (:

      I rejected using 'grep' at all since it seems likely to me that there could be cases where there is nothing to trim from the front. Your solution (and any simple 'grep' solution) will produce nothing for that case (or else can't deal with extra, unexpected leading garbage). I also dislike solutions that I can tell will require fundamental changes in the face of simple changes to the requirements.

      - tye        

        > it seems likely to me that there could be cases where there is nothing to trim from the front.

        Well not clearly phrased in the OP but at least the way the example was coded!

        But maybe that's the real bug???

        Please notice the word "between" in this thread's title.

        > I rejected using 'grep' at all

        Maybe you mean mean "grep w/o reverse" ?

        DB<130> @list= ("a".."c","DBIC","A".."C","DANCER","a".."c") => ("a", + "b", "c", "DBIC", "A", "B", "C", "DANCER", "a", "b", "c") DB<131> reverse grep { not /DBIC/ .. 1 } reverse grep { not /DANCE +R/ .. 1 } @list => ("A", "B", "C") DB<132> reverse grep { not /DBIC/ .. 1 } reverse grep { not /NONSE +NSE/ .. 1 } @list => ("A", "B", "C", "DANCER", "a", "b", "c")

        ... or maybe not ?

        Personally, I reject posting solutions for every possible interpretation of a poorly phrased question.

        Human discussions are an iterative process.

        Cheers Rolf

        I guess some definitions of "quite simple" can be applied to that code.

        Linear, clear, concise.

        I did not mean to include yours in the set of "not simple". I wasn't paying much attention by the time I got that far.

        will produce nothing for that case

        I was viewing the problem as "print the stuff between the bounds", in which case it *should* print nothing for the case with no bound.

        But it also shouldn't print anything for a missing trailing bound (while it does), so point made.

        I rejected using 'grep' at all since it seems likely to me that there could be cases where there is nothing to trim from the front. Your solution (and any simple 'grep' solution) ...

        So, you reject simple now, in favour of complicated, because of what might be the requirement in some speculative future.

        Yep! That just about sums up all that is wrong with the software industry. Throwing good money after bad chasing what-if scenarios and may-be requirements instead of just taking care of the real requirements as they exist now; and dealing with future requirements -- actual future requirements -- when and if they actually arise.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        The start of some sanity?

      FYI, looking at this a second time, I noticed that both of those solutions would handle input containing qw< DBIC::4 eval DBIC::3 > in a manner that seems to me rather clearly against the stated wish (of removing leading/trailing "rubbish" from a stack trace).

      - tye        

        Well spotted. That is a realistic possibility for me.


        - Boldra
Re: List::MoreUtils before, after and ... between?
by BrowserUk (Patriarch) on Feb 21, 2012 at 18:08 UTC

    BTW: If the your main concern with your method, is the time taken to perform the reverses, don't even consider it unless your list is huge. Reverse is a particularly efficient operation. 1000 items take a just 2/10s of a millisecond:

    @a = 1 .. 1e3;; $t = time; @a = reverse @a; print time() - $t;; 0.000192165374755859 $t = time; @a = reverse @a; print time() - $t;; 0.000243186950683594 $t = time; @a = reverse @a; print time() - $t;; 0.000186920166015625

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

Re: List::MoreUtils before, after and ... between?
by tobyink (Canon) on Feb 21, 2012 at 18:11 UTC

    No good for one-liners, but this looks pretty sweet:

    use 5.010; use strict; use List::MoreUtils qw/first_index last_index/; sub narrow { my ($from, $to) = map { $_->(@_) } @{(pop)}; return @_[$from .. $to]; } sub from (&$) { [ @_[0..1] ] } sub to (&) { $_[0] } my @list = qw/eval DBIC::1 DBIC::2 My::1 My::2 My::3 Dancer::1 Dancer: +:2 Dancer::3 Dancer::4/; say foreach narrow @list, from { 1 + last_index { /^DBIC/ } @_ } to { (-1) + first_index { /^Dancer/ } @_ };

    Obviously you'd want to factor narrow, from and to out into a separate module rather than defining them inline.

    And given that from and to are quite generic names, perhaps something like start_at and finish_at might be better in practice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://955318]
Approved by Corion
Front-paged by moritz
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (1)
As of 2024-04-25 01:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found