Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

reduce like iterators

by LanX (Cardinal)
on Jan 03, 2011 at 17:57 UTC ( #880228=perlquestion: print w/replies, xml ) Need Help??

LanX has asked for the wisdom of the Perl Monks concerning the following question:

Hi

First I thougt about writing a longer meditation about iterators and semipredicate problem, but maybe a simple seek for perl wisdom is a better start.

In gnu.emacs.help someone asked

How would you code this simple list compression problem in Ruby:

1.08 (**) Eliminate consecutive duplicates of list elements. If a list contains repeated elements they should be replaced with a single copy of the element. The order of the elements should not be changed.

Example:

?- compress([a,a,a,a,b,c,c,a,a,d,e,e,e,e],X). X = [a,b,c,a,d,e]

my first idea to solve it in perl was grep

DB<1> my $p; print grep { $p = $_ if $_ ne $l } (a,a,a,a,b,c,c, +a,a,d,e,e,e,e); abcade

but this approach fails for false list elements like e.g "0"!

A nice exmaple where a grep like map is better than the plain grep

DB<2> my $p; print map { $p ne $_ ? $p = $_ : () } (a,a,a,a,b,c,c,a +,a,d,e,e,e,e) abcade

But beside the need to lexicalize $p in the outer scope ... using state wouldn't be more elegant :

DB<3> use feature "state"; print map { state $p; $p ne $_ ? $p = $_ : + () } (a,a,a,a,b,c,c,a,a,d,e,e,e,e) abcade

... it's still buggy if the compressed list was starting with undef values. Thats a variation of the old Semipredicate problem:

"One can't find an initial value which is not also potentially a part of the iterated list!"

So I had a look into List::Util and the only appropriate approach was reduce

The first call will be with $a and $b set to the first two elements of the list, subsequent calls will be done by setting $a to the result of the previous call and $b to the next element in the list.

Looks perfect, but reduce {} LIST only returns a scalar and List::MoreUtils doesn't seem to provide anything better¹.

(Sure I could use $a as an array-ref for accumulation, not very elegant...)

Temporary conclusion:

There seem to be a lack of iterators allowing to compare successive elements

And now I'm wondering about the best design ...

could a special variable like $^PRE generally help augmenting existing iterators, without the need to define and appropiately name a whole new family of reduce-like iterators?

( IMHO too many specialized functions like in LISP are hard to remember.)

something like

print grep { $_ ne $^PRE } (a,a,a,a,b,c,c,a,a,d,e,e,e,e); abcade

Thoughts?

Cheers Rolf

¹) couldn't find much more on CPAN...

Replies are listed 'Best First'.
Re: reduce like iterators
by ELISHEVA (Prior) on Jan 03, 2011 at 18:27 UTC

    This is fairly easy to implement and wrap in a sub, even with the undefs. Once the sub is written you are back to having a one-liner. What would you like that this wouldn't do?

    use strict; use warnings; sub compress { my $x; map { if (!defined($x)) { defined($_) ? ($x = $_) : () } else { defined($_) && ($x eq $_) ? () : ($x = $_) } } @_; } my @aData=(qw(a a a a b c c a a d e e e e), undef, undef, qw(f g g)); my @aCompressed = compress @aData; print "compressed: @aCompressed\n"; # outputs a b c a d e undef f g

    Or if you want to play golf (though others I'm sure can do better)

    sub compress { my$x;map{defined($x)?(defined($_)&&($x eq$_)?():($x=$_) +):defined($_)?($x=$_):()}@_; }

    Update: added golf

    Update:: fixed mistake - undefs were strings.

      Sure, I also did a function like this in the emacs help group

      But I was asking about designing a generic iterator solution, opening a variety of applications and augmenting readablity

      already a reduce_list {BLOCK} LIST variant of reduce could be sufficient in this case

       reduce_list { $a ne $b } qw(a a a a b c c a a d e e e e undef undef f g g)

      another application would be Haskells group iterator which partitions into successive groups:

      group [1, 1, 1, 1, 2, 3, 3, 1, 1, 4, 5, 5, 5, 5] ==> [[1,1,1,1], [2], [3,3], [1,1], [4], [5,5,5,5]]

      Simpling allowing access to the previous iteration easily would help implementing this with part from List::MoreUtils

       part { $a eq $b } LIST

      or

       part { $^PRE eq $_ } LIST

      Without the need to invent and name a whole new group of iterators.

      Cheers Rolf

        Your reduce_list is named similarly to reduce, but bears no resemblance since it doesn't to reduce at all.

        • reduce allows an arbitrary state to be passed from one pass to another. Your reduce_list doesn't.

        • reduce can return any value, not just the input. reduce_list can return at most one scalar, and it can only be the input. (That's not very "listy"!)

        Because of those reasons, reduce is a general purpose function. (It can implement any other function in List::Util.) Your reduce_list is just grep with access to the last element.

        This accounts for the differences with what I suggested it should look like.

        Perl also lets you use parameters to roll your own block syntax. Here is a slightly different definition that would allow you to do arbitrary operations on the first element of each run in a list:

        Or if you want to be able to work with the current run value ($b) and the previous run value $a, you could do something like this:

        There is almost no limit to what you could create. For example, if you wanted to be able to group items, you could additionally track the number of items in each run and set $_ to number of items in the current run:

        Would that do what you want?

        Update: added yet another block iterator, this time one that could be used for generating groups as in the Haskell function above.

        Update: put in readmore tags

        Update: fixed 3rd example so it prints count of current (not previous) run.

Re: reduce like iterators
by BrowserUk (Pope) on Jan 03, 2011 at 18:49 UTC

    Compress() is easily defined in terms of a generic adjacent pairing routine:

    #! perl -slw use strict; sub adjacentPairs (&@) { my $code = shift; map { local( $a, $b ) = @_[ $_-1, $_ ]; $code->(); } 1 .. @_; } print adjacentPairs{ defined $b && $a eq $b ? () : $a } qw[a a a b c c a a d e e e e]; __END__ C:\test>compress abcade

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: reduce like iterators
by JavaFan (Canon) on Jan 03, 2011 at 19:05 UTC
    You have to be careful with any state solution. They'll work fine until you go through the same code path again (due to a loop, the sub it's in called again, etc). Being a state variable, it'll retain whatever was in it last. So you have to be careful you aren't comparing the last element of the first list with the first element of the second list.
    @cleaned = @uncleaned[0, grep {@uncleaned[$i] ne @uncleaned[$i-1]} 1.. +$#uncleaded];
    It needs some additional hackery to deal with undefs correctly.
      argh!!!

      I missed that:

      DB<3> use feature "state"; map { state $p; print $p;$p=$_} 1..3 12 DB<5> use feature "state"; for (1..3) {map { state $p; print $p;$p=$ +_} 1..3} 12312312

      thanks a lot! :)

      Cheers Rolf

        This particular problem is solved in Perl 6 by always cloning all blocks as closures the same way, so state is reset in loops as well as in map. (In fact, the for loop is defined in terms of map in Perl 6. So for loops return a list of values just like map does.) Perl 5 could probably move in the direction of fewer special-cased blocks that aren't true closures; this would have many subtle benefits. Not everything in Perl 6 can be borrowed back, but I suspect this is one of them.
Re: reduce like iterators
by moritz (Cardinal) on Jan 03, 2011 at 19:20 UTC
    I couldn't resist writing a "golfed" Perl 6 version, heavily inspired by functional programming techniques:
    sub compress(*@a) { flat @a Zxx 1, (@a Zne @a[1..*-1]) } say compress(<a a a a b c c a a d e e e e>).perl;

    The *-1 is a workaround for a rakudo bug, in a perfect Perl 6 implementation 1..* would work too.

Re: reduce like iterators
by furry_marmot (Pilgrim) on Jan 04, 2011 at 01:16 UTC

    Hi LanX,

    map{} will pass on undef's, but grep{} needs undef to tell whether something passed the filter. But you can stack any number of maps and greps together, which is how I came up with a workaround for it, using two maps and a grep.

    The first map{}, reading from the bottom up, simply marks duplicates in-a-row (as opposed to any dupes at all) by changing them to 'aardvark'. Undef's and zeroes go through without problem. The second map{} changes actual undefined elements to the text 'undef'. Change these as suits your algorithm. The grep{} is where the duplicates/aardvarks are finally removed.

    It seems to work, accommodating undefined values, while still being simple enough to use in a one-liner, and uses array refs so it doesn't pass whole lists back and forth.

    use strict; my @orig = ( qw(a a b b c 0 c d d u u), (undef, undef, 'blink', 'blink'), qw(0 0 v v w w a a a b b b c c c) ); my @list; nodupes (\@orig, \@list); print join ' ', @list, "\n"; sub nodupes { my ($ar1, $ar2) = @_; my $p; push @$ar2, grep{$_ ne 'aardvark'} map{defined $_ ? $_ : 'undef'} map { $p ne $_ ? $p = $_ : 'aardvark' } @$ar1; } __END__ Prints --> a b c 0 c d u undef blink 0 v w a b c
    And the one-liner -- actually broken up for easier viewing:
    perl -e "@list = grep{$_ ne 'aardvark'} map{defined $_ ? $_ : 'undef'} map{$p ne $_ ? $p = $_ : 'aardvark'} (qw(a a b b c 0 c u u), (undef, undef), qw(0 0 v v w w)); print join ' ', @list;" Prints --> a b c 0 c u undef 0 v w

    I hope you find this interesting/useful.

    --marmot

    UPDATE: I was thinking about this some more, and realized I had made it waaaay too complicated. The nodupes() below pretty much is a one-liner, and accomodates 0, "0", and undef just fine. And nodupes() can be used inline with other maps and greps. In the end, the grep was the only thing needed.

    use strict; sub nodupes { my $p; return grep{ $_ ne '~~' } map { $p ne $_ ? $p = $_ : '~~' } @_; } my @orig = ( qw(a a b b c 0 c d d u u), (undef, undef, 'blink', 'blink'), qw(0 0 v v "0" "0" "0" w w a a a b b b c c c) ); my @new = nodupes @orig; print join ' ', @new, "\n"; __END__ Prints --> a b c 0 c d u blink 0 v "0" w a b c ^^ There's an undef between these two spaces.
      Any reason not to just do:
      my $p; my @new = grep { $p = $_ or 1 if $p ne $_ } @orig;
      ?
        Ummm....because I didn't think of that? :-) Thanks! You just expanded my understanding of what you can do with grep.

        --marmot

        Try @orig starting with undef.

        Also see the OP for the "semi-predicate problem" discussion.

        If this seems too theoretical for you, consider the practical task to do of a run-length encoding of sparsely set arrays. (undef is a real value)

        Cheers Rolf

Re: reduce like iterators
by ikegami (Pope) on Jan 03, 2011 at 18:51 UTC
    my @b = uniq @a; my %seen; my @b = grep !$seen{$_}++, @a;

    But if you really want to use reduce with just one expression, it's definitely possible.

    my @b = @{( reduce { push @{$a->[1]}, $b if !$a->[0]{$b}++ } [ {}, [] +], @a; $a )->[1]};
      read the example again, it's not about uniq but identifying sequences!

      update:

      DB<1> use List::MoreUtils "uniq"; print uniq qw(a a a a b c c a a d +e e e e) abcde

      Cheers Rolf

        Oops, adjusted:

        my @b = @{ reduce { push @$a, $b if !@$a || $b ne $a->[-1]; $a } [], @ +a };

        If one were to make a list version of reduce, the callback would need access to three variables: The list (say $_), the state (say $a), the current value (say $b). The problem could be solved as follows:

        my @b = list_reduce { push @$_, $b if !@$_ || $b ne $_->[-1]; undef } +undef, @a; my @b = list_reduce { push @$_, $b if $a || $b ne $_->[-1]; 0 } 1, @a; my @b = list_reduce { push @$_, $b if defined($b) && $b ne $a; $b } un +def, @a; my @b = list_reduce { push @$_, grep defined && $_ ne $a, $b; $b } und +ef, @a;

        (@a may not start with undef for the last two to work properly.)

        Update: Added everything after the first block of code.

Re: reduce like iterators
by Anonymous Monk on Jan 03, 2011 at 19:43 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://880228]
Approved by ELISHEVA
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (3)
As of 2021-01-27 01:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?