This is PerlMonks "Mobile"

Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  


in reply to Re^2: Date Array Convolution
in thread Date Array Convolution

OK, here is the new code I wrote on the underground on my way to work :-) I used OO this time. The second case is your second example input, if your expected output is different, can you show it?
Update: I read the discussion you had with BrowserUk and tried to accommodate the code appropriately.
Update2: zero-intervals removed from output.
#!/usr/bin/perl use Data::Dump qw/dump/; use warnings; use strict; my $d = My::ConvoluteDate->new; while (<>) { my ($from, $to, $value) = split; $d->populate([$from, $to, $value]); } print dump $d->combine; package My::ConvoluteDate; use Data::Dump qw/dump/; use List::Util qw/min/; sub new { my $class = shift; my $self = {}; bless $self, $class; } # new sub populate { my $self = shift; while (my $triple = shift) { my ($start, $end, $value) = @$triple; push @{ $self->{times}{$start}{push} }, $value; push @{ $self->{times}{$end}{pop} }, $value; } } # populate sub combine { my $self = shift; my @result; my @stack; for my $time_string (sort keys %{ $self->{times} }) { my $time = $self->{times}{$time_string}; my $old_value = min @stack; for my $push (@{ $time->{push} }) { push @stack, $push; } for my $pop (@{ $time->{pop} }) { my $value = $pop; @stack = grep { not defined $value or $_ != $value or undef $value } @stack; } my $new_value = min @stack; if (not defined $old_value) { push @result, [$time_string]; } elsif (not defined $new_value) { push @{ $result[-1] }, $time_string, $old_value; } elsif ($new_value != $old_value) { my $end_string = $time_string; my $start_string = $time_string; if ($new_value < $old_value) { $end_string = _dec($end_string); } else { $start_string = _inc($start_string); } # avoid zero length intervals if ($result[-1][0] le $end_string) { push @{ $result[-1] }, $end_string, $old_value; } else { pop @result; } push @result, [$start_string]; } } return _day_split(@result); } # combine sub _day_split { return map { my ($start, $end, $value) = @$_; my $from = 0 + substr $start, 0, 2; my $to = 0 + substr $end, 0, 2; if ($from < $to) { my $split; my $newfrom = sprintf('%02d', $from) . '2359'; $split = [[$start, $newfrom, $value]]; push @$split, map { [sprintf('%02d', $_) . '0000', sprintf('%02d', $_) . '2359', $value] } $from + 1 .. $to - 1; my $newto = sprintf('%02d', $to) . '0000'; push @$split, [$newto, $end, $value]; @$split; } elsif ($from == $to) { $_; } else { die "Start later then end\n"; } } @_; } # _day_split sub _dec { my $time = shift; my ($day, $hour, $min) = $time =~ /(..)(..)(..)/; $min--; if ($min < 0) { $min = 59; $hour--; if ($hour < 0) { $hour = 23; $day--; die "Cannot go before 010000\n" if $day < 1; } } return sprintf '%02d' x 3, $day, $hour, $min; } # _dec sub _inc { my $time = shift; my ($day, $hour, $min) = $time =~ /(..)(..)(..)/; $min++; if ($min > 59) { $min = 00; $hour++; if ($hour > 23) { $hour = 00; $day++; } } return sprintf '%02d' x 3, $day, $hour, $min; } # _inc

Replies are listed 'Best First'.
Re^4: Date Array Convolution
by alanonymous (Sexton) on Nov 05, 2011 at 03:30 UTC
    You *Sir* are also a gentleman AND a scholar! I think the output is right, but I'm having trouble understanding exactly what's going on in your code. Like I mentioned to Mr. UK, you guys are awesome at coming up with solutions, but it's hard for newbies to understand it all! I *hate* taking code without understanding it :/
      I'll try to explain my code:
      The new method just creates an empty object. The object is then populated by the populate method: it just remembers all the start end end points of all the intervals in times subhash. At each interval, it also remembers what value begins to be valid (push) or ceases to (pop).

      The most important method is combine. It sorts all the time points and then walks through them, keeping track of all "active" values for the given interval (the active values are stored in @stack). All the values whose intervals start at the given point are inserted into the stack, all values whose intervals end at the given point are popped. The grep { not defined $value or $_ != $value or undef $value }construct just makes sure repeated values are removed one by one from the stack. Then, if there is no value left, the current output interval is closed and stored in @result. If there was no value before, but now there is, a new interval is started. If the value has changed, the interval is closed and a new one is opened.

      Finally, the _day_split function is called on the result. It just examines each interval, and if it spans over several days, it splits it into several intervals. The first and last days are special, because their starting, resp. ending times are derived from the interval.

      The _dec and _inc are just helper functions to subtract or add 1 to a time stamp. They are needed to avoid "overlapping" of intervals, i.e. in

      1 7 |-----| (4) |--------| (3) 4 12
      the first interval will end at 3, which is _dec(4).