processing columns of text

sitnalta has asked for the wisdom of the Perl Monks concerning the following question:

Hello everyone,
I am graphing data with gnuplot and need to convert the incremental numbers I pull with my script to reflect the difference between the given day and the day before. So 02/03/2007 needs to subtract columns 3-6 with the same column numbers from 02/02/2007.
Here is how the data looks that I pulled:
02/02/2007 00:00:00 719267027 719244316 719233953 719240015
02/03/2007 00:00:00 720375777 720336674 720325633 720329849
02/04/2007 00:00:00 721640280 721640267 721522690 721552815
02/05/2007 00:00:00 722297206 722297203 722297203 722297206

And here is how im trying to get the data to look:
02/03/2007 00:00:00 1108750 1092358 1091680 1089834
02/04/2007 00:00:00 1264503 1303593 1197057 1222966
02/05/2007 00:00:00 656926 656936 774513 744391

As you can see im just subtracting one day against the other to pull the difference. Thats all I want but im not sure how to go about this with perl. I am able to do it in bash but its so ugly that I decided to attempt to learn how its done in perl. I would like to see how people will go about doing this and learn something from there expertise. This data is in a flat file and I am having trouble even getting off the ground on howto approach such a thing without using temp files.
Thanks in advance!

Comment on processing columns of text

Replies are listed 'Best First'.
Re: processing columns of text by GrandFather (Saint) on Feb 26, 2007 at 20:47 UTC
Think about what you need to do: loop over the lines for each line extract the fields calculate the differences between the required fields print the result save the current line as the last line (you did skip for the first record didn't you?) use warnings; use strict; my @last; my @current; while (<DATA>) { # For each line chomp; # Strip any trailing line end sequence @current = split; # extract the fields next if ! @last; # Skip for the first line my @diffs = map {$current[$_] - $last[$_]} 2 .. 5; # Calculate print "@current[0, 1, 2] @diffs\n"; # Print } continue { @last = @current; # Save current line as last } __DATA__ 02/02/2007 00:00:00 719267027 719244316 719233953 719240015 02/03/2007 00:00:00 720375777 720336674 720325633 720329849 02/04/2007 00:00:00 721640280 721640267 721522690 721552815 02/05/2007 00:00:00 722297206 722297203 722297203 722297206 [download] Prints: `02/03/2007 00:00:00 720375777 1108750 1092358 1091680 1089834 02/04/2007 00:00:00 721640280 1264503 1303593 1197057 1222966 02/05/2007 00:00:00 722297206 656926 656936 774513 744391` [download] You may care to take a look at the docs for split, map and 'Loop Control' in perlsyn. DWIM is Perl's answer to Gödel	[reply] [d/l] [select]
Re: processing columns of text by TGI (Parson) on Feb 26, 2007 at 21:57 UTC
Take a look at the data structures cookbook. Your basic approach should be to build an array of arrays (actually array references). You can then loop over your array to generate a new array. Here's an example of working with an AoA. In this example, key things to understand are how data structures are built up in perl, and how to use the map function to make a new array out of an existing one. If you can't grok map right away, you can use a foreach or while loop instead. `use strict; use warnings; use diagnostics; use Data::Dumper; # Generate primary data array my @data = map {[ split ]} <DATA>; print Dumper \@data; my $first = $data[0]; my @first_vs_x = map { [ "$first->[0] vs $data[$_][0]", # Generate new name entry $first->[1] - $data[$_][1], # Fred - X $first->[2] + $data[$_][2], # Fred + X ] } 1..$#data; # use indexes from 1 to last in @data print Dumper \@first_vs_x; __DATA__ Fred 15 20 Wilma 23 19 Barney 1 22 Betty 99 63` [download] It's worth noting that this example keeps everything in memory, and for large data sets, you'll need to use a different approach. TGI says moo	[reply] [d/l]
Re: processing columns of text by ptum (Priest) on Feb 26, 2007 at 20:19 UTC
Hi. Welcome to the Monastery. I'm sure many here will want to help you, but you may wish to gain some credibility for the future by showing a little more effort before you ask for help. What have you tried so far? :) It sounds like you want to step through a data file, reading in each record at a time, splitting those records into meaningful chunks and then performing an arithmetic operation on those chunks using the N-1th record. You'll want to store the results of those arithmetic operations somewhere useful. As a hint, I would probably use an array (apart from the one that holds the results from the split operation) to store the dates and resulting difference values. Do you know how to open a file handle, and how to read the file contents line-by-line?	[reply]
Re: processing columns of text by bart (Canon) on Feb 27, 2007 at 12:19 UTC
I'd try something like this, in the assumption that all lines have the same number of numerical columns: `my @prev; local($\, $,) = ("\n", " "); while(<DATA>) { my($date, $time, @current) = split " "; if(@prev) { print $date, $time, map { $current[$_] - $prev[$_] } 0 .. $#cu +rrent; } @prev = @current; } __DATA__ 02/02/2007 00:00:00 719267027 719244316 719233953 719240015 02/03/2007 00:00:00 720375777 720336674 720325633 720329849 02/04/2007 00:00:00 721640280 721640267 721522690 721552815 02/05/2007 00:00:00 722297206 722297203 722297203 722297206` [download] (written so it uses the enclosed test data) Result: `02/03/2007 00:00:00 1108750 1092358 1091680 1089834 02/04/2007 00:00:00 1264503 1303593 1197057 1222966 02/05/2007 00:00:00 656926 656936 774513 744391` [download] To turn this into something that reads from a file instead of from the list of data near the end of the script, replace the `<DATA>` with `<>` and pass the file name on the command line.	[reply] [d/l] [select]


Syntactic Confectionery Delight
	PerlMonks