text sorting question, kinda

ybiC has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: text sorting question by plaid (Chaplain) on Aug 03, 2000 at 00:37 UTC
The first way that comes to mind offhand would be to do something like `my @top_lines; my @other_lines; while(<FILE>) { # whatever regex to strip out lines (/blahblah$/) ? push @top_lines, $_ : push @other_lines, $_; } print OUTFILE @top_lines; print OUTFILE @other_lines;` [download] This would work well provided that the file doesn't get too big, as the entire thing is going to have to be kept in memory. But, if it's about 200k as you say, that should work fine.	[reply] [d/l]
(Ovid) Re: text sorting question by Ovid (Cardinal) on Aug 03, 2000 at 00:53 UTC
For what it's worth, here's my take on the situation: `#!/usr/bin/perl -w use strict; my (@data, $i); my @logfile = <DATA>; for (@logfile) { /blahblah/? splice @data, $i++, 0, $_:push @data, $_; } print @data; __DATA__ foo 1 zot foo 2 blahblah bar 1 zot bar 2 zot bat 1 blahblah bat 2 baz 1 baz 2 zot` [download] It works fine and only needs one array. Cheers, Ovid Update: Clearly, I am smoking crack. larsen pointed out that I kept an extra array. Here's what I meant to post: `#!/usr/bin/perl -w use strict; my (@data, $i); for (<DATA>) { /blahblah/? splice @data, $i++, 0, $_:push @data, $_; } print @data; __DATA__ foo 1 zot foo 2 blahblah bar 1 zot bar 2 zot bat 1 blahblah bat 2 baz 1 baz 2 zot` [download] There. Only one array. (sigh)	[reply] [d/l] [select]
RE: (Ovid) Re: text sorting question by larsen (Parson) on Aug 03, 2000 at 01:05 UTC
Yes, but there's @logfile that contains the entire file. And you have: `$#logfile + $#data > $#toplines + $#otherlines` [download] see you Larsen	[reply] [d/l]
Re: text sorting question by ferrency (Deacon) on Aug 03, 2000 at 01:03 UTC
You can also get away with one array but only storing the non-matching lines, and printing out (or processing) the matching ones immediately. (slight rework of Plaid's code...) `my @other_lines; while(<FILE>) { # whatever regex to strip out lines (/blahblah$/) ? print OUTFILE : push @other_lines, $_; } print OUTFILE @other_lines;` [download] This way, not only do you only use one array, but you don't even store All of the lines in it- it only stores the nonmatching lines. Alan	[reply] [d/l]
Re: text sorting question by DrManhattan (Chaplain) on Aug 03, 2000 at 08:12 UTC
Here's the second shortest one I could come up with: `#!/usr/bin/perl -l print for map { $_->[0] } sort { $a->[1] cmp $b->[1] } map { [$_, ( /blahblah/ ? 0 : 1 ) . $_] } <>;` [download] It uses a Schwartzian Transform that compares the second elements of an array looking like this: `( ["bat 1 blahblah", "1bat 1 blahblah"], ["foo 2 blahblah", "1foo 2 blahblah"], ["bar 1 zot", "0bar 1 zot"], ["bar 2 zot", "0bar 2 zot"] )` [download] The lines that don't match the regex get prepended with a 0 and the ones that do match get a 1. That way the matching lines always win a cmp with the non-matching ones. Update: Fixed a typo in the array. I had written "0bat 1 blahblah" and "0foo 2 blahblah" Here's a shorter, faster one using substr instead of arrays: `#!/usr/bin/perl -l print for map { substr($_, 1) } sort map { ( /blahblah/ ? 0 : 1 ) . $_ } <>;` [download] -Matt	[reply] [d/l] [select]
Re: text sorting question by Boogman (Scribe) on Aug 03, 2000 at 00:51 UTC
If you don't want to be saving any of the elements in arrays or anything like that, you could always just make two runs through the file. The first would check for the 'blah blah' lines and write those to a temporary file. The second would append the remaining lines to the end. Sure it means you have to read through the file twice, but if you're worried about memory usage, this only holds on to one line at a time.	[reply]
Fast simple approach by gryng (Hermit) on Aug 03, 2000 at 02:15 UTC
Just put the blah blahs in output file 1, and non blah blahs in output file 2. Then when you are done: `cat out1.txt out2.txt > final-output.txt;rm out[12].txt` [download] You could use arrays if you don't mind the memory usage and feel dirty using temporary files :) . Don't know how clean this is supposed to be :) Ciao, Gryn	[reply] [d/l]
Re: text sorting question by eLore (Hermit) on Aug 03, 2000 at 00:40 UTC
I don't know if it's the most efficient, but what about pushing all of the "top of the list" items into one array, the rest into another, then reverse order prepending them onto item2? `while(<INFILE>){ if (match string to move){ @to_move = push $1 }else @regular_order push $1 } } while(@to_move){ unshift @regular_order, $to_move[last] }` [download] Completely untested, no warranty offered or implied! Someone please do it better... UPDATE Plaid did it better.	[reply] [d/l]
Re: text sorting question by eak (Monk) on Aug 03, 2000 at 00:58 UTC
if you can get the data into an array of arrays, the following would work very nicely. There is probably a nicer way to do the loop, than a foreach though. `#!/usr/bin/perl -w my @array = ( ['foo', 1, 'zot'], ['foo', 2, 'blahblah'], ['bar', 1, 'zot'], ['bar', 2, 'zot'], ['bat', 1, 'blahblah'], ['bat', 2, ''], ['baz', 1, ''], ['baz', 2, 'zot'], ); my @sorted; foreach my $array (@array){ ($array->[2] eq 'blahblah' ? unshift @sorted, $array : push @sorted, + $array); }` [download] --eric	[reply] [d/l]
(Ovid) RE(2): text sorting question by Ovid (Cardinal) on Aug 03, 2000 at 01:01 UTC
That's a nice example, but it reverses the order of the "blahblah" lines. That was something that ybiC was trying to avoid. I spotted that immediately because I made the same mistake at first :) Cheers, Ovid	[reply]
Re: text sorting question by ray (Initiate) on Aug 03, 2000 at 01:43 UTC
The following will do the sort you want, after the log has been loaded into the array `@v` `my @sorted = map { $_->[2] } sort { ($a->[0])($#v+1) + $a->[1] <=> ($b->[0])($#v+1) ++ $b->[1] } map { [ !((split '\s+', $v[$_])[2] eq 'blahblah'), $_, $v +[$_] ] } 0 .. $#v;` [download] Later, Ray.	[reply] [d/l] [select]
RE: Re: text sorting question by eak (Monk) on Aug 04, 2000 at 19:43 UTC
Here is a slightly modified version of the above version, but using an '\|\|' in the 'sort' block to make sure the lines are in the proper order. `my @sorted = map { $_->[2] } sort{ $b->[0] <=> $a->[0] \|\| $a->[1] <=> $b->[1] } map { [ (split '\s+', $file[$_])[2] eq 'blahblah', $_, $f +ile[$_] ] } 0 .. $#file;` [download]	[reply] [d/l]
RE: text sorting question, kinda (simple result) by ybiC (Prior) on Aug 03, 2000 at 05:38 UTC
Here's the snippet I ended up with - it's not much more than verbatim from plaid's answer. Ovid's answer looked interesting too, but I try not to take things from people on crack. <big grin> Update: Thanks as well to other fine Monks who offered answers. cheers, ybiC #!/usr/bin/perl -w # parse a log file and move lines with important text to top # of file while keeping sequence within each of two sections: # important and not-so-important use strict; my $infile = '/dir/file.in'; my $outfile = '/dir/file.out'; my @important; my @normal; open IN, "$infile" or die "Couldn't open $infile"; open OUT, ">$outfile" or die "Couldn't open $outfile"; while (<IN>) { s/unwanted text//g; # strip unwanted text s/more unwanted text//g; # strip unwanted text s/^\s+//g; # remove empty lines (/important text/) ? push @important, $_ : push @normal, $_; } print OUT @important; print OUT @normal; close IN or die "Couldn't close $infile"; close OUT or die "Couldn't close $outfile"; # END [download]	[reply] [d/l]
A reply falls below the community's threshold of quality. You may see it by logging in.


Your skill will accomplish what the force of many cannot
	PerlMonks