Learning to use the hash effectively

Stamp_Guy has asked for the wisdom of the Perl Monks concerning the following question:

I've been trying to teach myself hashes and have come up with the following block of code to change a line in a pipe-deliminated flat file. I was wondering if my fellow monks here could give me some advice as to how I could make this code simpler more efficient etc. I'm also wondering if there are better ways to do the same function. All suggestions, comments, etc. would be appreciated! Thanks!
Stamp_Guy

#!/usr/bin/perl -w
use strict;

# Predeclare variables.
my %test;
my @fileData;
my @sorted;

# Initialize the variable that will hold the data to be changed.
my $change = "this is cool";

# Open the file
open(TEST, "test.txt") || die "File couldn't be opened for reading: $!
+";

# Create a hash
while (<TEST>) {
my ($key,@fileData) = split /\|/;
chomp @fileData;
$test{$key} = \@fileData;
}

# Change the data
$test{mykey}->[1] = "$change";

# Place all the hash data into an array of pipe-seperated values for s
+orting.
foreach my $key (keys %test) {
    push (@sorted, "$key|$test{$key}->[0]|$test{$key}->[1]|$test{$key}
+->[2]|$test{$key}->[3]");
}

# Sort the data by the number in the last part of the array.
@sorted = map $_->[1],
    sort { $a->[0] <=> $b->[0] }
        map [ substr($_,rindex($_,'|')+1), $_ ],
        @sorted;

# Put the line breaks back in.
for (@sorted) {
$_ = "$_\n";
}

open(TEST, ">test.txt") || die "File couldn't be opened for writing: $
+!";
print TEST @sorted;
close(TEST)
[download]

Comment on Learning to use the hash effectively Download Code

Replies are listed 'Best First'.
Re: Learning to use the hash effectively by btrott (Parson) on Jun 25, 2001 at 02:36 UTC
Two things I will recommend: File locking. You have a very big race condition in your code: you read in the contents of the file, alter them, then write them back out. Between the time that you've read the file and you write the file, someone else (ie. another process) could have changed that file. Then you would overwrite that chance when you write the file w/ your file contents. One way to fix this is to open the file in read/append mode, flock it, seek to the beginning of the file, read from it, alter the file contents in memory, seek back to the beginning, truncate it, then rewrite the contents from memory. The flock will prevent the race condition (at least w/ another version of your program that uses flock). Another way to fix the problem is to use a semaphore file, like in tilly's Simple Locking. Here, you flock a semaphore file when you want to enter the "critical section" of your program, and then other processes of your program cannot enter that critical section until you have released the lock. I would recommend the second approach. My second suggestion is, you could just use a DBM file for this, particularly since you already have the notion of keys mapping to values. In particular, you could use MLDBM to serialize the data structure into the DBM format of your choice.	[reply]
Re: Re: Learning to use the hash effectively by Stamp_Guy (Monk) on Jun 25, 2001 at 04:41 UTC
btrott Thanks for your suggestions. I normally use the second method of file locking, however when I am testing on my Win98 box, I leave it off because it causes errors. Stamp_Guy	[reply]
Re: Learning to use the hash effectively by suaveant (Parson) on Jun 25, 2001 at 03:26 UTC
First of all change `while (<TEST>) { my ($key,@fileData) = split /\\|/; chomp @fileData; $test{$key} = \@fileData; } #to while (<TEST>) { chomp; my ($key,@fileData) = split /\\|/; $test{$key} = \@fileData; }` [download] Otherwise you are chomping every item in the array, when you know there can only be a newline at the end of the line. I would change `foreach my $key (keys %test) { push (@sorted, "$key\|$test{$key}->[0]\|$test{$key}->[1]\|$test{$ke +y}->[2]\|$test{$key}->[3]"); } #to foreach my $key (keys %test) { push @sorted, (join '\|', ($key,@{$test{$key}}); }` [download] except that that is overkill... since you can't have the multiples of the same key, you don't need to sort on the whole line, just the key, I would do... `open(TEST, ">test.txt") \|\| die "File couldn't be opened for writing: $ +!"; foreach my $key (sort keys %test) { print TEST (join '\|', ($key,@{$test{$key}}); print TEST "\n"; } close(TEST);` [download] Instead of the whole @sorted thing Update Sorry, you wanted to sort by the last item in the array of data... change `foreach my $key (sort keys %test) { #to foreach my $key (sort { $test{$a}[-1] <=> $test{$b}[-1] } keys %test) +{` [download] the `$test{$a}[-1] <=> $test{$b}[-1]` sorts numerically based on the final item in the data array at each key Untested, but I believe it all works fine - Ant	[reply] [d/l] [select]
Re: Re: Learning to use the hash effectively by Stamp_Guy (Monk) on Jun 25, 2001 at 04:38 UTC
Suaveant, thanks for your excellent suggestions! They work quite well. I had thought that since hashes are by nature unsorted that I could only sort them by ascii value. This is much more compact and efficient.	[reply]
Re: Re: Re: Learning to use the hash effectively by suaveant (Parson) on Jun 25, 2001 at 06:42 UTC
You can't sort hashes, but you can sort their keys :) - Ant	[reply]
Re: Learning to use the hash effectively by particle (Vicar) on Jun 25, 2001 at 03:38 UTC
this bit looks overly complex (my formatting)~ `# Place all the hash data into an array of pipe-seperated values for s +orting foreach my $key (keys %test) { push @sorted, "$key\|$test{$key}->[0]\|$test{$key}->[1]\|" . "$test{$key}->[2]\|$test{$key}->[3]"; } # Sort the data by the number in the last part of the array. @sorted = map $_->[1], sort { $a->[0] <=> $b->[0] } map [ substr($_,rindex($_,'\|')+1), $_ ], @sorted; # Put the line breaks back in. for (@sorted) { $_ = "$_\n"; }` [download] you build an array, break it down to sort, then add to it again. how about using element 3 from the array in the hash already for the compare... why not try something like (untested for errors)~ `# Sort the data by the number in the last part of the array @sorted = # return just the hash->key map { $_->[1] } # compare hash->key->value[3]'s sort { $a->[0] <=> $b->[0] } # anon array w/ hash->key->value[3], hash->key map { [ %test->{$_}->[3], $_ ] } # keys from hash keys %test; # rebuild array of pipe-seperated values, and put the line breaks back + in $_.="\|$test{$_}[0]\|$test{$_}[1]\|$test{$_}[2]\|$test{$_}[3]\n" for(@sort +ed);` [download] ~Particle	[reply] [d/l] [select]
Re: Learning to use the hash effectively by bikeNomad (Priest) on Jun 25, 2001 at 05:13 UTC
You might want to look at DBD::CSV, which will let you deal with your pipe separated file as if it were a real database.	[reply]


The stupid question is the question not asked
	PerlMonks