Reading directories and parsing standard names

Amoe has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Reading directories and parsing standard names by abstracts (Hermit) on Aug 11, 2001 at 14:07 UTC
Hello Solving your problems involves 4 steps: Get the names of files in the directory matching `foo_.bar`. Map that list to another list of numbers `foo_(\d+).bar`. Sort that list numerically in descending order (using spaceship <=>). Get the first element + 1. `print get_num('.'); sub get_num{ my $dir = shift; (sort{$b<=>$a} map{/(\d+)/, $1}<$dir/foo_.bar>)[0] + 1; }` [download] Hope this helps,,, Aziz,,, Update: The space complexity of this algorithm is O(n) and time complexity is O(nlogn). It might be OK for small number of files but there are better ways for larger number of files. Zaxo's algorithm works the same way but performs worse because it does more regexp matches than the algorithm presented. This algorithm does n matches while the other does an order of nlogn matches. The better answer is as follows: `sub get_num{ my $dir = shift; my $c = 0; /(\d+)/ and $1>$c and $c=$1 while <$dir/foo_*.bar>; return $c+1; }` [download] As it has O(1) space complexity and O(n) time complexity. Needless to say that this is a somplete solution that requires no special cases. It's also much shorter than my previous example. Enjoy. Aziz,,,	[reply] [d/l] [select]
Re: Reading directories and parsing standard names by Zaxo (Archbishop) on Aug 11, 2001 at 14:18 UTC
I suspect the problem is in the manner of sorting. If you did a string sort on the array of names, `foo_10.bar` would come before `foo_2.bar`. A numeric sort on captured digits should work. Here is a stab at it: `# needs several foo_<n>.bar { my $re = qr/foo_(\d+)\.bar$/; sub seq_num { $_ = shift; m/$re/; return $1; } } my @files = </dir/to/use/foo_*.bar>; my @sorted_files = sort { seq_num($b) <=> seq_num($a) } @files; my $next = 1 + seq_num($sorted_files[0]); open NEXT, "> /dir/to/use/foo_$next.bar"; # print content to NEXT close(NEXT);` [download] A full solution would special-case `@files` for 0 or 1 elements. After Compline, Zaxo	[reply] [d/l]
Re: Re: Reading directories and parsing standard names by Amoe (Friar) on Aug 11, 2001 at 14:53 UTC
This works great. Thanks loads :) credits zaxo	[reply]
Re: Reading directories and parsing standard names by George_Sherston (Vicar) on Aug 11, 2001 at 15:13 UTC
In the spirit of timtowtdi, why not add to the end of your sub a tiny routine that saves the name of the highest numbered file in a text file in the same directory? And a tiny routine at the beginning that opens this text file and reads the contents? I mean, why search for something you hid yourself? Sorry if there's something I didn't pick up on that makes this an impractical suggestion. § George Sherston	[reply]
Re: Re: Reading directories and parsing standard names by Amoe (Friar) on Aug 11, 2001 at 17:37 UTC
I thought about doing that, but it seemed kinda messy :P	[reply]
Re: Re: Re: Reading directories and parsing standard names by koolade (Pilgrim) on Aug 11, 2001 at 20:14 UTC
Messy? With this you're looking at one step versus four or five. Whether you use a text file to hold the id or not, I smell a possible race condition. See File::CounterFile for more info.	[reply]
Re: Reading directories and parsing standard names by runrig (Abbot) on Aug 11, 2001 at 18:41 UTC
No need to sort if you just want the highest number: `my ($max_file, $max_num); while (defined(my $file = <foo_*.bar>)) { ($max_file, $max_num) = ($1, $2) if $file =~ /^(foo_(\d+)\.bar)$/ and (!defined $max_num or $2 > $max_num); } print "$max_file\n";` [download]	[reply] [d/l]
Re: Reading directories and parsing standard names by kjherron (Pilgrim) on Aug 11, 2001 at 21:56 UTC
In the spirit of thinking outside the box. do the file numbers have to be sequential or start at 1? When I've needed to produce unique files within a directory, it's often been sufficient to use the current time as part of the filename, viz: `$name = 'foo_' . time() . '.bar'; or my($sec, $min, $hr, $day, $mon, $year) = (gmtime)[0..5]; $name = sprintf("foo_%04d%02d%02d.%02d%02d%02d.bar", $year + 1900, $mon + 1, $day, $hr, $min, $sec);` [download] As long as you don't try to create more than one file per second, this will produce a new filename every time.	[reply] [d/l]
Re: Re: Reading directories and parsing standard names by abstracts (Hermit) on Aug 12, 2001 at 14:06 UTC
Hello To generate unique names for files, you can even use the File::Temp module. The function tempfile, given a template, returns the filehandle and filename of the new file that was just opened. This way, you don't need to worry about the one-sec-time or other problems. `use File::Temp qw/tempfile/; ($fh, $filename) = tempfile( $template, DIR => $dir, SUFFIX => '.dat', + CLEANUP => 0);` [download] Hope this helps,,, Aziz,,,	[reply] [d/l]


Syntactic Confectionery Delight
	PerlMonks