Re: Cultural and Bibliometric Perl

Some comments:

use strict warnings and diagnostics or die
The file regex may produce some unwanted results with unix-like filenames: file.txt.bak . Just think which part you want to retain. You can find the first and last dot with index and rindex, respectively.
You slurp the contents in array context only to join the array. You can also set $/ to undef:
```
{
  local $/ = undef;
  $contenido = <LIBRO>;
}
[download]
```
You can leave the newlines intact, they will be catched with '\s'. Even better, tr will take care of that.
Use lc or uc to change the case.
You can simplify the translation, by complementing the list to the alphabetic range (see perlop):
```
$contenido = uc $contenido;
$contenido =~ tr/A-Z/ /cs;
[download]
```
Use '\s+' rather than '\s', so you don't have to test for empty cases.
You can get the total number without array assignment: $npalabras = keys %PF;The scalar context will force immediate size return.
I would print LIBROUT in the while loop, so the system will get the chance to buffer nicely.

It's quite a list, but I hope it will give you the chance to learn new idiom. Result:

#....
my $contenido;
{
  local $/ = undef;
  $contenido = <LIBRO>;
}
$contenido = uc $contenido;
$contenido =~ tr/A-Z/ /cs; 

my %PF;
$PF{$_}++ for( split /\s+/, $contenido);

open LIBROUT, ">$ar.csv";
my $npalabras = keys %PF;
while( keys %PF ){
  print LIBROUT join ';', $_, my $f=$PF{$_}, $f/ $npalabras;
  print LIBROUT "\n";
}
[download]

Well, you see how the use of $_ simplifies things..

Hope this helps,

Jeroen
"We are not alone"(FZ)

Comment on Re: Cultural and Bibliometric Perl Select or Download Code

Replies are listed 'Best First'.

Re: Re: Cultural and Bibliometric Perl
by Ignatius Monk (Novice) on Jun 29, 2001 at 15:53 UTC

[reply]


P is for Practical
	PerlMonks