Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

a method to get words from a file into an array

by aufrank (Pilgrim)
on Aug 08, 2002 at 20:41 UTC ( #188733=perlquestion: print w/replies, xml ) Need Help??

aufrank has asked for the wisdom of the Perl Monks concerning the following question:

hey--
I am working on my first project using OO perl, and have a quick question. I have several different files (some of which are very large) that are simple word lists-- words separated by a comma and a space. The idea is that I want to load all of the words from a file into an array via a method. The following code seems to do what I want:

sub words_from_file { my $self = shift; my $file_path = shift; my @word_list; open WORD_FILE, "< $file_path" or die "Could not open $file_path: $!\n"; while (<WORD_FILE>) { chomp; push @word_list, split ", "; } \@word_list; }

However, I am not at all sure that this is the best way to implement the functionality I want. In terms of security and speed, do you all have any suggestions? Are there file tests I should be making before the open? Should I do something involving tie to the filehandle to speed things up? (I know nothing about tie so I'm not even sure if that last question made sense :) Is there any benefit to using return \@word_list instead of what I've done in the last line?

just looking for any feedback before I get too much further into this,
--au

Replies are listed 'Best First'.
Re: a method to get words from a file into an array
by sauoq (Abbot) on Aug 08, 2002 at 20:52 UTC

    I think what you have is basically fine.

    You might not want to die() in your module. Better to carp() and return undef.

    You might also want to have a more robust split pattern. Maybe /,\s*/ for instance.

    The benefit of using return is that you make your code more readable. Making the return explicit will help the newbie they hire (for half your wages after they lay you off) maintain your code. I skip return in one-liners and quickie throw-aways but I use it otherwise.

    -sauoq
    "My two cents aren't worth a dime.";
    
Re: a method to get words from a file into an array
by Juerd (Abbot) on Aug 08, 2002 at 21:16 UTC

    words separated by a comma and a space.

    sub readthingy { local @ARGV = @_; local $/ = ', '; return <>; # ref to array might be faster }

    - Yes, I reinvent wheels.
    - Spam: Visit eurotraQ.
    

      sub readthingy { local @ARGV = @_; local $/ = ', '; return <>; # ref to array might be faster }

      This is not equivalent to the original. Ask yourself, "what happens to newlines?"

      -sauoq
      "My two cents aren't worth a dime.";
      

        This is not equivalent to the original. Ask yourself, "what happens to newlines?"

        Did I ever claim it was? Besides, if it were equivalent, there would be no point in posting it.

        - Yes, I reinvent wheels.
        - Spam: Visit eurotraQ.
        

Re: a method to get words from a file into an array
by DamnDirtyApe (Curate) on Aug 08, 2002 at 23:14 UTC

    Looks like you can do this quicker with map:

    Perl Code:
    #! /usr/bin/perl use strict ; use warnings ; $|++ ; use Benchmark qw( :all ) ; # Construct the test string. our @arr ; for ( 1..100_000 ) { push @arr, "foo, bar, zoot, blarg,\n" } cmpthese( 10_000_000, { 'Mine' => \&mine, 'Yours' => \&yours } ) ; sub mine { local @arr ; return map { split /,\s*|\n/ } @arr ; } sub yours { # Roughly. local @arr ; my @word_list ; while ( @arr ) { chomp; push @word_list, split ", "; } \@word_list; }
    Results:
    Benchmark: timing 10000000 iterations of Mine, Yours... Mine: 9 wallclock secs ( 8.82 usr + 0.02 sys = 8.84 CPU) @ 11 +31221.72/s (n=10000000) Yours: 25 wallclock secs (21.95 usr + 0.04 sys = 21.99 CPU) @ 45 +4752.16/s (n=10000000) Rate Yours Mine Yours 454752/s -- -60% Mine 1131222/s 149% --

    _______________
    DamnDirtyApe
    Those who know that they are profound strive for clarity. Those who
    would like to seem profound to the crowd strive for obscurity.
                --Friedrich Nietzsche

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://188733]
Approved by grep
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2022-12-01 17:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?