Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Script to create huge sample files

by wazoox (Prior)
on Jan 04, 2010 at 18:00 UTC ( #815618=note: print w/replies, xml ) Need Help??


in reply to Script to create huge sample files

I've made the following script to generate a large set of text files. The generated files looks like real text files, they are compressible but not too much (about 50%). Should work on any Unix-like system (or windows with an additional dictionary file as a source of words). Feel free to test and adapt.
#!/usr/bin/perl use strict; use warnings; use Carp; sub loaddict { my $dict = shift; open my $fh, $dict or croak "can't open $dict: $!"; my @words = <$fh>; chomp @words; return \@words; } ####################### # main my $testdir = $ARGV[0] or die "usage : $0 <test folder> <number of files>"; my $filecount = $ARGV[1] or die "usage : $0 <test folder> <number of files>"; my $seed = 0; $seed = $ARGV[2] if defined $ARGV[2]; # force number $filecount += 0; if ( not -d "$testdir" ) { mkdir "$testdir" or die "can't mkdir $testdir"; } my $wordlist = loaddict("/usr/share/dict/words"); srand(42 + $seed ); for ( 1 .. $filecount ) { open my $file, '>', "$testdir/$_" or croak "can't open file : $!"; my $filesize = int( rand(10000) ) + 5000 ; for ( 1 .. $filesize ) { my $dice = int( rand($#{$wordlist}) ) ; print $file $wordlist->[$dice] . " "; if ( $_ % 12 == 0 ) { print $file "\n"; } } }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://815618]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (8)
As of 2021-04-22 16:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?