Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: Re: Estimating Vocabulary

by I0 (Priest)
on Mar 28, 2002 at 00:41 UTC ( [id://154854]=note: print w/replies, xml ) Need Help??


in reply to Re: Estimating Vocabulary
in thread Estimating Vocabulary

my(@lines, $line); open(FILE, shift) || die; 1 while <FILE>; $line=$.; seek(FILE, 0, $. = 0); rand($line-$.) < $ARGV[0]-@lines && push(@lines,$_) while <FILE>; print @lines, "wc -l could have told you this is $. words\n";

Replies are listed 'Best First'.
Re: Re: Re: Estimating Vocabulary
by belg4mit (Prior) on Mar 28, 2002 at 01:00 UTC
    UPDATE:Excellent!

    WAS: That does not appear to work, I ask for one line and get 13-18 lines... It is also heavily weighted towards the Zs

    --
    perl -pe "s/\b;([st])/'\1/mg"

      Are you sure? It's working for me with only a small bias towards the Zs

      Update: Apparently, the observed bias was mostly an artifact of small sample size
        On a linux 2.4.9 box running perl 5.6.0 perl /tmp/a /usr/share/dict/words 1 yields such things as:

        databases fritter Saracens stammerer when Whitmanize willing writes Wuhan youthfully zigzag Zoroaster Zulu Zulus Zurich wc -l could have told you this is 45424 words
        shrivel topologies wetter Wilkins wristwatch Yeager yellowed zoom zooms Zoroastrian Zulu Zulus Zurich wc -l could have told you this is 45424 words
        valuably wins wriggles Zennist zoning Zoroaster Zulu Zulus Zurich wc -l could have told you this is 45424 words
        knave requisitioning seismology sentimentally tail Telnet Welles Whipple winner workbooks workmen Yates yeas Yokohama zodiac zonally zone Zoroaster Zoroastrian Zulu Zulus Zurich wc -l could have told you this is 45424 words

        --
        perl -pe "s/\b;([st])/'\1/mg"

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://154854]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (7)
As of 2024-04-25 08:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found