Perl Destroys Interview Question

Replies are listed 'Best First'.
Re: Perl Destroys Interview Question by Abigail-II (Bishop) on Jan 13, 2004 at 02:26 UTC
Of course, your Perl solution (which is incorrect as it counts lines, not words) take more than 5 times the lines a shell solution would take: `cat words.dat \| tr 'A-Z ' 'a-z\012' \| sort \| uniq -c` [download] I'd like to point out that for some problems, other solutions are more suited than Perl. Abigail	[reply] [d/l]
Re: Re: Perl Destroys Interview Question by redsquirrel (Hermit) on Jan 13, 2004 at 15:17 UTC
Amen to that! The more languages I learn, the more I can see the strengths and weaknesses of each language.	[reply]
Re: Re: Perl Destroys Interview Question by Rhose (Priest) on Jan 13, 2004 at 16:48 UTC
Your solution also breaks down if there is punctuation in the file. (OS HP-UX 11.0) File `This is a test file. How many unique words are in this file? Do you know? Does the file contain more than ten words?` [download] Results `1 1 a 1 are 1 contain 1 do 1 does 1 file 1 file. 1 file? 1 how 1 in 1 is 1 know? 1 many 1 more 1 ten 1 test 1 than 1 the 2 this 1 unique 1 words 1 words? 1 you` [download] Update: Changed the test file.	[reply] [d/l] [select]
Re: Re: Re: Perl Destroys Interview Question by redsquirrel (Hermit) on Jan 13, 2004 at 20:00 UTC
Here are the requirements I was given... Program Purpose The goal of the program is to count the occurrences of all words in a file, and write this count into a new file. Requirements The input file will contain 1 word per line (lines will be terminated by the newline character), and the file will contain an arbitrary number or lines. The file will be terminated by an end of file character. The word count must be case insensitive, as there may be varying case throughout the file. The output file must write each word once, and include the number of occurrences of that word on the same line. The lines in the output file must be sorted in ascending order. Sample Input: Chicago Paris chicago London red blue Green Red REd london Sample output: blue;1 Chicago;2 Green;1 London;2 Paris;1 red;3	[reply]
Re: Re: Re: Re: Perl Destroys Interview Question by mr_mischief (Monsignor) on Jan 13, 2004 at 22:51 UTC
Re: Re: Re: Re: Re: Perl Destroys Interview Question by ihb (Deacon) on Apr 11, 2004 at 23:33 UTC
Some notes below your chosen depth have not been shown here
Re: Perl Destroys Interview Question by Abigail-II (Bishop) on Jan 13, 2004 at 16:59 UTC
That just depends on how a word is defined. Which the OP didn't. And considering the suggestions how to fix the OP's solution (split with no arguments/-a without a -F), I wasn't the only one taking the not uncommon "non-whitespace" definition. But I'd like to see the version you would write during a job interview. Make sure you take into account punctuation, Unicode and words like `O'Reilly`, and `home-brew`. Abigail	[reply]
Re: Perl Destroys Interview Question by Zaxo (Archbishop) on Jan 12, 2004 at 23:23 UTC
What mr_mischief says, which can be fixed by replacement with `$words{$_}++ for split;` (no chomp needed). Also, I'd prefer an output loop which didn't construct a potentially long list of keys to iterate. Something like this, `while ($_ = each %words) { print $_, ';', $words{$_}, $/; }` [download] I like to name my hashes singular for their values, not their keys. That makes the doc-suggested pronounciation work - `$count('foo'}` is "count of foo" and so on. After Compline, Zaxo	[reply] [d/l] [select]
Re: Re: Perl Destroys Interview Question by redsquirrel (Hermit) on Jan 13, 2004 at 15:13 UTC
I agree, I like the name `%count` better than `%words`. The hash (Map) in my Java solution was named `wordCount`.	[reply]
Re: Perl Destroys Interview Question by mr_mischief (Monsignor) on Jan 12, 2004 at 22:56 UTC
This doesn't count words in a file. It almost counts unique lines in a file. What it actually does is list each unique line in a file and the number of times it occurs. This is useful in some situations, and I'm sure it's quicker to do in Perl than in Java. It's hardly a case-insensitive word count. Is this exactly the code you submitted to solve their problem, or did you retype this from memory? Christopher E. Stith	[reply]
Re: Re: Perl Destroys Interview Question by redsquirrel (Hermit) on Jan 13, 2004 at 15:02 UTC
I copy/pasted this code. I didn't re-type it. Why do you ask?	[reply]
Re: Perl Destroys Interview Question by Anonymous Monk on Jan 12, 2004 at 23:00 UTC
While you were at it you should have used strict, or else a one liner would have served the purpose just the same. `perl -lane '$w{lc $_}++ for @F;END{print for sort keys %w}' text.txt` [download]	[reply] [d/l]
Re: Re: Perl Destroys Interview Question by redsquirrel (Hermit) on Jan 13, 2004 at 15:09 UTC
I am usually a strict zealot, but in this case, I felt it would take a little away from the conciseness of the solution. So I consciously left it out. A one-liner certainly would have served the purpose, but it wouldn't have been as readable. This was an interview question, not a Perl Golf competition. :-)	[reply]
Re: Perl Destroys Interview Question by LAI (Hermit) on Jan 12, 2004 at 22:53 UTC
Well done, redsquirrel. This seems to point out what most Java-Perl holy wars miss: that for certain applications Perl is far more useful than Java. (and, by extension, vice-versa.) LAI `__END__`	[reply] [d/l]


Syntactic Confectionery Delight
	PerlMonks

Perl Destroys Interview Question

Program Purpose

Requirements