Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: Perl text processing

by wjw (Priest)
on Jun 09, 2014 at 15:31 UTC ( [id://1089287]=note: print w/replies, xml ) Need Help??


in reply to Perl text processing

Start with this:

#!/usr/bin/perl use strict; use warnings; #psudo code from here on # open the file with categories # read the categories in (probably to a hash where the key is the cate +gory and the value 0) # close the categories file (you have your hash, you don't need to rea +d from the file anymore) # open the file with 3000k entries # read the file line by line # for each line read # trim to the first 8 characters # look for that value in the hash keys # increment the value of the hash key that is matched if any # you now have a hash with category as the key, and the number found a +s the value # you should be able to figure out how to find the top 100 values and +print out the key and value for each of them or store them to file
You can do this assignment with nothing but the basic Perl functionality.

Run the program using the -d (debugger) and learn to use that tool to examine and learn what those hashes and any other variables look like. It is a quick tool to learn to use if just doing basic examining for self-enlightenment.

PerlDoc is your friend. Tutorials like perldsc, perlop, perlfunc will all help you solve this pretty quickly, including example code much of the time.

Hope you find this helpful... Update:

Note that by using a hash, you eliminate the possibility of there being duplicate categories, simplifying and possibly making the effort more efficient.

Restated the increment step for clarity(I hope)

...the majority is always wrong, and always the last to know about it...

Insanity: Doing the same thing over and over again and expecting different results...

A solution is nothing more than a clearly stated problem...otherwise, the problem is not a problem, it is a facct

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1089287]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2024-04-20 04:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found