Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Start with this:

#!/usr/bin/perl use strict; use warnings; #psudo code from here on # open the file with categories # read the categories in (probably to a hash where the key is the cate +gory and the value 0) # close the categories file (you have your hash, you don't need to rea +d from the file anymore) # open the file with 3000k entries # read the file line by line # for each line read # trim to the first 8 characters # look for that value in the hash keys # increment the value of the hash key that is matched if any # you now have a hash with category as the key, and the number found a +s the value # you should be able to figure out how to find the top 100 values and +print out the key and value for each of them or store them to file
You can do this assignment with nothing but the basic Perl functionality.

Run the program using the -d (debugger) and learn to use that tool to examine and learn what those hashes and any other variables look like. It is a quick tool to learn to use if just doing basic examining for self-enlightenment.

PerlDoc is your friend. Tutorials like perldsc, perlop, perlfunc will all help you solve this pretty quickly, including example code much of the time.

Hope you find this helpful... Update:

Note that by using a hash, you eliminate the possibility of there being duplicate categories, simplifying and possibly making the effort more efficient.

Restated the increment step for clarity(I hope)

...the majority is always wrong, and always the last to know about it...

Insanity: Doing the same thing over and over again and expecting different results...

A solution is nothing more than a clearly stated problem...otherwise, the problem is not a problem, it is a facct


In reply to Re: Perl text processing by wjw
in thread Perl text processing by biboshakan

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (1)
As of 2024-04-24 14:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found