chiburashka has asked for the wisdom of the Perl Monks concerning the following question:
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: makeing refering faster ?
by perldeveloper (Scribe) on Aug 15, 2004 at 15:20 UTC | |
As you can see, I first make a hash indexed by the first words in every sentence, where the values are references to arrays holding the indices to the sentences whith start with the word. Then, for every sentence I make an array of these hash values, for every word which happens to start any of the sentences (including the one under scrutiny). I believe this code is more fit to start working on optimization -- my code ran within a second on a 3 thousand line file. A few other remarks:
| [reply] [d/l] [select] |
| |
Re: makeing refering faster?
by Zero_Flop (Pilgrim) on Aug 15, 2004 at 17:18 UTC | |
You really need to get a good book on perl. Start on page one and work your way though. Posting bad code over and over again is not going to get you anywhere! There are numerous problems with this code. No strict, no Warnings. It all boils down to: If you want to learn, we can help you, but you have to learn to walk before you run. Zero janitored by ybiC: Moved from reaped parent thread into this'un, for better site searching | [reply] [d/l] |
| |
Re: makeing refering faster ?
by wfsp (Abbot) on Aug 15, 2004 at 14:24 UTC | |
I suspect there maybe some typos. If you do that it would be easier to help. | [reply] [d/l] |
Re: makeing refering faster ?
by CountZero (Bishop) on Aug 15, 2004 at 19:34 UTC | |
CountZero "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law | [reply] [d/l] |
by bart (Canon) on Aug 15, 2004 at 21:33 UTC | |
| [reply] |
Re: makeing refering faster ?
by graff (Chancellor) on Aug 17, 2004 at 08:23 UTC | |
ps : this script is supposed to get all the lines from a file and refer each word to the sentence that starts with that word (and, there aren't 2 sentences that start identically).Without that, there'd be no hope of helping with the problem. But even with that, there's still not quite enough to go on. (Looks like perldeveloper made a lucky guess, but I confess that I am still confused.) Does the input data file really contain exactly one "sentence" per line? Are you certain that the "words" in each sentence are always separated by exactly a single space character? Are the words in "mixed case", and do they include punctuation marks? (And does this have an effect on what you are trying to do?) Why should it matter if a sentence contains a "word" that consists of the single letter "v"? Let's suppose a particular word (e.g. "bar") occurs at the beginning of one sentence (e.g. sentence #23), and also occurs in the middle or at the end of 4 other sentences (e.g. #5, #12, #47, #69). What do you want to accomplish with regard to this word? Locate just the one sentence that begins with "bar"? Locate just the other four sentences that contain "bar"? Locate all five sentences (and identify the one that begins with "bar")? What do you want to do with words that only occur in the middle or at the end of sentences but never at the beginning of any sentence? Ignore them? How you answer those questions will determine how you should read through the sentences and words, what sort of data structure you should create from the input data, and how you would use that data structure after you've built it. As for the code you posted at the start of this thread, the reason it takes so long for more sentences is the nesting of your "for" loops: As you have learned from experience, this sort of approach "does not scale well" to large numbers of sentences. But to work out a good approach, you need to clarify your goals. You seem to be content with perldeveloper's solution (assuming his additional reply makes sense to you), but it's not clear to me that it is the best approach, or that it does what you really want -- mostly because you haven't provided a clear description of what you really want. | [reply] [d/l] |