Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: What's the best way to do a pattern search like this?

by CharlesClarkson (Curate)
on Jul 20, 2001 at 10:58 UTC ( [id://98343]=note: print w/replies, xml ) Need Help??


in reply to What's the best way to do a pattern search like this?

Some things to ponder:

How should the algorithm handle hyphenated words? Should pre-paid become pre and paid or remain pre-paid? Will any words wrap to the next line using a hyphen?

Are there any slang or shortcut words in the file? How should b4 be handled?

Is the file short or long? Should the algorithm read the entire file into memory or would it be better to process each line?

How might you handle dates: 500 A.D., c. 1500 bc.

And what about other abreviations: Mr. Jr. Ave. etc. e.g.


HTH,
Charles K. Clarkson
  • Comment on Re: What's the best way to do a pattern search like this?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://98343]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (2)
As of 2024-04-25 19:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found