Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Dear Monks,
I am looking to build or use a search engine for my Internet Website. I searched here and at CPAN but found no useful results. (May be I am missing right terms in CPAN search). Search will be for only HTML files. Search engine has to search different set of files for queries coming from different customers. The platform could be Windows or Unix using Apache server (no mod_perl) if that matters.

Building Google like search engine would excellent but anything and everything that has good value would help. I don't want to put Google front-end on my website.

Thus, what I am looking for is exisiting search mechanism or CPAN modules that helps me to build a search engine. Of course, I am open to any other suggestions.

Thank you,
{artist}

Update: Please note that I do not have access to database for this purpose. My files are not changed. They are added/deleted on daily base. Volume: Around 10000 files

Update2:(20031114)

  • Volume : total around 200,000 files and 5000 files are added daily.. Some are removed at periodical interval.
  • Perlfect Issues:.No incremental/differential index. And it takes large amount of time to build the index. So cannot search on recently updated things. The data has to be published ASAP. So cannot wait for indexing to finish before publishing. If someone has idea: How to encorporate differential indexing with perlfect, I would really appreciate it.
  • Cannot use Google. The main reason is I have password protected sites and not everybody should be able see everybody else's contents. Think: -> Building a search system for mail accounts.

20031113 Edit by Corion: Fixed unclosed blockquote


In reply to Building a search engine by artist

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (7)
As of 2024-04-25 08:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found