Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: Site Suggestion

by chromatic (Archbishop)
on Jun 18, 2002 at 17:09 UTC ( [id://175431]=note: print w/replies, xml ) Need Help??


in reply to Site Suggestion

Unlikely, for performance reasons. Super Search performs a much more detailed scan, and hits the database much harder than Search does. By keeping the common case the less expensive (albeit less powerful), we keep the site responsive.

I sometimes use Google to search the Perl Monks archive. It works pretty well... they have the hardware we don't.

Replies are listed 'Best First'.
Re: Re: Site Suggestion
by u914 (Pilgrim) on Jun 18, 2002 at 18:48 UTC
    How about using Google's API within a search box here?

    The way i read their agreement, that might be OK, though limited to 1000 searches a day.

    The search power of Google combined with the favorable signal-to-noise ratio here could be mighty fine indeed!

    There are lots of threads here related to Google, it's API and WWW::Google, just use the search box on "google api"

    just a thought...

      This might not be a bad idea. You could cache the results so that if you were at the limit, you could just show the slightly staler cached results. Perhaps even do the search for each word of the query (after removing common words like the) rather than the whole string and then merging the combined results for display.

      -Lee

      "To be civilized is to deny one's nature."

        I don't think the idea of merging results would work, it would take about as much power per search as a standard search engine system would (Where you effectively have a lookup table of which words appear in which documents and merge the document sets to find a particular word- sometimes with relevancy weightings and other such funky things), and the limitation of only being able to update 1,000 words a day would mean that any results which could be merged would soon get very out of date.

        Any saving you made by not needing to index the documents yourself would be lost by the added network traffic and latency required to use an external site for this.

        The idea of using Google's API to offload the searches may be do-able, but would rely on there being not more than about 1,000 searches per day.. and don't forget this needs to be scalable too so less than 500 would be more realistic.

        Is there a Perl module to provide a nice, and efficient, search engine on a site? Just I don't think I've seen one, but I'm amazed if one doesn't exist.

      I don't think that this is a good idea. Relying upon a crippled, experimental program which if it succeeds will most likely become commercial isn't a good idea for PM with its amount of traffic, limited budget, and emphasis on open source software.

      A fully operational Super Search will be a very useful tool. One would not want it beholden to Google's commercial wims.

      ()-()
       \"/
        `                                                     
      

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://175431]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (6)
As of 2024-04-24 01:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found