Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re^7: Running SuperSearch off a fast full-text index.

by creamygoodness (Curate)
on Jun 10, 2007 at 20:04 UTC ( #620353=note: print w/replies, xml ) Need Help??


in reply to Re^6: Running SuperSearch off a fast full-text index.
in thread Running SuperSearch off a fast full-text index.

Groovy. I've successfully checked out from the svn repository via https. I'll wait for you to set up trunk like a module distro. If you haven't got a name you prefer for the base module namespace over "MonkSearch", that would be my suggestion. BTW, I just had a glance at the code in your CPAN distros, and your stuff look great.

My impulse is to conduct our discussions here on PerlMonks... what do you think?

--
Marvin Humphrey
Rectangular Research ― http://www.rectangular.com

Replies are listed 'Best First'.
Re^8: Running SuperSearch off a fast full-text index.
by dmitri (Priest) on Jun 10, 2007 at 20:13 UTC
    I've successfully checked out from the svn repository via https. I'll wait for you to set up trunk like a module distro. If you haven't got a name you prefer for the base module namespace over "MonkSearch", that would be my suggestion.
      MonkSearch works for me. Should it be App::MonkSearch maybe? Also, we may end up with more than just the indexers -- there will also be the web interface(s).
    I just had a glance at the code in your CPAN distros, and your stuff look great.
      Except SQL::Tidy! Thanks ;)
    My impulse is to conduct our discussions here on PerlMonks... what do you think?
      In this thread?
      App::MonkSearch

      Sure, that'll work... Randy Kobes' CPAN search modules are under CPAN::Search::Lite, FWIW.

      In this thread?

      Probably not all in this thread. We might start a new thread called "MonkSearch - spider" to deal with spidering issues, for example.

      I think it's important to facilitate participation by anyone in the PerlMonks community who wants to join in. The downside is that we might end up creating more messages than the PerlMonks threading model is optimized for, but this isn't that big a project and I think the volume will be manageable.

      --
      Marvin Humphrey
      Rectangular Research ― http://www.rectangular.com
        It is a good point you make about other monks' participation. We can link to threads we create from this thread's parent node. The only thing that baffles me is which category would the threads related to the development of MonkSearch belong to? Seems like an off-topic wherever we place them.
      I would also prefer discussion and than summary on wiki.

      OOH, storing nodes locally in SQLite seems like an overkill. With good filesystem there is no reason to complicate crawler with DBI code, just dump files on disk.


      2share!2flame...
        The reason I think that SQLite would be useful is that if we want to separate the spider from indexer, finding the articles to update in the index is as simple as
        SELECT * FROM ARTICLES WHERE LAST_UPDATED > $LAST_TIME_I_RAN
        instead of searching the filesystem. Stored on the filesystem, we will need code to
        • search,
        • store, and
        • update
        the documents. SQLite provides all of that for free. Want to move to a different machine? -- The database is a single file. Plus, who knows what other useful things SQLite's flexibility will allow us to do?
Re^8: Running SuperSearch off a fast full-text index.
by dmitri (Priest) on Jun 10, 2007 at 22:08 UTC
    My impulse is to conduct our discussions here on PerlMonks... what do you think?
      I was poking around some other projects hosted on code.google.com, and another way to go is to create a google group for it -- they integrate nicely and seem to be an accepted way to communicate for a lot of code.google.com projects. You also get an email address.
      Right, that's the other option. I'd prefer to have the discussions here. educated_foo has already made a worthwhile contribution. If we go off on our own isolated corner of the 'net, we won't get the benefit of having our ideas challenged by this rich community.
      --
      Marvin Humphrey
      Rectangular Research ― http://www.rectangular.com

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://620353]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (2)
As of 2021-10-18 00:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My first memorable Perl project was:







    Results (72 votes). Check out past polls.

    Notices?