Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Fast shared data structures

by gildir (Pilgrim)
on Nov 08, 2001 at 22:09 UTC ( [id://124115]=perlquestion: print w/replies, xml ) Need Help??

gildir has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks,

I was working on a mod_perl templating system. As a speed is a primary issue, I read and pre-process all template data at server start-up. Pre-processed templates are stored in memory of Apache master server as one huge perl hash, and I let the OS copy-on-write do effecient sharing in subprocesses. All works fine so far.

But now I want allow users of my system to change templates when server is running. Kind of 'hot-swap' template feature. I know how to detect changed templates, how to re-parse tham and all such things. The problem is memory. If I do template re-parsing in every apache child process, memory consumption will be $size_of_tempate*$num_of_processes, because copy-on-write sharing will not take place here. Most undesirable situation if the size of parsed templates is 10MB ...

Some kind of shared memory hit my mind at this moment and I began a search. I have found IPC::Shareable and it looks ideal at first glance. But alas, it does freeze/thaw cycles all the time and that will affect performance very badly. I want templating system to be very fast, in-memory native perl hashes seems to be best solution.

Is there any way how I can share perl hash directly between Apache processes? Most of the time I will be reading that hash. When writing, all operations on the hash may block, that is no problem. Can anyone please turn my nose in the right direction?

Or maybe my thoughts go all the wrong way from the very begenning ....

Replies are listed 'Best First'.
Re: Fast shared data structures
by perrin (Chancellor) on Nov 09, 2001 at 00:36 UTC
    First, don't do it. Don't write a templating system. At least read my article on the subject and see if there's something out there that meets your needs.

    Now, as for the shared memory, the question is what your data looks like. If you're compiling the templates to code refs, there is no way to share them. IPC::Shareable won't do it, and neither will anything else currently available. If they're not code refs, you should look at MLDBM::Sync or Cache::Cache. These are currently the fastest options available for sharing data safely across processes.

      First of all, I did RTFM and found nothing suitable. Ma data is XML. I have started with a standard - XSLT and found it terribly slow. I have found that anything that is DOM-based or string-parsing based is too slow for my needs.

      So I have used XML::Grove to represent and built my XML trees in the application. Simple, fast and clean way to do it. As no suitable XML templating system was available for XML::Grove, I have written a very simple but powerfull templating system with embeded perl commands. very fast indeed, because crawling XML::Grove tree is fast operation, no text parsing, no extensive searches, no overkill-DOM interface overhead.
      I pre-parse all the templates on apache start-up and keep that in memory. Embeded-perl templates are stored as coderefs, indeed.

      At a 10000feet my application looks like:

        +-------------+ XML  +----------------+ HTML/WML/whatever
        | Application |----->| tmpl processor |----->
        +-------------+      +----------------+
                                     ^
                                     |
                                 templates
      
      Note that I represent XML trees as perl hashes, not a strings, so any IPC between application and templating modules will make the whole system ineffective.

      But if there is no module to share that data (as you indicated) could anyone provide me with any idea how to share a coderef between two processes? I thought that P-code is the same, no matter in what perl interpreter it runs in. I need this only for processes with the same (or very similar) symbol tables - processes forked from the same parent (typical Apache/mod_perl processes).

        AxKit has made progress on the speed of XML processing. I think it uses a processor from Gnome now. You can also use Template Toolkit with the XML plugin or XML::Simple.

        There's no way to share a coderef short of actually hacking some XS code, and even then it may not be possible. If it was easy, I assume that Storable would already do it. However, it sounds like you're saying that your data is hashes, not code refs. Those can be serialized. You could do a multi-level cache, with an in-memory version and a shared version serialized on disk. Take a look at MLDBM::Sync for an example, or HTML::Template.

Re: Fast shared data structures
by Masem (Monsignor) on Nov 08, 2001 at 22:33 UTC
    I'm not sure on the efficient sharing of memory between processes, but if one assumes this isn't possible, then the solution, to me, seems that you need some other process that caches and *builds* template requests when given the file and the data to be filled in, and returned the text stream that's needed. That is, the first thought that came to mind was backending this with a database (a single server process) , in which you replicate what you have with the shared hash you have. Unless the database server has sufficent ability to mimic everything else that you need, you'll then need a perl server script that sits on top of this; you can pass the raw data from your apache children to this process via either XML (slow) or doing Storable and/or freeze thaw (reasonably faster). Part of this request would be the template file, which you should be able to do a timestamp check and proceed to recache if necessary. Then deliever back the text stream, and you're all set.

    Of course, if you then think about it, you don't need the database server anymore, as your perl server can simply act as it.

    I think the key point is that if you want to do this effectively, I believe you need a separate process unassociated with the web server in order to retain only one memory store while keeping any reasonable issues of speed, as well as the ability to hot-swap.

    -----------------------------------------------------
    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
    "I can see my house from here!"
    It's not what you know, but knowing how to find it if you don't know that's important

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://124115]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2024-04-26 06:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found