Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Sharing data "cache" between forked processes

by Anonymous Monk
on Nov 23, 2018 at 11:31 UTC ( [id://1226220]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I'm trying to implement/adopt a "cache" for frequently accessed data which is currently loaded from and stored to files on each access/modification.

Situation

The data loaded from files is represented in hashes (one hash per file). The primary purpose of the "cache" would be to hide the reading/writing of files from the rest of the program, so that data retrieval and modifications stay reasonably fast even when there are high disk loads. The program is a forking server, so the "cache" needs to be shared across multiple processes.

Since I don't see a great overlap between traditional perl caches (CHI, Cache and similar) and my needs (essentially an in(-shared)-memory database with reading and writing entries to files on specific conditions), I don't think using those would be wise.

When trying to implement this "cache" I've considered using some sort of shared-memory module which could store a hash of hashes. On the hash of hashes in shared memory front, I've looked into IPC::MMA and IPC::Shareable but neither seems to fit the bill. IPC::MMA can only store scalars in its hash structures, so I can't nest the hashes. IPC::Shareable has the problem of possible conflicts with the 4 char glue (I need to share lots of relatively simple hashes) and might run out of usable shared memory segments.

I also looked at in-memory databases, but I'm not sure about how that would affect memory usage (I imagine anything in retrieved from a table will be copied), and all databases I've looked at would need a ramdisk, since they don't support in-memory connections from multiple processes.

Question

Primarily, I'd like to ask if you can recommend a perl module which supports nesting hashes in shared memory and doesn't suffer from the limitations IPC::Shareable has, however I'm open to suggestions of alternative approaches to solving my main issue (the "cache").

Replies are listed 'Best First'.
Re: Sharing data "cache" between forked processes (MCE!)
by 1nickt (Canon) on Nov 23, 2018 at 14:37 UTC

    Hi, the correct solution depends on your specific needs, but marioroy's Perl Many-Core Engine offers several options. Please see MCE::Shared, MCE::Shared::Hash, MCE::Shared::Minidb, MCE::Shared::Cache.

    From what I understand from your post, you basically want a shared DB where individual keys can be handled as with a cache, but sub-keys can also be accessed. Presumably you also need to be able to search for a key or keys by the value(s) of a sub-key or sub-keys). You might like:

    use strict; use warnings; use feature 'say'; use Data::Dumper; use MCE::Shared; my $db = MCE::Shared->minidb(); my %hash = ( problem => 'foo', technique => 'blorgle', answer => 41 ); my %junk = ( problem => 'bla', technique => 'blargle' ); $db->hset( my_key => %hash ); $db->hset( junkey => %junk ); # sorry for the bad pun my $pid = fork; die 'Fork failed' if not defined $pid; if ( $pid == 0 ) { # child $db->happend( my_key => (problem => 'bar')); $db->hincr( my_key => 'answer'); $db->hset( my_key => (technique => 'frobnicate') ); exit; } # parent wait; my @rows = $db->select_href(':hashes', ':WHERE answer > 0'); say Dumper \@rows; __END__
    Output:
    $ perl monks/1226220.pl $VAR1 = [ [ 'my_key', { 'answer' => 42, 'problem' => 'foobar', 'technique' => 'frobnicate' } ] ];

    (A note from the doc that helps explain why the unfamiliar query syntax: "Several methods take a query string for an argument. The format of the string is described below. In the context of sharing, the query mechanism is beneficial for the shared-manager process. It is able to perform the query where the data resides versus the client-process grep locally involving lots of IPC.")

    Hope this helps!


    The way forward always starts with a minimal test.
Re: Sharing data "cache" between forked processes
by hippo (Bishop) on Nov 23, 2018 at 13:55 UTC
    all databases I've looked at would need a ramdisk, since they don't support in-memory connections from multiple processes.

    Doubtless I am misunderstanding here but does the memory engine of MariaDB not satisfy your requirements? I've used it for lower-latency stores in several projects (including FCGI-based access) without problems. Could you explain where this falls short for you?

Re: Sharing data "cache" between forked processes
by cavac (Parson) on Nov 23, 2018 at 12:52 UTC

    Really depends on your needs. But just to shamelessly plug my own stuff: Interprocess messaging with Net::Clacks

    Net::Clacks implements real time messaging as well as a memory-only cache. Basically, if you read a file, you could just store() it in Clacks as Base64. Structures could be encoded with JSON::XS + Base64. At least, that's how i'm doing it.

    If you want to handle the file loading/saving/deleting on the server side for some reason, it would be sort of trivial to implement. Just adding some flags to the OVERHEAD command handling in Net::Clacks::Server.pm should do the trick.

    perl -e 'use MIME::Base64; print decode_base64("4pmsIE5ldmVyIGdvbm5hIGdpdmUgeW91IHVwCiAgTmV2ZXIgZ29ubmEgbGV0IHlvdSBkb3duLi4uIOKZqwo=");'
Re: Sharing data "cache" between forked processes
by kschwab (Vicar) on Nov 23, 2018 at 13:53 UTC
Re: Sharing data "cache" between forked processes
by localshop (Monk) on Nov 26, 2018 at 05:43 UTC
    I've had good results using CHI but I'd also first look at Hippo's suggestion of solutions in the DB - almost all of them allow either pinning tables to memory or using an in-memory engine. You can even put the backend DB's on a RAM Drive, but if you're not interested in solving this at the persistence layer then I'd suggest looking at CHI as well as the other suggestions.

    CHI has been around a long time and although it hasn't seen much updating recently it's proven in many production environments - there's also some interest in porting it to Perl6 which I assume is a good thing.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1226220]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2024-03-29 09:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found