http://qs321.pair.com?node_id=589642

hardburn has asked for the wisdom of the Perl Monks concerning the following question:

We have a large number of objects that we'd like to cache, running on mod_perl. We normally do this with MLDBM or some other file cache module that supports Perl multi-layer datastructures, but I'd like to explore an alternative using readonly scalars implemented via SvREADONLY, such as what Scalar::Readonly uses.

A big requirement is that the data stays shared between apache processes. Most of our applications don't need to write to the cache after server startup, so we can build the datastructures, set SvREADONLY flag on all its scalars, and should expect that the data stays shared.

We should be able to expect that our own code leaves the SvREADONLY flag alone to maintain sharing. However, I'm worried that perl will do something on its own that will end up with the scalars becoming unshared.

So my question is: how far can we rely on perl to keep SvREADONLY scalars shared?


"There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

Replies are listed 'Best First'.
Re: Reliability of SvREADONLY
by perrin (Chancellor) on Dec 13, 2006 at 18:05 UTC
    Reading a variable in perl can modify it. It can cause a conversion from a number to text, or vice-versa, which will be written. I don't know how SvREADONLY works, but I doubt it prevents this type of conversion. For the most part though, variables that you read in before forking will stay shared. I don't think SvREADONLY will affect this either way.
Re: Reliability of SvREADONLY
by Joost (Canon) on Dec 13, 2006 at 18:47 UTC
    So my question is: how far can we rely on perl to keep SvREADONLY scalars shared?
    Not very far, probably. As noted above, type coersion will still modify the underlying data structure. Also, creating references to the data can make the memory unshared if the SvNULL's REFCNT field is on the same page as the actual data. (IIRC the copy-on-write mechanism is on a per-page basis). That's probably especially relevant if you have a lot of nested data (i.e. much of the data is references).

    See also http://gisle.aas.no/perl/illguts/

    ps: if you keep the parent process alive and periodically kill and refork children, you might be able to keep this under control. it probably depends a lot on the data and the kind of access you need to it.

Re: Reliability of SvREADONLY (PV, need)
by tye (Sage) on Dec 13, 2006 at 21:05 UTC

    If I were doing this, only the PV would be shared and I'd expect SvREADONLY to work until the scalar is destroyed.

    I think Perl really needs to support this type of thing. I run into lots of cases where it would be nice to have Perl offer (efficient) read-only access to a block of memory that it didn't allocate and be smart enough to not try to free that block when the scalar is destroyed (and possibly call a hook instead). There are convoluted and inefficient ways to get Perl to kind-of do this.

    - tye        

Re: Reliability of SvREADONLY
by jbert (Priest) on Dec 13, 2006 at 19:00 UTC
    Shameless plug: if you want to see how much memory is really being used by different processes (taking copy-on-write etc into account), and you're running a relatively recent Linux, you might want to try Exmap (It's packaged in debian testing and recent ubuntu, so no real need to grab the source in those cases, unless you want the latest version).

    It's not the world's most polished tool, but can give you some useful numbers.

    Note: it's not particularly useful for single-process analysis, just for examining the degree of sharing between different processes (and breaking that down per file/elf section/elf symbol). In this case, you'll just get a big [heap] figure, but you'll be able to see the 'effective memory used' for each apache process.

    It's not perl-specific, so apologies if this is off-topic.

Re: Reliability of SvREADONLY
by diotalevi (Canon) on Dec 14, 2006 at 02:41 UTC

    It has occurred to me that you could stuff such process-wide strings into the optree as constants by using user-pragmas. I've recently been wondering what possible uses having data storage in the optree would have other than as plain user pragmas. This might be one. You'll also find that since optrees are shared between threads that this part might also be shared. I'm not sure of that though.

    ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Re: Reliability of SvREADONLY
by zshzn (Hermit) on Dec 14, 2006 at 03:56 UTC
    I am not quite sure why it is said above that type coercion is going to cause a problem here. Let me provide a little example.
    use ExtUtils::testlib; use Flag; #all local, sorry my $c = 'test'; Flag::sv_set_flag($c, 'READONLY'); #$c++; #$c = 'not a test'; $c = 5;
    Any of those examples produce a "Modification of a read-only value" error. Naturally. And as long as that works, that we cannot directly modify the value, we're set. Certain types of behavior can modify the underlying data structure, but not to the point of changing the eventual data.
    use Devel::Peek; use ExtUtils::testlib; use Flag; my $c = 15; Flag::sv_set_flag($c, 'READONLY'); print Dump $c; print "$c"; print Dump $c;
    Naturally the interpolation causes the IV to become an PVIV. But our PV value will be "15"\0. So in any circumstance our behavior is normal. Yes the structure has changed, but from a user standpoint our data is readonly.

    Is there a specific need to protect the data from perl internals that could possibly be ignoring the readonly state of a variable? If not, then type coercion isn't a problem at all.

      Naturally the interpolation causes the IV to become an PVIV.

      That causes a write to the memory page, causing it to become unshared, ...

      I am not quite sure why it is said above that type coercion is going to cause a problem here.

      ...thus type coercion causes a problem.