Re: Efficient partial deep cloning

Can you "flatten" each of the data structures? By that, I mean store everything in two ways simultaneously: in the actual structure, and also in a separate table that maps string representations of the paths to the appropriate value or variable in the structure. Or perhaps a reference to the actual value. Or separate the flattened representation into data and variables -- then, when you bind($s1,$s2), you'd scan through the keys to the table of variables, and look up each one in the flattened representation of the other value. It would be a simple string lookup in a hashtable at that point.

When to generate or update the flattened representation would depend on how these structures are populated. If you were to generate all of them at once, then yes, recursively walk the data structure. I don't think localizing the stack helps, though -- I think that would require copying the whole stack every time you localized it with the old value plus one new path token. Better to just push the token on and pop it back off after you're done. Actually, if you're doing it my way (with a string representation of the path), then you would just call bind(..., $path . $token) recursively and it will manage your stack for you.

And if you're maintaining the flattened version at all times, you just need to have the path so far available anywhere you're mutating things.

Why are you worried about this particular optimization already? It sounds like you aren't done with the rest of the algorithm. You might want to make it work first, just in case this optimization turns out to be invalid. Can variables be non-leaf nodes in your structures? If so, then you may have to do full searches rather than simple cached path lookups. Can the same variable appear more than once? If so, then you have to do speculative bindings or loop through all possibilities or something. Can the right-hand side also contain variables? Then you need to be able to bind the two variables together, and if they can appear more than once, I think the speculation gets hairier.

By the way, I think your function bind() should be called unify(). The terminology I am familiar with is that you unify the two structures, resulting in a set of bindings (possibly recorded via calls to bind()). But perhaps you're doing things differently from what I remember; I can't figure out when you would ever want to unify a single structure with multiple other structures in succession. Then again, it's been a very long time since I last thought about this stuff.

Comment on Re: Efficient partial deep cloning

Replies are listed 'Best First'.

Re^2: Efficient partial deep cloning
by Ovid (Cardinal) on Jan 09, 2008 at 10:22 UTC

Unification and data binding, in this context, are different things. Prolog:

  gives(tom, book, SOMEONE) :-
    person(SOMEONE),
    likes(tom, SOMEONE).
[download]

Let's look at the like( tom, SOMEONE ) bit. Prolog would fetch all of the like/2 facts and attempt to unify those facts, one at a time, with like( tom, SOMEONE ). As soon as a fact matches (unifies), then the SOMEONE variable will be bound to the corresponding value. So if we have these facts:

  likes( mary, tom ).
  likes( tom, susan ).
[download]

The first fact won't unify and SOMEONE won't be bound to tom. When the second fact is encountered, unification occurs and SOMEONE is bound to the value susan. Once that's bound, then the engine goes to person(SOMEONE) and since the variable is already bound, we are attempting to unify this fact with a fact in the database which asserts that susan is, in fact, a person. If we find that fact, we get to the top of the rule and know that tom gives a book to susan.

So the terminology is right, but this means we tend to have exhaustive, depth-first backtracking searches. If we have lots of facts in our database, these searches can be very slow. Having done a lot of work in this area before, I know how incredibly common (and slow) the binding and unbinding of data can be, so while I generally tell people not to do premature optimization, like many others, I caution folks that this isn't always true. If you're writing a ray-tracing program and you need it to be fast, you know you're probably not going to reach for Perl first. (Of course, if you want predicate logic, you also don't want to reach for Perl first, but I'm strange that way).

Cheers,
Ovid

New address of my CGI Course.

[reply]
[d/l]
[select]


Think about Loose Coupling
	PerlMonks