comment on

With apologies if you're not interested in a P6 solution. I was and thought the one that is natural in P6 was worth writing up, including its downsides and complications.

Given your data code exactly as it's written, one could write this parallel iterator in P6 code and given correctly formed data it should work to any depth:

sub infix:<landr> ($l, $r) { $l and $r }; say $filter Ťlandrť $source
[download]

Run by the Rakudo Perl 6 compiler it would display:

{f10 => [important data], f3 => important data, f5 => {f6 => more impo
+rtant data, f7 => {f8 => important data}}}
[download]

While I think it nicely addresses the elegance focus of your OP, other issues obviously arise.

Before touching heavier aspects, I'll get a couple lighter ones out of the way. First, I didn't add any validation of the input. I haven't even considered it. Second, the above code constructs a new data structure rather than pruning the existing $source. Perhaps that's not desirable if the structure is extremely large. I'd expect there to be some P6 way to iterate over two data structures in parallel, mutating one of them en passant, but it may not be as elegant, and again I haven't even considered it. The above at least illustrates one conceptually/syntactically elegant approach to the problem in P6 and I've decided to stop there because there are obviously heavier issues...

The elephant in the room is of course that it's P6, not P5. The rest of this comment discusses the two most important elements of this latter aspect, namely plausible integration into a P5 environment, and performance.

Plausible integration of the above code into a P5 environment

While the technical / deployment aspects of integrating P6 into a P5 environment aren't trivial, the much bigger and thornier issues in most situations are social, not technical. But I'm not going to address social issues in this post. Either a reader is interested in mixing P6 into a P5 environment or they aren't and I'm going to assume they are and that the only useful thing I can provide under this heading is an explanation of the two normal options of how to do so.

One way to integrate the two is to add a use Inline::Perl6; statement to P5 code. (The linked module hasn't been updated since December 2016 but I do not think that reflects lack of interest in maintenance on Stefan Seifert's part so much as that the documentation is still valid. Yes, P5's IP6 syntax is very primitive compared to P6's P5. See next paragraph and perhaps consider encouraging Stefan to develop P5's IP6 to catch up with P6's IP5 sweetness. IP6 relies on the regularly updated Inline::Perl5 to do its thing so bug fixes, performance, semantics, etc. should be basically the same.

The usual way to integrate P5 and P6 is instead to reverse the approach by adding a use Inline::Perl5; statement to P6 code. The underlying tech is the same. As far as I know the only substantive difference is the highly polished syntactic sugar available doing it this second way around.

Rakudo hyper op performance

Ignoring the many other issues with use of P6, it's plausible that any P6 solution is currently unacceptably slow compared to any P5 solution because most Rakudo operations are currently significantly slower than their P5 cousins.

Rakudo is already fast enough for some production use cases for some users. And it has begun the long hard slog of reaching faster performance than P5 with some success for a handful of things. But I think it'll be years before its progress is sufficiently compelling to quiet the grinches. Also, happily, perl (5) is getting significantly faster at some of its basic operations each year and there's at least one plausibly serious performance oriented alternate P5. (We may well be about to witness an interesting long haul race between old and new Perls, Pythons, and Rubys throughout the 2020s.)

In theory, use of hyperops (Ť and ť or their ascii equivalents) in P6 will one day suddenly jump to become one or several orders of magnitude faster for some use cases. This is because they're semantically defined to be done in parallel and the compiler is architected to take advantage of that (though does not yet do so).

While there may be an order of magnitude jump or so if the target hardware is multiple cores, it could well be several orders if the target is a GPU.

I doubt the compiler would ever parallelize for your use cased based purely on heuristics. These heuristics will presumably only even consider attempting parallelization of "hot" code.

(I say "attempting parallelization" because I'm expecting the heuristic approach to be speculative, backed by the same deoptimization mechanism that backs the new speculative static analysis that is just recently beginning to be applied -- see ‎How does deoptimization help us go faster? which is a link to a video that jumps right to the point where Jonathan Worthington delivers his presentation's punchline.)

As I understand it, code would have to be repeated at least several hundred times before it would be considered hot. Your use case seems to be a one-time-per-run deal so speculative parallelization would not apply.

Instead, I'm anticipating compiler directives, perhaps in the form of pragmas like use hyper <NUMA>; or use GPU; or similar, that direct Rakudo to unconditionally parallelize if the appropriate hardware is present.

As I understand things, this parallelization could be attempted today but tuits are currently focused elsewhere and likely to remain so for the next few years. So this is likely waiting until someone with sufficient C chops and a fearless attitude jumps in and makes it happen.

In reply to [Perl 6] Re: A more elegant way to filter a nested hash? by raiph
in thread A more elegant way to filter a nested hash? by jimpudar

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Problems? Is your data what you think it is?
	PerlMonks