Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

IO::Lambda: call for participation

by dk (Chaplain)
on Jan 01, 2009 at 22:50 UTC ( [id://733676]=perlmeditation: print w/replies, xml ) Need Help??

Hello everyone!

During the last year, I was busy writing an async I/O module, IO::Lambda, which I believe does the task of expressing callback-based I/O logic much more elegantly than it was ever done before, by using a different concept. The module includes async versions of DNS, SNMP, HTTP, and DBI (the cool part about DBI is that it can work asynchronously by using either forks, threads, or even a socket connection). It's a lot, but not as much a lot as I think I need. I'm planning to use IO::Lambda to write a new separate module for httpd, and possibly modules for ftp and irc too, but I don't have enough time for it all. Also I find it hard to determine the right balance for httpd, where the module ends, and where a httpd application begins. This is also a help call: if anyone wants to write new modules, or contribute to the development of the existing ones, please volunteer, that would be really greatly appreciated.

If you don't know where to start, there's documentation and examples. There's also a mailing list at io-lambda-general at lists.sourceforge.net, and in realtime I'm McFist on #perl.

Thanks!

Update: next version includes many (but not all) recommendations. I went for the low hanging fruit with small changes, larger changes in the structure of the manual are underway.

Next update: 1.03 It includes IO::Lambda::Mutex, as suggested by tilly - thanks!

Also, as promised, I made a wiki, where everyone is welcome to contribute.

Replies are listed 'Best First'.
Re: IO::Lambda: call for participation
by merlyn (Sage) on Jan 04, 2009 at 15:49 UTC
    I stared at IO::Lambda, and it made my head hurt.

    I'm pretty good at event loops, and POE, and even coderefs and closures. But I'm just not getting it from your docs and examples... too many things are magical. I think it's because you're referencing currying and monads and other stuff that I never totally understood.

    I'm saying this because I want to understand, but can't. I'm looking for something that's easier and deeper than POE. But I still think I'd have to stare at this for a long time before I could even start.

    Maybe if you remove most of the "just like X in Y" references in the docs, and make the docs more standalone. Maybe I could just go through the docs and point out every external reference I don't know about. I realize it must seem tedious to describe something that you already know from somewhere else, and you think everyone else must already know, but we don't all have your experience.

      Oh. Thanks a lot for the response! What's most disheartening, is that even for a person with your experience, the doc is overly complicated, and I of course too can relate to reading docs that are totally incomprehensible. What I can say in my defence, is that I absolutely didn't want to make the doc so that it appears aloof and to pose as if I have huge experience in functional programming, which I don't. What it shows rather, that I'm a lousy documentation writer, if even my best intentions turn into such a result.

      In the long run, I know what to do. Learn to write better, write more, all the usual advices for the aspiring writers. However, that'd be really sorry NOT to rewrite this doc and leave it in the current state when it is hard to understand what it is about. I'll try to rephrase and remove "X like Y", thank you for the advice. And I'll definitely try to rewrite hard sections into simpler ones, but here's a question to you, as a writer: where is the acceptable level of simplification? It won't possibly be a good idea to explain functional programming tricks to make the reader fully appreciate the tricks IO::Lambda does. However, without referring to the concepts from the functional programming, a single paragraph can easily explode into twenty. Possibly that's not a bad thing though.

      Also, I don't really understand what you mean by "make the docs more standalone". Is your idea to split the large document into smaller pods for readability?

      Finally, thank you for using time on bringing up problems with the docs. But I shall need to ask you and everyone interested, I need your help with docs too. I didn't realize that until now, but apparently the documentation is the biggest showstopper. Would anyone like to write, or help me write a gentler introduction to the module? I'll help all I can.

        Likewise, after puzzling my way through the documentation, I'm intrigued, but see a steep learning curve before I'd understand how I'd use this in practice. I don't think this is unique to IO::Lambda -- POE has a similar, perhaps even steeper learning curve. It wasn't until I struggled through POE::Kernel and POE::Session and similar documentation that I really understood what was going on.

        Some reactions:

        In general, I think you assume too much knowledge of functional programming and use too much unfamilar jargon (e.g. "predicates") without clearly explaining each one. The whole "apologetics" section should be moved to the end as it distracts from explaining how to use the module.

        The synopsis itself is too complex or else insufficiently commented to explain what is going on.

        Many of your examples might be clearer if you were explicit about return and fixed up some indentation. E.g. edited from the synopsis:

        # return an IO::Lambda object for a given $host sub http { my $host = shift; my $socket = IO::Socket::INET-> new( PeerAddr => $host, PeerPort => 80 ); return lambda { context $socket; # argument stack for other calls # register a callback for when $socket is writable write { print $socket "GET /index.html HTTP/1.0\r\n\r\n"; my $buf = ''; # register a callback for when $socket is readable read { return $buf unless sysread( $socket, $buf, 1024, length($buf)); # restart the "read" state if more is available again; } } } }

        There are too many ways of doing things described too early and without context (no pun intended). For example:

        A lambda is initially called with some arguments passed from the outside. These arguments can be stored using the call method; wait and tail also issue call internally, thus replacing any previous data stored by call. Inside the lambda these arguments are available as @_.

        The part in bold is extraneous and distracting, as it forces the reader to ponder the relationship between different ways of calling a lambda -- it exposes implementation details that are irrelevant to initial understanding. Moreover, I think you mean that the arguments are made available as @_ in the callback attached to the lambda object. "Inside the lambda" is a bit vague.

        Overall, I think you might need to explain more of the core principles for async programming and why they are necessary. Likewise, I think you need to explain subtleties in your examples, like why the "read" predicate is inside the "write" predicate. (Presumably, we don't want to start listening until we've actually sent the request via the write predicate.)

        Then I think you need to walk through what happens in an example. So the "http" lambda is executed via "wait" -- a "write" state callback is registered. (Is it executed immediately? Does it execute after the lambda callback executes?) The "write" state is the only one registered and it's immediately valid (socket is writeable) so the "write" callback is executed. During that callback, a "read" state callback is registered. When the "write" callback finishes, the socket is readable so the "read" callback is executed one or more time. And so on. (Why does execution stop?)

        So my general advice is to start very small, very simple and explicit and then build up from there.

        -xdg

        Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

        This might be just me - but I stumbled at 'context'. It is introduced in the examples - but the explanation is way further and even reading that explanation I cannot fully internalize what it is.
      I concur. Reading that documentation was a humbling experience :)

      Many of those operations there are a higher order functions - perhaps listing their signatures (types) would help people in understanding what they do?

        Thanks! Would that be too much to ask you to help me with the docs, by telling what exactly felt hard? The signatures are actually included for all higher-level functions, but apparently either not visible or not explained enough.

      The key words that are missing from all of this, including the documentation, are "Continuation-passing style". As far as I can tell, that is exactly what IO::Lambda provides helpers for.

      See this Wikipedia Article on Continuation-Passing Style

      dk, I suggestion you mention this in the documentation. Folks who are familiar with the concept should grok it immediately, and folks who aren't have a nice phrase to google.

        You're absolutely right, IO::Lambda is and and was all about CP-style from the very beginning. However, I remember I was reluctant to mention monads because they strangely spook people, so I thought, if I'd tell about CPS that would be even worse :) Still, I agree, for those who know about CPS, the analogy will be immediately clear. Thank you for the advice, I'll try to include it so that it won't scare off newcomers but will still be visible.

        btw, adding links to wikipedia inside a pod might be just the right thing to to.

regarding 1.02 (was Re: IO::Lambda: call for participation)
by merlyn (Sage) on Jan 13, 2009 at 16:12 UTC
    I appreciate you taking the time to incorporate changes.

    I'm still just not fundamentally getting it though. Why is everything nested in:

    lambda { context $socket; writable { print $socket "GET $url HTTP/1.0\r\n\r\n"; my $buf = ''; readable { my $n = sysread( $socket, $buf, 1024, length($buf)); return "read error:$!" unless defined $n; return $buf unless $n; again; }}}
    I mean, I don't get this continual nesting you have in all the examples. If you explained why this is nested as you introduce your first example, and that that's not a typo, and then explain the control flow (does "readable" get called first or last here??), I could get past this into the interesting details

    I'm not stupid. But I keep beating my head against that. And I think it's something fundamental to your approach, because I see it everywhere.

    I think it might have something to do with this paragraph:

    Whatever is returned by a condition callback (including the lambda condition itself), will be passed further on as @_ to the next callback, or to the outside, if the lambda is finished. The result of the finished lambda is available by peek method, that returns either all array of data available in the array context, or first item in the array otherwise. wait returns the same data as peek does
    But I can't connect that abstraction to what it means to me as a programmer, or how that would result in all these nested definitions.

    If you could enlighten me as to why you end up with "}}" in every example, that would help.

      Does it help at all if I re-phrase and re-format it like this (pretend that the predicates are prefixed with 'on_' even though they aren't):

      lambda sub { context $socket; on_writable sub { print $socket "GET $url HTTP/1.0\r\n\r\n"; my $buf = ''; on_readable sub { my $n = sysread( $socket, $buf, 1024, length($buf)); return "read error:$!" unless defined $n; return $buf unless $n; again; } } }

      When the lambda closure is executed, the 'on_writable' sets a callback to be executed when the $socket is writable. When the closure finishes, IO::Lambda sees that the socket is writable and executes the callback. That callback executes and it sets another callback for when the socket is readable. When the writable callback finishes, IO::Lambda sees that the socket is readable and executes the readable callback. That callback returns a value when all input is read, or else re-queues itself for the next time the socket is readable using again.

      When all that is done, the value returned from running (er, 'wait'-ing for the lambda) is the value returned from the last callback to run -- in this case, the "return $buf".

      The 'readable' part has to be set after the 'writable' part runs, otherwise, IO::Lambda could call them in any order, trying to read from the socket before the request is sent.

      At least, that's how I think it works.

      I agree that the nesting syntax is confusing and the way values are returned is likewise confusing.

      -xdg

      Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

        I thought it should be read differently, then I read it more carefully and I agree with your reading. But since seeing the same thing said multiple ways often helps, let me present it my way. First merlyn's example, formatted differently.
        lambda { context $socket; writable { print $socket "GET $url HTTP/1.0\r\n\r\n"; my $buf = ''; readable { my $n = sysread( $socket, $buf, 1024, length($buf)); return "read error:$!" unless defined $n; return $buf unless $n; again; }; }; };
        Here it is again with comments stating what each piece does according to my understanding.
        # This sets up one of many parallel closures that process # in parallel. It will be called at the start. lambda { # This sets the context of what connection this happens # on. This association is remembered within the engine. context $socket; # writeable sets up a possible event to monitor, when # $socket is writeable, execute the closure. writable { # The engine discovered we can write, so do so. print $socket "GET $url HTTP/1.0\r\n\r\n"; # This variable needs to stay shared across # multiple invocations of our readable closure, so # it needs to be outside that closure. my $buf = ''; # readable registers another event to monitor - # that $socket is readable. Note that we do not # need to set the context again because when we get # here, the engine knows what context this command # took place in, and assumes the same context. readable { # This closure is executed when we can read. my $n = sysread( $socket, $buf, 1024, length($buf)); # If we return without registering a follow-up # handler, this return will be processed as the # end of this sequence of events for whoever is # waiting on us. return "read error:$!" unless defined $n; return $buf unless $n; # We're not done so we need to do this again. # Note that the engine knows that it just # called this closure because $socket was # readable, so it can infer that it is supposed # to set up a callback that will call this # closure when $socket is next readable. again; }; }; };
        And here we see the reason for the nesting. You nest whenever one action is contingent on another having already happened. Given that lambda just registers a callback, and you always want to do something, somewhere, you always nest at least once. But you can nest more times.
      I've just discovered your question and all the answers here. Luckily, the answers are for all practical reasons correct, there are microscopic things that I'd otherwise comment on, but I won't to avoid further confusion.

      Now, back to your question. "{{" notion is not tied programmatically to the way results are returned, it's just a set of nested closures. However, it is tied conceptually, indeed it means more to a programmer, because it introduces two important parallels between plain old declarative style and all this callback and closure mess.

      The first parallel is that the sequence matters. As in normal, blocking code, you would expect programming a HTTP request as

      print $socket ... readline $socket;
      Here, the code is different:
      writable { syswrite ... readable { sysread ... }}
      but the sequence is the same! To emphasize this fact, there's no additional indentation for inner closures, to highlight that the execution is top-down, one way, linear.

      This fact, in my opinion, is also a step forward when comparing with programming using the traditional event-driven frameworks, like POE or IO::whatever:

      on_write => sub { ... }, on_read => sub { ... },
      where sequence doesn't (syntactically) matter. One has to deal with sequencing by other means. In IO::Lambda, as in declarative programming, sequence is a part of syntax.

      The second parallel is that return quits the scope. In the declarative style, no matter how deep the execution is down in inner cycles, return $x quits the scope of the current subroutine and sends $x to the caller. In IO::Lambda, no matter how many "}}" execution is inside, readable inside writable inside whatever, return $x does basically the same (given of course some restrictions, that no others callbacks are waiting, and no again is called). Whoever awaits for the lambda, be that asynchronous execution by tail/tails/etc, or synchronous by wait, it can count on $x being returned no matter how many stages the lambda internally went through. Again, to compare with the event-driven style, where it's surely possible, but is not that elegant and is not the part of the syntax.

      From this parallel one gets an important feature, easy wrapping of lambdas, one inside another. For example, we have a lambda that returns HTTP response, as we just've discussed, created by function http. Let's write a wrapper that accepts not URL and returns plaintext, but HTTP::Request and HTTP::Response:

      sub http2 { my $req = shift; lambda { my $uri = $req-> uri-> as_string; my $host = $req-> uri-> host; my $port = $req-> uri-> port; my $addr = sockaddr_in( $port, inet_aton( $host)); context http( $addr, $uri); tail { my $res = shift; return ( $res =~ /^(\w+ error:)/) ? $res : HTTP::Response-> parse( $res) }} }
      From here emerges the third parallel, that is about calling one sub inside another: the caller won't care what happens inside the callee. http2 doesn't care how http returns the data. To make it even more clear, let's write a wrapper that understands redirects:
      sub http3 { my $req = shift; lambda { context http2($req); tail { my $res = shift; return $res unless ref($res); return $res if $res-> code !~ /^3/; $req-> uri( $res-> header('Location')); context http2( $req); again; }} }
      (these examples were parts from my talk on yapc::eu, slides (no text sorry) can be found here) )

      I hope that helps, please ask further if it doesn't.

        Now, you may think you just communicated something to me in:
        First parallel is that the sequence matters. As in normal, blocking code, you would expect programming a HTTP request as
        print $socket ... readline $socket;
        Here, the code is different:
        writable { syswrite ... readable { sysread ... }}
        but the sequence is the same! To emphasize this fact, there's no additional indentation for inner closures, to highlight that the execution is top-down, one way, linear.
        But nothing clicked for me. Why are they nested? When does the closure passed to readable {} get executed? When does the code within the closure passed to writable get executed {}? And why?

        This is the magic that is then fundamental to the rest of your description. Please explain it so we mere mortals can follow along. I can't understand anything further, because it seems to be based on something you think you are showing with this code.

        In other words, you have writable, which takes a coderef. And readable, which takes a coderef. Somehow, the coderef passed to writable is built up by calling some other code and then the result of calling readable. Ooof. Too many layers. Head is hurting. What's the flow, and why?

        And why are they nested!?!

Re: IO::Lambda: call for participation
by zby (Vicar) on Jan 02, 2009 at 08:42 UTC
    Interesting!

    Out of curiosity - why the name?

      As the others have answered, "lambda" being a reference to "callback", but not limited to that. It's also a reference to techniques from functional languages (currying, map/filter/folds) that also apply to the module.

      An example. lambda {} is a (lightweight) object capable of holding the state for you (like monads do), so the following (deliberatly simiplified) construct

      lambda { read { sysread(...) write { syswrite(...) }} }

      makes sure that a socket handle will only receive writing (and only writing) events after reading events were received.

      With this approach, other functional tricks can be done, f.ex. a map analog will make sure that all lambda objects (connections, states) execute sequentially:

      print mapcar( lambda { 1 + shift })-> wait(1..5); 23456
      where instead "1 + shift" there could be a full-fledged http connection, for example, or a DBI statement, or lock waiting procedure - all non-blocking, of course. So again, back to the question, it's mainly "lambda" because it shares some interesting hacks with functional programming.
        Yeah - actually I did make that mental link - I am just not sure how that applies to the library. But after reading the other post (mentioning callback) I must admit that yeah - it was quite obvious.
Re: IO::Lambda: call for participation
by xdg (Monsignor) on Jan 15, 2009 at 11:17 UTC

    To summarize my general reaction to several of the other threads, I think that the particular code layout you use -- breaking the usual expectations of indentation for nested code blocks -- is getting in the way of the "elegance" you're trying to offer.

    It may seem elegant to you, but because it's very different, you're making the learning curve a lot steeper than it needs to be.

    I think the same is true of syntax. I think the move to 'readable' and 'writeable' and so on is a step in the right direction, but those names don't naturally communicate that they set a one-time event callback. I know you were looking to move away from 'on_' or 'when_' prefixes, but I really wonder if a more expressive name would make it easier for people to learn.

    For many people in a work context, I suspect that maintainability will be an important consideration. So making IO::Lambda more expressive and easier to learn may seem less 'elegant' but ultimately may make it more useful to others.

    -xdg

    Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

      That was my initial impression, then I began asking questions about how to write more complex stuff, and began thinking about how it would look. I'm also thinking of trying to write an introductory level document that should take someone who has no background in this stuff to the point where they could think about writing a complex application. (The direction I'm heading right now is a proof of concept asynchronous webserver that uses connection pooling to control how many database connections it uses.)

      In that process I realized that if you indent normally, then after any complex sequence of events you are indented off of the right hand side! And what, exactly, does that indentation tell you? Basically that this happens before that happens before the other thing. Nesting of braces is carrying sequencing information, which we normally don't bother indenting at all.

      So once you get past the mechanics of what it is doing under the hood and try to think in terms of this library, what you really need to do is imagine that someone added a very small vaguely Lisp-like language to Perl, and that language is used to achieve the asynchronous magic. And once you think of it that way, the indentation makes perfect sense. You indent all of your Perl in a block by 4. Then outdent all of the commands in this second language by 2 (to indicate that they are this other language). Then let your closing braces pile up. In short at this point you're formatting the Perl bits like Perl, and the IO::Lambda bits like they were Lisp. (And once I figured that out, I understood as I never have before why Lisp people universally format their code that way.)

        In that process I realized that if you indent normally, then after any complex sequence of events you are indented off of the right hand side! And what, exactly, does that indentation tell you? Basically that this happens before that happens before the other thing. Nesting of braces is carrying sequencing information, which we normally don't bother indenting at all.

        Well, rjbs got me converted to 2-space indents, so indentation is less of a beast to me. ;-)

        On the serious side, I think as long as this is used for a linear execution sequence, what you say is true. And I think all the examples are pretty linear. But if you ever do multiple predicates at the same level, I would suspect that lack of indentation would make following the execution sequence a bit challenging.

        My point is not that it can't be understood, but that one has to actually learn the library first and grok how it works before the code is "skimmable". That's an adoption barrier and probably makes maintenance in a larger team setting problematic.

        -xdg

        Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

        ++! And I think it would be a very good idea to include reasoning about indentation into the introductory document.
      Well, if indentation makes the learning curve too steep, I have no problems with reformatting code in docs and examples with the standard indentation, but shall explain the benefits of the {{-style. I agree.

      About the choice between on_write, writable, and write, it gets not that simple. While I agree that on_write makes a reader instantly recognize that the coderef is an event handler, it's not that obvious with the higher-level conditions. For example, let's take four conditions declared in IO::Lambda::Socket: connect(), accept(), send(), and recv(). One thing is that they clash the with CORE:: names, which I think is good, at least for the three latter names, because either one uses blocking CORE::send, or non-blocking IO::Lambda::Socket::send. Also, connect() is different by semantics from accept(), send(), and recv(): connect() doesn't do anything, it's basically a wrapped writable(), and is the only one that can be renamed to either on_connect() or connected(), without losing its meaning. However, consider send() for example. It waits for a handle to become writable, then sends whatever data provided, and returns the CORE::send() return value.

      What I'm getting to, is that the imperative names actually have their niche, they, like in declarative programming, actually order to do something. Now I'm getting into shaky ground, because I don't have that command of the english language that allows me to make statements like the following, but please tell me if there is a sense in that or not. Names like on_write, on_execute, on_ready, they, as I understand, are appropriate when a programmer did some setup and then awaits for an event. This is true for writable and readable, because all the setup is done outside of these conditions. It's not true though for send() and accept(), they themselves do the setup, and it seems to me that there's no place for word other than an imperative to describe their functions.

      Let's take for example POE:

      POE::Wheel::ListenAccept->new( Handle => $socket, AcceptEvent => ..., )
      where AcceptEvent is semantically separated from ListenAccept. In IO::Lambda::Socket::accept, it's not.

      So, I was thinking then and also thinking now. on_ and when_ (and for that, Event postfix in POE) have one great property, they unify the event names. Again, my english at best was to counter that with names in imperative modes (is it called modes or moods?), that also are expected to unify conditions. But conditions are not events, while some do look more like events, like writable and readable, the majority of the others do not.

      Finally, there are names that I think are very fitting, f.ex. tail and tails. Would that be better to have them changed into on_lambda_done and on_all_lambdas_done? That's too far I think.

      I'd like then to ask you, and everyone too, to help me find that grammatical or semantic unifying principle, or at least a division line between imperative and non-imperative conditions, that could be unambiguously declared and easily recognized. These features, I agree with you, are important both for learning and extensibility.

      And here's the list of the existing conditions: dns flock process forked http_request message snmpget signal pid spawn connect accept recv send rxw readable writable timeout tail tails tailo any_tail

      Out of these signal(), connect(), pid(), rxw(), readable(), writable(), and timeout() are non-imperative.

Re: IO::Lambda: call for participation
by sundialsvc4 (Abbot) on Jan 08, 2009 at 15:26 UTC

    The world is full of ideas, even brilliant ones, that collapsed or passed into oblivion because nobody could explain them to anyone other than their peers...

      Let's change that! How can I help you to learn and use IO::Lambda today? :D

      upd: aargh. that was me doing -- instead of intended ++. I'm sorry!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://733676]
Approved by ikegami
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2024-03-29 04:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found