Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: Is it worth using Monads in Perl ? and what the Monads are ?

by BrowserUk (Patriarch)
on Jun 11, 2007 at 07:02 UTC ( [id://620424]=note: print w/replies, xml ) Need Help??


in reply to Is it worth using Monads in Perl ? and what the Monads are ?

Disclaimer: This is a viewpoint of a 'failed', would be Haskell programmer, with all the jaundiced opinion that entails.

Haskell needs Monads; Perl does not.

One simple description of "pure functional code" is: Every function always gives the same output, for the same set of inputs; and has no side effects.

Which sounds quite reasonable, and given a certain spin, very laudable, but here are a few (and just a very few), of the things that you can't do whilst remaining 'purely functional'*; things that most programmers in most programming languages take for granted.

*...without Monads

  • You cannot read input--from a file, database or user or socket or another program via pipe, etc.

    The function you call to get the input would produce a different output for the same inputs.

  • You cannot read from a random number generator.

    Ditto. And the state of the PRNG (or other source) would be left in a different state to when you called it--that's a side-effect).

  • You cannot read the date or time; or the cpu usage; or free capacity of your harddrive, etc.

    Ditto.

  • You cannot write output--to a file, or screen or database or socket or pipe etc.

    The output medium would left in a different state--a side effect.

    The call could return a failure: the disk was full; the socket was closed; the database had been modified by another program.

In essence, a purely functional program has to be completely self-contained at compile time. This is because the compiler has to be able to freely substitute the value of any given function wherever that function is used within the program. A purely functional program has no concept of sequence. They are defined in a declarative manner in terms of function definitions, whereby function A is defined in terms of some other function or functions (or itself), and a few constants. Once it is declared, its result (for any given set of inputs), is fixed forever. That means, in theory, the compiler can substitute its defined value at every point in the program where it is used, and in any order, and so reduce the program by substitution, until finally everything is reduced and the result is known.

A constant program

In other words, the results of every purely functional program is decidable by the compiler, and its result is known by the time the code is compiled. There is no need to actually run the program ever because the result will never vary--it is a constant!

Besides, even if you did run it, the program could not output the result because that would have the side effect of changing the file, screen or wherever you output the result to. Ie. if you output the result twice--to say, the screen--the second time you output it, the output from the first time would still be on the screen meaning the result of doing it the second time would be different from the first. It would have had a side effect. And that's not purely functional.

PFP? Purely Functional Programming or Pretty F..useless Programming*

Read on; it's a joke :)

Of course, that would make for a pretty useless programming language. For example, you could never write a purely functional compiler--that is, a compiler that is purely functional, rather than a compiler for a purely functional language--because you couldn't read the source code. So every program the compiler would ever compile would have to be embedded into the compiler. And you wouldn't be able to write out the compiled program, which would kind of negate the purpose of having it.

Sequence in a program == procedural

So, in order for a purely functional language to do anything useful, it has to incorporate some mechanism for performing IO. For getting data in and outputting results. And it has to introduce the concept of sequence. That is, it is necessary to read the data before you can perform the calculation upon that data; which in turn has to happen before you can output the results.

But to do that, it becomes necessary to distinguish those functions that are purely functional--give the same results every time and have no side effects--from those that interact with the world outside of the program, and so are not: read() will produce a different result each time you call it; write() will have a permanent (side) effect upon the world each time you call it. It needs to distinguish them so that the compiler can know which functions can safely perform its reduction-by-substitution magic upon; and which it cannot.

The mechanism chosen to distinguish the two is 'Monad'

Ignoring the formal (and extremely obscure) mathematics behind this concept which I admit to not understanding at all and could never paraphrase, in simplistic terms a monad is a black box that represents (or contains) the world (or some small part of it), outside of a given, purely function program.

And a monadic function is a function to which you must pass one of these monads as one of its parameters, and which returns (a different) monad as its only result. The output monad is the (constant and unchanging) result of applying that function to the input monad. If you call that same monadic function a second time, you will be passing a different monad as input; and recieve another, different, new monad as its result, And each time you run the program, the initial monad representing the outside world will be different--time will have passed; things will have changed--and when you call the monadic function, the input monad will again be different from any other time it has been called, so it will be free to return a different monad to every other time it has, or will be, called.

And so the "same output for the same inputs" rule of functional programming has not been violated. And as every time you call, you pass a different monad in and a different monad out, any given monad only ever has one state, so there are no side effects. In this way, you have introduced the concept of sequence into a purely functional program, without violating the purely functional ethics(*).

Every Perl program is a Monad (or maybe lives inside a Monad)

In Haskell's terms, every function in a Perl program is monadic, carrying an implicit monad as an input parameter, and implicitly returning another as one of its return values. Passed into the 'main()' function at startup, and return (another) from whence it came--the OS? the world?--when the program terminates.

As Perl does not subscribe to the purely functional ethic, it doesn't need the obscure, high maths concept of monads to 'excuse' its use of state, side-effects and the real world.


Smoke and mirrors

*Of course, some of you may, like me, consider this a diversionary tactic--so much smoke & mirrors--to obscure and defend from the charge that you cannot have a purely functional language and do anything useful.

You may also consider that need to latch onto such an obscure and ethereal concept to explain the ethics of a programming language would simply go away if it was admitted that you cannot have a purely functional language. That, if you accept that some parts of any given program lend themselves to being coded in a (purely) functional style; whilst other parts do not, then you can use functional techniques where they are advantageous and not where they lead to clumsy or awkward coding (or their reliance on obscure and difficult to understand concepts).

However, to a non-mathematician like me, the concept of imaginary numbers is difficult and obscure, but there is no denying their usefulness in allowing some things to be described and calculated, that would be very difficult, if not impossible, without them. And just as I have to accept the concept of imaginary numbers and the rules that define them; without trying to visualise what they look like :) So it may be that there are benefits to doing the same thing for monads. Just accept that they follow a set of rules and in doing so allow X, Y & Z that would be much harder if not impossible to do without them.

The flaw

The flaw that I see in most of the descriptions and treatise on monads I've seen, is that they fail to outline what X, Y & Z are.

One possibility that seems to make sense to me, is that monads are a way of informing the (Haskell) compiler which functions it can perform its reduction through substitution processes on at compile time, and which it cannot. In this view, they are simply (or not so simply) a flag applied to each function definition saying: This function has side effects; it can only be reduced at run time.

Provability

I have seen the argument that purely functional code is 'provable'. And therefore purely functional programs are more reliable than non-PFP. But there is an old & well-known saying that a chain is only as strong as its weakest link. And if a program requires input--and every useful program must, otherwise its result is a constant; indeed, the entire program would be a constant--then you cannot pre-prove it. You have to test it with a variety of inputs. And any non-trivial program will have far too many possible inputs to test exhaustively, so you will never be able to 'prove' any, non-trivial, useful program.

The counter argument to that is: If you segregate the program into its purely functional, provable parts; and its monadic, non-provable parts, then you can reduce the number of possible errors, and therefore the risks, and therefore the amount of testing required. And that's a pretty good argument. Reduced risk and reduced testing are very laudable, real-world goals.

However, in order to prove even the functional parts of the program, it becomes necessary to not only prove the algorithms they represent; but also that the code produced by the compiler, from the high level language descriptions of those algorithms; is also correct. That is, you have to prove that the code the compiler produces from the Haskell language source code descriptions you feed it, actually performs exactly what that description says it should.

The rub

And there's the rub. ......the Haskell language source code descriptions you feed it...". The compiler takes input. That input is infinitely variable. And we just got done saying that you cannot 'prove' a program that takes input; you can only test it.

Chicken & egg

So, it's a chicken and egg situation. If you had a provable implementation of a compiler that was built from provable descriptions of its algorithms, then you could use it to build (implement) programs from provable descriptions of provable algorithms.

Until then, you will need to test programs--statistically. And as long that is true, there will never be a 100% guaranteed, bug-free program.

A question

So, the question becomes, are there (any) benefits of imposing this obscure view of procedural code--that's really all monads are (or perhaps; allow is a better term), that make the alienation and conceptual difficulties that their use introduces to the world of programming, worth it? Is provability of algorithm descriptions, without provability of the code that converts those descriptions into code, worth the costs?


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Is it worth using Monads in Perl ? and what the Monads are ?
by Anonymous Monk on Jun 12, 2007 at 01:10 UTC
    It seems like your entire article would make a lot more sense if every instance of the word "monads" was replaced with "the IO monad". There are other monads, you know. Sequencing IO actions is probably the worst example of a monad, because it gives the least insight into how monads in general are put together and used. (In particular, the IO monad is primitive in Haskell by necessity, but it's the only monad like that.)

    It seems like an especially useless abstraction to be making if you only have that one example, and you're coming from an imperative background (like perl), which is quite a lot like working in the IO monad all the time.

    If you'd like to see other examples of monads and what they're good for, I highly recommend starting off with a look at Parsec, which is a library for parsing implemented in Haskell (and then reimplemented in a bunch of other languages). A simpler decent example is the list monad, which gives you what are essentially list comprehensions, or what computer scientists call nondeterministic computations.

    Also, having the monadic abstraction lets you write algorithms which are generic over a wide class of monads, and then apply those at specific monads for a variety of effects.

    Let me give a somewhat involved example, so you can hopefully get a sense for what I mean. A friend and I wrote an L-system fractal generator which was nondeterministic, in the sense that each rule gave a list of possible expansions for a symbol, and it enumerated all possible expansions of the L-system. For display on the screen however, we wanted to choose a random one. Generating the whole list was wasteful, so rather than do that, we replaced the lists of rules in the input to the algorithm with computations in a random-generation monad which would emit one of the results in the original list at random. Since the expansion algorithm itself was generic over monad, only the input to it had to change, and we had a randomised L-system generator.

    Similar changes to the input could be made to incorporate state transitions into the expansion of the L-systems (or even continuations, if you're crazy), all without changing the code that actually does the expansion.

    So what are monads really? They're embedded domain specific languages which have been constructed to conform to a particular basic structure so that various control structures and generic algorithms can be shared across those embedded languages. There are lots of good examples in Control.Monad. For instance, forM/mapM is essentially a for-each loop which works in any monad. This saves you the trouble of implementing a new for-each loop in every new EDSL you write.

    That is, when you're implementing your EDSL, if you just bother to implement return and bind, you immediately get a nice API of control structures for free. Implement mzero and mplus, and you get an instance of MonadPlus, which gives you a bunch more.

    By encouraging people to write and use EDSLs (which are essentially just really good APIs) you encourage people to write code which is closer to the domain, and hence easier to think about. I/O is a red-herring -- I/O is handled by a monad in Haskell not because it's absolutely necessary that it be handled by a monad, but because it's possible, and so you get all the basic control structures for I/O from that same monad library for free.

      It seems like your entire article would make a lot more sense if every instance of the word "monads" was replaced with "the IO monad". There are other monads, you know.

      Yes, I do know, but please think about where you are reading this and the target audience, as well as my admission to being a 'failed' Haskell programmer. You should not expect deep and authoritative discussion on Haskell in this environment :)

      It seems like an especially useless abstraction to be making if ... you're coming from an imperative background, which is ... like working in the IO monad all the time.

      That was and is pretty much exactly my point. I'm aware that the bits I've elided make the statement much less definitive than I did, and within that lies the power of the monad abstraction, but that brings me back to my final question above, which can be paraphrased as: Do I, as a jobbing programmer need that power. It may be a endlessly fascinating subject for the academic/language designer, but does it really give the working stiff something that he does not already have in other languages.

      Isn't there a dichotomy between these sentances from your first and last paragraphs?

      • "(In particular, the IO monad is primitive in Haskell by necessity, but it's the only monad like that.)"
      • "I/O is handled by a monad in Haskell not because it's absolutely necessary that it be handled by a monad, but because it's possible,"

      From this point I'm going to have to resist the temptation to get drawn into arguing about the efficacy and effectiveness of Haskell (here). This is the wrong place, and I am not qualified. I've already allowed myself to over-extend my reach, and I'll probably regret doing so. But I'd be willing to continue and would relish the opportunity to learn) off-post via email, but not here.

      However, your arguments above remind me of a post I made some time ago. Please do not take offense. Nor think I am aiming the first line (or most of the rest of it) at you. And definitely ignore the context in which it was posted. But...there is one bit that I think is applicable, though even that isn't worded quite right for application in this context.

      ...and succeeded in re-creating your system using theirs, will you be in a position to counter their objections to your system, in their own terms.

      Only then will you have the ammunition with which to prove the veracity and accuracy of your system, in a way such that they can accept it.

      What I'm trying to get at here is that if you 'buy in' to the whole Haskell experience--from the efficacy of Hindley-Miller and functional programming, and purity up--then everything you say above, and all the deep discussions and treatise and papers in the Haskell world may (probably) make sense.

      But, if you are coming from a long and deeply imperative background as I am, and with respect, the vast majority of the worlds programmers are, then without the mathematical grounding to fully appreciate the full implications of HM, and the category theory underlying monads, it does seem like, as you put it above, " an especially useless abstraction".

      The problem with the rich and extensive set of information available on Haskell, and monads in particular, is that for the most part, it assumes a deep understanding of those underlying (and often obscure) mathematical concepts. It is often also somewhat pejorative and condescending ("... or what computer scientists call nondeterministic computations.") of those that do not come from that background.

      I'm not offended by that. It may well be true that I am either too old and set in my way, wrongly educated, or just too stupid, to ever make the journey of enlightenment required. Required to allow me get beyond the feelings of frustration when I encounter something (in Haskell) that seems clumsy, over-complicated or outright impossible, when I know that I could do that same thing in any of a dozen other languages in my sleep.

      You'll note (or rather I'm going to do it for you), that I never said that Haskell didn't need or benefit from monads. Indeed, I said the exact opposite. And when I asked "Are monads worth [the bother]", I was doing so from the perspective of someone who has written hundreds of programs and 10s or 100s of thousands of lines of code (guesstimate!), without ever once needing 'monads'.

      And to explain that monads are necessary "because they ensure referential integrity", isn't enough. Because I've written those same number of programs and lines of code without needing that either. And that's where the quote from my old post above comes in.

      Until I can understand how the provision of the same facilities Haskell has would benefit my programs in Perl (or C, or Java, or ...), then it's just not possible to see beyond the argument that Haskell only has those facilities because Haskell needs them.

      And Haskell only needs them because it makes certain (dogmatic) decisions about the need for purity, referential integrity, strong typing and provability. And all of those are just too easy to dismiss as unnecessary and/or non-beneficial.

      (I can see) There are several aspects of Haskell that are desirable, but as a language it is impossible to have those bits without taking all the rest, and the rest imposes such burdens upon the development and thought processes, that I have to question whether they are worth it.

      Don't misunderstand me. I know (for example) that proficient Haskellers tend to win against-the-clock programming competitions with unerring regularity, almost to the exclusion of all other languages. So i know that, in the right hands, Haskell very much challenges and often surpasses Perl at it's own game in the programmer productivity stakes. But, how many expert Haskellers are there?

      And how many of those programmers earning a living on a daily basis in other languages have the (maths theory) background and nouce to make the transition to using Haskell? And if they have to acquire a good understanding of category theory and type system theory and dog knows what else in order to see the benefits of the transition to using Haskell, how many of them are going to have the incentive to try?

      Until someone can explain its benefits to them in their own terms, the incentive to learn enough to understand the concepts in Haskell's terms, and those of functional programming purity, are minimal. Mostly just curiosity, which is easily 'cured' by work pressures, or condescension, or apparently self-describing, self-confirming 'belief' systems.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
      If you'd rather read that with proper spacing, see: http://programming.reddit.com/info/1xlxp/comments/c1xoho
      Oops, all the spacing got squashed out of that. Oh well.
Re^2: Is it worth using Monads in Perl ? and what the Monads are ?
by Anonymous Monk on Jun 11, 2007 at 20:09 UTC
    Unfortunately, I/O is probably the worst example of a monad there is. The next worse example is the state monad. The "Maybe" monad is much better example, especially for beginners.

    You might like the Clean language more, where I/O is done with unique types. And then, once you get tired of carrying this "world parameter", you might say to yourself, "Why don't we tuck this boilerplate away into a function, so that we don't have to mess with it all the time?" Bingo, there's the usefulness of monads.

    And to answer the OP's question, there is essentially no reasonable reason why anyone would use monads in Perl. Besides the fact that Perl's syntax makes it heavyweight and messy, its lack of static typing makes it hard to program in a functional manner.

      Unfortunately, I/O is probably the worst example of a monad there is. The next worse example is the state monad. The "Maybe" monad is much better example, especially for beginners.

      As I hope I demonstrated above, I understand what monads, and the IO Monad, are and do.

      The problem I have with monads, is not the name, or their obscure math derivation, or using them (much, though I admit I probably couldn't derive my own any time soon).

      The problem I have with them, is the need for, and the benefits of, having them at all.

      They are, emotionally, just a way of concealing the fact that Haskell programs contain code that has state, has side effects and is procedural. Technically, they are a way of indicating to the compiler which functions have no side-effects and can therefore be safely re-written at compile time; and which aren't. And that's makes them a really awkward way of setting a flag.

      It would be so much easier to simpler to have a keyword that could be placed at the top of a module or even half way down that says any definitions above this line are purely functional and anything below it is procedural code that has side effects. The 'monadic code' then just becomes normal procedural code that carries an implicit 'world state' from one step/statement to the next; it could be coded in a normal, procedural style with loops and state and side-effects; and without all that faffing around that monads introduce.

      No need to (badly) re-invent all the standard procedural constructs--like try/catch et al.

      At this point, I'm gonna stop, and ask you to email me (address on my home node) if you feel like continuing this, because this has become about me and Haskell (and Clean--which I do like more, though it lacks the community and available tutorial information that Haskell has. Eg. A search for lang:clean on google codesearch finds zero hits!), and has ceased to have any relevance to Perl. I'd enjoy the rebuttal :)


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
          They are, emotionally, just a way of concealing the fact that Haskell programs contain code that has state

        Rephrasing to mitigate the emotional stress: monads encapsulate effects. They aren't there to conceal in the sense of deception, but rather in the sense of information hiding.

        I agree with the other poster who said the IO monad's the poorest one to look at, because I agree with you when you said a Perl program is in a(n IO) monad. But that's trivially true; the value comes when you look at the richness of different monads, and when you can take your code and separate functions out of it into those that really are pure and those that can be made to fit in monads a, b, c.

        Here's a simple example.

        ruleSubName :: RuleParser String ruleSubName = verbatimRule "subroutine name" $ do twigil <- option "" (string "*") name <- ruleOperatorName <|> ruleQualifiedIdentifier return $ ('&':twigil) ++ name

        This is part of the Pugs parser, the code to parse out the name of a sub. Inside the RuleParser monad, sequencing means "demand the following parse to work". But it also means take care of bookkeeping (source code position, name of what we're trying to do for use in error messages in case we fail). If a parse fails, the associated further bookkeeping is automatic. Here we say "first of all, look for a '*' string, but it's okay if we don't find it". The function string is monadic; if it failed then the function 'option ""' works much like a try/catch and provides the fallback. Anyway, now twigil is either "" or "*", and the position has advanced by 0 or 1 columns.

        Now we try to parse a "name". We try a first parser, ruleOperatorName, and if it fails, ruleQualifiedIdentifier. If that one fails too, ruleSubName will fail. The actual behavior of that failure depends on who called us; if they were ready for us to fail (for example, by modifying us with "option" or "<|>") then a different parse might result higher up. Rewinding the position happens automatically, for example if we had already consumed a "*". But if not -- if we're in part of the Perl 6 grammar where the only conceivable term is a SubName and one could not be obtained -- then the user will get an error message saying Pugs was looking for a "subroutine name".

        What I'm hoping I've shown is that all these bits of pieces of combining small monadic functions are common to the domain of parsing. Haskell isn't being deceptive in not making me put try/catch blocks over every line. It's encapsulating very specific and repetitive functionality.

        I think one great difficulty in understanding monads is they take "if you look at it that way" to the extreme; and Haskell programmers often look at problems in surprising ways. Since parsing is a very practical problem, and familiar to Perl programmers at that, it's a good monad to learn with.

        They are, emotionally, just a way of concealing the fact that Haskell programs contain code that has state, has side effects and is procedural.
        I'd reword that a little differently. The type system loudly proclaims to everyone that these particular functions are contaminated by state and side effects, so by-golly you'd better don the bio-hazard suit and procede with caution.
        it could be coded in a normal, procedural style with loops and state and side-effects; and without all that faffing around that monads introduce.

        No need to (badly) re-invent all the standard procedural constructs--like try/catch et al.

        I think there's probably a large subjective element in choosing a favorite programming. Some people like cats, others like dogs. When I first encountered Haskell, my initial reaction was, "Yes! This is how programming was intended to be done". I no longer needed to faff (?) around with side effects willy-nilly everywhere.

        It sounds like you've had the opposite reaction. No harm, no foul. If you don't like broccoli, there is no sense in feeling bad just because others rave about it. And anyone who nags you about the health benefits of broccoli is a bore.

        Haskell programmers truly think it is a better way to program. No apologies. (Heck, maybe we're just crazy). And some of us think that Haskell doesn't even go far enough.

        I don't view monads as just a way of telling the compiler what effects I'm using (it doesn't really do that anyway). It's a way of tell *me* (or someone else reading the code) that this is effectful code, and that care must be taken.
Re^2: Is it worth using Monads in Perl ? and what the Monads are ?
by Anonymous Monk on Jun 11, 2007 at 21:55 UTC

    You may be interested to know that Haskell did not always use monads for I/O. This paper describes three alternative, purely functional methods for implementing I/O in Haskell that were used before monads:

    http://citeseer.ist.psu.edu/30726.html

    It is also important to note that monads not special things wired into the compiler. They are built using normal Haskell code and live in the standard libraries.

    It is quite possible to write real programs in Haskell without using any monads. Of course, you would probably just reinvent them without realizing it.

      On the Expressiveness of Purely Functional I/O Systems

      It's possibly a very interesting paper, but all I can find online are citations and references--which means it might as well be printed in a 9 point font on the Moon for all the accessability it has.

      It is also important to note that monads not special things wired into the compiler. They are built using normal Haskell code and live in the standard libraries.

      That sounds promising. Care to show an example of how to read from stdin and write to stdout without using a IO Monad? Or just perform an question and answer session interactively without them/it?

      It is quite possible to write real programs in Haskell without using any monads.

      Again, care to post an example that can be compiled with GHC v6.4?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        On the page I linked to, in the top-right hand corner, there are links to the paper in several different formats.

        The paper includes a bunch of examples. As far as I know, there is no fundamental reason why those libraries could not be used with a modern version of GHC.

        A simple experiment you can do to see what non-monad I/O is like is to use the interact function.

        main = interact yourFunction
        yourFunction is a pure function which takes a lazy list of Strings (which are gathered from stdin), and returns a lazy list of strings (which are written to stdout).

        For some examples, (such as interactive questions using stdin/stdout) check out this page:
        http://cod3po37ry.blogspot.com/2007/02/more-on-haskell-io-and-interact.html

        Although interact is currently implemented in the IO monad, I believe it pre-dates monadic IO.

        ...all I can find online are citations and references...
        What happens when you click on the PDF link in the light blue box in the upper left hand corner? What about this link instead?
Provability possible for PL with side effects
by Anonymous Monk on Mar 16, 2008 at 02:10 UTC

      Wow! A rave from the grave.

      First off, thank you for the links. I've only browsed most of then so far, but they will make for interesting reading over time.

      If I am not wrong, FP-people say that absence of side effects simplify proofs of correctness (wrt specifications), not that proofs of programs with side effects are impossible.

      Maybe they do, but not in most papers I've read. When I read a formal proofs paper that deals with a real world problem rather than isolated, mathematically pure problems--like incrementing a number or multiplying two numbers together--then I (and other programmers), might to start to sit up and take notes.

      What do I mean by a real world problem? Let's start with something really trivial. Command line filter programs. Say wc and tee. Show me (and explain) a formal proof for one of these programs. How it will handle taking input from a local disk file; or a remote disk file; or via pipe from another program; or from the keyboard; and ditto for the sink; and still do its job.

      But they never do that. Somehow, they manage to persuade someone, somewhere to pay them public money (it has to be public, because no one would fund this stuff privately), to write papers on how to "prove" a computer can successfully increment a number. And they come up with

      Let's try it on a trivial example. Informally, the specification is to increase the value of integer variable x . Formally, we face a problem: we are given no way to relate the final value of a variable to its initial value. We have to use a trick: introduce an extra variable, not part of the state space, to carry the relation. We write

      ∀X� {x = X} S {x > X}

      meaning: whatever x may have been equal to at the start, it is greater than that at the end. For S we might propose the assignment x:= x+1 . To prove it correct, we must use the assignment rule {substitute e for x in R } x:= e {R} In the example, this means

      {x+1 > X} x:= x+1 {x > X}

      The precondition is not what we want, so we now use the consequence rule (∀σ� A ⇒ P) ∧ {P} S {R} ∧ (∀σ� R ⇒ Z) ⇒ {A} S {Z} where σ (the state) stands for all variables. In the example, this means we must prove

      ∀x� (x = X) ⇒ (x+1 > X)

      which is now a simple theorem of ordinary logic. All that, just to prove the obvious!

      No shit Sherlock. All of that, to prove $i++? And that doesn't even handle the case where $i = 2147483648;. They'll undoubtedly determine that the outcome of incrementing that to be 2147483649, but the reality is, on 32-bit computers systems that it might be. Or it might be -1.

      But that formalism above won't consider the case of what happens when you increment a signed integer across its register size boundaries, because mathematicians live in a theoretical world of Real and Natural (who knows how many other obscure theoretically infinite, mathematically convenient) number domains, and none of them pay the slightest consideration to mundane realities like register sizes and word lengths.

      Now lets look at what they come up with for multiplying two numbers. This is the modern, advanced, abbreviated, compacted version. See the last linked paper above, page 9 for the unabridged nightmare:

      x' = x y�: s:= 0; s' = s + x y; x:= s

      s' = s + x y�: if y=0 then ok

      else if even(y) then (x:= x 2; y:= y/2; s' = s + x y)

      else (s:= s+x; x:= x 2; y:= (y–1)/2; s' = s + x y)

      I won't pretend to fully understand that, so I'll ask a few questions: Does it handle register overflow? How about one or both inputs being #NaN? Or either or both being one of the two flavours of infinity?

      I'm not mathematician, but take a closer look at the case above where if y=0 then ok. According to the paper, "ok is the identity relation and then later on "The empty (do nothing) program ok is the identity relation". Que? The way I read that, if the second (or one of?), the inputs to multiplying two numbers is zero, we do nothing! So-orry, but my schoolboy arithmetic tells me that anything times zero is zero--not throw your arms up in disgust and do nothing.

      Bad rhetoric

      I realise that the author is a much clever person than I am, but right from the start of this paper there are a bunch of things that just don't ring true.

      Formal programming methods allows us to prove that a computation does what its specification says it should do. More importantly, formal methods help us to write specifications, and to design programs so that their computations will provably satisfy a specification.

      This is analogous to the use of mathematics in all professional engineering: civil engineers must know and apply the theories of geometry and material stress; electrical engineers must know and apply electromagnetic theory.

      So why don't software engineers know and apply a theory of programming?

      I've broken that into three paragraphs. The first is the sales pitch. A broad-termed, theoretically desirable goal. A panacea. A magic bullet.

      The second paragraph is supporting argument.

      • Argumentum ad verecundiam

        in all professional engineering

        Most civil engineers to not use theories of geometry and material stress. They use tables.

        If this member is at this angle carrying this load, it will need to be--scan across for the angle, scan down for the load--this strong. And looking that up in the manufacturers tables for (say) steel girders or reinforced concrete lintels, including the statutory increment for safe working load for this type of construction, it will need to have this cross-sectional area. And that means part number xyz. Of course, for the most part these days, they'll feed their proposed design into a finite element analysis simulation, and increase the strength of the members until they eliminate any "hot spots".

        And an electrical engineer will rarely need to remember, let alone actually calculate any "electromagnetic theory" once he's left college. It's mostly just a case of picking the appropriately rated parts from the manufacturers catalog. Sure, someone somewhere had to design those parts, but I guarantee that not one single rating was arrived at through mathematical processes. The ratings are arrived at by testing--probably to destruction--and then applying a substantial "fudge factor" to indemnify the manufacturer against failures when close to the limits.

        As an example, there is a whole sub-culture of computer geeks that get their jollies by seeing how far the can push the clock rates of cpus beyond the manufacturers ratings.

      • Cum hoc ergo propter hoc.

        The paper postulates that because of the use of mathematics in all professional engineering, all programs should be written using the theories of mathematics. And all programmers should be mathematicians.

        Except, as I've shown above, the cum hoc is false, hence the ergo doesn't.

      To date, in all the sources I've read, mathematicians completely miss the point about engineering.

      Engineering is not about specifying products, from the ground up, in logically provable, theoretical terms. Engineering is about about assembling a collection of standard parts, to meet a set of requirements, in a timely fashion, for the lowest cost. And whilst some of those requirements will be measurable in ISO units: speed, weight, strength etc. Another bunch of requirements will be ethereal things like looks, feel, affects on the environment, impact on the view, life-time costs, built-in obsolescence, usability, desirability, etc.

      And the impact of those ethereal elements on the success of products becomes ever more important. Take the success of things like:

      • the mediocre technical capabilities of the iPhone:

        but boy does that interface ever make for good TV ads.

      • the heavy, slow and expensive Toyota Prius:

        it's not that much more fuel efficient than a modern say 1.2 litre VVT petrol engined four seater, and far less fuel efficient than a modern, common rail 1.4 diesel engined equivalent.

        The secret of its success? You cannot mistake a Prius for anything else. And there are no 200+ HP, 16-valve, turbo charged variants of it. So when you pull up along side the red carpet in it, amongst all the 40' long, gas-guzzling stretched limos, no one is in any doubt that "you care".

        Even if you did just fly half way around the world in your private jet to attend.

      Engineering is not theoretical

      Engineering is not about proving that the product complies with theories. It's about satisfying a few theoretically quantifiable requirements, and a whole bunch more unquantifiable ones. Read about the construction of the Millau Viaduct. A very small part of the effort involved the solving of well-known formulae ensuring the strength of the beams, cables and towers. A larger proportion was to do with the logistics of constructing those huge skinny towers in a "continuous pour", running the cables and manoeuvring the decks into position.

      But a far larger proportion of the effort was involved in minimising the impact of the construction on the environment; ensuring the aesthetics of the final construction; engendering national pride in the French psyche.

      Very few if any software projects have anything like these logistics and emotional criteria to deal with.

      Dark Ages

      In engineering terms, the software industry is currently, roughly equivalent to the 16th century cathedral building industry in terms of sophistication. Like them, we build each construction from scratch, individually, to fit its landscape. Each building block is carved to fit its individual neighbours. We build the infrastructure just as thick as we think it needs to be, and if it starts to fall down when we're half way up, we build it thicker.

      In metalworking terms, we may no longer need to start by extracting the raw metal from the ore, but we are still at the point of fettling pieces of standard stock to manufacture individual components.

      Software development has as much to do with "engineering" as does cookery. Indeed, cookery is now routinely done on a mass production, engineering scale, and software hasn't even come close to achieving that.

      I sit here at this computer with a dozen or more languages installed and, despite the theoretically provable, analysed and documented to death nature of a simple sort function, there are probably 2 or 3 dozen implementations of quicksort in the various libraries on my system. And they probably all have bugs.

      Until we start constructing programs from standard parts, the terms 'software' and 'engineering' do not belong in the same sentence. And I don't mean standard libraries of functions, or extended class hierarchies.

      The former come in huge great globs of disparate facilities all lumped together into single linkable, statically or dynamically, chunks. If I want to replace a single function, be it sort() or strcmp(), I have to replace everything else it's library contains also. And that means re-testing everything.

      And the latter is source code. The approximate equivalent of blueprints. To replace the implementation of a single class means rebuilding the program (and every program) that uses it. And that means re-testing everything from scratch.

      More craft than engineering

      Until OSs provide us with a catalog of standard parts, specified in high-level, generic terms, that we can replace individually, and assemble together, at runtime, into higher level sub-components; test in the abstract and then assemble those into higher level components still; software development will remain more craft than engineering.

      And formal methodologies will remain a theoretical pipe dream beloved of academics but largely ignored by the industry just as it has, as pointed out by the paper, ever since Hoare first proposed it in 1969.

      It took metalworking nearly two millenia to move from the bronze age to engineering. I doubt that software will take quite so long, but we are currently still somewhere in the Dark Ages and (IMO) Hehner's optimism: "I am optimistic that Kieburtz is right, that the use of formal design methods is just around the corner." is still wildly misplaced.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://620424]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (6)
As of 2024-03-28 16:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found