http://qs321.pair.com?node_id=707836

Since computers nowadays all have multiple cores (I just got my first multi-core computer) it looks like everything is going to have to be multi-threaded or multi process or fail to take advantage of the hardware. If it's true that "cores are the new megahertz", this is going to get kindof crazy. I mean, we might be looking at a future where everyone has a 32 core machine on their desk and beyond.

How are we, the perl community, going to address this? How do we see it affecting us? I'd love anyone's thoughts on this general question.

Replies are listed 'Best First'.
Re: Multi-core and the future
by kyle (Abbot) on Aug 29, 2008 at 21:53 UTC

    A lot of my work is in a web application environment. There, Perl's threading or forking or what-have-you isn't really relevant. Apache spawns processes for multiple requests, and the database spawns threads to handle them. I get the benefit of many CPUs essentially for free.

    In back end code, when I want to really max out the machine, I generally write things so that more than one can run at a time and then run as many as I want. I have to think about it a little more, but not much.

    I can imagine there could be situations where an individual request would benefit from being spread across as many CPU instances are available, and the "natural" solution just doesn't do that. Still, I think that will be the exception rather than the rule.

Re: Multi-core and the future
by tilly (Archbishop) on Aug 30, 2008 at 00:03 UTC
    There was a good thread on this on another forum I visit. See here for the thread, and my ideas on how I see concurrency playing out in the next few decades.

      Hmm... interesting.

      Don't take it badly, but you also seem a bit old, like me. :-) So I'll assume that you know what I am talking about when I ask:
      What is the present thinking on the programming model for the Connection Machine? How does that weird vector processing stuff for non-numerical data relate to your argument?

        I don't know enough to tell you the present thinking. I can, however, tell you my thinking.

        My thinking is that it is an interesting computing architecture that they did a lot of research on and never found enough real applications for. Which is why the company folded. Nobody else has found applications for it either, which is why you haven't seen it reintroduced.

        However your vector processing comment reminds me of some stuff I saw involving using GPUs for MD5 hashing, resulting in code that runs dozens of times faster than it does on a regular CPU. See http://majuric.org/software/cudamd5/ and http://www.schneier.com/blog/archives/2007/10/speeding_up_pas.html for the details.

        Of course a GPU is not good for general purpose computing. It is fast for certain tasks, and certain tasks only. So what you'd want to do is to make a call to a function that offloads the work to the GPU.

        My suspicion is that a similar strategy will work well for dealing with many cores. For example consider your basic webserver. You can get lots of naive parallelism by having one or two processes per CPU. And then the heavy lifting can be done in a database. Within the database certain algorithms, for instance sorting, may be written to make use of as much parallelism as is available.

        So the result is that the programmer writes, as today, naive single-threaded code. However both in the webservers and in the database, parallelism will be extracted and used where appropriate. Better yet, this is done with no need for having special support for it in the language.

Re: Multi-core and the future
by Tanktalus (Canon) on Aug 29, 2008 at 22:55 UTC

    I agree with kyle. I've had a quad-core on my desktop for a few months now, and there are just some things a user can think about differently. For example, converting avi's to DVD mpgs. Just have more than one thing to do, and let make take care of it.

    As for perl, well, I'm going to pretty much have to wait for a proper thread-safe perl before I can really take advantage of it. Sure, I can use some shared memory and forking, or try to share objects across ithreads, but that's way too much fiddling with protocols to bother with. I'll wait for perl 6 before really delving into parallelisable tasks (and I have *plenty* at work, they just need to communicate too much information!).

    What multiple cores does on the desktop is really just keep one process from locking the system. Unless it's actually X that is hung, I can generally go to another window and pkill a runaway process. Or, if it really WILL take a long time, I can simply ignore it - I still have 3 full cores that the initial application can't muck with. If every app became multi-threaded or multi-process, I wouldn't be able to do that ;-)

Re: Multi-core and the future
by jdrago_999 (Hermit) on Aug 29, 2008 at 22:57 UTC

    For now, we have threads and forks. I prefer forks when coding for Linux.

    Perl6 will magically provide for parallelism by allowing the VM to handle that kind of thing. Basically, whenever your code could benefit from multiple threads or whatever, it will Just Happen.

    This is a completely different paradigm than with other languages (C, Java, C#, etc) where using threads means a fair amount of extra work managing those extra threads.

    Being Perl though, I expect we will also be able to reach down and touch the bare metal of threading and parallelism ourselves whenever we like.

Re: Multi-core and the future
by Gavin (Archbishop) on Aug 30, 2008 at 12:58 UTC
Re: Multi-core and the future
by CountZero (Bishop) on Aug 30, 2008 at 15:13 UTC
    As was said during the last YAPC::EU in Copenhagen, multiple cores means also a lower clock-speed, so any application that runs in only one core is by definition running slower and is essentially at the mercy of the OS whether its core gets shared with other applications, slowing it down even more.

    Not all applications are open to parallel execution. A "read one line, do something with it, write the result to a file, start over again" is basically something sequential that would need a lot of overhead to split it over multiple cores. I think the majority of the scripts we write are of this type.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      Sorry, but multiple cores do not by definition run any slower.

      Sure, if you have a CPU that can be run in multiple modes, the option with more cores will be slower than the one with 1 core. However we've hit intrinsic limits on how fast we can make CPUs. Therefore Moore's Law can only continue by increasing the number of cores. Therefore over time we expect to see clock speeds remain about where they are, but the number of cores will increase.

      Now you're right when you say that you're at the mercy of the OS to decide whether your core is shared with other applications. But with 1 core you had to share that CPU anyways. And as you increase the number of cores, the odds that other applications need to be scheduled on your core goes down. Which is good. The bad, though, is that managing more cores takes OS time. So if you have too many cores, then with an SMP system the core you're on won't be doing full work.

      However modulo that small effect, if you're happy with the current speed of your code, multiple cores is not a big deal either way. It is only an issue if your code needs to go faster than it does already.

        Don't be sorry, you are probably right!

        What I meant, but failed to make clear, was that in general processors with more cores have a lower clockspeed. Sure there are high speed multi core processors but they tend to get very much more expensive; so "dollar for dollar" I feel that you get more processor capacity at a lower clock speed.

        CountZero

        A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: Multi-core and the future
by LesleyB (Friar) on Aug 30, 2008 at 12:54 UTC

    I'm nowhere near clever enough to know how the Perl community will deal with multi-core machines but I suspect there will be a lot of fun involved.

    Mathematically, single core computing serialized a lot of calculations.

    Before computers there were a number of mathematicians working out one problem so parallelisation was high. Multi-core has brought back that parallelisation.

      "Computer" used to be a job title. Quite a lot of effort went into parallelizing algorithms so they could be dispatched to a (large) room full of people (usually men) with desktop calculators who would each play CPU.

      One of my college professors had a summer job as "computer" at Grumman. Reading Neville Shute's autobiography, Slide Rule, will give a flavor of that era. One of Mr Shute's titles was that of "Chief Calculator," and he was a noted aeronautical engineer.


      Information about American English usage here and here. Floating point issues? Please read this before posting. — emc

        There's also a wonderful book called "Weather prediction by numerical process" by Lewis Fry Richardson, published in, IIRC, the 1920s. In it he explains how to predict the weather using methods similar to what we use now, the significant difference being that instead of handing the data to a computer, he would hand it to a vast amphitheatre full of people, all of them trained to perform particular mathematical operations accurately, who would pass their results one to the other until all the data was crunched. His "computers" even had a clock signal just like ours - a conductor standing where they could all see him.
Re: Multi-core and the future
by dragonchild (Archbishop) on Aug 30, 2008 at 23:34 UTC
    This is one of the many goals of Perl6. The Perl5 engine just isn't capable of being reworked into a multi-threading engine without a complete rewrite (hence, Perl6).

    For my own, I'm looking at Erlang.


    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
      I can elaborate on that a bit.

      Consider control structures such as for loops. In Fortran, they put the "for" after the statement to indicate vector hardware usage. Perl has the same syntax! But, it works the same as the ordinary form. But, in general the idea is to define looping constructs that are more like SIMD rather than traditional loops. That is, perform this step on each of all these items, but in no particular order. Order of the iterations is not defined, and may just as well be parallel or code for the SIMD instructions on the CPU.

      In particular, the "hyper" operator syntax is defined this way. @c = @a »+« @b; will add the corresponding elements in parallel. @list.».run(); will execute the method on every item in the list, in parallel.

      Also, lists may be "lazy", and establish co-routines to delay evaluation. But, if there are more cores free, why not start working on the list AND return at the same time? Don't wait for items to be needed for sure, any more than always computing them up front. It can compute the list in the background.

      —John

        All of this sounds great. And by great, I mean: brilliant; fantastic; amazing; extraordinary; just what the doctor ordered. I intend no hint of 'faint praise' here. But...

        Does any of this actually work? Anywhere?

        Each time a new parrot comes out I dutifully download it and go in exploration of the infrastructure to support these constructs, and I come up short.

        And every now and again I pop over to PDD 25 and look for changes, but rarely notice any. It still talks the talk of supporting every concurrency model known to man, and then some, but comes up way short of explaining how it's going to achieve this feat.

        And that's essentially why I stopped following the P6 lists. There is an aweful lot of talk about what P6 will do. Discussion after excruciating discussion about almost irrelevant minutia of what perl 6 will do. But almost nothing, and fiery defence for there being nothing, when the question is asked: how will P6/Parrot pull this off. (Not to mention the now sad joke of "by Christmas" for the question: when.)

        I want to believe. I really, really want to believe...


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        Another typical example for un-ordered evaluation are junctions.
        all($x, $y, $z) > 0
        autothreads to
        all($x > 0, $y > 0, $z > 0)
        At the same time it short circuiting, so as soon as one of these conditions is false, the execution of all other autothreaded comparisons (or in general sub calls) can be interrupted.

        Parallelism is one of the reasons why F90 introduced array operations, e.g., something like Perl's @a = @b or grep; implementation of these would be an obvious part a parallel version of Perl (∀Perl?). In F90, there is no special syntax akin to @a = @b »+« @c needed; it's not distinguished from a regular addition.


        Information about American English usage here and here. Floating point issues? Please read this before posting. — emc

Re: Multi-core and the future
by swampyankee (Parson) on Aug 30, 2008 at 15:11 UTC

    I suspect that a more viable long-term solution is to modify Perl proper to manage the level of parallelism provided by multi-core processors, perhaps by binding something like Linda or MPI to Perl (another task for the Perl 6 developers).

    I tend to think that a better long-term solution is to incorporate parallelism into the language, as is done with languages such as Fortran-M. Yes, I know that Fortran is considered horridly outdated, but there has been a lot of effort in modifying Fortran to minimize the onus upon application programmers of developing programs which use parallelism.


    Information about American English usage here and here. Floating point issues? Please read this before posting. — emc

Re: Multi-core and the future
by John M. Dlugosz (Monsignor) on Aug 31, 2008 at 21:36 UTC
    I'm using a 16-core machine at work, for testing the server apps I'm working on. I found that MS Visual Studio will spawn multiple copies of cl to compile, so it really compiles the large project faster!

    A server application made around the concept of multiple worker threads dispatching off incoming events will naturally like more cores.

    Also, even if the threads can't really work concurrently but proceed in lock-step, having each on a different core still improves the switching time.

    —John

Re: Multi-core and the future
by dwm042 (Priest) on Sep 04, 2008 at 15:48 UTC
    I'm offering this opinion just to be a little contrary, buuut..

    On a desktop, I don't want most apps using every processor they can find. I'd be much happier if they efficiently used one processor. I'd even be happy if they ate up one processor and stuck there. That way I could run my badly behaved app that eats up only one CPU and do the things I like (surfing perl monks, for example) while the badly behaved app beats up one of my cores in the background.

    The problem with encouraging applications to seek out new cores, to boldly go where code has never gone before is the crummy programmers who can't write things that coexist with others are going to go take your whole machine because they can.

      Under Win32 there is an API SetProcessAffinityMask() that allows you to restrict which processor(s) a process can run on. There seems to be a similar call (SetAffinityMask()) on some versions of Linux also.

      You would need to write a small command line utility to start the program, get the process handle (probably pid under Linux) and apply the call to it. There may even be existing utilities out there to do this for you.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        On linux that would be taskset (part of the util-linux-ng package).

      "I'd be much happier if they efficiently used one processor"

      There's the rub, they don't seem to be able too, ever tried running Photoshop and Dreamweaver together, I don't know for sure but would imagine they would run better and faster on multicore.

        The question is, would you rather have the application be single-threaded and dragging just one of your processors through the mud, or have it be greedy and multiprocessor aware, so that it can drag all your processors down at the same time? :)
Re: Multi-core and the future
by vrk (Chaplain) on Sep 02, 2008 at 08:26 UTC

    The multicore boom is just as silly as the mega and gigahertz race. No-one really needs that much computing power. When I say no-one, I really mean it: you think you do, but in reality you could do everything you do now with machines from ten years ago. The obsession to produce faster and leaner computing units yearly, quarterly, and even monthly is sometimes fun to watch, but it makes you think if there wasn't a better target for all that intellectual and financial effort.

    Rambling aside, my firm belief is that parallelism implementations and techniques should be invisible to the user of the programming language or library. This isn't to say they shouldn't be available; on the contrary. Having trivial to use implementations that prevent you from programming race conditions or deadlocks is crucial.

    Consider sorting in standard libraries of any programming language. Most of the time you use the standard mergesort or quicksort. You don't need to know how the implementation was done; you just feed in an array and out pops a correctly sorted one. Parallelism is a harder problem than sorting, so the interface can likely never be this simple, but ideally all you would need to do is define which pieces of code may be run in parallel, and the language implementation would do the rest.

    The obvious benefit is that the programmer can then make less of a mess of it. Parallelism is hard in the general case, but there are many good solutions to the problem. There is no reason why you should have to manage threads or mutexes yourself, unless you are writing the library code. We already have automatic memory management; we should have automatic parallelism.

    Note that the above remark is not condescending. It is just wasted effort and time when you insist on doing something manually that could and should be automated. No offense to C programmers either!

    Perl 6 comes very close to the ideal, I think. It might develop even closer. Hyperoperators and junctions are an excellent start, though I haven't seen any documentation or planning how they will work with side effects (for example, two functions &foo and &bar assigning to the same variable). Obviously it was meant to be used in SIMD and MIMD operations, not like this, so there is still a long way to go. Quite a lot of research has gone into parallelism (sometimes called multiprogramming, which has a nice sound), and threads or mutexes or monitors are not the only options. The problem is, as always, finding a good compromise.

    Since Perl 6 is not side-effect free (unlike, say, Haskell, unless you do I/O), it won't be as convenient to describe to the compiler which code blocks depend on each other and which ones don't -- though there may be a way through analysing lexical variables. In any case, in the first version of Perl 6 parallelism and concurrency won't be revolutionary.

    --
    print "Just Another Perl Adept\n";

      The multicore boom is just as silly as the mega and gigahertz race. No-one really needs that much computing power. When I say no-one, I really mean it: you think you do, but in reality you could do everything you do now with machines from ten years ago.

      I must respectfully disagree. This is true for many, but I am running microsimulations that I need to parallelize across ten multi-processor 4Ghz machines so that we get answers by the end of the weekend, and this was not possible with the commodity machines of ten years ago.

      I think the web-browsing, email reading, word processing spreadsheet viewing masses only need such massive computing power because of OS bloat and a demand for pretty colours, but I need this much computing power to do my work.

      The multicore boom is just as silly as the mega and gigahertz race. No-one really needs that much computing power. When I say no-one, I really mean it: you think you do, but in reality you could do everything you do now with machines from ten years ago.

      Speak for yourself. We routinely process jobs that take many days to run. They are already distributed to multiple dual-core worker machines. We would greatly benefit from much more powerful machines (for the money) with more cores per box. More cores per box would help us somewhat more than more boxes for the same cost.

      I don't think our company is anything special. We're not some fancy schmancy research lab or something. Just a company that chugs through lots of data in our daily course of business.