Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

(Golf): Sieve of Eratosthenes

by tilly (Archbishop)
on May 19, 2001 at 11:03 UTC ( #81695=perlmeditation: print w/replies, xml ) Need Help??

I got the idea for this from Re: RE: sieve of Eratosthenes.

Here is the challenge.

Write a function p that takes one argument $n, and returns an array of all of the primes up to and including $n.

Now there is a very short answer to this problem using the infamous RE from Abigail, namely:

sub p{ grep{(1x$_)!~/^(11+)\1+$/}2..pop }
Therefore I will add that the function must be clearly based on the Sieve of Eratosthenes. The sieve algorithm goes, "Form a list of integers. Knock out the evens other than 2. Knock out the multiples of 3. Knock out the multiples of 5. etc through the primes." For the purposes of this golf I will allow the following relaxations of the algorithm:
  1. Any finite amount of special case logic is allowed as long as your general test for non-divisibility by most primes looks like the sieve.
  2. As long as the general pattern of the algorithm is present, excess work is explicitly allowed. For instance at the striking out stage, there is no need to only strike out multiples of prime numbers.
So basically if you form a list of possible numbers and then go through rounds of elimination, then I will accept that as a solution. But if, like the above RE solution, you go through a series of numbers and then test each number for primality, that will be rejected.

To sweeten the bait, sometime later tomorrow I will post my best solution. If anyone had come without 5 strokes of that answer, the best entry gets a free PerlMonks t-shirt. (The unlikely event of a tie will be resolved by whoever got there first.) Entries that I can find a failing boundary case for will not count.

For bonus marks, and a second possible t-shirt, the same problem but without the relaxation on the sieve. That is in the elimination round you must only mark off multiples of primes, and you cannot have sufficient wasted operations to change the Big-O of the algorithm. (ie You can waste a constant factor of overhead. But you cannot, for instance, spend most of your running time marking off array elements that are out of bounds.) However I will let you assume that $n is above a fixed number. (I am not sure how people will tackle this, but sometimes it is convenient to make a special case out of 2.)

A final note. Most mathematicians say that the first prime is 2. However those who produce lists of primes like to say 1. I don't care whether your sequence starts with 2 or 1, either is acceptable.

Minor clarifications on the wording.

Masem said I should state an approximate time when I will post my solutions. I will make it whenever I get a chance after 5PM EST. That may be well after because I am likely to be off doing other things.

Golfing closed. My solutions got trashed. Now who won? Well on technicalities you can argue chipmunk for the main problem and MeowChow for the bonus. But I think all will agree looking at the answers that the winning ideas were MeowChow for the main problem and tye for the bonus. I think the only fair answer is to declare 3 winners. tye, chipmunk and MeowChow can contact me with the desired size, color, and location to send the shirts.

And an honorable mention goes to Arguile. OK, so he forgot to test whether pop populates $_ (it doesn't) but if that is what he can do after 6 weeks of Perl, I can only wonder what he will be like with a few more months under his belt...

Replies are listed 'Best First'.
(tye)Re: (Golf): Sieve of Eratosthenes
by tye (Sage) on May 19, 2001 at 12:00 UTC

    Though I somehow doubt this is a winner, I think it might qualify for the "bonus marks". You'll note that no division (or modulus) is used, so I consider it to be using the sieve with no special cases or wasted operations (though it probably has plenty of wasted comparisons O-:). Though it might be more efficient to break out of the grep when a non-prime is found (or, if we had things in a sorted order, to stop once $_*$_ passes $n) and compensate for that by incrementing $p{$_} in an extra loop (rather than incrementing it, at most, one time per prime candidate), but I'm not even sure if that would be a net win (without resorting to sorting).

    sub sieve { my%p=(2,2);for my$n(3..pop){grep$n==($p{$_}+=$_*($p{$_}<$n)),keys%p or$p{$n}=$n}keys%p } for( @ARGV ) { print "$_: ",join(" ",sort{$a<=>$b}sieve($_)),$/; }

    I count 86 characters for the part that counts. And, no, I don't think it'd be fair to add 13 characters for sorting since the original challange didn't say anything about returning the list in sorted order. q-:

    Sample output:

    1: 2 2: 2 3: 2 3 4: 2 3 5: 2 3 5 6: 2 3 5 50: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 1000: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197 199 211 223 227 229 233 239 241 251 257 263 269 271 277 281 283 293 307 311 313 317 331 337 347 349 353 359 367 373 379 383 389 397 401 409 419 421 431 433 439 443 449 457 461 463 467 479 487 491 499 503 509 521 523 541 547 557 563 569 571 577 587 593 599 601 607 613 617 619 631 641 643 647 653 659 661 673 677 683 691 701 709 719 727 733 739 743 751 757 761 769 773 787 797 809 811 821 823 827 829 839 853 857 859 863 877 881 883 887 907 911 919 929 937 941 947 953 967 971 977 983 991 997

    Update: BTW, this is strict compliant.

            - tye (but my friends call me "Tye")

      BTW, the above was based on solving the Sieve using functional programming techniques and then analysing what the computer ends up doing when that code is run and then reimplementing that in a procedural style. I find that technique often gives you interesting insights.

      Anyway, I tried to reverse engineer what tilly's solution was from his description and came up with this:

      sub sieve2 { grep{@_[map$a*$_,2..@_/($a=$_)]=0if$_[$_]>1}@_=0..pop; } for( @ARGV ) { print "$_: ",join(" ",sieve2($_)),$/; }
      which is 54 characters. It is strict compliant "to the letter" but not "in spirit". But it is even more Sieve-like than my previous one (I just don't like algorithms where you give an upper bound and once you get there you have to start over if you want to go further).

              - tye (but my friends call me "Tye")
        53 characters:
        sub sieve2 { grep{@_[map$a*$_,2..@_/($a=$_)]=0if$_[$_]>1}@_=0..pop }
        All your t-shirts are belong to me! j/k :-)
                       s aamecha.s a..a\u$&owag.print

        tilly's comment about things being "nearly linear" threw me for a bit. Then I realized that the quadratic nature is countered by the outer loop only needing to run to sqrt(N) and the inner loop being somewhat similarly restricted.

        Which made me realize that my solution was suboptimal. Here is a faster one at the same number of characters [ thanks to MeowChow noting that I'd stupidly left in a trailing semicolon in my previous one ;) ]:

        sub sieve3 { grep{@_[map$a*$_,$_..@_/($a=$_)]=0if$_[$_]>1}@_=0..pop } # ^^ for( @ARGV ) { print "$_: ",join(" ",sieve3($_)),$/; }

        In playing with this and verifying that I didn't break it, I noticed something interesting and expanded on it. For how long of a stretch can you go without hitting any prime numbers? Well, stopping at 0.5million (because of memory limits), here are the results. "xN" means there were N ties before a new "winner" appeared:

        2=5-3(x2) # 3..5, 5..7 4=11-7(x3) # 7..11, 13..17, 19..23 6=29-23(x7) 8=97-89 14=127-113(x3) 18=541-523 20=907-887 22=1151-1129 34=1361-1327(x2) 36=9587-9551(x3) 44=15727-15683 52=19661-19609(x2) 72=31469-31397 86=156007-155921(x2) 96=360749-360653 112=370373-370261 114=492227-492113

                - tye (but my friends call me "Tye")
(kudra: basketball) Re: (Golf): Sieve of Eratosthenes
by kudra (Vicar) on May 19, 2001 at 15:44 UTC
    Much worse than the previous offering and just plain bad approaches (what can I say--I'm bad at golf and I wanted to play with using an array. look here if you want to see some shorter solutions):

    # Example use: @result = sieve(60); sub sieve { my$m=pop;my@l=(0..$m);for(my$x=0;$x<$m;$x++){next if($l[$x]<2);my $i=2 +;my$n=0;for(;$n<$m;$i++){$n=$x*$i;next if($n>$m);$l[$ n]=0}}@l=grep{$l[$_]}(0..$m) }

    # Example of use: sieve(60); sub sieve { my$m=pop;@_=(2..$m);for my $c(@_){my%b=map{$_,1}grep{($_%$c)==0}($c+1. +.$m);@_=grep{!$b{$_}}@_}die"@_\n" }

    Too bad we're not playing basketball.


    # Example of use: @result = sieve(60); sub sieve{ L:for(2..pop){for my $b(2..$_-1){next L if($_%$b<1)}push@_,$_}@_ }
    65. Does not mark of multiples but tests each number (only as far as needed, at least), so not really allowed.
Re: (Golf): Sieve of Eratosthenes
by Arguile (Hermit) on May 19, 2001 at 16:14 UTC

    Well, looking through tilly's previous golf entries I think I'll feel fine if length($my_code) <= 3*length($tilly_code).

    sub c{($k,@_)=($#_)?@_:2..($n=@_[0]);@_=map{($_%$k)?$_:0}@_;push @_,$k;$k<=sqrt($n)?c(grep{!/^0/}@_):@_}
    As for the lenght:
    [arguile@cobalt ~]$ wc -L sieve_golf 104 sieve_golf

    True to the seive, it starts at k=2 eleminates n*k, then steps to next prime and repeats until k >= sqrt(n). Instead of mapping to 0 I would have loved to just drop them -- not bothering w/ the regexp for cleanup -- but I couldn't for the life of me figure out how.

    The subroutine is recursive mainly b/c I had yet to try that technique and this seemed like a good time to try.

    Sample output:

    # display procedure blatantly stolen from Tye 2: 2 10: 2 3 5 7 50: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 100: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 +89 97 169: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 +89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167

    Thanks to Petruchio for his rewording of ($k,@_)=($#_)?@_:2..($n=@_[0]) so I didn't produce an error. So far this seems like a great place to learn programming.</fawning>


    Scratched recursion and went w/ a conventional for loop, brought it down to 87 chars w/ strict compliance.

    # non-strict sub c{$n=pop;@_=2..$n;for($k=2;$k<=sqrt($n);($k)=@_){@_=grep!/^0/,(map +{($_%$k)?$_:0}@_),$k}@_} # strict w/ less named vars sub c{pop;@_=2..$_;for(my$k=2;$k<=sqrt;($k)=@_){@_=grep!/^0/,(map{$_%$ +k?$_:0}@_),$k}@_}


    No one else was mapping, and now I see why (75 chars).
    sub e{pop;@_=2..$_;for(my$k=2;$k<=sqrt;($k)=@_){@_=((grep$_%$k,@_),$k) +;}@_}
Re: (Golf): Sieve of Eratosthenes
by japhy (Canon) on May 19, 2001 at 17:00 UTC
    Here is one of 68 characters in length:
    sub p{my@x;map{my$c=$_;$_%$c||++$x[$_]for@_}@_=2..pop;grep$x[$_]==1,0. +.$#x}
    UPDATE: And another of 65:
    sub p{@_=2..pop;my$c;while($_[$c]){@_=($_[$c],grep$_%$_[$c],@_);$c++}@ +_}
    UPDATE: And another of 60:
    sub p{@_=2..pop;my$c;@_=($_[$c++],grep$_%$_[$c-1],@_)while$c<@_;@_}
    UPDATE: And another of 40 (but I fear it strays from the rules...):
    sub p{grep{my$c=$_;$#_==grep$c%$_,@_}@_=2..pop}

    japhy -- Perl and Regex Hacker
      I think I have to rule the 40 character solution out of bounds on the basis of the fact that you are walking through the numbers and testing whether each is prime. This loses the central idea of Eratosthenes which is that when you find a prime you immediately mark off its multiples.

      Basically you reversed the role of the 2 loops.

      The 60 character answer is impressive. However it seems to scale quadratically. I must confess that I don't see why it is scaling quadratically, but it clearly is. So unless you can explain why this is a bug in Perl, I am going to have to call this a solution to only the first problem.

      I now understand why your 60 character solution is scaling slowly. It actually scales like O(n*n/log(n)), and it is because you are walking all of the primes with each strike rather than just the multiples of the current prime. So it definitely only is a solution to the first problem.

Re: (Golf): Sieve of Eratosthenes
by chipmunk (Parson) on May 19, 2001 at 19:59 UTC
    This solution is very similar to japhy's, although I started it without looking at his. I did use his solutions to inspire me to shorten mine, though!

    57 characters:

    sub sieve { my@p;@_=2..pop;{push@p,$p=shift;redo if@_=grep$_%$p,@_}@p }
    58 characters, strict-compliant:
    sub sieve { my@p;@_=2..pop;{push@p,shift;redo if@_=grep$_%$p[-1],@_}@p }
    BTW, what is the Big-O of the Sieve of Eratosthenes? :)
      The sieve of Eratosthenes is O(n*log(log(n))) arithmetic operations. Arithmetic operations themselves scale logarithmically officially there is another log operation in there. So within memory limits it should look basically linear in n. I might be willing to bend a log factor here or there. But if it isn't roughly linear in my tests, you don't qualify.

      And BTW I am not worrying about strict.

Re: (Golf): Sieve of Eratosthenes
by MeowChow (Vicar) on May 19, 2001 at 20:54 UTC
    How about this at 45 (strict compliant too):
    sub sieve { sub p{$_[0],@_?p(grep$_%$_[0],@_):()}p 2..pop }
                   s aamecha.s a..a\u$&owag.print
      I think that undef is not prime. :)

      But at 47 characters I think MeowChow still won't be beat:

      sub sieve { sub p{$_[0],@_>1?p(grep$_%$_[0],@_):()}p 2..pop }
        I had thought of that, but assumed it would slide :)

        Make that 46 by the way...

        sub sieve { sub p{$_[0],$#_?p(grep$_%$_[0],@_):()}p 2..pop }
                       s aamecha.s a..a\u$&owag.print
        Abusing the rules for fun and profit (46 chars):
        sub sieve { sub p{$_[0],@_>1?p(grep$_%$_[0],@_):1}p 2..pop }
                       s aamecha.s a..a\u$&owag.print
      Unless someone improves on this I think I know who is getting the t-shirt for the main problem. If anyone comes within 5 characters of beating my solution - HAH! How about if anyone completely trashes the best I could do..?

      Of course the bonus question is still wide open...

Re (tilly) 1: (Golf): Sieve of Eratosthenes
by tilly (Archbishop) on May 19, 2001 at 23:08 UTC
    Well both of my solutions have been beaten so I am willing to post them. But there are still a couple of hours for people to try to beat the current entries. But right now I am in a quandry about who is leading in the first. The best solution so far is 47 characters. It is basically MeowChow's solution with a trivial fix from chipmunk. I feel that it is "really" MeowChow's, but technically it was chipmunk who posted it. I will have to decide what to do with that one. For the bonus round tye is clearly leading.

    What I did differently than everyone else is I used a hash. My idea was to use a hash and then mark off entries by assigning to a hash reference. If you assign a range of numbers starting with an odd, the odds conveniently become the keys. This unfortunately makes 2 a special case. For the general problem I snuck 2 past like this in 60 characters:

    sub p{ %p=@_=3..1+pop;@p{2,map$f*$_,@_}=$f=$_ for@_;grep$p{$_},2,@_ }
    For the second problem I just dropped the special logic for 2 and stuck it at the beginning of the list with this 61 character solution:
    sub p{ 2,grep{$f=$_,@p{map$f*$_,3..$_[0]/$f}=0if$p{$_}}%p=3..1+$_[0] }
    The observant will notice that this is indeed startlingly similar to tye's solution. If he tried to reverse-engineer mine, the copy is better than the original! (Possibly because he did not have my blind spot for using a hash...)
Re: (Golf): Sieve of Eratosthenes
by chromatic (Archbishop) on May 19, 2001 at 21:45 UTC
    72, and there's some nasty bits at the end:

    sub p{@_=(2..pop);while($a=$_[0]){@_=grep{$_%$a}@_;$_.=" $a";}split' ';}

    Be glad I didn't post the regex one.

Re: (Golf): Sieve of Eratosthenes
by MeowChow (Vicar) on May 20, 2001 at 00:45 UTC
    Ok, how's 50 characters for the bonus:
    sub sieve { grep!(map$_[$a*$_]++,1..$_[0]/($a=$_))[0],2..$_[0] }
    update: Hmm, in my golfing frenzy, I seem to have increased the big-O back to that of the main problem, so I don't think this qualifies after all.
                   s aamecha.s a..a\u$&owag.print
Re: (Golf): Sieve of Eratosthenes
by srawls (Friar) on May 19, 2001 at 22:17 UTC
    This is my very first golf attempt, and I know it's pretty bad, but here it goes:

    Well, it does return duplicates (2 2 5 5 7 7) since I use a hash, but I didn't see anything in the problem that said you couldn't. I count 77 chars (see update). (Note: that is excluding the new-lines, I ran it with out them, but I posted new lines here to be more pleasing to the eye, also, it is excluding the sub a { and the end } parts).

    sub a { @a{2..$_[0]}=2..$_[0]; for$b(%a){!($b%$_)&&$b!=$_&&delete@a{$b}for(%a)} %a }
    One last note, since this is my first golf, I would very much appreciate any suggestions or improvements to my code. I realize I took a very straight forward approach to this solution, but that is how I think.

    Update:Took it down to 73 chars. I took out one set of un-needed parenentheses and two un-needed semicolons.

    Note:If you put keys before the %a it takes away the problem of repeats, but it brings me back to where I started with 77 chars.

    The 15 year old, freshman programmer,
    Stephen Rawls

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://81695]
Approved by root
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (6)
As of 2023-12-08 09:07 GMT
Find Nodes?
    Voting Booth?
    What's your preferred 'use VERSION' for new CPAN modules in 2023?

    Results (35 votes). Check out past polls.