Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: RFC: Sorting IPv4 addresses

by ELISHEVA (Prior)
on Oct 07, 2009 at 18:54 UTC ( [id://799781] : note . print w/replies, xml ) Need Help??


in reply to RFC: Sorting IPv4 addresses

Curious. On my machine (Debian Etch, Perl 5.8.8) the naive run wins every time looses in all cases. Here are the results:

Rate naive splitwise inet_orc split_int split_pack inet_g +rt inet_grt2 naive 12.9/s -- -79% -82% -86% -86% -9 +3% -93% splitwise 62.1/s 382% -- -11% -32% -33% -6 +5% -65% inet_orc 69.9/s 443% 13% -- -23% -24% -6 +1% -61% split_int 91.2/s 608% 47% 30% -- -1% -4 +9% -49% split_pack 92.2/s 617% 49% 32% 1% -- -4 +8% -48% inet_grt 178/s 1284% 187% 155% 95% 93% +-- -0% inet_grt2 179/s 1287% 188% 155% 96% 94% +0% --

Note:The version of the test I used includes split_pack, but not Sort::Key::IPv4 or Sort::Key::Radix and fixes a small typo in naive_sort (my @b = split /\./, $a; should be my @b = split /\./, $b;)

Still, this makes me wonder about the Schwartian transform. The argument for it is that during a naive sort we are likely to need to extract the sort key several times. The cumulative time for the repeat derivation of the sort key is more than the extra work of reconstructing the original value from the sort key and the extra copying and memory allocation needed by the three step process. Hence the Schwartzian transform is faster. But is this always true?

It would seem to depend on four factors:

  • the time needed by the rule for extracting the sort key from the original value
  • the time needed by the rule for reconstructing the original value from the sort key
  • how badly out of order the initial array is.
  • the size of your data set

Perl sort uses quicksort. If the array is completely ordered,then each sort key (other than the first and last) will be derived twice - once to compare N-1 to N and once to compare N to N+1. In the average case 2N becomes 2N*logN and only in the worst case it is 2*N^2.

Putting this all together, the Schwartzian transform is only faster if N*(extract_sortkey + reconstruct_value + 2*memcopy) < extract_sortKey*N*X where X is 2 in the best case, log N on average and N in the worst case. This simplifies to reconstruct_value + 2*memcopy < extract_sortkey*(X - 1). Thus we have:

  • Best case: reconstruct_key+2*memcopy < extract_key
  • On average: reconstruct_key+2*memcopy < extract_key*log N
  • Worst case: reconstruct_key+2*memcopy < extract_key*N

Note that the Schwartian side is a constant and the naive side scales with respect to N. Surely a process that scales by N will be slower than a constant process? Not necessarily. The problem here is that we are comparing apples and oranges. On one side of the equation is "reconstruct_key + 2*memcopy" and on the other is "extract key". These are very different operations and we may be comparing very slow operations to very fast operations. For instance, if the process of reconstructing the original value is 10X slower than extracting the key, we would need to be sorting a fairly large array of N=1024=2^10 values before the Schwartzian transfom would give us any speed advantage.

The moral of this story is that there is no automatically right way to sort - it all depends on the size of your data set, the likelihood that it will be badly ordered, and the difference between the time involved in extracting sort keys and the time needed to reconstruct original values.

Best, beth

Update:Fixed who "wins" - misunderstood output! Thanks to bv, ikegami below.

Replies are listed 'Best First'.
Re^2: RFC: Sorting IPv4 addresses
by ikegami (Patriarch) on Oct 07, 2009 at 19:09 UTC

    Curious. On my machine (Debian Etch, Perl 5.8.8) the naive run wins every time. Here are the results:

    Win at having the lowest rate? I wouldn't call that winning. Other factors being equal, I'd consider the fastest to be the winner, and 179/s (0.00559s) is 13x faster than 12.9/s (0.0775s).

Re^2: RFC: Sorting IPv4 addresses
by bv (Friar) on Oct 07, 2009 at 19:05 UTC

    Great information, but did you really mean wins? Your benchmark shows naive coming in dead last, since it's reporting iterations/second. From the documentation,

    The COUNT can be zero or negative: this means the minimum number of CPU seconds to run. A zero signifies the default of 3 seconds.

    [...]

    The benchmark output will, however, also tell the number of $code runs/second, which should be a more interesting number than the actually spent seconds.

    print pack("A25",pack("V*",map{1919242272+$_}(34481450,-49737472,6228,0,-285028276,6979,-1380265972)))