Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add bottom-up heapsort #23

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Maxime2
Copy link
Contributor

@Maxime2 Maxime2 commented Jul 27, 2015

I have added bottom-up heapsort module implementing a modified version of bottom-up heapsort sorting, see details here: http://blog.dataparksearch.org/397

It is unified with asort() to be typesafe and accept context pointer for comparison function.

Would it better to rename bottom_up_heapsort() into bsort() for function name simplicity?

@dgibson
Copy link
Collaborator

dgibson commented Jul 27, 2015

On Sun, Jul 26, 2015 at 11:47:42PM -0700, Maxim Zakharov wrote:

I have added bottom-up heapsort module implementing a modified version of bottom-up heapsort sorting, see details here: http://blog.dataparksearch.org/397

It is unified with asort() to be typesafe and accept context pointer for comparison function.

Would it better to rename bottom_up_heapsort() into bsort() for function name simplicity?
You can view, comment on, or merge this pull request online at:

"bottom_up_heapsort" is awkwardly long, but "bsort" seems a bit
non-specific.

How different is a bottom up heapsort from any other heapsort? Would
just "heapsort" be a reasonable name?

David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT the other
| way around!
http://www.ozlabs.org/~dgibson

@Maxime2
Copy link
Contributor Author

Maxime2 commented Jul 27, 2015

It is indeed a heapsort sorting, though not a classical implementation, and it can be dubbed just "heapsort" if you do not plan to include classical heapsort implementation into ccan.

@rustyrussell
Copy link
Owner

Maxim Zakharov [email protected] writes:

I have added bottom-up heapsort module implementing a modified version of bottom-up heapsort sorting, see details here: http://blog.dataparksearch.org/397

It is unified with asort() to be typesafe and accept context pointer for comparison function.

Would it better to rename bottom_up_heapsort() into bsort() for function name simplicity?

Hi Maxim!

    I like the module, unfortunately a rough benchmark shows it to

only to be faster than asort (ie. glibc's sort) between 256 and 4096
elements, which is not what I expected:

asort-2 = 0.000003
heapsort-2 = 0.000001
asort-4 = 0.000001
heapsort-4 = 0.000001
asort-8 = 0.000001
heapsort-8 = 0.000001
asort-16 = 0.000002
heapsort-16 = 0.000002
asort-32 = 0.000003
heapsort-32 = 0.000003
asort-64 = 0.000006
heapsort-64 = 0.000006
asort-128 = 0.000011
heapsort-128 = 0.000013
asort-256 = 0.000094
heapsort-256 = 0.000024
asort-512 = 0.000108
heapsort-512 = 0.000051
asort-1024 = 0.000159
heapsort-1024 = 0.000108
asort-2048 = 0.000272
heapsort-2048 = 0.000232
asort-4096 = 0.000543
heapsort-4096 = 0.000498
asort-8192 = 0.001055
heapsort-8192 = 0.001073
asort-16384 = 0.002155
heapsort-16384 = 0.002298
asort-32768 = 0.004945
heapsort-32768 = 0.004974
asort-65536 = 0.009596
heapsort-65536 = 0.010618
asort-131072 = 0.020460
heapsort-131072 = 0.022994
asort-262144 = 0.042670
heapsort-262144 = 0.049739
asort-524288 = 0.089806
heapsort-524288 = 0.107007
asort-1048576 = 0.187838
heapsort-1048576 = 0.235675
asort-2097152 = 0.392327
heapsort-2097152 = 0.544562
asort-4194304 = 0.820443
heapsort-4194304 = 1.255215
asort-8388608 = 1.718564
heapsort-8388608 = 2.751897
asort-16777216 = 3.566820
heapsort-16777216 = 6.017466
asort-33554432 = 7.521751
heapsort-33554432 = 13.410207
asort-67108864 = 15.426503
heapsort-67108864 = 28.932382

Thanks,
Rusty.

commit ad0a71e85907a50c0b2807b581142139dd937c15
Author: Rusty Russell [email protected]
Date: Tue Jul 28 10:23:46 2015 +0930

bottom_up_heapsort: add benchmark.

Signed-off-by: Rusty Russell <[email protected]>

diff --git a/ccan/bottom_up_heapsort/benchmark/Makefile b/ccan/bottom_up_heapsort/benchmark/Makefile
new file mode 100644
index 0000000..66b0cbd
--- /dev/null
+++ b/ccan/bottom_up_heapsort/benchmark/Makefile
@@ -0,0 +1,18 @@
+CCANDIR := ../../../
+CFLAGS := -Wall -I$(CCANDIR) -O3 -flto
+LDFLAGS := -O3 -flto
+
+benchmark: benchmark.o ccan-asort.o ccan-bottom_up_heapsort.o ccan-time.o ccan-err.o
+
+clean:

  • $(RM) -f *.o

+ccan-asort.o: $(CCANDIR)/ccan/asort/asort.c

  • $(CC) $(CFLAGS) -c -o $@ $&lt;
    +ccan-bottom_up_heapsort.o: $(CCANDIR)/ccan/bottom_up_heapsort/bottom_up_heapsort.c
  • $(CC) $(CFLAGS) -c -o $@ $&lt;
    +ccan-time.o: $(CCANDIR)/ccan/time/time.c
  • $(CC) $(CFLAGS) -c -o $@ $&lt;
    +ccan-err.o: $(CCANDIR)/ccan/err/err.c
  • $(CC) $(CFLAGS) -c -o $@ $&lt;

diff --git a/ccan/bottom_up_heapsort/benchmark/benchmark.c b/ccan/bottom_up_heapsort/benchmark/benchmark.c
new file mode 100644
index 0000000..42aea04
--- /dev/null
+++ b/ccan/bottom_up_heapsort/benchmark/benchmark.c
@@ -0,0 +1,50 @@
+#include <ccan/asort/asort.h>
+#include <ccan/time/time.h>
+#include <ccan/bottom_up_heapsort/bottom_up_heapsort.h>
+#include <ccan/err/err.h>
+#include <stdio.h>
+
+static int compar(const int *a, const int *b, void *unused)
+{

  • return *a - *b;
    +}

+int main(int argc, char *argv[])
+{

  • unsigned int i, num;
  • int *randoms;
  • struct timeabs start;
  • struct timerel asort_time, heapsort_time;
  • err_set_progname(argv[0]);
  • if (argc != 2 || (num = atoi(argv[1])) == 0)
  •   errx(1, "Usage: benchmark <nitems>");
    
  • randoms = calloc(sizeof(*randoms), num);
  • srandom(0);
  • for (i = 0; i < num; i++)
  •   randoms[i] = random();
    
  • start = time_now();
  • asort(randoms, num, compar, NULL);
  • asort_time = time_between(time_now(), start);
  • srandom(0);
  • for (i = 0; i < num; i++)
  •   randoms[i] = random();
    
  • start = time_now();
  • bottom_up_heapsort(randoms, num, compar, NULL);
  • heapsort_time = time_between(time_now(), start);
  • printf("asort-%u = %u.%06u\n"
  •      "heapsort-%u = %u.%06u\n",
    
  •      num,
    
  •      (unsigned int)asort_time.ts.tv_sec,
    
  •      (unsigned int)(asort_time.ts.tv_nsec / 1000),
    
  •      num,
    
  •      (unsigned int)heapsort_time.ts.tv_sec,
    
  •      (unsigned int)(heapsort_time.ts.tv_nsec / 1000));
    
  • return 0;
    +}

@Maxime2
Copy link
Contributor Author

Maxime2 commented Jul 28, 2015

Hi Rusty,

It is well known quicksort behaves well better heapsort on random data on average, however quicksort's worse case is much worse in comparison to heapsort. There is a quicksort killer method to build a sequence demonstrating it. See http://research.swtch.com/qsort

If I switch to ccan built-in implementation which is classical quicksort as I see it shows the following performance on killing quicksort sequence in comparison to bottom_up_heapsort:

$ make test
bash -c "p=2; for i in {1..14}; do let p=2*$p ; ./benchmark $p ; done"
asort-4 = 0.000000
heapsort-4 = 0.000001
asort-killer-4 = 0.000000
heapsort-killer-4 = 0.000000
asort-8 = 0.000000
heapsort-8 = 0.000001
asort-killer-8 = 0.000000
heapsort-killer-8 = 0.000000
asort-16 = 0.000000
heapsort-16 = 0.000001
asort-killer-16 = 0.000000
heapsort-killer-16 = 0.000000
asort-32 = 0.000001
heapsort-32 = 0.000002
asort-killer-32 = 0.000001
heapsort-killer-32 = 0.000001
asort-64 = 0.000003
heapsort-64 = 0.000004
asort-killer-64 = 0.000003
heapsort-killer-64 = 0.000003
asort-128 = 0.000007
heapsort-128 = 0.000007
asort-killer-128 = 0.000010
heapsort-killer-128 = 0.000006
asort-256 = 0.000014
heapsort-256 = 0.000016
asort-killer-256 = 0.000038
heapsort-killer-256 = 0.000014
asort-512 = 0.000030
heapsort-512 = 0.000039
asort-killer-512 = 0.000144
heapsort-killer-512 = 0.000029
asort-1024 = 0.000066
heapsort-1024 = 0.000072
asort-killer-1024 = 0.000566
heapsort-killer-1024 = 0.000062
asort-2048 = 0.000144
heapsort-2048 = 0.000202
asort-killer-2048 = 0.002202
heapsort-killer-2048 = 0.000131
asort-4096 = 0.000302
heapsort-4096 = 0.000331
asort-killer-4096 = 0.008749
heapsort-killer-4096 = 0.000273
asort-8192 = 0.000636
heapsort-8192 = 0.000711
asort-killer-8192 = 0.034996
heapsort-killer-8192 = 0.000574
asort-16384 = 0.001353
heapsort-16384 = 0.001521
asort-killer-16384 = 0.141857
heapsort-killer-16384 = 0.001218
asort-32768 = 0.002899
heapsort-32768 = 0.003311
asort-killer-32768 = 0.555359
heapsort-killer-32768 = 0.002528

diff --git a/ccan/asort/asort.c b/ccan/asort/asort.c
index e7eaf2c..66e234c 100644
--- a/ccan/asort/asort.c
+++ b/ccan/asort/asort.c
@@ -1,7 +1,7 @@
#include <ccan/asort/asort.h>
#include <stdlib.h>

-#if !HAVE_QSORT_R_PRIVATE_LAST
+#if 1 || !HAVE_QSORT_R_PRIVATE_LAST

/* Steal glibc's code. */

diff --git a/ccan/asort/asort.h b/ccan/asort/asort.h
index 43b0f89..f54160d 100644
--- a/ccan/asort/asort.h
+++ b/ccan/asort/asort.h
@@ -22,7 +22,7 @@
_asort((base), (num), sizeof(*(base)),
total_order_cast((cmp), *(base), (ctx)), (ctx))

-#if HAVE_QSORT_R_PRIVATE_LAST
+#if 0 && HAVE_QSORT_R_PRIVATE_LAST
#define _asort(b, n, s, cmp, ctx) qsort_r(b, n, s, cmp, ctx)
#else
void _asort(void *base, size_t nmemb, size_t size,
diff --git a/ccan/bottom_up_heapsort/benchmark/Makefile b/ccan/bottom_up_heapsort/benchmark/Makefile
index 68e1437..3e664c8 100644
--- a/ccan/bottom_up_heapsort/benchmark/Makefile
+++ b/ccan/bottom_up_heapsort/benchmark/Makefile
@@ -5,14 +5,17 @@ LDFLAGS := -O3 -flto
benchmark: benchmark.o ccan-asort.o ccan-bottom_up_heapsort.o ccan-time.o ccan-err.o

clean:

  •   $(RM) -f *.o
    
  • $(RM) -f *.o

+test: benchmark

  • bash -c "p=2; for i in {1..14}; do let p=2*$$p ; ./benchmark $$p ; done"

ccan-asort.o: $(CCANDIR)/ccan/asort/asort.c

  •   $(CC) $(CFLAGS) -c -o $@ $<
    
  • $(CC) $(CFLAGS) -c -o $@ $&lt;
    ccan-bottom_up_heapsort.o: $(CCANDIR)/ccan/bottom_up_heapsort/bottom_up_heapsort.c
  •   $(CC) $(CFLAGS) -c -o $@ $<
    
  • $(CC) $(CFLAGS) -c -o $@ $&lt;
    ccan-time.o: $(CCANDIR)/ccan/time/time.c
  •   $(CC) $(CFLAGS) -c -o $@ $<
    
  • $(CC) $(CFLAGS) -c -o $@ $&lt;
    ccan-err.o: $(CCANDIR)/ccan/err/err.c
  •   $(CC) $(CFLAGS) -c -o $@ $<
    
  • $(CC) $(CFLAGS) -c -o $@ $&lt;

diff --git a/ccan/bottom_up_heapsort/benchmark/benchmark.c b/ccan/bottom_up_heapsort/benchmark/benchmark.c
index e69a1e6..1e3b95f 100644
--- a/ccan/bottom_up_heapsort/benchmark/benchmark.c
+++ b/ccan/bottom_up_heapsort/benchmark/benchmark.c
@@ -4,6 +4,143 @@
#include <ccan/err/err.h>
#include <stdio.h>

+/* Copyright 1998, M. Douglas McIlroy

  • Permission is granted to use or copy with this notice attached.
  • Aqsort is an antiquicksort. It will drive any qsort implementation
  • based on quicksort into quadratic behavior, provided the imlementation
  • has these properties:
    1. The implementation is single-threaded.
    1. The pivot-choosing phase uses O(1) comparisons.
    1. Partitioning is a contiguous phase of n-O(1) comparisons, all
  • against the same pivot value.
    1. No comparisons are made between items not found in the array.
  • Comparisons may, however, involve copies of those items.
  • Method
  • Values being sorted are dichotomized into "solid" values that are
  • known and fixed, and "gas" values that are unknown but distinct and
  • larger than solid values. Initially all values are gas. Comparisons
  • work as follows:
  • Solid-solid. Compare by value. Solid-gas. Solid compares low.
  • Gas-gas. Solidify one of the operands, with a value greater
  •  than any previously solidified value.  Compare afresh.
    
  • During partitioning, the gas values that remain after pivot-choosing
  • will compare high, provided the pivot is solid. Then qsort will go
  • quadratic. To force the pivot to be solid, a heuristic test
  • identifies pivot candidates to be solidified in gas-gas comparisons.
  • A pivot candidate is the gas item that most recently survived
  • a comparison. This heuristic assures that the pivot gets
  • solidified at or before the second gas-gas comparison during the
  • partitioning phase, so that n-O(1) gas values remain.
  • To allow for copying, we must be able to identify an operand even if
  • it was copied from an item that has since been solidified. Hence we
  • keep the data in fixed locations and sort pointers to them. Then
  • qsort can move or copy the pointers at will without disturbing the
  • underlying data.
  • int aqsort(int n, int *a); returns the count of comparisons qsort used
  • in sorting an array of n items and fills in array a with the
  • permutation of 0..n-1 that achieved the count.
    +_/

+#include <stdlib.h>
+#include <assert.h>
+int *val; /_ array, solidified on the fly /
+int ncmp; /
number of comparisons /
+int nsolid; /
number of solid items /
+int candidate; /
pivot candidate /
+int gas; /
gas value = highest sorted value /
+#define freeze(x) val[x] = nsolid++
+
+int
+cmp(const void *px, const void *py) /
per C standard */
+{

  • const int x = (const int)px;
  • const int y = (const int)py;
  • ncmp++;
  • if(val[x]==gas && val[y]==gas) {
  •   if(x == candidate)
    
  •       freeze(x);
    
  •   else
    
  •       freeze(y);
    
  • }
  • if(val[x] == gas)
  •   candidate = x;
    
  • else if(val[y] == gas)
  •   candidate = y;
    
  • return val[x] - val[y];
    +}

+int
+aqsort(int n, int *a)
+{

  • int i;
  • int _ptr = malloc(n_sizeof(*ptr));
  • val = a;
  • gas = n-1;
  • nsolid = ncmp = candidate = 0;
  • for(i=0; i<n; i++) {
  •   ptr[i] = i;
    
  •   val[i] = gas;
    
  • }
  • asort(ptr, n, cmp, NULL);
  • for(i=1;i<n;i++)
  •   assert(val[ptr[i]]==val[ptr[i-1]]+1);
    
  • free(ptr);
  • return ncmp;
    +}

+#if 0
+
+/* driver main program, to be linked with qsort

  • usage: aqsort [-p] n
  • constructs an adversarial input and reports the comparison
  • count for it. Option -p prints the adversarial input.
    +_/

+#include <stdio.h>
+#include <string.h>
+
+int pflag;
+
+main(int argc, char *_argv)
+{

  • int n, i;
  • int *b;
  • if(argc>1 && strcmp(argv[1],"-p") == 0) {
  •   pflag++;
    
  •   argc--;
    
  •   argv++;
    
  • }
  • if(argc != 2) {
  •   fprintf(stderr,"usage: aqsort [-p] n\n");
    
  •   exit(1);
    
  • }
  • n = atoi(argv[1]);
  • b = malloc(n*sizeof(int));
  • if(b == 0) {
  •   fprintf(stderr,"aqsort: out of space\n");
    
  •   exit(1);
    
  • }
  • i = aqsort(n, b);
  • printf("n=%d count=%d\n", n, i);
  • if(pflag)
  •   for(i=0; i<n; i++)
    
  •       printf("%d\n", b[i]);
    
  • exit(0);
    +}

+#endif
+
static int compar(const int *a, const int *b, void *unused)
{
return *a - *b;
@@ -15,6 +152,7 @@ int main(int argc, char *argv[])
int *randoms;
struct timeabs start;
struct timerel asort_time, heapsort_time;

  •   struct timerel asort_killer_time, heapsort_killer_time;
    
    err_set_progname(argv[0]);
    

@@ -38,13 +176,37 @@ int main(int argc, char *argv[])
bottom_up_heapsort(randoms, num, compar, NULL);
heapsort_time = time_between(time_now(), start);

  •   i = aqsort(num, randoms);
    
    +// for(i=0; i<num; i++)
    +// printf("%d\n", randoms[i]);
    +
  •   start = time_now();
    
  •   asort(randoms, num, compar, NULL);
    
  •   asort_killer_time = time_between(time_now(), start);
    
  •   i = aqsort(num, randoms);
    
  •   start = time_now();
    
  •   bottom_up_heapsort(randoms, num, compar, NULL);
    
  •   heapsort_killer_time = time_between(time_now(), start);
    
  • printf("asort-%u = %u.%06u\n"
    
  •          "heapsort-%u = %u.%06u\n",
    
  •          "heapsort-%u = %u.%06u\n"
    
  •     "asort-killer-%u = %u.%06u\n"
    
  •     "heapsort-killer-%u = %u.%06u\n",
           num,
           (unsigned int)asort_time.ts.tv_sec,
           (unsigned int)(asort_time.ts.tv_nsec / 1000),
           num,
           (unsigned int)heapsort_time.ts.tv_sec,
    
  •          (unsigned int)(heapsort_time.ts.tv_nsec / 1000));
    
  •          (unsigned int)(heapsort_time.ts.tv_nsec / 1000),
    
  •          num,
    
  •          (unsigned int)asort_killer_time.ts.tv_sec,
    
  •          (unsigned int)(asort_killer_time.ts.tv_nsec / 1000),
    
  •          num,
    
  •          (unsigned int)heapsort_killer_time.ts.tv_sec,
    
  •          (unsigned int)(heapsort_killer_time.ts.tv_nsec / 1000));
    return 0;
    
    }

@rustyrussell
Copy link
Owner

It look like my patch was garbled on github so I attach it here in e-mail
reply.

On 28 July 2015 at 11:06, Rusty Russell [email protected] wrote:

Maxim Zakharov [email protected] writes:

I have added bottom-up heapsort module implementing a modified version
of bottom-up heapsort sorting, see details here:
http://blog.dataparksearch.org/397

It is unified with asort() to be typesafe and accept context pointer for
comparison function.

Would it better to rename bottom_up_heapsort() into bsort() for function
name simplicity?

Hi Maxim!

    I like the module, unfortunately a rough benchmark shows it to

only to be faster than asort (ie. glibc's sort) between 256 and 4096
elements, which is not what I expected:

asort-2 = 0.000003
heapsort-2 = 0.000001
asort-4 = 0.000001
heapsort-4 = 0.000001
asort-8 = 0.000001
heapsort-8 = 0.000001
asort-16 = 0.000002
heapsort-16 = 0.000002
asort-32 = 0.000003
heapsort-32 = 0.000003
asort-64 = 0.000006
heapsort-64 = 0.000006
asort-128 = 0.000011
heapsort-128 = 0.000013
asort-256 = 0.000094
heapsort-256 = 0.000024
asort-512 = 0.000108
heapsort-512 = 0.000051
asort-1024 = 0.000159
heapsort-1024 = 0.000108
asort-2048 = 0.000272
heapsort-2048 = 0.000232
asort-4096 = 0.000543
heapsort-4096 = 0.000498
asort-8192 = 0.001055
heapsort-8192 = 0.001073
asort-16384 = 0.002155
heapsort-16384 = 0.002298
asort-32768 = 0.004945
heapsort-32768 = 0.004974
asort-65536 = 0.009596
heapsort-65536 = 0.010618
asort-131072 = 0.020460
heapsort-131072 = 0.022994
asort-262144 = 0.042670
heapsort-262144 = 0.049739
asort-524288 = 0.089806
heapsort-524288 = 0.107007
asort-1048576 = 0.187838
heapsort-1048576 = 0.235675
asort-2097152 = 0.392327
heapsort-2097152 = 0.544562
asort-4194304 = 0.820443
heapsort-4194304 = 1.255215
asort-8388608 = 1.718564
heapsort-8388608 = 2.751897
asort-16777216 = 3.566820
heapsort-16777216 = 6.017466
asort-33554432 = 7.521751
heapsort-33554432 = 13.410207
asort-67108864 = 15.426503
heapsort-67108864 = 28.932382

Thanks,
Rusty.

commit ad0a71e85907a50c0b2807b581142139dd937c15
Author: Rusty Russell [email protected]
Date: Tue Jul 28 10:23:46 2015 +0930

bottom_up_heapsort: add benchmark.

Signed-off-by: Rusty Russell <[email protected]>

diff --git a/ccan/bottom_up_heapsort/benchmark/Makefile
b/ccan/bottom_up_heapsort/benchmark/Makefile
new file mode 100644
index 0000000..66b0cbd
--- /dev/null
+++ b/ccan/bottom_up_heapsort/benchmark/Makefile
@@ -0,0 +1,18 @@
+CCANDIR := ../../../
+CFLAGS := -Wall -I$(CCANDIR) -O3 -flto
+LDFLAGS := -O3 -flto
+
+benchmark: benchmark.o ccan-asort.o ccan-bottom_up_heapsort.o ccan-time.o
ccan-err.o
+
+clean:

  •   $(RM) -f *.o
    
    +ccan-asort.o: $(CCANDIR)/ccan/asort/asort.c
  •   $(CC) $(CFLAGS) -c -o $@ $<
    
    +ccan-bottom_up_heapsort.o:
    $(CCANDIR)/ccan/bottom_up_heapsort/bottom_up_heapsort.c
  •   $(CC) $(CFLAGS) -c -o $@ $<
    
    +ccan-time.o: $(CCANDIR)/ccan/time/time.c
  •   $(CC) $(CFLAGS) -c -o $@ $<
    
    +ccan-err.o: $(CCANDIR)/ccan/err/err.c
  •   $(CC) $(CFLAGS) -c -o $@ $<
    
    diff --git a/ccan/bottom_up_heapsort/benchmark/benchmark.c
    b/ccan/bottom_up_heapsort/benchmark/benchmark.c
    new file mode 100644
    index 0000000..42aea04
    --- /dev/null
    +++ b/ccan/bottom_up_heapsort/benchmark/benchmark.c
    @@ -0,0 +1,50 @@
    +#include <ccan/asort/asort.h>
    +#include <ccan/time/time.h>
    +#include <ccan/bottom_up_heapsort/bottom_up_heapsort.h>
    +#include <ccan/err/err.h>
    +#include <stdio.h>
    +
    +static int compar(const int *a, const int *b, void *unused)
    +{
  •   return *a - *b;
    
    +}
    +
    +int main(int argc, char *argv[])
    +{
  •   unsigned int i, num;
    
  •   int *randoms;
    
  •   struct timeabs start;
    
  •   struct timerel asort_time, heapsort_time;
    
  •   err_set_progname(argv[0]);
    
  •   if (argc != 2 || (num = atoi(argv[1])) == 0)
    
  •           errx(1, "Usage: benchmark <nitems>");
    
  •   randoms = calloc(sizeof(*randoms), num);
    
  •   srandom(0);
    
  •   for (i = 0; i < num; i++)
    
  •           randoms[i] = random();
    
  •   start = time_now();
    
  •   asort(randoms, num, compar, NULL);
    
  •   asort_time = time_between(time_now(), start);
    
  •   srandom(0);
    
  •   for (i = 0; i < num; i++)
    
  •           randoms[i] = random();
    
  •   start = time_now();
    
  •   bottom_up_heapsort(randoms, num, compar, NULL);
    
  •   heapsort_time = time_between(time_now(), start);
    
  •   printf("asort-%u = %u.%06u\n"
    
  •          "heapsort-%u = %u.%06u\n",
    
  •          num,
    
  •          (unsigned int)asort_time.ts.tv_sec,
    
  •          (unsigned int)(asort_time.ts.tv_nsec / 1000),
    
  •          num,
    
  •          (unsigned int)heapsort_time.ts.tv_sec,
    
  •          (unsigned int)(heapsort_time.ts.tv_nsec / 1000));
    
  •   return 0;
    
    +}
    ccan mailing list
    [email protected]
    https://lists.ozlabs.org/listinfo/ccan

http://au.linkedin.com/in/dpmaxime/

@dgibson
Copy link
Collaborator

dgibson commented Jul 29, 2015

On Mon, Jul 27, 2015 at 08:06:08AM -0700, Maxim Zakharov wrote:

It is indeed a heapsort sorting, though not a classical
implementation, and it can be dubbed just "heapsort" if you do not
plan to include classical heapsort implementation into ccan.

Well, ccan isn't exactly planned at all..

IIUC the bottom up heapsort and ordinary heapsort do share part of the
algorithm, so I'm wondering if it would be reasonable to call the
module "heapsort" and we can add the classical heapsort variant to the
same module later if we want it.

David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT the other
| way around!
http://www.ozlabs.org/~dgibson

@Maxime2
Copy link
Contributor Author

Maxime2 commented Jul 29, 2015

Current glib's qsort() implementation uses "Towers of Hanoi" Mergesort if it can allocate required additional memory without using swap space, otherwise it falls back to quicksort (see https://sourceware.org/ml/libc-alpha/2002-01/msg00218.html )

From performance perspective, asort() module should be better updated to the latest glibc's qsort() implementation with fall back to heapsort instead of quicksort which gives protection against quicksort's worst case and heapsort generally requires less additional memory than quicksort.

In general, ccan should not only check qsort_r() availability but also do benchmarking for its asort() against qsort_r() on the system and chose fastest one.

@rustyrussell
Copy link
Owner

Maxim Zakharov [email protected] writes:

Current glib's qsort() implementation uses "Towers of Hanoi" Mergesort if it can allocate required additional memory without using swap space, otherwise it falls back to quicksort (see https://sourceware.org/ml/libc-alpha/2002-01/msg00218.html )

From performance perspective, asort() module should be better updated to the latest glibc's qsort() implementation with fall back to heapsort instead of quicksort which gives protection against quicksort's worst case and heapsort generally requires less additional memory than quicksort.

Indeed.

In general, ccan should not only check qsort_r() availability but also
do benchmarking for its asort() against qsort_r() on the system and
chose fastest one.

That's not something you want to do at runtime, and I'm not sure how
useful that is at compile time. Unless there are specific known
problems, I prefer to rely on library versions.

I would suggest:

  1. Call your module "heapsort".
  2. Change asort to fall back to using it if !HAVE_QSORT_R_PRIVATE_LAST.

Cheers,
Rusty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants