git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] mergesort: improve tests and performance
@ 2021-10-01  9:07 René Scharfe
  2021-10-01  9:10 ` [PATCH 1/9] test-mergesort: use strbuf_getline() René Scharfe
                   ` (8 more replies)
  0 siblings, 9 replies; 29+ messages in thread
From: René Scharfe @ 2021-10-01  9:07 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano

Our mergesort for linked lists doesn't allocate temporary memory, which
is nice.  It is traversing the list multiple times from start to finish,
though.  This series teaches it to avoid that, which speeds it up
considerably.  But first it improves the associated test helper to
exercise the code harder and to make the effect of the changes visible
in terms of saved operations.

  test-mergesort: use strbuf_getline()
  test-mergesort: add sort subcommand
  test-mergesort: add test subcommand
  test-mergesort: add generate subcommand
  test-mergesort: add unriffle mode
  test-mergesort: add unriffle_skewed mode
  p0071: measure sorting of already sorted and reversed files
  p0071: test performance of llist_mergesort()
  mergesort: use ranks stack

 mergesort.c               | 121 +++++++------
 t/helper/test-mergesort.c | 360 +++++++++++++++++++++++++++++++++++++-
 t/perf/p0071-sort.sh      |  40 ++++-
 t/t0071-sort.sh           |  11 ++
 4 files changed, 465 insertions(+), 67 deletions(-)
 create mode 100755 t/t0071-sort.sh

--
2.33.0

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/9] test-mergesort: use strbuf_getline()
  2021-10-01  9:07 [PATCH 0/9] mergesort: improve tests and performance René Scharfe
@ 2021-10-01  9:10 ` René Scharfe
  2021-10-02  9:08   ` Ævar Arnfjörð Bjarmason
  2021-10-01  9:11 ` [PATCH 2/9] test-mergesort: add sort subcommand René Scharfe
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 29+ messages in thread
From: René Scharfe @ 2021-10-01  9:10 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano

Strip line ending characters to make sure empty lines are sorted like
sort(1) does.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/helper/test-mergesort.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/t/helper/test-mergesort.c b/t/helper/test-mergesort.c
index c5cffaa4b7..621e2a5197 100644
--- a/t/helper/test-mergesort.c
+++ b/t/helper/test-mergesort.c
@@ -28,9 +28,7 @@ int cmd__mergesort(int argc, const char **argv)
 	struct line *line, *p = NULL, *lines = NULL;
 	struct strbuf sb = STRBUF_INIT;

-	for (;;) {
-		if (strbuf_getwholeline(&sb, stdin, '\n'))
-			break;
+	while (!strbuf_getline(&sb, stdin)) {
 		line = xmalloc(sizeof(struct line));
 		line->text = strbuf_detach(&sb, NULL);
 		if (p) {
@@ -46,7 +44,7 @@ int cmd__mergesort(int argc, const char **argv)
 	lines = llist_mergesort(lines, get_next, set_next, compare_strings);

 	while (lines) {
-		printf("%s", lines->text);
+		puts(lines->text);
 		lines = lines->next;
 	}
 	return 0;
--
2.33.0


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/9] test-mergesort: add sort subcommand
  2021-10-01  9:07 [PATCH 0/9] mergesort: improve tests and performance René Scharfe
  2021-10-01  9:10 ` [PATCH 1/9] test-mergesort: use strbuf_getline() René Scharfe
@ 2021-10-01  9:11 ` René Scharfe
  2021-10-01 20:26   ` Junio C Hamano
  2021-10-01  9:12 ` [PATCH 3/9] test-mergesort: add test subcommand René Scharfe
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 29+ messages in thread
From: René Scharfe @ 2021-10-01  9:11 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano

Give the code for sorting a text file its own sub-command.  This allows
extending the helper, which we'll do in the following patches.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/helper/test-mergesort.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/t/helper/test-mergesort.c b/t/helper/test-mergesort.c
index 621e2a5197..05be0d067a 100644
--- a/t/helper/test-mergesort.c
+++ b/t/helper/test-mergesort.c
@@ -23,7 +23,7 @@ static int compare_strings(const void *a, const void *b)
 	return strcmp(x->text, y->text);
 }

-int cmd__mergesort(int argc, const char **argv)
+static int sort_stdin(void)
 {
 	struct line *line, *p = NULL, *lines = NULL;
 	struct strbuf sb = STRBUF_INIT;
@@ -49,3 +49,10 @@ int cmd__mergesort(int argc, const char **argv)
 	}
 	return 0;
 }
+
+int cmd__mergesort(int argc, const char **argv)
+{
+	if (argc == 2 && !strcmp(argv[1], "sort"))
+		return sort_stdin();
+	usage("test-tool mergesort sort");
+}
--
2.33.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 3/9] test-mergesort: add test subcommand
  2021-10-01  9:07 [PATCH 0/9] mergesort: improve tests and performance René Scharfe
  2021-10-01  9:10 ` [PATCH 1/9] test-mergesort: use strbuf_getline() René Scharfe
  2021-10-01  9:11 ` [PATCH 2/9] test-mergesort: add sort subcommand René Scharfe
@ 2021-10-01  9:12 ` René Scharfe
  2021-10-01 20:26   ` Junio C Hamano
  2021-10-01  9:14 ` [PATCH 4/9] test-mergesort: add generate subcommand René Scharfe
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 29+ messages in thread
From: René Scharfe @ 2021-10-01  9:12 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano

Adapt the qsort certification program from "Engineering a Sort Function"
by Bentley and McIlroy for testing our linked list sort function.  It
generates several lists with various distribution patterns and counts
the number of operations llist_mergesort() needs to order them.  It
compares the result to the output of a trusted sort function (qsort(1))
and also checks if the sort is stable.

Also add a test script that makes use of the new subcommand.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/helper/test-mergesort.c | 232 +++++++++++++++++++++++++++++++++++++-
 t/t0071-sort.sh           |  11 ++
 2 files changed, 242 insertions(+), 1 deletion(-)
 create mode 100755 t/t0071-sort.sh

diff --git a/t/helper/test-mergesort.c b/t/helper/test-mergesort.c
index 05be0d067a..8006be8bf8 100644
--- a/t/helper/test-mergesort.c
+++ b/t/helper/test-mergesort.c
@@ -50,9 +50,239 @@ static int sort_stdin(void)
 	return 0;
 }

+static void dist_sawtooth(int *arr, int n, int m)
+{
+	int i;
+	for (i = 0; i < n; i++)
+		arr[i] = i % m;
+}
+
+static void dist_rand(int *arr, int n, int m)
+{
+	int i;
+	for (i = 0; i < n; i++)
+		arr[i] = rand() % m;
+}
+
+static void dist_stagger(int *arr, int n, int m)
+{
+	int i;
+	for (i = 0; i < n; i++)
+		arr[i] = (i * m + i) % n;
+}
+
+static void dist_plateau(int *arr, int n, int m)
+{
+	int i;
+	for (i = 0; i < n; i++)
+		arr[i] = (i < m) ? i : m;
+}
+
+static void dist_shuffle(int *arr, int n, int m)
+{
+	int i, j, k;
+	for (i = j = 0, k = 1; i < n; i++)
+		arr[i] = (rand() % m) ? (j += 2) : (k += 2);
+}
+
+#define DIST(name) { #name, dist_##name }
+
+static struct dist {
+	const char *name;
+	void (*fn)(int *arr, int n, int m);
+} dist[] = {
+	DIST(sawtooth),
+	DIST(rand),
+	DIST(stagger),
+	DIST(plateau),
+	DIST(shuffle),
+};
+
+static void mode_copy(int *arr, int n)
+{
+	/* nothing */
+}
+
+static void mode_reverse(int *arr, int n)
+{
+	int i, j;
+	for (i = 0, j = n - 1; i < j; i++, j--)
+		SWAP(arr[i], arr[j]);
+}
+
+static void mode_reverse_1st_half(int *arr, int n)
+{
+	mode_reverse(arr, n / 2);
+}
+
+static void mode_reverse_2nd_half(int *arr, int n)
+{
+	int half = n / 2;
+	mode_reverse(arr + half, n - half);
+}
+
+static int compare_ints(const void *av, const void *bv)
+{
+	const int *ap = av, *bp = bv;
+	int a = *ap, b = *bp;
+	return (a > b) - (a < b);
+}
+
+static void mode_sort(int *arr, int n)
+{
+	QSORT(arr, n, compare_ints);
+}
+
+static void mode_dither(int *arr, int n)
+{
+	int i;
+	for (i = 0; i < n; i++)
+		arr[i] += i % 5;
+}
+
+#define MODE(name) { #name, mode_##name }
+
+static struct mode {
+	const char *name;
+	void (*fn)(int *arr, int n);
+} mode[] = {
+	MODE(copy),
+	MODE(reverse),
+	MODE(reverse_1st_half),
+	MODE(reverse_2nd_half),
+	MODE(sort),
+	MODE(dither),
+};
+
+static struct stats {
+	int get_next, set_next, compare;
+} stats;
+
+struct number {
+	int value, rank;
+	struct number *next;
+};
+
+static void *get_next_number(const void *a)
+{
+	stats.get_next++;
+	return ((const struct number *)a)->next;
+}
+
+static void set_next_number(void *a, void *b)
+{
+	stats.set_next++;
+	((struct number *)a)->next = b;
+}
+
+static int compare_numbers(const void *av, const void *bv)
+{
+	const struct number *an = av, *bn = bv;
+	int a = an->value, b = bn->value;
+	stats.compare++;
+	return (a > b) - (a < b);
+}
+
+static void clear_numbers(struct number *list)
+{
+	while (list) {
+		struct number *next = list->next;
+		free(list);
+		list = next;
+	}
+}
+
+static int test(const struct dist *dist, const struct mode *mode, int n, int m)
+{
+	int *arr;
+	size_t i;
+	struct number *curr, *list, **tail;
+	int is_sorted = 1;
+	int is_stable = 1;
+	const char *verdict;
+	int result = -1;
+
+	ALLOC_ARRAY(arr, n);
+	dist->fn(arr, n, m);
+	mode->fn(arr, n);
+	for (i = 0, tail = &list; i < n; i++) {
+		curr = xmalloc(sizeof(*curr));
+		curr->value = arr[i];
+		curr->rank = i;
+		*tail = curr;
+		tail = &curr->next;
+	}
+	*tail = NULL;
+
+	stats.get_next = stats.set_next = stats.compare = 0;
+	list = llist_mergesort(list, get_next_number, set_next_number,
+			       compare_numbers);
+
+	QSORT(arr, n, compare_ints);
+	for (i = 0, curr = list; i < n && curr; i++, curr = curr->next) {
+		if (arr[i] != curr->value)
+			is_sorted = 0;
+		if (curr->next && curr->value == curr->next->value &&
+		    curr->rank >= curr->next->rank)
+			is_stable = 0;
+	}
+	if (i < n) {
+		verdict = "too short";
+	} else if (curr) {
+		verdict = "too long";
+	} else if (!is_sorted) {
+		verdict = "not sorted";
+	} else if (!is_stable) {
+		verdict = "unstable";
+	} else {
+		verdict = "OK";
+		result = 0;
+	}
+
+	printf("%-9s %-16s %8d %8d %8d %8d %8d %s\n",
+	       dist->name, mode->name, n, m, stats.get_next, stats.set_next,
+	       stats.compare, verdict);
+
+	clear_numbers(list);
+	free(arr);
+
+	return result;
+}
+
+/*
+ * A version of the qsort certification program from "Engineering a Sort
+ * Function" by Bentley and McIlroy, Software—Practice and Experience,
+ * Volume 23, Issue 11, 1249–1265 (November 1993).
+ */
+static int run_tests(int argc, const char **argv)
+{
+	const char *argv_default[] = { "100", "1023", "1024", "1025" };
+	if (!argc)
+		return run_tests(ARRAY_SIZE(argv_default), argv_default);
+	printf("%-9s %-16s %8s %8s %8s %8s %8s %s\n",
+	       "distribut", "mode", "n", "m", "get_next", "set_next",
+	       "compare", "verdict");
+	while (argc--) {
+		int i, j, m, n = strtol(*argv++, NULL, 10);
+		for (i = 0; i < ARRAY_SIZE(dist); i++) {
+			for (j = 0; j < ARRAY_SIZE(mode); j++) {
+				for (m = 1; m < 2 * n; m *= 2) {
+					if (test(&dist[i], &mode[j], n, m))
+						return 1;
+				}
+			}
+		}
+	}
+	return 0;
+}
+
 int cmd__mergesort(int argc, const char **argv)
 {
 	if (argc == 2 && !strcmp(argv[1], "sort"))
 		return sort_stdin();
-	usage("test-tool mergesort sort");
+	if (argc > 1 && !strcmp(argv[1], "test"))
+		return run_tests(argc - 2, argv + 2);
+	fprintf(stderr, "usage: test-tool mergesort sort\n");
+	fprintf(stderr, "   or: test-tool mergesort test [<n>...]\n");
+	return 129;
 }
diff --git a/t/t0071-sort.sh b/t/t0071-sort.sh
new file mode 100755
index 0000000000..a8ab174879
--- /dev/null
+++ b/t/t0071-sort.sh
@@ -0,0 +1,11 @@
+#!/bin/sh
+
+test_description='verify sort functions'
+
+. ./test-lib.sh
+
+test_expect_success 'llist_mergesort()' '
+	test-tool mergesort test
+'
+
+test_done
--
2.33.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 4/9] test-mergesort: add generate subcommand
  2021-10-01  9:07 [PATCH 0/9] mergesort: improve tests and performance René Scharfe
                   ` (2 preceding siblings ...)
  2021-10-01  9:12 ` [PATCH 3/9] test-mergesort: add test subcommand René Scharfe
@ 2021-10-01  9:14 ` René Scharfe
  2021-10-01  9:16 ` [PATCH 5/9] test-mergesort: add unriffle mode René Scharfe
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 29+ messages in thread
From: René Scharfe @ 2021-10-01  9:14 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano

Add a subcommand for printing test data.  It can be used to generate
special test cases and feed them into the sort subcommand or sort(1) for
performance measurements.  It may also be useful to illustrate the
effect of distributions, modes and their parameters.

It generates n integers with the specified distribution and its
distribution-specific parameter m.  E.g. m is the maximum value for
the plateau distribution and the length and height of individual teeth
of the sawtooth distribution.

The generated values are printed as zero-padded eight-digit hexadecimal
numbers to make sure alphabetic and numeric order are the same.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/helper/test-mergesort.c | 60 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 59 insertions(+), 1 deletion(-)

diff --git a/t/helper/test-mergesort.c b/t/helper/test-mergesort.c
index 8006be8bf8..27ba252d4a 100644
--- a/t/helper/test-mergesort.c
+++ b/t/helper/test-mergesort.c
@@ -98,6 +98,16 @@ static struct dist {
 	DIST(shuffle),
 };

+static const struct dist *get_dist_by_name(const char *name)
+{
+	int i;
+	for (i = 0; i < ARRAY_SIZE(dist); i++) {
+	       if (!strcmp(dist[i].name, name))
+		       return &dist[i];
+	}
+	return NULL;
+}
+
 static void mode_copy(int *arr, int n)
 {
 	/* nothing */
@@ -154,6 +164,41 @@ static struct mode {
 	MODE(dither),
 };

+static const struct mode *get_mode_by_name(const char *name)
+{
+	int i;
+	for (i = 0; i < ARRAY_SIZE(mode); i++) {
+	       if (!strcmp(mode[i].name, name))
+		       return &mode[i];
+	}
+	return NULL;
+}
+
+static int generate(int argc, const char **argv)
+{
+	const struct dist *dist = NULL;
+	const struct mode *mode = NULL;
+	int i, n, m, *arr;
+
+	if (argc != 4)
+		return 1;
+
+	dist = get_dist_by_name(argv[0]);
+	mode = get_mode_by_name(argv[1]);
+	n = strtol(argv[2], NULL, 10);
+	m = strtol(argv[3], NULL, 10);
+	if (!dist || !mode)
+		return 1;
+
+	ALLOC_ARRAY(arr, n);
+	dist->fn(arr, n, m);
+	mode->fn(arr, n);
+	for (i = 0; i < n; i++)
+		printf("%08x\n", arr[i]);
+	free(arr);
+	return 0;
+}
+
 static struct stats {
 	int get_next, set_next, compare;
 } stats;
@@ -278,11 +323,24 @@ static int run_tests(int argc, const char **argv)

 int cmd__mergesort(int argc, const char **argv)
 {
+	int i;
+	const char *sep;
+
+	if (argc == 6 && !strcmp(argv[1], "generate"))
+		return generate(argc - 2, argv + 2);
 	if (argc == 2 && !strcmp(argv[1], "sort"))
 		return sort_stdin();
 	if (argc > 1 && !strcmp(argv[1], "test"))
 		return run_tests(argc - 2, argv + 2);
-	fprintf(stderr, "usage: test-tool mergesort sort\n");
+	fprintf(stderr, "usage: test-tool mergesort generate <distribution> <mode> <n> <m>\n");
+	fprintf(stderr, "   or: test-tool mergesort sort\n");
 	fprintf(stderr, "   or: test-tool mergesort test [<n>...]\n");
+	fprintf(stderr, "\n");
+	for (i = 0, sep = "distributions: "; i < ARRAY_SIZE(dist); i++, sep = ", ")
+		fprintf(stderr, "%s%s", sep, dist[i].name);
+	fprintf(stderr, "\n");
+	for (i = 0, sep = "modes: "; i < ARRAY_SIZE(mode); i++, sep = ", ")
+		fprintf(stderr, "%s%s", sep, mode[i].name);
+	fprintf(stderr, "\n");
 	return 129;
 }
--
2.33.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 5/9] test-mergesort: add unriffle mode
  2021-10-01  9:07 [PATCH 0/9] mergesort: improve tests and performance René Scharfe
                   ` (3 preceding siblings ...)
  2021-10-01  9:14 ` [PATCH 4/9] test-mergesort: add generate subcommand René Scharfe
@ 2021-10-01  9:16 ` René Scharfe
  2021-10-01  9:17 ` [PATCH 6/9] test-mergesort: add unriffle_skewed mode René Scharfe
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 29+ messages in thread
From: René Scharfe @ 2021-10-01  9:16 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano

Add a mode that turns sorted items into adversarial input for mergesort.
Do that by running mergesort in reverse and rearranging the items in
such a way that each merge needs the maximum number of operations to
undo it.

To riffle is a card shuffling technique and involves splitting a deck
into two and then to interleave them.  A perfect riffle takes one card
from each half in turn.  That's similar to the most expensive merge,
which has to take one item from each sublist in turn, which requires the
maximum number of comparisons (n-1).

So unriffle does that in reverse, i.e. it generates the first sublist
out of the items at even indexes and the second sublist out of the items
at odd indexes, without changing their order in any other way.  Done
recursively until we reach the trivial sublist length of one, this
twists the list into an order that requires the maximum effort for
mergesort to untangle.

As a baseline, here are the rand distributions with the highest number
of comparisons from "test-tool mergesort test":

   $ t/helper/test-tool mergesort test | awk '
      NR > 1 && $1 != "rand" {next}
      $7 > max[$3] {max[$3] = $7; line[$3] = $0}
      END {for (n in line) print line[n]}
   '

distribut mode                    n        m get_next set_next  compare verdict
rand      copy                  100       32     1184      700      569 OK
rand      reverse_1st_half     1023      256    16373    10230     8976 OK
rand      reverse_1st_half     1024      512    16384    10240     8993 OK
rand      dither               1025       64    18454    11275     9970 OK

And here are the most expensive ones overall:

   $ t/helper/test-tool mergesort test | awk '
      $7 > max[$3] {max[$3] = $7; line[$3] = $0}
      END {for (n in line) print line[n]}
   '

distribut mode                    n        m get_next set_next  compare verdict
stagger   reverse               100       64     1184      700      580 OK
sawtooth  unriffle             1023     1024    16373    10230     9179 OK
sawtooth  unriffle             1024     1024    16384    10240     9217 OK
stagger   unriffle             1025     2048    18454    11275    10241 OK

The sawtooth distribution with m>=n generates a sorted list.  The
unriffle mode is designed to turn that into adversarial input for
mergesort, and that checks out for n=1023 and n=1024, where it produces
the list that requires the most comparisons.

Item counts that are not powers of two have other winners, and that's
because unriffle recursively splits lists into equal-sized halves, while
llist_mergesort() splits them into the biggest power of two smaller than
n and the rest, e.g. for n=1025 it sorts the first 1024 separately and
finally merges them to the last item.

So unriffle mode works as designed for the intended use case, but to
consistently generate adversarial input for unbalanced merges we need
something else.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/helper/test-mergesort.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/t/helper/test-mergesort.c b/t/helper/test-mergesort.c
index 27ba252d4a..d71ef568f3 100644
--- a/t/helper/test-mergesort.c
+++ b/t/helper/test-mergesort.c
@@ -150,6 +150,34 @@ static void mode_dither(int *arr, int n)
 		arr[i] += i % 5;
 }

+static void unriffle(int *arr, int n, int *tmp)
+{
+	int i, j;
+	COPY_ARRAY(tmp, arr, n);
+	for (i = j = 0; i < n; i += 2)
+		arr[j++] = tmp[i];
+	for (i = 1; i < n; i += 2)
+		arr[j++] = tmp[i];
+}
+
+static void unriffle_recursively(int *arr, int n, int *tmp)
+{
+	if (n > 1) {
+		int half = n / 2;
+		unriffle(arr, n, tmp);
+		unriffle_recursively(arr, half, tmp);
+		unriffle_recursively(arr + half, n - half, tmp);
+	}
+}
+
+static void mode_unriffle(int *arr, int n)
+{
+	int *tmp;
+	ALLOC_ARRAY(tmp, n);
+	unriffle_recursively(arr, n, tmp);
+	free(tmp);
+}
+
 #define MODE(name) { #name, mode_##name }

 static struct mode {
@@ -162,6 +190,7 @@ static struct mode {
 	MODE(reverse_2nd_half),
 	MODE(sort),
 	MODE(dither),
+	MODE(unriffle),
 };

 static const struct mode *get_mode_by_name(const char *name)
--
2.33.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 6/9] test-mergesort: add unriffle_skewed mode
  2021-10-01  9:07 [PATCH 0/9] mergesort: improve tests and performance René Scharfe
                   ` (4 preceding siblings ...)
  2021-10-01  9:16 ` [PATCH 5/9] test-mergesort: add unriffle mode René Scharfe
@ 2021-10-01  9:17 ` René Scharfe
  2021-10-01  9:19 ` [PATCH 7/9] p0071: measure sorting of already sorted and reversed files René Scharfe
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 29+ messages in thread
From: René Scharfe @ 2021-10-01  9:17 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano

Add a mode that turns a sorted list into adversarial input for a
bottom-up mergesort implementation that doubles the length of sorted
sublists at each level -- like our llist_mergesort().

While unriffle mode splits the list in half at each recursion step,
unriffle_skewed splits it into 2^l items and the rest, with 2^l being
the highest power of two smaller than the number of items and thus
2^l >= rest.  The rest is unriffled with the tail of the first half to
require a merge to compare the maximum number of elements.

It complements the unriffle mode, which targets balanced merges.  If
the number of elements is a power of two then both actually produce the
same result, as 2^l == rest == n/2 at each recursion step in that case.

Here are the results:

   $ t/helper/test-tool mergesort test | awk '
      $7 > max[$3] {max[$3] = $7; line[$3] = $0}
      END {for (n in line) print line[n]}
   '

distribut mode                    n        m get_next set_next  compare verdict
sawtooth  unriffle_skewed       100      128     1184      700      589 OK
sawtooth  unriffle_skewed      1023     1024    16373    10230     9207 OK
sawtooth  unriffle             1024     1024    16384    10240     9217 OK
sawtooth  unriffle_skewed      1025     2048    18454    11275    10241 OK

The sawtooth distribution with m>=n produces a sorted list and
unriffle_skewed mode turns it into adversarial input for unbalanced
merges, which it wins in all cases except for n=1024 -- the resulting
list is the same, but unriffle is tested before unriffle_skewed, so its
result is selected by the AWK script.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/helper/test-mergesort.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/t/helper/test-mergesort.c b/t/helper/test-mergesort.c
index d71ef568f3..43ec74e2d3 100644
--- a/t/helper/test-mergesort.c
+++ b/t/helper/test-mergesort.c
@@ -178,6 +178,33 @@ static void mode_unriffle(int *arr, int n)
 	free(tmp);
 }

+static unsigned int prev_pow2(unsigned int n)
+{
+	unsigned int pow2 = 1;
+	while (pow2 * 2 < n)
+		pow2 *= 2;
+	return pow2;
+}
+
+static void unriffle_recursively_skewed(int *arr, int n, int *tmp)
+{
+	if (n > 1) {
+		int pow2 = prev_pow2(n);
+		int rest = n - pow2;
+		unriffle(arr + pow2 - rest, rest * 2, tmp);
+		unriffle_recursively_skewed(arr, pow2, tmp);
+		unriffle_recursively_skewed(arr + pow2, rest, tmp);
+	}
+}
+
+static void mode_unriffle_skewed(int *arr, int n)
+{
+	int *tmp;
+	ALLOC_ARRAY(tmp, n);
+	unriffle_recursively_skewed(arr, n, tmp);
+	free(tmp);
+}
+
 #define MODE(name) { #name, mode_##name }

 static struct mode {
@@ -191,6 +218,7 @@ static struct mode {
 	MODE(sort),
 	MODE(dither),
 	MODE(unriffle),
+	MODE(unriffle_skewed),
 };

 static const struct mode *get_mode_by_name(const char *name)
--
2.33.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 7/9] p0071: measure sorting of already sorted and reversed files
  2021-10-01  9:07 [PATCH 0/9] mergesort: improve tests and performance René Scharfe
                   ` (5 preceding siblings ...)
  2021-10-01  9:17 ` [PATCH 6/9] test-mergesort: add unriffle_skewed mode René Scharfe
@ 2021-10-01  9:19 ` René Scharfe
  2021-10-01  9:19 ` [PATCH 8/9] p0071: test performance of llist_mergesort() René Scharfe
  2021-10-01  9:22 ` [PATCH 9/9] mergesort: use ranks stack René Scharfe
  8 siblings, 0 replies; 29+ messages in thread
From: René Scharfe @ 2021-10-01  9:19 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano

Check if sorting takes advantage of already sorted or reversed content,
or if that corner case actually decreases performance, like it would for
a simplistic quicksort implementation.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/perf/p0071-sort.sh | 29 ++++++++++++++++++++++-------
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/t/perf/p0071-sort.sh b/t/perf/p0071-sort.sh
index 6e924f5fa3..5b39b68f35 100755
--- a/t/perf/p0071-sort.sh
+++ b/t/perf/p0071-sort.sh
@@ -11,16 +11,31 @@ test_expect_success 'setup' '
 	git cat-file --batch >unsorted
 '

-test_perf 'sort(1)' '
-	sort <unsorted >expect
+test_perf 'sort(1) unsorted' '
+	sort <unsorted >sorted
 '

-test_perf 'string_list_sort()' '
-	test-tool string-list sort <unsorted >actual
+test_expect_success 'reverse' '
+	sort -r <unsorted >reversed
 '

-test_expect_success 'string_list_sort() sorts like sort(1)' '
-	test_cmp_bin expect actual
-'
+for file in sorted reversed
+do
+	test_perf "sort(1) $file" "
+		sort <$file >actual
+	"
+done
+
+for file in unsorted sorted reversed
+do
+
+	test_perf "string_list_sort() $file" "
+		test-tool string-list sort <$file >actual
+	"
+
+	test_expect_success "string_list_sort() $file sorts like sort(1)" "
+		test_cmp_bin sorted actual
+	"
+done

 test_done
--
2.33.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 8/9] p0071: test performance of llist_mergesort()
  2021-10-01  9:07 [PATCH 0/9] mergesort: improve tests and performance René Scharfe
                   ` (6 preceding siblings ...)
  2021-10-01  9:19 ` [PATCH 7/9] p0071: measure sorting of already sorted and reversed files René Scharfe
@ 2021-10-01  9:19 ` René Scharfe
  2021-10-01  9:22 ` [PATCH 9/9] mergesort: use ranks stack René Scharfe
  8 siblings, 0 replies; 29+ messages in thread
From: René Scharfe @ 2021-10-01  9:19 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/perf/p0071-sort.sh | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/t/perf/p0071-sort.sh b/t/perf/p0071-sort.sh
index 5b39b68f35..ed366e2e12 100755
--- a/t/perf/p0071-sort.sh
+++ b/t/perf/p0071-sort.sh
@@ -38,4 +38,15 @@ do
 	"
 done

+for file in unsorted sorted reversed
+do
+	test_perf "llist_mergesort() $file" "
+		test-tool mergesort sort <$file >actual
+	"
+
+	test_expect_success "llist_mergesort() $file sorts like sort(1)" "
+		test_cmp_bin sorted actual
+	"
+done
+
 test_done
--
2.33.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 9/9] mergesort: use ranks stack
  2021-10-01  9:07 [PATCH 0/9] mergesort: improve tests and performance René Scharfe
                   ` (7 preceding siblings ...)
  2021-10-01  9:19 ` [PATCH 8/9] p0071: test performance of llist_mergesort() René Scharfe
@ 2021-10-01  9:22 ` René Scharfe
  2022-01-17 17:43   ` Ævar Arnfjörð Bjarmason
  8 siblings, 1 reply; 29+ messages in thread
From: René Scharfe @ 2021-10-01  9:22 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano

The bottom-up mergesort implementation needs to skip through sublists a
lot.  A recursive version could avoid that, but would require log2(n)
stack frames.  Explicitly manage a stack of sorted sublists of various
lengths instead to avoid fast-forwarding while also keeping a lid on
memory usage.

While this patch was developed independently, a ranks stack is also used
in https://github.com/mono/mono/blob/master/mono/eglib/sort.frag.h by
the Mono project.

The idea is to keep slots for log2(n_max) sorted sublists, one for each
power of 2.  Such a construct can accommodate lists of any length up to
n_max.  Since there is a known maximum number of items (effectively
SIZE_MAX), we can preallocate the whole rank stack.

We add items one by one, which is akin to incrementing a binary number.
Make use of that by keeping track of the number of items and check bits
in it instead of checking for NULL in the rank stack when checking if a
sublist of a certain rank exists, in order to avoid memory accesses.

The first item can go into the empty first slot as a sublist of length
2^0.  The second one needs to be merged with the previous sublist and
the result goes into the empty second slot as a sublist of length 2^1.
The third one goes into vacated first slot and so on.  At the end we
merge all the sublists to get the result.

The new version still performs a stable sort by making sure to put items
seen earlier first when the compare function indicates equality.  That's
done by preferring items from sublists with a higher rank.

The new merge function also tries to minimize the number of operations.
Like blame.c::blame_merge(), the function doesn't set the next pointer
if it already points to the right item, and it exits when it reaches the
end of one of the two sublists that it's given.  The old code couldn't
do the latter because it kept all items in a single list.

The number of comparisons stays the same, though.  Here's example output
of "test-tool mergesort test" for the rand distributions with the most
number of comparisons with the ranks stack:

   $ t/helper/test-tool mergesort test | awk '
       NR > 1 && $1 != "rand" {next}
       $7 > max[$3] {max[$3] = $7; line[$3] = $0}
       END {for (n in line) print line[n]}
   '

distribut mode                    n        m get_next set_next  compare verdict
rand      copy                  100       32      669      420      569 OK
rand      dither               1023       64     9997     5396     8974 OK
rand      dither               1024      512    10007     6159     8983 OK
rand      dither               1025      256    10993     5988     9968 OK

Here are the differences to the results without this patch:

distribut mode                    n        m get_next set_next  compare
rand      copy                  100       32     -515     -280        0
rand      dither               1023       64    -6376    -4834        0
rand      dither               1024      512    -6377    -4081        0
rand      dither               1025      256    -7461    -5287        0

The numbers of get_next and set_next calls are reduced significantly.

NB: These winners are different than the ones shown in the patch that
introduced the unriffle mode because the addition of the unriffle_skewed
mode in between changed the consumption of rand() values.

Here are the distributions with the most comparisons overall with the
ranks stack:

   $ t/helper/test-tool mergesort test | awk '
       $7 > max[$3] {max[$3] = $7; line[$3] = $0}
       END {for (n in line) print line[n]}
   '

distribut mode                    n        m get_next set_next  compare verdict
sawtooth  unriffle_skewed       100      128      689      632      589 OK
sawtooth  unriffle_skewed      1023     1024    10230    10220     9207 OK
sawtooth  unriffle             1024     1024    10241    10240     9217 OK
sawtooth  unriffle_skewed      1025     2048    11266    10242    10241 OK

And here the differences to before:

distribut mode                    n        m get_next set_next  compare
sawtooth  unriffle_skewed       100      128     -495      -68        0
sawtooth  unriffle_skewed      1023     1024    -6143      -10        0
sawtooth  unriffle             1024     1024    -6143        0        0
sawtooth  unriffle_skewed      1025     2048    -7188    -1033        0

We get a similar reduction of get_next calls here, but only a slight
reduction of set_next calls, if at all.

And here are the results of p0071-sort.sh before:

0071.12: llist_mergesort() unsorted    0.36(0.33+0.01)
0071.14: llist_mergesort() sorted      0.15(0.13+0.01)
0071.16: llist_mergesort() reversed    0.16(0.14+0.01)

... and here the ones with this patch:

0071.12: llist_mergesort() unsorted    0.24(0.22+0.01)
0071.14: llist_mergesort() sorted      0.12(0.10+0.01)
0071.16: llist_mergesort() reversed    0.12(0.10+0.01)

NB: We can't use t/perf/run to compare revisions in one run because it
uses the test-tool from the worktree, not from the revisions being
tested.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 mergesort.c | 121 ++++++++++++++++++++++++++++------------------------
 1 file changed, 66 insertions(+), 55 deletions(-)

diff --git a/mergesort.c b/mergesort.c
index e5fdf2ee4a..6216835566 100644
--- a/mergesort.c
+++ b/mergesort.c
@@ -1,73 +1,84 @@
 #include "cache.h"
 #include "mergesort.h"

-struct mergesort_sublist {
-	void *ptr;
-	unsigned long len;
-};
-
-static void *get_nth_next(void *list, unsigned long n,
-			  void *(*get_next_fn)(const void *))
+/* Combine two sorted lists.  Take from `list` on equality. */
+static void *llist_merge(void *list, void *other,
+			 void *(*get_next_fn)(const void *),
+			 void (*set_next_fn)(void *, void *),
+			 int (*compare_fn)(const void *, const void *))
 {
-	while (n-- && list)
-		list = get_next_fn(list);
-	return list;
-}
+	void *result = list, *tail;

-static void *pop_item(struct mergesort_sublist *l,
-		      void *(*get_next_fn)(const void *))
-{
-	void *p = l->ptr;
-	l->ptr = get_next_fn(l->ptr);
-	l->len = l->ptr ? (l->len - 1) : 0;
-	return p;
+	if (compare_fn(list, other) > 0) {
+		result = other;
+		goto other;
+	}
+	for (;;) {
+		do {
+			tail = list;
+			list = get_next_fn(list);
+			if (!list) {
+				set_next_fn(tail, other);
+				return result;
+			}
+		} while (compare_fn(list, other) <= 0);
+		set_next_fn(tail, other);
+	other:
+		do {
+			tail = other;
+			other = get_next_fn(other);
+			if (!other) {
+				set_next_fn(tail, list);
+				return result;
+			}
+		} while (compare_fn(list, other) > 0);
+		set_next_fn(tail, list);
+	}
 }

+/*
+ * Perform an iterative mergesort using an array of sublists.
+ *
+ * n is the number of items.
+ * ranks[i] is undefined if n & 2^i == 0, and assumed empty.
+ * ranks[i] contains a sublist of length 2^i otherwise.
+ *
+ * The number of bits in a void pointer limits the number of objects
+ * that can be created, and thus the number of array elements necessary
+ * to be able to sort any valid list.
+ *
+ * Adding an item to this array is like incrementing a binary number;
+ * positional values for set bits correspond to sublist lengths.
+ */
 void *llist_mergesort(void *list,
 		      void *(*get_next_fn)(const void *),
 		      void (*set_next_fn)(void *, void *),
 		      int (*compare_fn)(const void *, const void *))
 {
-	unsigned long l;
-
-	if (!list)
-		return NULL;
-	for (l = 1; ; l *= 2) {
-		void *curr;
-		struct mergesort_sublist p, q;
+	void *ranks[bitsizeof(void *)];
+	size_t n = 0;
+	int i;

-		p.ptr = list;
-		q.ptr = get_nth_next(p.ptr, l, get_next_fn);
-		if (!q.ptr)
-			break;
-		p.len = q.len = l;
+	while (list) {
+		void *next = get_next_fn(list);
+		if (next)
+			set_next_fn(list, NULL);
+		for (i = 0; n & (1 << i); i++)
+			list = llist_merge(ranks[i], list, get_next_fn,
+					   set_next_fn, compare_fn);
+		n++;
+		ranks[i] = list;
+		list = next;
+	}

-		if (compare_fn(p.ptr, q.ptr) > 0)
-			list = curr = pop_item(&q, get_next_fn);
+	for (i = 0; n; i++, n >>= 1) {
+		if (!(n & 1))
+			continue;
+		if (list)
+			list = llist_merge(ranks[i], list, get_next_fn,
+					   set_next_fn, compare_fn);
 		else
-			list = curr = pop_item(&p, get_next_fn);
-
-		while (p.ptr) {
-			while (p.len || q.len) {
-				void *prev = curr;
-
-				if (!p.len)
-					curr = pop_item(&q, get_next_fn);
-				else if (!q.len)
-					curr = pop_item(&p, get_next_fn);
-				else if (compare_fn(p.ptr, q.ptr) > 0)
-					curr = pop_item(&q, get_next_fn);
-				else
-					curr = pop_item(&p, get_next_fn);
-				set_next_fn(prev, curr);
-			}
-			p.ptr = q.ptr;
-			p.len = l;
-			q.ptr = get_nth_next(p.ptr, l, get_next_fn);
-			q.len = q.ptr ? l : 0;
-
-		}
-		set_next_fn(curr, NULL);
+			list = ranks[i];
 	}
 	return list;
 }
--
2.33.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/9] test-mergesort: add test subcommand
  2021-10-01  9:12 ` [PATCH 3/9] test-mergesort: add test subcommand René Scharfe
@ 2021-10-01 20:26   ` Junio C Hamano
  2021-10-02  8:35     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2021-10-01 20:26 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List

René Scharfe <l.s.r@web.de> writes:

> +static void dist_rand(int *arr, int n, int m)
> +{
> +	int i;
> +	for (i = 0; i < n; i++)
> +		arr[i] = rand() % m;
> +}
> ...
> +static void dist_shuffle(int *arr, int n, int m)
> +{
> +	int i, j, k;
> +	for (i = j = 0, k = 1; i < n; i++)
> +		arr[i] = (rand() % m) ? (j += 2) : (k += 2);
> +}

I briefly wondered if we want to seed the rand() in some way to make
the tests reproducible, but we'd need to ship our own rand() if we
wanted to go that route, which would probably be too much.

> +static int test(const struct dist *dist, const struct mode *mode, int n, int m)
> +{
> +...
> +	for (i = 0, curr = list; i < n && curr; i++, curr = curr->next) {
> +		if (arr[i] != curr->value)
> +			is_sorted = 0;
> +		if (curr->next && curr->value == curr->next->value &&
> +		    curr->rank >= curr->next->rank)
> +			is_stable = 0;
> +	}
> +	if (i < n) {
> +		verdict = "too short";
> +	} else if (curr) {
> +		verdict = "too long";
> +	} else if (!is_sorted) {
> +		verdict = "not sorted";
> +	} else if (!is_stable) {
> +		verdict = "unstable";
> +	} else {
> +		verdict = "OK";
> +		result = 0;
> +	}
> +
> +	printf("%-9s %-16s %8d %8d %8d %8d %8d %s\n",
> +	       dist->name, mode->name, n, m, stats.get_next, stats.set_next,
> +	       stats.compare, verdict);
> +
> +	clear_numbers(list);
> +	free(arr);
> +
> +	return result;
> +}

Nice.

>  int cmd__mergesort(int argc, const char **argv)
>  {
>  	if (argc == 2 && !strcmp(argv[1], "sort"))
>  		return sort_stdin();
> -	usage("test-tool mergesort sort");
> +	if (argc > 1 && !strcmp(argv[1], "test"))
> +		return run_tests(argc - 2, argv + 2);
> +	fprintf(stderr, "usage: test-tool mergesort sort\n");
> +	fprintf(stderr, "   or: test-tool mergesort test [<n>...]\n");
> +	return 129;

If you can live with OPT_CMDMODE to implement sort/test subcommands,
you'd get to have parse_options() to do the usage for you, I think.
I am not sure if it is worth it, as t/helper/ is not end-user
facing.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/9] test-mergesort: add sort subcommand
  2021-10-01  9:11 ` [PATCH 2/9] test-mergesort: add sort subcommand René Scharfe
@ 2021-10-01 20:26   ` Junio C Hamano
  0 siblings, 0 replies; 29+ messages in thread
From: Junio C Hamano @ 2021-10-01 20:26 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List

René Scharfe <l.s.r@web.de> writes:

> Give the code for sorting a text file its own sub-command.  This allows
> extending the helper, which we'll do in the following patches.
> ...
> -int cmd__mergesort(int argc, const char **argv)
> +static int sort_stdin(void)
>  {
>  	struct line *line, *p = NULL, *lines = NULL;
>  	struct strbuf sb = STRBUF_INIT;
> @@ -49,3 +49,10 @@ int cmd__mergesort(int argc, const char **argv)
>  	}
>  	return 0;
>  }
> +
> +int cmd__mergesort(int argc, const char **argv)
> +{
> +	if (argc == 2 && !strcmp(argv[1], "sort"))
> +		return sort_stdin();
> +	usage("test-tool mergesort sort");
> +}

This smelled funny, as it would certainly have broken any existing
script in t/ that were using "test-tool mergesort <input".

But "git grep mergesort master -- t/" reveals that nobody uses it,
so we are safe ;-)


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/9] test-mergesort: add test subcommand
  2021-10-01 20:26   ` Junio C Hamano
@ 2021-10-02  8:35     ` Ævar Arnfjörð Bjarmason
  2021-10-03 10:15       ` René Scharfe
  2021-10-03 10:15       ` [PATCH 3/9] test-mergesort: add test subcommand René Scharfe
  0 siblings, 2 replies; 29+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-02  8:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, Git List


On Fri, Oct 01 2021, Junio C Hamano wrote:

> René Scharfe <l.s.r@web.de> writes:
>
>> +static void dist_rand(int *arr, int n, int m)
>> +{
>> +	int i;
>> +	for (i = 0; i < n; i++)
>> +		arr[i] = rand() % m;
>> +}
>> ...
>> +static void dist_shuffle(int *arr, int n, int m)
>> +{
>> +	int i, j, k;
>> +	for (i = j = 0, k = 1; i < n; i++)
>> +		arr[i] = (rand() % m) ? (j += 2) : (k += 2);
>> +}
>
> I briefly wondered if we want to seed the rand() in some way to make
> the tests reproducible, but we'd need to ship our own rand() if we
> wanted to go that route, which would probably be too much.

Wouldn't calling srand() with some constant value suffice on most
platforms? I'm aware of it being a NOOP and rand() always being randomly
seeded on (IIRC) OpenBSD, but that should work on e.g. glibc.

>>  int cmd__mergesort(int argc, const char **argv)
>>  {
>>  	if (argc == 2 && !strcmp(argv[1], "sort"))
>>  		return sort_stdin();
>> -	usage("test-tool mergesort sort");
>> +	if (argc > 1 && !strcmp(argv[1], "test"))
>> +		return run_tests(argc - 2, argv + 2);
>> +	fprintf(stderr, "usage: test-tool mergesort sort\n");
>> +	fprintf(stderr, "   or: test-tool mergesort test [<n>...]\n");
>> +	return 129;
>
> If you can live with OPT_CMDMODE to implement sort/test subcommands,
> you'd get to have parse_options() to do the usage for you, I think.
> I am not sure if it is worth it, as t/helper/ is not end-user
> facing.

Yeah I think the one thing that could improve here is this custom
getopts handling, in particular the manual formatting of "usage" and
"or" is a bit ugly, considering that you'll get it for free with the
parse_options() "usage" array.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/9] test-mergesort: use strbuf_getline()
  2021-10-01  9:10 ` [PATCH 1/9] test-mergesort: use strbuf_getline() René Scharfe
@ 2021-10-02  9:08   ` Ævar Arnfjörð Bjarmason
  2021-10-02 16:56     ` René Scharfe
  0 siblings, 1 reply; 29+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-02  9:08 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List, Junio C Hamano


On Fri, Oct 01 2021, René Scharfe wrote:

>  	while (lines) {
> -		printf("%s", lines->text);
> +		puts(lines->text);
>  		lines = lines->next;

Aside: I wonder if we should have a coccicheck for that (not as part of
this series), but maybe it would generate too much noise.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/9] test-mergesort: use strbuf_getline()
  2021-10-02  9:08   ` Ævar Arnfjörð Bjarmason
@ 2021-10-02 16:56     ` René Scharfe
  0 siblings, 0 replies; 29+ messages in thread
From: René Scharfe @ 2021-10-02 16:56 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git List, Junio C Hamano

Am 02.10.21 um 11:08 schrieb Ævar Arnfjörð Bjarmason:
>
> On Fri, Oct 01 2021, René Scharfe wrote:
>
>>  	while (lines) {
>> -		printf("%s", lines->text);
>> +		puts(lines->text);
>>  		lines = lines->next;
>
> Aside: I wonder if we should have a coccicheck for that (not as part of
> this series), but maybe it would generate too much noise.

To replace printf("%s\n", s) with puts(s)?  I considered such a semantic
patch before as well.  It would effectively forbid that specific printf
usage.  And why shouldn't it?  puts(3) is easier to use and its call is
shorter.

But puts(3) is also confusing: It adds a trailing newline, but its
sibling fputs(3) doesn't.  And it would look weird in the middle of a
run of printf calls.

Clang already does the replacement with -O1 and GCC even with -O0, so
there is no further performance improvement or object text size
reduction to be gained.

And so I dropped the idea.

René

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/9] test-mergesort: add test subcommand
  2021-10-02  8:35     ` Ævar Arnfjörð Bjarmason
@ 2021-10-03 10:15       ` René Scharfe
  2021-10-03 17:33         ` Junio C Hamano
  2021-10-03 10:15       ` [PATCH 3/9] test-mergesort: add test subcommand René Scharfe
  1 sibling, 1 reply; 29+ messages in thread
From: René Scharfe @ 2021-10-03 10:15 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Junio C Hamano; +Cc: Git List

Am 02.10.21 um 10:35 schrieb Ævar Arnfjörð Bjarmason:
>
> On Fri, Oct 01 2021, Junio C Hamano wrote:
>
>> René Scharfe <l.s.r@web.de> writes:
>>
>>> +static void dist_rand(int *arr, int n, int m)
>>> +{
>>> +	int i;
>>> +	for (i = 0; i < n; i++)
>>> +		arr[i] = rand() % m;
>>> +}
>>> ...
>>> +static void dist_shuffle(int *arr, int n, int m)
>>> +{
>>> +	int i, j, k;
>>> +	for (i = j = 0, k = 1; i < n; i++)
>>> +		arr[i] = (rand() % m) ? (j += 2) : (k += 2);
>>> +}
>>
>> I briefly wondered if we want to seed the rand() in some way to make
>> the tests reproducible, but we'd need to ship our own rand() if we
>> wanted to go that route, which would probably be too much.
>
> Wouldn't calling srand() with some constant value suffice on most
> platforms? I'm aware of it being a NOOP and rand() always being randomly
> seeded on (IIRC) OpenBSD, but that should work on e.g. glibc.

Right, so we'd need to ship our own random number generator.

Repeatable tests are not essential (the original paper didn't mention
seeding), but shouldn't be much trouble to implement and would simplify
comparisons across versions, systems and among testers.

The only downside I can think of is that it may perhaps also simplify
over-fitting, i.e. I might find micro-tweaks that only work for our
specific rand() sequence and then misinterpret them as general
improvements..

René

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/9] test-mergesort: add test subcommand
  2021-10-02  8:35     ` Ævar Arnfjörð Bjarmason
  2021-10-03 10:15       ` René Scharfe
@ 2021-10-03 10:15       ` René Scharfe
  1 sibling, 0 replies; 29+ messages in thread
From: René Scharfe @ 2021-10-03 10:15 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Junio C Hamano; +Cc: Git List

Am 02.10.21 um 10:35 schrieb Ævar Arnfjörð Bjarmason:
>
> On Fri, Oct 01 2021, Junio C Hamano wrote:
>
>> René Scharfe <l.s.r@web.de> writes:
>>
>>>  int cmd__mergesort(int argc, const char **argv)
>>>  {
>>>  	if (argc == 2 && !strcmp(argv[1], "sort"))
>>>  		return sort_stdin();
>>> -	usage("test-tool mergesort sort");
>>> +	if (argc > 1 && !strcmp(argv[1], "test"))
>>> +		return run_tests(argc - 2, argv + 2);
>>> +	fprintf(stderr, "usage: test-tool mergesort sort\n");
>>> +	fprintf(stderr, "   or: test-tool mergesort test [<n>...]\n");
>>> +	return 129;
>>
>> If you can live with OPT_CMDMODE to implement sort/test subcommands,
>> you'd get to have parse_options() to do the usage for you, I think.
>> I am not sure if it is worth it, as t/helper/ is not end-user
>> facing.
>
> Yeah I think the one thing that could improve here is this custom
> getopts handling, in particular the manual formatting of "usage" and
> "or" is a bit ugly, considering that you'll get it for free with the
> parse_options() "usage" array.

I don't see how using parseopt would help here.  Maintaining the "usage"
and "or" strings manually is trivial.  The meaty part of the usage
string (e.g. "test [<n>...]") would not be generated, neither would the
repeated part ("test-tool mergesort").  OPT_CMDMODE would require
double dashes for no good reason.

PowerShell's param array allows specifying value types, positions and
group parameters into sets.  I think it's expressive enough to allow
declaring all three subcommands and their parameters, and then can
parse command lines and generate help text automatically.

Until parseopt gains similar capabilities I'd like to avoid that
dependency.

René

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/9] test-mergesort: add test subcommand
  2021-10-03 10:15       ` René Scharfe
@ 2021-10-03 17:33         ` Junio C Hamano
  2021-10-07 20:00           ` René Scharfe
  0 siblings, 1 reply; 29+ messages in thread
From: Junio C Hamano @ 2021-10-03 17:33 UTC (permalink / raw)
  To: René Scharfe; +Cc: Ævar Arnfjörð Bjarmason, Git List

René Scharfe <l.s.r@web.de> writes:

> Repeatable tests are not essential (the original paper didn't mention
> seeding), but shouldn't be much trouble to implement and would simplify
> comparisons across versions, systems and among testers.
>
> The only downside I can think of is that it may perhaps also simplify
> over-fitting, i.e. I might find micro-tweaks that only work for our
> specific rand() sequence and then misinterpret them as general
> improvements..

Yup, I think you summarized the pros-and-cons nicely.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/9] test-mergesort: add test subcommand
  2021-10-03 17:33         ` Junio C Hamano
@ 2021-10-07 20:00           ` René Scharfe
  2021-10-08  4:04             ` [PATCH 10/9 v2] test-mergesort: use repeatable random numbers René Scharfe
  0 siblings, 1 reply; 29+ messages in thread
From: René Scharfe @ 2021-10-07 20:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, Git List

Am 03.10.21 um 19:33 schrieb Junio C Hamano:
> René Scharfe <l.s.r@web.de> writes:
>
>> Repeatable tests are not essential (the original paper didn't mention
>> seeding), but shouldn't be much trouble to implement and would simplify
>> comparisons across versions, systems and among testers.
>>
>> The only downside I can think of is that it may perhaps also simplify
>> over-fitting, i.e. I might find micro-tweaks that only work for our
>> specific rand() sequence and then misinterpret them as general
>> improvements..
>
> Yup, I think you summarized the pros-and-cons nicely.

Seeing that the series is in next already, here's a bonus patch for
that.

--- >8 ---
Subject: [PATCH 10/9] test-mergesort: use repeatable random numbers

Use MINSTD to generate pseudo-random numbers consistently instead of
using rand(3), whose output can vary from system to system, and reset
its seed before filling in the test values.  This gives repeatable
results across versions and systems, which simplifies sharing and
comparing of results between developers.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/helper/test-mergesort.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/t/helper/test-mergesort.c b/t/helper/test-mergesort.c
index 43ec74e2d3..8d128ae437 100644
--- a/t/helper/test-mergesort.c
+++ b/t/helper/test-mergesort.c
@@ -2,6 +2,12 @@
 #include "cache.h"
 #include "mergesort.h"

+static unsigned int minstd_rand(unsigned int *state)
+{
+	*state = (uint64_t)*state * 48271 % 2147483647;
+	return *state;
+}
+
 struct line {
 	char *text;
 	struct line *next;
@@ -60,8 +66,9 @@ static void dist_sawtooth(int *arr, int n, int m)
 static void dist_rand(int *arr, int n, int m)
 {
 	int i;
+	unsigned int seed = 1;
 	for (i = 0; i < n; i++)
-		arr[i] = rand() % m;
+		arr[i] = minstd_rand(&seed) % m;
 }

 static void dist_stagger(int *arr, int n, int m)
@@ -81,8 +88,9 @@ static void dist_plateau(int *arr, int n, int m)
 static void dist_shuffle(int *arr, int n, int m)
 {
 	int i, j, k;
+	unsigned int seed = 1;
 	for (i = j = 0, k = 1; i < n; i++)
-		arr[i] = (rand() % m) ? (j += 2) : (k += 2);
+		arr[i] = minstd_rand(&seed) % m ? (j += 2) : (k += 2);
 }

 #define DIST(name) { #name, dist_##name }
--
2.33.0


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 10/9 v2] test-mergesort: use repeatable random numbers
  2021-10-07 20:00           ` René Scharfe
@ 2021-10-08  4:04             ` René Scharfe
  2021-10-08  4:17               ` Jeff King
  2021-10-08  7:23               ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 29+ messages in thread
From: René Scharfe @ 2021-10-08  4:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, Git List

Use MINSTD to generate pseudo-random numbers consistently instead of
using rand(3), whose output can vary from system to system, and reset
its seed before filling in the test values.  This gives repeatable
results across versions and systems, which simplifies sharing and
comparing of results between developers.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
Change: Use uint32_t to avoid relying on unsigned int being exactly
4 bytes wide.  D'oh!

 t/helper/test-mergesort.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/t/helper/test-mergesort.c b/t/helper/test-mergesort.c
index 29758cf89b..c6fa816be3 100644
--- a/t/helper/test-mergesort.c
+++ b/t/helper/test-mergesort.c
@@ -2,6 +2,12 @@
 #include "cache.h"
 #include "mergesort.h"

+static uint32_t minstd_rand(uint32_t *state)
+{
+	*state = (uint64_t)*state * 48271 % 2147483647;
+	return *state;
+}
+
 struct line {
 	char *text;
 	struct line *next;
@@ -60,8 +66,9 @@ static void dist_sawtooth(int *arr, int n, int m)
 static void dist_rand(int *arr, int n, int m)
 {
 	int i;
+	uint32_t seed = 1;
 	for (i = 0; i < n; i++)
-		arr[i] = rand() % m;
+		arr[i] = minstd_rand(&seed) % m;
 }

 static void dist_stagger(int *arr, int n, int m)
@@ -81,8 +88,9 @@ static void dist_plateau(int *arr, int n, int m)
 static void dist_shuffle(int *arr, int n, int m)
 {
 	int i, j, k;
+	uint32_t seed = 1;
 	for (i = j = 0, k = 1; i < n; i++)
-		arr[i] = (rand() % m) ? (j += 2) : (k += 2);
+		arr[i] = minstd_rand(&seed) % m ? (j += 2) : (k += 2);
 }

 #define DIST(name) { #name, dist_##name }
--
2.33.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 10/9 v2] test-mergesort: use repeatable random numbers
  2021-10-08  4:04             ` [PATCH 10/9 v2] test-mergesort: use repeatable random numbers René Scharfe
@ 2021-10-08  4:17               ` Jeff King
  2021-10-08  7:23               ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 29+ messages in thread
From: Jeff King @ 2021-10-08  4:17 UTC (permalink / raw)
  To: René Scharfe
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason, Git List

On Fri, Oct 08, 2021 at 06:04:42AM +0200, René Scharfe wrote:

> Use MINSTD to generate pseudo-random numbers consistently instead of
> using rand(3), whose output can vary from system to system, and reset
> its seed before filling in the test values.  This gives repeatable
> results across versions and systems, which simplifies sharing and
> comparing of results between developers.

Nice. As a bonus, I noticed that Coverity complains about the use of
rand() as a security red-flag (even though of course we don't care about
its quality here). This should get rid of it by hiding the same thing in
a custom implementation. ;)

We have a similar LCG in t/helper/test-genrandom.c, but I don't think
there's any reason this needs to be factored into a shared one. And in
particular, I'd be loathe to change the genrandom one, as it may create
small bugs in the test suite (cases where we rely on hashes of the data
having particular attributes).

-Peff

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 10/9 v2] test-mergesort: use repeatable random numbers
  2021-10-08  4:04             ` [PATCH 10/9 v2] test-mergesort: use repeatable random numbers René Scharfe
  2021-10-08  4:17               ` Jeff King
@ 2021-10-08  7:23               ` Ævar Arnfjörð Bjarmason
  2021-10-08 17:30                 ` René Scharfe
  1 sibling, 1 reply; 29+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-08  7:23 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, Git List, Jeff King


On Fri, Oct 08 2021, René Scharfe wrote:

> Use MINSTD to generate pseudo-random numbers consistently instead of
> using rand(3), whose output can vary from system to system, and reset
> its seed before filling in the test values.  This gives repeatable
> results across versions and systems, which simplifies sharing and
> comparing of results between developers.
>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
> Change: Use uint32_t to avoid relying on unsigned int being exactly
> 4 bytes wide.  D'oh!
>
>  t/helper/test-mergesort.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/t/helper/test-mergesort.c b/t/helper/test-mergesort.c
> index 29758cf89b..c6fa816be3 100644
> --- a/t/helper/test-mergesort.c
> +++ b/t/helper/test-mergesort.c
> @@ -2,6 +2,12 @@
>  #include "cache.h"
>  #include "mergesort.h"
>
> +static uint32_t minstd_rand(uint32_t *state)
> +{
> +	*state = (uint64_t)*state * 48271 % 2147483647;
> +	return *state;
> +}
> +
>  struct line {
>  	char *text;
>  	struct line *next;
> @@ -60,8 +66,9 @@ static void dist_sawtooth(int *arr, int n, int m)
>  static void dist_rand(int *arr, int n, int m)
>  {
>  	int i;
> +	uint32_t seed = 1;
>  	for (i = 0; i < n; i++)
> -		arr[i] = rand() % m;
> +		arr[i] = minstd_rand(&seed) % m;
>  }
>
>  static void dist_stagger(int *arr, int n, int m)
> @@ -81,8 +88,9 @@ static void dist_plateau(int *arr, int n, int m)
>  static void dist_shuffle(int *arr, int n, int m)
>  {
>  	int i, j, k;
> +	uint32_t seed = 1;
>  	for (i = j = 0, k = 1; i < n; i++)
> -		arr[i] = (rand() % m) ? (j += 2) : (k += 2);
> +		arr[i] = minstd_rand(&seed) % m ? (j += 2) : (k += 2);
>  }
>
>  #define DIST(name) { #name, dist_##name }

Just to your upthread:

    "Right, so we'd need to ship our own random number generator."

I don't really think this matters in either case here, and if anything a
flaky failure in this test would quickly point us in the right
direction, as opposed to say having the N test_expect_success being run
in rand() order or whatever.

If we'd like results we can compare across platforms we're surely better
of here running this in a loop with different per-platform srand()
values N times for some high value of N, than we are in picking one
"golden" distribution.

But just on srand() and rand() use more generally in the test suite: I
think it's fine to just assume that we can call srand()/rand() and get
"predictable" results, because what we're really after in most cases is
to avoid hard-to-diagnose flakyness. If as a result of random
distribution we'll get a consistent failure on one OS (or the flakyness
is just OpenBSD...).

Also generally: If you'd like "portable" rand() for a test just shell
out to perl. I ran this on various Perl versions (oldest 5.12) on Debian
Linux, OSX, Solaris & OpenBSD, all returned the same number for both:

    ruby -e 'srand(1); puts rand'; perl -E 'srand(1); say $^V; say rand'

Whereas a C program doing the same:
    
    #include <stdio.h>
    #include <stdlib.h>
    
    int main(void)
    {
            srand(1);
            printf("rand = %d\n", rand());
            return 0;
    }

Returns different numbers an all, and on OpenBSD the number is different
each time, per their well-known non-standard srand()/rand() behavior.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 10/9 v2] test-mergesort: use repeatable random numbers
  2021-10-08  7:23               ` Ævar Arnfjörð Bjarmason
@ 2021-10-08 17:30                 ` René Scharfe
  2021-10-08 19:00                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 29+ messages in thread
From: René Scharfe @ 2021-10-08 17:30 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, Git List, Jeff King

Am 08.10.21 um 09:23 schrieb Ævar Arnfjörð Bjarmason:
>
> Just to your upthread:
>
>     "Right, so we'd need to ship our own random number generator."
>
> I don't really think this matters in either case here, and if anything a
> flaky failure in this test would quickly point us in the right
> direction, as opposed to say having the N test_expect_success being run
> in rand() order or whatever.
>
> If we'd like results we can compare across platforms we're surely better
> of here running this in a loop with different per-platform srand()
> values N times for some high value of N, than we are in picking one
> "golden" distribution.

A mergesort bug that only causes invalid results for certain RNG seeds
is not impossible, but unlikely.  Portability of results is more useful
for comparing the number of operations needed for different types of
input, i.e. for performance work, not so much for correctness checking.
(And those results need to be taken with enough salt to avoid micro-
optimizing for specific distributions.)

Adding more rand and shuffle distributions, parameterized with different
seeds, is certainly possible.  Not sure what it would prove, though.  We
would visit a bigger part of the permutation space, but that thing is so
huge (N!) that any reasonable sample is still small.  That's why I added
the unriffle modes, to find maxima.

> But just on srand() and rand() use more generally in the test suite: I
> think it's fine to just assume that we can call srand()/rand() and get
> "predictable" results, because what we're really after in most cases is
> to avoid hard-to-diagnose flakyness. If as a result of random
> distribution we'll get a consistent failure on one OS (or the flakyness
> is just OpenBSD...).

I can't find any current use of rand() in t/, except perhaps
t/helper/test-genrandom.c, which open-codes it to get reproducible
results.  I don't see how calling rand() instead would improve it.

> Also generally: If you'd like "portable" rand() for a test just shell
> out to perl. I ran this on various Perl versions (oldest 5.12) on Debian
> Linux, OSX, Solaris & OpenBSD, all returned the same number for both:
>
>     ruby -e 'srand(1); puts rand'; perl -E 'srand(1); say $^V; say rand'
>
> Whereas a C program doing the same:
>
>     #include <stdio.h>
>     #include <stdlib.h>
>
>     int main(void)
>     {
>             srand(1);
>             printf("rand = %d\n", rand());
>             return 0;
>     }
>
> Returns different numbers an all, and on OpenBSD the number is different
> each time, per their well-known non-standard srand()/rand() behavior.

For test shell code that needs only a few random numbers this would
be fine.

For test-genrandom it would also work, but I don't see any benefit in
converting it to a scripting language.

Shelling out to a script to avoid a multiplication and a modulo in
test-mergesort is not interesting, to put it mildly.  A mode that sorts
input from stdin like the sort subcommand, but returns the operation
counts, might be useful if you want to test distributions generated by
a Perl script or other data source of your choice.

René

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 10/9 v2] test-mergesort: use repeatable random numbers
  2021-10-08 17:30                 ` René Scharfe
@ 2021-10-08 19:00                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 29+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-10-08 19:00 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, Git List, Jeff King


On Fri, Oct 08 2021, René Scharfe wrote:

> Am 08.10.21 um 09:23 schrieb Ævar Arnfjörð Bjarmason:
>> Also generally: If you'd like "portable" rand() for a test just shell
>> out to perl. I ran this on various Perl versions (oldest 5.12) on Debian
>> Linux, OSX, Solaris & OpenBSD, all returned the same number for both:
>>
>>     ruby -e 'srand(1); puts rand'; perl -E 'srand(1); say $^V; say rand'
>>
>> Whereas a C program doing the same:
>>
>>     #include <stdio.h>
>>     #include <stdlib.h>
>>
>>     int main(void)
>>     {
>>             srand(1);
>>             printf("rand = %d\n", rand());
>>             return 0;
>>     }
>>
>> Returns different numbers an all, and on OpenBSD the number is different
>> each time, per their well-known non-standard srand()/rand() behavior.
>
> For test shell code that needs only a few random numbers this would
> be fine.
>
> For test-genrandom it would also work, but I don't see any benefit in
> converting it to a scripting language.
>
> Shelling out to a script to avoid a multiplication and a modulo in
> test-mergesort is not interesting, to put it mildly.  A mode that sorts
> input from stdin like the sort subcommand, but returns the operation
> counts, might be useful if you want to test distributions generated by
> a Perl script or other data source of your choice.

Yes, it has zero applicablility here.

It was just an aside/FYI since we were on the topic of the
cross-platformness of rand().

I.e. one might assume that for the general problem of seeding something
randomly cross-platform one had to ship a rand(), but usually at least
perl is there ahead of you, and since it has its own rand()...

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 9/9] mergesort: use ranks stack
  2021-10-01  9:22 ` [PATCH 9/9] mergesort: use ranks stack René Scharfe
@ 2022-01-17 17:43   ` Ævar Arnfjörð Bjarmason
  2022-01-17 18:22     ` René Scharfe
  0 siblings, 1 reply; 29+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-17 17:43 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List, Junio C Hamano


On Fri, Oct 01 2021, René Scharfe wrote:


> +/*
> + * Perform an iterative mergesort using an array of sublists.
> + *
> + * n is the number of items.
> + * ranks[i] is undefined if n & 2^i == 0, and assumed empty.
> + * ranks[i] contains a sublist of length 2^i otherwise.
> + *
> + * The number of bits in a void pointer limits the number of objects
> + * that can be created, and thus the number of array elements necessary
> + * to be able to sort any valid list.
> + *
> + * Adding an item to this array is like incrementing a binary number;
> + * positional values for set bits correspond to sublist lengths.
> + */
>  void *llist_mergesort(void *list,
>  		      void *(*get_next_fn)(const void *),
>  		      void (*set_next_fn)(void *, void *),
>  		      int (*compare_fn)(const void *, const void *))
>  {
> -	unsigned long l;
> -
> -	if (!list)
> -		return NULL;
> -	for (l = 1; ; l *= 2) {
> -		void *curr;
> -		struct mergesort_sublist p, q;
> +	void *ranks[bitsizeof(void *)];
> +	size_t n = 0;
> +	int i;
>
> -		p.ptr = list;
> -		q.ptr = get_nth_next(p.ptr, l, get_next_fn);
> -		if (!q.ptr)
> -			break;
> -		p.len = q.len = l;
> +	while (list) {
> +		void *next = get_next_fn(list);
> +		if (next)
> +			set_next_fn(list, NULL);
> +		for (i = 0; n & (1 << i); i++)
> +			list = llist_merge(ranks[i], list, get_next_fn,
> +					   set_next_fn, compare_fn);
> +		n++;
> +		ranks[i] = list;
> +		list = next;
> +	}

(Commenting on a commit integrated into v2.34.0)

The aCC compiler on HP/UX notes:

    "mergesort.c", line 67: warning #2549-D: variable "ranks" is used before its value is set
                        list = llist_merge(ranks[i], list, get_next_fn,

It's commenting on the ranks[i] within the for-loop-within-while-loop
above.

>
> -		if (compare_fn(p.ptr, q.ptr) > 0)
> -			list = curr = pop_item(&q, get_next_fn);
> +	for (i = 0; n; i++, n >>= 1) {
> +		if (!(n & 1))
> +			continue;
> +		if (list)
> +			list = llist_merge(ranks[i], list, get_next_fn,
> +					   set_next_fn, compare_fn);
>  		else
> -			list = curr = pop_item(&p, get_next_fn);
> -
> -		while (p.ptr) {
> -			while (p.len || q.len) {
> -				void *prev = curr;
> -
> -				if (!p.len)
> -					curr = pop_item(&q, get_next_fn);
> -				else if (!q.len)
> -					curr = pop_item(&p, get_next_fn);
> -				else if (compare_fn(p.ptr, q.ptr) > 0)
> -					curr = pop_item(&q, get_next_fn);
> -				else
> -					curr = pop_item(&p, get_next_fn);
> -				set_next_fn(prev, curr);
> -			}
> -			p.ptr = q.ptr;
> -			p.len = l;
> -			q.ptr = get_nth_next(p.ptr, l, get_next_fn);
> -			q.len = q.ptr ? l : 0;
> -
> -		}
> -		set_next_fn(curr, NULL);
> +			list = ranks[i];
>  	}
>  	return list;
>  }


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 9/9] mergesort: use ranks stack
  2022-01-17 17:43   ` Ævar Arnfjörð Bjarmason
@ 2022-01-17 18:22     ` René Scharfe
  2022-01-18  5:07       ` René Scharfe
  0 siblings, 1 reply; 29+ messages in thread
From: René Scharfe @ 2022-01-17 18:22 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git List, Junio C Hamano

Am 17.01.22 um 18:43 schrieb Ævar Arnfjörð Bjarmason:
>
> On Fri, Oct 01 2021, René Scharfe wrote:
>
>
>> +/*
>> + * Perform an iterative mergesort using an array of sublists.
>> + *
>> + * n is the number of items.
>> + * ranks[i] is undefined if n & 2^i == 0, and assumed empty.
>> + * ranks[i] contains a sublist of length 2^i otherwise.
>> + *
>> + * The number of bits in a void pointer limits the number of objects
>> + * that can be created, and thus the number of array elements necessary
>> + * to be able to sort any valid list.
>> + *
>> + * Adding an item to this array is like incrementing a binary number;
>> + * positional values for set bits correspond to sublist lengths.
>> + */
>>  void *llist_mergesort(void *list,
>>  		      void *(*get_next_fn)(const void *),
>>  		      void (*set_next_fn)(void *, void *),
>>  		      int (*compare_fn)(const void *, const void *))
>>  {
>> -	unsigned long l;
>> -
>> -	if (!list)
>> -		return NULL;
>> -	for (l = 1; ; l *= 2) {
>> -		void *curr;
>> -		struct mergesort_sublist p, q;
>> +	void *ranks[bitsizeof(void *)];
>> +	size_t n = 0;
>> +	int i;
>>
>> -		p.ptr = list;
>> -		q.ptr = get_nth_next(p.ptr, l, get_next_fn);
>> -		if (!q.ptr)
>> -			break;
>> -		p.len = q.len = l;
>> +	while (list) {
>> +		void *next = get_next_fn(list);
>> +		if (next)
>> +			set_next_fn(list, NULL);
>> +		for (i = 0; n & (1 << i); i++)
>> +			list = llist_merge(ranks[i], list, get_next_fn,
>> +					   set_next_fn, compare_fn);
>> +		n++;
>> +		ranks[i] = list;
>> +		list = next;
>> +	}
>
> (Commenting on a commit integrated into v2.34.0)
>
> The aCC compiler on HP/UX notes:
>
>     "mergesort.c", line 67: warning #2549-D: variable "ranks" is used before its value is set
>                         list = llist_merge(ranks[i], list, get_next_fn,
>
> It's commenting on the ranks[i] within the for-loop-within-while-loop
> above.

That would be a bug.  I think none of the array elements are read before
they are written -- but of course I'm biased.  Can that compiler show a
sequence that would lead to reading uninitialized data?  Or anyone else?

Initializing the array would memset(3) 128 bytes on 32-bit systems and
512 bytes on 64-bit systems.  Doing that everywhere just to appease a
confused compiler on a dying platform would be merciful, but still sad.

>
>>
>> -		if (compare_fn(p.ptr, q.ptr) > 0)
>> -			list = curr = pop_item(&q, get_next_fn);
>> +	for (i = 0; n; i++, n >>= 1) {
>> +		if (!(n & 1))
>> +			continue;
>> +		if (list)
>> +			list = llist_merge(ranks[i], list, get_next_fn,
>> +					   set_next_fn, compare_fn);
>>  		else
>> -			list = curr = pop_item(&p, get_next_fn);
>> -
>> -		while (p.ptr) {
>> -			while (p.len || q.len) {
>> -				void *prev = curr;
>> -
>> -				if (!p.len)
>> -					curr = pop_item(&q, get_next_fn);
>> -				else if (!q.len)
>> -					curr = pop_item(&p, get_next_fn);
>> -				else if (compare_fn(p.ptr, q.ptr) > 0)
>> -					curr = pop_item(&q, get_next_fn);
>> -				else
>> -					curr = pop_item(&p, get_next_fn);
>> -				set_next_fn(prev, curr);
>> -			}
>> -			p.ptr = q.ptr;
>> -			p.len = l;
>> -			q.ptr = get_nth_next(p.ptr, l, get_next_fn);
>> -			q.len = q.ptr ? l : 0;
>> -
>> -		}
>> -		set_next_fn(curr, NULL);
>> +			list = ranks[i];
>>  	}
>>  	return list;
>>  }
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 9/9] mergesort: use ranks stack
  2022-01-17 18:22     ` René Scharfe
@ 2022-01-18  5:07       ` René Scharfe
  2022-01-18 10:40         ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 29+ messages in thread
From: René Scharfe @ 2022-01-18  5:07 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git List, Junio C Hamano



Am 17.01.22 um 19:22 schrieb René Scharfe:
> Am 17.01.22 um 18:43 schrieb Ævar Arnfjörð Bjarmason:
>>
>> On Fri, Oct 01 2021, René Scharfe wrote:
>>
>>
>>> +/*
>>> + * Perform an iterative mergesort using an array of sublists.
>>> + *
>>> + * n is the number of items.
>>> + * ranks[i] is undefined if n & 2^i == 0, and assumed empty.
>>> + * ranks[i] contains a sublist of length 2^i otherwise.
>>> + *
>>> + * The number of bits in a void pointer limits the number of objects
>>> + * that can be created, and thus the number of array elements necessary
>>> + * to be able to sort any valid list.
>>> + *
>>> + * Adding an item to this array is like incrementing a binary number;
>>> + * positional values for set bits correspond to sublist lengths.
>>> + */
>>>  void *llist_mergesort(void *list,
>>>  		      void *(*get_next_fn)(const void *),
>>>  		      void (*set_next_fn)(void *, void *),
>>>  		      int (*compare_fn)(const void *, const void *))
>>>  {
>>> -	unsigned long l;
>>> -
>>> -	if (!list)
>>> -		return NULL;
>>> -	for (l = 1; ; l *= 2) {
>>> -		void *curr;
>>> -		struct mergesort_sublist p, q;
>>> +	void *ranks[bitsizeof(void *)];
>>> +	size_t n = 0;
>>> +	int i;
>>>
>>> -		p.ptr = list;
>>> -		q.ptr = get_nth_next(p.ptr, l, get_next_fn);
>>> -		if (!q.ptr)
>>> -			break;
>>> -		p.len = q.len = l;
>>> +	while (list) {
>>> +		void *next = get_next_fn(list);
>>> +		if (next)
>>> +			set_next_fn(list, NULL);
>>> +		for (i = 0; n & (1 << i); i++)
>>> +			list = llist_merge(ranks[i], list, get_next_fn,
>>> +					   set_next_fn, compare_fn);
>>> +		n++;
>>> +		ranks[i] = list;
>>> +		list = next;
>>> +	}
>>
>> (Commenting on a commit integrated into v2.34.0)
>>
>> The aCC compiler on HP/UX notes:
>>
>>     "mergesort.c", line 67: warning #2549-D: variable "ranks" is used before its value is set
>>                         list = llist_merge(ranks[i], list, get_next_fn,
>>
>> It's commenting on the ranks[i] within the for-loop-within-while-loop
>> above.
>
> That would be a bug.  I think none of the array elements are read before
> they are written -- but of course I'm biased.  Can that compiler show a
> sequence that would lead to reading uninitialized data?  Or anyone else?
>
> Initializing the array would memset(3) 128 bytes on 32-bit systems and
> 512 bytes on 64-bit systems.  Doing that everywhere just to appease a
> confused compiler on a dying platform would be merciful, but still sad.

Does the warning disappear if you add "ranks[0] = NULL;" before the while
loop?  And if it does, has adding "if (n & 1) ranks[0] = NULL;" instead
the same effect?

>
>>
>>>
>>> -		if (compare_fn(p.ptr, q.ptr) > 0)
>>> -			list = curr = pop_item(&q, get_next_fn);
>>> +	for (i = 0; n; i++, n >>= 1) {
>>> +		if (!(n & 1))
>>> +			continue;
>>> +		if (list)
>>> +			list = llist_merge(ranks[i], list, get_next_fn,
>>> +					   set_next_fn, compare_fn);
>>>  		else
>>> -			list = curr = pop_item(&p, get_next_fn);
>>> -
>>> -		while (p.ptr) {
>>> -			while (p.len || q.len) {
>>> -				void *prev = curr;
>>> -
>>> -				if (!p.len)
>>> -					curr = pop_item(&q, get_next_fn);
>>> -				else if (!q.len)
>>> -					curr = pop_item(&p, get_next_fn);
>>> -				else if (compare_fn(p.ptr, q.ptr) > 0)
>>> -					curr = pop_item(&q, get_next_fn);
>>> -				else
>>> -					curr = pop_item(&p, get_next_fn);
>>> -				set_next_fn(prev, curr);
>>> -			}
>>> -			p.ptr = q.ptr;
>>> -			p.len = l;
>>> -			q.ptr = get_nth_next(p.ptr, l, get_next_fn);
>>> -			q.len = q.ptr ? l : 0;
>>> -
>>> -		}
>>> -		set_next_fn(curr, NULL);
>>> +			list = ranks[i];
>>>  	}
>>>  	return list;
>>>  }
>>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 9/9] mergesort: use ranks stack
  2022-01-18  5:07       ` René Scharfe
@ 2022-01-18 10:40         ` Ævar Arnfjörð Bjarmason
  2022-01-18 12:27           ` René Scharfe
  0 siblings, 1 reply; 29+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-18 10:40 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List, Junio C Hamano


On Tue, Jan 18 2022, René Scharfe wrote:

> Am 17.01.22 um 19:22 schrieb René Scharfe:
>> Am 17.01.22 um 18:43 schrieb Ævar Arnfjörð Bjarmason:
>>>
>>> On Fri, Oct 01 2021, René Scharfe wrote:
>>>
>>>
>>>> +/*
>>>> + * Perform an iterative mergesort using an array of sublists.
>>>> + *
>>>> + * n is the number of items.
>>>> + * ranks[i] is undefined if n & 2^i == 0, and assumed empty.
>>>> + * ranks[i] contains a sublist of length 2^i otherwise.
>>>> + *
>>>> + * The number of bits in a void pointer limits the number of objects
>>>> + * that can be created, and thus the number of array elements necessary
>>>> + * to be able to sort any valid list.
>>>> + *
>>>> + * Adding an item to this array is like incrementing a binary number;
>>>> + * positional values for set bits correspond to sublist lengths.
>>>> + */
>>>>  void *llist_mergesort(void *list,
>>>>  		      void *(*get_next_fn)(const void *),
>>>>  		      void (*set_next_fn)(void *, void *),
>>>>  		      int (*compare_fn)(const void *, const void *))
>>>>  {
>>>> -	unsigned long l;
>>>> -
>>>> -	if (!list)
>>>> -		return NULL;
>>>> -	for (l = 1; ; l *= 2) {
>>>> -		void *curr;
>>>> -		struct mergesort_sublist p, q;
>>>> +	void *ranks[bitsizeof(void *)];
>>>> +	size_t n = 0;
>>>> +	int i;
>>>>
>>>> -		p.ptr = list;
>>>> -		q.ptr = get_nth_next(p.ptr, l, get_next_fn);
>>>> -		if (!q.ptr)
>>>> -			break;
>>>> -		p.len = q.len = l;
>>>> +	while (list) {
>>>> +		void *next = get_next_fn(list);
>>>> +		if (next)
>>>> +			set_next_fn(list, NULL);
>>>> +		for (i = 0; n & (1 << i); i++)
>>>> +			list = llist_merge(ranks[i], list, get_next_fn,
>>>> +					   set_next_fn, compare_fn);
>>>> +		n++;
>>>> +		ranks[i] = list;
>>>> +		list = next;
>>>> +	}
>>>
>>> (Commenting on a commit integrated into v2.34.0)
>>>
>>> The aCC compiler on HP/UX notes:
>>>
>>>     "mergesort.c", line 67: warning #2549-D: variable "ranks" is used before its value is set
>>>                         list = llist_merge(ranks[i], list, get_next_fn,
>>>
>>> It's commenting on the ranks[i] within the for-loop-within-while-loop
>>> above.
>>
>> That would be a bug.  I think none of the array elements are read before
>> they are written -- but of course I'm biased.  Can that compiler show a
>> sequence that would lead to reading uninitialized data?  Or anyone else?
>>
>> Initializing the array would memset(3) 128 bytes on 32-bit systems and
>> 512 bytes on 64-bit systems.  Doing that everywhere just to appease a
>> confused compiler on a dying platform would be merciful, but still sad.
>
> Does the warning disappear if you add "ranks[0] = NULL;" before the while
> loop?  And if it does, has adding "if (n & 1) ranks[0] = NULL;" instead
> the same effect?

Both of those make the warning go away.

Anyway, if you think the pre-image in master now is fine let's leave it
as it is. There's no point in just trying to appease aCC here.

I just thought I'd send a quick mail about it because I was looking at
its warning output, most of those warnings point to obviously harmless
issues, but I thought this one *might* point to an actual logic error
(but didn't look carefully enough myself), so I thought I'd send a quick
note about it.

If you think not it's probably best just to leave the code as-is.

>>
>>>
>>>>
>>>> -		if (compare_fn(p.ptr, q.ptr) > 0)
>>>> -			list = curr = pop_item(&q, get_next_fn);
>>>> +	for (i = 0; n; i++, n >>= 1) {
>>>> +		if (!(n & 1))
>>>> +			continue;
>>>> +		if (list)
>>>> +			list = llist_merge(ranks[i], list, get_next_fn,
>>>> +					   set_next_fn, compare_fn);
>>>>  		else
>>>> -			list = curr = pop_item(&p, get_next_fn);
>>>> -
>>>> -		while (p.ptr) {
>>>> -			while (p.len || q.len) {
>>>> -				void *prev = curr;
>>>> -
>>>> -				if (!p.len)
>>>> -					curr = pop_item(&q, get_next_fn);
>>>> -				else if (!q.len)
>>>> -					curr = pop_item(&p, get_next_fn);
>>>> -				else if (compare_fn(p.ptr, q.ptr) > 0)
>>>> -					curr = pop_item(&q, get_next_fn);
>>>> -				else
>>>> -					curr = pop_item(&p, get_next_fn);
>>>> -				set_next_fn(prev, curr);
>>>> -			}
>>>> -			p.ptr = q.ptr;
>>>> -			p.len = l;
>>>> -			q.ptr = get_nth_next(p.ptr, l, get_next_fn);
>>>> -			q.len = q.ptr ? l : 0;
>>>> -
>>>> -		}
>>>> -		set_next_fn(curr, NULL);
>>>> +			list = ranks[i];
>>>>  	}
>>>>  	return list;
>>>>  }
>>>


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 9/9] mergesort: use ranks stack
  2022-01-18 10:40         ` Ævar Arnfjörð Bjarmason
@ 2022-01-18 12:27           ` René Scharfe
  0 siblings, 0 replies; 29+ messages in thread
From: René Scharfe @ 2022-01-18 12:27 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Git List, Junio C Hamano

Am 18.01.22 um 11:40 schrieb Ævar Arnfjörð Bjarmason:
>
> On Tue, Jan 18 2022, René Scharfe wrote:
>
>> Am 17.01.22 um 19:22 schrieb René Scharfe:
>>> Am 17.01.22 um 18:43 schrieb Ævar Arnfjörð Bjarmason:
>>>>
>>>> On Fri, Oct 01 2021, René Scharfe wrote:
>>>>
>>>>
>>>>> +/*
>>>>> + * Perform an iterative mergesort using an array of sublists.
>>>>> + *
>>>>> + * n is the number of items.
>>>>> + * ranks[i] is undefined if n & 2^i == 0, and assumed empty.
>>>>> + * ranks[i] contains a sublist of length 2^i otherwise.
>>>>> + *
>>>>> + * The number of bits in a void pointer limits the number of objects
>>>>> + * that can be created, and thus the number of array elements necessary
>>>>> + * to be able to sort any valid list.
>>>>> + *
>>>>> + * Adding an item to this array is like incrementing a binary number;
>>>>> + * positional values for set bits correspond to sublist lengths.
>>>>> + */
>>>>>  void *llist_mergesort(void *list,
>>>>>  		      void *(*get_next_fn)(const void *),
>>>>>  		      void (*set_next_fn)(void *, void *),
>>>>>  		      int (*compare_fn)(const void *, const void *))
>>>>>  {
>>>>> -	unsigned long l;
>>>>> -
>>>>> -	if (!list)
>>>>> -		return NULL;
>>>>> -	for (l = 1; ; l *= 2) {
>>>>> -		void *curr;
>>>>> -		struct mergesort_sublist p, q;
>>>>> +	void *ranks[bitsizeof(void *)];
>>>>> +	size_t n = 0;
>>>>> +	int i;
>>>>>
>>>>> -		p.ptr = list;
>>>>> -		q.ptr = get_nth_next(p.ptr, l, get_next_fn);
>>>>> -		if (!q.ptr)
>>>>> -			break;
>>>>> -		p.len = q.len = l;
>>>>> +	while (list) {
>>>>> +		void *next = get_next_fn(list);
>>>>> +		if (next)
>>>>> +			set_next_fn(list, NULL);
>>>>> +		for (i = 0; n & (1 << i); i++)
>>>>> +			list = llist_merge(ranks[i], list, get_next_fn,
>>>>> +					   set_next_fn, compare_fn);
>>>>> +		n++;
>>>>> +		ranks[i] = list;
>>>>> +		list = next;
>>>>> +	}
>>>>
>>>> (Commenting on a commit integrated into v2.34.0)
>>>>
>>>> The aCC compiler on HP/UX notes:
>>>>
>>>>     "mergesort.c", line 67: warning #2549-D: variable "ranks" is used before its value is set
>>>>                         list = llist_merge(ranks[i], list, get_next_fn,
>>>>
>>>> It's commenting on the ranks[i] within the for-loop-within-while-loop
>>>> above.
>>>
>>> That would be a bug.  I think none of the array elements are read before
>>> they are written -- but of course I'm biased.  Can that compiler show a
>>> sequence that would lead to reading uninitialized data?  Or anyone else?
>>>
>>> Initializing the array would memset(3) 128 bytes on 32-bit systems and
>>> 512 bytes on 64-bit systems.  Doing that everywhere just to appease a
>>> confused compiler on a dying platform would be merciful, but still sad.
>>
>> Does the warning disappear if you add "ranks[0] = NULL;" before the while
>> loop?  And if it does, has adding "if (n & 1) ranks[0] = NULL;" instead
>> the same effect?
>
> Both of those make the warning go away.

The second one is optimized out by GCC and Clang because n == 0 before
the while loop.  The data flow analysis in aCC that leads to the warning
must be taking some shortcuts if can be fooled by an inconsequential
expression.

> Anyway, if you think the pre-image in master now is fine let's leave it
> as it is. There's no point in just trying to appease aCC here.
>
> I just thought I'd send a quick mail about it because I was looking at
> its warning output, most of those warnings point to obviously harmless
> issues, but I thought this one *might* point to an actual logic error
> (but didn't look carefully enough myself), so I thought I'd send a quick
> note about it.

Sure, it would be a disaster if this loop somehow read uninitialized
data, and any hint needs towards that end must be taken seriously.  But
that warning from aCC seems to be a false positive.

Adding "if (n & 1) ranks[0] = NULL;" before the loop and a comment would
not change the code generated by  "normal" optimizing compilers, so we
could add this at the cost of slightly hurting readability, if necessary.

> If you think not it's probably best just to leave the code as-is.

That works for me as well. :)

René

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2022-01-18 12:27 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-01  9:07 [PATCH 0/9] mergesort: improve tests and performance René Scharfe
2021-10-01  9:10 ` [PATCH 1/9] test-mergesort: use strbuf_getline() René Scharfe
2021-10-02  9:08   ` Ævar Arnfjörð Bjarmason
2021-10-02 16:56     ` René Scharfe
2021-10-01  9:11 ` [PATCH 2/9] test-mergesort: add sort subcommand René Scharfe
2021-10-01 20:26   ` Junio C Hamano
2021-10-01  9:12 ` [PATCH 3/9] test-mergesort: add test subcommand René Scharfe
2021-10-01 20:26   ` Junio C Hamano
2021-10-02  8:35     ` Ævar Arnfjörð Bjarmason
2021-10-03 10:15       ` René Scharfe
2021-10-03 17:33         ` Junio C Hamano
2021-10-07 20:00           ` René Scharfe
2021-10-08  4:04             ` [PATCH 10/9 v2] test-mergesort: use repeatable random numbers René Scharfe
2021-10-08  4:17               ` Jeff King
2021-10-08  7:23               ` Ævar Arnfjörð Bjarmason
2021-10-08 17:30                 ` René Scharfe
2021-10-08 19:00                   ` Ævar Arnfjörð Bjarmason
2021-10-03 10:15       ` [PATCH 3/9] test-mergesort: add test subcommand René Scharfe
2021-10-01  9:14 ` [PATCH 4/9] test-mergesort: add generate subcommand René Scharfe
2021-10-01  9:16 ` [PATCH 5/9] test-mergesort: add unriffle mode René Scharfe
2021-10-01  9:17 ` [PATCH 6/9] test-mergesort: add unriffle_skewed mode René Scharfe
2021-10-01  9:19 ` [PATCH 7/9] p0071: measure sorting of already sorted and reversed files René Scharfe
2021-10-01  9:19 ` [PATCH 8/9] p0071: test performance of llist_mergesort() René Scharfe
2021-10-01  9:22 ` [PATCH 9/9] mergesort: use ranks stack René Scharfe
2022-01-17 17:43   ` Ævar Arnfjörð Bjarmason
2022-01-17 18:22     ` René Scharfe
2022-01-18  5:07       ` René Scharfe
2022-01-18 10:40         ` Ævar Arnfjörð Bjarmason
2022-01-18 12:27           ` René Scharfe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).