All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] use merge-base for tag --contains
@ 2014-06-25 23:34 Jeff King
  2014-06-25 23:35 ` [PATCH 1/8] tag: allow --sort with -n Jeff King
                   ` (7 more replies)
  0 siblings, 8 replies; 26+ messages in thread
From: Jeff King @ 2014-06-25 23:34 UTC (permalink / raw)
  To: git

Once upon a time we checked "tag --contains" by doing N merge-base
traversals, one per tag. That turned out to be really slow.

Later, I added a single traversal in ffc4b80 (tag: speed up --contains
calculation, 2011-06-11) that works in a depth-first way. That's fast
for the common case of tags spread throughout history, but slow when all
of the tags are close to the searched-for commit (which would be more
likely with branches, since they advance). That, plus the general
hacky-ness of the implementation, prevented it from being used for "git
branch" or in other places.

Over a year ago, Junio and I worked on the commit-slab code. The
original point of it[1] was to be able to do a merge-base traversal like
this, where we kept one bit per tip in each commit (so that we know not
only what the merge base is, but _which_ tip hit each commit). So now I
finally got around to it. :)

Timings are in the final patch, but the short of it is: it's about as
fast as the depth-first code for the normal tag case (tags spread out
through history), but way faster for the branch-like case (tags close to
the commit).

This series stops short of moving "git branch" over to it.  My next goal
once this is solid is to factor the logic out so that "tag -l", "branch
-l", and "for-each-ref" all use the same code. I got stuck on that
earlier because I just couldn't justify sharing the tag-contains
implementation with the others.

  [1/8]: tag: allow --sort with -n
  [2/8]: tag: factor out decision to stream tags
  [3/8]: paint_down_to_common: use prio_queue
  [4/8]: add functions for memory-efficient bitmaps
  [5/8]: string-list: add pos to iterator callback
  [6/8]: commit: provide a fast multi-tip contains function
  [7/8]: tag: use commit_contains
  [8/8]: perf: add tests for tag --contains

-Peff

[1] http://article.gmane.org/gmane.comp.version-control.git/220545

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/8] tag: allow --sort with -n
  2014-06-25 23:34 [PATCH 0/8] use merge-base for tag --contains Jeff King
@ 2014-06-25 23:35 ` Jeff King
  2014-06-25 23:35 ` [PATCH 2/8] tag: factor out decision to stream tags Jeff King
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 26+ messages in thread
From: Jeff King @ 2014-06-25 23:35 UTC (permalink / raw)
  To: git

When we are listing tags, we print each one as it is
processed by for_each_ref. We can't do that with --sort, of
course, as we need to see the whole list to sort. For the
--sort code path, we store each tag in a string_list, and
then print them all at the end.

This interacts badly with "-n", which needs not only the
name of the tag, but also the object itself. We simply
punted on handling this, and disallowed the combination.

This patch remedies that by storing the sha1 of each object
in the "util" field of the string list. We can then factor
out the printing to a helper function and call that function
either when we first see each tag, or after we have sorted.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/tag.c  | 42 +++++++++++++++++++++++++-----------------
 cache.h        |  7 +++++++
 t/t7004-tag.sh | 18 ++++++++++++++++++
 3 files changed, 50 insertions(+), 17 deletions(-)

diff --git a/builtin/tag.c b/builtin/tag.c
index c6e8a71..2adfc3d 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -213,6 +213,18 @@ free_return:
 	free(buf);
 }
 
+static void print_tag(const char *refname, const unsigned char *sha1,
+		      int lines)
+{
+		if (!lines)
+			printf("%s\n", refname);
+		else {
+			printf("%-15s ", refname);
+			show_tag_lines(sha1, lines);
+			putchar('\n');
+		}
+}
+
 static int show_reference(const char *refname, const unsigned char *sha1,
 			  int flag, void *cb_data)
 {
@@ -232,16 +244,10 @@ static int show_reference(const char *refname, const unsigned char *sha1,
 		if (points_at.nr && !match_points_at(refname, sha1))
 			return 0;
 
-		if (!filter->lines) {
-			if (filter->sort)
-				string_list_append(&filter->tags, refname);
-			else
-				printf("%s\n", refname);
-			return 0;
-		}
-		printf("%-15s ", refname);
-		show_tag_lines(sha1, filter->lines);
-		putchar('\n');
+		if (filter->sort)
+			string_list_append(&filter->tags, refname)->util = hashdup(sha1);
+		else
+			print_tag(refname, sha1, filter->lines);
 	}
 
 	return 0;
@@ -273,12 +279,16 @@ static int list_tags(const char **patterns, int lines,
 			qsort(filter.tags.items, filter.tags.nr,
 			      sizeof(struct string_list_item), sort_by_version);
 		if (sort & REVERSE_SORT)
-			for (i = filter.tags.nr - 1; i >= 0; i--)
-				printf("%s\n", filter.tags.items[i].string);
+			for (i = filter.tags.nr - 1; i >= 0; i--) {
+				struct string_list_item *it = &filter.tags.items[i];
+				print_tag(it->string, it->util, lines);
+			}
 		else
-			for (i = 0; i < filter.tags.nr; i++)
-				printf("%s\n", filter.tags.items[i].string);
-		string_list_clear(&filter.tags, 0);
+			for (i = 0; i < filter.tags.nr; i++) {
+				struct string_list_item *it = &filter.tags.items[i];
+				print_tag(it->string, it->util, lines);
+			}
+		string_list_clear(&filter.tags, 1);
 	}
 	return 0;
 }
@@ -634,8 +644,6 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
 			copts.padding = 2;
 			run_column_filter(colopts, &copts);
 		}
-		if (lines != -1 && sort)
-			die(_("--sort and -n are incompatible"));
 		ret = list_tags(argv, lines == -1 ? 0 : lines, with_commit, sort);
 		if (column_active(colopts))
 			stop_column_filter();
diff --git a/cache.h b/cache.h
index df65231..74bf163 100644
--- a/cache.h
+++ b/cache.h
@@ -738,6 +738,13 @@ static inline void hashclr(unsigned char *hash)
 	memset(hash, 0, 20);
 }
 
+static inline unsigned char *hashdup(const unsigned char *hash)
+{
+	unsigned char *ret = xmalloc(20);
+	hashcpy(ret, hash);
+	return ret;
+}
+
 #define EMPTY_TREE_SHA1_HEX \
 	"4b825dc642cb6eb9a060e54bf8d69288fbee4904"
 #define EMPTY_TREE_SHA1_BIN_LITERAL \
diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index e4ab0f5..279b932 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -1423,6 +1423,24 @@ EOF
 	test_cmp expect actual
 '
 
+test_expect_success 'sorting works with -n' '
+	cat >msg <<-\EOF &&
+	multiline
+	tag
+	message
+	EOF
+	git tag -F msg foo-long &&
+	git tag -l --sort=-refname -n2 "foo*" >actual &&
+	cat >expect <<-\EOF &&
+	foo1.6          Merge branch '\''master'\'' into stable
+	foo1.3          Merge branch '\''master'\'' into stable
+	foo1.10         Merge branch '\''master'\'' into stable
+	foo-long        multiline
+	    tag
+	EOF
+	test_cmp expect actual
+'
+
 run_with_limited_stack () {
 	(ulimit -s 64 && "$@")
 }
-- 
2.0.0.566.gfe3e6b2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/8] tag: factor out decision to stream tags
  2014-06-25 23:34 [PATCH 0/8] use merge-base for tag --contains Jeff King
  2014-06-25 23:35 ` [PATCH 1/8] tag: allow --sort with -n Jeff King
@ 2014-06-25 23:35 ` Jeff King
  2014-06-25 23:39 ` [PATCH 3/8] paint_down_to_common: use prio_queue Jeff King
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 26+ messages in thread
From: Jeff King @ 2014-06-25 23:35 UTC (permalink / raw)
  To: git

Right now we stream tags if we are not sorting. If we are
sorting, we save them in a list and print them at the end.
Let's abstract this decision into a function to make it
easier to add more cases where we use the list.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/tag.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/builtin/tag.c b/builtin/tag.c
index 2adfc3d..3ef2fab 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -225,6 +225,11 @@ static void print_tag(const char *refname, const unsigned char *sha1,
 		}
 }
 
+static int filter_can_stream(struct tag_filter *filter)
+{
+	return !filter->sort;
+}
+
 static int show_reference(const char *refname, const unsigned char *sha1,
 			  int flag, void *cb_data)
 {
@@ -244,7 +249,7 @@ static int show_reference(const char *refname, const unsigned char *sha1,
 		if (points_at.nr && !match_points_at(refname, sha1))
 			return 0;
 
-		if (filter->sort)
+		if (!filter_can_stream(filter))
 			string_list_append(&filter->tags, refname)->util = hashdup(sha1);
 		else
 			print_tag(refname, sha1, filter->lines);
@@ -273,11 +278,11 @@ static int list_tags(const char **patterns, int lines,
 	filter.tags.strdup_strings = 1;
 
 	for_each_tag_ref(show_reference, (void *) &filter);
-	if (sort) {
+	if ((sort & SORT_MASK) == VERCMP_SORT)
+		qsort(filter.tags.items, filter.tags.nr,
+		      sizeof(struct string_list_item), sort_by_version);
+	if (!filter_can_stream(&filter)) {
 		int i;
-		if ((sort & SORT_MASK) == VERCMP_SORT)
-			qsort(filter.tags.items, filter.tags.nr,
-			      sizeof(struct string_list_item), sort_by_version);
 		if (sort & REVERSE_SORT)
 			for (i = filter.tags.nr - 1; i >= 0; i--) {
 				struct string_list_item *it = &filter.tags.items[i];
-- 
2.0.0.566.gfe3e6b2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/8] paint_down_to_common: use prio_queue
  2014-06-25 23:34 [PATCH 0/8] use merge-base for tag --contains Jeff King
  2014-06-25 23:35 ` [PATCH 1/8] tag: allow --sort with -n Jeff King
  2014-06-25 23:35 ` [PATCH 2/8] tag: factor out decision to stream tags Jeff King
@ 2014-06-25 23:39 ` Jeff King
  2014-07-01 16:23   ` Junio C Hamano
  2014-06-25 23:40 ` [PATCH 4/8] add functions for memory-efficient bitmaps Jeff King
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2014-06-25 23:39 UTC (permalink / raw)
  To: git

When we are traversing to find merge bases, we keep our
usual commit_list of commits to process, sorted by their
commit timestamp. As we add each parent to the list, we have
to spend "O(width of history)" to do the insertion, where
the width of history is the number of simultaneous lines of
development.

If we instead use a heap-based priority queue, we can do
these insertions in "O(log width)" time. This provides minor
speedups to merge-base calculations (timings in linux.git,
warm cache, best-of-five):

  [before]
  $ git merge-base HEAD v2.6.12
  real    0m3.251s
  user    0m3.148s
  sys     0m0.104s

  [after]
  $ git merge-base HEAD v2.6.12
  real    0m3.234s
  user    0m3.108s
  sys     0m0.128s

That's only an 0.5% speedup, but it does help protect us
against pathological cases.

The downside is that our priority queue is not stable, which
means that commits with the same timestamp may not come out
in the order we put them in. You can see this in the test
update in t6024. That test does a recursive merge across a
set of commits that all have the same timestamp. For the
virtual ancestor, the test currently ends up with blob like
this:

    <<<<<<< Temporary merge branch 1
    <<<<<<< Temporary merge branch 1
    C
    =======
    B
    >>>>>>> Temporary merge branch 2
    =======
    A
    >>>>>>> Temporary merge branch 2

but with this patch, the positions of B and A are swapped.
This is probably fine, as the order is an internal
implementation detail anyway (it would _not_ be fine if we
were using a priority queue for "git log" traversal, which
should show commits in parent order).

While we are munging the "interesting" function, we also
take the opportunity to give it a more descriptive name, and
convert the return value to an int (we returned the first
interesting commit, but nobody ever looked at it).

Signed-off-by: Jeff King <peff@peff.net>
---
This one is not strictly required for the series; it's just that I'm
adding what is essentially a clone of paint_down_to_common later in the
series. I wanted to use the priority queue there, too, so I looked into
using it here.

I'm slightly hesitant because of the stability thing mentioned above. I
_think_ it's probably fine. But we could also implement a
stable_prio_queue on top of the existing prio_queue if we're concerned
(and that may be something we want to do anyway, because "git log" would
want that if it switched to a priority queue).

I had no recollection while writing this patch, but after searching the
list for "stable priority queue", I realized that Junio and I discussed
it quite extensively a few years ago:

  http://thread.gmane.org/gmane.comp.version-control.git/204386/focus=204534

I think the conclusion there is that what this patch does is acceptable.

 commit.c                   | 42 +++++++++++++++++++-----------------------
 t/t6024-recursive-merge.sh |  2 +-
 2 files changed, 20 insertions(+), 24 deletions(-)

diff --git a/commit.c b/commit.c
index 881be3b..445b679 100644
--- a/commit.c
+++ b/commit.c
@@ -729,45 +729,41 @@ void sort_in_topological_order(struct commit_list **list, enum rev_sort_order so
 
 static const unsigned all_flags = (PARENT1 | PARENT2 | STALE | RESULT);
 
-static struct commit *interesting(struct commit_list *list)
+static int queue_has_nonstale(struct prio_queue *queue)
 {
-	while (list) {
-		struct commit *commit = list->item;
-		list = list->next;
-		if (commit->object.flags & STALE)
-			continue;
-		return commit;
+	int i;
+	for (i = 0; i < queue->nr; i++) {
+		struct commit *commit = queue->array[i];
+		if (!(commit->object.flags & STALE))
+			return 1;
 	}
-	return NULL;
+	return 0;
 }
 
 /* all input commits in one and twos[] must have been parsed! */
 static struct commit_list *paint_down_to_common(struct commit *one, int n, struct commit **twos)
 {
-	struct commit_list *list = NULL;
+	struct prio_queue queue = { compare_commits_by_commit_date };
 	struct commit_list *result = NULL;
 	int i;
 
 	one->object.flags |= PARENT1;
-	commit_list_insert_by_date(one, &list);
-	if (!n)
-		return list;
+	if (!n) {
+		commit_list_append(one, &result);
+		return result;
+	}
+	prio_queue_put(&queue, one);
+
 	for (i = 0; i < n; i++) {
 		twos[i]->object.flags |= PARENT2;
-		commit_list_insert_by_date(twos[i], &list);
+		prio_queue_put(&queue, twos[i]);
 	}
 
-	while (interesting(list)) {
-		struct commit *commit;
+	while (queue_has_nonstale(&queue)) {
+		struct commit *commit = prio_queue_get(&queue);
 		struct commit_list *parents;
-		struct commit_list *next;
 		int flags;
 
-		commit = list->item;
-		next = list->next;
-		free(list);
-		list = next;
-
 		flags = commit->object.flags & (PARENT1 | PARENT2 | STALE);
 		if (flags == (PARENT1 | PARENT2)) {
 			if (!(commit->object.flags & RESULT)) {
@@ -786,11 +782,11 @@ static struct commit_list *paint_down_to_common(struct commit *one, int n, struc
 			if (parse_commit(p))
 				return NULL;
 			p->object.flags |= flags;
-			commit_list_insert_by_date(p, &list);
+			prio_queue_put(&queue, p);
 		}
 	}
 
-	free_commit_list(list);
+	clear_prio_queue(&queue);
 	return result;
 }
 
diff --git a/t/t6024-recursive-merge.sh b/t/t6024-recursive-merge.sh
index 755d30c..c3c634f 100755
--- a/t/t6024-recursive-merge.sh
+++ b/t/t6024-recursive-merge.sh
@@ -76,7 +76,7 @@ test_expect_success "result contains a conflict" "test_cmp expect a1"
 
 git ls-files --stage > out
 cat > expect << EOF
-100644 439cc46de773d8a83c77799b7cc9191c128bfcff 1	a1
+100644 222cbac87715ba85080f8b1463833bb3cfb3b9e0 1	a1
 100644 cf84443e49e1b366fac938711ddf4be2d4d1d9e9 2	a1
 100644 fd7923529855d0b274795ae3349c5e0438333979 3	a1
 EOF
-- 
2.0.0.566.gfe3e6b2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 4/8] add functions for memory-efficient bitmaps
  2014-06-25 23:34 [PATCH 0/8] use merge-base for tag --contains Jeff King
                   ` (2 preceding siblings ...)
  2014-06-25 23:39 ` [PATCH 3/8] paint_down_to_common: use prio_queue Jeff King
@ 2014-06-25 23:40 ` Jeff King
  2014-06-26  3:15   ` Torsten Bögershausen
  2014-06-29  7:41   ` Eric Sunshine
  2014-06-25 23:42 ` [PATCH 5/8] string-list: add pos to iterator callback Jeff King
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 26+ messages in thread
From: Jeff King @ 2014-06-25 23:40 UTC (permalink / raw)
  To: git

We already have a nice-to-use bitmap implementation in
ewah/bitmap.c. It pretends to be infinitely long when asking
for a bit (and just returns 0 for bits that haven't been
allocated or set), and dynamically resizes as appropriate
when you set bits.

The cost to this is that each bitmap must store its own
pointer and length, using up to 16 bytes per bitmap on top
of the actual bit storage. This is a lot of storage (not to
mention an extra level of pointer indirection) if you are
going to store one bitmap per commit in a traversal.

These functions provide an alternative bitmap implementation
that can be used when you have a large number of fixed-size
bitmaps. See the documentation in the header file for
details and examples.

Signed-off-by: Jeff King <peff@peff.net>
---
 Makefile |   1 +
 bitset.h | 113 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 114 insertions(+)
 create mode 100644 bitset.h

diff --git a/Makefile b/Makefile
index 07ea105..8158fd2 100644
--- a/Makefile
+++ b/Makefile
@@ -633,6 +633,7 @@ LIB_H += archive.h
 LIB_H += argv-array.h
 LIB_H += attr.h
 LIB_H += bisect.h
+LIB_H += bitset.h
 LIB_H += blob.h
 LIB_H += branch.h
 LIB_H += builtin.h
diff --git a/bitset.h b/bitset.h
new file mode 100644
index 0000000..5fbc956
--- /dev/null
+++ b/bitset.h
@@ -0,0 +1,113 @@
+#ifndef BITSET_H
+#define BITSET_H
+
+/*
+ * This header file provides functions for operating on an array of unsigned
+ * characters as a bitmap. There is zero per-bitset storage overhead beyond the
+ * actual number of stored bits (modulo some padding). This is efficient, but
+ * makes the API harder to use. In particular, each bitset does not know how
+ * long it is. It is the caller's responsibility to:
+ *
+ *   1. Never ask to get or set a bit outside of the allocated range.
+ *
+ *   2. Provide the allocated range to any functions which operate
+ *      on the whole bitset (e.g., bitset_or).
+ *
+ *   3. Always feed bitsets of the same size to functions which require it
+ *      (e.g., bitset_or).
+ *
+ * It is mostly intended to be used with commit-slabs to store N bits per
+ * commit. Here's an example:
+ *
+ *   define_commit_slab(bit_slab, unsigned char);
+ *
+ *   ... assume we want to store nr bits per commit ...
+ *   struct bit_slab bits;
+ *   init_bit_slab_with_stride(&bits, bitset_sizeof(nr));
+ *
+ *   ... set a bit (make sure 10 < nr!) ...
+ *   bitset_set(bit_slab_at(&bits, commit), 10);
+ *
+ *   ... or get a bit ...
+ *   if (bitset_get(bit_slab_at(&bits, commit), 5))
+ *
+ *   ... propagate bits to a parent commit ...
+ *   bitset_or(bit_slab_at(&bits, parent),
+ *	       bit_slab_at(&bits, commit),
+ *	       nr);
+ */
+
+/*
+ * Return the number of unsigned chars required to store num_bits bits.
+ *
+ * This is mostly used internally by the bitset functions, but you may need it
+ * when allocating the bitset. Example:
+ *
+ *   bits = xcalloc(1, bitset_sizeof(nr));
+ */
+static inline int bitset_sizeof(int num_bits)
+{
+	return (num_bits + CHAR_BIT - 1) / CHAR_BIT;
+}
+
+/*
+ * Set the bit at position "n". "n" is counted from zero, and must be
+ * smaller than the num_bits given to bitset_sizeof when allocating the bitset.
+ */
+static inline void bitset_set(unsigned char *bits, int n)
+{
+	bits[n / CHAR_BIT] |= 1 << (n % CHAR_BIT);
+}
+
+/*
+ * Return the bit at position "n" (see bitset_set for a description of "n").
+ */
+static inline int bitset_get(unsigned char *bits, int n)
+{
+	return !!(bits[n / CHAR_BIT] & (1 << (n % CHAR_BIT)));
+}
+
+/*
+ * Return true iff the bitsets contain the same bits. Each bitset should be the
+ * same size, and should have been allocated using bitset_sizeof(max).
+ *
+ * Note that it is not safe to check partial equality by providing a smaller
+ * "max" (we assume any bits beyond "max" up to the next CHAR_BIT boundary are
+ * zeroed padding).
+ */
+static inline int bitset_equal(unsigned char *a, unsigned char *b, int max)
+{
+	int i;
+	for (i = bitset_sizeof(max); i > 0; i--)
+		if (*a++ != *b++)
+			return 0;
+	return 1;
+}
+
+/*
+ * Bitwise-or the bitsets in "dst" and "src", and store the result in "dst".
+ *
+ * See bitset_equal for the definition of "max".
+ */
+static inline void bitset_or(unsigned char *dst, const unsigned char *src, int max)
+{
+	int i;
+	for (i = bitset_sizeof(max); i > 0; i--)
+		*dst++ |= *src++;
+}
+
+/*
+ * Returns true iff the bitset contains all zeroes.
+ *
+ * See bitset_equal for the definition of "max".
+ */
+static inline int bitset_empty(const unsigned char *bits, int max)
+{
+	int i;
+	for (i = bitset_sizeof(max); i > 0; i--, bits++)
+		if (*bits)
+			return 0;
+	return 1;
+}
+
+#endif /* BITSET_H */
-- 
2.0.0.566.gfe3e6b2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 5/8] string-list: add pos to iterator callback
  2014-06-25 23:34 [PATCH 0/8] use merge-base for tag --contains Jeff King
                   ` (3 preceding siblings ...)
  2014-06-25 23:40 ` [PATCH 4/8] add functions for memory-efficient bitmaps Jeff King
@ 2014-06-25 23:42 ` Jeff King
  2014-07-01 17:45   ` Junio C Hamano
  2014-06-25 23:47 ` [PATCH 6/8] commit: provide a fast multi-tip contains function Jeff King
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2014-06-25 23:42 UTC (permalink / raw)
  To: git

When we are running a string-list foreach or filter, the
callback function sees only the string_list_item, along with
a void* data pointer provided by the caller. This is
sufficient for most purposes.

However, it can also be useful to know the position of the
item within the string (for example, if the data pointer
points to a secondary array in which each element
corresponds to part of the string list). We can help this
use case by providing the position to each callback.

Signed-off-by: Jeff King <peff@peff.net>
---
The diff here is noisy, and I expect in the long run that the one caller
I add to builtin/tag.c later in the series will eventually stop using
string_list entirely (in favor of a custom struct), which may leave us
with no callers that actually use the new field.

I do think the logic above is sound, though, and it's a potentially
useful thing. There may be other sites that avoid the for_each wrapper
in favor of iterating themselves simply _because_ they needed to know
the position (I would just do the same here, except that my new caller
wants to use filter_string_list, which is a little more complicated).

 builtin/clone.c    |  2 +-
 builtin/remote.c   | 12 ++++++------
 notes.c            |  1 +
 setup.c            |  1 +
 string-list.c      |  6 +++---
 string-list.h      |  9 +++++++--
 test-path-utils.c  |  2 +-
 test-string-list.c |  2 +-
 8 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index b12989d..89d0709 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -227,7 +227,7 @@ static void strip_trailing_slashes(char *dir)
 	*end = '\0';
 }
 
-static int add_one_reference(struct string_list_item *item, void *cb_data)
+static int add_one_reference(struct string_list_item *item, int pos, void *cb_data)
 {
 	char *ref_git;
 	const char *repo;
diff --git a/builtin/remote.c b/builtin/remote.c
index c9102e8..bea0b58 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -925,7 +925,7 @@ struct show_info {
 	int any_rebase;
 };
 
-static int add_remote_to_show_info(struct string_list_item *item, void *cb_data)
+static int add_remote_to_show_info(struct string_list_item *item, int pos, void *cb_data)
 {
 	struct show_info *info = cb_data;
 	int n = strlen(item->string);
@@ -935,7 +935,7 @@ static int add_remote_to_show_info(struct string_list_item *item, void *cb_data)
 	return 0;
 }
 
-static int show_remote_info_item(struct string_list_item *item, void *cb_data)
+static int show_remote_info_item(struct string_list_item *item, int pos, void *cb_data)
 {
 	struct show_info *info = cb_data;
 	struct ref_states *states = info->states;
@@ -962,7 +962,7 @@ static int show_remote_info_item(struct string_list_item *item, void *cb_data)
 	return 0;
 }
 
-static int add_local_to_show_info(struct string_list_item *branch_item, void *cb_data)
+static int add_local_to_show_info(struct string_list_item *branch_item, int pos, void *cb_data)
 {
 	struct show_info *show_info = cb_data;
 	struct ref_states *states = show_info->states;
@@ -984,7 +984,7 @@ static int add_local_to_show_info(struct string_list_item *branch_item, void *cb
 	return 0;
 }
 
-static int show_local_info_item(struct string_list_item *item, void *cb_data)
+static int show_local_info_item(struct string_list_item *item, int pos, void *cb_data)
 {
 	struct show_info *show_info = cb_data;
 	struct branch_info *branch_info = item->util;
@@ -1016,7 +1016,7 @@ static int show_local_info_item(struct string_list_item *item, void *cb_data)
 	return 0;
 }
 
-static int add_push_to_show_info(struct string_list_item *push_item, void *cb_data)
+static int add_push_to_show_info(struct string_list_item *push_item, int pos, void *cb_data)
 {
 	struct show_info *show_info = cb_data;
 	struct push_info *push_info = push_item->util;
@@ -1045,7 +1045,7 @@ static int cmp_string_with_push(const void *va, const void *vb)
 	return cmp ? cmp : strcmp(a_push->dest, b_push->dest);
 }
 
-static int show_push_info_item(struct string_list_item *item, void *cb_data)
+static int show_push_info_item(struct string_list_item *item, int pos, void *cb_data)
 {
 	struct show_info *show_info = cb_data;
 	struct push_info *push_info = item->util;
diff --git a/notes.c b/notes.c
index 5fe691d..f7a84f9 100644
--- a/notes.c
+++ b/notes.c
@@ -881,6 +881,7 @@ static int string_list_add_note_lines(struct string_list *list,
 }
 
 static int string_list_join_lines_helper(struct string_list_item *item,
+					 int pos,
 					 void *cb_data)
 {
 	struct strbuf *buf = cb_data;
diff --git a/setup.c b/setup.c
index 0a22f8b..0fe92fe 100644
--- a/setup.c
+++ b/setup.c
@@ -586,6 +586,7 @@ static dev_t get_device_or_die(const char *path, const char *prefix, int prefix_
  * subsequent entries.
  */
 static int canonicalize_ceiling_entry(struct string_list_item *item,
+				      int pos,
 				      void *cb_data)
 {
 	int *empty_entry_found = cb_data;
diff --git a/string-list.c b/string-list.c
index aabb25e..85e1e41 100644
--- a/string-list.c
+++ b/string-list.c
@@ -116,7 +116,7 @@ int for_each_string_list(struct string_list *list,
 {
 	int i, ret = 0;
 	for (i = 0; i < list->nr; i++)
-		if ((ret = fn(&list->items[i], cb_data)))
+		if ((ret = fn(&list->items[i], i, cb_data)))
 			break;
 	return ret;
 }
@@ -126,7 +126,7 @@ void filter_string_list(struct string_list *list, int free_util,
 {
 	int src, dst = 0;
 	for (src = 0; src < list->nr; src++) {
-		if (want(&list->items[src], cb_data)) {
+		if (want(&list->items[src], src, cb_data)) {
 			list->items[dst++] = list->items[src];
 		} else {
 			if (list->strdup_strings)
@@ -138,7 +138,7 @@ void filter_string_list(struct string_list *list, int free_util,
 	list->nr = dst;
 }
 
-static int item_is_not_empty(struct string_list_item *item, void *unused)
+static int item_is_not_empty(struct string_list_item *item, int pos, void *unused)
 {
 	return *item->string != '\0';
 }
diff --git a/string-list.h b/string-list.h
index dd5e294..3318076 100644
--- a/string-list.h
+++ b/string-list.h
@@ -26,8 +26,13 @@ void string_list_clear(struct string_list *list, int free_util);
 typedef void (*string_list_clear_func_t)(void *p, const char *str);
 void string_list_clear_func(struct string_list *list, string_list_clear_func_t clearfunc);
 
-/* Use this function or the macro below to iterate over each item */
-typedef int (*string_list_each_func_t)(struct string_list_item *, void *);
+/*
+ * Use this function or the macro below to iterate over each item. Each item
+ * is passed as the first argument to an invocation of the callback. The second
+ * argument, "pos", is the numeric position of the first argument within the
+ * list (_not_ an offset from the first item).
+ */
+typedef int (*string_list_each_func_t)(struct string_list_item *, int pos, void *);
 int for_each_string_list(struct string_list *list,
 			 string_list_each_func_t, void *cb_data);
 #define for_each_string_list_item(item,list) \
diff --git a/test-path-utils.c b/test-path-utils.c
index 3dd3744..f3c584b 100644
--- a/test-path-utils.c
+++ b/test-path-utils.c
@@ -6,7 +6,7 @@
  * GIT_CEILING_DIRECTORIES.  If the path is unusable for some reason,
  * die with an explanation.
  */
-static int normalize_ceiling_entry(struct string_list_item *item, void *unused)
+static int normalize_ceiling_entry(struct string_list_item *item, int pos, void *unused)
 {
 	const char *ceil = item->string;
 	int len = strlen(ceil);
diff --git a/test-string-list.c b/test-string-list.c
index 14bdf9d..85d9893 100644
--- a/test-string-list.c
+++ b/test-string-list.c
@@ -35,7 +35,7 @@ static void write_list_compact(const struct string_list *list)
 	}
 }
 
-static int prefix_cb(struct string_list_item *item, void *cb_data)
+static int prefix_cb(struct string_list_item *item, int pos, void *cb_data)
 {
 	const char *prefix = (const char *)cb_data;
 	return starts_with(item->string, prefix);
-- 
2.0.0.566.gfe3e6b2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 6/8] commit: provide a fast multi-tip contains function
  2014-06-25 23:34 [PATCH 0/8] use merge-base for tag --contains Jeff King
                   ` (4 preceding siblings ...)
  2014-06-25 23:42 ` [PATCH 5/8] string-list: add pos to iterator callback Jeff King
@ 2014-06-25 23:47 ` Jeff King
  2014-06-26 18:55   ` Junio C Hamano
  2014-06-25 23:49 ` [PATCH 7/8] tag: use commit_contains Jeff King
  2014-06-25 23:53 ` [PATCH 8/8] perf: add tests for tag --contains Jeff King
  7 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2014-06-25 23:47 UTC (permalink / raw)
  To: git

When commands like "git branch --contains" want to know
which branches contain a particular commit, they run a
series of merge-base calculations, one per branch. This can
be very slow if you have a large number of branches.

We made "tag --contains" faster in ffc4b80 (tag: speed up
--contains calculation, 2011-06-11) by switching to a
different algorithm that caches intermediate "contains"
information from each tag we check. The downside of the new
algorithm is that it moves depth-first through the graph. So
it tends to go all the way to the roots, even if the
contained commit is near the top of history. That works OK
for tags, because repositories tend to have tags near the
roots anyway (e.g., a v0.1 or similar). The number of
commits we look at increased a little bit, but since we
avoid traversing over the same parts of history repeatedly,
it was a huge net win.

For "branch --contains", it is less clear that this is a
win. Most branches stay up to date, so we can bound a search
for a recent commit when we hit the merge base between the
commit and the branches.

The ideal would be to use the merge-base-style breadth-first
traversal, but to perform a single traversal for all tips.
The problem is that we need one bit of storage per tip in
each commit, and "struct commit" has only a fixed number of
bits. We can solve that by using a process similar to
paint_down_to_common, but instead of storing PARENT1 and
PARENT2 flags, using a commit slab to store one bit per tip.

Signed-off-by: Jeff King <peff@peff.net>
---
This is the interesting commit, and I'd really love some eyes on the
logic. It's basically paint_down_to_common but with the PARENT1 and
PARENT2 flags replaced with larger bitfields.

I haven't quite convinced myself that the stale logic in the middle is
right. The origin paint_down function checks "PARENT1 | PARENT2" to see
if we found a merge base (even though PARENT2 may represent many tips).
Here I check whether we have _any_ "left" parent flag and _any_ "right"
parent flag. I'm not sure if I actually need to be finding the merge
base of _all_ of the commits. I don't think so, and I can't find a case
where this doesn't work, but perhaps I am not being imaginative enough.

 commit.c | 102 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 commit.h |  17 +++++++++++
 2 files changed, 119 insertions(+)

diff --git a/commit.c b/commit.c
index 445b679..66e0def 100644
--- a/commit.c
+++ b/commit.c
@@ -11,6 +11,7 @@
 #include "commit-slab.h"
 #include "prio-queue.h"
 #include "sha1-lookup.h"
+#include "bitset.h"
 
 static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **);
 
@@ -1040,6 +1041,107 @@ struct commit_list *reduce_heads(struct commit_list *heads)
 	return result;
 }
 
+define_commit_slab(bit_slab, unsigned char);
+
+static int init_contains_bits(const struct commit_list *commits,
+			      struct bit_slab *bits,
+			      struct prio_queue *queue)
+{
+	int i, nr = commit_list_count(commits);
+
+	init_bit_slab_with_stride(bits, bitset_sizeof(nr));
+	for (i = 0; i < nr; i++, commits = commits->next) {
+		struct commit *c = commits->item;
+
+		prio_queue_put(queue, c);
+		bitset_set(bit_slab_at(bits, c), i);
+	}
+
+	return nr;
+}
+
+static int queue_has_nonstale_bits(struct prio_queue *queue, struct bit_slab *stale)
+{
+	int i;
+	for (i = 0; i < queue->nr; i++) {
+		struct commit *commit = queue->array[i];
+		if (!*bit_slab_at(stale, commit))
+			return 1;
+	}
+	return 0;
+}
+
+static void fill_contains_result(unsigned char *result, int nr,
+				 struct bit_slab *bits,
+				 const struct commit_list *other_side)
+{
+	const struct commit_list *c;
+
+	memset(result, 0, nr);
+	for (c = other_side; c; c = c->next) {
+		unsigned char *c_bits = bit_slab_at(bits, c->item);
+		int i;
+
+		for (i = 0; i < nr; i++)
+			result[i] |= bitset_get(c_bits, i);
+	}
+}
+
+void commit_contains(const struct commit_list *left,
+		     const struct commit_list *right,
+		     unsigned char *left_contains,
+		     unsigned char *right_contains)
+{
+	struct prio_queue queue = { compare_commits_by_commit_date };
+	struct bit_slab left_bits, right_bits, stale_bits;
+	int left_nr, right_nr;
+
+	left_nr = init_contains_bits(left, &left_bits, &queue);
+	right_nr = init_contains_bits(right, &right_bits, &queue);
+	init_bit_slab(&stale_bits);
+
+	while (queue_has_nonstale_bits(&queue, &stale_bits)) {
+		struct commit *commit = prio_queue_get(&queue);
+		struct commit_list *parents;
+		unsigned char *c_left, *c_right, *c_stale;
+
+		c_left = bit_slab_at(&left_bits, commit);
+		c_right = bit_slab_at(&right_bits, commit);
+		c_stale = bit_slab_at(&stale_bits, commit);
+
+		if (!bitset_empty(c_left, left_nr) &&
+		    !bitset_empty(c_right, right_nr))
+			*c_stale = 1;
+
+		for (parents = commit->parents; parents; parents = parents->next) {
+			struct commit *p = parents->item;
+			unsigned char *p_left = bit_slab_at(&left_bits, p),
+				      *p_right = bit_slab_at(&right_bits, p);
+
+			if (bitset_equal(c_left, p_left, left_nr) &&
+			    bitset_equal(c_right, p_right, right_nr))
+				continue;
+			if (parse_commit(p))
+				die("unable to parse commit");
+			bitset_or(p_left, c_left, left_nr);
+			bitset_or(p_right, c_right, right_nr);
+			*bit_slab_at(&stale_bits, p) |= *c_stale;
+			prio_queue_put(&queue, p);
+		}
+	}
+
+	if (left_contains)
+		fill_contains_result(left_contains, left_nr, &left_bits, right);
+	if (right_contains)
+		fill_contains_result(right_contains, right_nr, &right_bits, left);
+
+	clear_prio_queue(&queue);
+	clear_bit_slab(&left_bits);
+	clear_bit_slab(&right_bits);
+	clear_bit_slab(&stale_bits);
+}
+
+
 static const char gpg_sig_header[] = "gpgsig";
 static const int gpg_sig_header_len = sizeof(gpg_sig_header) - 1;
 
diff --git a/commit.h b/commit.h
index a9f177b..d3a142a 100644
--- a/commit.h
+++ b/commit.h
@@ -307,4 +307,21 @@ int compare_commits_by_commit_date(const void *a_, const void *b_, void *unused)
 LAST_ARG_MUST_BE_NULL
 extern int run_commit_hook(int editor_is_used, const char *index_file, const char *name, ...);
 
+/*
+ * Calculate which commits in left contain commits in right, and vice versa.
+ *
+ * The left_contains result, if non-NULL, must point to an array of unsigned
+ * char with as many elements as there are items in the "left" commit_list.
+ * When the function completes, the nth char in left_contains will be non-zero
+ * iff the nth commit in the "left" list contains at least one commit from the
+ * "right" list.
+ *
+ * The right_contains result works in the same way, but with left and right
+ * swapped.
+ */
+void commit_contains(const struct commit_list *left,
+		     const struct commit_list *right,
+		     unsigned char *left_contains,
+		     unsigned char *right_contains);
+
 #endif /* COMMIT_H */
-- 
2.0.0.566.gfe3e6b2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 7/8] tag: use commit_contains
  2014-06-25 23:34 [PATCH 0/8] use merge-base for tag --contains Jeff King
                   ` (5 preceding siblings ...)
  2014-06-25 23:47 ` [PATCH 6/8] commit: provide a fast multi-tip contains function Jeff King
@ 2014-06-25 23:49 ` Jeff King
  2014-06-25 23:53 ` [PATCH 8/8] perf: add tests for tag --contains Jeff King
  7 siblings, 0 replies; 26+ messages in thread
From: Jeff King @ 2014-06-25 23:49 UTC (permalink / raw)
  To: git

The newly added commit_contains function should do a better
job than our custom depth-first traversal. It should be the
same speed when going close to the roots, but much faster
when all tags are close to the searched-for commit (this
usually isn't the case, but could be if you limit the tags
with a pattern).

It also cleans up some of the more egregious pitfalls of the
original implementation, including an abuse of the
UNINTERESTING and TMP_MARK flags, an utterly confusing
calling convention (it silently caches the bits between
calls, with no checks that our "with_commit" was the same
for each call), and a failure to clean up after itself
(tainting any further traversals).

Signed-off-by: Jeff King <peff@peff.net>
---
The code to use the new contains function ends up disappointingly longer
than I would have hoped, but it has to massage our string_list of tag
names into a list of commits, and then massage the output back into a
filtered string list. It's not too bad, though. And as I mentioned, I
hope to eventually factor this out to share with for-each-ref and
branch.

 builtin/tag.c | 161 +++++++++++++++++-----------------------------------------
 1 file changed, 48 insertions(+), 113 deletions(-)

diff --git a/builtin/tag.c b/builtin/tag.c
index 3ef2fab..f17192c 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -72,108 +72,6 @@ static const unsigned char *match_points_at(const char *refname,
 	return NULL;
 }
 
-static int in_commit_list(const struct commit_list *want, struct commit *c)
-{
-	for (; want; want = want->next)
-		if (!hashcmp(want->item->object.sha1, c->object.sha1))
-			return 1;
-	return 0;
-}
-
-enum contains_result {
-	CONTAINS_UNKNOWN = -1,
-	CONTAINS_NO = 0,
-	CONTAINS_YES = 1,
-};
-
-/*
- * Test whether the candidate or one of its parents is contained in the list.
- * Do not recurse to find out, though, but return -1 if inconclusive.
- */
-static enum contains_result contains_test(struct commit *candidate,
-			    const struct commit_list *want)
-{
-	/* was it previously marked as containing a want commit? */
-	if (candidate->object.flags & TMP_MARK)
-		return 1;
-	/* or marked as not possibly containing a want commit? */
-	if (candidate->object.flags & UNINTERESTING)
-		return 0;
-	/* or are we it? */
-	if (in_commit_list(want, candidate)) {
-		candidate->object.flags |= TMP_MARK;
-		return 1;
-	}
-
-	if (parse_commit(candidate) < 0)
-		return 0;
-
-	return -1;
-}
-
-/*
- * Mimicking the real stack, this stack lives on the heap, avoiding stack
- * overflows.
- *
- * At each recursion step, the stack items points to the commits whose
- * ancestors are to be inspected.
- */
-struct stack {
-	int nr, alloc;
-	struct stack_entry {
-		struct commit *commit;
-		struct commit_list *parents;
-	} *stack;
-};
-
-static void push_to_stack(struct commit *candidate, struct stack *stack)
-{
-	int index = stack->nr++;
-	ALLOC_GROW(stack->stack, stack->nr, stack->alloc);
-	stack->stack[index].commit = candidate;
-	stack->stack[index].parents = candidate->parents;
-}
-
-static enum contains_result contains(struct commit *candidate,
-		const struct commit_list *want)
-{
-	struct stack stack = { 0, 0, NULL };
-	int result = contains_test(candidate, want);
-
-	if (result != CONTAINS_UNKNOWN)
-		return result;
-
-	push_to_stack(candidate, &stack);
-	while (stack.nr) {
-		struct stack_entry *entry = &stack.stack[stack.nr - 1];
-		struct commit *commit = entry->commit;
-		struct commit_list *parents = entry->parents;
-
-		if (!parents) {
-			commit->object.flags |= UNINTERESTING;
-			stack.nr--;
-		}
-		/*
-		 * If we just popped the stack, parents->item has been marked,
-		 * therefore contains_test will return a meaningful 0 or 1.
-		 */
-		else switch (contains_test(parents->item, want)) {
-		case CONTAINS_YES:
-			commit->object.flags |= TMP_MARK;
-			stack.nr--;
-			break;
-		case CONTAINS_NO:
-			entry->parents = parents->next;
-			break;
-		case CONTAINS_UNKNOWN:
-			push_to_stack(parents->item, &stack);
-			break;
-		}
-	}
-	free(stack.stack);
-	return contains_test(candidate, want);
-}
-
 static void show_tag_lines(const unsigned char *sha1, int lines)
 {
 	int i;
@@ -227,7 +125,7 @@ static void print_tag(const char *refname, const unsigned char *sha1,
 
 static int filter_can_stream(struct tag_filter *filter)
 {
-	return !filter->sort;
+	return !filter->sort && !filter->with_commit;
 }
 
 static int show_reference(const char *refname, const unsigned char *sha1,
@@ -236,16 +134,6 @@ static int show_reference(const char *refname, const unsigned char *sha1,
 	struct tag_filter *filter = cb_data;
 
 	if (match_pattern(filter->patterns, refname)) {
-		if (filter->with_commit) {
-			struct commit *commit;
-
-			commit = lookup_commit_reference_gently(sha1, 1);
-			if (!commit)
-				return 0;
-			if (!contains(commit, filter->with_commit))
-				return 0;
-		}
-
 		if (points_at.nr && !match_points_at(refname, sha1))
 			return 0;
 
@@ -258,6 +146,46 @@ static int show_reference(const char *refname, const unsigned char *sha1,
 	return 0;
 }
 
+static int util_is_not_null(struct string_list_item *it, int pos, void *data)
+{
+	return !!it->util;
+}
+
+static int matches_contains(struct string_list_item *it, int pos, void *data)
+{
+	unsigned char *contains = data;
+	return contains[pos];
+}
+
+static void limit_by_contains(struct string_list *tags, struct commit_list *with)
+{
+	struct commit_list *tag_commits = NULL, **tail = &tag_commits;
+	unsigned char *result;
+	int i;
+
+	for (i = 0; i < tags->nr; i++) {
+		struct string_list_item *it = &tags->items[i];
+		struct commit *c = lookup_commit_reference_gently(it->util, 1);
+		if (c)
+			tail = commit_list_append(c, tail);
+		else {
+			free(it->util);
+			it->util = NULL;
+		}
+	}
+	filter_string_list(tags, 0, util_is_not_null, NULL);
+
+	if (!tags->nr)
+		return;
+
+	result = xmalloc(tags->nr);
+	commit_contains(with, tag_commits, NULL, result);
+	filter_string_list(tags, 1, matches_contains, result);
+
+	free(result);
+	free_commit_list(tag_commits);
+}
+
 static int sort_by_version(const void *a_, const void *b_)
 {
 	const struct string_list_item *a = a_;
@@ -278,9 +206,16 @@ static int list_tags(const char **patterns, int lines,
 	filter.tags.strdup_strings = 1;
 
 	for_each_tag_ref(show_reference, (void *) &filter);
+	if (with_commit)
+		limit_by_contains(&filter.tags, with_commit);
 	if ((sort & SORT_MASK) == VERCMP_SORT)
 		qsort(filter.tags.items, filter.tags.nr,
 		      sizeof(struct string_list_item), sort_by_version);
+	if (sort) {
+		if ((sort & SORT_MASK) == VERCMP_SORT)
+			qsort(filter.tags.items, filter.tags.nr,
+			      sizeof(struct string_list_item), sort_by_version);
+	}
 	if (!filter_can_stream(&filter)) {
 		int i;
 		if (sort & REVERSE_SORT)
-- 
2.0.0.566.gfe3e6b2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 8/8] perf: add tests for tag --contains
  2014-06-25 23:34 [PATCH 0/8] use merge-base for tag --contains Jeff King
                   ` (6 preceding siblings ...)
  2014-06-25 23:49 ` [PATCH 7/8] tag: use commit_contains Jeff King
@ 2014-06-25 23:53 ` Jeff King
  2014-06-26  0:01   ` Jeff King
  7 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2014-06-25 23:53 UTC (permalink / raw)
  To: git

These tests can demonstrate the changes in "tag --contains"
speed over time. The interesting points in history are:

  - pre-ffc4b80, where we used a series of N merge-base
    traversals

  - ffc4b80 up to the current master, where we moved to a
    single depth-first traversal

  - the previous commit, where we moved from depth-first to
    a multi-tip merge-base

The interesting cases to measure are:

  - checking which tags contain a recent commit (we use
    HEAD~100 here)

  - checking which tags contain a very ancient commit (we
    use the last commit output by rev-list)

  - checking which tags contain a commit in the middle (we
    use HEAD~5000, which goes back 5 years in git.git)

  - all of the above, but instead of looking at all commits,
    considering only recent ones (we pick the most recent
    tag by its tagger date)

Here are the timings for git.git:

    Test                              ffc4b80^          origin/master             HEAD
    ----------------------------------------------------------------------------------------------------
    7000.3: contains recent/all       1.97(1.96+0.01)   0.26(0.25+0.00) -86.8%    0.27(0.26+0.00) -86.3%
    7000.4: contains recent/v2.0.1    0.08(0.08+0.00)   0.25(0.24+0.01) +212.5%   0.02(0.02+0.00) -75.0%
    7000.5: contains old/all          0.90(0.89+0.00)   0.18(0.17+0.00) -80.0%    0.27(0.26+0.00) -70.0%
    7000.6: contains old/v2.0.1       0.25(0.23+0.02)   0.03(0.03+0.00) -88.0%    0.25(0.24+0.00) +0.0%
    7000.7: contains ancient/all      1.98(1.97+0.01)   0.26(0.24+0.01) -86.9%    0.28(0.25+0.02) -85.9%
    7000.8: contains ancient/v2.0.1   1.95(1.94+0.00)   0.26(0.24+0.01) -86.7%    0.27(0.26+0.00) -86.2%

You can see that ffc4b80 vastly improved the normal case of
checking all tags. This is because we avoid walking over the
same parts of history over and over. However, when looking
only for a recent tag (v2.0.1 in these tests), it sometimes
performs much worse than the original. This is not
surprising. For a merge-base solution, we can quit when we
hit history shared between the contained commit and the tag.
For ffc4b80's depth-first approach, we typically go all the
way to the roots before backtracking. For the ancient/v2.0.1
case, that's not a big deal, because the merge base requires
us doing that anyway. But for recent/v2.0.1, the merge-base
answer should involve only recent history.

The new traversal code performs about as well as the
depth-first code in the normal case, but fixes the
regression in the recent/v2.0.1 case.

Signed-off-by: Jeff King <peff@peff.net>
---
There are still two things about the timings that puzzle me a bit.

One is that the old/all case gets slower moving from the depth-first
traversal to the merge-base one. I think this is simply because the
depth-first one may get "lucky" sometimes, and hit the commit we are
looking for on the way down. So its average case is somewhat better than
its worst case (and I would not be surprised if my choice of HEAD~5000
helps it, because it follows first parents first).

The second question is why ffc4b80^ is so much slower on the v2.0.1
tests than the new code. They should both be doing a single merge-base
traversal, and I'd expect them to take about the same amount of time
(for that matter, ancient/v2.0.1 should take the same amount of time as
the depth-first code, since they all basically have to read all of the
commits once). My guess is that there's some other speedup that has
happened in the years between ffc4b80 and now.

 t/perf/p7000-tag-contains.sh | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)
 create mode 100755 t/perf/p7000-tag-contains.sh

diff --git a/t/perf/p7000-tag-contains.sh b/t/perf/p7000-tag-contains.sh
new file mode 100755
index 0000000..846f106
--- /dev/null
+++ b/t/perf/p7000-tag-contains.sh
@@ -0,0 +1,30 @@
+#!/bin/sh
+
+test_description='speed of tag --contains lookups'
+. ./perf-lib.sh
+
+test_perf_default_repo
+
+test_expect_success 'find reference points' '
+	recent=$(git rev-parse HEAD~100) &&
+	old=$(git rev-parse HEAD~5000) &&
+	ancient=$(git rev-list | tail -n 1)
+'
+
+test_expect_success 'find most recent tag' '
+	tag=$(git for-each-ref --sort=-taggerdate \
+			       --format="%(refname:short)" \
+			       refs/tags |
+	      head -n 1)
+'
+
+for distance in recent old ancient; do
+	contains=$(eval echo \$$distance)
+	for match in "" "$tag"; do
+		test_perf "contains $distance/${match:-all}" "
+			git tag -l --contains $contains $match
+		"
+	done
+done
+
+test_done
-- 
2.0.0.566.gfe3e6b2

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 8/8] perf: add tests for tag --contains
  2014-06-25 23:53 ` [PATCH 8/8] perf: add tests for tag --contains Jeff King
@ 2014-06-26  0:01   ` Jeff King
  2014-06-26  0:04     ` Jeff King
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2014-06-26  0:01 UTC (permalink / raw)
  To: git

On Wed, Jun 25, 2014 at 07:53:35PM -0400, Jeff King wrote:

> There are still two things about the timings that puzzle me a bit.

<forehead-palm>

This certainly isn't helping:

> +test_expect_success 'find reference points' '
> +	recent=$(git rev-parse HEAD~100) &&
> +	old=$(git rev-parse HEAD~5000) &&
> +	ancient=$(git rev-list | tail -n 1)
> +'

$ancient will always be empty, as rev-list needs a "HEAD" argument.

I also think HEAD~100 is probably _too_ recent, as it is not enough to
match any tags at all right now.

So with this patch:

diff --git a/t/perf/p7000-tag-contains.sh b/t/perf/p7000-tag-contains.sh
index 846f106..8294d41 100755
--- a/t/perf/p7000-tag-contains.sh
+++ b/t/perf/p7000-tag-contains.sh
@@ -6,9 +6,9 @@ test_description='speed of tag --contains lookups'
 test_perf_default_repo
 
 test_expect_success 'find reference points' '
-	recent=$(git rev-parse HEAD~100) &&
+	recent=$(git rev-parse HEAD~200) &&
 	old=$(git rev-parse HEAD~5000) &&
-	ancient=$(git rev-list | tail -n 1)
+	ancient=$(git rev-list HEAD | tail -n 1)
 '
 
 test_expect_success 'find most recent tag' '

I get:

  Test                              ffc4b80^          origin/master             HEAD                  
  ----------------------------------------------------------------------------------------------------
  7000.3: contains recent/all       1.99(1.97+0.01)   0.25(0.24+0.00) -87.4%    0.27(0.26+0.00) -86.4%
  7000.4: contains recent/v2.0.1    0.03(0.03+0.00)   0.00(0.00+0.00) -100.0%   0.03(0.02+0.00) +0.0% 
  7000.5: contains old/all          0.90(0.89+0.00)   0.18(0.17+0.00) -80.0%    0.27(0.26+0.00) -70.0%
  7000.6: contains old/v2.0.1       0.25(0.24+0.00)   0.03(0.03+0.00) -88.0%    0.25(0.24+0.00) +0.0% 
  7000.7: contains ancient/all      0.82(0.80+0.01)   0.13(0.12+0.00) -84.1%    0.27(0.26+0.01) -67.1%
  7000.8: contains ancient/v2.0.1   0.26(0.26+0.00)   0.09(0.08+0.00) -65.4%    0.25(0.24+0.00) -3.8% 

which looks about right.

-Peff

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 8/8] perf: add tests for tag --contains
  2014-06-26  0:01   ` Jeff King
@ 2014-06-26  0:04     ` Jeff King
  0 siblings, 0 replies; 26+ messages in thread
From: Jeff King @ 2014-06-26  0:04 UTC (permalink / raw)
  To: git

On Wed, Jun 25, 2014 at 08:01:29PM -0400, Jeff King wrote:

> I get:
> 
>   Test                              ffc4b80^          origin/master             HEAD                  
>   ----------------------------------------------------------------------------------------------------
>   7000.3: contains recent/all       1.99(1.97+0.01)   0.25(0.24+0.00) -87.4%    0.27(0.26+0.00) -86.4%
>   7000.4: contains recent/v2.0.1    0.03(0.03+0.00)   0.00(0.00+0.00) -100.0%   0.03(0.02+0.00) +0.0% 
>   7000.5: contains old/all          0.90(0.89+0.00)   0.18(0.17+0.00) -80.0%    0.27(0.26+0.00) -70.0%
>   7000.6: contains old/v2.0.1       0.25(0.24+0.00)   0.03(0.03+0.00) -88.0%    0.25(0.24+0.00) +0.0% 
>   7000.7: contains ancient/all      0.82(0.80+0.01)   0.13(0.12+0.00) -84.1%    0.27(0.26+0.01) -67.1%
>   7000.8: contains ancient/v2.0.1   0.26(0.26+0.00)   0.09(0.08+0.00) -65.4%    0.25(0.24+0.00) -3.8% 
> 
> which looks about right.

Oh, hmph. The "bad" case in recent/v2.0.1 for origin/master goes away
then. Because we get "lucky" again, and the depth-first one does not
have to go all the way to the roots. So it's probably a better example
to use the original HEAD~100, which isn't actually in v2.0.1.

-Peff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/8] add functions for memory-efficient bitmaps
  2014-06-25 23:40 ` [PATCH 4/8] add functions for memory-efficient bitmaps Jeff King
@ 2014-06-26  3:15   ` Torsten Bögershausen
  2014-06-26 15:51     ` Jeff King
  2014-06-29  7:41   ` Eric Sunshine
  1 sibling, 1 reply; 26+ messages in thread
From: Torsten Bögershausen @ 2014-06-26  3:15 UTC (permalink / raw)
  To: Jeff King, git

On 2014-06-26 01.40, Jeff King wrote:
[]

> + */
> +static inline int bitset_sizeof(int num_bits)
> +{
> +	return (num_bits + CHAR_BIT - 1) / CHAR_BIT;
> +}
Just a general question about the usage of "int" here (and at other places):
Is there a special reason for new code to allow num_bits to be negative ?

To my knowledge all the size_t definitions these days are positive,
because a size can not be negative.

As a reader of the code I always wonder if there is a special meaning with
negative values, (as the result of read() to indicate an error) but there isn't.

Should we use
"unsigned" here ?
or "unsigned int" ?
or "size_t" (Which may use 64 bits, which feels like a overkill)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/8] add functions for memory-efficient bitmaps
  2014-06-26  3:15   ` Torsten Bögershausen
@ 2014-06-26 15:51     ` Jeff King
  0 siblings, 0 replies; 26+ messages in thread
From: Jeff King @ 2014-06-26 15:51 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git

On Thu, Jun 26, 2014 at 05:15:05AM +0200, Torsten Bögershausen wrote:

> > + */
> > +static inline int bitset_sizeof(int num_bits)
> > +{
> > +	return (num_bits + CHAR_BIT - 1) / CHAR_BIT;
> > +}
> Just a general question about the usage of "int" here (and at other places):
> Is there a special reason for new code to allow num_bits to be negative ?

No. I usually choose "int" when the word size is not likely to matter
(i.e., we do not expect it to overflow a 32-bit integer, nor to have so
many that I need to be careful not to waste space).

Probably "unsigned int" would be a more descriptive choice.

It may also help the compiler optimize better. Assuming CHAR_BIT is 8
(i.e., most everywhere), we get:

  (num_bits + 7) / 8

Presumably the compiler implements the division with a right-shift.
Marking num_bits as unsigned should let us do just a logical shift,
without worrying about the sign. And indeed, here are the signed and
unsigned versions produced by "gcc -S -O2" (for an equivalent
non-inlined function):

  [signed]
        leal    14(%rdi), %edx
        movl    %edi, %eax
        addl    $7, %eax
        cmovs   %edx, %eax
        sarl    $3, %eax
        ret

  [unsigned]
        leal    7(%rdi), %eax
        shrl    $3, %eax
        ret

Much simpler, though see below for practical considerations.

> To my knowledge all the size_t definitions these days are positive,
> because a size can not be negative.

size_t is perhaps a reasonable choice for the return value, given the name
"sizeof". But if you really care about using the whole range of bits there, you
need a data type for num_bits that is CHAR_BIT times larger.

> Should we use
> "unsigned" here ?
> or "unsigned int" ?

Yes, I think so. Both are the same to the compiler. I have a vague
recollection that we prefer one over the other, but grepping seems to
find many examples of each in our code.

I'm squashing in the patch below. I couldn't measure any speed
improvement. I'm guessing because the functions are all inlined, which
means we likely get away with calculating bitset_sizeof once outside of
our loop. I think the result is still more obvious to read, though.

-Peff

---
diff --git a/bitset.h b/bitset.h
index 5fbc956..268545b 100644
--- a/bitset.h
+++ b/bitset.h
@@ -45,7 +45,7 @@
  *
  *   bits = xcalloc(1, bitset_sizeof(nr));
  */
-static inline int bitset_sizeof(int num_bits)
+static inline unsigned bitset_sizeof(unsigned num_bits)
 {
 	return (num_bits + CHAR_BIT - 1) / CHAR_BIT;
 }
@@ -54,7 +54,7 @@ static inline int bitset_sizeof(int num_bits)
  * Set the bit at position "n". "n" is counted from zero, and must be
  * smaller than the num_bits given to bitset_sizeof when allocating the bitset.
  */
-static inline void bitset_set(unsigned char *bits, int n)
+static inline void bitset_set(unsigned char *bits, unsigned n)
 {
 	bits[n / CHAR_BIT] |= 1 << (n % CHAR_BIT);
 }
@@ -62,7 +62,7 @@ static inline void bitset_set(unsigned char *bits, int n)
 /*
  * Return the bit at position "n" (see bitset_set for a description of "n").
  */
-static inline int bitset_get(unsigned char *bits, int n)
+static inline unsigned bitset_get(unsigned char *bits, unsigned n)
 {
 	return !!(bits[n / CHAR_BIT] & (1 << (n % CHAR_BIT)));
 }
@@ -75,9 +75,10 @@ static inline int bitset_get(unsigned char *bits, int n)
  * "max" (we assume any bits beyond "max" up to the next CHAR_BIT boundary are
  * zeroed padding).
  */
-static inline int bitset_equal(unsigned char *a, unsigned char *b, int max)
+static inline int bitset_equal(unsigned char *a, unsigned char *b,
+			       unsigned max)
 {
-	int i;
+	unsigned i;
 	for (i = bitset_sizeof(max); i > 0; i--)
 		if (*a++ != *b++)
 			return 0;
@@ -89,9 +90,10 @@ static inline int bitset_equal(unsigned char *a, unsigned char *b, int max)
  *
  * See bitset_equal for the definition of "max".
  */
-static inline void bitset_or(unsigned char *dst, const unsigned char *src, int max)
+static inline void bitset_or(unsigned char *dst, const unsigned char *src,
+			     unsigned max)
 {
-	int i;
+	unsigned i;
 	for (i = bitset_sizeof(max); i > 0; i--)
 		*dst++ |= *src++;
 }
@@ -101,9 +103,9 @@ static inline void bitset_or(unsigned char *dst, const unsigned char *src, int m
  *
  * See bitset_equal for the definition of "max".
  */
-static inline int bitset_empty(const unsigned char *bits, int max)
+static inline int bitset_empty(const unsigned char *bits, unsigned max)
 {
-	int i;
+	unsigned i;
 	for (i = bitset_sizeof(max); i > 0; i--, bits++)
 		if (*bits)
 			return 0;

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/8] commit: provide a fast multi-tip contains function
  2014-06-25 23:47 ` [PATCH 6/8] commit: provide a fast multi-tip contains function Jeff King
@ 2014-06-26 18:55   ` Junio C Hamano
  2014-06-26 19:19     ` Junio C Hamano
  0 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2014-06-26 18:55 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> I haven't quite convinced myself that the stale logic in the middle is
> right. The origin paint_down function checks "PARENT1 | PARENT2" to see
> if we found a merge base (even though PARENT2 may represent many tips).
> Here I check whether we have _any_ "left" parent flag and _any_ "right"
> parent flag. I'm not sure if I actually need to be finding the merge
> base of _all_ of the commits.

In the one-each-on-two-side case (i.e. the original algorithm), it
is "any commit we will encounter by digging further down from a
STALE one will all be reachable from a newer merge base (e.g. the
commit that caused it to be marked as STALE originally).  It will
never be a useful merge base, so let's mark it as STALE.  Even if a
future traversal that comes from sideways (i.e. not passing the
commit that caused it to be marked as STALE) reach this STALE one,
digging further from there won't give us anything new."

If you see a commit can be reached from L1 and R2, the only thing
you know is that its parents can also be reached from L1 and R2, but
it does not tell you if it is reachable from other tips, e.g. L2 or
R1.  When a traversal reaches such a node from sideways, trying to
paint it with L2 for example, you do need to continue digging.

I think the traversal optimization based on the STALE bit is merely
a halfway optimization to cull obvious duplicates.  See how
get_merge_bases_many() needs to clean up redundant ones in the
result of merge_bases_many(), the latter of which does use the STALE
bit to remove obvious duplicates, in order to make sure it won't
include a commit in the result that can be reached from another
commit in the result, without having to resort to the SLOP trick.

Because your end-goal is to tell if Ln is reachable from Rm (for
number of n's and m's), coming up with the independent/non-redundant
set of merge-bases is not necessary, I think.  I suspect that you do
not need the STALE half-optimization in the first place.

The only time you can say "Ah, we've seen this one and no need to
dig further" is when you are propagating a colour C and the parent
in question is already painted in C (it is OK to be painted as
reachable from more tips), I would think, so shouldn't the loop be
more like, after painting each tip and placing it in the queue:

 * Dequeue one, check the L/R bits in it and call that C

 * Iterate over its parents P:

   * check the L/R bits in P and call that Cp.

   * If Cp is already a superset of C, there is no point putting P
     to the queue to be processed.

   * If Cp is not a superset of C, then update L/R bits in P to mark
     it reachable from tips represented by C and put P to the queue.

until you ran out of queue?




> +void commit_contains(const struct commit_list *left,
> +		     const struct commit_list *right,
> +		     unsigned char *left_contains,
> +		     unsigned char *right_contains)
> +{
> +	struct prio_queue queue = { compare_commits_by_commit_date };
> +	struct bit_slab left_bits, right_bits, stale_bits;
> +	int left_nr, right_nr;
> +
> +	left_nr = init_contains_bits(left, &left_bits, &queue);
> +	right_nr = init_contains_bits(right, &right_bits, &queue);
> +	init_bit_slab(&stale_bits);
> +
> +	while (queue_has_nonstale_bits(&queue, &stale_bits)) {
> +		struct commit *commit = prio_queue_get(&queue);
> +		struct commit_list *parents;
> +		unsigned char *c_left, *c_right, *c_stale;
> +
> +		c_left = bit_slab_at(&left_bits, commit);
> +		c_right = bit_slab_at(&right_bits, commit);
> +		c_stale = bit_slab_at(&stale_bits, commit);
> +
> +		if (!bitset_empty(c_left, left_nr) &&
> +		    !bitset_empty(c_right, right_nr))
> +			*c_stale = 1;

Hmph, is this one-char-a bit?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/8] commit: provide a fast multi-tip contains function
  2014-06-26 18:55   ` Junio C Hamano
@ 2014-06-26 19:19     ` Junio C Hamano
  2014-06-26 19:26       ` Junio C Hamano
  2014-07-01 18:16       ` Junio C Hamano
  0 siblings, 2 replies; 26+ messages in thread
From: Junio C Hamano @ 2014-06-26 19:19 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> The only time you can say "Ah, we've seen this one and no need to
> dig further" is when you are propagating a colour C and the parent
> in question is already painted in C (it is OK to be painted as
> reachable from more tips), I would think, so shouldn't the loop be
> more like, after painting each tip and placing it in the queue:
>
>  * Dequeue one, check the L/R bits in it and call that C
>
>  * Iterate over its parents P:
>
>    * check the L/R bits in P and call that Cp.
>
>    * If Cp is already a superset of C, there is no point putting P
>      to the queue to be processed.
>
>    * If Cp is not a superset of C, then update L/R bits in P to mark
>      it reachable from tips represented by C and put P to the queue.
>
> until you ran out of queue?

Actually that will cause you to dig down to the root, so it won't be
nice.

If you are trying to do "branch --with $commit", what you would want
is not exactly "paint-down-to-common(all branches, $commit)", but
something that paints down to $commit from all branches, with an
optimization that cuts off the traversal at a commit that is
reachable from $commit.  If a traversal from branch B reached such a
commit that is reachable from $commit, you can stop the traversal
because propagating the bit for B further down to its parents will
not reach the $commit you are interested in.

So the termination condition for this a single Left (I'll use Left
for $commit and Right for "all branches") case is "if a commit is
already painted as reachable from $commit, do not propagate bits
further down".

What does it mean to look for "branch --with $commit1 $commit2"
(i.e. more than one in the Left set)?  If we are trying to see which
branches reach _both_ of these commits, then replace the ablve with
"if a commit is already painted as reachable from both $commit{1,2},
then painting it with any branch bits is futile---its parents will
never reach either $commit1 nor $commit2", so the additional
termination condition will be "If left bits are full, then stop".

I do not know how you can optimize the traversal if you are trying
to see which branches reach at least one of these commits, though.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/8] commit: provide a fast multi-tip contains function
  2014-06-26 19:19     ` Junio C Hamano
@ 2014-06-26 19:26       ` Junio C Hamano
  2014-07-01 18:16       ` Junio C Hamano
  1 sibling, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2014-06-26 19:26 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> What does it mean to look for "branch --with $commit1 $commit2"
> (i.e. more than one in the Left set)?  If we are trying to see which
> branches reach _both_ of these commits, then replace the ablve with
> "if a commit is already painted as reachable from both $commit{1,2},
> then painting it with any branch bits is futile---its parents will
> never reach either $commit1 nor $commit2", so the additional
> termination condition will be "If left bits are full, then stop".
>
> I do not know how you can optimize the traversal if you are trying
> to see which branches reach at least one of these commits, though.

By the way, don't we do 80% of this already in show-branch?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/8] add functions for memory-efficient bitmaps
  2014-06-25 23:40 ` [PATCH 4/8] add functions for memory-efficient bitmaps Jeff King
  2014-06-26  3:15   ` Torsten Bögershausen
@ 2014-06-29  7:41   ` Eric Sunshine
  2014-06-30 17:07     ` Jeff King
  1 sibling, 1 reply; 26+ messages in thread
From: Eric Sunshine @ 2014-06-29  7:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List

On Wed, Jun 25, 2014 at 7:40 PM, Jeff King <peff@peff.net> wrote:
> We already have a nice-to-use bitmap implementation in
> ewah/bitmap.c. It pretends to be infinitely long when asking
> for a bit (and just returns 0 for bits that haven't been
> allocated or set), and dynamically resizes as appropriate
> when you set bits.
>
> The cost to this is that each bitmap must store its own
> pointer and length, using up to 16 bytes per bitmap on top
> of the actual bit storage. This is a lot of storage (not to
> mention an extra level of pointer indirection) if you are
> going to store one bitmap per commit in a traversal.
>
> These functions provide an alternative bitmap implementation
> that can be used when you have a large number of fixed-size
> bitmaps. See the documentation in the header file for
> details and examples.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> diff --git a/bitset.h b/bitset.h
> new file mode 100644
> index 0000000..5fbc956
> --- /dev/null
> +++ b/bitset.h
> @@ -0,0 +1,113 @@
> +#ifndef BITSET_H
> +#define BITSET_H
> +
> + * Return the number of unsigned chars required to store num_bits bits.
> + *
> + * This is mostly used internally by the bitset functions, but you may need it
> + * when allocating the bitset. Example:
> + *
> + *   bits = xcalloc(1, bitset_sizeof(nr));
> + */
> +static inline int bitset_sizeof(int num_bits)
> +{
> +       return (num_bits + CHAR_BIT - 1) / CHAR_BIT;
> +}
> +
> +/*
> + * Set the bit at position "n". "n" is counted from zero, and must be
> + * smaller than the num_bits given to bitset_sizeof when allocating the bitset.
> + */
> +static inline void bitset_set(unsigned char *bits, int n)
> +{
> +       bits[n / CHAR_BIT] |= 1 << (n % CHAR_BIT);
> +}

Is it intentional or an oversight that there is no way to clear a bit
in the set?

> +/*
> + * Return the bit at position "n" (see bitset_set for a description of "n").
> + */
> +static inline int bitset_get(unsigned char *bits, int n)
> +{
> +       return !!(bits[n / CHAR_BIT] & (1 << (n % CHAR_BIT)));
> +}
> +
> +/*
> + * Return true iff the bitsets contain the same bits. Each bitset should be the
> + * same size, and should have been allocated using bitset_sizeof(max).
> + *
> + * Note that it is not safe to check partial equality by providing a smaller
> + * "max" (we assume any bits beyond "max" up to the next CHAR_BIT boundary are
> + * zeroed padding).
> + */
> +static inline int bitset_equal(unsigned char *a, unsigned char *b, int max)
> +{
> +       int i;
> +       for (i = bitset_sizeof(max); i > 0; i--)
> +               if (*a++ != *b++)
> +                       return 0;
> +       return 1;
> +}
> +
> +/*
> + * Bitwise-or the bitsets in "dst" and "src", and store the result in "dst".
> + *
> + * See bitset_equal for the definition of "max".
> + */
> +static inline void bitset_or(unsigned char *dst, const unsigned char *src, int max)
> +{
> +       int i;
> +       for (i = bitset_sizeof(max); i > 0; i--)
> +               *dst++ |= *src++;
> +}
> +
> +/*
> + * Returns true iff the bitset contains all zeroes.
> + *
> + * See bitset_equal for the definition of "max".
> + */
> +static inline int bitset_empty(const unsigned char *bits, int max)
> +{
> +       int i;
> +       for (i = bitset_sizeof(max); i > 0; i--, bits++)
> +               if (*bits)
> +                       return 0;
> +       return 1;
> +}
> +
> +#endif /* BITSET_H */
> --
> 2.0.0.566.gfe3e6b2

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/8] add functions for memory-efficient bitmaps
  2014-06-29  7:41   ` Eric Sunshine
@ 2014-06-30 17:07     ` Jeff King
  2014-07-01 16:57       ` Junio C Hamano
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2014-06-30 17:07 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List

On Sun, Jun 29, 2014 at 03:41:37AM -0400, Eric Sunshine wrote:

> > +static inline void bitset_set(unsigned char *bits, int n)
> > +{
> > +       bits[n / CHAR_BIT] |= 1 << (n % CHAR_BIT);
> > +}
> 
> Is it intentional or an oversight that there is no way to clear a bit
> in the set?

Intentional in the sense that I had no need for it in my series, and I
didn't think about it. I doubt many callers would want it, since commit
traversals tend to propagate bits through the graph, and then clean them
up all at once. And the right way to clean up slabbed data like this is
to just clear the slab.

Of course somebody may use the code for something besides commit
traversals. But I'd rather avoid adding dead code on the off chance that
somebody uses it later (and then gets to find out whether it even works
or not!).

-Peff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/8] paint_down_to_common: use prio_queue
  2014-06-25 23:39 ` [PATCH 3/8] paint_down_to_common: use prio_queue Jeff King
@ 2014-07-01 16:23   ` Junio C Hamano
  2014-07-01 17:10     ` Jeff King
  0 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2014-07-01 16:23 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> The downside is that our priority queue is not stable, which
> means that commits with the same timestamp may not come out
> in the order we put them in. You can see this in the test
> update in t6024. That test does a recursive merge across a
> set of commits that all have the same timestamp. For the
> virtual ancestor, the test currently ends up with blob like
> this:
>
>     <<<<<<< Temporary merge branch 1
>     <<<<<<< Temporary merge branch 1
>     C
>     =======
>     B
>     >>>>>>> Temporary merge branch 2
>     =======
>     A
>     >>>>>>> Temporary merge branch 2
>
> but with this patch, the positions of B and A are swapped.
> This is probably fine, as the order is an internal
> implementation detail anyway (it would _not_ be fine if we
> were using a priority queue for "git log" traversal, which
> should show commits in parent order).

Interesting that the queue is not "stable", but the test can still
rely on a fixed output.  While I tend to agree that for the purpose
of this code path, the order is an internal implementation detail,
but I wonder if it would benefit us a lot if we taught prio-queue to
be optionally more "stable", which would allow us to use it in other
code paths that care.  If we really wanted to, I would imagine that
we could keep the "insertion counter" in the elements of the queue
to make the result stable (i.e. the "void **array" would become
something like "struct { int insertion_ctr; void *thing; } *array").

> I'm slightly hesitant because of the stability thing mentioned above. I
> _think_ it's probably fine. But we could also implement a
> stable_prio_queue on top of the existing prio_queue if we're concerned
> (and that may be something we want to do anyway, because "git log" would
> want that if it switched to a priority queue).

Heh, I should have read the below-three-dashs commentary before
commenting (I often start working from the commit messages in "git
log" and then go back to the original thread).

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/8] add functions for memory-efficient bitmaps
  2014-06-30 17:07     ` Jeff King
@ 2014-07-01 16:57       ` Junio C Hamano
  2014-07-01 17:18         ` Jeff King
  0 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2014-07-01 16:57 UTC (permalink / raw)
  To: Jeff King; +Cc: Eric Sunshine, Git List

Jeff King <peff@peff.net> writes:

> On Sun, Jun 29, 2014 at 03:41:37AM -0400, Eric Sunshine wrote:
>
>> > +static inline void bitset_set(unsigned char *bits, int n)
>> > +{
>> > +       bits[n / CHAR_BIT] |= 1 << (n % CHAR_BIT);
>> > +}
>> 
>> Is it intentional or an oversight that there is no way to clear a bit
>> in the set?
>
> Intentional in the sense that I had no need for it in my series, and I
> didn't think about it. I doubt many callers would want it, since commit
> traversals tend to propagate bits through the graph, and then clean them
> up all at once. And the right way to clean up slabbed data like this is
> to just clear the slab.
>
> Of course somebody may use the code for something besides commit
> traversals. But I'd rather avoid adding dead code on the off chance that
> somebody uses it later (and then gets to find out whether it even works
> or not!).

Another thing I noticed was that the definition of and the
commentary on bitset_equal() and bitset_empty() sounded somewhat
"undecided".  These functions take "max" that is deliberately named
differently from "num_bits" (the width of the bitsets involved),
inviting to use them for testing only earlier bits in the bitset as
long as the caller understands the caveat, but the caveat requires
that the partial bitset to test must be byte-aligned, which makes it
not very useful in practice, which means we probably do not want
them to be used for any "max" other than "num_bits".

They probably would want either:

 * be made to truly honor max < num_bits case, by special casing the
   last byte that has max-th bit, to officially allow them to be
   used for partial bitset test; or

 * take "num_bits", not "max", to clarify that callers must use them
   only on the full bitset.

In either case, there needs another item in the "caller's responsibility"
list at the beginning of bitset.h:

    4. Ensure that padding bits at the end of the bitset array are
       initialized to 0.

In the description of bitset_sizeof(), the comment hints it by using
xcalloc() in the example, but a careless user may be tempted to
implement bitset_clr() and then do:

        int i;
        unsigned char *bits = malloc(bitset_sizeof(nr));
        for (i = 0; i < nr; i++)
        	bitset_clr(bits, i);
	assert(bitset_empty(bits, nr));

and the implementation of bitset_empty(), even if we rename
s/max/num_bits/, will choke if (nr % CHAR_BIT) and malloc() gave us
non-zero bit in the padding.

For the sake of simplicity, I am inclined to vote for not allowing
their use on a partial-sub-bitset.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/8] paint_down_to_common: use prio_queue
  2014-07-01 16:23   ` Junio C Hamano
@ 2014-07-01 17:10     ` Jeff King
  0 siblings, 0 replies; 26+ messages in thread
From: Jeff King @ 2014-07-01 17:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, Jul 01, 2014 at 09:23:21AM -0700, Junio C Hamano wrote:

> > but with this patch, the positions of B and A are swapped.
> > This is probably fine, as the order is an internal
> > implementation detail anyway (it would _not_ be fine if we
> > were using a priority queue for "git log" traversal, which
> > should show commits in parent order).
> 
> Interesting that the queue is not "stable", but the test can still
> rely on a fixed output.

I think it is deterministic for a particular sequence of inserts/pops,
but not stable with respect to insertion order.

> While I tend to agree that for the purpose
> of this code path, the order is an internal implementation detail,
> but I wonder if it would benefit us a lot if we taught prio-queue to
> be optionally more "stable", which would allow us to use it in other
> code paths that care.  If we really wanted to, I would imagine that
> we could keep the "insertion counter" in the elements of the queue
> to make the result stable (i.e. the "void **array" would become
> something like "struct { int insertion_ctr; void *thing; } *array").

Yeah, I think the reasons to be stable are:

  1. To be on the safe side for operations like this where it
    _shouldn't_ matter, but perhaps there are hidden dependencies we
    don't know of.

  2. To make it easier for later callers to use prio-queue for cases
     where it does matter (and I think "git log" is one of these).

If we can do it without a big performance loss (and I don't see any
reason it should be any worse than a slight bump to the constant-factor
of the logarithmic operations), it probably makes sense to.

I'll take a look at it (in fact, I already implemented something like it
once long ago in the thread I linked to earlier). My sense of taste says
it should be a stable_prio_queue implemented on top of prio_queue (i.e.,
storing pointers to the struct you mention above). That means you can
still use the unstable one if you want the (presumably minor)
performance benefit, and it keeps the logic nice and tidy.

But given that we have implemented prio_queue using void pointers, I
think it would introduce an extra pointer per item and an extra layer of
indirection on each access.  So maybe it is better to just build it in.

The low-cost alternative is to implement prio_queue to hold items of
arbitrary size. I'm not sure if that is the worth the complexity and
maintenance cost.

> Heh, I should have read the below-three-dashs commentary before
> commenting (I often start working from the commit messages in "git
> log" and then go back to the original thread).

I always wonder how people read those. I tend to write them as if people
have (just) read the commit message, but not yet read the patch.

-Peff

PS Thanks for your earlier comments on the actual commit-slab painting
   algorithm. Responding to those is taking more thinking, and I haven't
   gotten to it yet, but it's on my agenda.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/8] add functions for memory-efficient bitmaps
  2014-07-01 16:57       ` Junio C Hamano
@ 2014-07-01 17:18         ` Jeff King
  0 siblings, 0 replies; 26+ messages in thread
From: Jeff King @ 2014-07-01 17:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eric Sunshine, Git List

On Tue, Jul 01, 2014 at 09:57:13AM -0700, Junio C Hamano wrote:

> Another thing I noticed was that the definition of and the
> commentary on bitset_equal() and bitset_empty() sounded somewhat
> "undecided".  These functions take "max" that is deliberately named
> differently from "num_bits" (the width of the bitsets involved),
> inviting to use them for testing only earlier bits in the bitset as
> long as the caller understands the caveat, but the caveat requires
> that the partial bitset to test must be byte-aligned, which makes it
> not very useful in practice, which means we probably do not want
> them to be used for any "max" other than "num_bits".

Yeah, I added that comment because I found "max" to be confusing, but
couldn't think of a better name. I'm not sure why "num_bits" did not
occur to me, as that makes it completely obvious.

>  * take "num_bits", not "max", to clarify that callers must use them
>    only on the full bitset.

This seems like the right solution to me. Handling partially aligned
bytes adds to the complexity and may hurt performance (in fact, I think
bitset_equal could actually just call memcmp, which I should fix).
That's fine if callers care about that feature, but I actually don't
anticipate any that do.

By the way, I chose "unsigned char" as the storage format somewhat
arbitrarily. Performance might be better with "unsigned int" or even
"unsigned long". It means potentially wasting more space, but not more
than one word (minus a byte) per commit (so about 3MB on linux.git).
I'll try to do some timings to see if it's worth doing.

> In either case, there needs another item in the "caller's responsibility"
> list at the beginning of bitset.h:
> 
>     4. Ensure that padding bits at the end of the bitset array are
>        initialized to 0.

Agreed. That is definitely a requirement I had in mind, but I didn't
think to write it down.

I'll fix both points in the re-roll.

-Peff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/8] string-list: add pos to iterator callback
  2014-06-25 23:42 ` [PATCH 5/8] string-list: add pos to iterator callback Jeff King
@ 2014-07-01 17:45   ` Junio C Hamano
  2014-07-01 19:00     ` Jeff King
  0 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2014-07-01 17:45 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> When we are running a string-list foreach or filter, the
> callback function sees only the string_list_item, along with
> a void* data pointer provided by the caller. This is
> sufficient for most purposes.
>
> However, it can also be useful to know the position of the
> item within the string (for example, if the data pointer

s/the string/&-list/ (or s/&/&_list/).

> points to a secondary array in which each element
> corresponds to part of the string list). We can help this
> use case by providing the position to each callback.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> The diff here is noisy, and I expect in the long run that the one caller
> I add to builtin/tag.c later in the series will eventually stop using
> string_list entirely (in favor of a custom struct), which may leave us
> with no callers that actually use the new field.
>
> I do think the logic above is sound, though, and it's a potentially
> useful thing. There may be other sites that avoid the for_each wrapper
> in favor of iterating themselves simply _because_ they needed to know
> the position (I would just do the same here, except that my new caller
> wants to use filter_string_list, which is a little more complicated).

While I can buy that some callers would want to learn the pos in the
list, I am not sure if this is a good direction to go.  Primarily, I
am having trouble sifting between "want" and "need".

How often do callers want to do this?  If only narrow specialized
callers want this, is it unreasonable to ask them to add a "int ctr"
in their cb_data and use "pos = ((struct theirs *)cb_data)->ctr++"
in their callback instead?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/8] commit: provide a fast multi-tip contains function
  2014-06-26 19:19     ` Junio C Hamano
  2014-06-26 19:26       ` Junio C Hamano
@ 2014-07-01 18:16       ` Junio C Hamano
  2014-07-01 19:14         ` Junio C Hamano
  1 sibling, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2014-07-01 18:16 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> If you are trying to do "branch --with $commit", what you would want
> is not exactly "paint-down-to-common(all branches, $commit)", but
> something that paints down to $commit from all branches, with an
> optimization that cuts off the traversal at a commit that is
> reachable from $commit.  If a traversal from branch B reached such a
> commit that is reachable from $commit, you can stop the traversal
> because propagating the bit for B further down to its parents will
> not reach the $commit you are interested in.

I forgot about the other direction, i.e. "branch --merged $commit".
Since I do "git branch --no-merged pu" fairly often, I care about
its performance ;-)

We paint $commit as Left, and tips of all the branches as Right.  We
try to paint down from $commit but the traversal cannot terminate if
it reaches a commit that is reachable from one Right ref---we may
find that we can reach more Right refs by following its ancestor.
We can stop when we paint Right bits fully, of course, but I wonder
if we can do better than that.

Suppose we just painted a commit with L and Rn bit (i.e. the commit
is a common ancestor of the $commit and one branch).  We know that
traversing down from there will never reach the tip of the branch
whose color is Rn (otherwise we will have a cycle from that commit
back to the tip of the branch), so if the reason we are continuing
the traversal is to prove that the tip of the branch Rn is reachable
(or unreachable) from $commit, there is no reason to continue
digging from there.  Can we exploit that to terminate the traversal
earlier?

When we encounter a new commit that is painted with L bit and some
but not necessarily all R bits, we propagate the bits and check the
R bits.  Some of the commits in Right set that correspond to R bits
may have been painted in L (i.e. we found branches that are merged
to $commit), and digging further from this commit will not give us
any new information.  Other commits are not painted in L (i.e. we do
not yet know if $commit merges these branches), so we may need to
keep digging.  So perhaps we can stop if a commit is painted in L
and also has all the R bits that correspond to Right commits that
are not yet proven reachable from $commit (i.e. not painted in L)?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/8] string-list: add pos to iterator callback
  2014-07-01 17:45   ` Junio C Hamano
@ 2014-07-01 19:00     ` Jeff King
  0 siblings, 0 replies; 26+ messages in thread
From: Jeff King @ 2014-07-01 19:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, Jul 01, 2014 at 10:45:19AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > When we are running a string-list foreach or filter, the
> > callback function sees only the string_list_item, along with
> > a void* data pointer provided by the caller. This is
> > sufficient for most purposes.
> >
> > However, it can also be useful to know the position of the
> > item within the string (for example, if the data pointer
> 
> s/the string/&-list/ (or s/&/&_list/).

Thanks, yeah.

> While I can buy that some callers would want to learn the pos in the
> list, I am not sure if this is a good direction to go.  Primarily, I
> am having trouble sifting between "want" and "need".
> 
> How often do callers want to do this?  If only narrow specialized
> callers want this, is it unreasonable to ask them to add a "int ctr"
> in their cb_data and use "pos = ((struct theirs *)cb_data)->ctr++"
> in their callback instead?

Not all that often, I suppose (I only add one caller in this series). I
just found it a little hack-ish to force callers to keep their own
counter when we already have it (especially because it is not too hard
to get it wrong, for example by forgetting to increment the counter in
one code path of the callback).

Here's what the caller would look like without "pos" (done as a patch on
top of the series, so pos is still there, but no longer used).

diff --git a/builtin/tag.c b/builtin/tag.c
index f17192c..dc6f105 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -151,14 +151,20 @@ static int util_is_not_null(struct string_list_item *it, int pos, void *data)
 	return !!it->util;
 }
 
-static int matches_contains(struct string_list_item *it, int pos, void *data)
+struct contains_callback_data {
+	int ctr;
+	unsigned char *contains;
+};
+
+static int matches_contains(struct string_list_item *it, int pos, void *vdata)
 {
-	unsigned char *contains = data;
-	return contains[pos];
+	struct contains_callback_data *data = vdata;
+	return data->contains[data->ctr++];
 }
 
 static void limit_by_contains(struct string_list *tags, struct commit_list *with)
 {
+	struct contains_callback_data cbdata;
 	struct commit_list *tag_commits = NULL, **tail = &tag_commits;
 	unsigned char *result;
 	int i;
@@ -180,7 +186,10 @@ static void limit_by_contains(struct string_list *tags, struct commit_list *with
 
 	result = xmalloc(tags->nr);
 	commit_contains(with, tag_commits, NULL, result);
-	filter_string_list(tags, 1, matches_contains, result);
+
+	cbdata.ctr = 0;
+	cbdata.contains = result;
+	filter_string_list(tags, 1, matches_contains, &cbdata);
 
 	free(result);
 	free_commit_list(tag_commits);

So I think it's a small change in a lot of places rather than a kind of
ugly change in one spot.

All that being said, I think the real issue here is that I want more
than a string list (I am already using the util field for the sha1, but
this code would be potentially simpler if I could also store the commit
object). In the long run I hope to factor out a ref-listing API that can
be used by tag, branch, and for-each-ref. And then this string-list code
would go away in favor of a more expansive struct. So it may not be
worth worrying about keeping it too clean.

-Peff

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/8] commit: provide a fast multi-tip contains function
  2014-07-01 18:16       ` Junio C Hamano
@ 2014-07-01 19:14         ` Junio C Hamano
  0 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2014-07-01 19:14 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> I forgot about the other direction, i.e. "branch --merged $commit".
> Since I do "git branch --no-merged pu" fairly often, I care about
> its performance ;-)
>
> We paint $commit as Left, and tips of all the branches as Right.  We
> try to paint down from $commit but the traversal cannot terminate if
> it reaches a commit that is reachable from one Right ref---we may
> find that we can reach more Right refs by following its ancestor.
> We can stop when we paint Right bits fully, of course, but I wonder
> if we can do better than that.
>
> Suppose we just painted a commit with L and Rn bit (i.e. the commit
> is a common ancestor of the $commit and one branch).  We know that
> traversing down from there will never reach the tip of the branch
> whose color is Rn (otherwise we will have a cycle from that commit
> back to the tip of the branch), so if the reason we are continuing
> the traversal is to prove that the tip of the branch Rn is reachable
> (or unreachable) from $commit, there is no reason to continue
> digging from there.  Can we exploit that to terminate the traversal
> earlier?

I forgot to mention this case because with "branch --no-merged pu"
it never happens, but another clue we can terminate traversal earier
with is when $commit is found to be an ancestor of some Right
commits.  Then we can start ignoring Rn bits for these Right commits
that can reach the Left one, as we know they can never be reached
from the Left.  That is, the last sentence in the paragraph ...

> When we encounter a new commit that is painted with L bit and some
> but not necessarily all R bits, we propagate the bits and check the
> R bits.  Some of the commits in Right set that correspond to R bits
> may have been painted in L (i.e. we found branches that are merged
> to $commit), and digging further from this commit will not give us
> any new information.  Other commits are not painted in L (i.e. we do
> not yet know if $commit merges these branches), so we may need to
> keep digging.  So perhaps we can stop if a commit is painted in L
> and also has all the R bits that correspond to Right commits that
> are not yet proven reachable from $commit (i.e. not painted in L)?

... will be more like "ignore Rn bits for Right commits that are
painted in L (i.e. they are reachable from L) or the Left commit is
painted with (i.e. they reach L)".

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2014-07-01 19:14 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-25 23:34 [PATCH 0/8] use merge-base for tag --contains Jeff King
2014-06-25 23:35 ` [PATCH 1/8] tag: allow --sort with -n Jeff King
2014-06-25 23:35 ` [PATCH 2/8] tag: factor out decision to stream tags Jeff King
2014-06-25 23:39 ` [PATCH 3/8] paint_down_to_common: use prio_queue Jeff King
2014-07-01 16:23   ` Junio C Hamano
2014-07-01 17:10     ` Jeff King
2014-06-25 23:40 ` [PATCH 4/8] add functions for memory-efficient bitmaps Jeff King
2014-06-26  3:15   ` Torsten Bögershausen
2014-06-26 15:51     ` Jeff King
2014-06-29  7:41   ` Eric Sunshine
2014-06-30 17:07     ` Jeff King
2014-07-01 16:57       ` Junio C Hamano
2014-07-01 17:18         ` Jeff King
2014-06-25 23:42 ` [PATCH 5/8] string-list: add pos to iterator callback Jeff King
2014-07-01 17:45   ` Junio C Hamano
2014-07-01 19:00     ` Jeff King
2014-06-25 23:47 ` [PATCH 6/8] commit: provide a fast multi-tip contains function Jeff King
2014-06-26 18:55   ` Junio C Hamano
2014-06-26 19:19     ` Junio C Hamano
2014-06-26 19:26       ` Junio C Hamano
2014-07-01 18:16       ` Junio C Hamano
2014-07-01 19:14         ` Junio C Hamano
2014-06-25 23:49 ` [PATCH 7/8] tag: use commit_contains Jeff King
2014-06-25 23:53 ` [PATCH 8/8] perf: add tests for tag --contains Jeff King
2014-06-26  0:01   ` Jeff King
2014-06-26  0:04     ` Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.