All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/1] ls-refs.c: minimize number of refs visited
@ 2021-01-19 14:42 Jacob Vosmaer
  2021-01-19 14:42 ` [PATCH 1/1] " Jacob Vosmaer
  0 siblings, 1 reply; 35+ messages in thread
From: Jacob Vosmaer @ 2021-01-19 14:42 UTC (permalink / raw)
  To: git; +Cc: Jacob Vosmaer

The back story of this patch is in
https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/746; it
comes down to: we have a repo with 500,000 refs of which only 50,000
are visible. Having ls-refs iterate through all the refs adds
considerable overhead.

This patch reduces the number of refs visited by replacing one ref
walk over refs/ with multiple ref walks over the prefixes the user
requested via ref-prefix. In the case the user did not use ref-prefix
we fall back to the walk over all of refs/.

Jacob Vosmaer (1):
  ls-refs.c: minimize number of refs visited

 ls-refs.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 62 insertions(+), 1 deletion(-)

-- 
2.30.0


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 1/1] ls-refs.c: minimize number of refs visited
  2021-01-19 14:42 [PATCH 0/1] ls-refs.c: minimize number of refs visited Jacob Vosmaer
@ 2021-01-19 14:42 ` Jacob Vosmaer
  2021-01-19 16:12   ` Taylor Blau
  2021-01-19 22:53   ` Jeff King
  0 siblings, 2 replies; 35+ messages in thread
From: Jacob Vosmaer @ 2021-01-19 14:42 UTC (permalink / raw)
  To: git; +Cc: Jacob Vosmaer

The previous implementation of ls-refs would perform exactly one ref
walk, matching each ref against the prefixes (if any) provided by the
user. This can be expensive if there are a lot of refs and the user
only cares about a small subset of them.

In this patch we analyze the prefixes provided by the user and build a
minimal set of disjoint prefixes that contains all of them. We then do
a ref walk for each of these minimal prefixes.

It is tempting to have just one strvec for the prefixes and use it
both for matching and for iterating. But every time I tried that, it
made things more complicated. I settled on leaving the existing ref
matching (using &data.prefixes) alone, and I added a new layer around
it for the ref walk optimization (using &iter_prefixes).

This commit also fixes a bug in ls-refs.c that was not triggered
before: we were using a strvec set to zero, which is not how you are
supposed to initialize a strvec. We now call strvec_init after zeroing.

Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
---
 ls-refs.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 62 insertions(+), 1 deletion(-)

diff --git a/ls-refs.c b/ls-refs.c
index a1e0b473e4..6d5f0c769a 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -84,12 +84,44 @@ static int ls_refs_config(const char *var, const char *value, void *data)
 	return parse_hide_refs_config(var, value, "uploadpack");
 }
 
+static int cmp_prefix(const void *a_, const void *b_){
+	const char *a = *(const char **)a_;
+	const char *b = *(const char **)b_;
+	return strcmp(a, b);
+}
+
+static void deduplicate_prefixes(struct strvec *prefixes) {
+	int i;
+
+	QSORT(prefixes->v, prefixes->nr, cmp_prefix);
+
+	for (i = 1; i < prefixes->nr;) {
+		const char *p = prefixes->v[i];
+
+		/*
+		 * If p is "refs/foobar" and its predecessor is "refs/foo" then we should
+		 * drop p, both to avoid sending duplicate refs to the user, and to avoid
+		 * doing unnecessary work.
+		 */
+		if (starts_with(p, prefixes->v[i - 1])) {
+			MOVE_ARRAY(&prefixes->v[i], &prefixes->v[i + 1], prefixes->nr - (i + 1));
+			prefixes->v[prefixes->nr - 1] = p;
+			strvec_pop(prefixes);
+		} else {
+			i++;
+		}
+	}
+}
+
 int ls_refs(struct repository *r, struct strvec *keys,
 	    struct packet_reader *request)
 {
 	struct ls_refs_data data;
+	struct strvec iter_prefixes = STRVEC_INIT;
+	const char **p;
 
 	memset(&data, 0, sizeof(data));
+	strvec_init(&data.prefixes);
 
 	git_config(ls_refs_config, NULL);
 
@@ -109,8 +141,37 @@ int ls_refs(struct repository *r, struct strvec *keys,
 		die(_("expected flush after ls-refs arguments"));
 
 	head_ref_namespaced(send_ref, &data);
-	for_each_namespaced_ref(send_ref, &data);
+
+	/*
+	 * Try to create a minimal disjoint set of prefixes that
+	 * contains everything the user wants to see, but that lets us
+	 * avoid a full ref walk. If the repo has a lot of refs and
+	 * the user only wants to see refs/heads and refs/tags,
+	 * it is faster to do two small walks of refs/heads and
+	 * refs/tags instead of one big walk of all of refs/.
+	 */
+	for (p = data.prefixes.v; *p; p++) {
+		if (starts_with(*p, "refs/")) {
+			strvec_push(&iter_prefixes, *p);
+		} else { /* full ref walk in pathological cases like "" */
+			strvec_push(&iter_prefixes, "refs/");
+		}
+	}
+
+	if (!iter_prefixes.nr) /* full ref walk when there are no prefixes */
+		strvec_push(&iter_prefixes, "refs/");
+
+	deduplicate_prefixes(&iter_prefixes);
+
+	for (p = iter_prefixes.v; *p; p++) {
+		struct strbuf buf = STRBUF_INIT;
+		strbuf_addf(&buf, "%s%s", get_git_namespace(), *p);
+		for_each_fullref_in(buf.buf, send_ref, &data, 0);
+		strbuf_release(&buf);
+	}
+
 	packet_flush(1);
 	strvec_clear(&data.prefixes);
+	strvec_clear(&iter_prefixes);
 	return 0;
 }
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/1] ls-refs.c: minimize number of refs visited
  2021-01-19 14:42 ` [PATCH 1/1] " Jacob Vosmaer
@ 2021-01-19 16:12   ` Taylor Blau
  2021-01-19 17:42     ` Jacob Vosmaer
  2021-01-19 22:53   ` Jeff King
  1 sibling, 1 reply; 35+ messages in thread
From: Taylor Blau @ 2021-01-19 16:12 UTC (permalink / raw)
  To: Jacob Vosmaer; +Cc: git

Hi Jacob,

On Tue, Jan 19, 2021 at 03:42:51PM +0100, Jacob Vosmaer wrote:
> The previous implementation of ls-refs would perform exactly one ref
> walk, matching each ref against the prefixes (if any) provided by the
> user. This can be expensive if there are a lot of refs and the user
> only cares about a small subset of them.
>
> In this patch we analyze the prefixes provided by the user and build a
> minimal set of disjoint prefixes that contains all of them. We then do
> a ref walk for each of these minimal prefixes.

This reminds me of b31e2680c4 (ref-filter.c: find disjoint pattern
prefixes, 2019-06-26), where we solved a very similar problem for 'git
for-each-ref'. The difference here is that we are operating on a set of
prefixes, not a set of refs.

But, I think that we could get pretty far by treating the prefixes as
refs so that we can call ref-filter.c:find_longest_prefixes(). For its
purposes, it doesn't really care about whether or not the arguments
actually are references. It simply returns the longest common prefix
among all of its arguments (delimited by '/' characters).

> It is tempting to have just one strvec for the prefixes and use it
> both for matching and for iterating. But every time I tried that, it
> made things more complicated. I settled on leaving the existing ref
> matching (using &data.prefixes) alone, and I added a new layer around
> it for the ref walk optimization (using &iter_prefixes).

I think the implementation in b31e2680c4 deals with this nicely: it
takes a pointer to a strvec and dumps prefixes in there.

> This commit also fixes a bug in ls-refs.c that was not triggered
> before: we were using a strvec set to zero, which is not how you are
> supposed to initialize a strvec. We now call strvec_init after zeroing.

Good.

> Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
> ---
>  ls-refs.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 62 insertions(+), 1 deletion(-)
>
> diff --git a/ls-refs.c b/ls-refs.c
> index a1e0b473e4..6d5f0c769a 100644
> --- a/ls-refs.c
> +++ b/ls-refs.c
> @@ -84,12 +84,44 @@ static int ls_refs_config(const char *var, const char *value, void *data)
>  	return parse_hide_refs_config(var, value, "uploadpack");
>  }
>
> +static int cmp_prefix(const void *a_, const void *b_){
> +	const char *a = *(const char **)a_;
> +	const char *b = *(const char **)b_;
> +	return strcmp(a, b);
> +}
> +
> +static void deduplicate_prefixes(struct strvec *prefixes) {
> +	int i;
> +
> +	QSORT(prefixes->v, prefixes->nr, cmp_prefix);
> +
> +	for (i = 1; i < prefixes->nr;) {
> +		const char *p = prefixes->v[i];
> +
> +		/*
> +		 * If p is "refs/foobar" and its predecessor is "refs/foo" then we should
> +		 * drop p, both to avoid sending duplicate refs to the user, and to avoid
> +		 * doing unnecessary work.
> +		 */
> +		if (starts_with(p, prefixes->v[i - 1])) {
> +			MOVE_ARRAY(&prefixes->v[i], &prefixes->v[i + 1], prefixes->nr - (i + 1));
> +			prefixes->v[prefixes->nr - 1] = p;
> +			strvec_pop(prefixes);
> +		} else {
> +			i++;
> +		}
> +	}
> +}
> +

Indeed, this and the below code are very reminiscent of b31e2680c4. So,
I wonder if it's possible to use the existing implementation rather than
implement what is roughly the same thing twice.

Below is a completely untested patch to try and reuse the code from
b31e2680c4. (It compiles, but that's the extent of my guarantees about
it ;-).) It's all smashed into one huge patch, so if you're happy with
the direction I'll take care of cleaning it up. The new function in
ref-filter.h really belongs in refs.h, but I left the implementation in
ref-filter.c to avoid creating more noise in the diff.

Let me know what you think.

Thanks,
Taylor

--- >8 ---

Subject: [PATCH] ls-refs: iterate longest common refs prefix

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 ls-refs.c    |  4 +++-
 ref-filter.c | 46 ++++++++++++++++++++++++++++++++--------------
 ref-filter.h | 10 ++++++++++
 3 files changed, 45 insertions(+), 15 deletions(-)

diff --git a/ls-refs.c b/ls-refs.c
index a1e0b473e4..6a3e11d45c 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -6,6 +6,7 @@
 #include "ls-refs.h"
 #include "pkt-line.h"
 #include "config.h"
+#include "ref-filter.h"

 /*
  * Check if one of the prefixes is a prefix of the ref.
@@ -109,7 +110,8 @@ int ls_refs(struct repository *r, struct strvec *keys,
 		die(_("expected flush after ls-refs arguments"));

 	head_ref_namespaced(send_ref, &data);
-	for_each_namespaced_ref(send_ref, &data);
+	for_each_fullref_in_prefixes(get_git_namespace(), data.prefixes.v,
+				     send_ref, &data, 0);
 	packet_flush(1);
 	strvec_clear(&data.prefixes);
 	return 0;
diff --git a/ref-filter.c b/ref-filter.c
index aa260bfd09..c34bf34d06 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1987,6 +1987,36 @@ static void find_longest_prefixes(struct string_list *out,
 	strbuf_release(&prefix);
 }

+int for_each_fullref_in_prefixes(const char *namespace,
+				 const char **patterns,
+				 each_ref_fn cb,
+				 void *cb_data,
+				 int broken)
+{
+	struct string_list prefixes = STRING_LIST_INIT_DUP;
+	struct string_list_item *prefix;
+	struct strbuf buf = STRBUF_INIT;
+	int ret = 0, namespace_len;
+
+	find_longest_prefixes(&prefixes, patterns);
+
+	if (namespace)
+		strbuf_addstr(&buf, namespace);
+	namespace_len = buf.len;
+
+	for_each_string_list_item(prefix, &prefixes) {
+		strbuf_addf(&buf, prefix->string);
+		ret = for_each_fullref_in(buf.buf, cb, cb_data, broken);
+		if (ret)
+			break;
+		strbuf_setlen(&buf, namespace_len);
+	}
+
+	string_list_clear(&prefixes, 0);
+	strbuf_release(&buf);
+	return ret;
+}
+
 /*
  * This is the same as for_each_fullref_in(), but it tries to iterate
  * only over the patterns we'll care about. Note that it _doesn't_ do a full
@@ -1997,10 +2027,6 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter,
 				       void *cb_data,
 				       int broken)
 {
-	struct string_list prefixes = STRING_LIST_INIT_DUP;
-	struct string_list_item *prefix;
-	int ret;
-
 	if (!filter->match_as_path) {
 		/*
 		 * in this case, the patterns are applied after
@@ -2024,16 +2050,8 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter,
 		return for_each_fullref_in("", cb, cb_data, broken);
 	}

-	find_longest_prefixes(&prefixes, filter->name_patterns);
-
-	for_each_string_list_item(prefix, &prefixes) {
-		ret = for_each_fullref_in(prefix->string, cb, cb_data, broken);
-		if (ret)
-			break;
-	}
-
-	string_list_clear(&prefixes, 0);
-	return ret;
+	return for_each_fullref_in_prefixes(NULL, filter->name_patterns,
+					    cb, cb_data, broken);
 }

 /*
diff --git a/ref-filter.h b/ref-filter.h
index feaef4a8fd..f666a0fb49 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -146,4 +146,14 @@ struct ref_array_item *ref_array_push(struct ref_array *array,
 				      const char *refname,
 				      const struct object_id *oid);

+/**
+ * iterate all refs which descend from the longest common prefix among
+ * "patterns".
+ */
+int for_each_fullref_in_prefixes(const char *namespace,
+				 const char **patterns,
+				 each_ref_fn cb,
+				 void *cb_data,
+				 int broken);
+
 #endif /*  REF_FILTER_H  */
--
2.30.0.138.g6d7191ea01


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/1] ls-refs.c: minimize number of refs visited
  2021-01-19 16:12   ` Taylor Blau
@ 2021-01-19 17:42     ` Jacob Vosmaer
  2021-01-19 18:19       ` [PATCH 0/2] ls-refs: only traverse through longest common ref prefix Taylor Blau
  2021-01-19 19:09       ` [PATCH 1/1] ls-refs.c: minimize number of refs visited Taylor Blau
  0 siblings, 2 replies; 35+ messages in thread
From: Jacob Vosmaer @ 2021-01-19 17:42 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Git Mailing List

Hi Taylor,

Thanks for your reply. That sounds like a great idea!

On Tue, Jan 19, 2021 at 5:12 PM Taylor Blau <me@ttaylorr.com> wrote:
> But, I think that we could get pretty far by treating the prefixes as
> refs so that we can call ref-filter.c:find_longest_prefixes(). For its
> purposes, it doesn't really care about whether or not the arguments
> actually are references. It simply returns the longest common prefix
> among all of its arguments (delimited by '/' characters).

What does "delimited by /" mean?

Without really understanding the longest common prefix code in
ref-filter.c, my intuitive concern is that the specifics of glob
matching and special treatment of '/' may bite us. I suppose we'll be
fine because ls-refs has its own matching logic. So long as
for_each_fullref_in_prefixes yield enough prefixes, the end result
would remain the same.

The question is then, does for_each_fullref_in_prefixes yield
everything we need?

I think my approach would be to expose the new
for_each_fullref_in_prefixes iterator you propose through test-tool,
and unit test it so we can be sure it handles both contexts
(for-each-refs with globs and special '/', and ls-refs without any
special character behavior) correctly.

I may be overly cautious here, take this with a grain of salt because
I am not an experienced Git contributor. On that topic, apologies if
I'm botching my inline replies in this email.

Regarding your patch: it works correctly and as fast as expected for
my development "many refs" test case. Yay! It also segfaults and fails
some tests but see my comments below.

All in all: thanks, great idea, yes we should reuse, I only lack
confidence on correctness because I don't fully grasp your
longest-common-prefix algorithm yet.

Cheers, Jacob


> --- >8 ---
>
> Subject: [PATCH] ls-refs: iterate longest common refs prefix
>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  ls-refs.c    |  4 +++-
>  ref-filter.c | 46 ++++++++++++++++++++++++++++++++--------------
>  ref-filter.h | 10 ++++++++++
>  3 files changed, 45 insertions(+), 15 deletions(-)
>
> diff --git a/ls-refs.c b/ls-refs.c
> index a1e0b473e4..6a3e11d45c 100644
> --- a/ls-refs.c
> +++ b/ls-refs.c
> @@ -6,6 +6,7 @@
>  #include "ls-refs.h"
>  #include "pkt-line.h"
>  #include "config.h"
> +#include "ref-filter.h"
>
>  /*
>   * Check if one of the prefixes is a prefix of the ref.
> @@ -109,7 +110,8 @@ int ls_refs(struct repository *r, struct strvec *keys,
>                 die(_("expected flush after ls-refs arguments"));
>
>         head_ref_namespaced(send_ref, &data);
> -       for_each_namespaced_ref(send_ref, &data);
> +       for_each_fullref_in_prefixes(get_git_namespace(), data.prefixes.v,
> +                                    send_ref, &data, 0);

Remember to add that strvec_init(&data.prefixes) I talked about in my
commit message, or you get a segfault.

You also need:  if (!data.prefixes.nr) strvec_push(&data.prefixes, "refs/");

Without it, the "ls-refs without ref-prefix" scenario fails.

>         packet_flush(1);
>         strvec_clear(&data.prefixes);
>         return 0;
> diff --git a/ref-filter.c b/ref-filter.c
> index aa260bfd09..c34bf34d06 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -1987,6 +1987,36 @@ static void find_longest_prefixes(struct string_list *out,
>         strbuf_release(&prefix);
>  }
>
> +int for_each_fullref_in_prefixes(const char *namespace,
> +                                const char **patterns,
> +                                each_ref_fn cb,
> +                                void *cb_data,
> +                                int broken)
> +{
> +       struct string_list prefixes = STRING_LIST_INIT_DUP;
> +       struct string_list_item *prefix;
> +       struct strbuf buf = STRBUF_INIT;
> +       int ret = 0, namespace_len;
> +
> +       find_longest_prefixes(&prefixes, patterns);
> +
> +       if (namespace)
> +               strbuf_addstr(&buf, namespace);
> +       namespace_len = buf.len;
> +
> +       for_each_string_list_item(prefix, &prefixes) {
> +               strbuf_addf(&buf, prefix->string);

Missing format string "%s".


> +               ret = for_each_fullref_in(buf.buf, cb, cb_data, broken);
> +               if (ret)
> +                       break;
> +               strbuf_setlen(&buf, namespace_len);
> +       }
> +
> +       string_list_clear(&prefixes, 0);
> +       strbuf_release(&buf);
> +       return ret;
> +}
> +
>  /*
>   * This is the same as for_each_fullref_in(), but it tries to iterate
>   * only over the patterns we'll care about. Note that it _doesn't_ do a full
> @@ -1997,10 +2027,6 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter,
>                                        void *cb_data,
>                                        int broken)
>  {
> -       struct string_list prefixes = STRING_LIST_INIT_DUP;
> -       struct string_list_item *prefix;
> -       int ret;
> -
>         if (!filter->match_as_path) {
>                 /*
>                  * in this case, the patterns are applied after
> @@ -2024,16 +2050,8 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter,
>                 return for_each_fullref_in("", cb, cb_data, broken);
>         }
>
> -       find_longest_prefixes(&prefixes, filter->name_patterns);
> -
> -       for_each_string_list_item(prefix, &prefixes) {
> -               ret = for_each_fullref_in(prefix->string, cb, cb_data, broken);
> -               if (ret)
> -                       break;
> -       }
> -
> -       string_list_clear(&prefixes, 0);
> -       return ret;
> +       return for_each_fullref_in_prefixes(NULL, filter->name_patterns,
> +                                           cb, cb_data, broken);
>  }
>
>  /*
> diff --git a/ref-filter.h b/ref-filter.h
> index feaef4a8fd..f666a0fb49 100644
> --- a/ref-filter.h
> +++ b/ref-filter.h
> @@ -146,4 +146,14 @@ struct ref_array_item *ref_array_push(struct ref_array *array,
>                                       const char *refname,
>                                       const struct object_id *oid);
>
> +/**
> + * iterate all refs which descend from the longest common prefix among
> + * "patterns".
> + */
> +int for_each_fullref_in_prefixes(const char *namespace,
> +                                const char **patterns,
> +                                each_ref_fn cb,
> +                                void *cb_data,
> +                                int broken);
> +
>  #endif /*  REF_FILTER_H  */
> --
> 2.30.0.138.g6d7191ea01
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 0/2] ls-refs: only traverse through longest common ref prefix
  2021-01-19 17:42     ` Jacob Vosmaer
@ 2021-01-19 18:19       ` Taylor Blau
  2021-01-19 18:19         ` [PATCH 1/2] refs: expose 'for_each_fullref_in_prefixes' Taylor Blau
                           ` (2 more replies)
  2021-01-19 19:09       ` [PATCH 1/1] ls-refs.c: minimize number of refs visited Taylor Blau
  1 sibling, 3 replies; 35+ messages in thread
From: Taylor Blau @ 2021-01-19 18:19 UTC (permalink / raw)
  To: git; +Cc: jacob, peff

Here is a series inspired by Jacob Vosmaer's recent patch to only traverse once
through the longest common prefix of given ref-prefixes in ls-refs.

Instead of a hand-rolled implementation, this uses the one written back in
b31e2680c4 (ref-filter.c: find disjoint pattern prefixes, 2019-06-26). The first
patch exposes that algorithm in refs.h, and the second patch uses the new
function.

If Jacob is happy with this direction, I'd suggest that we use this series as a
replacement for his patch.

Thanks in advance for your review.

Taylor Blau (2):
  refs: expose 'for_each_fullref_in_prefixes'
  ls-refs.c: traverse longest common ref prefix

 ls-refs.c    |  6 +++-
 ref-filter.c | 74 ++------------------------------------------
 refs.c       | 87 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 refs.h       |  7 +++++
 4 files changed, 101 insertions(+), 73 deletions(-)

--
2.30.0.138.g6d7191ea01

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 1/2] refs: expose 'for_each_fullref_in_prefixes'
  2021-01-19 18:19       ` [PATCH 0/2] ls-refs: only traverse through longest common ref prefix Taylor Blau
@ 2021-01-19 18:19         ` Taylor Blau
  2021-01-19 18:19         ` [PATCH 2/2] ls-refs.c: traverse longest common ref prefix Taylor Blau
  2021-01-20 16:04         ` [PATCH v2 0/3] ls-refs: traverse prefixes of disjoint "ref-prefix" sets Taylor Blau
  2 siblings, 0 replies; 35+ messages in thread
From: Taylor Blau @ 2021-01-19 18:19 UTC (permalink / raw)
  To: git; +Cc: jacob, peff

This function was used in the ref-filter.c code to find the longest
common prefix of among a set of refspecs, and then to iterate all of the
references that descend from that prefix.

The subsequent patch will want to use that same code from ls-refs, so
prepare by exposing and moving it to refs.c. Since there is nothing
specific to the ref-filter code here (other than that it was previously
the only caller of this function), this really belongs in the more
generic refs.h header.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 ref-filter.c | 74 ++------------------------------------------
 refs.c       | 87 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 refs.h       |  7 +++++
 3 files changed, 96 insertions(+), 72 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index aa260bfd09..f918f00151 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1929,64 +1929,6 @@ static int filter_pattern_match(struct ref_filter *filter, const char *refname)
 	return match_pattern(filter, refname);
 }
 
-static int qsort_strcmp(const void *va, const void *vb)
-{
-	const char *a = *(const char **)va;
-	const char *b = *(const char **)vb;
-
-	return strcmp(a, b);
-}
-
-static void find_longest_prefixes_1(struct string_list *out,
-				  struct strbuf *prefix,
-				  const char **patterns, size_t nr)
-{
-	size_t i;
-
-	for (i = 0; i < nr; i++) {
-		char c = patterns[i][prefix->len];
-		if (!c || is_glob_special(c)) {
-			string_list_append(out, prefix->buf);
-			return;
-		}
-	}
-
-	i = 0;
-	while (i < nr) {
-		size_t end;
-
-		/*
-		* Set "end" to the index of the element _after_ the last one
-		* in our group.
-		*/
-		for (end = i + 1; end < nr; end++) {
-			if (patterns[i][prefix->len] != patterns[end][prefix->len])
-				break;
-		}
-
-		strbuf_addch(prefix, patterns[i][prefix->len]);
-		find_longest_prefixes_1(out, prefix, patterns + i, end - i);
-		strbuf_setlen(prefix, prefix->len - 1);
-
-		i = end;
-	}
-}
-
-static void find_longest_prefixes(struct string_list *out,
-				  const char **patterns)
-{
-	struct strvec sorted = STRVEC_INIT;
-	struct strbuf prefix = STRBUF_INIT;
-
-	strvec_pushv(&sorted, patterns);
-	QSORT(sorted.v, sorted.nr, qsort_strcmp);
-
-	find_longest_prefixes_1(out, &prefix, sorted.v, sorted.nr);
-
-	strvec_clear(&sorted);
-	strbuf_release(&prefix);
-}
-
 /*
  * This is the same as for_each_fullref_in(), but it tries to iterate
  * only over the patterns we'll care about. Note that it _doesn't_ do a full
@@ -1997,10 +1939,6 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter,
 				       void *cb_data,
 				       int broken)
 {
-	struct string_list prefixes = STRING_LIST_INIT_DUP;
-	struct string_list_item *prefix;
-	int ret;
-
 	if (!filter->match_as_path) {
 		/*
 		 * in this case, the patterns are applied after
@@ -2024,16 +1962,8 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter,
 		return for_each_fullref_in("", cb, cb_data, broken);
 	}
 
-	find_longest_prefixes(&prefixes, filter->name_patterns);
-
-	for_each_string_list_item(prefix, &prefixes) {
-		ret = for_each_fullref_in(prefix->string, cb, cb_data, broken);
-		if (ret)
-			break;
-	}
-
-	string_list_clear(&prefixes, 0);
-	return ret;
+	return for_each_fullref_in_prefixes(NULL, filter->name_patterns,
+					    cb, cb_data, broken);
 }
 
 /*
diff --git a/refs.c b/refs.c
index 13dc2c3291..0b5a68588f 100644
--- a/refs.c
+++ b/refs.c
@@ -1546,6 +1546,93 @@ int for_each_rawref(each_ref_fn fn, void *cb_data)
 	return refs_for_each_rawref(get_main_ref_store(the_repository), fn, cb_data);
 }
 
+static int qsort_strcmp(const void *va, const void *vb)
+{
+	const char *a = *(const char **)va;
+	const char *b = *(const char **)vb;
+
+	return strcmp(a, b);
+}
+
+static void find_longest_prefixes_1(struct string_list *out,
+				  struct strbuf *prefix,
+				  const char **patterns, size_t nr)
+{
+	size_t i;
+
+	for (i = 0; i < nr; i++) {
+		char c = patterns[i][prefix->len];
+		if (!c || is_glob_special(c)) {
+			string_list_append(out, prefix->buf);
+			return;
+		}
+	}
+
+	i = 0;
+	while (i < nr) {
+		size_t end;
+
+		/*
+		* Set "end" to the index of the element _after_ the last one
+		* in our group.
+		*/
+		for (end = i + 1; end < nr; end++) {
+			if (patterns[i][prefix->len] != patterns[end][prefix->len])
+				break;
+		}
+
+		strbuf_addch(prefix, patterns[i][prefix->len]);
+		find_longest_prefixes_1(out, prefix, patterns + i, end - i);
+		strbuf_setlen(prefix, prefix->len - 1);
+
+		i = end;
+	}
+}
+
+static void find_longest_prefixes(struct string_list *out,
+				  const char **patterns)
+{
+	struct strvec sorted = STRVEC_INIT;
+	struct strbuf prefix = STRBUF_INIT;
+
+	strvec_pushv(&sorted, patterns);
+	QSORT(sorted.v, sorted.nr, qsort_strcmp);
+
+	find_longest_prefixes_1(out, &prefix, sorted.v, sorted.nr);
+
+	strvec_clear(&sorted);
+	strbuf_release(&prefix);
+}
+
+int for_each_fullref_in_prefixes(const char *namespace,
+				 const char **patterns,
+				 each_ref_fn fn, void *cb_data,
+				 unsigned int broken)
+{
+	struct string_list prefixes = STRING_LIST_INIT_DUP;
+	struct string_list_item *prefix;
+	struct strbuf buf = STRBUF_INIT;
+	int ret = 0, namespace_len;
+
+	find_longest_prefixes(&prefixes, patterns);
+
+	if (namespace)
+		strbuf_addstr(&buf, namespace);
+	namespace_len = buf.len;
+
+	for_each_string_list_item(prefix, &prefixes) {
+		strbuf_addf(&buf, "%s", prefix->string);
+		ret = for_each_fullref_in(buf.buf, fn, cb_data, broken);
+		if (ret)
+			break;
+		strbuf_setlen(&buf, namespace_len);
+	}
+
+	string_list_clear(&prefixes, 0);
+	strbuf_release(&buf);
+	return ret;
+}
+
 static int refs_read_special_head(struct ref_store *ref_store,
 				  const char *refname, struct object_id *oid,
 				  struct strbuf *referent, unsigned int *type)
diff --git a/refs.h b/refs.h
index ff05d2e9fe..e88a794ee6 100644
--- a/refs.h
+++ b/refs.h
@@ -347,6 +347,13 @@ int refs_for_each_fullref_in(struct ref_store *refs, const char *prefix,
 int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data,
 			unsigned int broken);
 
+/**
+ * iterate all refs which are descendent from the longest common prefix among
+ * the list "patterns".
+ */
+int for_each_fullref_in_prefixes(const char *namespace, const char **patterns,
+				 each_ref_fn fn, void *cb_data,
+				 unsigned int broken);
 /**
  * iterate refs from the respective area.
  */
-- 
2.30.0.138.g6d7191ea01


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 2/2] ls-refs.c: traverse longest common ref prefix
  2021-01-19 18:19       ` [PATCH 0/2] ls-refs: only traverse through longest common ref prefix Taylor Blau
  2021-01-19 18:19         ` [PATCH 1/2] refs: expose 'for_each_fullref_in_prefixes' Taylor Blau
@ 2021-01-19 18:19         ` Taylor Blau
  2021-01-19 23:09           ` Jeff King
  2021-01-20 11:00           ` Jacob Vosmaer
  2021-01-20 16:04         ` [PATCH v2 0/3] ls-refs: traverse prefixes of disjoint "ref-prefix" sets Taylor Blau
  2 siblings, 2 replies; 35+ messages in thread
From: Taylor Blau @ 2021-01-19 18:19 UTC (permalink / raw)
  To: git; +Cc: jacob, peff

ls-refs performs a single revision walk over the whole ref namespace,
and sends ones that match with one of the given ref prefixes down to the
user.

This can be expensive if there are many refs overall, but the portion of
them covered by the given prefixes is small by comparison.

To attempt to reduce the difference between the number of refs
traversed, and the number of refs sent, only traverse references which
are in the longest common prefix of the given prefixes. This is very
reminiscent of the approach taken in b31e2680c4 (ref-filter.c: find
disjoint pattern prefixes, 2019-06-26) which does an analogous thing for
multi-patterned 'git for-each-ref' invocations.

The only difference here is that we are operating on ref prefixes, which
do not necessarily point to a single reference. That is just fine, since
all we care about is finding the longest common prefix among prefixes
which can be thought of as refspecs for our purposes here.

Similarly, for_each_fullref_in_prefixes may return more results than the
caller asked for (since the longest common prefix might match something
that a longer prefix in the same set wouldn't match) but
ls-refs.c:send_ref() discards such results.

The code introduced in b31e2680c4 is resilient to stop early (and
return a shorter prefix) when it encounters a metacharacter (as
mentioned in that patch, there is some opportunity to improve this, but
nobody has done it).

There are two remaining small items in this patch:

  - If no prefixes were provided, then implicitly add the empty string
    (which will match all references).

  - Since we are manually munging the prefixes, make sure that we
    initialize it ourselves (previously this wasn't necessary since the
    first strvec_push would do so).

Original-patch-by: Jacob Vosmaer <jacob@gitlab.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 ls-refs.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/ls-refs.c b/ls-refs.c
index a1e0b473e4..eaaa36d0df 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -90,6 +90,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
 	struct ls_refs_data data;
 
 	memset(&data, 0, sizeof(data));
+	strvec_init(&data.prefixes);
 
 	git_config(ls_refs_config, NULL);
 
@@ -109,7 +110,10 @@ int ls_refs(struct repository *r, struct strvec *keys,
 		die(_("expected flush after ls-refs arguments"));
 
 	head_ref_namespaced(send_ref, &data);
-	for_each_namespaced_ref(send_ref, &data);
+	if (!data.prefixes.nr)
+		strvec_push(&data.prefixes, "");
+	for_each_fullref_in_prefixes(get_git_namespace(), data.prefixes.v,
+				     send_ref, &data, 0);
 	packet_flush(1);
 	strvec_clear(&data.prefixes);
 	return 0;
-- 
2.30.0.138.g6d7191ea01

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/1] ls-refs.c: minimize number of refs visited
  2021-01-19 17:42     ` Jacob Vosmaer
  2021-01-19 18:19       ` [PATCH 0/2] ls-refs: only traverse through longest common ref prefix Taylor Blau
@ 2021-01-19 19:09       ` Taylor Blau
  2021-01-19 21:59         ` Jeff King
  1 sibling, 1 reply; 35+ messages in thread
From: Taylor Blau @ 2021-01-19 19:09 UTC (permalink / raw)
  To: Jacob Vosmaer; +Cc: Taylor Blau, Git Mailing List

On Tue, Jan 19, 2021 at 06:42:56PM +0100, Jacob Vosmaer wrote:
> Hi Taylor,
>
> Thanks for your reply. That sounds like a great idea!
>
> On Tue, Jan 19, 2021 at 5:12 PM Taylor Blau <me@ttaylorr.com> wrote:
> > But, I think that we could get pretty far by treating the prefixes as
> > refs so that we can call ref-filter.c:find_longest_prefixes(). For its
> > purposes, it doesn't really care about whether or not the arguments
> > actually are references. It simply returns the longest common prefix
> > among all of its arguments (delimited by '/' characters).
>
> What does "delimited by /" mean?

Ah, I just meant that it looks for the longest common prefix where it
will only split at '/' characters. But, that's not right at all:
find_longest_prefixes_1() will happily split anywhere there is a
difference.

> Without really understanding the longest common prefix code in
> ref-filter.c, my intuitive concern is that the specifics of glob
> matching and special treatment of '/' may bite us. I suppose we'll be
> fine because ls-refs has its own matching logic. So long as
> for_each_fullref_in_prefixes yield enough prefixes, the end result
> would remain the same.

Right. We can ignore the concern about '/' (seeing my comment above),
and note that find_longest_prefixes_1() breaks on glob metacharacters,
so we'll only match or overmatch the desired set (and we'll never
undermatch).

I made sure to write in the second patch downthread that
ls-refs.c:send_ref() correctly handles receiving too many refs (and it
discards ones that it doesn't want).

> The question is then, does for_each_fullref_in_prefixes yield
> everything we need?

For the reasons above, yes: it will.

> I think my approach would be to expose the new
> for_each_fullref_in_prefixes iterator you propose through test-tool,
> and unit test it so we can be sure it handles both contexts
> (for-each-refs with globs and special '/', and ls-refs without any
> special character behavior) correctly.
>
> I may be overly cautious here, take this with a grain of salt because
> I am not an experienced Git contributor. On that topic, apologies if
> I'm botching my inline replies in this email.

I do appreciate your caution, but I'm not sure exposing a test-tool is
necessary, since we already test this behavior extensively in t6300 (and
now t5701, t5702 and t5704, too).

> Regarding your patch: it works correctly and as fast as expected for
> my development "many refs" test case. Yay! It also segfaults and fails
> some tests but see my comments below.
>
> All in all: thanks, great idea, yes we should reuse, I only lack
> confidence on correctness because I don't fully grasp your
> longest-common-prefix algorithm yet.

:-). Thanks for the pointers on the spots that I had missed (as I
mentioned, I only compiled it before sending, so having an additional
set of more careful eyes was quite helpful).


Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/1] ls-refs.c: minimize number of refs visited
  2021-01-19 19:09       ` [PATCH 1/1] ls-refs.c: minimize number of refs visited Taylor Blau
@ 2021-01-19 21:59         ` Jeff King
  2021-01-19 22:15           ` Jeff King
  0 siblings, 1 reply; 35+ messages in thread
From: Jeff King @ 2021-01-19 21:59 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Jacob Vosmaer, Git Mailing List

On Tue, Jan 19, 2021 at 02:09:04PM -0500, Taylor Blau wrote:

> > What does "delimited by /" mean?
> 
> Ah, I just meant that it looks for the longest common prefix where it
> will only split at '/' characters. But, that's not right at all:
> find_longest_prefixes_1() will happily split anywhere there is a
> difference.

Right. We thought in early revisions of the ref-filter work that we
might have to split on path components, but it turns out that the
underlying ref code is happy to take arbitrary prefixes. And it's that
code which defines our strategy; Even if the ls-refs code wanted to
allow only full path components, it should be using the limiting from
for_each_ref_in() only as an optimization, and applying its own
filtering to the output.

> > Without really understanding the longest common prefix code in
> > ref-filter.c, my intuitive concern is that the specifics of glob
> > matching and special treatment of '/' may bite us. I suppose we'll be
> > fine because ls-refs has its own matching logic. So long as
> > for_each_fullref_in_prefixes yield enough prefixes, the end result
> > would remain the same.
> 
> Right. We can ignore the concern about '/' (seeing my comment above),
> and note that find_longest_prefixes_1() breaks on glob metacharacters,
> so we'll only match or overmatch the desired set (and we'll never
> undermatch).
> 
> I made sure to write in the second patch downthread that
> ls-refs.c:send_ref() correctly handles receiving too many refs (and it
> discards ones that it doesn't want).

Yeah, I think the glob-handling is OK for that reason. We'd have a
smaller prefix, which means seeing more refs. So it's never a
correctness issue, but only a potential optimization one (we consider
more refs in ls-refs.c than we might otherwise). But I think even that s
impossible, because the glob characters are not valid in refnames. So we
would never see one at all in a string which is meant to be a pure
prefix.

> > I think my approach would be to expose the new
> > for_each_fullref_in_prefixes iterator you propose through test-tool,
> > and unit test it so we can be sure it handles both contexts
> > (for-each-refs with globs and special '/', and ls-refs without any
> > special character behavior) correctly.
> >
> > I may be overly cautious here, take this with a grain of salt because
> > I am not an experienced Git contributor. On that topic, apologies if
> > I'm botching my inline replies in this email.
> 
> I do appreciate your caution, but I'm not sure exposing a test-tool is
> necessary, since we already test this behavior extensively in t6300 (and
> now t5701, t5702 and t5704, too).

Agreed. test-tool is fine when we have no way to easily feed data to a
unit. But in this case, the prefix code is easy to test via the
for-each-ref plumbing. We'd want to make sure the new setting in ls-refs
is covered, too, but I agree that t5701 covers that well (regular
fetches make sure we don't send too few refs, but those ones check that
we limited the refs in the expected way).

-Peff

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/1] ls-refs.c: minimize number of refs visited
  2021-01-19 21:59         ` Jeff King
@ 2021-01-19 22:15           ` Jeff King
  2021-01-19 22:23             ` Taylor Blau
  0 siblings, 1 reply; 35+ messages in thread
From: Jeff King @ 2021-01-19 22:15 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Jacob Vosmaer, Git Mailing List

On Tue, Jan 19, 2021 at 04:59:18PM -0500, Jeff King wrote:

> > > What does "delimited by /" mean?
> > 
> > Ah, I just meant that it looks for the longest common prefix where it
> > will only split at '/' characters. But, that's not right at all:
> > find_longest_prefixes_1() will happily split anywhere there is a
> > difference.
> 
> Right. We thought in early revisions of the ref-filter work that we
> might have to split on path components, but it turns out that the
> underlying ref code is happy to take arbitrary prefixes. And it's that
> code which defines our strategy; Even if the ls-refs code wanted to
> allow only full path components, it should be using the limiting from
> for_each_ref_in() only as an optimization, and applying its own
> filtering to the output.

Having now looked carefully at the ls-refs code, it's a pure
prefix-match, too. So I think we _could_ rely on for_each_fullref_in()
returning us the correct full results, and not checking it further in
send_ref(). But I kind of like keeping it there as an extra check (and
one which could in theory grow more logic later).

-Peff

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/1] ls-refs.c: minimize number of refs visited
  2021-01-19 22:15           ` Jeff King
@ 2021-01-19 22:23             ` Taylor Blau
  2021-01-19 22:52               ` Jeff King
  0 siblings, 1 reply; 35+ messages in thread
From: Taylor Blau @ 2021-01-19 22:23 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, Jacob Vosmaer, Git Mailing List

On Tue, Jan 19, 2021 at 05:15:30PM -0500, Jeff King wrote:
> On Tue, Jan 19, 2021 at 04:59:18PM -0500, Jeff King wrote:
>
> > > > What does "delimited by /" mean?
> > >
> > > Ah, I just meant that it looks for the longest common prefix where it
> > > will only split at '/' characters. But, that's not right at all:
> > > find_longest_prefixes_1() will happily split anywhere there is a
> > > difference.
> >
> > Right. We thought in early revisions of the ref-filter work that we
> > might have to split on path components, but it turns out that the
> > underlying ref code is happy to take arbitrary prefixes. And it's that
> > code which defines our strategy; Even if the ls-refs code wanted to
> > allow only full path components, it should be using the limiting from
> > for_each_ref_in() only as an optimization, and applying its own
> > filtering to the output.
>
> Having now looked carefully at the ls-refs code, it's a pure
> prefix-match, too. So I think we _could_ rely on for_each_fullref_in()
> returning us the correct full results, and not checking it further in
> send_ref(). But I kind of like keeping it there as an extra check (and
> one which could in theory grow more logic later).

Hmm. What if the caller asks for:

  ref-prefix refs/tags/a
  ref-prefix refs/tags/b

?

The LCP between those two is refs/tags, so send_ref() will presumably
get lots of reuslts that it doesn't care about (assuming there are tags
besides a and b).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/1] ls-refs.c: minimize number of refs visited
  2021-01-19 22:23             ` Taylor Blau
@ 2021-01-19 22:52               ` Jeff King
  2021-01-19 22:59                 ` Jeff King
  0 siblings, 1 reply; 35+ messages in thread
From: Jeff King @ 2021-01-19 22:52 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Jacob Vosmaer, Git Mailing List

On Tue, Jan 19, 2021 at 05:23:36PM -0500, Taylor Blau wrote:

> > Having now looked carefully at the ls-refs code, it's a pure
> > prefix-match, too. So I think we _could_ rely on for_each_fullref_in()
> > returning us the correct full results, and not checking it further in
> > send_ref(). But I kind of like keeping it there as an extra check (and
> > one which could in theory grow more logic later).
> 
> Hmm. What if the caller asks for:
> 
>   ref-prefix refs/tags/a
>   ref-prefix refs/tags/b
> 
> ?
> 
> The LCP between those two is refs/tags, so send_ref() will presumably
> get lots of reuslts that it doesn't care about (assuming there are tags
> besides a and b).

Oh, you're right, of course. Ignore me. :)

-Peff

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/1] ls-refs.c: minimize number of refs visited
  2021-01-19 14:42 ` [PATCH 1/1] " Jacob Vosmaer
  2021-01-19 16:12   ` Taylor Blau
@ 2021-01-19 22:53   ` Jeff King
  2021-01-19 23:00     ` Taylor Blau
  1 sibling, 1 reply; 35+ messages in thread
From: Jeff King @ 2021-01-19 22:53 UTC (permalink / raw)
  To: Jacob Vosmaer; +Cc: git

On Tue, Jan 19, 2021 at 03:42:51PM +0100, Jacob Vosmaer wrote:

> The previous implementation of ls-refs would perform exactly one ref
> walk, matching each ref against the prefixes (if any) provided by the
> user. This can be expensive if there are a lot of refs and the user
> only cares about a small subset of them.
> 
> In this patch we analyze the prefixes provided by the user and build a
> minimal set of disjoint prefixes that contains all of them. We then do
> a ref walk for each of these minimal prefixes.

Thanks for posting this. I have a vague recollection that we considered
this either when we did the for-each-ref prefixes, or when we added
ls-refs prefixes, but I can't seem to find either. At any rate, at
GitHub we haven't generally found it to be a problem because our
horrifically-large repos tend to be aggregated alternates repos, not the
ones people serve upload-pack out of (though I did just time it, and
some of our largest repos should save a few hundred milliseconds per
advertisement, which is certainly not nothing).

I do think we should reuse the code from ref-filter, as Taylor showed.

> This commit also fixes a bug in ls-refs.c that was not triggered
> before: we were using a strvec set to zero, which is not how you are
> supposed to initialize a strvec. We now call strvec_init after zeroing.

Good catch. It didn't matter until now because nobody relied on having a
NULL entry when no prefix had been added (instead, they always iterated
over prefixes->nr). IMHO that is worth fixing as a separate commit.

-Peff

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/1] ls-refs.c: minimize number of refs visited
  2021-01-19 22:52               ` Jeff King
@ 2021-01-19 22:59                 ` Jeff King
  2021-01-19 23:02                   ` Taylor Blau
  0 siblings, 1 reply; 35+ messages in thread
From: Jeff King @ 2021-01-19 22:59 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Jacob Vosmaer, Git Mailing List

On Tue, Jan 19, 2021 at 05:52:04PM -0500, Jeff King wrote:

> On Tue, Jan 19, 2021 at 05:23:36PM -0500, Taylor Blau wrote:
> 
> > > Having now looked carefully at the ls-refs code, it's a pure
> > > prefix-match, too. So I think we _could_ rely on for_each_fullref_in()
> > > returning us the correct full results, and not checking it further in
> > > send_ref(). But I kind of like keeping it there as an extra check (and
> > > one which could in theory grow more logic later).
> > 
> > Hmm. What if the caller asks for:
> > 
> >   ref-prefix refs/tags/a
> >   ref-prefix refs/tags/b
> > 
> > ?
> > 
> > The LCP between those two is refs/tags, so send_ref() will presumably
> > get lots of reuslts that it doesn't care about (assuming there are tags
> > besides a and b).
> 
> Oh, you're right, of course. Ignore me. :)

Actually, I am not sure that we would look for "refs/tags/" in that case
(I did a quick test and we do not seem to). Which makes sense, as it is
cheaper to find the "a" and "b" hierarchies separately if there is a
very big "refs/tags/c" hierarchy.

But I agree that this is a good reason that callers should consider it
as an optimization which could return more results than expected.

-Peff

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/1] ls-refs.c: minimize number of refs visited
  2021-01-19 22:53   ` Jeff King
@ 2021-01-19 23:00     ` Taylor Blau
  2021-01-19 23:11       ` Jeff King
  0 siblings, 1 reply; 35+ messages in thread
From: Taylor Blau @ 2021-01-19 23:00 UTC (permalink / raw)
  To: Jeff King; +Cc: Jacob Vosmaer, git

On Tue, Jan 19, 2021 at 05:53:56PM -0500, Jeff King wrote:
> Thanks for posting this. I have a vague recollection that we considered
> this either when we did the for-each-ref prefixes, or when we added
> ls-refs prefixes, but I can't seem to find either. At any rate, at
> GitHub we haven't generally found it to be a problem because our
> horrifically-large repos tend to be aggregated alternates repos, not the
> ones people serve upload-pack out of (though I did just time it, and
> some of our largest repos should save a few hundred milliseconds per
> advertisement, which is certainly not nothing).

Great on all counts!

> > This commit also fixes a bug in ls-refs.c that was not triggered
> > before: we were using a strvec set to zero, which is not how you are
> > supposed to initialize a strvec. We now call strvec_init after zeroing.
>
> Good catch. It didn't matter until now because nobody relied on having a
> NULL entry when no prefix had been added (instead, they always iterated
> over prefixes->nr). IMHO that is worth fixing as a separate commit.

Yeah. Even after calling it out as such myself, I promptly forgot it
when preparing the first patch I sent back to Jacob!

I didn't pull it out into its own patch, and rather folded it in to my
2/2. It has a small call-out of its own, but if you'd prefer it by
itself, I'm happy to resubmit it with that change included.

> -Peff

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/1] ls-refs.c: minimize number of refs visited
  2021-01-19 22:59                 ` Jeff King
@ 2021-01-19 23:02                   ` Taylor Blau
  0 siblings, 0 replies; 35+ messages in thread
From: Taylor Blau @ 2021-01-19 23:02 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, Jacob Vosmaer, Git Mailing List

On Tue, Jan 19, 2021 at 05:59:34PM -0500, Jeff King wrote:
> Actually, I am not sure that we would look for "refs/tags/" in that case
> (I did a quick test and we do not seem to). Which makes sense, as it is
> cheaper to find the "a" and "b" hierarchies separately if there is a
> very big "refs/tags/c" hierarchy.

Ah, makes sense. Thanks for double checking.

> But I agree that this is a good reason that callers should consider it
> as an optimization which could return more results than expected.

Yep. Even though I couldn't quite remember when the algorithm would
split without looking more closely, I made sure to document that it
iterates *all* references that are descendent of the LCP of its
arguments.

> -Peff

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/2] ls-refs.c: traverse longest common ref prefix
  2021-01-19 18:19         ` [PATCH 2/2] ls-refs.c: traverse longest common ref prefix Taylor Blau
@ 2021-01-19 23:09           ` Jeff King
  2021-01-19 23:52             ` Taylor Blau
  2021-01-20 11:00           ` Jacob Vosmaer
  1 sibling, 1 reply; 35+ messages in thread
From: Jeff King @ 2021-01-19 23:09 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, jacob

On Tue, Jan 19, 2021 at 01:19:17PM -0500, Taylor Blau wrote:

> To attempt to reduce the difference between the number of refs
> traversed, and the number of refs sent, only traverse references which
> are in the longest common prefix of the given prefixes. This is very
> reminiscent of the approach taken in b31e2680c4 (ref-filter.c: find
> disjoint pattern prefixes, 2019-06-26) which does an analogous thing for
> multi-patterned 'git for-each-ref' invocations.
> 
> The only difference here is that we are operating on ref prefixes, which
> do not necessarily point to a single reference. That is just fine, since
> all we care about is finding the longest common prefix among prefixes
> which can be thought of as refspecs for our purposes here.

This second paragraph confused me. Aren't the inputs to for-each-ref
also prefixes?

I guess they require an explicit '*', but fundamentally it's the same
concept (and certainly they are not just single references).

> Similarly, for_each_fullref_in_prefixes may return more results than the
> caller asked for (since the longest common prefix might match something
> that a longer prefix in the same set wouldn't match) but
> ls-refs.c:send_ref() discards such results.

Based on my other poking, I'm not entirely sure that we can return too
many results. But I do think it's worth keeping the caller more careful.

> The code introduced in b31e2680c4 is resilient to stop early (and
> return a shorter prefix) when it encounters a metacharacter (as
> mentioned in that patch, there is some opportunity to improve this, but
> nobody has done it).

I agree that is how b31e2680c4 works, but we don't care about that
behavior here, since we have strict prefixes. Isn't the argument we need
to make the other way? I.e., that stopping early on a metacharacter will
not hurt us. Because at worst we return too many results (hey, there's a
case!) and because we would not expect metacharacters in the prefixes
anyway, as they are illegal in refnames.

> There are two remaining small items in this patch:
> 
>   - If no prefixes were provided, then implicitly add the empty string
>     (which will match all references).

I wonder if for_each_fullref_in_prefixes() should do that, since that is
exactly what the other caller does, too. OTOH, maybe it is better to
make the callers be more explicit. In which case should it maybe BUG()
on an empty set of prefixes, rather than letting the caller assume some
behavior?

>   - Since we are manually munging the prefixes, make sure that we
>     initialize it ourselves (previously this wasn't necessary since the
>     first strvec_push would do so).

I think this was an existing bug (that we were just lucky it was
impossible to trigger, because nobody looked for the NULL sentinel), and
would make more sense as a separate fix.

-Peff

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/1] ls-refs.c: minimize number of refs visited
  2021-01-19 23:00     ` Taylor Blau
@ 2021-01-19 23:11       ` Jeff King
  2021-01-20 10:40         ` Jacob Vosmaer
  0 siblings, 1 reply; 35+ messages in thread
From: Jeff King @ 2021-01-19 23:11 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Jacob Vosmaer, git

On Tue, Jan 19, 2021 at 06:00:19PM -0500, Taylor Blau wrote:

> > Good catch. It didn't matter until now because nobody relied on having a
> > NULL entry when no prefix had been added (instead, they always iterated
> > over prefixes->nr). IMHO that is worth fixing as a separate commit.
> 
> Yeah. Even after calling it out as such myself, I promptly forgot it
> when preparing the first patch I sent back to Jacob!
> 
> I didn't pull it out into its own patch, and rather folded it in to my
> 2/2. It has a small call-out of its own, but if you'd prefer it by
> itself, I'm happy to resubmit it with that change included.

It's not a _huge_ deal to me, but I think it is slightly nicer as a
separate patch. Plus it can easily be credited to Jacob, so at least he
gets some authorship credit out of this. :)

-Peff

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/2] ls-refs.c: traverse longest common ref prefix
  2021-01-19 23:09           ` Jeff King
@ 2021-01-19 23:52             ` Taylor Blau
  2021-01-20  0:08               ` Jeff King
  0 siblings, 1 reply; 35+ messages in thread
From: Taylor Blau @ 2021-01-19 23:52 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, jacob

On Tue, Jan 19, 2021 at 06:09:42PM -0500, Jeff King wrote:
> On Tue, Jan 19, 2021 at 01:19:17PM -0500, Taylor Blau wrote:
>
> > To attempt to reduce the difference between the number of refs
> > traversed, and the number of refs sent, only traverse references which
> > are in the longest common prefix of the given prefixes. This is very
> > reminiscent of the approach taken in b31e2680c4 (ref-filter.c: find
> > disjoint pattern prefixes, 2019-06-26) which does an analogous thing for
> > multi-patterned 'git for-each-ref' invocations.
> >
> > The only difference here is that we are operating on ref prefixes, which
> > do not necessarily point to a single reference. That is just fine, since
> > all we care about is finding the longest common prefix among prefixes
> > which can be thought of as refspecs for our purposes here.
>
> This second paragraph confused me. Aren't the inputs to for-each-ref
> also prefixes?
>
> I guess they require an explicit '*', but fundamentally it's the same
> concept (and certainly they are not just single references).

Yeah, that is the point that I was trying to make. But re-reading this
patch after knowing that it confused you, I think the clearest way to
make that point is to drop that second paragraph entirely.

> > Similarly, for_each_fullref_in_prefixes may return more results than the
> > caller asked for (since the longest common prefix might match something
> > that a longer prefix in the same set wouldn't match) but
> > ls-refs.c:send_ref() discards such results.
>
> Based on my other poking, I'm not entirely sure that we can return too
> many results. But I do think it's worth keeping the caller more careful.

It can return more results, but I don't think that my writing in
b31e2680c4 is particularly clear. Here's an example, though. Say I ask
for `git for-each-refs 'refs/tags/a/*' 'refs/tags/a/b/c'`. The LCP of
that is definitely "refs/tags/a", which might traverse other stuff like
"refs/tags/a/b/d", which wouldn't get matched by either.

> > The code introduced in b31e2680c4 is resilient to stop early (and
> > return a shorter prefix) when it encounters a metacharacter (as
> > mentioned in that patch, there is some opportunity to improve this, but
> > nobody has done it).
>
> I agree that is how b31e2680c4 works, but we don't care about that
> behavior here, since we have strict prefixes. Isn't the argument we need
> to make the other way? I.e., that stopping early on a metacharacter will
> not hurt us. Because at worst we return too many results (hey, there's a
> case!) and because we would not expect metacharacters in the prefixes
> anyway, as they are illegal in refnames.

Yeah, thinking on it more I agree that's the argument we should be
making here. I updated the patch to reflect it.

> > There are two remaining small items in this patch:
> >
> >   - If no prefixes were provided, then implicitly add the empty string
> >     (which will match all references).
>
> I wonder if for_each_fullref_in_prefixes() should do that, since that is
> exactly what the other caller does, too. OTOH, maybe it is better to
> make the callers be more explicit. In which case should it maybe BUG()
> on an empty set of prefixes, rather than letting the caller assume some
> behavior?

Hmm. I don't feel strongly either way, but I think that the BUG is
probably the most sensible option.

> >   - Since we are manually munging the prefixes, make sure that we
> >     initialize it ourselves (previously this wasn't necessary since the
> >     first strvec_push would do so).
>
> I think this was an existing bug (that we were just lucky it was
> impossible to trigger, because nobody looked for the NULL sentinel), and
> would make more sense as a separate fix.

Right. I'll make sure to pull this one out into a separate patch and
credit Jacob with authorship.

> -Peff

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/2] ls-refs.c: traverse longest common ref prefix
  2021-01-19 23:52             ` Taylor Blau
@ 2021-01-20  0:08               ` Jeff King
  0 siblings, 0 replies; 35+ messages in thread
From: Jeff King @ 2021-01-20  0:08 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, jacob

On Tue, Jan 19, 2021 at 06:52:31PM -0500, Taylor Blau wrote:

> > I guess they require an explicit '*', but fundamentally it's the same
> > concept (and certainly they are not just single references).
> 
> Yeah, that is the point that I was trying to make. But re-reading this
> patch after knowing that it confused you, I think the clearest way to
> make that point is to drop that second paragraph entirely.

Sounds good.

> > Based on my other poking, I'm not entirely sure that we can return too
> > many results. But I do think it's worth keeping the caller more careful.
> 
> It can return more results, but I don't think that my writing in
> b31e2680c4 is particularly clear. Here's an example, though. Say I ask
> for `git for-each-refs 'refs/tags/a/*' 'refs/tags/a/b/c'`. The LCP of
> that is definitely "refs/tags/a", which might traverse other stuff like
> "refs/tags/a/b/d", which wouldn't get matched by either.

I thought that would be matched by refs/tags/a/*, but it looks like
for-each-ref treats "*" as matching only a single path component. So
really just:

  git for-each-ref refs/tags/*

requires extra filtering already. But AFAICT none of that is true for
ls-refs, which is strictly prefix matching already.

-Peff

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/1] ls-refs.c: minimize number of refs visited
  2021-01-19 23:11       ` Jeff King
@ 2021-01-20 10:40         ` Jacob Vosmaer
  2021-01-20 10:44           ` Jacob Vosmaer
  0 siblings, 1 reply; 35+ messages in thread
From: Jacob Vosmaer @ 2021-01-20 10:40 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, Git Mailing List

On Wed, Jan 20, 2021 at 12:11 AM Jeff King <peff@peff.net> wrote:
> It's not a _huge_ deal to me, but I think it is slightly nicer as a
> separate patch. Plus it can easily be credited to Jacob, so at least he
> gets some authorship credit out of this. :)

Thanks, I'm fine either way. Just remember that the other changes in
ls-refs.c depend on that strvec_init, so if we split it out, we need
to remember / maintain that (order) dependency.

> Having now looked carefully at the ls-refs code, it's a pure
> prefix-match, too. So I think we _could_ rely on for_each_fullref_in()
> returning us the correct full results, and not checking it further in
> send_ref().

I also think we could. But as I alluded to in my original commit
message, I don't like how complicated that gets. I find it easier to
convince myself in the current form that the longest prefix code
selects _enough_ prefixes, which is a weaker property than "selects
exactly the right prefixes".

Jacob

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/1] ls-refs.c: minimize number of refs visited
  2021-01-20 10:40         ` Jacob Vosmaer
@ 2021-01-20 10:44           ` Jacob Vosmaer
  0 siblings, 0 replies; 35+ messages in thread
From: Jacob Vosmaer @ 2021-01-20 10:44 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, Git Mailing List

On Wed, Jan 20, 2021 at 11:40 AM Jacob Vosmaer <jacob@gitlab.com> wrote:
> I also think we could. But as I alluded to in my original commit
> message, I don't like how complicated that gets. I find it easier to
> convince myself in the current form that the longest prefix code
> selects _enough_ prefixes, which is a weaker property than "selects
> exactly the right prefixes".

By which I mean to say I think we should keep the current form, with
the final authoritative ref matching pass.

Jacob

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/2] ls-refs.c: traverse longest common ref prefix
  2021-01-19 18:19         ` [PATCH 2/2] ls-refs.c: traverse longest common ref prefix Taylor Blau
  2021-01-19 23:09           ` Jeff King
@ 2021-01-20 11:00           ` Jacob Vosmaer
  1 sibling, 0 replies; 35+ messages in thread
From: Jacob Vosmaer @ 2021-01-20 11:00 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Git Mailing List, Jeff King

On Tue, Jan 19, 2021 at 7:19 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> ls-refs performs a single revision walk over the whole ref namespace,
> and sends ones that match with one of the given ref prefixes down to the
> user.
>
> This can be expensive if there are many refs overall, but the portion of
> them covered by the given prefixes is small by comparison.
>
> To attempt to reduce the difference between the number of refs
> traversed, and the number of refs sent, only traverse references which
> are in the longest common prefix of the given prefixes. This is very
> reminiscent of the approach taken in b31e2680c4 (ref-filter.c: find
> disjoint pattern prefixes, 2019-06-26) which does an analogous thing for
> multi-patterned 'git for-each-ref' invocations.
>
> The only difference here is that we are operating on ref prefixes, which
> do not necessarily point to a single reference. That is just fine, since
> all we care about is finding the longest common prefix among prefixes
> which can be thought of as refspecs for our purposes here.
>
> Similarly, for_each_fullref_in_prefixes may return more results than the
> caller asked for (since the longest common prefix might match something
> that a longer prefix in the same set wouldn't match) but
> ls-refs.c:send_ref() discards such results.
>
> The code introduced in b31e2680c4 is resilient to stop early (and
> return a shorter prefix) when it encounters a metacharacter (as
> mentioned in that patch, there is some opportunity to improve this, but
> nobody has done it).
>
> There are two remaining small items in this patch:
>
>   - If no prefixes were provided, then implicitly add the empty string
>     (which will match all references).
>
>   - Since we are manually munging the prefixes, make sure that we
>     initialize it ourselves (previously this wasn't necessary since the
>     first strvec_push would do so).
>
> Original-patch-by: Jacob Vosmaer <jacob@gitlab.com>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  ls-refs.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/ls-refs.c b/ls-refs.c
> index a1e0b473e4..eaaa36d0df 100644
> --- a/ls-refs.c
> +++ b/ls-refs.c
> @@ -90,6 +90,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
>         struct ls_refs_data data;
>
>         memset(&data, 0, sizeof(data));
> +       strvec_init(&data.prefixes);
>
>         git_config(ls_refs_config, NULL);
>
> @@ -109,7 +110,10 @@ int ls_refs(struct repository *r, struct strvec *keys,
>                 die(_("expected flush after ls-refs arguments"));
>
>         head_ref_namespaced(send_ref, &data);
> -       for_each_namespaced_ref(send_ref, &data);
> +       if (!data.prefixes.nr)
> +               strvec_push(&data.prefixes, "");

The old code, with for_each_namespaced_ref, would walk
"${NAMESPACE}refs/". The new code would walk "${NAMESPACE}" because
we're pushing "" onto data.prefixes. So if there is anything in the
namespace that does not start with "refs/" it will get yielded.

Does that matter?

> +       for_each_fullref_in_prefixes(get_git_namespace(), data.prefixes.v,
> +                                    send_ref, &data, 0);
>         packet_flush(1);
>         strvec_clear(&data.prefixes);
>         return 0;
> --
> 2.30.0.138.g6d7191ea01

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2 0/3] ls-refs: traverse prefixes of disjoint "ref-prefix" sets
  2021-01-19 18:19       ` [PATCH 0/2] ls-refs: only traverse through longest common ref prefix Taylor Blau
  2021-01-19 18:19         ` [PATCH 1/2] refs: expose 'for_each_fullref_in_prefixes' Taylor Blau
  2021-01-19 18:19         ` [PATCH 2/2] ls-refs.c: traverse longest common ref prefix Taylor Blau
@ 2021-01-20 16:04         ` Taylor Blau
  2021-01-20 16:04           ` [PATCH v2 1/3] refs: expose 'for_each_fullref_in_prefixes' Taylor Blau
                             ` (3 more replies)
  2 siblings, 4 replies; 35+ messages in thread
From: Taylor Blau @ 2021-01-20 16:04 UTC (permalink / raw)
  To: git; +Cc: jacob, peff

Here is a reroll of the series I sent yesterday to traverse the
longest-common prefixes of disjoint sets containing the ref-prefix
arguments to ls-refs.

Not much has changed last time, except some clarification about what
'for_each_fullref_in_prefixes()' can yield, and splitting out the last
patch into one from Jacob and one from me. I've forged his sign-off
since it contains his original code, but it includes a new patch
description from me.

Jacob Vosmaer (1):
  ls-refs.c: initialize 'prefixes' before using it

Taylor Blau (2):
  refs: expose 'for_each_fullref_in_prefixes'
  ls-refs.c: traverse prefixes of disjoint "ref-prefix" sets

 ls-refs.c    |  6 +++-
 ref-filter.c | 74 ++------------------------------------------
 refs.c       | 87 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 refs.h       |  9 ++++++
 4 files changed, 103 insertions(+), 73 deletions(-)

Range-diff against v1:
1:  d4c8d059f6 ! 1:  bda314fe7a refs: expose 'for_each_fullref_in_prefixes'
    @@ Commit message
         common prefix of among a set of refspecs, and then to iterate all of the
         references that descend from that prefix.

    -    The subsequent patch will want to use that same code from ls-refs, so
    +    A future patch will want to use that same code from ls-refs.c, so
         prepare by exposing and moving it to refs.c. Since there is nothing
         specific to the ref-filter code here (other than that it was previously
         the only caller of this function), this really belongs in the more
         generic refs.h header.

    +    The code moved in this patch is identical before and after, with the one
    +    exception of renaming some arguments to be consistent with other
    +    functions exposed in refs.h.
    +
         Signed-off-by: Taylor Blau <me@ttaylorr.com>

      ## ref-filter.c ##
    @@ refs.h: int refs_for_each_fullref_in(struct ref_store *refs, const char *prefix,
      			unsigned int broken);

     +/**
    -+ * iterate all refs which are descendent from the longest common prefix among
    -+ * the list "patterns".
    ++ * iterate all refs in "patterns" by partitioning patterns into disjoint sets
    ++ * and iterating the longest-common prefix of each set.
    ++ *
    ++ * callers should be prepared to ignore references that they did not ask for.
     + */
     +int for_each_fullref_in_prefixes(const char *namespace, const char **patterns,
     +				 each_ref_fn fn, void *cb_data,
-:  ---------- > 2:  5fc081b2d5 ls-refs.c: initialize 'prefixes' before using it
2:  fb8681d128 ! 3:  b97dfb706f ls-refs.c: traverse longest common ref prefix
    @@ Metadata
     Author: Taylor Blau <me@ttaylorr.com>

      ## Commit message ##
    -    ls-refs.c: traverse longest common ref prefix
    +    ls-refs.c: traverse prefixes of disjoint "ref-prefix" sets

         ls-refs performs a single revision walk over the whole ref namespace,
         and sends ones that match with one of the given ref prefixes down to the
    @@ Commit message
         disjoint pattern prefixes, 2019-06-26) which does an analogous thing for
         multi-patterned 'git for-each-ref' invocations.

    -    The only difference here is that we are operating on ref prefixes, which
    -    do not necessarily point to a single reference. That is just fine, since
    -    all we care about is finding the longest common prefix among prefixes
    -    which can be thought of as refspecs for our purposes here.
    -
    -    Similarly, for_each_fullref_in_prefixes may return more results than the
    -    caller asked for (since the longest common prefix might match something
    -    that a longer prefix in the same set wouldn't match) but
    -    ls-refs.c:send_ref() discards such results.
    -
    -    The code introduced in b31e2680c4 is resilient to stop early (and
    -    return a shorter prefix) when it encounters a metacharacter (as
    -    mentioned in that patch, there is some opportunity to improve this, but
    -    nobody has done it).
    -
    -    There are two remaining small items in this patch:
    -
    -      - If no prefixes were provided, then implicitly add the empty string
    -        (which will match all references).
    -
    -      - Since we are manually munging the prefixes, make sure that we
    -        initialize it ourselves (previously this wasn't necessary since the
    -        first strvec_push would do so).
    +    The callback 'send_ref' is resilient to ignore extra patterns by
    +    discarding any arguments which do not begin with at least one of the
    +    specified prefixes.
    +
    +    Similarly, the code introduced in b31e2680c4 is resilient to stop early
    +    at metacharacters, but we only pass strict prefixes here. At worst we
    +    would return too many results, but the double checking done by send_ref
    +    will throw away anything that doesn't start with something in the prefix
    +    list.
    +
    +    Finally, if no prefixes were provided, then implicitly add the empty
    +    string (which will match all references) since this matches the existing
    +    behavior (see the "no restrictions" comment in "ls-refs.c:ref_match()").

         Original-patch-by: Jacob Vosmaer <jacob@gitlab.com>
         Signed-off-by: Taylor Blau <me@ttaylorr.com>

      ## ls-refs.c ##
    -@@ ls-refs.c: int ls_refs(struct repository *r, struct strvec *keys,
    - 	struct ls_refs_data data;
    -
    - 	memset(&data, 0, sizeof(data));
    -+	strvec_init(&data.prefixes);
    -
    - 	git_config(ls_refs_config, NULL);
    -
     @@ ls-refs.c: int ls_refs(struct repository *r, struct strvec *keys,
      		die(_("expected flush after ls-refs arguments"));

--
2.30.0.138.g6d7191ea01

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2 1/3] refs: expose 'for_each_fullref_in_prefixes'
  2021-01-20 16:04         ` [PATCH v2 0/3] ls-refs: traverse prefixes of disjoint "ref-prefix" sets Taylor Blau
@ 2021-01-20 16:04           ` Taylor Blau
  2021-01-20 19:56             ` Jeff King
  2021-01-23  2:59             ` Junio C Hamano
  2021-01-20 16:04           ` [PATCH v2 2/3] ls-refs.c: initialize 'prefixes' before using it Taylor Blau
                             ` (2 subsequent siblings)
  3 siblings, 2 replies; 35+ messages in thread
From: Taylor Blau @ 2021-01-20 16:04 UTC (permalink / raw)
  To: git; +Cc: jacob, peff

This function was used in the ref-filter.c code to find the longest
common prefix of among a set of refspecs, and then to iterate all of the
references that descend from that prefix.

A future patch will want to use that same code from ls-refs.c, so
prepare by exposing and moving it to refs.c. Since there is nothing
specific to the ref-filter code here (other than that it was previously
the only caller of this function), this really belongs in the more
generic refs.h header.

The code moved in this patch is identical before and after, with the one
exception of renaming some arguments to be consistent with other
functions exposed in refs.h.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 ref-filter.c | 74 ++------------------------------------------
 refs.c       | 87 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 refs.h       |  9 ++++++
 3 files changed, 98 insertions(+), 72 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index aa260bfd09..f918f00151 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1929,64 +1929,6 @@ static int filter_pattern_match(struct ref_filter *filter, const char *refname)
 	return match_pattern(filter, refname);
 }
 
-static int qsort_strcmp(const void *va, const void *vb)
-{
-	const char *a = *(const char **)va;
-	const char *b = *(const char **)vb;
-
-	return strcmp(a, b);
-}
-
-static void find_longest_prefixes_1(struct string_list *out,
-				  struct strbuf *prefix,
-				  const char **patterns, size_t nr)
-{
-	size_t i;
-
-	for (i = 0; i < nr; i++) {
-		char c = patterns[i][prefix->len];
-		if (!c || is_glob_special(c)) {
-			string_list_append(out, prefix->buf);
-			return;
-		}
-	}
-
-	i = 0;
-	while (i < nr) {
-		size_t end;
-
-		/*
-		* Set "end" to the index of the element _after_ the last one
-		* in our group.
-		*/
-		for (end = i + 1; end < nr; end++) {
-			if (patterns[i][prefix->len] != patterns[end][prefix->len])
-				break;
-		}
-
-		strbuf_addch(prefix, patterns[i][prefix->len]);
-		find_longest_prefixes_1(out, prefix, patterns + i, end - i);
-		strbuf_setlen(prefix, prefix->len - 1);
-
-		i = end;
-	}
-}
-
-static void find_longest_prefixes(struct string_list *out,
-				  const char **patterns)
-{
-	struct strvec sorted = STRVEC_INIT;
-	struct strbuf prefix = STRBUF_INIT;
-
-	strvec_pushv(&sorted, patterns);
-	QSORT(sorted.v, sorted.nr, qsort_strcmp);
-
-	find_longest_prefixes_1(out, &prefix, sorted.v, sorted.nr);
-
-	strvec_clear(&sorted);
-	strbuf_release(&prefix);
-}
-
 /*
  * This is the same as for_each_fullref_in(), but it tries to iterate
  * only over the patterns we'll care about. Note that it _doesn't_ do a full
@@ -1997,10 +1939,6 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter,
 				       void *cb_data,
 				       int broken)
 {
-	struct string_list prefixes = STRING_LIST_INIT_DUP;
-	struct string_list_item *prefix;
-	int ret;
-
 	if (!filter->match_as_path) {
 		/*
 		 * in this case, the patterns are applied after
@@ -2024,16 +1962,8 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter,
 		return for_each_fullref_in("", cb, cb_data, broken);
 	}
 
-	find_longest_prefixes(&prefixes, filter->name_patterns);
-
-	for_each_string_list_item(prefix, &prefixes) {
-		ret = for_each_fullref_in(prefix->string, cb, cb_data, broken);
-		if (ret)
-			break;
-	}
-
-	string_list_clear(&prefixes, 0);
-	return ret;
+	return for_each_fullref_in_prefixes(NULL, filter->name_patterns,
+					    cb, cb_data, broken);
 }
 
 /*
diff --git a/refs.c b/refs.c
index 13dc2c3291..0b5a68588f 100644
--- a/refs.c
+++ b/refs.c
@@ -1546,6 +1546,93 @@ int for_each_rawref(each_ref_fn fn, void *cb_data)
 	return refs_for_each_rawref(get_main_ref_store(the_repository), fn, cb_data);
 }
 
+static int qsort_strcmp(const void *va, const void *vb)
+{
+	const char *a = *(const char **)va;
+	const char *b = *(const char **)vb;
+
+	return strcmp(a, b);
+}
+
+static void find_longest_prefixes_1(struct string_list *out,
+				  struct strbuf *prefix,
+				  const char **patterns, size_t nr)
+{
+	size_t i;
+
+	for (i = 0; i < nr; i++) {
+		char c = patterns[i][prefix->len];
+		if (!c || is_glob_special(c)) {
+			string_list_append(out, prefix->buf);
+			return;
+		}
+	}
+
+	i = 0;
+	while (i < nr) {
+		size_t end;
+
+		/*
+		* Set "end" to the index of the element _after_ the last one
+		* in our group.
+		*/
+		for (end = i + 1; end < nr; end++) {
+			if (patterns[i][prefix->len] != patterns[end][prefix->len])
+				break;
+		}
+
+		strbuf_addch(prefix, patterns[i][prefix->len]);
+		find_longest_prefixes_1(out, prefix, patterns + i, end - i);
+		strbuf_setlen(prefix, prefix->len - 1);
+
+		i = end;
+	}
+}
+
+static void find_longest_prefixes(struct string_list *out,
+				  const char **patterns)
+{
+	struct strvec sorted = STRVEC_INIT;
+	struct strbuf prefix = STRBUF_INIT;
+
+	strvec_pushv(&sorted, patterns);
+	QSORT(sorted.v, sorted.nr, qsort_strcmp);
+
+	find_longest_prefixes_1(out, &prefix, sorted.v, sorted.nr);
+
+	strvec_clear(&sorted);
+	strbuf_release(&prefix);
+}
+
+int for_each_fullref_in_prefixes(const char *namespace,
+				 const char **patterns,
+				 each_ref_fn fn, void *cb_data,
+				 unsigned int broken)
+{
+	struct string_list prefixes = STRING_LIST_INIT_DUP;
+	struct string_list_item *prefix;
+	struct strbuf buf = STRBUF_INIT;
+	int ret = 0, namespace_len;
+
+	find_longest_prefixes(&prefixes, patterns);
+
+	if (namespace)
+		strbuf_addstr(&buf, namespace);
+	namespace_len = buf.len;
+
+	for_each_string_list_item(prefix, &prefixes) {
+		strbuf_addf(&buf, "%s", prefix->string);
+		ret = for_each_fullref_in(buf.buf, fn, cb_data, broken);
+		if (ret)
+			break;
+		strbuf_setlen(&buf, namespace_len);
+	}
+
+	string_list_clear(&prefixes, 0);
+	strbuf_release(&buf);
+	return ret;
+}
+
 static int refs_read_special_head(struct ref_store *ref_store,
 				  const char *refname, struct object_id *oid,
 				  struct strbuf *referent, unsigned int *type)
diff --git a/refs.h b/refs.h
index ff05d2e9fe..c5fd35487d 100644
--- a/refs.h
+++ b/refs.h
@@ -347,6 +347,15 @@ int refs_for_each_fullref_in(struct ref_store *refs, const char *prefix,
 int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data,
 			unsigned int broken);
 
+/**
+ * iterate all refs in "patterns" by partitioning patterns into disjoint sets
+ * and iterating the longest-common prefix of each set.
+ *
+ * callers should be prepared to ignore references that they did not ask for.
+ */
+int for_each_fullref_in_prefixes(const char *namespace, const char **patterns,
+				 each_ref_fn fn, void *cb_data,
+				 unsigned int broken);
 /**
  * iterate refs from the respective area.
  */
-- 
2.30.0.138.g6d7191ea01


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 2/3] ls-refs.c: initialize 'prefixes' before using it
  2021-01-20 16:04         ` [PATCH v2 0/3] ls-refs: traverse prefixes of disjoint "ref-prefix" sets Taylor Blau
  2021-01-20 16:04           ` [PATCH v2 1/3] refs: expose 'for_each_fullref_in_prefixes' Taylor Blau
@ 2021-01-20 16:04           ` Taylor Blau
  2021-01-20 19:58             ` Jeff King
  2021-01-20 21:50             ` Jacob Vosmaer
  2021-01-20 16:04           ` [PATCH v2 3/3] ls-refs.c: traverse prefixes of disjoint "ref-prefix" sets Taylor Blau
  2021-01-23 17:55           ` [PATCH v2 0/3] ls-refs: " Junio C Hamano
  3 siblings, 2 replies; 35+ messages in thread
From: Taylor Blau @ 2021-01-20 16:04 UTC (permalink / raw)
  To: git; +Cc: jacob, peff

From: Jacob Vosmaer <jacob@gitlab.com>

Correctly initialize the "prefixes" strvec using strvec_init() instead
of simply zeroing it via the earlier memset().

There's no way to trigger a crash, since the first 'ref-prefix' command
will initialize the strvec via the 'ALLOC_GROW' in 'strvec_push_nodup()'
(the alloc and nr variables are already zero'd, so the call to
ALLOC_GROW is valid).

If no "ref-prefix" command was given, then the call to
'ls-refs.c:ref_match()' will abort early after it reads the zero in
'prefixes->nr'. Likewise, strvec_clear() will only call free() on the
array, which is NULL, so we're safe there, too.

But, all of this is dangerous and requires more reasoning than it would
if we simply called 'strvec_init()', so do that.

Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 ls-refs.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/ls-refs.c b/ls-refs.c
index a1e0b473e4..367597d447 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -90,6 +90,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
 	struct ls_refs_data data;
 
 	memset(&data, 0, sizeof(data));
+	strvec_init(&data.prefixes);
 
 	git_config(ls_refs_config, NULL);
 
-- 
2.30.0.138.g6d7191ea01


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v2 3/3] ls-refs.c: traverse prefixes of disjoint "ref-prefix" sets
  2021-01-20 16:04         ` [PATCH v2 0/3] ls-refs: traverse prefixes of disjoint "ref-prefix" sets Taylor Blau
  2021-01-20 16:04           ` [PATCH v2 1/3] refs: expose 'for_each_fullref_in_prefixes' Taylor Blau
  2021-01-20 16:04           ` [PATCH v2 2/3] ls-refs.c: initialize 'prefixes' before using it Taylor Blau
@ 2021-01-20 16:04           ` Taylor Blau
  2021-01-23 17:55           ` [PATCH v2 0/3] ls-refs: " Junio C Hamano
  3 siblings, 0 replies; 35+ messages in thread
From: Taylor Blau @ 2021-01-20 16:04 UTC (permalink / raw)
  To: git; +Cc: jacob, peff

ls-refs performs a single revision walk over the whole ref namespace,
and sends ones that match with one of the given ref prefixes down to the
user.

This can be expensive if there are many refs overall, but the portion of
them covered by the given prefixes is small by comparison.

To attempt to reduce the difference between the number of refs
traversed, and the number of refs sent, only traverse references which
are in the longest common prefix of the given prefixes. This is very
reminiscent of the approach taken in b31e2680c4 (ref-filter.c: find
disjoint pattern prefixes, 2019-06-26) which does an analogous thing for
multi-patterned 'git for-each-ref' invocations.

The callback 'send_ref' is resilient to ignore extra patterns by
discarding any arguments which do not begin with at least one of the
specified prefixes.

Similarly, the code introduced in b31e2680c4 is resilient to stop early
at metacharacters, but we only pass strict prefixes here. At worst we
would return too many results, but the double checking done by send_ref
will throw away anything that doesn't start with something in the prefix
list.

Finally, if no prefixes were provided, then implicitly add the empty
string (which will match all references) since this matches the existing
behavior (see the "no restrictions" comment in "ls-refs.c:ref_match()").

Original-patch-by: Jacob Vosmaer <jacob@gitlab.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 ls-refs.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/ls-refs.c b/ls-refs.c
index 367597d447..eaaa36d0df 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -110,7 +110,10 @@ int ls_refs(struct repository *r, struct strvec *keys,
 		die(_("expected flush after ls-refs arguments"));
 
 	head_ref_namespaced(send_ref, &data);
-	for_each_namespaced_ref(send_ref, &data);
+	if (!data.prefixes.nr)
+		strvec_push(&data.prefixes, "");
+	for_each_fullref_in_prefixes(get_git_namespace(), data.prefixes.v,
+				     send_ref, &data, 0);
 	packet_flush(1);
 	strvec_clear(&data.prefixes);
 	return 0;
-- 
2.30.0.138.g6d7191ea01

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 1/3] refs: expose 'for_each_fullref_in_prefixes'
  2021-01-20 16:04           ` [PATCH v2 1/3] refs: expose 'for_each_fullref_in_prefixes' Taylor Blau
@ 2021-01-20 19:56             ` Jeff King
  2021-01-20 20:12               ` Taylor Blau
  2021-01-23  2:59             ` Junio C Hamano
  1 sibling, 1 reply; 35+ messages in thread
From: Jeff King @ 2021-01-20 19:56 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, jacob

On Wed, Jan 20, 2021 at 11:04:21AM -0500, Taylor Blau wrote:

> This function was used in the ref-filter.c code to find the longest
> common prefix of among a set of refspecs, and then to iterate all of the
> references that descend from that prefix.
> 
> A future patch will want to use that same code from ls-refs.c, so
> prepare by exposing and moving it to refs.c. Since there is nothing
> specific to the ref-filter code here (other than that it was previously
> the only caller of this function), this really belongs in the more
> generic refs.h header.
> 
> The code moved in this patch is identical before and after, with the one
> exception of renaming some arguments to be consistent with other
> functions exposed in refs.h.

The other non-identical thing is that it handles a namespace parameter
(which is good, but an obvious non-movement).

-Peff

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 2/3] ls-refs.c: initialize 'prefixes' before using it
  2021-01-20 16:04           ` [PATCH v2 2/3] ls-refs.c: initialize 'prefixes' before using it Taylor Blau
@ 2021-01-20 19:58             ` Jeff King
  2021-01-20 20:13               ` Taylor Blau
  2021-01-20 21:50             ` Jacob Vosmaer
  1 sibling, 1 reply; 35+ messages in thread
From: Jeff King @ 2021-01-20 19:58 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, jacob

On Wed, Jan 20, 2021 at 11:04:25AM -0500, Taylor Blau wrote:

> From: Jacob Vosmaer <jacob@gitlab.com>
> 
> Correctly initialize the "prefixes" strvec using strvec_init() instead
> of simply zeroing it via the earlier memset().
> 
> There's no way to trigger a crash, since the first 'ref-prefix' command
> will initialize the strvec via the 'ALLOC_GROW' in 'strvec_push_nodup()'
> (the alloc and nr variables are already zero'd, so the call to
> ALLOC_GROW is valid).

For people not familiar with strvec, I think there's a missing bit of
explanation between those two paragraphs: an initialized strvec does not
point to NULL, but to a dummy array with a single NULL value in it. That
explains why it might crash. :)

That nit (and the one I mentioned in the previous patch) aside, these
patches look good to me (and I am OK with or without the nits
addressed).

-Peff

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 1/3] refs: expose 'for_each_fullref_in_prefixes'
  2021-01-20 19:56             ` Jeff King
@ 2021-01-20 20:12               ` Taylor Blau
  0 siblings, 0 replies; 35+ messages in thread
From: Taylor Blau @ 2021-01-20 20:12 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, jacob

On Wed, Jan 20, 2021 at 02:56:17PM -0500, Jeff King wrote:
> On Wed, Jan 20, 2021 at 11:04:21AM -0500, Taylor Blau wrote:
> > The code moved in this patch is identical before and after, with the one
> > exception of renaming some arguments to be consistent with other
> > functions exposed in refs.h.
>
> The other non-identical thing is that it handles a namespace parameter
> (which is good, but an obvious non-movement).

Ack; I knew that I forgot to mention something. Thanks for pointing it
out here. (Since you seem to be OK with this patch even without this
mention, I'll avoid sending another revision).

> -Peff

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 2/3] ls-refs.c: initialize 'prefixes' before using it
  2021-01-20 19:58             ` Jeff King
@ 2021-01-20 20:13               ` Taylor Blau
  0 siblings, 0 replies; 35+ messages in thread
From: Taylor Blau @ 2021-01-20 20:13 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, jacob

On Wed, Jan 20, 2021 at 02:58:21PM -0500, Jeff King wrote:
> That nit (and the one I mentioned in the previous patch) aside, these
> patches look good to me (and I am OK with or without the nits
> addressed).

Thanks, as always, for your review :-).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 2/3] ls-refs.c: initialize 'prefixes' before using it
  2021-01-20 16:04           ` [PATCH v2 2/3] ls-refs.c: initialize 'prefixes' before using it Taylor Blau
  2021-01-20 19:58             ` Jeff King
@ 2021-01-20 21:50             ` Jacob Vosmaer
  1 sibling, 0 replies; 35+ messages in thread
From: Jacob Vosmaer @ 2021-01-20 21:50 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Git Mailing List, Jeff King

As the person whose name is on the "From:" line, I approve. And thanks!

On Wed, Jan 20, 2021 at 5:04 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> From: Jacob Vosmaer <jacob@gitlab.com>
>
> Correctly initialize the "prefixes" strvec using strvec_init() instead
> of simply zeroing it via the earlier memset().
>
> There's no way to trigger a crash, since the first 'ref-prefix' command
> will initialize the strvec via the 'ALLOC_GROW' in 'strvec_push_nodup()'
> (the alloc and nr variables are already zero'd, so the call to
> ALLOC_GROW is valid).
>
> If no "ref-prefix" command was given, then the call to
> 'ls-refs.c:ref_match()' will abort early after it reads the zero in
> 'prefixes->nr'. Likewise, strvec_clear() will only call free() on the
> array, which is NULL, so we're safe there, too.
>
> But, all of this is dangerous and requires more reasoning than it would
> if we simply called 'strvec_init()', so do that.
>
> Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  ls-refs.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/ls-refs.c b/ls-refs.c
> index a1e0b473e4..367597d447 100644
> --- a/ls-refs.c
> +++ b/ls-refs.c
> @@ -90,6 +90,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
>         struct ls_refs_data data;
>
>         memset(&data, 0, sizeof(data));
> +       strvec_init(&data.prefixes);
>
>         git_config(ls_refs_config, NULL);
>
> --
> 2.30.0.138.g6d7191ea01
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 1/3] refs: expose 'for_each_fullref_in_prefixes'
  2021-01-20 16:04           ` [PATCH v2 1/3] refs: expose 'for_each_fullref_in_prefixes' Taylor Blau
  2021-01-20 19:56             ` Jeff King
@ 2021-01-23  2:59             ` Junio C Hamano
  2021-01-25  1:35               ` Taylor Blau
  1 sibling, 1 reply; 35+ messages in thread
From: Junio C Hamano @ 2021-01-23  2:59 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, jacob, peff

Taylor Blau <me@ttaylorr.com> writes:

> This function was used in the ref-filter.c code to find the longest
> common prefix of among a set of refspecs, and then to iterate all of the
> references that descend from that prefix.
>
> A future patch will want to use that same code from ls-refs.c, so
> prepare by exposing and moving it to refs.c. Since there is nothing
> specific to the ref-filter code here (other than that it was previously
> the only caller of this function), this really belongs in the more
> generic refs.h header.
>
> The code moved in this patch is identical before and after, with the one
> exception of renaming some arguments to be consistent with other
> functions exposed in refs.h.
>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  ref-filter.c | 74 ++------------------------------------------
>  refs.c       | 87 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  refs.h       |  9 ++++++
>  3 files changed, 98 insertions(+), 72 deletions(-)

It is amusing that even to a change that is supposedly "expose
existing functionality by moving code around" and nothing else,
we can introduce new glitches.

> diff --git a/refs.c b/refs.c
> index 13dc2c3291..0b5a68588f 100644
> --- a/refs.c
> +++ b/refs.c
> ...
> +	for_each_string_list_item(prefix, &prefixes) {
> +		strbuf_addf(&buf, "%s", prefix->string);

		strbuf_addstr(&buf, prefix->string);

Caught by

https://github.com/git/git/runs/1752536671?check_suite_focus=true#step:4:63

I'll apply the fix suggested by Coccinelle on my end, so there is no
need to send an updated version just for this one.

Thanks.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 0/3] ls-refs: traverse prefixes of disjoint "ref-prefix" sets
  2021-01-20 16:04         ` [PATCH v2 0/3] ls-refs: traverse prefixes of disjoint "ref-prefix" sets Taylor Blau
                             ` (2 preceding siblings ...)
  2021-01-20 16:04           ` [PATCH v2 3/3] ls-refs.c: traverse prefixes of disjoint "ref-prefix" sets Taylor Blau
@ 2021-01-23 17:55           ` Junio C Hamano
  3 siblings, 0 replies; 35+ messages in thread
From: Junio C Hamano @ 2021-01-23 17:55 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, jacob, peff

Taylor Blau <me@ttaylorr.com> writes:

> Here is a reroll of the series I sent yesterday to traverse the
> longest-common prefixes of disjoint sets containing the ref-prefix
> arguments to ls-refs.
>
> Not much has changed last time, except some clarification about what
> 'for_each_fullref_in_prefixes()' can yield, and splitting out the last
> patch into one from Jacob and one from me. I've forged his sign-off
> since it contains his original code, but it includes a new patch
> description from me.

Oh boy I feel behind, but finally I had managed to block time to
read through the two iterations of discussion on this topic without
leaving my desk once, and it was a pleasant read.

Thank you to all three of you.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 1/3] refs: expose 'for_each_fullref_in_prefixes'
  2021-01-23  2:59             ` Junio C Hamano
@ 2021-01-25  1:35               ` Taylor Blau
  0 siblings, 0 replies; 35+ messages in thread
From: Taylor Blau @ 2021-01-25  1:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, jacob, peff

On Fri, Jan 22, 2021 at 06:59:30PM -0800, Junio C Hamano wrote:
> Caught by
>
> https://github.com/git/git/runs/1752536671?check_suite_focus=true#step:4:63
>
> I'll apply the fix suggested by Coccinelle on my end, so there is no
> need to send an updated version just for this one.

Oof. How embarrassing. I'm well aware of the existence of
strbuf_addstr() -- there's even a caller just below the line I changed!
-- but clearly wasn't thinking when I wrote this patch.

Thanks for cleaning it up.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2021-01-25  1:36 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-19 14:42 [PATCH 0/1] ls-refs.c: minimize number of refs visited Jacob Vosmaer
2021-01-19 14:42 ` [PATCH 1/1] " Jacob Vosmaer
2021-01-19 16:12   ` Taylor Blau
2021-01-19 17:42     ` Jacob Vosmaer
2021-01-19 18:19       ` [PATCH 0/2] ls-refs: only traverse through longest common ref prefix Taylor Blau
2021-01-19 18:19         ` [PATCH 1/2] refs: expose 'for_each_fullref_in_prefixes' Taylor Blau
2021-01-19 18:19         ` [PATCH 2/2] ls-refs.c: traverse longest common ref prefix Taylor Blau
2021-01-19 23:09           ` Jeff King
2021-01-19 23:52             ` Taylor Blau
2021-01-20  0:08               ` Jeff King
2021-01-20 11:00           ` Jacob Vosmaer
2021-01-20 16:04         ` [PATCH v2 0/3] ls-refs: traverse prefixes of disjoint "ref-prefix" sets Taylor Blau
2021-01-20 16:04           ` [PATCH v2 1/3] refs: expose 'for_each_fullref_in_prefixes' Taylor Blau
2021-01-20 19:56             ` Jeff King
2021-01-20 20:12               ` Taylor Blau
2021-01-23  2:59             ` Junio C Hamano
2021-01-25  1:35               ` Taylor Blau
2021-01-20 16:04           ` [PATCH v2 2/3] ls-refs.c: initialize 'prefixes' before using it Taylor Blau
2021-01-20 19:58             ` Jeff King
2021-01-20 20:13               ` Taylor Blau
2021-01-20 21:50             ` Jacob Vosmaer
2021-01-20 16:04           ` [PATCH v2 3/3] ls-refs.c: traverse prefixes of disjoint "ref-prefix" sets Taylor Blau
2021-01-23 17:55           ` [PATCH v2 0/3] ls-refs: " Junio C Hamano
2021-01-19 19:09       ` [PATCH 1/1] ls-refs.c: minimize number of refs visited Taylor Blau
2021-01-19 21:59         ` Jeff King
2021-01-19 22:15           ` Jeff King
2021-01-19 22:23             ` Taylor Blau
2021-01-19 22:52               ` Jeff King
2021-01-19 22:59                 ` Jeff King
2021-01-19 23:02                   ` Taylor Blau
2021-01-19 22:53   ` Jeff King
2021-01-19 23:00     ` Taylor Blau
2021-01-19 23:11       ` Jeff King
2021-01-20 10:40         ` Jacob Vosmaer
2021-01-20 10:44           ` Jacob Vosmaer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.