All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Taylor Blau" <me@ttaylorr.com>,
	"Martin Ågren" <martin.agren@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [PATCH v2 07/11] ls-refs: ignore very long ref-prefix counts
Date: Tue, 14 Sep 2021 19:51:35 -0400	[thread overview]
Message-ID: <YUE1hz/KKM0XebCP@coredump.intra.peff.net> (raw)
In-Reply-To: <YUE1alo58cGyTw6/@coredump.intra.peff.net>

Because each "ref-prefix" capability from the client comes in its own
pkt-line, there's no limit to the number of them that a misbehaving
client may send. We read them all into a strvec, which means the client
can waste arbitrary amounts of our memory by just sending us "ref-prefix
foo" over and over.

One possible solution is to just drop the connection when the limit is
reached. If we set it high enough, then only misbehaving or malicious
clients would hit it. But "high enough" is vague, and it's unfriendly if
we guess wrong and a legitimate client hits this.

But we can do better. Since supporting the ref-prefix capability is
optional anyway, the client has to further cull the response based on
their own patterns. So we can simply ignore the patterns once we cross a
certain threshold. Note that we have to ignore _all_ patterns, not just
the ones past our limit (since otherwise we'd send too little data).

The limit here is fairly arbitrary, and probably much higher than anyone
would need in practice. It might be worth limiting it further, if only
because we check it linearly (so with "m" local refs and "n" patterns,
we do "m * n" string comparisons). But if we care about optimizing this,
an even better solution may be a more advanced data structure anyway.

I didn't bother making the limit configurable, since it's so high and
since Git should behave correctly in either case. It wouldn't be too
hard to do, but it makes both the code and documentation more complex.

Signed-off-by: Jeff King <peff@peff.net>
---
 ls-refs.c            | 20 ++++++++++++++++++--
 t/t5701-git-serve.sh | 31 +++++++++++++++++++++++++++++++
 2 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/ls-refs.c b/ls-refs.c
index a1a0250607..18c4f41e87 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -40,6 +40,12 @@ static void ensure_config_read(void)
 	config_read = 1;
 }
 
+/*
+ * If we see this many or more "ref-prefix" lines from the client, we consider
+ * it "too many" and will avoid using the prefix feature entirely.
+ */
+#define TOO_MANY_PREFIXES 65536
+
 /*
  * Check if one of the prefixes is a prefix of the ref.
  * If no prefixes were provided, all refs match.
@@ -156,15 +162,25 @@ int ls_refs(struct repository *r, struct packet_reader *request)
 			data.peel = 1;
 		else if (!strcmp("symrefs", arg))
 			data.symrefs = 1;
-		else if (skip_prefix(arg, "ref-prefix ", &out))
-			strvec_push(&data.prefixes, out);
+		else if (skip_prefix(arg, "ref-prefix ", &out)) {
+			if (data.prefixes.nr < TOO_MANY_PREFIXES)
+				strvec_push(&data.prefixes, out);
+		}
 		else if (!strcmp("unborn", arg))
 			data.unborn = allow_unborn;
 	}
 
 	if (request->status != PACKET_READ_FLUSH)
 		die(_("expected flush after ls-refs arguments"));
 
+	/*
+	 * If we saw too many prefixes, we must avoid using them at all; as
+	 * soon as we have any prefix, they are meant to form a comprehensive
+	 * list.
+	 */
+	if (data.prefixes.nr >= TOO_MANY_PREFIXES)
+		strvec_clear(&data.prefixes);
+
 	send_possibly_unborn_head(&data);
 	if (!data.prefixes.nr)
 		strvec_push(&data.prefixes, "");
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 930721f053..3bc96ebcde 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -158,6 +158,37 @@ test_expect_success 'refs/heads prefix' '
 	test_cmp expect actual
 '
 
+test_expect_success 'ignore very large set of prefixes' '
+	# generate a large number of ref-prefixes that we expect
+	# to match nothing; the value here exceeds TOO_MANY_PREFIXES
+	# from ls-refs.c.
+	{
+		echo command=ls-refs &&
+		echo object-format=$(test_oid algo)
+		echo 0001 &&
+		perl -le "print \"ref-prefix refs/heads/\$_\" for (1..65536)" &&
+		echo 0000
+	} |
+	test-tool pkt-line pack >in &&
+
+	# and then confirm that we see unmatched prefixes anyway (i.e.,
+	# that the prefix was not applied).
+	cat >expect <<-EOF &&
+	$(git rev-parse HEAD) HEAD
+	$(git rev-parse refs/heads/dev) refs/heads/dev
+	$(git rev-parse refs/heads/main) refs/heads/main
+	$(git rev-parse refs/heads/release) refs/heads/release
+	$(git rev-parse refs/tags/annotated-tag) refs/tags/annotated-tag
+	$(git rev-parse refs/tags/one) refs/tags/one
+	$(git rev-parse refs/tags/two) refs/tags/two
+	0000
+	EOF
+
+	test-tool serve-v2 --stateless-rpc <in >out &&
+	test-tool pkt-line unpack <out >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'peel parameter' '
 	test-tool pkt-line pack >in <<-EOF &&
 	command=ls-refs
-- 
2.33.0.917.gae6ecbedc7


  parent reply	other threads:[~2021-09-14 23:51 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-14 15:29 [PATCH 0/9] reducing memory allocations for v2 servers Jeff King
2021-09-14 15:30 ` [PATCH 1/9] serve: rename is_command() to parse_command() Jeff King
2021-09-14 15:30 ` [PATCH 2/9] serve: return capability "value" from get_capability() Jeff King
2021-09-14 15:31 ` [PATCH 3/9] serve: add "receive" method for v2 capabilities table Jeff King
2021-09-14 15:31 ` [PATCH 4/9] serve: provide "receive" function for object-format capability Jeff King
2021-09-14 18:59   ` Martin Ågren
2021-09-14 15:33 ` [PATCH 5/9] serve: provide "receive" function for session-id capability Jeff King
2021-09-14 16:55   ` Taylor Blau
2021-09-14 17:06     ` Jeff King
2021-09-14 17:12       ` Taylor Blau
2021-09-14 19:02   ` Martin Ågren
2021-09-14 19:14     ` Jeff King
2021-09-14 15:33 ` [PATCH 6/9] serve: drop "keys" strvec Jeff King
2021-09-14 16:59   ` Taylor Blau
2021-09-14 17:16     ` Jeff King
2021-09-14 15:37 ` [PATCH 7/9] ls-refs: ignore very long ref-prefix counts Jeff King
2021-09-14 17:18   ` Taylor Blau
2021-09-14 17:27     ` Jeff King
2021-09-14 17:23   ` Jeff King
2021-09-14 19:06   ` Martin Ågren
2021-09-14 19:22     ` Jeff King
2021-09-14 22:09   ` Jeff King
2021-09-14 22:11     ` Taylor Blau
2021-09-14 22:15       ` Jeff King
2021-09-14 15:37 ` [PATCH 8/9] serve: reject bogus v2 "command=ls-refs=foo" Jeff King
2021-09-14 17:21   ` Taylor Blau
2021-09-14 15:37 ` [PATCH 9/9] serve: reject commands used as capabilities Jeff King
2021-09-14 17:30 ` [PATCH 0/9] reducing memory allocations for v2 servers Taylor Blau
2021-09-14 18:00 ` Junio C Hamano
2021-09-14 18:38   ` Jeff King
2021-09-14 23:51 ` [PATCH v2 0/11] limit " Jeff King
2021-09-14 23:51   ` [PATCH v2 01/11] serve: rename is_command() to parse_command() Jeff King
2021-09-14 23:51   ` [PATCH v2 02/11] serve: return capability "value" from get_capability() Jeff King
2021-09-14 23:51   ` [PATCH v2 03/11] serve: add "receive" method for v2 capabilities table Jeff King
2021-09-15  0:31     ` Ævar Arnfjörð Bjarmason
2021-09-15 16:35       ` Jeff King
2021-09-15 16:41     ` Junio C Hamano
2021-09-15 16:57       ` Jeff King
2021-09-14 23:51   ` [PATCH v2 04/11] serve: provide "receive" function for object-format capability Jeff King
2021-09-15 16:54     ` Junio C Hamano
2021-09-14 23:51   ` [PATCH v2 05/11] serve: provide "receive" function for session-id capability Jeff King
2021-09-15 16:56     ` Junio C Hamano
2021-09-14 23:51   ` [PATCH v2 06/11] serve: drop "keys" strvec Jeff King
2021-09-15 17:01     ` Junio C Hamano
2021-09-14 23:51   ` Jeff King [this message]
2021-09-15  4:16     ` [PATCH v2 07/11] ls-refs: ignore very long ref-prefix counts Taylor Blau
2021-09-15 16:39       ` Jeff King
2021-09-15  5:00     ` Eric Sunshine
2021-09-15 16:40       ` Jeff King
2021-09-14 23:52   ` [PATCH v2 08/11] docs/protocol-v2: clarify some ls-refs ref-prefix details Jeff King
2021-09-14 23:52   ` [PATCH v2 09/11] serve: reject bogus v2 "command=ls-refs=foo" Jeff King
2021-09-15  0:27     ` Ævar Arnfjörð Bjarmason
2021-09-15 16:28       ` Jeff King
2021-09-15  5:09     ` Eric Sunshine
2021-09-15 16:32       ` Jeff King
2021-09-15 17:33     ` Junio C Hamano
2021-09-15 17:39       ` Jeff King
2021-09-14 23:52   ` [PATCH v2 10/11] serve: reject commands used as capabilities Jeff King
2021-09-14 23:54   ` [PATCH v2 11/11] ls-refs: reject unknown arguments Jeff King
2021-09-15  0:09     ` Ævar Arnfjörð Bjarmason
2021-09-15 16:25       ` Jeff King
2021-09-15  4:17   ` [PATCH v2 0/11] limit memory allocations for v2 servers Taylor Blau
2021-09-15 18:33   ` Jeff King
2021-09-15 18:34     ` [PATCH v3 " Jeff King
2021-09-15 18:35       ` [PATCH v3 01/11] serve: rename is_command() to parse_command() Jeff King
2021-09-15 18:35       ` [PATCH v3 02/11] serve: return capability "value" from get_capability() Jeff King
2021-09-15 18:35       ` [PATCH v3 03/11] serve: add "receive" method for v2 capabilities table Jeff King
2021-09-15 18:35       ` [PATCH v3 04/11] serve: provide "receive" function for object-format capability Jeff King
2021-09-15 18:35       ` [PATCH v3 05/11] serve: provide "receive" function for session-id capability Jeff King
2021-09-15 18:35       ` [PATCH v3 06/11] serve: drop "keys" strvec Jeff King
2021-09-15 18:35       ` [PATCH v3 07/11] ls-refs: ignore very long ref-prefix counts Jeff King
2021-09-15 18:35       ` [PATCH v3 08/11] docs/protocol-v2: clarify some ls-refs ref-prefix details Jeff King
2021-09-15 18:36       ` [PATCH v3 09/11] serve: reject bogus v2 "command=ls-refs=foo" Jeff King
2021-09-15 18:36       ` [PATCH v3 10/11] serve: reject commands used as capabilities Jeff King
2021-09-15 18:36       ` [PATCH v3 11/11] ls-refs: reject unknown arguments Jeff King
2021-09-15  0:25 ` [PATCH 0/9] reducing memory allocations for v2 servers Ævar Arnfjörð Bjarmason
2021-09-15 16:41   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YUE1hz/KKM0XebCP@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=martin.agren@gmail.com \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.