Git Mailing List Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/6] remote-curl: partial fix for a deadlock with stateless rpc
@ 2020-05-13 18:04 Denton Liu
  2020-05-13 18:04 ` [PATCH 1/6] remote-curl: fix typo Denton Liu
                   ` (6 more replies)
  0 siblings, 7 replies; 62+ messages in thread
From: Denton Liu @ 2020-05-13 18:04 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King

The following command hangs forever:

	$ git -c protocol.version=2 clone https://github.com/git/git.git --shallow-since=20151012
	Cloning into 'git'...

This occurs because the --shallow-since arg is incorrect and the server
dies early. However, remote-curl does not realise that the server
errored out and just faithfully forwards the packets to fetch-pack
before waiting on more input from fetch-pack. Meanwhile, fetch-pack
keeps reading as it still expects more input. As a result, the processes
deadlock. Original analysis by Peff:
https://lore.kernel.org/git/20200328154936.GA1217052@coredump.intra.peff.net/

This isn't a full fix as it may still be possible to deadlock, as
described in the last commit message. However, this patch is probably
better than nothing as it fixes the reported bug. I've been working on
the proper reframing fix on-and-off for the past while but it seems
considerably more complicated so it'll probably take a while for me to
get it to a ready state.

Denton Liu (6):
  remote-curl: fix typo
  remote-curl: remove label indentation
  transport: combine common cases with a fallthrough
  pkt-line: extern packet_length()
  remote-curl: error on incomplete packet
  remote-curl: ensure last packet is a flush

 pkt-line.c             |  2 +-
 pkt-line.h             |  5 ++++
 remote-curl.c          | 58 +++++++++++++++++++++++++++++++++++++++---
 t/t5702-protocol-v2.sh | 17 +++++++++++++
 transport.c            | 10 +++-----
 5 files changed, 81 insertions(+), 11 deletions(-)

-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 1/6] remote-curl: fix typo
  2020-05-13 18:04 [PATCH 0/6] remote-curl: partial fix for a deadlock with stateless rpc Denton Liu
@ 2020-05-13 18:04 ` Denton Liu
  2020-05-13 18:04 ` [PATCH 2/6] remote-curl: remove label indentation Denton Liu
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 62+ messages in thread
From: Denton Liu @ 2020-05-13 18:04 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 remote-curl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index 1c9aa3d0ab..6844708f38 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -643,7 +643,7 @@ static size_t rpc_out(void *ptr, size_t eltsize,
 			return 0;
 		}
 		/*
-		 * If avail is non-zerp, the line length for the flush still
+		 * If avail is non-zero, the line length for the flush still
 		 * hasn't been fully sent. Proceed with sending the line
 		 * length.
 		 */
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 2/6] remote-curl: remove label indentation
  2020-05-13 18:04 [PATCH 0/6] remote-curl: partial fix for a deadlock with stateless rpc Denton Liu
  2020-05-13 18:04 ` [PATCH 1/6] remote-curl: fix typo Denton Liu
@ 2020-05-13 18:04 ` Denton Liu
  2020-05-13 18:04 ` [PATCH 3/6] transport: combine common cases with a fallthrough Denton Liu
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 62+ messages in thread
From: Denton Liu @ 2020-05-13 18:04 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King

In the codebase, labels are aligned to the leftmost column. Remove the
space-indentation from `free_specs:` to conform to this.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 remote-curl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index 6844708f38..da3e07184a 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -1276,7 +1276,7 @@ static void parse_push(struct strbuf *buf)
 	if (ret)
 		exit(128); /* error already reported */
 
- free_specs:
+free_specs:
 	argv_array_clear(&specs);
 }
 
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 3/6] transport: combine common cases with a fallthrough
  2020-05-13 18:04 [PATCH 0/6] remote-curl: partial fix for a deadlock with stateless rpc Denton Liu
  2020-05-13 18:04 ` [PATCH 1/6] remote-curl: fix typo Denton Liu
  2020-05-13 18:04 ` [PATCH 2/6] remote-curl: remove label indentation Denton Liu
@ 2020-05-13 18:04 ` Denton Liu
  2020-05-13 23:14   ` Eric Sunshine
  2020-05-13 18:04 ` [PATCH 4/6] pkt-line: extern packet_length() Denton Liu
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 62+ messages in thread
From: Denton Liu @ 2020-05-13 18:04 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King

In the switch statement, the difference between the `protocol_v2` and
`protocol_v{1,0}` arms is a prepatory call to die_if_server_options() in
the latter. The fetch_pack() call is identical in both arms. However,
since this fetch_pack() call has so many parameters, it is not
immediately obvious that the call is identical in both cases.

Rewrite the switch statement to fallthrough from the v{1,0} case to v2
so that they share a common fetch_pack() call. This reduces duplication
and makes the logic more clear for future readers.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 transport.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/transport.c b/transport.c
index 15f5ba4e8f..475f94564a 100644
--- a/transport.c
+++ b/transport.c
@@ -370,15 +370,11 @@ static int fetch_refs_via_pack(struct transport *transport,
 	}
 
 	switch (data->version) {
-	case protocol_v2:
-		refs = fetch_pack(&args, data->fd,
-				  refs_tmp ? refs_tmp : transport->remote_refs,
-				  to_fetch, nr_heads, &data->shallow,
-				  &transport->pack_lockfile, data->version);
-		break;
-	case protocol_v1:
 	case protocol_v0:
+	case protocol_v1:
 		die_if_server_options(transport);
+		/* fallthrough */
+	case protocol_v2:
 		refs = fetch_pack(&args, data->fd,
 				  refs_tmp ? refs_tmp : transport->remote_refs,
 				  to_fetch, nr_heads, &data->shallow,
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 4/6] pkt-line: extern packet_length()
  2020-05-13 18:04 [PATCH 0/6] remote-curl: partial fix for a deadlock with stateless rpc Denton Liu
                   ` (2 preceding siblings ...)
  2020-05-13 18:04 ` [PATCH 3/6] transport: combine common cases with a fallthrough Denton Liu
@ 2020-05-13 18:04 ` Denton Liu
  2020-05-13 23:23   ` Eric Sunshine
  2020-05-15 20:56   ` Jeff King
  2020-05-13 18:04 ` [PATCH 5/6] remote-curl: error on incomplete packet Denton Liu
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 62+ messages in thread
From: Denton Liu @ 2020-05-13 18:04 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King

In a future commit, we will be manually processing packets and we will
need to access the length header. In order to simplify this, extern
packet_length() so that the logic can be reused.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 pkt-line.c | 2 +-
 pkt-line.h | 5 +++++
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/pkt-line.c b/pkt-line.c
index a0e87b1e81..6b60886770 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -306,7 +306,7 @@ static int get_packet_data(int fd, char **src_buf, size_t *src_size,
 	return ret;
 }
 
-static int packet_length(const char *linelen)
+int packet_length(const char *linelen)
 {
 	int val = hex2chr(linelen);
 	return (val < 0) ? val : (val << 8) | hex2chr(linelen + 2);
diff --git a/pkt-line.h b/pkt-line.h
index fef3a0d792..f443185f8f 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -74,6 +74,11 @@ int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
 int packet_read(int fd, char **src_buffer, size_t *src_len, char
 		*buffer, unsigned size, int options);
 
+/*
+ * Reads a packetized line and returns the length header of the packet.
+ */
+int packet_length(const char *linelen);
+
 /*
  * Read a packetized line into a buffer like the 'packet_read()' function but
  * returns an 'enum packet_read_status' which indicates the status of the read.
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 5/6] remote-curl: error on incomplete packet
  2020-05-13 18:04 [PATCH 0/6] remote-curl: partial fix for a deadlock with stateless rpc Denton Liu
                   ` (3 preceding siblings ...)
  2020-05-13 18:04 ` [PATCH 4/6] pkt-line: extern packet_length() Denton Liu
@ 2020-05-13 18:04 ` Denton Liu
  2020-05-15 21:38   ` Jeff King
  2020-05-13 18:04 ` [PATCH 6/6] remote-curl: ensure last packet is a flush Denton Liu
  2020-05-18 15:47 ` [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects Denton Liu
  6 siblings, 1 reply; 62+ messages in thread
From: Denton Liu @ 2020-05-13 18:04 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King

Currently, remote-curl acts as a proxy and blindly forwards packets
between an HTTP server and fetch-pack. In the case of a stateless RPC
connection where the connection is terminated with a partially written
packet, remote-curl will blindly send the partially written packet
before waiting on more input from fetch-pack. Meanwhile, fetch-pack will
read the partial packet and continue reading, expecting more input. This
results in a deadlock between the two processes.

Instead of blindly forwarding packets, inspect each packet to ensure
that it is a full packet, erroring out if a partial packet is sent.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
---

Notes:
    Unfortunately, I'm not really sure how to test this.

 remote-curl.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index da3e07184a..8b740354e5 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -682,6 +682,8 @@ static curlioerr rpc_ioctl(CURL *handle, int cmd, void *clientp)
 struct rpc_in_data {
 	struct rpc_state *rpc;
 	struct active_request_slot *slot;
+	struct strbuf len_buf;
+	int remaining;
 };
 
 /*
@@ -692,6 +694,7 @@ static size_t rpc_in(char *ptr, size_t eltsize,
 		size_t nmemb, void *buffer_)
 {
 	size_t size = eltsize * nmemb;
+	size_t unwritten = size;
 	struct rpc_in_data *data = buffer_;
 	long response_code;
 
@@ -702,7 +705,42 @@ static size_t rpc_in(char *ptr, size_t eltsize,
 		return size;
 	if (size)
 		data->rpc->any_written = 1;
-	write_or_die(data->rpc->in, ptr, size);
+
+	while (unwritten) {
+		if (!data->remaining) {
+			int digits_remaining = 4 - data->len_buf.len;
+			if (digits_remaining > unwritten)
+				digits_remaining = unwritten;
+			strbuf_add(&data->len_buf, ptr, digits_remaining);
+			ptr += digits_remaining;
+			unwritten -= digits_remaining;
+
+			if (data->len_buf.len == 4) {
+				data->remaining = packet_length(data->len_buf.buf);
+				if (data->remaining < 0) {
+					die(_("remote-curl: bad line length character: %.4s"), data->len_buf.buf);
+				} else if (data->remaining <= 1) {
+					data->remaining = 0;
+				} else if (data->remaining < 4) {
+					die(_("remote-curl: bad line length %d"), data->remaining);
+				} else {
+					data->remaining -= 4;
+				}
+				write_or_die(data->rpc->in, data->len_buf.buf, 4);
+				strbuf_reset(&data->len_buf);
+			}
+		}
+
+		if (data->remaining) {
+			int remaining = data->remaining;
+			if (remaining > unwritten)
+				remaining = unwritten;
+			write_or_die(data->rpc->in, ptr, remaining);
+			ptr += remaining;
+			unwritten -= remaining;
+			data->remaining -= remaining;
+		}
+	}
 	return size;
 }
 
@@ -920,6 +958,8 @@ static int post_rpc(struct rpc_state *rpc, int flush_received)
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, rpc_in);
 	rpc_in_data.rpc = rpc;
 	rpc_in_data.slot = slot;
+	strbuf_init(&rpc_in_data.len_buf, 4);
+	rpc_in_data.remaining = 0;
 	curl_easy_setopt(slot->curl, CURLOPT_FILE, &rpc_in_data);
 	curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 0);
 
@@ -936,6 +976,9 @@ static int post_rpc(struct rpc_state *rpc, int flush_received)
 	if (!rpc->any_written)
 		err = -1;
 
+	if (rpc_in_data.remaining)
+		err = error(_("%d bytes are still expected"), rpc_in_data.remaining);
+
 	curl_slist_free_all(headers);
 	free(gzip_body);
 	return err;
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 6/6] remote-curl: ensure last packet is a flush
  2020-05-13 18:04 [PATCH 0/6] remote-curl: partial fix for a deadlock with stateless rpc Denton Liu
                   ` (4 preceding siblings ...)
  2020-05-13 18:04 ` [PATCH 5/6] remote-curl: error on incomplete packet Denton Liu
@ 2020-05-13 18:04 ` Denton Liu
  2020-05-15 21:02   ` Denton Liu
  2020-05-18 15:47 ` [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects Denton Liu
  6 siblings, 1 reply; 62+ messages in thread
From: Denton Liu @ 2020-05-13 18:04 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Force Charlie

Currently, remote-curl acts as a proxy and blindly forwards packets
between an HTTP server and fetch-pack. In the case of a stateless RPC
connection where the connection is terminated before the transaction is
complete, remote-curl will blindly forward the packets before waiting on
more input from fetch-pack. Meanwhile, fetch-pack will read the
transaction and continue reading, expecting more input to continue the
transaction. This results in a deadlock between the two processes.

This can be seen in the following command which does not terminate:

	$ git -c protocol.version=2 clone https://github.com/git/git.git --shallow-since=20151012
	Cloning into 'git'...

whereas the v1 version does terminate as expected:

	$ git -c protocol.version=1 clone https://github.com/git/git.git --shallow-since=20151012
	Cloning into 'git'...
	fatal: the remote end hung up unexpectedly

Instead of blindly forwarding packets, raise a flag when a flush packet
is encountered. Ensure that the last packet sent is a flush packet
otherwise error out, breaking the deadlock.

This is not a complete solution to the problem, however. It is possible
that a flush packet could be sent in the middle of a message and the
connection could die immediately after. Then, remote-curl would not
error out and fetch-pack would still be in the middle of a transaction
and they would enter deadlock. A complete solution would involve
reframing the stateless-connect protocol, possibly by introducing
another control packet ("0002"?) as a stateless request separator
packet which is always sent at the end of post_rpc().

Although this is not a complete solution, it is better than nothing and
it resolves the reported issue for now.

Reported-by: Force Charlie <charlieio@outlook.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
---

Notes:
    I wish there were some way to insert a timeout on the test case so that
    we don't block forever in case we regress.

 remote-curl.c          |  9 +++++++++
 t/t5702-protocol-v2.sh | 17 +++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/remote-curl.c b/remote-curl.c
index 8b740354e5..aab17851be 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -684,6 +684,7 @@ struct rpc_in_data {
 	struct active_request_slot *slot;
 	struct strbuf len_buf;
 	int remaining;
+	int last_flush;
 };
 
 /*
@@ -707,6 +708,8 @@ static size_t rpc_in(char *ptr, size_t eltsize,
 		data->rpc->any_written = 1;
 
 	while (unwritten) {
+		data->last_flush = 0;
+
 		if (!data->remaining) {
 			int digits_remaining = 4 - data->len_buf.len;
 			if (digits_remaining > unwritten)
@@ -720,6 +723,8 @@ static size_t rpc_in(char *ptr, size_t eltsize,
 				if (data->remaining < 0) {
 					die(_("remote-curl: bad line length character: %.4s"), data->len_buf.buf);
 				} else if (data->remaining <= 1) {
+					if (!data->remaining)
+						data->last_flush = 1;
 					data->remaining = 0;
 				} else if (data->remaining < 4) {
 					die(_("remote-curl: bad line length %d"), data->remaining);
@@ -960,6 +965,7 @@ static int post_rpc(struct rpc_state *rpc, int flush_received)
 	rpc_in_data.slot = slot;
 	strbuf_init(&rpc_in_data.len_buf, 4);
 	rpc_in_data.remaining = 0;
+	rpc_in_data.last_flush = 0;
 	curl_easy_setopt(slot->curl, CURLOPT_FILE, &rpc_in_data);
 	curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 0);
 
@@ -979,6 +985,9 @@ static int post_rpc(struct rpc_state *rpc, int flush_received)
 	if (rpc_in_data.remaining)
 		err = error(_("%d bytes are still expected"), rpc_in_data.remaining);
 
+	if (!rpc_in_data.last_flush)
+		err = error(_("last packet was not a flush"));
+
 	curl_slist_free_all(headers);
 	free(gzip_body);
 	return err;
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 5039e66dc4..4570d0746c 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -586,6 +586,23 @@ test_expect_success 'clone with http:// using protocol v2' '
 	! grep "Send header: Transfer-Encoding: chunked" log
 '
 
+test_expect_success 'clone with http:// using protocol v2 and invalid parameters' '
+	test_when_finished "rm -f log" &&
+
+	test_must_fail env GIT_TRACE_PACKET="$(pwd)/log" GIT_TRACE_CURL="$(pwd)/log" \
+		git -c protocol.version=2 \
+		clone --shallow-since=20151012 "$HTTPD_URL/smart/http_parent" http_child_invalid 2>err &&
+
+	# Client requested to use protocol v2
+	grep "Git-Protocol: version=2" log &&
+	# Server responded using protocol v2
+	grep "git< version 2" log &&
+	# Verify that we errored out in the expected way
+	test_i18ngrep "last packet was not a flush" err &&
+	# Verify that the chunked encoding sending codepath is NOT exercised
+	! grep "Send header: Transfer-Encoding: chunked" log
+'
+
 test_expect_success 'clone big repository with http:// using protocol v2' '
 	test_when_finished "rm -f log" &&
 
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 3/6] transport: combine common cases with a fallthrough
  2020-05-13 18:04 ` [PATCH 3/6] transport: combine common cases with a fallthrough Denton Liu
@ 2020-05-13 23:14   ` Eric Sunshine
  2020-05-18  9:18     ` Denton Liu
  0 siblings, 1 reply; 62+ messages in thread
From: Eric Sunshine @ 2020-05-13 23:14 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List, Jeff King

On Wed, May 13, 2020 at 2:07 PM Denton Liu <liu.denton@gmail.com> wrote:
> In the switch statement, the difference between the `protocol_v2` and
> `protocol_v{1,0}` arms is a prepatory call to die_if_server_options() in

What is "prepatory"?

More below...

> the latter. The fetch_pack() call is identical in both arms. However,
> since this fetch_pack() call has so many parameters, it is not
> immediately obvious that the call is identical in both cases.
>
> Rewrite the switch statement to fallthrough from the v{1,0} case to v2
> so that they share a common fetch_pack() call. This reduces duplication
> and makes the logic more clear for future readers.
>
> Signed-off-by: Denton Liu <liu.denton@gmail.com>
> ---
> diff --git a/transport.c b/transport.c
> @@ -370,15 +370,11 @@ static int fetch_refs_via_pack(struct transport *transport,
>         switch (data->version) {
> -       case protocol_v2:
> -               refs = fetch_pack(&args, data->fd,
> -                                 refs_tmp ? refs_tmp : transport->remote_refs,
> -                                 to_fetch, nr_heads, &data->shallow,
> -                                 &transport->pack_lockfile, data->version);
> -               break;
> -       case protocol_v1:
>         case protocol_v0:
> +       case protocol_v1:
>                 die_if_server_options(transport);
> +               /* fallthrough */
> +       case protocol_v2:
>                 refs = fetch_pack(&args, data->fd,
>                                   refs_tmp ? refs_tmp : transport->remote_refs,
>                                   to_fetch, nr_heads, &data->shallow,

I can't say that I'm a fan of this change. While it may make it clear
that the two calls are identical, it makes it more difficult to reason
about the distinct v0, v1, and v2 cases. It also makes it more
difficult to extend or make changes to one or another case, should
that become necessary in the future.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 4/6] pkt-line: extern packet_length()
  2020-05-13 18:04 ` [PATCH 4/6] pkt-line: extern packet_length() Denton Liu
@ 2020-05-13 23:23   ` Eric Sunshine
  2020-05-15 20:56   ` Jeff King
  1 sibling, 0 replies; 62+ messages in thread
From: Eric Sunshine @ 2020-05-13 23:23 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List, Jeff King

On Wed, May 13, 2020 at 2:07 PM Denton Liu <liu.denton@gmail.com> wrote:
> In a future commit, we will be manually processing packets and we will
> need to access the length header. In order to simplify this, extern
> packet_length() so that the logic can be reused.
>
> Signed-off-by: Denton Liu <liu.denton@gmail.com>
> ---
> diff --git a/pkt-line.c b/pkt-line.c
> @@ -306,7 +306,7 @@ static int get_packet_data(int fd, char **src_buf, size_t *src_size,
> +/*
> + * Reads a packetized line and returns the length header of the packet.
> + */
> +int packet_length(const char *linelen);

The function comment seems rather gobbledy-gooky to me. Perhaps it
could be clearer by saying something along the lines of the input
being a hexadecimal 2-digit representation of a packet length and that
the function converts it to a numeric value (between 0 and 255).

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 4/6] pkt-line: extern packet_length()
  2020-05-13 18:04 ` [PATCH 4/6] pkt-line: extern packet_length() Denton Liu
  2020-05-13 23:23   ` Eric Sunshine
@ 2020-05-15 20:56   ` Jeff King
  2020-05-15 20:57     ` Jeff King
  1 sibling, 1 reply; 62+ messages in thread
From: Jeff King @ 2020-05-15 20:56 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List

On Wed, May 13, 2020 at 02:04:56PM -0400, Denton Liu wrote:

> +/*
> + * Reads a packetized line and returns the length header of the packet.
> + */
> +int packet_length(const char *linelen);

If this is becoming public, I think we need to talk a bit more about:

  - what is linelen; it's not a NUL-terminated string, which I would
    have expected from the prototype. It must be at least 4 chars, and
    doesn't need terminated.

  - what happens on error; it looks like we return -1

I think the prototype would be more clear to me as:

  int packet_length(const char *line, size_t len)
  {
	if (len < 4)
		return -1;
  }

which makes it clear that this is a sized buffer. But since nobody
should ever be passing anything except "4", it may be overkill. I'd be
happy enough with a comment.

-Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 4/6] pkt-line: extern packet_length()
  2020-05-15 20:56   ` Jeff King
@ 2020-05-15 20:57     ` Jeff King
  0 siblings, 0 replies; 62+ messages in thread
From: Jeff King @ 2020-05-15 20:57 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List

On Fri, May 15, 2020 at 04:56:32PM -0400, Jeff King wrote:

> On Wed, May 13, 2020 at 02:04:56PM -0400, Denton Liu wrote:
> 
> > +/*
> > + * Reads a packetized line and returns the length header of the packet.
> > + */
> > +int packet_length(const char *linelen);
> 
> If this is becoming public, I think we need to talk a bit more about:
> 
>   - what is linelen; it's not a NUL-terminated string, which I would
>     have expected from the prototype. It must be at least 4 chars, and
>     doesn't need terminated.
> 
>   - what happens on error; it looks like we return -1
> 
> I think the prototype would be more clear to me as:
> 
>   int packet_length(const char *line, size_t len)
>   {
> 	if (len < 4)
> 		return -1;
>   }
> 
> which makes it clear that this is a sized buffer. But since nobody
> should ever be passing anything except "4", it may be overkill. I'd be
> happy enough with a comment.

Oh, and obviously:

  int packet_length(const char linelen[4]);

would be descriptive, but I think falls afoul of C pointer/array
weirdness.

-Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 6/6] remote-curl: ensure last packet is a flush
  2020-05-13 18:04 ` [PATCH 6/6] remote-curl: ensure last packet is a flush Denton Liu
@ 2020-05-15 21:02   ` Denton Liu
  2020-05-15 21:41     ` Jeff King
  0 siblings, 1 reply; 62+ messages in thread
From: Denton Liu @ 2020-05-15 21:02 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Force Charlie

On Wed, May 13, 2020 at 02:04:58PM -0400, Denton Liu wrote:
> This is not a complete solution to the problem, however. It is possible
> that a flush packet could be sent in the middle of a message and the
> connection could die immediately after. Then, remote-curl would not
> error out and fetch-pack would still be in the middle of a transaction
> and they would enter deadlock. A complete solution would involve
> reframing the stateless-connect protocol, possibly by introducing
> another control packet ("0002"?) as a stateless request separator
> packet which is always sent at the end of post_rpc().
> 
> Although this is not a complete solution, it is better than nothing and
> it resolves the reported issue for now.

I managed to get the implementation of the control packet working. As a
result, I will be dropping this patch in the next reroll and replacing
it with the more complete solution. For anyone reviewing, feel free to
skip this patch.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/6] remote-curl: error on incomplete packet
  2020-05-13 18:04 ` [PATCH 5/6] remote-curl: error on incomplete packet Denton Liu
@ 2020-05-15 21:38   ` Jeff King
  2020-05-18  9:08     ` Denton Liu
  0 siblings, 1 reply; 62+ messages in thread
From: Jeff King @ 2020-05-15 21:38 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List

On Wed, May 13, 2020 at 02:04:57PM -0400, Denton Liu wrote:

> Currently, remote-curl acts as a proxy and blindly forwards packets
> between an HTTP server and fetch-pack. In the case of a stateless RPC
> connection where the connection is terminated with a partially written
> packet, remote-curl will blindly send the partially written packet
> before waiting on more input from fetch-pack. Meanwhile, fetch-pack will
> read the partial packet and continue reading, expecting more input. This
> results in a deadlock between the two processes.
> 
> Instead of blindly forwarding packets, inspect each packet to ensure
> that it is a full packet, erroring out if a partial packet is sent.

Hmm. Right now there's no assumption in rpc_in that we're writing
pktlines. Will this always be the case?

I think the answer is probably yes. If there's a case where it isn't, it
might be something like v0 git-over-http against a server which doesn't
have the sideband capability.

> diff --git a/remote-curl.c b/remote-curl.c
> index da3e07184a..8b740354e5 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -682,6 +682,8 @@ static curlioerr rpc_ioctl(CURL *handle, int cmd, void *clientp)
>  struct rpc_in_data {
>  	struct rpc_state *rpc;
>  	struct active_request_slot *slot;
> +	struct strbuf len_buf;
> +	int remaining;

This remaining is "remaining in the current packet", I assume? An "int"
should be OK for that.

Using a strbuf feels a bit like overkill, because we have to remember to
free it (which I think doesn't actually happen in your patch). Could we
just use a "char len_buf[4]" (you'd need an extra int to count how many
bytes you've received there).

> @@ -702,7 +705,42 @@ static size_t rpc_in(char *ptr, size_t eltsize,
>  		return size;
>  	if (size)
>  		data->rpc->any_written = 1;
> -	write_or_die(data->rpc->in, ptr, size);
> +
> +	while (unwritten) {
> +		if (!data->remaining) {
> +			int digits_remaining = 4 - data->len_buf.len;
> +			if (digits_remaining > unwritten)
> +				digits_remaining = unwritten;
> +			strbuf_add(&data->len_buf, ptr, digits_remaining);
> +			ptr += digits_remaining;
> +			unwritten -= digits_remaining;

So we actually save up the 4 bytes, not sending them at all until we get
them. I wonder if this might be easier to follow if our count was simply
advisory. I.e., continue to write data as we get it, but keep a small
state machine tracking pktlines (which could even be done in its own
separate struct/function pair).

I dunno. It might be about the same level of confusing, but it makes it
easy to keep the logic separate from rpc_in, and it lets us put all of
the policy bits at the end, after we know we've received all of the
data (in post_rpc):

  if (data->len_buf.len < 4)
          die("incomplete packet header");
  if (data->remaining)
          die("incomplete packet");
  /* imagine we kept the actual pktlen value, instead of just counting
   * down remaining */
  if (data->pktlen)
          die("did not end in flush");

Notably, I'm not sure if your code will complain if the connection dies
will reading the 4-byte header (remaining would still be 0). That won't
leave the caller trying to read the packet (we never sent them the
header), but they may still be waiting for a response.

> +			if (data->len_buf.len == 4) {
> +				data->remaining = packet_length(data->len_buf.buf);
> +				if (data->remaining < 0) {
> +					die(_("remote-curl: bad line length character: %.4s"), data->len_buf.buf);
> +				} else if (data->remaining <= 1) {
> +					data->remaining = 0;

This treats 0001 delimiters the same as a 0000 flush. Expecting 0 more
bytes is the right thing, but would we later want to differentiate in
post_rpc()? I.e., is it ever correct for the server data to end with a
delim?

> +				} else if (data->remaining < 4) {
> +					die(_("remote-curl: bad line length %d"), data->remaining);

We don't use an 0002 or 0003 packet for anything yet, but this would
need to be updated if we ever did. I wonder if we should treat them also
as zero-length packets and quietly pass through, which is likely the
right thing to do. OTOH, this should complain loudly enough that
somebody would presumably notice as soon as they tried to use those
packets.

Regarding testing, I think the ideal thing would be a proxy layer we
could insert that simply eats all data after N bytes. That would be easy
to do if we could use --upload-pack='git-upload-pack | head -c 500' or
something. But we need it to happen between curl and the server side of
Git. Possibly we could insert something between apache and
git-http-backend (simialr to how we handle broken-smart-http.sh.

-Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 6/6] remote-curl: ensure last packet is a flush
  2020-05-15 21:02   ` Denton Liu
@ 2020-05-15 21:41     ` Jeff King
  2020-05-18 16:34       ` Junio C Hamano
  0 siblings, 1 reply; 62+ messages in thread
From: Jeff King @ 2020-05-15 21:41 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List, Force Charlie

On Fri, May 15, 2020 at 05:02:45PM -0400, Denton Liu wrote:

> On Wed, May 13, 2020 at 02:04:58PM -0400, Denton Liu wrote:
> > This is not a complete solution to the problem, however. It is possible
> > that a flush packet could be sent in the middle of a message and the
> > connection could die immediately after. Then, remote-curl would not
> > error out and fetch-pack would still be in the middle of a transaction
> > and they would enter deadlock. A complete solution would involve
> > reframing the stateless-connect protocol, possibly by introducing
> > another control packet ("0002"?) as a stateless request separator
> > packet which is always sent at the end of post_rpc().
> > 
> > Although this is not a complete solution, it is better than nothing and
> > it resolves the reported issue for now.
> 
> I managed to get the implementation of the control packet working. As a
> result, I will be dropping this patch in the next reroll and replacing
> it with the more complete solution. For anyone reviewing, feel free to
> skip this patch.

OK. I'm less concerned about a flush packet in the middle of the
response fooling us into thinking things are done, and more that there
may be responses which _don't_ end in a flush. But maybe they all do.

This (and the previous patch) are definitely adding an extra layer of
assumptions about what the protocol going over the rpc channel looks
like. That seems like it will introduce more fragility.

I do kind of like the idea of a stateless separator packet, if I
understand your meaning correctly. I'll wait to see what the patches
look like. :)

-Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/6] remote-curl: error on incomplete packet
  2020-05-15 21:38   ` Jeff King
@ 2020-05-18  9:08     ` Denton Liu
  2020-05-18 15:49       ` Jeff King
  0 siblings, 1 reply; 62+ messages in thread
From: Denton Liu @ 2020-05-18  9:08 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

Hi Peff,

On Fri, May 15, 2020 at 05:38:44PM -0400, Jeff King wrote:
> On Wed, May 13, 2020 at 02:04:57PM -0400, Denton Liu wrote:
> 
> > Currently, remote-curl acts as a proxy and blindly forwards packets
> > between an HTTP server and fetch-pack. In the case of a stateless RPC
> > connection where the connection is terminated with a partially written
> > packet, remote-curl will blindly send the partially written packet
> > before waiting on more input from fetch-pack. Meanwhile, fetch-pack will
> > read the partial packet and continue reading, expecting more input. This
> > results in a deadlock between the two processes.
> > 
> > Instead of blindly forwarding packets, inspect each packet to ensure
> > that it is a full packet, erroring out if a partial packet is sent.
> 
> Hmm. Right now there's no assumption in rpc_in that we're writing
> pktlines. Will this always be the case?
> 
> I think the answer is probably yes. If there's a case where it isn't, it
> might be something like v0 git-over-http against a server which doesn't
> have the sideband capability.

As far as I can tell from skimming the code, v0 uses always pktlines
although I'm far from being an expert on Git's networking stuff. Perhaps
we could implement this such that the line-length checking only happens
for stateless_connect()?

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 3/6] transport: combine common cases with a fallthrough
  2020-05-13 23:14   ` Eric Sunshine
@ 2020-05-18  9:18     ` Denton Liu
  2020-05-18 17:43       ` Eric Sunshine
  0 siblings, 1 reply; 62+ messages in thread
From: Denton Liu @ 2020-05-18  9:18 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git Mailing List, Jeff King

Hi Eric,

On Wed, May 13, 2020 at 07:14:35PM -0400, Eric Sunshine wrote:
> > diff --git a/transport.c b/transport.c
> > @@ -370,15 +370,11 @@ static int fetch_refs_via_pack(struct transport *transport,
> >         switch (data->version) {
> > -       case protocol_v2:
> > -               refs = fetch_pack(&args, data->fd,
> > -                                 refs_tmp ? refs_tmp : transport->remote_refs,
> > -                                 to_fetch, nr_heads, &data->shallow,
> > -                                 &transport->pack_lockfile, data->version);
> > -               break;
> > -       case protocol_v1:
> >         case protocol_v0:
> > +       case protocol_v1:
> >                 die_if_server_options(transport);
> > +               /* fallthrough */
> > +       case protocol_v2:
> >                 refs = fetch_pack(&args, data->fd,
> >                                   refs_tmp ? refs_tmp : transport->remote_refs,
> >                                   to_fetch, nr_heads, &data->shallow,
> 
> I can't say that I'm a fan of this change. While it may make it clear
> that the two calls are identical, it makes it more difficult to reason
> about the distinct v0, v1, and v2 cases.

I would argue that it would make it easier to reason about the distinct
cases. From the rewritten version, it's more obvious that code used in
the v2 case is a subset of the code used in v0 and v1.

> It also makes it more
> difficult to extend or make changes to one or another case, should
> that become necessary in the future.

I would say that's the point of this change ;) When I was looking over
this code initially, I was scratching my head over the difference
between the two cases because I expected them to be different in the two
cases. If a change is necessary in the future, then it'll make sense to
write it as two separate calls to fetch_pack() or whatever but until
that happens, I think it makes more sense remove the duplication.

Actually, thinking about it some more, do you think it would make more
sense to just pull the fetch_pack() call out of the switch statement
entirely? We could entirely eliminate the fallthrough if we do this.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects
  2020-05-13 18:04 [PATCH 0/6] remote-curl: partial fix for a deadlock with stateless rpc Denton Liu
                   ` (5 preceding siblings ...)
  2020-05-13 18:04 ` [PATCH 6/6] remote-curl: ensure last packet is a flush Denton Liu
@ 2020-05-18 15:47 ` Denton Liu
  2020-05-18 15:47   ` [PATCH v2 1/7] remote-curl: fix typo Denton Liu
                     ` (8 more replies)
  6 siblings, 9 replies; 62+ messages in thread
From: Denton Liu @ 2020-05-18 15:47 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Eric Sunshine

The following command hangs forever:

	$ git -c protocol.version=2 clone https://github.com/git/git.git --shallow-since=20151012
	Cloning into 'git'...

This occurs because the --shallow-since arg is incorrect and the server
dies early. However, remote-curl does not realise that the server
errored out and just faithfully forwards the packets to fetch-pack
before waiting on more input from fetch-pack. Meanwhile, fetch-pack
keeps reading as it still expects more input. As a result, the processes
deadlock. Original analysis by Peff:
https://lore.kernel.org/git/20200328154936.GA1217052@coredump.intra.peff.net/

Changes since v1:

* Remove fallthrough in switch in favour of just extracting the common
  call out of the switch in patch 3

* Add more detail in function comment and use `const char linelen[4]` in
  patch 4

* Implement most of Peff's suggestions[0] in patch 5

* Only operate on stateless_connect() in patch 5

* Add tests in patch 5

* Drop "remote-curl: ensure last packet is a flush" in favour of
  "stateless-connect: send response end packet"

[0]: https://lore.kernel.org/git/20200515213844.GD115445@coredump.intra.peff.net/

Denton Liu (7):
  remote-curl: fix typo
  remote-curl: remove label indentation
  transport: extract common fetch_pack() call
  pkt-line: extern packet_length()
  remote-curl: error on incomplete packet
  pkt-line: PACKET_READ_RESPONSE_END
  stateless-connect: send response end packet

 Documentation/gitremote-helpers.txt           |  4 +-
 Documentation/technical/protocol-v2.txt       |  2 +
 builtin/fetch-pack.c                          |  2 +-
 connect.c                                     | 12 +++-
 fetch-pack.c                                  | 12 ++++
 pkt-line.c                                    | 13 +++-
 pkt-line.h                                    | 11 +++
 remote-curl.c                                 | 72 +++++++++++++++++--
 remote.h                                      |  3 +-
 serve.c                                       |  2 +
 t/helper/test-pkt-line.c                      |  4 ++
 t/lib-httpd.sh                                |  2 +
 t/lib-httpd/apache.conf                       |  8 +++
 .../incomplete-body-upload-pack-v2-http.sh    |  3 +
 .../incomplete-length-upload-pack-v2-http.sh  |  3 +
 t/t5702-protocol-v2.sh                        | 47 ++++++++++++
 transport.c                                   | 16 ++---
 17 files changed, 197 insertions(+), 19 deletions(-)
 create mode 100644 t/lib-httpd/incomplete-body-upload-pack-v2-http.sh
 create mode 100644 t/lib-httpd/incomplete-length-upload-pack-v2-http.sh

Range-diff against v1:
1:  b390875f87 = 1:  b390875f87 remote-curl: fix typo
2:  a2b28c0b28 = 2:  a2b28c0b28 remote-curl: remove label indentation
3:  c89c184100 < -:  ---------- transport: combine common cases with a fallthrough
-:  ---------- > 3:  3a42575bd5 transport: extract common fetch_pack() call
4:  891a39c853 ! 4:  c2b9d033bb pkt-line: extern packet_length()
    @@ Commit message
         need to access the length header. In order to simplify this, extern
         packet_length() so that the logic can be reused.
     
    +    Change the function parameter from a `const char *` to
    +    `const char linelen[4]`. Even though these two types behave identically
    +    as function parameters, use the array notation to semantically indicate
    +    exactly what this function is expecting as an argument.
    +
      ## pkt-line.c ##
     @@ pkt-line.c: static int get_packet_data(int fd, char **src_buf, size_t *src_size,
      	return ret;
      }
      
     -static int packet_length(const char *linelen)
    -+int packet_length(const char *linelen)
    ++int packet_length(const char linelen[4])
      {
      	int val = hex2chr(linelen);
      	return (val < 0) ? val : (val << 8) | hex2chr(linelen + 2);
    @@ pkt-line.h: int write_packetized_from_buf(const char *src_in, size_t len, int fd
      		*buffer, unsigned size, int options);
      
     +/*
    -+ * Reads a packetized line and returns the length header of the packet.
    ++ * Convert a four hex digit packet line length header into its numeric
    ++ * representation. linelen should not be null-terminated.
    ++ *
    ++ * If linelen contains non-hex characters, return -1. Otherwise, return the
    ++ * numeric value of the length header.
     + */
    -+int packet_length(const char *linelen);
    ++int packet_length(const char linelen[4]);
     +
      /*
       * Read a packetized line into a buffer like the 'packet_read()' function but
5:  3ed7cf87aa < -:  ---------- remote-curl: error on incomplete packet
6:  7a689da2bb < -:  ---------- remote-curl: ensure last packet is a flush
-:  ---------- > 5:  52ce5fdffd remote-curl: error on incomplete packet
-:  ---------- > 6:  744b078324 pkt-line: PACKET_READ_RESPONSE_END
-:  ---------- > 7:  4b079bcd83 stateless-connect: send response end packet
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v2 1/7] remote-curl: fix typo
  2020-05-18 15:47 ` [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects Denton Liu
@ 2020-05-18 15:47   ` Denton Liu
  2020-05-18 15:47   ` [PATCH v2 2/7] remote-curl: remove label indentation Denton Liu
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 62+ messages in thread
From: Denton Liu @ 2020-05-18 15:47 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Eric Sunshine

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 remote-curl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index 1c9aa3d0ab..6844708f38 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -643,7 +643,7 @@ static size_t rpc_out(void *ptr, size_t eltsize,
 			return 0;
 		}
 		/*
-		 * If avail is non-zerp, the line length for the flush still
+		 * If avail is non-zero, the line length for the flush still
 		 * hasn't been fully sent. Proceed with sending the line
 		 * length.
 		 */
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v2 2/7] remote-curl: remove label indentation
  2020-05-18 15:47 ` [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects Denton Liu
  2020-05-18 15:47   ` [PATCH v2 1/7] remote-curl: fix typo Denton Liu
@ 2020-05-18 15:47   ` Denton Liu
  2020-05-18 18:37     ` Junio C Hamano
  2020-05-18 15:47   ` [PATCH v2 3/7] transport: extract common fetch_pack() call Denton Liu
                     ` (6 subsequent siblings)
  8 siblings, 1 reply; 62+ messages in thread
From: Denton Liu @ 2020-05-18 15:47 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Eric Sunshine

In the codebase, labels are aligned to the leftmost column. Remove the
space-indentation from `free_specs:` to conform to this.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 remote-curl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index 6844708f38..da3e07184a 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -1276,7 +1276,7 @@ static void parse_push(struct strbuf *buf)
 	if (ret)
 		exit(128); /* error already reported */
 
- free_specs:
+free_specs:
 	argv_array_clear(&specs);
 }
 
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v2 3/7] transport: extract common fetch_pack() call
  2020-05-18 15:47 ` [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects Denton Liu
  2020-05-18 15:47   ` [PATCH v2 1/7] remote-curl: fix typo Denton Liu
  2020-05-18 15:47   ` [PATCH v2 2/7] remote-curl: remove label indentation Denton Liu
@ 2020-05-18 15:47   ` Denton Liu
  2020-05-18 18:40     ` Junio C Hamano
  2020-05-18 15:47   ` [PATCH v2 4/7] pkt-line: extern packet_length() Denton Liu
                     ` (5 subsequent siblings)
  8 siblings, 1 reply; 62+ messages in thread
From: Denton Liu @ 2020-05-18 15:47 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Eric Sunshine

In the switch statement, the difference between the `protocol_v2` and
`protocol_v{1,0}` arms is a preparatory call to die_if_server_options() in
the latter. The fetch_pack() call is identical in both arms. However,
since this fetch_pack() call has so many parameters, it is not
immediately obvious that the call is identical in both cases.

Extract the common fetch_pack() call out of the switch statement so that
code duplication is reduced and the logic is more clear for future
readers.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 transport.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/transport.c b/transport.c
index 15f5ba4e8f..a6002e502f 100644
--- a/transport.c
+++ b/transport.c
@@ -371,22 +371,19 @@ static int fetch_refs_via_pack(struct transport *transport,
 
 	switch (data->version) {
 	case protocol_v2:
-		refs = fetch_pack(&args, data->fd,
-				  refs_tmp ? refs_tmp : transport->remote_refs,
-				  to_fetch, nr_heads, &data->shallow,
-				  &transport->pack_lockfile, data->version);
+		/* do nothing */
 		break;
 	case protocol_v1:
 	case protocol_v0:
 		die_if_server_options(transport);
-		refs = fetch_pack(&args, data->fd,
-				  refs_tmp ? refs_tmp : transport->remote_refs,
-				  to_fetch, nr_heads, &data->shallow,
-				  &transport->pack_lockfile, data->version);
 		break;
 	case protocol_unknown_version:
 		BUG("unknown protocol version");
 	}
+	refs = fetch_pack(&args, data->fd,
+			  refs_tmp ? refs_tmp : transport->remote_refs,
+			  to_fetch, nr_heads, &data->shallow,
+			  &transport->pack_lockfile, data->version);
 
 	close(data->fd[0]);
 	close(data->fd[1]);
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v2 4/7] pkt-line: extern packet_length()
  2020-05-18 15:47 ` [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects Denton Liu
                     ` (2 preceding siblings ...)
  2020-05-18 15:47   ` [PATCH v2 3/7] transport: extract common fetch_pack() call Denton Liu
@ 2020-05-18 15:47   ` Denton Liu
  2020-05-18 16:04     ` Jeff King
  2020-05-18 15:47   ` [PATCH v2 5/7] remote-curl: error on incomplete packet Denton Liu
                     ` (4 subsequent siblings)
  8 siblings, 1 reply; 62+ messages in thread
From: Denton Liu @ 2020-05-18 15:47 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Eric Sunshine

In a future commit, we will be manually processing packets and we will
need to access the length header. In order to simplify this, extern
packet_length() so that the logic can be reused.

Change the function parameter from a `const char *` to
`const char linelen[4]`. Even though these two types behave identically
as function parameters, use the array notation to semantically indicate
exactly what this function is expecting as an argument.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 pkt-line.c | 2 +-
 pkt-line.h | 9 +++++++++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/pkt-line.c b/pkt-line.c
index a0e87b1e81..5c3b4539b5 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -306,7 +306,7 @@ static int get_packet_data(int fd, char **src_buf, size_t *src_size,
 	return ret;
 }
 
-static int packet_length(const char *linelen)
+int packet_length(const char linelen[4])
 {
 	int val = hex2chr(linelen);
 	return (val < 0) ? val : (val << 8) | hex2chr(linelen + 2);
diff --git a/pkt-line.h b/pkt-line.h
index fef3a0d792..de81776a7a 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -74,6 +74,15 @@ int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
 int packet_read(int fd, char **src_buffer, size_t *src_len, char
 		*buffer, unsigned size, int options);
 
+/*
+ * Convert a four hex digit packet line length header into its numeric
+ * representation. linelen should not be null-terminated.
+ *
+ * If linelen contains non-hex characters, return -1. Otherwise, return the
+ * numeric value of the length header.
+ */
+int packet_length(const char linelen[4]);
+
 /*
  * Read a packetized line into a buffer like the 'packet_read()' function but
  * returns an 'enum packet_read_status' which indicates the status of the read.
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v2 5/7] remote-curl: error on incomplete packet
  2020-05-18 15:47 ` [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects Denton Liu
                     ` (3 preceding siblings ...)
  2020-05-18 15:47   ` [PATCH v2 4/7] pkt-line: extern packet_length() Denton Liu
@ 2020-05-18 15:47   ` Denton Liu
  2020-05-18 16:22     ` Jeff King
  2020-05-18 15:47   ` [PATCH v2 6/7] pkt-line: PACKET_READ_RESPONSE_END Denton Liu
                     ` (3 subsequent siblings)
  8 siblings, 1 reply; 62+ messages in thread
From: Denton Liu @ 2020-05-18 15:47 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Eric Sunshine

Currently, remote-curl acts as a proxy and blindly forwards packets
between an HTTP server and fetch-pack. In the case of a stateless RPC
connection where the connection is terminated with a partially written
packet, remote-curl will blindly send the partially written packet
before waiting on more input from fetch-pack. Meanwhile, fetch-pack will
read the partial packet and continue reading, expecting more input. This
results in a deadlock between the two processes.

For a stateless connection, inspect packets before sending them and
error out if a a packet line packet is incomplete.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 remote-curl.c                                 | 59 ++++++++++++++++++-
 t/lib-httpd.sh                                |  2 +
 t/lib-httpd/apache.conf                       |  8 +++
 .../incomplete-body-upload-pack-v2-http.sh    |  3 +
 .../incomplete-length-upload-pack-v2-http.sh  |  3 +
 t/t5702-protocol-v2.sh                        | 34 +++++++++++
 6 files changed, 106 insertions(+), 3 deletions(-)
 create mode 100644 t/lib-httpd/incomplete-body-upload-pack-v2-http.sh
 create mode 100644 t/lib-httpd/incomplete-length-upload-pack-v2-http.sh

diff --git a/remote-curl.c b/remote-curl.c
index da3e07184a..e020140092 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -679,9 +679,53 @@ static curlioerr rpc_ioctl(CURL *handle, int cmd, void *clientp)
 }
 #endif
 
+struct check_pktline_state {
+	char len_buf[4];
+	int len_filled;
+	int remaining;
+};
+
+static void check_pktline(struct check_pktline_state *state, const char *ptr, size_t size)
+{
+	while (size) {
+		if (!state->remaining) {
+			int digits_remaining = 4 - state->len_filled;
+			if (digits_remaining > size)
+				digits_remaining = size;
+			memcpy(&state->len_buf[state->len_filled], ptr, digits_remaining);
+			state->len_filled += digits_remaining;
+			ptr += digits_remaining;
+			size -= digits_remaining;
+
+			if (state->len_filled == 4) {
+				state->remaining = packet_length(state->len_buf);
+				if (state->remaining < 0) {
+					die(_("remote-curl: bad line length character: %.4s"), state->len_buf);
+				} else if (state->remaining < 4) {
+					state->remaining = 0;
+				} else {
+					state->remaining -= 4;
+				}
+				state->len_filled = 0;
+			}
+		}
+
+		if (state->remaining) {
+			int remaining = state->remaining;
+			if (remaining > size)
+				remaining = size;
+			ptr += remaining;
+			size -= remaining;
+			state->remaining -= remaining;
+		}
+	}
+}
+
 struct rpc_in_data {
 	struct rpc_state *rpc;
 	struct active_request_slot *slot;
+	int check_pktline;
+	struct check_pktline_state pktline_state;
 };
 
 /*
@@ -702,6 +746,8 @@ static size_t rpc_in(char *ptr, size_t eltsize,
 		return size;
 	if (size)
 		data->rpc->any_written = 1;
+	if (data->check_pktline)
+		check_pktline(&data->pktline_state, ptr, size);
 	write_or_die(data->rpc->in, ptr, size);
 	return size;
 }
@@ -778,7 +824,7 @@ static curl_off_t xcurl_off_t(size_t len)
  * If flush_received is true, do not attempt to read any more; just use what's
  * in rpc->buf.
  */
-static int post_rpc(struct rpc_state *rpc, int flush_received)
+static int post_rpc(struct rpc_state *rpc, int stateless_connect, int flush_received)
 {
 	struct active_request_slot *slot;
 	struct curl_slist *headers = http_copy_default_headers();
@@ -920,6 +966,8 @@ static int post_rpc(struct rpc_state *rpc, int flush_received)
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, rpc_in);
 	rpc_in_data.rpc = rpc;
 	rpc_in_data.slot = slot;
+	rpc_in_data.check_pktline = stateless_connect;
+	memset(&rpc_in_data.pktline_state, 0, sizeof(rpc_in_data.pktline_state));
 	curl_easy_setopt(slot->curl, CURLOPT_FILE, &rpc_in_data);
 	curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 0);
 
@@ -936,6 +984,11 @@ static int post_rpc(struct rpc_state *rpc, int flush_received)
 	if (!rpc->any_written)
 		err = -1;
 
+	if (rpc_in_data.pktline_state.len_filled)
+		err = error(_("%d bytes of length header were received"), rpc_in_data.pktline_state.len_filled);
+	if (rpc_in_data.pktline_state.remaining)
+		err = error(_("%d bytes of body are still expected"), rpc_in_data.pktline_state.remaining);
+
 	curl_slist_free_all(headers);
 	free(gzip_body);
 	return err;
@@ -985,7 +1038,7 @@ static int rpc_service(struct rpc_state *rpc, struct discovery *heads,
 			break;
 		rpc->pos = 0;
 		rpc->len = n;
-		err |= post_rpc(rpc, 0);
+		err |= post_rpc(rpc, 0, 0);
 	}
 
 	close(client.in);
@@ -1342,7 +1395,7 @@ static int stateless_connect(const char *service_name)
 			BUG("The entire rpc->buf should be larger than LARGE_PACKET_MAX");
 		if (status == PACKET_READ_EOF)
 			break;
-		if (post_rpc(&rpc, status == PACKET_READ_FLUSH))
+		if (post_rpc(&rpc, 1, status == PACKET_READ_FLUSH))
 			/* We would have an err here */
 			break;
 		/* Reset the buffer for next request */
diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index 1449ee95e9..d2edfa4c50 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -129,6 +129,8 @@ install_script () {
 prepare_httpd() {
 	mkdir -p "$HTTPD_DOCUMENT_ROOT_PATH"
 	cp "$TEST_PATH"/passwd "$HTTPD_ROOT_PATH"
+	install_script incomplete-length-upload-pack-v2-http.sh
+	install_script incomplete-body-upload-pack-v2-http.sh
 	install_script broken-smart-http.sh
 	install_script error-smart-http.sh
 	install_script error.sh
diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index 994e5290d6..afa91e38b0 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -117,6 +117,8 @@ Alias /auth/dumb/ www/auth/dumb/
 	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
 	SetEnv GIT_HTTP_EXPORT_ALL
 </LocationMatch>
+ScriptAlias /smart/incomplete_length/git-upload-pack incomplete-length-upload-pack-v2-http.sh/
+ScriptAlias /smart/incomplete_body/git-upload-pack incomplete-body-upload-pack-v2-http.sh/
 ScriptAliasMatch /error_git_upload_pack/(.*)/git-upload-pack error.sh/
 ScriptAliasMatch /smart_*[^/]*/(.*) ${GIT_EXEC_PATH}/git-http-backend/$1
 ScriptAlias /broken_smart/ broken-smart-http.sh/
@@ -126,6 +128,12 @@ ScriptAliasMatch /one_time_perl/(.*) apply-one-time-perl.sh/$1
 <Directory ${GIT_EXEC_PATH}>
 	Options FollowSymlinks
 </Directory>
+<Files incomplete-length-upload-pack-v2-http.sh>
+	Options ExecCGI
+</Files>
+<Files incomplete-body-upload-pack-v2-http.sh>
+	Options ExecCGI
+</Files>
 <Files broken-smart-http.sh>
 	Options ExecCGI
 </Files>
diff --git a/t/lib-httpd/incomplete-body-upload-pack-v2-http.sh b/t/lib-httpd/incomplete-body-upload-pack-v2-http.sh
new file mode 100644
index 0000000000..2f5ed9fcf6
--- /dev/null
+++ b/t/lib-httpd/incomplete-body-upload-pack-v2-http.sh
@@ -0,0 +1,3 @@
+printf "Content-Type: text/%s\n" "application/x-git-upload-pack-result"
+echo
+printf "%s%s\n" "0079" "45"
diff --git a/t/lib-httpd/incomplete-length-upload-pack-v2-http.sh b/t/lib-httpd/incomplete-length-upload-pack-v2-http.sh
new file mode 100644
index 0000000000..86c6e648c9
--- /dev/null
+++ b/t/lib-httpd/incomplete-length-upload-pack-v2-http.sh
@@ -0,0 +1,3 @@
+printf "Content-Type: text/%s\n" "application/x-git-upload-pack-result"
+echo
+printf "%s\n" "00"
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 5039e66dc4..4eb81ba2d4 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -586,6 +586,40 @@ test_expect_success 'clone with http:// using protocol v2' '
 	! grep "Send header: Transfer-Encoding: chunked" log
 '
 
+test_expect_success 'clone repository with http:// using protocol v2 with incomplete pktline length' '
+	test_when_finished "rm -f log" &&
+
+	git init "$HTTPD_DOCUMENT_ROOT_PATH/incomplete_length" &&
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/incomplete_length" file &&
+
+	test_must_fail env GIT_TRACE_PACKET="$(pwd)/log" GIT_TRACE_CURL="$(pwd)/log" git -c protocol.version=2 \
+		clone "$HTTPD_URL/smart/incomplete_length" incomplete_length_child 2>err &&
+
+	# Client requested to use protocol v2
+	grep "Git-Protocol: version=2" log &&
+	# Server responded using protocol v2
+	grep "git< version 2" log &&
+	# Client reported appropriate failure
+	test_i18ngrep "bytes of length header were received" err
+'
+
+test_expect_success 'clone repository with http:// using protocol v2 with incomplete pktline body' '
+	test_when_finished "rm -f log" &&
+
+	git init "$HTTPD_DOCUMENT_ROOT_PATH/incomplete_body" &&
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/incomplete_body" file &&
+
+	test_must_fail env GIT_TRACE_PACKET="$(pwd)/log" GIT_TRACE_CURL="$(pwd)/log" git -c protocol.version=2 \
+		clone "$HTTPD_URL/smart/incomplete_body" incomplete_body_child 2>err &&
+
+	# Client requested to use protocol v2
+	grep "Git-Protocol: version=2" log &&
+	# Server responded using protocol v2
+	grep "git< version 2" log &&
+	# Client reported appropriate failure
+	test_i18ngrep "bytes of body are still expected" err
+'
+
 test_expect_success 'clone big repository with http:// using protocol v2' '
 	test_when_finished "rm -f log" &&
 
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v2 6/7] pkt-line: PACKET_READ_RESPONSE_END
  2020-05-18 15:47 ` [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects Denton Liu
                     ` (4 preceding siblings ...)
  2020-05-18 15:47   ` [PATCH v2 5/7] remote-curl: error on incomplete packet Denton Liu
@ 2020-05-18 15:47   ` Denton Liu
  2020-05-18 15:47   ` [PATCH v2 7/7] stateless-connect: send response end packet Denton Liu
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 62+ messages in thread
From: Denton Liu @ 2020-05-18 15:47 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Eric Sunshine

In a future commit, we will use PACKET_READ_RESPONSE_END to separate
messages proxied by remote-curl. To prepare for this, add the
PACKET_READ_RESPONSE_END enum value.

In switch statements that need a case added, die() or BUG() when a
PACKET_READ_RESPONSE_END is unexpected. Otherwise, mirror how
PACKET_READ_DELIM is implemented (especially in cases where packets are
being forwarded).

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 connect.c                |  2 ++
 pkt-line.c               | 11 +++++++++++
 pkt-line.h               |  2 ++
 remote-curl.c            |  2 ++
 serve.c                  |  2 ++
 t/helper/test-pkt-line.c |  4 ++++
 6 files changed, 23 insertions(+)

diff --git a/connect.c b/connect.c
index 23013c6344..11c6ec70a0 100644
--- a/connect.c
+++ b/connect.c
@@ -127,6 +127,7 @@ enum protocol_version discover_version(struct packet_reader *reader)
 		die_initial_contact(0);
 	case PACKET_READ_FLUSH:
 	case PACKET_READ_DELIM:
+	case PACKET_READ_RESPONSE_END:
 		version = protocol_v0;
 		break;
 	case PACKET_READ_NORMAL:
@@ -310,6 +311,7 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 			state = EXPECTING_DONE;
 			break;
 		case PACKET_READ_DELIM:
+		case PACKET_READ_RESPONSE_END:
 			die(_("invalid packet"));
 		}
 
diff --git a/pkt-line.c b/pkt-line.c
index 5c3b4539b5..fdd6b90e68 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -99,6 +99,13 @@ void packet_delim(int fd)
 		die_errno(_("unable to write delim packet"));
 }
 
+void packet_response_end(int fd)
+{
+	packet_trace("0002", 4, 1);
+	if (write_in_full(fd, "0002", 4) < 0)
+		die_errno(_("unable to write stateless separator packet"));
+}
+
 int packet_flush_gently(int fd)
 {
 	packet_trace("0000", 4, 1);
@@ -337,6 +344,10 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 		packet_trace("0001", 4, 0);
 		*pktlen = 0;
 		return PACKET_READ_DELIM;
+	} else if (len == 2) {
+		packet_trace("0002", 4, 0);
+		*pktlen = 0;
+		return PACKET_READ_RESPONSE_END;
 	} else if (len < 4) {
 		die(_("protocol error: bad line length %d"), len);
 	}
diff --git a/pkt-line.h b/pkt-line.h
index de81776a7a..8156445873 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -22,6 +22,7 @@
  */
 void packet_flush(int fd);
 void packet_delim(int fd);
+void packet_response_end(int fd);
 void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 void packet_buf_flush(struct strbuf *buf);
 void packet_buf_delim(struct strbuf *buf);
@@ -94,6 +95,7 @@ enum packet_read_status {
 	PACKET_READ_NORMAL,
 	PACKET_READ_FLUSH,
 	PACKET_READ_DELIM,
+	PACKET_READ_RESPONSE_END,
 };
 enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 						size_t *src_len, char *buffer,
diff --git a/remote-curl.c b/remote-curl.c
index e020140092..d02cb547e9 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -601,6 +601,8 @@ static int rpc_read_from_out(struct rpc_state *rpc, int options,
 		case PACKET_READ_FLUSH:
 			memcpy(buf - 4, "0000", 4);
 			break;
+		case PACKET_READ_RESPONSE_END:
+			die(_("remote server sent stateless separator"));
 		}
 	}
 
diff --git a/serve.c b/serve.c
index 317256c1a4..c046926ba1 100644
--- a/serve.c
+++ b/serve.c
@@ -217,6 +217,8 @@ static int process_request(void)
 
 			state = PROCESS_REQUEST_DONE;
 			break;
+		case PACKET_READ_RESPONSE_END:
+			BUG("unexpected stateless separator packet");
 		}
 	}
 
diff --git a/t/helper/test-pkt-line.c b/t/helper/test-pkt-line.c
index 12ca698e17..69152958e5 100644
--- a/t/helper/test-pkt-line.c
+++ b/t/helper/test-pkt-line.c
@@ -46,6 +46,9 @@ static void unpack(void)
 		case PACKET_READ_DELIM:
 			printf("0001\n");
 			break;
+		case PACKET_READ_RESPONSE_END:
+			printf("0002\n");
+			break;
 		}
 	}
 }
@@ -75,6 +78,7 @@ static void unpack_sideband(void)
 		case PACKET_READ_FLUSH:
 			return;
 		case PACKET_READ_DELIM:
+		case PACKET_READ_RESPONSE_END:
 			break;
 		}
 	}
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v2 7/7] stateless-connect: send response end packet
  2020-05-18 15:47 ` [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects Denton Liu
                     ` (5 preceding siblings ...)
  2020-05-18 15:47   ` [PATCH v2 6/7] pkt-line: PACKET_READ_RESPONSE_END Denton Liu
@ 2020-05-18 15:47   ` Denton Liu
  2020-05-18 16:43     ` Jeff King
  2020-05-18 16:50   ` [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects Jeff King
  2020-05-19 10:53   ` [PATCH v3 " Denton Liu
  8 siblings, 1 reply; 62+ messages in thread
From: Denton Liu @ 2020-05-18 15:47 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Eric Sunshine, Force Charlie

Currently, remote-curl acts as a proxy and blindly forwards packets
between an HTTP server and fetch-pack. In the case of a stateless RPC
connection where the connection is terminated before the transaction is
complete, remote-curl will blindly forward the packets before waiting on
more input from fetch-pack. Meanwhile, fetch-pack will read the
transaction and continue reading, expecting more input to continue the
transaction. This results in a deadlock between the two processes.

This can be seen in the following command which does not terminate:

	$ git -c protocol.version=2 clone https://github.com/git/git.git --shallow-since=20151012
	Cloning into 'git'...

whereas the v1 version does terminate as expected:

	$ git -c protocol.version=1 clone https://github.com/git/git.git --shallow-since=20151012
	Cloning into 'git'...
	fatal: the remote end hung up unexpectedly

Instead of blindly forwarding packets, make remote-curl insert response
end and flush packets after proxying the responses from the remote
server when using stateless_connect(). On the RPC client side, ensure
that each response ends as described.

A separate control packet is chosen because we need to be able to
differentiate between what the remote server sends and remote-curl's
control packets. By ensuring in the remote-curl code that a server
cannot send response end packets, we prevent a malicious server from
being able to perform a denial of service attack in which they spoof a
response end packet and cause the described deadlock to happen.

Reported-by: Force Charlie <charlieio@outlook.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 Documentation/gitremote-helpers.txt     |  4 +++-
 Documentation/technical/protocol-v2.txt |  2 ++
 builtin/fetch-pack.c                    |  2 +-
 connect.c                               | 10 +++++++++-
 fetch-pack.c                            | 12 ++++++++++++
 remote-curl.c                           |  7 +++++++
 remote.h                                |  3 ++-
 t/t5702-protocol-v2.sh                  | 13 +++++++++++++
 transport.c                             |  3 ++-
 9 files changed, 51 insertions(+), 5 deletions(-)

diff --git a/Documentation/gitremote-helpers.txt b/Documentation/gitremote-helpers.txt
index f48a031dc3..84f8e92b23 100644
--- a/Documentation/gitremote-helpers.txt
+++ b/Documentation/gitremote-helpers.txt
@@ -405,7 +405,9 @@ Supported if the helper has the "connect" capability.
 	trying to fall back).  After line feed terminating the positive
 	(empty) response, the output of the service starts.  Messages
 	(both request and response) must consist of zero or more
-	PKT-LINEs, terminating in a flush packet. The client must not
+	PKT-LINEs, terminating in a flush packet. Response messages will
+	have a response end packet before the flush packet to indicate
+	the end of a response.  The client must not
 	expect the server to store any state in between request-response
 	pairs.  After the connection ends, the remote helper exits.
 +
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 7e3766cafb..3996d70891 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -33,6 +33,8 @@ In protocol v2 these special packets will have the following semantics:
 
   * '0000' Flush Packet (flush-pkt) - indicates the end of a message
   * '0001' Delimiter Packet (delim-pkt) - separates sections of a message
+  * '0002' Message Packet (response-end-pkt) - indicates the end of a response
+    for stateless connections
 
 Initial Client Request
 ----------------------
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 4771100072..94b0c89b82 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -224,7 +224,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	version = discover_version(&reader);
 	switch (version) {
 	case protocol_v2:
-		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL);
+		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL, args.stateless_rpc);
 		break;
 	case protocol_v1:
 	case protocol_v0:
diff --git a/connect.c b/connect.c
index 11c6ec70a0..12b57f5c0a 100644
--- a/connect.c
+++ b/connect.c
@@ -409,7 +409,8 @@ static int process_ref_v2(const char *line, struct ref ***list)
 struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
 			     const struct argv_array *ref_prefixes,
-			     const struct string_list *server_options)
+			     const struct string_list *server_options,
+			     int stateless_rpc)
 {
 	int i;
 	*list = NULL;
@@ -446,6 +447,13 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 	if (reader->status != PACKET_READ_FLUSH)
 		die(_("expected flush after ref listing"));
 
+	if (stateless_rpc) {
+		if (packet_reader_read(reader) != PACKET_READ_RESPONSE_END)
+			die(_("expected response end packet after ref listing"));
+		if (packet_reader_read(reader) != PACKET_READ_FLUSH)
+			die(_("expected flush packet after response end"));
+	}
+
 	return list;
 }
 
diff --git a/fetch-pack.c b/fetch-pack.c
index f73a2ce6cb..bcbbb7e2fb 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1468,6 +1468,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 	struct fetch_negotiator negotiator_alloc;
 	struct fetch_negotiator *negotiator;
 	int seen_ack = 0;
+	int check_http_delimiter;
 
 	if (args->no_dependents) {
 		negotiator = NULL;
@@ -1486,6 +1487,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 	}
 
 	while (state != FETCH_DONE) {
+		check_http_delimiter = 0;
+
 		switch (state) {
 		case FETCH_CHECK_LOCAL:
 			sort_ref_list(&ref, ref_compare_name);
@@ -1542,6 +1545,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 				/* fallthrough */
 			case NO_COMMON_FOUND:
 				state = FETCH_SEND_REQUEST;
+				check_http_delimiter = 1;
 				break;
 			}
 			break;
@@ -1562,10 +1566,18 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 				die(_("git fetch-pack: fetch failed."));
 
 			state = FETCH_DONE;
+			check_http_delimiter = 1;
 			break;
 		case FETCH_DONE:
 			continue;
 		}
+
+		if (args->stateless_rpc && check_http_delimiter) {
+			if (packet_reader_read(&reader) != PACKET_READ_RESPONSE_END)
+				die(_("git fetch-pack: expected response end packet"));
+			if (packet_reader_read(&reader) != PACKET_READ_FLUSH)
+				die(_("git fetch-pack: expected flush packet"));
+		}
 	}
 
 	if (negotiator)
diff --git a/remote-curl.c b/remote-curl.c
index d02cb547e9..8a72b5ee7a 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -703,6 +703,8 @@ static void check_pktline(struct check_pktline_state *state, const char *ptr, si
 				state->remaining = packet_length(state->len_buf);
 				if (state->remaining < 0) {
 					die(_("remote-curl: bad line length character: %.4s"), state->len_buf);
+				} else if (state->remaining == 2) {
+					die(_("remote-curl: unexpected response end packet"));
 				} else if (state->remaining < 4) {
 					state->remaining = 0;
 				} else {
@@ -991,6 +993,11 @@ static int post_rpc(struct rpc_state *rpc, int stateless_connect, int flush_rece
 	if (rpc_in_data.pktline_state.remaining)
 		err = error(_("%d bytes of body are still expected"), rpc_in_data.pktline_state.remaining);
 
+	if (stateless_connect) {
+		packet_response_end(rpc->in);
+		packet_flush(rpc->in);
+	}
+
 	curl_slist_free_all(headers);
 	free(gzip_body);
 	return err;
diff --git a/remote.h b/remote.h
index 11d8719b58..5cc26c1b3b 100644
--- a/remote.h
+++ b/remote.h
@@ -179,7 +179,8 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
 			     const struct argv_array *ref_prefixes,
-			     const struct string_list *server_options);
+			     const struct string_list *server_options,
+			     int stateless_rpc);
 
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 4eb81ba2d4..8da65e60de 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -620,6 +620,19 @@ test_expect_success 'clone repository with http:// using protocol v2 with incomp
 	test_i18ngrep "bytes of body are still expected" err
 '
 
+test_expect_success 'clone with http:// using protocol v2 and invalid parameters' '
+	test_when_finished "rm -f log" &&
+
+	test_must_fail env GIT_TRACE_PACKET="$(pwd)/log" GIT_TRACE_CURL="$(pwd)/log" \
+		git -c protocol.version=2 \
+		clone --shallow-since=20151012 "$HTTPD_URL/smart/http_parent" http_child_invalid &&
+
+	# Client requested to use protocol v2
+	grep "Git-Protocol: version=2" log &&
+	# Server responded using protocol v2
+	grep "git< version 2" log
+'
+
 test_expect_success 'clone big repository with http:// using protocol v2' '
 	test_when_finished "rm -f log" &&
 
diff --git a/transport.c b/transport.c
index a6002e502f..182978f4be 100644
--- a/transport.c
+++ b/transport.c
@@ -297,7 +297,8 @@ static struct ref *handshake(struct transport *transport, int for_push,
 		if (must_list_refs)
 			get_remote_refs(data->fd[1], &reader, &refs, for_push,
 					ref_prefixes,
-					transport->server_options);
+					transport->server_options,
+					transport->stateless_rpc);
 		break;
 	case protocol_v1:
 	case protocol_v0:
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 5/6] remote-curl: error on incomplete packet
  2020-05-18  9:08     ` Denton Liu
@ 2020-05-18 15:49       ` Jeff King
  0 siblings, 0 replies; 62+ messages in thread
From: Jeff King @ 2020-05-18 15:49 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List

On Mon, May 18, 2020 at 05:08:57AM -0400, Denton Liu wrote:

> > Hmm. Right now there's no assumption in rpc_in that we're writing
> > pktlines. Will this always be the case?
> > 
> > I think the answer is probably yes. If there's a case where it isn't, it
> > might be something like v0 git-over-http against a server which doesn't
> > have the sideband capability.
> 
> As far as I can tell from skimming the code, v0 uses always pktlines
> although I'm far from being an expert on Git's networking stuff. Perhaps
> we could implement this such that the line-length checking only happens
> for stateless_connect()?

Yeah, that would certainly limit the possibility of unintended side
effects (and I don't think there's any benefit to this patch for the
non-stateless-connect cases).

We'd still be locking stateless-connect into always using pktlines, but
I think that's probably OK in practice. AFAICT it's the case now, and I
doubt we'd have any desire to change it in the future (and if we do,
this is all local-ish code that we could modify).

One unfortunate thing about the protocol (but not new to your patch) is
that this will be a problem for _any_ remote-helper which claims to
support stateless-connect. So they'd all need to carry similar code to
deal with this issue. Right now remote-curl is the only one, but
probably git-remote-ext and git-remote-fd could support this, too. Those
are pretty exotic, though (I don't think anyone has even bothered to
make them support v2 yet).

-Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 4/7] pkt-line: extern packet_length()
  2020-05-18 15:47   ` [PATCH v2 4/7] pkt-line: extern packet_length() Denton Liu
@ 2020-05-18 16:04     ` Jeff King
  2020-05-18 17:50       ` Eric Sunshine
  2020-05-18 18:44       ` Junio C Hamano
  0 siblings, 2 replies; 62+ messages in thread
From: Jeff King @ 2020-05-18 16:04 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List, Eric Sunshine

On Mon, May 18, 2020 at 11:47:21AM -0400, Denton Liu wrote:

> In a future commit, we will be manually processing packets and we will
> need to access the length header. In order to simplify this, extern
> packet_length() so that the logic can be reused.
> 
> Change the function parameter from a `const char *` to
> `const char linelen[4]`. Even though these two types behave identically
> as function parameters, use the array notation to semantically indicate
> exactly what this function is expecting as an argument.

OK. I double-checked that we will not run into any problems with passing
pointers or arrays of other sizes (sadly compilers would not tell us if
we passed a too-small array, but at least it's documented for humans).

> +/*
> + * Convert a four hex digit packet line length header into its numeric
> + * representation. linelen should not be null-terminated.

Minor nit, but it is perfectly fine if there is a NUL. Maybe "linelen
does not need to be..."?

-Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 5/7] remote-curl: error on incomplete packet
  2020-05-18 15:47   ` [PATCH v2 5/7] remote-curl: error on incomplete packet Denton Liu
@ 2020-05-18 16:22     ` Jeff King
  2020-05-18 16:51       ` Denton Liu
  0 siblings, 1 reply; 62+ messages in thread
From: Jeff King @ 2020-05-18 16:22 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List, Eric Sunshine

On Mon, May 18, 2020 at 11:47:22AM -0400, Denton Liu wrote:

> +struct check_pktline_state {
> +	char len_buf[4];
> +	int len_filled;
> +	int remaining;
> +};
> +
> +static void check_pktline(struct check_pktline_state *state, const char *ptr, size_t size)

Thanks for converting this. I think having this broken out makes it a
bit easier to reason about, and it should be much easier to reuse if we
need it elsewhere.

> +{
> +	while (size) {
> +		if (!state->remaining) {
> +			int digits_remaining = 4 - state->len_filled;
> +			if (digits_remaining > size)
> +				digits_remaining = size;
> +			memcpy(&state->len_buf[state->len_filled], ptr, digits_remaining);
> +			state->len_filled += digits_remaining;
> +			ptr += digits_remaining;
> +			size -= digits_remaining;

Having personally written and screwed up this kind of parsing state
machine before, I read over this logic quite carefully. ;) I believe
it's correct.

Another way would be to loop by single characters:

  while (state->len_filled < 4 && size) {
          state->len_buf[state->len_filled++] = *ptr;
	  ptr++;
	  size--;
  }

which I _think_ is equivalent, and is a bit shorter. I'm OK with either
(see below, especially).

I'm not sure if it's worth replacing "4" with ARRAY_SIZE(state->len_buf).
I generally try to avoid magic numbers, but it's certainly not like one
could change the size of len_buf and this code would still be useful. :)

> +			if (state->len_filled == 4) {
> +				state->remaining = packet_length(state->len_buf);
> +				if (state->remaining < 0) {
> +					die(_("remote-curl: bad line length character: %.4s"), state->len_buf);
> +				} else if (state->remaining < 4) {
> +					state->remaining = 0;
> +				} else {
> +					state->remaining -= 4;
> +				}
> +				state->len_filled = 0;
> +			}
> +		}

This part makes sense. We'll either leave len_filled as 1-3 (incomplete),
or we'll read a whole packet (for a flush), or we'll be waiting to read
the rest of the packet.

> +		if (state->remaining) {
> +			int remaining = state->remaining;
> +			if (remaining > size)
> +				remaining = size;
> +			ptr += remaining;
> +			size -= remaining;
> +			state->remaining -= remaining;
> +		}
> +	}
> +}

And here we most certainly don't want to read character-by-character,
because we're not doing anything with each one, and we expect there to be
a lot more of them. Having the earlier loop match the form of this one is
perhaps a good reason to leave it alone.

> [...]
> @@ -936,6 +984,11 @@ static int post_rpc(struct rpc_state *rpc, int flush_received)
>  	if (!rpc->any_written)
>  		err = -1;
>  
> +	if (rpc_in_data.pktline_state.len_filled)
> +		err = error(_("%d bytes of length header were received"), rpc_in_data.pktline_state.len_filled);
> +	if (rpc_in_data.pktline_state.remaining)
> +		err = error(_("%d bytes of body are still expected"), rpc_in_data.pktline_state.remaining);

And here's the payoff for all of the state machine checks. Makes sense.

> @@ -702,6 +746,8 @@ static size_t rpc_in(char *ptr, size_t eltsize,
>  		return size;
>  	if (size)
>  		data->rpc->any_written = 1;
> +	if (data->check_pktline)
> +		check_pktline(&data->pktline_state, ptr, size);
>  	write_or_die(data->rpc->in, ptr, size);
>  	return size;
>  }

And this is now conditional. Good...

> @@ -920,6 +966,8 @@ static int post_rpc(struct rpc_state *rpc, int flush_received)
>  	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, rpc_in);
>  	rpc_in_data.rpc = rpc;
>  	rpc_in_data.slot = slot;
> +	rpc_in_data.check_pktline = stateless_connect;
> +	memset(&rpc_in_data.pktline_state, 0, sizeof(rpc_in_data.pktline_state));

And we enable it only for stateless-connect. Makes perfect sense.

> --- /dev/null
> +++ b/t/lib-httpd/incomplete-body-upload-pack-v2-http.sh
> @@ -0,0 +1,3 @@
> +printf "Content-Type: text/%s\n" "application/x-git-upload-pack-result"
> +echo
> +printf "%s%s\n" "0079" "45"

Nice. Just having a deterministic half-written packet is way easier to
reason about than my "truncating proxy" suggestion.

> --- /dev/null
> +++ b/t/lib-httpd/incomplete-length-upload-pack-v2-http.sh
> @@ -0,0 +1,3 @@
> +printf "Content-Type: text/%s\n" "application/x-git-upload-pack-result"
> +echo
> +printf "%s\n" "00"

Thanks for covering this case, too.

I confirmed (as I'm sure you did) they both cause git-remote-http to hang
without the fix in this patch.

-Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 6/6] remote-curl: ensure last packet is a flush
  2020-05-15 21:41     ` Jeff King
@ 2020-05-18 16:34       ` Junio C Hamano
  2020-05-18 16:52         ` Jeff King
  0 siblings, 1 reply; 62+ messages in thread
From: Junio C Hamano @ 2020-05-18 16:34 UTC (permalink / raw)
  To: Jeff King
  Cc: Denton Liu, Git Mailing List, Force Charlie, Jonathan Tan,
	Jonathan Nieder

Jeff King <peff@peff.net> writes:

> On Fri, May 15, 2020 at 05:02:45PM -0400, Denton Liu wrote:
>
>> On Wed, May 13, 2020 at 02:04:58PM -0400, Denton Liu wrote:
>> > This is not a complete solution to the problem, however. It is possible
>> > that a flush packet could be sent in the middle of a message and the
>> > connection could die immediately after. Then, remote-curl would not
>> > error out and fetch-pack would still be in the middle of a transaction
>> > and they would enter deadlock. A complete solution would involve
>> > reframing the stateless-connect protocol, possibly by introducing
>> > another control packet ("0002"?) as a stateless request separator
>> > packet which is always sent at the end of post_rpc().
>> > ...
> I do kind of like the idea of a stateless separator packet, if I
> understand your meaning correctly. I'll wait to see what the patches
> look like. :)

I vaguely recall talking with somebody (perhaps it was Shawn Pearce)
about my long-time complaint on the on-the-wire protocol, in that
the protocol sometimes uses flush to merely mean "I've finished
speaking one section of my speech" and sometimes "I've done talking
for now and it's your turn to speak to me" (the receiving end needs
to know which one a particular flush means without knowing the
context in which it is issued---which makes it impossible to write a
protocol agnostic proxy.  

If the above proposal means that we'll have an explicit way to say
"Not just I am done with one section of my speech, but it is your
turn to speak and I am going to listen", that would be wonderful.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 7/7] stateless-connect: send response end packet
  2020-05-18 15:47   ` [PATCH v2 7/7] stateless-connect: send response end packet Denton Liu
@ 2020-05-18 16:43     ` Jeff King
  2020-05-18 17:12       ` Denton Liu
  0 siblings, 1 reply; 62+ messages in thread
From: Jeff King @ 2020-05-18 16:43 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List, Eric Sunshine, Force Charlie

On Mon, May 18, 2020 at 11:47:24AM -0400, Denton Liu wrote:

> A separate control packet is chosen because we need to be able to
> differentiate between what the remote server sends and remote-curl's
> control packets. By ensuring in the remote-curl code that a server
> cannot send response end packets, we prevent a malicious server from
> being able to perform a denial of service attack in which they spoof a
> response end packet and cause the described deadlock to happen.

Thanks for thinking about this. I was wondering what mischief a server
might be able to cause by sending such a packet.

> @@ -446,6 +447,13 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
>  	if (reader->status != PACKET_READ_FLUSH)
>  		die(_("expected flush after ref listing"));
>  
> +	if (stateless_rpc) {
> +		if (packet_reader_read(reader) != PACKET_READ_RESPONSE_END)
> +			die(_("expected response end packet after ref listing"));
> +		if (packet_reader_read(reader) != PACKET_READ_FLUSH)
> +			die(_("expected flush packet after response end"));
> +	}

Having two packets here surprised me. We'd have already received the
actual flush from the server (as you can see in the context above).
Wouldn't a single RESPONSE_END be enough to signal the reader?

> diff --git a/fetch-pack.c b/fetch-pack.c
> index f73a2ce6cb..bcbbb7e2fb 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1468,6 +1468,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
>  	struct fetch_negotiator negotiator_alloc;
>  	struct fetch_negotiator *negotiator;
>  	int seen_ack = 0;
> +	int check_http_delimiter;

This flag was more complicated than I expected, and I'm not sure how we
can easily be certain that all necessary paths are covered.

E.g., in FETCH_PROCESS_ACKS, we'll always be reading a response via
process_acks(). Why do COMMON_FOUND and NO_COMMON_FOUND check for
end-of-response, but READY doesn't? I think the answer is that we'd
continue to read the same response via FETCH_GET_PACK in this instance.

I just wonder if there is a better place to put this logic that would be
more certain to catch every place we'd expect to read to the end of a
response. But I suppose not. We could push it down into process_acks(),
but it would have the same READY logic that's here. I'll admit part of
my complaint is that the existing do_fetch_pack_v2 state-machine switch
is kind of hard to follow, but that's not your fault.

Two things that might help:

  - put a comment in READY to indicate why we don't need to check
    end-of-response (but do for its sibling cases)

  - instead of setting and clearing a variable and then using it to
    check at the end of the loop, call check_http_delimiter(&reader,
    args->stateless_rpc) at the moment we're done with the packet. The
    actual executed code would have the same behavior, I think, but it
    might be easier to understand that we're trying to hit the end of
    each response.

Also, it probably should be check_stateless_delimiter() or something.
There could be other helpers that support stateless-connect besides
http.

Speaking of which: this is a change to the remote-helper protocol, since
we're now expecting stateless-connect helpers to produce these delim
packets (and failing if they don't). I kind of doubt that anybody but
remote-curl has implemented v2 stateless-connect, but should we be
marking this with an extra capability to be on the safe side?

> +		if (args->stateless_rpc && check_http_delimiter) {
> +			if (packet_reader_read(&reader) != PACKET_READ_RESPONSE_END)
> +				die(_("git fetch-pack: expected response end packet"));
> +			if (packet_reader_read(&reader) != PACKET_READ_FLUSH)
> +				die(_("git fetch-pack: expected flush packet"));
> +		}

I wondered also if stateless_rpc might catch cases besides
stateless-connect. But I guess we're in the v2 fetch-pack path, and that
_only_ understands stateless-connect.

-Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects
  2020-05-18 15:47 ` [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects Denton Liu
                     ` (6 preceding siblings ...)
  2020-05-18 15:47   ` [PATCH v2 7/7] stateless-connect: send response end packet Denton Liu
@ 2020-05-18 16:50   ` Jeff King
  2020-05-18 17:36     ` Denton Liu
  2020-05-18 19:36     ` Junio C Hamano
  2020-05-19 10:53   ` [PATCH v3 " Denton Liu
  8 siblings, 2 replies; 62+ messages in thread
From: Jeff King @ 2020-05-18 16:50 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List, Eric Sunshine

On Mon, May 18, 2020 at 11:47:17AM -0400, Denton Liu wrote:

> Changes since v1:
> 
> * Remove fallthrough in switch in favour of just extracting the common
>   call out of the switch in patch 3
> 
> * Add more detail in function comment and use `const char linelen[4]` in
>   patch 4
> 
> * Implement most of Peff's suggestions[0] in patch 5
> 
> * Only operate on stateless_connect() in patch 5
> 
> * Add tests in patch 5
> 
> * Drop "remote-curl: ensure last packet is a flush" in favour of
>   "stateless-connect: send response end packet"

Overall this looks pretty cleanly done. I left a few minor comments
throughout, but the real question is whether we prefer the "0002" packet
in the last one, or if we instead insist that the response end in a
flush.

At first glance, the "0002" seems like it's more flexible, because we're
making fewer assumptions about what's being transferred over the
stateless-connect channel. But in reality it still has to be pktlines
(because we're checking them for incomplete or invalid packets already).
So all it really buys us is that the server response doesn't have to end
with a flush packet.

So I dunno. The "0002" solution is slightly more flexible, but I'm not
sure it helps in practice. And it does eat up one of our two remaining
special packet markers.

There is another solution, which would allow arbitrary data over
stateless-connect: adding an extra level of pktline framing between the
helper and the parent process. But that's rather ugly (inner pktlines
may be split across outer ones, so you have to do a bunch of buffer
reassembly). I think that's actually how v0 http works, IIRC.
IMHO it probably isn't worth pursuing. That extra layer wrecks the
elegance to the v2 stateless-connect approach, and we really do expect
only pktlines to go over it.

So I think either of your solutions (enforcing a final flush, or the
0002 packet) is preferable. I'm on the fence between them.

-Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 5/7] remote-curl: error on incomplete packet
  2020-05-18 16:22     ` Jeff King
@ 2020-05-18 16:51       ` Denton Liu
  0 siblings, 0 replies; 62+ messages in thread
From: Denton Liu @ 2020-05-18 16:51 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, Eric Sunshine

Hi Peff,

Thanks for reading over the patch carefully. It is much appreciated :)

On Mon, May 18, 2020 at 12:22:25PM -0400, Jeff King wrote:
> I'm not sure if it's worth replacing "4" with ARRAY_SIZE(state->len_buf).
> I generally try to avoid magic numbers, but it's certainly not like one
> could change the size of len_buf and this code would still be useful. :)

Yeah, I'm a bit sad about all the magic 4's that are strewn in the code
too. I was thinking about consolidating all of the magic 4's into a
constant in pkt-line as a preparatory patch but I didn't want to risk
holding up this series. I think once the dust settles here, I'll go
ahead and do that. For now, though, I think I'll leave it as is.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 6/6] remote-curl: ensure last packet is a flush
  2020-05-18 16:34       ` Junio C Hamano
@ 2020-05-18 16:52         ` Jeff King
  2020-05-18 21:00           ` Jeff King
  0 siblings, 1 reply; 62+ messages in thread
From: Jeff King @ 2020-05-18 16:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Denton Liu, Git Mailing List, Force Charlie, Jonathan Tan,
	Jonathan Nieder

On Mon, May 18, 2020 at 09:34:42AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > On Fri, May 15, 2020 at 05:02:45PM -0400, Denton Liu wrote:
> >
> >> On Wed, May 13, 2020 at 02:04:58PM -0400, Denton Liu wrote:
> >> > This is not a complete solution to the problem, however. It is possible
> >> > that a flush packet could be sent in the middle of a message and the
> >> > connection could die immediately after. Then, remote-curl would not
> >> > error out and fetch-pack would still be in the middle of a transaction
> >> > and they would enter deadlock. A complete solution would involve
> >> > reframing the stateless-connect protocol, possibly by introducing
> >> > another control packet ("0002"?) as a stateless request separator
> >> > packet which is always sent at the end of post_rpc().
> >> > ...
> > I do kind of like the idea of a stateless separator packet, if I
> > understand your meaning correctly. I'll wait to see what the patches
> > look like. :)
> 
> I vaguely recall talking with somebody (perhaps it was Shawn Pearce)
> about my long-time complaint on the on-the-wire protocol, in that
> the protocol sometimes uses flush to merely mean "I've finished
> speaking one section of my speech" and sometimes "I've done talking
> for now and it's your turn to speak to me" (the receiving end needs
> to know which one a particular flush means without knowing the
> context in which it is issued---which makes it impossible to write a
> protocol agnostic proxy.  
> 
> If the above proposal means that we'll have an explicit way to say
> "Not just I am done with one section of my speech, but it is your
> turn to speak and I am going to listen", that would be wonderful.

I think we already have that now in v2 because of the "0001" delim
packet. All of the flushes are (I think) really "this is the end of my
speech", and any inner "my list is ending, but I have more to say"
delimiters are "0001".

-Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 7/7] stateless-connect: send response end packet
  2020-05-18 16:43     ` Jeff King
@ 2020-05-18 17:12       ` Denton Liu
  2020-05-18 17:26         ` Jeff King
  0 siblings, 1 reply; 62+ messages in thread
From: Denton Liu @ 2020-05-18 17:12 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, Eric Sunshine, Force Charlie

Hi Peff,

On Mon, May 18, 2020 at 12:43:08PM -0400, Jeff King wrote:
> On Mon, May 18, 2020 at 11:47:24AM -0400, Denton Liu wrote:

[...]

> > @@ -446,6 +447,13 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
> >  	if (reader->status != PACKET_READ_FLUSH)
> >  		die(_("expected flush after ref listing"));
> >  
> > +	if (stateless_rpc) {
> > +		if (packet_reader_read(reader) != PACKET_READ_RESPONSE_END)
> > +			die(_("expected response end packet after ref listing"));
> > +		if (packet_reader_read(reader) != PACKET_READ_FLUSH)
> > +			die(_("expected flush packet after response end"));
> > +	}
> 
> Having two packets here surprised me. We'd have already received the
> actual flush from the server (as you can see in the context above).
> Wouldn't a single RESPONSE_END be enough to signal the reader?

Yes, having a single RESPONSE_END is enough. I sent a flush packet
because it seemed appropriate when I was writing the patch to end all
messages with a flush but I suppose that's just cargo culting. I'll
remove it in the next iteration.

> > diff --git a/fetch-pack.c b/fetch-pack.c
> > index f73a2ce6cb..bcbbb7e2fb 100644
> > --- a/fetch-pack.c
> > +++ b/fetch-pack.c
> > @@ -1468,6 +1468,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
> >  	struct fetch_negotiator negotiator_alloc;
> >  	struct fetch_negotiator *negotiator;
> >  	int seen_ack = 0;
> > +	int check_http_delimiter;
> 
> This flag was more complicated than I expected, and I'm not sure how we
> can easily be certain that all necessary paths are covered.

Essentially, any time we're in a path where state is set to
FETCH_SEND_REQUEST or FETCH_DONE, we know that we've finished processing
the current request and we can check for response end packets. Of course,
the one exception to this is the first iteration of the loop when we're
transitioning from FETCH_CHECK_LOCAL to FETCH_SEND_REQUEST where we
can't check for response end packets since nothing has been requested
yet.

> E.g., in FETCH_PROCESS_ACKS, we'll always be reading a response via
> process_acks(). Why do COMMON_FOUND and NO_COMMON_FOUND check for
> end-of-response, but READY doesn't? I think the answer is that we'd
> continue to read the same response via FETCH_GET_PACK in this instance.
> 
> I just wonder if there is a better place to put this logic that would be
> more certain to catch every place we'd expect to read to the end of a
> response. But I suppose not. We could push it down into process_acks(),
> but it would have the same READY logic that's here. I'll admit part of
> my complaint is that the existing do_fetch_pack_v2 state-machine switch
> is kind of hard to follow, but that's not your fault.

I debated between the current implementation and doing something like

	int first_iteration = 1;

	...

	while (state != FETCH_DONE) {
		switch (...) {
			...
		}

		if (args->stateless_rpc && !first_iteration && (state == FETCH_SEND_REQUEST || state == FETCH_DONE)) {
			if (packet_reader_read(&reader) != PACKET_READ_RESPONSE_END)
				die(_("git fetch-pack: expected response end packet"));
			if (packet_reader_read(&reader) != PACKET_READ_FLUSH)
				die(_("git fetch-pack: expected flush packet"));
		}
		first_iteration = 0;
	}

I think that this catches _all_ the cases without fiddling with any of
the state machine logic.

[...]

> Also, it probably should be check_stateless_delimiter() or something.
> There could be other helpers that support stateless-connect besides
> http.

Yeah... I forgot to change the check_http_delimiter name when I was
doing cleanup. Whoops.

> Speaking of which: this is a change to the remote-helper protocol, since
> we're now expecting stateless-connect helpers to produce these delim
> packets (and failing if they don't). I kind of doubt that anybody but
> remote-curl has implemented v2 stateless-connect, but should we be
> marking this with an extra capability to be on the safe side?

I think that we're probably safe from breaking anything external.
According to the gitremote-helpers documentation, 

	'stateless-connect'::
		Experimental; for internal use only.

so I think that gives us a bit of leeway in terms of the changes we can
make to the stateless-connect protocol. They've been warned ;)

Thanks,

Denton

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 7/7] stateless-connect: send response end packet
  2020-05-18 17:12       ` Denton Liu
@ 2020-05-18 17:26         ` Jeff King
  0 siblings, 0 replies; 62+ messages in thread
From: Jeff King @ 2020-05-18 17:26 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List, Eric Sunshine, Force Charlie

On Mon, May 18, 2020 at 01:12:49PM -0400, Denton Liu wrote:

> > I just wonder if there is a better place to put this logic that would be
> > more certain to catch every place we'd expect to read to the end of a
> > response. But I suppose not. We could push it down into process_acks(),
> > but it would have the same READY logic that's here. I'll admit part of
> > my complaint is that the existing do_fetch_pack_v2 state-machine switch
> > is kind of hard to follow, but that's not your fault.
> 
> I debated between the current implementation and doing something like
> 
> 	int first_iteration = 1;
> 
> 	...
> 
> 	while (state != FETCH_DONE) {
> 		switch (...) {
> 			...
> 		}
> 
> 		if (args->stateless_rpc && !first_iteration && (state == FETCH_SEND_REQUEST || state == FETCH_DONE)) {
> 			if (packet_reader_read(&reader) != PACKET_READ_RESPONSE_END)
> 				die(_("git fetch-pack: expected response end packet"));
> 			if (packet_reader_read(&reader) != PACKET_READ_FLUSH)
> 				die(_("git fetch-pack: expected flush packet"));
> 		}
> 		first_iteration = 0;
> 	}
> 
> I think that this catches _all_ the cases without fiddling with any of
> the state machine logic.

I think you're right that it does, but TBH I find it harder to follow,
because now we're even further away from the actual response reads. What
I most want to verify is:

  - there is no response-reading code path that does not check the
    delimiters

  - there is no code path where we check delimiters but did not actually
    read a response

So to me, putting the check as close to the response-reads as possible
makes sense. I think the patch below is equivalent and easier to verify,
but I admit it's somewhat subjective:

diff --git a/fetch-pack.c b/fetch-pack.c
index 5325649eda..0bf1e5760c 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1443,6 +1443,17 @@ static void receive_wanted_refs(struct packet_reader *reader,
 		die(_("error processing wanted refs: %d"), reader->status);
 }
 
+static void check_stateless_delimiter(struct fetch_pack_args *args,
+				      struct packet_reader *reader)
+{
+	if (!args->stateless_rpc)
+		return; /* not in stateless mode, no delimiter expected */
+	if (packet_reader_read(reader) != PACKET_READ_RESPONSE_END)
+		die(_("git fetch-pack: expected response end packet"));
+	if (packet_reader_read(reader) != PACKET_READ_FLUSH)
+		die(_("git fetch-pack: expected flush packet"));
+}
+
 enum fetch_state {
 	FETCH_CHECK_LOCAL = 0,
 	FETCH_SEND_REQUEST,
@@ -1469,7 +1480,6 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 	struct fetch_negotiator negotiator_alloc;
 	struct fetch_negotiator *negotiator;
 	int seen_ack = 0;
-	int check_http_delimiter;
 
 	if (args->no_dependents) {
 		negotiator = NULL;
@@ -1488,8 +1498,6 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 	}
 
 	while (state != FETCH_DONE) {
-		check_http_delimiter = 0;
-
 		switch (state) {
 		case FETCH_CHECK_LOCAL:
 			sort_ref_list(&ref, ref_compare_name);
@@ -1538,15 +1546,19 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 			/* Process ACKs/NAKs */
 			switch (process_acks(negotiator, &reader, &common)) {
 			case READY:
+				/*
+				 * Don't check for response delimiter; get_pack() will
+				 * read the rest of this response.
+				 */
 				state = FETCH_GET_PACK;
 				break;
 			case COMMON_FOUND:
 				in_vain = 0;
 				seen_ack = 1;
 				/* fallthrough */
 			case NO_COMMON_FOUND:
 				state = FETCH_SEND_REQUEST;
-				check_http_delimiter = 1;
+				check_stateless_delimiter(args, &reader);
 				break;
 			}
 			break;
@@ -1565,20 +1577,13 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 			process_section_header(&reader, "packfile", 0);
 			if (get_pack(args, fd, pack_lockfile, sought, nr_sought))
 				die(_("git fetch-pack: fetch failed."));
+			check_stateless_delimiter(args, &reader);
 
 			state = FETCH_DONE;
-			check_http_delimiter = 1;
 			break;
 		case FETCH_DONE:
 			continue;
 		}
-
-		if (args->stateless_rpc && check_http_delimiter) {
-			if (packet_reader_read(&reader) != PACKET_READ_RESPONSE_END)
-				die(_("git fetch-pack: expected response end packet"));
-			if (packet_reader_read(&reader) != PACKET_READ_FLUSH)
-				die(_("git fetch-pack: expected flush packet"));
-		}
 	}
 
 	if (negotiator)

> > Speaking of which: this is a change to the remote-helper protocol, since
> > we're now expecting stateless-connect helpers to produce these delim
> > packets (and failing if they don't). I kind of doubt that anybody but
> > remote-curl has implemented v2 stateless-connect, but should we be
> > marking this with an extra capability to be on the safe side?
> 
> I think that we're probably safe from breaking anything external.
> According to the gitremote-helpers documentation, 
> 
> 	'stateless-connect'::
> 		Experimental; for internal use only.
> 
> so I think that gives us a bit of leeway in terms of the changes we can
> make to the stateless-connect protocol. They've been warned ;)

OK, I didn't realize we had explicitly marked it as experimental. I
agree we're probably fine here, then (now that v2 is starting to settle
a bit more, we might think about lifting that warning, but this patch
series shows we're not quite there yet :) ).

-Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects
  2020-05-18 16:50   ` [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects Jeff King
@ 2020-05-18 17:36     ` Denton Liu
  2020-05-18 20:58       ` Jeff King
  2020-05-18 19:36     ` Junio C Hamano
  1 sibling, 1 reply; 62+ messages in thread
From: Denton Liu @ 2020-05-18 17:36 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, Eric Sunshine

Hi Peff,

On Mon, May 18, 2020 at 12:50:56PM -0400, Jeff King wrote:
> On Mon, May 18, 2020 at 11:47:17AM -0400, Denton Liu wrote:
> 
> > Changes since v1:
> > 
> > * Remove fallthrough in switch in favour of just extracting the common
> >   call out of the switch in patch 3
> > 
> > * Add more detail in function comment and use `const char linelen[4]` in
> >   patch 4
> > 
> > * Implement most of Peff's suggestions[0] in patch 5
> > 
> > * Only operate on stateless_connect() in patch 5
> > 
> > * Add tests in patch 5
> > 
> > * Drop "remote-curl: ensure last packet is a flush" in favour of
> >   "stateless-connect: send response end packet"
> 
> Overall this looks pretty cleanly done. I left a few minor comments
> throughout, but the real question is whether we prefer the "0002" packet
> in the last one, or if we instead insist that the response end in a
> flush.

Thanks for the prompt review!

> At first glance, the "0002" seems like it's more flexible, because we're
> making fewer assumptions about what's being transferred over the
> stateless-connect channel. But in reality it still has to be pktlines
> (because we're checking them for incomplete or invalid packets already).
> So all it really buys us is that the server response doesn't have to end
> with a flush packet.
> 
> So I dunno. The "0002" solution is slightly more flexible, but I'm not
> sure it helps in practice. And it does eat up one of our two remaining
> special packet markers.

Yeah, I was worried about consuming a special packet. One alternative
that I considered but is kind of gross is sending something like
"0028gitremote-helper: response complete\n" instead of "0002". Then,
instead of "0002" checks, we can check for that special string instead.
I don't _think_ that stateless-connect currently allows for completely
arbitrary data but I might be mistaken.

> There is another solution, which would allow arbitrary data over
> stateless-connect: adding an extra level of pktline framing between the
> helper and the parent process. But that's rather ugly (inner pktlines
> may be split across outer ones, so you have to do a bunch of buffer
> reassembly). I think that's actually how v0 http works, IIRC.
> IMHO it probably isn't worth pursuing. That extra layer wrecks the
> elegance to the v2 stateless-connect approach, and we really do expect
> only pktlines to go over it.

This was the original approach that I was working on but I found it to
be much too invasive for my liking. (Also, I never actually managed to
get it working ;) ) I think I gave up when I realised I had to insert
reframing logic into index-pack and unpack-objects.

> So I think either of your solutions (enforcing a final flush, or the
> 0002 packet) is preferable. I'm on the fence between them.

I'm mostly on the fence too. One advantage of 0002, however, is that a
malicious server can't end a request with 0002 as that's explicitly
prevented. If a malicious server closes a connection after sending a
0000, I think that they could cause a deadlock to happen if there are
multiple flush packets expected in a response. You mentioned in a
sibling email that this currently doesn't happen wrt stateless-connect
although I'm not sure if in the future, this is something that might
change. I dunno.

> -Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 3/6] transport: combine common cases with a fallthrough
  2020-05-18  9:18     ` Denton Liu
@ 2020-05-18 17:43       ` Eric Sunshine
  0 siblings, 0 replies; 62+ messages in thread
From: Eric Sunshine @ 2020-05-18 17:43 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List, Jeff King

On Mon, May 18, 2020 at 5:18 AM Denton Liu <liu.denton@gmail.com> wrote:
> On Wed, May 13, 2020 at 07:14:35PM -0400, Eric Sunshine wrote:
> > > +       case protocol_v1:
> > >                 die_if_server_options(transport);
> > > +               /* fallthrough */
> > > +       case protocol_v2:
> > >                 refs = fetch_pack(&args, data->fd,
> > >                                   refs_tmp ? refs_tmp : transport->remote_refs,
> > >                                   to_fetch, nr_heads, &data->shallow,
> >
> > I can't say that I'm a fan of this change. While it may make it clear
> > that the two calls are identical, it makes it more difficult to reason
> > about the distinct v0, v1, and v2 cases.
>
> Actually, thinking about it some more, do you think it would make more
> sense to just pull the fetch_pack() call out of the switch statement
> entirely? We could entirely eliminate the fallthrough if we do this.

Yes, I think that works better and is cleaner and easier to understand
than the "fallthrough" in v1.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 4/7] pkt-line: extern packet_length()
  2020-05-18 16:04     ` Jeff King
@ 2020-05-18 17:50       ` Eric Sunshine
  2020-05-18 20:08         ` Jeff King
  2020-05-18 18:44       ` Junio C Hamano
  1 sibling, 1 reply; 62+ messages in thread
From: Eric Sunshine @ 2020-05-18 17:50 UTC (permalink / raw)
  To: Jeff King; +Cc: Denton Liu, Git Mailing List

On Mon, May 18, 2020 at 12:04 PM Jeff King <peff@peff.net> wrote:
> On Mon, May 18, 2020 at 11:47:21AM -0400, Denton Liu wrote:
> > Change the function parameter from a `const char *` to
> > `const char linelen[4]`. Even though these two types behave identically
> > as function parameters, use the array notation to semantically indicate
> > exactly what this function is expecting as an argument.
>
> > +/*
> > + * Convert a four hex digit packet line length header into its numeric
> > + * representation. linelen should not be null-terminated.
>
> Minor nit, but it is perfectly fine if there is a NUL. Maybe "linelen
> does not need to be..."?

I had the exact same reaction when reading "...should not be
null-terminated", however, I'd prefer to drop mention of
NUL-termination altogether since talking about it merely confuses the
issue since it is quite clear both from the declaration (const
char[4]) and the documentation "four hex digits" that 'linelen' is
expected to contain exactly four hex digits and no NUL character(s).

By the way, I find the argument name "linelen" highly confusing; every
time I read it, I think it is an integral type, not a string or
character array. I'd very much prefer to see it renamed to "s" (or
something) or dropped altogether:

    int packet_length(const char[4]);

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 2/7] remote-curl: remove label indentation
  2020-05-18 15:47   ` [PATCH v2 2/7] remote-curl: remove label indentation Denton Liu
@ 2020-05-18 18:37     ` Junio C Hamano
  0 siblings, 0 replies; 62+ messages in thread
From: Junio C Hamano @ 2020-05-18 18:37 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List, Jeff King, Eric Sunshine

Denton Liu <liu.denton@gmail.com> writes:

> In the codebase, labels are aligned to the leftmost column. Remove the
> space-indentation from `free_specs:` to conform to this.

Even though there are 50+ remaining labels that tells us otherwise,
we have 300+ labels that support the position.

;-)

>
> Signed-off-by: Denton Liu <liu.denton@gmail.com>
> ---
>  remote-curl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/remote-curl.c b/remote-curl.c
> index 6844708f38..da3e07184a 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -1276,7 +1276,7 @@ static void parse_push(struct strbuf *buf)
>  	if (ret)
>  		exit(128); /* error already reported */
>  
> - free_specs:
> +free_specs:
>  	argv_array_clear(&specs);
>  }

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 3/7] transport: extract common fetch_pack() call
  2020-05-18 15:47   ` [PATCH v2 3/7] transport: extract common fetch_pack() call Denton Liu
@ 2020-05-18 18:40     ` Junio C Hamano
  0 siblings, 0 replies; 62+ messages in thread
From: Junio C Hamano @ 2020-05-18 18:40 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List, Jeff King, Eric Sunshine

Denton Liu <liu.denton@gmail.com> writes:

> In the switch statement, the difference between the `protocol_v2` and
> `protocol_v{1,0}` arms is a preparatory call to die_if_server_options() in
> the latter. The fetch_pack() call is identical in both arms. However,
> since this fetch_pack() call has so many parameters, it is not
> immediately obvious that the call is identical in both cases.

Sure.

	if (data->version < protocol_v2)
		die_if_server_options(transport);
	else if (data->version == protocol_unknown_version)
		BUG("...");

	refs = fetch_pack(...);

might have been even easier to follow, but that's OK.

>  	switch (data->version) {
>  	case protocol_v2:
> -		refs = fetch_pack(&args, data->fd,
> -				  refs_tmp ? refs_tmp : transport->remote_refs,
> -				  to_fetch, nr_heads, &data->shallow,
> -				  &transport->pack_lockfile, data->version);
> +		/* do nothing */
>  		break;
>  	case protocol_v1:
>  	case protocol_v0:
>  		die_if_server_options(transport);
> -		refs = fetch_pack(&args, data->fd,
> -				  refs_tmp ? refs_tmp : transport->remote_refs,
> -				  to_fetch, nr_heads, &data->shallow,
> -				  &transport->pack_lockfile, data->version);
>  		break;
>  	case protocol_unknown_version:
>  		BUG("unknown protocol version");
>  	}
> +	refs = fetch_pack(&args, data->fd,
> +			  refs_tmp ? refs_tmp : transport->remote_refs,
> +			  to_fetch, nr_heads, &data->shallow,
> +			  &transport->pack_lockfile, data->version);
>  
>  	close(data->fd[0]);
>  	close(data->fd[1]);

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 4/7] pkt-line: extern packet_length()
  2020-05-18 16:04     ` Jeff King
  2020-05-18 17:50       ` Eric Sunshine
@ 2020-05-18 18:44       ` Junio C Hamano
  1 sibling, 0 replies; 62+ messages in thread
From: Junio C Hamano @ 2020-05-18 18:44 UTC (permalink / raw)
  To: Jeff King; +Cc: Denton Liu, Git Mailing List, Eric Sunshine

Jeff King <peff@peff.net> writes:

>> +/*
>> + * Convert a four hex digit packet line length header into its numeric
>> + * representation. linelen should not be null-terminated.
>
> Minor nit, but it is perfectly fine if there is a NUL. Maybe "linelen
> does not need to be..."?

Yup, I was wondering about the same thing.  I actually would go
stronger than "does not need to be", as the byte after these four
would never be taken as "the terminator for the linelen bytes", even
it happens to be '\0'.

Just getting rid of the extra sentence would suffice for that, I
think.  The first sentence makes it clear that it is about
interpreting the 4 bytes we are given, and those 4 bytes come from
the 'packet line length header' the caller has.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects
  2020-05-18 16:50   ` [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects Jeff King
  2020-05-18 17:36     ` Denton Liu
@ 2020-05-18 19:36     ` Junio C Hamano
  1 sibling, 0 replies; 62+ messages in thread
From: Junio C Hamano @ 2020-05-18 19:36 UTC (permalink / raw)
  To: Jeff King; +Cc: Denton Liu, Git Mailing List, Eric Sunshine

Jeff King <peff@peff.net> writes:

> Overall this looks pretty cleanly done. I left a few minor comments
> throughout, but the real question is whether we prefer the "0002" packet
> in the last one, or if we instead insist that the response end in a
> flush.

Thanks for a review.  I was reading the series through and I found
it a reasonably pleasant read.

> So I think either of your solutions (enforcing a final flush, or the
> 0002 packet) is preferable. I'm on the fence between them.

I am undecided, too X-<.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 4/7] pkt-line: extern packet_length()
  2020-05-18 17:50       ` Eric Sunshine
@ 2020-05-18 20:08         ` Jeff King
  0 siblings, 0 replies; 62+ messages in thread
From: Jeff King @ 2020-05-18 20:08 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Denton Liu, Git Mailing List

On Mon, May 18, 2020 at 01:50:52PM -0400, Eric Sunshine wrote:

> On Mon, May 18, 2020 at 12:04 PM Jeff King <peff@peff.net> wrote:
> > On Mon, May 18, 2020 at 11:47:21AM -0400, Denton Liu wrote:
> > > Change the function parameter from a `const char *` to
> > > `const char linelen[4]`. Even though these two types behave identically
> > > as function parameters, use the array notation to semantically indicate
> > > exactly what this function is expecting as an argument.
> >
> > > +/*
> > > + * Convert a four hex digit packet line length header into its numeric
> > > + * representation. linelen should not be null-terminated.
> >
> > Minor nit, but it is perfectly fine if there is a NUL. Maybe "linelen
> > does not need to be..."?
> 
> I had the exact same reaction when reading "...should not be
> null-terminated", however, I'd prefer to drop mention of
> NUL-termination altogether since talking about it merely confuses the
> issue since it is quite clear both from the declaration (const
> char[4]) and the documentation "four hex digits" that 'linelen' is
> expected to contain exactly four hex digits and no NUL character(s).

Yeah, I think I have a slight preference for that, too.

> By the way, I find the argument name "linelen" highly confusing; every
> time I read it, I think it is an integral type, not a string or
> character array. I'd very much prefer to see it renamed to "s" (or
> something) or dropped altogether:
> 
>     int packet_length(const char[4]);

I think giving it _some_ name is useful. Maybe "const char
lenbuf_hex[4]".

-Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects
  2020-05-18 17:36     ` Denton Liu
@ 2020-05-18 20:58       ` Jeff King
  2020-05-18 22:52         ` Junio C Hamano
  0 siblings, 1 reply; 62+ messages in thread
From: Jeff King @ 2020-05-18 20:58 UTC (permalink / raw)
  To: Denton Liu; +Cc: Junio C Hamano, Git Mailing List, Eric Sunshine

On Mon, May 18, 2020 at 01:36:52PM -0400, Denton Liu wrote:

> > So I dunno. The "0002" solution is slightly more flexible, but I'm not
> > sure it helps in practice. And it does eat up one of our two remaining
> > special packet markers.
> 
> Yeah, I was worried about consuming a special packet. One alternative
> that I considered but is kind of gross is sending something like
> "0028gitremote-helper: response complete\n" instead of "0002". Then,
> instead of "0002" checks, we can check for that special string instead.
> I don't _think_ that stateless-connect currently allows for completely
> arbitrary data but I might be mistaken.

Yeah, I think we should avoid using any packet that could be confused
with regular output. A well-functioning server should have sent a flush
already, and so the worst case would probably be that we would mistake
their "0028" packet for ours if the network happens to cut out right at
that moment (and even that is obviously exceedingly unlikely). But it
just seems easier to reason about if we know we can't get confused.

> This was the original approach that I was working on but I found it to
> be much too invasive for my liking. (Also, I never actually managed to
> get it working ;) ) I think I gave up when I realised I had to insert
> reframing logic into index-pack and unpack-objects.

Yuck. :) I think v0 does it by unwrapping the extra layer in fetch-pack,
but I'd have to double check. But I'm not going to, because I think we
both agree that your other approaches are way nicer.

> > So I think either of your solutions (enforcing a final flush, or the
> > 0002 packet) is preferable. I'm on the fence between them.
> 
> I'm mostly on the fence too. One advantage of 0002, however, is that a
> malicious server can't end a request with 0002 as that's explicitly
> prevented. If a malicious server closes a connection after sending a
> 0000, I think that they could cause a deadlock to happen if there are
> multiple flush packets expected in a response. You mentioned in a
> sibling email that this currently doesn't happen wrt stateless-connect
> although I'm not sure if in the future, this is something that might
> change. I dunno.

Yeah, agreed that the efficacy of the "must end in flush" strategy
depends on there not being internal flushes. At least against a
determined attacker. But it may also be less random than you might
think, if the pattern is to send a flush, then do a bunch of work, etc.
A "random" hangup or error might be more likely to happen at the flush
points then (from the client's perspective).

So let's see if we can answer that question with less hand-waving.

The outer v2 capabilities conversation only writes out the capabilities
list, followed by a single flush (the packet_flush() call in
advertise_capabilities()). So far so good.

The only two v2 commands defined in serve.c are "ls-refs" and "fetch".

For "ls-refs", we end up in ls_refs(). That writes only regular packets
via send_ref(), and then concludes with a single flush (we don't even
send a delim; the client does that to specify options). Good.

For "fetch", we end up in upload_pack_v2(). We write via
process_haves_and_send_acks(), which will either:

  1. Write nothing and return 1.

  2. Write a delim and return 1.

  3. Write a flush packet and return 0.

In the final case, we jump to FETCH_DONE. So this really is the final
thing we say. Good.

In the first two cases, we jump to FETCH_SEND_PACK. We may write out
wanted-ref info, followed by a delim. OK. Then shallow info, followed by
a delim. OK. Then the actual packfile via create_pack_file(). There our
write behavior depends on use_sideband. If not set, we just dump data
directly. If set, then we packetize. And it will always be set for v2
(we do it unconditionally in the beginning of upload_pack_v2()).

So we only need to care about use_sideband cases. And there we may send
keepalives, but those are "0005\1" empty data packets. OK. We may send
data via send_client_data(), but those will be real data packets. And
then finally we give a single packet_flush(). Good.

So I think we can say that yes, the protocol as designed will send a
flush at the end of every response, and will not ever send another
flush.

_But_ that's if all goes well. If upload-pack sees an error, it may:

  - write an ERR packet in the earlier non-sideband steps, and then die

  - write a sideband 3 packet in later steps, and then die

  - just die() in some cases (e.g., pack-objects failed to start)

Obviously these are all an error on the client. And that third one
especially is probably a hanging case that we'd like to fix with this
series. But in the other two, we'd definitely want to make sure that
remote-curl doesn't complain too early or too loudly, and that the error
is shown to the user.

I think if remote-curl is just asking "did we see a flush packet at the
end" then it is likely to complain unnecessarily in that case. And I
don't think it should be inspecting packets for ERR or sideband-3. It
doesn't know enough about the rest of the conversation to know which is
correct. Though I guess we'd really only have to inspect the final
packet.

I really wish we had converted all ERRs into packet 0002 as part of the
v2 conversion. Then somebody reading only at the pktline level could
truly understand or proxy a full conversation without understanding
anything about what's in the pktlines. But it's too late for that now.

So I think our options are probably:

  1. detect flush packets in remote-curl, and either:

     a. don't print an error, just hang up. That prevents a hang in the
	caller and produces no extra message on a real error. It may be
	less informative than it could be if the connection hangs up
	(though we may print a curl error message, and the caller will
	at least say "the helper hung up")

     b. like (a), but always print an error; this is your original
	patch, but I _suspect_ (but didn't test) that it would produce
	extra useless messages for errors the server reports

     c. between the two: inspect the final packet data for evidence of
        ERR/sideband 3 and suppress any message if found

  2. helper signals end-of-response to caller (then it never produces a
     message itself; only the caller does, and it would abort on an ERR
     packet before then)

     a. using a special pktline (your "0002" patch)

     b. some other out-of-band mechanism (e.g., could be another fd)

I think this is pushing me towards 2a, your "0002" patch. It sidesteps
the error-message questions entirely (and I think 2b is too convoluted
to be worth pursuing, especially on Windows where setting up extra pipes
is tricky). But I'd also be OK with 1a or 1c.

-Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 6/6] remote-curl: ensure last packet is a flush
  2020-05-18 16:52         ` Jeff King
@ 2020-05-18 21:00           ` Jeff King
  0 siblings, 0 replies; 62+ messages in thread
From: Jeff King @ 2020-05-18 21:00 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Denton Liu, Git Mailing List, Force Charlie, Jonathan Tan,
	Jonathan Nieder

On Mon, May 18, 2020 at 12:52:07PM -0400, Jeff King wrote:

> > I vaguely recall talking with somebody (perhaps it was Shawn Pearce)
> > about my long-time complaint on the on-the-wire protocol, in that
> > the protocol sometimes uses flush to merely mean "I've finished
> > speaking one section of my speech" and sometimes "I've done talking
> > for now and it's your turn to speak to me" (the receiving end needs
> > to know which one a particular flush means without knowing the
> > context in which it is issued---which makes it impossible to write a
> > protocol agnostic proxy.  
> > 
> > If the above proposal means that we'll have an explicit way to say
> > "Not just I am done with one section of my speech, but it is your
> > turn to speak and I am going to listen", that would be wonderful.
> 
> I think we already have that now in v2 because of the "0001" delim
> packet. All of the flushes are (I think) really "this is the end of my
> speech", and any inner "my list is ending, but I have more to say"
> delimiters are "0001".

Sadly, this is only _mostly_ true. See my response in:

  https://lore.kernel.org/git/20200518205854.GB63978@coredump.intra.peff.net/

-Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects
  2020-05-18 20:58       ` Jeff King
@ 2020-05-18 22:52         ` Junio C Hamano
  2020-05-19  2:38           ` Jeff King
  0 siblings, 1 reply; 62+ messages in thread
From: Junio C Hamano @ 2020-05-18 22:52 UTC (permalink / raw)
  To: Jeff King; +Cc: Denton Liu, Git Mailing List, Eric Sunshine

Jeff King <peff@peff.net> writes:

> So I think our options are probably:
>
>   1. detect flush packets in remote-curl, and either:
>
>      a. don't print an error, just hang up. That prevents a hang in the
> 	caller and produces no extra message on a real error. It may be
> 	less informative than it could be if the connection hangs up
> 	(though we may print a curl error message, and the caller will
> 	at least say "the helper hung up")
>
>      b. like (a), but always print an error; this is your original
> 	patch, but I _suspect_ (but didn't test) that it would produce
> 	extra useless messages for errors the server reports
>
>      c. between the two: inspect the final packet data for evidence of
>         ERR/sideband 3 and suppress any message if found
>
>   2. helper signals end-of-response to caller (then it never produces a
>      message itself; only the caller does, and it would abort on an ERR
>      packet before then)
>
>      a. using a special pktline (your "0002" patch)
>
>      b. some other out-of-band mechanism (e.g., could be another fd)
>
> I think this is pushing me towards 2a, your "0002" patch. It sidesteps
> the error-message questions entirely (and I think 2b is too convoluted
> to be worth pursuing, especially on Windows where setting up extra pipes
> is tricky). But I'd also be OK with 1a or 1c.

Thanks for a detailed analysis.  I guess we'd take 0002, then?

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects
  2020-05-18 22:52         ` Junio C Hamano
@ 2020-05-19  2:38           ` Jeff King
  0 siblings, 0 replies; 62+ messages in thread
From: Jeff King @ 2020-05-19  2:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Denton Liu, Git Mailing List, Eric Sunshine

On Mon, May 18, 2020 at 03:52:42PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > So I think our options are probably:
> >
> >   1. detect flush packets in remote-curl, and either:
> >
> >      a. don't print an error, just hang up. That prevents a hang in the
> > 	caller and produces no extra message on a real error. It may be
> > 	less informative than it could be if the connection hangs up
> > 	(though we may print a curl error message, and the caller will
> > 	at least say "the helper hung up")
> >
> >      b. like (a), but always print an error; this is your original
> > 	patch, but I _suspect_ (but didn't test) that it would produce
> > 	extra useless messages for errors the server reports
> >
> >      c. between the two: inspect the final packet data for evidence of
> >         ERR/sideband 3 and suppress any message if found
> >
> >   2. helper signals end-of-response to caller (then it never produces a
> >      message itself; only the caller does, and it would abort on an ERR
> >      packet before then)
> >
> >      a. using a special pktline (your "0002" patch)
> >
> >      b. some other out-of-band mechanism (e.g., could be another fd)
> >
> > I think this is pushing me towards 2a, your "0002" patch. It sidesteps
> > the error-message questions entirely (and I think 2b is too convoluted
> > to be worth pursuing, especially on Windows where setting up extra pipes
> > is tricky). But I'd also be OK with 1a or 1c.
> 
> Thanks for a detailed analysis.  I guess we'd take 0002, then?

Yeah, that's how I lean. I think I did have a few comments on the patch
that I'd like Denton to consider, so hopefully we'll see a re-roll.

-Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v3 0/7] remote-curl: fix deadlocks when remote server disconnects
  2020-05-18 15:47 ` [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects Denton Liu
                     ` (7 preceding siblings ...)
  2020-05-18 16:50   ` [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects Jeff King
@ 2020-05-19 10:53   ` Denton Liu
  2020-05-19 10:53     ` [PATCH v3 1/7] remote-curl: fix typo Denton Liu
                       ` (9 more replies)
  8 siblings, 10 replies; 62+ messages in thread
From: Denton Liu @ 2020-05-19 10:53 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Eric Sunshine, Junio C Hamano

The following command hangs forever:

	$ git -c protocol.version=2 clone https://github.com/git/git.git --shallow-since=20151012
	Cloning into 'git'...

This occurs because the --shallow-since arg is incorrect and the server
dies early. However, remote-curl does not realise that the server
errored out and just faithfully forwards the packets to fetch-pack
before waiting on more input from fetch-pack. Meanwhile, fetch-pack
keeps reading as it still expects more input. As a result, the processes
deadlock. Original analysis by Peff:
https://lore.kernel.org/git/20200328154936.GA1217052@coredump.intra.peff.net/

Changes since v2:

* Use if-else tower in "transport: extract common fetch_pack() call"

* Rename to `lenbuf_hex` and remove null-terminator sentence in
  "pkt-line: extern packet_length()"

* Fix "a a" typo in "remote-curl: error on incomplete packet"

* Don't send flush packet after response end packet

* Move stateless delimiter checks closer to where message processing
  happens in do_fetch_pack_v2()

Changes since v1:

* Remove fallthrough in switch in favour of just extracting the common
  call out of the switch in patch 3

* Add more detail in function comment and use `const char linelen[4]` in
  patch 4

* Implement most of Peff's suggestions[0] in patch 5

* Only operate on stateless_connect() in patch 5

* Add tests in patch 5

* Drop "remote-curl: ensure last packet is a flush" in favour of
  "stateless-connect: send response end packet"

[0]: https://lore.kernel.org/git/20200515213844.GD115445@coredump.intra.peff.net/

Denton Liu (7):
  remote-curl: fix typo
  remote-curl: remove label indentation
  transport: extract common fetch_pack() call
  pkt-line: extern packet_length()
  remote-curl: error on incomplete packet
  pkt-line: define PACKET_READ_RESPONSE_END
  stateless-connect: send response end packet

 Documentation/gitremote-helpers.txt           |  4 +-
 Documentation/technical/protocol-v2.txt       |  2 +
 builtin/fetch-pack.c                          |  2 +-
 connect.c                                     | 18 ++++-
 connect.h                                     |  4 ++
 fetch-pack.c                                  | 13 ++++
 pkt-line.c                                    | 17 ++++-
 pkt-line.h                                    | 11 +++
 remote-curl.c                                 | 70 +++++++++++++++++--
 remote.h                                      |  3 +-
 serve.c                                       |  2 +
 t/helper/test-pkt-line.c                      |  4 ++
 t/lib-httpd.sh                                |  2 +
 t/lib-httpd/apache.conf                       |  8 +++
 .../incomplete-body-upload-pack-v2-http.sh    |  3 +
 .../incomplete-length-upload-pack-v2-http.sh  |  3 +
 t/t5702-protocol-v2.sh                        | 47 +++++++++++++
 transport.c                                   | 28 +++-----
 18 files changed, 211 insertions(+), 30 deletions(-)
 create mode 100644 t/lib-httpd/incomplete-body-upload-pack-v2-http.sh
 create mode 100644 t/lib-httpd/incomplete-length-upload-pack-v2-http.sh

Range-diff against v2:
1:  b390875f87 = 1:  b390875f87 remote-curl: fix typo
2:  a2b28c0b28 = 2:  a2b28c0b28 remote-curl: remove label indentation
3:  3a42575bd5 < -:  ---------- transport: extract common fetch_pack() call
-:  ---------- > 3:  c118baa5a2 transport: extract common fetch_pack() call
4:  c2b9d033bb ! 4:  36885943b2 pkt-line: extern packet_length()
    @@ Commit message
         need to access the length header. In order to simplify this, extern
         packet_length() so that the logic can be reused.
     
    -    Change the function parameter from a `const char *` to
    -    `const char linelen[4]`. Even though these two types behave identically
    -    as function parameters, use the array notation to semantically indicate
    -    exactly what this function is expecting as an argument.
    +    Change the function parameter from `const char *linelen` to
    +    `const char lenbuf_hex[4]`. Even though these two types behave
    +    identically as function parameters, use the array notation to
    +    semantically indicate exactly what this function is expecting as an
    +    argument. Also, rename it from linelen to lenbuf_hex as the former
    +    sounds like it should be an integral type which is misleading.
     
      ## pkt-line.c ##
     @@ pkt-line.c: static int get_packet_data(int fd, char **src_buf, size_t *src_size,
    @@ pkt-line.c: static int get_packet_data(int fd, char **src_buf, size_t *src_size,
      }
      
     -static int packet_length(const char *linelen)
    -+int packet_length(const char linelen[4])
    ++int packet_length(const char lenbuf_hex[4])
      {
    - 	int val = hex2chr(linelen);
    - 	return (val < 0) ? val : (val << 8) | hex2chr(linelen + 2);
    +-	int val = hex2chr(linelen);
    +-	return (val < 0) ? val : (val << 8) | hex2chr(linelen + 2);
    ++	int val = hex2chr(lenbuf_hex);
    ++	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
    + }
    + 
    + enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
     
      ## pkt-line.h ##
     @@ pkt-line.h: int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
    @@ pkt-line.h: int write_packetized_from_buf(const char *src_in, size_t len, int fd
      
     +/*
     + * Convert a four hex digit packet line length header into its numeric
    -+ * representation. linelen should not be null-terminated.
    ++ * representation.
     + *
     + * If linelen contains non-hex characters, return -1. Otherwise, return the
     + * numeric value of the length header.
     + */
    -+int packet_length(const char linelen[4]);
    ++int packet_length(const char lenbuf_hex[4]);
     +
      /*
       * Read a packetized line into a buffer like the 'packet_read()' function but
5:  52ce5fdffd ! 5:  91d330620a remote-curl: error on incomplete packet
    @@ Commit message
         results in a deadlock between the two processes.
     
         For a stateless connection, inspect packets before sending them and
    -    error out if a a packet line packet is incomplete.
    +    error out if a packet line packet is incomplete.
     
         Helped-by: Jeff King <peff@peff.net>
     
6:  744b078324 ! 6:  ff83344e9e pkt-line: PACKET_READ_RESPONSE_END
    @@ Metadata
     Author: Denton Liu <liu.denton@gmail.com>
     
      ## Commit message ##
    -    pkt-line: PACKET_READ_RESPONSE_END
    +    pkt-line: define PACKET_READ_RESPONSE_END
     
         In a future commit, we will use PACKET_READ_RESPONSE_END to separate
         messages proxied by remote-curl. To prepare for this, add the
7:  4b079bcd83 ! 7:  c26e160fbc stateless-connect: send response end packet
    @@ Commit message
                 Cloning into 'git'...
                 fatal: the remote end hung up unexpectedly
     
    -    Instead of blindly forwarding packets, make remote-curl insert response
    -    end and flush packets after proxying the responses from the remote
    -    server when using stateless_connect(). On the RPC client side, ensure
    -    that each response ends as described.
    +    Instead of blindly forwarding packets, make remote-curl insert a
    +    response end packet after proxying the responses from the remote server
    +    when using stateless_connect(). On the RPC client side, ensure that each
    +    response ends as described.
     
         A separate control packet is chosen because we need to be able to
         differentiate between what the remote server sends and remote-curl's
    @@ Documentation/gitremote-helpers.txt: Supported if the helper has the "connect" c
      	(both request and response) must consist of zero or more
     -	PKT-LINEs, terminating in a flush packet. The client must not
     +	PKT-LINEs, terminating in a flush packet. Response messages will
    -+	have a response end packet before the flush packet to indicate
    -+	the end of a response.  The client must not
    ++	then have a response end packet after the flush packet to
    ++	indicate the end of a response.  The client must not
      	expect the server to store any state in between request-response
      	pairs.  After the connection ends, the remote helper exits.
      +
    @@ builtin/fetch-pack.c: int cmd_fetch_pack(int argc, const char **argv, const char
     
      ## connect.c ##
     @@ connect.c: static int process_ref_v2(const char *line, struct ref ***list)
    + 	return ret;
    + }
    + 
    ++void check_stateless_delimiter(int stateless_rpc,
    ++			      struct packet_reader *reader,
    ++			      const char *error)
    ++{
    ++	if (!stateless_rpc)
    ++		return; /* not in stateless mode, no delimiter expected */
    ++	if (packet_reader_read(reader) != PACKET_READ_RESPONSE_END)
    ++		die("%s", error);
    ++}
    ++
      struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
      			     struct ref **list, int for_push,
      			     const struct argv_array *ref_prefixes,
    @@ connect.c: struct ref **get_remote_refs(int fd_out, struct packet_reader *reader
      	if (reader->status != PACKET_READ_FLUSH)
      		die(_("expected flush after ref listing"));
      
    -+	if (stateless_rpc) {
    -+		if (packet_reader_read(reader) != PACKET_READ_RESPONSE_END)
    -+			die(_("expected response end packet after ref listing"));
    -+		if (packet_reader_read(reader) != PACKET_READ_FLUSH)
    -+			die(_("expected flush packet after response end"));
    -+	}
    ++	check_stateless_delimiter(stateless_rpc, reader,
    ++				  _("expected response end packet after ref listing"));
     +
      	return list;
      }
      
     
    - ## fetch-pack.c ##
    -@@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
    - 	struct fetch_negotiator negotiator_alloc;
    - 	struct fetch_negotiator *negotiator;
    - 	int seen_ack = 0;
    -+	int check_http_delimiter;
    + ## connect.h ##
    +@@ connect.h: int server_supports_v2(const char *c, int die_on_error);
    + int server_supports_feature(const char *c, const char *feature,
    + 			    int die_on_error);
      
    - 	if (args->no_dependents) {
    - 		negotiator = NULL;
    -@@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
    - 	}
    - 
    - 	while (state != FETCH_DONE) {
    -+		check_http_delimiter = 0;
    ++void check_stateless_delimiter(int stateless_rpc,
    ++			       struct packet_reader *reader,
    ++			       const char *error);
     +
    - 		switch (state) {
    - 		case FETCH_CHECK_LOCAL:
    - 			sort_ref_list(&ref, ref_compare_name);
    + #endif
    +
    + ## fetch-pack.c ##
    +@@ fetch-pack.c: enum fetch_state {
    + 	FETCH_DONE,
    + };
    + 
    ++static void do_check_stateless_delimiter(const struct fetch_pack_args *args,
    ++					 struct packet_reader *reader)
    ++{
    ++	check_stateless_delimiter(args->stateless_rpc, reader,
    ++				  _("git fetch-pack: expected response end packet"));
    ++}
    ++
    + static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
    + 				    int fd[2],
    + 				    const struct ref *orig_ref,
     @@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
    + 			/* Process ACKs/NAKs */
    + 			switch (process_acks(negotiator, &reader, &common)) {
    + 			case READY:
    ++				/*
    ++				 * Don't check for response delimiter; get_pack() will
    ++				 * read the rest of this response.
    ++				 */
    + 				state = FETCH_GET_PACK;
    + 				break;
    + 			case COMMON_FOUND:
    +@@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
    + 				seen_ack = 1;
      				/* fallthrough */
      			case NO_COMMON_FOUND:
    ++				do_check_stateless_delimiter(args, &reader);
      				state = FETCH_SEND_REQUEST;
    -+				check_http_delimiter = 1;
      				break;
      			}
    - 			break;
     @@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
    + 			process_section_header(&reader, "packfile", 0);
    + 			if (get_pack(args, fd, pack_lockfile, sought, nr_sought))
      				die(_("git fetch-pack: fetch failed."));
    ++			do_check_stateless_delimiter(args, &reader);
      
      			state = FETCH_DONE;
    -+			check_http_delimiter = 1;
      			break;
    - 		case FETCH_DONE:
    - 			continue;
    - 		}
    -+
    -+		if (args->stateless_rpc && check_http_delimiter) {
    -+			if (packet_reader_read(&reader) != PACKET_READ_RESPONSE_END)
    -+				die(_("git fetch-pack: expected response end packet"));
    -+			if (packet_reader_read(&reader) != PACKET_READ_FLUSH)
    -+				die(_("git fetch-pack: expected flush packet"));
    -+		}
    - 	}
    - 
    - 	if (negotiator)
     
      ## remote-curl.c ##
     @@ remote-curl.c: static void check_pktline(struct check_pktline_state *state, const char *ptr, si
    @@ remote-curl.c: static int post_rpc(struct rpc_state *rpc, int stateless_connect,
      	if (rpc_in_data.pktline_state.remaining)
      		err = error(_("%d bytes of body are still expected"), rpc_in_data.pktline_state.remaining);
      
    -+	if (stateless_connect) {
    ++	if (stateless_connect)
     +		packet_response_end(rpc->in);
    -+		packet_flush(rpc->in);
    -+	}
     +
      	curl_slist_free_all(headers);
      	free(gzip_body);
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v3 1/7] remote-curl: fix typo
  2020-05-19 10:53   ` [PATCH v3 " Denton Liu
@ 2020-05-19 10:53     ` Denton Liu
  2020-05-19 10:53     ` [PATCH v3 2/7] remote-curl: remove label indentation Denton Liu
                       ` (8 subsequent siblings)
  9 siblings, 0 replies; 62+ messages in thread
From: Denton Liu @ 2020-05-19 10:53 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Eric Sunshine, Junio C Hamano

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 remote-curl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index 1c9aa3d0ab..6844708f38 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -643,7 +643,7 @@ static size_t rpc_out(void *ptr, size_t eltsize,
 			return 0;
 		}
 		/*
-		 * If avail is non-zerp, the line length for the flush still
+		 * If avail is non-zero, the line length for the flush still
 		 * hasn't been fully sent. Proceed with sending the line
 		 * length.
 		 */
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v3 2/7] remote-curl: remove label indentation
  2020-05-19 10:53   ` [PATCH v3 " Denton Liu
  2020-05-19 10:53     ` [PATCH v3 1/7] remote-curl: fix typo Denton Liu
@ 2020-05-19 10:53     ` Denton Liu
  2020-05-19 10:53     ` [PATCH v3 3/7] transport: extract common fetch_pack() call Denton Liu
                       ` (7 subsequent siblings)
  9 siblings, 0 replies; 62+ messages in thread
From: Denton Liu @ 2020-05-19 10:53 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Eric Sunshine, Junio C Hamano

In the codebase, labels are aligned to the leftmost column. Remove the
space-indentation from `free_specs:` to conform to this.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 remote-curl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index 6844708f38..da3e07184a 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -1276,7 +1276,7 @@ static void parse_push(struct strbuf *buf)
 	if (ret)
 		exit(128); /* error already reported */
 
- free_specs:
+free_specs:
 	argv_array_clear(&specs);
 }
 
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v3 3/7] transport: extract common fetch_pack() call
  2020-05-19 10:53   ` [PATCH v3 " Denton Liu
  2020-05-19 10:53     ` [PATCH v3 1/7] remote-curl: fix typo Denton Liu
  2020-05-19 10:53     ` [PATCH v3 2/7] remote-curl: remove label indentation Denton Liu
@ 2020-05-19 10:53     ` Denton Liu
  2020-05-19 10:53     ` [PATCH v3 4/7] pkt-line: extern packet_length() Denton Liu
                       ` (6 subsequent siblings)
  9 siblings, 0 replies; 62+ messages in thread
From: Denton Liu @ 2020-05-19 10:53 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Eric Sunshine, Junio C Hamano

In the switch statement, the difference between the `protocol_v2` and
`protocol_v{1,0}` arms is a preparatory call to die_if_server_options() in
the latter. The fetch_pack() call is identical in both arms. However,
since this fetch_pack() call has so many parameters, it is not
immediately obvious that the call is identical in both cases.

Extract the common fetch_pack() call out of the switch statement so that
code duplication is reduced and the logic is more clear for future
readers. While we're at it, rewrite the switch statement as an if-else
tower for increased clarity.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 transport.c | 25 ++++++++-----------------
 1 file changed, 8 insertions(+), 17 deletions(-)

diff --git a/transport.c b/transport.c
index 15f5ba4e8f..431a93caef 100644
--- a/transport.c
+++ b/transport.c
@@ -369,24 +369,15 @@ static int fetch_refs_via_pack(struct transport *transport,
 		refs_tmp = handshake(transport, 0, NULL, must_list_refs);
 	}
 
-	switch (data->version) {
-	case protocol_v2:
-		refs = fetch_pack(&args, data->fd,
-				  refs_tmp ? refs_tmp : transport->remote_refs,
-				  to_fetch, nr_heads, &data->shallow,
-				  &transport->pack_lockfile, data->version);
-		break;
-	case protocol_v1:
-	case protocol_v0:
-		die_if_server_options(transport);
-		refs = fetch_pack(&args, data->fd,
-				  refs_tmp ? refs_tmp : transport->remote_refs,
-				  to_fetch, nr_heads, &data->shallow,
-				  &transport->pack_lockfile, data->version);
-		break;
-	case protocol_unknown_version:
+	if (data->version == protocol_unknown_version)
 		BUG("unknown protocol version");
-	}
+	else if (data->version <= protocol_v1)
+		die_if_server_options(transport);
+
+	refs = fetch_pack(&args, data->fd,
+			  refs_tmp ? refs_tmp : transport->remote_refs,
+			  to_fetch, nr_heads, &data->shallow,
+			  &transport->pack_lockfile, data->version);
 
 	close(data->fd[0]);
 	close(data->fd[1]);
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v3 4/7] pkt-line: extern packet_length()
  2020-05-19 10:53   ` [PATCH v3 " Denton Liu
                       ` (2 preceding siblings ...)
  2020-05-19 10:53     ` [PATCH v3 3/7] transport: extract common fetch_pack() call Denton Liu
@ 2020-05-19 10:53     ` Denton Liu
  2020-05-19 16:23       ` Eric Sunshine
  2020-05-19 10:53     ` [PATCH v3 5/7] remote-curl: error on incomplete packet Denton Liu
                       ` (5 subsequent siblings)
  9 siblings, 1 reply; 62+ messages in thread
From: Denton Liu @ 2020-05-19 10:53 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Eric Sunshine, Junio C Hamano

In a future commit, we will be manually processing packets and we will
need to access the length header. In order to simplify this, extern
packet_length() so that the logic can be reused.

Change the function parameter from `const char *linelen` to
`const char lenbuf_hex[4]`. Even though these two types behave
identically as function parameters, use the array notation to
semantically indicate exactly what this function is expecting as an
argument. Also, rename it from linelen to lenbuf_hex as the former
sounds like it should be an integral type which is misleading.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 pkt-line.c | 6 +++---
 pkt-line.h | 9 +++++++++
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/pkt-line.c b/pkt-line.c
index a0e87b1e81..3beab1dc6b 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -306,10 +306,10 @@ static int get_packet_data(int fd, char **src_buf, size_t *src_size,
 	return ret;
 }
 
-static int packet_length(const char *linelen)
+int packet_length(const char lenbuf_hex[4])
 {
-	int val = hex2chr(linelen);
-	return (val < 0) ? val : (val << 8) | hex2chr(linelen + 2);
+	int val = hex2chr(lenbuf_hex);
+	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
 }
 
 enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
diff --git a/pkt-line.h b/pkt-line.h
index fef3a0d792..0d92c5e17f 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -74,6 +74,15 @@ int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
 int packet_read(int fd, char **src_buffer, size_t *src_len, char
 		*buffer, unsigned size, int options);
 
+/*
+ * Convert a four hex digit packet line length header into its numeric
+ * representation.
+ *
+ * If linelen contains non-hex characters, return -1. Otherwise, return the
+ * numeric value of the length header.
+ */
+int packet_length(const char lenbuf_hex[4]);
+
 /*
  * Read a packetized line into a buffer like the 'packet_read()' function but
  * returns an 'enum packet_read_status' which indicates the status of the read.
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v3 5/7] remote-curl: error on incomplete packet
  2020-05-19 10:53   ` [PATCH v3 " Denton Liu
                       ` (3 preceding siblings ...)
  2020-05-19 10:53     ` [PATCH v3 4/7] pkt-line: extern packet_length() Denton Liu
@ 2020-05-19 10:53     ` Denton Liu
  2020-05-19 10:53     ` [PATCH v3 6/7] pkt-line: define PACKET_READ_RESPONSE_END Denton Liu
                       ` (4 subsequent siblings)
  9 siblings, 0 replies; 62+ messages in thread
From: Denton Liu @ 2020-05-19 10:53 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Eric Sunshine, Junio C Hamano

Currently, remote-curl acts as a proxy and blindly forwards packets
between an HTTP server and fetch-pack. In the case of a stateless RPC
connection where the connection is terminated with a partially written
packet, remote-curl will blindly send the partially written packet
before waiting on more input from fetch-pack. Meanwhile, fetch-pack will
read the partial packet and continue reading, expecting more input. This
results in a deadlock between the two processes.

For a stateless connection, inspect packets before sending them and
error out if a packet line packet is incomplete.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 remote-curl.c                                 | 59 ++++++++++++++++++-
 t/lib-httpd.sh                                |  2 +
 t/lib-httpd/apache.conf                       |  8 +++
 .../incomplete-body-upload-pack-v2-http.sh    |  3 +
 .../incomplete-length-upload-pack-v2-http.sh  |  3 +
 t/t5702-protocol-v2.sh                        | 34 +++++++++++
 6 files changed, 106 insertions(+), 3 deletions(-)
 create mode 100644 t/lib-httpd/incomplete-body-upload-pack-v2-http.sh
 create mode 100644 t/lib-httpd/incomplete-length-upload-pack-v2-http.sh

diff --git a/remote-curl.c b/remote-curl.c
index da3e07184a..e020140092 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -679,9 +679,53 @@ static curlioerr rpc_ioctl(CURL *handle, int cmd, void *clientp)
 }
 #endif
 
+struct check_pktline_state {
+	char len_buf[4];
+	int len_filled;
+	int remaining;
+};
+
+static void check_pktline(struct check_pktline_state *state, const char *ptr, size_t size)
+{
+	while (size) {
+		if (!state->remaining) {
+			int digits_remaining = 4 - state->len_filled;
+			if (digits_remaining > size)
+				digits_remaining = size;
+			memcpy(&state->len_buf[state->len_filled], ptr, digits_remaining);
+			state->len_filled += digits_remaining;
+			ptr += digits_remaining;
+			size -= digits_remaining;
+
+			if (state->len_filled == 4) {
+				state->remaining = packet_length(state->len_buf);
+				if (state->remaining < 0) {
+					die(_("remote-curl: bad line length character: %.4s"), state->len_buf);
+				} else if (state->remaining < 4) {
+					state->remaining = 0;
+				} else {
+					state->remaining -= 4;
+				}
+				state->len_filled = 0;
+			}
+		}
+
+		if (state->remaining) {
+			int remaining = state->remaining;
+			if (remaining > size)
+				remaining = size;
+			ptr += remaining;
+			size -= remaining;
+			state->remaining -= remaining;
+		}
+	}
+}
+
 struct rpc_in_data {
 	struct rpc_state *rpc;
 	struct active_request_slot *slot;
+	int check_pktline;
+	struct check_pktline_state pktline_state;
 };
 
 /*
@@ -702,6 +746,8 @@ static size_t rpc_in(char *ptr, size_t eltsize,
 		return size;
 	if (size)
 		data->rpc->any_written = 1;
+	if (data->check_pktline)
+		check_pktline(&data->pktline_state, ptr, size);
 	write_or_die(data->rpc->in, ptr, size);
 	return size;
 }
@@ -778,7 +824,7 @@ static curl_off_t xcurl_off_t(size_t len)
  * If flush_received is true, do not attempt to read any more; just use what's
  * in rpc->buf.
  */
-static int post_rpc(struct rpc_state *rpc, int flush_received)
+static int post_rpc(struct rpc_state *rpc, int stateless_connect, int flush_received)
 {
 	struct active_request_slot *slot;
 	struct curl_slist *headers = http_copy_default_headers();
@@ -920,6 +966,8 @@ static int post_rpc(struct rpc_state *rpc, int flush_received)
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, rpc_in);
 	rpc_in_data.rpc = rpc;
 	rpc_in_data.slot = slot;
+	rpc_in_data.check_pktline = stateless_connect;
+	memset(&rpc_in_data.pktline_state, 0, sizeof(rpc_in_data.pktline_state));
 	curl_easy_setopt(slot->curl, CURLOPT_FILE, &rpc_in_data);
 	curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 0);
 
@@ -936,6 +984,11 @@ static int post_rpc(struct rpc_state *rpc, int flush_received)
 	if (!rpc->any_written)
 		err = -1;
 
+	if (rpc_in_data.pktline_state.len_filled)
+		err = error(_("%d bytes of length header were received"), rpc_in_data.pktline_state.len_filled);
+	if (rpc_in_data.pktline_state.remaining)
+		err = error(_("%d bytes of body are still expected"), rpc_in_data.pktline_state.remaining);
+
 	curl_slist_free_all(headers);
 	free(gzip_body);
 	return err;
@@ -985,7 +1038,7 @@ static int rpc_service(struct rpc_state *rpc, struct discovery *heads,
 			break;
 		rpc->pos = 0;
 		rpc->len = n;
-		err |= post_rpc(rpc, 0);
+		err |= post_rpc(rpc, 0, 0);
 	}
 
 	close(client.in);
@@ -1342,7 +1395,7 @@ static int stateless_connect(const char *service_name)
 			BUG("The entire rpc->buf should be larger than LARGE_PACKET_MAX");
 		if (status == PACKET_READ_EOF)
 			break;
-		if (post_rpc(&rpc, status == PACKET_READ_FLUSH))
+		if (post_rpc(&rpc, 1, status == PACKET_READ_FLUSH))
 			/* We would have an err here */
 			break;
 		/* Reset the buffer for next request */
diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index 1449ee95e9..d2edfa4c50 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -129,6 +129,8 @@ install_script () {
 prepare_httpd() {
 	mkdir -p "$HTTPD_DOCUMENT_ROOT_PATH"
 	cp "$TEST_PATH"/passwd "$HTTPD_ROOT_PATH"
+	install_script incomplete-length-upload-pack-v2-http.sh
+	install_script incomplete-body-upload-pack-v2-http.sh
 	install_script broken-smart-http.sh
 	install_script error-smart-http.sh
 	install_script error.sh
diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index 994e5290d6..afa91e38b0 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -117,6 +117,8 @@ Alias /auth/dumb/ www/auth/dumb/
 	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
 	SetEnv GIT_HTTP_EXPORT_ALL
 </LocationMatch>
+ScriptAlias /smart/incomplete_length/git-upload-pack incomplete-length-upload-pack-v2-http.sh/
+ScriptAlias /smart/incomplete_body/git-upload-pack incomplete-body-upload-pack-v2-http.sh/
 ScriptAliasMatch /error_git_upload_pack/(.*)/git-upload-pack error.sh/
 ScriptAliasMatch /smart_*[^/]*/(.*) ${GIT_EXEC_PATH}/git-http-backend/$1
 ScriptAlias /broken_smart/ broken-smart-http.sh/
@@ -126,6 +128,12 @@ ScriptAliasMatch /one_time_perl/(.*) apply-one-time-perl.sh/$1
 <Directory ${GIT_EXEC_PATH}>
 	Options FollowSymlinks
 </Directory>
+<Files incomplete-length-upload-pack-v2-http.sh>
+	Options ExecCGI
+</Files>
+<Files incomplete-body-upload-pack-v2-http.sh>
+	Options ExecCGI
+</Files>
 <Files broken-smart-http.sh>
 	Options ExecCGI
 </Files>
diff --git a/t/lib-httpd/incomplete-body-upload-pack-v2-http.sh b/t/lib-httpd/incomplete-body-upload-pack-v2-http.sh
new file mode 100644
index 0000000000..2f5ed9fcf6
--- /dev/null
+++ b/t/lib-httpd/incomplete-body-upload-pack-v2-http.sh
@@ -0,0 +1,3 @@
+printf "Content-Type: text/%s\n" "application/x-git-upload-pack-result"
+echo
+printf "%s%s\n" "0079" "45"
diff --git a/t/lib-httpd/incomplete-length-upload-pack-v2-http.sh b/t/lib-httpd/incomplete-length-upload-pack-v2-http.sh
new file mode 100644
index 0000000000..86c6e648c9
--- /dev/null
+++ b/t/lib-httpd/incomplete-length-upload-pack-v2-http.sh
@@ -0,0 +1,3 @@
+printf "Content-Type: text/%s\n" "application/x-git-upload-pack-result"
+echo
+printf "%s\n" "00"
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 5039e66dc4..4eb81ba2d4 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -586,6 +586,40 @@ test_expect_success 'clone with http:// using protocol v2' '
 	! grep "Send header: Transfer-Encoding: chunked" log
 '
 
+test_expect_success 'clone repository with http:// using protocol v2 with incomplete pktline length' '
+	test_when_finished "rm -f log" &&
+
+	git init "$HTTPD_DOCUMENT_ROOT_PATH/incomplete_length" &&
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/incomplete_length" file &&
+
+	test_must_fail env GIT_TRACE_PACKET="$(pwd)/log" GIT_TRACE_CURL="$(pwd)/log" git -c protocol.version=2 \
+		clone "$HTTPD_URL/smart/incomplete_length" incomplete_length_child 2>err &&
+
+	# Client requested to use protocol v2
+	grep "Git-Protocol: version=2" log &&
+	# Server responded using protocol v2
+	grep "git< version 2" log &&
+	# Client reported appropriate failure
+	test_i18ngrep "bytes of length header were received" err
+'
+
+test_expect_success 'clone repository with http:// using protocol v2 with incomplete pktline body' '
+	test_when_finished "rm -f log" &&
+
+	git init "$HTTPD_DOCUMENT_ROOT_PATH/incomplete_body" &&
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/incomplete_body" file &&
+
+	test_must_fail env GIT_TRACE_PACKET="$(pwd)/log" GIT_TRACE_CURL="$(pwd)/log" git -c protocol.version=2 \
+		clone "$HTTPD_URL/smart/incomplete_body" incomplete_body_child 2>err &&
+
+	# Client requested to use protocol v2
+	grep "Git-Protocol: version=2" log &&
+	# Server responded using protocol v2
+	grep "git< version 2" log &&
+	# Client reported appropriate failure
+	test_i18ngrep "bytes of body are still expected" err
+'
+
 test_expect_success 'clone big repository with http:// using protocol v2' '
 	test_when_finished "rm -f log" &&
 
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v3 6/7] pkt-line: define PACKET_READ_RESPONSE_END
  2020-05-19 10:53   ` [PATCH v3 " Denton Liu
                       ` (4 preceding siblings ...)
  2020-05-19 10:53     ` [PATCH v3 5/7] remote-curl: error on incomplete packet Denton Liu
@ 2020-05-19 10:53     ` Denton Liu
  2020-05-19 10:54     ` [PATCH v3 7/7] stateless-connect: send response end packet Denton Liu
                       ` (3 subsequent siblings)
  9 siblings, 0 replies; 62+ messages in thread
From: Denton Liu @ 2020-05-19 10:53 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Eric Sunshine, Junio C Hamano

In a future commit, we will use PACKET_READ_RESPONSE_END to separate
messages proxied by remote-curl. To prepare for this, add the
PACKET_READ_RESPONSE_END enum value.

In switch statements that need a case added, die() or BUG() when a
PACKET_READ_RESPONSE_END is unexpected. Otherwise, mirror how
PACKET_READ_DELIM is implemented (especially in cases where packets are
being forwarded).

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 connect.c                |  2 ++
 pkt-line.c               | 11 +++++++++++
 pkt-line.h               |  2 ++
 remote-curl.c            |  2 ++
 serve.c                  |  2 ++
 t/helper/test-pkt-line.c |  4 ++++
 6 files changed, 23 insertions(+)

diff --git a/connect.c b/connect.c
index 23013c6344..11c6ec70a0 100644
--- a/connect.c
+++ b/connect.c
@@ -127,6 +127,7 @@ enum protocol_version discover_version(struct packet_reader *reader)
 		die_initial_contact(0);
 	case PACKET_READ_FLUSH:
 	case PACKET_READ_DELIM:
+	case PACKET_READ_RESPONSE_END:
 		version = protocol_v0;
 		break;
 	case PACKET_READ_NORMAL:
@@ -310,6 +311,7 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 			state = EXPECTING_DONE;
 			break;
 		case PACKET_READ_DELIM:
+		case PACKET_READ_RESPONSE_END:
 			die(_("invalid packet"));
 		}
 
diff --git a/pkt-line.c b/pkt-line.c
index 3beab1dc6b..8f9bc68ee2 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -99,6 +99,13 @@ void packet_delim(int fd)
 		die_errno(_("unable to write delim packet"));
 }
 
+void packet_response_end(int fd)
+{
+	packet_trace("0002", 4, 1);
+	if (write_in_full(fd, "0002", 4) < 0)
+		die_errno(_("unable to write stateless separator packet"));
+}
+
 int packet_flush_gently(int fd)
 {
 	packet_trace("0000", 4, 1);
@@ -337,6 +344,10 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 		packet_trace("0001", 4, 0);
 		*pktlen = 0;
 		return PACKET_READ_DELIM;
+	} else if (len == 2) {
+		packet_trace("0002", 4, 0);
+		*pktlen = 0;
+		return PACKET_READ_RESPONSE_END;
 	} else if (len < 4) {
 		die(_("protocol error: bad line length %d"), len);
 	}
diff --git a/pkt-line.h b/pkt-line.h
index 0d92c5e17f..6cb92d7a5d 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -22,6 +22,7 @@
  */
 void packet_flush(int fd);
 void packet_delim(int fd);
+void packet_response_end(int fd);
 void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 void packet_buf_flush(struct strbuf *buf);
 void packet_buf_delim(struct strbuf *buf);
@@ -94,6 +95,7 @@ enum packet_read_status {
 	PACKET_READ_NORMAL,
 	PACKET_READ_FLUSH,
 	PACKET_READ_DELIM,
+	PACKET_READ_RESPONSE_END,
 };
 enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 						size_t *src_len, char *buffer,
diff --git a/remote-curl.c b/remote-curl.c
index e020140092..d02cb547e9 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -601,6 +601,8 @@ static int rpc_read_from_out(struct rpc_state *rpc, int options,
 		case PACKET_READ_FLUSH:
 			memcpy(buf - 4, "0000", 4);
 			break;
+		case PACKET_READ_RESPONSE_END:
+			die(_("remote server sent stateless separator"));
 		}
 	}
 
diff --git a/serve.c b/serve.c
index 317256c1a4..c046926ba1 100644
--- a/serve.c
+++ b/serve.c
@@ -217,6 +217,8 @@ static int process_request(void)
 
 			state = PROCESS_REQUEST_DONE;
 			break;
+		case PACKET_READ_RESPONSE_END:
+			BUG("unexpected stateless separator packet");
 		}
 	}
 
diff --git a/t/helper/test-pkt-line.c b/t/helper/test-pkt-line.c
index 12ca698e17..69152958e5 100644
--- a/t/helper/test-pkt-line.c
+++ b/t/helper/test-pkt-line.c
@@ -46,6 +46,9 @@ static void unpack(void)
 		case PACKET_READ_DELIM:
 			printf("0001\n");
 			break;
+		case PACKET_READ_RESPONSE_END:
+			printf("0002\n");
+			break;
 		}
 	}
 }
@@ -75,6 +78,7 @@ static void unpack_sideband(void)
 		case PACKET_READ_FLUSH:
 			return;
 		case PACKET_READ_DELIM:
+		case PACKET_READ_RESPONSE_END:
 			break;
 		}
 	}
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v3 7/7] stateless-connect: send response end packet
  2020-05-19 10:53   ` [PATCH v3 " Denton Liu
                       ` (5 preceding siblings ...)
  2020-05-19 10:53     ` [PATCH v3 6/7] pkt-line: define PACKET_READ_RESPONSE_END Denton Liu
@ 2020-05-19 10:54     ` Denton Liu
  2020-05-19 18:40     ` [PATCH v3 0/7] remote-curl: fix deadlocks when remote server disconnects Jeff King
                       ` (2 subsequent siblings)
  9 siblings, 0 replies; 62+ messages in thread
From: Denton Liu @ 2020-05-19 10:54 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Eric Sunshine, Junio C Hamano, Force Charlie

Currently, remote-curl acts as a proxy and blindly forwards packets
between an HTTP server and fetch-pack. In the case of a stateless RPC
connection where the connection is terminated before the transaction is
complete, remote-curl will blindly forward the packets before waiting on
more input from fetch-pack. Meanwhile, fetch-pack will read the
transaction and continue reading, expecting more input to continue the
transaction. This results in a deadlock between the two processes.

This can be seen in the following command which does not terminate:

	$ git -c protocol.version=2 clone https://github.com/git/git.git --shallow-since=20151012
	Cloning into 'git'...

whereas the v1 version does terminate as expected:

	$ git -c protocol.version=1 clone https://github.com/git/git.git --shallow-since=20151012
	Cloning into 'git'...
	fatal: the remote end hung up unexpectedly

Instead of blindly forwarding packets, make remote-curl insert a
response end packet after proxying the responses from the remote server
when using stateless_connect(). On the RPC client side, ensure that each
response ends as described.

A separate control packet is chosen because we need to be able to
differentiate between what the remote server sends and remote-curl's
control packets. By ensuring in the remote-curl code that a server
cannot send response end packets, we prevent a malicious server from
being able to perform a denial of service attack in which they spoof a
response end packet and cause the described deadlock to happen.

Reported-by: Force Charlie <charlieio@outlook.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 Documentation/gitremote-helpers.txt     |  4 +++-
 Documentation/technical/protocol-v2.txt |  2 ++
 builtin/fetch-pack.c                    |  2 +-
 connect.c                               | 16 +++++++++++++++-
 connect.h                               |  4 ++++
 fetch-pack.c                            | 13 +++++++++++++
 remote-curl.c                           |  5 +++++
 remote.h                                |  3 ++-
 t/t5702-protocol-v2.sh                  | 13 +++++++++++++
 transport.c                             |  3 ++-
 10 files changed, 60 insertions(+), 5 deletions(-)

diff --git a/Documentation/gitremote-helpers.txt b/Documentation/gitremote-helpers.txt
index f48a031dc3..93baeeb029 100644
--- a/Documentation/gitremote-helpers.txt
+++ b/Documentation/gitremote-helpers.txt
@@ -405,7 +405,9 @@ Supported if the helper has the "connect" capability.
 	trying to fall back).  After line feed terminating the positive
 	(empty) response, the output of the service starts.  Messages
 	(both request and response) must consist of zero or more
-	PKT-LINEs, terminating in a flush packet. The client must not
+	PKT-LINEs, terminating in a flush packet. Response messages will
+	then have a response end packet after the flush packet to
+	indicate the end of a response.  The client must not
 	expect the server to store any state in between request-response
 	pairs.  After the connection ends, the remote helper exits.
 +
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 7e3766cafb..3996d70891 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -33,6 +33,8 @@ In protocol v2 these special packets will have the following semantics:
 
   * '0000' Flush Packet (flush-pkt) - indicates the end of a message
   * '0001' Delimiter Packet (delim-pkt) - separates sections of a message
+  * '0002' Message Packet (response-end-pkt) - indicates the end of a response
+    for stateless connections
 
 Initial Client Request
 ----------------------
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 4771100072..94b0c89b82 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -224,7 +224,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	version = discover_version(&reader);
 	switch (version) {
 	case protocol_v2:
-		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL);
+		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL, args.stateless_rpc);
 		break;
 	case protocol_v1:
 	case protocol_v0:
diff --git a/connect.c b/connect.c
index 11c6ec70a0..0df45a1108 100644
--- a/connect.c
+++ b/connect.c
@@ -406,10 +406,21 @@ static int process_ref_v2(const char *line, struct ref ***list)
 	return ret;
 }
 
+void check_stateless_delimiter(int stateless_rpc,
+			      struct packet_reader *reader,
+			      const char *error)
+{
+	if (!stateless_rpc)
+		return; /* not in stateless mode, no delimiter expected */
+	if (packet_reader_read(reader) != PACKET_READ_RESPONSE_END)
+		die("%s", error);
+}
+
 struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
 			     const struct argv_array *ref_prefixes,
-			     const struct string_list *server_options)
+			     const struct string_list *server_options,
+			     int stateless_rpc)
 {
 	int i;
 	*list = NULL;
@@ -446,6 +457,9 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 	if (reader->status != PACKET_READ_FLUSH)
 		die(_("expected flush after ref listing"));
 
+	check_stateless_delimiter(stateless_rpc, reader,
+				  _("expected response end packet after ref listing"));
+
 	return list;
 }
 
diff --git a/connect.h b/connect.h
index 5f2382e018..235bc66254 100644
--- a/connect.h
+++ b/connect.h
@@ -22,4 +22,8 @@ int server_supports_v2(const char *c, int die_on_error);
 int server_supports_feature(const char *c, const char *feature,
 			    int die_on_error);
 
+void check_stateless_delimiter(int stateless_rpc,
+			       struct packet_reader *reader,
+			       const char *error);
+
 #endif
diff --git a/fetch-pack.c b/fetch-pack.c
index f73a2ce6cb..f096442d4d 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1450,6 +1450,13 @@ enum fetch_state {
 	FETCH_DONE,
 };
 
+static void do_check_stateless_delimiter(const struct fetch_pack_args *args,
+					 struct packet_reader *reader)
+{
+	check_stateless_delimiter(args->stateless_rpc, reader,
+				  _("git fetch-pack: expected response end packet"));
+}
+
 static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 				    int fd[2],
 				    const struct ref *orig_ref,
@@ -1534,6 +1541,10 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 			/* Process ACKs/NAKs */
 			switch (process_acks(negotiator, &reader, &common)) {
 			case READY:
+				/*
+				 * Don't check for response delimiter; get_pack() will
+				 * read the rest of this response.
+				 */
 				state = FETCH_GET_PACK;
 				break;
 			case COMMON_FOUND:
@@ -1541,6 +1552,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 				seen_ack = 1;
 				/* fallthrough */
 			case NO_COMMON_FOUND:
+				do_check_stateless_delimiter(args, &reader);
 				state = FETCH_SEND_REQUEST;
 				break;
 			}
@@ -1560,6 +1572,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 			process_section_header(&reader, "packfile", 0);
 			if (get_pack(args, fd, pack_lockfile, sought, nr_sought))
 				die(_("git fetch-pack: fetch failed."));
+			do_check_stateless_delimiter(args, &reader);
 
 			state = FETCH_DONE;
 			break;
diff --git a/remote-curl.c b/remote-curl.c
index d02cb547e9..75532a8bae 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -703,6 +703,8 @@ static void check_pktline(struct check_pktline_state *state, const char *ptr, si
 				state->remaining = packet_length(state->len_buf);
 				if (state->remaining < 0) {
 					die(_("remote-curl: bad line length character: %.4s"), state->len_buf);
+				} else if (state->remaining == 2) {
+					die(_("remote-curl: unexpected response end packet"));
 				} else if (state->remaining < 4) {
 					state->remaining = 0;
 				} else {
@@ -991,6 +993,9 @@ static int post_rpc(struct rpc_state *rpc, int stateless_connect, int flush_rece
 	if (rpc_in_data.pktline_state.remaining)
 		err = error(_("%d bytes of body are still expected"), rpc_in_data.pktline_state.remaining);
 
+	if (stateless_connect)
+		packet_response_end(rpc->in);
+
 	curl_slist_free_all(headers);
 	free(gzip_body);
 	return err;
diff --git a/remote.h b/remote.h
index 11d8719b58..5cc26c1b3b 100644
--- a/remote.h
+++ b/remote.h
@@ -179,7 +179,8 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
 			     const struct argv_array *ref_prefixes,
-			     const struct string_list *server_options);
+			     const struct string_list *server_options,
+			     int stateless_rpc);
 
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 4eb81ba2d4..8da65e60de 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -620,6 +620,19 @@ test_expect_success 'clone repository with http:// using protocol v2 with incomp
 	test_i18ngrep "bytes of body are still expected" err
 '
 
+test_expect_success 'clone with http:// using protocol v2 and invalid parameters' '
+	test_when_finished "rm -f log" &&
+
+	test_must_fail env GIT_TRACE_PACKET="$(pwd)/log" GIT_TRACE_CURL="$(pwd)/log" \
+		git -c protocol.version=2 \
+		clone --shallow-since=20151012 "$HTTPD_URL/smart/http_parent" http_child_invalid &&
+
+	# Client requested to use protocol v2
+	grep "Git-Protocol: version=2" log &&
+	# Server responded using protocol v2
+	grep "git< version 2" log
+'
+
 test_expect_success 'clone big repository with http:// using protocol v2' '
 	test_when_finished "rm -f log" &&
 
diff --git a/transport.c b/transport.c
index 431a93caef..7d50c502ad 100644
--- a/transport.c
+++ b/transport.c
@@ -297,7 +297,8 @@ static struct ref *handshake(struct transport *transport, int for_push,
 		if (must_list_refs)
 			get_remote_refs(data->fd[1], &reader, &refs, for_push,
 					ref_prefixes,
-					transport->server_options);
+					transport->server_options,
+					transport->stateless_rpc);
 		break;
 	case protocol_v1:
 	case protocol_v0:
-- 
2.26.2.706.g87896c9627


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v3 4/7] pkt-line: extern packet_length()
  2020-05-19 10:53     ` [PATCH v3 4/7] pkt-line: extern packet_length() Denton Liu
@ 2020-05-19 16:23       ` Eric Sunshine
  0 siblings, 0 replies; 62+ messages in thread
From: Eric Sunshine @ 2020-05-19 16:23 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List, Jeff King, Junio C Hamano

On Tue, May 19, 2020 at 6:54 AM Denton Liu <liu.denton@gmail.com> wrote:
> [...]
> Change the function parameter from `const char *linelen` to
> `const char lenbuf_hex[4]`. Even though these two types behave
> identically as function parameters, use the array notation to
> semantically indicate exactly what this function is expecting as an
> argument. Also, rename it from linelen to lenbuf_hex as the former
> sounds like it should be an integral type which is misleading.
>
> Signed-off-by: Denton Liu <liu.denton@gmail.com>
> ---
> diff --git a/pkt-line.h b/pkt-line.h
> @@ -74,6 +74,15 @@ int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
> +/*
> + * Convert a four hex digit packet line length header into its numeric
> + * representation.
> + *
> + * If linelen contains non-hex characters, return -1. Otherwise, return the

s/linelen/lenbuf_hex/
...or...
s/lenbuf_hex/input argument/

> + * numeric value of the length header.
> + */
> +int packet_length(const char lenbuf_hex[4]);

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v3 0/7] remote-curl: fix deadlocks when remote server disconnects
  2020-05-19 10:53   ` [PATCH v3 " Denton Liu
                       ` (6 preceding siblings ...)
  2020-05-19 10:54     ` [PATCH v3 7/7] stateless-connect: send response end packet Denton Liu
@ 2020-05-19 18:40     ` Jeff King
  2020-05-19 21:14       ` Denton Liu
  2020-05-19 20:51     ` [PATCH v3 8/7] fixup! pkt-line: extern packet_length() Denton Liu
  2020-05-22 13:33     ` [PATCH v3 9/9] fixup! remote-curl: error on incomplete packet Denton Liu
  9 siblings, 1 reply; 62+ messages in thread
From: Jeff King @ 2020-05-19 18:40 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List, Eric Sunshine, Junio C Hamano

On Tue, May 19, 2020 at 06:53:53AM -0400, Denton Liu wrote:

> Changes since v2:
> 
> * Use if-else tower in "transport: extract common fetch_pack() call"
> 
> * Rename to `lenbuf_hex` and remove null-terminator sentence in
>   "pkt-line: extern packet_length()"
> 
> * Fix "a a" typo in "remote-curl: error on incomplete packet"
> 
> * Don't send flush packet after response end packet
> 
> * Move stateless delimiter checks closer to where message processing
>   happens in do_fetch_pack_v2()

This looks good to me, modulo the minor comment fixup from Eric. I did
have one final question, though. Our discussion focused a lot on
checking the 0002 packets in the success case. But we didn't discuss how
fetch-pack would deal with an unexpected 0002 packet (i.e., the case
that the server response is truncated, but then remote-curl tacks on its
0002).

It _seems_ to work, because that's the case your invalid-shallow test is
covering. I'm just not sure if it works consistently, or what error we
might produce in some cases (e.g., saying "woah, what's the weird 0002
packet" instead of "the server response ended unexpectedly" or
something).

I suspect any remaining issues there are cosmetic, and it might be OK to
just discover them organically through use. But if you happen to have
done some poking around there, it would be nice to hear it.

Thanks for working on this. When we had the initial discussion, I was
really worried the solution was going to be quite nasty, but I think it
turned out to be reasonably nice.

-Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v3 8/7] fixup! pkt-line: extern packet_length()
  2020-05-19 10:53   ` [PATCH v3 " Denton Liu
                       ` (7 preceding siblings ...)
  2020-05-19 18:40     ` [PATCH v3 0/7] remote-curl: fix deadlocks when remote server disconnects Jeff King
@ 2020-05-19 20:51     ` Denton Liu
  2020-05-22 13:33     ` [PATCH v3 9/9] fixup! remote-curl: error on incomplete packet Denton Liu
  9 siblings, 0 replies; 62+ messages in thread
From: Denton Liu @ 2020-05-19 20:51 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Eric Sunshine, Junio C Hamano

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 pkt-line.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pkt-line.h b/pkt-line.h
index 6cb92d7a5d..5b373fe4cd 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -79,7 +79,7 @@ int packet_read(int fd, char **src_buffer, size_t *src_len, char
  * Convert a four hex digit packet line length header into its numeric
  * representation.
  *
- * If linelen contains non-hex characters, return -1. Otherwise, return the
+ * If lenbuf_hex contains non-hex characters, return -1. Otherwise, return the
  * numeric value of the length header.
  */
 int packet_length(const char lenbuf_hex[4]);
-- 
2.27.0.rc0.114.g141fe7d276


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v3 0/7] remote-curl: fix deadlocks when remote server disconnects
  2020-05-19 18:40     ` [PATCH v3 0/7] remote-curl: fix deadlocks when remote server disconnects Jeff King
@ 2020-05-19 21:14       ` Denton Liu
  0 siblings, 0 replies; 62+ messages in thread
From: Denton Liu @ 2020-05-19 21:14 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, Eric Sunshine, Junio C Hamano

Hi Peff,

On Tue, May 19, 2020 at 02:40:04PM -0400, Jeff King wrote:
> It _seems_ to work, because that's the case your invalid-shallow test is
> covering. I'm just not sure if it works consistently, or what error we
> might produce in some cases (e.g., saying "woah, what's the weird 0002
> packet" instead of "the server response ended unexpectedly" or
> something).
> 
> I suspect any remaining issues there are cosmetic, and it might be OK to
> just discover them organically through use. But if you happen to have
> done some poking around there, it would be nice to hear it.

From what I can tell, every time a packet is read using the reader in
do_fetch_pack_v2(), we're careful about checking the packet type so we
should be safe there. Also, when piping stuff over to index-pack and
unpack-objects, it looks like the resulting call to recv_sideband()
treats any control packets as flush packets so it should handle the 0002
fine.

I could have missed checking some spots, though. But as far as I can
tell, if it can't handle the 0002 properly, it was already buggy to
begin with. I agree that we can let any remaining issues be shaken out
through use.

> Thanks for working on this. When we had the initial discussion, I was
> really worried the solution was going to be quite nasty, but I think it
> turned out to be reasonably nice.

Thanks,

Denton

> -Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v3 9/9] fixup! remote-curl: error on incomplete packet
  2020-05-19 10:53   ` [PATCH v3 " Denton Liu
                       ` (8 preceding siblings ...)
  2020-05-19 20:51     ` [PATCH v3 8/7] fixup! pkt-line: extern packet_length() Denton Liu
@ 2020-05-22 13:33     ` Denton Liu
  2020-05-22 15:54       ` Jeff King
  9 siblings, 1 reply; 62+ messages in thread
From: Denton Liu @ 2020-05-22 13:33 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Jeff King, Eric Sunshine, Junio C Hamano

In the CGI scripts which emulate a connection being prematurely
terminated, it doesn't make sense for there to be a trailing newline
after the simulated connection cut. Remove these newlines.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 t/lib-httpd/incomplete-body-upload-pack-v2-http.sh   | 2 +-
 t/lib-httpd/incomplete-length-upload-pack-v2-http.sh | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/lib-httpd/incomplete-body-upload-pack-v2-http.sh b/t/lib-httpd/incomplete-body-upload-pack-v2-http.sh
index 2f5ed9fcf6..90e73ef8d5 100644
--- a/t/lib-httpd/incomplete-body-upload-pack-v2-http.sh
+++ b/t/lib-httpd/incomplete-body-upload-pack-v2-http.sh
@@ -1,3 +1,3 @@
 printf "Content-Type: text/%s\n" "application/x-git-upload-pack-result"
 echo
-printf "%s%s\n" "0079" "45"
+printf "%s%s" "0079" "45"
diff --git a/t/lib-httpd/incomplete-length-upload-pack-v2-http.sh b/t/lib-httpd/incomplete-length-upload-pack-v2-http.sh
index 86c6e648c9..dce552e348 100644
--- a/t/lib-httpd/incomplete-length-upload-pack-v2-http.sh
+++ b/t/lib-httpd/incomplete-length-upload-pack-v2-http.sh
@@ -1,3 +1,3 @@
 printf "Content-Type: text/%s\n" "application/x-git-upload-pack-result"
 echo
-printf "%s\n" "00"
+printf "%s" "00"
-- 
2.27.0.rc1.69.g3229fa6e15


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v3 9/9] fixup! remote-curl: error on incomplete packet
  2020-05-22 13:33     ` [PATCH v3 9/9] fixup! remote-curl: error on incomplete packet Denton Liu
@ 2020-05-22 15:54       ` Jeff King
  2020-05-22 16:05         ` Denton Liu
  0 siblings, 1 reply; 62+ messages in thread
From: Jeff King @ 2020-05-22 15:54 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List, Eric Sunshine, Junio C Hamano

On Fri, May 22, 2020 at 09:33:47AM -0400, Denton Liu wrote:

> In the CGI scripts which emulate a connection being prematurely
> terminated, it doesn't make sense for there to be a trailing newline
> after the simulated connection cut. Remove these newlines.

Ah, good catch. I think in the first one it doesn't matter:

> -printf "%s%s\n" "0079" "45"
> +printf "%s%s" "0079" "45"

because we have a too-short packet, so the fact that it is 3 bytes and
not 2 does not change that.

I agree it's clearer without the newline, though. I wonder if:

  printf "0079" "fewer than 0x79 bytes"

would make it even more self-documenting. :)

> -printf "%s\n" "00"
> +printf "%s" "00"

This one is a behavior improvement: we were probably hitting "oops,
newline isn't a valid line-length character" before, and now we're
really hitting the truncated packet.

I don't know if it's worth adding an extra test with a bogus
line-length. I'm OK with with it either way.

-Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v3 9/9] fixup! remote-curl: error on incomplete packet
  2020-05-22 15:54       ` Jeff King
@ 2020-05-22 16:05         ` Denton Liu
  2020-05-22 16:31           ` Jeff King
  0 siblings, 1 reply; 62+ messages in thread
From: Denton Liu @ 2020-05-22 16:05 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, Eric Sunshine, Junio C Hamano

Hi Peff,

On Fri, May 22, 2020 at 11:54:10AM -0400, Jeff King wrote:
> > -printf "%s\n" "00"
> > +printf "%s" "00"
> 
> This one is a behavior improvement: we were probably hitting "oops,
> newline isn't a valid line-length character" before, and now we're
> really hitting the truncated packet.

The test currently actually greps for the "incomplete length" error
message and it passes so the behaviour remains the same. We just got
lucky that we send "00" instead of "000" beacuse "000\n" would've
otherwise given us a full length header.

> I don't know if it's worth adding an extra test with a bogus
> line-length. I'm OK with with it either way.

I think I'll leave this unless anyone really wants this to be tested.

Thanks,

Denton

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v3 9/9] fixup! remote-curl: error on incomplete packet
  2020-05-22 16:05         ` Denton Liu
@ 2020-05-22 16:31           ` Jeff King
  0 siblings, 0 replies; 62+ messages in thread
From: Jeff King @ 2020-05-22 16:31 UTC (permalink / raw)
  To: Denton Liu; +Cc: Git Mailing List, Eric Sunshine, Junio C Hamano

On Fri, May 22, 2020 at 12:05:52PM -0400, Denton Liu wrote:

> Hi Peff,
> 
> On Fri, May 22, 2020 at 11:54:10AM -0400, Jeff King wrote:
> > > -printf "%s\n" "00"
> > > +printf "%s" "00"
> > 
> > This one is a behavior improvement: we were probably hitting "oops,
> > newline isn't a valid line-length character" before, and now we're
> > really hitting the truncated packet.
> 
> The test currently actually greps for the "incomplete length" error
> message and it passes so the behaviour remains the same. We just got
> lucky that we send "00" instead of "000" beacuse "000\n" would've
> otherwise given us a full length header.

Ah, OK. I was surprised it didn't hit your new code in check_pktline()
that complains about non-hex chars. But the reason is that we don't
check digit-by-digit as we read them. We wait until we have 4 chars, and
then we feed the whole thing to packet_length(). But since we only
receive 3, the state machine doesn't move to that step. So that makes
sense.

> > I don't know if it's worth adding an extra test with a bogus
> > line-length. I'm OK with with it either way.
> 
> I think I'll leave this unless anyone really wants this to be tested.

Sounds good.

-Peff

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, back to index

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-13 18:04 [PATCH 0/6] remote-curl: partial fix for a deadlock with stateless rpc Denton Liu
2020-05-13 18:04 ` [PATCH 1/6] remote-curl: fix typo Denton Liu
2020-05-13 18:04 ` [PATCH 2/6] remote-curl: remove label indentation Denton Liu
2020-05-13 18:04 ` [PATCH 3/6] transport: combine common cases with a fallthrough Denton Liu
2020-05-13 23:14   ` Eric Sunshine
2020-05-18  9:18     ` Denton Liu
2020-05-18 17:43       ` Eric Sunshine
2020-05-13 18:04 ` [PATCH 4/6] pkt-line: extern packet_length() Denton Liu
2020-05-13 23:23   ` Eric Sunshine
2020-05-15 20:56   ` Jeff King
2020-05-15 20:57     ` Jeff King
2020-05-13 18:04 ` [PATCH 5/6] remote-curl: error on incomplete packet Denton Liu
2020-05-15 21:38   ` Jeff King
2020-05-18  9:08     ` Denton Liu
2020-05-18 15:49       ` Jeff King
2020-05-13 18:04 ` [PATCH 6/6] remote-curl: ensure last packet is a flush Denton Liu
2020-05-15 21:02   ` Denton Liu
2020-05-15 21:41     ` Jeff King
2020-05-18 16:34       ` Junio C Hamano
2020-05-18 16:52         ` Jeff King
2020-05-18 21:00           ` Jeff King
2020-05-18 15:47 ` [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects Denton Liu
2020-05-18 15:47   ` [PATCH v2 1/7] remote-curl: fix typo Denton Liu
2020-05-18 15:47   ` [PATCH v2 2/7] remote-curl: remove label indentation Denton Liu
2020-05-18 18:37     ` Junio C Hamano
2020-05-18 15:47   ` [PATCH v2 3/7] transport: extract common fetch_pack() call Denton Liu
2020-05-18 18:40     ` Junio C Hamano
2020-05-18 15:47   ` [PATCH v2 4/7] pkt-line: extern packet_length() Denton Liu
2020-05-18 16:04     ` Jeff King
2020-05-18 17:50       ` Eric Sunshine
2020-05-18 20:08         ` Jeff King
2020-05-18 18:44       ` Junio C Hamano
2020-05-18 15:47   ` [PATCH v2 5/7] remote-curl: error on incomplete packet Denton Liu
2020-05-18 16:22     ` Jeff King
2020-05-18 16:51       ` Denton Liu
2020-05-18 15:47   ` [PATCH v2 6/7] pkt-line: PACKET_READ_RESPONSE_END Denton Liu
2020-05-18 15:47   ` [PATCH v2 7/7] stateless-connect: send response end packet Denton Liu
2020-05-18 16:43     ` Jeff King
2020-05-18 17:12       ` Denton Liu
2020-05-18 17:26         ` Jeff King
2020-05-18 16:50   ` [PATCH v2 0/7] remote-curl: fix deadlocks when remote server disconnects Jeff King
2020-05-18 17:36     ` Denton Liu
2020-05-18 20:58       ` Jeff King
2020-05-18 22:52         ` Junio C Hamano
2020-05-19  2:38           ` Jeff King
2020-05-18 19:36     ` Junio C Hamano
2020-05-19 10:53   ` [PATCH v3 " Denton Liu
2020-05-19 10:53     ` [PATCH v3 1/7] remote-curl: fix typo Denton Liu
2020-05-19 10:53     ` [PATCH v3 2/7] remote-curl: remove label indentation Denton Liu
2020-05-19 10:53     ` [PATCH v3 3/7] transport: extract common fetch_pack() call Denton Liu
2020-05-19 10:53     ` [PATCH v3 4/7] pkt-line: extern packet_length() Denton Liu
2020-05-19 16:23       ` Eric Sunshine
2020-05-19 10:53     ` [PATCH v3 5/7] remote-curl: error on incomplete packet Denton Liu
2020-05-19 10:53     ` [PATCH v3 6/7] pkt-line: define PACKET_READ_RESPONSE_END Denton Liu
2020-05-19 10:54     ` [PATCH v3 7/7] stateless-connect: send response end packet Denton Liu
2020-05-19 18:40     ` [PATCH v3 0/7] remote-curl: fix deadlocks when remote server disconnects Jeff King
2020-05-19 21:14       ` Denton Liu
2020-05-19 20:51     ` [PATCH v3 8/7] fixup! pkt-line: extern packet_length() Denton Liu
2020-05-22 13:33     ` [PATCH v3 9/9] fixup! remote-curl: error on incomplete packet Denton Liu
2020-05-22 15:54       ` Jeff King
2020-05-22 16:05         ` Denton Liu
2020-05-22 16:31           ` Jeff King

Git Mailing List Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/git/0 git/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 git git/ https://lore.kernel.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.git


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git