All of lore.kernel.org
 help / color / mirror / Atom feed
* git fetch (http) hanging/failing on one specific repository, http only
@ 2014-10-20 14:29 Dennis Kaarsemaker
  2014-10-21  9:48 ` Bug in fetch-pack causing ever-growing http requests. (Was: git fetch (http) hanging/failing on one specific repository, http only) Dennis Kaarsemaker
  0 siblings, 1 reply; 10+ messages in thread
From: Dennis Kaarsemaker @ 2014-10-20 14:29 UTC (permalink / raw)
  To: git

Since a few days, one of our repos is causing problems during git fetch,
basically git fetch over http hangs during find_common. When using ssh,
this does not happen. One thing to noe about this repo is that it has
53000+ refs, though I'm not convinced that that is the cause of my
ptoblem.

Client and server are both 2.1.2, the 'dirty' is from some debugging
printf's I threw in. I also tested with 1.8.4, which also exhibits this
behaviour:

GIT_CURL_VERBOSE=1 /home/dkaarsemaker/git/git fetch -vvv

* Couldn't find host xxx.booking.com in the .netrc file; using defaults
* Expire cleared
* About to connect() to xxx.booking.com port 80 (#0)
*   Trying 10.196.70.243... >
GET /git/main.git/info/refs?service=git-upload-pack HTTP/1.1
User-Agent: git/2.1.2.dirty
Host: xxx.booking.com
Accept: */*
Accept-Encoding: gzip
Pragma: no-cache

< HTTP/1.1 200 OK
< Date: Mon, 20 Oct 2014 13:41:00 GMT
< Server: Apache
< Expires: Fri, 01 Jan 1980 00:00:00 GMT
< Pragma: no-cache
< Cache-Control: no-cache, max-age=0, must-revalidate
< Transfer-Encoding: chunked
< Content-Type: application/x-git-upload-pack-advertisement
< 
* Expire cleared
* Connection #0 to host xxx.booking.com left intact
HEADServer supports multi_ack_detailed
Server supports no-done
Server supports side-band-64k
Server supports ofs-delta
Server version is git/2.1.2
want dfb112668697c44d3221ce1e603af1ee46fff41e (refs/heads/xxx)
want e884b8d79a935fdc9291c2424372cd372f0b7666 (refs/heads/xxx)
want d83ee364a2be10ea5d9fa9c19a64d8680f7f2327 (refs/heads/xxx)
want 3cf9d21630bb5e380be3537628cd026ddfee8a11 (refs/heads/xxx)
want 0ecb911a9062d2a94c5381efab8b10d76acfa487 (refs/heads/xxx)
want 53dca992d6162fb08a1c722317621866ade339ca (refs/heads/xxx)
want 49de5539e17ed53706159f4fdda2bdacac18f9d6 (refs/heads/xxx)
want 6fb579642139bff5e28f313012dbe7ffe4f3aae5 (refs/heads/xxx)
want 62bf71564cc70c3d9a5cd8046b09e136f9e27928 (refs/heads/xxx)
want c1f98f24174113f901950506ab6a08c54c88e867 (refs/heads/xxx)
want 830ee444f8e9f3127ee7c8ac054690581e6bea53 (refs/heads/xxx)
want dd7a12a3f7344e78136fe8ea1d8bd5b8a7592bda (refs/heads/xxx)
want acc5936de82f6f31a50a3c68dc1fe22bfe5818ef (refs/heads/xxx)
want 88c5f7c05e55a83826112c4cc19e2e9fc53bbdfb (refs/heads/xxx)
want 639339def49ba7f3d469413e7d26e0d205fa23b7 (refs/heads/xxx)
want 65a12becb29987726864709429052f8b63a9e468 (refs/heads/xxx)
want 69375797ac43b2427d6c88a07202cb986eb0e91e (refs/heads/xxx)
want c32e7a453643f43ad3a840d8972a591345eb4ce6 (refs/heads/xxx)
want 024a9c8904934522a8a625145997b118cb07ea85 (refs/heads/xxx)
want 7f3b10df5ddb7eb4d8b310812e7406515ff9c439 (refs/heads/xxx)
want 762cf79100279f14c1d6089daacd7a800d4574d9 (refs/heads/xxx)
want 8b74fe415472720cc72c761af8025e730f8378ce (refs/heads/xxx)
want 811b5bea55533bcc158da1a568b92a070a5aff46 (refs/heads/xxx)
want 8a46fb3dedb55e63fa580541c9f7633d425a8ea2 (refs/heads/xxx)
want d81598d11edcbda0e020b63c3ae6084029b0714c (refs/heads/xxx)
want 0641ea1ab05775a73f7e99342859ee638b534159 (refs/heads/xxx)
want 837466f5fbd51cf21c93e5d44124dce42deac469 (refs/heads/trunk)
want 8225bf4201c14518eef7c81c8964b5f11b12c0eb (refs/heads/xxx)
want d366df049d50d879cc3dfaced812ebac91a987ba (refs/heads/xxx)
want c47ff2f1b8155dd8a9e0113fd92c3a109dce72b7 (refs/heads/xxx)
want dd136cdae6ef772323b922c15b6166666ad987de (refs/tags/admin-20141016-163909)
<skip 150-odd tags with the same format

<skip lots of back anf forth have/ack, here are the apache logs for them>
10.189.174.96 - - [20/Oct/2014:15:41:04 +0200] "POST /git/main.git/git-upload-pack HTTP/1.1" 200 904
10.189.174.96 - - [20/Oct/2014:15:41:04 +0200] "POST /git/main.git/git-upload-pack HTTP/1.1" 200 1576
10.189.174.96 - - [20/Oct/2014:15:41:04 +0200] "POST /git/main.git/git-upload-pack HTTP/1.1" 200 3312
10.189.174.96 - - [20/Oct/2014:15:41:04 +0200] "POST /git/main.git/git-upload-pack HTTP/1.1" 200 6784
10.189.174.96 - - [20/Oct/2014:15:41:04 +0200] "POST /git/main.git/git-upload-pack HTTP/1.1" 200 13728
10.189.174.96 - - [20/Oct/2014:15:41:05 +0200] "POST /git/main.git/git-upload-pack HTTP/1.1" 200 27840
10.189.174.96 - - [20/Oct/2014:15:41:05 +0200] "POST /git/main.git/git-upload-pack HTTP/1.1" 200 55392
10.189.174.96 - - [20/Oct/2014:15:41:05 +0200] "POST /git/main.git/git-upload-pack HTTP/1.1" 200 112456
10.189.174.96 - - [20/Oct/2014:15:41:06 +0200] "POST /git/main.git/git-upload-pack HTTP/1.1" 200 169800
10.189.174.96 - - [20/Oct/2014:15:41:06 +0200] "POST /git/main.git/git-upload-pack HTTP/1.1" 200 227144

<skip lots of acks>
got ack 3 87f1981596402d37d905dfb5356be7aa99020053
got ack 3 5ed75db2add2ae8abfb888fac1502ed3fac0221e
got ack 3 ee08fd65a137548c6769bf703f4da07307c3023d
<skip lots of haves>
have d6d890f55dcf83d1f6a6328c07a57b0c60f28e30
have bd94ba6d06011aaaf620abcea37e756a590bebd5
have 3946073390c2548a3f065c88f41a4b8964b9a2dd
have 34f881dbb4af0e3b090a93b17fba9c31b7771f61
have a7c18a679ce9e6eefe7930556735b29ecfc33952
POST git-upload-pack (gzip 262103 to 130185 bytes)
* Couldn't find host xxx.booking.com in the .netrc file; using defaults
* Re-using existing connection! (#0) with host xxx.booking.com
* Connected to xxx.booking.com (10.196.70.243) port 80 (#0)
> POST /git/main.git/git-upload-pack HTTP/1.1
User-Agent: git/2.1.2.dirty
Host: xxx.booking.com
Accept-Encoding: gzip
Content-Type: application/x-git-upload-pack-request
Accept: application/x-git-upload-pack-result
Content-Encoding: gzip
Content-Length: 130185

And here it hangs. The request does not show up in the apache
access/error log. The upload-pack process does get started though, but
git-upload-pack and git-http-backend seem deadlocked. strace shows:

sudo strace -p 3094 -p 3095 -p 3096
Process 3094 attached - interrupt to quit
Process 3095 attached - interrupt to quit
Process 3096 attached - interrupt to quit
[pid  3095] wait4(3096,  <unfinished ...>
[pid  3096] write(1, "0038ACK f3e8514d6e1dd402d27fecd9"..., 56 <unfinished ...>
[pid  3094] write(4, "32have e099999f7a64a93bcf8dd7f59"..., 8192 <unfinished ...>


3094, 3095, 3096 are git-http-backend, git upload pack and
git-upload-pack respectively.

Backtrace for the hanging git-fetch-pack (not suer if relevant):
#0  0x00007feb7abbf530 in __read_nocancel () from /lib64/libpthread.so.0
#1  0x000000000051531c in xread (fd=0, buf=0x7fff9fc04a60, len=4) at wrapper.c:152
#2  0x000000000051539b in read_in_full (fd=0, buf=<value optimized out>, count=4) at wrapper.c:201
#3  0x00000000004cea39 in get_packet_data (fd=<value optimized out>, src_buf=0x0, src_size=0x0, 
    dst=<value optimized out>, size=4, options=2) at pkt-line.c:122
#4  0x00000000004ceced in packet_read (fd=<value optimized out>, src_buf=0x0, src_len=0x0, 
    buffer=0x7d7560 "ACK c6597d5122bf0a5c831b4585cd2ad92d953cbdc9 common", size=65520, options=2) at pkt-line.c:169
#5  0x00000000004cee7e in packet_read_line_generic (fd=<value optimized out>, len_p=<value optimized out>)
    at pkt-line.c:199
#6  packet_read_line (fd=<value optimized out>, len_p=<value optimized out>) at pkt-line.c:209
#7  0x00000000004a6504 in get_ack (fd=0, 
    result_sha1=0x7fff9fc04ba0 "\306Y}Q\"\277\n\\\203\033E\205\315*\331-\225<\275", <incomplete sequence \311>)
    at fetch-pack.c:194
#8  0x00000000004a8062 in find_common (args=<value optimized out>, fd=0x7fff9fc04ce0, conn=<value optimized out>, 
    ref=<value optimized out>, dest=<value optimized out>, sought=0xe17610, nr_sought=160, shallow=0x7fff9fc04ca0, 
    pack_lockfile=0x7fff9fc04cd8) at fetch-pack.c:391
#9  do_fetch_pack (args=<value optimized out>, fd=0x7fff9fc04ce0, conn=<value optimized out>, 
    ref=<value optimized out>, dest=<value optimized out>, sought=0xe17610, nr_sought=160, shallow=0x7fff9fc04ca0, 
    pack_lockfile=0x7fff9fc04cd8) at fetch-pack.c:856
#10 fetch_pack (args=<value optimized out>, fd=0x7fff9fc04ce0, conn=<value optimized out>, ref=<value optimized out>, 
    dest=<value optimized out>, sought=0xe17610, nr_sought=160, shallow=0x7fff9fc04ca0, pack_lockfile=0x7fff9fc04cd8)
    at fetch-pack.c:1058
#11 0x000000000042baf3 in cmd_fetch_pack (argc=<value optimized out>, argv=<value optimized out>, 
    prefix=<value optimized out>) at builtin/fetch-pack.c:180
#12 0x0000000000404dc5 in run_builtin (argc=9, argv=0x7fff9fc04f90) at git.c:351
#13 handle_builtin (argc=9, argv=0x7fff9fc04f90) at git.c:529
#14 0x0000000000405001 in run_argv (argc=9, av=<value optimized out>) at git.c:575
#15 main (argc=9, av=<value optimized out>) at git.c:662

And for the hanging git-upload-pack:
#0  0x00007f7c8034b4d0 in __write_nocancel () from /lib64/libpthread.so.0
#1  0x000000000043c9dc in xwrite (fd=1, buf=0x6c70e0, len=56) at wrapper.c:170
#2  0x000000000043ca5b in write_in_full (fd=1, buf=<value optimized out>, count=56) at wrapper.c:220
#3  0x000000000043d019 in write_or_die (fd=<value optimized out>, buf=<value optimized out>, 
    count=<value optimized out>) at write_or_die.c:61
#4  0x00000000004131fa in packet_write (fd=1, fmt=<value optimized out>) at pkt-line.c:93
#5  0x000000000040538c in get_common_commits (argc=<value optimized out>, argv=0x7fff00000001) at upload-pack.c:420
#6  upload_pack (argc=<value optimized out>, argv=0x7fff00000001) at upload-pack.c:778
#7  main (argc=<value optimized out>, argv=0x7fff00000001) at upload-pack.c:846

And the hanging git-http-backend:
#0  0x00007f4c9553d4d0 in __write_nocancel () from /lib64/libpthread.so.0
#1  0x000000000042d31c in xwrite (fd=4, buf=0x7fff394ea570, len=8192) at wrapper.c:170
#2  0x000000000042d39b in write_in_full (fd=4, buf=<value optimized out>, count=8192) at wrapper.c:220
#3  0x0000000000403e5d in inflate_request (prog_name=0x490d98 "upload-pack", out=4) at http-backend.c:305
#4  0x000000000040405d in run_service (argv=0x7fff394ee6d0) at http-backend.c:355
#5  0x00000000004041d2 in service_rpc (service_name=<value optimized out>) at http-backend.c:508
#6  0x0000000000404b35 in main (argc=<value optimized out>, argv=<value optimized out>) at http-backend.c:631

Both git-http-backend and git-upload-pack are trying to write at the
same time. I'm *guessing* I've hit some buffer limit here, given that
the have/ack exchanges are increasing in size and suddenly this one is
misbehaving. However I have no idea where to look next and would really
appreciate some help.

-- 
Dennis Kaarsemaker
http://www.kaarsemaker.net

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Bug in fetch-pack causing ever-growing http requests. (Was: git fetch (http) hanging/failing on one specific repository, http only)
  2014-10-20 14:29 git fetch (http) hanging/failing on one specific repository, http only Dennis Kaarsemaker
@ 2014-10-21  9:48 ` Dennis Kaarsemaker
  2014-10-21 14:49   ` [PATCH] fetch-pack: don't resend known-common refs in find_common Dennis Kaarsemaker
  0 siblings, 1 reply; 10+ messages in thread
From: Dennis Kaarsemaker @ 2014-10-21  9:48 UTC (permalink / raw)
  To: git

On ma, 2014-10-20 at 16:29 +0200, Dennis Kaarsemaker wrote:
> Since a few days, one of our repos is causing problems during git fetch,
> basically git fetch over http hangs during find_common. When using ssh,
> this does not happen. 

<snip things that may not be relevant>.

> And for the hanging git-upload-pack:
> #0  0x00007f7c8034b4d0 in __write_nocancel () from /lib64/libpthread.so.0
> #1  0x000000000043c9dc in xwrite (fd=1, buf=0x6c70e0, len=56) at wrapper.c:170
> #2  0x000000000043ca5b in write_in_full (fd=1, buf=<value optimized out>, count=56) at wrapper.c:220
> #3  0x000000000043d019 in write_or_die (fd=<value optimized out>, buf=<value optimized out>, 
>     count=<value optimized out>) at write_or_die.c:61
> #4  0x00000000004131fa in packet_write (fd=1, fmt=<value optimized out>) at pkt-line.c:93
> #5  0x000000000040538c in get_common_commits (argc=<value optimized out>, argv=0x7fff00000001) at upload-pack.c:420
> #6  upload_pack (argc=<value optimized out>, argv=0x7fff00000001) at upload-pack.c:778
> #7  main (argc=<value optimized out>, argv=0x7fff00000001) at upload-pack.c:846
> 
> And the hanging git-http-backend:
> #0  0x00007f4c9553d4d0 in __write_nocancel () from /lib64/libpthread.so.0
> #1  0x000000000042d31c in xwrite (fd=4, buf=0x7fff394ea570, len=8192) at wrapper.c:170
> #2  0x000000000042d39b in write_in_full (fd=4, buf=<value optimized out>, count=8192) at wrapper.c:220
> #3  0x0000000000403e5d in inflate_request (prog_name=0x490d98 "upload-pack", out=4) at http-backend.c:305
> #4  0x000000000040405d in run_service (argv=0x7fff394ee6d0) at http-backend.c:355
> #5  0x00000000004041d2 in service_rpc (service_name=<value optimized out>) at http-backend.c:508
> #6  0x0000000000404b35 in main (argc=<value optimized out>, argv=<value optimized out>) at http-backend.c:631
> 
> Both git-http-backend and git-upload-pack are trying to write at the
> same time. I'm *guessing* I've hit some buffer limit here, given that
> the have/ack exchanges are increasing in size and suddenly this one is
> misbehaving. However I have no idea where to look next and would really
> appreciate some help.

I think the reasoning in 44d8dc54e73e8010c4bdf57a422fc8d5ce709029 is
incomplete: there's still a pipe involved in the case of gzip'ed request
bodies, and it's here that it hangs. However, I now do think that this
is merely a symptom, because after inspecting the output a bit further I
noticed that all reponses start with the same lines:

got ack 3 a669f13aab3a2c192c15574ead70f92b303e8aee
got ack 3 360530ff695a4deb01575e85976060a083e17245
got ack 3 bab20d62a5a4c34885cf2acbf83aca91908f9af8

In fact, response N, is the same as response N-1 plus acks for the
commits in the 'have' lines of the debug output for the next request. So
it looks like every request sends all common commits again, which seems
wrong but does explain the ever-growing request size. After commenting
out line 413 in fetch-pack.c (state_len = req_buf.len) the requests and
responses don't increase in size, and fetch completes, though the
received pack seems too large (http response is 400MB), which makes me
think it's not actually ack'ing. Subsequent HTTP fetches don't get a big
pack in response though, so maybe the pack is the right size. THis is a
*very* busy repo with thousands of commits between the last succesful
fetch 5 days ago and the first succesfil fetch after my hack.

In any case, I think there's a bug here, but I don't know nearly enough
about the protocol to judge if my "fix" is even close to correct. I've
also not tested my "fix" with any other protocol yet.

-- 
Dennis Kaarsemaker
http://www.kaarsemaker.net

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] fetch-pack: don't resend known-common refs in find_common
  2014-10-21  9:48 ` Bug in fetch-pack causing ever-growing http requests. (Was: git fetch (http) hanging/failing on one specific repository, http only) Dennis Kaarsemaker
@ 2014-10-21 14:49   ` Dennis Kaarsemaker
  2014-10-21 17:56     ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Dennis Kaarsemaker @ 2014-10-21 14:49 UTC (permalink / raw)
  To: git

By not clearing the request buffer in stateless-rpc mode, fetch-pack
would keep sending already known-common commits, leading to ever bigger
http requests, eventually getting too large for git-http-backend to
handle properly without filling up the pipe buffer in inflate_request.
---
I'm still not quite sure whether this is the right thing to do, but make
test still passes :) The new testcase demonstrates the problem, when
running t5551 with EXPENSIVE, this test will hang without the patch to
fetch-pack.c and succeed otherwise.

 fetch-pack.c                |  1 -
 t/t5551-http-fetch-smart.sh | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 655ee64..258245c 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -410,7 +410,6 @@ static int find_common(struct fetch_pack_args *args,
 						 */
 						const char *hex = sha1_to_hex(result_sha1);
 						packet_buf_write(&req_buf, "have %s\n", hex);
-						state_len = req_buf.len;
 					}
 					mark_common(commit, 0, 1);
 					retval = 0;
diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index 6cbc12d..2aac237 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -245,5 +245,37 @@ test_expect_success EXPENSIVE 'clone the 50,000 tag repo to check OS command lin
 	)
 '
 
+test_expect_success EXPENSIVE 'create 50,000 more tags' '
+	(
+	cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
+	for i in `test_seq 50001 100000`
+	do
+		echo "commit refs/heads/too-many-refs-again"
+		echo "mark :$i"
+		echo "committer git <git@example.com> $i +0000"
+		echo "data 0"
+		echo "M 644 inline bla.txt"
+		echo "data 4"
+		echo "bla"
+		# make every commit dangling by always
+		# rewinding the branch after each commit
+		echo "reset refs/heads/too-many-refs-again"
+		echo "from :50001"
+	done | git fast-import --export-marks=marks &&
+
+	# now assign tags to all the dangling commits we created above
+	tag=$(perl -e "print \"bla\" x 30") &&
+	sed -e "s|^:\([^ ]*\) \(.*\)$|\2 refs/tags/$tag-\1|" <marks >>packed-refs
+	)
+'
+
+test_expect_success EXPENSIVE 'fetch the new tags' '
+	(
+		cd too-many-refs &&
+		git fetch --tags &&
+		test $(git for-each-ref refs/tags | wc -l) = 100000
+	)
+'
+
 stop_httpd
 test_done
-- 
2.1.0-245-g26e60d4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] fetch-pack: don't resend known-common refs in find_common
  2014-10-21 14:49   ` [PATCH] fetch-pack: don't resend known-common refs in find_common Dennis Kaarsemaker
@ 2014-10-21 17:56     ` Junio C Hamano
  2014-10-22  7:41       ` Dennis Kaarsemaker
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2014-10-21 17:56 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git, Dennis Kaarsemaker

Dennis Kaarsemaker <dennis@kaarsemaker.net> writes:

> By not clearing the request buffer in stateless-rpc mode, fetch-pack
> would keep sending already known-common commits, leading to ever bigger
> http requests, eventually getting too large for git-http-backend to
> handle properly without filling up the pipe buffer in inflate_request.
> ---
> I'm still not quite sure whether this is the right thing to do, but make
> test still passes :) The new testcase demonstrates the problem, when
> running t5551 with EXPENSIVE, this test will hang without the patch to
> fetch-pack.c and succeed otherwise.

IIUC, because "stateless" is just that, i.e. the server-end does not
keep track of what is already known, not telling what is known to be
common in each request would fundamentally break the protocol.  Am I
mistaken?


>  fetch-pack.c                |  1 -
>  t/t5551-http-fetch-smart.sh | 32 ++++++++++++++++++++++++++++++++
>  2 files changed, 32 insertions(+), 1 deletion(-)
>
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 655ee64..258245c 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -410,7 +410,6 @@ static int find_common(struct fetch_pack_args *args,
>  						 */
>  						const char *hex = sha1_to_hex(result_sha1);
>  						packet_buf_write(&req_buf, "have %s\n", hex);
> -						state_len = req_buf.len;
>  					}
>  					mark_common(commit, 0, 1);
>  					retval = 0;
> diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
> index 6cbc12d..2aac237 100755
> --- a/t/t5551-http-fetch-smart.sh
> +++ b/t/t5551-http-fetch-smart.sh
> @@ -245,5 +245,37 @@ test_expect_success EXPENSIVE 'clone the 50,000 tag repo to check OS command lin
>  	)
>  '
>  
> +test_expect_success EXPENSIVE 'create 50,000 more tags' '
> +	(
> +	cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
> +	for i in `test_seq 50001 100000`
> +	do
> +		echo "commit refs/heads/too-many-refs-again"
> +		echo "mark :$i"
> +		echo "committer git <git@example.com> $i +0000"
> +		echo "data 0"
> +		echo "M 644 inline bla.txt"
> +		echo "data 4"
> +		echo "bla"
> +		# make every commit dangling by always
> +		# rewinding the branch after each commit
> +		echo "reset refs/heads/too-many-refs-again"
> +		echo "from :50001"
> +	done | git fast-import --export-marks=marks &&
> +
> +	# now assign tags to all the dangling commits we created above
> +	tag=$(perl -e "print \"bla\" x 30") &&
> +	sed -e "s|^:\([^ ]*\) \(.*\)$|\2 refs/tags/$tag-\1|" <marks >>packed-refs
> +	)
> +'
> +
> +test_expect_success EXPENSIVE 'fetch the new tags' '
> +	(
> +		cd too-many-refs &&
> +		git fetch --tags &&
> +		test $(git for-each-ref refs/tags | wc -l) = 100000
> +	)
> +'
> +
>  stop_httpd
>  test_done

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fetch-pack: don't resend known-common refs in find_common
  2014-10-21 17:56     ` Junio C Hamano
@ 2014-10-22  7:41       ` Dennis Kaarsemaker
  2014-10-22 10:07         ` Duy Nguyen
  2014-10-22 17:11         ` Junio C Hamano
  0 siblings, 2 replies; 10+ messages in thread
From: Dennis Kaarsemaker @ 2014-10-22  7:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn Pearce, git

On di, 2014-10-21 at 10:56 -0700, Junio C Hamano wrote:
> Dennis Kaarsemaker <dennis@kaarsemaker.net> writes:
> 
> > By not clearing the request buffer in stateless-rpc mode, fetch-pack
> > would keep sending already known-common commits, leading to ever bigger
> > http requests, eventually getting too large for git-http-backend to
> > handle properly without filling up the pipe buffer in inflate_request.
> > ---
> > I'm still not quite sure whether this is the right thing to do, but make
> > test still passes :) The new testcase demonstrates the problem, when
> > running t5551 with EXPENSIVE, this test will hang without the patch to
> > fetch-pack.c and succeed otherwise.
> 
> IIUC, because "stateless" is just that, i.e. the server-end does not
> keep track of what is already known, not telling what is known to be
> common in each request would fundamentally break the protocol.  Am I
> mistaken?

That sounds plausible, but why then does the fetch complete with this
line removed, and why does 'make test' still pass? I tried to understand
the protocol, but the documentation has TODO's in some critical
places :)

And if that's true, it means the inflate_request / upload-pack
interaction should be fixed, so more than 64k (current linux pipe buffer
size) of uncompressed data is supported. I see two options:

* Turning that interaction into a more cooperative process, with a
  select/poll loop
* Make upload-pack buffer its entire response when run in stateless_rpc
  mode until it has consumed all of the request

The latter sounds easier to do, but not being very familiar with the
protocol, I may have missed something obvious.

-- 
Dennis Kaarsemaker
http://www.kaarsemaker.net

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fetch-pack: don't resend known-common refs in find_common
  2014-10-22  7:41       ` Dennis Kaarsemaker
@ 2014-10-22 10:07         ` Duy Nguyen
  2014-10-26 15:42           ` Dennis Kaarsemaker
  2014-10-22 17:11         ` Junio C Hamano
  1 sibling, 1 reply; 10+ messages in thread
From: Duy Nguyen @ 2014-10-22 10:07 UTC (permalink / raw)
  To: Dennis Kaarsemaker; +Cc: Junio C Hamano, Shawn Pearce, Git Mailing List

On Wed, Oct 22, 2014 at 2:41 PM, Dennis Kaarsemaker
<dennis@kaarsemaker.net> wrote:
> I see two options:
>
> * Turning that interaction into a more cooperative process, with a
>   select/poll loop
> * Make upload-pack buffer its entire response when run in stateless_rpc
>   mode until it has consumed all of the request

Or add a helper daemon and support stateful smart http. Or maybe
that's what you meant in the first option.
-- 
Duy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fetch-pack: don't resend known-common refs in find_common
  2014-10-22  7:41       ` Dennis Kaarsemaker
  2014-10-22 10:07         ` Duy Nguyen
@ 2014-10-22 17:11         ` Junio C Hamano
  2014-10-26 15:39           ` Dennis Kaarsemaker
  1 sibling, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2014-10-22 17:11 UTC (permalink / raw)
  To: Dennis Kaarsemaker; +Cc: Shawn Pearce, git

Dennis Kaarsemaker <dennis@kaarsemaker.net> writes:

> On di, 2014-10-21 at 10:56 -0700, Junio C Hamano wrote:
>> Dennis Kaarsemaker <dennis@kaarsemaker.net> writes:
>> 
>> > By not clearing the request buffer in stateless-rpc mode, fetch-pack
>> > would keep sending already known-common commits, leading to ever bigger
>> > http requests, eventually getting too large for git-http-backend to
>> > handle properly without filling up the pipe buffer in inflate_request.
>> > ---
>> > I'm still not quite sure whether this is the right thing to do, but make
>> > test still passes :) The new testcase demonstrates the problem, when
>> > running t5551 with EXPENSIVE, this test will hang without the patch to
>> > fetch-pack.c and succeed otherwise.
>> 
>> IIUC, because "stateless" is just that, i.e. the server-end does not
>> keep track of what is already known, not telling what is known to be
>> common in each request would fundamentally break the protocol.  Am I
>> mistaken?
>
> That sounds plausible, but why then does the fetch complete with this
> line removed, and why does 'make test' still pass?

The fetch-pack program tries to help the upload-pack program(s)
running on the other end find what nodes in the graph both
repositories have in common by sending what the repository on its
end has.  Some commits may not be known by the other side (e.g. your
new commits that haven't been pushed there that are made on a branch
forked from the common history), and some others may be known
(i.e. you drilled down the history from the tips of your refs and
reached a commit that you fetched from the common history
previously).  The latter are ACKed by the upload-pack process and
are remembered to be re-sent to the _next_ incarnation of the
upload-pack process when stateless RPC is in use.

With your patch, you stop telling the upload-pack process what these
two programs already found to be common in their exchange.  After
the first exchange, fetch-pack and upload-pack may have noticed that
both ends have version 2.0, but because you do not convey that fact
to the other side, the new incarnation of upload-pack may end up
deciding that the version 1.9 is the newest common commit between
the two, and sending commits between 1.9 and 2.0.

If you imagine an extreme case, it would be easy to see why "the
fetch completes" and "make test passes" are not sufficient to say
anything about this change.  Even if you break the protocol in in a
way different from your patch, by not sending any "have", such a
butchered "fetch-pack" will become "fetch everything from scratch",
aka "clone".  The end result will still have correct history and
"fetch completes" would be true.

But I'd prefer deferring a more detailed analysis/explanation to
Shawn, as stateless RPC is his creation.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fetch-pack: don't resend known-common refs in find_common
  2014-10-22 17:11         ` Junio C Hamano
@ 2014-10-26 15:39           ` Dennis Kaarsemaker
  2014-12-06  0:48             ` Shawn Pearce
  0 siblings, 1 reply; 10+ messages in thread
From: Dennis Kaarsemaker @ 2014-10-26 15:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn Pearce, git

On Wed, Oct 22, 2014 at 10:11:40AM -0700, Junio C Hamano wrote:
> Dennis Kaarsemaker <dennis@kaarsemaker.net> writes:
> 
> > On di, 2014-10-21 at 10:56 -0700, Junio C Hamano wrote:
> >> Dennis Kaarsemaker <dennis@kaarsemaker.net> writes:
> >> 
> >> > By not clearing the request buffer in stateless-rpc mode, fetch-pack
> >> > would keep sending already known-common commits, leading to ever bigger
> >> > http requests, eventually getting too large for git-http-backend to
> >> > handle properly without filling up the pipe buffer in inflate_request.
> >> > ---
> >> > I'm still not quite sure whether this is the right thing to do, but make
> >> > test still passes :) The new testcase demonstrates the problem, when
> >> > running t5551 with EXPENSIVE, this test will hang without the patch to
> >> > fetch-pack.c and succeed otherwise.
> >> 
> >> IIUC, because "stateless" is just that, i.e. the server-end does not
> >> keep track of what is already known, not telling what is known to be
> >> common in each request would fundamentally break the protocol.  Am I
> >> mistaken?
> >
> > That sounds plausible, but why then does the fetch complete with this
> > line removed, and why does 'make test' still pass?
> 
> The fetch-pack program tries to help the upload-pack program(s)
> running on the other end find what nodes in the graph both
> repositories have in common by sending what the repository on its
> end has.  Some commits may not be known by the other side (e.g. your
> new commits that haven't been pushed there that are made on a branch
> forked from the common history), and some others may be known
> (i.e. you drilled down the history from the tips of your refs and
> reached a commit that you fetched from the common history
> previously).  The latter are ACKed by the upload-pack process and
> are remembered to be re-sent to the _next_ incarnation of the
> upload-pack process when stateless RPC is in use.
> 
> With your patch, you stop telling the upload-pack process what these
> two programs already found to be common in their exchange.  After
> the first exchange, fetch-pack and upload-pack may have noticed that
> both ends have version 2.0, but because you do not convey that fact
> to the other side, the new incarnation of upload-pack may end up
> deciding that the version 1.9 is the newest common commit between
> the two, and sending commits between 1.9 and 2.0.
> 
> If you imagine an extreme case, it would be easy to see why "the
> fetch completes" and "make test passes" are not sufficient to say
> anything about this change.  Even if you break the protocol in in a
> way different from your patch, by not sending any "have", such a
> butchered "fetch-pack" will become "fetch everything from scratch",
> aka "clone".  The end result will still have correct history and
> "fetch completes" would be true.
> 
> But I'd prefer deferring a more detailed analysis/explanation to
> Shawn, as stateless RPC is his creation.

Thanks for the explanation, that makes it quite clear that this approach
is wrong. The patch below (apologies for the formatting, I'm not quite
sure how to use format-patch in this situation) does it differently: by
buffering upload-pack's output until it has read all the input, the new
test still succeeds and again 'make test' passes. 

---
 t/t5551-http-fetch-smart.sh | 32 ++++++++++++++++++++++++++++++++
 upload-pack.c               | 29 +++++++++++++++++++----------
 2 files changed, 51 insertions(+), 10 deletions(-)

diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index 6cbc12d..2aac237 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -245,5 +245,37 @@ test_expect_success EXPENSIVE 'clone the 50,000 tag repo to check OS command lin
 	)
 '
 
+test_expect_success EXPENSIVE 'create 50,000 more tags' '
+	(
+	cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
+	for i in `test_seq 50001 100000`
+	do
+		echo "commit refs/heads/too-many-refs-again"
+		echo "mark :$i"
+		echo "committer git <git@example.com> $i +0000"
+		echo "data 0"
+		echo "M 644 inline bla.txt"
+		echo "data 4"
+		echo "bla"
+		# make every commit dangling by always
+		# rewinding the branch after each commit
+		echo "reset refs/heads/too-many-refs-again"
+		echo "from :50001"
+	done | git fast-import --export-marks=marks &&
+
+	# now assign tags to all the dangling commits we created above
+	tag=$(perl -e "print \"bla\" x 30") &&
+	sed -e "s|^:\([^ ]*\) \(.*\)$|\2 refs/tags/$tag-\1|" <marks >>packed-refs
+	)
+'
+
+test_expect_success EXPENSIVE 'fetch the new tags' '
+	(
+		cd too-many-refs &&
+		git fetch --tags &&
+		test $(git for-each-ref refs/tags | wc -l) = 100000
+	)
+'
+
 stop_httpd
 test_done
diff --git a/upload-pack.c b/upload-pack.c
index ac9ac15..3d76750 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -373,6 +373,7 @@ static int get_common_commits(void)
 	int got_common = 0;
 	int got_other = 0;
 	int sent_ready = 0;
+	struct strbuf resp_buf = STRBUF_INIT;
 
 	save_commit_buffer = 0;
 
@@ -384,15 +385,19 @@ static int get_common_commits(void)
 			if (multi_ack == 2 && got_common
 			    && !got_other && ok_to_give_up()) {
 				sent_ready = 1;
-				packet_write(1, "ACK %s ready\n", last_hex);
+				packet_buf_write(&resp_buf, "ACK %s ready\n", last_hex);
 			}
 			if (have_obj.nr == 0 || multi_ack)
-				packet_write(1, "NAK\n");
+				packet_buf_write(&resp_buf, "NAK\n");
 
 			if (no_done && sent_ready) {
-				packet_write(1, "ACK %s\n", last_hex);
+				packet_buf_write(&resp_buf, "ACK %s\n", last_hex);
+				write_or_die(1, resp_buf.buf, resp_buf.len);
+				strbuf_release(&resp_buf);
 				return 0;
 			}
+			write_or_die(1, resp_buf.buf, resp_buf.len);
+			strbuf_release(&resp_buf);
 			if (stateless_rpc)
 				exit(0);
 			got_common = 0;
@@ -407,20 +412,20 @@ static int get_common_commits(void)
 					const char *hex = sha1_to_hex(sha1);
 					if (multi_ack == 2) {
 						sent_ready = 1;
-						packet_write(1, "ACK %s ready\n", hex);
+						packet_buf_write(&resp_buf, "ACK %s ready\n", hex);
 					} else
-						packet_write(1, "ACK %s continue\n", hex);
+						packet_buf_write(&resp_buf, "ACK %s continue\n", hex);
 				}
 				break;
 			default:
 				got_common = 1;
 				memcpy(last_hex, sha1_to_hex(sha1), 41);
 				if (multi_ack == 2)
-					packet_write(1, "ACK %s common\n", last_hex);
+					packet_buf_write(&resp_buf, "ACK %s common\n", last_hex);
 				else if (multi_ack)
-					packet_write(1, "ACK %s continue\n", last_hex);
+					packet_buf_write(&resp_buf, "ACK %s continue\n", last_hex);
 				else if (have_obj.nr == 1)
-					packet_write(1, "ACK %s\n", last_hex);
+					packet_buf_write(&resp_buf, "ACK %s\n", last_hex);
 				break;
 			}
 			continue;
@@ -428,10 +433,14 @@ static int get_common_commits(void)
 		if (!strcmp(line, "done")) {
 			if (have_obj.nr > 0) {
 				if (multi_ack)
-					packet_write(1, "ACK %s\n", last_hex);
+					packet_buf_write(&resp_buf, "ACK %s\n", last_hex);
+				write_or_die(1, resp_buf.buf, resp_buf.len);
+				strbuf_release(&resp_buf);
 				return 0;
 			}
-			packet_write(1, "NAK\n");
+			packet_buf_write(&resp_buf, "NAK\n");
+			write_or_die(1, resp_buf.buf, resp_buf.len);
+			strbuf_release(&resp_buf);
 			return -1;
 		}
 		die("git upload-pack: expected SHA1 list, got '%s'", line);
-- 
2.1.0-245-g26e60d4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] fetch-pack: don't resend known-common refs in find_common
  2014-10-22 10:07         ` Duy Nguyen
@ 2014-10-26 15:42           ` Dennis Kaarsemaker
  0 siblings, 0 replies; 10+ messages in thread
From: Dennis Kaarsemaker @ 2014-10-26 15:42 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Junio C Hamano, Shawn Pearce, Git Mailing List

On Wed, Oct 22, 2014 at 05:07:31PM +0700, Duy Nguyen wrote:
> On Wed, Oct 22, 2014 at 2:41 PM, Dennis Kaarsemaker
> <dennis@kaarsemaker.net> wrote:
> > I see two options:
> >
> > * Turning that interaction into a more cooperative process, with a
> >   select/poll loop
> > * Make upload-pack buffer its entire response when run in stateless_rpc
> >   mode until it has consumed all of the request
> 
> Or add a helper daemon and support stateful smart http. Or maybe
> that's what you meant in the first option.

No, I meant that get-http-backend should have a select/poll loop that
can read from and write to git-upload-pack at the same time, but option
two was easier to implement :)

Stateful smart http seems to be against the design goals of smart http if I
interpret Documentation/technical/http-protocol.txt correctly though, so
that doesn't seem to be the right approach.
-- 
Dennis Kaarsemaker <dennis@kaarsemaker.net>
http://twitter.com/seveas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fetch-pack: don't resend known-common refs in find_common
  2014-10-26 15:39           ` Dennis Kaarsemaker
@ 2014-12-06  0:48             ` Shawn Pearce
  0 siblings, 0 replies; 10+ messages in thread
From: Shawn Pearce @ 2014-12-06  0:48 UTC (permalink / raw)
  To: Dennis Kaarsemaker; +Cc: Junio C Hamano, git

On Sun, Oct 26, 2014 at 8:39 AM, Dennis Kaarsemaker
<dennis@kaarsemaker.net> wrote:
> On Wed, Oct 22, 2014 at 10:11:40AM -0700, Junio C Hamano wrote:
>> Dennis Kaarsemaker <dennis@kaarsemaker.net> writes:
>>
>> > On di, 2014-10-21 at 10:56 -0700, Junio C Hamano wrote:
>> >> Dennis Kaarsemaker <dennis@kaarsemaker.net> writes:
>> >>
>> >> > By not clearing the request buffer in stateless-rpc mode, fetch-pack
>> >> > would keep sending already known-common commits, leading to ever bigger
>> >> > http requests, eventually getting too large for git-http-backend to
>> >> > handle properly without filling up the pipe buffer in inflate_request.
>> >> > ---
>> >> > I'm still not quite sure whether this is the right thing to do, but make
>> >> > test still passes :) The new testcase demonstrates the problem, when
>> >> > running t5551 with EXPENSIVE, this test will hang without the patch to
>> >> > fetch-pack.c and succeed otherwise.
>> >>
>> >> IIUC, because "stateless" is just that, i.e. the server-end does not
>> >> keep track of what is already known, not telling what is known to be
>> >> common in each request would fundamentally break the protocol.  Am I
>> >> mistaken?
>> >
>> > That sounds plausible, but why then does the fetch complete with this
>> > line removed, and why does 'make test' still pass?
>>
>> The fetch-pack program tries to help the upload-pack program(s)
>> running on the other end find what nodes in the graph both
>> repositories have in common by sending what the repository on its
>> end has.  Some commits may not be known by the other side (e.g. your
>> new commits that haven't been pushed there that are made on a branch
>> forked from the common history), and some others may be known
>> (i.e. you drilled down the history from the tips of your refs and
>> reached a commit that you fetched from the common history
>> previously).  The latter are ACKed by the upload-pack process and
>> are remembered to be re-sent to the _next_ incarnation of the
>> upload-pack process when stateless RPC is in use.
>>
>> With your patch, you stop telling the upload-pack process what these
>> two programs already found to be common in their exchange.  After
>> the first exchange, fetch-pack and upload-pack may have noticed that
>> both ends have version 2.0, but because you do not convey that fact
>> to the other side, the new incarnation of upload-pack may end up
>> deciding that the version 1.9 is the newest common commit between
>> the two, and sending commits between 1.9 and 2.0.
>>
>> If you imagine an extreme case, it would be easy to see why "the
>> fetch completes" and "make test passes" are not sufficient to say
>> anything about this change.  Even if you break the protocol in in a
>> way different from your patch, by not sending any "have", such a
>> butchered "fetch-pack" will become "fetch everything from scratch",
>> aka "clone".  The end result will still have correct history and
>> "fetch completes" would be true.
>>
>> But I'd prefer deferring a more detailed analysis/explanation to
>> Shawn, as stateless RPC is his creation.

Junio's statement above about the world is correct.

> Thanks for the explanation, that makes it quite clear that this approach
> is wrong. The patch below (apologies for the formatting, I'm not quite
> sure how to use format-patch in this situation) does it differently: by
> buffering upload-pack's output until it has read all the input, the new
> test still succeeds and again 'make test' passes.

This probably introduces latency into the traditional bidirectional
multi_ack protocol.

> @@ -384,15 +385,19 @@ static int get_common_commits(void)
>                         if (multi_ack == 2 && got_common
>                             && !got_other && ok_to_give_up()) {
>                                 sent_ready = 1;
> -                               packet_write(1, "ACK %s ready\n", last_hex);
> +                               packet_buf_write(&resp_buf, "ACK %s ready\n", last_hex);
>                         }
>                         if (have_obj.nr == 0 || multi_ack)
> -                               packet_write(1, "NAK\n");
> +                               packet_buf_write(&resp_buf, "NAK\n");

By buffering and delaying these when !stateless_rpc we defer telling
the peer in a multi_ack exchange. That delays the peer from cutting
off its communication by about 1RTT.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-12-06  0:48 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-20 14:29 git fetch (http) hanging/failing on one specific repository, http only Dennis Kaarsemaker
2014-10-21  9:48 ` Bug in fetch-pack causing ever-growing http requests. (Was: git fetch (http) hanging/failing on one specific repository, http only) Dennis Kaarsemaker
2014-10-21 14:49   ` [PATCH] fetch-pack: don't resend known-common refs in find_common Dennis Kaarsemaker
2014-10-21 17:56     ` Junio C Hamano
2014-10-22  7:41       ` Dennis Kaarsemaker
2014-10-22 10:07         ` Duy Nguyen
2014-10-26 15:42           ` Dennis Kaarsemaker
2014-10-22 17:11         ` Junio C Hamano
2014-10-26 15:39           ` Dennis Kaarsemaker
2014-12-06  0:48             ` Shawn Pearce

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.