* Deadlock between git-remote-http and git fetch-pack
@ 2017-01-27 22:31 tsuna
2017-01-27 23:19 ` Jonathan Tan
2017-01-27 23:34 ` Junio C Hamano
0 siblings, 2 replies; 3+ messages in thread
From: tsuna @ 2017-01-27 22:31 UTC (permalink / raw)
To: git
Hi there,
While investigating a hung job in our CI system today, I think I found
a deadlock in git-remote-http
Git version: 2.9.3
Linux (amd64) kernel 4.9.0
Excerpt from the process list:
jenkins 27316 0.0 0.0 18508 6024 ? S 19:30 0:00 |
\_ git -C ../../../arista fetch --unshallow
jenkins 27317 0.0 0.0 169608 10916 ? S 19:30 0:00 |
\_ git-remote-http origin http://gerrit/arista
jenkins 27319 0.0 0.0 24160 8260 ? S 19:30 0:00 |
\_ git fetch-pack --stateless-rpc --stdin
--lock-pack --include-tag --thin --no-progress --depth=2147483647
http://gerrit/arista/
Here PID 27319 (git fetch-pack) is stuck reading on stdin, while its
parent, PID 27317 (git-remote-http) is stuck reading on its child’s
stdout. Nothing has moved for like 2h, it’s deadlocked.
> strace -fp 27319
strace: Process 27319 attached
read(0,
Here FD 0 is a pipe:
~ @8a33a534e2f7> lsof -np 27319 | grep 0r
git 27319 jenkins 0r FIFO 0,10 0t0 354519158 pipe
The writing end of which is owned by the parent process:
~ @8a33a534e2f7> lsof -n 2>/dev/null | fgrep 354519158
git-remot 27317 jenkins 4w FIFO 0,10 0t0
354519158 pipe
git 27319 jenkins 0r FIFO 0,10 0t0
354519158 pipe
And the parent process (git-remote-http) is stuck reading from another FD:
> strace -fp 27317
strace: Process 27317 attached
read(5,
And here FD 5 is another pipe:
~ @8a33a534e2f7> lsof -np 27317 | grep 5r
git-remot 27317 jenkins 5r FIFO 0,10 0t0 354519159 pipe
Which is the child’s stdout:
> lsof -n 2>/dev/null | fgrep 354519159
git-remot 27317 jenkins 5r FIFO 0,10 0t0
354519159 pipe
git 27319 jenkins 1w FIFO 0,10 0t0
354519159 pipe
Hence the deadlock.
Stack trace in git-remote-http:
(gdb) bt
#0 0x00007f04f1e1363d in read () from target:/lib64/libpthread.so.0
#1 0x0000562417472d73 in xread ()
#2 0x0000562417472f2b in read_in_full ()
#3 0x0000562417438a6e in get_packet_data ()
#4 0x0000562417439129 in packet_read ()
#5 0x00005624174245e0 in rpc_service ()
#6 0x0000562417424f10 in fetch_git ()
#7 0x00005624174233fd in main ()
Stack trace in git fetch-pack:
(gdb) bt
#0 0x00007fb3ab478620 in __read_nocancel () from target:/lib64/libpthread.so.0
#1 0x000055f688827283 in xread ()
#2 0x000055f68882743b in read_in_full ()
#3 0x000055f6887ce35e in get_packet_data ()
#4 0x000055f6887cea19 in packet_read ()
#5 0x000055f6887ceb90 in packet_read_line ()
#6 0x000055f68879dd05 in get_ack ()
#7 0x000055f68879f6b4 in fetch_pack ()
#8 0x000055f688710619 in cmd_fetch_pack ()
#9 0x000055f6886dff7b in handle_builtin ()
#10 0x000055f6886df026 in main ()
I looked at the diff between v2.9.3 and HEAD on fetch-pack.c and
remote-curl.c and didn’t see anything noteworthy in that area of the
code, so I presume the bug is still there in master.
--
Benoit "tsuna" Sigoure
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Deadlock between git-remote-http and git fetch-pack
2017-01-27 22:31 Deadlock between git-remote-http and git fetch-pack tsuna
@ 2017-01-27 23:19 ` Jonathan Tan
2017-01-27 23:34 ` Junio C Hamano
1 sibling, 0 replies; 3+ messages in thread
From: Jonathan Tan @ 2017-01-27 23:19 UTC (permalink / raw)
To: tsuna, git
On 01/27/2017 02:31 PM, tsuna wrote:
> Hi there,
> While investigating a hung job in our CI system today, I think I found
> a deadlock in git-remote-http
>
> Git version: 2.9.3
> Linux (amd64) kernel 4.9.0
>
> Excerpt from the process list:
>
> jenkins 27316 0.0 0.0 18508 6024 ? S 19:30 0:00 |
> \_ git -C ../../../arista fetch --unshallow
> jenkins 27317 0.0 0.0 169608 10916 ? S 19:30 0:00 |
> \_ git-remote-http origin http://gerrit/arista
> jenkins 27319 0.0 0.0 24160 8260 ? S 19:30 0:00 |
> \_ git fetch-pack --stateless-rpc --stdin
> --lock-pack --include-tag --thin --no-progress --depth=2147483647
> http://gerrit/arista/
>
> Here PID 27319 (git fetch-pack) is stuck reading on stdin, while its
> parent, PID 27317 (git-remote-http) is stuck reading on its child’s
> stdout. Nothing has moved for like 2h, it’s deadlocked.
>
>> strace -fp 27319
> strace: Process 27319 attached
> read(0,
>
> Here FD 0 is a pipe:
>
> ~ @8a33a534e2f7> lsof -np 27319 | grep 0r
> git 27319 jenkins 0r FIFO 0,10 0t0 354519158 pipe
>
> The writing end of which is owned by the parent process:
>
> ~ @8a33a534e2f7> lsof -n 2>/dev/null | fgrep 354519158
> git-remot 27317 jenkins 4w FIFO 0,10 0t0
> 354519158 pipe
> git 27319 jenkins 0r FIFO 0,10 0t0
> 354519158 pipe
>
> And the parent process (git-remote-http) is stuck reading from another FD:
>
>> strace -fp 27317
> strace: Process 27317 attached
> read(5,
>
> And here FD 5 is another pipe:
>
> ~ @8a33a534e2f7> lsof -np 27317 | grep 5r
> git-remot 27317 jenkins 5r FIFO 0,10 0t0 354519159 pipe
>
> Which is the child’s stdout:
>
>> lsof -n 2>/dev/null | fgrep 354519159
> git-remot 27317 jenkins 5r FIFO 0,10 0t0
> 354519159 pipe
> git 27319 jenkins 1w FIFO 0,10 0t0
> 354519159 pipe
>
> Hence the deadlock.
>
> Stack trace in git-remote-http:
>
> (gdb) bt
> #0 0x00007f04f1e1363d in read () from target:/lib64/libpthread.so.0
> #1 0x0000562417472d73 in xread ()
> #2 0x0000562417472f2b in read_in_full ()
> #3 0x0000562417438a6e in get_packet_data ()
> #4 0x0000562417439129 in packet_read ()
> #5 0x00005624174245e0 in rpc_service ()
> #6 0x0000562417424f10 in fetch_git ()
> #7 0x00005624174233fd in main ()
>
> Stack trace in git fetch-pack:
>
> (gdb) bt
> #0 0x00007fb3ab478620 in __read_nocancel () from target:/lib64/libpthread.so.0
> #1 0x000055f688827283 in xread ()
> #2 0x000055f68882743b in read_in_full ()
> #3 0x000055f6887ce35e in get_packet_data ()
> #4 0x000055f6887cea19 in packet_read ()
> #5 0x000055f6887ceb90 in packet_read_line ()
> #6 0x000055f68879dd05 in get_ack ()
> #7 0x000055f68879f6b4 in fetch_pack ()
> #8 0x000055f688710619 in cmd_fetch_pack ()
> #9 0x000055f6886dff7b in handle_builtin ()
> #10 0x000055f6886df026 in main ()
>
> I looked at the diff between v2.9.3 and HEAD on fetch-pack.c and
> remote-curl.c and didn’t see anything noteworthy in that area of the
> code, so I presume the bug is still there in master.
>
I haven't looked into this in detail, but this might be related to
something I discovered while writing my patch set. I noticed that
upload-pack (the process on the "other side" of fetch-pack) can die
without first writing any notification, causing fetch-pack to block
forever on a read. A fix would probably look like that patch [1].
[1]
<afe5d7d3f876893fdad318665805df1e056717c6.1485381677.git.jonathantanmy@google.com>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Deadlock between git-remote-http and git fetch-pack
2017-01-27 22:31 Deadlock between git-remote-http and git fetch-pack tsuna
2017-01-27 23:19 ` Jonathan Tan
@ 2017-01-27 23:34 ` Junio C Hamano
1 sibling, 0 replies; 3+ messages in thread
From: Junio C Hamano @ 2017-01-27 23:34 UTC (permalink / raw)
To: tsuna; +Cc: git
tsuna <tsunanet@gmail.com> writes:
> While investigating a hung job in our CI system today, I think I found
> a deadlock in git-remote-http
> ...
> Here PID 27319 (git fetch-pack) is stuck reading on stdin, while its
> parent, PID 27317 (git-remote-http) is stuck reading on its child’s
> stdout. Nothing has moved for like 2h, it’s deadlocked.
Hmph, would this be related to 296b847c0d ("remote-curl: don't hang
when a server dies before any output", 2016-11-18) I wonder...
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-01-28 0:12 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-27 22:31 Deadlock between git-remote-http and git fetch-pack tsuna
2017-01-27 23:19 ` Jonathan Tan
2017-01-27 23:34 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).