All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] t5516 "75 - deny fetch  unreachable SHA1, allowtipsha1inwant=true" flaky?
@ 2015-10-24 13:08 Lars Schneider
  2015-10-25  7:18 ` Fredrik Medley
  0 siblings, 1 reply; 9+ messages in thread
From: Lars Schneider @ 2015-10-24 13:08 UTC (permalink / raw)
  To: Git Users; +Cc: fredrik.medley

Hi,

while working on the Git CI integration I noticed that t5516 "75 - deny fetch 
unreachable SHA1, allowtipsha1inwant=true" (introduced in 68ee628) seems to be 
flaky on TravisCI. I get the following output in verbose mode:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
expecting success: 
        mk_empty testrepo &&
        (
            cd testrepo &&
            git config uploadpack.allowtipsha1inwant $configallowtipsha1inwant &&
            git commit --allow-empty -m foo &&
            git commit --allow-empty -m bar &&
            git commit --allow-empty -m xyz
        ) &&
        SHA1_1=$(git --git-dir=testrepo/.git rev-parse HEAD^^) &&
        SHA1_2=$(git --git-dir=testrepo/.git rev-parse HEAD^) &&
        SHA1_3=$(git --git-dir=testrepo/.git rev-parse HEAD) &&
        (
            cd testrepo &&
            git reset --hard $SHA1_2 &&
            git cat-file commit $SHA1_1 &&
            git cat-file commit $SHA1_3
        ) &&
        mk_empty shallow &&
        (
            cd shallow &&
            test_must_fail git fetch ../testrepo/.git $SHA1_3 &&
            test_must_fail git fetch ../testrepo/.git $SHA1_1 &&
            git --git-dir=../testrepo/.git config uploadpack.allowreachablesha1inwant true &&
            git fetch ../testrepo/.git $SHA1_1 &&
            git cat-file commit $SHA1_1 &&
            test_must_fail git cat-file commit $SHA1_2 &&
            git fetch ../testrepo/.git $SHA1_2 &&
            git cat-file commit $SHA1_2 &&
            test_must_fail git fetch ../testrepo/.git $SHA1_3
        )
    
Initialized empty Git repository in /home/travis/build/larsxschneider/git/t/trash directory.t5516-fetch-push/testrepo/.git/
[master (root-commit) a6b22ca] foo
 Author: A U Thor <author@example.com>
[master 5e26403] bar
 Author: A U Thor <author@example.com>
[master 64ea4c1] xyz
 Author: A U Thor <author@example.com>
HEAD is now at 5e26403 bar
tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
author A U Thor <author@example.com> 1112912053 -0700
committer C O Mitter <committer@example.com> 1112912053 -0700
foo
tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
parent 5e26403b4485d7a44fd8b65dc3f71e21c0dd6fad
author A U Thor <author@example.com> 1112912053 -0700
committer C O Mitter <committer@example.com> 1112912053 -0700
xyz
Initialized empty Git repository in /home/travis/build/larsxschneider/git/t/trash directory.t5516-fetch-push/shallow/.git/
fatal: git upload-pack: not our ref 64ea4c133d59fa98e86a771eda009872d6ab2886
test_must_fail: died by signal: git fetch ../testrepo/.git 64ea4c133d59fa98e86a771eda009872d6ab2886
not ok 75 - deny fetch unreachable SHA1, allowtipsha1inwant=true 
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

"git fetch ../testrepo/.git $SHA1_3" seems to die as follows:
1. fetch-pack.c:408 goto done;
2. fetch-pack.c:447 done:
3. fetch-pack.c:229 static void send_request... (send "0009done\n" --> 9 bytes)
4. write_or_die.c:74 write_or_die
5. wrapper.c:331 write_in_full
6. wrapper.c:281 xwrite
7. wrapper.c:287 write --> just dies with exit code 141?!
(use 4688abf to match the line numbers)

You can find the full logs about the CI run here:
https://travis-ci.org/larsxschneider/git/jobs/86984110

I repeatedly executed this test to demonstrate the flakiness:
failed after 100 attempts: https://travis-ci.org/larsxschneider/git/jobs/87181753
failed after 1876 attempts: https://travis-ci.org/larsxschneider/git/jobs/87181754
(all executed on bbd2929 from https://github.com/larsxschneider/git)

Unfortunately I was not able to make it fail on my local machine (OS X 10.9) 
nor on my VM (Ubuntu 14.10). Travis CI uses Ubuntu 12.04 LTS Server Edition 64 
bit with a AUFS file system.

Has anyone an idea what might causes these failures or can give me pointers how
to investigate it?

Thank you,
Lars

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] t5516 "75 - deny fetch unreachable SHA1, allowtipsha1inwant=true" flaky?
  2015-10-24 13:08 [RFC] t5516 "75 - deny fetch unreachable SHA1, allowtipsha1inwant=true" flaky? Lars Schneider
@ 2015-10-25  7:18 ` Fredrik Medley
  2015-10-25 13:47   ` Lars Schneider
  0 siblings, 1 reply; 9+ messages in thread
From: Fredrik Medley @ 2015-10-25  7:18 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Git Users

I think the following happens:
1. The remote upload-pack finds out "not our ref"
2. The remote send a response and close the pipe
3. fetch-pack still tries to write commands to the remote upload-pack
4. Because the connection has already been closed, writing will fail
with EPIPE which is detected by write_or_die() -> check_pipe()
resulting in die(141)

The reason for the test to succeed, i.e. git-fetch fails with 128 (or
1?), in most cases must be because it manages to write everything
before the context switch to the remote upload-pack occurs.

What is actually the wanted outcome? Should git-fetch try to continue
to see if the already received response is enough to continue as
normal?

Best regards,
Fredrik

2015-10-24 15:08 GMT+02:00 Lars Schneider <larsxschneider@gmail.com>:
> Hi,
>
> while working on the Git CI integration I noticed that t5516 "75 - deny fetch
> unreachable SHA1, allowtipsha1inwant=true" (introduced in 68ee628) seems to be
> flaky on TravisCI. I get the following output in verbose mode:
>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> expecting success:
>         mk_empty testrepo &&
>         (
>             cd testrepo &&
>             git config uploadpack.allowtipsha1inwant $configallowtipsha1inwant &&
>             git commit --allow-empty -m foo &&
>             git commit --allow-empty -m bar &&
>             git commit --allow-empty -m xyz
>         ) &&
>         SHA1_1=$(git --git-dir=testrepo/.git rev-parse HEAD^^) &&
>         SHA1_2=$(git --git-dir=testrepo/.git rev-parse HEAD^) &&
>         SHA1_3=$(git --git-dir=testrepo/.git rev-parse HEAD) &&
>         (
>             cd testrepo &&
>             git reset --hard $SHA1_2 &&
>             git cat-file commit $SHA1_1 &&
>             git cat-file commit $SHA1_3
>         ) &&
>         mk_empty shallow &&
>         (
>             cd shallow &&
>             test_must_fail git fetch ../testrepo/.git $SHA1_3 &&
>             test_must_fail git fetch ../testrepo/.git $SHA1_1 &&
>             git --git-dir=../testrepo/.git config uploadpack.allowreachablesha1inwant true &&
>             git fetch ../testrepo/.git $SHA1_1 &&
>             git cat-file commit $SHA1_1 &&
>             test_must_fail git cat-file commit $SHA1_2 &&
>             git fetch ../testrepo/.git $SHA1_2 &&
>             git cat-file commit $SHA1_2 &&
>             test_must_fail git fetch ../testrepo/.git $SHA1_3
>         )
>
> Initialized empty Git repository in /home/travis/build/larsxschneider/git/t/trash directory.t5516-fetch-push/testrepo/.git/
> [master (root-commit) a6b22ca] foo
>  Author: A U Thor <author@example.com>
> [master 5e26403] bar
>  Author: A U Thor <author@example.com>
> [master 64ea4c1] xyz
>  Author: A U Thor <author@example.com>
> HEAD is now at 5e26403 bar
> tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
> author A U Thor <author@example.com> 1112912053 -0700
> committer C O Mitter <committer@example.com> 1112912053 -0700
> foo
> tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
> parent 5e26403b4485d7a44fd8b65dc3f71e21c0dd6fad
> author A U Thor <author@example.com> 1112912053 -0700
> committer C O Mitter <committer@example.com> 1112912053 -0700
> xyz
> Initialized empty Git repository in /home/travis/build/larsxschneider/git/t/trash directory.t5516-fetch-push/shallow/.git/
> fatal: git upload-pack: not our ref 64ea4c133d59fa98e86a771eda009872d6ab2886
> test_must_fail: died by signal: git fetch ../testrepo/.git 64ea4c133d59fa98e86a771eda009872d6ab2886
> not ok 75 - deny fetch unreachable SHA1, allowtipsha1inwant=true
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>
> "git fetch ../testrepo/.git $SHA1_3" seems to die as follows:
> 1. fetch-pack.c:408 goto done;
> 2. fetch-pack.c:447 done:
> 3. fetch-pack.c:229 static void send_request... (send "0009done\n" --> 9 bytes)
> 4. write_or_die.c:74 write_or_die
> 5. wrapper.c:331 write_in_full
> 6. wrapper.c:281 xwrite
> 7. wrapper.c:287 write --> just dies with exit code 141?!
> (use 4688abf to match the line numbers)
>
> You can find the full logs about the CI run here:
> https://travis-ci.org/larsxschneider/git/jobs/86984110
>
> I repeatedly executed this test to demonstrate the flakiness:
> failed after 100 attempts: https://travis-ci.org/larsxschneider/git/jobs/87181753
> failed after 1876 attempts: https://travis-ci.org/larsxschneider/git/jobs/87181754
> (all executed on bbd2929 from https://github.com/larsxschneider/git)
>
> Unfortunately I was not able to make it fail on my local machine (OS X 10.9)
> nor on my VM (Ubuntu 14.10). Travis CI uses Ubuntu 12.04 LTS Server Edition 64
> bit with a AUFS file system.
>
> Has anyone an idea what might causes these failures or can give me pointers how
> to investigate it?
>
> Thank you,
> Lars

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] t5516 "75 - deny fetch unreachable SHA1, allowtipsha1inwant=true" flaky?
  2015-10-25  7:18 ` Fredrik Medley
@ 2015-10-25 13:47   ` Lars Schneider
  2015-10-25 17:38     ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Lars Schneider @ 2015-10-25 13:47 UTC (permalink / raw)
  To: Fredrik Medley, peff, patrick.reynolds; +Cc: Git Users

Thanks for your explanation Fredrik! However, I believe your 4. step is not what happens in the current implementation as the write call in wrapper.c dies directly. I see three ways to fix the problem:

(1) Make upload-pack wait for a response (with timeout) before it closes the pipe. However, I believe this would not be in line with the general Git philosophy stated in "git.c" (added in 7559a1be):
"Many parts of Git have subprograms communicate via pipe, expect the upstream of a pipe to die with SIGPIPE when the downstream of a pipe does not need to read all that is written."

(2) Ignore SIGPIPE errors when "fetch-pack" sends a "done" using "sigchain_push(SIGPIPE, SIG_IGN)" / "sigchain_pop(SIGPIPE)". However, then the check_pipe function (added in 756e676c) kicks in and we would need to work around that as well.

(3) Add a method "test_must_fail_or_die" to "test-lib-functions.sh". This method accepts exit codes 129<x<192, too. Use the new method in t5516.

All this code is fairly new to me and I don't understand all the ramifications. That being said I prefer solution (3) as it is the simplest solution.

Thanks,
Lars

On 25 Oct 2015, at 08:18, Fredrik Medley <fredrik.medley@gmail.com> wrote:

> I think the following happens:
> 1. The remote upload-pack finds out "not our ref"
> 2. The remote send a response and close the pipe
> 3. fetch-pack still tries to write commands to the remote upload-pack
> 4. Because the connection has already been closed, writing will fail
> with EPIPE which is detected by write_or_die() -> check_pipe()
> resulting in die(141)
> 
> The reason for the test to succeed, i.e. git-fetch fails with 128 (or
> 1?), in most cases must be because it manages to write everything
> before the context switch to the remote upload-pack occurs.
> 
> What is actually the wanted outcome? Should git-fetch try to continue
> to see if the already received response is enough to continue as
> normal?
> 
> Best regards,
> Fredrik
> 
> 2015-10-24 15:08 GMT+02:00 Lars Schneider <larsxschneider@gmail.com>:
>> Hi,
>> 
>> while working on the Git CI integration I noticed that t5516 "75 - deny fetch
>> unreachable SHA1, allowtipsha1inwant=true" (introduced in 68ee628) seems to be
>> flaky on TravisCI. I get the following output in verbose mode:
>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>> expecting success:
>>        mk_empty testrepo &&
>>        (
>>            cd testrepo &&
>>            git config uploadpack.allowtipsha1inwant $configallowtipsha1inwant &&
>>            git commit --allow-empty -m foo &&
>>            git commit --allow-empty -m bar &&
>>            git commit --allow-empty -m xyz
>>        ) &&
>>        SHA1_1=$(git --git-dir=testrepo/.git rev-parse HEAD^^) &&
>>        SHA1_2=$(git --git-dir=testrepo/.git rev-parse HEAD^) &&
>>        SHA1_3=$(git --git-dir=testrepo/.git rev-parse HEAD) &&
>>        (
>>            cd testrepo &&
>>            git reset --hard $SHA1_2 &&
>>            git cat-file commit $SHA1_1 &&
>>            git cat-file commit $SHA1_3
>>        ) &&
>>        mk_empty shallow &&
>>        (
>>            cd shallow &&
>>            test_must_fail git fetch ../testrepo/.git $SHA1_3 &&
>>            test_must_fail git fetch ../testrepo/.git $SHA1_1 &&
>>            git --git-dir=../testrepo/.git config uploadpack.allowreachablesha1inwant true &&
>>            git fetch ../testrepo/.git $SHA1_1 &&
>>            git cat-file commit $SHA1_1 &&
>>            test_must_fail git cat-file commit $SHA1_2 &&
>>            git fetch ../testrepo/.git $SHA1_2 &&
>>            git cat-file commit $SHA1_2 &&
>>            test_must_fail git fetch ../testrepo/.git $SHA1_3
>>        )
>> 
>> Initialized empty Git repository in /home/travis/build/larsxschneider/git/t/trash directory.t5516-fetch-push/testrepo/.git/
>> [master (root-commit) a6b22ca] foo
>> Author: A U Thor <author@example.com>
>> [master 5e26403] bar
>> Author: A U Thor <author@example.com>
>> [master 64ea4c1] xyz
>> Author: A U Thor <author@example.com>
>> HEAD is now at 5e26403 bar
>> tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
>> author A U Thor <author@example.com> 1112912053 -0700
>> committer C O Mitter <committer@example.com> 1112912053 -0700
>> foo
>> tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
>> parent 5e26403b4485d7a44fd8b65dc3f71e21c0dd6fad
>> author A U Thor <author@example.com> 1112912053 -0700
>> committer C O Mitter <committer@example.com> 1112912053 -0700
>> xyz
>> Initialized empty Git repository in /home/travis/build/larsxschneider/git/t/trash directory.t5516-fetch-push/shallow/.git/
>> fatal: git upload-pack: not our ref 64ea4c133d59fa98e86a771eda009872d6ab2886
>> test_must_fail: died by signal: git fetch ../testrepo/.git 64ea4c133d59fa98e86a771eda009872d6ab2886
>> not ok 75 - deny fetch unreachable SHA1, allowtipsha1inwant=true
>> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>> 
>> "git fetch ../testrepo/.git $SHA1_3" seems to die as follows:
>> 1. fetch-pack.c:408 goto done;
>> 2. fetch-pack.c:447 done:
>> 3. fetch-pack.c:229 static void send_request... (send "0009done\n" --> 9 bytes)
>> 4. write_or_die.c:74 write_or_die
>> 5. wrapper.c:331 write_in_full
>> 6. wrapper.c:281 xwrite
>> 7. wrapper.c:287 write --> just dies with exit code 141?!
>> (use 4688abf to match the line numbers)
>> 
>> You can find the full logs about the CI run here:
>> https://travis-ci.org/larsxschneider/git/jobs/86984110
>> 
>> I repeatedly executed this test to demonstrate the flakiness:
>> failed after 100 attempts: https://travis-ci.org/larsxschneider/git/jobs/87181753
>> failed after 1876 attempts: https://travis-ci.org/larsxschneider/git/jobs/87181754
>> (all executed on bbd2929 from https://github.com/larsxschneider/git)
>> 
>> Unfortunately I was not able to make it fail on my local machine (OS X 10.9)
>> nor on my VM (Ubuntu 14.10). Travis CI uses Ubuntu 12.04 LTS Server Edition 64
>> bit with a AUFS file system.
>> 
>> Has anyone an idea what might causes these failures or can give me pointers how
>> to investigate it?
>> 
>> Thank you,
>> Lars

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] t5516 "75 - deny fetch unreachable SHA1, allowtipsha1inwant=true" flaky?
  2015-10-25 13:47   ` Lars Schneider
@ 2015-10-25 17:38     ` Junio C Hamano
  2015-10-30 21:00       ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2015-10-25 17:38 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Fredrik Medley, peff, patrick.reynolds, Git Users

Lars Schneider <larsxschneider@gmail.com> writes:

> (1) Make upload-pack wait for a response (with timeout) before it
> closes the pipe. However, I believe this would not be in line with
> the general Git philosophy stated in "git.c" (added in 7559a1be):
> "Many parts of Git have subprograms communicate via pipe, expect
> the upstream of a pipe to die with SIGPIPE when the downstream of
> a pipe does not need to read all that is written."

While I think "waiting" with "timeout" is undesirable for the
upload-pack / fetch-pack communication, you are rejecting it for a
wrong reason.

What you quoted is not even a Git philosophy.  Yes, many parts do
want to behave like that, and the "restore" needs to make sure that
such a behaviour happens by fixing the environment given to us by
the process spawning us.

That does not mean some parts of the system, if they want to run
SIGPIPE ignored, must not do so because it goes "against the
philosophy".  

It only means that these parts of the system, if they choose to,
must arrange to do so themselves.  That "restore" thing is merely to
give us a known good state to start from.

> (2) Ignore SIGPIPE errors when "fetch-pack" sends a "done" using
> "sigchain_push(SIGPIPE, SIG_IGN)" /
> "sigchain_pop(SIGPIPE)". However, then the check_pipe function
> (added in 756e676c) kicks in and we would need to work around that
> as well.

I am not sure if we are looking at the issue the right way once we
start saying 'do this _when_ fetch sends ...'.

The communication between fetch and upload goes bidirectionally in
an overlapping fashion, and at any point in the communication, not
limited to "after fetch sends 'done'", the other side can decide to
disconnect while one side is sending.  If you are lucky, you will
finish sending the current batch and try to read from the other side
and notice "the remote end hung up unexpectedly", and if you are
unlucky, you find your write cannot go through due to broken pipe.

Dying with SIGPIPE would not give us a chance to clean things up
(i.e. react to the "not our ref" error in a more application
specific way), so the current behaviour has a room for improvement,
but I suspect that changes required for "dying more nicely" would be
rather large [*1*]; I am not convinced if it is worth the effort.

That leaves us to something along this line...

> (3) Add a method "test_must_fail_or_die" to
> "test-lib-functions.sh". This method accepts exit codes 129<x<192,
> too. Use the new method in t5516.

... but I have to wonder if 129<x<192 is loosening too much, given
that the only error we expect, other than the orderly shutdown, is
reception of sigpipe.


[Footnote]

*1* We'd need to update fetch-pack.c:send_request() so that it and
    its helpers use a variant of write() that does not die with
    SIGPIPE and instead returns an error status to the caller, and
    update all the callers to react to an error, which may involve
    jumping to the receive codepath with the hope that some early
    response is already waiting for us to read and gives us an error
    message from the remote side, or something along that line.  And
    we'd probably need to update the other direction, i.e. push and
    receive, for symmetry.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] t5516 "75 - deny fetch unreachable SHA1, allowtipsha1inwant=true" flaky?
  2015-10-25 17:38     ` Junio C Hamano
@ 2015-10-30 21:00       ` Junio C Hamano
  2015-10-30 21:22         ` [PATCH] test: accept death by SIGPIPE as a valid failure mode Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2015-10-30 21:00 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Fredrik Medley, peff, patrick.reynolds, Git Users

Junio C Hamano <gitster@pobox.com> writes:

> That leaves us to something along this line...
>
>> (3) Add a method "test_must_fail_or_die" to
>> "test-lib-functions.sh". This method accepts exit codes 129<x<192,
>> too. Use the new method in t5516.
>
> ... but I have to wonder if 129<x<192 is loosening too much, given
> that the only error we expect, other than the orderly shutdown, is
> reception of sigpipe.

So, how about doing something like this as a starter.  All of the
object transport codepath share "we may notice that the other end
disconnected, or that the other end explicitly told us it found an
error, and die, or the other end may have died, perhaps after giving
a human-readable error message, and we end up dying when we try to
tell them more with SIGPIPE", and that by itself is not a bug in the
real life---we will exit with non-zero status and that is a sign
enough for the user to know that the command has failed.



 t/test-lib-functions.sh | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 6dffb8b..b1f950d 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -579,6 +579,9 @@ test_must_fail () {
 	if test $exit_code = 0; then
 		echo >&2 "test_must_fail: command succeeded: $*"
 		return 1
+	elif test $exit_code -eq 141; then
+		echo >&2 "test_must_fail: died with sigpipe: $*"
+		return 0
 	elif test $exit_code -gt 129 && test $exit_code -le 192; then
 		echo >&2 "test_must_fail: died by signal: $*"
 		return 1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH] test: accept death by SIGPIPE as a valid failure mode
  2015-10-30 21:00       ` Junio C Hamano
@ 2015-10-30 21:22         ` Junio C Hamano
  2015-11-05  7:47           ` Jeff King
  0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2015-10-30 21:22 UTC (permalink / raw)
  To: Git Users; +Cc: Fredrik Medley, peff, patrick.reynolds, Lars Schneider

On a local host, the object/history transport code often talks over
pipe with the other side.  The other side may notice some (expected)
failure, send the error message either to our process or to the
standard error and hung up.  In such codepaths, if timing were not
unfortunate, our side would receive the report of (expected) failure
from the other side over the pipe and die().  Otherwise, our side
may still be trying to talk to it and would die with a SIGPIPE.

This was observed as an intermittent breakage in t5516 by a few
people.

In the real-life scenario, either mode of death exits with a
non-zero status, and the user would learn that the command failed.
The test_must_fail helper should also know that dying with SIGPIPE
is one of the valid failure modes when we are expecting the tested
operation to notice problem and fail.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * This time with a real commit log message.  As the goal is to
   treat death-by-sigpipe just like a call to die(), this version
   silently lets the test pass instead of giving a message to the
   standard error stream.

 t/test-lib-functions.sh | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 8f8858a..4f40ffe 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -556,6 +556,9 @@ test_must_fail () {
 	if test $exit_code = 0; then
 		echo >&2 "test_must_fail: command succeeded: $*"
 		return 1
+	elif test $exit_code -eq 141; then
+		# died with sigpipe
+		return 0
 	elif test $exit_code -gt 129 && test $exit_code -le 192; then
 		echo >&2 "test_must_fail: died by signal: $*"
 		return 1
-- 
2.6.2-456-gc12a6fe

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] test: accept death by SIGPIPE as a valid failure mode
  2015-10-30 21:22         ` [PATCH] test: accept death by SIGPIPE as a valid failure mode Junio C Hamano
@ 2015-11-05  7:47           ` Jeff King
  2015-11-05  9:34             ` Lars Schneider
  0 siblings, 1 reply; 9+ messages in thread
From: Jeff King @ 2015-11-05  7:47 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Users, Fredrik Medley, patrick.reynolds, Lars Schneider

On Fri, Oct 30, 2015 at 02:22:14PM -0700, Junio C Hamano wrote:

> On a local host, the object/history transport code often talks over
> pipe with the other side.  The other side may notice some (expected)
> failure, send the error message either to our process or to the
> standard error and hung up.  In such codepaths, if timing were not
> unfortunate, our side would receive the report of (expected) failure
> from the other side over the pipe and die().  Otherwise, our side
> may still be trying to talk to it and would die with a SIGPIPE.
> 
> This was observed as an intermittent breakage in t5516 by a few
> people.
> 
> In the real-life scenario, either mode of death exits with a
> non-zero status, and the user would learn that the command failed.
> The test_must_fail helper should also know that dying with SIGPIPE
> is one of the valid failure modes when we are expecting the tested
> operation to notice problem and fail.

Sorry for the slow review; before commenting I wanted to dig into
whether this SIGPIPE ambiguity was avoidable in the first place.

I think the answer is "probably not". We do call write_or_die() pretty
consistently in the network-aware programs. So we could ignore SIGPIPE,
and then we would catch EPIPE (of course, we convert that into SIGPIPE
in many places, but we do not have to do so). But since the SIGPIPE
behavior is global, that carries the risk of us failing to check a write
against some other descriptor. It's probably not worth it.

Teaching the tests to handle both cases seems like a reasonable
workaround. Changing test_must_fail covers a lot of cases; I wondered if
there are other tests that would not want to silently cover up a SIGPIPE
death. But I could not really think of a plausible reason.

So I think your patch is the best thing to do.

-Peff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] test: accept death by SIGPIPE as a valid failure mode
  2015-11-05  7:47           ` Jeff King
@ 2015-11-05  9:34             ` Lars Schneider
  2015-11-05 16:21               ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Lars Schneider @ 2015-11-05  9:34 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Git Users, Fredrik Medley, patrick.reynolds


> On 05 Nov 2015, at 08:47, Jeff King <peff@peff.net> wrote:
> 
> On Fri, Oct 30, 2015 at 02:22:14PM -0700, Junio C Hamano wrote:
> 
>> On a local host, the object/history transport code often talks over
>> pipe with the other side.  The other side may notice some (expected)
>> failure, send the error message either to our process or to the
>> standard error and hung up.  In such codepaths, if timing were not
>> unfortunate, our side would receive the report of (expected) failure
>> from the other side over the pipe and die().  Otherwise, our side
>> may still be trying to talk to it and would die with a SIGPIPE.
>> 
>> This was observed as an intermittent breakage in t5516 by a few
>> people.
>> 
>> In the real-life scenario, either mode of death exits with a
>> non-zero status, and the user would learn that the command failed.
>> The test_must_fail helper should also know that dying with SIGPIPE
>> is one of the valid failure modes when we are expecting the tested
>> operation to notice problem and fail.
> 
> Sorry for the slow review; before commenting I wanted to dig into
> whether this SIGPIPE ambiguity was avoidable in the first place.
> 
> I think the answer is "probably not". We do call write_or_die() pretty
> consistently in the network-aware programs. So we could ignore SIGPIPE,
> and then we would catch EPIPE (of course, we convert that into SIGPIPE
> in many places, but we do not have to do so). But since the SIGPIPE
> behavior is global, that carries the risk of us failing to check a write
> against some other descriptor. It's probably not worth it.
> 
> Teaching the tests to handle both cases seems like a reasonable
> workaround. Changing test_must_fail covers a lot of cases; I wondered if
> there are other tests that would not want to silently cover up a SIGPIPE
> death. But I could not really think of a plausible reason.
> 
> So I think your patch is the best thing to do.
> 
> -Peff

Oh, I missed this email thread. I am still working on a stable Travis-CI integration and I ran into this issue a few times. I fixed it in my (not yet published) patch with an additional function "test_must_fail_or_sigpipe" that I've used for all tests affected by this issue. Modifying the "test_must_fail" function seemed too risky for me as I don't understand all possible implications. However, if you don't see a problem then this is fine with me.

- Lars

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] test: accept death by SIGPIPE as a valid failure mode
  2015-11-05  9:34             ` Lars Schneider
@ 2015-11-05 16:21               ` Junio C Hamano
  0 siblings, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2015-11-05 16:21 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Jeff King, Git Users, Fredrik Medley, patrick.reynolds

Lars Schneider <larsxschneider@gmail.com> writes:

> Oh, I missed this email thread. I am still working on a stable
> Travis-CI integration and I ran into this issue a few times. I
> fixed it in my (not yet published) patch with an additional
> function "test_must_fail_or_sigpipe" that I've used for all tests
> affected by this issue. Modifying the "test_must_fail" function
> seemed too risky for me as I don't understand all possible
> implications. However, if you don't see a problem then this is
> fine with me.

It's not that I don't see a problem at all.  You constructed a good
summary of the issues in three bullet points, that lead me to think
that it is the right approach to tweak the way the tests evaluate
the outcome, but then nothing came out of the discussion, so I sent
out a "how about doing it this way" to make sure this topic will not
be forgotten.  There is nothing more to it, and "how about..." is in
no way final.

There obviously are pros and cons between introducing your new
helper to mark the ones that are allowed to catch SIGPIPE and
changing all occurrences of test_must_fail.  I do not have a strong
opinion yet, but it needs to be discussed and decided.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-11-05 16:22 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-24 13:08 [RFC] t5516 "75 - deny fetch unreachable SHA1, allowtipsha1inwant=true" flaky? Lars Schneider
2015-10-25  7:18 ` Fredrik Medley
2015-10-25 13:47   ` Lars Schneider
2015-10-25 17:38     ` Junio C Hamano
2015-10-30 21:00       ` Junio C Hamano
2015-10-30 21:22         ` [PATCH] test: accept death by SIGPIPE as a valid failure mode Junio C Hamano
2015-11-05  7:47           ` Jeff King
2015-11-05  9:34             ` Lars Schneider
2015-11-05 16:21               ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.