git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/1]  Fix t5516 flakiness in Visual Studio builds
@ 2019-10-28 15:00 Johannes Schindelin via GitGitGadget
  2019-10-28 15:00 ` [PATCH 1/1] vreportf(): avoid buffered write in favor of unbuffered one Johannes Schindelin via GitGitGadget
  2019-10-29 13:37 ` [PATCH v2 0/1] Fix t5516 flakiness in Visual Studio builds Johannes Schindelin via GitGitGadget
  0 siblings, 2 replies; 34+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-10-28 15:00 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Alexandr Miloslavskiy, Johannes Schindelin, Junio C Hamano

Among the flaky tests, it seems that the Azure Pipeline suffers relatively
frequently from t5516 failing with the Visual Studio builds. Essentially, we
grep for an error message, but that error message is produced twice, once by
a fetch and once by the upload-pack spawned from it, and those error
messages are usually interleaved because of MSVC runtime fprintf() 
idiosyncracies. 

The commit message of this patch is based, in part, on 
https://github.com/gitgitgadget/git/pull/407. The patch itself is a much
more minimal alternative (using xwrite() instead of fprintf()) to the code
of https://github.com/gitgitgadget/git/pull/407, avoiding the complexity of
the part of the code that allows for unlimited messages.

While it would seem theoretically more elegant to allow for unlimited
messages, in practice too-long messages cause more problems than they solve,
and therefore we already clip them, and this patch does not change that
behavior.

This fixes https://github.com/gitgitgadget/git/issues/240.

Johannes Schindelin (1):
  vreportf(): avoid buffered write in favor of unbuffered one

 usage.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)


base-commit: 566a1439f6f56c2171b8853ddbca0ad3f5098770
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-428%2Fdscho%2Ffix-t5516-flakiness-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-428/dscho/fix-t5516-flakiness-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/428
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 1/1] vreportf(): avoid buffered write in favor of unbuffered one
  2019-10-28 15:00 [PATCH 0/1] Fix t5516 flakiness in Visual Studio builds Johannes Schindelin via GitGitGadget
@ 2019-10-28 15:00 ` Johannes Schindelin via GitGitGadget
  2019-10-29  3:18   ` Junio C Hamano
  2019-10-29 10:38   ` SZEDER Gábor
  2019-10-29 13:37 ` [PATCH v2 0/1] Fix t5516 flakiness in Visual Studio builds Johannes Schindelin via GitGitGadget
  1 sibling, 2 replies; 34+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-10-28 15:00 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Alexandr Miloslavskiy, Johannes Schindelin,
	Junio C Hamano, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The MSVC runtime behavior differs from glibc's with respect to
`fprintf(stderr, ...)` in that the former writes out the message
character by character.

In t5516, this leads to a funny problem where a `git fetch` process as
well as the `git upload-pack` process spawned by it _both_ call `die()`
at the same time. The output can look like this:

	fatal: git uploadfata-lp: raemcokte :error:  upload-pnot our arcef k6: n4ot our ea4cr1e3f 36d45ea94fca1398e86a771eda009872d63adb28598f6a9
	8e86a771eda009872d6ab2886

Let's avoid this predicament altogether by rendering the entire message,
including the prefix and the trailing newline, into the buffer we
already have (and which is still fixed size) and then write it out via
`write_in_full()`.

The history of `vreportf()` with regard to this issue includes the
following commits:

d048a96e (2007-11-09) - 'char msg[256]' is introduced to avoid interleaving
389d1767 (2009-03-25) - Buffer size increased to 1024 to avoid truncation
625a860c (2009-11-22) - Buffer size increased to 4096 to avoid truncation
f4c3edc0 (2015-08-11) - Buffer removed to avoid truncation
b5a9e435 (2017-01-11) - Reverts f4c3edc0 to be able to replace control
                        chars before sending to stderr
9ac13ec9 (2006-10-11) - Another attempt to solve interleaving.
                        This is seemingly related to d048a96e.
137a0d0e (2007-11-19) - Addresses out-of-order for display()
34df8aba (2009-03-10) - Switches xwrite() to fprintf() in recv_sideband()
                        to support UTF-8 emulation
eac14f89 (2012-01-14) - Removes the need for fprintf() for UTF-8 emulation,
                        so it's safe to use xwrite() again
5e5be9e2 (2016-06-28) - recv_sideband() uses xwrite() again

Note that we need to be careful to handle the return value of
`vsnprintf()` that indicates the _desired_ byte count.

Also please note that we `fflush(stderr)` here to help when running in a
Git Bash on Windows: in this case, `stderr` is not actually truly
unbuffered, and needs the extra help.

Co-authored-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 usage.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/usage.c b/usage.c
index 2fdb20086b..4328894dce 100644
--- a/usage.c
+++ b/usage.c
@@ -10,13 +10,19 @@ void vreportf(const char *prefix, const char *err, va_list params)
 {
 	char msg[4096];
 	char *p;
-
-	vsnprintf(msg, sizeof(msg), err, params);
+	size_t off = strlcpy(msg, prefix, sizeof(msg));
+	int ret = vsnprintf(msg + off, sizeof(msg) - off, err, params);
 	for (p = msg; *p; p++) {
 		if (iscntrl(*p) && *p != '\t' && *p != '\n')
 			*p = '?';
 	}
-	fprintf(stderr, "%s%s\n", prefix, msg);
+	if (ret > 0) {
+		if (off + ret > sizeof(msg) - 1)
+			ret = sizeof(msg) - 1 - off;
+		msg[off + ret] = '\n'; /* we no longer need a NUL */
+		fflush(stderr);
+		xwrite(2, msg, off + ret + 1);
+	}
 }
 
 static NORETURN void usage_builtin(const char *err, va_list params)
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/1] vreportf(): avoid buffered write in favor of unbuffered one
  2019-10-28 15:00 ` [PATCH 1/1] vreportf(): avoid buffered write in favor of unbuffered one Johannes Schindelin via GitGitGadget
@ 2019-10-29  3:18   ` Junio C Hamano
  2019-10-29 12:30     ` Johannes Schindelin
  2019-10-29 10:38   ` SZEDER Gábor
  1 sibling, 1 reply; 34+ messages in thread
From: Junio C Hamano @ 2019-10-29  3:18 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Jeff King, Alexandr Miloslavskiy, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> Also please note that we `fflush(stderr)` here to help when running in a
> Git Bash on Windows: in this case, `stderr` is not actually truly
> unbuffered, and needs the extra help.

Yuck.  So on all systems, vreportf() now totally bypasses stdio?

Also, this is only to help output from us that goes via vreportf()
and other codepaths in us that use stdio to write to the standard
error stream can still get mixed on Windows (I think the answer is
yes, because we wouldn't need fflush() in this patch if we are
covering all writes to the standard error stream)?

By the way, I'd retitle the patch to highlight that we are still
doing buffered write, if I were doing this topic.  We are just
avoiding some implementations of stdio that do not give us buffering
and doing the buffering ourselves.

    Subject: vreportf(): don't rely on stdio buffering

or something like that.

> Co-authored-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  usage.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/usage.c b/usage.c
> index 2fdb20086b..4328894dce 100644
> --- a/usage.c
> +++ b/usage.c
> @@ -10,13 +10,19 @@ void vreportf(const char *prefix, const char *err, va_list params)
>  {
>  	char msg[4096];
>  	char *p;
> -
> -	vsnprintf(msg, sizeof(msg), err, params);
> +	size_t off = strlcpy(msg, prefix, sizeof(msg));

Like snprintf(3) the strlcpy() and strlcat() functions return the
total length of the string they tried to create.  For strlcpy() that
means the length of src.

So "off" would be strlen(prefix), which could be longer than
sizeof(msg)?  Then what happens to this vsnprintf() we see below?

> +	int ret = vsnprintf(msg + off, sizeof(msg) - off, err, params);
>  	for (p = msg; *p; p++) {
>  		if (iscntrl(*p) && *p != '\t' && *p != '\n')
>  			*p = '?';
>  	}
> -	fprintf(stderr, "%s%s\n", prefix, msg);

Strictly speaking this is a breaking change in that control
sequences in prefix are now clobbered.  Does any caller call this
function with prefix like "^M\033[K<some string>" to overwrite the
last output line with the new message?  If not, then probably we do
not have to worry about it (and reusing msg[] does feel attractive).

> +	if (ret > 0) {
> +		if (off + ret > sizeof(msg) - 1)
> +			ret = sizeof(msg) - 1 - off;
> +		msg[off + ret] = '\n'; /* we no longer need a NUL */
> +		fflush(stderr);
> +		xwrite(2, msg, off + ret + 1);
> +	}
>  }
>  
>  static NORETURN void usage_builtin(const char *err, va_list params)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/1] vreportf(): avoid buffered write in favor of unbuffered one
  2019-10-28 15:00 ` [PATCH 1/1] vreportf(): avoid buffered write in favor of unbuffered one Johannes Schindelin via GitGitGadget
  2019-10-29  3:18   ` Junio C Hamano
@ 2019-10-29 10:38   ` SZEDER Gábor
  2019-10-29 12:38     ` Johannes Schindelin
  1 sibling, 1 reply; 34+ messages in thread
From: SZEDER Gábor @ 2019-10-29 10:38 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Jeff King, Alexandr Miloslavskiy, Johannes Schindelin,
	Junio C Hamano

On Mon, Oct 28, 2019 at 03:00:59PM +0000, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> The MSVC runtime behavior differs from glibc's with respect to
> `fprintf(stderr, ...)` in that the former writes out the message
> character by character.
> 
> In t5516, this leads to a funny problem where a `git fetch` process as
> well as the `git upload-pack` process spawned by it _both_ call `die()`
> at the same time. The output can look like this:
> 
> 	fatal: git uploadfata-lp: raemcokte :error:  upload-pnot our arcef k6: n4ot our ea4cr1e3f 36d45ea94fca1398e86a771eda009872d63adb28598f6a9
> 	8e86a771eda009872d6ab2886

Heh.

> Let's avoid this predicament altogether by rendering the entire message,
> including the prefix and the trailing newline, into the buffer we
> already have (and which is still fixed size) and then write it out via
> `write_in_full()`.

s/write_in_full/xwrite/ perhaps?  Both the cover letter and the patch
below use xwrite().

> The history of `vreportf()` with regard to this issue includes the
> following commits:
> 
> d048a96e (2007-11-09) - 'char msg[256]' is introduced to avoid interleaving
> 389d1767 (2009-03-25) - Buffer size increased to 1024 to avoid truncation
> 625a860c (2009-11-22) - Buffer size increased to 4096 to avoid truncation
> f4c3edc0 (2015-08-11) - Buffer removed to avoid truncation
> b5a9e435 (2017-01-11) - Reverts f4c3edc0 to be able to replace control
>                         chars before sending to stderr
> 9ac13ec9 (2006-10-11) - Another attempt to solve interleaving.
>                         This is seemingly related to d048a96e.
> 137a0d0e (2007-11-19) - Addresses out-of-order for display()
> 34df8aba (2009-03-10) - Switches xwrite() to fprintf() in recv_sideband()
>                         to support UTF-8 emulation
> eac14f89 (2012-01-14) - Removes the need for fprintf() for UTF-8 emulation,
>                         so it's safe to use xwrite() again
> 5e5be9e2 (2016-06-28) - recv_sideband() uses xwrite() again
> 
> Note that we need to be careful to handle the return value of
> `vsnprintf()` that indicates the _desired_ byte count.
> 
> Also please note that we `fflush(stderr)` here to help when running in a
> Git Bash on Windows: in this case, `stderr` is not actually truly
> unbuffered, and needs the extra help.
> 
> Co-authored-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  usage.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/usage.c b/usage.c
> index 2fdb20086b..4328894dce 100644
> --- a/usage.c
> +++ b/usage.c
> @@ -10,13 +10,19 @@ void vreportf(const char *prefix, const char *err, va_list params)
>  {
>  	char msg[4096];
>  	char *p;
> -
> -	vsnprintf(msg, sizeof(msg), err, params);
> +	size_t off = strlcpy(msg, prefix, sizeof(msg));
> +	int ret = vsnprintf(msg + off, sizeof(msg) - off, err, params);
>  	for (p = msg; *p; p++) {
>  		if (iscntrl(*p) && *p != '\t' && *p != '\n')
>  			*p = '?';
>  	}
> -	fprintf(stderr, "%s%s\n", prefix, msg);
> +	if (ret > 0) {
> +		if (off + ret > sizeof(msg) - 1)
> +			ret = sizeof(msg) - 1 - off;
> +		msg[off + ret] = '\n'; /* we no longer need a NUL */
> +		fflush(stderr);
> +		xwrite(2, msg, off + ret + 1);
> +	}
>  }
>  
>  static NORETURN void usage_builtin(const char *err, va_list params)
> -- 
> gitgitgadget

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/1] vreportf(): avoid buffered write in favor of unbuffered one
  2019-10-29  3:18   ` Junio C Hamano
@ 2019-10-29 12:30     ` Johannes Schindelin
  2019-10-29 13:49       ` Jeff King
  0 siblings, 1 reply; 34+ messages in thread
From: Johannes Schindelin @ 2019-10-29 12:30 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin via GitGitGadget, git, Jeff King,
	Alexandr Miloslavskiy

Hi Junio,

On Tue, 29 Oct 2019, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > Also please note that we `fflush(stderr)` here to help when running in a
> > Git Bash on Windows: in this case, `stderr` is not actually truly
> > unbuffered, and needs the extra help.
>
> Yuck.  So on all systems, vreportf() now totally bypasses stdio?

Yep ;-)

> Also, this is only to help output from us that goes via vreportf() and
> other codepaths in us that use stdio to write to the standard error
> stream can still get mixed on Windows (I think the answer is yes,
> because we wouldn't need fflush() in this patch if we are covering all
> writes to the standard error stream)?

Yes, `write()` can get interrupted, so there is still a slight chance of
interleaving.

However, with `fprintf()`, apparently the MSVC runtime essentially
writes and flushes one character at a time, which will make it _much_
more likely that two competing processes write interleaved messages to
`stderr`.

> By the way, I'd retitle the patch to highlight that we are still doing
> buffered write, if I were doing this topic.  We are just avoiding some
> implementations of stdio that do not give us buffering and doing the
> buffering ourselves.
>
>     Subject: vreportf(): don't rely on stdio buffering
>
> or something like that.

Good idea.

> > Co-authored-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> >  usage.c | 12 +++++++++---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/usage.c b/usage.c
> > index 2fdb20086b..4328894dce 100644
> > --- a/usage.c
> > +++ b/usage.c
> > @@ -10,13 +10,19 @@ void vreportf(const char *prefix, const char *err, va_list params)
> >  {
> >  	char msg[4096];
> >  	char *p;
> > -
> > -	vsnprintf(msg, sizeof(msg), err, params);
> > +	size_t off = strlcpy(msg, prefix, sizeof(msg));
>
> Like snprintf(3) the strlcpy() and strlcat() functions return the
> total length of the string they tried to create.  For strlcpy() that
> means the length of src.

True (I misread `compat/strlcpy.c` and forgot to consult the
documentation). This length can be longer than `msg`, of course.

> So "off" would be strlen(prefix), which could be longer than
> sizeof(msg)?  Then what happens to this vsnprintf() we see below?

A problem.

I `git grep`ed and saw that only very short `prefix`es are hard-coded.
So that is a hypothetical concern.

However, Alex also indicated his discomfort with this, so I will change
the code to account for a `prefix` that is too long (the entire error
message will be clipped away in that case, which is unfortunate, but to
be expected).

> > +	int ret = vsnprintf(msg + off, sizeof(msg) - off, err, params);
> >  	for (p = msg; *p; p++) {
> >  		if (iscntrl(*p) && *p != '\t' && *p != '\n')
> >  			*p = '?';
> >  	}
> > -	fprintf(stderr, "%s%s\n", prefix, msg);
>
> Strictly speaking this is a breaking change in that control
> sequences in prefix are now clobbered.  Does any caller call this
> function with prefix like "^M\033[K<some string>" to overwrite the
> last output line with the new message?  If not, then probably we do
> not have to worry about it (and reusing msg[] does feel attractive).

Such a sequence would not exactly be a prefix, but okay, I changed the
code to replace only characters in the non-prefix part. For good
measure, I also detect `NUL`s in that part and shorten `ret` in that
case (think `die("This was an\0unintentional NUL")`).

Thanks for the review!
Dscho

> > +	if (ret > 0) {
> > +		if (off + ret > sizeof(msg) - 1)
> > +			ret = sizeof(msg) - 1 - off;
> > +		msg[off + ret] = '\n'; /* we no longer need a NUL */
> > +		fflush(stderr);
> > +		xwrite(2, msg, off + ret + 1);
> > +	}
> >  }
> >
> >  static NORETURN void usage_builtin(const char *err, va_list params)
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/1] vreportf(): avoid buffered write in favor of unbuffered one
  2019-10-29 10:38   ` SZEDER Gábor
@ 2019-10-29 12:38     ` Johannes Schindelin
  2019-10-29 13:52       ` Jeff King
  0 siblings, 1 reply; 34+ messages in thread
From: Johannes Schindelin @ 2019-10-29 12:38 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Johannes Schindelin via GitGitGadget, git, Jeff King,
	Alexandr Miloslavskiy, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 3946 bytes --]

Hi Gábor,

On Tue, 29 Oct 2019, SZEDER Gábor wrote:

> On Mon, Oct 28, 2019 at 03:00:59PM +0000, Johannes Schindelin via GitGitGadget wrote:
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > The MSVC runtime behavior differs from glibc's with respect to
> > `fprintf(stderr, ...)` in that the former writes out the message
> > character by character.
> >
> > In t5516, this leads to a funny problem where a `git fetch` process as
> > well as the `git upload-pack` process spawned by it _both_ call `die()`
> > at the same time. The output can look like this:
> >
> > 	fatal: git uploadfata-lp: raemcokte :error:  upload-pnot our arcef k6: n4ot our ea4cr1e3f 36d45ea94fca1398e86a771eda009872d63adb28598f6a9
> > 	8e86a771eda009872d6ab2886
>
> Heh.

Yes ;-)

> > Let's avoid this predicament altogether by rendering the entire message,
> > including the prefix and the trailing newline, into the buffer we
> > already have (and which is still fixed size) and then write it out via
> > `write_in_full()`.
>
> s/write_in_full/xwrite/ perhaps?  Both the cover letter and the patch
> below use xwrite().

Excellent eyes! I had originally used `write_in_full()` before realizing
that `xwrite()` would be more appropriate.

Fixed.

Thank you for helping me improve the patch!
Dscho

> > The history of `vreportf()` with regard to this issue includes the
> > following commits:
> >
> > d048a96e (2007-11-09) - 'char msg[256]' is introduced to avoid interleaving
> > 389d1767 (2009-03-25) - Buffer size increased to 1024 to avoid truncation
> > 625a860c (2009-11-22) - Buffer size increased to 4096 to avoid truncation
> > f4c3edc0 (2015-08-11) - Buffer removed to avoid truncation
> > b5a9e435 (2017-01-11) - Reverts f4c3edc0 to be able to replace control
> >                         chars before sending to stderr
> > 9ac13ec9 (2006-10-11) - Another attempt to solve interleaving.
> >                         This is seemingly related to d048a96e.
> > 137a0d0e (2007-11-19) - Addresses out-of-order for display()
> > 34df8aba (2009-03-10) - Switches xwrite() to fprintf() in recv_sideband()
> >                         to support UTF-8 emulation
> > eac14f89 (2012-01-14) - Removes the need for fprintf() for UTF-8 emulation,
> >                         so it's safe to use xwrite() again
> > 5e5be9e2 (2016-06-28) - recv_sideband() uses xwrite() again
> >
> > Note that we need to be careful to handle the return value of
> > `vsnprintf()` that indicates the _desired_ byte count.
> >
> > Also please note that we `fflush(stderr)` here to help when running in a
> > Git Bash on Windows: in this case, `stderr` is not actually truly
> > unbuffered, and needs the extra help.
> >
> > Co-authored-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> >  usage.c | 12 +++++++++---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/usage.c b/usage.c
> > index 2fdb20086b..4328894dce 100644
> > --- a/usage.c
> > +++ b/usage.c
> > @@ -10,13 +10,19 @@ void vreportf(const char *prefix, const char *err, va_list params)
> >  {
> >  	char msg[4096];
> >  	char *p;
> > -
> > -	vsnprintf(msg, sizeof(msg), err, params);
> > +	size_t off = strlcpy(msg, prefix, sizeof(msg));
> > +	int ret = vsnprintf(msg + off, sizeof(msg) - off, err, params);
> >  	for (p = msg; *p; p++) {
> >  		if (iscntrl(*p) && *p != '\t' && *p != '\n')
> >  			*p = '?';
> >  	}
> > -	fprintf(stderr, "%s%s\n", prefix, msg);
> > +	if (ret > 0) {
> > +		if (off + ret > sizeof(msg) - 1)
> > +			ret = sizeof(msg) - 1 - off;
> > +		msg[off + ret] = '\n'; /* we no longer need a NUL */
> > +		fflush(stderr);
> > +		xwrite(2, msg, off + ret + 1);
> > +	}
> >  }
> >
> >  static NORETURN void usage_builtin(const char *err, va_list params)
> > --
> > gitgitgadget
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v2 0/1]  Fix t5516 flakiness in Visual Studio builds
  2019-10-28 15:00 [PATCH 0/1] Fix t5516 flakiness in Visual Studio builds Johannes Schindelin via GitGitGadget
  2019-10-28 15:00 ` [PATCH 1/1] vreportf(): avoid buffered write in favor of unbuffered one Johannes Schindelin via GitGitGadget
@ 2019-10-29 13:37 ` Johannes Schindelin via GitGitGadget
  2019-10-29 13:37   ` [PATCH v2 1/1] vreportf(): avoid relying on stdio buffering Johannes Schindelin via GitGitGadget
  2019-10-29 20:01   ` [PATCH v3 0/1] Fix t5516 flakiness in Visual Studio builds Johannes Schindelin via GitGitGadget
  1 sibling, 2 replies; 34+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-10-29 13:37 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Alexandr Miloslavskiy, Johannes Schindelin, Junio C Hamano

Among the flaky tests, it seems that the Azure Pipeline suffers relatively
frequently from t5516 failing with the Visual Studio builds. Essentially, we
grep for an error message, but that error message is produced twice, once by
a fetch and once by the upload-pack spawned from it, and those error
messages are usually interleaved because of MSVC runtime fprintf() 
idiosyncracies. 

The commit message of this patch is based, in part, on 
https://github.com/gitgitgadget/git/pull/407. The patch itself is a much
more minimal alternative (using xwrite() instead of fprintf()) to the code
of https://github.com/gitgitgadget/git/pull/407, avoiding the complexity of
the part of the code that allows for unlimited messages.

While it would seem theoretically more elegant to allow for unlimited
messages, in practice too-long messages cause more problems than they solve,
and therefore we already clip them, and this patch does not change that
behavior.

This fixes https://github.com/gitgitgadget/git/issues/240.

Changes since v1:

 * Changed the oneline to be more accurate (thanks Junio).
 * Improved the commit message (e.g. talking about the xwrite() function
   this patch uses, rather than the write_in_full() function used by an
   earlier iteration, thanks Gábor).
 * Revamped the actual code to account for insanely long prefixes (thanks
   for the advice, Junio).

Johannes Schindelin (1):
  vreportf(): avoid relying on stdio buffering

 usage.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)


base-commit: 566a1439f6f56c2171b8853ddbca0ad3f5098770
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-428%2Fdscho%2Ffix-t5516-flakiness-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-428/dscho/fix-t5516-flakiness-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/428

Range-diff vs v1:

 1:  455026ce3e ! 1:  e426627e14 vreportf(): avoid buffered write in favor of unbuffered one
     @@ -1,6 +1,6 @@
      Author: Johannes Schindelin <johannes.schindelin@gmx.de>
      
     -    vreportf(): avoid buffered write in favor of unbuffered one
     +    vreportf(): avoid relying on stdio buffering
      
          The MSVC runtime behavior differs from glibc's with respect to
          `fprintf(stderr, ...)` in that the former writes out the message
     @@ -16,7 +16,9 @@
          Let's avoid this predicament altogether by rendering the entire message,
          including the prefix and the trailing newline, into the buffer we
          already have (and which is still fixed size) and then write it out via
     -    `write_in_full()`.
     +    `xwrite()`.
     +
     +    We still clip the message to at most 4095 characters.
      
          The history of `vreportf()` with regard to this issue includes the
          following commits:
     @@ -36,39 +38,45 @@
                                  so it's safe to use xwrite() again
          5e5be9e2 (2016-06-28) - recv_sideband() uses xwrite() again
      
     -    Note that we need to be careful to handle the return value of
     -    `vsnprintf()` that indicates the _desired_ byte count.
     +    Note that we print nothing if the `vsnprintf()` call failed to render
     +    the error message; There is little we can do in that case, and it should
     +    not happen anyway.
      
          Also please note that we `fflush(stderr)` here to help when running in a
          Git Bash on Windows: in this case, `stderr` is not actually truly
          unbuffered, and needs the extra help.
      
     -    Co-authored-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
     +    Helped-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
     +    Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
     +    Helped-by: Junio C Hamano <gitster@pobox.com>
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
       diff --git a/usage.c b/usage.c
       --- a/usage.c
       +++ b/usage.c
      @@
     + void vreportf(const char *prefix, const char *err, va_list params)
       {
       	char msg[4096];
     - 	char *p;
     --
     --	vsnprintf(msg, sizeof(msg), err, params);
     +-	char *p;
      +	size_t off = strlcpy(msg, prefix, sizeof(msg));
     -+	int ret = vsnprintf(msg + off, sizeof(msg) - off, err, params);
     - 	for (p = msg; *p; p++) {
     ++	char *p, *pend = msg + sizeof(msg);
     + 
     +-	vsnprintf(msg, sizeof(msg), err, params);
     +-	for (p = msg; *p; p++) {
     ++	p = msg + off < pend ? msg + off : pend - 1;
     ++	if (vsnprintf(p, pend - p, err, params) < 0)
     ++		return; /* vsnprintf() failed, there is nothing we can do */
     ++
     ++	for (; p != pend - 1 && *p; p++) {
       		if (iscntrl(*p) && *p != '\t' && *p != '\n')
       			*p = '?';
       	}
      -	fprintf(stderr, "%s%s\n", prefix, msg);
     -+	if (ret > 0) {
     -+		if (off + ret > sizeof(msg) - 1)
     -+			ret = sizeof(msg) - 1 - off;
     -+		msg[off + ret] = '\n'; /* we no longer need a NUL */
     -+		fflush(stderr);
     -+		xwrite(2, msg, off + ret + 1);
     -+	}
     ++
     ++	*(p++) = '\n'; /* we no longer need a NUL */
     ++	fflush(stderr);
     ++	xwrite(2, msg, p - msg);
       }
       
       static NORETURN void usage_builtin(const char *err, va_list params)

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v2 1/1] vreportf(): avoid relying on stdio buffering
  2019-10-29 13:37 ` [PATCH v2 0/1] Fix t5516 flakiness in Visual Studio builds Johannes Schindelin via GitGitGadget
@ 2019-10-29 13:37   ` Johannes Schindelin via GitGitGadget
  2019-10-29 14:21     ` Alexandr Miloslavskiy
  2019-10-29 20:01   ` [PATCH v3 0/1] Fix t5516 flakiness in Visual Studio builds Johannes Schindelin via GitGitGadget
  1 sibling, 1 reply; 34+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-10-29 13:37 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Alexandr Miloslavskiy, Johannes Schindelin,
	Junio C Hamano, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The MSVC runtime behavior differs from glibc's with respect to
`fprintf(stderr, ...)` in that the former writes out the message
character by character.

In t5516, this leads to a funny problem where a `git fetch` process as
well as the `git upload-pack` process spawned by it _both_ call `die()`
at the same time. The output can look like this:

	fatal: git uploadfata-lp: raemcokte :error:  upload-pnot our arcef k6: n4ot our ea4cr1e3f 36d45ea94fca1398e86a771eda009872d63adb28598f6a9
	8e86a771eda009872d6ab2886

Let's avoid this predicament altogether by rendering the entire message,
including the prefix and the trailing newline, into the buffer we
already have (and which is still fixed size) and then write it out via
`xwrite()`.

We still clip the message to at most 4095 characters.

The history of `vreportf()` with regard to this issue includes the
following commits:

d048a96e (2007-11-09) - 'char msg[256]' is introduced to avoid interleaving
389d1767 (2009-03-25) - Buffer size increased to 1024 to avoid truncation
625a860c (2009-11-22) - Buffer size increased to 4096 to avoid truncation
f4c3edc0 (2015-08-11) - Buffer removed to avoid truncation
b5a9e435 (2017-01-11) - Reverts f4c3edc0 to be able to replace control
                        chars before sending to stderr
9ac13ec9 (2006-10-11) - Another attempt to solve interleaving.
                        This is seemingly related to d048a96e.
137a0d0e (2007-11-19) - Addresses out-of-order for display()
34df8aba (2009-03-10) - Switches xwrite() to fprintf() in recv_sideband()
                        to support UTF-8 emulation
eac14f89 (2012-01-14) - Removes the need for fprintf() for UTF-8 emulation,
                        so it's safe to use xwrite() again
5e5be9e2 (2016-06-28) - recv_sideband() uses xwrite() again

Note that we print nothing if the `vsnprintf()` call failed to render
the error message; There is little we can do in that case, and it should
not happen anyway.

Also please note that we `fflush(stderr)` here to help when running in a
Git Bash on Windows: in this case, `stderr` is not actually truly
unbuffered, and needs the extra help.

Helped-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 usage.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/usage.c b/usage.c
index 2fdb20086b..5dfd38acb8 100644
--- a/usage.c
+++ b/usage.c
@@ -9,14 +9,21 @@
 void vreportf(const char *prefix, const char *err, va_list params)
 {
 	char msg[4096];
-	char *p;
+	size_t off = strlcpy(msg, prefix, sizeof(msg));
+	char *p, *pend = msg + sizeof(msg);
 
-	vsnprintf(msg, sizeof(msg), err, params);
-	for (p = msg; *p; p++) {
+	p = msg + off < pend ? msg + off : pend - 1;
+	if (vsnprintf(p, pend - p, err, params) < 0)
+		return; /* vsnprintf() failed, there is nothing we can do */
+
+	for (; p != pend - 1 && *p; p++) {
 		if (iscntrl(*p) && *p != '\t' && *p != '\n')
 			*p = '?';
 	}
-	fprintf(stderr, "%s%s\n", prefix, msg);
+
+	*(p++) = '\n'; /* we no longer need a NUL */
+	fflush(stderr);
+	xwrite(2, msg, p - msg);
 }
 
 static NORETURN void usage_builtin(const char *err, va_list params)
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/1] vreportf(): avoid buffered write in favor of unbuffered one
  2019-10-29 12:30     ` Johannes Schindelin
@ 2019-10-29 13:49       ` Jeff King
  2019-10-29 14:13         ` Johannes Schindelin
  2019-10-29 16:44         ` Junio C Hamano
  0 siblings, 2 replies; 34+ messages in thread
From: Jeff King @ 2019-10-29 13:49 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Junio C Hamano, Johannes Schindelin via GitGitGadget, git,
	Alexandr Miloslavskiy

On Tue, Oct 29, 2019 at 01:30:51PM +0100, Johannes Schindelin wrote:

> > Also, this is only to help output from us that goes via vreportf() and
> > other codepaths in us that use stdio to write to the standard error
> > stream can still get mixed on Windows (I think the answer is yes,
> > because we wouldn't need fflush() in this patch if we are covering all
> > writes to the standard error stream)?
> 
> Yes, `write()` can get interrupted, so there is still a slight chance of
> interleaving.
> 
> However, with `fprintf()`, apparently the MSVC runtime essentially
> writes and flushes one character at a time, which will make it _much_
> more likely that two competing processes write interleaved messages to
> `stderr`.

Wow, they have truly taken "unbuffered" to a whole new level.

I don't mind seeing this for all platforms, though. I can't think of any
downside, and having one less moving part to contend with in our
error-reporting code seems like a good thing.

> > > -	vsnprintf(msg, sizeof(msg), err, params);
> > > +	size_t off = strlcpy(msg, prefix, sizeof(msg));
> >
> > Like snprintf(3) the strlcpy() and strlcat() functions return the
> > total length of the string they tried to create.  For strlcpy() that
> > means the length of src.
> 
> True (I misread `compat/strlcpy.c` and forgot to consult the
> documentation). This length can be longer than `msg`, of course.

I'd recommend xsnprintf() here. If we have a prefix longer than our
vreportf() buffer, I think a BUG() is the right outcome.

> I `git grep`ed and saw that only very short `prefix`es are hard-coded.
> So that is a hypothetical concern.
> 
> However, Alex also indicated his discomfort with this, so I will change
> the code to account for a `prefix` that is too long (the entire error
> message will be clipped away in that case, which is unfortunate, but to
> be expected).

I'd disagree here. Any caller sending an arbitrarily-large prefix is
holding it wrong, and we'd probably want to know as soon as possible
(and a BUG() is our best bet there).

-Peff

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/1] vreportf(): avoid buffered write in favor of unbuffered one
  2019-10-29 12:38     ` Johannes Schindelin
@ 2019-10-29 13:52       ` Jeff King
  2019-10-29 14:18         ` Johannes Schindelin
  0 siblings, 1 reply; 34+ messages in thread
From: Jeff King @ 2019-10-29 13:52 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: SZEDER Gábor, Johannes Schindelin via GitGitGadget, git,
	Alexandr Miloslavskiy, Junio C Hamano

On Tue, Oct 29, 2019 at 01:38:49PM +0100, Johannes Schindelin wrote:

> > > Let's avoid this predicament altogether by rendering the entire message,
> > > including the prefix and the trailing newline, into the buffer we
> > > already have (and which is still fixed size) and then write it out via
> > > `write_in_full()`.
> >
> > s/write_in_full/xwrite/ perhaps?  Both the cover letter and the patch
> > below use xwrite().
> 
> Excellent eyes! I had originally used `write_in_full()` before realizing
> that `xwrite()` would be more appropriate.

What's your reasoning there? We wouldn't expect xwrite() to ever return
with a partial write. But if it did for whatever reason, surely we'd
prefer to keep trying to print the rest of the error rather than leave
it truncated? We might see an error on the subsequent write(), but it's
worth calling it to find out, I'd think.

-Peff

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/1] vreportf(): avoid buffered write in favor of unbuffered one
  2019-10-29 13:49       ` Jeff King
@ 2019-10-29 14:13         ` Johannes Schindelin
  2019-10-29 14:32           ` Jeff King
  2019-10-29 16:44         ` Junio C Hamano
  1 sibling, 1 reply; 34+ messages in thread
From: Johannes Schindelin @ 2019-10-29 14:13 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Johannes Schindelin via GitGitGadget, git,
	Alexandr Miloslavskiy

Hi Peff,

On Tue, 29 Oct 2019, Jeff King wrote:

> On Tue, Oct 29, 2019 at 01:30:51PM +0100, Johannes Schindelin wrote:
>
> > > Also, this is only to help output from us that goes via vreportf() and
> > > other codepaths in us that use stdio to write to the standard error
> > > stream can still get mixed on Windows (I think the answer is yes,
> > > because we wouldn't need fflush() in this patch if we are covering all
> > > writes to the standard error stream)?
> >
> > Yes, `write()` can get interrupted, so there is still a slight chance of
> > interleaving.
> >
> > However, with `fprintf()`, apparently the MSVC runtime essentially
> > writes and flushes one character at a time, which will make it _much_
> > more likely that two competing processes write interleaved messages to
> > `stderr`.
>
> Wow, they have truly taken "unbuffered" to a whole new level.
>
> I don't mind seeing this for all platforms, though. I can't think of any
> downside, and having one less moving part to contend with in our
> error-reporting code seems like a good thing.
>
> > > > -	vsnprintf(msg, sizeof(msg), err, params);
> > > > +	size_t off = strlcpy(msg, prefix, sizeof(msg));
> > >
> > > Like snprintf(3) the strlcpy() and strlcat() functions return the
> > > total length of the string they tried to create.  For strlcpy() that
> > > means the length of src.
> >
> > True (I misread `compat/strlcpy.c` and forgot to consult the
> > documentation). This length can be longer than `msg`, of course.
>
> I'd recommend xsnprintf() here. If we have a prefix longer than our
> vreportf() buffer, I think a BUG() is the right outcome.

But BUG_fl() calls vreportf(). I am worried about an infinite
recursion...

>
> > I `git grep`ed and saw that only very short `prefix`es are hard-coded.
> > So that is a hypothetical concern.
> >
> > However, Alex also indicated his discomfort with this, so I will change
> > the code to account for a `prefix` that is too long (the entire error
> > message will be clipped away in that case, which is unfortunate, but to
> > be expected).
>
> I'd disagree here. Any caller sending an arbitrarily-large prefix is
> holding it wrong, and we'd probably want to know as soon as possible
> (and a BUG() is our best bet there).

How about truncating already inside the prefix, then? It would miss the
entire error message... but at least it would print _something_...

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/1] vreportf(): avoid buffered write in favor of unbuffered one
  2019-10-29 13:52       ` Jeff King
@ 2019-10-29 14:18         ` Johannes Schindelin
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Schindelin @ 2019-10-29 14:18 UTC (permalink / raw)
  To: Jeff King
  Cc: SZEDER Gábor, Johannes Schindelin via GitGitGadget, git,
	Alexandr Miloslavskiy, Junio C Hamano

Hi Peff,

On Tue, 29 Oct 2019, Jeff King wrote:

> On Tue, Oct 29, 2019 at 01:38:49PM +0100, Johannes Schindelin wrote:
>
> > > > Let's avoid this predicament altogether by rendering the entire message,
> > > > including the prefix and the trailing newline, into the buffer we
> > > > already have (and which is still fixed size) and then write it out via
> > > > `write_in_full()`.
> > >
> > > s/write_in_full/xwrite/ perhaps?  Both the cover letter and the patch
> > > below use xwrite().
> >
> > Excellent eyes! I had originally used `write_in_full()` before realizing
> > that `xwrite()` would be more appropriate.
>
> What's your reasoning there? We wouldn't expect xwrite() to ever return
> with a partial write. But if it did for whatever reason, surely we'd
> prefer to keep trying to print the rest of the error rather than leave
> it truncated? We might see an error on the subsequent write(), but it's
> worth calling it to find out, I'd think.

Oh, `xwrite()` does not write in full, eh? But at least it continues on
`EINTR`...

Will fix.
Dscho

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 1/1] vreportf(): avoid relying on stdio buffering
  2019-10-29 13:37   ` [PATCH v2 1/1] vreportf(): avoid relying on stdio buffering Johannes Schindelin via GitGitGadget
@ 2019-10-29 14:21     ` Alexandr Miloslavskiy
  2019-10-29 19:57       ` Johannes Schindelin
  0 siblings, 1 reply; 34+ messages in thread
From: Alexandr Miloslavskiy @ 2019-10-29 14:21 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget, git
  Cc: Jeff King, Johannes Schindelin, Junio C Hamano

On 29.10.2019 14:37, Johannes Schindelin via GitGitGadget wrote:

 > -	vsnprintf(msg, sizeof(msg), err, params);
 > -	for (p = msg; *p; p++) {
 > +	p = msg + off < pend ? msg + off : pend - 1;

According to my understanding, this is undefined behavior:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3242.pdf
... Unless both pointers point to elements of the same array object, or 
one past the last element of the array object, the behavior is undefined.

 > The MSVC runtime behavior differs from glibc's with respect to
 > `fprintf(stderr, ...)` in that the former writes out the message
 > character by character.

Can you please preserve the research text about fprintf() behavior on 
different platforms that I provided before? Change formatting to what 
you think fits best.

fprintf() has problems with any buffering settings
--------------------------------------------------
1) If `stderr` is fully buffered:
    the ordering of `stdout` and `stderr` messages could be wrong,
    because `stderr` output waits for the buffer to become full.
2) If `stderr` has any type of buffering:
    buffer has fixed size, which could lead to interleaved buffer blocks
    when two threads/processes write at the same time.
3) If `stderr` is not buffered:
    Some implementations, such as VC++ and MinGW, literally disable
    buffering and `fprintf()` will output char-by-char, which leads to
    unreadable char-interleaved writes if two processes write to
    `stderr` at the same time (threads are OK because `fprintf()`
    usually locks `FILE*` in current process).
4) If stderr is line buffered: MinGW/VC++ will enable full buffering
    instead. See MSDN for `setvbuf()`.

fprintf() behavior in git, per platform
---------------------------------------
1) libc - large outputs can be block-interleaved
    fprintf() enables temporary stream buffering.
    Code references:
        buffered_vfprintf()
2) VC++ - char-interleaved
    fprintf() enables temporary stream buffering, but only if stream was
    not set to no buffering. This has no effect, because stderr is not
    buffered by default, and git takes an extra step to ensure that in
    `swap_osfhnd()`.
    Code references:
        _iob[_IOB_ENTRIES]
        __acrt_stdio_temporary_buffering_guard
        has_any_buffer()
3) MinGW - char-interleaved (console), full buffering (file)
    `fprintf()` obeys `stderr` buffering. But it uses old MSVCRT.DLL,
    which eventually calls `_flsbuf()`, which enables buffering unless
    `isatty(stderr)` or buffering is disabled. Buffering is not disabled
    by default for `stderr`. Therefore, buffering is enabled only for
    file-redirected `stderr`.
    Code references:
        __mingw_vfprintf()
        __pformat_wcputs()
        _fputc_nolock()
        _flsbuf()
        _iob[_IOB_ENTRIES]


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/1] vreportf(): avoid buffered write in favor of unbuffered one
  2019-10-29 14:13         ` Johannes Schindelin
@ 2019-10-29 14:32           ` Jeff King
  2019-10-29 20:09             ` Johannes Schindelin
  0 siblings, 1 reply; 34+ messages in thread
From: Jeff King @ 2019-10-29 14:32 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Junio C Hamano, Johannes Schindelin via GitGitGadget, git,
	Alexandr Miloslavskiy

On Tue, Oct 29, 2019 at 03:13:39PM +0100, Johannes Schindelin wrote:

> > I'd recommend xsnprintf() here. If we have a prefix longer than our
> > vreportf() buffer, I think a BUG() is the right outcome.
> 
> But BUG_fl() calls vreportf(). I am worried about an infinite
> recursion...

Good point. I think it would be OK in practice (BUG() gives its own
length-limited prefix), but it's probably a good idea to err on the side
of caution inside vreportf.

> > I'd disagree here. Any caller sending an arbitrarily-large prefix is
> > holding it wrong, and we'd probably want to know as soon as possible
> > (and a BUG() is our best bet there).
> 
> How about truncating already inside the prefix, then? It would miss the
> entire error message... but at least it would print _something_...

Yeah, that might be OK. Hopefully missing the whole rest of the error
message would cause some tests to fail.

You could also abort() after having written if we want to be more
BUG()-like.

-Peff

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/1] vreportf(): avoid buffered write in favor of unbuffered one
  2019-10-29 13:49       ` Jeff King
  2019-10-29 14:13         ` Johannes Schindelin
@ 2019-10-29 16:44         ` Junio C Hamano
  1 sibling, 0 replies; 34+ messages in thread
From: Junio C Hamano @ 2019-10-29 16:44 UTC (permalink / raw)
  To: Jeff King
  Cc: Johannes Schindelin, Johannes Schindelin via GitGitGadget, git,
	Alexandr Miloslavskiy

Jeff King <peff@peff.net> writes:

> Wow, they have truly taken "unbuffered" to a whole new level.
>
> I don't mind seeing this for all platforms, though. I can't think of any
> downside, and having one less moving part to contend with in our
> error-reporting code seems like a good thing.

FWIW, it was my mistake to sound as if I was unhappy to see this
done on all platforms.  As long as the end result is to reasonably
buffer, I do not mind us (as opposed to the system library we
sometimes cannot trust) being the one who implements the buffering.

I very much prefer to see the same code run everywhere than #ifdef
in this case.

> I'd recommend xsnprintf() here. If we have a prefix longer than our
> vreportf() buffer, I think a BUG() is the right outcome.
> ...
> I'd disagree here. Any caller sending an arbitrarily-large prefix is
> holding it wrong, and we'd probably want to know as soon as possible
> (and a BUG() is our best bet there).

Yes.  We are not using an unvetted input as the prefix here, if
somebody needs to use a larger prefix somewhere, that patch needs to
argue the pros-and-cons between making the buffer larger or
shortening that larger prefix.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 1/1] vreportf(): avoid relying on stdio buffering
  2019-10-29 14:21     ` Alexandr Miloslavskiy
@ 2019-10-29 19:57       ` Johannes Schindelin
  2019-10-29 20:09         ` Jeff King
  2019-10-29 20:11         ` Alexandr Miloslavskiy
  0 siblings, 2 replies; 34+ messages in thread
From: Johannes Schindelin @ 2019-10-29 19:57 UTC (permalink / raw)
  To: Alexandr Miloslavskiy
  Cc: Johannes Schindelin via GitGitGadget, git, Jeff King, Junio C Hamano

Hi Alex,

On Tue, 29 Oct 2019, Alexandr Miloslavskiy wrote:

> On 29.10.2019 14:37, Johannes Schindelin via GitGitGadget wrote:
>
> > -	vsnprintf(msg, sizeof(msg), err, params);
> > -	for (p = msg; *p; p++) {
> > +	p = msg + off < pend ? msg + off : pend - 1;
>
> According to my understanding, this is undefined behavior:

It is not entirely obvious to me what exactly you mean by "this",
assuming that you refer to comparing two pointers via `<`, I agree that
this is not the best idea, I changed it to `off < pend - msg`. If my
assumption is correct, however, we are still talking about C, so I
wonder how this C++ document you linked could bear any relevance:

> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3242.pdf
> ... Unless both pointers point to elements of the same array object, or one
> past the last element of the array object, the behavior is undefined.
>
> > The MSVC runtime behavior differs from glibc's with respect to
> > `fprintf(stderr, ...)` in that the former writes out the message
> > character by character.
>
> Can you please preserve the research text about fprintf() behavior on
> different platforms that I provided before? Change formatting to what you
> think fits best.

Quite honestly, I purposefully refrained from copying that information.
While even the following paragraphs are incomplete by necessity (there
are many more standard C libraries we try to work against, e.g. musl and
newlibc, and it would be insanity to try to cover them all in an
analysis of stdio buffering), the only finding that is relevant to the
patch under discussion is that MSVC's runtime outputs to `stderr` byte
by byte (or more correctly, character by character, I guess), and I did
mention this in the commit message (in the part you quoted).

Thanks,
Dscho

>
> fprintf() has problems with any buffering settings
> --------------------------------------------------
> 1) If `stderr` is fully buffered:
>    the ordering of `stdout` and `stderr` messages could be wrong,
>    because `stderr` output waits for the buffer to become full.
> 2) If `stderr` has any type of buffering:
>    buffer has fixed size, which could lead to interleaved buffer blocks
>    when two threads/processes write at the same time.
> 3) If `stderr` is not buffered:
>    Some implementations, such as VC++ and MinGW, literally disable
>    buffering and `fprintf()` will output char-by-char, which leads to
>    unreadable char-interleaved writes if two processes write to
>    `stderr` at the same time (threads are OK because `fprintf()`
>    usually locks `FILE*` in current process).
> 4) If stderr is line buffered: MinGW/VC++ will enable full buffering
>    instead. See MSDN for `setvbuf()`.
>
> fprintf() behavior in git, per platform
> ---------------------------------------
> 1) libc - large outputs can be block-interleaved
>    fprintf() enables temporary stream buffering.
>    Code references:
>        buffered_vfprintf()
> 2) VC++ - char-interleaved
>    fprintf() enables temporary stream buffering, but only if stream was
>    not set to no buffering. This has no effect, because stderr is not
>    buffered by default, and git takes an extra step to ensure that in
>    `swap_osfhnd()`.
>    Code references:
>        _iob[_IOB_ENTRIES]
>        __acrt_stdio_temporary_buffering_guard
>        has_any_buffer()
> 3) MinGW - char-interleaved (console), full buffering (file)
>    `fprintf()` obeys `stderr` buffering. But it uses old MSVCRT.DLL,
>    which eventually calls `_flsbuf()`, which enables buffering unless
>    `isatty(stderr)` or buffering is disabled. Buffering is not disabled
>    by default for `stderr`. Therefore, buffering is enabled only for
>    file-redirected `stderr`.
>    Code references:
>        __mingw_vfprintf()
>        __pformat_wcputs()
>        _fputc_nolock()
>        _flsbuf()
>        _iob[_IOB_ENTRIES]
>
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v3 0/1]  Fix t5516 flakiness in Visual Studio builds
  2019-10-29 13:37 ` [PATCH v2 0/1] Fix t5516 flakiness in Visual Studio builds Johannes Schindelin via GitGitGadget
  2019-10-29 13:37   ` [PATCH v2 1/1] vreportf(): avoid relying on stdio buffering Johannes Schindelin via GitGitGadget
@ 2019-10-29 20:01   ` Johannes Schindelin via GitGitGadget
  2019-10-29 20:01     ` [PATCH v3 1/1] vreportf(): avoid relying on stdio buffering Johannes Schindelin via GitGitGadget
  2019-10-30 10:44     ` [PATCH v4 0/1] Fix t5516 flakiness in Visual Studio builds Johannes Schindelin via GitGitGadget
  1 sibling, 2 replies; 34+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-10-29 20:01 UTC (permalink / raw)
  To: git
  Cc: Jeff King, SZEDER Gábor, Alexandr Miloslavskiy,
	Johannes Schindelin, Junio C Hamano

Among the flaky tests, it seems that the Azure Pipeline suffers relatively
frequently from t5516 failing with the Visual Studio builds. Essentially, we
grep for an error message, but that error message is produced twice, once by
a fetch and once by the upload-pack spawned from it, and those error
messages are usually interleaved because of MSVC runtime fprintf() 
idiosyncracies.

As a consequence, I have to re-run about half a dozen failed builds a day,
which I would like to avoid. My plan is therefore to merge this patch into
Git for Windows v2.24.0-rc2.

The commit message of this patch is based, in part, on 
https://github.com/gitgitgadget/git/pull/407. 

This fixes https://github.com/gitgitgadget/git/issues/240.

Changes since v2:

 * Using write_in_full() instead of xwrite() again (to make sure that the
   entire message is printed).
 * When vsnprintf() fails, now we at least print the prefix.
 * The code to check whether prefix was too long no longer tests an
   inequality between pointers, but between sizes.

Changes since v1:

 * Changed the oneline to be more accurate (thanks Junio).
 * Improved the commit message (e.g. talking about the xwrite() function
   this patch uses, rather than the write_in_full() function used by an
   earlier iteration, thanks Gábor).
 * Revamped the actual code to account for insanely long prefixes (thanks
   for the advice, Junio).

Johannes Schindelin (1):
  vreportf(): avoid relying on stdio buffering

 usage.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)


base-commit: 566a1439f6f56c2171b8853ddbca0ad3f5098770
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-428%2Fdscho%2Ffix-t5516-flakiness-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-428/dscho/fix-t5516-flakiness-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/428

Range-diff vs v2:

 1:  e426627e14 ! 1:  fce0894ee4 vreportf(): avoid relying on stdio buffering
     @@ -16,7 +16,7 @@
          Let's avoid this predicament altogether by rendering the entire message,
          including the prefix and the trailing newline, into the buffer we
          already have (and which is still fixed size) and then write it out via
     -    `xwrite()`.
     +    `write_in_full()`.
      
          We still clip the message to at most 4095 characters.
      
     @@ -46,6 +46,7 @@
          Git Bash on Windows: in this case, `stderr` is not actually truly
          unbuffered, and needs the extra help.
      
     +    Helped-by: Jeff King <peff@peff.net>
          Helped-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
          Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
          Helped-by: Junio C Hamano <gitster@pobox.com>
     @@ -64,9 +65,9 @@
       
      -	vsnprintf(msg, sizeof(msg), err, params);
      -	for (p = msg; *p; p++) {
     -+	p = msg + off < pend ? msg + off : pend - 1;
     ++	p = off < pend - msg ? msg + off : pend - 1;
      +	if (vsnprintf(p, pend - p, err, params) < 0)
     -+		return; /* vsnprintf() failed, there is nothing we can do */
     ++		*p = '\0'; /* vsnprintf() failed, clip at prefix */
      +
      +	for (; p != pend - 1 && *p; p++) {
       		if (iscntrl(*p) && *p != '\t' && *p != '\n')
     @@ -76,7 +77,7 @@
      +
      +	*(p++) = '\n'; /* we no longer need a NUL */
      +	fflush(stderr);
     -+	xwrite(2, msg, p - msg);
     ++	write_in_full(2, msg, p - msg);
       }
       
       static NORETURN void usage_builtin(const char *err, va_list params)

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v3 1/1] vreportf(): avoid relying on stdio buffering
  2019-10-29 20:01   ` [PATCH v3 0/1] Fix t5516 flakiness in Visual Studio builds Johannes Schindelin via GitGitGadget
@ 2019-10-29 20:01     ` Johannes Schindelin via GitGitGadget
  2019-10-29 20:32       ` Jeff King
  2019-10-30  2:01       ` Junio C Hamano
  2019-10-30 10:44     ` [PATCH v4 0/1] Fix t5516 flakiness in Visual Studio builds Johannes Schindelin via GitGitGadget
  1 sibling, 2 replies; 34+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-10-29 20:01 UTC (permalink / raw)
  To: git
  Cc: Jeff King, SZEDER Gábor, Alexandr Miloslavskiy,
	Johannes Schindelin, Junio C Hamano, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The MSVC runtime behavior differs from glibc's with respect to
`fprintf(stderr, ...)` in that the former writes out the message
character by character.

In t5516, this leads to a funny problem where a `git fetch` process as
well as the `git upload-pack` process spawned by it _both_ call `die()`
at the same time. The output can look like this:

	fatal: git uploadfata-lp: raemcokte :error:  upload-pnot our arcef k6: n4ot our ea4cr1e3f 36d45ea94fca1398e86a771eda009872d63adb28598f6a9
	8e86a771eda009872d6ab2886

Let's avoid this predicament altogether by rendering the entire message,
including the prefix and the trailing newline, into the buffer we
already have (and which is still fixed size) and then write it out via
`write_in_full()`.

We still clip the message to at most 4095 characters.

The history of `vreportf()` with regard to this issue includes the
following commits:

d048a96e (2007-11-09) - 'char msg[256]' is introduced to avoid interleaving
389d1767 (2009-03-25) - Buffer size increased to 1024 to avoid truncation
625a860c (2009-11-22) - Buffer size increased to 4096 to avoid truncation
f4c3edc0 (2015-08-11) - Buffer removed to avoid truncation
b5a9e435 (2017-01-11) - Reverts f4c3edc0 to be able to replace control
                        chars before sending to stderr
9ac13ec9 (2006-10-11) - Another attempt to solve interleaving.
                        This is seemingly related to d048a96e.
137a0d0e (2007-11-19) - Addresses out-of-order for display()
34df8aba (2009-03-10) - Switches xwrite() to fprintf() in recv_sideband()
                        to support UTF-8 emulation
eac14f89 (2012-01-14) - Removes the need for fprintf() for UTF-8 emulation,
                        so it's safe to use xwrite() again
5e5be9e2 (2016-06-28) - recv_sideband() uses xwrite() again

Note that we print nothing if the `vsnprintf()` call failed to render
the error message; There is little we can do in that case, and it should
not happen anyway.

Also please note that we `fflush(stderr)` here to help when running in a
Git Bash on Windows: in this case, `stderr` is not actually truly
unbuffered, and needs the extra help.

Helped-by: Jeff King <peff@peff.net>
Helped-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 usage.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/usage.c b/usage.c
index 2fdb20086b..471efb2de9 100644
--- a/usage.c
+++ b/usage.c
@@ -9,14 +9,21 @@
 void vreportf(const char *prefix, const char *err, va_list params)
 {
 	char msg[4096];
-	char *p;
+	size_t off = strlcpy(msg, prefix, sizeof(msg));
+	char *p, *pend = msg + sizeof(msg);
 
-	vsnprintf(msg, sizeof(msg), err, params);
-	for (p = msg; *p; p++) {
+	p = off < pend - msg ? msg + off : pend - 1;
+	if (vsnprintf(p, pend - p, err, params) < 0)
+		*p = '\0'; /* vsnprintf() failed, clip at prefix */
+
+	for (; p != pend - 1 && *p; p++) {
 		if (iscntrl(*p) && *p != '\t' && *p != '\n')
 			*p = '?';
 	}
-	fprintf(stderr, "%s%s\n", prefix, msg);
+
+	*(p++) = '\n'; /* we no longer need a NUL */
+	fflush(stderr);
+	write_in_full(2, msg, p - msg);
 }
 
 static NORETURN void usage_builtin(const char *err, va_list params)
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/1] vreportf(): avoid buffered write in favor of unbuffered one
  2019-10-29 14:32           ` Jeff King
@ 2019-10-29 20:09             ` Johannes Schindelin
  2019-10-30  1:43               ` Junio C Hamano
  0 siblings, 1 reply; 34+ messages in thread
From: Johannes Schindelin @ 2019-10-29 20:09 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Johannes Schindelin via GitGitGadget, git,
	Alexandr Miloslavskiy

Hi Peff,

On Tue, 29 Oct 2019, Jeff King wrote:

> On Tue, Oct 29, 2019 at 03:13:39PM +0100, Johannes Schindelin wrote:
> >
> > > I'd disagree here. Any caller sending an arbitrarily-large prefix
> > > is holding it wrong, and we'd probably want to know as soon as
> > > possible (and a BUG() is our best bet there).
> >
> > How about truncating already inside the prefix, then? It would miss
> > the entire error message... but at least it would print
> > _something_...
>
> Yeah, that might be OK. Hopefully missing the whole rest of the error
> message would cause some tests to fail.

I am not really worried about that. So far, we only have prefixes like
`"fatal: "` or `"error: "`. The longest prefix is in `BUG_fl()` itself,
when we have access to the file name and the line number of the
triggering line, in which case it is `"BUG: <file>:<line>: "`. So as
long as nobody builds a custom version of Git that includes a file whose
path is insanely long, they should be fine. I am certain that Junio will
never accept such a file into git.git.

Oh, and even if anybody would introduce insanely long paths into their
fork of git.git, they would _still_ be safe because `BUG_fl()` limits
the prefix to 256 bytes, including `NUL`.

> You could also abort() after having written if we want to be more
> BUG()-like.

As I said, this is really a part of the patch I am not at all concerned
about. The prefixes are well under control, and I only addressed that
potential future breakage (that I believe falls squarely into YAGNI
territory) only because two reviewers seemed to be concerned.

I really believe that the care I put into the patch to safeguard against
overly long prefixes is seriously overkill.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 1/1] vreportf(): avoid relying on stdio buffering
  2019-10-29 19:57       ` Johannes Schindelin
@ 2019-10-29 20:09         ` Jeff King
  2019-10-29 20:24           ` Alexandr Miloslavskiy
  2019-10-29 20:11         ` Alexandr Miloslavskiy
  1 sibling, 1 reply; 34+ messages in thread
From: Jeff King @ 2019-10-29 20:09 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Alexandr Miloslavskiy, Johannes Schindelin via GitGitGadget, git,
	Junio C Hamano

On Tue, Oct 29, 2019 at 08:57:33PM +0100, Johannes Schindelin wrote:

> > On 29.10.2019 14:37, Johannes Schindelin via GitGitGadget wrote:
> >
> > > -	vsnprintf(msg, sizeof(msg), err, params);
> > > -	for (p = msg; *p; p++) {
> > > +	p = msg + off < pend ? msg + off : pend - 1;
> >
> > According to my understanding, this is undefined behavior:
> 
> It is not entirely obvious to me what exactly you mean by "this",
> assuming that you refer to comparing two pointers via `<`, I agree that
> this is not the best idea, I changed it to `off < pend - msg`. If my
> assumption is correct, however, we are still talking about C, so I
> wonder how this C++ document you linked could bear any relevance:

I think the issue is not the comparison, but rather that forming the
pointer "msg + off" is undefined, since its point outside of any object
(including the "one past" rule). And this is illegal in both C and C++,
though of course it works fine in practice most of the time.

Doing "off < pend - msg" is legal. Or note that "pend" is just
"msg + sizeof(msg)", subtract out the common term and end up with
"off < sizeof(msg)". :)

> > Can you please preserve the research text about fprintf() behavior on
> > different platforms that I provided before? Change formatting to what you
> > think fits best.
> 
> Quite honestly, I purposefully refrained from copying that information.
> While even the following paragraphs are incomplete by necessity (there
> are many more standard C libraries we try to work against, e.g. musl and
> newlibc, and it would be insanity to try to cover them all in an
> analysis of stdio buffering), the only finding that is relevant to the
> patch under discussion is that MSVC's runtime outputs to `stderr` byte
> by byte (or more correctly, character by character, I guess), and I did
> mention this in the commit message (in the part you quoted).

One other case is impacted, which is:

> > 1) If `stderr` is fully buffered:
> >    the ordering of `stdout` and `stderr` messages could be wrong,
> >    because `stderr` output waits for the buffer to become full.

We'll now output the stderr message closer to its time-of-printing,
which is a good thing (it may still be in a weird place with respect to
a buffered stdout, but getting the error message out ASAP is the best we
can do there).

I agree that the rest of the research is not especially relevant to the
code change (though I'm glad it is available in the list archive).

-Peff

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 1/1] vreportf(): avoid relying on stdio buffering
  2019-10-29 19:57       ` Johannes Schindelin
  2019-10-29 20:09         ` Jeff King
@ 2019-10-29 20:11         ` Alexandr Miloslavskiy
  1 sibling, 0 replies; 34+ messages in thread
From: Alexandr Miloslavskiy @ 2019-10-29 20:11 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Johannes Schindelin via GitGitGadget, git, Jeff King, Junio C Hamano

On 29.10.2019 20:57, Johannes Schindelin wrote:

> It is not entirely obvious to me what exactly you mean by "this",
> assuming that you refer to comparing two pointers via `<`, I agree that
> this is not the best idea, I changed it to `off < pend - msg`. If my
> assumption is correct, however, we are still talking about C, so I
> wonder how this C++ document you linked could bear any relevance:

The linked portion of text speaks about adding to pointer. The paragraph 
is a bit long and my quote was indeed hard to grasp. And yes, sorry, C++ 
habits. However, the same thing applies to C, let's have this link instead:

https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf

Hopefully a better quote this time (c/c++ standard are hard to read for 
mortals like me):

When an expression that has integer type is added to or subtracted from 
a pointer <...> If both the pointer operand and the result point
to elements of the same array object, or one past the last element of 
the array object, <...> otherwise, the behavior is undefined.

> Quite honestly, I purposefully refrained from copying that information.

That's a pity, I've put quite an effort into researching all that.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 1/1] vreportf(): avoid relying on stdio buffering
  2019-10-29 20:09         ` Jeff King
@ 2019-10-29 20:24           ` Alexandr Miloslavskiy
  0 siblings, 0 replies; 34+ messages in thread
From: Alexandr Miloslavskiy @ 2019-10-29 20:24 UTC (permalink / raw)
  To: Jeff King, Johannes Schindelin
  Cc: Johannes Schindelin via GitGitGadget, git, Junio C Hamano

On 29.10.2019 21:09, Jeff King wrote:

> I think the issue is not the comparison, but rather that forming the
> pointer "msg + off" is undefined, since its point outside of any object
> (including the "one past" rule). And this is illegal in both C and C++,

Yes, thanks for clarifying.

 > +    p = msg + off < pend ? msg + off : pend - 1;

 > though of course it works fine in practice most of the time.

The easiest thing that an optimizer can do (and that's actually quite 
likely in my understanding) is to conclude that 'msg + off' is the only 
non-UB branch here, therefore 'msg + off < pend' can be taken as always 
true, and then discard 'pend - 1' branch. Afterall, UB serve the biggest 
role in optimizations.

That's even more true now that all parts of equation are local and easy 
to grasp for static analysis.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/1] vreportf(): avoid relying on stdio buffering
  2019-10-29 20:01     ` [PATCH v3 1/1] vreportf(): avoid relying on stdio buffering Johannes Schindelin via GitGitGadget
@ 2019-10-29 20:32       ` Jeff King
  2019-10-30  8:54         ` Johannes Schindelin
  2019-10-30  2:01       ` Junio C Hamano
  1 sibling, 1 reply; 34+ messages in thread
From: Jeff King @ 2019-10-29 20:32 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, SZEDER Gábor, Alexandr Miloslavskiy,
	Johannes Schindelin, Junio C Hamano

On Tue, Oct 29, 2019 at 08:01:20PM +0000, Johannes Schindelin via GitGitGadget wrote:

> diff --git a/usage.c b/usage.c
> index 2fdb20086b..471efb2de9 100644
> --- a/usage.c
> +++ b/usage.c
> @@ -9,14 +9,21 @@
>  void vreportf(const char *prefix, const char *err, va_list params)
>  {
>  	char msg[4096];
> -	char *p;
> +	size_t off = strlcpy(msg, prefix, sizeof(msg));
> +	char *p, *pend = msg + sizeof(msg);
>  
> -	vsnprintf(msg, sizeof(msg), err, params);
> -	for (p = msg; *p; p++) {
> +	p = off < pend - msg ? msg + off : pend - 1;
> +	if (vsnprintf(p, pend - p, err, params) < 0)
> +		*p = '\0'; /* vsnprintf() failed, clip at prefix */
> +
> +	for (; p != pend - 1 && *p; p++) {
>  		if (iscntrl(*p) && *p != '\t' && *p != '\n')
>  			*p = '?';
>  	}

This version looks OK to me. Some bikeshedding:

  - I suspect it may be more readable to just stick to offsets instead
    of pointers, since that's what strlcpy() and vsnprintf() give us.

  - I don't think "p == pend - 1" can ever trigger, since either
    vsnprintf() or we will have just written a NUL.

  - Do we need to contend with vsnprintf() return a negative value in
    general in our codebase? We already BUG() on it elsewhere. Yes, that
    BUG() would try to write via this code path, but it implies to me
    that we've already dealt with any such broken vsnprintf()
    implementations (via compat/snprintf.c).

If you're sick of bikeshedding, though, I can live without any of those
being addressed.

> +	*(p++) = '\n'; /* we no longer need a NUL */
> +	fflush(stderr);
> +	write_in_full(2, msg, p - msg);

One non-bikeshed question: would fprintf() on some platforms have sent
"\r\n", which is no longer happening with our write()? Do we need to
care about that?

-Peff

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/1] vreportf(): avoid buffered write in favor of unbuffered one
  2019-10-29 20:09             ` Johannes Schindelin
@ 2019-10-30  1:43               ` Junio C Hamano
  0 siblings, 0 replies; 34+ messages in thread
From: Junio C Hamano @ 2019-10-30  1:43 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Jeff King, Johannes Schindelin via GitGitGadget, git,
	Alexandr Miloslavskiy

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> I really believe that the care I put into the patch to safeguard against
> overly long prefixes is seriously overkill.

If it was buggy, the care given to it probably was insufficient, not
overkill ;-)

As long as we know there won't be an overflow, and our compilers
won't complain about possible overflow, simpler code that is easier
to reason about is better.  I tend to think that Peff's
"consistently count based on offsets, instead of pointer arithmetic
that you can easily get wrong" would give us an easier-to-read
result in this case.

Thanks for simplifying the original "overkill" version.  Except for
the bug resulting from misunderstanding of strlcpy() return value,
the version I saw was a quite pleasant read.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/1] vreportf(): avoid relying on stdio buffering
  2019-10-29 20:01     ` [PATCH v3 1/1] vreportf(): avoid relying on stdio buffering Johannes Schindelin via GitGitGadget
  2019-10-29 20:32       ` Jeff King
@ 2019-10-30  2:01       ` Junio C Hamano
  2019-10-30  9:13         ` Johannes Schindelin
  1 sibling, 1 reply; 34+ messages in thread
From: Junio C Hamano @ 2019-10-30  2:01 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Jeff King, SZEDER Gábor, Alexandr Miloslavskiy,
	Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> Subject: Re: [PATCH v3 1/1] vreportf(): avoid relying on stdio buffering

perfect ;-)

> Let's avoid this predicament altogether by rendering the entire message,
> including the prefix and the trailing newline, into the buffer we
> already have (and which is still fixed size) and then write it out via
> `write_in_full()`.
>
> We still clip the message to at most 4095 characters.

Good.

> Also please note that we `fflush(stderr)` here to help when running in a
> Git Bash on Windows: in this case, `stderr` is not actually truly
> unbuffered, and needs the extra help.

I do not think you have to single out Windows in this case; if we
are writing directly to file descriptor #2, as long as there have
been something written by the process to the standard error stream
before our caller called us, we must flush it to make sure what
we are going to write out will come after, no matter what platform
we are on.

    The process may have written to the standard error stream and
    there may be something left in the buffer kept in the stdio
    layer.  Call fflush(stderr) before writing the message we
    prepare in this function.

or something along that line would be sufficient.

> diff --git a/usage.c b/usage.c
> index 2fdb20086b..471efb2de9 100644
> --- a/usage.c
> +++ b/usage.c
> @@ -9,14 +9,21 @@
>  void vreportf(const char *prefix, const char *err, va_list params)
>  {
>  	char msg[4096];
> -	char *p;
> +	size_t off = strlcpy(msg, prefix, sizeof(msg));
> +	char *p, *pend = msg + sizeof(msg);
>  
> -	vsnprintf(msg, sizeof(msg), err, params);
> -	for (p = msg; *p; p++) {
> +	p = off < pend - msg ? msg + off : pend - 1;
> +	if (vsnprintf(p, pend - p, err, params) < 0)
> +		*p = '\0'; /* vsnprintf() failed, clip at prefix */
> +
> +	for (; p != pend - 1 && *p; p++) {

It may make the result much simpler to start with something like:

	size_t prefix_len = strlen(prefix);

	if (sizeof(msg) <= prefix_len) {
 		fprintf(stderr, "BUG!!! too long a prefix '%s'\n", prefix);
		abort();
	}
	memcpy(msg, prefix, prefix_len);
	p = msg + prefix_len;

as we agreed that we won't have prefix that will fill or overflow
msg[] based on your earlier reading of the callers like BUG-fl.
That way, we probably may be able to even lose the pend pointer and
arithmetic around it.

>  		if (iscntrl(*p) && *p != '\t' && *p != '\n')
>  			*p = '?';
>  	}

Thanks.  With this flow it is crystal clear that the prefix is shown
as-is, while the payload is sanitized.

> -	fprintf(stderr, "%s%s\n", prefix, msg);
> +
> +	*(p++) = '\n'; /* we no longer need a NUL */
> +	fflush(stderr);
> +	write_in_full(2, msg, p - msg);
>  }
>  
>  static NORETURN void usage_builtin(const char *err, va_list params)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/1] vreportf(): avoid relying on stdio buffering
  2019-10-29 20:32       ` Jeff King
@ 2019-10-30  8:54         ` Johannes Schindelin
  2019-10-31  6:24           ` Jeff King
  0 siblings, 1 reply; 34+ messages in thread
From: Johannes Schindelin @ 2019-10-30  8:54 UTC (permalink / raw)
  To: Jeff King
  Cc: Johannes Schindelin via GitGitGadget, git, SZEDER Gábor,
	Alexandr Miloslavskiy, Junio C Hamano

Hi Peff,

On Tue, 29 Oct 2019, Jeff King wrote:

> On Tue, Oct 29, 2019 at 08:01:20PM +0000, Johannes Schindelin via GitGitGadget wrote:
>
> > diff --git a/usage.c b/usage.c
> > index 2fdb20086b..471efb2de9 100644
> > --- a/usage.c
> > +++ b/usage.c
> > @@ -9,14 +9,21 @@
> >  void vreportf(const char *prefix, const char *err, va_list params)
> >  {
> >  	char msg[4096];
> > -	char *p;
> > +	size_t off = strlcpy(msg, prefix, sizeof(msg));
> > +	char *p, *pend = msg + sizeof(msg);
> >
> > -	vsnprintf(msg, sizeof(msg), err, params);
> > -	for (p = msg; *p; p++) {
> > +	p = off < pend - msg ? msg + off : pend - 1;
> > +	if (vsnprintf(p, pend - p, err, params) < 0)
> > +		*p = '\0'; /* vsnprintf() failed, clip at prefix */
> > +
> > +	for (; p != pend - 1 && *p; p++) {
> >  		if (iscntrl(*p) && *p != '\t' && *p != '\n')
> >  			*p = '?';
> >  	}
>
> This version looks OK to me. Some bikeshedding:
>
>   - I suspect it may be more readable to just stick to offsets instead
>     of pointers, since that's what strlcpy() and vsnprintf() give us.

An earlier (unsent) iteration did exactly that, but it was quite a bit
more unreadable because of the required arithmetics with `sizeof(msg)`.

>   - I don't think "p == pend - 1" can ever trigger, since either
>     vsnprintf() or we will have just written a NUL.

You are right, but I wanted to make extra sure that this code is robust
even (or: especially) in the presence of buggy libc functions.

It's not even expensive, I don't think.

>   - Do we need to contend with vsnprintf() return a negative value in
>     general in our codebase? We already BUG() on it elsewhere. Yes, that
>     BUG() would try to write via this code path, but it implies to me
>     that we've already dealt with any such broken vsnprintf()
>     implementations (via compat/snprintf.c).

It is true that the test suite bails out with a `BUG()` when
`vsnprintf()` is broken. But I figured that we want to be safe rather
than sorry. Besides, I have no full picture about what potential reasons
could make `vsnprintf()` return a negative value: for what I know, an
invalid format string could trigger that. And I _really_ want this code
path to be as robust as I can make it.

> If you're sick of bikeshedding, though, I can live without any of those
> being addressed.

Oh, that's okay, and I would not even call it bikeshedding, I think you
raised valid concerns.

> > +	*(p++) = '\n'; /* we no longer need a NUL */
> > +	fflush(stderr);
> > +	write_in_full(2, msg, p - msg);
>
> One non-bikeshed question: would fprintf() on some platforms have sent
> "\r\n", which is no longer happening with our write()? Do we need to
> care about that?

I am not aware of any platform where `fprintf()` would automatically
transform `\n` to `\r\n`. Not unless the `FILE *` in question has been
opened with the `t` flag. And I am rather certain that `stderr` is not
opened with that flag. And if it was, I would force it off in Git for
Windows.

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/1] vreportf(): avoid relying on stdio buffering
  2019-10-30  2:01       ` Junio C Hamano
@ 2019-10-30  9:13         ` Johannes Schindelin
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Schindelin @ 2019-10-30  9:13 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin via GitGitGadget, git, Jeff King,
	SZEDER Gábor, Alexandr Miloslavskiy

Hi Junio,

On Wed, 30 Oct 2019, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > Also please note that we `fflush(stderr)` here to help when running in a
> > Git Bash on Windows: in this case, `stderr` is not actually truly
> > unbuffered, and needs the extra help.
>
> I do not think you have to single out Windows in this case; if we
> are writing directly to file descriptor #2, as long as there have
> been something written by the process to the standard error stream
> before our caller called us, we must flush it to make sure what
> we are going to write out will come after, no matter what platform
> we are on.
>
>     The process may have written to the standard error stream and
>     there may be something left in the buffer kept in the stdio
>     layer.  Call fflush(stderr) before writing the message we
>     prepare in this function.
>
> or something along that line would be sufficient.

I replaced the paragraph with a slightly edited version of this. Thanks.

> > diff --git a/usage.c b/usage.c
> > index 2fdb20086b..471efb2de9 100644
> > --- a/usage.c
> > +++ b/usage.c
> > @@ -9,14 +9,21 @@
> >  void vreportf(const char *prefix, const char *err, va_list params)
> >  {
> >  	char msg[4096];
> > -	char *p;
> > +	size_t off = strlcpy(msg, prefix, sizeof(msg));
> > +	char *p, *pend = msg + sizeof(msg);
> >
> > -	vsnprintf(msg, sizeof(msg), err, params);
> > -	for (p = msg; *p; p++) {
> > +	p = off < pend - msg ? msg + off : pend - 1;
> > +	if (vsnprintf(p, pend - p, err, params) < 0)
> > +		*p = '\0'; /* vsnprintf() failed, clip at prefix */
> > +
> > +	for (; p != pend - 1 && *p; p++) {
>
> It may make the result much simpler to start with something like:
>
> 	size_t prefix_len = strlen(prefix);
>
> 	if (sizeof(msg) <= prefix_len) {
>  		fprintf(stderr, "BUG!!! too long a prefix '%s'\n", prefix);
> 		abort();
> 	}
> 	memcpy(msg, prefix, prefix_len);
> 	p = msg + prefix_len;
>
> as we agreed that we won't have prefix that will fill or overflow
> msg[] based on your earlier reading of the callers like BUG-fl.

Makes sense. I was a bit blinded by my endeavor to keep the diffstat
small.

> That way, we probably may be able to even lose the pend pointer and
> arithmetic around it.

I tried this, and it still looks more obvious to me with `pend`.

> >  		if (iscntrl(*p) && *p != '\t' && *p != '\n')
> >  			*p = '?';
> >  	}
>
> Thanks.  With this flow it is crystal clear that the prefix is shown
> as-is, while the payload is sanitized.

Yes ;-)

Thanks,
Dscho

> > -	fprintf(stderr, "%s%s\n", prefix, msg);
> > +
> > +	*(p++) = '\n'; /* we no longer need a NUL */
> > +	fflush(stderr);
> > +	write_in_full(2, msg, p - msg);
> >  }
> >
> >  static NORETURN void usage_builtin(const char *err, va_list params)
>
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 0/1]  Fix t5516 flakiness in Visual Studio builds
  2019-10-29 20:01   ` [PATCH v3 0/1] Fix t5516 flakiness in Visual Studio builds Johannes Schindelin via GitGitGadget
  2019-10-29 20:01     ` [PATCH v3 1/1] vreportf(): avoid relying on stdio buffering Johannes Schindelin via GitGitGadget
@ 2019-10-30 10:44     ` Johannes Schindelin via GitGitGadget
  2019-10-30 10:44       ` [PATCH v4 1/1] vreportf(): avoid relying on stdio buffering Johannes Schindelin via GitGitGadget
  1 sibling, 1 reply; 34+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-10-30 10:44 UTC (permalink / raw)
  To: git
  Cc: Jeff King, SZEDER Gábor, Alexandr Miloslavskiy,
	Johannes Schindelin, Junio C Hamano

Among the flaky tests, it seems that the Azure Pipeline suffers relatively
frequently from t5516 failing with the Visual Studio builds. Essentially, we
grep for an error message, but that error message is produced twice, once by
a fetch and once by the upload-pack spawned from it, and those error
messages are usually interleaved because of MSVC runtime fprintf() 
idiosyncracies.

As a consequence, I have to re-run about half a dozen failed builds a day,
which I would like to avoid. My plan is therefore to merge this patch into
Git for Windows v2.24.0-rc2.

The commit message of this patch is based, in part, on 
https://github.com/gitgitgadget/git/pull/407. 

This fixes https://github.com/gitgitgadget/git/issues/240.

Changes since v3:

 * Reworded the part of the commit message that talks about the need for 
   fflush(stderr); for accuracy, as suggested by Junio.
 * Simplified the flow of the prefix handling as well as providing a proper 
   BUG message and abort()ing when the prefix is too long. Thanks Junio!

Changes since v2:

 * Using write_in_full() instead of xwrite() again (to make sure that the
   entire message is printed).
 * When vsnprintf() fails, now we at least print the prefix.
 * The code to check whether prefix was too long no longer tests an
   inequality between pointers, but between sizes.

Changes since v1:

 * Changed the oneline to be more accurate (thanks Junio).
 * Improved the commit message (e.g. talking about the xwrite() function
   this patch uses, rather than the write_in_full() function used by an
   earlier iteration, thanks Gábor).
 * Revamped the actual code to account for insanely long prefixes (thanks
   for the advice, Junio).

Johannes Schindelin (1):
  vreportf(): avoid relying on stdio buffering

 usage.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)


base-commit: 566a1439f6f56c2171b8853ddbca0ad3f5098770
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-428%2Fdscho%2Ffix-t5516-flakiness-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-428/dscho/fix-t5516-flakiness-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/428

Range-diff vs v3:

 1:  fce0894ee4 ! 1:  f6d6c8122a vreportf(): avoid relying on stdio buffering
     @@ -42,9 +42,9 @@
          the error message; There is little we can do in that case, and it should
          not happen anyway.
      
     -    Also please note that we `fflush(stderr)` here to help when running in a
     -    Git Bash on Windows: in this case, `stderr` is not actually truly
     -    unbuffered, and needs the extra help.
     +    The process may have written to `stderr` and there may be something left
     +    in the buffer kept in the stdio layer. Call `fflush(stderr)` before
     +    writing the message we prepare in this function.
      
          Helped-by: Jeff King <peff@peff.net>
          Helped-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
     @@ -60,12 +60,17 @@
       {
       	char msg[4096];
      -	char *p;
     -+	size_t off = strlcpy(msg, prefix, sizeof(msg));
      +	char *p, *pend = msg + sizeof(msg);
     ++	size_t prefix_len = strlen(prefix);
       
      -	vsnprintf(msg, sizeof(msg), err, params);
      -	for (p = msg; *p; p++) {
     -+	p = off < pend - msg ? msg + off : pend - 1;
     ++	if (sizeof(msg) <= prefix_len) {
     ++		fprintf(stderr, "BUG!!! too long a prefix '%s'\n", prefix);
     ++		abort();
     ++	}
     ++	memcpy(msg, prefix, prefix_len);
     ++	p = msg + prefix_len;
      +	if (vsnprintf(p, pend - p, err, params) < 0)
      +		*p = '\0'; /* vsnprintf() failed, clip at prefix */
      +

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 1/1] vreportf(): avoid relying on stdio buffering
  2019-10-30 10:44     ` [PATCH v4 0/1] Fix t5516 flakiness in Visual Studio builds Johannes Schindelin via GitGitGadget
@ 2019-10-30 10:44       ` Johannes Schindelin via GitGitGadget
  2019-11-02  4:05         ` Junio C Hamano
  0 siblings, 1 reply; 34+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-10-30 10:44 UTC (permalink / raw)
  To: git
  Cc: Jeff King, SZEDER Gábor, Alexandr Miloslavskiy,
	Johannes Schindelin, Junio C Hamano, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The MSVC runtime behavior differs from glibc's with respect to
`fprintf(stderr, ...)` in that the former writes out the message
character by character.

In t5516, this leads to a funny problem where a `git fetch` process as
well as the `git upload-pack` process spawned by it _both_ call `die()`
at the same time. The output can look like this:

	fatal: git uploadfata-lp: raemcokte :error:  upload-pnot our arcef k6: n4ot our ea4cr1e3f 36d45ea94fca1398e86a771eda009872d63adb28598f6a9
	8e86a771eda009872d6ab2886

Let's avoid this predicament altogether by rendering the entire message,
including the prefix and the trailing newline, into the buffer we
already have (and which is still fixed size) and then write it out via
`write_in_full()`.

We still clip the message to at most 4095 characters.

The history of `vreportf()` with regard to this issue includes the
following commits:

d048a96e (2007-11-09) - 'char msg[256]' is introduced to avoid interleaving
389d1767 (2009-03-25) - Buffer size increased to 1024 to avoid truncation
625a860c (2009-11-22) - Buffer size increased to 4096 to avoid truncation
f4c3edc0 (2015-08-11) - Buffer removed to avoid truncation
b5a9e435 (2017-01-11) - Reverts f4c3edc0 to be able to replace control
                        chars before sending to stderr
9ac13ec9 (2006-10-11) - Another attempt to solve interleaving.
                        This is seemingly related to d048a96e.
137a0d0e (2007-11-19) - Addresses out-of-order for display()
34df8aba (2009-03-10) - Switches xwrite() to fprintf() in recv_sideband()
                        to support UTF-8 emulation
eac14f89 (2012-01-14) - Removes the need for fprintf() for UTF-8 emulation,
                        so it's safe to use xwrite() again
5e5be9e2 (2016-06-28) - recv_sideband() uses xwrite() again

Note that we print nothing if the `vsnprintf()` call failed to render
the error message; There is little we can do in that case, and it should
not happen anyway.

The process may have written to `stderr` and there may be something left
in the buffer kept in the stdio layer. Call `fflush(stderr)` before
writing the message we prepare in this function.

Helped-by: Jeff King <peff@peff.net>
Helped-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 usage.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/usage.c b/usage.c
index 2fdb20086b..58fb5fff5f 100644
--- a/usage.c
+++ b/usage.c
@@ -9,14 +9,26 @@
 void vreportf(const char *prefix, const char *err, va_list params)
 {
 	char msg[4096];
-	char *p;
+	char *p, *pend = msg + sizeof(msg);
+	size_t prefix_len = strlen(prefix);
 
-	vsnprintf(msg, sizeof(msg), err, params);
-	for (p = msg; *p; p++) {
+	if (sizeof(msg) <= prefix_len) {
+		fprintf(stderr, "BUG!!! too long a prefix '%s'\n", prefix);
+		abort();
+	}
+	memcpy(msg, prefix, prefix_len);
+	p = msg + prefix_len;
+	if (vsnprintf(p, pend - p, err, params) < 0)
+		*p = '\0'; /* vsnprintf() failed, clip at prefix */
+
+	for (; p != pend - 1 && *p; p++) {
 		if (iscntrl(*p) && *p != '\t' && *p != '\n')
 			*p = '?';
 	}
-	fprintf(stderr, "%s%s\n", prefix, msg);
+
+	*(p++) = '\n'; /* we no longer need a NUL */
+	fflush(stderr);
+	write_in_full(2, msg, p - msg);
 }
 
 static NORETURN void usage_builtin(const char *err, va_list params)
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/1] vreportf(): avoid relying on stdio buffering
  2019-10-30  8:54         ` Johannes Schindelin
@ 2019-10-31  6:24           ` Jeff King
  2019-10-31 10:26             ` Johannes Schindelin
  0 siblings, 1 reply; 34+ messages in thread
From: Jeff King @ 2019-10-31  6:24 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Johannes Schindelin via GitGitGadget, git, SZEDER Gábor,
	Alexandr Miloslavskiy, Junio C Hamano

On Wed, Oct 30, 2019 at 09:54:52AM +0100, Johannes Schindelin wrote:

> > One non-bikeshed question: would fprintf() on some platforms have sent
> > "\r\n", which is no longer happening with our write()? Do we need to
> > care about that?
> 
> I am not aware of any platform where `fprintf()` would automatically
> transform `\n` to `\r\n`. Not unless the `FILE *` in question has been
> opened with the `t` flag. And I am rather certain that `stderr` is not
> opened with that flag. And if it was, I would force it off in Git for
> Windows.

OK, thanks. You guessed the platform I was thinking of. :)

Another more far-fetched one: IIRC our stdio wrappers on Windows do some
magic to convert ANSI color codes into actual terminal codes. Could that
be a problem here? I think we'd kill off any color codes in the actual
message due to the control-code replacement. In theory the prefix could
have them. I don't think any code does now, but the PUSH_COLOR_ERROR
stuff in builtin/push.c is getting close. I wouldn't be surprised for
that to eventually get folded into error().

-Peff

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/1] vreportf(): avoid relying on stdio buffering
  2019-10-31  6:24           ` Jeff King
@ 2019-10-31 10:26             ` Johannes Schindelin
  2019-10-31 15:48               ` Jeff King
  0 siblings, 1 reply; 34+ messages in thread
From: Johannes Schindelin @ 2019-10-31 10:26 UTC (permalink / raw)
  To: Jeff King
  Cc: Johannes Schindelin via GitGitGadget, git, SZEDER Gábor,
	Alexandr Miloslavskiy, Junio C Hamano

Hi Peff,

On Thu, 31 Oct 2019, Jeff King wrote:

> On Wed, Oct 30, 2019 at 09:54:52AM +0100, Johannes Schindelin wrote:
>
> > > One non-bikeshed question: would fprintf() on some platforms have sent
> > > "\r\n", which is no longer happening with our write()? Do we need to
> > > care about that?
> >
> > I am not aware of any platform where `fprintf()` would automatically
> > transform `\n` to `\r\n`. Not unless the `FILE *` in question has been
> > opened with the `t` flag. And I am rather certain that `stderr` is not
> > opened with that flag. And if it was, I would force it off in Git for
> > Windows.
>
> OK, thanks. You guessed the platform I was thinking of. :)
>
> Another more far-fetched one: IIRC our stdio wrappers on Windows do some
> magic to convert ANSI color codes into actual terminal codes. Could that
> be a problem here? I think we'd kill off any color codes in the actual
> message due to the control-code replacement. In theory the prefix could
> have them. I don't think any code does now, but the PUSH_COLOR_ERROR
> stuff in builtin/push.c is getting close. I wouldn't be surprised for
> that to eventually get folded into error().

A valid concern!

As per eac14f8909d (Win32: Thread-safe windows console output,
2012-01-14), `write()` _implicitly_ gets the ANSI emulation of
`winansi.c`.

So I think we're good.

Thanks for thinking of all kinds of angles,
Dscho

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/1] vreportf(): avoid relying on stdio buffering
  2019-10-31 10:26             ` Johannes Schindelin
@ 2019-10-31 15:48               ` Jeff King
  2019-11-01 18:41                 ` Johannes Schindelin
  0 siblings, 1 reply; 34+ messages in thread
From: Jeff King @ 2019-10-31 15:48 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Johannes Schindelin via GitGitGadget, git, SZEDER Gábor,
	Alexandr Miloslavskiy, Junio C Hamano

On Thu, Oct 31, 2019 at 11:26:30AM +0100, Johannes Schindelin wrote:

> > Another more far-fetched one: IIRC our stdio wrappers on Windows do some
> > magic to convert ANSI color codes into actual terminal codes. Could that
> > be a problem here? I think we'd kill off any color codes in the actual
> > message due to the control-code replacement. In theory the prefix could
> > have them. I don't think any code does now, but the PUSH_COLOR_ERROR
> > stuff in builtin/push.c is getting close. I wouldn't be surprised for
> > that to eventually get folded into error().
> 
> A valid concern!
> 
> As per eac14f8909d (Win32: Thread-safe windows console output,
> 2012-01-14), `write()` _implicitly_ gets the ANSI emulation of
> `winansi.c`.
> 
> So I think we're good.
> 
> Thanks for thinking of all kinds of angles,

Good. Thanks for humoring my wild speculation. :)

I remember being confused by this not too long ago, so I did a quick dig
in the archive. And indeed, there was a confusing comment which caught
me in 2016, and which you removed back then (via 3d0a83382f26). I even
reviewed the patch. Maybe this time I will remember the outcome!

-Peff

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/1] vreportf(): avoid relying on stdio buffering
  2019-10-31 15:48               ` Jeff King
@ 2019-11-01 18:41                 ` Johannes Schindelin
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Schindelin @ 2019-11-01 18:41 UTC (permalink / raw)
  To: Jeff King
  Cc: Johannes Schindelin via GitGitGadget, git, SZEDER Gábor,
	Alexandr Miloslavskiy, Junio C Hamano

Hi,

On Thu, 31 Oct 2019, Jeff King wrote:

> On Thu, Oct 31, 2019 at 11:26:30AM +0100, Johannes Schindelin wrote:
>
> > > Another more far-fetched one: IIRC our stdio wrappers on Windows do some
> > > magic to convert ANSI color codes into actual terminal codes. Could that
> > > be a problem here? I think we'd kill off any color codes in the actual
> > > message due to the control-code replacement. In theory the prefix could
> > > have them. I don't think any code does now, but the PUSH_COLOR_ERROR
> > > stuff in builtin/push.c is getting close. I wouldn't be surprised for
> > > that to eventually get folded into error().
> >
> > A valid concern!
> >
> > As per eac14f8909d (Win32: Thread-safe windows console output,
> > 2012-01-14), `write()` _implicitly_ gets the ANSI emulation of
> > `winansi.c`.
> >
> > So I think we're good.
> >
> > Thanks for thinking of all kinds of angles,
>
> Good. Thanks for humoring my wild speculation. :)

Of course!

> I remember being confused by this not too long ago, so I did a quick dig
> in the archive. And indeed, there was a confusing comment which caught
> me in 2016, and which you removed back then (via 3d0a83382f26). I even
> reviewed the patch. Maybe this time I will remember the outcome!

Heh, I won't fault you because I totally forgot about it myself.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 1/1] vreportf(): avoid relying on stdio buffering
  2019-10-30 10:44       ` [PATCH v4 1/1] vreportf(): avoid relying on stdio buffering Johannes Schindelin via GitGitGadget
@ 2019-11-02  4:05         ` Junio C Hamano
  0 siblings, 0 replies; 34+ messages in thread
From: Junio C Hamano @ 2019-11-02  4:05 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Jeff King, SZEDER Gábor, Alexandr Miloslavskiy,
	Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> The MSVC runtime behavior differs from glibc's with respect to
> `fprintf(stderr, ...)` in that the former writes out the message
> character by character.
> ...
> Let's avoid this predicament altogether by rendering the entire message,
> including the prefix and the trailing newline, into the buffer we
> already have (and which is still fixed size) and then write it out via
> `write_in_full()`.
> ...
> The process may have written to `stderr` and there may be something left
> in the buffer kept in the stdio layer. Call `fflush(stderr)` before
> writing the message we prepare in this function.

Thanks.

This is for future reference and not a comment for you alone, but we
probably do not want to single out glibc like this proposed log
message does in its first paragraph, as if in the author's mind, the
world has only two systems.  It invites questions like "what about
various BSD variants?  how does musl behave?" that may lead to a
suggestion to further update it to "differs from most everybody
else", or we may end up saying "MSVC, X and Y share this problem
unlike all others".  Either way, at that point, the original
singling out of glibc becomes meaningless.

"Platform X has this issue and here is a way to avoid that on
everybody" is a good description of the motivation, and the mention
of the behaviour of MSVC runtime is what we want to see there.

        The MSVC runtime writes out the message character by
        character given `fprintf(stderr, ...)`.

It is not necessary to add "There may be other platforms that share
the same issue, but MSVC alone is already important enough so we do
not list them here".  It is trivially true and obvious ;-)

As I said, the above is for future reference for everybody; it's
cutting close to the final, so let's queue this (and the other one)
as-is.

Thanks for working on this fix.

> Helped-by: Jeff King <peff@peff.net>
> Helped-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
> Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
> Helped-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  usage.c | 20 ++++++++++++++++----
>  1 file changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/usage.c b/usage.c
> index 2fdb20086b..58fb5fff5f 100644
> --- a/usage.c
> +++ b/usage.c
> @@ -9,14 +9,26 @@
>  void vreportf(const char *prefix, const char *err, va_list params)
>  {
>  	char msg[4096];
> -	char *p;
> +	char *p, *pend = msg + sizeof(msg);
> +	size_t prefix_len = strlen(prefix);
>  
> -	vsnprintf(msg, sizeof(msg), err, params);
> -	for (p = msg; *p; p++) {
> +	if (sizeof(msg) <= prefix_len) {
> +		fprintf(stderr, "BUG!!! too long a prefix '%s'\n", prefix);
> +		abort();
> +	}
> +	memcpy(msg, prefix, prefix_len);
> +	p = msg + prefix_len;
> +	if (vsnprintf(p, pend - p, err, params) < 0)
> +		*p = '\0'; /* vsnprintf() failed, clip at prefix */
> +
> +	for (; p != pend - 1 && *p; p++) {
>  		if (iscntrl(*p) && *p != '\t' && *p != '\n')
>  			*p = '?';
>  	}
> -	fprintf(stderr, "%s%s\n", prefix, msg);
> +
> +	*(p++) = '\n'; /* we no longer need a NUL */
> +	fflush(stderr);
> +	write_in_full(2, msg, p - msg);
>  }
>  
>  static NORETURN void usage_builtin(const char *err, va_list params)

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2019-11-02  4:06 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-28 15:00 [PATCH 0/1] Fix t5516 flakiness in Visual Studio builds Johannes Schindelin via GitGitGadget
2019-10-28 15:00 ` [PATCH 1/1] vreportf(): avoid buffered write in favor of unbuffered one Johannes Schindelin via GitGitGadget
2019-10-29  3:18   ` Junio C Hamano
2019-10-29 12:30     ` Johannes Schindelin
2019-10-29 13:49       ` Jeff King
2019-10-29 14:13         ` Johannes Schindelin
2019-10-29 14:32           ` Jeff King
2019-10-29 20:09             ` Johannes Schindelin
2019-10-30  1:43               ` Junio C Hamano
2019-10-29 16:44         ` Junio C Hamano
2019-10-29 10:38   ` SZEDER Gábor
2019-10-29 12:38     ` Johannes Schindelin
2019-10-29 13:52       ` Jeff King
2019-10-29 14:18         ` Johannes Schindelin
2019-10-29 13:37 ` [PATCH v2 0/1] Fix t5516 flakiness in Visual Studio builds Johannes Schindelin via GitGitGadget
2019-10-29 13:37   ` [PATCH v2 1/1] vreportf(): avoid relying on stdio buffering Johannes Schindelin via GitGitGadget
2019-10-29 14:21     ` Alexandr Miloslavskiy
2019-10-29 19:57       ` Johannes Schindelin
2019-10-29 20:09         ` Jeff King
2019-10-29 20:24           ` Alexandr Miloslavskiy
2019-10-29 20:11         ` Alexandr Miloslavskiy
2019-10-29 20:01   ` [PATCH v3 0/1] Fix t5516 flakiness in Visual Studio builds Johannes Schindelin via GitGitGadget
2019-10-29 20:01     ` [PATCH v3 1/1] vreportf(): avoid relying on stdio buffering Johannes Schindelin via GitGitGadget
2019-10-29 20:32       ` Jeff King
2019-10-30  8:54         ` Johannes Schindelin
2019-10-31  6:24           ` Jeff King
2019-10-31 10:26             ` Johannes Schindelin
2019-10-31 15:48               ` Jeff King
2019-11-01 18:41                 ` Johannes Schindelin
2019-10-30  2:01       ` Junio C Hamano
2019-10-30  9:13         ` Johannes Schindelin
2019-10-30 10:44     ` [PATCH v4 0/1] Fix t5516 flakiness in Visual Studio builds Johannes Schindelin via GitGitGadget
2019-10-30 10:44       ` [PATCH v4 1/1] vreportf(): avoid relying on stdio buffering Johannes Schindelin via GitGitGadget
2019-11-02  4:05         ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).