All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Jeff King <peff@peff.net>
Cc: Johannes Sixt <j6t@kdbg.org>, Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org
Subject: Re: What's cooking in git.git (Jan 2017, #02; Sun, 15)
Date: Mon, 16 Jan 2017 18:06:35 +0100 (CET)	[thread overview]
Message-ID: <alpine.DEB.2.20.1701161746200.3469@virtualbox> (raw)
In-Reply-To: <20170116160456.ltbb7ofe47xos7xo@sigill.intra.peff.net>

Hi Peff,

On Mon, 16 Jan 2017, Jeff King wrote:

> On Mon, Jan 16, 2017 at 01:52:39PM +0100, Johannes Schindelin wrote:
> 
> > > >  An error message with an ASCII control character like '\r' in it
> > > >  can alter the message to hide its early part, which is problematic
> > > >  when a remote side gives such an error message that the local side
> > > >  will relay with a "remote: " prefix.
> > > >
> > > >  Will merge to 'next'.
> > > 
> > > Please be not too hasty with advancing this topic to master. I could imagine
> > > that there is some unwanted fallout on systems where the end-of-line marker
> > > CRLF is common. Though, I haven't tested the topic myself, yet, nor do I
> > > expect regressions in *my* typical workflow.
> > 
> > Thank you for being so attentive.
> > 
> > This topic branch would indeed have caused problems. Worse: it would have
> > caused problems that are not covered by our test suite, as Git for
> > Windows' own utilities do not generate CR/LF line endings. So this
> > regression would have bit our users. Nasty.
> 
> Hmm.  I am not sure to what degree CRLFs are actually a problem here.
> Keep in mind these are error messages generated via error(), and so not
> processing arbitrary data. I can imagine that CRs might come from:

Please note the regression test I added. It uses rev-parse's --abbrev-ref
option which quotes the argument when erroring out. This argument then
gets munged.

So error() (or in this case, die()) *very much* processes arbitrary data.

I *know* that rev-parse --abbrev-ref is an artificial example, it is
highly unlikely that anybody will use

	git rev-parse --abbrev-ref "$(<call an external program here that
		generates CR/LF line endings>)"

However, there are plenty other cases in regular Git usage where arguments
are generated by external programs to which we have no business dictating a
specific line ending style.

If you absolutely insist, I will spend time to find a plausible example
and use that in the regression test. However, that would not really
improve anything, as the purpose of the regression test is simply to
demonstrate that a user-provided argument's CR/LF does not get munged
incorrectly. And the test I provided serves that purpose perfectly.

>   1. A parameter like a filename or revision name. If I ask for a
>      rev-parse of "foo\r\n", then it's probably useful to mention the
>      "\r" in the error message, rather than ignoring it (or converting
>      it to a line-feed).
> 
>      And I think that would apply to any input parameter we show via
>      error(), etc, if it is connected to a newline (ideally we would
>      show newlines as "?", too, but we cannot tell the difference
>      between ones from parameters, and ones that are part of the error
>      message).

I think it is doing users a really great disservice to munge up CR or LF
into question marks. I *guarantee* you that it confuses users. And not
because they are dumb, but because the code violates the Law of Least
Surprise.

> I am certainly open to loosening the sanitizing for CR to make things
> work seamlessly on Windows. But I am having trouble imagining a case
> that is actually negatively impacted.

Git accepts all kinds of arguments, not just file names. It is totally
legitimate, and you probably can show use cases of that yourself, to
generate those arguments by other programs.

These programs can generate CR/LF line endings.

While well-intentioned, your changes could make things even unreadable.
Certainly confusing.

> > -- snipsnap --
> > Subject: [PATCH] fixup! vreport: sanitize ASCII control chars
> 
> Given the subtlety here, I'd much rather have a patch on top.

Fine.

> > The original patch is incorrect, as it turns carriage returns into
> > question marks.
> > 
> > However, carriage returns should be left alone when preceding a line feed,
> > and simply turned into line feeds otherwise.
> 
> The question of whether to leave CRLFs alone is addressed above. But I
> do not understand why you'd want a lone CR to be converted to a line
> feed. Running:
> 
>   git rev-parse --abbrev-ref "$(printf "foo\r")"
> 
> with my patch gets you:
> 
>   fatal: ambiguous argument 'foo?': unknown revision or path not in the working tree.
> 
> But with yours:
> 
>   fatal: ambiguous argument 'foo
>   ': unknown revision or path not in the working tree.
> 
> Obviously the "?" thing is a lossy transformation,

s/lossy/confusing/

Probably not to you, because you came up with the transformation, but to
literally everybody else.

> but I do not see how a newline is any less confusing (but see below for
> some thoughts).

The more common bug report to be expected would show that multi-line
arguments are converted to ending in question marks. That is not the user
experience for which I am aiming.

> > To make the end result more readable, the logic is inverted so that the
> > common case (no substitution) is handled first.
> > 
> > While at it, let's lose the unnecessary curly braces.
> 
> Please don't. Obviously C treats the "if/else" as a single unit, but
> IMHO it's less error-prone to include the braces any time there are
> multiple visual lines. E.g., something like:
> 
>   while (foo)
> 	if (bar)
> 		one();
> 	else
> 		two();
> 	three();
> 
> is much easier to spot as wrong when you would require braces either
> way (and not relevant here, but I'd say that even an inner block with a
> comment deserves braces for the same reason).

There is no documentation about the preferred coding style.

For years, I saw patches, and provided patches myself, that *avoided*
curly braces when not necessary (in addition to aiming for shorter arms to
come before longer arms in conditionals).

Now all of a sudden Junio *and* you suggest unnecessary curly braces to be
added.

That is inconsistent at best, and confusing. Maybe you two gentle people
can make up your mind and document the final verdict, and for extra
brownie points automate the formatting so that patch reviews are not
dominated by coding style comments that would be better addressed by
machines than by humans.

> > We also add a regression test that verifies that carriage returns are
> > handled properly. And as we expect CR/LF to be handled properly also on
> > platforms other than Windows, this test case is not guarded by the MINGW
> > prerequisite.
> 
> I am not sure what "properly" means here. In your test:
> 
> > +# This test verifies that the error reporting functions handle CR correctly.
> > +# --abbrev-ref is simply used out of convenience, as it reports an error including
> > +# a user-specified argument.
> > +test_expect_success 'error messages treat CR/LF correctly' '
> > +	test_must_fail git rev-parse --abbrev-ref "$(printf "CR/LF\\r\\n")" 2>err &&
> > +	grep "^fatal: .*CR/LF$" err
> > +'
> 
> The "\n" is eaten by the shell, and git sees only "CR/LF\r". So we are
> not testing the CRLF case in vreportf() at all.

True. Should be easy to fix, starting from my patch.

> We do end up with "CR/LF\n" in vreportf(), which is presumably converted
> by fprintf() to "CR/LF\r\n" on Windows. And so perhaps that is why you
> are doing the "convert \r to \n" thing above.
> 
> But I still think it's not doing the right thing. Git _didn't_ see CRLF,
> it saw CR.

You know, I don't care. As long as carriage returns are either left alone
(which conflicts specifically with your stated goal) or at least are
handled gracefully when coming before line feeds.

Converting stray carriage returns to question marks is confusing, and of
course I would take the brunt of the bug reports, so I do not look
favorably on that change.

My patch was just a starter, to help with fixing the patch series before
it hits `next`.

Ciao,
Johannes

  reply	other threads:[~2017-01-16 17:07 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-16  1:51 What's cooking in git.git (Jan 2017, #02; Sun, 15) Junio C Hamano
2017-01-16  6:56 ` Johannes Sixt
2017-01-16  7:48   ` Junio C Hamano
2017-01-16 12:52   ` Johannes Schindelin
2017-01-16 16:04     ` Jeff King
2017-01-16 17:06       ` Johannes Schindelin [this message]
2017-01-16 20:33         ` Johannes Sixt
2017-01-16 21:44           ` Jeff King
2017-01-17 19:17             ` Junio C Hamano
2017-01-18  6:50               ` Johannes Sixt
2017-01-18 19:12                 ` Junio C Hamano
2017-01-19 16:12                   ` Johannes Schindelin
2017-01-16 22:00         ` Jeff King
2017-01-16 22:08           ` [PATCH] CodingGuidelines: clarify multi-line brace style Jeff King
2017-01-17 19:39             ` Junio C Hamano
2017-01-17 20:05               ` Jeff King
2017-01-17 21:41                 ` Junio C Hamano
2017-01-17  1:33           ` What's cooking in git.git (Jan 2017, #02; Sun, 15) Jacob Keller
2017-01-17  7:52             ` Jeff King
2017-01-17 19:20           ` Junio C Hamano
2017-01-17 19:36             ` Jeff King
2017-01-17 20:32               ` Junio C Hamano
2017-01-17 20:51                 ` Jeff King
2017-01-16 11:28 ` Johannes Schindelin
2017-01-17 19:45   ` Junio C Hamano
2017-01-18 13:54     ` Johannes Schindelin
2017-01-16 11:40 ` Johannes Schindelin
2017-01-18  1:05 ` [PATCHv3 4/4] unpack-trees: support super-prefix option Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.20.1701161746200.3469@virtualbox \
    --to=johannes.schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.