git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: George Dunlap <george.dunlap@citrix.com>
Cc: <git@vger.kernel.org>
Subject: Re: git-am doesn't strip CRLF line endings when the mbox is base64-encoded
Date: Mon, 06 Jan 2020 09:07:58 -0800	[thread overview]
Message-ID: <xmqqpnfwa4y9.fsf@gitster-ct.c.googlers.com> (raw)
In-Reply-To: <0c18df58-9d1d-550f-d69e-d6ffe6c01858@citrix.com> (George Dunlap's message of "Mon, 6 Jan 2020 11:58:48 +0000")

George Dunlap <george.dunlap@citrix.com> writes:

> On 12/18/19 12:15 PM, George Dunlap wrote:
>> On 12/18/19 11:42 AM, George Dunlap wrote:
>>> Using git 2.24.0 from Debian testing.
>>>
>>> It seems that git-am will strip CRLF endings from mails before applying
>>> patches when the mail isn't encoded in any way.  It will also decode
>>> base64-encoded mails.  But it won't strip CRLF endings from
>>> base64-encoded mails.
>>>
>>> Attached are two mbox files for two different recent series.
>>> plainenc.am applies cleanly with `git am`, while base64enc.am doesn't.
>>>
>>> Poking around the man pages, it looks like part of the issue might be
>>> that the CRLF stripping is done in `git mailsplit`, before the base64
>>> encoding, rather than after.
>> 
>> Poking around -- it looks like the CRLF stripping would be better done
>> in `git mailinfo` after the decoding.
>
> Anyone want to take this up?  I mean, I could try to send a patch, but
> since I've never looked at the git source code before, I'm sure it would
> take me about 10x as much effort for me to do it as for someone already
> familiar with the codebase.

Even before writing a patch, somebody needs to come up with a
sensible design first.  --[no-]keep-cr is about "because transfer of
e-mail messages between MTAs and to the receiving MUA is defined in
terms of CRLF delimited lines per RFC, Git cannot tell if the CRLF
in the input was meant to be part of the patch (i.e. the diff is
describing a change between preimage and postimage of a file that
uses CRLF line endings) or they are cruft added during transit.  By
default we favor LF endings so we will strip, but we leave an option
to keep CRs at the end of lines".  

What you are asking for is quite different, isn't it?  "We know the
CRLF in the payload is from the original because they were protected
from getting munged during the transfer by being MIME-encased.
Please tell Git to preprocess that payload to convert CRLF to LF
before treating it as a patch".

So, if you are imagining to change the meaning of --[no-]keep-cr, I
do not think it will fly (that is why I said that we need a sensible
design before a patch).

And by stepping back a bit like so, and once we start viewing this
as "after receiving a piece of e-mail from MUA (where --[no-]keep-cr
may affect the outermost CRLF line endings) and unwrapping possible
MIME-encasing, we can optionally tell Git to pass the payload
further through a preprocess filter", we'd realize that this does
not have to be limited to just running dos2unix (you may want to run
iconv to fix encodings, for example), which would mean that the new
flag may not just want to be --strip-cr, which is too limiting, but
rather want to be --filter-message=<how> where <how> could be one of
the canned preprocess filter (among which your dos2unix may exist)
or an external script.

I am not saying that "--filter-message=<how>" must be the "sensible
design" I mentioned at the beginning of this message---the above is
to illustrate what kind of thought needs to go in before even the
first line of the patch gets written.

Thanks.

  reply	other threads:[~2020-01-06 17:08 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-18 11:42 git-am doesn't strip CRLF line endings when the mbox is base64-encoded George Dunlap
2019-12-18 12:15 ` George Dunlap
2019-12-18 19:41   ` Todd Zullinger
2020-01-06 11:58   ` George Dunlap
2020-01-06 17:07     ` Junio C Hamano [this message]
2020-01-06 18:10       ` George Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqpnfwa4y9.fsf@gitster-ct.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=george.dunlap@citrix.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).