git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paul Eggert <eggert@CS.UCLA.EDU>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Junio C Hamano <junkio@cox.net>,
	Robert Fitzsimons <robfitz@273k.net>,
	Alex Riesen <raa.lkml@gmail.com>,
	git@vger.kernel.org, Kai Ruemmler <kai.ruemmler@gmx.net>
Subject: Re: [PATCH] Try URI quoting for embedded TAB and LF in pathnames
Date: Thu, 13 Oct 2005 17:16:57 -0700	[thread overview]
Message-ID: <87irw1q7eu.fsf@penguin.cs.ucla.edu> (raw)
In-Reply-To: <Pine.LNX.4.64.0510121411550.15297@g5.osdl.org> (Linus Torvalds's message of "Wed, 12 Oct 2005 14:24:31 -0700 (PDT)")

Linus Torvalds <torvalds@osdl.org> writes:

> So I repeat: 
>  - escape as little as possible
>  - make the _viewer_ decide how to view it.

Under my most recent proposal, the only bytes one must escape are ",
\, and LF.  Doesn't that satisfy these two main criteria?


> If GNU emacs does locale translations rather than just do a binary
> transfer of the data, then that's a sign that GNU emacs is being
> really stupid.

Perhaps so, but it has a lot of company.  I have even worse problems
with Mozilla Thunderbird.  And as we observed, Pine also has problems
sending properly-formatted email containing arbitrary binary data.

I suspect the vast majority of email clients will screw up in
relatively common cases involving unusual characters in file names.
Using attachments avoids many of the problems, but lots of patches are
emailed inline and I'd rather not force people to use attachments to
send diffs.


> I find that email is very robust - it's basically 8-bit clean. No 
> character encoding, no crap. Just a byte stream. It really _is_ the most 
> reliable format.

Hmm.  To test that theory, I just now sent plain-text email to myself,
containing a carriage-return (CR) byte in the middle of a line.

The CR byte was transliterated into a LF.  Ooops.

This was the very first (and only) test I tried, which isn't a good
sign for reliability.  If you're curious, I tracked the problem down
to Exim, a popular mail transfer agent that is running on my personal
Debian GNU/Linux (stable) box.  As to why Exim munges email, please see
<http://www.exim.org/exim-html-4.40/doc/html/spec_44.html#SECT44.1>.
(And I didn't know about the Exim glitch before trying my test.
I'm normally a Sendmail man myself.)

More generally, I suspect inline patches with weird bytes will suffer
greatly from encoding and recoding by mail agents.


> What matters is not what it looks like, but what it _saves_ as. If
> you save the email message, it should come out as the same reliable
> 8-bit byte stream

Unfortunately this isn't true for Emacs, and I suspect other mailers
will have similar problems.  For example, with Emacs I can easily save
either the exact byte-for-byte message body that my mail transfer
agent gave me; or I can have Emacs decode the message into its
constituent characters, reencode the result as UTF-8, and put that
into a file.  In neither case, though, am I saving the original byte
stream that you presented to your mail user agent.  Even if I save the
byte-for-byte message body, it is often in quoted-printable format so
I'll have to decode strings like "=EF" to recover the original bytes.
This is doable, yes, but it's inconvenient in practice, at least with
the mail user agents I'm familiar with.  And even if I do it, I don't
necessarily have the same byte stream you gave your mail user agent; I
merely have the byte stream that your MUA gave to your MTA, and these
may not be the same thing (they certainly aren't always the same thing
with Emacs).


The simplest fix for git may be to say "Don't use inline patches; use
attachments if you must email anything with strange characters in it."
That's fine.  But I prefer a format that also allows GNU diff, if it
chooses, to generate output that resists common inline-email botches.

  reply	other threads:[~2005-10-14  0:18 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-10-07 19:35 [RFC] embedded TAB and LF in pathnames Junio C Hamano
2005-10-07 23:29 ` Alex Riesen
2005-10-07 23:44   ` Junio C Hamano
2005-10-08  6:45     ` Alex Riesen
2005-10-08  9:10       ` Junio C Hamano
2005-10-08 13:30         ` [PATCH] Try URI quoting for " Robert Fitzsimons
2005-10-08 18:30           ` Junio C Hamano
2005-10-08 20:19             ` Junio C Hamano
2005-10-11  6:20               ` Paul Eggert
2005-10-11  7:37                 ` Junio C Hamano
2005-10-11 15:17                 ` Linus Torvalds
2005-10-11 18:03                   ` Paul Eggert
2005-10-11 18:37                     ` Linus Torvalds
2005-10-11 19:42                       ` Paul Eggert
2005-10-11 20:56                         ` Linus Torvalds
2005-10-12  6:51                           ` Paul Eggert
2005-10-12 14:59                             ` Linus Torvalds
2005-10-12 19:07                               ` Daniel Barkalow
2005-10-12 19:52                                 ` Linus Torvalds
2005-10-12 20:21                                   ` H. Peter Anvin
     [not found]                               ` <87vf02qy79.fsf@penguin.cs.ucla.edu>
2005-10-12 21:02                                 ` Junio C Hamano
2005-10-12 21:05                                 ` Linus Torvalds
2005-10-12 21:09                                   ` H. Peter Anvin
2005-10-12 21:15                                   ` Johannes Schindelin
2005-10-12 21:33                                   ` Junio C Hamano
2005-10-14  0:57                                   ` Paul Eggert
2005-10-14  5:43                                     ` Linus Torvalds
2005-10-12 21:24                                 ` Linus Torvalds
2005-10-14  0:16                                   ` Paul Eggert [this message]
2005-10-14  5:20                                     ` Linus Torvalds
2005-10-14 17:18                                       ` H. Peter Anvin
2005-10-14  6:59                                 ` Junio C Hamano
2005-10-09 10:42           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87irw1q7eu.fsf@penguin.cs.ucla.edu \
    --to=eggert@cs.ucla.edu \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=kai.ruemmler@gmx.net \
    --cc=raa.lkml@gmail.com \
    --cc=robfitz@273k.net \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).