git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@osdl.org>
To: Paul Eggert <eggert@CS.UCLA.EDU>
Cc: Junio C Hamano <junkio@cox.net>,
	Robert Fitzsimons <robfitz@273k.net>,
	Alex Riesen <raa.lkml@gmail.com>,
	git@vger.kernel.org, Kai Ruemmler <kai.ruemmler@gmx.net>
Subject: Re: [PATCH] Try URI quoting for embedded TAB and LF in pathnames
Date: Tue, 11 Oct 2005 08:17:25 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.64.0510110802470.14597@g5.osdl.org> (raw)
In-Reply-To: <87mzlgh8xa.fsf@penguin.cs.ucla.edu>



On Mon, 10 Oct 2005, Paul Eggert wrote:
> 
> An issue I hadn't really had time to think about is the character
> encoding of file names.

Please don't. Use filenames as if they are just binary blobs of data, 
that's the only thing that has a high chance of success. Yes, it too can 
break in the presense of something _else_ doing character translation 
and/or people moving a patch from one encoding to another , buthat's 
just true of anything.

Eventually everybody will hopefully use UTF-8, and nothing else really 
matters, but the thing is, if you see filenames as just blobs of data, 
that works with UTF-8 too, so it's not "wrong" even in the long run. And 
until everybody has one single encoding, you simply won't be able to tell, 
and the likelihood that you'd screw up is pretty high.

The happy part of the "binary blob" approach is that users _understand_ 
it. People who actively use different encoding formats are (painfully) 
aware of conversions, and they may curse you for not doing the random 
encoding format of the day, but they will be able to handle it.

In contrast, if you start doing conversions, I guarantee you that people 
will _not_ be able to handle it when you do something strange - you've 
changed the data.

Personally, I'd like the normal C quoting the best. Leave space as-is, and 
quote TAB/NL as \t and \n respectively. It's pretty universally understood 
in programming circles even outside of C, and it's not like a very 
uncommon patch format like that really needs to be well-understood outside 
of those circles.

It also has a very obvious and ASCII-safe format for other characters (ie 
just the normal octal escapes: \377 etc..

That said, I personally don't think it's necessarily even worth it. If 
somebody wants to use names with tabs and newlines, is he really going to 
work with diffs? Or is it just a driver error?

			Linus

  parent reply	other threads:[~2005-10-11 15:21 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-10-07 19:35 [RFC] embedded TAB and LF in pathnames Junio C Hamano
2005-10-07 23:29 ` Alex Riesen
2005-10-07 23:44   ` Junio C Hamano
2005-10-08  6:45     ` Alex Riesen
2005-10-08  9:10       ` Junio C Hamano
2005-10-08 13:30         ` [PATCH] Try URI quoting for " Robert Fitzsimons
2005-10-08 18:30           ` Junio C Hamano
2005-10-08 20:19             ` Junio C Hamano
2005-10-11  6:20               ` Paul Eggert
2005-10-11  7:37                 ` Junio C Hamano
2005-10-11 15:17                 ` Linus Torvalds [this message]
2005-10-11 18:03                   ` Paul Eggert
2005-10-11 18:37                     ` Linus Torvalds
2005-10-11 19:42                       ` Paul Eggert
2005-10-11 20:56                         ` Linus Torvalds
2005-10-12  6:51                           ` Paul Eggert
2005-10-12 14:59                             ` Linus Torvalds
2005-10-12 19:07                               ` Daniel Barkalow
2005-10-12 19:52                                 ` Linus Torvalds
2005-10-12 20:21                                   ` H. Peter Anvin
     [not found]                               ` <87vf02qy79.fsf@penguin.cs.ucla.edu>
2005-10-12 21:02                                 ` Junio C Hamano
2005-10-12 21:05                                 ` Linus Torvalds
2005-10-12 21:09                                   ` H. Peter Anvin
2005-10-12 21:15                                   ` Johannes Schindelin
2005-10-12 21:33                                   ` Junio C Hamano
2005-10-14  0:57                                   ` Paul Eggert
2005-10-14  5:43                                     ` Linus Torvalds
2005-10-12 21:24                                 ` Linus Torvalds
2005-10-14  0:16                                   ` Paul Eggert
2005-10-14  5:20                                     ` Linus Torvalds
2005-10-14 17:18                                       ` H. Peter Anvin
2005-10-14  6:59                                 ` Junio C Hamano
2005-10-09 10:42           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0510110802470.14597@g5.osdl.org \
    --to=torvalds@osdl.org \
    --cc=eggert@CS.UCLA.EDU \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=kai.ruemmler@gmx.net \
    --cc=raa.lkml@gmail.com \
    --cc=robfitz@273k.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).