From: Junio C Hamano <firstname.lastname@example.org> To: Paul Eggert <eggert@CS.UCLA.EDU> Cc: email@example.com Subject: Re: [PATCH] Try URI quoting for embedded TAB and LF in pathnames Date: Tue, 11 Oct 2005 00:37:57 -0700 Message-ID: <firstname.lastname@example.org> (raw) In-Reply-To: <email@example.com> (Paul Eggert's message of "Mon, 10 Oct 2005 23:20:01 -0700") Paul Eggert <eggert@CS.UCLA.EDU> writes: > The convention I had been thinking of adding is to have GNU diff > use shell-quoting style, e.g., > > 'three > o'\''clock' > > to represent a file name with a newline and an apostrophe in it. > This sort of file name can be cut and pasted into the shell. > The quoting could be used with any file name containing a > troublesome character. > > Perhaps another quoting style would be better. A patch header (both "diff --git" line and ---/+++ lines) I've been considering, and have in the proposed updates branch, looks something like this: diff --git a/def\nghi/pqr b/dee/pqr similarity index 72% rename from def\nghi/pqr rename to dee/pqr index 9ee055c..243fbbc 100644 --- a/def\nghi/pqr +++ b/dee/pqr @@ -1 +1,3 @@ Fri Oct 7 23:19:04 PDT 2005 +foo +foo If we can keep things on one line, that would help parsing the stuff very simple, but more importantly, it is easier to see what's happening. The pattern is the same whether you have funny pathnames or not, and that helps the human consumer. Adjusting the "git diff" output to the style the GNU diff with your shell quoting style would produce something like this: diff --git 'a/def ghi/pqr' b/dee/pqr similarity index 72% rename from 'def ghi/pqr' rename to dee/pqr index 9ee055c..243fbbc 100644 --- 'a/def ghi/pqr' +++ b/dee/pqr @@ -1 +1,3 @@ Fri Oct 7 23:19:04 PDT 2005 +foo +foo Which, while it is possible to make tools parse them, is very distracting for humans to read and review. Yes, LF is quoted, but it still breaks the line, disrupting the pattern we are used to see. If you are talking about a funny file, whose name is "a\ndiff --git a/b/c", your diff would look like this: diff --git 'a/ diff --git a/b/c' 'b/ diff --git a/b/c' index 9ee055c..243fbbc 100644 --- 'a/ diff --git a/b/c' +++ 'b/ diff --git a/b/c' @@ -1 +1,3 @@ Fri Oct 7 23:19:04 PDT 2005 +foo +foo We are used to tell the "less" command to do "/^diff --git .*" while reviewing patches. The shell quoting, while I admit I learned its beauty from you, is a disaster for human consumption. For diff output quoting purposes, LF is the only thing that matters, as you mentioned in another message to me. Our parsing side ("GNU patch" counterpart) checks two pathnames on "diff --git" line and makes sure what follows a/ and b/ are consistent (that is, they should be identical, or each are the same as "rename from" and "rename to"), so there is no ambiguity. But again for human consumption purposes, we cannot easily tell SP and TAB apart by just reading, and a TAB is so unusual character to have in pathname (as opposed to SP which is not that uncommon), we may be better off making them visible. Quoting TAB incidentally has an added benefit, which you as GNU diff/patch person would probably not care too much about. Our other tools sometimes need to show two paths in one record, and TAB is used as the field separator between two paths (LF is the record separator). The tools do have '-z' mode to let us use anything but NUL in the pathname, and carefully written scripts tend to run them with '-z' flag and use Perl or Python to parse paths out, but it would be nicer if we did not always have to. For example, the 'git commit' command prepares the log editor with the status information about changes being committed, and needs to mention paths. This is purely for human consumption, and showing something like: # Type commit message to this file. Lines that start # with '#' are ignored. # # Updated but not checked in: # (will commit) # # new file: ab\n\tc/mno # modified: abc/mno # renamed: def\nghi/pqr -> dee/pqr ... is perfectly readable for human users, and can be done without running the tool in '-z' mode, if the tool output is quoted with '\n' and '\t' convention -- the parsing and formatting side can just split the field with TAB and show them, without worrying about an embedded LF making the rest of the pathname spilling over to the next line. And once we start teaching the user we represent funny characters in their paths this way, it becomes nicer to be consistent in the diff output as well.
next prev parent reply index Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top 2005-10-07 19:35 [RFC] " Junio C Hamano 2005-10-07 23:29 ` Alex Riesen 2005-10-07 23:44 ` Junio C Hamano 2005-10-08 6:45 ` Alex Riesen 2005-10-08 9:10 ` Junio C Hamano 2005-10-08 13:30 ` [PATCH] Try URI quoting for " Robert Fitzsimons 2005-10-08 18:30 ` Junio C Hamano 2005-10-08 20:19 ` Junio C Hamano 2005-10-11 6:20 ` Paul Eggert 2005-10-11 7:37 ` Junio C Hamano [this message] 2005-10-11 15:17 ` Linus Torvalds 2005-10-11 18:03 ` Paul Eggert 2005-10-11 18:37 ` Linus Torvalds 2005-10-11 19:42 ` Paul Eggert 2005-10-11 20:56 ` Linus Torvalds 2005-10-12 6:51 ` Paul Eggert 2005-10-12 14:59 ` Linus Torvalds 2005-10-12 19:07 ` Daniel Barkalow 2005-10-12 19:52 ` Linus Torvalds 2005-10-12 20:21 ` H. Peter Anvin [not found] ` <firstname.lastname@example.org> 2005-10-12 21:02 ` Junio C Hamano 2005-10-12 21:05 ` Linus Torvalds 2005-10-12 21:09 ` H. Peter Anvin 2005-10-12 21:15 ` Johannes Schindelin 2005-10-12 21:33 ` Junio C Hamano 2005-10-14 0:57 ` Paul Eggert 2005-10-14 5:43 ` Linus Torvalds 2005-10-12 21:24 ` Linus Torvalds 2005-10-14 0:16 ` Paul Eggert 2005-10-14 5:20 ` Linus Torvalds 2005-10-14 17:18 ` H. Peter Anvin 2005-10-14 6:59 ` Junio C Hamano 2005-10-09 10:42 ` Junio C Hamano
Reply instructions: You may reply publically to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --email@example.com \ --firstname.lastname@example.org \ --cc=eggert@CS.UCLA.EDU \ --email@example.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Mailing List Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/git/0 git/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 git git/ https://lore.kernel.org/git \ firstname.lastname@example.org public-inbox-index git Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.git AGPL code for this site: git clone https://public-inbox.org/public-inbox.git