All of lore.kernel.org
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, me@ttaylorr.com, newren@gmail.com,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 8/8] sparse-checkout: write escaped patterns in cone mode
Date: Fri, 24 Jan 2020 16:10:21 -0500	[thread overview]
Message-ID: <d4fe0b4a-9e3f-6669-9068-d7b4b8934d1b@gmail.com> (raw)
In-Reply-To: <20200114224818.GD3957260@coredump.intra.peff.net>

On 1/14/2020 5:48 PM, Jeff King wrote:
> On Tue, Jan 14, 2020 at 05:11:03PM -0500, Derrick Stolee wrote:
> 
>>> Do we need to document these rules somewhere? Naively I'd expect
>>> "--stdin" to take in literal pathnames. But of course it can't represent
>>> a path with a newline. So perhaps it makes sense to take quoted names by
>>> default, and allow literal NUL-separated input with "-z" if anybody
>>> wants it.
>>
>> This is worth thinking about the right way to describe the rules:
>>
>> 1. You don't _need_ quotes. They happen to come along for the ride in
>>   'git ls-tree' so it doesn't mess up shell scripts that iterate on
>>   those entries. At least, that's why I think they are quoted.
> 
> It's not just shell scripts. Without quoting, the syntax becomes
> ambiguous (e.g., imagine a file with a newline in it). So most Git
> output that shows a filename will quote it if necessary, unless
> NUL separators are being used.

Good to know.

>> 2. If you use quotes, the first layer of quotes will be removed.
> 
> I take this to mean that anything starting with a double-quote will have
> the outer layer removed, and backslash escapes inside expanded. And
> anything without a starting double quote (even if it has internal
> backslash escapes!) will be taken literally.

Hm. Perhaps you are right! The ls-tree output for the test example
is:

	deep
	folder1
	folder2
	"zbad\\dir"
	zdoes*exist

so the "zdoes*exist" value is not escaped. I believe the current
logic does something extra: consider supplying this input to
'git sparse-checkout set --stdin':

	deep
	folder1
	folder2
	"zbad\\dir"
	zdoes\*exist

then should we un-escape "\*" to "*"? Or is this not a valid input
because a backslash should have been quoted into C-style quotes?

The behavior in the current series allows this output that would
never be written by "git ls-tree".

> That would match how things like "update-index --index-info" work.
> 
> As far as implementation, I know you're trying to keep some of the
> escaping, but I think it might make more sense to do use
> unquote_c_style() to parse the input (see update-index's use for some
> prior art), and then re-quote as necessary to put things into the
> sparse-checkout file (I guess quoting more than just quote_c_style()
> would do, since you need to quote glob metacharacters like '*' and
> probably "!"). But as much as possible, I think you'd want literal
> strings inside the program, and just quoting/unquoting at the edges.

I was playing around with this, and I think that quote_c_style() is
necessary for the output, but we have a strange in-memory situation
for the other escaping: we both fill the hashsets with the un-escaped
data and fill the pattern list with the escaped patterns.

I'll add a commit with the quote_c_style() calls during the 'list'
subcommand.

>> How much of this needs to be documented explicitly, or how much should
>> we say "The input format matches what we would expect from 'git ls-tree
>> --name-only'"?
> 
> I think it's fine to say that, and maybe call attention to the quoting.
> Like:
> 
>   The input format matches the output of `git ls-tree --name-only`. This
>   includes interpreting pathnames that begin with a double quote (") as
>   C-style quoted strings.
> 
> Disappointingly, update-index does not seem to explain the rules
> anywhere. fast-import does cover it. Maybe it's something that ought to
> be hoisted out into gitcli(7) or similar (or maybe it has been and I
> just can't find it).

I'll start the process by using your recommended language. I noticed
also that the 'set' command doesn't actually document what happens
when in cone mode, so I will correct that, too.

Thanks,
-Stolee


  reply	other threads:[~2020-01-24 21:10 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-14 19:25 [PATCH 0/8] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
2020-01-14 19:25 ` [PATCH 1/8] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
2020-01-16 21:40   ` Junio C Hamano
2020-01-14 19:25 ` [PATCH 2/8] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
2020-01-16 21:46   ` Junio C Hamano
2020-01-14 19:25 ` [PATCH 3/8] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
2020-01-14 19:30   ` Taylor Blau
2020-01-14 19:25 ` [PATCH 4/8] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
2020-01-14 21:16   ` Jeff King
2020-01-14 19:25 ` [PATCH 5/8] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
2020-01-14 19:26 ` [PATCH 6/8] sparse-checkout: warn on incorrect '*' in patterns Derrick Stolee via GitGitGadget
2020-01-14 19:26 ` [PATCH 7/8] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
2020-01-14 21:21   ` Jeff King
2020-01-14 22:08     ` Derrick Stolee
2020-01-14 19:26 ` [PATCH 8/8] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
2020-01-14 21:25   ` Jeff King
2020-01-14 22:11     ` Derrick Stolee
2020-01-14 22:48       ` Jeff King
2020-01-24 21:10         ` Derrick Stolee [this message]
2020-01-24 21:42           ` Jeff King
2020-01-28 15:03             ` Derrick Stolee
2020-01-14 19:34 ` [PATCH 0/8] Harden the sparse-checkout builtin Taylor Blau
2020-01-14 19:44   ` Derrick Stolee
2020-01-14 21:31     ` Jeff King
2020-01-15 19:16 ` Junio C Hamano
2020-01-15 20:32   ` Derrick Stolee
2020-01-24 21:19 ` [PATCH v2 00/12] " Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 01/12] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 02/12] t1091: improve here-docs Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 03/12] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 04/12] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 05/12] sparse-checkout: fix documentation typo for core.sparseCheckoutCone Jeff King via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 06/12] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 07/12] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 08/12] sparse-checkout: warn on incorrect '*' in patterns Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 09/12] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 10/12] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 11/12] sparse-checkout: use C-style quotes in 'list' subcommand Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 12/12] sparse-checkout: improve docs around 'set' in cone mode Derrick Stolee via GitGitGadget
2020-01-28 18:26   ` [PATCH v3 00/12] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
2020-01-28 18:26     ` [PATCH v3 01/12] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
2020-01-28 18:26     ` [PATCH v3 02/12] t1091: improve here-docs Derrick Stolee via GitGitGadget
2020-01-28 18:26     ` [PATCH v3 03/12] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
2020-01-28 18:26     ` [PATCH v3 04/12] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
2020-01-28 18:26     ` [PATCH v3 05/12] sparse-checkout: fix documentation typo for core.sparseCheckoutCone Jeff King via GitGitGadget
2020-01-28 18:26     ` [PATCH v3 06/12] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
2020-01-28 18:26     ` [PATCH v3 07/12] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
2020-01-28 18:26     ` [PATCH v3 08/12] sparse-checkout: warn on incorrect '*' in patterns Derrick Stolee via GitGitGadget
2020-01-28 18:26     ` [PATCH v3 09/12] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
2020-01-29 10:03       ` Jeff King
2020-01-29 13:58         ` Derrick Stolee
2020-01-29 14:04           ` Derrick Stolee
2020-01-28 18:26     ` [PATCH v3 10/12] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
2020-01-29 10:17       ` Jeff King
2020-01-29 10:33         ` Jeff King
2020-01-29 14:16           ` Derrick Stolee
2020-01-29 14:39             ` Derrick Stolee
2020-01-30  7:29             ` Jeff King
2020-01-30 15:01               ` Derrick Stolee
2020-01-28 18:26     ` [PATCH v3 11/12] sparse-checkout: use C-style quotes in 'list' subcommand Derrick Stolee via GitGitGadget
2020-01-29 10:23       ` Jeff King
2020-01-28 18:26     ` [PATCH v3 12/12] sparse-checkout: improve docs around 'set' in cone mode Derrick Stolee via GitGitGadget
2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 01/15] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 02/15] t1091: improve here-docs Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 03/15] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 04/15] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 05/15] sparse-checkout: fix documentation typo for core.sparseCheckoutCone Jeff King via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 06/15] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 07/15] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 08/15] sparse-checkout: warn on globs in cone patterns Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 09/15] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 10/15] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 11/15] sparse-checkout: unquote C-style strings over --stdin Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 12/15] sparse-checkout: use C-style quotes in 'list' subcommand Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 13/15] sparse-checkout: escape all glob characters on write Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 14/15] sparse-checkout: improve docs around 'set' in cone mode Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 15/15] sparse-checkout: fix cone mode behavior mismatch Derrick Stolee via GitGitGadget
2020-01-31 20:36       ` [PATCH v4 00/15] Harden the sparse-checkout builtin Elijah Newren
2020-02-03 14:09         ` Derrick Stolee
2020-02-08 23:32           ` Taylor Blau
2020-02-09 17:27             ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d4fe0b4a-9e3f-6669-9068-d7b4b8934d1b@gmail.com \
    --to=stolee@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.