All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] cat-file: support NUL-delimited input with `-z`
@ 2022-07-22 23:28 Taylor Blau
  2022-07-22 23:29 ` [PATCH 1/2] t1006: extract --batch-command inputs to variables Taylor Blau
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Taylor Blau @ 2022-07-22 23:28 UTC (permalink / raw)
  To: git; +Cc: gitster

This short patch series implements a `-z` mode that is compatible with
the `--batch`, `--batch-check`, and new `--batch-command` options in
`cat-file`.

This came out of a request from a colleague to support `cat-file`
invocations that refer to a tree entry whose name includes a newline
character.

The implementation is mostly straightforward, though the second patch
(which contains the main substance of this change) has a few additional
thoughts on areas for future cleanup. The first patch is preparatory, but
could easily be squashed, too.

Thanks in advance for your review!

Taylor Blau (2):
  t1006: extract --batch-command inputs to variables
  builtin/cat-file.c: support NUL-delimited input with `-z`

 Documentation/git-cat-file.txt |  7 +++-
 builtin/cat-file.c             | 28 +++++++++++--
 t/t1006-cat-file.sh            | 72 +++++++++++++++++++++++++++-------
 3 files changed, 88 insertions(+), 19 deletions(-)

-- 
2.37.0.1.g1379af2e9d

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] t1006: extract --batch-command inputs to variables
  2022-07-22 23:28 [PATCH 0/2] cat-file: support NUL-delimited input with `-z` Taylor Blau
@ 2022-07-22 23:29 ` Taylor Blau
  2022-07-22 23:29 ` [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z` Taylor Blau
  2022-07-23  4:44 ` [PATCH 0/2] cat-file: " Junio C Hamano
  2 siblings, 0 replies; 15+ messages in thread
From: Taylor Blau @ 2022-07-22 23:29 UTC (permalink / raw)
  To: git; +Cc: gitster

A future commit will want to ensure that various `--batch`-related
options produce the same output whether their input is newline
terminated, or NUL terminated (and a to-be-implemented `-z` option
exists).

To prepare for this, extract the given input(s) into separate variables
to that their LF characters can easily be converted into NUL bytes when
testing the new `-z` mode.

This is consistent with other tests in t1006 (which these days is no
longer a shining example of our CodingGuidelines).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t1006-cat-file.sh | 30 ++++++++++++++++--------------
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index dadf3b1458..01c0535765 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -418,6 +418,12 @@ test_expect_success "--batch-check with multiple sha1s gives correct format" '
     "$(echo_without_newline "$batch_check_input" | git cat-file --batch-check)"
 '
 
+batch_command_multiple_info="info $hello_sha1
+info $tree_sha1
+info $commit_sha1
+info $tag_sha1
+info deadbeef"
+
 test_expect_success '--batch-command with multiple info calls gives correct format' '
 	cat >expect <<-EOF &&
 	$hello_sha1 blob $hello_size
@@ -427,17 +433,18 @@ test_expect_success '--batch-command with multiple info calls gives correct form
 	deadbeef missing
 	EOF
 
-	git cat-file --batch-command --buffer >actual <<-EOF &&
-	info $hello_sha1
-	info $tree_sha1
-	info $commit_sha1
-	info $tag_sha1
-	info deadbeef
-	EOF
+	echo "$batch_command_multiple_info" >in &&
+	git cat-file --batch-command --buffer <in >actual &&
 
 	test_cmp expect actual
 '
 
+batch_command_multiple_contents="contents $hello_sha1
+contents $commit_sha1
+contents $tag_sha1
+contents deadbeef
+flush"
+
 test_expect_success '--batch-command with multiple command calls gives correct format' '
 	remove_timestamp >expect <<-EOF &&
 	$hello_sha1 blob $hello_size
@@ -449,13 +456,8 @@ test_expect_success '--batch-command with multiple command calls gives correct f
 	deadbeef missing
 	EOF
 
-	git cat-file --batch-command --buffer >actual_raw <<-EOF &&
-	contents $hello_sha1
-	contents $commit_sha1
-	contents $tag_sha1
-	contents deadbeef
-	flush
-	EOF
+	echo "$batch_command_multiple_contents" >in &&
+	git cat-file --batch-command --buffer <in >actual_raw &&
 
 	remove_timestamp <actual_raw >actual &&
 	test_cmp expect actual
-- 
2.37.0.1.g1379af2e9d


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z`
  2022-07-22 23:28 [PATCH 0/2] cat-file: support NUL-delimited input with `-z` Taylor Blau
  2022-07-22 23:29 ` [PATCH 1/2] t1006: extract --batch-command inputs to variables Taylor Blau
@ 2022-07-22 23:29 ` Taylor Blau
  2022-07-22 23:41   ` Chris Torek
                     ` (4 more replies)
  2022-07-23  4:44 ` [PATCH 0/2] cat-file: " Junio C Hamano
  2 siblings, 5 replies; 15+ messages in thread
From: Taylor Blau @ 2022-07-22 23:29 UTC (permalink / raw)
  To: git; +Cc: gitster

When callers are using `cat-file` via one of the stdin-driven `--batch`
modes, all input is newline-delimited. This presents a problem when
callers wish to ask about, e.g. tree-entries that have a newline
character present in their filename.

To support this niche scenario, introduce a new `-z` mode to the
`--batch`, `--batch-check`, and `--batch-command` suite of options that
instructs `cat-file` to treat its input as NUL-delimited, allowing the
individual commands themselves to have newlines present.

The refactoring here is slightly unfortunate, since we turn loops like:

    while (strbuf_getline(&buf, stdin) != EOF)

into:

    while (1) {
        int ret;
        if (opt->nul_terminated)
            ret = strbuf_getline_nul(&input, stdin);
        else
            ret = strbuf_getline(&input, stdin);

        if (ret == EOF)
            break;
    }

It's tempting to think that we could use `strbuf_getwholeline()` and
specify either `\n` or `\0` as the terminating character. But for input
on platforms that include a CR character preceeding the LF, this
wouldn't quite be the same, since `strbuf_getline(...)` will trim any
trailing CR, while `strbuf_getwholeline(&buf, stdin, '\n')` will not.

In the future, we could clean this up further by introducing a variant
of `strbuf_getwholeline()` that addresses the aforementioned gap, but
that approach felt too heavy-handed for this pair of uses.

Some tests are added in t1006 to ensure that `cat-file` produces the
same output in `--batch`, `--batch-check`, and `--batch-command` modes
with and without the new `-z` option.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/git-cat-file.txt |  7 +++++-
 builtin/cat-file.c             | 28 ++++++++++++++++++++---
 t/t1006-cat-file.sh            | 42 +++++++++++++++++++++++++++++++++-
 3 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index 24a811f0ef..3515350ed6 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -14,7 +14,7 @@ SYNOPSIS
 'git cat-file' (-t | -s) [--allow-unknown-type] <object>
 'git cat-file' (--batch | --batch-check | --batch-command) [--batch-all-objects]
 	     [--buffer] [--follow-symlinks] [--unordered]
-	     [--textconv | --filters]
+	     [--textconv | --filters] [-z]
 'git cat-file' (--textconv | --filters)
 	     [<rev>:<path|tree-ish> | --path=<path|tree-ish> <rev>]
 
@@ -207,6 +207,11 @@ respectively print:
 	/etc/passwd
 --
 
+-z::
+	Only meaningful with `--batch`, `--batch-check`, or
+	`--batch-command`; input is NUL-delimited instead of
+	newline-delimited.
+
 
 OUTPUT
 ------
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index f42782e955..c3602d15df 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -31,6 +31,7 @@ struct batch_options {
 	int all_objects;
 	int unordered;
 	int transform_mode; /* may be 'w' or 'c' for --filters or --textconv */
+	int nul_terminated;
 	const char *format;
 };
 
@@ -614,12 +615,20 @@ static void batch_objects_command(struct batch_options *opt,
 	struct queued_cmd *queued_cmd = NULL;
 	size_t alloc = 0, nr = 0;
 
-	while (!strbuf_getline(&input, stdin)) {
-		int i;
+	while (1) {
+		int i, ret;
 		const struct parse_cmd *cmd = NULL;
 		const char *p = NULL, *cmd_end;
 		struct queued_cmd call = {0};
 
+		if (opt->nul_terminated)
+			ret = strbuf_getline_nul(&input, stdin);
+		else
+			ret = strbuf_getline(&input, stdin);
+
+		if (ret)
+			break;
+
 		if (!input.len)
 			die(_("empty command in input"));
 		if (isspace(*input.buf))
@@ -763,7 +772,16 @@ static int batch_objects(struct batch_options *opt)
 		goto cleanup;
 	}
 
-	while (strbuf_getline(&input, stdin) != EOF) {
+	while (1) {
+		int ret;
+		if (opt->nul_terminated)
+			ret = strbuf_getline_nul(&input, stdin);
+		else
+			ret = strbuf_getline(&input, stdin);
+
+		if (ret == EOF)
+			break;
+
 		if (data.split_on_whitespace) {
 			/*
 			 * Split at first whitespace, tying off the beginning
@@ -866,6 +884,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
 			N_("like --batch, but don't emit <contents>"),
 			PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
 			batch_option_callback),
+		OPT_BOOL('z', NULL, &batch.nul_terminated, N_("stdin is NUL-terminated")),
 		OPT_CALLBACK_F(0, "batch-command", &batch, N_("format"),
 			N_("read commands from stdin"),
 			PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
@@ -921,6 +940,9 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
 	else if (batch.all_objects)
 		usage_msg_optf(_("'%s' requires a batch mode"), usage, options,
 			       "--batch-all-objects");
+	else if (batch.nul_terminated)
+		usage_msg_optf(_("'%s' requires a batch mode"), usage, options,
+			       "-z");
 
 	/* Batch defaults */
 	if (batch.buffer_output < 0)
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index 01c0535765..23b8942edb 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -88,7 +88,8 @@ done
 
 for opt in --buffer \
 	--follow-symlinks \
-	--batch-all-objects
+	--batch-all-objects \
+	-z
 do
 	test_expect_success "usage: bad option combination: $opt without batch mode" '
 		test_incompatible_usage git cat-file $opt &&
@@ -100,6 +101,10 @@ echo_without_newline () {
     printf '%s' "$*"
 }
 
+echo_without_newline_nul () {
+	echo_without_newline "$@" | tr '\n' '\0'
+}
+
 strlen () {
     echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
 }
@@ -398,6 +403,12 @@ test_expect_success '--batch with multiple sha1s gives correct format' '
 	test "$(maybe_remove_timestamp "$batch_output" 1)" = "$(maybe_remove_timestamp "$(echo_without_newline "$batch_input" | git cat-file --batch)" 1)"
 '
 
+test_expect_success '--batch, -z with multiple sha1s gives correct format' '
+	echo_without_newline_nul "$batch_input" >in &&
+	test "$(maybe_remove_timestamp "$batch_output" 1)" = \
+	"$(maybe_remove_timestamp "$(git cat-file --batch -z <in)" 1)"
+'
+
 batch_check_input="$hello_sha1
 $tree_sha1
 $commit_sha1
@@ -418,6 +429,24 @@ test_expect_success "--batch-check with multiple sha1s gives correct format" '
     "$(echo_without_newline "$batch_check_input" | git cat-file --batch-check)"
 '
 
+test_expect_success "--batch-check, -z with multiple sha1s gives correct format" '
+    echo_without_newline_nul "$batch_check_input" >in &&
+    test "$batch_check_output" = "$(git cat-file --batch-check -z <in)"
+'
+
+test_expect_success FUNNYNAMES '--batch-check, -z with newline in input' '
+	touch -- "newline${LF}embedded" &&
+	git add -- "newline${LF}embedded" &&
+	git commit -m "file with newline embedded" &&
+	test_tick &&
+
+	printf "HEAD:newline${LF}embedded" >in &&
+	git cat-file --batch-check -z <in >actual &&
+
+	echo "$(git rev-parse "HEAD:newline${LF}embedded") blob 0" >expect &&
+	test_cmp expect actual
+'
+
 batch_command_multiple_info="info $hello_sha1
 info $tree_sha1
 info $commit_sha1
@@ -436,6 +465,11 @@ test_expect_success '--batch-command with multiple info calls gives correct form
 	echo "$batch_command_multiple_info" >in &&
 	git cat-file --batch-command --buffer <in >actual &&
 
+	test_cmp expect actual &&
+
+	echo "$batch_command_multiple_info" | tr "\n" "\0" >in &&
+	git cat-file --batch-command --buffer -z <in >actual &&
+
 	test_cmp expect actual
 '
 
@@ -459,6 +493,12 @@ test_expect_success '--batch-command with multiple command calls gives correct f
 	echo "$batch_command_multiple_contents" >in &&
 	git cat-file --batch-command --buffer <in >actual_raw &&
 
+	remove_timestamp <actual_raw >actual &&
+	test_cmp expect actual &&
+
+	echo "$batch_command_multiple_contents" | tr "\n" "\0" >in &&
+	git cat-file --batch-command --buffer -z <in >actual_raw &&
+
 	remove_timestamp <actual_raw >actual &&
 	test_cmp expect actual
 '
-- 
2.37.0.1.g1379af2e9d

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z`
  2022-07-22 23:29 ` [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z` Taylor Blau
@ 2022-07-22 23:41   ` Chris Torek
  2022-07-25 23:39     ` Taylor Blau
  2022-07-23  5:17   ` Ævar Arnfjörð Bjarmason
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Chris Torek @ 2022-07-22 23:41 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Git List, Junio C Hamano

On Fri, Jul 22, 2022 at 4:34 PM Taylor Blau <me@ttaylorr.com> wrote:
> The refactoring here is slightly unfortunate, since we turn loops like:
>
>     while (strbuf_getline(&buf, stdin) != EOF)
>
> into:
>
>     while (1) {
>         int ret;
>         if (opt->nul_terminated)
>             ret = strbuf_getline_nul(&input, stdin);
>         else
>             ret = strbuf_getline(&input, stdin);
>
>         if (ret == EOF)
>             break;
>     }
>
> It's tempting to think that we could use `strbuf_getwholeline()` and
> specify either `\n` or `\0` as the terminating character. ...

How about:

    int (*get)(struct strbuf *, FILE *);
    get = opt->nul_terminated ? strbuf_getline_nul : strbuf_getline;
    while (get(&buf, stdin) != EOF)

or similar?

Chris

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] cat-file: support NUL-delimited input with `-z`
  2022-07-22 23:28 [PATCH 0/2] cat-file: support NUL-delimited input with `-z` Taylor Blau
  2022-07-22 23:29 ` [PATCH 1/2] t1006: extract --batch-command inputs to variables Taylor Blau
  2022-07-22 23:29 ` [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z` Taylor Blau
@ 2022-07-23  4:44 ` Junio C Hamano
  2 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2022-07-23  4:44 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git

Taylor Blau <me@ttaylorr.com> writes:

> This short patch series implements a `-z` mode that is compatible with
> the `--batch`, `--batch-check`, and new `--batch-command` options in
> `cat-file`.

It's a shame that we forgot that people use strange "object name"
like "<tree>:<path>", when we originally intended to only take
simple object names, like fully-spelled hexadecimal.  And if we were
prepared to take <path>, we should have make sure we use the usual
cquote convention.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z`
  2022-07-22 23:29 ` [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z` Taylor Blau
  2022-07-22 23:41   ` Chris Torek
@ 2022-07-23  5:17   ` Ævar Arnfjörð Bjarmason
  2022-07-23 17:45     ` Junio C Hamano
  2022-07-23  5:35   ` Junio C Hamano
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-23  5:17 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, gitster, John Cai


On Fri, Jul 22 2022, Taylor Blau wrote:

[I think John Cai would appreciate being CC'd on this]

> It's tempting to think that we could use `strbuf_getwholeline()` and
> specify either `\n` or `\0` as the terminating character. But for input
> on platforms that include a CR character preceeding the LF, this
> wouldn't quite be the same, since `strbuf_getline(...)` will trim any
> trailing CR, while `strbuf_getwholeline(&buf, stdin, '\n')` will not.

I commend the effort to maintain bug-for-bug compatibility, but do we
care about this distinction, or is it just an accident that we support
\r\n here in the first place?

This doesn't apply to the rest of cat-file directly, but the origin of
the recent --batch-command mode cdoe was to lift the same-ish code from
builtin/update-ref.c, whose \n or \0 mode does exactly that:

	while (!strbuf_getwholeline(&input, stdin, line_termination)) {

I.e. it doesn't support \r\n, just \n or \0.

Isn't that fine? I may be missing something, but why isn't it OK even on
platforms that use \r\n for their normal *.txt endings to only use \n or
\0 for this bit of protocol?

For the command mode at least this passes our tests:
	
	diff --git a/builtin/cat-file.c b/builtin/cat-file.c
	index f42782e955f..8646059472d 100644
	--- a/builtin/cat-file.c
	+++ b/builtin/cat-file.c
	@@ -614,12 +614,16 @@ static void batch_objects_command(struct batch_options *opt,
	 	struct queued_cmd *queued_cmd = NULL;
	 	size_t alloc = 0, nr = 0;
	 
	-	while (!strbuf_getline(&input, stdin)) {
	+	while (!strbuf_getwholeline(&input, stdin, '\n')) {
	 		int i;
	 		const struct parse_cmd *cmd = NULL;
	 		const char *p = NULL, *cmd_end;
	 		struct queued_cmd call = {0};
	 
	+		if (input.len > 0 && input.buf[input.len - 1] == '\n')
	+			--input.len;
	+		input.buf[input.len] = '\0';
	+
	 		if (!input.len)
	 			die(_("empty command in input"));
	 		if (isspace(*input.buf))

So maybe we should just do something like that instead? I.e. declare
that a mistake.

As for the rest of cat-file 05d5667fec9 (git-cat-file: Add --batch-check
option, 2008-04-23) documents that it's LF, not CR LF, ditto
git-cat-file.txt.

So isn't this just an accident in of us having used the strbuf_getline()
function to mean "\n", but actually it also does "\r\n".

Which is a really unfortunately named function b.t.w., since it sneaks
this bit of Windows portability into places that may not want it in the
first place.

>  strlen () {
>      echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
>  }
> @@ -398,6 +403,12 @@ test_expect_success '--batch with multiple sha1s gives correct format' '
>  	test "$(maybe_remove_timestamp "$batch_output" 1)" = "$(maybe_remove_timestamp "$(echo_without_newline "$batch_input" | git cat-file --batch)" 1)"
>  '
>  
> +test_expect_success '--batch, -z with multiple sha1s gives correct format' '
> +	echo_without_newline_nul "$batch_input" >in &&
> +	test "$(maybe_remove_timestamp "$batch_output" 1)" = \
> +	"$(maybe_remove_timestamp "$(git cat-file --batch -z <in)" 1)"

This...

> +'
> +
>  batch_check_input="$hello_sha1
>  $tree_sha1
>  $commit_sha1
> @@ -418,6 +429,24 @@ test_expect_success "--batch-check with multiple sha1s gives correct format" '
>      "$(echo_without_newline "$batch_check_input" | git cat-file --batch-check)"
>  '
>  
> +test_expect_success "--batch-check, -z with multiple sha1s gives correct format" '
> +    echo_without_newline_nul "$batch_check_input" >in &&
> +    test "$batch_check_output" = "$(git cat-file --batch-check -z <in)"

This....

> +'
> +
> +test_expect_success FUNNYNAMES '--batch-check, -z with newline in input' '
> +	touch -- "newline${LF}embedded" &&
> +	git add -- "newline${LF}embedded" &&
> +	git commit -m "file with newline embedded" &&
> +	test_tick &&
> +
> +	printf "HEAD:newline${LF}embedded" >in &&
> +	git cat-file --batch-check -z <in >actual &&
> +
> +	echo "$(git rev-parse "HEAD:newline${LF}embedded") blob 0" >expect &&

..and this hides git's exit code, better to pipe to a file, use test_cmp
etc. etc.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z`
  2022-07-22 23:29 ` [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z` Taylor Blau
  2022-07-22 23:41   ` Chris Torek
  2022-07-23  5:17   ` Ævar Arnfjörð Bjarmason
@ 2022-07-23  5:35   ` Junio C Hamano
  2022-07-25 23:50     ` Taylor Blau
  2022-07-31 15:50   ` Phillip Wood
  2022-08-11 11:52   ` Ævar Arnfjörð Bjarmason
  4 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2022-07-23  5:35 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git

Taylor Blau <me@ttaylorr.com> writes:

> @@ -14,7 +14,7 @@ SYNOPSIS
>  'git cat-file' (-t | -s) [--allow-unknown-type] <object>
>  'git cat-file' (--batch | --batch-check | --batch-command) [--batch-all-objects]
>  	     [--buffer] [--follow-symlinks] [--unordered]
> -	     [--textconv | --filters]
> +	     [--textconv | --filters] [-z]

Is "-z" useful with any other option, or is it useful only in
combination with one of the three --batch-*?  The above suggests the
former.

> +test_expect_success '--batch, -z with multiple sha1s gives correct format' '
> +	echo_without_newline_nul "$batch_input" >in &&

I I recall [1/2] correctly, the input lacked the LF at the end.  In
the original "LF terminated" use converted to use these variables,
because $batch_*_input is "echo"ed to create the file "in", the lack
of LF at the end is a GOOD thing.

But here, echo_without_newline_nul is just a glorified "printf %s"
piped into tr to turn LF into NUL.  What is fed by printf into the
pipe lacks LF at the end, so the output from tr will not have NUL at
the end, either.

That might happen to work (because the EOF may be enough to signal
the end of the entire input, thus the last input item), but it does
not make the test case for "-z" exactly parallel to the line oriented
input.

> +test_expect_success "--batch-check, -z with multiple sha1s gives correct format" '
> +    echo_without_newline_nul "$batch_check_input" >in &&
> +    test "$batch_check_output" = "$(git cat-file --batch-check -z <in)"
> +'
> +
> +test_expect_success FUNNYNAMES '--batch-check, -z with newline in input' '
> +	touch -- "newline${LF}embedded" &&
> +	git add -- "newline${LF}embedded" &&
> +	git commit -m "file with newline embedded" &&
> +	test_tick &&
> +
> +	printf "HEAD:newline${LF}embedded" >in &&
> +	git cat-file --batch-check -z <in >actual &&

As I already said, I suspect that new users who know how our path
quoting works would expect c-quoted path would work just fine
without using "-z".  It is not a reason to refuse "-z" to exist,
though.

> @@ -436,6 +465,11 @@ test_expect_success '--batch-command with multiple info calls gives correct form
>  	echo "$batch_command_multiple_info" >in &&
>  	git cat-file --batch-command --buffer <in >actual &&
>  
> +	test_cmp expect actual &&
> +
> +	echo "$batch_command_multiple_info" | tr "\n" "\0" >in &&

This is what I would expect.  The _info variable lacks final LF,
which is supplied by "echo", so output from tr ends with NUL, which
mirrors the line-oriented input we used above.

> +	git cat-file --batch-command --buffer -z <in >actual &&
> +
>  	test_cmp expect actual
>  '
>  
> @@ -459,6 +493,12 @@ test_expect_success '--batch-command with multiple command calls gives correct f
>  	echo "$batch_command_multiple_contents" >in &&
>  	git cat-file --batch-command --buffer <in >actual_raw &&
>  
> +	remove_timestamp <actual_raw >actual &&
> +	test_cmp expect actual &&
> +
> +	echo "$batch_command_multiple_contents" | tr "\n" "\0" >in &&
> +	git cat-file --batch-command --buffer -z <in >actual_raw &&
> +

Likewise.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z`
  2022-07-23  5:17   ` Ævar Arnfjörð Bjarmason
@ 2022-07-23 17:45     ` Junio C Hamano
  2022-07-25 23:44       ` Taylor Blau
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2022-07-23 17:45 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Taylor Blau, git, John Cai

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> This doesn't apply to the rest of cat-file directly, but the origin of
> the recent --batch-command mode cdoe was to lift the same-ish code from
> builtin/update-ref.c, whose \n or \0 mode does exactly that:
>
> 	while (!strbuf_getwholeline(&input, stdin, line_termination)) {
>
> I.e. it doesn't support \r\n, just \n or \0.
>
> Isn't that fine? I may be missing something, but why isn't it OK even on
> platforms that use \r\n for their normal *.txt endings to only use \n or
> \0 for this bit of protocol?
>
> So maybe we should just do something like that instead? I.e. declare
> that a mistake.

Good point.

> So isn't this just an accident in of us having used the strbuf_getline()
> function to mean "\n", but actually it also does "\r\n".
>
> Which is a really unfortunately named function b.t.w., since it sneaks
> this bit of Windows portability into places that may not want it in the
> first place.

getline() is to read a single "logical" line, so it is fine for it
to strip CRLF on platforms with CRLF, and to leave CR at the end of
line on a LF platform.  If the "protocol" is defined to use LF on
any platform (and allow a payload that ends with CR to be passed),
you can argue that it is a wrong helper to call.

But does that result in a sensible behaviour?  I am not sure.  Some
editors that are popular on LF platforms can produce CRLF files when
the user asks (either on purpose or by mistake), and when not using
the "-z" mode of input, I suspect that most users would expect CR at
the end of the "line" (terminated with LF on their platform of
choice) would be thrown away even on their LF platform, simply
because it is unlikely that the kind of input they are preparing can
usefully and legitimately end with CR, as their colleagues on CRLF
platforms may also have to produce similar input.  IOW, the presence
of CRLF platforms makes text lines that end with CR much less useful
everywhere.

And from that point of view, "getline() or getwholeline(NUL)" may be
a pattern that is practically more useful than the old "use always
getwholeline(NUL or LF)" convention that I invented more than 10
years ago.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z`
  2022-07-22 23:41   ` Chris Torek
@ 2022-07-25 23:39     ` Taylor Blau
  0 siblings, 0 replies; 15+ messages in thread
From: Taylor Blau @ 2022-07-25 23:39 UTC (permalink / raw)
  To: Chris Torek; +Cc: Git List, Junio C Hamano

On Fri, Jul 22, 2022 at 04:41:54PM -0700, Chris Torek wrote:
> > It's tempting to think that we could use `strbuf_getwholeline()` and
> > specify either `\n` or `\0` as the terminating character. ...
>
> How about:
>
>     int (*get)(struct strbuf *, FILE *);
>     get = opt->nul_terminated ? strbuf_getline_nul : strbuf_getline;
>     while (get(&buf, stdin) != EOF)
>
> or similar?

I thought about it, but it seemed cleaner to lift the ternary out to
capture the result, too, leading to the if/else-statement I sent in the
patch above.

If we wanted to change it, I'd probably suggest a more general-purpose
strbuf function that implemented this behavior through a single call.
But it sounds like supporting the carriage return character was a
mistake from the beginning, which simplifies this direction quite a bit,
anyways.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z`
  2022-07-23 17:45     ` Junio C Hamano
@ 2022-07-25 23:44       ` Taylor Blau
  2022-07-27 14:10         ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Taylor Blau @ 2022-07-25 23:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, git, John Cai

Hi Junio,

On Sat, Jul 23, 2022 at 10:45:05AM -0700, Junio C Hamano wrote:
> > So isn't this just an accident in of us having used the strbuf_getline()
> > function to mean "\n", but actually it also does "\r\n".
> >
> > Which is a really unfortunately named function b.t.w., since it sneaks
> > this bit of Windows portability into places that may not want it in the
> > first place.
>
> getline() is to read a single "logical" line, so it is fine for it
> to strip CRLF on platforms with CRLF, and to leave CR at the end of
> line on a LF platform.  If the "protocol" is defined to use LF on
> any platform (and allow a payload that ends with CR to be passed),
> you can argue that it is a wrong helper to call.

Reading your email up to this point makes me think that we should ignore
any CR bytes we see next to a LF. So by this point I think that we
should take Ævar's suggestion and call:

    strbuf_getwholeline(&buf, stdin, opt->nul_terminated ? '\0' : '\n');

But...

> But does that result in a sensible behaviour?  I am not sure.  Some
> editors that are popular on LF platforms can produce CRLF files when
> the user asks (either on purpose or by mistake), and when not using
> the "-z" mode of input, I suspect that most users would expect CR at
> the end of the "line" (terminated with LF on their platform of
> choice) would be thrown away even on their LF platform, simply
> because it is unlikely that the kind of input they are preparing can
> usefully and legitimately end with CR, as their colleagues on CRLF
> platforms may also have to produce similar input.  IOW, the presence
> of CRLF platforms makes text lines that end with CR much less useful
> everywhere.
>
> And from that point of view, "getline() or getwholeline(NUL)" may be
> a pattern that is practically more useful than the old "use always
> getwholeline(NUL or LF)" convention that I invented more than 10
> years ago.

This makes me think that we should retain support for dropping the CR
preceeeding a LF and treat it as a historical wart that we are stuck
supporting.

Do you have a preference? I am fine with either approach, FWIW.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z`
  2022-07-23  5:35   ` Junio C Hamano
@ 2022-07-25 23:50     ` Taylor Blau
  2022-07-27 14:20       ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Taylor Blau @ 2022-07-25 23:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Fri, Jul 22, 2022 at 10:35:43PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > @@ -14,7 +14,7 @@ SYNOPSIS
> >  'git cat-file' (-t | -s) [--allow-unknown-type] <object>
> >  'git cat-file' (--batch | --batch-check | --batch-command) [--batch-all-objects]
> >  	     [--buffer] [--follow-symlinks] [--unordered]
> > -	     [--textconv | --filters]
> > +	     [--textconv | --filters] [-z]
>
> Is "-z" useful with any other option, or is it useful only in
> combination with one of the three --batch-*?  The above suggests the
> former.

It only makes sense with `--batch`-related options. But doesn't the
above suggest the latter, not the former? That synopsis line begins
with:

    'git cat-file' (--batch | --batch-check | --batch-command) ...

which made me think that this was the invocation for batch-related
options, and only listed options that made sense with a `--batch` mode
of one kind or another.

> > +test_expect_success '--batch, -z with multiple sha1s gives correct format' '
> > +	echo_without_newline_nul "$batch_input" >in &&
>
> I I recall [1/2] correctly, the input lacked the LF at the end.  In
> the original "LF terminated" use converted to use these variables,
> because $batch_*_input is "echo"ed to create the file "in", the lack
> of LF at the end is a GOOD thing.
>
> But here, echo_without_newline_nul is just a glorified "printf %s"
> piped into tr to turn LF into NUL.  What is fed by printf into the
> pipe lacks LF at the end, so the output from tr will not have NUL at
> the end, either.
>
> That might happen to work (because the EOF may be enough to signal
> the end of the entire input, thus the last input item), but it does
> not make the test case for "-z" exactly parallel to the line oriented
> input.

I see what you're saying. And, yeah, I think it happens to work since we
treat EOF as marking the end of the last input element, regardless of
whether or not we saw a NUL byte or a LF (depending on whether or not we
passed `-z`).

I think the helper should probably be something more like:

    echo_with_nul () {
        echo "$@" | tr '\n' '\0'
    }

or similar. But as you note below, this is probably not even worth
extracting to a helper function.

'
> > +test_expect_success "--batch-check, -z with multiple sha1s gives correct format" '
> > +    echo_without_newline_nul "$batch_check_input" >in &&
> > +    test "$batch_check_output" = "$(git cat-file --batch-check -z <in)"
> > +'
> > +
> > +test_expect_success FUNNYNAMES '--batch-check, -z with newline in input' '
> > +	touch -- "newline${LF}embedded" &&
> > +	git add -- "newline${LF}embedded" &&
> > +	git commit -m "file with newline embedded" &&
> > +	test_tick &&
> > +
> > +	printf "HEAD:newline${LF}embedded" >in &&
> > +	git cat-file --batch-check -z <in >actual &&
>
> As I already said, I suspect that new users who know how our path
> quoting works would expect c-quoted path would work just fine
> without using "-z".  It is not a reason to refuse "-z" to exist,
> though.

Yeah. I think we can do both, if there is a need. I suspect that just
`-z` support would be sufficient for now, but I agree that one doesn't
need to tie up the other.

> > @@ -436,6 +465,11 @@ test_expect_success '--batch-command with multiple info calls gives correct form
> >  	echo "$batch_command_multiple_info" >in &&
> >  	git cat-file --batch-command --buffer <in >actual &&
> >
> > +	test_cmp expect actual &&
> > +
> > +	echo "$batch_command_multiple_info" | tr "\n" "\0" >in &&
>
> This is what I would expect.  The _info variable lacks final LF,
> which is supplied by "echo", so output from tr ends with NUL, which
> mirrors the line-oriented input we used above.

Yep.

> > +	git cat-file --batch-command --buffer -z <in >actual &&
> > +
> >  	test_cmp expect actual
> >  '
> >
> > @@ -459,6 +493,12 @@ test_expect_success '--batch-command with multiple command calls gives correct f
> >  	echo "$batch_command_multiple_contents" >in &&
> >  	git cat-file --batch-command --buffer <in >actual_raw &&
> >
> > +	remove_timestamp <actual_raw >actual &&
> > +	test_cmp expect actual &&
> > +
> > +	echo "$batch_command_multiple_contents" | tr "\n" "\0" >in &&
> > +	git cat-file --batch-command --buffer -z <in >actual_raw &&
> > +
>
> Likewise.

Ditto, thanks.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z`
  2022-07-25 23:44       ` Taylor Blau
@ 2022-07-27 14:10         ` Junio C Hamano
  0 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2022-07-27 14:10 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Ævar Arnfjörð Bjarmason, git, John Cai

Taylor Blau <me@ttaylorr.com> writes:

> Do you have a preference? I am fine with either approach, FWIW.

I do not have a strong preference, but in "text" mode, I suspect
that the users may find it more convenient if we discarded CR at the
end of the line even on LF platforms.

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z`
  2022-07-25 23:50     ` Taylor Blau
@ 2022-07-27 14:20       ` Junio C Hamano
  0 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2022-07-27 14:20 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git

Taylor Blau <me@ttaylorr.com> writes:

> On Fri, Jul 22, 2022 at 10:35:43PM -0700, Junio C Hamano wrote:
>> Taylor Blau <me@ttaylorr.com> writes:
>>
>> > @@ -14,7 +14,7 @@ SYNOPSIS
>> >  'git cat-file' (-t | -s) [--allow-unknown-type] <object>
>> >  'git cat-file' (--batch | --batch-check | --batch-command) [--batch-all-objects]
>> >  	     [--buffer] [--follow-symlinks] [--unordered]
>> > -	     [--textconv | --filters]
>> > +	     [--textconv | --filters] [-z]
>>
>> Is "-z" useful with any other option, or is it useful only in
>> combination with one of the three --batch-*?  The above suggests the
>> former.
>
> It only makes sense with `--batch`-related options. But doesn't the
> above suggest the latter, not the former? That synopsis line begins
> with:
>
>     'git cat-file' (--batch | --batch-check | --batch-command) ...
>
> which made me think that this was the invocation for batch-related
> options, and only listed options that made sense with a `--batch` mode
> of one kind or another.

Ah, yes you are absolutely right.  I misread the synopsis.

> I think the helper should probably be something more like:
>
>     echo_with_nul () {
>         echo "$@" | tr '\n' '\0'
>     }

I think you meant

	printf "%s\n" "$@" | tr ..

but then I wonder if

	printf "%s\000" "$@"

would work better?

>> > +	printf "HEAD:newline${LF}embedded" >in &&
>> > +	git cat-file --batch-check -z <in >actual &&
>>
>> As I already said, I suspect that new users who know how our path
>> quoting works would expect c-quoted path would work just fine
>> without using "-z".  It is not a reason to refuse "-z" to exist,
>> though.
>
> Yeah. I think we can do both, if there is a need.

Even though we know it is needed already, it is unfortunately way
too late now X-<, because existing scripts know that paths are not
taken as c-quoted, even though people would naturally expect that
paths are, and satisfying the latter expectation now means breaking
existing scripts.

In any case, as long as "-z" is designed right from the day one, it
would be fine ;-)

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z`
  2022-07-22 23:29 ` [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z` Taylor Blau
                     ` (2 preceding siblings ...)
  2022-07-23  5:35   ` Junio C Hamano
@ 2022-07-31 15:50   ` Phillip Wood
  2022-08-11 11:52   ` Ævar Arnfjörð Bjarmason
  4 siblings, 0 replies; 15+ messages in thread
From: Phillip Wood @ 2022-07-31 15:50 UTC (permalink / raw)
  To: Taylor Blau, git; +Cc: gitster, Ævar Arnfjörð Bjarmason

Hi Taylor

Thanks for working on this, I've been annoyed by the lack of a "-z" 
option but never got round to doing anything about it.

On 23/07/2022 00:29, Taylor Blau wrote:
> When callers are using `cat-file` via one of the stdin-driven `--batch`
> modes, all input is newline-delimited. This presents a problem when
> callers wish to ask about, e.g. tree-entries that have a newline
> character present in their filename.
> 
> To support this niche scenario, introduce a new `-z` mode to the
> `--batch`, `--batch-check`, and `--batch-command` suite of options that
> instructs `cat-file` to treat its input as NUL-delimited, allowing the
> individual commands themselves to have newlines present.

I wonder if this should also change the output delimiter from NL to NUL.

printf 'HEAD:does\nnot\nexist\000' | bin-wrappers/git cat-file 
--batch-check -z

prints

HEAD:does
not
exist missing

If it was terminated by a NUL rather than a newline it would be much 
easier to parse.

Best Wishes

Phillip

> The refactoring here is slightly unfortunate, since we turn loops like:
> 
>      while (strbuf_getline(&buf, stdin) != EOF)
> 
> into:
> 
>      while (1) {
>          int ret;
>          if (opt->nul_terminated)
>              ret = strbuf_getline_nul(&input, stdin);
>          else
>              ret = strbuf_getline(&input, stdin);
> 
>          if (ret == EOF)
>              break;
>      }
> 
> It's tempting to think that we could use `strbuf_getwholeline()` and
> specify either `\n` or `\0` as the terminating character. But for input
> on platforms that include a CR character preceeding the LF, this
> wouldn't quite be the same, since `strbuf_getline(...)` will trim any
> trailing CR, while `strbuf_getwholeline(&buf, stdin, '\n')` will not.
> 
> In the future, we could clean this up further by introducing a variant
> of `strbuf_getwholeline()` that addresses the aforementioned gap, but
> that approach felt too heavy-handed for this pair of uses.
> 
> Some tests are added in t1006 to ensure that `cat-file` produces the
> same output in `--batch`, `--batch-check`, and `--batch-command` modes
> with and without the new `-z` option.
> 
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>   Documentation/git-cat-file.txt |  7 +++++-
>   builtin/cat-file.c             | 28 ++++++++++++++++++++---
>   t/t1006-cat-file.sh            | 42 +++++++++++++++++++++++++++++++++-
>   3 files changed, 72 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index 24a811f0ef..3515350ed6 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -14,7 +14,7 @@ SYNOPSIS
>   'git cat-file' (-t | -s) [--allow-unknown-type] <object>
>   'git cat-file' (--batch | --batch-check | --batch-command) [--batch-all-objects]
>   	     [--buffer] [--follow-symlinks] [--unordered]
> -	     [--textconv | --filters]
> +	     [--textconv | --filters] [-z]
>   'git cat-file' (--textconv | --filters)
>   	     [<rev>:<path|tree-ish> | --path=<path|tree-ish> <rev>]
>   
> @@ -207,6 +207,11 @@ respectively print:
>   	/etc/passwd
>   --
>   
> +-z::
> +	Only meaningful with `--batch`, `--batch-check`, or
> +	`--batch-command`; input is NUL-delimited instead of
> +	newline-delimited.
> +
>   
>   OUTPUT
>   ------
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index f42782e955..c3602d15df 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -31,6 +31,7 @@ struct batch_options {
>   	int all_objects;
>   	int unordered;
>   	int transform_mode; /* may be 'w' or 'c' for --filters or --textconv */
> +	int nul_terminated;
>   	const char *format;
>   };
>   
> @@ -614,12 +615,20 @@ static void batch_objects_command(struct batch_options *opt,
>   	struct queued_cmd *queued_cmd = NULL;
>   	size_t alloc = 0, nr = 0;
>   
> -	while (!strbuf_getline(&input, stdin)) {
> -		int i;
> +	while (1) {
> +		int i, ret;
>   		const struct parse_cmd *cmd = NULL;
>   		const char *p = NULL, *cmd_end;
>   		struct queued_cmd call = {0};
>   
> +		if (opt->nul_terminated)
> +			ret = strbuf_getline_nul(&input, stdin);
> +		else
> +			ret = strbuf_getline(&input, stdin);
> +
> +		if (ret)
> +			break;
> +
>   		if (!input.len)
>   			die(_("empty command in input"));
>   		if (isspace(*input.buf))
> @@ -763,7 +772,16 @@ static int batch_objects(struct batch_options *opt)
>   		goto cleanup;
>   	}
>   
> -	while (strbuf_getline(&input, stdin) != EOF) {
> +	while (1) {
> +		int ret;
> +		if (opt->nul_terminated)
> +			ret = strbuf_getline_nul(&input, stdin);
> +		else
> +			ret = strbuf_getline(&input, stdin);
> +
> +		if (ret == EOF)
> +			break;
> +
>   		if (data.split_on_whitespace) {
>   			/*
>   			 * Split at first whitespace, tying off the beginning
> @@ -866,6 +884,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
>   			N_("like --batch, but don't emit <contents>"),
>   			PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
>   			batch_option_callback),
> +		OPT_BOOL('z', NULL, &batch.nul_terminated, N_("stdin is NUL-terminated")),
>   		OPT_CALLBACK_F(0, "batch-command", &batch, N_("format"),
>   			N_("read commands from stdin"),
>   			PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
> @@ -921,6 +940,9 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
>   	else if (batch.all_objects)
>   		usage_msg_optf(_("'%s' requires a batch mode"), usage, options,
>   			       "--batch-all-objects");
> +	else if (batch.nul_terminated)
> +		usage_msg_optf(_("'%s' requires a batch mode"), usage, options,
> +			       "-z");
>   
>   	/* Batch defaults */
>   	if (batch.buffer_output < 0)
> diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
> index 01c0535765..23b8942edb 100755
> --- a/t/t1006-cat-file.sh
> +++ b/t/t1006-cat-file.sh
> @@ -88,7 +88,8 @@ done
>   
>   for opt in --buffer \
>   	--follow-symlinks \
> -	--batch-all-objects
> +	--batch-all-objects \
> +	-z
>   do
>   	test_expect_success "usage: bad option combination: $opt without batch mode" '
>   		test_incompatible_usage git cat-file $opt &&
> @@ -100,6 +101,10 @@ echo_without_newline () {
>       printf '%s' "$*"
>   }
>   
> +echo_without_newline_nul () {
> +	echo_without_newline "$@" | tr '\n' '\0'
> +}
> +
>   strlen () {
>       echo_without_newline "$1" | wc -c | sed -e 's/^ *//'
>   }
> @@ -398,6 +403,12 @@ test_expect_success '--batch with multiple sha1s gives correct format' '
>   	test "$(maybe_remove_timestamp "$batch_output" 1)" = "$(maybe_remove_timestamp "$(echo_without_newline "$batch_input" | git cat-file --batch)" 1)"
>   '
>   
> +test_expect_success '--batch, -z with multiple sha1s gives correct format' '
> +	echo_without_newline_nul "$batch_input" >in &&
> +	test "$(maybe_remove_timestamp "$batch_output" 1)" = \
> +	"$(maybe_remove_timestamp "$(git cat-file --batch -z <in)" 1)"
> +'
> +
>   batch_check_input="$hello_sha1
>   $tree_sha1
>   $commit_sha1
> @@ -418,6 +429,24 @@ test_expect_success "--batch-check with multiple sha1s gives correct format" '
>       "$(echo_without_newline "$batch_check_input" | git cat-file --batch-check)"
>   '
>   
> +test_expect_success "--batch-check, -z with multiple sha1s gives correct format" '
> +    echo_without_newline_nul "$batch_check_input" >in &&
> +    test "$batch_check_output" = "$(git cat-file --batch-check -z <in)"
> +'
> +
> +test_expect_success FUNNYNAMES '--batch-check, -z with newline in input' '
> +	touch -- "newline${LF}embedded" &&
> +	git add -- "newline${LF}embedded" &&
> +	git commit -m "file with newline embedded" &&
> +	test_tick &&
> +
> +	printf "HEAD:newline${LF}embedded" >in &&
> +	git cat-file --batch-check -z <in >actual &&
> +
> +	echo "$(git rev-parse "HEAD:newline${LF}embedded") blob 0" >expect &&
> +	test_cmp expect actual
> +'
> +
>   batch_command_multiple_info="info $hello_sha1
>   info $tree_sha1
>   info $commit_sha1
> @@ -436,6 +465,11 @@ test_expect_success '--batch-command with multiple info calls gives correct form
>   	echo "$batch_command_multiple_info" >in &&
>   	git cat-file --batch-command --buffer <in >actual &&
>   
> +	test_cmp expect actual &&
> +
> +	echo "$batch_command_multiple_info" | tr "\n" "\0" >in &&
> +	git cat-file --batch-command --buffer -z <in >actual &&
> +
>   	test_cmp expect actual
>   '
>   
> @@ -459,6 +493,12 @@ test_expect_success '--batch-command with multiple command calls gives correct f
>   	echo "$batch_command_multiple_contents" >in &&
>   	git cat-file --batch-command --buffer <in >actual_raw &&
>   
> +	remove_timestamp <actual_raw >actual &&
> +	test_cmp expect actual &&
> +
> +	echo "$batch_command_multiple_contents" | tr "\n" "\0" >in &&
> +	git cat-file --batch-command --buffer -z <in >actual_raw &&
> +
>   	remove_timestamp <actual_raw >actual &&
>   	test_cmp expect actual
>   '

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z`
  2022-07-22 23:29 ` [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z` Taylor Blau
                     ` (3 preceding siblings ...)
  2022-07-31 15:50   ` Phillip Wood
@ 2022-08-11 11:52   ` Ævar Arnfjörð Bjarmason
  4 siblings, 0 replies; 15+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-08-11 11:52 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, gitster


On Fri, Jul 22 2022, Taylor Blau wrote:

This change:

> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index 24a811f0ef..3515350ed6 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -14,7 +14,7 @@ SYNOPSIS
>  'git cat-file' (-t | -s) [--allow-unknown-type] <object>
>  'git cat-file' (--batch | --batch-check | --batch-command) [--batch-all-objects]
>  	     [--buffer] [--follow-symlinks] [--unordered]
> -	     [--textconv | --filters]
> +	     [--textconv | --filters] [-z]
>  'git cat-file' (--textconv | --filters)
>  	     [<rev>:<path|tree-ish> | --path=<path|tree-ish> <rev>]
>  
> [...]
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index f42782e955..c3602d15df 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -31,6 +31,7 @@ struct batch_options {
>  	int all_objects;
>  	int unordered;
>  	int transform_mode; /* may be 'w' or 'c' for --filters or --textconv */
> +	int nul_terminated;
>  	const char *format;
>  };

Is missing a corresponding change to the -h output here. Before this
they were the same, but now:
	
	+ diff -u txt help
	--- txt 2022-08-11 11:53:00.235221628 +0000
	+++ help        2022-08-11 11:53:00.239221594 +0000
	@@ -3,6 +3,6 @@
	 git cat-file (-t | -s) [--allow-unknown-type] <object>
	 git cat-file (--batch | --batch-check | --batch-command) [--batch-all-objects]
	              [--buffer] [--follow-symlinks] [--unordered]
	-             [--textconv | --filters] [-z]
	+             [--textconv | --filters]
	 git cat-file (--textconv | --filters)
	              [<rev>:<path|tree-ish> | --path=<path|tree-ish> <rev>]

(Spotted with some local patches of mine which automatically check this)

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-08-11 11:53 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-22 23:28 [PATCH 0/2] cat-file: support NUL-delimited input with `-z` Taylor Blau
2022-07-22 23:29 ` [PATCH 1/2] t1006: extract --batch-command inputs to variables Taylor Blau
2022-07-22 23:29 ` [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z` Taylor Blau
2022-07-22 23:41   ` Chris Torek
2022-07-25 23:39     ` Taylor Blau
2022-07-23  5:17   ` Ævar Arnfjörð Bjarmason
2022-07-23 17:45     ` Junio C Hamano
2022-07-25 23:44       ` Taylor Blau
2022-07-27 14:10         ` Junio C Hamano
2022-07-23  5:35   ` Junio C Hamano
2022-07-25 23:50     ` Taylor Blau
2022-07-27 14:20       ` Junio C Hamano
2022-07-31 15:50   ` Phillip Wood
2022-08-11 11:52   ` Ævar Arnfjörð Bjarmason
2022-07-23  4:44 ` [PATCH 0/2] cat-file: " Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.