All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: phillip.wood@dunelm.org.uk
Cc: git@vger.kernel.org, Taylor Blau <me@ttaylorr.com>,
	Toon Claes <toon@iotcl.com>, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 5/5] cat-file: Introduce new option to delimit output with NUL characters
Date: Tue, 6 Jun 2023 07:00:00 +0200	[thread overview]
Message-ID: <ZH69UBT_H0OtGR6l@ncase> (raw)
In-Reply-To: <9900512f-b0da-2e47-f1ab-ed51ec2c78ff@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4261 bytes --]

On Mon, Jun 05, 2023 at 04:47:14PM +0100, Phillip Wood wrote:
> > @@ -384,6 +390,11 @@ notdir SP <size> LF
> >   is printed when, during symlink resolution, a file is used as a
> >   directory name.
> >   
> > +Alternatively, when `-Z` is passed, the line feeds in any of the above examples
> > +are replaced with NUL terminators. This ensures that output will be parsable if
> > +the output itself would contain a linefeed and is thus recommended for
> > +scripting purposes.
> > +
> >   CAVEATS
> >   -------
> >   
> > diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> > index 001dcb24d6..90ef407d30 100644
> > --- a/builtin/cat-file.c
> > +++ b/builtin/cat-file.c
> > @@ -492,17 +494,18 @@ static void batch_object_write(const char *obj_name,
> >   	strbuf_reset(scratch);
> >   
> >   	if (!opt->format) {
> > -		print_default_format(scratch, data);
> > +		print_default_format(scratch, data, opt);
> >   	} else {
> >   		strbuf_expand(scratch, opt->format, expand_format, data);
> > -		strbuf_addch(scratch, '\n');
> > +		strbuf_addch(scratch, opt->output_delim);
> >   	}
> >   
> >   	batch_write(opt, scratch->buf, scratch->len);
> >   
> >   	if (opt->batch_mode == BATCH_MODE_CONTENTS) {
> > +		char buf[] = {opt->output_delim};
> 
> I found this a bit confusing, I think it would be clearer just to do
> 
> 	batch_write(opt, &opt->output_delim, 1);

Agreed, that's cleaner.

> >   		print_object_or_die(opt, data);
> > -		batch_write(opt, "\n", 1);
> > +		batch_write(opt, buf, 1);
> >   	}
> >   }
> 
> > @@ -920,7 +927,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
> >   		N_("git cat-file (-t | -s) [--allow-unknown-type] <object>"),
> >   		N_("git cat-file (--batch | --batch-check | --batch-command) [--batch-all-objects]\n"
> >   		   "             [--buffer] [--follow-symlinks] [--unordered]\n"
> > -		   "             [--textconv | --filters] [-z]"),
> > +		   "             [--textconv | --filters] [-z] [-Z]"),
> 
> If we're recommending that people don't use '-z' then maybe we should 
> remove it from the synopsis and add OPT_HIDDEN to it below.

I might still change this depending on the conclusion Junio and I will
arrive at, but for now I agree that this makes sense.

> >   		N_("git cat-file (--textconv | --filters)\n"
> >   		   "             [<rev>:<path|tree-ish> | --path=<path|tree-ish> <rev>]"),
> >   		NULL
> > @@ -950,6 +957,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
> >   			PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
> >   			batch_option_callback),
> >   		OPT_BOOL('z', NULL, &input_nul_terminated, N_("stdin is NUL-terminated")),
> > +		OPT_BOOL('Z', NULL, &nul_terminated, N_("stdin and stdout is NUL-terminated")),
> 
> > diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
> > index 7b985cfded..d73a0be1b9 100755
> > --- a/t/t1006-cat-file.sh
> > +++ b/t/t1006-cat-file.sh
> > @@ -392,17 +393,18 @@ deadbeef
> >   
> >   "
> >   
> > -batch_output="$hello_sha1 blob $hello_size
> > -$hello_content
> > -$commit_sha1 commit $commit_size
> > -$commit_content
> > -$tag_sha1 tag $tag_size
> > -$tag_content
> > -deadbeef missing
> > - missing"
> > +printf "%s\0" \
> > +	"$hello_sha1 blob $hello_size" \
> > +	"$hello_content" \
> > +	"$commit_sha1 commit $commit_size" \
> > +	"$commit_content" \
> > +	"$tag_sha1 tag $tag_size" \
> > +	"$tag_content" \
> > +	"deadbeef missing" \
> > +	" missing" >batch_output
> 
> I think writing the expected output to a file is a good change as we 
> always use it with test_cmp. As "-z" is deprecated I think it makes 
> sense to model the expected output for "-Z" and use tr for the "-z" 
> tests as you have done here. It looks like we have good coverage of the 
> new option.

It's actually also required in order to not have to specify the expected
output twice. While we could leave this as-is, translating it to be NUL
terminated via `tr \n \0` doesn't work as the output contains newlines
in places where we don't want to translate them to NUL delimiters. And
storing the NUL-delimited string in a variable doesn't work either as
shells will truncate the C strings.

Thanks for your review!

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2023-06-06  5:00 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-02 13:02 [PATCH 0/5] cat-file: introduce NUL-terminated output format Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 1/5] t1006: don't strip timestamps from expected results Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 2/5] t1006: modernize test style to use `test_cmp` Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 3/5] strbuf: provide CRLF-aware helper to read until a specified delimiter Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 4/5] cat-file: simplify reading from standard input Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 5/5] cat-file: Introduce new option to delimit output with NUL characters Patrick Steinhardt
2023-06-05 15:47   ` Phillip Wood
2023-06-05 23:54     ` Junio C Hamano
2023-06-06  4:52       ` Patrick Steinhardt
2023-06-06  5:22         ` Junio C Hamano
2023-06-06  5:31           ` Patrick Steinhardt
2023-06-12 19:12             ` Junio C Hamano
2023-06-06  5:00     ` Patrick Steinhardt [this message]
2023-06-06  1:23   ` Junio C Hamano
2023-06-03  1:44 ` [PATCH 0/5] cat-file: introduce NUL-terminated output format Junio C Hamano
2023-06-06  5:19 ` [PATCH v2 0/5] catfile: " Patrick Steinhardt
2023-06-06  5:19   ` [PATCH v2 1/5] t1006: don't strip timestamps from expected results Patrick Steinhardt
2023-06-06  5:19   ` [PATCH v2 2/5] t1006: modernize test style to use `test_cmp` Patrick Steinhardt
2023-06-06  5:19   ` [PATCH v2 3/5] strbuf: provide CRLF-aware helper to read until a specified delimiter Patrick Steinhardt
2023-06-06  5:19   ` [PATCH v2 4/5] cat-file: simplify reading from standard input Patrick Steinhardt
2023-06-06  5:19   ` [PATCH v2 5/5] cat-file: introduce option to delimit input and output with NUL Patrick Steinhardt
2023-06-12 20:43   ` [PATCH v2 0/5] catfile: introduce NUL-terminated output format Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZH69UBT_H0OtGR6l@ncase \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=phillip.wood@dunelm.org.uk \
    --cc=toon@iotcl.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.