All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Sunshine <sunshine@sunshineco.com>
To: John Cai via GitGitGadget <gitgitgadget@gmail.com>
Cc: "Git List" <git@vger.kernel.org>, "Taylor Blau" <me@ttaylorr.com>,
	"Phillip Wood" <phillip.wood123@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Eric Wong" <e@80x24.org>, "Bagas Sanjaya" <bagasdotme@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>,
	"John Cai" <johncai86@gmail.com>
Subject: Re: [PATCH 2/2] catfile.c: add --batch-command mode
Date: Fri, 4 Feb 2022 01:45:52 -0500	[thread overview]
Message-ID: <CAPig+cQHcstj28F-==oCakp3iMLHtd1SVQV+snzewYihHxNG6g@mail.gmail.com> (raw)
In-Reply-To: <ebd2a1356017905245744babf32c5b78a0af1639.1643915286.git.gitgitgadget@gmail.com>

()()On Thu, Feb 3, 2022 at 7:20 PM John Cai via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> Add new flag --batch-command that will accept commands and arguments
> from stdin, similar to git-update-ref --stdin.
>
> contents <object> NL
> info <object> NL
>
> With the following commands, process (A) can begin a "session" and
> send a list of object names over stdin. When "get contents" or "get info"
> is issued, this list of object names will be fed into batch_one_object()
> to retrieve either info or contents. Finally an fflush() will be called
> to end the session.
>
> begin
> <sha1>
> get info
>
> begin
> <sha1>
> get contents

I had the same reaction to this new command set as Junio expressed
upstream[1], and was prepared to suggest an alternative but Junio said
everything I wanted to say, so I won't say anything more about it
here.

That aside, the implementation presented here is overly loose about
malformed input and accepts all sorts of bogus invocations which it
should be reporting as errors. For instance, a lone:

    get info

without a preceding `begin` should be an error but such bogus input is
not diagnosed. `get contents` is likewise affected. Another example:

    begin
    <oid>
    <EOF>

which lacks a closing `get info` or `get contents` is silently
ignored. Similarly, malformed:

    begin
    begin
    ...

should be reported as an error but is not.

There is also a bug in which the accumulated list of OID's is never
cleared. Thus:

    begin
    <oid>
    get info
    get info

emits info about <oid> twice, once for each `get info` invocation. Similarly:

    begin
    <oid1>
    get info
    <oid2>
    get info

emits information about <oid1> twice and <oid2> once. Invoking `begin`
between the `get info` commands doesn't help because `begin` is a
no-op. Thus:

    begin
    <oid1>
    get info
    begin
    <oid2>
    get info

likewise emits information about <oid1> twice and <oid2> once.

It also incorrectly accepts non-"session" commands in the middle of a
session. For instance:

    begin
    <oid1>
    info <oid2>
    <oid3>

immediately emits information about <oid2> and then bombs out claiming
that <oid3> is an "unknown command" because the `info` command --
which should not have been allowed within a session -- prematurely
ended the "session".

The `info` and `contents` commands neglect to do any sort of
validation of their arguments, thus any and all bogus invocations are
accepted. Thus:

    info
    info <arg1> <arg2>
    info <non-oid>

are all accepted as valid invocations, misleadingly producing "<foo>
missing" messages, rather than erroring out as they should. The "<oid>
missing" message should be reserved for the case when the lone
argument to `info` or `contents` is something which looks like a
legitimate OID.

The above is a long-winded way of saying that it is important not only
to check the obvious "expect success" cases when implementing a new
feature, but it's also important to add checks for the "expect
failure" cases, such as all the above malformed inputs.

It's subjective, but it feels like this implementation is trying to be
too clever by handling all these cases via a single strbuf_getline()
loop in batch_objects_command(). It would be easier to reason about,
and would have avoided some of the above problems, for instance, if
handling of `begin` was split out to its own function which looped
over strbuf_getline() itself, thus could easily have detected
premature EOF (lacking `get info` or `get contents`) and likewise
would not have allowed `info` or `contents` commands to be executed
within a "session".

Similarly (again subjective), the generic command dispatcher seems
over-engineered and perhaps contributed to the oversight of `info` and
`contents` failing to perform validation of their arguments. A simpler
hand-rolled command-response loop is more common on this project and
often easier to reason about. Perhaps something like this
(pseudo-code):

    while (strbuf_getline(&input, stdin)) {
        char *end_cmd = strchrnul(&input.buf, ' ');
        const char *argstart = *end_cmd ? end_cmd + 1 : end_cmd;
        *end_cmd = '\0';
        if (strcmp(input.buf, "info"))
            show_info(argstart);
        else if (strcmp(input.buf, "contents"))
            show_contents(argstart);
        else if (strcmp(input.buf, "begin"))
            begin_session(argstart);
        else
            die(...);
    }

and each of the command-handler functions would perform its own
argument validation (including the case when no argument should be
present).

[1]: https://lore.kernel.org/git/xmqqo83nsoxs.fsf@gitster.g/

> +static void parse_cmd_begin(struct batch_options *opt,
> +                          const char *line,
> +                          struct strbuf *output,
> +                          struct expand_data *data,
> +                          struct string_list revs)
> +{
> +       /* nothing needs to be done here */
> +}

Either this function should be clearing the list of collected OID's or...

> +static void parse_cmd_get(struct batch_options *opt,
> +                          const char *line,
> +                          struct strbuf *output,
> +                          struct expand_data *data,
> +                          struct string_list revs)
> +{
> +       struct string_list_item *item;
> +       for_each_string_list_item(item, &revs) {
> +               batch_one_object(item->string, output, opt, data);
> +       }
> +       if (opt->buffer_output)
> +               fflush(stdout);

... this function should do so after it's done processing the OID list.

> +}
> +static void parse_cmd_get_info(struct batch_options *opt,
> +                          const char *line,
> +                          struct strbuf *output,
> +                          struct expand_data *data,
> +                          struct string_list revs)

Missing blank line before the function definition.

> +static void parse_cmd_get_objects(struct batch_options *opt,
> +                          const char *line,
> +                          struct strbuf *output,
> +                          struct expand_data *data,
> +                          struct string_list revs)
> +{
> +       opt->print_contents = 1;
> +       parse_cmd_get(opt, line, output, data, revs);
> +       if (opt->buffer_output)
> +               fflush(stdout);
> +}

Why does this function duplicate the flushing logic of parse_cmd_get()
immediately after calling parse_cmd_get()?

> +static void batch_objects_command(struct batch_options *opt,
> +                                   struct strbuf *output,
> +                                   struct expand_data *data)
> +{
> +       /* Read each line dispatch its command */

Missing "and".

> +       while (!strbuf_getline(&input, stdin)) {
> +               if (state == BATCH_STATE_COMMAND) {
> +                       if (*input.buf == '\n')
> +                               die("empty command in input");

This never triggers because strbuf_getline() removes the line
terminator before returning.

> +                       else if (isspace(*input.buf))
> +                               die("whitespace before command: %s", input.buf);
> +               }
> +
> +               for (i = 0; i < ARRAY_SIZE(commands); i++) {
> +                       if (!skip_prefix(input.buf, prefix, &cmd_end))
> +                               continue;
> +                       /*
> +                        * If the command has arguments, verify that it's
> +                        * followed by a space. Otherwise, it shall be followed
> +                        * by a line terminator.
> +                        */
> +                       c = commands[i].takes_args ? ' ' : '\n';
> +                       if (*cmd_end && *cmd_end != c)
> +                               die("arguments invalid for command: %s", commands[i].prefix);

Ditto regarding strbuf_getline() removing line-terminator.

> +                       cmd = &commands[i];
> +                       if (cmd->takes_args)
> +                               p = cmd_end + 1;
> +                       break;
> +               }
> +
> +               if (input.buf[input.len - 1] == '\n')
> +                       input.buf[--input.len] = '\0';

Ditto again. This is especially scary since it's accessing element
`input.len-1` without even checking if `input.len` is greater than
zero.

Also, it's better to use published strbuf API rather than mucking with
the internals:

    strbuf_setlen(&input, input.len - 1);

> +               if (state == BATCH_STATE_INPUT && !cmd){
> +                       string_list_append(&revs, input.buf);
> +                       continue;
> +               }
> +
> +               if (!cmd)
> +                       die("unknown command: %s", input.buf);
> +
> +               state = cmd->next_state;
> +               cmd->fn(opt, p, output, data, revs);
> +       }
> +       strbuf_release(&input);
> +       string_list_clear(&revs, 0);
> +}

  parent reply	other threads:[~2022-02-04  6:46 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-03 19:08 [PATCH 0/2] Add cat-file --batch-command flag John Cai via GitGitGadget
2022-02-03 19:08 ` [PATCH 1/2] cat-file.c: rename cmdmode to mode John Cai via GitGitGadget
2022-02-03 19:28   ` Junio C Hamano
2022-02-04 12:10   ` Ævar Arnfjörð Bjarmason
2022-02-03 19:08 ` [PATCH 2/2] catfile.c: add --batch-command mode John Cai via GitGitGadget
2022-02-03 19:57   ` Junio C Hamano
2022-02-04  4:11     ` John Cai
2022-02-04 16:46       ` Phillip Wood
2022-02-04  6:45   ` Eric Sunshine [this message]
2022-02-04 21:41     ` John Cai
2022-02-05  6:52       ` Eric Sunshine
2022-02-04 12:11   ` Ævar Arnfjörð Bjarmason
2022-02-04 16:51     ` Phillip Wood
2022-02-07 16:33 ` [PATCH v2 0/2] Add cat-file --batch-command flag John Cai via GitGitGadget
2022-02-07 16:33   ` [PATCH v2 1/2] cat-file: rename cmdmode to transform_mode John Cai via GitGitGadget
2022-02-07 23:58     ` Junio C Hamano
2022-02-07 16:33   ` [PATCH v2 2/2] cat-file: add --batch-command mode John Cai via GitGitGadget
2022-02-07 23:34     ` Jonathan Tan
2022-02-08 11:00       ` Phillip Wood
2022-02-08 17:56         ` Jonathan Tan
2022-02-08 18:09           ` Junio C Hamano
2022-02-09  0:11             ` Jonathan Tan
2022-02-08  0:49     ` Junio C Hamano
2022-02-08 11:06     ` Phillip Wood
2022-02-08 20:58   ` [PATCH v3 0/3] Add cat-file --batch-command flag John Cai via GitGitGadget
2022-02-08 20:58     ` [PATCH v3 1/3] cat-file: rename cmdmode to transform_mode John Cai via GitGitGadget
2022-02-08 20:58     ` [PATCH v3 2/3] cat-file: introduce batch_command enum to replace print_contents John Cai via GitGitGadget
2022-02-08 23:43       ` Junio C Hamano
2022-02-08 20:58     ` [PATCH v3 3/3] cat-file: add --batch-command mode John Cai via GitGitGadget
2022-02-08 23:59       ` Junio C Hamano
2022-02-09 21:40     ` [PATCH v3 0/3] Add cat-file --batch-command flag Junio C Hamano
2022-02-09 22:22       ` John Cai
2022-02-09 23:10         ` John Cai
2022-02-10  4:01     ` [PATCH v4 " John Cai via GitGitGadget
2022-02-10  4:01       ` [PATCH v4 1/3] cat-file: rename cmdmode to transform_mode John Cai via GitGitGadget
2022-02-10  4:01       ` [PATCH v4 2/3] cat-file: introduce batch_mode enum to replace print_contents John Cai via GitGitGadget
2022-02-10 10:10         ` Christian Couder
2022-02-10  4:01       ` [PATCH v4 3/3] cat-file: add --batch-command mode John Cai via GitGitGadget
2022-02-10 10:57         ` Phillip Wood
2022-02-10 17:05           ` Junio C Hamano
2022-02-11 17:45             ` John Cai
2022-02-11 20:07               ` Junio C Hamano
2022-02-11 21:30                 ` John Cai
2022-02-10 18:55           ` John Cai
2022-02-10 22:46         ` Eric Sunshine
2022-02-10 20:30       ` [PATCH v4 0/3] Add cat-file --batch-command flag Junio C Hamano
2022-02-11 20:01       ` [PATCH v5 " John Cai via GitGitGadget
2022-02-11 20:01         ` [PATCH v5 1/3] cat-file: rename cmdmode to transform_mode John Cai via GitGitGadget
2022-02-11 20:01         ` [PATCH v5 2/3] cat-file: introduce batch_mode enum to replace print_contents John Cai via GitGitGadget
2022-02-11 20:01         ` [PATCH v5 3/3] cat-file: add --batch-command mode John Cai via GitGitGadget
2022-02-14 13:59           ` Phillip Wood
2022-02-14 16:19             ` John Cai
2022-02-14 18:23         ` [PATCH v6 0/4] Add cat-file --batch-command flag John Cai via GitGitGadget
2022-02-14 18:23           ` [PATCH v6 1/4] cat-file: rename cmdmode to transform_mode John Cai via GitGitGadget
2022-02-14 18:23           ` [PATCH v6 2/4] cat-file: introduce batch_mode enum to replace print_contents John Cai via GitGitGadget
2022-02-14 18:23           ` [PATCH v6 3/4] cat-file: add remove_timestamp helper John Cai via GitGitGadget
2022-02-14 18:23           ` [PATCH v6 4/4] cat-file: add --batch-command mode John Cai via GitGitGadget
2022-02-15 19:39             ` Eric Sunshine
2022-02-15 22:58               ` John Cai
2022-02-15 23:20                 ` Eric Sunshine
2022-02-16  0:53           ` [PATCH v7 0/4] Add cat-file --batch-command flag John Cai via GitGitGadget
2022-02-16  0:53             ` [PATCH v7 1/4] cat-file: rename cmdmode to transform_mode John Cai via GitGitGadget
2022-02-16  0:53             ` [PATCH v7 2/4] cat-file: introduce batch_mode enum to replace print_contents John Cai via GitGitGadget
2022-02-16  0:53             ` [PATCH v7 3/4] cat-file: add remove_timestamp helper John Cai via GitGitGadget
2022-02-16  0:53             ` [PATCH v7 4/4] cat-file: add --batch-command mode John Cai via GitGitGadget
2022-02-16  1:28               ` Junio C Hamano
2022-02-16  2:48                 ` John Cai
2022-02-16  3:00                   ` Junio C Hamano
2022-02-16  3:17                     ` Eric Sunshine
2022-02-16  3:01                   ` Eric Sunshine
2022-02-16 15:02             ` [PATCH v8 0/4] Add cat-file --batch-command flag John Cai via GitGitGadget
2022-02-16 15:02               ` [PATCH v8 1/4] cat-file: rename cmdmode to transform_mode John Cai via GitGitGadget
2022-02-16 15:02               ` [PATCH v8 2/4] cat-file: introduce batch_mode enum to replace print_contents John Cai via GitGitGadget
2022-02-16 15:02               ` [PATCH v8 3/4] cat-file: add remove_timestamp helper John Cai via GitGitGadget
2022-02-16 15:02               ` [PATCH v8 4/4] cat-file: add --batch-command mode John Cai via GitGitGadget
2022-02-16 17:15                 ` Junio C Hamano
2022-02-16 17:25                   ` Eric Sunshine
2022-02-16 20:30                     ` John Cai
2022-02-16 20:59               ` [PATCH v9 0/4] Add cat-file --batch-command flag John Cai via GitGitGadget
2022-02-16 20:59                 ` [PATCH v9 1/4] cat-file: rename cmdmode to transform_mode John Cai via GitGitGadget
2022-02-16 20:59                 ` [PATCH v9 2/4] cat-file: introduce batch_mode enum to replace print_contents John Cai via GitGitGadget
2022-02-16 20:59                 ` [PATCH v9 3/4] cat-file: add remove_timestamp helper John Cai via GitGitGadget
2022-02-16 20:59                 ` [PATCH v9 4/4] cat-file: add --batch-command mode John Cai via GitGitGadget
2022-02-18 11:26                   ` Phillip Wood
2022-02-18 16:53                     ` John Cai
2022-02-18 17:32                       ` Junio C Hamano
2022-02-18 17:23                     ` Junio C Hamano
2022-02-18 18:23                 ` [PATCH v10 0/4] Add cat-file --batch-command flag John Cai via GitGitGadget
2022-02-18 18:23                   ` [PATCH v10 1/4] cat-file: rename cmdmode to transform_mode John Cai via GitGitGadget
2022-02-18 18:23                   ` [PATCH v10 2/4] cat-file: introduce batch_mode enum to replace print_contents John Cai via GitGitGadget
2022-02-18 18:23                   ` [PATCH v10 3/4] cat-file: add remove_timestamp helper John Cai via GitGitGadget
2022-02-19  6:33                     ` Ævar Arnfjörð Bjarmason
2022-02-22  3:31                       ` John Cai
2022-02-18 18:23                   ` [PATCH v10 4/4] cat-file: add --batch-command mode John Cai via GitGitGadget
2022-02-19  6:35                     ` Ævar Arnfjörð Bjarmason
2022-02-18 19:38                   ` [PATCH v10 0/4] Add cat-file --batch-command flag Junio C Hamano
2022-02-22 11:07                   ` Phillip Wood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPig+cQHcstj28F-==oCakp3iMLHtd1SVQV+snzewYihHxNG6g@mail.gmail.com' \
    --to=sunshine@sunshineco.com \
    --cc=avarab@gmail.com \
    --cc=bagasdotme@gmail.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=johncai86@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=phillip.wood123@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.