All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: ZheNing Hu via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Christian Couder <christian.couder@gmail.com>,
	Hariom Verma <hariom18599@gmail.com>,
	Felipe Contreras <felipe.contreras@gmail.com>,
	ZheNing Hu <adlternative@gmail.com>
Subject: Re: [PATCH 1/2] [GSOC] cat-file: fix --batch report changed-type bug
Date: Tue, 1 Jun 2021 11:52:40 -0400	[thread overview]
Message-ID: <YLZXyBJ5YgGfmkKv@coredump.intra.peff.net> (raw)
In-Reply-To: <495cd90dbaf43e957d03edd2fdc7449b39eee53a.1622558157.git.gitgitgadget@gmail.com>

On Tue, Jun 01, 2021 at 02:35:56PM +0000, ZheNing Hu via GitGitGadget wrote:

> From: ZheNing Hu <adlternative@gmail.com>
> 
> When `--batch` used with `--batch-all-objects`,
> with some format atoms like %(objectname), %(rest)
> or even no atoms may cause Git exit and report
> "object xxx changed type!?".
> 
> E.g. `git cat-file --batch="batman" --batch-all-objects`
> 
> This is because we did not get the object type through
> oid_object_info_extended(), it's composed of two
> situations:
> 
> 1. Since object_info is empty, skip_object_info is
> set to true, We skipped collecting the object type.
> 
> 2. The formatting atom like %(objectname) does not require
> oid_object_info_extended() to collect object types.
> 
> The correct way to deal with it is to swap the order
> of setting skip_object_info and setting typep. This
> will ensure that we must get the type of the object
> when using --batch.

Thanks, this explanation and the patch make sense, and I'd be happy if
we take it as-is. But in the name of GSoC, let me offer a few polishing
tips.

The commit message hints at the root of the problem, but doesn't say it
explicitly. Which is: because setting skip_object_info depends on seeing
that the object_info is empty, we can't add items to it after setting
that flag. And the code path for --batch does that, hence re-ordering
them is the solution.

It might also be worth noting that the bug was present from when the
skip_object_info code was initially added in 845de33a5b (cat-file: avoid
noop calls to sha1_object_info_extended, 2016-05-18).

> diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
> index 5d2dc99b74ad..1502a27142ba 100755
> --- a/t/t1006-cat-file.sh
> +++ b/t/t1006-cat-file.sh
> @@ -586,4 +586,23 @@ test_expect_success 'cat-file --unordered works' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'cat-file --batch="%(objectname)" with --batch-all-objects will work' '
> +	git -C all-two cat-file --batch-all-objects --batch-check="%(objectname)" >objects &&
> +	git -C all-two cat-file --batch="%(objectname)" <objects >expect &&
> +	git -C all-two cat-file --batch-all-objects --batch="%(objectname)" >actual &&
> +	cmp expect actual
> +'

OK, we're checking the output of --batch-all-objects versus taking the
object list from the input. We can depend on the ordering being
identical between the two because --batch-all-objects sorts. Good.

> +test_expect_success 'cat-file --batch="%(rest)" with --batch-all-objects will work' '
> +	git -C all-two cat-file --batch="%(rest)" <objects >expect &&
> +	git -C all-two cat-file --batch-all-objects --batch="%(rest)" >actual &&
> +	cmp expect actual
> +'

This one is rather curious. It definitely shows the bug, but I can't
imagine why %(rest) would be useful with --batch-all-objects, since its
purpose is to show the rest of the input line (and there are no input
lines!).

That might be a problem later if we change the behavior (e.g., to say
"%(rest) does not make sense with --batch-all-objects"). But I'm also OK
leaving it for now; somebody later can dig up this commit via git-blame.

> +test_expect_success 'cat-file --batch="batman" with --batch-all-objects will work' '
> +	git -C all-two cat-file --batch="batman" <objects >expect &&
> +	git -C all-two cat-file --batch-all-objects --batch="batman" >actual &&
> +	cmp expect actual
> +'

And this one is a more extreme version of the "%(objectname)" one. It's
useful as a regression test because we might later change the
optimization so that %(objectname) ends up filling in the other object
info.

There's a subtle dependency here on the "objects" file from the earlier
test. In such a case, we'll often split that out as a separate setup
step like:

  test_expect_success 'set up object list for --batch-all-objects tests' '
	git -C all-two cat-file --batch-all-objects --batch-check="%(objectname)" >objects
  '

That makes it more clear that all three of the other tests are doing the
same thing (just with different formats), and can be reordered, removed,
etc, later if we wanted to. So not a correctness thing, but rather just
communicating the structure of the tests to later readers.

-Peff

  reply	other threads:[~2021-06-01 15:52 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-01 14:35 [PATCH 0/2] [GSOC] cat-file: fix --batch report changed-type bug ZheNing Hu via GitGitGadget
2021-06-01 14:35 ` [PATCH 1/2] " ZheNing Hu via GitGitGadget
2021-06-01 15:52   ` Jeff King [this message]
2021-06-02 13:15     ` ZheNing Hu
2021-06-02 20:00       ` Jeff King
2021-06-03  6:28         ` ZheNing Hu
2021-06-03  7:18         ` ZheNing Hu
2021-06-03 19:28           ` Jeff King
2021-06-01 14:35 ` [PATCH 2/2] [GSOC] cat-file: merge two block into one ZheNing Hu via GitGitGadget
2021-06-01 15:55   ` Jeff King
2021-06-02 13:27     ` ZheNing Hu
2021-06-03 16:29 ` [PATCH v2 0/2] [GSOC] cat-file: fix --batch report changed-type bug ZheNing Hu via GitGitGadget
2021-06-03 16:29   ` [PATCH v2 1/2] [GSOC] cat-file: handle trivial --batch format with --batch-all-objects ZheNing Hu via GitGitGadget
2021-06-03 16:29   ` [PATCH v2 2/2] [GSOC] cat-file: merge two block into one ZheNing Hu via GitGitGadget
2021-06-03 19:29   ` [PATCH v2 0/2] [GSOC] cat-file: fix --batch report changed-type bug Jeff King
2021-06-03 22:51   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YLZXyBJ5YgGfmkKv@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=adlternative@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=felipe.contreras@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=hariom18599@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.