git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ZheNing Hu <adlternative@gmail.com>
To: Jeff King <peff@peff.net>
Cc: ZheNing Hu via GitGitGadget <gitgitgadget@gmail.com>,
	Git List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Christian Couder <christian.couder@gmail.com>,
	Hariom Verma <hariom18599@gmail.com>,
	Felipe Contreras <felipe.contreras@gmail.com>
Subject: Re: [PATCH 1/2] [GSOC] cat-file: fix --batch report changed-type bug
Date: Wed, 2 Jun 2021 21:15:45 +0800	[thread overview]
Message-ID: <CAOLTT8SCeKy74cVO3K5zJ5n=0s=o9zk2ipV5wM6CHQPzRoMi5Q@mail.gmail.com> (raw)
In-Reply-To: <YLZXyBJ5YgGfmkKv@coredump.intra.peff.net>

Jeff King <peff@peff.net> 于2021年6月1日周二 下午11:52写道:
>
> On Tue, Jun 01, 2021 at 02:35:56PM +0000, ZheNing Hu via GitGitGadget wrote:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > When `--batch` used with `--batch-all-objects`,
> > with some format atoms like %(objectname), %(rest)
> > or even no atoms may cause Git exit and report
> > "object xxx changed type!?".
> >
> > E.g. `git cat-file --batch="batman" --batch-all-objects`
> >
> > This is because we did not get the object type through
> > oid_object_info_extended(), it's composed of two
> > situations:
> >
> > 1. Since object_info is empty, skip_object_info is
> > set to true, We skipped collecting the object type.
> >
> > 2. The formatting atom like %(objectname) does not require
> > oid_object_info_extended() to collect object types.
> >
> > The correct way to deal with it is to swap the order
> > of setting skip_object_info and setting typep. This
> > will ensure that we must get the type of the object
> > when using --batch.
>
> Thanks, this explanation and the patch make sense, and I'd be happy if
> we take it as-is. But in the name of GSoC, let me offer a few polishing
> tips.
>

Thanks, as a GSOC student, your reminders will be very helpful to me.

> The commit message hints at the root of the problem, but doesn't say it
> explicitly. Which is: because setting skip_object_info depends on seeing
> that the object_info is empty, we can't add items to it after setting
> that flag. And the code path for --batch does that, hence re-ordering
> them is the solution.
>

Um, let's rewrite the commit message, I don't know if this is accurate:

[GSOC] cat-file: fix --batch report changed-type bug

When `--batch` used with `--batch-all-objects`,
with some format atoms like %(objectname), %(rest)
or even no atoms may cause Git exit and report
"object xxx changed type!?".

E.g. `git cat-file --batch="batman" --batch-all-objects`

The bug was present from when the skip_object_info code
was initially added in 845de33a5b (cat-file: avoid
noop calls to sha1_object_info_extended, 2016-05-18).

This is because we did not get the object type through
oid_object_info_extended(), it's composed of two
situations:

1. all_objects will be set to true when we use
`--batch-all-objects`, seeing that object_info
is empty, skip_object_info will be to true,
`oid_object_info_extended()` will not get the
type of the object.

2. The formatting atom like %(objectname) does
not require oid_object_info_extended() to collect
object types.

print_contents will be set to true when we use
`--batch`, which can make object_info non-empty,
so the solution is to swap the code order of it
and checking if object_info is empty, which will
ensure that we must get the type of the object
when using --batch.

> It might also be worth noting that the bug was present from when the
> skip_object_info code was initially added in 845de33a5b (cat-file: avoid
> noop calls to sha1_object_info_extended, 2016-05-18).
>

OK.

> > +test_expect_success 'cat-file --batch="%(objectname)" with --batch-all-objects will work' '
> > +     git -C all-two cat-file --batch-all-objects --batch-check="%(objectname)" >objects &&
> > +     git -C all-two cat-file --batch="%(objectname)" <objects >expect &&
> > +     git -C all-two cat-file --batch-all-objects --batch="%(objectname)" >actual &&
> > +     cmp expect actual
> > +'
>
> OK, we're checking the output of --batch-all-objects versus taking the
> object list from the input. We can depend on the ordering being
> identical between the two because --batch-all-objects sorts. Good.
>
> > +test_expect_success 'cat-file --batch="%(rest)" with --batch-all-objects will work' '
> > +     git -C all-two cat-file --batch="%(rest)" <objects >expect &&
> > +     git -C all-two cat-file --batch-all-objects --batch="%(rest)" >actual &&
> > +     cmp expect actual
> > +'
>
> This one is rather curious. It definitely shows the bug, but I can't
> imagine why %(rest) would be useful with --batch-all-objects, since its
> purpose is to show the rest of the input line (and there are no input
> lines!).
>

I wanted to argue that print_object_or_die() will use data->rest originally.
But --batch-all-objects and --textconv and --filter are incompatible.
So your idea is reasonable: %(rest) with --batch-all-objects is useless.

> That might be a problem later if we change the behavior (e.g., to say
> "%(rest) does not make sense with --batch-all-objects"). But I'm also OK
> leaving it for now; somebody later can dig up this commit via git-blame.
>

Yes, I think this feature can be completed later.(not in this patch)

> > +test_expect_success 'cat-file --batch="batman" with --batch-all-objects will work' '
> > +     git -C all-two cat-file --batch="batman" <objects >expect &&
> > +     git -C all-two cat-file --batch-all-objects --batch="batman" >actual &&
> > +     cmp expect actual
> > +'
>
> And this one is a more extreme version of the "%(objectname)" one. It's
> useful as a regression test because we might later change the
> optimization so that %(objectname) ends up filling in the other object
> info.
>

Yes, it does not use atoms, so it is the most special.

> There's a subtle dependency here on the "objects" file from the earlier
> test. In such a case, we'll often split that out as a separate setup
> step like:
>
>   test_expect_success 'set up object list for --batch-all-objects tests' '
>         git -C all-two cat-file --batch-all-objects --batch-check="%(objectname)" >objects
>   '
>
> That makes it more clear that all three of the other tests are doing the
> same thing (just with different formats), and can be reordered, removed,
> etc, later if we wanted to. So not a correctness thing, but rather just
> communicating the structure of the tests to later readers.
>

Makes sense.

> -Peff

Thanks.
--
ZheNing Hu

  reply	other threads:[~2021-06-02 13:16 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-01 14:35 [PATCH 0/2] [GSOC] cat-file: fix --batch report changed-type bug ZheNing Hu via GitGitGadget
2021-06-01 14:35 ` [PATCH 1/2] " ZheNing Hu via GitGitGadget
2021-06-01 15:52   ` Jeff King
2021-06-02 13:15     ` ZheNing Hu [this message]
2021-06-02 20:00       ` Jeff King
2021-06-03  6:28         ` ZheNing Hu
2021-06-03  7:18         ` ZheNing Hu
2021-06-03 19:28           ` Jeff King
2021-06-01 14:35 ` [PATCH 2/2] [GSOC] cat-file: merge two block into one ZheNing Hu via GitGitGadget
2021-06-01 15:55   ` Jeff King
2021-06-02 13:27     ` ZheNing Hu
2021-06-03 16:29 ` [PATCH v2 0/2] [GSOC] cat-file: fix --batch report changed-type bug ZheNing Hu via GitGitGadget
2021-06-03 16:29   ` [PATCH v2 1/2] [GSOC] cat-file: handle trivial --batch format with --batch-all-objects ZheNing Hu via GitGitGadget
2021-06-03 16:29   ` [PATCH v2 2/2] [GSOC] cat-file: merge two block into one ZheNing Hu via GitGitGadget
2021-06-03 19:29   ` [PATCH v2 0/2] [GSOC] cat-file: fix --batch report changed-type bug Jeff King
2021-06-03 22:51   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOLTT8SCeKy74cVO3K5zJ5n=0s=o9zk2ipV5wM6CHQPzRoMi5Q@mail.gmail.com' \
    --to=adlternative@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=felipe.contreras@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=hariom18599@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).