git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Derrick Stolee <derrickstolee@github.com>
Cc: git@vger.kernel.org, gitster@pobox.com,
	Johannes.Schindelin@gmx.de, chakrabortyabhradeep79@gmail.com,
	jonathantanmy@google.com, kaartic.sivaraam@gmail.com
Subject: Re: [PATCH 6/6] midx.c: include preferred pack correctly with existing MIDX
Date: Mon, 22 Aug 2022 14:14:31 -0400	[thread overview]
Message-ID: <YwPHh/EDIS0S4uoj@nand.local> (raw)
In-Reply-To: <3cecd058-aec2-d5f9-ef79-58cc10ce14fb@github.com>

On Mon, Aug 22, 2022 at 01:03:11PM -0400, Derrick Stolee wrote:
> On 8/19/2022 5:30 PM, Taylor Blau wrote:
>
> > Resolve this by adding all objects from the preferred pack separately
> > when it appears in the existing MIDX (if one was present). This will
> > duplicate objects from that pack that *did* appear in the MIDX, but this
> > is fine, since get_sorted_entries() already handles duplicates. (A
> > future optimization in this area could avoid adding copies of objects
> > that we know already existing in the MIDX.)
>
> ...
>
> > This resolves the bug described in the first patch of this series.
>
> Thinking ahead to when this is a commit, perhaps this could instead
> refer to the 'preferred pack change with existing MIDX bitmap' test
> case?

Good idea, thanks.

> > @@ -610,10 +609,7 @@ static void midx_fanout_add_midx_fanout(struct midx_fanout *fanout,
> >  		nth_midxed_pack_midx_entry(m,
> >  					   &fanout->entries[fanout->nr],
> >  					   cur_object);
> > -		if (nth_midxed_pack_int_id(m, cur_object) == preferred_pack)
> > -			fanout->entries[fanout->nr].preferred = 1;
> > -		else
> > -			fanout->entries[fanout->nr].preferred = 0;
> > +		fanout->entries[fanout->nr].preferred = 0;
> >  		fanout->nr++;
>
> Here, we have lost the ability to set the 'preferred' bit from the
> previous MIDX. Good.

Yep, we don't want to propagate any of these bits forward when reusing
an existing MIDX. Thinking on it more, I think this is the only
legitimate use of MIDX reuse in the "I'm about to write bitmaps"
context.

I mentioned before the idea that we could use `--stdin-packs` now when
writing a MIDX bitmap where before it wasn't implemented (likely due to
problems caused by this bug). But the whole premise doesn't make a ton
of sense:

  - Every pack that's in the include_packs list would need to be handled
    separately.

  - And every pack that *isn't* in that list would be skipped.

Which means that it wouldn't help at all to reuse an existing MIDX. The
reason that we'd need to handle all included packs separately is subtle
and a little different from what's going on here, though. The problem
there is that if you have two packs, say P1 and P2, and P1 is in the
include list but P2 is not, then any objects duplicated between the two
and selected from P2 will disappear when writing the new MIDX.

Since the set of packs that are going into the new MIDX are precisely
equal to the set of packs that we'd need to handle separately, it
probably makes sense to continue to avoid using the existing MIDX when
writing a bitmap with the `--stdin-packs` option.

> > @@ -694,6 +689,11 @@ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m,
> >  						    preferred, cur_fanout);
> >  		}
> >
> > +		if (-1 < preferred_pack && preferred_pack < start_pack)
> > +			midx_fanout_add_pack_fanout(&fanout, info,
> > +						    preferred_pack, 1,
> > +						    cur_fanout);
> > +
>
> And here, when there is a preferred pack _in the previous MIDX_,
> we add its objects a second time, but now with the preferred bit
> on. If the preferred pack is _not_ in the previous MIDX, then the
> 'preferred_pack < start_pack' condition will fail and the bits
> would have been set within the for loop.

Exactly!

> > @@ -346,7 +346,7 @@ test_expect_success 'preferred pack change with existing MIDX bitmap' '
> >  		test_path_is_file $midx &&
> >  		test_path_is_file $midx-$(midx_checksum $objdir).bitmap &&
> >
> > -		test_must_fail git clone --no-local . clone2
> > +		git clone --no-local . clone2
>
> I mentioned in patch 1 that this test could use some comments about
> what is unexpected and what _is_ expected. I think this comment needs
> an update in this patch:
>
> 	# Generate a new MIDX which changes the preferred pack to a pack
> 	# contained in the existing MIDX, such that not all objects from
> 	# p2 that appear in the MIDX had their copy selected from p2.

Good eyes, thanks for spotting. I updated the comment below, too (which
doesn't exist in this version of the patch, but you suggested adding to
the first patch in this series).

Thanks,
Taylor

  reply	other threads:[~2022-08-22 18:14 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-19 21:30 [PATCH 0/6] midx: permit changing the preferred pack when reusing the MIDX Taylor Blau
2022-08-19 21:30 ` [PATCH 1/6] t5326: demonstrate potential bitmap corruption Taylor Blau
2022-08-22 16:09   ` Derrick Stolee
2022-08-22 17:57     ` Taylor Blau
2022-08-22 19:31       ` Junio C Hamano
2022-08-22 19:41         ` Taylor Blau
2022-08-19 21:30 ` [PATCH 2/6] t/lib-bitmap.sh: avoid silencing stderr Taylor Blau
2022-08-20 16:44   ` Abhradeep Chakraborty
2022-08-22 17:58     ` Taylor Blau
2022-08-19 21:30 ` [PATCH 3/6] midx.c: extract `struct midx_fanout` Taylor Blau
2022-08-19 21:30 ` [PATCH 4/6] midx.c: extract `midx_fanout_add_midx_fanout()` Taylor Blau
2022-08-19 21:30 ` [PATCH 5/6] midx.c: extract `midx_fanout_add_pack_fanout()` Taylor Blau
2022-08-19 21:30 ` [PATCH 6/6] midx.c: include preferred pack correctly with existing MIDX Taylor Blau
2022-08-20 18:40   ` Abhradeep Chakraborty
2022-08-22 18:08     ` Taylor Blau
2022-08-22 17:03   ` Derrick Stolee
2022-08-22 18:14     ` Taylor Blau [this message]
2022-08-22 17:04 ` [PATCH 0/6] midx: permit changing the preferred pack when reusing the MIDX Derrick Stolee
2022-08-22 19:44   ` Taylor Blau
2022-08-22 19:50 ` [PATCH v2 0/7] " Taylor Blau
2022-08-22 19:50   ` [PATCH v2 1/7] t5326: demonstrate potential bitmap corruption Taylor Blau
2022-08-22 19:50   ` [PATCH v2 2/7] t/lib-bitmap.sh: avoid silencing stderr Taylor Blau
2022-08-22 19:50   ` [PATCH v2 3/7] midx.c: extract `struct midx_fanout` Taylor Blau
2022-08-22 19:50   ` [PATCH v2 4/7] midx.c: extract `midx_fanout_add_midx_fanout()` Taylor Blau
2022-08-22 19:50   ` [PATCH v2 5/7] midx.c: extract `midx_fanout_add_pack_fanout()` Taylor Blau
2022-08-22 19:50   ` [PATCH v2 6/7] midx.c: include preferred pack correctly with existing MIDX Taylor Blau
2022-08-22 19:50   ` [PATCH v2 7/7] midx.c: avoid adding preferred objects twice Taylor Blau
2022-08-23 16:22     ` Derrick Stolee
2022-08-23 16:23   ` [PATCH v2 0/7] midx: permit changing the preferred pack when reusing the MIDX Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YwPHh/EDIS0S4uoj@nand.local \
    --to=me@ttaylorr.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=chakrabortyabhradeep79@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=kaartic.sivaraam@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).