git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: gitster@pobox.com, Johannes.Schindelin@gmx.de,
	chakrabortyabhradeep79@gmail.com, derrickstolee@github.com,
	jonathantanmy@google.com, kaartic.sivaraam@gmail.com
Subject: [PATCH v2 1/7] t5326: demonstrate potential bitmap corruption
Date: Mon, 22 Aug 2022 15:50:32 -0400	[thread overview]
Message-ID: <6b38bfcd2c2bced350cea198f7d576ffb81f3481.1661197803.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1661197803.git.me@ttaylorr.com>

It is possible to generate a corrupt MIDX bitmap when certain conditions
are met. This happens when the preferred pack "P" changes to one (say,
"Q") that:

  - "Q" has objects included in an existing MIDX,
  - but "Q" is different than "P",
  - and "Q" and "P" have some objects in common

When this is the case, not all objects from "Q" will be selected from
"Q" (ie., the generated MIDX will represent them as coming from a
different pack), despite "Q" being preferred.

This is an invariant violation, since all objects contained in the
MIDX's preferred pack are supposed to originate from the preferred pack.
In other words, all duplicate objects are resolved in favor of the copy
that comes from the MIDX's preferred pack, if any.

This violation results in a corrupt object order, which cannot be
interpreted by the pack-bitmap code, leading to broken clones and other
defects.

This test demonstrates the above problem by constructing a minimal
reproduction, and showing that the final `git clone` invocation fails.

The reproduction is mostly straightforward, except that the new pack
generated between MIDX writes (which is necessary in order to prevent
that operation from being a noop) must sort ahead of all existing packs
in order to prevent a different pack (neither "P" nor "Q") from
appearing as preferred (meaning all its objects appear in order at the
beginning of the pseudo-pack order).

Subsequent commits will first refactor the midx.c::get_sorted_entries()
function, and then fix this bug.

Reported-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t5326-multi-pack-bitmaps.sh | 47 +++++++++++++++++++++++++++++++++++
 1 file changed, 47 insertions(+)

diff --git a/t/t5326-multi-pack-bitmaps.sh b/t/t5326-multi-pack-bitmaps.sh
index 4fe57414c1..c364677ae8 100755
--- a/t/t5326-multi-pack-bitmaps.sh
+++ b/t/t5326-multi-pack-bitmaps.sh
@@ -307,4 +307,51 @@ test_expect_success 'graceful fallback when missing reverse index' '
 	)
 '
 
+test_expect_success 'preferred pack change with existing MIDX bitmap' '
+	git init preferred-pack-with-existing &&
+	(
+		cd preferred-pack-with-existing &&
+
+		test_commit base &&
+		test_commit other &&
+
+		git rev-list --objects --no-object-names base >p1.objects &&
+		git rev-list --objects --no-object-names other >p2.objects &&
+
+		p1="$(git pack-objects "$objdir/pack/pack" \
+			--delta-base-offset <p1.objects)" &&
+		p2="$(git pack-objects "$objdir/pack/pack" \
+			--delta-base-offset <p2.objects)" &&
+
+		# Generate a MIDX containing the first two packs,
+		# marking p1 as preferred, and ensure that it can be
+		# successfully cloned.
+		git multi-pack-index write --bitmap \
+			--preferred-pack="pack-$p1.pack" &&
+		test_path_is_file $midx &&
+		test_path_is_file $midx-$(midx_checksum $objdir).bitmap &&
+		git clone --no-local . clone1 &&
+
+		# Then generate a new pack which sorts ahead of any
+		# existing pack (by tweaking the pack prefix).
+		test_commit foo &&
+		git pack-objects --all --unpacked $objdir/pack/pack0 &&
+
+		# Generate a new MIDX which changes the preferred pack
+		# to a pack contained in the existing MIDX, such that
+		# not all objects from p2 that appear in the MIDX had
+		# their copy selected from p2.
+		git multi-pack-index write --bitmap \
+			--preferred-pack="pack-$p2.pack" &&
+		test_path_is_file $midx &&
+		test_path_is_file $midx-$(midx_checksum $objdir).bitmap &&
+
+		# When the above circumstances are met, an existing bug
+		# in the MIDX machinery will cause the reverse index to
+		# be read incorrectly, resulting in failed clones (among
+		# other things).
+		test_must_fail git clone --no-local . clone2
+	)
+'
+
 test_done
-- 
2.37.0.1.g1379af2e9d


  reply	other threads:[~2022-08-22 19:50 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-19 21:30 [PATCH 0/6] midx: permit changing the preferred pack when reusing the MIDX Taylor Blau
2022-08-19 21:30 ` [PATCH 1/6] t5326: demonstrate potential bitmap corruption Taylor Blau
2022-08-22 16:09   ` Derrick Stolee
2022-08-22 17:57     ` Taylor Blau
2022-08-22 19:31       ` Junio C Hamano
2022-08-22 19:41         ` Taylor Blau
2022-08-19 21:30 ` [PATCH 2/6] t/lib-bitmap.sh: avoid silencing stderr Taylor Blau
2022-08-20 16:44   ` Abhradeep Chakraborty
2022-08-22 17:58     ` Taylor Blau
2022-08-19 21:30 ` [PATCH 3/6] midx.c: extract `struct midx_fanout` Taylor Blau
2022-08-19 21:30 ` [PATCH 4/6] midx.c: extract `midx_fanout_add_midx_fanout()` Taylor Blau
2022-08-19 21:30 ` [PATCH 5/6] midx.c: extract `midx_fanout_add_pack_fanout()` Taylor Blau
2022-08-19 21:30 ` [PATCH 6/6] midx.c: include preferred pack correctly with existing MIDX Taylor Blau
2022-08-20 18:40   ` Abhradeep Chakraborty
2022-08-22 18:08     ` Taylor Blau
2022-08-22 17:03   ` Derrick Stolee
2022-08-22 18:14     ` Taylor Blau
2022-08-22 17:04 ` [PATCH 0/6] midx: permit changing the preferred pack when reusing the MIDX Derrick Stolee
2022-08-22 19:44   ` Taylor Blau
2022-08-22 19:50 ` [PATCH v2 0/7] " Taylor Blau
2022-08-22 19:50   ` Taylor Blau [this message]
2022-08-22 19:50   ` [PATCH v2 2/7] t/lib-bitmap.sh: avoid silencing stderr Taylor Blau
2022-08-22 19:50   ` [PATCH v2 3/7] midx.c: extract `struct midx_fanout` Taylor Blau
2022-08-22 19:50   ` [PATCH v2 4/7] midx.c: extract `midx_fanout_add_midx_fanout()` Taylor Blau
2022-08-22 19:50   ` [PATCH v2 5/7] midx.c: extract `midx_fanout_add_pack_fanout()` Taylor Blau
2022-08-22 19:50   ` [PATCH v2 6/7] midx.c: include preferred pack correctly with existing MIDX Taylor Blau
2022-08-22 19:50   ` [PATCH v2 7/7] midx.c: avoid adding preferred objects twice Taylor Blau
2022-08-23 16:22     ` Derrick Stolee
2022-08-23 16:23   ` [PATCH v2 0/7] midx: permit changing the preferred pack when reusing the MIDX Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6b38bfcd2c2bced350cea198f7d576ffb81f3481.1661197803.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=chakrabortyabhradeep79@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=kaartic.sivaraam@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).