All of lore.kernel.org
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: Johannes.Schindelin@gmx.de, dstolee@microsoft.com,
	jeffhost@microsoft.com, peff@peff.net
Subject: [PATCH 2/4] midx.c: lookup MIDX by object directory during expire
Date: Fri, 8 Oct 2021 17:46:32 -0400	[thread overview]
Message-ID: <84e95aacbdfb092082d0ca467892552982134774.1633729502.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1633729502.git.me@ttaylorr.com>

Before a new MIDX can be written, expire_midx_packs() first loads the
existing MIDX, figures out which packs can be expired, and then writes a
new MIDX based on that information.

In order to load the existing MIDX, it uses load_multi_pack_index(),
which mmaps the multi-pack-index file, but does not store the resulting
`struct multi_pack_index *` in the object store.

write_midx_internal() also needs to open the existing MIDX, and it does
so by iterating the results of get_multi_pack_index(), so that it reuses
the same pointer held by the object store. But before it can move the
new MIDX into place, it close_object_store() to munmap() the
multi-pack-index file to accommodate platforms like Windows which don't
allow overwriting files which are memory mapped.

That's where things get weird. Since expire_midx_packs has its own
*separate* memory mapped copy of the MIDX, the MIDX file is still memory
mapped! Interestingly, this doesn't seem to cause a problem in our
tests. (I believe that this has much more to do with my own lack of
familiarity with Windows than it does a lack of coverage in our tests).

In any case, we can side-step the whole issue by teaching
expire_midx_packs() to use the `struct multi_pack_index` pointer it
found via the object store instead of maintain its own copy. That way,
when write_midx_internal() calls `close_object_store()`, we know that
there are no memory mapped copies of the MIDX laying around.

A couple of other small notes about this patch:

  - As far as I can tell, passing `local == 1` to the call to
    load_multi_pack_index() was an error, since object_dir could be an
    alternate. But it doesn't matter, since even though we write
    `m->local = 1`, we never read that field back later on.

  - Setting `m = NULL` after write_midx_internal() was likely to prevent
    a double-free back from when that function took a `struct
    multi_pack_index *` that it called close_midx() on itself. We can
    rely on write_midx_internal() to call that for us now.

Finally, this enforces the same "the value of --object-dir must be the
local object store, or an alternate" rule from f57a739691 (midx: avoid
opening multiple MIDXs when writing, 2021-09-01) to the `expire`
sub-command, too.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
This does leak the MIDX write_midx_internal returns before calling
close_object_store(). We can't just blindly call close_object_store()
here, either, since it's susceptible to double-frees. I'll think about
improving this in a separate series.

 midx.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/midx.c b/midx.c
index b66b75a3cd..7f1addf4b6 100644
--- a/midx.c
+++ b/midx.c
@@ -1707,7 +1707,7 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla
 {
 	uint32_t i, *count, result = 0;
 	struct string_list packs_to_drop = STRING_LIST_INIT_DUP;
-	struct multi_pack_index *m = load_multi_pack_index(object_dir, 1);
+	struct multi_pack_index *m = lookup_multi_pack_index(r, object_dir);
 	struct progress *progress = NULL;

 	if (!m)
@@ -1752,12 +1752,11 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla

 	free(count);

-	if (packs_to_drop.nr) {
+	if (packs_to_drop.nr)
 		result = write_midx_internal(object_dir, NULL, &packs_to_drop, NULL, NULL, flags);
-		m = NULL;
-	}

 	string_list_clear(&packs_to_drop, 0);
+
 	return result;
 }

--
2.33.0.96.g73915697e6


  parent reply	other threads:[~2021-10-08 21:46 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-08 21:46 [PATCH 0/4] midx: avoid potential rename-while-mapped on Windows Taylor Blau
2021-10-08 21:46 ` [PATCH 1/4] midx.c: extract MIDX lookup by object_dir Taylor Blau
2021-10-08 21:46 ` Taylor Blau [this message]
2021-10-13 13:28   ` [PATCH 2/4] midx.c: lookup MIDX by object directory during expire Derrick Stolee
2021-10-08 21:46 ` [PATCH 3/4] midx.c: lookup MIDX by object directory during repack Taylor Blau
2021-10-08 21:46 ` [PATCH 4/4] midx.c: guard against commit_lock_file() failures Taylor Blau
2021-10-09  2:07   ` Ævar Arnfjörð Bjarmason
2021-10-13 13:29 ` [PATCH 0/4] midx: avoid potential rename-while-mapped on Windows Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=84e95aacbdfb092082d0ca467892552982134774.1633729502.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=jeffhost@microsoft.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.