git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] midx:  reduce memory pressure while writing bitmaps
@ 2022-07-18 20:36 Derrick Stolee via GitGitGadget
  2022-07-18 21:47 ` Ævar Arnfjörð Bjarmason
  2022-07-19 15:26 ` [PATCH v2 0/3] " Derrick Stolee via GitGitGadget
  0 siblings, 2 replies; 8+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-07-18 20:36 UTC (permalink / raw)
  To: git
  Cc: gitster, me, vdye, chakrabortyabhradeep79, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

We noticed that some 'git multi-pack-index write --bitmap' processes
were running with very high memory. It turns out that a lot of this
memory is required to store a list of every object in the written
multi-pack-index, with a second copy that has additional information
used for the bitmap writing logic.

Using 'valgrind --tool=massif' before this change, the following chart
shows how memory load increased and was maintained throughout the
process:

    GB
4.102^                                                             ::
     |              @  @::@@::@@::::::::@::::::@@:#:::::::::::::@@:: :
     |         :::::@@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |      :::: :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |    :::: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |    : :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |    : :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     | @ :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     | @ :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
   0 +--------------------------------------------------------------->

It turns out that the 'struct write_midx_context' data is persisting
through the life of the process, including the 'entries' array. This
array is used last inside find_commits_for_midx_bitmap() within
write_midx_bitmap(). If we free (and nullify) the array at that point,
we can free a decent chunk of memory before the bitmap logic adds more
to the memory footprint.

Here is the massif memory load chart after this change:

    GB
3.111^      #
     |      #                              :::::::::::@::::::::::::::@
     |      #        ::::::::::::::::::::::::: : :: : @:: ::::: :: ::@
     |     @#  :::::::::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |  :::@#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |  :: @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |  :: @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |  :: @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
   0 +--------------------------------------------------------------->

It is unfortunate that the lifetime of the 'entries' array is less
clear. To make this simpler, I added a few things to try and prevent an
accidental reference:

 1. Using FREE_AND_NULL() we will at least get a segfault from reading a
    NULL pointer instead of a use-after-free.

 2. 'entries_nr' is also set to zero to make any loop that would iterate
    over the entries be trivial.

 3. Set the 'ctx' pointer to NULL within write_midx_bitmap() so it does
    not get another reference later. This requires adding a local copy
    of 'pack_order' giving us a reference that we can use later in the
    method.

 4. Add significant comments in write_midx_bitmap() and
    write_midx_internal() to add warnings for future authors who might
    accidentally add references to this cleared memory.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
    midx: reduce memory pressure while writing bitmaps
    
    The thing I'm most worried about with this patch is whether there is
    enough (or too much) defensive programming.
    
    Thanks, -Stolee

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1292%2Fderrickstolee%2Fbitmap-memory-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1292/derrickstolee/bitmap-memory-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1292

 midx.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/midx.c b/midx.c
index 5f0dd386b02..cc31d803a5f 100644
--- a/midx.c
+++ b/midx.c
@@ -1063,6 +1063,7 @@ static int write_midx_bitmap(char *midx_name, unsigned char *midx_hash,
 	struct commit **commits = NULL;
 	uint32_t i, commits_nr;
 	uint16_t options = 0;
+	uint32_t *pack_order;
 	char *bitmap_name = xstrfmt("%s-%s.bitmap", midx_name, hash_to_hex(midx_hash));
 	int ret;
 
@@ -1076,6 +1077,15 @@ static int write_midx_bitmap(char *midx_name, unsigned char *midx_hash,
 
 	commits = find_commits_for_midx_bitmap(&commits_nr, refs_snapshot, ctx);
 
+	/*
+	 * Remove the ctx.entries to reduce memory pressure.
+	 * Nullify 'ctx' to help avoid adding new references to ctx->entries.
+	 */
+	FREE_AND_NULL(ctx->entries);
+	ctx->entries_nr = 0;
+	pack_order = ctx->pack_order;
+	ctx = NULL;
+
 	/*
 	 * Build the MIDX-order index based on pdata.objects (which is already
 	 * in MIDX order; c.f., 'midx_pack_order_cmp()' for the definition of
@@ -1102,7 +1112,7 @@ static int write_midx_bitmap(char *midx_name, unsigned char *midx_hash,
 	 * bitmap_writer_finish().
 	 */
 	for (i = 0; i < pdata.nr_objects; i++)
-		index[ctx->pack_order[i]] = &pdata.objects[i].idx;
+		index[pack_order[i]] = &pdata.objects[i].idx;
 
 	bitmap_writer_select_commits(commits, commits_nr, -1);
 	ret = bitmap_writer_build(&pdata);
@@ -1443,6 +1453,11 @@ static int write_midx_internal(const char *object_dir,
 	if (flags & MIDX_WRITE_REV_INDEX &&
 	    git_env_bool("GIT_TEST_MIDX_WRITE_REV", 0))
 		write_midx_reverse_index(midx_name.buf, midx_hash, &ctx);
+
+	/*
+	 * Writing the bitmap must be last, as it will free ctx.entries
+	 * to reduce memory pressure during the bitmap write.
+	 */
 	if (flags & MIDX_WRITE_BITMAP) {
 		if (write_midx_bitmap(midx_name.buf, midx_hash, &ctx,
 				      refs_snapshot, flags) < 0) {

base-commit: 9dd64cb4d310986dd7b8ca7fff92f9b61e0bd21a
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] midx:  reduce memory pressure while writing bitmaps
  2022-07-18 20:36 [PATCH] midx: reduce memory pressure while writing bitmaps Derrick Stolee via GitGitGadget
@ 2022-07-18 21:47 ` Ævar Arnfjörð Bjarmason
  2022-07-19 13:50   ` Derrick Stolee
  2022-07-19 15:26 ` [PATCH v2 0/3] " Derrick Stolee via GitGitGadget
  1 sibling, 1 reply; 8+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-18 21:47 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, gitster, me, vdye, chakrabortyabhradeep79, Derrick Stolee


On Mon, Jul 18 2022, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <derrickstolee@github.com>
> [...]
> It is unfortunate that the lifetime of the 'entries' array is less
> clear. To make this simpler, I added a few things to try and prevent an
> accidental reference:
>
>  1. Using FREE_AND_NULL() we will at least get a segfault from reading a
>     NULL pointer instead of a use-after-free.
>
>  2. 'entries_nr' is also set to zero to make any loop that would iterate
>     over the entries be trivial.
>
>  3. Set the 'ctx' pointer to NULL within write_midx_bitmap() so it does
>     not get another reference later. This requires adding a local copy
>     of 'pack_order' giving us a reference that we can use later in the
>     method.
>
>  4. Add significant comments in write_midx_bitmap() and
>     write_midx_internal() to add warnings for future authors who might
>     accidentally add references to this cleared memory.
> [...]
> +	/*
> +	 * Remove the ctx.entries to reduce memory pressure.
> +	 * Nullify 'ctx' to help avoid adding new references to ctx->entries.
> +	 */
> +	FREE_AND_NULL(ctx->entries);
> +	ctx->entries_nr = 0;
> +	pack_order = ctx->pack_order;
> +	ctx = NULL;

After this change this is a ~70 line function, but only 3 lines at the
top actually use ctx for anything:
    
	/* the bug check for ctx.nr... */
	prepare_midx_packing_data(&pdata, ctx);
	commits = find_commits_for_midx_bitmap(&commits_nr, refs_snapshot, ctx);

Did you consider just splitting it up so that that there's a "prepare
write" function? Then you don't need to worry about the scoping of ctx.

I'd think that would be better, then you also wouldn't need to implement
your own free-ing, nothing after this seems to use ctx->entries_nr (but
I just skimmed it), so it could just fall through to the free() at the
end of write_midx_internal() (the only caller), couldn't it?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] midx: reduce memory pressure while writing bitmaps
  2022-07-18 21:47 ` Ævar Arnfjörð Bjarmason
@ 2022-07-19 13:50   ` Derrick Stolee
  0 siblings, 0 replies; 8+ messages in thread
From: Derrick Stolee @ 2022-07-19 13:50 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Derrick Stolee via GitGitGadget
  Cc: git, gitster, me, vdye, chakrabortyabhradeep79

On 7/18/2022 5:47 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Mon, Jul 18 2022, Derrick Stolee via GitGitGadget wrote:
> 
>> From: Derrick Stolee <derrickstolee@github.com>
>> [...]
>> It is unfortunate that the lifetime of the 'entries' array is less
>> clear. To make this simpler, I added a few things to try and prevent an
>> accidental reference:
>>
>>  1. Using FREE_AND_NULL() we will at least get a segfault from reading a
>>     NULL pointer instead of a use-after-free.
>>
>>  2. 'entries_nr' is also set to zero to make any loop that would iterate
>>     over the entries be trivial.
>>
>>  3. Set the 'ctx' pointer to NULL within write_midx_bitmap() so it does
>>     not get another reference later. This requires adding a local copy
>>     of 'pack_order' giving us a reference that we can use later in the
>>     method.
>>
>>  4. Add significant comments in write_midx_bitmap() and
>>     write_midx_internal() to add warnings for future authors who might
>>     accidentally add references to this cleared memory.
>> [...]
>> +	/*
>> +	 * Remove the ctx.entries to reduce memory pressure.
>> +	 * Nullify 'ctx' to help avoid adding new references to ctx->entries.
>> +	 */
>> +	FREE_AND_NULL(ctx->entries);
>> +	ctx->entries_nr = 0;
>> +	pack_order = ctx->pack_order;
>> +	ctx = NULL;
> 
> After this change this is a ~70 line function, but only 3 lines at the
> top actually use ctx for anything:
>     
> 	/* the bug check for ctx.nr... */
> 	prepare_midx_packing_data(&pdata, ctx);
> 	commits = find_commits_for_midx_bitmap(&commits_nr, refs_snapshot, ctx);
> 
> Did you consider just splitting it up so that that there's a "prepare
> write" function? Then you don't need to worry about the scoping of ctx.

I did not, and that's a good suggestion. Extracting these prepare steps
into write_midx_internal() works to reduce the complexity and make the
early free()ing more clear.
 
> I'd think that would be better, then you also wouldn't need to implement
> your own free-ing, nothing after this seems to use ctx->entries_nr (but
> I just skimmed it), so it could just fall through to the free() at the
> end of write_midx_internal() (the only caller), couldn't it?

I think this paragraph misunderstands the point. The bitmaps are being
computed and written before the MIDX lock file completes, so the free()
of the entries array is after the bitmaps are computed. To reduce the
memory pressure (by ~25%) by freeing early is the point of this patch.

We still want that free(ctx.entries) after the cleanup: label for the
error cases, but for the "happy path" we can free early.

By doing the refactoring, this point of having an earlier free() makes
things more clear.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 0/3] midx: reduce memory pressure while writing bitmaps
  2022-07-18 20:36 [PATCH] midx: reduce memory pressure while writing bitmaps Derrick Stolee via GitGitGadget
  2022-07-18 21:47 ` Ævar Arnfjörð Bjarmason
@ 2022-07-19 15:26 ` Derrick Stolee via GitGitGadget
  2022-07-19 15:26   ` [PATCH v2 1/3] pack-bitmap-write: use const for hashes Derrick Stolee via GitGitGadget
                     ` (2 more replies)
  1 sibling, 3 replies; 8+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-07-19 15:26 UTC (permalink / raw)
  To: git
  Cc: gitster, me, vdye, chakrabortyabhradeep79,
	Ævar Arnfjörð Bjarmason, Derrick Stolee

We noticed an instance where writing multi-pack-index bitmaps was taking a
lot of memory. This small change can reduce the memory pressure slightly
(~25%), but more will be needed to significantly reduce the memory pressure.
Such a change would require updating the bitmap writing code to use on-disk
data structures instead. This is particularly tricky when the
multi-pack-index has not been fully written, because we don't want a point
in time where the object store has a new multi-pack-index without a
reachability bitmap.


Updates in v2
=============

To reduce confusion on the lifetime of 'ctx.entries', some refactoring
patches are inserted to first extract the use of 'ctx' out of
write_midx_bitmap() and into write_midx_internal(). This makes the
FREE_AND_NULL() stand out more clearly.

Thanks, -Stolee

Derrick Stolee (3):
  pack-bitmap-write: use const for hashes
  midx: extract bitmap write setup
  midx: reduce memory pressure while writing bitmaps

 midx.c              | 69 +++++++++++++++++++++++++++++----------------
 pack-bitmap-write.c |  2 +-
 pack-bitmap.h       |  2 +-
 3 files changed, 47 insertions(+), 26 deletions(-)


base-commit: 9dd64cb4d310986dd7b8ca7fff92f9b61e0bd21a
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1292%2Fderrickstolee%2Fbitmap-memory-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1292/derrickstolee/bitmap-memory-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1292

Range-diff vs v1:

 -:  ----------- > 1:  a09fdbb8f3e pack-bitmap-write: use const for hashes
 1:  9104bc55795 ! 2:  4dfb7ae5112 midx:  reduce memory pressure while writing bitmaps
     @@ Metadata
      Author: Derrick Stolee <derrickstolee@github.com>
      
       ## Commit message ##
     -    midx:  reduce memory pressure while writing bitmaps
     +    midx: extract bitmap write setup
      
     -    We noticed that some 'git multi-pack-index write --bitmap' processes
     -    were running with very high memory. It turns out that a lot of this
     -    memory is required to store a list of every object in the written
     -    multi-pack-index, with a second copy that has additional information
     -    used for the bitmap writing logic.
     +    The write_midx_bitmap() method is a long method that does a lot of
     +    steps. It requires the write_midx_context struct for use in
     +    prepare_midx_packing_data() and find_commits_for_midx_bitmap(), but
     +    after that only needs the pack_order array.
      
     -    Using 'valgrind --tool=massif' before this change, the following chart
     -    shows how memory load increased and was maintained throughout the
     -    process:
     -
     -        GB
     -    4.102^                                                             ::
     -         |              @  @::@@::@@::::::::@::::::@@:#:::::::::::::@@:: :
     -         |         :::::@@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     -         |      :::: :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     -         |    :::: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     -         |    : :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     -         |    : :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     -         |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     -         |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     -         |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     -         |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     -         |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     -         |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     -         |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     -         | @ :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     -         | @ :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     -         | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     -         | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     -         | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     -         | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     -       0 +--------------------------------------------------------------->
     -
     -    It turns out that the 'struct write_midx_context' data is persisting
     -    through the life of the process, including the 'entries' array. This
     -    array is used last inside find_commits_for_midx_bitmap() within
     -    write_midx_bitmap(). If we free (and nullify) the array at that point,
     -    we can free a decent chunk of memory before the bitmap logic adds more
     -    to the memory footprint.
     -
     -    Here is the massif memory load chart after this change:
     -
     -        GB
     -    3.111^      #
     -         |      #                              :::::::::::@::::::::::::::@
     -         |      #        ::::::::::::::::::::::::: : :: : @:: ::::: :: ::@
     -         |     @#  :::::::::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     -         |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     -         |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     -         |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     -         |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     -         |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     -         |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     -         |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     -         |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     -         |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     -         |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     -         |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     -         |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     -         |  :::@#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     -         |  :: @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     -         |  :: @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     -         |  :: @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     -       0 +--------------------------------------------------------------->
     -
     -    It is unfortunate that the lifetime of the 'entries' array is less
     -    clear. To make this simpler, I added a few things to try and prevent an
     -    accidental reference:
     -
     -     1. Using FREE_AND_NULL() we will at least get a segfault from reading a
     -        NULL pointer instead of a use-after-free.
     -
     -     2. 'entries_nr' is also set to zero to make any loop that would iterate
     -        over the entries be trivial.
     -
     -     3. Set the 'ctx' pointer to NULL within write_midx_bitmap() so it does
     -        not get another reference later. This requires adding a local copy
     -        of 'pack_order' giving us a reference that we can use later in the
     -        method.
     -
     -     4. Add significant comments in write_midx_bitmap() and
     -        write_midx_internal() to add warnings for future authors who might
     -        accidentally add references to this cleared memory.
     +    This is a messy, but completely non-functional refactoring. The code is
     +    only being moved around to reduce visibility of the write_midx_context
     +    during the longest part of computing reachability bitmaps.
      
          Signed-off-by: Derrick Stolee <derrickstolee@github.com>
      
       ## midx.c ##
     -@@ midx.c: static int write_midx_bitmap(char *midx_name, unsigned char *midx_hash,
     - 	struct commit **commits = NULL;
     - 	uint32_t i, commits_nr;
     - 	uint16_t options = 0;
     -+	uint32_t *pack_order;
     - 	char *bitmap_name = xstrfmt("%s-%s.bitmap", midx_name, hash_to_hex(midx_hash));
     - 	int ret;
     +@@ midx.c: static struct commit **find_commits_for_midx_bitmap(uint32_t *indexed_commits_nr
     + 	return cb.commits;
     + }
       
     -@@ midx.c: static int write_midx_bitmap(char *midx_name, unsigned char *midx_hash,
     +-static int write_midx_bitmap(char *midx_name, unsigned char *midx_hash,
     +-			     struct write_midx_context *ctx,
     ++static int write_midx_bitmap(const char *midx_name,
     ++			     const unsigned char *midx_hash,
     ++			     struct packing_data *pdata,
     ++			     struct commit **commits,
     ++			     uint32_t commits_nr,
     ++			     uint32_t *pack_order,
     + 			     const char *refs_snapshot,
     + 			     unsigned flags)
     + {
     +-	struct packing_data pdata;
     +-	struct pack_idx_entry **index;
     +-	struct commit **commits = NULL;
     +-	uint32_t i, commits_nr;
     ++	int ret, i;
     + 	uint16_t options = 0;
     +-	char *bitmap_name = xstrfmt("%s-%s.bitmap", midx_name, hash_to_hex(midx_hash));
     +-	int ret;
     +-
     +-	if (!ctx->entries_nr)
     +-		BUG("cannot write a bitmap without any objects");
     ++	struct pack_idx_entry **index;
     ++	char *bitmap_name = xstrfmt("%s-%s.bitmap", midx_name,
     ++					hash_to_hex(midx_hash));
       
     - 	commits = find_commits_for_midx_bitmap(&commits_nr, refs_snapshot, ctx);
     + 	if (flags & MIDX_WRITE_BITMAP_HASH_CACHE)
     + 		options |= BITMAP_OPT_HASH_CACHE;
       
     -+	/*
     -+	 * Remove the ctx.entries to reduce memory pressure.
     -+	 * Nullify 'ctx' to help avoid adding new references to ctx->entries.
     -+	 */
     -+	FREE_AND_NULL(ctx->entries);
     -+	ctx->entries_nr = 0;
     -+	pack_order = ctx->pack_order;
     -+	ctx = NULL;
     -+
     +-	prepare_midx_packing_data(&pdata, ctx);
     +-
     +-	commits = find_commits_for_midx_bitmap(&commits_nr, refs_snapshot, ctx);
     +-
       	/*
       	 * Build the MIDX-order index based on pdata.objects (which is already
       	 * in MIDX order; c.f., 'midx_pack_order_cmp()' for the definition of
     + 	 * this order).
     + 	 */
     +-	ALLOC_ARRAY(index, pdata.nr_objects);
     +-	for (i = 0; i < pdata.nr_objects; i++)
     +-		index[i] = &pdata.objects[i].idx;
     ++	ALLOC_ARRAY(index, pdata->nr_objects);
     ++	for (i = 0; i < pdata->nr_objects; i++)
     ++		index[i] = &pdata->objects[i].idx;
     + 
     + 	bitmap_writer_show_progress(flags & MIDX_PROGRESS);
     +-	bitmap_writer_build_type_index(&pdata, index, pdata.nr_objects);
     ++	bitmap_writer_build_type_index(pdata, index, pdata->nr_objects);
     + 
     + 	/*
     + 	 * bitmap_writer_finish expects objects in lex order, but pack_order
      @@ midx.c: static int write_midx_bitmap(char *midx_name, unsigned char *midx_hash,
     + 	 * happens between bitmap_writer_build_type_index() and
       	 * bitmap_writer_finish().
       	 */
     - 	for (i = 0; i < pdata.nr_objects; i++)
     +-	for (i = 0; i < pdata.nr_objects; i++)
      -		index[ctx->pack_order[i]] = &pdata.objects[i].idx;
     -+		index[pack_order[i]] = &pdata.objects[i].idx;
     ++	for (i = 0; i < pdata->nr_objects; i++)
     ++		index[pack_order[i]] = &pdata->objects[i].idx;
       
       	bitmap_writer_select_commits(commits, commits_nr, -1);
     - 	ret = bitmap_writer_build(&pdata);
     +-	ret = bitmap_writer_build(&pdata);
     ++	ret = bitmap_writer_build(pdata);
     + 	if (ret < 0)
     + 		goto cleanup;
     + 
     + 	bitmap_writer_set_checksum(midx_hash);
     +-	bitmap_writer_finish(index, pdata.nr_objects, bitmap_name, options);
     ++	bitmap_writer_finish(index, pdata->nr_objects, bitmap_name, options);
     + 
     + cleanup:
     + 	free(index);
      @@ midx.c: static int write_midx_internal(const char *object_dir,
       	if (flags & MIDX_WRITE_REV_INDEX &&
       	    git_env_bool("GIT_TEST_MIDX_WRITE_REV", 0))
       		write_midx_reverse_index(midx_name.buf, midx_hash, &ctx);
      +
     -+	/*
     -+	 * Writing the bitmap must be last, as it will free ctx.entries
     -+	 * to reduce memory pressure during the bitmap write.
     -+	 */
       	if (flags & MIDX_WRITE_BITMAP) {
     - 		if (write_midx_bitmap(midx_name.buf, midx_hash, &ctx,
     +-		if (write_midx_bitmap(midx_name.buf, midx_hash, &ctx,
     ++		struct packing_data pdata;
     ++		struct commit **commits;
     ++		uint32_t commits_nr;
     ++
     ++		if (!ctx.entries_nr)
     ++			BUG("cannot write a bitmap without any objects");
     ++
     ++		prepare_midx_packing_data(&pdata, &ctx);
     ++
     ++		commits = find_commits_for_midx_bitmap(&commits_nr, refs_snapshot, &ctx);
     ++
     ++		if (write_midx_bitmap(midx_name.buf, midx_hash, &pdata,
     ++				      commits, commits_nr, ctx.pack_order,
       				      refs_snapshot, flags) < 0) {
     + 			error(_("could not write multi-pack bitmap"));
     + 			result = 1;
 -:  ----------- > 3:  98e72f71b6b midx: reduce memory pressure while writing bitmaps

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/3] pack-bitmap-write: use const for hashes
  2022-07-19 15:26 ` [PATCH v2 0/3] " Derrick Stolee via GitGitGadget
@ 2022-07-19 15:26   ` Derrick Stolee via GitGitGadget
  2022-07-19 15:26   ` [PATCH v2 2/3] midx: extract bitmap write setup Derrick Stolee via GitGitGadget
  2022-07-19 15:26   ` [PATCH v2 3/3] midx: reduce memory pressure while writing bitmaps Derrick Stolee via GitGitGadget
  2 siblings, 0 replies; 8+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-07-19 15:26 UTC (permalink / raw)
  To: git
  Cc: gitster, me, vdye, chakrabortyabhradeep79,
	Ævar Arnfjörð Bjarmason, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The next change will use a const array when calling this method. There
is no need for the non-const version, so let's do this cleanup quickly.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 pack-bitmap-write.c | 2 +-
 pack-bitmap.h       | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index c43375bd344..4fcfaed428f 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -683,7 +683,7 @@ static void write_hash_cache(struct hashfile *f,
 	}
 }
 
-void bitmap_writer_set_checksum(unsigned char *sha1)
+void bitmap_writer_set_checksum(const unsigned char *sha1)
 {
 	hashcpy(writer.pack_checksum, sha1);
 }
diff --git a/pack-bitmap.h b/pack-bitmap.h
index 3d3ddd77345..f3a57ca065f 100644
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@@ -75,7 +75,7 @@ int bitmap_has_oid_in_uninteresting(struct bitmap_index *, const struct object_i
 off_t get_disk_usage_from_bitmap(struct bitmap_index *, struct rev_info *);
 
 void bitmap_writer_show_progress(int show);
-void bitmap_writer_set_checksum(unsigned char *sha1);
+void bitmap_writer_set_checksum(const unsigned char *sha1);
 void bitmap_writer_build_type_index(struct packing_data *to_pack,
 				    struct pack_idx_entry **index,
 				    uint32_t index_nr);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/3] midx: extract bitmap write setup
  2022-07-19 15:26 ` [PATCH v2 0/3] " Derrick Stolee via GitGitGadget
  2022-07-19 15:26   ` [PATCH v2 1/3] pack-bitmap-write: use const for hashes Derrick Stolee via GitGitGadget
@ 2022-07-19 15:26   ` Derrick Stolee via GitGitGadget
  2022-07-19 15:26   ` [PATCH v2 3/3] midx: reduce memory pressure while writing bitmaps Derrick Stolee via GitGitGadget
  2 siblings, 0 replies; 8+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-07-19 15:26 UTC (permalink / raw)
  To: git
  Cc: gitster, me, vdye, chakrabortyabhradeep79,
	Ævar Arnfjörð Bjarmason, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

The write_midx_bitmap() method is a long method that does a lot of
steps. It requires the write_midx_context struct for use in
prepare_midx_packing_data() and find_commits_for_midx_bitmap(), but
after that only needs the pack_order array.

This is a messy, but completely non-functional refactoring. The code is
only being moved around to reduce visibility of the write_midx_context
during the longest part of computing reachability bitmaps.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 midx.c | 56 ++++++++++++++++++++++++++++++++------------------------
 1 file changed, 32 insertions(+), 24 deletions(-)

diff --git a/midx.c b/midx.c
index 5f0dd386b02..e2dd808b35d 100644
--- a/midx.c
+++ b/midx.c
@@ -1053,40 +1053,35 @@ static struct commit **find_commits_for_midx_bitmap(uint32_t *indexed_commits_nr
 	return cb.commits;
 }
 
-static int write_midx_bitmap(char *midx_name, unsigned char *midx_hash,
-			     struct write_midx_context *ctx,
+static int write_midx_bitmap(const char *midx_name,
+			     const unsigned char *midx_hash,
+			     struct packing_data *pdata,
+			     struct commit **commits,
+			     uint32_t commits_nr,
+			     uint32_t *pack_order,
 			     const char *refs_snapshot,
 			     unsigned flags)
 {
-	struct packing_data pdata;
-	struct pack_idx_entry **index;
-	struct commit **commits = NULL;
-	uint32_t i, commits_nr;
+	int ret, i;
 	uint16_t options = 0;
-	char *bitmap_name = xstrfmt("%s-%s.bitmap", midx_name, hash_to_hex(midx_hash));
-	int ret;
-
-	if (!ctx->entries_nr)
-		BUG("cannot write a bitmap without any objects");
+	struct pack_idx_entry **index;
+	char *bitmap_name = xstrfmt("%s-%s.bitmap", midx_name,
+					hash_to_hex(midx_hash));
 
 	if (flags & MIDX_WRITE_BITMAP_HASH_CACHE)
 		options |= BITMAP_OPT_HASH_CACHE;
 
-	prepare_midx_packing_data(&pdata, ctx);
-
-	commits = find_commits_for_midx_bitmap(&commits_nr, refs_snapshot, ctx);
-
 	/*
 	 * Build the MIDX-order index based on pdata.objects (which is already
 	 * in MIDX order; c.f., 'midx_pack_order_cmp()' for the definition of
 	 * this order).
 	 */
-	ALLOC_ARRAY(index, pdata.nr_objects);
-	for (i = 0; i < pdata.nr_objects; i++)
-		index[i] = &pdata.objects[i].idx;
+	ALLOC_ARRAY(index, pdata->nr_objects);
+	for (i = 0; i < pdata->nr_objects; i++)
+		index[i] = &pdata->objects[i].idx;
 
 	bitmap_writer_show_progress(flags & MIDX_PROGRESS);
-	bitmap_writer_build_type_index(&pdata, index, pdata.nr_objects);
+	bitmap_writer_build_type_index(pdata, index, pdata->nr_objects);
 
 	/*
 	 * bitmap_writer_finish expects objects in lex order, but pack_order
@@ -1101,16 +1096,16 @@ static int write_midx_bitmap(char *midx_name, unsigned char *midx_hash,
 	 * happens between bitmap_writer_build_type_index() and
 	 * bitmap_writer_finish().
 	 */
-	for (i = 0; i < pdata.nr_objects; i++)
-		index[ctx->pack_order[i]] = &pdata.objects[i].idx;
+	for (i = 0; i < pdata->nr_objects; i++)
+		index[pack_order[i]] = &pdata->objects[i].idx;
 
 	bitmap_writer_select_commits(commits, commits_nr, -1);
-	ret = bitmap_writer_build(&pdata);
+	ret = bitmap_writer_build(pdata);
 	if (ret < 0)
 		goto cleanup;
 
 	bitmap_writer_set_checksum(midx_hash);
-	bitmap_writer_finish(index, pdata.nr_objects, bitmap_name, options);
+	bitmap_writer_finish(index, pdata->nr_objects, bitmap_name, options);
 
 cleanup:
 	free(index);
@@ -1443,8 +1438,21 @@ static int write_midx_internal(const char *object_dir,
 	if (flags & MIDX_WRITE_REV_INDEX &&
 	    git_env_bool("GIT_TEST_MIDX_WRITE_REV", 0))
 		write_midx_reverse_index(midx_name.buf, midx_hash, &ctx);
+
 	if (flags & MIDX_WRITE_BITMAP) {
-		if (write_midx_bitmap(midx_name.buf, midx_hash, &ctx,
+		struct packing_data pdata;
+		struct commit **commits;
+		uint32_t commits_nr;
+
+		if (!ctx.entries_nr)
+			BUG("cannot write a bitmap without any objects");
+
+		prepare_midx_packing_data(&pdata, &ctx);
+
+		commits = find_commits_for_midx_bitmap(&commits_nr, refs_snapshot, &ctx);
+
+		if (write_midx_bitmap(midx_name.buf, midx_hash, &pdata,
+				      commits, commits_nr, ctx.pack_order,
 				      refs_snapshot, flags) < 0) {
 			error(_("could not write multi-pack bitmap"));
 			result = 1;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/3] midx: reduce memory pressure while writing bitmaps
  2022-07-19 15:26 ` [PATCH v2 0/3] " Derrick Stolee via GitGitGadget
  2022-07-19 15:26   ` [PATCH v2 1/3] pack-bitmap-write: use const for hashes Derrick Stolee via GitGitGadget
  2022-07-19 15:26   ` [PATCH v2 2/3] midx: extract bitmap write setup Derrick Stolee via GitGitGadget
@ 2022-07-19 15:26   ` Derrick Stolee via GitGitGadget
  2022-07-19 15:59     ` Junio C Hamano
  2 siblings, 1 reply; 8+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2022-07-19 15:26 UTC (permalink / raw)
  To: git
  Cc: gitster, me, vdye, chakrabortyabhradeep79,
	Ævar Arnfjörð Bjarmason, Derrick Stolee,
	Derrick Stolee

From: Derrick Stolee <derrickstolee@github.com>

We noticed that some 'git multi-pack-index write --bitmap' processes
were running with very high memory. It turns out that a lot of this
memory is required to store a list of every object in the written
multi-pack-index, with a second copy that has additional information
used for the bitmap writing logic.

Using 'valgrind --tool=massif' before this change, the following chart
shows how memory load increased and was maintained throughout the
process:

    GB
4.102^                                                             ::
     |              @  @::@@::@@::::::::@::::::@@:#:::::::::::::@@:: :
     |         :::::@@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |      :::: :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |    :::: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |    : :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |    : :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     |   :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     | @ :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     | @ :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
     | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: :
   0 +--------------------------------------------------------------->

It turns out that the 'struct write_midx_context' data is persisting
through the life of the process, including the 'entries' array. This
array is used last inside find_commits_for_midx_bitmap() within
write_midx_bitmap(). If we free (and nullify) the array at that point,
we can free a decent chunk of memory before the bitmap logic adds more
to the memory footprint.

Here is the massif memory load chart after this change:

    GB
3.111^      #
     |      #                              :::::::::::@::::::::::::::@
     |      #        ::::::::::::::::::::::::: : :: : @:: ::::: :: ::@
     |     @#  :::::::::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |     @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |  :::@#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |  :: @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |  :: @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
     |  :: @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@
   0 +--------------------------------------------------------------->

The previous change introduced a refactoring of write_midx_bitmap() to
make it more clear how much of the 'struct write_midx_context' instance
is needed at different parts of the process. In addition, the following
defensive programming measures were put in place:

 1. Using FREE_AND_NULL() we will at least get a segfault from reading a
    NULL pointer instead of a use-after-free.

 2. 'entries_nr' is also set to zero to make any loop that would iterate
    over the entries be trivial.

 3. Add significant comments in write_midx_internal() to add warnings
    for future authors who might accidentally add references to this
    cleared memory.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 midx.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/midx.c b/midx.c
index e2dd808b35d..772ab7d2944 100644
--- a/midx.c
+++ b/midx.c
@@ -1451,6 +1451,15 @@ static int write_midx_internal(const char *object_dir,
 
 		commits = find_commits_for_midx_bitmap(&commits_nr, refs_snapshot, &ctx);
 
+		/*
+		 * The previous steps translated the information from
+		 * 'entries' into information suitable for constructing
+		 * bitmaps. We no longer need that array, so clear it to
+		 * reduce memory pressure.
+		 */
+		FREE_AND_NULL(ctx.entries);
+		ctx.entries_nr = 0;
+
 		if (write_midx_bitmap(midx_name.buf, midx_hash, &pdata,
 				      commits, commits_nr, ctx.pack_order,
 				      refs_snapshot, flags) < 0) {
@@ -1459,6 +1468,10 @@ static int write_midx_internal(const char *object_dir,
 			goto cleanup;
 		}
 	}
+	/*
+	 * NOTE: Do not use ctx.entries beyond this point, since it might
+	 * have been freed in the previous if block.
+	 */
 
 	if (ctx.m)
 		close_object_store(the_repository->objects);
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] midx: reduce memory pressure while writing bitmaps
  2022-07-19 15:26   ` [PATCH v2 3/3] midx: reduce memory pressure while writing bitmaps Derrick Stolee via GitGitGadget
@ 2022-07-19 15:59     ` Junio C Hamano
  0 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2022-07-19 15:59 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, me, vdye, chakrabortyabhradeep79,
	Ævar Arnfjörð Bjarmason, Derrick Stolee

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/midx.c b/midx.c
> index e2dd808b35d..772ab7d2944 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -1451,6 +1451,15 @@ static int write_midx_internal(const char *object_dir,
>  
>  		commits = find_commits_for_midx_bitmap(&commits_nr, refs_snapshot, &ctx);
>  
> +		/*
> +		 * The previous steps translated the information from
> +		 * 'entries' into information suitable for constructing
> +		 * bitmaps. We no longer need that array, so clear it to
> +		 * reduce memory pressure.
> +		 */
> +		FREE_AND_NULL(ctx.entries);
> +		ctx.entries_nr = 0;
> +
>  		if (write_midx_bitmap(midx_name.buf, midx_hash, &pdata,
>  				      commits, commits_nr, ctx.pack_order,
>  				      refs_snapshot, flags) < 0) {

As the reduced helper, thanks to step [1/3], only takes the
pack_order[] array, without being even aware of other members in the
ctx struct, it is immediately obvious that this early freeing is
safe for this call.  It is a bit messy.

I've been staring at the code and was wondering if we can just get
rid of pack_order member from the context, and make pack_order a
separate local variable that belong to this function.  The separate
variable needs to be packaged together with ctx back to please the
chunk-format API, so it may require more boilerplate code and may
not be an overall win.

> @@ -1459,6 +1468,10 @@ static int write_midx_internal(const char *object_dir,
>  			goto cleanup;
>  		}
>  	}
> +	/*
> +	 * NOTE: Do not use ctx.entries beyond this point, since it might
> +	 * have been freed in the previous if block.
> +	 */

OK.

>  	if (ctx.m)
>  		close_object_store(the_repository->objects);


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-07-19 15:59 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-18 20:36 [PATCH] midx: reduce memory pressure while writing bitmaps Derrick Stolee via GitGitGadget
2022-07-18 21:47 ` Ævar Arnfjörð Bjarmason
2022-07-19 13:50   ` Derrick Stolee
2022-07-19 15:26 ` [PATCH v2 0/3] " Derrick Stolee via GitGitGadget
2022-07-19 15:26   ` [PATCH v2 1/3] pack-bitmap-write: use const for hashes Derrick Stolee via GitGitGadget
2022-07-19 15:26   ` [PATCH v2 2/3] midx: extract bitmap write setup Derrick Stolee via GitGitGadget
2022-07-19 15:26   ` [PATCH v2 3/3] midx: reduce memory pressure while writing bitmaps Derrick Stolee via GitGitGadget
2022-07-19 15:59     ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).