All of lore.kernel.org
 help / color / mirror / Atom feed
From: Son Luong Ngoc <sluongng@gmail.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: Son Luong Ngoc via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Subject: Re: [PATCH] midx: apply gitconfig to midx repack
Date: Tue, 5 May 2020 18:03:33 +0200	[thread overview]
Message-ID: <74A7FE73-6B5F-4DCF-9A57-AD11306CFAF8@gmail.com> (raw)
In-Reply-To: <8bd91a14-75dc-76e2-31b4-54eff5bea8dd@gmail.com>

Hi Derrick,

Thanks for a swift and comprehensive review.

> On May 5, 2020, at 15:50, Derrick Stolee <stolee@gmail.com> wrote:
> 
> In the scenario where there is a .keep pack _and_ it is small enough to get
> picked up by the batch size, the 'git multi-pack-index repack' command will
> create a new pack containing its objects (and objects from other packs) but
> the 'git multi-pack-index expire' command will not delete the pack with .keep.
> 
> The good news is that after the first repack, the objects in the pack are
> in a newer pack, so the multi-pack-index will not repack those objects from
> that pack multiple times. However, this may be unintended behavior for the
> user that specified the .keep pack.

Yup I experienced exactly this when trying to test midx repack/expire
with biggest pack file marked with `.keep`.
Luckily the storage size bump for duplicated objects was not noticeable in my case.
You worded the situation precisely.

> I think the right thing to do to respect "repack.packKeptObjects = false" is
> to ignore the packs when selecting the batch of objects. Instead of asking
> you to do this, I added a patch below. Please take it into your v2, if you
> don't mind.

Gladly.
This should help me a lot for re-rolling V2.

>> +static int delta_base_offset = 1;
>> +static int write_bitmaps = -1;
>> +static int use_delta_islands;
>> +
> 
> Why not make these local to the midx_repack method?

No practical reason except me shamelessly lifted those from builtin/repack.c.
I was a bit confused how `git repack` houses these logic in the builtin file,
while midx was having these logic in the midx.c instead of builtin/multi-pack-index.c.

I make them local in V2.

>> int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, unsigned flags)
>> {
>> 	int result = 0;
>> @@ -1381,12 +1385,25 @@ int midx_repack(struct repository *r, const char *object_dir, size_t batch_size,
>> 	} else if (fill_included_packs_all(m, include_pack))
>> 		goto cleanup;
>> 
>> +  git_config_get_bool("repack.usedeltabaseoffset", &delta_base_offset);
>> +  git_config_get_bool("repack.writebitmaps", &write_bitmaps);
>> +  git_config_get_bool("repack.usedeltaislands", &use_delta_islands);
>> +
> 
> It looks like you have some spacing issues here. Perhaps use tabs?

Rookie mistake on my part. Will fix it in V2

>> +	if (write_bitmaps > 0)
>> +		argv_array_push(&cmd.args, "--write-bitmap-index");
>> +	else if (write_bitmaps < 0)
>> +		argv_array_push(&cmd.args, "--write-bitmap-index-quiet");
> 
> These make less sense. Unless --batch-size=0 and there are no .keep
> packs (with the patch below) I'm not sure we _can_ write bitmap indexes
> here. The pack-file is not necessarily closed under reachability. Or,
> will supplying these arguments to 'git pack-objects' actually do that
> closure?
> 
> I would be happy to special-case these options to the "--batch-size=0"
> situation and otherwise ignore them. This then gets into enough
> complication that we should update the documentation as in the patch
> below.

You make a great point here. 
I completely missed this as I have been largely testing with repacking only 2 packs,
effectively with --batch-size=0.

I think having the bitmaps index is highly desirable in `--batch-size=0` case.
I will try to include that in V2 (with Documentation).

> At minimum, it would be good to have some tests that exercise these
> code paths so we know they are behaving correctly.

I will do some readings with the current tests for repack and midx.
Hopefully I will have something for V2. (^_^ !)

> Thanks,
> -Stolee

Cheers,
Son Luong


  reply	other threads:[~2020-05-05 16:03 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-05 13:06 [PATCH] midx: apply gitconfig to midx repack Son Luong Ngoc via GitGitGadget
2020-05-05 13:50 ` Derrick Stolee
2020-05-05 16:03   ` Son Luong Ngoc [this message]
2020-05-06  8:56     ` Son Luong Ngoc
2020-05-06  9:43 ` [PATCH v2 0/2] " Son Luong Ngoc via GitGitGadget
2020-05-06  9:43   ` [PATCH v2 1/2] " Son Luong Ngoc via GitGitGadget
2020-05-06 12:03     ` Derrick Stolee
2020-05-06 17:03     ` Junio C Hamano
2020-05-07  7:29       ` Son Luong Ngoc
2020-05-06  9:43   ` [PATCH v2 2/2] multi-pack-index: respect repack.packKeptObjects=false Derrick Stolee via GitGitGadget
2020-05-06 16:18     ` Eric Sunshine
2020-05-06 16:36       ` Derrick Stolee
2020-05-09 14:24   ` [PATCH v3 0/3] midx: apply gitconfig to midx repack Son Luong Ngoc via GitGitGadget
2020-05-09 14:24     ` [PATCH v3 1/3] midx: teach "git multi-pack-index repack" honor "git repack" configurations Son Luong Ngoc via GitGitGadget
2020-05-09 16:51       ` Junio C Hamano
2020-05-10 14:27         ` Son Luong Ngoc
2020-05-09 14:24     ` [PATCH v3 2/3] multi-pack-index: respect repack.packKeptObjects=false Derrick Stolee via GitGitGadget
2020-05-09 16:11       ` Đoàn Trần Công Danh
2020-05-09 17:33         ` Junio C Hamano
2020-05-10  6:38           ` Đoàn Trần Công Danh
2020-05-10 15:52             ` Son Luong Ngoc
2020-05-09 14:24     ` [PATCH v3 3/3] Ensured t5319 follows arith expansion guideline Son Luong Ngoc via GitGitGadget
2020-05-09 16:55       ` Junio C Hamano
2020-05-10 16:07     ` [PATCH v4 0/2] midx: apply gitconfig to midx repack Son Luong Ngoc via GitGitGadget
2020-05-10 16:07       ` [PATCH v4 1/2] midx: teach "git multi-pack-index repack" honor "git repack" configurations Son Luong Ngoc via GitGitGadget
2020-05-10 16:07       ` [PATCH v4 2/2] multi-pack-index: respect repack.packKeptObjects=false Derrick Stolee via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=74A7FE73-6B5F-4DCF-9A57-AD11306CFAF8@gmail.com \
    --to=sluongng@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.