All of lore.kernel.org
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Victoria Dye via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, Victoria Dye <vdye@github.com>
Subject: Re: [PATCH] repack: respect --keep-pack with geometric repack
Date: Fri, 20 May 2022 19:26:18 -0400	[thread overview]
Message-ID: <Yogjmn9BcOBc/5ed@nand.local> (raw)
In-Reply-To: <xmqqr14oey2a.fsf@gitster.g>

On Fri, May 20, 2022 at 01:41:49PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> >>  	*geometry_p = xcalloc(1, sizeof(struct pack_geometry));
> >>  	geometry = *geometry_p;
> >>
> >> +	string_list_sort(existing_kept_packs);
> >
> > Would it be worth sorting this as early as in collect_pack_filenames()?
> > For our purposes in this patch, this works as-is, but it may be
> > defensive to try and minimize the time that list has unsorted contents.
>
> It does not matter too much but after reading the latest one before
> coming back to this original thread, I actually thought that sorting
> it too early was questionable.  Nobody other than this code cares
> about the list being sorted, and if it turns out that the look-up in
> the loop need to be optimized, it is plausible that we do not need
> this list sorted when we populate a hashmap out of its contents, for
> example.

My thinking was simply that sorting it as often as possible (by doing so
immediately after appending all of our newly written packs into it) was
the most defensive approach that we could take. At least in the sense of
another caller which _does_ depend on `names` sorted, and makes the same
mistake of forgetting to call `string_list_sort()` itself.

I suspect that this is mostly academic anyway, and that either would be
fine, so if others feel strongly I'm happy to send an updated version
(though TBH I would be just as happy with what we already have in the
series I posted here).

> >>  	for (p = get_all_packs(the_repository); p; p = p->next) {
> >> -		if (!pack_kept_objects && p->pack_keep)
> >> -			continue;
> >> +		if (!pack_kept_objects) {
> >> +			if (p->pack_keep)
> >> +				continue;
> >
> > (You mentioned this to me off-list, but I'll repeat it here since it
> > wasn't obvious to me on first read): this check for `p->pack_keep` isn't
> > strictly necessary, since any packs that have their `pack_keep` bit set
> > will appear in the `existing_kept_packs` list.
> >
> > But it does give us a fast path to avoid having to check that list, so
> > it's worth checking that bit to avoid a slightly more expensive check
> > where possible.
>
> That invites a related but different question.  Doesn't p->pack_keep
> and string_list_has_string(existing_kept_packs, name_of_pack(p))
> mirror each other?  Can a pack appear on the existing_kept_packs
> list without having .pack_keep bit set and under what condition?

Unfortunately not. `repack` is much more keen to manipulate the results
of calling readdir() over $GIT_DIR/objects/pack than it is to, say, call
`get_all_packs()` and mark `p->pack_keep_in_core`.

In a future where `repack` does not interact with the packdir at such a
low level, I don't think we would have a use for a function like
`collect_pack_filenames()` at all. But that is a much larger change than
this one ;).

> It is answered by the comment below, I presume?
>
> >> +			/*
> >> +			 * The pack may be kept via the --keep-pack option;
> >> +			 * check 'existing_kept_packs' to determine whether to
> >> +			 * ignore it.
> >> +			 */
>
> OK.  So there are two classes of packs we want to exclude from the
> geometry repacking.  Those that already have .pack_keep bit set, and
> those that are _are_ newly making into kept packs that do not yet
> have .pack_keep bit set.  Makes sense.

Yep (and also "yep" to your note below about adding cruft packs into the
mix).

Thanks,
Taylor

      parent reply	other threads:[~2022-05-20 23:26 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-20 16:36 [PATCH] repack: respect --keep-pack with geometric repack Victoria Dye via GitGitGadget
2022-05-20 17:00 ` Taylor Blau
2022-05-20 17:27   ` Victoria Dye
2022-05-20 19:05     ` Taylor Blau
2022-05-20 20:41   ` Junio C Hamano
2022-05-20 22:12     ` Junio C Hamano
2022-05-20 22:20       ` Taylor Blau
2022-05-20 23:26     ` Taylor Blau [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yogjmn9BcOBc/5ed@nand.local \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.