All of lore.kernel.org
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, dstolee@microsoft.com, gitster@pobox.com
Subject: Re: [PATCH v2 2/8] revision: learn '--no-kept-objects'
Date: Wed, 17 Feb 2021 13:35:48 -0500	[thread overview]
Message-ID: <YC1iBEFyQnv2URSV@nand.local> (raw)
In-Reply-To: <YCxSlNK4Ug2exDbv@coredump.intra.peff.net>

On Tue, Feb 16, 2021 at 06:17:40PM -0500, Jeff King wrote:
> On Wed, Feb 03, 2021 at 10:58:57PM -0500, Taylor Blau wrote:
>
> > @@ -3797,6 +3807,11 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
> >  		return commit_ignore;
> >  	if (revs->unpacked && has_object_pack(&commit->object.oid))
> >  		return commit_ignore;
> > +	if (revs->no_kept_objects) {
> > +		if (has_object_kept_pack(&commit->object.oid,
> > +					 revs->keep_pack_cache_flags))
> > +			return commit_ignore;
> > +	}
>
> OK, so this has the same "problems" as --unpacked, which is that we can
> miss some objects (i.e., things that are reachable but not-kept may not
> be reported). But it should be OK in this version of the series, because
> we will not be relying on it for selection of objects, but only to fill
> in ordering / namehash fields.
>
> Should we warn people about that, either as a comment or in the commit
> message?

Yeah, let's warn about it in the commit message. We could put it in the
documentation, but...

> > +--no-kept-objects[=<kind>]::
> > +	Halts the traversal as soon as an object in a kept pack is
> > +	found. If `<kind>` is `on-disk`, only packs with a corresponding
> > +	`*.keep` file are ignored. If `<kind>` is `in-core`, only packs
> > +	with their in-core kept state set are ignored. Otherwise, both
> > +	kinds of kept packs are ignored.
>
> Likewise, I wonder whether we need to expose this mode to users.
> Normally I'm a fan of doing so, because it allows scripted callers
> access to more of the internals, but:
>
>   - the semantics are kind of weird about where we draw the line between
>     performance and absolute correctness
>
>   - the "in-core" thing is a bit weird for callers of rev-list; how do I
>     as a caller mark a pack as kept-in-core? I think it's only an
>     internal pack-objects thing.
>
> Once we support this in rev-list, we'll have to do it forever (or deal
> with deprecation, etc). If we just need it internally, maybe it's wise
> to leave it as a something you ask for by manipulating rev_info
> directly. Or perhaps leave it as an undocumented interface we use for
> testing, and not something we promise to keep working.

I think that you raise a good point about not advertising this option,
since doing so paints us into a corner that we have to keep it working
and behaving consistently forever.

I'm not opposed to the idea that we may eventually want to do so, but I
think that this is too early for that. As you note, we *could* just
expose it in rev_info flags, but that makes it much more difficult to
test some of the tricky cases that are added in t6114, so I think a
middle ground of having an undocumented option satisfies both of our
wants.

> > --- a/list-objects.c
> > +++ b/list-objects.c
> > @@ -338,6 +338,13 @@ static void traverse_trees_and_blobs(struct traversal_context *ctx,
> >  			ctx->show_object(obj, name, ctx->show_data);
> >  			continue;
> >  		}
> > +		if (ctx->revs->no_kept_objects) {
> > +			struct pack_entry e;
> > +			if (find_kept_pack_entry(ctx->revs->repo, &obj->oid,
> > +						 ctx->revs->keep_pack_cache_flags,
> > +						 &e))
> > +				continue;
> > +		}
>
> This hunk is interesting.
>
> There is no similar check for revs->unpacked in list-objects.c to cut
> off the traversal. And indeed, running "rev-list --unpacked" will
> generally look at the _whole_ tree for a commit that is unpacked, even
> if all of the tree entries are packed. That's something we might
> consider changing in the name of performance (though it does increase
> the number of cases where --unpacked will fail to find an unpacked but
> reachable object).
>
> But this is a funny place to put it. If I understand it correctly, it is
> cutting off the traversal at the very top of the tree. I.e., if we had a
> commit that is not-kept, we'd queue it's root tree. And then we might
> find that the root tree is kept, and avoid traversing it. But if we _do_
> traverse it, we would look at every subtree it contains, even if they
> are kept! That's because we recurse the tree via the recursive
> process_tree(), not by queueing more objects in the pending array here.
>
> So this check seems to exist in a funny middle ground. I think it's
> unlikely to catch anything useful (usually commits have a unique root
> tree; it's all of the untouched parts of the subtrees that will be in
> the kept packs). IMHO we should either drop it (and act like
> "--unpacked", accepting that we may traverse some extra tree objects),
> or we should go all-in on performance and cut it off in the top of
> process_tree().

Agreed. Let's drop it.

> -Peff

Thanks,
Taylor

  reply	other threads:[~2021-02-17 18:36 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-19 23:23 [PATCH 00/10] repack: support repacking into a geometric sequence Taylor Blau
2021-01-19 23:24 ` [PATCH 01/10] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-01-20 13:40   ` Derrick Stolee
2021-01-20 14:38     ` Taylor Blau
2021-01-29  2:33   ` Junio C Hamano
2021-01-29 18:38     ` Taylor Blau
2021-01-29 19:31     ` Jeff King
2021-01-29 20:20       ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 02/10] revision: learn '--no-kept-objects' Taylor Blau
2021-01-29  3:10   ` Junio C Hamano
2021-01-29 19:13     ` Taylor Blau
2021-01-19 23:24 ` [PATCH 03/10] builtin/pack-objects.c: learn '--assume-kept-packs-closed' Taylor Blau
2021-01-29  3:21   ` Junio C Hamano
2021-01-29 19:19     ` Jeff King
2021-01-29 20:01       ` Taylor Blau
2021-01-29 20:25         ` Jeff King
2021-01-29 22:10           ` Taylor Blau
2021-01-29 22:57             ` Jeff King
2021-01-29 23:03             ` Junio C Hamano
2021-01-29 23:28               ` Taylor Blau
2021-02-02  3:04                 ` Taylor Blau
2021-01-29 23:31               ` Jeff King
2021-01-29 22:13           ` Junio C Hamano
2021-01-29 20:30       ` Junio C Hamano
2021-01-29 22:43         ` Jeff King
2021-01-29 22:53           ` Taylor Blau
2021-01-29 23:00             ` Jeff King
2021-01-29 23:10             ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 04/10] p5303: add missing &&-chains Taylor Blau
2021-01-19 23:24 ` [PATCH 05/10] p5303: measure time to repack with keep Taylor Blau
2021-01-29  3:40   ` Junio C Hamano
2021-01-29 19:32     ` Jeff King
2021-01-29 20:04       ` [PATCH] p5303: avoid sed GNU-ism Jeff King
2021-01-29 20:19         ` Eric Sunshine
2021-01-29 20:27           ` Jeff King
2021-01-29 20:36             ` Eric Sunshine
2021-01-29 22:11               ` Taylor Blau
2021-01-29 20:38       ` [PATCH 05/10] p5303: measure time to repack with keep Junio C Hamano
2021-01-29 22:10         ` Jeff King
2021-01-29 23:12           ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 06/10] pack-objects: rewrite honor-pack-keep logic Taylor Blau
2021-01-19 23:24 ` [PATCH 07/10] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-01-19 23:24 ` [PATCH 08/10] builtin/pack-objects.c: teach '--keep-pack-stdin' Taylor Blau
2021-01-19 23:24 ` [PATCH 09/10] builtin/repack.c: extract loose object handling Taylor Blau
2021-01-20 13:59   ` Derrick Stolee
2021-01-20 14:34     ` Taylor Blau
2021-01-20 15:51       ` Derrick Stolee
2021-01-21  3:45     ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 10/10] builtin/repack.c: add '--geometric' option Taylor Blau
2021-01-20 14:05 ` [PATCH 00/10] repack: support repacking into a geometric sequence Derrick Stolee
2021-02-04  3:58 ` [PATCH v2 0/8] " Taylor Blau
2021-02-04  3:58   ` [PATCH v2 1/8] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-02-16 21:42     ` Jeff King
2021-02-16 21:48       ` Taylor Blau
2021-02-04  3:58   ` [PATCH v2 2/8] revision: learn '--no-kept-objects' Taylor Blau
2021-02-16 23:17     ` Jeff King
2021-02-17 18:35       ` Taylor Blau [this message]
2021-02-04  3:59   ` [PATCH v2 3/8] builtin/pack-objects.c: add '--stdin-packs' option Taylor Blau
2021-02-16 23:46     ` Jeff King
2021-02-17 18:59       ` Taylor Blau
2021-02-17 19:21         ` Jeff King
2021-02-04  3:59   ` [PATCH v2 4/8] p5303: add missing &&-chains Taylor Blau
2021-02-04  3:59   ` [PATCH v2 5/8] p5303: measure time to repack with keep Taylor Blau
2021-02-16 23:58     ` Jeff King
2021-02-17  0:02       ` Jeff King
2021-02-17 19:13       ` Taylor Blau
2021-02-17 19:25         ` Jeff King
2021-02-04  3:59   ` [PATCH v2 6/8] builtin/pack-objects.c: rewrite honor-pack-keep logic Taylor Blau
2021-02-17 16:05     ` Jeff King
2021-02-17 19:23       ` Taylor Blau
2021-02-17 19:29         ` Jeff King
2021-02-04  3:59   ` [PATCH v2 7/8] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-02-17 17:11     ` Jeff King
2021-02-17 19:54       ` Taylor Blau
2021-02-17 20:25         ` Jeff King
2021-02-17 20:29           ` Taylor Blau
2021-02-17 21:43             ` Jeff King
2021-02-04  3:59   ` [PATCH v2 8/8] builtin/repack.c: add '--geometric' option Taylor Blau
2021-02-17 18:17     ` Jeff King
2021-02-17 20:01       ` Taylor Blau
2021-02-17  0:01   ` [PATCH v2 0/8] repack: support repacking into a geometric sequence Jeff King
2021-02-17 18:18     ` Jeff King
2021-02-18  3:14 ` [PATCH v3 " Taylor Blau
2021-02-18  3:14   ` [PATCH v3 1/8] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-02-18  3:14   ` [PATCH v3 2/8] revision: learn '--no-kept-objects' Taylor Blau
2021-02-18  3:14   ` [PATCH v3 3/8] builtin/pack-objects.c: add '--stdin-packs' option Taylor Blau
2021-02-18  3:14   ` [PATCH v3 4/8] p5303: add missing &&-chains Taylor Blau
2021-02-18  3:14   ` [PATCH v3 5/8] p5303: measure time to repack with keep Taylor Blau
2021-02-18  3:14   ` [PATCH v3 6/8] builtin/pack-objects.c: rewrite honor-pack-keep logic Taylor Blau
2021-02-18  3:14   ` [PATCH v3 7/8] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-02-18  3:14   ` [PATCH v3 8/8] builtin/repack.c: add '--geometric' option Taylor Blau
2021-02-23  0:31   ` [PATCH v3 0/8] repack: support repacking into a geometric sequence Jeff King
2021-02-23  1:06     ` Taylor Blau
2021-02-23  1:42       ` Jeff King
2021-02-23  2:24 ` [PATCH v4 " Taylor Blau
2021-02-23  2:25   ` [PATCH v4 1/8] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-02-23  2:25   ` [PATCH v4 2/8] revision: learn '--no-kept-objects' Taylor Blau
2021-02-23  2:25   ` [PATCH v4 3/8] builtin/pack-objects.c: add '--stdin-packs' option Taylor Blau
2021-02-23  8:07     ` Junio C Hamano
2021-02-23 18:51       ` Jeff King
2021-02-23  2:25   ` [PATCH v4 4/8] p5303: add missing &&-chains Taylor Blau
2021-02-23  2:25   ` [PATCH v4 5/8] p5303: measure time to repack with keep Taylor Blau
2021-02-23  2:25   ` [PATCH v4 6/8] builtin/pack-objects.c: rewrite honor-pack-keep logic Taylor Blau
2021-02-23  2:25   ` [PATCH v4 7/8] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-02-23  2:25   ` [PATCH v4 8/8] builtin/repack.c: add '--geometric' option Taylor Blau
2021-02-24 23:19     ` Junio C Hamano
2021-02-24 23:43       ` Junio C Hamano
2021-03-04 21:40         ` Taylor Blau
2021-03-04 21:55       ` Taylor Blau
2021-02-23  3:39   ` [PATCH v4 0/8] repack: support repacking into a geometric sequence Jeff King
2021-02-23  7:43   ` Junio C Hamano
2021-02-23 18:44     ` Jeff King
2021-02-23 19:54       ` Martin Fick
2021-02-23 20:06         ` Taylor Blau
2021-02-23 21:57           ` Martin Fick
2021-02-23 20:15         ` Jeff King
2021-02-23 21:41           ` Martin Fick
2021-02-23 21:53             ` Jeff King
2021-02-24 18:13               ` Martin Fick
2021-02-26  6:23                 ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YC1iBEFyQnv2URSV@nand.local \
    --to=me@ttaylorr.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.