git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, dstolee@microsoft.com, gitster@pobox.com
Subject: Re: [PATCH v2 2/8] revision: learn '--no-kept-objects'
Date: Wed, 17 Feb 2021 13:35:48 -0500	[thread overview]
Message-ID: <YC1iBEFyQnv2URSV@nand.local> (raw)
In-Reply-To: <YCxSlNK4Ug2exDbv@coredump.intra.peff.net>

On Tue, Feb 16, 2021 at 06:17:40PM -0500, Jeff King wrote:
> On Wed, Feb 03, 2021 at 10:58:57PM -0500, Taylor Blau wrote:
>
> > @@ -3797,6 +3807,11 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi
> >  		return commit_ignore;
> >  	if (revs->unpacked && has_object_pack(&commit->object.oid))
> >  		return commit_ignore;
> > +	if (revs->no_kept_objects) {
> > +		if (has_object_kept_pack(&commit->object.oid,
> > +					 revs->keep_pack_cache_flags))
> > +			return commit_ignore;
> > +	}
>
> OK, so this has the same "problems" as --unpacked, which is that we can
> miss some objects (i.e., things that are reachable but not-kept may not
> be reported). But it should be OK in this version of the series, because
> we will not be relying on it for selection of objects, but only to fill
> in ordering / namehash fields.
>
> Should we warn people about that, either as a comment or in the commit
> message?

Yeah, let's warn about it in the commit message. We could put it in the
documentation, but...

> > +--no-kept-objects[=<kind>]::
> > +	Halts the traversal as soon as an object in a kept pack is
> > +	found. If `<kind>` is `on-disk`, only packs with a corresponding
> > +	`*.keep` file are ignored. If `<kind>` is `in-core`, only packs
> > +	with their in-core kept state set are ignored. Otherwise, both
> > +	kinds of kept packs are ignored.
>
> Likewise, I wonder whether we need to expose this mode to users.
> Normally I'm a fan of doing so, because it allows scripted callers
> access to more of the internals, but:
>
>   - the semantics are kind of weird about where we draw the line between
>     performance and absolute correctness
>
>   - the "in-core" thing is a bit weird for callers of rev-list; how do I
>     as a caller mark a pack as kept-in-core? I think it's only an
>     internal pack-objects thing.
>
> Once we support this in rev-list, we'll have to do it forever (or deal
> with deprecation, etc). If we just need it internally, maybe it's wise
> to leave it as a something you ask for by manipulating rev_info
> directly. Or perhaps leave it as an undocumented interface we use for
> testing, and not something we promise to keep working.

I think that you raise a good point about not advertising this option,
since doing so paints us into a corner that we have to keep it working
and behaving consistently forever.

I'm not opposed to the idea that we may eventually want to do so, but I
think that this is too early for that. As you note, we *could* just
expose it in rev_info flags, but that makes it much more difficult to
test some of the tricky cases that are added in t6114, so I think a
middle ground of having an undocumented option satisfies both of our
wants.

> > --- a/list-objects.c
> > +++ b/list-objects.c
> > @@ -338,6 +338,13 @@ static void traverse_trees_and_blobs(struct traversal_context *ctx,
> >  			ctx->show_object(obj, name, ctx->show_data);
> >  			continue;
> >  		}
> > +		if (ctx->revs->no_kept_objects) {
> > +			struct pack_entry e;
> > +			if (find_kept_pack_entry(ctx->revs->repo, &obj->oid,
> > +						 ctx->revs->keep_pack_cache_flags,
> > +						 &e))
> > +				continue;
> > +		}
>
> This hunk is interesting.
>
> There is no similar check for revs->unpacked in list-objects.c to cut
> off the traversal. And indeed, running "rev-list --unpacked" will
> generally look at the _whole_ tree for a commit that is unpacked, even
> if all of the tree entries are packed. That's something we might
> consider changing in the name of performance (though it does increase
> the number of cases where --unpacked will fail to find an unpacked but
> reachable object).
>
> But this is a funny place to put it. If I understand it correctly, it is
> cutting off the traversal at the very top of the tree. I.e., if we had a
> commit that is not-kept, we'd queue it's root tree. And then we might
> find that the root tree is kept, and avoid traversing it. But if we _do_
> traverse it, we would look at every subtree it contains, even if they
> are kept! That's because we recurse the tree via the recursive
> process_tree(), not by queueing more objects in the pending array here.
>
> So this check seems to exist in a funny middle ground. I think it's
> unlikely to catch anything useful (usually commits have a unique root
> tree; it's all of the untouched parts of the subtrees that will be in
> the kept packs). IMHO we should either drop it (and act like
> "--unpacked", accepting that we may traverse some extra tree objects),
> or we should go all-in on performance and cut it off in the top of
> process_tree().

Agreed. Let's drop it.

> -Peff

Thanks,
Taylor

  reply	other threads:[~2021-02-17 18:36 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-19 23:23 [PATCH 00/10] repack: support repacking into a geometric sequence Taylor Blau
2021-01-19 23:24 ` [PATCH 01/10] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-01-20 13:40   ` Derrick Stolee
2021-01-20 14:38     ` Taylor Blau
2021-01-29  2:33   ` Junio C Hamano
2021-01-29 18:38     ` Taylor Blau
2021-01-29 19:31     ` Jeff King
2021-01-29 20:20       ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 02/10] revision: learn '--no-kept-objects' Taylor Blau
2021-01-29  3:10   ` Junio C Hamano
2021-01-29 19:13     ` Taylor Blau
2021-01-19 23:24 ` [PATCH 03/10] builtin/pack-objects.c: learn '--assume-kept-packs-closed' Taylor Blau
2021-01-29  3:21   ` Junio C Hamano
2021-01-29 19:19     ` Jeff King
2021-01-29 20:01       ` Taylor Blau
2021-01-29 20:25         ` Jeff King
2021-01-29 22:10           ` Taylor Blau
2021-01-29 22:57             ` Jeff King
2021-01-29 23:03             ` Junio C Hamano
2021-01-29 23:28               ` Taylor Blau
2021-02-02  3:04                 ` Taylor Blau
2021-01-29 23:31               ` Jeff King
2021-01-29 22:13           ` Junio C Hamano
2021-01-29 20:30       ` Junio C Hamano
2021-01-29 22:43         ` Jeff King
2021-01-29 22:53           ` Taylor Blau
2021-01-29 23:00             ` Jeff King
2021-01-29 23:10             ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 04/10] p5303: add missing &&-chains Taylor Blau
2021-01-19 23:24 ` [PATCH 05/10] p5303: measure time to repack with keep Taylor Blau
2021-01-29  3:40   ` Junio C Hamano
2021-01-29 19:32     ` Jeff King
2021-01-29 20:04       ` [PATCH] p5303: avoid sed GNU-ism Jeff King
2021-01-29 20:19         ` Eric Sunshine
2021-01-29 20:27           ` Jeff King
2021-01-29 20:36             ` Eric Sunshine
2021-01-29 22:11               ` Taylor Blau
2021-01-29 20:38       ` [PATCH 05/10] p5303: measure time to repack with keep Junio C Hamano
2021-01-29 22:10         ` Jeff King
2021-01-29 23:12           ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 06/10] pack-objects: rewrite honor-pack-keep logic Taylor Blau
2021-01-19 23:24 ` [PATCH 07/10] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-01-19 23:24 ` [PATCH 08/10] builtin/pack-objects.c: teach '--keep-pack-stdin' Taylor Blau
2021-01-19 23:24 ` [PATCH 09/10] builtin/repack.c: extract loose object handling Taylor Blau
2021-01-20 13:59   ` Derrick Stolee
2021-01-20 14:34     ` Taylor Blau
2021-01-20 15:51       ` Derrick Stolee
2021-01-21  3:45     ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 10/10] builtin/repack.c: add '--geometric' option Taylor Blau
2021-01-20 14:05 ` [PATCH 00/10] repack: support repacking into a geometric sequence Derrick Stolee
2021-02-04  3:58 ` [PATCH v2 0/8] " Taylor Blau
2021-02-04  3:58   ` [PATCH v2 1/8] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-02-16 21:42     ` Jeff King
2021-02-16 21:48       ` Taylor Blau
2021-02-04  3:58   ` [PATCH v2 2/8] revision: learn '--no-kept-objects' Taylor Blau
2021-02-16 23:17     ` Jeff King
2021-02-17 18:35       ` Taylor Blau [this message]
2021-02-04  3:59   ` [PATCH v2 3/8] builtin/pack-objects.c: add '--stdin-packs' option Taylor Blau
2021-02-16 23:46     ` Jeff King
2021-02-17 18:59       ` Taylor Blau
2021-02-17 19:21         ` Jeff King
2021-02-04  3:59   ` [PATCH v2 4/8] p5303: add missing &&-chains Taylor Blau
2021-02-04  3:59   ` [PATCH v2 5/8] p5303: measure time to repack with keep Taylor Blau
2021-02-16 23:58     ` Jeff King
2021-02-17  0:02       ` Jeff King
2021-02-17 19:13       ` Taylor Blau
2021-02-17 19:25         ` Jeff King
2021-02-04  3:59   ` [PATCH v2 6/8] builtin/pack-objects.c: rewrite honor-pack-keep logic Taylor Blau
2021-02-17 16:05     ` Jeff King
2021-02-17 19:23       ` Taylor Blau
2021-02-17 19:29         ` Jeff King
2021-02-04  3:59   ` [PATCH v2 7/8] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-02-17 17:11     ` Jeff King
2021-02-17 19:54       ` Taylor Blau
2021-02-17 20:25         ` Jeff King
2021-02-17 20:29           ` Taylor Blau
2021-02-17 21:43             ` Jeff King
2021-02-04  3:59   ` [PATCH v2 8/8] builtin/repack.c: add '--geometric' option Taylor Blau
2021-02-17 18:17     ` Jeff King
2021-02-17 20:01       ` Taylor Blau
2021-02-17  0:01   ` [PATCH v2 0/8] repack: support repacking into a geometric sequence Jeff King
2021-02-17 18:18     ` Jeff King
2021-02-18  3:14 ` [PATCH v3 " Taylor Blau
2021-02-18  3:14   ` [PATCH v3 1/8] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-02-18  3:14   ` [PATCH v3 2/8] revision: learn '--no-kept-objects' Taylor Blau
2021-02-18  3:14   ` [PATCH v3 3/8] builtin/pack-objects.c: add '--stdin-packs' option Taylor Blau
2021-02-18  3:14   ` [PATCH v3 4/8] p5303: add missing &&-chains Taylor Blau
2021-02-18  3:14   ` [PATCH v3 5/8] p5303: measure time to repack with keep Taylor Blau
2021-02-18  3:14   ` [PATCH v3 6/8] builtin/pack-objects.c: rewrite honor-pack-keep logic Taylor Blau
2021-02-18  3:14   ` [PATCH v3 7/8] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-02-18  3:14   ` [PATCH v3 8/8] builtin/repack.c: add '--geometric' option Taylor Blau
2021-02-23  0:31   ` [PATCH v3 0/8] repack: support repacking into a geometric sequence Jeff King
2021-02-23  1:06     ` Taylor Blau
2021-02-23  1:42       ` Jeff King
2021-02-23  2:24 ` [PATCH v4 " Taylor Blau
2021-02-23  2:25   ` [PATCH v4 1/8] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-02-23  2:25   ` [PATCH v4 2/8] revision: learn '--no-kept-objects' Taylor Blau
2021-02-23  2:25   ` [PATCH v4 3/8] builtin/pack-objects.c: add '--stdin-packs' option Taylor Blau
2021-02-23  8:07     ` Junio C Hamano
2021-02-23 18:51       ` Jeff King
2021-02-23  2:25   ` [PATCH v4 4/8] p5303: add missing &&-chains Taylor Blau
2021-02-23  2:25   ` [PATCH v4 5/8] p5303: measure time to repack with keep Taylor Blau
2021-02-23  2:25   ` [PATCH v4 6/8] builtin/pack-objects.c: rewrite honor-pack-keep logic Taylor Blau
2021-02-23  2:25   ` [PATCH v4 7/8] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-02-23  2:25   ` [PATCH v4 8/8] builtin/repack.c: add '--geometric' option Taylor Blau
2021-02-24 23:19     ` Junio C Hamano
2021-02-24 23:43       ` Junio C Hamano
2021-03-04 21:40         ` Taylor Blau
2021-03-04 21:55       ` Taylor Blau
2021-02-23  3:39   ` [PATCH v4 0/8] repack: support repacking into a geometric sequence Jeff King
2021-02-23  7:43   ` Junio C Hamano
2021-02-23 18:44     ` Jeff King
2021-02-23 19:54       ` Martin Fick
2021-02-23 20:06         ` Taylor Blau
2021-02-23 21:57           ` Martin Fick
2021-02-23 20:15         ` Jeff King
2021-02-23 21:41           ` Martin Fick
2021-02-23 21:53             ` Jeff King
2021-02-24 18:13               ` Martin Fick
2021-02-26  6:23                 ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YC1iBEFyQnv2URSV@nand.local \
    --to=me@ttaylorr.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).