All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Josh Steadmon <steadmon@google.com>
Cc: git@vger.kernel.org, jonathantanmy@google.com, jrnieder@gmail.com
Subject: Re: [PATCH v2] rev-list: exclude promisor objects at walk time
Date: Thu, 4 Apr 2019 20:00:01 -0400	[thread overview]
Message-ID: <20190405000001.GA20793@sigill.intra.peff.net> (raw)
In-Reply-To: <20190404234726.GG60888@google.com>

On Thu, Apr 04, 2019 at 04:47:26PM -0700, Josh Steadmon wrote:

> > Did you (or anybody else) have any thoughts on the case where a given
> > object is referred to both by a promisor and a non-promisor (and we
> > don't have it)? That's the "shortcut" I think we're taking here: we
> > would no longer realize that it's available via the promisor when we
> > traverse to it from the non-promisor. I'm just not clear on whether that
> > can ever happen.
> 
> I am not sure either. In process_blob() and process_tree() there are
> additional checks for whether missing blobs/trees are promisor objects
> using is_promisor_object()...  but if we call that we undo the
> performance gains from this change.

Hmm. That might be a good outcome, though. If it never happens, we're
fast. If it does happen, then our worst case is that we fall back to the
current slower-but-more-thorough check. (And I think that happens with
your patch, without us having to do anything further).

> > One other possible small optimization: we don't look up the object
> > unless the caller asked to exclude promisors, which is good. But we
> > could also keep a single flag for "is there a promisor pack at all?".
> > When there isn't, we know there's no point in looking for the object.
> [...]
> I'm not necessarily opposed, but I'm leaning towards the "won't matter
> much" side.
> 
> Where would such a flag live, in this case, and who would be responsible
> for initializing it? I guess it would only matter for rev-list, so we
> could initialize it in cmd_rev_list() if --exclude-promisor-objects is
> passed?

The check is really something like:

  int have_promisor_pack() {
	for (p = packed_git; p; p = p->next) {
		if (p->pack_promisor)
			return 1;
	}
	return 0;
  }

That could be lazily cached as a single bit, but it would need to be
reset whenever we call reprepare_packed_git().

Let's just punt on it for now. I'm not convinced it would actually yield
any benefit, unless we have a partial-clone repo that doesn't have any
promisor packs (but then, I suspect whatever un-partial'd it should
probably be resetting the partial flag in the config).

> > I didn't see any tweaks to the callers, which makes sense; we're already
> > passing --exclude-promisor-objects as necessary. Which means by itself,
> > this patch should be making things faster, right? Do you have timings to
> > show that off?
> 
> Yeah, for a partial clone of a large-ish Android repo [1], we see the
> connectivity check go from >180s to ~7s.

Those are nice numbers. :) Worth mentioning in the commit message, I
think. How does it compare to your earlier patch? I'd hope they're about
the same.

-Peff

  reply	other threads:[~2019-04-05  0:00 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-03 17:27 [PATCH] clone: do faster object check for partial clones Josh Steadmon
2019-04-03 18:58 ` Jonathan Tan
2019-04-03 19:41 ` Jeff King
2019-04-03 20:57   ` Jonathan Tan
2019-04-04  0:21     ` Josh Steadmon
2019-04-04  1:33     ` Jeff King
2019-04-04 22:53 ` [PATCH v2] rev-list: exclude promisor objects at walk time Josh Steadmon
2019-04-04 23:08   ` Jeff King
2019-04-04 23:47     ` Josh Steadmon
2019-04-05  0:00       ` Jeff King [this message]
2019-04-05  0:09         ` Josh Steadmon
2019-04-08 20:59           ` Josh Steadmon
2019-04-08 21:06 ` [PATCH v3] " Josh Steadmon
2019-04-08 22:23   ` Christian Couder
2019-04-08 23:12     ` Josh Steadmon
2019-04-09 15:14   ` Junio C Hamano
2019-04-09 15:15     ` Jeff King
2019-04-09 15:43       ` Junio C Hamano
2019-04-09 16:35         ` Josh Steadmon
2019-04-09 18:04   ` SZEDER Gábor
2019-04-09 23:42     ` Josh Steadmon
2019-04-11  4:06       ` Jeff King
2019-04-12 22:38         ` Josh Steadmon
2019-04-13  5:34           ` Jeff King
2019-04-19 20:26             ` Josh Steadmon
2019-04-19 21:00 ` [PATCH v4] clone: do faster object check for partial clones Josh Steadmon
2019-04-22 21:31   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190405000001.GA20793@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=steadmon@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.