git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Subject: [PATCH] fetch-pack: speed up loading of refs via commit graph
Date: Wed, 4 Aug 2021 15:56:11 +0200	[thread overview]
Message-ID: <08519b8ab6f395cffbcd5e530bfba6aaf64241a2.1628085347.git.ps@pks.im> (raw)

[-- Attachment #1: Type: text/plain, Size: 2195 bytes --]

When doing reference negotiation, git-fetch-pack(1) is loading all refs
from disk in order to determine which commits it has in common with the
remote repository. This can be quite expensive in repositories with many
references though: in a real-world repository with around 2.2 million
refs, fetching a single commit by its ID takes around 44 seconds.

Dominating the loading time is decompression and parsing of the objects
which are referenced by commits. Given the fact that we only care about
commits (or tags which can be peeled to one) in this context, there is
thus an easy performance win by switching the parsing logic to make use
of the commit graph in case we have one available. Like this, we avoid
hitting the object database to parse these commits but instead only load
them from the commit-graph. This results in a significant performance
boost when executing git-fetch in said repository with 2.2 million refs:

    Benchmark #1: HEAD~: git fetch $remote $commit
      Time (mean ± σ):     44.168 s ±  0.341 s    [User: 42.985 s, System: 1.106 s]
      Range (min … max):   43.565 s … 44.577 s    10 runs

    Benchmark #2: HEAD: git fetch $remote $commit
      Time (mean ± σ):     19.498 s ±  0.724 s    [User: 18.751 s, System: 0.690 s]
      Range (min … max):   18.629 s … 20.454 s    10 runs

    Summary
      'HEAD: git fetch $remote $commit' ran
        2.27 ± 0.09 times faster than 'HEAD~: git fetch $remote $commit'

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 fetch-pack.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index b0c7be717c..0bf7ed7e47 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -137,8 +137,14 @@ static struct commit *deref_without_lazy_fetch(const struct object_id *oid,
 			break;
 		}
 	}
-	if (type == OBJ_COMMIT)
-		return (struct commit *) parse_object(the_repository, oid);
+
+	if (type == OBJ_COMMIT) {
+		struct commit *commit = lookup_commit(the_repository, oid);
+		if (!commit || repo_parse_commit(the_repository, commit))
+			return NULL;
+		return commit;
+	}
+
 	return NULL;
 }
 
-- 
2.32.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

             reply	other threads:[~2021-08-04 13:56 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-04 13:56 Patrick Steinhardt [this message]
2021-08-04 14:55 ` [PATCH] fetch-pack: speed up loading of refs via commit graph Derrick Stolee
2021-08-04 17:45 ` Junio C Hamano
2021-08-04 20:59 ` Jeff King
2021-08-04 21:32   ` Junio C Hamano
2021-08-05  6:04   ` Patrick Steinhardt
2021-08-05 11:53     ` Patrick Steinhardt
2021-08-05 16:26       ` Junio C Hamano
2021-08-05 20:42       ` Jeff King
2021-08-05 20:40     ` Jeff King
2021-08-05 19:05   ` Ævar Arnfjörð Bjarmason
2021-08-05 20:29     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=08519b8ab6f395cffbcd5e530bfba6aaf64241a2.1628085347.git.ps@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).