git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: git@vger.kernel.org
Cc: Jonathan Tan <jonathantanmy@google.com>
Subject: [PATCH 2/2] cache-tree: prefetch in partial clone read-tree
Date: Fri, 23 Jul 2021 11:52:23 -0700	[thread overview]
Message-ID: <f4881b7455b9d33c8a53a91eda7fbdfc5d11382c.1627066238.git.jonathantanmy@google.com> (raw)
In-Reply-To: <cover.1627066238.git.jonathantanmy@google.com>

"git read-tree" checks the existence of the blobs referenced by the
given tree, but does not bulk prefetch them. Add a bulk prefetch.

The lack of prefetch here was noticed at $DAYJOB during a merge
involving some specific commits, but I couldn't find a minimal merge
that didn't also trigger the prefetch in check_updates() in
unpack-trees.c (and in all these cases, the lack of prefetch in
cache-tree.c didn't matter because all the relevant blobs would have
already been prefetched by then). This is why I used read-tree here to
exercise this code path.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache-tree.c                       | 11 ++++++++--
 t/t1022-read-tree-partial-clone.sh | 33 ++++++++++++++++++++++++++++++
 2 files changed, 42 insertions(+), 2 deletions(-)
 create mode 100755 t/t1022-read-tree-partial-clone.sh

diff --git a/cache-tree.c b/cache-tree.c
index 45e58666af..9ba2c7c6b2 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -237,6 +237,11 @@ int cache_tree_fully_valid(struct cache_tree *it)
 	return 1;
 }
 
+static int must_check_existence(const struct cache_entry *ce)
+{
+	return !(has_promisor_remote() && ce_skip_worktree(ce));
+}
+
 static int update_one(struct cache_tree *it,
 		      struct cache_entry **cache,
 		      int entries,
@@ -378,8 +383,7 @@ static int update_one(struct cache_tree *it,
 		}
 
 		ce_missing_ok = mode == S_IFGITLINK || missing_ok ||
-			(has_promisor_remote() &&
-			 ce_skip_worktree(ce));
+			!must_check_existence(ce);
 		if (is_null_oid(oid) ||
 		    (!ce_missing_ok && !has_object_file(oid))) {
 			strbuf_release(&buffer);
@@ -466,6 +470,9 @@ int cache_tree_update(struct index_state *istate, int flags)
 	if (!istate->cache_tree)
 		istate->cache_tree = cache_tree();
 
+	if (!(flags & WRITE_TREE_MISSING_OK) && has_promisor_remote())
+		prefetch_cache_entries(istate, must_check_existence);
+
 	trace_performance_enter();
 	trace2_region_enter("cache_tree", "update", the_repository);
 	i = update_one(istate->cache_tree, istate->cache, istate->cache_nr,
diff --git a/t/t1022-read-tree-partial-clone.sh b/t/t1022-read-tree-partial-clone.sh
new file mode 100755
index 0000000000..a763e27c7d
--- /dev/null
+++ b/t/t1022-read-tree-partial-clone.sh
@@ -0,0 +1,33 @@
+#!/bin/sh
+
+test_description='git read-tree in partial clones'
+
+TEST_NO_CREATE_REPO=1
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+test_expect_success 'read-tree in partial clone prefetches in one batch' '
+	test_when_finished "rm -rf server client trace" &&
+
+	git init server &&
+	echo foo >server/one &&
+	echo bar >server/two &&
+	git -C server add one two &&
+	git -C server commit -m "initial commit" &&
+	TREE=$(git -C server rev-parse HEAD^{tree}) &&
+
+	git -C server config uploadpack.allowfilter 1 &&
+	git -C server config uploadpack.allowanysha1inwant 1 &&
+	git clone --bare --filter=blob:none "file://$(pwd)/server" client &&
+	GIT_TRACE_PACKET="$(pwd)/trace" git -C client read-tree $TREE &&
+
+	# "done" marks the end of negotiation (once per fetch). Expect that
+	# only one fetch occurs.
+	grep "fetch> done" trace >donelines &&
+	test_line_count = 1 donelines
+'
+
+test_done
-- 
2.32.0.432.gabb21c7263-goog


  parent reply	other threads:[~2021-07-23 18:52 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-23 18:52 [PATCH 0/2] Another partial clone prefetch Jonathan Tan
2021-07-23 18:52 ` [PATCH 1/2] unpack-trees: refactor prefetching code Jonathan Tan
2021-07-23 20:26   ` Elijah Newren
2021-07-23 18:52 ` Jonathan Tan [this message]
2021-07-23 18:55   ` [PATCH 2/2] cache-tree: prefetch in partial clone read-tree Jonathan Tan
2021-07-23 21:20   ` Junio C Hamano
2021-07-23 21:34   ` Elijah Newren
2021-07-26 13:01 ` [PATCH 0/2] Another partial clone prefetch Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f4881b7455b9d33c8a53a91eda7fbdfc5d11382c.1627066238.git.jonathantanmy@google.com \
    --to=jonathantanmy@google.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).