git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] unpack-trees: allow missing CE_SKIP_WORKTREE objs
@ 2018-10-08 21:48 Jonathan Tan
  2018-10-09  9:27 ` Junio C Hamano
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Jonathan Tan @ 2018-10-08 21:48 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Whenever a sparse checkout occurs, the existence of all blobs in the
index is verified, whether or not they are included or excluded by the
.git/info/sparse-checkout specification. This degrades performance,
significantly in the case of a partial clone, because a lazy fetch
occurs whenever the existence of a missing blob is checked.

At the point of invoking cache_tree_update() in unpack_trees(),
CE_SKIP_WORKTREE is already set on all excluded blobs
(mark_new_skip_worktree() is called twice to set CE_NEW_SKIP_WORKTREE,
then apply_sparse_checkout() is called which copies over
CE_NEW_SKIP_WORKTREE to CE_SKIP_WORKTREE). cache_tree_update() can use
this information to know which blobs are excluded, and thus skip the
checking of these.

Because cache_tree_update() is used from multiple places, this new
behavior is guarded by a new flag WRITE_TREE_SKIP_WORKTREE_MISSING_OK.
Implement this new flag, and teach unpack_trees() to invoke
cache_tree_update() with this new flag.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache-tree.c                     |  6 +++++-
 cache-tree.h                     |  4 ++++
 t/t1090-sparse-checkout-scope.sh | 33 ++++++++++++++++++++++++++++++++
 unpack-trees.c                   |  1 +
 4 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/cache-tree.c b/cache-tree.c
index 5ce51468f0..340caf2d34 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -246,6 +246,8 @@ static int update_one(struct cache_tree *it,
 	int missing_ok = flags & WRITE_TREE_MISSING_OK;
 	int dryrun = flags & WRITE_TREE_DRY_RUN;
 	int repair = flags & WRITE_TREE_REPAIR;
+	int skip_worktree_missing_ok =
+		flags & WRITE_TREE_SKIP_WORKTREE_MISSING_OK;
 	int to_invalidate = 0;
 	int i;
 
@@ -356,7 +358,9 @@ static int update_one(struct cache_tree *it,
 		}
 
 		if (is_null_oid(oid) ||
-		    (mode != S_IFGITLINK && !missing_ok && !has_object_file(oid))) {
+		    (mode != S_IFGITLINK && !missing_ok &&
+		     !(skip_worktree_missing_ok && ce_skip_worktree(ce)) &&
+		     !has_object_file(oid))) {
 			strbuf_release(&buffer);
 			if (expected_missing)
 				return -1;
diff --git a/cache-tree.h b/cache-tree.h
index 0ab6784ffe..655d487619 100644
--- a/cache-tree.h
+++ b/cache-tree.h
@@ -40,6 +40,10 @@ void cache_tree_verify(struct index_state *);
 #define WRITE_TREE_DRY_RUN 4
 #define WRITE_TREE_SILENT 8
 #define WRITE_TREE_REPAIR 16
+/*
+ * Do not check for the presence of cache entry objects with CE_SKIP_WORKTREE.
+ */
+#define WRITE_TREE_SKIP_WORKTREE_MISSING_OK 32
 
 /* error return codes */
 #define WRITE_TREE_UNREADABLE_INDEX (-1)
diff --git a/t/t1090-sparse-checkout-scope.sh b/t/t1090-sparse-checkout-scope.sh
index 25d7c700f6..090b7fc3d3 100755
--- a/t/t1090-sparse-checkout-scope.sh
+++ b/t/t1090-sparse-checkout-scope.sh
@@ -63,4 +63,37 @@ test_expect_success 'return to full checkout of master' '
 	test "$(cat b)" = "modified"
 '
 
+test_expect_success 'in partial clone, sparse checkout only fetches needed blobs' '
+	test_create_repo server &&
+	git clone "file://$(pwd)/server" client &&
+
+	test_config -C server uploadpack.allowfilter 1 &&
+	test_config -C server uploadpack.allowanysha1inwant 1 &&
+	echo a >server/a &&
+	echo bb >server/b &&
+	mkdir server/c &&
+	echo ccc >server/c/c &&
+	git -C server add a b c/c &&
+	git -C server commit -m message &&
+
+	test_config -C client core.sparsecheckout 1 &&
+	test_config -C client extensions.partialclone origin &&
+	echo "!/*" >client/.git/info/sparse-checkout &&
+	echo "/a" >>client/.git/info/sparse-checkout &&
+	git -C client fetch --filter=blob:none origin &&
+	git -C client checkout FETCH_HEAD &&
+
+	git -C client rev-list HEAD \
+		--quiet --objects --missing=print >unsorted_actual &&
+	(
+		printf "?" &&
+		git hash-object server/b &&
+		printf "?" &&
+		git hash-object server/c/c
+	) >unsorted_expect &&
+	sort unsorted_actual >actual &&
+	sort unsorted_expect >expect &&
+	test_cmp expect actual
+'
+
 test_done
diff --git a/unpack-trees.c b/unpack-trees.c
index 51bfac6aa0..39e0e7a6c7 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1635,6 +1635,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 				o->result.cache_tree = cache_tree();
 			if (!cache_tree_fully_valid(o->result.cache_tree))
 				cache_tree_update(&o->result,
+						  WRITE_TREE_SKIP_WORKTREE_MISSING_OK |
 						  WRITE_TREE_SILENT |
 						  WRITE_TREE_REPAIR);
 		}
-- 
2.19.0.271.gfe8321ec05.dirty


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-10-10  1:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-08 21:48 [PATCH] unpack-trees: allow missing CE_SKIP_WORKTREE objs Jonathan Tan
2018-10-09  9:27 ` Junio C Hamano
2018-10-09  9:30 ` Junio C Hamano
2018-10-09 14:49   ` Ben Peart
2018-10-09 14:48 ` Ben Peart
2018-10-09 18:40 ` [PATCH v2] cache-tree: skip some blob checks in partial clone Jonathan Tan
2018-10-10  1:19   ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).