All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: git@vger.kernel.org
Cc: Jonathan Tan <jonathantanmy@google.com>,
	jrnieder@gmail.com, gitster@pobox.com, sunshine@sunshineco.com
Subject: [PATCH v2] fetch: remove fetch_if_missing=0
Date: Tue,  5 Nov 2019 10:56:19 -0800	[thread overview]
Message-ID: <20191105185619.207173-1-jonathantanmy@google.com> (raw)
In-Reply-To: <20191101203830.231676-1-jonathantanmy@google.com>

In fetch_pack() (and all functions it calls), pass
OBJECT_INFO_SKIP_FETCH_OBJECT whenever we query an object that could be
a tree or blob that we do not want to be lazy-fetched even if it is
absent. Thus, the only lazy-fetches occurring for trees and blobs are
when resolving deltas.

Thus, we can remove fetch_if_missing=0 from builtin/fetch.c. Remove
this, and also add a test ensuring that such objects are not
lazy-fetched. (We might be able to remove fetch_if_missing=0 from other
places too, but I have limited myself to builtin/fetch.c in this commit
because I have not written tests for the other commands yet.)

Note that commits and tags may still be lazy-fetched. I limited myself
to objects that could be trees or blobs here because Git does not
support creating such commit- and tag-excluding clones yet, and even if
such a clone were manually created, Git does not have good support for
fetching a single commit (when fetching a commit, it and all its
ancestors would be sent).

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
Changes from v1:
- Added NEEDSWORK in test, suggested by Jonathan Nieder
- Used printf in test, suggested by Eric Sunshine

Range-diff against v1:
1:  f52c8d4c11 ! 1:  8e93cb8b02 fetch: remove fetch_if_missing=0
    @@ Commit message
         ancestors would be sent).
     
         Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
     
      ## builtin/fetch.c ##
     @@ builtin/fetch.c: static int check_exist_and_connected(struct ref *ref_map)
    @@ t/t5616-partial-clone.sh: test_expect_success 'partial clone with unresolvable s
     +setup_triangle () {
     +	rm -rf big-blob.txt server client promisor-remote &&
     +
    -+	touch big-blob.txt &&
    -+	for i in $(seq 1 100)
    -+	do
    -+		echo line $i >>big-blob.txt
    -+	done &&
    ++	printf "line %d\n" $(test_seq 1 100) >big-blob.txt
     +
     +	# Create a server with 2 commits: a commit with a big blob and a child
     +	# commit with an incremental change. Also, create a partial clone
    @@ t/t5616-partial-clone.sh: test_expect_success 'partial clone with unresolvable s
     +	git -C client remote set-url origin "file://$(pwd)/promisor-remote"
     +}
     +
    ++# NEEDSWORK: The tests beginning with "fetch lazy-fetches" below only
    ++# test that "fetch" avoid fetching trees and blobs, but not commits or
    ++# tags. Revisit this if Git is ever taught to support partial clones
    ++# with commits and/or tags filtered out.
    ++
     +test_expect_success 'fetch lazy-fetches only to resolve deltas' '
     +	setup_triangle &&
     +

 builtin/fetch.c          |  5 ++-
 fetch-pack.c             |  3 +-
 t/t5616-partial-clone.sh | 70 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 74 insertions(+), 4 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 0c345b5dfe..5ff7367dd7 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1074,7 +1074,8 @@ static int check_exist_and_connected(struct ref *ref_map)
 	 * we need all direct targets to exist.
 	 */
 	for (r = rm; r; r = r->next) {
-		if (!has_object_file(&r->old_oid))
+		if (!has_object_file_with_flags(&r->old_oid,
+						OBJECT_INFO_SKIP_FETCH_OBJECT))
 			return -1;
 	}
 
@@ -1755,8 +1756,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 
 	packet_trace_identity("fetch");
 
-	fetch_if_missing = 0;
-
 	/* Record the command line for the reflog */
 	strbuf_addstr(&default_rla, "fetch");
 	for (i = 1; i < argc; i++)
diff --git a/fetch-pack.c b/fetch-pack.c
index 0130b44112..37178e2d34 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -673,7 +673,8 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 		struct object *o;
 
 		if (!has_object_file_with_flags(&ref->old_oid,
-						OBJECT_INFO_QUICK))
+						OBJECT_INFO_QUICK |
+							OBJECT_INFO_SKIP_FETCH_OBJECT))
 			continue;
 		o = parse_object(the_repository, &ref->old_oid);
 		if (!o)
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 79f7b65f8c..e785fba7ed 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -296,6 +296,76 @@ test_expect_success 'partial clone with unresolvable sparse filter fails cleanly
 	test_i18ngrep "unable to parse sparse filter data in" err
 '
 
+setup_triangle () {
+	rm -rf big-blob.txt server client promisor-remote &&
+
+	printf "line %d\n" $(test_seq 1 100) >big-blob.txt
+
+	# Create a server with 2 commits: a commit with a big blob and a child
+	# commit with an incremental change. Also, create a partial clone
+	# client that only contains the first commit.
+	git init server &&
+	git -C server config --local uploadpack.allowfilter 1 &&
+	cp big-blob.txt server &&
+	git -C server add big-blob.txt &&
+	git -C server commit -m "initial" &&
+	git clone --bare --filter=tree:0 "file://$(pwd)/server" client &&
+	echo another line >>server/big-blob.txt &&
+	git -C server commit -am "append line to big blob" &&
+
+	# Create a promisor remote that only contains the blob from the first
+	# commit, and set it as the promisor remote of client. Thus, whenever
+	# the client lazy fetches, the lazy fetch will succeed only if it is
+	# for this blob.
+	git init promisor-remote &&
+	test_commit -C promisor-remote one && # so that ref advertisement is not empty
+	git -C promisor-remote config --local uploadpack.allowanysha1inwant 1 &&
+	git -C promisor-remote hash-object -w --stdin <big-blob.txt &&
+	git -C client remote set-url origin "file://$(pwd)/promisor-remote"
+}
+
+# NEEDSWORK: The tests beginning with "fetch lazy-fetches" below only
+# test that "fetch" avoid fetching trees and blobs, but not commits or
+# tags. Revisit this if Git is ever taught to support partial clones
+# with commits and/or tags filtered out.
+
+test_expect_success 'fetch lazy-fetches only to resolve deltas' '
+	setup_triangle &&
+
+	# Exercise to make sure it works. Git will not fetch anything from the
+	# promisor remote other than for the big blob (because it needs to
+	# resolve the delta).
+	GIT_TRACE_PACKET="$(pwd)/trace" git -C client \
+		fetch "file://$(pwd)/server" master &&
+
+	# Verify the assumption that the client needed to fetch the delta base
+	# to resolve the delta.
+	git hash-object big-blob.txt >hash &&
+	grep "want $(cat hash)" trace
+'
+
+test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
+	setup_triangle &&
+
+	git -C server config --local protocol.version 2 &&
+	git -C client config --local protocol.version 2 &&
+	git -C promisor-remote config --local protocol.version 2 &&
+
+	# Exercise to make sure it works. Git will not fetch anything from the
+	# promisor remote other than for the big blob (because it needs to
+	# resolve the delta).
+	GIT_TRACE_PACKET="$(pwd)/trace" git -C client \
+		fetch "file://$(pwd)/server" master &&
+
+	# Verify that protocol version 2 was used.
+	grep "fetch< version 2" trace &&
+
+	# Verify the assumption that the client needed to fetch the delta base
+	# to resolve the delta.
+	git hash-object big-blob.txt >hash &&
+	grep "want $(cat hash)" trace
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
-- 
2.24.0.rc1.363.gb1bccd3e3d-goog


  parent reply	other threads:[~2019-11-05 18:56 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-01 20:38 [PATCH] fetch: remove fetch_if_missing=0 Jonathan Tan
2019-11-01 22:05 ` Jonathan Nieder
2019-11-02  5:55   ` Junio C Hamano
2019-11-02  6:11     ` Eric Sunshine
2019-11-02  5:59   ` Junio C Hamano
2019-11-05 18:53   ` Jonathan Tan
2019-11-05 18:58     ` Jonathan Nieder
2019-11-05 18:56 ` Jonathan Tan [this message]
2019-11-05 20:06   ` [PATCH v2] " Eric Sunshine
2019-11-06  1:45   ` Junio C Hamano
2019-11-08  6:33   ` Junio C Hamano
2019-11-08  7:40     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191105185619.207173-1-jonathantanmy@google.com \
    --to=jonathantanmy@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.