All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: git@vger.kernel.org
Cc: Jonathan Tan <jonathantanmy@google.com>
Subject: [RFC PATCH] diff: only prefetch for certain output formats
Date: Tue, 28 Jan 2020 13:35:08 -0800	[thread overview]
Message-ID: <20200128213508.31661-1-jonathantanmy@google.com> (raw)

Running "git status" on a partial clone that was cloned with
"--no-checkout", like this:

  git clone --no-checkout --filter=blob:none <url> foo
  git -C foo status

results in an unnecessary lazy fetch. This is because of commit
7fbbcb21b1 ("diff: batch fetching of missing blobs", 2019-04-08), which
optimized "diff" by prefetching, but inadvertently affected at least one
command ("status") that does not need prefetching. These 2 cases can be
distinguished by looking at output_format; the former uses
DIFF_FORMAT_PATCH and the latter uses DIFF_FORMAT_CALLBACK.

Therefore, restrict prefetching only to output_format values like the
one used by "diff".

(Note that other callers that use DIFF_FORMAT_CALLBACK will also lose
prefetching. I haven't investigated enough to see if this is a net
benefit or drawback, but I think this will need to be done on a
caller-by-caller basis and can be done in the future.)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
Points of discussion I can think of:

 - Is the whitelist of output_format constants the best?
 - Should we just have the callers pass a prefetch boolean arg instead
   of trying to guess it ourselves? I'm leaning towards no since I think
   we should avoid options unless they are necessary.

Perhaps there are others.
---
 diff.c                   | 10 +++++++++-
 t/t0410-partial-clone.sh | 13 +++++++++++++
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index 8e2914c031..8263491783 100644
--- a/diff.c
+++ b/diff.c
@@ -6504,7 +6504,15 @@ static void add_if_missing(struct repository *r,
 
 void diffcore_std(struct diff_options *options)
 {
-	if (options->repo == the_repository && has_promisor_remote()) {
+	int output_formats_to_prefetch = DIFF_FORMAT_RAW |
+		DIFF_FORMAT_DIFFSTAT |
+		DIFF_FORMAT_NUMSTAT |
+		DIFF_FORMAT_SUMMARY |
+		DIFF_FORMAT_PATCH |
+		DIFF_FORMAT_SHORTSTAT |
+		DIFF_FORMAT_DIRSTAT;
+	if ((options->output_format & output_formats_to_prefetch) &&
+	    options->repo == the_repository && has_promisor_remote()) {
 		/*
 		 * Prefetch the diff pairs that are about to be flushed.
 		 */
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index a3988bd4b8..d72631a3a0 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -567,6 +567,19 @@ test_expect_success 'do not fetch when checking existence of tree we construct o
 	git -C repo cherry-pick side1
 '
 
+test_expect_success 'do not fetch when running status against --no-checkout partial clone' '
+	rm -rf server client trace &&
+	test_create_repo server &&
+	test_commit -C server one &&
+
+	test_config -C server uploadpack.allowfilter 1 &&
+	git clone --no-checkout --filter=blob:none "file://$(pwd)/server" client &&
+
+	test_config -C server uploadpack.allowanysha1inwant 1 &&
+	GIT_TRACE_PACKET="$(pwd)/trace" git -C client status &&
+	! test_path_exists trace
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
-- 
2.25.0.341.g760bfbb309-goog


             reply	other threads:[~2020-01-28 21:35 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-28 21:35 Jonathan Tan [this message]
2020-01-29  5:09 ` [RFC PATCH] diff: only prefetch for certain output formats Jeff King
2020-01-30  1:39   ` Jonathan Tan
2020-01-30  5:51     ` Jeff King
2020-01-30 23:20       ` Jonathan Tan
2020-01-31  0:14         ` Jeff King
2020-01-31 18:08           ` Junio C Hamano
2020-02-01 11:29             ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200128213508.31661-1-jonathantanmy@google.com \
    --to=jonathantanmy@google.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.