From: Jonathan Tan <jonathantanmy@google.com>
To: git@vger.kernel.org
Cc: Jonathan Tan <jonathantanmy@google.com>,
stolee@gmail.com, gitster@pobox.com
Subject: [PATCH v3 0/7] Lazy fetch with subprocess
Date: Mon, 17 Aug 2020 21:01:30 -0700 [thread overview]
Message-ID: <cover.1597722941.git.jonathantanmy@google.com> (raw)
In-Reply-To: <20200724223844.2723397-1-jonathantanmy@google.com>
These patches are based on jc/no-update-fetch-head, once again.
I've addressed most reviewer comments by expanding commit messages and
fixing the cascading "if" issue in patch 3.
> Would "git fetch" be the only user potentially benefit from being
> able to dereference a tag that already exists locally and ignore
> missing ones? I wonder if teaching deref_tag() a new flag would be
> a better alternative to keep the separation of concerns at different
> layers.
Right now, I think so - "fetch" is a special case of having to avoid
infinite loops when lazy fetching, and of being able to tolerate missing
ref targets. I don't think most other commands are like that.
Jonathan Tan (7):
negotiator/noop: add noop fetch negotiator
fetch: allow refspecs specified through stdin
fetch: avoid reading submodule config until needed
fetch: only populate existing_refs if needed
fetch-pack: do not lazy-fetch during ref iteration
promisor-remote: lazy-fetch objects in subprocess
fetch-pack: remove no_dependents code
Documentation/config/fetch.txt | 5 +-
Documentation/git-fetch.txt | 4 +
Documentation/technical/partial-clone.txt | 13 +-
Makefile | 1 +
builtin/fetch-pack.c | 4 -
builtin/fetch.c | 42 ++++-
fetch-negotiator.c | 5 +
fetch-pack.c | 189 +++++++++-------------
fetch-pack.h | 14 --
negotiator/noop.c | 44 +++++
negotiator/noop.h | 8 +
promisor-remote.c | 46 +++---
remote-curl.c | 6 -
repo-settings.c | 2 +
repository.h | 1 +
submodule-config.c | 8 +-
t/t0410-partial-clone.sh | 2 +-
t/t4067-diff-partial-clone.sh | 8 +-
t/t5554-noop-fetch-negotiator.sh | 22 +++
t/t5601-clone.sh | 2 +-
t/t5616-partial-clone.sh | 20 +++
transport.c | 4 -
transport.h | 7 -
23 files changed, 255 insertions(+), 202 deletions(-)
create mode 100644 negotiator/noop.c
create mode 100644 negotiator/noop.h
create mode 100755 t/t5554-noop-fetch-negotiator.sh
Range-diff against v2:
1: 35bdd372ae ! 1: 5b3b49d123 negotiator/null: add null fetch negotiator
@@ Metadata
Author: Jonathan Tan <jonathantanmy@google.com>
## Commit message ##
- negotiator/null: add null fetch negotiator
+ negotiator/noop: add noop fetch negotiator
- Add a null fetch negotiator. This is introduced to allow partial clones
+ Add a noop fetch negotiator. This is introduced to allow partial clones
to skip the unneeded negotiation step when fetching missing objects
using a "git fetch" subprocess. (The implementation of spawning a "git
fetch" subprocess will be done in a subsequent patch.) But this can also
@@ Documentation/config/fetch.txt: fetch.negotiationAlgorithm::
server. Set to "skipping" to use an algorithm that skips commits in an
effort to converge faster, but may result in a larger-than-necessary
- packfile; The default is "default" which instructs Git to use the default algorithm
-+ packfile; or set to "null" to not send any information at all, which
++ packfile; or set to "noop" to not send any information at all, which
+ will almost certainly result in a larger-than-necessary packfile, but
+ will skip the negotiation step.
+ The default is "default" which instructs Git to use the default algorithm
@@ Makefile: LIB_OBJS += mergesort.o
LIB_OBJS += midx.o
LIB_OBJS += name-hash.o
LIB_OBJS += negotiator/default.o
-+LIB_OBJS += negotiator/null.o
++LIB_OBJS += negotiator/noop.o
LIB_OBJS += negotiator/skipping.o
LIB_OBJS += notes-cache.o
LIB_OBJS += notes-merge.o
@@ fetch-negotiator.c
#include "fetch-negotiator.h"
#include "negotiator/default.h"
#include "negotiator/skipping.h"
-+#include "negotiator/null.h"
++#include "negotiator/noop.h"
#include "repository.h"
void fetch_negotiator_init(struct repository *r,
@@ fetch-negotiator.c: void fetch_negotiator_init(struct repository *r,
skipping_negotiator_init(negotiator);
return;
-+ case FETCH_NEGOTIATION_NULL:
-+ null_negotiator_init(negotiator);
++ case FETCH_NEGOTIATION_NOOP:
++ noop_negotiator_init(negotiator);
+ return;
+
case FETCH_NEGOTIATION_DEFAULT:
default:
default_negotiator_init(negotiator);
- ## negotiator/null.c (new) ##
+ ## negotiator/noop.c (new) ##
@@
+#include "cache.h"
-+#include "null.h"
++#include "noop.h"
+#include "../commit.h"
+#include "../fetch-negotiator.h"
+
@@ negotiator/null.c (new)
+ * This negotiator does not emit any commits, so there is no commit to
+ * be acknowledged. If there is any ack, there is a bug.
+ */
-+ BUG("ack with null negotiator, which does not emit any commits");
++ BUG("ack with noop negotiator, which does not emit any commits");
+ return 0;
+}
+
@@ negotiator/null.c (new)
+ /* nothing to release */
+}
+
-+void null_negotiator_init(struct fetch_negotiator *negotiator)
++void noop_negotiator_init(struct fetch_negotiator *negotiator)
+{
+ negotiator->known_common = known_common;
+ negotiator->add_tip = add_tip;
@@ negotiator/null.c (new)
+ negotiator->data = NULL;
+}
- ## negotiator/null.h (new) ##
+ ## negotiator/noop.h (new) ##
@@
-+#ifndef NEGOTIATOR_NULL_H
-+#define NEGOTIATOR_NULL_H
++#ifndef NEGOTIATOR_NOOP_H
++#define NEGOTIATOR_NOOP_H
+
+struct fetch_negotiator;
+
-+void null_negotiator_init(struct fetch_negotiator *negotiator);
++void noop_negotiator_init(struct fetch_negotiator *negotiator);
+
+#endif
@@ repo-settings.c: void prepare_repo_settings(struct repository *r)
if (!repo_config_get_string(r, "fetch.negotiationalgorithm", &strval)) {
if (!strcasecmp(strval, "skipping"))
r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_SKIPPING;
-+ else if (!strcasecmp(strval, "null"))
-+ r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_NULL;
++ else if (!strcasecmp(strval, "noop"))
++ r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_NOOP;
else
r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_DEFAULT;
}
@@ repository.h: enum fetch_negotiation_setting {
FETCH_NEGOTIATION_NONE = 0,
FETCH_NEGOTIATION_DEFAULT = 1,
FETCH_NEGOTIATION_SKIPPING = 2,
-+ FETCH_NEGOTIATION_NULL = 3,
++ FETCH_NEGOTIATION_NOOP = 3,
};
struct repo_settings {
- ## t/t5554-null-fetch-negotiator.sh (new) ##
+ ## t/t5554-noop-fetch-negotiator.sh (new) ##
@@
+#!/bin/sh
+
-+test_description='test null fetch negotiator'
++test_description='test noop fetch negotiator'
+. ./test-lib.sh
+
-+test_expect_success 'null negotiator does not emit any "have"' '
++test_expect_success 'noop negotiator does not emit any "have"' '
+ rm -f trace &&
+
+ test_create_repo server &&
@@ t/t5554-null-fetch-negotiator.sh (new)
+ test_create_repo client &&
+ test_commit -C client we_have &&
+
-+ test_config -C client fetch.negotiationalgorithm null &&
++ test_config -C client fetch.negotiationalgorithm noop &&
+ GIT_TRACE_PACKET="$(pwd)/trace" git -C client fetch "$(pwd)/server" &&
+
+ ! grep "fetch> have" trace &&
2: 00ad7dd875 = 2: 9f277f1631 fetch: allow refspecs specified through stdin
3: 8b4a522a13 ! 3: fda9f834f6 fetch: avoid reading submodule config until needed
@@ Metadata
## Commit message ##
fetch: avoid reading submodule config until needed
- Teach "git fetch" to avoid reading the submodule config until necessary.
- This allows users to avoid the lazy-fetching of this potentially missing
- config file by specifying the --recurse-submodules=no command line
- option.
+ In "fetch", there are two parameters submodule_fetch_jobs_config and
+ recurse_submodules that can be set in a variety of ways: through
+ .gitmodules, through .git/config, and through the command line.
+ Currently "fetch" handles this by first reading .gitmodules, then
+ reading .git/config (allowing it to overwrite existing values), then
+ reading the command line (allowing it to overwrite existing values).
+
+ Notice that we can avoid reading .gitmodules if .git/config and/or the
+ command line already provides us with what we need. In addition, if
+ recurse_submodules is found to be "no", we do not need the value of
+ submodule_fetch_jobs_config.
+
+ Avoiding reading .gitmodules is especially important when we use "git
+ fetch" to perform lazy fetches in a partial clone because the
+ .gitmodules file itself might need to be lazy fetched (and otherwise
+ causing an infinite loop).
+
+ In light of all this, avoid reading .gitmodules until necessary. When
+ reading it, we may only need one of the two parameters it provides, so
+ teach fetch_config_from_gitmodules() to support NULL arguments. With
+ this patch, users (including Git itself when invoking "git fetch" to
+ lazy-fetch) will be able to guarantee avoiding reading .gitmodules by
+ passing --recurse-submodules=no.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
@@ builtin/fetch.c: int cmd_fetch(int argc, const char **argv, const char *prefix)
if (deepen_relative < 0)
## submodule-config.c ##
-@@ submodule-config.c: struct fetch_config {
- static int gitmodules_fetch_config(const char *var, const char *value, void *cb)
+@@ submodule-config.c: static int gitmodules_fetch_config(const char *var, const char *value, void *cb)
{
struct fetch_config *config = cb;
-- if (!strcmp(var, "submodule.fetchjobs")) {
-+ if (!strcmp(var, "submodule.fetchjobs") && config->max_children) {
- *(config->max_children) = parse_submodule_fetchjobs(var, value);
+ if (!strcmp(var, "submodule.fetchjobs")) {
+- *(config->max_children) = parse_submodule_fetchjobs(var, value);
++ if (config->max_children)
++ *(config->max_children) =
++ parse_submodule_fetchjobs(var, value);
return 0;
-- } else if (!strcmp(var, "fetch.recursesubmodules")) {
-+ } else if (!strcmp(var, "fetch.recursesubmodules") &&
-+ config->recurse_submodules) {
- *(config->recurse_submodules) = parse_fetch_recurse_submodules_arg(var, value);
+ } else if (!strcmp(var, "fetch.recursesubmodules")) {
+- *(config->recurse_submodules) = parse_fetch_recurse_submodules_arg(var, value);
++ if (config->recurse_submodules)
++ *(config->recurse_submodules) =
++ parse_fetch_recurse_submodules_arg(var, value);
return 0;
}
+
4: 77bc83e7f2 ! 4: a5554cd27f fetch: only populate existing_refs if needed
@@ Metadata
## Commit message ##
fetch: only populate existing_refs if needed
- When fetching tags, Git only writes tags that do not already exist in
- the client repository. This necessitates an iteration over all the refs,
- but fetch performs this iteration even if no tags are fetched.
+ In "fetch", get_ref_map() iterates over all refs to populate
+ "existing_refs" in order to populate peer_ref->old_oid in the returned
+ refmap, even if the refmap has no peer_ref set - which is the case when
+ only literal hashes (i.e. no refs by name) are fetched.
- This issue is more severe in a partial clone because the iteration over
- refs also checks that the targets of those refs are present,
- necessitating a lazy fetch if the target is missing.
+ Iterating over refs causes the targets of those refs to be checked for
+ existence. Avoiding this is especially important when we use "git fetch"
+ to perform lazy fetches in a partial clone because a target of such a
+ ref may need to be itself lazy-fetched (and otherwise causing an
+ infinite loop).
- Therefore, iterate over the refs only when necessary. The user can avoid
- this iteration by refraining from fetching tags, for example, by passing
- --no-tags as an argument. A subsequent patch will also teach Git to use
- "git fetch" to lazy-fetch missing objects in a partial clone, thus also
- making use of this change.
+ Therefore, avoid populating "existing_refs" until necessary. With this
+ patch, because Git lazy-fetches objects by literal hashes (to be done in
+ a subsequent commit), it will then be able to guarantee avoiding reading
+ targets of refs.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
5: d42e98ff07 ! 5: f56c24dd1b fetch-pack: do not lazy-fetch during ref iteration
@@ Metadata
## Commit message ##
fetch-pack: do not lazy-fetch during ref iteration
- Teach "fetch-pack" not to lazy fetch whenever iterating over refs. This
- is done by using the raw form of ref iteration and by dereferencing tags
- ourselves.
+ In order to determine negotiation tips, "fetch-pack" iterates over all
+ refs and dereferences all annotated tags found. This causes the
+ existence of targets of refs and annotated tags to be checked. Avoiding
+ this is especially important when we use "git fetch" (which invokes
+ "fetch-pack") to perform lazy fetches in a partial clone because a
+ target of such a ref or annotated tag may need to be itself lazy-fetched
+ (and otherwise causing an infinite loop).
+
+ Therefore, teach "fetch-pack" not to lazy fetch whenever iterating over
+ refs. This is done by using the raw form of ref iteration and by
+ dereferencing tags ourselves.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
6: 55d2e5a4cc = 6: 4e72ca6258 promisor-remote: lazy-fetch objects in subprocess
7: e8f16d6908 = 7: 3ff9d034e9 fetch-pack: remove no_dependents code
--
2.28.0.220.ged08abb693-goog
next prev parent reply other threads:[~2020-08-18 4:02 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-24 22:38 [RFC PATCH] Modify fetch-pack to no longer die on error? Jonathan Tan
2020-07-24 23:07 ` Junio C Hamano
2020-07-24 23:11 ` Junio C Hamano
2020-07-25 21:41 ` Jeff King
2020-07-25 23:01 ` Junio C Hamano
2020-07-27 17:11 ` Jeff King
2020-07-28 19:23 ` Jonathan Tan
2020-07-28 20:08 ` Jeff King
2020-07-29 18:53 ` Jonathan Tan
2020-07-29 19:29 ` Jeff King
2020-07-29 19:02 ` Junio C Hamano
2020-07-29 22:55 ` Jonathan Tan
2020-08-05 1:20 ` [RFC PATCH 0/7] Lazy fetch with subprocess Jonathan Tan
2020-08-05 1:20 ` [RFC PATCH 1/7] fetch-pack: allow NULL negotiator->add_tip Jonathan Tan
2020-08-05 19:53 ` Junio C Hamano
2020-08-07 20:53 ` Jonathan Tan
2020-08-05 1:20 ` [RFC PATCH 2/7] fetch-pack: allow NULL negotiator->known_common Jonathan Tan
2020-08-05 20:08 ` Junio C Hamano
2020-08-05 22:11 ` Junio C Hamano
2020-08-07 20:59 ` Jonathan Tan
2020-08-05 1:20 ` [RFC PATCH 3/7] negotiator/null: add null fetch negotiator Jonathan Tan
2020-08-05 1:20 ` [RFC PATCH 4/7] fetch: --stdin Jonathan Tan
2020-08-05 20:33 ` Junio C Hamano
2020-08-07 21:10 ` Jonathan Tan
2020-08-07 21:58 ` Junio C Hamano
2020-08-07 21:10 ` Jonathan Tan
2020-08-05 1:20 ` [RFC PATCH 5/7] fetch: submodule config Jonathan Tan
2020-08-05 1:20 ` [RFC PATCH 6/7] fetch: only populate existing_refs if needed Jonathan Tan
2020-08-05 1:20 ` [RFC PATCH 7/7] promisor-remote: use subprocess to fetch Jonathan Tan
2020-08-11 22:52 ` [PATCH v2 0/7] Lazy fetch with subprocess Jonathan Tan
2020-08-11 22:52 ` [PATCH v2 1/7] negotiator/null: add null fetch negotiator Jonathan Tan
2020-08-12 12:55 ` Derrick Stolee
2020-08-12 16:44 ` Junio C Hamano
2020-08-12 17:29 ` Jonathan Tan
2020-08-11 22:52 ` [PATCH v2 2/7] fetch: allow refspecs specified through stdin Jonathan Tan
2020-08-11 22:52 ` [PATCH v2 3/7] fetch: avoid reading submodule config until needed Jonathan Tan
2020-08-12 17:34 ` Junio C Hamano
2020-08-11 22:52 ` [PATCH v2 4/7] fetch: only populate existing_refs if needed Jonathan Tan
2020-08-12 18:06 ` Junio C Hamano
2020-08-11 22:52 ` [PATCH v2 5/7] fetch-pack: do not lazy-fetch during ref iteration Jonathan Tan
2020-08-12 18:25 ` Junio C Hamano
2020-08-11 22:52 ` [PATCH v2 6/7] promisor-remote: lazy-fetch objects in subprocess Jonathan Tan
2020-08-12 18:28 ` Junio C Hamano
2020-08-11 22:52 ` [PATCH v2 7/7] fetch-pack: remove no_dependents code Jonathan Tan
2020-08-12 12:51 ` [PATCH v2 0/7] Lazy fetch with subprocess Derrick Stolee
2020-08-18 4:01 ` Jonathan Tan [this message]
2020-08-18 4:01 ` [PATCH v3 1/7] negotiator/noop: add noop fetch negotiator Jonathan Tan
2020-08-18 4:01 ` [PATCH v3 2/7] fetch: allow refspecs specified through stdin Jonathan Tan
2020-08-18 4:01 ` [PATCH v3 3/7] fetch: avoid reading submodule config until needed Jonathan Tan
2020-08-18 4:01 ` [PATCH v3 4/7] fetch: only populate existing_refs if needed Jonathan Tan
2020-08-18 4:01 ` [PATCH v3 5/7] fetch-pack: do not lazy-fetch during ref iteration Jonathan Tan
2020-08-18 4:01 ` [PATCH v3 6/7] promisor-remote: lazy-fetch objects in subprocess Jonathan Tan
2020-08-18 4:01 ` [PATCH v3 7/7] fetch-pack: remove no_dependents code Jonathan Tan
2020-08-18 19:56 ` [PATCH v3 0/7] Lazy fetch with subprocess Junio C Hamano
2020-08-18 22:32 ` Junio C Hamano
2020-08-18 23:36 ` [PATCH] fixup! promisor-remote: lazy-fetch objects in subprocess Jonathan Tan
2020-08-18 23:57 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1597722941.git.jonathantanmy@google.com \
--to=jonathantanmy@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).