All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: git@vger.kernel.org
Cc: Jonathan Tan <jonathantanmy@google.com>,
	matheus.bernardino@usp.br, gitster@pobox.com
Subject: [PATCH v3 1/8] submodule: lazily add submodule ODBs as alternates
Date: Mon, 16 Aug 2021 14:09:51 -0700	[thread overview]
Message-ID: <5994a517e8afc345e8f649b2368756e22b0e9ebe.1629148153.git.jonathantanmy@google.com> (raw)
In-Reply-To: <cover.1629148153.git.jonathantanmy@google.com>

Teach Git to add submodule ODBs as alternates to the object store of
the_repository only upon the first access of an object not in
the_repository, and not when add_submodule_odb() is called.

This provides a means of gradually migrating from accessing a
submodule's object through alternates to accessing a submodule's object
by explicitly passing its repository object. Any Git command can declare
that it might access submodule objects by calling add_submodule_odb()
(as they do now), but the submodule ODBs themselves will not be added
until needed, so individual commands and/or combinations of arguments
can be migrated one by one.

[The advantage of explicit repository-object passing is code clarity (it
is clear which repository an object read is from), performance (there is
no need to linearly search through all submodule ODBs whenever an object
is accessed from any repository, whether superproject or submodule), and
the possibility of future features like partial clone submodules (which
right now is not possible because if an object is missing, we do not
know which repository to lazy-fetch into).]

This commit also introduces an environment variable that a test may set
to make the actual registration of alternates fatal, in order to
demonstrate that its codepaths do not need this registration.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 object-file.c |  5 +++++
 submodule.c   | 20 +++++++++++++++++++-
 submodule.h   |  7 +++++++
 t/README      | 10 ++++++++++
 4 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/object-file.c b/object-file.c
index 3d27dc8dea..621b121bcb 100644
--- a/object-file.c
+++ b/object-file.c
@@ -32,6 +32,7 @@
 #include "packfile.h"
 #include "object-store.h"
 #include "promisor-remote.h"
+#include "submodule.h"
 
 /* The maximum size for an object header. */
 #define MAX_HEADER_LEN 32
@@ -1592,6 +1593,10 @@ static int do_oid_object_info_extended(struct repository *r,
 				break;
 		}
 
+		if (register_all_submodule_odb_as_alternates())
+			/* We added some alternates; retry */
+			continue;
+
 		/* Check if it is a missing object */
 		if (fetch_if_missing && repo_has_promisor_remote(r) &&
 		    !already_retried &&
diff --git a/submodule.c b/submodule.c
index 8e611fe1db..8fde90e906 100644
--- a/submodule.c
+++ b/submodule.c
@@ -165,6 +165,8 @@ void stage_updated_gitmodules(struct index_state *istate)
 		die(_("staging updated .gitmodules failed"));
 }
 
+static struct string_list added_submodule_odb_paths = STRING_LIST_INIT_NODUP;
+
 /* TODO: remove this function, use repo_submodule_init instead. */
 int add_submodule_odb(const char *path)
 {
@@ -178,12 +180,28 @@ int add_submodule_odb(const char *path)
 		ret = -1;
 		goto done;
 	}
-	add_to_alternates_memory(objects_directory.buf);
+	string_list_insert(&added_submodule_odb_paths,
+			   strbuf_detach(&objects_directory, NULL));
 done:
 	strbuf_release(&objects_directory);
 	return ret;
 }
 
+int register_all_submodule_odb_as_alternates(void)
+{
+	int i;
+	int ret = added_submodule_odb_paths.nr;
+
+	for (i = 0; i < added_submodule_odb_paths.nr; i++)
+		add_to_alternates_memory(added_submodule_odb_paths.items[i].string);
+	if (ret) {
+		string_list_clear(&added_submodule_odb_paths, 0);
+		if (git_env_bool("GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB", 0))
+			BUG("register_all_submodule_odb_as_alternates() called");
+	}
+	return ret;
+}
+
 void set_diffopt_flags_from_submodule_config(struct diff_options *diffopt,
 					     const char *path)
 {
diff --git a/submodule.h b/submodule.h
index 84640c49c1..c252784bc2 100644
--- a/submodule.h
+++ b/submodule.h
@@ -97,7 +97,14 @@ int submodule_uses_gitfile(const char *path);
 #define SUBMODULE_REMOVAL_IGNORE_IGNORED_UNTRACKED (1<<2)
 int bad_to_remove_submodule(const char *path, unsigned flags);
 
+/*
+ * Call add_submodule_odb() to add the submodule at the given path to a list.
+ * When register_all_submodule_odb_as_alternates() is called, the object stores
+ * of all submodules in that list will be added as alternates in
+ * the_repository.
+ */
 int add_submodule_odb(const char *path);
+int register_all_submodule_odb_as_alternates(void);
 
 /*
  * Checks if there are submodule changes in a..b. If a is the null OID,
diff --git a/t/README b/t/README
index 9e70122302..8b67b4f00b 100644
--- a/t/README
+++ b/t/README
@@ -448,6 +448,16 @@ GIT_TEST_CHECKOUT_WORKERS=<n> overrides the 'checkout.workers' setting
 to <n> and 'checkout.thresholdForParallelism' to 0, forcing the
 execution of the parallel-checkout code.
 
+GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=<boolean>, when true, makes
+registering submodule ODBs as alternates a fatal action. Support for
+this environment variable can be removed once the migration to
+explicitly providing repositories when accessing submodule objects is
+complete (in which case we might want to replace this with a trace2
+call so that users can make it visible if accessing submodule objects
+without an explicit repository still happens) or needs to be abandoned
+for whatever reason (in which case the migrated codepaths still retain
+their performance benefits).
+
 Naming Tests
 ------------
 
-- 
2.33.0.rc1.237.g0d66db33f3-goog


  reply	other threads:[~2021-08-16 21:10 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-10 18:28 [PATCH 0/7] In grep, no adding submodule ODB as alternates Jonathan Tan
2021-08-10 18:28 ` [PATCH 1/7] submodule: lazily add submodule ODBs " Jonathan Tan
2021-08-10 21:13   ` Junio C Hamano
2021-08-13 16:53     ` Jonathan Tan
2021-08-11 21:33   ` Emily Shaffer
2021-08-13 16:23     ` Jonathan Tan
2021-08-10 18:28 ` [PATCH 2/7] grep: use submodule-ODB-as-alternate lazy-addition Jonathan Tan
2021-08-11 21:36   ` Emily Shaffer
2021-08-13 16:31     ` Jonathan Tan
2021-08-10 18:28 ` [PATCH 3/7] grep: typesafe versions of grep_source_init Jonathan Tan
2021-08-10 21:38   ` Junio C Hamano
2021-08-11 21:42   ` Emily Shaffer
2021-08-11 23:07     ` Ramsay Jones
2021-08-13 16:32       ` Jonathan Tan
2021-08-11 22:45   ` Matheus Tavares Bernardino
2021-08-12 16:49     ` Junio C Hamano
2021-08-13 16:33       ` Jonathan Tan
2021-08-10 18:28 ` [PATCH 4/7] grep: read submodule entry with explicit repo Jonathan Tan
2021-08-11 21:44   ` Emily Shaffer
2021-08-13 16:39     ` Jonathan Tan
2021-08-10 18:28 ` [PATCH 5/7] grep: allocate subrepos on heap Jonathan Tan
2021-08-11 21:50   ` Emily Shaffer
2021-08-13 16:42     ` Jonathan Tan
2021-08-10 18:28 ` [PATCH 6/7] grep: add repository to OID grep sources Jonathan Tan
2021-08-11 21:52   ` Emily Shaffer
2021-08-13 16:44     ` Jonathan Tan
2021-08-11 23:28   ` Matheus Tavares Bernardino
2021-08-13 16:47     ` Jonathan Tan
2021-08-10 18:28 ` [PATCH 7/7] t7814: show lack of alternate ODB-adding Jonathan Tan
2021-08-11 21:55   ` Emily Shaffer
2021-08-11 22:22     ` Matheus Tavares Bernardino
2021-08-13 16:50       ` Jonathan Tan
2021-08-11 21:29 ` [PATCH 0/7] In grep, no adding submodule ODB as alternates Emily Shaffer
2021-08-11 22:49 ` Josh Steadmon
2021-08-13 21:05 ` [PATCH v2 0/8] " Jonathan Tan
2021-08-13 21:05   ` [PATCH v2 1/8] submodule: lazily add submodule ODBs " Jonathan Tan
2021-08-13 21:05   ` [PATCH v2 2/8] grep: use submodule-ODB-as-alternate lazy-addition Jonathan Tan
2021-08-13 21:05   ` [PATCH v2 3/8] grep: typesafe versions of grep_source_init Jonathan Tan
2021-08-16 15:06     ` Matheus Tavares Bernardino
2021-08-13 21:05   ` [PATCH v2 4/8] grep: read submodule entry with explicit repo Jonathan Tan
2021-08-13 21:05   ` [PATCH v2 5/8] grep: allocate subrepos on heap Jonathan Tan
2021-08-13 21:44     ` Junio C Hamano
2021-08-16 19:42       ` Jonathan Tan
2021-08-13 21:05   ` [PATCH v2 6/8] grep: add repository to OID grep sources Jonathan Tan
2021-08-16 14:48     ` Matheus Tavares Bernardino
2021-08-16 19:44       ` Jonathan Tan
2021-08-13 21:05   ` [PATCH v2 7/8] submodule-config: pass repo upon blob config read Jonathan Tan
2021-08-16 14:32     ` Matheus Tavares Bernardino
2021-08-16 19:57       ` Matheus Tavares Bernardino
2021-08-16 20:02       ` Jonathan Tan
2021-08-16 15:48     ` Matheus Tavares Bernardino
2021-08-16 20:09       ` Jonathan Tan
2021-08-16 20:57         ` Jonathan Tan
2021-08-13 21:05   ` [PATCH v2 8/8] t7814: show lack of alternate ODB-adding Jonathan Tan
2021-08-16 15:14   ` [PATCH v2 0/8] In grep, no adding submodule ODB as alternates Matheus Tavares Bernardino
2021-08-16 21:09 ` [PATCH v3 " Jonathan Tan
2021-08-16 21:09   ` Jonathan Tan [this message]
2021-08-16 21:09   ` [PATCH v3 2/8] grep: use submodule-ODB-as-alternate lazy-addition Jonathan Tan
2021-08-16 21:09   ` [PATCH v3 3/8] grep: typesafe versions of grep_source_init Jonathan Tan
2021-08-16 21:09   ` [PATCH v3 4/8] grep: read submodule entry with explicit repo Jonathan Tan
2021-08-16 21:09   ` [PATCH v3 5/8] grep: allocate subrepos on heap Jonathan Tan
2021-08-16 21:09   ` [PATCH v3 6/8] grep: add repository to OID grep sources Jonathan Tan
2021-09-27 12:08     ` Ævar Arnfjörð Bjarmason
2021-09-27 16:45       ` [RFC PATCH 0/3] grep: don'\''t add subrepos to in-memory alternates Matheus Tavares
2021-09-27 17:30         ` Ævar Arnfjörð Bjarmason
2021-08-16 21:09   ` [PATCH v3 7/8] submodule-config: pass repo upon blob config read Jonathan Tan
2021-08-16 21:09   ` [PATCH v3 8/8] t7814: show lack of alternate ODB-adding Jonathan Tan
2021-08-17 19:29   ` [PATCH v3 0/8] In grep, no adding submodule ODB as alternates Matheus Tavares Bernardino
2021-09-08  0:26   ` Junio C Hamano
2021-09-08 15:31     ` Matheus Tavares Bernardino
2021-09-08 18:45       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5994a517e8afc345e8f649b2368756e22b0e9ebe.1629148153.git.jonathantanmy@google.com \
    --to=jonathantanmy@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=matheus.bernardino@usp.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.