git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Emily Shaffer <emilyshaffer@google.com>
To: git@vger.kernel.org
Cc: "Emily Shaffer" <emilyshaffer@google.com>,
	"Albert Cui" <albertcui@google.com>,
	"Phillip Wood" <phillip.wood123@gmail.com>,
	"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Matheus Tavares Bernardino" <matheus.bernardino@usp.br>,
	"Jonathan Nieder" <jrnieder@gmail.com>,
	"Jacob Keller" <jacob.keller@gmail.com>,
	"Atharva Raykar" <raykar.ath@gmail.com>
Subject: [RFC PATCH 0/4] cache parent project's gitdir in submodules
Date: Fri, 11 Jun 2021 15:54:24 -0700	[thread overview]
Message-ID: <20210611225428.1208973-1-emilyshaffer@google.com> (raw)

It's necessary for a superproject to know which submodules it contains.
However, historically submodules do not know they are anything but a
normal single-repo Git project (or a superproject, in nested-submodules
cases). This decision does help prevent us from having to support
divergent behaviors in submodule projects vs. superprojects, which makes
sure Git is (somewhat) less confusing for the reader, and helps simplify
our code.

One could imagine, though, some conveniences we could gain from
submodules learning added behavior (though not necessarily *different*
behavior) to provide more context about the state of the project as a
whole, and to make large submodule-based projects easier to work with.
One example is a series[1] I sent some time ago, adding a config to be
shared between the superproject and all its submodules. The RFC[2] I
sent around the same time mentions a few other examples, such as "git
status" run in the submodule noticing whether the superproject has a
commit referencing the submodule's current HEAD.

It's expensive and non-definitive to try and guess whether or not the
current repo is a submodule. submodule.c:get_superproject_working_tree()
does so by essentially running 'git -C .. ls-files -- <own-path>',
invoking an additional process. get_superproject_working_tree() is not
called often, so that's mostly fine. However, [1] attempted to include
an additional config located in the superproject's gitdir by running
'git -C .. rev-parse --git-dir' during startup - a little expensive in
the best case, because it's an extra process, but extremely expensive in
the case when the current repo is *not* a submodule, because we hunt all
the way up the filesystem looking for a '.git'. Adding that cost to
every startup is infeasible.

To that end, in this series I propose caching a path to the
superproject's gitdir - by having the superproject write that relative
path to the submodule's config on creation or update. The goal here is
*not* to say "If I am a submodule, I must have
submodule.superprojectGitDir set" - but instead to say "If I have
submodule.superprojectGitDir set, then I must be a submodule." That is,
I expect we will find edge cases where a submodule was introduced in
some interesting way that bypassed any of the patches below, and
therefore doesn't have the superproject's gitdir cached.

The combination of these two rules:
 - Anything relying on submodule.superprojectGitDir must be nice to
   have, but not essential, because
 - It's possible for a submodule to be valid without having
   submodule.superprojectGitDir set
makes me feel more comfortable with the idea of submodules learning
additional behavior based on this config. I feel pretty unconfident in
our ability to ensure that *every* submodule has this config set.

The series covers a few paths for introducing that config, which I'm
hoping covers most cases.
 - "git submodule update" (which seems to be part of the "git submodule
   init" flow)
 - "git submodule absorbgitdir" to convert a "git init"'d repo into a
   submodule

Notably, we can only really set this config when 'the_repository' is the
superproject - that appears to be the only time when we know the gitdirs
of both the superproject and the submodule.

I'm expecting folks may have a lot to say about this, so I look forward
to discussion :)

 - Emily

1: https://lore.kernel.org/git/20210423001539.4059524-1-emilyshaffer@google.com
2: https://lore.kernel.org/git/YHofmWcIAidkvJiD@google.com

Emily Shaffer (4):
  t7400-submodule-basic: modernize inspect() helper
  introduce submodule.superprojectGitDir cache
  submodule: cache superproject gitdir during absorbgitdirs
  submodule: cache superproject gitdir during 'update'

 builtin/submodule--helper.c        |  4 +++
 git-submodule.sh                   |  9 ++++++
 submodule.c                        | 10 ++++++
 t/t7400-submodule-basic.sh         | 49 ++++++++++++++----------------
 t/t7406-submodule-update.sh        | 10 ++++++
 t/t7412-submodule-absorbgitdirs.sh |  1 +
 6 files changed, 57 insertions(+), 26 deletions(-)

-- 
2.32.0.272.g935e593368-goog


             reply	other threads:[~2021-06-11 22:54 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-11 22:54 Emily Shaffer [this message]
2021-06-11 22:54 ` [RFC PATCH 1/4] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2021-06-14  4:52   ` Junio C Hamano
2021-06-11 22:54 ` [RFC PATCH 2/4] introduce submodule.superprojectGitDir cache Emily Shaffer
2021-06-14  5:09   ` Junio C Hamano
2021-06-15 22:00     ` Emily Shaffer
2021-06-11 22:54 ` [RFC PATCH 3/4] submodule: cache superproject gitdir during absorbgitdirs Emily Shaffer
2021-06-14  6:18   ` Junio C Hamano
2021-06-11 22:54 ` [RFC PATCH 4/4] submodule: cache superproject gitdir during 'update' Emily Shaffer
2021-06-14  6:22   ` Junio C Hamano
2021-06-15 21:27     ` Emily Shaffer
2021-06-12 20:12 ` [RFC PATCH 0/4] cache parent project's gitdir in submodules Jacob Keller
2021-06-14  7:26 ` Junio C Hamano
2021-06-15 21:18   ` Emily Shaffer
2021-06-16  0:45 ` [PATCH v2 " Emily Shaffer
2021-06-16  0:45   ` [PATCH v2 1/4] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2021-07-27 17:12     ` Jonathan Tan
2021-08-19 17:46       ` Emily Shaffer
2021-06-16  0:45   ` [PATCH v2 2/4] introduce submodule.superprojectGitDir cache Emily Shaffer
2021-06-16  4:40     ` Junio C Hamano
2021-06-16  4:43       ` Junio C Hamano
2021-06-18  0:03         ` Emily Shaffer
2021-06-18  0:00       ` Emily Shaffer
2021-07-27 17:46     ` Jonathan Tan
2021-08-19 17:53       ` Emily Shaffer
2021-10-14 19:25     ` Ævar Arnfjörð Bjarmason
2021-06-16  0:45   ` [PATCH v2 3/4] submodule: cache superproject gitdir during absorbgitdirs Emily Shaffer
2021-06-16  0:45   ` [PATCH v2 4/4] submodule: cache superproject gitdir during 'update' Emily Shaffer
2021-07-27 17:51     ` Jonathan Tan
2021-08-19 18:02       ` Emily Shaffer
2021-08-19 20:09   ` [PATCH v3 0/4] cache parent project's gitdir in submodules Emily Shaffer
2021-08-19 20:09     ` [PATCH v3 1/4] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2021-08-19 20:09     ` [PATCH v3 2/4] introduce submodule.superprojectGitDir record Emily Shaffer
2021-08-20  0:38       ` Derrick Stolee
2021-10-13 19:36         ` Emily Shaffer
2021-09-04 17:20       ` Matheus Tavares
2021-10-13 19:39         ` Emily Shaffer
2021-08-19 20:09     ` [PATCH v3 3/4] submodule: record superproject gitdir during absorbgitdirs Emily Shaffer
2021-08-20  0:50       ` Derrick Stolee
2021-10-13 19:42         ` Emily Shaffer
2021-09-04 17:27       ` Matheus Tavares
2021-10-14 18:40         ` Emily Shaffer
2021-08-19 20:09     ` [PATCH v3 4/4] submodule: record superproject gitdir during 'update' Emily Shaffer
2021-08-20  0:59       ` Derrick Stolee
2021-10-14 18:45         ` Emily Shaffer
2021-08-19 21:56     ` [PATCH v3 0/4] cache parent project's gitdir in submodules Junio C Hamano
2021-08-20  1:09     ` Derrick Stolee
2021-10-13 18:51       ` Emily Shaffer
2021-10-14 17:12         ` Derrick Stolee
2021-10-14 18:52           ` Emily Shaffer
2021-09-04 17:50     ` Matheus Tavares Bernardino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210611225428.1208973-1-emilyshaffer@google.com \
    --to=emilyshaffer@google.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=albertcui@google.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jacob.keller@gmail.com \
    --cc=jrnieder@gmail.com \
    --cc=matheus.bernardino@usp.br \
    --cc=phillip.wood123@gmail.com \
    --cc=raykar.ath@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).