All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jacob Keller <jacob.keller@gmail.com>
To: Emily Shaffer <emilyshaffer@google.com>
Cc: "Git mailing list" <git@vger.kernel.org>,
	"Albert Cui" <albertcui@google.com>,
	"Phillip Wood" <phillip.wood123@gmail.com>,
	"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Matheus Tavares Bernardino" <matheus.bernardino@usp.br>,
	"Jonathan Nieder" <jrnieder@gmail.com>,
	"Atharva Raykar" <raykar.ath@gmail.com>
Subject: Re: [RFC PATCH 0/4] cache parent project's gitdir in submodules
Date: Sat, 12 Jun 2021 13:12:58 -0700	[thread overview]
Message-ID: <CA+P7+xp5NNb6zz1LkQrv_A3Ny41WE=-h2-9VswyL_qzwFdyifw@mail.gmail.com> (raw)
In-Reply-To: <20210611225428.1208973-1-emilyshaffer@google.com>

On Fri, Jun 11, 2021 at 3:54 PM Emily Shaffer <emilyshaffer@google.com> wrote:
>
> It's necessary for a superproject to know which submodules it contains.
> However, historically submodules do not know they are anything but a
> normal single-repo Git project (or a superproject, in nested-submodules
> cases). This decision does help prevent us from having to support
> divergent behaviors in submodule projects vs. superprojects, which makes
> sure Git is (somewhat) less confusing for the reader, and helps simplify
> our code.
>
> One could imagine, though, some conveniences we could gain from
> submodules learning added behavior (though not necessarily *different*
> behavior) to provide more context about the state of the project as a
> whole, and to make large submodule-based projects easier to work with.
> One example is a series[1] I sent some time ago, adding a config to be
> shared between the superproject and all its submodules. The RFC[2] I
> sent around the same time mentions a few other examples, such as "git
> status" run in the submodule noticing whether the superproject has a
> commit referencing the submodule's current HEAD.
>
> It's expensive and non-definitive to try and guess whether or not the
> current repo is a submodule. submodule.c:get_superproject_working_tree()
> does so by essentially running 'git -C .. ls-files -- <own-path>',
> invoking an additional process. get_superproject_working_tree() is not
> called often, so that's mostly fine. However, [1] attempted to include
> an additional config located in the superproject's gitdir by running
> 'git -C .. rev-parse --git-dir' during startup - a little expensive in
> the best case, because it's an extra process, but extremely expensive in
> the case when the current repo is *not* a submodule, because we hunt all
> the way up the filesystem looking for a '.git'. Adding that cost to
> every startup is infeasible.
>

It also adds cost for no benefit to the normal case where it's not a
submodule. The cost added for non-submodules ought to be as near-zero
as possible.

> To that end, in this series I propose caching a path to the
> superproject's gitdir - by having the superproject write that relative
> path to the submodule's config on creation or update. The goal here is
> *not* to say "If I am a submodule, I must have
> submodule.superprojectGitDir set" - but instead to say "If I have
> submodule.superprojectGitDir set, then I must be a submodule." That is,
> I expect we will find edge cases where a submodule was introduced in
> some interesting way that bypassed any of the patches below, and
> therefore doesn't have the superproject's gitdir cached.
>
> The combination of these two rules:
>  - Anything relying on submodule.superprojectGitDir must be nice to
>    have, but not essential, because
>  - It's possible for a submodule to be valid without having
>    submodule.superprojectGitDir set
> makes me feel more comfortable with the idea of submodules learning
> additional behavior based on this config. I feel pretty unconfident in
> our ability to ensure that *every* submodule has this config set.
>

I think this is a good direction. I do think having some information
about being a submodule could be very useful for tools such as git
status, and making it more of a "if we know for sure, we get some
small benefit, but if we don't know then it's no harm" is a good
direction.

> The series covers a few paths for introducing that config, which I'm
> hoping covers most cases.
>  - "git submodule update" (which seems to be part of the "git submodule
>    init" flow)
>  - "git submodule absorbgitdir" to convert a "git init"'d repo into a
>    submodule
>
> Notably, we can only really set this config when 'the_repository' is the
> superproject - that appears to be the only time when we know the gitdirs
> of both the superproject and the submodule.

We could also presumably add a new command for setting this?

>
> I'm expecting folks may have a lot to say about this, so I look forward
> to discussion :)
>
>  - Emily
>
> 1: https://lore.kernel.org/git/20210423001539.4059524-1-emilyshaffer@google.com
> 2: https://lore.kernel.org/git/YHofmWcIAidkvJiD@google.com
>
> Emily Shaffer (4):
>   t7400-submodule-basic: modernize inspect() helper
>   introduce submodule.superprojectGitDir cache
>   submodule: cache superproject gitdir during absorbgitdirs
>   submodule: cache superproject gitdir during 'update'
>
>  builtin/submodule--helper.c        |  4 +++
>  git-submodule.sh                   |  9 ++++++
>  submodule.c                        | 10 ++++++
>  t/t7400-submodule-basic.sh         | 49 ++++++++++++++----------------
>  t/t7406-submodule-update.sh        | 10 ++++++
>  t/t7412-submodule-absorbgitdirs.sh |  1 +
>  6 files changed, 57 insertions(+), 26 deletions(-)
>
> --
> 2.32.0.272.g935e593368-goog
>

  parent reply	other threads:[~2021-06-12 20:14 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-11 22:54 [RFC PATCH 0/4] cache parent project's gitdir in submodules Emily Shaffer
2021-06-11 22:54 ` [RFC PATCH 1/4] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2021-06-14  4:52   ` Junio C Hamano
2021-06-11 22:54 ` [RFC PATCH 2/4] introduce submodule.superprojectGitDir cache Emily Shaffer
2021-06-14  5:09   ` Junio C Hamano
2021-06-15 22:00     ` Emily Shaffer
2021-06-11 22:54 ` [RFC PATCH 3/4] submodule: cache superproject gitdir during absorbgitdirs Emily Shaffer
2021-06-14  6:18   ` Junio C Hamano
2021-06-11 22:54 ` [RFC PATCH 4/4] submodule: cache superproject gitdir during 'update' Emily Shaffer
2021-06-14  6:22   ` Junio C Hamano
2021-06-15 21:27     ` Emily Shaffer
2021-06-12 20:12 ` Jacob Keller [this message]
2021-06-14  7:26 ` [RFC PATCH 0/4] cache parent project's gitdir in submodules Junio C Hamano
2021-06-15 21:18   ` Emily Shaffer
2021-06-16  0:45 ` [PATCH v2 " Emily Shaffer
2021-06-16  0:45   ` [PATCH v2 1/4] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2021-07-27 17:12     ` Jonathan Tan
2021-08-19 17:46       ` Emily Shaffer
2021-06-16  0:45   ` [PATCH v2 2/4] introduce submodule.superprojectGitDir cache Emily Shaffer
2021-06-16  4:40     ` Junio C Hamano
2021-06-16  4:43       ` Junio C Hamano
2021-06-18  0:03         ` Emily Shaffer
2021-06-18  0:00       ` Emily Shaffer
2021-07-27 17:46     ` Jonathan Tan
2021-08-19 17:53       ` Emily Shaffer
2021-10-14 19:25     ` Ævar Arnfjörð Bjarmason
2021-06-16  0:45   ` [PATCH v2 3/4] submodule: cache superproject gitdir during absorbgitdirs Emily Shaffer
2021-06-16  0:45   ` [PATCH v2 4/4] submodule: cache superproject gitdir during 'update' Emily Shaffer
2021-07-27 17:51     ` Jonathan Tan
2021-08-19 18:02       ` Emily Shaffer
2021-08-19 20:09   ` [PATCH v3 0/4] cache parent project's gitdir in submodules Emily Shaffer
2021-08-19 20:09     ` [PATCH v3 1/4] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2021-08-19 20:09     ` [PATCH v3 2/4] introduce submodule.superprojectGitDir record Emily Shaffer
2021-08-20  0:38       ` Derrick Stolee
2021-10-13 19:36         ` Emily Shaffer
2021-09-04 17:20       ` Matheus Tavares
2021-10-13 19:39         ` Emily Shaffer
2021-08-19 20:09     ` [PATCH v3 3/4] submodule: record superproject gitdir during absorbgitdirs Emily Shaffer
2021-08-20  0:50       ` Derrick Stolee
2021-10-13 19:42         ` Emily Shaffer
2021-09-04 17:27       ` Matheus Tavares
2021-10-14 18:40         ` Emily Shaffer
2021-08-19 20:09     ` [PATCH v3 4/4] submodule: record superproject gitdir during 'update' Emily Shaffer
2021-08-20  0:59       ` Derrick Stolee
2021-10-14 18:45         ` Emily Shaffer
2021-08-19 21:56     ` [PATCH v3 0/4] cache parent project's gitdir in submodules Junio C Hamano
2021-08-20  1:09     ` Derrick Stolee
2021-10-13 18:51       ` Emily Shaffer
2021-10-14 17:12         ` Derrick Stolee
2021-10-14 18:52           ` Emily Shaffer
2021-09-04 17:50     ` Matheus Tavares Bernardino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+P7+xp5NNb6zz1LkQrv_A3Ny41WE=-h2-9VswyL_qzwFdyifw@mail.gmail.com' \
    --to=jacob.keller@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=albertcui@google.com \
    --cc=avarab@gmail.com \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=matheus.bernardino@usp.br \
    --cc=phillip.wood123@gmail.com \
    --cc=raykar.ath@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.