git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Glen Choo <chooglen@google.com>
To: Philippe Blain <levraiphilippeblain@gmail.com>,
	Huang Zou <huang.zou@schrodinger.com>,
	git@vger.kernel.org
Subject: Re: Bug Report: fetch.recurseSubmodules doesn't affect git pull as otherwise stated in the git config docs
Date: Fri, 06 May 2022 14:50:28 -0700	[thread overview]
Message-ID: <kl6lbkwa8h5n.fsf@chooglen-macbookpro.roam.corp.google.com> (raw)
In-Reply-To: <fc492627-c552-10ec-b30c-820299241278@gmail.com>

Philippe Blain <levraiphilippeblain@gmail.com> writes:

> Hi Huang,
>
> Le 2022-05-02 à 10:42, Huang Zou a écrit :
>> Thank you for filling out a Git bug report!
>> Please answer the following questions to help us understand your issue.
>> 
>> What did you do before the bug happened? (Steps to reproduce your issue)
>> 1) Set the following configs:
>>  git config submodule.recurse true
>>  git config fetch.recurseSubmodules false
>> 2) On a repo with submodules, run:
>> git pull
>> 
>> What did you expect to happen? (Expected behavior)
>> git pull doesn't recursively fetch submodules
>> 
>> What happened instead? (Actual behavior)
>> Submodules are fetched recursively
>> 
>> What's different between what you expected and what actually happened?
>> Submodules are fetched recursively
>> 
>> Anything else you want to add:
>> git fetch works as intended. The documentation for fetch.recurseSubmodules
>> states that "This option controls whether git fetch (and the underlying
>> fetch in git pull)" so I would naturally expect git pull to behave the same
>> as git fetch
>
> I did not try to reproduce, but I took a look at the code and I think I understand
> what happens. 
>
> When 'git pull' invokes 'git fetch', it does so by specifically using the '--recurse-submodules'
> flag, see [1]. It sends either 'yes', 'no' or 'on-demand' as value, depending on the value
> of the local variable 'recurse_submodules'. This variable is initialized to the config value
> of 'submodule.recurse' in 'git_pull_config' [2], called at [3], and then overwritten by the value given
> explicitely on the command line [4], parsed at [5].
>
> So when 'git fetch' runs when called by 'git pull', it always receive the 
> '--recurse-submodules' flag, and thus any config for fetch.recurseSubmodules is ignored
> (since explicit command line flags always have precedence over config values).

Thanks for looking into this! This seems to agree with my reading of the
code. I haven't tried to reproduce it either, but the code looks
obviously incorrect.

> So one way to fix this would be to also parse 'fetch.recurseSubmodules' in 'git_pull_config',
> and send the correct value to the 'git fetch' invocation... Or simpler, call 'git fetch' with
> '--recurse-submodules-default' [9] instead...

Despite having touched this code fairly recently, I had to do quite a
rereading to refresh myself on how this works. If I _am_ reading this
correctly, then I think we actually want to set `--recurse-submodules`
and not `--recurse-submodules-default`.

The short story is that the two are not equivalent - when figuring out
_which_ submodules to fetch (we determine on a submodule-by-submodule
basis; we don't just say "should we fetch all submodules?"),
`--recurse-submodules-default` gets overrided by config values, but
`--recurse-submodules` does not.

The longer story (which I think is quite difficult to explain, I am also
a little confused) is that in a recursive fetch,
`--recurse-submodules-default` is the value of the parent's (we'll call
it P) `--recurse-submodules`. This only matters when a process, C1, has
to pass a value for `--recurse-submodules` to its child, C2. The
precedence order is:

- C1's --recurse-submodules | fetch.recurseSubmodules |
  submodule.recurse
- C2's submodule entry in C1's .git/config
- C2's entry in C1's .gitmodules
- C1's --recurse-submodules-default (aka P's --recurse-submodules)

Specifically, in code:

  static int get_fetch_recurse_config(const struct submodule *submodule,
              struct submodule_parallel_fetch *spf)
  {
    // Glen: passed in from builtin/fetch, which parses
    //  --recurse-submodules, fetch.recurseSubmodules, submodule.recurse
    if (spf->command_line_option != RECURSE_SUBMODULES_DEFAULT)
      return spf->command_line_option;

    if (submodule) {
      // ...
      // Glen: fetch configuration from .gitmodules
      int fetch_recurse = submodule->fetch_recurse;

      key = xstrfmt("submodule.%s.fetchRecurseSubmodules", submodule->name);
      if (!repo_config_get_string_tmp(spf->r, key, &value)) {
        // Glen: fetch configuration from .git/config
        fetch_recurse = parse_fetch_recurse_submodules_arg(key, value);
      }
      // ...
    }

    // Glen: --recurse-submodules-default
    return spf->default_option;
  }

So `--recurse-submodules-default` really wasn't meant for anything other
than "fetch" invoking itself in a superproject-submodule setting.

Of course, I could be entirely wrong and I should just write up a test
case :). I hope to send one soon.

> [sidenote]
> I'm thought for a while that it was maybe not a good idea to change the behaviour
> in your specific situation. If you have 'submodule.recurse'
> set to true and 'fetch.recurseSubmodules' set to false, and if the code is changed so that indeed
> 'git pull' does not fetch recursively, then the code will still try to update the submodule working
> trees after the end of the operation (merge or rebase), see the end of 'cmd_pull' [6], [7]. This  is
> OK, because if there are new submodule commits referenced by the superproject and they were not fetched because the 
> fetch was not recursive, then the call to 'git submodule update' in update_submodules/rebase_submodule
> should fetch them, so no problem there.
> [/sidenote]

I think the bigger question to ask is "what is the intended effect of
'submodule.recurse = true' and 'fetch.recurseSubmodules = false'?". I
think it is not surprising to think that recursive worktree operations
might fail if the submodule commits weren't fetched.

Perhaps this is just a performance optimization? i.e. after fetching in
the superproject, the user wants to skip the rev walk that discovers new
submodule commits.

  reply	other threads:[~2022-05-06 21:50 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-02 14:42 Bug Report: fetch.recurseSubmodules doesn't affect git pull as otherwise stated in the git config docs Huang Zou
2022-05-05 23:57 ` Philippe Blain
2022-05-06 21:50   ` Glen Choo [this message]
2022-05-06 22:24     ` Philippe Blain
2022-05-09 16:24       ` Huang Zou
2022-05-09 23:32 ` Glen Choo
2022-05-10  0:57   ` Philippe Blain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=kl6lbkwa8h5n.fsf@chooglen-macbookpro.roam.corp.google.com \
    --to=chooglen@google.com \
    --cc=git@vger.kernel.org \
    --cc=huang.zou@schrodinger.com \
    --cc=levraiphilippeblain@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).