git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git checkout --recurse-submodules doesn't like checking out an older commit after a submodule was removed
@ 2020-05-01  0:54 Mike Hommey
  2020-05-01 14:40 ` Philippe Blain
  0 siblings, 1 reply; 2+ messages in thread
From: Mike Hommey @ 2020-05-01  0:54 UTC (permalink / raw)
  To: git

Hi,

Doing the following fails:

$ git clone --recurse-submodules https://github.com/trailofbits/winchecksec/
$ cd winchecksec
$ git checkout --recurse-submodules 93ffe67dbfc757bf6f440d80b8acf88e652ed60a
fatal: not a git repository: ../.git/modules/pe-parse
fatal: could not reset submodule index

The main reason is that the pe-parse directory/submodule was removed in
the commit that follows 93ffe67dbfc757bf6f440d80b8acf88e652ed60a (which
is current master).

This also leaves the tree in a bad state, and it's hard to fix anything
from that state.

Mike

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: git checkout --recurse-submodules doesn't like checking out an older commit after a submodule was removed
  2020-05-01  0:54 git checkout --recurse-submodules doesn't like checking out an older commit after a submodule was removed Mike Hommey
@ 2020-05-01 14:40 ` Philippe Blain
  0 siblings, 0 replies; 2+ messages in thread
From: Philippe Blain @ 2020-05-01 14:40 UTC (permalink / raw)
  To: Mike Hommey; +Cc: git, Jonathan Tan, Emily Shaffer

Hi Mike,

> Le 30 avr. 2020 à 20:54, Mike Hommey <mh@glandium.org> a écrit :
> 
> Hi,
> 
> Doing the following fails:
> 
> $ git clone --recurse-submodules https://github.com/trailofbits/winchecksec/
> $ cd winchecksec
> $ git checkout --recurse-submodules 93ffe67dbfc757bf6f440d80b8acf88e652ed60a
> fatal: not a git repository: ../.git/modules/pe-parse
> fatal: could not reset submodule index

Thanks for the report, I can reproduce that with Git 2.26.0.

> The main reason is that the pe-parse directory/submodule was removed in
> the commit that follows 93ffe67dbfc757bf6f440d80b8acf88e652ed60a (which
> is current master).

Yes, indeed. If you look at the code for `git clone`, you will see that the `--recurse-submodules`
flag causes `git submodule update --init` to be called in the "checkout" phase of the clone [1].
This means that only submodules that are present in the HEAD commit of the default branch (or 
the branch to be checked out if using the `--branch` flag of `git clone`) will be cloned.

This also means that they won't be cloned if using `git clone --no-checkout`, and similarly they won't be 
cloned if they are present in a different branch then the one being checked out by `git clone`.

However, the `submodule.active=.` config value will still be written to the local configuration [2].
This is why `git checkout --recurse-submodules` tries to checkout the submodule when checking
out commit 93ffe67db.

> This also leaves the tree in a bad state, and it's hard to fix anything
> from that state.

Indeed, the files are written to disk as in commit 93ffe67dbfc757bf6f440d80b8acf88e652ed60a, 
but HEAD is not updated, and so files appear modified and .gitmodules appears untracked.
For me,

    git reset --hard && git clean -df

returned me to the initial state (master checked out) without errors.

However, trying to get to the intended state (93ffe67dbfc757bf6f440d80b8acf88e652ed60a
checked out with its submodule) is indeed pretty hard. The first thing I tried 
(after the reset and clean above) was to checkout 93ffe67 without the `--recurse-submodules` flag. 

    git checkout 93ffe67

This succeeds. However, note that if  `git clean -f` is used instead of `git clean -df` above, 
then the pe-parse directory (with only its gitfile inside) is left untouched and the checkout fails with 

    fatal: not a git repository: pe-parse/../.git/modules/pe-parse

and HEAD stays at master. A subsequent `git status` fails the same way. However 
`git status --ignore-submodules` does not error (and shows that the files are *staged*
as in 93ffe67, which is very confusing!)

Anyway assuming we used `git clean -df`, then the checkout of 93ffe67 succeeds, but
the submodule is not checked out. Trying `git submodule update` (with or without `--init) fails:

    fatal: not a git repository: /private/var/folders/lr/r6n2057j0dzd4gdb614fp0740000gp/T/winchecksec/pe-parse/../.git/modules/pe-parse
    Failed to clone 'pe-parse'. Retry scheduled
    BUG: submodule considered for cloning, doesn't need cloning any more?
    fatal: could not get a repository handle for submodule 'pe-parse'

I noticed here that the submodule config file `.git/modules/pe-parse/config` was written by the 
first failed recursive checkout with this content:

    $ cat .git/modules/pe-parse/config
    [core]
        worktree = ../../../pe-parse

I next tried `git sub deinit --all -f` and retried `git submodule update --init`, but still got the
same error. This is in part because the deinit leaves and empty file `.git/modules/pe-parse/config`.
Removing this file and retrying gives the same error. Removing the directory  `.git/modules/pe-parse/` 
and retrying finally succeeds in cloning the submodule.

Afterwards, alternating `git checkout --r master` and `git checkout --r 93ffe67` both succeeds.

In my opinion there are several things that are going wrong here:

1. The most user-friendly approach, I think, would be that  `git clone --recurse-submodules` would
look at the history of all branches and clone any submodules it finds. This would require the most
code modification, and might not be desired in some cases, so maybe it should be opt-in 
(i.e. adding supplementary flag).

2. Maybe it would be safer to *not* write `submodule.active=.` to the config if no submodules 
are present in the HEAD commit of the branch checked out by `git clone`. This would alleviate 
the situation you got (the recursive checkout would succeed, the submodule would not be 
checked out, but `git submodule update --init` would clone it and check it out.)

3. I think `git checkout --recurse-submodules` should not leave the tree and the submodule config file
in a hard-to-recover-from state if it can't correctly checkout the submodules. It should abort completely 
and cleanly and say it can't find the submodule repository, and tell the user how to clone it.

4. I think `git submodule update --init` should succeed in cloning the submodule if the 
`.git/modules/pe-parse/` directory exists but is empty. I also think it should succeeds 
if the only file in it is an empty `config` file. Probably it should also succeed if
the config file contains the `core.worktree` config as above.

5. I think `git submodule deinit --all -f` should not let an empty config file in the submodule repository.

I'm CC-ing some people that I *think* are working on some aspects of submodules 
(Jonathan and Emily, forgive me if that's not the case!) in case they want to add to the analysis above.

Cheers,

Philippe.

[1] https://github.com/git/git/blob/d61d20c9b413225793f8a0b491bbbec61c184e26/builtin/clone.c#L821-L847
[2] https://github.com/git/git/blob/d61d20c9b413225793f8a0b491bbbec61c184e26/builtin/clone.c#L1087-L1092

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-05-01 14:40 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-01  0:54 git checkout --recurse-submodules doesn't like checking out an older commit after a submodule was removed Mike Hommey
2020-05-01 14:40 ` Philippe Blain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).