git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/1] Test case for checkout of added/deleted submodules in clones
@ 2020-09-21  8:15 Luke Diamand
  2020-09-21  8:15 ` [PATCH 1/1] " Luke Diamand
  0 siblings, 1 reply; 4+ messages in thread
From: Luke Diamand @ 2020-09-21  8:15 UTC (permalink / raw)
  To: git; +Cc: kartic Kaartic Sivaraam, Luke Diamand

I hit some quite confusing (to me anyway) corner cases in `git
submodule` handling of checking out revisions where submodules have been
added or removed.

If you clone a repo with an "old" submodule which is missing from HEAD,
and then try to checkout an older revision, you can sometimes get:

    fatal: not a git repository: ../.git/modules/old
    fatal: could not reset submodule index

If you clone a repo with a "new" submodule, and then checkout an older
revision where it is deleted, you can sometimes find that new submodule
is left lying around.

Luke Diamand (1):
  Test case for checkout of added/deleted submodules in clones

 t/t5619-submodules-missing.sh | 104 ++++++++++++++++++++++++++++++++++
 1 file changed, 104 insertions(+)
 create mode 100755 t/t5619-submodules-missing.sh

-- 
2.28.0.762.g324f61785e


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/1] Test case for checkout of added/deleted submodules in clones
  2020-09-21  8:15 [PATCH 0/1] Test case for checkout of added/deleted submodules in clones Luke Diamand
@ 2020-09-21  8:15 ` Luke Diamand
  2020-09-22  9:35   ` Kaartic Sivaraam
  2020-09-22 14:04   ` Philippe Blain
  0 siblings, 2 replies; 4+ messages in thread
From: Luke Diamand @ 2020-09-21  8:15 UTC (permalink / raw)
  To: git; +Cc: kartic Kaartic Sivaraam, Luke Diamand

Test case for various cases of deleted submodules:

1. Clone a repo where an `old` submodule has been removed in HEAD.
Try to checkout a revision from before `old` was deleted.

This can fail in a rather ugly way depending on how the `git checkout`
and `git submodule` commands are used.

2. Clone a repo where a `new` submodule has been added. Try to
checkout a revision from before `new` was added.

This can leave `new` lying around in some circumstances, and not in
others, in a way which is confusing (at least to me).

Signed-off-by: Luke Diamand <luke@diamand.org>
---
 t/t5619-submodules-missing.sh | 104 ++++++++++++++++++++++++++++++++++
 1 file changed, 104 insertions(+)
 create mode 100755 t/t5619-submodules-missing.sh

diff --git a/t/t5619-submodules-missing.sh b/t/t5619-submodules-missing.sh
new file mode 100755
index 0000000000..d7878c52fc
--- /dev/null
+++ b/t/t5619-submodules-missing.sh
@@ -0,0 +1,104 @@
+#!/bin/sh
+
+test_description='Clone a repo containing submodules. Sync to a revision where the submodule is missing or added'
+
+. ./test-lib.sh
+
+pwd=$(pwd)
+
+# Setup a super project with a submodule called `old`, which gets deleted, and
+# a submodule `new` which is added later on.
+
+test_expect_success 'setup' '
+	mkdir super old new &&
+	git -C old init &&
+	test_commit -C old commit_old &&
+	(cd super &&
+		git init . &&
+		git submodule add ../old old &&
+		git commit -m "adding submodule old" &&
+		test_commit commit2 &&
+		git tag OLD &&
+		test_path_is_file old/commit_old.t &&
+		git rm old &&
+		git commit -m "Remove old submodule" &&
+		test_commit commit3
+	) &&
+	git -C new init &&
+	test_commit -C new commit_new &&
+	(cd super &&
+		git tag BEFORE_NEW &&
+		git submodule add ../new new &&
+		git commit -m "adding submodule new" &&
+		test_commit commit4
+	)
+'
+
+# Checkout the OLD tag inside the original repo. This works fine since all of
+# the submodules are present in .git/modules.
+test_expect_success 'checkout old inside original repo' '
+	(cd super &&
+		git config advice.detachedHead false &&
+		git tag LATEST &&
+		git checkout --recurse-submodules OLD &&
+		git submodule update --checkout --remote --force &&
+		test_path_is_file old/commit_old.t &&
+		test_path_is_missing new/commit_new.t &&
+		git checkout --recurse-submodules LATEST &&
+		test_path_is_file new/commit_new.t
+	)
+'
+
+# Clone the repo, and then checkout the OLD tag inside the clone.
+# The `old` submodule does not get updated. Instead we get:
+#
+# fatal: not a git repository: ../.git/modules/old
+# fatal: could not reset submodule index
+#
+# That's because `old` is missing from .git/modules since it
+# was not cloned originally and `checkout` does not know how to
+# fetch the remote submodules, whereas `submodule update --remote` does.
+
+test_expect_failure 'checkout old with --recurse-submodules' '
+	test_when_finished "rm -fr super-clone" &&
+	git clone --recurse-submodules super super-clone &&
+	(cd super-clone &&
+		git config advice.detachedHead false &&
+		test_path_is_file commit3.t &&
+		test_path_is_file commit2.t &&
+		test_path_is_missing old &&
+		test_path_is_file new/commit_new.t &&
+		git checkout --recurse-submodules OLD &&
+		git submodule update --checkout --remote --force &&
+		test_path_is_file commit2.t &&
+		test_path_is_missing commit3.t &&
+		test_path_is_dir old &&
+		test_path_is_file old/commit_old.t
+	)
+'
+
+# As above, but this time, instead of using "checkout --recurse-submodules" we just
+# use "checkout" to avoid the missing submodule error.
+#
+# The checkout of `old` now works fine, but instead `new` is left lying
+# around with seemingly no way to clean it up. Even a later invocation of
+# `git checkout --recurse-submodules` does not get rid of it.
+
+test_expect_failure 'checkout old without --recurse-submodules' '
+	test_when_finished "rm -fr super-clone" &&
+	git clone --recurse-submodules super super-clone &&
+	(cd super-clone &&
+		git config advice.detachedHead false &&
+		test_path_is_file new/commit_new.t &&
+		git checkout OLD &&
+		git submodule update --checkout --remote --force &&
+		git checkout --recurse-submodules OLD &&
+		test_path_is_file commit2.t &&
+		test_path_is_missing commit3.t &&
+		test_path_is_dir old &&
+		test_path_is_file old/commit_old.t &&
+		test_path_is_missing new/commit_new.t
+	)
+'
+
+test_done
-- 
2.28.0.762.g324f61785e


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] Test case for checkout of added/deleted submodules in clones
  2020-09-21  8:15 ` [PATCH 1/1] " Luke Diamand
@ 2020-09-22  9:35   ` Kaartic Sivaraam
  2020-09-22 14:04   ` Philippe Blain
  1 sibling, 0 replies; 4+ messages in thread
From: Kaartic Sivaraam @ 2020-09-22  9:35 UTC (permalink / raw)
  To: Luke Diamand; +Cc: Philippe Blain, git

On 21-09-2020 13:45, Luke Diamand wrote:
> +
> +# Checkout the OLD tag inside the original repo. This works fine since all of
> +# the submodules are present in .git/modules.
> +test_expect_success 'checkout old inside original repo' '
> +	(cd super &&
> +		git config advice.detachedHead false &&
> +		git tag LATEST &&
> +		git checkout --recurse-submodules OLD &&
> +		git submodule update --checkout --remote --force &&
> +		test_path_is_file old/commit_old.t &&
> +		test_path_is_missing new/commit_new.t &&
> +		git checkout --recurse-submodules LATEST &&
> +		test_path_is_file new/commit_new.t
> +	)
> +'
> +

Philippe pointed out the reason behind this behavriour [1] for this in
another thread.

> +# Clone the repo, and then checkout the OLD tag inside the clone.
> +# The `old` submodule does not get updated. Instead we get:
> +#
> +# fatal: not a git repository: ../.git/modules/old
> +# fatal: could not reset submodule index
> +#
> +# That's because `old` is missing from .git/modules since it
> +# was not cloned originally and `checkout` does not know how to
> +# fetch the remote submodules, whereas `submodule update --remote` does.
> +
> +test_expect_failure 'checkout old with --recurse-submodules' '
> +	test_when_finished "rm -fr super-clone" &&
> +	git clone --recurse-submodules super super-clone &&
> +	(cd super-clone &&
> +		git config advice.detachedHead false &&
> +		test_path_is_file commit3.t &&
> +		test_path_is_file commit2.t &&
> +		test_path_is_missing old &&
> +		test_path_is_file new/commit_new.t &&
> +		git checkout --recurse-submodules OLD &&
> +		git submodule update --checkout --remote --force &&
> +		test_path_is_file commit2.t &&
> +		test_path_is_missing commit3.t &&
> +		test_path_is_dir old &&
> +		test_path_is_file old/commit_old.t
> +	)
> +'
> +
> +# As above, but this time, instead of using "checkout --recurse-submodules" we just
> +# use "checkout" to avoid the missing submodule error.
> +#
> +# The checkout of `old` now works fine, but instead `new` is left lying
> +# around with seemingly no way to clean it up. Even a later invocation of
> +# `git checkout --recurse-submodules` does not get rid of it.
> +
> +test_expect_failure 'checkout old without --recurse-submodules' '
> +	test_when_finished "rm -fr super-clone" &&
> +	git clone --recurse-submodules super super-clone &&
> +	(cd super-clone &&
> +		git config advice.detachedHead false &&
> +		test_path_is_file new/commit_new.t &&
> +		git checkout OLD &&
> +		git submodule update --checkout --remote --force &&
> +		git checkout --recurse-submodules OLD &&
> +		test_path_is_file commit2.t &&
> +		test_path_is_missing commit3.t &&
> +		test_path_is_dir old &&
> +		test_path_is_file old/commit_old.t &&
> +		test_path_is_missing new/commit_new.t
> +	)
> +'
> +
> +test_done
> 

I think this isn't the complete story. The submodule removal is done
properly if `git checkout --recurse-submodules` is called when the
revision is not checked out (modulo the checkout failing issue detailed
in the previous test case). For some reason, it doesn't work when the
revisions is already checked out. i.e., the following passes

  test_expect_success 'checkout old without --recurse-submodules' '
        test_when_finished "rm -fr super-clone" &&
        git clone --recurse-submodules super super-clone &&
        (cd super-clone &&
                git config advice.detachedHead false &&
                test_path_is_file new/commit_new.t &&
                git checkout OLD &&
                git submodule update --checkout --remote --force &&
                git checkout --recurse-submodules LATEST &&
                test_path_is_file commit2.t &&
                test_path_is_file commit3.t &&
                test_path_is_missing old &&
                test_path_is_missing old/commit_old.t &&
                test_path_is_file new/commit_new.t
        )
  '

Also, this is the exact case that seems to be explained in commit
bbad9f9314 (rm: better document side effects when removing a submodule,
2014-01-07) which adds the following BUGS part to the documentation.

    BUGS
    ----
    Each time a superproject update removes a populated submodule
    e.g. when switching between commits before and after the removal) a
    stale submodule checkout will remain in the old location. Removing
    the old directory is only safe when it uses a gitfile, as otherwise
    the history of the submodule will be deleted too. This step will be
    obsolete when recursive submodule update has been implemented.

As Phillipe points out in [2], I do wonder if this part has now become
stale.

[ References ]

[1]:
https://lore.kernel.org/git/20200501005432.h62dnpkx7feb7rto@glandium.org/T/#u

[2]:
https://public-inbox.org/git/0B191753-C1AD-499C-B8B2-122F49CF6F14@gmail.com/

-- 
Sivaraam

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] Test case for checkout of added/deleted submodules in clones
  2020-09-21  8:15 ` [PATCH 1/1] " Luke Diamand
  2020-09-22  9:35   ` Kaartic Sivaraam
@ 2020-09-22 14:04   ` Philippe Blain
  1 sibling, 0 replies; 4+ messages in thread
From: Philippe Blain @ 2020-09-22 14:04 UTC (permalink / raw)
  To: Luke Diamand; +Cc: Git mailing list, kartic Kaartic Sivaraam

Hi Luke,

> Le 21 sept. 2020 à 04:15, Luke Diamand <luke@diamand.org> a écrit :
> 
> Test case for various cases of deleted submodules:
> 
> 1. Clone a repo where an `old` submodule has been removed in HEAD.
> Try to checkout a revision from before `old` was deleted.
> 
> This can fail in a rather ugly way depending on how the `git checkout`
> and `git submodule` commands are used.

As I wrote in my previous email [1], this fails if '--recurse-submodules' 
was passed to 'git clone', which writes 'submodule.active=.' to the config.
So I would phrase it as:

This fails if the 'checkout' uses '--recurse-submodules' and 
a pathspec in the configuration variable 'submodule.active' matches
the name of the submodule, which is the case for the match-all value '.' written to 
this variable by 'git clone --recurse-submodules'.

> 2. Clone a repo where a `new` submodule has been added. Try to
> checkout a revision from before `new` was added.
> 
> This can leave `new` lying around in some circumstances, and not in
> others, in a way which is confusing (at least to me).
> 
> Signed-off-by: Luke Diamand <luke@diamand.org>
> ---
> t/t5619-submodules-missing.sh | 104 ++++++++++++++++++++++++++++++++++

I'm a little torn about where to put this test case. The heart of the matter
is that clone should not write '.' to 'submodule.active' if it does not clone all submodules,
even deleted ones, since 'checkout' cant (yet?) cope with an active submodule for which
it does not find the Git repository. So the "bug" is indeed in clone, thus t56** is a good place.
However, I would say that from a user experience perspective,
this test case should be in t2013-checkout-submodules, because if someone clones
with '--recurse-submodules', and then checks out the old revision with '--recurse-submodules',
then clearly their expectation is that the checkout should work.

In any case, if we leave it in t56** I think the name of the test file should maybe contain "clone", 
since that seems to be the case for almost all test files in this series.

> 1 file changed, 104 insertions(+)
> create mode 100755 t/t5619-submodules-missing.sh
> 
> diff --git a/t/t5619-submodules-missing.sh b/t/t5619-submodules-missing.sh
> new file mode 100755
> index 0000000000..d7878c52fc
> --- /dev/null
> +++ b/t/t5619-submodules-missing.sh
> @@ -0,0 +1,104 @@
> +#!/bin/sh
> +
> +test_description='Clone a repo containing submodules. Sync to a revision where the submodule is missing or added'

"Sync" is confusing here ('git submodule sync" exists and does something completely different). 
What this test is doing is a (recursive) *checkout* of a revision where the submodule exists,
starting from a revision where it's absent, with 'submodule.active' set to a pathspec that matches
that submodule.

> +
> +. ./test-lib.sh
> +
> +pwd=$(pwd)
> +
> +# Setup a super project with a submodule called `old`, which gets deleted, and
> +# a submodule `new` which is added later on.

As I showed in [1], we don't need the 'new' submodule to demonstrate the faulty
behaviour, so I would not include it in this test case, 
it's adding unnecessary noise in my opinion.

> +
> +test_expect_success 'setup' '
> +	mkdir super old new &&
> +	git -C old init &&
> +	test_commit -C old commit_old &&
> +	(cd super &&
> +		git init . &&
> +		git submodule add ../old old &&
> +		git commit -m "adding submodule old" &&
> +		test_commit commit2 &&
> +		git tag OLD &&
> +		test_path_is_file old/commit_old.t &&
> +		git rm old &&
> +		git commit -m "Remove old submodule" &&
> +		test_commit commit3
> +	) &&
> +	git -C new init &&
> +	test_commit -C new commit_new &&
> +	(cd super &&
> +		git tag BEFORE_NEW &&
> +		git submodule add ../new new &&
> +		git commit -m "adding submodule new" &&
> +		test_commit commit4
> +	)
> +'
> +
> +# Checkout the OLD tag inside the original repo. This works fine since all of
> +# the submodules are present in .git/modules.
> +test_expect_success 'checkout old inside original repo' '
> +	(cd super &&
> +		git config advice.detachedHead false &&
> +		git tag LATEST &&

minor point: this 'tag' command should be in the "setup" test above.

> +		git checkout --recurse-submodules OLD &&
> +		git submodule update --checkout --remote --force &&

This invocation of 'git submodule update' does nothing here (the 
submodule is already correctly checked out at the revision
registered in the superproject as a result of `git checkout --recurse-submodules OLD`)
so I would remove it.

> +
> +# Clone the repo, and then checkout the OLD tag inside the clone.
> +# The `old` submodule does not get updated

Minor point, but I would write "checked out" instead of "updated".

> . Instead we get:
> +#
> +# fatal: not a git repository: ../.git/modules/old
> +# fatal: could not reset submodule index
> +#
> +# That's because `old` is missing from .git/modules since it
> +# was not cloned originally

I would add a little bit of info here:

...since it was not clone originally, 'checkout' wants to recurse 
into it because 'submodule.active' was set to '.' by 'clone --recurse-submodules',
and 'checkout' does not know how to....

> and `checkout` does not know how to
> +# fetch the remote submodules,

I think "clone" would be more appropriate then 'fetch" here. 

> whereas `submodule update --remote` does.

I don't think you want to add '--remote' here.
What we want is to checkout the submodule at the revision
specified by the superproject at tag OLD, which is what a 
plain 'git submodule update' should do. Adding '--remote'
would fetch the *submodule's remote* and checkout 
the submodule at the commit at the tip of the HEAD branch of that
remote [2] (which in this specific case would be the same commit, 
since no commit were done in the 'old' repo after it was added 
as a submodule to 'super', but in general this does not have to 
be the case).

> +
> +test_expect_failure 'checkout old with --recurse-submodules' '
> +	test_when_finished "rm -fr super-clone" &&
> +	git clone --recurse-submodules super super-clone &&
> +	(cd super-clone &&
> +		git config advice.detachedHead false &&
> +		test_path_is_file commit3.t &&
> +		test_path_is_file commit2.t &&
> +		test_path_is_missing old &&
> +		test_path_is_file new/commit_new.t &&
> +		git checkout --recurse-submodules OLD &&

The test case will fail here as the exit code of 'checkout' is 128,
so any further command don't affect the outcome of the test.
I think it's a good reflex to try 'git submodule update' next at this point, 
and I agree that in an ideal world it should be able to help and clone
the missing submodule, but I think it should be tested in a new test,
 following this one.

> +		git submodule update --checkout --remote --force &&

As I wrote above '--remote' is not what we want here, and '--checkout'
is the default behaviour so unnecessary. Also, '--force' should not be necessary 
to clone the submodule here since there is no revision checked out in it yet (see [3], [4]).

As I detailed in [5], "git submodule update" can't clone the submodule
until the whole .git/modules/old directory, which is written by the failing
'checkout --recurse-submodules', is manually deleted, so this would be 
an additional 'test_expect_failure' case.

> +# As above, but this time, instead of using "checkout --recurse-submodules" we just
> +# use "checkout" to avoid the missing submodule error.
> +#
> +# The checkout of `old` now works fine, but instead `new` is left lying
> +# around with seemingly no way to clean it up.

`git clean -dff` cleans it up.

> Even a later invocation of
> +# `git checkout --recurse-submodules` does not get rid of it.
> +
> +test_expect_failure 'checkout old without --recurse-submodules' '
> +	test_when_finished "rm -fr super-clone" &&
> +	git clone --recurse-submodules super super-clone &&
> +	(cd super-clone &&
> +		git config advice.detachedHead false &&
> +		test_path_is_file new/commit_new.t &&
> +		git checkout OLD &&
> +		git submodule update --checkout --remote --force &&
> +		git checkout --recurse-submodules OLD &&
> +		test_path_is_file commit2.t &&
> +		test_path_is_missing commit3.t &&
> +		test_path_is_dir old &&
> +		test_path_is_file old/commit_old.t &&
> +		test_path_is_missing new/commit_new.t
> +	)

I would simply remove this test case, it does not add new information.
After 'git checkout OLD', since '--recurse-submodules was not used,
the "new" submodules appears as "untracked",
so the command to remove it is "git clean", and since it contains a '.git'
gitfile, two '-f' are needed [6]. Since it is untracked, it makes sense that no
further Git command should touch it.

Thanks for working on this!

Philippe.

[1] https://lore.kernel.org/git/0B191753-C1AD-499C-B8B2-122F49CF6F14@gmail.com/T/#m85fe0b90231033c96d3d75bac6e8ea9b2ae6d467
[2] https://git-scm.com/docs/git-submodule#Documentation/git-submodule.txt---remote
[3] https://git-scm.com/docs/git-submodule#Documentation/git-submodule.txt--f
[4] https://git-scm.com/docs/git-submodule#Documentation/git-submodule.txt-update--init--remote-N--no-fetch--no-recommend-shallow-f--force--checkout--rebase--merge--referenceltrepositorygt--depthltdepthgt--recursive--jobsltngt--no-single-branch--ltpathgt82308203
[5] https://lore.kernel.org/git/20200501005432.h62dnpkx7feb7rto@glandium.org/T/#u
[6] https://git-scm.com/docs/git-clean#Documentation/git-clean.txt--f

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-09-22 14:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-21  8:15 [PATCH 0/1] Test case for checkout of added/deleted submodules in clones Luke Diamand
2020-09-21  8:15 ` [PATCH 1/1] " Luke Diamand
2020-09-22  9:35   ` Kaartic Sivaraam
2020-09-22 14:04   ` Philippe Blain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).