All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Stefan Beller <sbeller@google.com>
Cc: Uma Srinivasan <usrinivasan@twitter.com>,
	Jacob Keller <jacob.keller@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	Jens Lehmann <Jens.Lehmann@web.de>,
	Heiko Voigt <hvoigt@hvoigt.net>
Subject: Re: git submodules implementation question
Date: Thu, 01 Sep 2016 13:21:02 -0700	[thread overview]
Message-ID: <xmqq4m5zwevl.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <xmqqa8frwhpr.fsf@gitster.mtv.corp.google.com> (Junio C. Hamano's message of "Thu, 01 Sep 2016 12:19:44 -0700")

Junio C Hamano <gitster@pobox.com> writes:

> Stefan Beller <sbeller@google.com> writes:
>
>>> The final version needs to be accompanied with tests to show the
>>> effect of this change for callers.  A test would set up a top-level
>>> and submodule, deliberately break submodule/.git/ repository and
>>> show what breaks and how without this change.
>>
>> Tests are really good at providing this context as well, or to communicate
>> the actual underlying problem, which is not quite clear to me.
>> That is why I refrained from jumping into the discussion as I think the
>> first few emails were dropped from the mailing list and I am missing context.
>
> I do not know where you started reading, but the gist of it is that
> submodule.c spawns subprocess to run in the submodule's context by
> assuming that chdir'ing into the <path> of the submodule and running
> it (i.e. cp.dir set to <path> to drive start_command(&cp)) is
> sufficient.  When <path>/.git (either it is a directory itself or it
> points at a directory in .git/module/<name> in the superproject) is
> a corrupt repository, running "git -C <path> command" would try to
> auto-detect the repository, because it thinks <path>/.git is not a
> repository and it thinks it is not at the top-level of the working
> tree, and instead finds the repository of the top-level, which is
> almost never what we want.

This is with a test that covers the call in get_next_submodule() for
the parallel fetch callback.  I think many of the codepaths will end
up recursing forever the same way without the fix in a submodule
repository that is broken in a similar way, but I didn't check, so
I do not consider this to be completed.

-- >8 --
Subject: submodule: avoid auto-discovery in prepare_submodule_repo_env()

The function is used to set up the environment variable used in a
subprocess we spawn in a submodule directory.  The callers set up a
child_process structure, find the working tree path of one submodule 
and set .dir field to it, and then use start_command() API to spawn
the subprocess like "status", "fetch", etc.

When this happens, we expect that the ".git" (either a directory or
a gitfile that points at the real location) in the current working
directory of the subprocess MUST be the repository for the submodule.

If this ".git" thing is a corrupt repository, however, because
prepare_submodule_repo_env() unsets GIT_DIR and GIT_WORK_TREE, the
subprocess will see ".git", thinks it is not a repository, and
attempt to find one by going up, likely to end up in finding the
repository of the superproject.  In some codepaths, this will cause
a command run with the "--recurse-submodules" option to recurse
forever.

By exporting GIT_DIR=.git, disable the auto-discovery logic in the
subprocess, which would instead stop it and report an error.

Not-signed-off-yet.
---
 submodule.c                 |  1 +
 t/t5526-fetch-submodules.sh | 29 +++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+)

diff --git a/submodule.c b/submodule.c
index 1b5cdfb..e8258f0 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1160,4 +1160,5 @@ void prepare_submodule_repo_env(struct argv_array *out)
 		if (strcmp(*var, CONFIG_DATA_ENVIRONMENT))
 			argv_array_push(out, *var);
 	}
+	argv_array_push(out, "GIT_DIR=.git");
 }
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index 954d0e4..b2dee30 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -485,4 +485,33 @@ test_expect_success 'fetching submodules respects parallel settings' '
 	)
 '
 
+test_expect_success 'fetching submodule into a broken repository' '
+	# Prepare src and src/sub nested in it
+	git init src &&
+	(
+		cd src &&
+		git init sub &&
+		git -C sub commit --allow-empty -m "initial in sub" &&
+		git submodule add -- ./sub sub &&
+		git commit -m "initial in top"
+	) &&
+
+	# Clone the old-fashoned way
+	git clone src dst &&
+	git -C dst clone ../src/sub sub &&
+
+	# Make sure that old-fashoned layout is still supported
+	git -C dst status &&
+
+	# Recursive-fetch works fine
+	git -C dst fetch --recurse-submodules &&
+
+	# Break the receiving submodule
+	rm -f dst/sub/.git/HEAD &&
+
+	# Recursive-fetch must terminate
+	# NOTE: without fix this will recurse forever!
+	test_must_fail git -C dst fetch --recurse-submodules
+'
+
 test_done



  parent reply	other threads:[~2016-09-01 21:00 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-28 23:24 git submodules implementation question Uma Srinivasan
2016-08-29 20:03 ` Junio C Hamano
2016-08-29 21:03   ` Uma Srinivasan
2016-08-29 21:09     ` Junio C Hamano
2016-08-29 21:13       ` Uma Srinivasan
2016-08-29 23:04         ` Uma Srinivasan
2016-08-29 23:15           ` Junio C Hamano
2016-08-29 23:34             ` Uma Srinivasan
2016-08-30  0:02             ` Jacob Keller
2016-08-30  0:12               ` Uma Srinivasan
2016-08-30  6:09                 ` Jacob Keller
2016-08-30  6:23                   ` Jacob Keller
2016-08-30 17:40                     ` Uma Srinivasan
2016-08-30 17:53                       ` Junio C Hamano
2016-08-31  2:54                         ` Uma Srinivasan
2016-08-31 16:42                           ` Junio C Hamano
2016-08-31 18:40                             ` Uma Srinivasan
2016-08-31 18:44                               ` Junio C Hamano
2016-08-31 18:58                                 ` Uma Srinivasan
2016-09-01  1:04                                   ` Uma Srinivasan
2016-09-01  4:09                             ` Junio C Hamano
2016-09-01 16:05                               ` Uma Srinivasan
2016-09-01 18:32                                 ` Junio C Hamano
2016-09-01 18:37                                   ` Stefan Beller
2016-09-01 19:19                                     ` Junio C Hamano
2016-09-01 19:56                                       ` Uma Srinivasan
2016-09-01 20:29                                         ` Junio C Hamano
2016-09-01 20:21                                       ` Junio C Hamano [this message]
2016-09-01 21:02                                         ` Junio C Hamano
2016-09-01 21:04                                         ` Stefan Beller
2016-09-01 21:12                                           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq4m5zwevl.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=Jens.Lehmann@web.de \
    --cc=git@vger.kernel.org \
    --cc=hvoigt@hvoigt.net \
    --cc=jacob.keller@gmail.com \
    --cc=sbeller@google.com \
    --cc=usrinivasan@twitter.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.