All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "SZEDER Gábor" <szeder.dev@gmail.com>
Cc: "Antonio Ospite" <ao2@ao2.it>,
	git@vger.kernel.org, "Daniel Graña" <dangra@gmail.com>,
	"Jonathan Nieder" <jrnieder@gmail.com>,
	"Richard Hartmann" <richih.mailinglist@gmail.com>,
	"Stefan Beller" <sbeller@google.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: Re: [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree
Date: Wed, 19 Sep 2018 12:24:30 -0700	[thread overview]
Message-ID: <xmqqfty547f5.fsf@gitster-ct.c.googlers.com> (raw)
In-Reply-To: <20180918171257.GC27036@localhost> ("SZEDER =?utf-8?Q?G=C3=A1?= =?utf-8?Q?bor=22's?= message of "Tue, 18 Sep 2018 19:12:57 +0200")

SZEDER Gábor <szeder.dev@gmail.com> writes:

> it appears that this patch (and its previous versions as well) is
> responsible for triggering occasional test failures in
> 't7814-grep-recurse-submodules.sh', more frequently, about once in
> every ten runs, on macOS on Travis CI, less frequently, about once in
> a couple of hundred runs on Linux on my machine.

I see that among Cc'ed are people who are more familiar with the
submodule code and where it wants to go.  Thanks for a report and
analysis.

> The reason for the failure is memory corruption manifesting in various
> ways: segfault, malloc() or use after free() errors from libc, corrupt
> loose object, invalid ref, bogus output, etc.
> 
> Applying the following patch makes t7814 fail almost every time,
> though sometimes that loop has to iterate over 1000 times until that
> 'git grep' finally fails...  so good luck with debugging ;)
>
> diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
> index 7184113b9b..93ae2e8e7c 100755
> --- a/t/t7814-grep-recurse-submodules.sh
> +++ b/t/t7814-grep-recurse-submodules.sh
> @@ -337,6 +337,10 @@ test_expect_success 'grep --recurse-submodules should pass the pattern type alon
>  	test_must_fail git -c grep.patternType=fixed grep --recurse-submodules -e "(.|.)[\d]" &&
>  
>  	# Basic
> +	for i in $(seq 0 2000)
> +	do
> +		git grep --recurse-submodules 1 >/dev/null || return 1
> +	done &&
>  	git grep -G --recurse-submodules -e "(.|.)[\d]" >actual &&
>  	cat >expect <<-\EOF &&
>  	a:(1|2)d(3|4)
>
> On first look I didn't notice anything that is obviously wrong in this
> patch and could be responsible for the memory corruption, but there is
> one thing I found strange, though:
>
>
> On Mon, Sep 17, 2018 at 04:09:40PM +0200, Antonio Ospite wrote:
>> When the .gitmodules file is not available in the working tree, try
>> using the content from the index and from the current branch.
>
> "from the index and from the current branch" of which repository?
>
>> This
>> covers the case when the file is part of the repository but for some
>> reason it is not checked out, for example because of a sparse checkout.
>> 
>> This makes it possible to use at least the 'git submodule' commands
>> which *read* the gitmodules configuration file without fully populating
>> the working tree.
>> 
>> Writing to .gitmodules will still require that the file is checked out,
>> so check for that before calling config_set_in_gitmodules_file_gently.
>> 
>> Add a similar check also in git-submodule.sh::cmd_add() to anticipate
>> the eventual failure of the "git submodule add" command when .gitmodules
>> is not safely writeable; this prevents the command from leaving the
>> repository in a spurious state (e.g. the submodule repository was cloned
>> but .gitmodules was not updated because
>> config_set_in_gitmodules_file_gently failed).
>> 
>> Finally, add t7416-submodule-sparse-gitmodules.sh to verify that reading
>> from .gitmodules succeeds and that writing to it fails when the file is
>> not checked out.
>> 
>> Signed-off-by: Antonio Ospite <ao2@ao2.it>
>> ---
>
>> diff --git a/submodule-config.c b/submodule-config.c
>> index 61a555e920..bdb1d0e2c9 100644
>> --- a/submodule-config.c
>> +++ b/submodule-config.c
>
>> @@ -603,8 +604,21 @@ static void submodule_cache_check_init(struct repository *repo)
>>  static void config_from_gitmodules(config_fn_t fn, struct repository *repo, void *data)
>>  {
>>  	if (repo->worktree) {
>> -		char *file = repo_worktree_path(repo, GITMODULES_FILE);
>> -		git_config_from_file(fn, file, data);
>> +		struct git_config_source config_source = { 0 };
>> +		const struct config_options opts = { 0 };
>> +		struct object_id oid;
>> +		char *file;
>> +
>> +		file = repo_worktree_path(repo, GITMODULES_FILE);
>> +		if (file_exists(file))
>> +			config_source.file = file;
>> +		else if (get_oid(GITMODULES_INDEX, &oid) >= 0)
>> +			config_source.blob = GITMODULES_INDEX;
>
> The repository used in t7814 contains nested submodules, which means
> that config_from_gitmodules() is invoked three times.
>
> Now, the first two of those calls look at the superproject and at
> 'submodule', and find the existing files '.../trash
> directory.t7814-grep-recurse-submodules/.gitmodules' and '.../trash
> directory.t7814-grep-recurse-submodules/submodule/.gitmodules',
> respectively.  So far so good.
>
> The third call, however, looks at the nested submodule at
> 'submodule/sub', which doesn't contain a '.gitmodules' file.  So this
> function goes on with the second condition and calls
> get_oid(GITMODULES_INDEX, &oid), which then appears to find the blob
> in the _superproject's_ index.
>
> I'm no expert on submodules, but my gut feeling says that this can't
> be right.  But if it _is_ right, then I would say that the commit
> message should explain in detail, why it is right.
>
> Anyway, even if it is indeed wrong, I'm not sure whether this is the
> root cause of the memory corruption.
>
>
>> +		else if (get_oid(GITMODULES_HEAD, &oid) >= 0)
>> +			config_source.blob = GITMODULES_HEAD;
>> +
>> +		config_with_options(fn, data, &config_source, &opts);
>> +
>>  		free(file);
>>  	}
>>  }

  reply	other threads:[~2018-09-19 19:24 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-17 14:09 [PATCH v5 0/9] Make submodules work if .gitmodules is not checked out Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 1/9] submodule: add a print_config_from_gitmodules() helper Antonio Ospite
2018-09-24 10:25   ` Antonio Ospite
2018-09-24 23:06     ` Stefan Beller
2018-09-17 14:09 ` [PATCH v5 2/9] submodule: factor out a config_set_in_gitmodules_file_gently function Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 3/9] t7411: merge tests 5 and 6 Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 4/9] t7411: be nicer to future tests and really clean things up Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 5/9] submodule--helper: add a new 'config' subcommand Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 6/9] submodule: use the 'submodule--helper config' command Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 7/9] t7506: clean up .gitmodules properly before setting up new scenario Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 8/9] submodule: add a helper to check if it is safe to write to .gitmodules Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree Antonio Ospite
2018-09-18 17:12   ` SZEDER Gábor
2018-09-19 19:24     ` Junio C Hamano [this message]
2018-09-20 15:35     ` Antonio Ospite
2018-09-21 16:19       ` Junio C Hamano
2018-09-27 14:49         ` Antonio Ospite
2018-09-24 10:20     ` Antonio Ospite
2018-09-24 21:00       ` Stefan Beller
2018-09-27 14:44         ` Antonio Ospite
2018-09-27 18:00           ` Stefan Beller
2018-10-01 15:45             ` Antonio Ospite
2018-10-01 19:42               ` Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqfty547f5.fsf@gitster-ct.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=ao2@ao2.it \
    --cc=avarab@gmail.com \
    --cc=dangra@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jrnieder@gmail.com \
    --cc=richih.mailinglist@gmail.com \
    --cc=sbeller@google.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.