git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: Glen Choo <chooglen@google.com>
Cc: Jonathan Tan <jonathantanmy@google.com>,
	git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v2 7/9] fetch: fetch unpopulated, changed submodules
Date: Tue, 15 Feb 2022 14:02:29 -0800	[thread overview]
Message-ID: <20220215220229.1633486-1-jonathantanmy@google.com> (raw)
In-Reply-To: <20220215172318.73533-8-chooglen@google.com>

Patches 4-6 look good.

Glen Choo <chooglen@google.com> writes:
> Teach "git fetch" to fetch cloned, changed submodules regardless of
> whether they are populated (this is in addition to the current behavior
> of fetching populated submodules).

I think add a note that the current behavior is regardless of what is
being fetched. So, maybe something like:

  Teach "git fetch" to fetch cloned, changed submodules regardless of
  whether they are populated. This is in addition to the current behavior
  of fetching populated submodules (which happens regardless of what was
  fetched in the superproject, or even if nothing was fetched in the
  superproject).

> As I mentioned in the cover letter, I'm not entirely happy with the
> name repo_has_absorbed_submodules() - it's not a standardized term AFAIK
> and it's a little clunky.
> 
> "absorbed submodule" is just a stand-in for "submodule in .git/modules",
> so if we have a better term for "submodule in .git/modules", let's use
> that instead.

I think that this is OK if the doc comment is updated. I'll make the
suggestion in the appropriate place below.

> @@ -1248,14 +1261,28 @@ void check_for_new_submodule_commits(struct object_id *oid)
>  	oid_array_append(&ref_tips_after_fetch, oid);
>  }
>  
> +/*
> + * Returns 1 if the repo has absorbed submodule gitdirs, and 0
> + * otherwise. Like submodule_name_to_gitdir(), this checks
> + * $GIT_DIR/modules, not $GIT_COMMON_DIR.
> + */
> +static int repo_has_absorbed_submodules(struct repository *r)
> +{
> +	struct strbuf buf = STRBUF_INIT;
> +
> +	strbuf_repo_git_path(&buf, r, "modules/");
> +	return file_exists(buf.buf) && !is_empty_dir(buf.buf);
> +}

I think that if you replace the doc comment with something like:

  Returns 1 if there is at least one submodule gitdir in
  $GIT_DIR/modules, and 0 otherwise. (End users can move submodule
  gitdirs into $GIT_DIR/modules by running "git submodule
  absorbgitdirs".) Like submodule_name_to_gitdir()...

then it would be fine.

> @@ -1338,6 +1366,7 @@ struct submodule_parallel_fetch {
>  	int result;
>  
>  	struct string_list changed_submodule_names;
> +	struct string_list seen_submodule_names;

Could we unify the 2 lists instead of having 2 separate ones?

> @@ -1529,12 +1569,77 @@ get_fetch_task(struct submodule_parallel_fetch *spf,
>  	return NULL;
>  }
>  
> +static struct fetch_task *
> +get_fetch_task_from_changed(struct submodule_parallel_fetch *spf,
> +			    const char **default_argv, struct strbuf *err)
> +{
> +	for (; spf->changed_count < spf->changed_submodule_names.nr;
> +	     spf->changed_count++) {
> +		struct string_list_item item =
> +			spf->changed_submodule_names.items[spf->changed_count];
> +		struct changed_submodule_data *cs_data = item.util;
> +		struct fetch_task *task;
> +
> +		/*
> +		 * We might have already considered this submodule
> +		 * because we saw it in the index.
> +		 */
> +		if (string_list_lookup(&spf->seen_submodule_names, item.string))
> +			continue;
> +
> +		task = fetch_task_create(spf->r, cs_data->path,
> +					 cs_data->super_oid);
> +		if (!task)
> +			continue;
> +
> +		switch (get_fetch_recurse_config(task->sub, spf)) {
> +		default:
> +		case RECURSE_SUBMODULES_DEFAULT:
> +		case RECURSE_SUBMODULES_ON_DEMAND:
> +			*default_argv = "on-demand";
> +			break;
> +		case RECURSE_SUBMODULES_ON:
> +			*default_argv = "yes";
> +			break;
> +		case RECURSE_SUBMODULES_OFF:
> +			continue;
> +		}
> +
> +		task->repo = get_submodule_repo_for(spf->r, task->sub->path,
> +						    cs_data->super_oid);
> +		if (!task->repo) {
> +			fetch_task_release(task);
> +			free(task);
> +
> +			strbuf_addf(err, _("Could not access submodule '%s'\n"),
> +				    cs_data->path);
> +			continue;
> +		}
> +		if (!is_tree_submodule_active(spf->r, cs_data->super_oid,
> +					      task->sub->path))
> +			continue;
> +
> +		if (!spf->quiet)
> +			strbuf_addf(err,
> +				    _("Fetching submodule %s%s at commit %s\n"),
> +				    spf->prefix, task->sub->path,
> +				    find_unique_abbrev(cs_data->super_oid,
> +						       DEFAULT_ABBREV));
> +		spf->changed_count++;
> +		return task;
> +	}
> +	return NULL;
> +}

This is very similar to get_fetch_task_from_index(). Both:
 1. loop over something
 2. exit early if the submodule name is seen
 3. create the fetch task
 4. set the "recurse config"
 5. get the submodule repo
 6. if success, increment a counter
 7. if failure, check for some conditions and maybe append to err

Could a function be refactored that does 2-5?

> +test_expect_success "'--recurse-submodules=on-demand' should fetch submodule commits if the submodule is changed but the index has no submodules" '
> +	git -C downstream fetch --recurse-submodules &&

First of all, thanks for updating the test - it is much easier to
understand now.

About this line, we shouldn't use the code being tested to set up (we're
testing "fetch --recurse-submodules", so we shouldn't use the same
command to set up). Also, if we don't have confidence in the starting
state, it may be a sign to write it out more explicitly instead of
relying on a complicated command to do the right thing.

However, in this case, I don't think we need this. All we need is to see
that the test contains a new superproject commit that points to a new
submodule commit (and recursively). So we don't need this line.

> +	# Create new superproject commit with updated submodules
> +	add_upstream_commit &&
> +	(
> +		cd submodule &&
> +		(
> +			cd subdir/deepsubmodule &&
> +			git fetch &&
> +			git checkout -q FETCH_HEAD
> +		) &&
> +		git add subdir/deepsubmodule &&
> +		git commit -m "new deep submodule"
> +	) &&
> +	git add submodule &&
> +	git commit -m "new submodule" &&

I thought add_upstream_commit() would do this, but apparently it just
adds commits to the submodules (which works for the earlier tests that
just tested that the submodules were fetched, but not for this one). I
think that all this should go into its own function.

Also, I understand that "git fetch" is there to pick up the commit we
made in add_upstream_commit, but this indirection is unnecessary in a
test, I think. Can we not use add_upstream_commit and just create our
own in subdir/deepsubmodule? (This might conflict with subsequent tests
that use the old scheme, but I think that it should be fine.)

> +	# Fetch the new superproject commit
> +	(
> +		cd downstream &&
> +		git switch --recurse-submodules no-submodules &&
> +		git fetch --recurse-submodules=on-demand >../actual.out 2>../actual.err
> +	) &&
> +	test_must_be_empty actual.out &&
> +	git rev-parse --short HEAD >superhead &&
> +	git -C submodule rev-parse --short HEAD >subhead &&
> +	git -C deepsubmodule rev-parse --short HEAD >deephead &&

These >superhead lines would not be necessary if we had our own
function.

> +# In downstream, init "submodule2", but do not check it out while
> +# fetching. This lets us assert that unpopulated submodules can be
> +# fetched.

Firstly, "In upstream", I think? You want to fetch from it, so it has to
be upstream.

Secondly, is this test needed? I thought the case in which the worktree
has no submodules would be sufficient to test that unpopulated
submodules can be fetched.

> +test_expect_success "'--recurse-submodules' should fetch submodule commits in changed submodules and the index" '

[snip]

> +	git checkout super &&
> +	(
> +		cd downstream &&
> +		git fetch --recurse-submodules >../actual.out 2>../actual.err
> +	) &&
> +	test_must_be_empty actual.out &&
> +
> +	# Assert that the submodules in the super branch are fetched
> +	git rev-parse --short HEAD >superhead &&
> +	git -C submodule rev-parse --short HEAD >subhead &&
> +	git -C deepsubmodule rev-parse --short HEAD >deephead &&
> +	verify_fetch_result actual.err &&
> +	# grep for the exact line to check that the submodule is read
> +	# from the index, not from a commit
> +	grep "^Fetching submodule submodule\$" actual.err &&

Instead of a grep, I think this should be done by precisely specifying
what to fetch in the "git fetch" invocation, and then checking that the
submodule has commits that it didn't have before.

In addition, I think the following cases also need to be tested:
 - two fetched commits have submodules of the same name but different
   URL
 - a fetched commit and a commit in the index have submodules of the
   same name but different URL

  reply	other threads:[~2022-02-15 22:02 UTC|newest]

Thread overview: 149+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-10  4:41 [PATCH 0/8] fetch --recurse-submodules: fetch unpopulated submodules Glen Choo
2022-02-10  4:41 ` [PATCH 1/8] submodule: inline submodule_commits() into caller Glen Choo
2022-02-10  4:41 ` [PATCH 2/8] submodule: store new submodule commits oid_array in a struct Glen Choo
2022-02-10 19:00   ` Jonathan Tan
2022-02-10 22:05     ` Junio C Hamano
2022-02-10  4:41 ` [PATCH 3/8] submodule: make static functions read submodules from commits Glen Choo
2022-02-10 19:15   ` Jonathan Tan
2022-02-11 10:07     ` Glen Choo
2022-02-11 10:09     ` Glen Choo
2022-02-10  4:41 ` [PATCH 4/8] t5526: introduce test helper to assert on fetches Glen Choo
2022-02-10  4:41 ` [PATCH 5/8] t5526: use grep " Glen Choo
2022-02-10 19:17   ` Jonathan Tan
2022-02-10  4:41 ` [PATCH 6/8] submodule: extract get_fetch_task() Glen Choo
2022-02-10 19:33   ` Jonathan Tan
2022-02-10  4:41 ` [PATCH 7/8] fetch: fetch unpopulated, changed submodules Glen Choo
2022-02-10 22:49   ` Junio C Hamano
2022-02-11  7:15     ` Glen Choo
2022-02-11 17:07       ` Junio C Hamano
2022-02-10 22:51   ` Jonathan Tan
2022-02-14  4:24     ` Glen Choo
2022-02-14 18:04     ` Glen Choo
2022-02-14 10:17   ` Glen Choo
2022-02-10  4:41 ` [PATCH 8/8] submodule: fix bug and remove add_submodule_odb() Glen Choo
2022-02-10 22:54   ` Junio C Hamano
2022-02-11  3:13     ` Glen Choo
2022-02-10 23:04   ` Jonathan Tan
2022-02-11  3:18     ` Glen Choo
2022-02-11 17:19     ` Junio C Hamano
2022-02-14  2:52       ` Glen Choo
2022-02-10  7:07 ` [PATCH 0/8] fetch --recurse-submodules: fetch unpopulated submodules Junio C Hamano
2022-02-10  8:51   ` Glen Choo
2022-02-10 17:40     ` Junio C Hamano
2022-02-11  2:39       ` Glen Choo
2022-02-15 17:23 ` [PATCH v2 0/9] " Glen Choo
2022-02-15 17:23   ` [PATCH v2 1/9] t5526: introduce test helper to assert on fetches Glen Choo
2022-02-15 21:37     ` Ævar Arnfjörð Bjarmason
2022-02-15 17:23   ` [PATCH v2 2/9] t5526: use grep " Glen Choo
2022-02-15 21:53     ` Ævar Arnfjörð Bjarmason
2022-02-16  3:09       ` Glen Choo
2022-02-16 10:02         ` Ævar Arnfjörð Bjarmason
2022-02-17  4:04           ` Glen Choo
2022-02-17  9:25             ` Ævar Arnfjörð Bjarmason
2022-02-17 16:16               ` Glen Choo
2022-02-15 17:23   ` [PATCH v2 3/9] submodule: make static functions read submodules from commits Glen Choo
2022-02-15 21:18     ` Jonathan Tan
2022-02-16  6:59       ` Glen Choo
2022-02-15 22:00     ` Ævar Arnfjörð Bjarmason
2022-02-16  7:08       ` Glen Choo
2022-02-15 17:23   ` [PATCH v2 4/9] submodule: inline submodule_commits() into caller Glen Choo
2022-02-15 22:02     ` Ævar Arnfjörð Bjarmason
2022-02-15 17:23   ` [PATCH v2 5/9] submodule: store new submodule commits oid_array in a struct Glen Choo
2022-02-15 21:33     ` Ævar Arnfjörð Bjarmason
2022-02-15 17:23   ` [PATCH v2 6/9] submodule: extract get_fetch_task() Glen Choo
2022-02-15 17:23   ` [PATCH v2 7/9] fetch: fetch unpopulated, changed submodules Glen Choo
2022-02-15 22:02     ` Jonathan Tan [this message]
2022-02-16  5:46       ` Glen Choo
2022-02-16  9:11         ` Glen Choo
2022-02-16  9:39           ` Ævar Arnfjörð Bjarmason
2022-02-16 17:33             ` Glen Choo
2022-02-15 22:06     ` Ævar Arnfjörð Bjarmason
2022-02-15 17:23   ` [PATCH v2 8/9] submodule: read shallows when finding " Glen Choo
2022-02-15 22:03     ` Jonathan Tan
2022-02-15 17:23   ` [PATCH v2 9/9] submodule: fix latent check_has_commit() bug Glen Choo
2022-02-15 22:04     ` Jonathan Tan
2022-02-24 10:08   ` [PATCH v3 00/10] fetch --recurse-submodules: fetch unpopulated submodules Glen Choo
2022-02-24 10:08     ` [PATCH v3 01/10] t5526: introduce test helper to assert on fetches Glen Choo
2022-02-25  0:34       ` Junio C Hamano
2022-02-24 10:08     ` [PATCH v3 02/10] t5526: stop asserting on stderr literally Glen Choo
2022-02-24 11:52       ` Ævar Arnfjörð Bjarmason
2022-02-24 16:15         ` Glen Choo
2022-02-24 18:13           ` Eric Sunshine
2022-02-24 23:05       ` Jonathan Tan
2022-02-25  2:26         ` Glen Choo
2022-02-24 10:08     ` [PATCH v3 03/10] t5526: create superproject commits with test helper Glen Choo
2022-02-24 23:14       ` Jonathan Tan
2022-02-25  2:52         ` Glen Choo
2022-02-25 11:42           ` Ævar Arnfjörð Bjarmason
2022-02-28 18:11             ` Glen Choo
2022-02-24 10:08     ` [PATCH v3 04/10] submodule: make static functions read submodules from commits Glen Choo
2022-02-24 10:08     ` [PATCH v3 05/10] submodule: inline submodule_commits() into caller Glen Choo
2022-02-24 10:08     ` [PATCH v3 06/10] submodule: store new submodule commits oid_array in a struct Glen Choo
2022-02-24 10:08     ` [PATCH v3 07/10] submodule: extract get_fetch_task() Glen Choo
2022-02-24 23:26       ` Jonathan Tan
2022-02-24 10:08     ` [PATCH v3 08/10] submodule: move logic into fetch_task_create() Glen Choo
2022-02-24 23:36       ` Jonathan Tan
2022-02-24 10:08     ` [PATCH v3 09/10] fetch: fetch unpopulated, changed submodules Glen Choo
2022-02-24 21:30       ` Junio C Hamano
2022-02-25  3:04         ` Glen Choo
2022-02-25  0:33       ` Junio C Hamano
2022-02-25  3:07         ` Glen Choo
2022-02-25  0:39       ` Jonathan Tan
2022-02-25  3:46         ` Glen Choo
2022-03-04 23:46           ` Jonathan Tan
2022-03-05  0:22             ` Glen Choo
2022-03-04 23:53           ` Jonathan Tan
2022-02-26 18:53       ` Junio C Hamano
2022-03-01 20:24         ` Johannes Schindelin
2022-03-01 20:33           ` Junio C Hamano
2022-03-02 23:25             ` Glen Choo
2022-03-01 20:32         ` Junio C Hamano
2022-02-24 10:08     ` [PATCH v3 10/10] submodule: fix latent check_has_commit() bug Glen Choo
2022-03-04  0:57     ` [PATCH v4 00/10] fetch --recurse-submodules: fetch unpopulated submodules Glen Choo
2022-03-04  0:57       ` [PATCH v4 01/10] t5526: introduce test helper to assert on fetches Glen Choo
2022-03-04  2:06         ` Junio C Hamano
2022-03-04 22:11           ` Glen Choo
2022-03-04  0:57       ` [PATCH v4 02/10] t5526: stop asserting on stderr literally Glen Choo
2022-03-04  2:12         ` Junio C Hamano
2022-03-04 22:41         ` Jonathan Tan
2022-03-04 23:48           ` Junio C Hamano
2022-03-05  0:25             ` Glen Choo
2022-03-04  0:57       ` [PATCH v4 03/10] t5526: create superproject commits with test helper Glen Choo
2022-03-04 22:59         ` Jonathan Tan
2022-03-04  0:57       ` [PATCH v4 04/10] submodule: make static functions read submodules from commits Glen Choo
2022-03-04  0:57       ` [PATCH v4 05/10] submodule: inline submodule_commits() into caller Glen Choo
2022-03-04  0:57       ` [PATCH v4 06/10] submodule: store new submodule commits oid_array in a struct Glen Choo
2022-03-04  0:57       ` [PATCH v4 07/10] submodule: extract get_fetch_task() Glen Choo
2022-03-04  0:57       ` [PATCH v4 08/10] submodule: move logic into fetch_task_create() Glen Choo
2022-03-04  0:57       ` [PATCH v4 09/10] fetch: fetch unpopulated, changed submodules Glen Choo
2022-03-04  2:37         ` Junio C Hamano
2022-03-04 22:59           ` Glen Choo
2022-03-05  0:13             ` Junio C Hamano
2022-03-05  0:37               ` Glen Choo
2022-03-08  0:11                 ` Junio C Hamano
2022-03-04 23:56         ` Jonathan Tan
2022-03-04  0:57       ` [PATCH v4 10/10] submodule: fix latent check_has_commit() bug Glen Choo
2022-03-04  2:17         ` Junio C Hamano
2022-03-04  2:22       ` [PATCH v4 00/10] fetch --recurse-submodules: fetch unpopulated submodules Junio C Hamano
2022-03-08  0:14       ` [PATCH v5 " Glen Choo
2022-03-08  0:14         ` [PATCH v5 01/10] t5526: introduce test helper to assert on fetches Glen Choo
2022-03-08  0:14         ` [PATCH v5 02/10] t5526: stop asserting on stderr literally Glen Choo
2022-03-08  0:14         ` [PATCH v5 03/10] t5526: create superproject commits with test helper Glen Choo
2022-03-08  0:14         ` [PATCH v5 04/10] submodule: make static functions read submodules from commits Glen Choo
2022-03-08  0:14         ` [PATCH v5 05/10] submodule: inline submodule_commits() into caller Glen Choo
2022-03-08  0:14         ` [PATCH v5 06/10] submodule: store new submodule commits oid_array in a struct Glen Choo
2022-03-08  0:14         ` [PATCH v5 07/10] submodule: extract get_fetch_task() Glen Choo
2022-03-08  0:14         ` [PATCH v5 08/10] submodule: move logic into fetch_task_create() Glen Choo
2022-03-08  0:14         ` [PATCH v5 09/10] fetch: fetch unpopulated, changed submodules Glen Choo
2022-03-08  0:14         ` [PATCH v5 10/10] submodule: fix latent check_has_commit() bug Glen Choo
2022-03-08  0:50         ` [PATCH v5 00/10] fetch --recurse-submodules: fetch unpopulated submodules Junio C Hamano
2022-03-08 18:24           ` Glen Choo
2022-03-09 19:13             ` Junio C Hamano
2022-03-09 19:49               ` Glen Choo
2022-03-09 20:22                 ` Junio C Hamano
2022-03-09 22:11                   ` Glen Choo
2022-03-16 21:58                     ` Glen Choo
2022-03-16 22:06                       ` Junio C Hamano
2022-03-16 22:37                         ` Glen Choo
2022-03-16 23:08                           ` Junio C Hamano
2022-03-08 21:42         ` Jonathan Tan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220215220229.1633486-1-jonathantanmy@google.com \
    --to=jonathantanmy@google.com \
    --cc=chooglen@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).