All of lore.kernel.org
 help / color / mirror / Atom feed
From: Derrick Stolee <derrickstolee@github.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, vdye@github.com, shaoxuan.yuan02@gmail.com,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH v2 08/10] sparse-index: complete partial expansion
Date: Wed, 25 May 2022 10:26:41 -0400	[thread overview]
Message-ID: <f0327fd6-f2b3-64b6-3a34-2a7bbad20a01@github.com> (raw)
In-Reply-To: <xmqqv8tvlvao.fsf@gitster.g>

On 5/23/2022 6:48 PM, Junio C Hamano wrote:
> Derrick Stolee <derrickstolee@github.com> writes:
> 
>>> I suspect that is a situation that is not so uncommon.  Working
>>> inside a narrow cone of a wide tree, performing a merge would
>>> hopefully allow many subtrees that are outside of the cones of our
>>> interest merged without getting expanded at all (e.g. only the other
>>> side touched these subtrees we are not working on, so their version
>>> will become the merge result), while changes to some paths in the
>>> cone of our interest may result in true conflicts represented as
>>> cache entries at higher stages, needing conflict resolution
>>> concluded with "git add".  Having to expand these subtrees that we
>>> managed to merge while still collapsed, only because we have
>>> conflicts in some other parts of the tree, feels somewhat sad.
>>
>> You are correct that conflicts outside of the sparse-checkout cone will
>> cause index expansion. That happens during the 'git merge' command, but
>> the index will continue to fail to collapse as long as those conflicts
>> still exist in the index.
>>
>> When there are conflicts like this during the merge, then the index
>> expansion is not as large of a portion of the command as normal, because
>> the conflict resolution also takes some time to compute. The commands
>> afterwards do take longer purely because of the expanded index.
> 
> I was imagining a situation more like "tech-writers only have
> Documentation/ inside the cone of interest, attempt a pull from
> somebody else, have conflicts inside Documentation/, but everything
> else could be resolved cleanly without expanding the index".  If the
> puller's tree is based on the pristine upstream release tag, and the
> pullee's tree is based on a slightly newer version of upstream
> snapshot, everything that happened outside Documentation/ in their
> trees would fast-forward, so such a merge wouldn't have to expand
> directories like "builtin/" or "contrib/" in the index and instead
> can merge at the tree level, right?
> 
> On the other hand, ...
> 
>> However, this state is also not as common as you might think. If a user
>> has a sparse-checkout cone specified, then they are unlikely to change
>> files outside of the sparse-checkout cone. They would not be the reason
>> that those files have a conflict. The conflicts would exist only if they
>> are merging branches that had conflicts outside of the cone. Typically,
>> any merge of external changes like this are of the form of "git pull" or
>> "git rebase", in which case the conflicts are still "local" to the
>> developer's changes.
> 
> ... you seem to be talking about the opposite case (e.g. in the
> above paragraph), where a conflict happens outside the cone of
> interest of the person who is making a merge.  So, I am a bit
> puzzled.

Hm. We must be getting mixed up with each other. Let me try again
from the beginning.

When the conflict happens inside of the sparse-checkout cone,
then the sparse index is not expanded. This is checked by the
test 'sparse-index is not expanded: merge conflict in cone' in
t1092.

Most merges get to "fast forward" changes that are outside of
the sparse-checkout cone because the only interesting changes
that could lead to a conflict are those created by the current
user (and hence within the sparse-checkout cone).

So, the typical case of a tech writer only editing "Documentation/"
should only see conflicts within "Documentation/".

The case where we would see conflicts outside of the cone include
cases where long-lived branches are being merged by someone with
a small cone. I can imagine an automated process using an empty
sparse-checkout cone to occasionally merge a "deployed" and a
"develop" branch, and it gets conflicts when someone ships a
hotfix directly to "deployed" without first going through "develop".
Any conflict is likely to cause index expansion in this case.

Let's re-introduce the patch section we are talking about:

+	if (pl && !pl->use_cone_patterns) {
 		pl = NULL;
+	} else {
+		/*
+		 * We might contract file entries into sparse-directory
+		 * entries, and for that we will need the cache tree to
+		 * be recomputed.
+		 */
+		cache_tree_free(&istate->cache_tree);
+
+		/*
+		 * If there is a problem creating the cache tree, then we
+		 * need to expand to a full index since we cannot satisfy
+		 * the current request as a sparse index.
+		 */
+		if (cache_tree_update(istate, WRITE_TREE_MISSING_OK))
+			pl = NULL;
+	}

If a user is in a conflict state and modifies their sparse-checkout
cone, then we will hit this "recompute the cache-tree" state, fail,
and cause full index expansion. I think that combination (have a
conflict AND modify sparse-checkout cone) is rare enough to not
optimize for (yet).

Does that make the situation more clear?

Thanks,
-Stolee

  reply	other threads:[~2022-05-25 14:27 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-16 18:11 [PATCH 0/8] Sparse index: integrate with sparse-checkout Derrick Stolee via GitGitGadget
2022-05-16 18:11 ` [PATCH 1/8] sparse-index: create expand_to_pattern_list() Derrick Stolee via GitGitGadget
2022-05-16 20:36   ` Victoria Dye
2022-05-16 20:49     ` Derrick Stolee
2022-05-16 18:11 ` [PATCH 2/8] sparse-index: introduce partially-sparse indexes Derrick Stolee via GitGitGadget
2022-05-16 18:11 ` [PATCH 3/8] cache-tree: implement cache_tree_find_path() Derrick Stolee via GitGitGadget
2022-05-16 18:11 ` [PATCH 4/8] sparse-checkout: --no-sparse-index needs a full index Derrick Stolee via GitGitGadget
2022-05-16 18:11 ` [PATCH 5/8] sparse-index: partially expand directories Derrick Stolee via GitGitGadget
2022-05-16 20:36   ` Victoria Dye
2022-05-16 18:11 ` [PATCH 6/8] sparse-index: complete partial expansion Derrick Stolee via GitGitGadget
2022-05-16 20:38   ` Victoria Dye
2022-05-17 13:23     ` Derrick Stolee
2022-05-16 18:11 ` [PATCH 7/8] p2000: add test for 'git sparse-checkout [add|set]' Derrick Stolee via GitGitGadget
2022-05-16 18:11 ` [PATCH 8/8] sparse-checkout: integrate with sparse index Derrick Stolee via GitGitGadget
2022-05-16 20:38   ` Victoria Dye
2022-05-17 13:28     ` Derrick Stolee
2022-05-19 17:52 ` [PATCH v2 00/10] Sparse index: integrate with sparse-checkout Derrick Stolee via GitGitGadget
2022-05-19 17:52   ` [PATCH v2 01/10] t1092: refactor 'sparse-index contents' test Derrick Stolee via GitGitGadget
2022-05-19 17:52   ` [PATCH v2 02/10] t1092: stress test 'git sparse-checkout set' Derrick Stolee via GitGitGadget
2022-05-19 17:52   ` [PATCH v2 03/10] sparse-index: create expand_to_pattern_list() Derrick Stolee via GitGitGadget
2022-05-19 19:50     ` Junio C Hamano
2022-05-20 18:01       ` Derrick Stolee
2022-05-19 17:52   ` [PATCH v2 04/10] sparse-index: introduce partially-sparse indexes Derrick Stolee via GitGitGadget
2022-05-19 20:05     ` Junio C Hamano
2022-05-20 18:05       ` Derrick Stolee
2022-05-20 18:23         ` Junio C Hamano
2022-05-19 17:52   ` [PATCH v2 05/10] cache-tree: implement cache_tree_find_path() Derrick Stolee via GitGitGadget
2022-05-19 20:14     ` Junio C Hamano
2022-05-20 18:13       ` Derrick Stolee
2022-05-19 17:52   ` [PATCH v2 06/10] sparse-checkout: --no-sparse-index needs a full index Derrick Stolee via GitGitGadget
2022-05-19 20:19     ` Junio C Hamano
2022-05-19 17:52   ` [PATCH v2 07/10] sparse-index: partially expand directories Derrick Stolee via GitGitGadget
2022-05-20 18:17     ` Junio C Hamano
2022-05-20 18:33       ` Derrick Stolee
2022-05-19 17:52   ` [PATCH v2 08/10] sparse-index: complete partial expansion Derrick Stolee via GitGitGadget
2022-05-21  7:45     ` Junio C Hamano
2022-05-23 13:13       ` Derrick Stolee
2022-05-23 13:18         ` Derrick Stolee
2022-05-23 18:01           ` Junio C Hamano
2022-05-23 22:48         ` Junio C Hamano
2022-05-25 14:26           ` Derrick Stolee [this message]
2022-05-25 16:32             ` Junio C Hamano
2022-05-19 17:52   ` [PATCH v2 09/10] p2000: add test for 'git sparse-checkout [add|set]' Derrick Stolee via GitGitGadget
2022-05-19 17:52   ` [PATCH v2 10/10] sparse-checkout: integrate with sparse index Derrick Stolee via GitGitGadget
2022-05-23 13:48   ` [PATCH v3 00/10] Sparse index: integrate with sparse-checkout Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 01/10] t1092: refactor 'sparse-index contents' test Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 02/10] t1092: stress test 'git sparse-checkout set' Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 03/10] sparse-index: create expand_index() Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 04/10] sparse-index: introduce partially-sparse indexes Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 05/10] cache-tree: implement cache_tree_find_path() Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 06/10] sparse-checkout: --no-sparse-index needs a full index Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 07/10] sparse-index: partially expand directories Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 08/10] sparse-index: complete partial expansion Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 09/10] p2000: add test for 'git sparse-checkout [add|set]' Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 10/10] sparse-checkout: integrate with sparse index Derrick Stolee via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f0327fd6-f2b3-64b6-3a34-2a7bbad20a01@github.com \
    --to=derrickstolee@github.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=shaoxuan.yuan02@gmail.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.