git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert
@ 2021-08-17 17:08 Derrick Stolee via GitGitGadget
  2021-08-17 17:08 ` [PATCH 1/6] t1092: use ORT merge strategy Derrick Stolee via GitGitGadget
                   ` (6 more replies)
  0 siblings, 7 replies; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-08-17 17:08 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Derrick Stolee

This series integrates the sparse index with commands that perform merges
such as 'git merge', 'git cherry-pick', 'git revert' (free with
cherry-pick), and 'git rebase'.

When the ORT merge strategy is enabled, this allows most merges to succeed
without expanding the sparse index, leading to significant performance
gains. I tested these changes against an internal monorepo with over 2
million paths at HEAD but with a sparse-checkout that only has ~60,000 files
within the sparse-checkout cone. 'git merge' commands went from 5-6 seconds
to 0.750-1.250s.

In the case of the recursive merge strategy, the sparse index is expanded
before the recursive algorithm proceeds. We expect that this is as good as
we can get with that strategy. When the strategy shifts to ORT as the
default, then this will not be a problem except for users who decide to
change the behavior.

Most of the hard work was done by previous series, such as
ds/sparse-index-ignored-files (which this series is based on).

Thanks, -Stolee

Derrick Stolee (6):
  t1092: use ORT merge strategy
  diff: ignore sparse paths in diffstat
  merge: make sparse-aware with ORT
  merge-ort: expand only for out-of-cone conflicts
  t1092: add cherry-pick, rebase tests
  sparse-index: integrate with cherry-pick and rebase

 builtin/merge.c                          |  3 +
 builtin/rebase.c                         |  6 ++
 builtin/revert.c                         |  3 +
 diff.c                                   |  8 ++
 merge-ort.c                              | 16 ++++
 merge-recursive.c                        |  3 +
 t/t1092-sparse-checkout-compatibility.sh | 97 +++++++++++++++++++++---
 7 files changed, 124 insertions(+), 12 deletions(-)


base-commit: febef675f051eb08896751bb5661b6deb5579ead
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1019%2Fderrickstolee%2Fsparse-index%2Fmerge-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1019/derrickstolee/sparse-index/merge-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1019
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 1/6] t1092: use ORT merge strategy
  2021-08-17 17:08 [PATCH 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
@ 2021-08-17 17:08 ` Derrick Stolee via GitGitGadget
  2021-08-18 17:16   ` Taylor Blau
  2021-08-18 18:10   ` Junio C Hamano
  2021-08-17 17:08 ` [PATCH 2/6] diff: ignore sparse paths in diffstat Derrick Stolee via GitGitGadget
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-08-17 17:08 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The sparse index will be compatible with the ORT merge strategy, so
let's use it explicitly in our tests.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t1092-sparse-checkout-compatibility.sh | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index ddc86bb4152..3e01e70fa0b 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -7,6 +7,11 @@ GIT_TEST_SPARSE_INDEX=
 
 . ./test-lib.sh
 
+# Force the use of the ORT merge algorithm until testing with the
+# recursive strategy. We expect ORT to be used with sparse-index.
+GIT_TEST_MERGE_ALGORITHM=ort
+export GIT_TEST_MERGE_ALGORITHM
+
 test_expect_success 'setup' '
 	git init initial-repo &&
 	(
@@ -501,7 +506,7 @@ test_expect_success 'merge with conflict outside cone' '
 
 	test_all_match git checkout -b merge-tip merge-left &&
 	test_all_match git status --porcelain=v2 &&
-	test_all_match test_must_fail git merge -m merge merge-right &&
+	test_all_match test_must_fail git merge -sort -m merge merge-right &&
 	test_all_match git status --porcelain=v2 &&
 
 	# Resolve the conflict in different ways:
@@ -531,7 +536,7 @@ test_expect_success 'merge with outside renames' '
 	do
 		test_all_match git reset --hard &&
 		test_all_match git checkout -f -b merge-$type update-deep &&
-		test_all_match git merge -m "$type" rename-$type &&
+		test_all_match git merge -sort -m "$type" rename-$type &&
 		test_all_match git rev-parse HEAD^{tree} || return 1
 	done
 '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 2/6] diff: ignore sparse paths in diffstat
  2021-08-17 17:08 [PATCH 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
  2021-08-17 17:08 ` [PATCH 1/6] t1092: use ORT merge strategy Derrick Stolee via GitGitGadget
@ 2021-08-17 17:08 ` Derrick Stolee via GitGitGadget
  2021-08-20 21:32   ` Elijah Newren
  2021-08-17 17:08 ` [PATCH 3/6] merge: make sparse-aware with ORT Derrick Stolee via GitGitGadget
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-08-17 17:08 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The diff_populate_filespec() method is used to describe the diff after a
merge operation is complete, especially when a conflict appears. In
order to avoid expanding a sparse index, the reuse_worktree_file() needs
to be adapted to ignore files that are outside of the sparse-checkout
cone. The file names and OIDs used for this check come from the merged
tree in the case of the ORT strategy, not the index, hence the ability
to look into these paths without having already expanded the index.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 diff.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/diff.c b/diff.c
index a8113f17070..c457cfa0e59 100644
--- a/diff.c
+++ b/diff.c
@@ -26,6 +26,7 @@
 #include "parse-options.h"
 #include "help.h"
 #include "promisor-remote.h"
+#include "dir.h"
 
 #ifdef NO_FAST_WORKING_DIRECTORY
 #define FAST_WORKING_DIRECTORY 0
@@ -3900,6 +3901,13 @@ static int reuse_worktree_file(struct index_state *istate,
 	if (!FAST_WORKING_DIRECTORY && !want_file && has_object_pack(oid))
 		return 0;
 
+	/*
+	 * If this path does not match our sparse-checkout definition,
+	 * then the file will not be in the working directory.
+	 */
+	if (!path_in_sparse_checkout(name, istate))
+		return 0;
+
 	/*
 	 * Similarly, if we'd have to convert the file contents anyway, that
 	 * makes the optimization not worthwhile.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 3/6] merge: make sparse-aware with ORT
  2021-08-17 17:08 [PATCH 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
  2021-08-17 17:08 ` [PATCH 1/6] t1092: use ORT merge strategy Derrick Stolee via GitGitGadget
  2021-08-17 17:08 ` [PATCH 2/6] diff: ignore sparse paths in diffstat Derrick Stolee via GitGitGadget
@ 2021-08-17 17:08 ` Derrick Stolee via GitGitGadget
  2021-08-20 21:40   ` Elijah Newren
  2021-08-17 17:08 ` [PATCH 4/6] merge-ort: expand only for out-of-cone conflicts Derrick Stolee via GitGitGadget
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-08-17 17:08 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Allow 'git merge' to operate without expanding a sparse index, at least
not immediately. The index still will be expanded in a few cases:

1. If the merge strategy is 'recursive', then we enable
   command_requires_full_index at the start of the merge_recursive()
   method. We expect sparse-index users to also have the 'ort' strategy
   enabled.

2. If the merge results in a conflicted file, then we expand the index
   before updating the working tree. The loop that iterates over the
   worktree replaces index entries and tracks 'origintal_cache_nr' which
   can become completely wrong if the index expands in the middle of the
   operation. This safety valve is important before that loop starts. A
   later change will focus this to only expand if we indeed have a
   conflict outside of the sparse-checkout cone.

Some test updates are required, including a mistaken 'git checkout -b'
that did not specify the base branch, causing merges to be fast-forward
merges.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/merge.c                          | 3 +++
 merge-ort.c                              | 8 ++++++++
 merge-recursive.c                        | 3 +++
 t/t1092-sparse-checkout-compatibility.sh | 8 ++++++--
 4 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 22f23990b37..926de328fbb 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1276,6 +1276,9 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 	if (argc == 2 && !strcmp(argv[1], "-h"))
 		usage_with_options(builtin_merge_usage, builtin_merge_options);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	/*
 	 * Check if we are _not_ on a detached HEAD, i.e. if there is a
 	 * current branch.
diff --git a/merge-ort.c b/merge-ort.c
index 6eb910d6f0c..8e754b769e1 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -4058,6 +4058,14 @@ static int record_conflicted_index_entries(struct merge_options *opt)
 	if (strmap_empty(&opt->priv->conflicted))
 		return 0;
 
+	/*
+	 * We are in a conflicted state. These conflicts might be inside
+	 * sparse-directory entries, so expand the index preemtively.
+	 * Also, we set original_cache_nr below, but that might change if
+	 * index_name_pos() calls ask for paths within sparse directories.
+	 */
+	ensure_full_index(index);
+
 	/* If any entries have skip_worktree set, we'll have to check 'em out */
 	state.force = 1;
 	state.quiet = 1;
diff --git a/merge-recursive.c b/merge-recursive.c
index 3355d50e8ad..1f563cd6874 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3750,6 +3750,9 @@ int merge_recursive(struct merge_options *opt,
 	assert(opt->ancestor == NULL ||
 	       !strcmp(opt->ancestor, "constructed merge base"));
 
+	prepare_repo_settings(opt->repo);
+	opt->repo->settings.command_requires_full_index = 1;
+
 	if (merge_start(opt, repo_get_commit_tree(opt->repo, h1)))
 		return -1;
 	clean = merge_recursive_internal(opt, h1, h2, merge_bases, result);
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 3e01e70fa0b..781ebd9a656 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -52,7 +52,7 @@ test_expect_success 'setup' '
 		git checkout -b base &&
 		for dir in folder1 folder2 deep
 		do
-			git checkout -b update-$dir &&
+			git checkout -b update-$dir base &&
 			echo "updated $dir" >$dir/a &&
 			git commit -a -m "update $dir" || return 1
 		done &&
@@ -652,7 +652,11 @@ test_expect_success 'sparse-index is not expanded' '
 	echo >>sparse-index/extra.txt &&
 	ensure_not_expanded add extra.txt &&
 	echo >>sparse-index/untracked.txt &&
-	ensure_not_expanded add .
+	ensure_not_expanded add . &&
+
+	ensure_not_expanded checkout -f update-deep &&
+	ensure_not_expanded merge -s ort -m merge update-folder1 &&
+	ensure_not_expanded merge -s ort -m merge update-folder2
 '
 
 # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 4/6] merge-ort: expand only for out-of-cone conflicts
  2021-08-17 17:08 [PATCH 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
                   ` (2 preceding siblings ...)
  2021-08-17 17:08 ` [PATCH 3/6] merge: make sparse-aware with ORT Derrick Stolee via GitGitGadget
@ 2021-08-17 17:08 ` Derrick Stolee via GitGitGadget
  2021-08-20 21:53   ` Elijah Newren
  2021-08-17 17:08 ` [PATCH 5/6] t1092: add cherry-pick, rebase tests Derrick Stolee via GitGitGadget
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-08-17 17:08 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Merge conflicts happen often enough to want to avoid expanding a sparse
index when they happen, as long as those conflicts are within the
sparse-checkout cone. If a conflict exists outside of the
sparse-checkout cone, then we still need to expand before iterating over
the index entries. This is critical to do in advance because of how the
original_cache_nr is tracked to allow inserting and replacing cache
entries.

Iterate over the conflicted files and check if any paths are outside of
the sparse-checkout cone. If so, then expand the full index.

Add a test that demonstrates that we do not expand the index, even when
we hit a conflict within the sparse-checkout cone.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 merge-ort.c                              | 14 ++++++++---
 t/t1092-sparse-checkout-compatibility.sh | 30 ++++++++++++++++++++++--
 2 files changed, 39 insertions(+), 5 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 8e754b769e1..590e52058cf 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -36,6 +36,7 @@
 #include "tree.h"
 #include "unpack-trees.h"
 #include "xdiff-interface.h"
+#include "dir.h"
 
 /*
  * We have many arrays of size 3.  Whenever we have such an array, the
@@ -4060,11 +4061,18 @@ static int record_conflicted_index_entries(struct merge_options *opt)
 
 	/*
 	 * We are in a conflicted state. These conflicts might be inside
-	 * sparse-directory entries, so expand the index preemtively.
-	 * Also, we set original_cache_nr below, but that might change if
+	 * sparse-directory entries, so check if any entries are outside
+	 * of the sparse-checkout cone preemptively.
+	 *
+	 * We set original_cache_nr below, but that might change if
 	 * index_name_pos() calls ask for paths within sparse directories.
 	 */
-	ensure_full_index(index);
+	strmap_for_each_entry(&opt->priv->conflicted, &iter, e) {
+		if (!path_in_sparse_checkout(e->key, index)) {
+			ensure_full_index(index);
+			break;
+		}
+	}
 
 	/* If any entries have skip_worktree set, we'll have to check 'em out */
 	state.force = 1;
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 781ebd9a656..a0ed2bec574 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -622,8 +622,21 @@ test_expect_success 'sparse-index is expanded and converted back' '
 ensure_not_expanded () {
 	rm -f trace2.txt &&
 	echo >>sparse-index/untracked.txt &&
-	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
-		git -C sparse-index "$@" &&
+
+	if test "$1" = "!"
+	then
+		shift &&
+		(
+			GIT_TRACE2_EVENT="$(pwd)/trace2.txt" &&
+			GIT_TRACE2_EVENT_NESTING=10 &&
+			export GIT_TRACE2_EVENT &&
+			export GIT_TRACE2_EVENT_NESTING &&
+			test_must_fail git -C sparse-index "$@" || return 1
+		)
+	else
+		GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+			git -C sparse-index "$@" || return 1
+	fi &&
 	test_region ! index ensure_full_index trace2.txt
 }
 
@@ -659,6 +672,19 @@ test_expect_success 'sparse-index is not expanded' '
 	ensure_not_expanded merge -s ort -m merge update-folder2
 '
 
+test_expect_success 'sparse-index is not expanded: merge conflict in cone' '
+	init_repos &&
+
+	for side in right left
+	do
+		git -C sparse-index checkout -b expand-$side base &&
+		echo $side >sparse-index/deep/a &&
+		git -C sparse-index commit -a -m "$side" || return 1
+	done &&
+
+	ensure_not_expanded ! merge -m merged expand-right
+'
+
 # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
 # in this scenario, but it shouldn't.
 test_expect_success 'reset mixed and checkout orphan' '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 5/6] t1092: add cherry-pick, rebase tests
  2021-08-17 17:08 [PATCH 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
                   ` (3 preceding siblings ...)
  2021-08-17 17:08 ` [PATCH 4/6] merge-ort: expand only for out-of-cone conflicts Derrick Stolee via GitGitGadget
@ 2021-08-17 17:08 ` Derrick Stolee via GitGitGadget
  2021-08-20 21:58   ` Elijah Newren
  2021-08-17 17:08 ` [PATCH 6/6] sparse-index: integrate with cherry-pick and rebase Derrick Stolee via GitGitGadget
  2021-08-24 21:52 ` [PATCH v2 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
  6 siblings, 1 reply; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-08-17 17:08 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Add tests to check that cherry-pick and rebase behave the same in the
sparse-index case as in the full index cases.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t1092-sparse-checkout-compatibility.sh | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index a0ed2bec574..a52d2edda54 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -486,14 +486,17 @@ test_expect_success 'checkout and reset (mixed) [sparse]' '
 	test_sparse_match git reset update-folder2
 '
 
-test_expect_success 'merge' '
+test_expect_success 'merge, cherry-pick, and rebase' '
 	init_repos &&
 
-	test_all_match git checkout -b merge update-deep &&
-	test_all_match git merge -m "folder1" update-folder1 &&
-	test_all_match git rev-parse HEAD^{tree} &&
-	test_all_match git merge -m "folder2" update-folder2 &&
-	test_all_match git rev-parse HEAD^{tree}
+	for OPERATION in "merge -s ort -m merge" cherry-pick rebase
+	do
+		test_all_match git checkout -B temp update-deep &&
+		test_all_match git $OPERATION update-folder1 &&
+		test_all_match git rev-parse HEAD^{tree} &&
+		test_all_match git $OPERATION update-folder2 &&
+		test_all_match git rev-parse HEAD^{tree} || return 1
+	done
 '
 
 # NEEDSWORK: This test is documenting current behavior, but that
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 6/6] sparse-index: integrate with cherry-pick and rebase
  2021-08-17 17:08 [PATCH 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
                   ` (4 preceding siblings ...)
  2021-08-17 17:08 ` [PATCH 5/6] t1092: add cherry-pick, rebase tests Derrick Stolee via GitGitGadget
@ 2021-08-17 17:08 ` Derrick Stolee via GitGitGadget
  2021-08-21  0:12   ` Elijah Newren
  2021-08-24 21:52 ` [PATCH v2 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
  6 siblings, 1 reply; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-08-17 17:08 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The hard work was already done with 'git merge' and the ORT strategy.
Just add extra tests to see that we get the expected results in the
non-conflict cases.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/rebase.c                         |  6 ++++
 builtin/revert.c                         |  3 ++
 t/t1092-sparse-checkout-compatibility.sh | 41 ++++++++++++++++++++++--
 3 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index 33e09619005..27433d7c5a2 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -559,6 +559,9 @@ int cmd_rebase__interactive(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options,
 			builtin_rebase_interactive_usage, PARSE_OPT_KEEP_ARGV0);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	if (!is_null_oid(&squash_onto))
 		opts.squash_onto = &squash_onto;
 
@@ -1430,6 +1433,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
 		usage_with_options(builtin_rebase_usage,
 				   builtin_rebase_options);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	options.allow_empty_message = 1;
 	git_config(rebase_config, &options);
 	/* options.gpg_sign_opt will be either "-S" or NULL */
diff --git a/builtin/revert.c b/builtin/revert.c
index 237f2f18d4c..6c4c22691bd 100644
--- a/builtin/revert.c
+++ b/builtin/revert.c
@@ -136,6 +136,9 @@ static int run_sequencer(int argc, const char **argv, struct replay_opts *opts)
 			PARSE_OPT_KEEP_ARGV0 |
 			PARSE_OPT_KEEP_UNKNOWN);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	/* implies allow_empty */
 	if (opts->keep_redundant_commits)
 		opts->allow_empty = 1;
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index a52d2edda54..c047b95b121 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -532,6 +532,38 @@ test_expect_success 'merge with conflict outside cone' '
 	test_all_match git rev-parse HEAD^{tree}
 '
 
+test_expect_success 'cherry-pick/rebase with conflict outside cone' '
+	init_repos &&
+
+	for OPERATION in cherry-pick rebase
+	do
+		test_all_match git checkout -B tip &&
+		test_all_match git reset --hard merge-left &&
+		test_all_match git status --porcelain=v2 &&
+		test_all_match test_must_fail git $OPERATION merge-right &&
+		test_all_match git status --porcelain=v2 &&
+
+		# Resolve the conflict in different ways:
+		# 1. Revert to the base
+		test_all_match git checkout base -- deep/deeper2/a &&
+		test_all_match git status --porcelain=v2 &&
+
+		# 2. Add the file with conflict markers
+		test_all_match git add folder1/a &&
+		test_all_match git status --porcelain=v2 &&
+
+		# 3. Rename the file to another sparse filename and
+		#    accept conflict markers as resolved content.
+		run_on_all mv folder2/a folder2/z &&
+		test_all_match git add folder2 &&
+		test_all_match git status --porcelain=v2 &&
+
+		test_all_match git $OPERATION --continue &&
+		test_all_match git status --porcelain=v2 &&
+		test_all_match git rev-parse HEAD^{tree} || return 1
+	done
+'
+
 test_expect_success 'merge with outside renames' '
 	init_repos &&
 
@@ -670,9 +702,12 @@ test_expect_success 'sparse-index is not expanded' '
 	echo >>sparse-index/untracked.txt &&
 	ensure_not_expanded add . &&
 
-	ensure_not_expanded checkout -f update-deep &&
-	ensure_not_expanded merge -s ort -m merge update-folder1 &&
-	ensure_not_expanded merge -s ort -m merge update-folder2
+	for OPERATION in "merge -s ort -m merge" cherry-pick rebase
+	do
+		ensure_not_expanded checkout -f -B temp update-deep &&
+		ensure_not_expanded $OPERATION update-folder1 &&
+		ensure_not_expanded $OPERATION update-folder2 || return 1
+	done
 '
 
 test_expect_success 'sparse-index is not expanded: merge conflict in cone' '
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/6] t1092: use ORT merge strategy
  2021-08-17 17:08 ` [PATCH 1/6] t1092: use ORT merge strategy Derrick Stolee via GitGitGadget
@ 2021-08-18 17:16   ` Taylor Blau
  2021-08-18 18:10   ` Junio C Hamano
  1 sibling, 0 replies; 45+ messages in thread
From: Taylor Blau @ 2021-08-18 17:16 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, newren, stolee, gitster, Derrick Stolee, Derrick Stolee

On Tue, Aug 17, 2021 at 05:08:39PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
>
> The sparse index will be compatible with the ORT merge strategy, so
> let's use it explicitly in our tests.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  t/t1092-sparse-checkout-compatibility.sh | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index ddc86bb4152..3e01e70fa0b 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -7,6 +7,11 @@ GIT_TEST_SPARSE_INDEX=
>
>  . ./test-lib.sh
>
> +# Force the use of the ORT merge algorithm until testing with the
> +# recursive strategy. We expect ORT to be used with sparse-index.
> +GIT_TEST_MERGE_ALGORITHM=ort
> +export GIT_TEST_MERGE_ALGORITHM
> +

Looks good, but are the lower hunks which set `-s ort` necessary, too? I
applied this series into my tree and t1092 still passes after reverting
the next two hunks.

Not worth a reroll on its own, just curious.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/6] t1092: use ORT merge strategy
  2021-08-17 17:08 ` [PATCH 1/6] t1092: use ORT merge strategy Derrick Stolee via GitGitGadget
  2021-08-18 17:16   ` Taylor Blau
@ 2021-08-18 18:10   ` Junio C Hamano
  2021-08-18 18:42     ` Derrick Stolee
  1 sibling, 1 reply; 45+ messages in thread
From: Junio C Hamano @ 2021-08-18 18:10 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, newren, stolee, Derrick Stolee, Derrick Stolee

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <dstolee@microsoft.com>
>
> The sparse index will be compatible with the ORT merge strategy, so
> let's use it explicitly in our tests.

Unless you mean that the sparse index will only be compatible with
ort, but will never be with recursive, I do not quite see why this
is taking us into a good direction.  Is this because we want to gain
test coverage for ort early, before we flip the default to ort [*1*]?



[Footnote]

*1* If the answer is "no, it is because sparse index will not work
    with recursive", the please disregard the rest, but just in
    case it is not...

    It seems to me that it would let us live in the future in a more
    comprehensive way if we tweaked merge_recursive() and/or
    merge_recursive_generic() so that all internal callers, not just
    builtin/merge.c, would redirect to the ort machinery when say
    GIT_TEST_REPLACE_RECURSIVE_WITH_ORT environment exists, and
    doing it that way we do not need to sprinkle "-srecursive" and
    "-sort" everywhere in our tests at randomly chosen places to
    give test coverage to both strategies.


> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  t/t1092-sparse-checkout-compatibility.sh | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index ddc86bb4152..3e01e70fa0b 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -7,6 +7,11 @@ GIT_TEST_SPARSE_INDEX=
>  
>  . ./test-lib.sh
>  
> +# Force the use of the ORT merge algorithm until testing with the
> +# recursive strategy. We expect ORT to be used with sparse-index.
> +GIT_TEST_MERGE_ALGORITHM=ort
> +export GIT_TEST_MERGE_ALGORITHM
> +
>  test_expect_success 'setup' '
>  	git init initial-repo &&
>  	(
> @@ -501,7 +506,7 @@ test_expect_success 'merge with conflict outside cone' '
>  
>  	test_all_match git checkout -b merge-tip merge-left &&
>  	test_all_match git status --porcelain=v2 &&
> -	test_all_match test_must_fail git merge -m merge merge-right &&
> +	test_all_match test_must_fail git merge -sort -m merge merge-right &&
>  	test_all_match git status --porcelain=v2 &&
>  
>  	# Resolve the conflict in different ways:
> @@ -531,7 +536,7 @@ test_expect_success 'merge with outside renames' '
>  	do
>  		test_all_match git reset --hard &&
>  		test_all_match git checkout -f -b merge-$type update-deep &&
> -		test_all_match git merge -m "$type" rename-$type &&
> +		test_all_match git merge -sort -m "$type" rename-$type &&
>  		test_all_match git rev-parse HEAD^{tree} || return 1
>  	done
>  '

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/6] t1092: use ORT merge strategy
  2021-08-18 18:10   ` Junio C Hamano
@ 2021-08-18 18:42     ` Derrick Stolee
  2021-08-18 22:22       ` Junio C Hamano
  2021-08-20 21:23       ` Elijah Newren
  0 siblings, 2 replies; 45+ messages in thread
From: Derrick Stolee @ 2021-08-18 18:42 UTC (permalink / raw)
  To: Junio C Hamano, Derrick Stolee via GitGitGadget
  Cc: git, newren, Derrick Stolee, Derrick Stolee

On 8/18/2021 2:10 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> The sparse index will be compatible with the ORT merge strategy, so
>> let's use it explicitly in our tests.
> 
> Unless you mean that the sparse index will only be compatible with
> ort, but will never be with recursive, I do not quite see why this
> is taking us into a good direction.  Is this because we want to gain
> test coverage for ort early, before we flip the default to ort [*1*]?

The sparse index will _work_ with the recursive merge strategy, it will
just continue to be slow, and likely slower than if we had a full index.
This is because the recursive merge algorithm will expand a sparse index
into a full one before doing any of its logic (hence my confidence that
it will work).

The main point why ORT is the focus is that the ORT strategy is required
so the sparse index can get the intended performance gains (i.e. it does
not expand in most cases). The ORT algorithm can resolve conflicts
outside the sparse-checkout cone without needing the index as a data
structure and instead the resulting tree is recorded in the correct
sparse directory entry.

> [Footnote]
> 
> *1* If the answer is "no, it is because sparse index will not work
>     with recursive", the please disregard the rest, but just in
>     case it is not...
> 
>     It seems to me that it would let us live in the future in a more
>     comprehensive way if we tweaked merge_recursive() and/or
>     merge_recursive_generic() so that all internal callers, not just
>     builtin/merge.c, would redirect to the ort machinery when say
>     GIT_TEST_REPLACE_RECURSIVE_WITH_ORT environment exists, and
>     doing it that way we do not need to sprinkle "-srecursive" and
>     "-sort" everywhere in our tests at randomly chosen places to
>     give test coverage to both strategies.

I could also change this patch to stop using ORT _all the time_ and
instead let the GIT_TEST_MERGE_ALGORITHM decide which is tested.

That is, except for the final tests that check that the index is not
expanded. Those tests must specify the ORT strategy explicitly.

I think I started playing with the GIT_TEST_MERGE_ALGORITHM because
it appears to override the command-line option, but I will need to
double-check that.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/6] t1092: use ORT merge strategy
  2021-08-18 18:42     ` Derrick Stolee
@ 2021-08-18 22:22       ` Junio C Hamano
  2021-08-20 21:23       ` Elijah Newren
  1 sibling, 0 replies; 45+ messages in thread
From: Junio C Hamano @ 2021-08-18 22:22 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, newren, Derrick Stolee,
	Derrick Stolee

Derrick Stolee <stolee@gmail.com> writes:

> On 8/18/2021 2:10 PM, Junio C Hamano wrote:
>> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
>> 
>>> From: Derrick Stolee <dstolee@microsoft.com>
>>>
>>> The sparse index will be compatible with the ORT merge strategy, so
>>> let's use it explicitly in our tests.
>> 
>> Unless you mean that the sparse index will only be compatible with
>> ort, but will never be with recursive, I do not quite see why this
>> is taking us into a good direction.  Is this because we want to gain
>> test coverage for ort early, before we flip the default to ort [*1*]?
>
> The sparse index will _work_ with the recursive merge strategy, it will
> just continue to be slow, and likely slower than if we had a full index.
> This is because the recursive merge algorithm will expand a sparse index
> into a full one before doing any of its logic (hence my confidence that
> it will work).

Ah, thanks for explanation.  The tests being touched are not about
correctness of the merge results but the damage the operation would
make to the sparseness of the index, and because in the longer term
the recursive backend is on its way out, we do want to focus on
testing how ORT performs.

> I could also change this patch to stop using ORT _all the time_ and
> instead let the GIT_TEST_MERGE_ALGORITHM decide which is tested.

No, that's OK.

It was unclear from the proposed log message (hence my questions),
but if it is made clear that we have a good reason why we want to
see how the sparse-index works with ORT, explicitly testing with ORT
like your patch does is the right thing to do, I would think.

With GIT_TEST_MERGE_ALGORITHM set to ort and exported, it is
puzzling why some "git merge" invocations gets "-s ort" in the same
patch, though.

Thanks.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/6] t1092: use ORT merge strategy
  2021-08-18 18:42     ` Derrick Stolee
  2021-08-18 22:22       ` Junio C Hamano
@ 2021-08-20 21:23       ` Elijah Newren
  2021-08-20 23:32         ` Junio C Hamano
  1 sibling, 1 reply; 45+ messages in thread
From: Elijah Newren @ 2021-08-20 21:23 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Junio C Hamano, Derrick Stolee via GitGitGadget,
	Git Mailing List, Derrick Stolee, Derrick Stolee

On Wed, Aug 18, 2021 at 11:42 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> >     It seems to me that it would let us live in the future in a more
> >     comprehensive way if we tweaked merge_recursive() and/or
> >     merge_recursive_generic() so that all internal callers, not just
> >     builtin/merge.c, would redirect to the ort machinery when say
> >     GIT_TEST_REPLACE_RECURSIVE_WITH_ORT environment exists, and
> >     doing it that way we do not need to sprinkle "-srecursive" and
> >     "-sort" everywhere in our tests at randomly chosen places to
> >     give test coverage to both strategies.

GIT_TEST_MERGE_ALGORITHM already does this; the testsuite already had
`-s recursive` sprinkled everywhere (due to contrast with `-s
resolve`), but since I wanted to use all existing recursive tests as
ort tests, then rather than tweaking all the test files and copying
tests and whatnot, we decided to just have GIT_TEST_MERGE_ALGORITHM
reinterpret "recursive" to whatever GIT_TEST_MERGE_ALGORITHM says.

> I could also change this patch to stop using ORT _all the time_ and
> instead let the GIT_TEST_MERGE_ALGORITHM decide which is tested.
>
> That is, except for the final tests that check that the index is not
> expanded. Those tests must specify the ORT strategy explicitly.
>
> I think I started playing with the GIT_TEST_MERGE_ALGORITHM because
> it appears to override the command-line option, but I will need to
> double-check that.

Yes, GIT_TEST_MERGE_ALGORITHM=ort reinterprets "recursive" to mean "ort".

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/6] diff: ignore sparse paths in diffstat
  2021-08-17 17:08 ` [PATCH 2/6] diff: ignore sparse paths in diffstat Derrick Stolee via GitGitGadget
@ 2021-08-20 21:32   ` Elijah Newren
  2021-08-24 18:30     ` Derrick Stolee
  0 siblings, 1 reply; 45+ messages in thread
From: Elijah Newren @ 2021-08-20 21:32 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Derrick Stolee, Junio C Hamano, Derrick Stolee,
	Derrick Stolee

On Tue, Aug 17, 2021 at 10:08 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> The diff_populate_filespec() method is used to describe the diff after a
> merge operation is complete, especially when a conflict appears. In
> order to avoid expanding a sparse index, the reuse_worktree_file() needs
> to be adapted to ignore files that are outside of the sparse-checkout
> cone. The file names and OIDs used for this check come from the merged
> tree in the case of the ORT strategy, not the index, hence the ability
> to look into these paths without having already expanded the index.

I'm confused; I thought the diffstat was only shown if the merge was
successful, in which case there would be no conflicts appearing.

Also, I'm not that familiar with the general diff machinery (just the
rename detection parts), but...if the diffstat only shows when the
merge is successful, wouldn't we be comparing two REVS (ORIG_HEAD to
HEAD)?  Why would we make use of the working tree at all in such a
case?  And, wouldn't using the working tree be dangerous...what if
there was a merge performed with a dirty working tree?

On a bit of a tangent, I know diffcore-rename.c calls into
diff_populate_filespec() as well, and I have some code doing merges in
a bare repository (where there obviously is no index).  It seemed to
be working, but given this commit message, now I'm wondering if I've
missed something fundamental either in that implementation or there's
something amiss in this patch.  Or both.  Maybe I need to dig into
diff_populate_filespec() more, but it seems you already have.  Any
pointers to orient me on why your changes are right here (and, if you
know, why diffcore-rename.c should or shouldn't be using
diff_populate_filespec() in a bare repo)?


> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  diff.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/diff.c b/diff.c
> index a8113f17070..c457cfa0e59 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -26,6 +26,7 @@
>  #include "parse-options.h"
>  #include "help.h"
>  #include "promisor-remote.h"
> +#include "dir.h"
>
>  #ifdef NO_FAST_WORKING_DIRECTORY
>  #define FAST_WORKING_DIRECTORY 0
> @@ -3900,6 +3901,13 @@ static int reuse_worktree_file(struct index_state *istate,
>         if (!FAST_WORKING_DIRECTORY && !want_file && has_object_pack(oid))
>                 return 0;
>
> +       /*
> +        * If this path does not match our sparse-checkout definition,
> +        * then the file will not be in the working directory.
> +        */
> +       if (!path_in_sparse_checkout(name, istate))
> +               return 0;
> +
>         /*
>          * Similarly, if we'd have to convert the file contents anyway, that
>          * makes the optimization not worthwhile.
> --
> gitgitgadget
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 3/6] merge: make sparse-aware with ORT
  2021-08-17 17:08 ` [PATCH 3/6] merge: make sparse-aware with ORT Derrick Stolee via GitGitGadget
@ 2021-08-20 21:40   ` Elijah Newren
  0 siblings, 0 replies; 45+ messages in thread
From: Elijah Newren @ 2021-08-20 21:40 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Derrick Stolee, Junio C Hamano, Derrick Stolee,
	Derrick Stolee

On Tue, Aug 17, 2021 at 10:08 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> Allow 'git merge' to operate without expanding a sparse index, at least
> not immediately. The index still will be expanded in a few cases:
>
> 1. If the merge strategy is 'recursive', then we enable
>    command_requires_full_index at the start of the merge_recursive()
>    method. We expect sparse-index users to also have the 'ort' strategy
>    enabled.

What about `resolve`, `octopus`, `subtree` (which technically could be
implemented via either recursive or ort, such fun...) or a
user-defined strategy?

`resolve` and `octopus` would absolutely need a full index.  `subtree`
would if implemented via merge-recursive, and not if implemented via
merge-ort.

I'm not sure what to assume about user-defined strategies; I guess for
safety reasons and backward compatibility, we should always expand?
Or maybe there are no backward compatibility concerns since no one who
uses a sparse-index will attempt to use any pre-existing external
merge strategies (are there even any known ones in the wild or is this
still a theoretical capability?), and we can assume they will only use
ones written in the future?  Hmm...

> 2. If the merge results in a conflicted file, then we expand the index
>    before updating the working tree. The loop that iterates over the
>    worktree replaces index entries and tracks 'origintal_cache_nr' which
>    can become completely wrong if the index expands in the middle of the
>    operation. This safety valve is important before that loop starts. A
>    later change will focus this to only expand if we indeed have a
>    conflict outside of the sparse-checkout cone.

From reading the patch below, this is specific to ort, but that wasn't
clear on reading the commit message.

> Some test updates are required, including a mistaken 'git checkout -b'
> that did not specify the base branch, causing merges to be fast-forward
> merges.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  builtin/merge.c                          | 3 +++
>  merge-ort.c                              | 8 ++++++++
>  merge-recursive.c                        | 3 +++
>  t/t1092-sparse-checkout-compatibility.sh | 8 ++++++--
>  4 files changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 22f23990b37..926de328fbb 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -1276,6 +1276,9 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>         if (argc == 2 && !strcmp(argv[1], "-h"))
>                 usage_with_options(builtin_merge_usage, builtin_merge_options);
>
> +       prepare_repo_settings(the_repository);
> +       the_repository->settings.command_requires_full_index = 0;
> +
>         /*
>          * Check if we are _not_ on a detached HEAD, i.e. if there is a
>          * current branch.
> diff --git a/merge-ort.c b/merge-ort.c
> index 6eb910d6f0c..8e754b769e1 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -4058,6 +4058,14 @@ static int record_conflicted_index_entries(struct merge_options *opt)
>         if (strmap_empty(&opt->priv->conflicted))
>                 return 0;
>
> +       /*
> +        * We are in a conflicted state. These conflicts might be inside
> +        * sparse-directory entries, so expand the index preemtively.

s/preemtively/preemptively/

> +        * Also, we set original_cache_nr below, but that might change if
> +        * index_name_pos() calls ask for paths within sparse directories.
> +        */
> +       ensure_full_index(index);
> +

This seems somewhat pessimistic; what if all the conflicts are within
the sparse-checkout?  Having conflicts contains within the
sparse-checkout seems likely, since we'd only get conflicts for files
modified by both sides of history, and sparse-checkouts are used when
users aren't going to modify files outside the sparse-checkout.

>         /* If any entries have skip_worktree set, we'll have to check 'em out */
>         state.force = 1;
>         state.quiet = 1;
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 3355d50e8ad..1f563cd6874 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -3750,6 +3750,9 @@ int merge_recursive(struct merge_options *opt,
>         assert(opt->ancestor == NULL ||
>                !strcmp(opt->ancestor, "constructed merge base"));
>
> +       prepare_repo_settings(opt->repo);
> +       opt->repo->settings.command_requires_full_index = 1;
> +
>         if (merge_start(opt, repo_get_commit_tree(opt->repo, h1)))
>                 return -1;
>         clean = merge_recursive_internal(opt, h1, h2, merge_bases, result);
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index 3e01e70fa0b..781ebd9a656 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -52,7 +52,7 @@ test_expect_success 'setup' '
>                 git checkout -b base &&
>                 for dir in folder1 folder2 deep
>                 do
> -                       git checkout -b update-$dir &&
> +                       git checkout -b update-$dir base &&
>                         echo "updated $dir" >$dir/a &&
>                         git commit -a -m "update $dir" || return 1
>                 done &&
> @@ -652,7 +652,11 @@ test_expect_success 'sparse-index is not expanded' '
>         echo >>sparse-index/extra.txt &&
>         ensure_not_expanded add extra.txt &&
>         echo >>sparse-index/untracked.txt &&
> -       ensure_not_expanded add .
> +       ensure_not_expanded add . &&
> +
> +       ensure_not_expanded checkout -f update-deep &&
> +       ensure_not_expanded merge -s ort -m merge update-folder1 &&
> +       ensure_not_expanded merge -s ort -m merge update-folder2

Can we just set GIT_TEST_MERGE_ALGORITHM=ort at the beginning of the
test file and then avoid repeating `-s ort`?


>  '
>
>  # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
> --
> gitgitgadget

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 4/6] merge-ort: expand only for out-of-cone conflicts
  2021-08-17 17:08 ` [PATCH 4/6] merge-ort: expand only for out-of-cone conflicts Derrick Stolee via GitGitGadget
@ 2021-08-20 21:53   ` Elijah Newren
  0 siblings, 0 replies; 45+ messages in thread
From: Elijah Newren @ 2021-08-20 21:53 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Derrick Stolee, Junio C Hamano, Derrick Stolee,
	Derrick Stolee

On Tue, Aug 17, 2021 at 10:08 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> Merge conflicts happen often enough to want to avoid expanding a sparse
> index when they happen, as long as those conflicts are within the
> sparse-checkout cone. If a conflict exists outside of the
> sparse-checkout cone, then we still need to expand before iterating over
> the index entries. This is critical to do in advance because of how the
> original_cache_nr is tracked to allow inserting and replacing cache
> entries.
>
> Iterate over the conflicted files and check if any paths are outside of
> the sparse-checkout cone. If so, then expand the full index.
>
> Add a test that demonstrates that we do not expand the index, even when
> we hit a conflict within the sparse-checkout cone.

If I had read ahead a little bit instead of responding as I went....  :-)

Awesome to see this.

>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  merge-ort.c                              | 14 ++++++++---
>  t/t1092-sparse-checkout-compatibility.sh | 30 ++++++++++++++++++++++--
>  2 files changed, 39 insertions(+), 5 deletions(-)
>
> diff --git a/merge-ort.c b/merge-ort.c
> index 8e754b769e1..590e52058cf 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -36,6 +36,7 @@
>  #include "tree.h"
>  #include "unpack-trees.h"
>  #include "xdiff-interface.h"
> +#include "dir.h"

dir.h was already included from merge-ort.c; no need to include it
again.  (Plus, the original line keeps the nice ordering of the
include list...)

>
>  /*
>   * We have many arrays of size 3.  Whenever we have such an array, the
> @@ -4060,11 +4061,18 @@ static int record_conflicted_index_entries(struct merge_options *opt)
>
>         /*
>          * We are in a conflicted state. These conflicts might be inside
> -        * sparse-directory entries, so expand the index preemtively.
> -        * Also, we set original_cache_nr below, but that might change if
> +        * sparse-directory entries, so check if any entries are outside
> +        * of the sparse-checkout cone preemptively.
> +        *
> +        * We set original_cache_nr below, but that might change if
>          * index_name_pos() calls ask for paths within sparse directories.
>          */
> -       ensure_full_index(index);
> +       strmap_for_each_entry(&opt->priv->conflicted, &iter, e) {
> +               if (!path_in_sparse_checkout(e->key, index)) {
> +                       ensure_full_index(index);
> +                       break;
> +               }
> +       }

Sweet, awesome that it was so simple to implement.

>
>         /* If any entries have skip_worktree set, we'll have to check 'em out */
>         state.force = 1;
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index 781ebd9a656..a0ed2bec574 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -622,8 +622,21 @@ test_expect_success 'sparse-index is expanded and converted back' '
>  ensure_not_expanded () {
>         rm -f trace2.txt &&
>         echo >>sparse-index/untracked.txt &&
> -       GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
> -               git -C sparse-index "$@" &&
> +
> +       if test "$1" = "!"
> +       then
> +               shift &&
> +               (
> +                       GIT_TRACE2_EVENT="$(pwd)/trace2.txt" &&
> +                       GIT_TRACE2_EVENT_NESTING=10 &&
> +                       export GIT_TRACE2_EVENT &&
> +                       export GIT_TRACE2_EVENT_NESTING &&
> +                       test_must_fail git -C sparse-index "$@" || return 1
> +               )

Could this be simplified to
          test_must_fail env \
                  GIT_TRACE2_EVENT="$(pwd)/trace2.txt"
GIT_TRACE2_EVENT_NESTING=10 \
                          git -C sparse-index "$@" || return 1
?
Especially if the lines can be indented to make it clear that the only
difference between this block and the one below is the "test_must_fail
env" being added at the front.



> +       else
> +               GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
> +                       git -C sparse-index "$@" || return 1
> +       fi &&
>         test_region ! index ensure_full_index trace2.txt
>  }
>
> @@ -659,6 +672,19 @@ test_expect_success 'sparse-index is not expanded' '
>         ensure_not_expanded merge -s ort -m merge update-folder2
>  '
>
> +test_expect_success 'sparse-index is not expanded: merge conflict in cone' '
> +       init_repos &&
> +
> +       for side in right left
> +       do
> +               git -C sparse-index checkout -b expand-$side base &&
> +               echo $side >sparse-index/deep/a &&
> +               git -C sparse-index commit -a -m "$side" || return 1
> +       done &&
> +
> +       ensure_not_expanded ! merge -m merged expand-right
> +'
> +
>  # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
>  # in this scenario, but it shouldn't.
>  test_expect_success 'reset mixed and checkout orphan' '
> --
> gitgitgadget
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 5/6] t1092: add cherry-pick, rebase tests
  2021-08-17 17:08 ` [PATCH 5/6] t1092: add cherry-pick, rebase tests Derrick Stolee via GitGitGadget
@ 2021-08-20 21:58   ` Elijah Newren
  0 siblings, 0 replies; 45+ messages in thread
From: Elijah Newren @ 2021-08-20 21:58 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Derrick Stolee, Junio C Hamano, Derrick Stolee,
	Derrick Stolee

On Tue, Aug 17, 2021 at 10:08 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> Add tests to check that cherry-pick and rebase behave the same in the
> sparse-index case as in the full index cases.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  t/t1092-sparse-checkout-compatibility.sh | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index a0ed2bec574..a52d2edda54 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -486,14 +486,17 @@ test_expect_success 'checkout and reset (mixed) [sparse]' '
>         test_sparse_match git reset update-folder2
>  '
>
> -test_expect_success 'merge' '
> +test_expect_success 'merge, cherry-pick, and rebase' '
>         init_repos &&
>
> -       test_all_match git checkout -b merge update-deep &&
> -       test_all_match git merge -m "folder1" update-folder1 &&
> -       test_all_match git rev-parse HEAD^{tree} &&
> -       test_all_match git merge -m "folder2" update-folder2 &&
> -       test_all_match git rev-parse HEAD^{tree}
> +       for OPERATION in "merge -s ort -m merge" cherry-pick rebase

You're explicitly testing the ort strategy with merge, but relying on
GIT_TEST_MERGE_ALGORITHM for cherry-pick and rebase?

It'd probably be better to set GIT_TEST_MERGE_ALGORITHM=ort at the
beginning of the file and leave out the `-s ort` references.

Or, if you really wanted to test both algorithms, then in addition to
leaving out the `-s ort`, don't bother setting
GIT_TEST_MERGE_ALGORITHM.  (That works because automated test suites
will set GIT_TEST_MERGE_ALGORITHM=recursive on linux-gcc, and
GIT_TEST_MERGE_ALGORITHM=ort elsewhere).


> +       do
> +               test_all_match git checkout -B temp update-deep &&
> +               test_all_match git $OPERATION update-folder1 &&
> +               test_all_match git rev-parse HEAD^{tree} &&
> +               test_all_match git $OPERATION update-folder2 &&
> +               test_all_match git rev-parse HEAD^{tree} || return 1
> +       done


>  '
>
>  # NEEDSWORK: This test is documenting current behavior, but that
> --
> gitgitgadget
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/6] t1092: use ORT merge strategy
  2021-08-20 21:23       ` Elijah Newren
@ 2021-08-20 23:32         ` Junio C Hamano
  2021-08-21  0:20           ` Elijah Newren
  0 siblings, 1 reply; 45+ messages in thread
From: Junio C Hamano @ 2021-08-20 23:32 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Derrick Stolee, Derrick Stolee via GitGitGadget,
	Git Mailing List, Derrick Stolee, Derrick Stolee

Elijah Newren <newren@gmail.com> writes:

> On Wed, Aug 18, 2021 at 11:42 AM Derrick Stolee <stolee@gmail.com> wrote:
>>
>> >     It seems to me that it would let us live in the future in a more
>> >     comprehensive way if we tweaked merge_recursive() and/or
>> >     merge_recursive_generic() so that all internal callers, not just
>> >     builtin/merge.c, would redirect to the ort machinery when say
>> >     GIT_TEST_REPLACE_RECURSIVE_WITH_ORT environment exists, and
>> >     doing it that way we do not need to sprinkle "-srecursive" and
>> >     "-sort" everywhere in our tests at randomly chosen places to
>> >     give test coverage to both strategies.
>
> GIT_TEST_MERGE_ALGORITHM already does this; the testsuite already had
> `-s recursive` sprinkled everywhere (due to contrast with `-s
> resolve`), but since I wanted to use all existing recursive tests as
> ort tests, then rather than tweaking all the test files and copying
> tests and whatnot, we decided to just have GIT_TEST_MERGE_ALGORITHM
> reinterpret "recursive" to whatever GIT_TEST_MERGE_ALGORITHM says.

I somehow thought that direct calls to merge_recursive() and
merge_recursive_generic() are not affected with that environment
variable, so you cannot influence what happens during "git am -3"
and "git stash apply" with that, but perhaps I was not reading the
code correctly.

It seems that merge_recursive() and merge_ort_recursive() are
interface compatible and the latter can serve as a drop-in
replacement for the former?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 6/6] sparse-index: integrate with cherry-pick and rebase
  2021-08-17 17:08 ` [PATCH 6/6] sparse-index: integrate with cherry-pick and rebase Derrick Stolee via GitGitGadget
@ 2021-08-21  0:12   ` Elijah Newren
  0 siblings, 0 replies; 45+ messages in thread
From: Elijah Newren @ 2021-08-21  0:12 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Derrick Stolee, Junio C Hamano, Derrick Stolee,
	Derrick Stolee

On Tue, Aug 17, 2021 at 10:08 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> The hard work was already done with 'git merge' and the ORT strategy.
> Just add extra tests to see that we get the expected results in the
> non-conflict cases.

:-)

> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  builtin/rebase.c                         |  6 ++++
>  builtin/revert.c                         |  3 ++
>  t/t1092-sparse-checkout-compatibility.sh | 41 ++++++++++++++++++++++--
>  3 files changed, 47 insertions(+), 3 deletions(-)
>
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 33e09619005..27433d7c5a2 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -559,6 +559,9 @@ int cmd_rebase__interactive(int argc, const char **argv, const char *prefix)
>         argc = parse_options(argc, argv, prefix, options,
>                         builtin_rebase_interactive_usage, PARSE_OPT_KEEP_ARGV0);
>
> +       prepare_repo_settings(the_repository);
> +       the_repository->settings.command_requires_full_index = 0;
> +
>         if (!is_null_oid(&squash_onto))
>                 opts.squash_onto = &squash_onto;
>
> @@ -1430,6 +1433,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>                 usage_with_options(builtin_rebase_usage,
>                                    builtin_rebase_options);
>
> +       prepare_repo_settings(the_repository);
> +       the_repository->settings.command_requires_full_index = 0;
> +

While rebase can use either merge-recursive or merge-ort under the
covers, this change is okay because merge-recursive.c now has code to
expand the index unconditionally, and merge-ort does so only
conditionally in the cases needed.  Right?  (Same for change below to
revert.c)

>         options.allow_empty_message = 1;
>         git_config(rebase_config, &options);
>         /* options.gpg_sign_opt will be either "-S" or NULL */
> diff --git a/builtin/revert.c b/builtin/revert.c
> index 237f2f18d4c..6c4c22691bd 100644
> --- a/builtin/revert.c
> +++ b/builtin/revert.c
> @@ -136,6 +136,9 @@ static int run_sequencer(int argc, const char **argv, struct replay_opts *opts)
>                         PARSE_OPT_KEEP_ARGV0 |
>                         PARSE_OPT_KEEP_UNKNOWN);
>
> +       prepare_repo_settings(the_repository);
> +       the_repository->settings.command_requires_full_index = 0;
> +
>         /* implies allow_empty */
>         if (opts->keep_redundant_commits)
>                 opts->allow_empty = 1;
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index a52d2edda54..c047b95b121 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -532,6 +532,38 @@ test_expect_success 'merge with conflict outside cone' '
>         test_all_match git rev-parse HEAD^{tree}
>  '
>
> +test_expect_success 'cherry-pick/rebase with conflict outside cone' '
> +       init_repos &&
> +
> +       for OPERATION in cherry-pick rebase
> +       do
> +               test_all_match git checkout -B tip &&
> +               test_all_match git reset --hard merge-left &&
> +               test_all_match git status --porcelain=v2 &&
> +               test_all_match test_must_fail git $OPERATION merge-right &&
> +               test_all_match git status --porcelain=v2 &&
> +
> +               # Resolve the conflict in different ways:
> +               # 1. Revert to the base
> +               test_all_match git checkout base -- deep/deeper2/a &&
> +               test_all_match git status --porcelain=v2 &&
> +
> +               # 2. Add the file with conflict markers
> +               test_all_match git add folder1/a &&
> +               test_all_match git status --porcelain=v2 &&
> +
> +               # 3. Rename the file to another sparse filename and
> +               #    accept conflict markers as resolved content.
> +               run_on_all mv folder2/a folder2/z &&
> +               test_all_match git add folder2 &&
> +               test_all_match git status --porcelain=v2 &&
> +
> +               test_all_match git $OPERATION --continue &&
> +               test_all_match git status --porcelain=v2 &&
> +               test_all_match git rev-parse HEAD^{tree} || return 1
> +       done
> +'
> +
>  test_expect_success 'merge with outside renames' '
>         init_repos &&
>
> @@ -670,9 +702,12 @@ test_expect_success 'sparse-index is not expanded' '
>         echo >>sparse-index/untracked.txt &&
>         ensure_not_expanded add . &&
>
> -       ensure_not_expanded checkout -f update-deep &&
> -       ensure_not_expanded merge -s ort -m merge update-folder1 &&
> -       ensure_not_expanded merge -s ort -m merge update-folder2
> +       for OPERATION in "merge -s ort -m merge" cherry-pick rebase
> +       do
> +               ensure_not_expanded checkout -f -B temp update-deep &&
> +               ensure_not_expanded $OPERATION update-folder1 &&
> +               ensure_not_expanded $OPERATION update-folder2 || return 1
> +       done

Won't this fail on linux-gcc in the automated testing, since it uses
GIT_TEST_MERGE_ALGORITHM=recursive and you didn't override the merge
strategy for cherry-pick and rebase?

To fix, you'd either need to put a GIT_TEST_MERGE_ALGORITHM=ort at the
beginning of the file, or add `--strategy ort` options to the
cherry-pick and rebase commands.


>  '
>
>  test_expect_success 'sparse-index is not expanded: merge conflict in cone' '
> --
> gitgitgadget

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/6] t1092: use ORT merge strategy
  2021-08-20 23:32         ` Junio C Hamano
@ 2021-08-21  0:20           ` Elijah Newren
  2021-08-21  4:20             ` Junio C Hamano
  0 siblings, 1 reply; 45+ messages in thread
From: Elijah Newren @ 2021-08-21  0:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Derrick Stolee, Derrick Stolee via GitGitGadget,
	Git Mailing List, Derrick Stolee, Derrick Stolee

On Fri, Aug 20, 2021 at 4:32 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > On Wed, Aug 18, 2021 at 11:42 AM Derrick Stolee <stolee@gmail.com> wrote:
> >>
> >> >     It seems to me that it would let us live in the future in a more
> >> >     comprehensive way if we tweaked merge_recursive() and/or
> >> >     merge_recursive_generic() so that all internal callers, not just
> >> >     builtin/merge.c, would redirect to the ort machinery when say
> >> >     GIT_TEST_REPLACE_RECURSIVE_WITH_ORT environment exists, and
> >> >     doing it that way we do not need to sprinkle "-srecursive" and
> >> >     "-sort" everywhere in our tests at randomly chosen places to
> >> >     give test coverage to both strategies.
> >
> > GIT_TEST_MERGE_ALGORITHM already does this; the testsuite already had
> > `-s recursive` sprinkled everywhere (due to contrast with `-s
> > resolve`), but since I wanted to use all existing recursive tests as
> > ort tests, then rather than tweaking all the test files and copying
> > tests and whatnot, we decided to just have GIT_TEST_MERGE_ALGORITHM
> > reinterpret "recursive" to whatever GIT_TEST_MERGE_ALGORITHM says.
>
> I somehow thought that direct calls to merge_recursive() and
> merge_recursive_generic() are not affected with that environment
> variable, so you cannot influence what happens during "git am -3"
> and "git stash apply" with that, but perhaps I was not reading the
> code correctly.

Sorry for being unclear.  I was responding to the "sprinkling" portion
of the quote; GIT_TEST_MERGE_ALGORITHM allows us to avoid sprinkling
-srecursive and -sort in various places.

You are correct that merge_recursive() and merge_recursive_generic()
are unaffected by the environment variable; the environment variable
operates at a higher level in the code to choose whether to call e.g.
merge_recursive() vs. merge_incore_recursive().

> It seems that merge_recursive() and merge_ort_recursive() are
> interface compatible and the latter can serve as a drop-in
> replacement for the former?

Yes, merge_ort_recursive() and merge_ort_nonrecursive() were meant as
interface compatible drop-in replacements for merge_recursive() and
merge_trees(), to make it easy to switch callers over.

There is no such replacement for merge_recursive_generic(), though,
and builtin/{am, merge-recursive, stash}.c will all need to be
modified to work with merge-ort.  IIRC, when last we discussed that
interface, we realized that the three were using it a bit differently
and it had some hardcoded am-specific assumptions that were not
appropriate for the other two, so it's not clear to me we should port
that interface.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/6] t1092: use ORT merge strategy
  2021-08-21  0:20           ` Elijah Newren
@ 2021-08-21  4:20             ` Junio C Hamano
  2021-08-21 23:48               ` Elijah Newren
  0 siblings, 1 reply; 45+ messages in thread
From: Junio C Hamano @ 2021-08-21  4:20 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Derrick Stolee, Derrick Stolee via GitGitGadget,
	Git Mailing List, Derrick Stolee, Derrick Stolee

Elijah Newren <newren@gmail.com> writes:

>> It seems that merge_recursive() and merge_ort_recursive() are
>> interface compatible and the latter can serve as a drop-in
>> replacement for the former?
>
> Yes, merge_ort_recursive() and merge_ort_nonrecursive() were meant as
> interface compatible drop-in replacements for merge_recursive() and
> merge_trees(), to make it easy to switch callers over.
>
> There is no such replacement for merge_recursive_generic(), though,
> and builtin/{am, merge-recursive, stash}.c will all need to be
> modified to work with merge-ort.

But merge_recursive_generic() eveantually calls into merge_recursive();
as long as you hook into the latter, you're covered, no?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/6] t1092: use ORT merge strategy
  2021-08-21  4:20             ` Junio C Hamano
@ 2021-08-21 23:48               ` Elijah Newren
  0 siblings, 0 replies; 45+ messages in thread
From: Elijah Newren @ 2021-08-21 23:48 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Derrick Stolee, Derrick Stolee via GitGitGadget,
	Git Mailing List, Derrick Stolee, Derrick Stolee

On Fri, Aug 20, 2021 at 9:20 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> >> It seems that merge_recursive() and merge_ort_recursive() are
> >> interface compatible and the latter can serve as a drop-in
> >> replacement for the former?
> >
> > Yes, merge_ort_recursive() and merge_ort_nonrecursive() were meant as
> > interface compatible drop-in replacements for merge_recursive() and
> > merge_trees(), to make it easy to switch callers over.
> >
> > There is no such replacement for merge_recursive_generic(), though,
> > and builtin/{am, merge-recursive, stash}.c will all need to be
> > modified to work with merge-ort.
>
> But merge_recursive_generic() eventually calls into merge_recursive();
> as long as you hook into the latter, you're covered, no?

Okay, you caught me.  merge_ort_recursive() has one API difference
from merge_recursive(), namely, it does not allow opt->ancestor to be
anything other than NULL, whereas merge_recursive() does.  The only
caller where that distinction matters is
merge_recursive_generic()...but that does prevent
merge_recursive_generic() from just calling merge_ort_recursive().

The original patches I sent in for merge_incore_recursive() would have
provided for this same ability (and made merge_ort_recursive()
actually be a drop in replacement), but you rightfully pointed out
that the relevant opt->ancestor portion of the patches made no sense.
At the time, I responded (at
https://lore.kernel.org/git/CABPp-BFr=1iVY739cfh-1Hp82x-Mny-k4y0f3zZ_YuP3PxiGfQ@mail.gmail.com/):

"""
Yeah, you're making me lean towards thinking that
merge_recursive_generic() is just a broken API that I shouldn't port
over (even as a wrapper), and that I further shouldn't support using
the merge_ort_recursive() API in a fashion wanted by that function.
"""

The problem with opt->ancestor in merge_recursive_generic() is as follows:

  * When there is only one merge base (and opt->ancestor is not set),
merge_ort_recursive() and merge_recursive() will both set
opt->ancestor to the hash of the merge base commit.

  * The hash of the merge base is great when the merge base is a real
commit, less so when fake commits are generated (as
merge_recursive_generic() does) to pass along.

  * Because of the above, merge_recursive_generic() overrides
opt->ancestor with the value "constructed merge base" when there is
exactly one merge base tree.  That was done with git-am in mind, and
makes sense for am because it does create a fake or constructed merge
base.  It does not make sense to me for stash, which has a real
commit.  It also seems suboptimal or wrong for
builtin/merge-recursive.c as well -- it's hard to be sure since
builtin/merge-recursive simply takes an OID rather than actual branch
names and thus those OIDs could correspond to fake or constructed
commits, but builtin/merge-recursive does have the
better_branch_name() function it uses for o.branch1 and o.branch2 and
it seems like it should do the same on the merge base when it's
unique.


Rather than porting this bug (these bugs?) over from merge-recursive,
I'll just convert the merge_recursive_generic() callers over to the
new API.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/6] diff: ignore sparse paths in diffstat
  2021-08-20 21:32   ` Elijah Newren
@ 2021-08-24 18:30     ` Derrick Stolee
  2021-08-27 22:27       ` Elijah Newren
  0 siblings, 1 reply; 45+ messages in thread
From: Derrick Stolee @ 2021-08-24 18:30 UTC (permalink / raw)
  To: Elijah Newren, Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Junio C Hamano, Derrick Stolee, Derrick Stolee

On 8/20/2021 5:32 PM, Elijah Newren wrote:
> On Tue, Aug 17, 2021 at 10:08 AM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> The diff_populate_filespec() method is used to describe the diff after a
>> merge operation is complete, especially when a conflict appears. In
>> order to avoid expanding a sparse index, the reuse_worktree_file() needs
>> to be adapted to ignore files that are outside of the sparse-checkout
>> cone. The file names and OIDs used for this check come from the merged
>> tree in the case of the ORT strategy, not the index, hence the ability
>> to look into these paths without having already expanded the index.
> 
> I'm confused; I thought the diffstat was only shown if the merge was
> successful, in which case there would be no conflicts appearing.

That's my mistake. I'll edit the message accordingly.
 
> Also, I'm not that familiar with the general diff machinery (just the
> rename detection parts), but...if the diffstat only shows when the
> merge is successful, wouldn't we be comparing two REVS (ORIG_HEAD to
> HEAD)?  Why would we make use of the working tree at all in such a
> case?  And, wouldn't using the working tree be dangerous...what if
> there was a merge performed with a dirty working tree?
> 
> On a bit of a tangent, I know diffcore-rename.c calls into
> diff_populate_filespec() as well, and I have some code doing merges in
> a bare repository (where there obviously is no index).  It seemed to
> be working, but given this commit message, now I'm wondering if I've
> missed something fundamental either in that implementation or there's
> something amiss in this patch.  Or both.  Maybe I need to dig into
> diff_populate_filespec() more, but it seems you already have.  Any
> pointers to orient me on why your changes are right here (and, if you
> know, why diffcore-rename.c should or shouldn't be using
> diff_populate_filespec() in a bare repo)?

I think the cases you are thinking about are covered by this
condition before the one I'm inserting:

	/* We want to avoid the working directory if our caller
	 * doesn't need the data in a normal file, this system
	 * is rather slow with its stat/open/mmap/close syscalls,
	 * and the object is contained in a pack file.  The pack
	 * is probably already open and will be faster to obtain
	 * the data through than the working directory.  Loose
	 * objects however would tend to be slower as they need
	 * to be individually opened and inflated.
	 */
	if (!FAST_WORKING_DIRECTORY && !want_file && has_object_pack(oid))
		return 0;

or after:

	/*
	 * Similarly, if we'd have to convert the file contents anyway, that
	 * makes the optimization not worthwhile.
	 */
	if (!want_file && would_convert_to_git(istate, name))
		return 0;

(This makes me think that I should move my new condition further down
so these two can be linked by context.)

Sounds like this is just an optimization, so it is fine to opt out of it
if we think the optimization isn't necessary. Outside of the sparse-checkout
cone qualifies.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert
  2021-08-17 17:08 [PATCH 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
                   ` (5 preceding siblings ...)
  2021-08-17 17:08 ` [PATCH 6/6] sparse-index: integrate with cherry-pick and rebase Derrick Stolee via GitGitGadget
@ 2021-08-24 21:52 ` Derrick Stolee via GitGitGadget
  2021-08-24 21:52   ` [PATCH v2 1/6] diff: ignore sparse paths in diffstat Derrick Stolee via GitGitGadget
                     ` (7 more replies)
  6 siblings, 8 replies; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-08-24 21:52 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Taylor Blau, Derrick Stolee

This series integrates the sparse index with commands that perform merges
such as 'git merge', 'git cherry-pick', 'git revert' (free with
cherry-pick), and 'git rebase'.

When the ORT merge strategy is enabled, this allows most merges to succeed
without expanding the sparse index, leading to significant performance
gains. I tested these changes against an internal monorepo with over 2
million paths at HEAD but with a sparse-checkout that only has ~60,000 files
within the sparse-checkout cone. 'git merge' commands went from 5-6 seconds
to 0.750-1.250s.

In the case of the recursive merge strategy, the sparse index is expanded
before the recursive algorithm proceeds. We expect that this is as good as
we can get with that strategy. When the strategy shifts to ORT as the
default, then this will not be a problem except for users who decide to
change the behavior.

Most of the hard work was done by previous series, such as
ds/sparse-index-ignored-files (which this series is based on).


Updates in V2
=============

 * The tests no longer specify GIT_TEST_MERGE_ALGORITHM or directly
   reference "-s ort". By relaxing this condition, I found an issue with
   'git cherry-pick' and 'git rebase' when using the 'recursive' algorithm
   which is fixed in a new patch.

 * Use the pul.twohead config to specify the ORT merge algorithm to avoid
   expanding the sparse index when that is what we are testing.

 * Corrected some misstatements in my commit messages.

Thanks, -Stolee

Derrick Stolee (6):
  diff: ignore sparse paths in diffstat
  merge: make sparse-aware with ORT
  merge-ort: expand only for out-of-cone conflicts
  t1092: add cherry-pick, rebase tests
  sequencer: ensure full index if not ORT strategy
  sparse-index: integrate with cherry-pick and rebase

 builtin/merge.c                          |  3 +
 builtin/rebase.c                         |  6 ++
 builtin/revert.c                         |  3 +
 diff.c                                   |  8 +++
 merge-ort.c                              | 15 ++++
 merge-recursive.c                        |  3 +
 sequencer.c                              |  9 +++
 t/t1092-sparse-checkout-compatibility.sh | 92 +++++++++++++++++++++---
 8 files changed, 129 insertions(+), 10 deletions(-)


base-commit: 8d55a6ba2fdf64cee4eb51f3cb6f9808bd0b7505
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1019%2Fderrickstolee%2Fsparse-index%2Fmerge-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1019/derrickstolee/sparse-index/merge-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1019

Range-diff vs v1:

 1:  7cad9eee90b < -:  ----------- t1092: use ORT merge strategy
 2:  9f50f11d394 ! 1:  c5ae705648c diff: ignore sparse paths in diffstat
     @@ Commit message
          diff: ignore sparse paths in diffstat
      
          The diff_populate_filespec() method is used to describe the diff after a
     -    merge operation is complete, especially when a conflict appears. In
     -    order to avoid expanding a sparse index, the reuse_worktree_file() needs
     -    to be adapted to ignore files that are outside of the sparse-checkout
     -    cone. The file names and OIDs used for this check come from the merged
     -    tree in the case of the ORT strategy, not the index, hence the ability
     -    to look into these paths without having already expanded the index.
     +    merge operation is complete. In order to avoid expanding a sparse index,
     +    the reuse_worktree_file() needs to be adapted to ignore files that are
     +    outside of the sparse-checkout cone. The file names and OIDs used for
     +    this check come from the merged tree in the case of the ORT strategy,
     +    not the index, hence the ability to look into these paths without having
     +    already expanded the index.
     +
     +    The work done by reuse_worktree_file() is only an optimization, and
     +    requires the file being on disk for it to be of any value. Thus, it is
     +    safe to exit the method early if we do not expect the file on disk.
      
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
     @@ diff.c
       #ifdef NO_FAST_WORKING_DIRECTORY
       #define FAST_WORKING_DIRECTORY 0
      @@ diff.c: static int reuse_worktree_file(struct index_state *istate,
     - 	if (!FAST_WORKING_DIRECTORY && !want_file && has_object_pack(oid))
     + 	if (!want_file && would_convert_to_git(istate, name))
       		return 0;
       
      +	/*
     @@ diff.c: static int reuse_worktree_file(struct index_state *istate,
      +	if (!path_in_sparse_checkout(name, istate))
      +		return 0;
      +
     - 	/*
     - 	 * Similarly, if we'd have to convert the file contents anyway, that
     - 	 * makes the optimization not worthwhile.
     + 	len = strlen(name);
     + 	pos = index_name_pos(istate, name, len);
     + 	if (pos < 0)
 3:  4c1104a0dd3 ! 2:  bb150483bcf merge: make sparse-aware with ORT
     @@ Commit message
             method. We expect sparse-index users to also have the 'ort' strategy
             enabled.
      
     -    2. If the merge results in a conflicted file, then we expand the index
     -       before updating the working tree. The loop that iterates over the
     -       worktree replaces index entries and tracks 'origintal_cache_nr' which
     -       can become completely wrong if the index expands in the middle of the
     -       operation. This safety valve is important before that loop starts. A
     -       later change will focus this to only expand if we indeed have a
     -       conflict outside of the sparse-checkout cone.
     +    2. With the 'ort' strategy, if the merge results in a conflicted file,
     +       then we expand the index before updating the working tree. The loop
     +       that iterates over the worktree replaces index entries and tracks
     +       'origintal_cache_nr' which can become completely wrong if the index
     +       expands in the middle of the operation. This safety valve is
     +       important before that loop starts. A later change will focus this
     +       to only expand if we indeed have a conflict outside of the
     +       sparse-checkout cone.
     +
     +    3. Other merge strategies are executed as a 'git merge-X' subcommand,
     +       and those strategies are currently protected with the
     +       'command_requires_full_index' guard.
      
          Some test updates are required, including a mistaken 'git checkout -b'
          that did not specify the base branch, causing merges to be fast-forward
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is n
      +	ensure_not_expanded add . &&
      +
      +	ensure_not_expanded checkout -f update-deep &&
     -+	ensure_not_expanded merge -s ort -m merge update-folder1 &&
     -+	ensure_not_expanded merge -s ort -m merge update-folder2
     ++	(
     ++		sane_unset GIT_TEST_MERGE_ALGORITHM &&
     ++		git -C sparse-index config pull.twohead ort &&
     ++		ensure_not_expanded merge -m merge update-folder1 &&
     ++		ensure_not_expanded merge -m merge update-folder2
     ++	)
       '
       
       # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
 4:  e47b15554e3 ! 3:  815b1b1cfbf merge-ort: expand only for out-of-cone conflicts
     @@ Commit message
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       ## merge-ort.c ##
     -@@
     - #include "tree.h"
     - #include "unpack-trees.h"
     - #include "xdiff-interface.h"
     -+#include "dir.h"
     - 
     - /*
     -  * We have many arrays of size 3.  Whenever we have such an array, the
      @@ merge-ort.c: static int record_conflicted_index_entries(struct merge_options *opt)
       
       	/*
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is e
      +	if test "$1" = "!"
      +	then
      +		shift &&
     -+		(
     -+			GIT_TRACE2_EVENT="$(pwd)/trace2.txt" &&
     -+			GIT_TRACE2_EVENT_NESTING=10 &&
     -+			export GIT_TRACE2_EVENT &&
     -+			export GIT_TRACE2_EVENT_NESTING &&
     -+			test_must_fail git -C sparse-index "$@" || return 1
     -+		)
     ++		test_must_fail env \
     ++			GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
     ++			git -C sparse-index "$@" || return 1
      +	else
      +		GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
      +			git -C sparse-index "$@" || return 1
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is e
       }
       
      @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is not expanded' '
     - 	ensure_not_expanded merge -s ort -m merge update-folder2
     + 	)
       '
       
      +test_expect_success 'sparse-index is not expanded: merge conflict in cone' '
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is n
      +		git -C sparse-index commit -a -m "$side" || return 1
      +	done &&
      +
     -+	ensure_not_expanded ! merge -m merged expand-right
     ++	(
     ++		sane_unset GIT_TEST_MERGE_ALGORITHM &&
     ++		git -C sparse-index config pull.twohead ort &&
     ++		ensure_not_expanded ! merge -m merged expand-right
     ++	)
      +'
      +
       # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
 5:  ca23bf38bd9 ! 4:  8032154bc8a t1092: add cherry-pick, rebase tests
     @@ Commit message
          t1092: add cherry-pick, rebase tests
      
          Add tests to check that cherry-pick and rebase behave the same in the
     -    sparse-index case as in the full index cases.
     +    sparse-index case as in the full index cases. These tests are agnostic
     +    to GIT_TEST_MERGE_ALGORITHM, so a full CI test suite will check both the
     +    'ort' and 'recursive' strategies on this test.
      
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'checkout and rese
      -	test_all_match git rev-parse HEAD^{tree} &&
      -	test_all_match git merge -m "folder2" update-folder2 &&
      -	test_all_match git rev-parse HEAD^{tree}
     -+	for OPERATION in "merge -s ort -m merge" cherry-pick rebase
     ++	for OPERATION in "merge -m merge" cherry-pick rebase
      +	do
      +		test_all_match git checkout -B temp update-deep &&
      +		test_all_match git $OPERATION update-folder1 &&
 -:  ----------- > 5:  90ac85500b8 sequencer: ensure full index if not ORT strategy
 6:  350ed86a453 ! 6:  df4bbec744f sparse-index: integrate with cherry-pick and rebase
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'merge with confli
       	init_repos &&
       
      @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is not expanded' '
     - 	echo >>sparse-index/untracked.txt &&
     - 	ensure_not_expanded add . &&
     - 
     --	ensure_not_expanded checkout -f update-deep &&
     --	ensure_not_expanded merge -s ort -m merge update-folder1 &&
     --	ensure_not_expanded merge -s ort -m merge update-folder2
     -+	for OPERATION in "merge -s ort -m merge" cherry-pick rebase
     -+	do
     -+		ensure_not_expanded checkout -f -B temp update-deep &&
     -+		ensure_not_expanded $OPERATION update-folder1 &&
     -+		ensure_not_expanded $OPERATION update-folder2 || return 1
     -+	done
     + 	(
     + 		sane_unset GIT_TEST_MERGE_ALGORITHM &&
     + 		git -C sparse-index config pull.twohead ort &&
     +-		ensure_not_expanded merge -m merge update-folder1 &&
     +-		ensure_not_expanded merge -m merge update-folder2
     ++		for OPERATION in "merge -m merge" cherry-pick rebase
     ++		do
     ++			ensure_not_expanded merge -m merge update-folder1 &&
     ++			ensure_not_expanded merge -m merge update-folder2 || return 1
     ++		done
     + 	)
       '
       
     - test_expect_success 'sparse-index is not expanded: merge conflict in cone' '

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 1/6] diff: ignore sparse paths in diffstat
  2021-08-24 21:52 ` [PATCH v2 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
@ 2021-08-24 21:52   ` Derrick Stolee via GitGitGadget
  2021-08-24 21:52   ` [PATCH v2 2/6] merge: make sparse-aware with ORT Derrick Stolee via GitGitGadget
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-08-24 21:52 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Taylor Blau, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The diff_populate_filespec() method is used to describe the diff after a
merge operation is complete. In order to avoid expanding a sparse index,
the reuse_worktree_file() needs to be adapted to ignore files that are
outside of the sparse-checkout cone. The file names and OIDs used for
this check come from the merged tree in the case of the ORT strategy,
not the index, hence the ability to look into these paths without having
already expanded the index.

The work done by reuse_worktree_file() is only an optimization, and
requires the file being on disk for it to be of any value. Thus, it is
safe to exit the method early if we do not expect the file on disk.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 diff.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/diff.c b/diff.c
index a8113f17070..c8f530ffdbe 100644
--- a/diff.c
+++ b/diff.c
@@ -26,6 +26,7 @@
 #include "parse-options.h"
 #include "help.h"
 #include "promisor-remote.h"
+#include "dir.h"
 
 #ifdef NO_FAST_WORKING_DIRECTORY
 #define FAST_WORKING_DIRECTORY 0
@@ -3907,6 +3908,13 @@ static int reuse_worktree_file(struct index_state *istate,
 	if (!want_file && would_convert_to_git(istate, name))
 		return 0;
 
+	/*
+	 * If this path does not match our sparse-checkout definition,
+	 * then the file will not be in the working directory.
+	 */
+	if (!path_in_sparse_checkout(name, istate))
+		return 0;
+
 	len = strlen(name);
 	pos = index_name_pos(istate, name, len);
 	if (pos < 0)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 2/6] merge: make sparse-aware with ORT
  2021-08-24 21:52 ` [PATCH v2 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
  2021-08-24 21:52   ` [PATCH v2 1/6] diff: ignore sparse paths in diffstat Derrick Stolee via GitGitGadget
@ 2021-08-24 21:52   ` Derrick Stolee via GitGitGadget
  2021-08-27 22:43     ` Elijah Newren
  2021-08-24 21:52   ` [PATCH v2 3/6] merge-ort: expand only for out-of-cone conflicts Derrick Stolee via GitGitGadget
                     ` (5 subsequent siblings)
  7 siblings, 1 reply; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-08-24 21:52 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Taylor Blau, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Allow 'git merge' to operate without expanding a sparse index, at least
not immediately. The index still will be expanded in a few cases:

1. If the merge strategy is 'recursive', then we enable
   command_requires_full_index at the start of the merge_recursive()
   method. We expect sparse-index users to also have the 'ort' strategy
   enabled.

2. With the 'ort' strategy, if the merge results in a conflicted file,
   then we expand the index before updating the working tree. The loop
   that iterates over the worktree replaces index entries and tracks
   'origintal_cache_nr' which can become completely wrong if the index
   expands in the middle of the operation. This safety valve is
   important before that loop starts. A later change will focus this
   to only expand if we indeed have a conflict outside of the
   sparse-checkout cone.

3. Other merge strategies are executed as a 'git merge-X' subcommand,
   and those strategies are currently protected with the
   'command_requires_full_index' guard.

Some test updates are required, including a mistaken 'git checkout -b'
that did not specify the base branch, causing merges to be fast-forward
merges.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/merge.c                          |  3 +++
 merge-ort.c                              |  8 ++++++++
 merge-recursive.c                        |  3 +++
 t/t1092-sparse-checkout-compatibility.sh | 12 ++++++++++--
 4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 22f23990b37..926de328fbb 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1276,6 +1276,9 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 	if (argc == 2 && !strcmp(argv[1], "-h"))
 		usage_with_options(builtin_merge_usage, builtin_merge_options);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	/*
 	 * Check if we are _not_ on a detached HEAD, i.e. if there is a
 	 * current branch.
diff --git a/merge-ort.c b/merge-ort.c
index 6eb910d6f0c..8e754b769e1 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -4058,6 +4058,14 @@ static int record_conflicted_index_entries(struct merge_options *opt)
 	if (strmap_empty(&opt->priv->conflicted))
 		return 0;
 
+	/*
+	 * We are in a conflicted state. These conflicts might be inside
+	 * sparse-directory entries, so expand the index preemtively.
+	 * Also, we set original_cache_nr below, but that might change if
+	 * index_name_pos() calls ask for paths within sparse directories.
+	 */
+	ensure_full_index(index);
+
 	/* If any entries have skip_worktree set, we'll have to check 'em out */
 	state.force = 1;
 	state.quiet = 1;
diff --git a/merge-recursive.c b/merge-recursive.c
index 3355d50e8ad..1f563cd6874 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3750,6 +3750,9 @@ int merge_recursive(struct merge_options *opt,
 	assert(opt->ancestor == NULL ||
 	       !strcmp(opt->ancestor, "constructed merge base"));
 
+	prepare_repo_settings(opt->repo);
+	opt->repo->settings.command_requires_full_index = 1;
+
 	if (merge_start(opt, repo_get_commit_tree(opt->repo, h1)))
 		return -1;
 	clean = merge_recursive_internal(opt, h1, h2, merge_bases, result);
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index ddc86bb4152..dc56252865c 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -47,7 +47,7 @@ test_expect_success 'setup' '
 		git checkout -b base &&
 		for dir in folder1 folder2 deep
 		do
-			git checkout -b update-$dir &&
+			git checkout -b update-$dir base &&
 			echo "updated $dir" >$dir/a &&
 			git commit -a -m "update $dir" || return 1
 		done &&
@@ -647,7 +647,15 @@ test_expect_success 'sparse-index is not expanded' '
 	echo >>sparse-index/extra.txt &&
 	ensure_not_expanded add extra.txt &&
 	echo >>sparse-index/untracked.txt &&
-	ensure_not_expanded add .
+	ensure_not_expanded add . &&
+
+	ensure_not_expanded checkout -f update-deep &&
+	(
+		sane_unset GIT_TEST_MERGE_ALGORITHM &&
+		git -C sparse-index config pull.twohead ort &&
+		ensure_not_expanded merge -m merge update-folder1 &&
+		ensure_not_expanded merge -m merge update-folder2
+	)
 '
 
 # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 3/6] merge-ort: expand only for out-of-cone conflicts
  2021-08-24 21:52 ` [PATCH v2 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
  2021-08-24 21:52   ` [PATCH v2 1/6] diff: ignore sparse paths in diffstat Derrick Stolee via GitGitGadget
  2021-08-24 21:52   ` [PATCH v2 2/6] merge: make sparse-aware with ORT Derrick Stolee via GitGitGadget
@ 2021-08-24 21:52   ` Derrick Stolee via GitGitGadget
  2021-08-27 22:47     ` Elijah Newren
  2021-08-24 21:52   ` [PATCH v2 4/6] t1092: add cherry-pick, rebase tests Derrick Stolee via GitGitGadget
                     ` (4 subsequent siblings)
  7 siblings, 1 reply; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-08-24 21:52 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Taylor Blau, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Merge conflicts happen often enough to want to avoid expanding a sparse
index when they happen, as long as those conflicts are within the
sparse-checkout cone. If a conflict exists outside of the
sparse-checkout cone, then we still need to expand before iterating over
the index entries. This is critical to do in advance because of how the
original_cache_nr is tracked to allow inserting and replacing cache
entries.

Iterate over the conflicted files and check if any paths are outside of
the sparse-checkout cone. If so, then expand the full index.

Add a test that demonstrates that we do not expand the index, even when
we hit a conflict within the sparse-checkout cone.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 merge-ort.c                              | 13 +++++++---
 t/t1092-sparse-checkout-compatibility.sh | 30 ++++++++++++++++++++++--
 2 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 8e754b769e1..805f7c41397 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -4060,11 +4060,18 @@ static int record_conflicted_index_entries(struct merge_options *opt)
 
 	/*
 	 * We are in a conflicted state. These conflicts might be inside
-	 * sparse-directory entries, so expand the index preemtively.
-	 * Also, we set original_cache_nr below, but that might change if
+	 * sparse-directory entries, so check if any entries are outside
+	 * of the sparse-checkout cone preemptively.
+	 *
+	 * We set original_cache_nr below, but that might change if
 	 * index_name_pos() calls ask for paths within sparse directories.
 	 */
-	ensure_full_index(index);
+	strmap_for_each_entry(&opt->priv->conflicted, &iter, e) {
+		if (!path_in_sparse_checkout(e->key, index)) {
+			ensure_full_index(index);
+			break;
+		}
+	}
 
 	/* If any entries have skip_worktree set, we'll have to check 'em out */
 	state.force = 1;
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index dc56252865c..38afdf689a2 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -617,8 +617,17 @@ test_expect_success 'sparse-index is expanded and converted back' '
 ensure_not_expanded () {
 	rm -f trace2.txt &&
 	echo >>sparse-index/untracked.txt &&
-	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
-		git -C sparse-index "$@" &&
+
+	if test "$1" = "!"
+	then
+		shift &&
+		test_must_fail env \
+			GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+			git -C sparse-index "$@" || return 1
+	else
+		GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+			git -C sparse-index "$@" || return 1
+	fi &&
 	test_region ! index ensure_full_index trace2.txt
 }
 
@@ -658,6 +667,23 @@ test_expect_success 'sparse-index is not expanded' '
 	)
 '
 
+test_expect_success 'sparse-index is not expanded: merge conflict in cone' '
+	init_repos &&
+
+	for side in right left
+	do
+		git -C sparse-index checkout -b expand-$side base &&
+		echo $side >sparse-index/deep/a &&
+		git -C sparse-index commit -a -m "$side" || return 1
+	done &&
+
+	(
+		sane_unset GIT_TEST_MERGE_ALGORITHM &&
+		git -C sparse-index config pull.twohead ort &&
+		ensure_not_expanded ! merge -m merged expand-right
+	)
+'
+
 # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
 # in this scenario, but it shouldn't.
 test_expect_success 'reset mixed and checkout orphan' '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 4/6] t1092: add cherry-pick, rebase tests
  2021-08-24 21:52 ` [PATCH v2 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
                     ` (2 preceding siblings ...)
  2021-08-24 21:52   ` [PATCH v2 3/6] merge-ort: expand only for out-of-cone conflicts Derrick Stolee via GitGitGadget
@ 2021-08-24 21:52   ` Derrick Stolee via GitGitGadget
  2021-08-24 21:52   ` [PATCH v2 5/6] sequencer: ensure full index if not ORT strategy Derrick Stolee via GitGitGadget
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-08-24 21:52 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Taylor Blau, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Add tests to check that cherry-pick and rebase behave the same in the
sparse-index case as in the full index cases. These tests are agnostic
to GIT_TEST_MERGE_ALGORITHM, so a full CI test suite will check both the
'ort' and 'recursive' strategies on this test.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t1092-sparse-checkout-compatibility.sh | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 38afdf689a2..60d7400d014 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -481,14 +481,17 @@ test_expect_success 'checkout and reset (mixed) [sparse]' '
 	test_sparse_match git reset update-folder2
 '
 
-test_expect_success 'merge' '
+test_expect_success 'merge, cherry-pick, and rebase' '
 	init_repos &&
 
-	test_all_match git checkout -b merge update-deep &&
-	test_all_match git merge -m "folder1" update-folder1 &&
-	test_all_match git rev-parse HEAD^{tree} &&
-	test_all_match git merge -m "folder2" update-folder2 &&
-	test_all_match git rev-parse HEAD^{tree}
+	for OPERATION in "merge -m merge" cherry-pick rebase
+	do
+		test_all_match git checkout -B temp update-deep &&
+		test_all_match git $OPERATION update-folder1 &&
+		test_all_match git rev-parse HEAD^{tree} &&
+		test_all_match git $OPERATION update-folder2 &&
+		test_all_match git rev-parse HEAD^{tree} || return 1
+	done
 '
 
 # NEEDSWORK: This test is documenting current behavior, but that
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 5/6] sequencer: ensure full index if not ORT strategy
  2021-08-24 21:52 ` [PATCH v2 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
                     ` (3 preceding siblings ...)
  2021-08-24 21:52   ` [PATCH v2 4/6] t1092: add cherry-pick, rebase tests Derrick Stolee via GitGitGadget
@ 2021-08-24 21:52   ` Derrick Stolee via GitGitGadget
  2021-08-24 21:52   ` [PATCH v2 6/6] sparse-index: integrate with cherry-pick and rebase Derrick Stolee via GitGitGadget
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-08-24 21:52 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Taylor Blau, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The sequencer is used by 'cherry-pick' and 'rebase' to sequence a list
of operations that modify the index. Since we intend to remove the need
for 'command_requires_full_index', we need to ensure the sparse index is
expanded every time it is written to disk between these steps. That is,
unless the merge strategy is 'ort' where the index can remain sparse
throughout.

There are two main places to be extra careful about a full index:

1. Right before calling merge_trees(), ensure the index is full. This
   happens within an 'else' where the 'if' block checks if the 'ort'
   strategy is selected.

2. During read_and_refresh_cache(), the index might be written to disk
   and converted to sparse in the process. Ensure it expands back to
   full afterwards by checking if the strategy is _not_ 'ort'. This
   'if' statement is the logical negation of the 'if' in item (1).

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 sequencer.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/sequencer.c b/sequencer.c
index 7f07cd00f3f..228bc089d22 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -652,6 +652,7 @@ static int do_recursive_merge(struct repository *r,
 		merge_switch_to_result(&o, head_tree, &result, 1, show_output);
 		clean = result.clean;
 	} else {
+		ensure_full_index(r->index);
 		clean = merge_trees(&o, head_tree, next_tree, base_tree);
 		if (is_rebase_i(opts) && clean <= 0)
 			fputs(o.obuf.buf, stdout);
@@ -2346,6 +2347,7 @@ static int read_and_refresh_cache(struct repository *r,
 			_(action_name(opts)));
 	}
 	refresh_index(r->index, REFRESH_QUIET|REFRESH_UNMERGED, NULL, NULL, NULL);
+
 	if (index_fd >= 0) {
 		if (write_locked_index(r->index, &index_lock,
 				       COMMIT_LOCK | SKIP_IF_UNCHANGED)) {
@@ -2353,6 +2355,13 @@ static int read_and_refresh_cache(struct repository *r,
 				_(action_name(opts)));
 		}
 	}
+
+	/*
+	 * If we are resolving merges in any way other than "ort", then
+	 * expand the sparse index.
+	 */
+	if (opts->strategy && strcmp(opts->strategy, "ort"))
+		ensure_full_index(r->index);
 	return 0;
 }
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 6/6] sparse-index: integrate with cherry-pick and rebase
  2021-08-24 21:52 ` [PATCH v2 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
                     ` (4 preceding siblings ...)
  2021-08-24 21:52   ` [PATCH v2 5/6] sequencer: ensure full index if not ORT strategy Derrick Stolee via GitGitGadget
@ 2021-08-24 21:52   ` Derrick Stolee via GitGitGadget
  2021-08-27 22:56   ` [PATCH v2 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Elijah Newren
  2021-09-08 11:23   ` [PATCH v3 " Derrick Stolee via GitGitGadget
  7 siblings, 0 replies; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-08-24 21:52 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Taylor Blau, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The hard work was already done with 'git merge' and the ORT strategy.
Just add extra tests to see that we get the expected results in the
non-conflict cases.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/rebase.c                         |  6 ++++
 builtin/revert.c                         |  3 ++
 t/t1092-sparse-checkout-compatibility.sh | 39 ++++++++++++++++++++++--
 3 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index 33e09619005..27433d7c5a2 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -559,6 +559,9 @@ int cmd_rebase__interactive(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options,
 			builtin_rebase_interactive_usage, PARSE_OPT_KEEP_ARGV0);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	if (!is_null_oid(&squash_onto))
 		opts.squash_onto = &squash_onto;
 
@@ -1430,6 +1433,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
 		usage_with_options(builtin_rebase_usage,
 				   builtin_rebase_options);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	options.allow_empty_message = 1;
 	git_config(rebase_config, &options);
 	/* options.gpg_sign_opt will be either "-S" or NULL */
diff --git a/builtin/revert.c b/builtin/revert.c
index 237f2f18d4c..6c4c22691bd 100644
--- a/builtin/revert.c
+++ b/builtin/revert.c
@@ -136,6 +136,9 @@ static int run_sequencer(int argc, const char **argv, struct replay_opts *opts)
 			PARSE_OPT_KEEP_ARGV0 |
 			PARSE_OPT_KEEP_UNKNOWN);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	/* implies allow_empty */
 	if (opts->keep_redundant_commits)
 		opts->allow_empty = 1;
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 60d7400d014..bbc6de712c4 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -527,6 +527,38 @@ test_expect_success 'merge with conflict outside cone' '
 	test_all_match git rev-parse HEAD^{tree}
 '
 
+test_expect_success 'cherry-pick/rebase with conflict outside cone' '
+	init_repos &&
+
+	for OPERATION in cherry-pick rebase
+	do
+		test_all_match git checkout -B tip &&
+		test_all_match git reset --hard merge-left &&
+		test_all_match git status --porcelain=v2 &&
+		test_all_match test_must_fail git $OPERATION merge-right &&
+		test_all_match git status --porcelain=v2 &&
+
+		# Resolve the conflict in different ways:
+		# 1. Revert to the base
+		test_all_match git checkout base -- deep/deeper2/a &&
+		test_all_match git status --porcelain=v2 &&
+
+		# 2. Add the file with conflict markers
+		test_all_match git add folder1/a &&
+		test_all_match git status --porcelain=v2 &&
+
+		# 3. Rename the file to another sparse filename and
+		#    accept conflict markers as resolved content.
+		run_on_all mv folder2/a folder2/z &&
+		test_all_match git add folder2 &&
+		test_all_match git status --porcelain=v2 &&
+
+		test_all_match git $OPERATION --continue &&
+		test_all_match git status --porcelain=v2 &&
+		test_all_match git rev-parse HEAD^{tree} || return 1
+	done
+'
+
 test_expect_success 'merge with outside renames' '
 	init_repos &&
 
@@ -665,8 +697,11 @@ test_expect_success 'sparse-index is not expanded' '
 	(
 		sane_unset GIT_TEST_MERGE_ALGORITHM &&
 		git -C sparse-index config pull.twohead ort &&
-		ensure_not_expanded merge -m merge update-folder1 &&
-		ensure_not_expanded merge -m merge update-folder2
+		for OPERATION in "merge -m merge" cherry-pick rebase
+		do
+			ensure_not_expanded merge -m merge update-folder1 &&
+			ensure_not_expanded merge -m merge update-folder2 || return 1
+		done
 	)
 '
 
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/6] diff: ignore sparse paths in diffstat
  2021-08-24 18:30     ` Derrick Stolee
@ 2021-08-27 22:27       ` Elijah Newren
  0 siblings, 0 replies; 45+ messages in thread
From: Elijah Newren @ 2021-08-27 22:27 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, Git Mailing List,
	Junio C Hamano, Derrick Stolee, Derrick Stolee

On Tue, Aug 24, 2021 at 11:30 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 8/20/2021 5:32 PM, Elijah Newren wrote:
> > On Tue, Aug 17, 2021 at 10:08 AM Derrick Stolee via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> >>
> >> From: Derrick Stolee <dstolee@microsoft.com>
> >>
> >> The diff_populate_filespec() method is used to describe the diff after a
> >> merge operation is complete, especially when a conflict appears. In
> >> order to avoid expanding a sparse index, the reuse_worktree_file() needs
> >> to be adapted to ignore files that are outside of the sparse-checkout
> >> cone. The file names and OIDs used for this check come from the merged
> >> tree in the case of the ORT strategy, not the index, hence the ability
> >> to look into these paths without having already expanded the index.
> >
> > I'm confused; I thought the diffstat was only shown if the merge was
> > successful, in which case there would be no conflicts appearing.
>
> That's my mistake. I'll edit the message accordingly.
>
> > Also, I'm not that familiar with the general diff machinery (just the
> > rename detection parts), but...if the diffstat only shows when the
> > merge is successful, wouldn't we be comparing two REVS (ORIG_HEAD to
> > HEAD)?  Why would we make use of the working tree at all in such a
> > case?  And, wouldn't using the working tree be dangerous...what if
> > there was a merge performed with a dirty working tree?
> >
> > On a bit of a tangent, I know diffcore-rename.c calls into
> > diff_populate_filespec() as well, and I have some code doing merges in
> > a bare repository (where there obviously is no index).  It seemed to
> > be working, but given this commit message, now I'm wondering if I've
> > missed something fundamental either in that implementation or there's
> > something amiss in this patch.  Or both.  Maybe I need to dig into
> > diff_populate_filespec() more, but it seems you already have.  Any
> > pointers to orient me on why your changes are right here (and, if you
> > know, why diffcore-rename.c should or shouldn't be using
> > diff_populate_filespec() in a bare repo)?
>
> I think the cases you are thinking about are covered by this
> condition before the one I'm inserting:
>
>         /* We want to avoid the working directory if our caller
>          * doesn't need the data in a normal file, this system
>          * is rather slow with its stat/open/mmap/close syscalls,
>          * and the object is contained in a pack file.  The pack
>          * is probably already open and will be faster to obtain
>          * the data through than the working directory.  Loose
>          * objects however would tend to be slower as they need
>          * to be individually opened and inflated.
>          */
>         if (!FAST_WORKING_DIRECTORY && !want_file && has_object_pack(oid))
>                 return 0;
>
> or after:
>
>         /*
>          * Similarly, if we'd have to convert the file contents anyway, that
>          * makes the optimization not worthwhile.
>          */
>         if (!want_file && would_convert_to_git(istate, name))
>                 return 0;
>
> (This makes me think that I should move my new condition further down
> so these two can be linked by context.)
>
> Sounds like this is just an optimization, so it is fine to opt out of it
> if we think the optimization isn't necessary. Outside of the sparse-checkout
> cone qualifies.

Ah, this is very helpful.  Thanks for digging up these details.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 2/6] merge: make sparse-aware with ORT
  2021-08-24 21:52   ` [PATCH v2 2/6] merge: make sparse-aware with ORT Derrick Stolee via GitGitGadget
@ 2021-08-27 22:43     ` Elijah Newren
  2021-08-30 17:18       ` Derrick Stolee
  0 siblings, 1 reply; 45+ messages in thread
From: Elijah Newren @ 2021-08-27 22:43 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Derrick Stolee, Junio C Hamano, Taylor Blau,
	Derrick Stolee, Derrick Stolee

On Tue, Aug 24, 2021 at 2:52 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> Allow 'git merge' to operate without expanding a sparse index, at least
> not immediately. The index still will be expanded in a few cases:
>
> 1. If the merge strategy is 'recursive', then we enable
>    command_requires_full_index at the start of the merge_recursive()
>    method. We expect sparse-index users to also have the 'ort' strategy
>    enabled.
>
> 2. With the 'ort' strategy, if the merge results in a conflicted file,
>    then we expand the index before updating the working tree. The loop
>    that iterates over the worktree replaces index entries and tracks
>    'origintal_cache_nr' which can become completely wrong if the index
>    expands in the middle of the operation. This safety valve is
>    important before that loop starts. A later change will focus this
>    to only expand if we indeed have a conflict outside of the
>    sparse-checkout cone.
>
> 3. Other merge strategies are executed as a 'git merge-X' subcommand,
>    and those strategies are currently protected with the
>    'command_requires_full_index' guard.

Oh, indeed, it appears ag/merge-strategies-in-c didn't complete but
was discarded, as per the July 6 "What's cooking in git.git" email.
Well, that certainly makes things easier for you; thanks for
mentioning them in this item #3.

>
> Some test updates are required, including a mistaken 'git checkout -b'
> that did not specify the base branch, causing merges to be fast-forward
> merges.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  builtin/merge.c                          |  3 +++
>  merge-ort.c                              |  8 ++++++++
>  merge-recursive.c                        |  3 +++
>  t/t1092-sparse-checkout-compatibility.sh | 12 ++++++++++--
>  4 files changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 22f23990b37..926de328fbb 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -1276,6 +1276,9 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>         if (argc == 2 && !strcmp(argv[1], "-h"))
>                 usage_with_options(builtin_merge_usage, builtin_merge_options);
>
> +       prepare_repo_settings(the_repository);
> +       the_repository->settings.command_requires_full_index = 0;
> +
>         /*
>          * Check if we are _not_ on a detached HEAD, i.e. if there is a
>          * current branch.
> diff --git a/merge-ort.c b/merge-ort.c
> index 6eb910d6f0c..8e754b769e1 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -4058,6 +4058,14 @@ static int record_conflicted_index_entries(struct merge_options *opt)
>         if (strmap_empty(&opt->priv->conflicted))
>                 return 0;
>
> +       /*
> +        * We are in a conflicted state. These conflicts might be inside
> +        * sparse-directory entries, so expand the index preemtively.

Same typo I pointed out in v1.

> +        * Also, we set original_cache_nr below, but that might change if
> +        * index_name_pos() calls ask for paths within sparse directories.
> +        */
> +       ensure_full_index(index);
> +
>         /* If any entries have skip_worktree set, we'll have to check 'em out */
>         state.force = 1;
>         state.quiet = 1;
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 3355d50e8ad..1f563cd6874 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -3750,6 +3750,9 @@ int merge_recursive(struct merge_options *opt,
>         assert(opt->ancestor == NULL ||
>                !strcmp(opt->ancestor, "constructed merge base"));
>
> +       prepare_repo_settings(opt->repo);
> +       opt->repo->settings.command_requires_full_index = 1;
> +
>         if (merge_start(opt, repo_get_commit_tree(opt->repo, h1)))
>                 return -1;
>         clean = merge_recursive_internal(opt, h1, h2, merge_bases, result);
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index ddc86bb4152..dc56252865c 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -47,7 +47,7 @@ test_expect_success 'setup' '
>                 git checkout -b base &&
>                 for dir in folder1 folder2 deep
>                 do
> -                       git checkout -b update-$dir &&
> +                       git checkout -b update-$dir base &&
>                         echo "updated $dir" >$dir/a &&
>                         git commit -a -m "update $dir" || return 1
>                 done &&
> @@ -647,7 +647,15 @@ test_expect_success 'sparse-index is not expanded' '
>         echo >>sparse-index/extra.txt &&
>         ensure_not_expanded add extra.txt &&
>         echo >>sparse-index/untracked.txt &&
> -       ensure_not_expanded add .
> +       ensure_not_expanded add . &&
> +
> +       ensure_not_expanded checkout -f update-deep &&
> +       (
> +               sane_unset GIT_TEST_MERGE_ALGORITHM &&
> +               git -C sparse-index config pull.twohead ort &&
> +               ensure_not_expanded merge -m merge update-folder1 &&
> +               ensure_not_expanded merge -m merge update-folder2
> +       )
>  '

Should you use test_config rather than git config here?

More importantly, why the subshell and unsetting of
GIT_TEST_MERGE_ALGORITHM and the special worrying about pull.twohead?
Wouldn't it be simpler to just set GIT_TEST_MERGE_ALGORITHM=ort,
perhaps at the beginning of the file?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 3/6] merge-ort: expand only for out-of-cone conflicts
  2021-08-24 21:52   ` [PATCH v2 3/6] merge-ort: expand only for out-of-cone conflicts Derrick Stolee via GitGitGadget
@ 2021-08-27 22:47     ` Elijah Newren
  2021-08-30 17:21       ` Derrick Stolee
  0 siblings, 1 reply; 45+ messages in thread
From: Elijah Newren @ 2021-08-27 22:47 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Derrick Stolee, Junio C Hamano, Taylor Blau,
	Derrick Stolee, Derrick Stolee

On Tue, Aug 24, 2021 at 2:52 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> Merge conflicts happen often enough to want to avoid expanding a sparse
> index when they happen, as long as those conflicts are within the
> sparse-checkout cone. If a conflict exists outside of the
> sparse-checkout cone, then we still need to expand before iterating over
> the index entries. This is critical to do in advance because of how the
> original_cache_nr is tracked to allow inserting and replacing cache
> entries.
>
> Iterate over the conflicted files and check if any paths are outside of
> the sparse-checkout cone. If so, then expand the full index.
>
> Add a test that demonstrates that we do not expand the index, even when
> we hit a conflict within the sparse-checkout cone.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  merge-ort.c                              | 13 +++++++---
>  t/t1092-sparse-checkout-compatibility.sh | 30 ++++++++++++++++++++++--
>  2 files changed, 38 insertions(+), 5 deletions(-)
>
> diff --git a/merge-ort.c b/merge-ort.c
> index 8e754b769e1..805f7c41397 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -4060,11 +4060,18 @@ static int record_conflicted_index_entries(struct merge_options *opt)
>
>         /*
>          * We are in a conflicted state. These conflicts might be inside
> -        * sparse-directory entries, so expand the index preemtively.
> -        * Also, we set original_cache_nr below, but that might change if
> +        * sparse-directory entries, so check if any entries are outside
> +        * of the sparse-checkout cone preemptively.
> +        *
> +        * We set original_cache_nr below, but that might change if
>          * index_name_pos() calls ask for paths within sparse directories.
>          */
> -       ensure_full_index(index);
> +       strmap_for_each_entry(&opt->priv->conflicted, &iter, e) {
> +               if (!path_in_sparse_checkout(e->key, index)) {
> +                       ensure_full_index(index);
> +                       break;
> +               }
> +       }
>
>         /* If any entries have skip_worktree set, we'll have to check 'em out */
>         state.force = 1;
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index dc56252865c..38afdf689a2 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -617,8 +617,17 @@ test_expect_success 'sparse-index is expanded and converted back' '
>  ensure_not_expanded () {
>         rm -f trace2.txt &&
>         echo >>sparse-index/untracked.txt &&
> -       GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
> -               git -C sparse-index "$@" &&
> +
> +       if test "$1" = "!"
> +       then
> +               shift &&
> +               test_must_fail env \
> +                       GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
> +                       git -C sparse-index "$@" || return 1
> +       else
> +               GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
> +                       git -C sparse-index "$@" || return 1
> +       fi &&
>         test_region ! index ensure_full_index trace2.txt
>  }
>
> @@ -658,6 +667,23 @@ test_expect_success 'sparse-index is not expanded' '
>         )
>  '
>
> +test_expect_success 'sparse-index is not expanded: merge conflict in cone' '
> +       init_repos &&
> +
> +       for side in right left
> +       do
> +               git -C sparse-index checkout -b expand-$side base &&
> +               echo $side >sparse-index/deep/a &&
> +               git -C sparse-index commit -a -m "$side" || return 1
> +       done &&
> +
> +       (
> +               sane_unset GIT_TEST_MERGE_ALGORITHM &&
> +               git -C sparse-index config pull.twohead ort &&
> +               ensure_not_expanded ! merge -m merged expand-right
> +       )

These last five lines could just be replaced with the fourth, if you
just set GIT_TEST_MERGE_ALGORITHM=ort at the beginning of the file.

Are you worrying about testing with recursive in some of the testcases?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert
  2021-08-24 21:52 ` [PATCH v2 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
                     ` (5 preceding siblings ...)
  2021-08-24 21:52   ` [PATCH v2 6/6] sparse-index: integrate with cherry-pick and rebase Derrick Stolee via GitGitGadget
@ 2021-08-27 22:56   ` Elijah Newren
  2021-08-30 17:26     ` Derrick Stolee
  2021-09-08 11:23   ` [PATCH v3 " Derrick Stolee via GitGitGadget
  7 siblings, 1 reply; 45+ messages in thread
From: Elijah Newren @ 2021-08-27 22:56 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Derrick Stolee, Junio C Hamano, Taylor Blau,
	Derrick Stolee

On Tue, Aug 24, 2021 at 2:52 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> This series integrates the sparse index with commands that perform merges
> such as 'git merge', 'git cherry-pick', 'git revert' (free with
> cherry-pick), and 'git rebase'.
>
> When the ORT merge strategy is enabled, this allows most merges to succeed
> without expanding the sparse index, leading to significant performance
> gains. I tested these changes against an internal monorepo with over 2
> million paths at HEAD but with a sparse-checkout that only has ~60,000 files
> within the sparse-checkout cone. 'git merge' commands went from 5-6 seconds
> to 0.750-1.250s.
>
> In the case of the recursive merge strategy, the sparse index is expanded
> before the recursive algorithm proceeds. We expect that this is as good as
> we can get with that strategy. When the strategy shifts to ORT as the
> default, then this will not be a problem except for users who decide to
> change the behavior.
>
> Most of the hard work was done by previous series, such as
> ds/sparse-index-ignored-files (which this series is based on).
>
>
> Updates in V2
> =============
>
>  * The tests no longer specify GIT_TEST_MERGE_ALGORITHM or directly
>    reference "-s ort". By relaxing this condition, I found an issue with
>    'git cherry-pick' and 'git rebase' when using the 'recursive' algorithm
>    which is fixed in a new patch.
>
>  * Use the pul.twohead config to specify the ORT merge algorithm to avoid
>    expanding the sparse index when that is what we are testing.

pull.twohead, not pul.twohead.

I'm curious, though, why use it instead of just setting
GIT_TEST_MERGE_ALGORITHM=ort?  That'd seem to avoid the need for the
explicit subshells and the sane_unset calls.

>
>  * Corrected some misstatements in my commit messages.

I read over v2.  Other than some minor questions about whether using
GIT_TEST_MERGE_ALGORITHM=ort would be easier, and a typo still present
from v1, the series looks good to me.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 2/6] merge: make sparse-aware with ORT
  2021-08-27 22:43     ` Elijah Newren
@ 2021-08-30 17:18       ` Derrick Stolee
  2021-09-08  1:49         ` Derrick Stolee
  0 siblings, 1 reply; 45+ messages in thread
From: Derrick Stolee @ 2021-08-30 17:18 UTC (permalink / raw)
  To: Elijah Newren, Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Junio C Hamano, Taylor Blau, Derrick Stolee,
	Derrick Stolee

On 8/27/2021 6:43 PM, Elijah Newren wrote:
> On Tue, Aug 24, 2021 at 2:52 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
...
>> +       /*
>> +        * We are in a conflicted state. These conflicts might be inside
>> +        * sparse-directory entries, so expand the index preemtively.
> 
> Same typo I pointed out in v1.

Sorry, this comment is edited in a later patch and I fixed it there. Will
fix here, too.

>> +       ensure_not_expanded checkout -f update-deep &&
>> +       (
>> +               sane_unset GIT_TEST_MERGE_ALGORITHM &&
>> +               git -C sparse-index config pull.twohead ort &&
>> +               ensure_not_expanded merge -m merge update-folder1 &&
>> +               ensure_not_expanded merge -m merge update-folder2
>> +       )
>>  '
> 
> Should you use test_config rather than git config here?

That's a better pattern. It's not technically _required_ for these
tests because the repositories are completely rewritten at the start
of each new test, but it's best to be a good example.

> More importantly, why the subshell and unsetting of
> GIT_TEST_MERGE_ALGORITHM and the special worrying about pull.twohead?
> Wouldn't it be simpler to just set GIT_TEST_MERGE_ALGORITHM=ort,
> perhaps at the beginning of the file?
 
I don't set GIT_TEST_MERGE_ALGORITHM at the beginning of the file so
the rest of the tests are covered with both 'ort' and 'recursive' in
the CI test suite.

Using the config more carefully matches how I expect the 'ort'
strategy to be specified in practice (very temporarily, as it will
soon be the default).

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 3/6] merge-ort: expand only for out-of-cone conflicts
  2021-08-27 22:47     ` Elijah Newren
@ 2021-08-30 17:21       ` Derrick Stolee
  0 siblings, 0 replies; 45+ messages in thread
From: Derrick Stolee @ 2021-08-30 17:21 UTC (permalink / raw)
  To: Elijah Newren, Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Junio C Hamano, Taylor Blau, Derrick Stolee,
	Derrick Stolee

On 8/27/2021 6:47 PM, Elijah Newren wrote:
> On Tue, Aug 24, 2021 at 2:52 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
...
>> +       (
>> +               sane_unset GIT_TEST_MERGE_ALGORITHM &&
>> +               git -C sparse-index config pull.twohead ort &&
>> +               ensure_not_expanded ! merge -m merged expand-right
>> +       )
> 
> These last five lines could just be replaced with the fourth, if you
> just set GIT_TEST_MERGE_ALGORITHM=ort at the beginning of the file.
> 
> Are you worrying about testing with recursive in some of the testcases?

Yes. In fact, since I _stopped_ setting GIT_TEST_MERGE_ALGORITHM at the
start of the file since the previous version, I found and fixed a bug
which forms the new Patch 5 (sequencer: ensure full index if not ORT
strategy).

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert
  2021-08-27 22:56   ` [PATCH v2 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Elijah Newren
@ 2021-08-30 17:26     ` Derrick Stolee
  0 siblings, 0 replies; 45+ messages in thread
From: Derrick Stolee @ 2021-08-30 17:26 UTC (permalink / raw)
  To: Elijah Newren, Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Junio C Hamano, Taylor Blau, Derrick Stolee

On 8/27/2021 6:56 PM, Elijah Newren wrote:
> On Tue, Aug 24, 2021 at 2:52 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
...
>> Updates in V2
>> =============
>>
>>  * The tests no longer specify GIT_TEST_MERGE_ALGORITHM or directly
>>    reference "-s ort". By relaxing this condition, I found an issue with
>>    'git cherry-pick' and 'git rebase' when using the 'recursive' algorithm
>>    which is fixed in a new patch.

This describes why the tests no longer use GIT_TEST_MERGE_ALGORITHM at
the top. It improves coverage in case users opt-out of ORT. Instead,

>>  * Use the pul.twohead config to specify the ORT merge algorithm to avoid
>>    expanding the sparse index when that is what we are testing.
> 
> pull.twohead, not pul.twohead.

We use this config option to specify when we _really care_ about the ORT
strategy as it is necessary to avoid expanding the full index.

> I'm curious, though, why use it instead of just setting
> GIT_TEST_MERGE_ALGORITHM=ort?  That'd seem to avoid the need for the
> explicit subshells and the sane_unset calls.
> 
>>
>>  * Corrected some misstatements in my commit messages.
> 
> I read over v2.  Other than some minor questions about whether using
> GIT_TEST_MERGE_ALGORITHM=ort would be easier, and a typo still present
> from v1, the series looks good to me.

Sorry about the typo, as I fixed it in Patch 3 but not Patch 2. I will
fix these in v3 after I send a v5 of the ignored files series.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 2/6] merge: make sparse-aware with ORT
  2021-08-30 17:18       ` Derrick Stolee
@ 2021-09-08  1:49         ` Derrick Stolee
  0 siblings, 0 replies; 45+ messages in thread
From: Derrick Stolee @ 2021-09-08  1:49 UTC (permalink / raw)
  To: Elijah Newren, Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Junio C Hamano, Taylor Blau, Derrick Stolee,
	Derrick Stolee

On 8/30/2021 1:18 PM, Derrick Stolee wrote:
> On 8/27/2021 6:43 PM, Elijah Newren wrote:
>> On Tue, Aug 24, 2021 at 2:52 PM Derrick Stolee via GitGitGadget
>>> +       ensure_not_expanded checkout -f update-deep &&
>>> +       (
>>> +               sane_unset GIT_TEST_MERGE_ALGORITHM &&
>>> +               git -C sparse-index config pull.twohead ort &&
>>> +               ensure_not_expanded merge -m merge update-folder1 &&
>>> +               ensure_not_expanded merge -m merge update-folder2
>>> +       )
>>>  '
>>
>> Should you use test_config rather than git config here?
> 
> That's a better pattern. It's not technically _required_ for these
> tests because the repositories are completely rewritten at the start
> of each new test, but it's best to be a good example.

Actually, test_config runs test_when_finished, and that results in
the following message and failure on macOS and Windows:

	BUG 'test_when_finished does nothing in a subshell'

So, I'll leave this as-is.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v3 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert
  2021-08-24 21:52 ` [PATCH v2 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
                     ` (6 preceding siblings ...)
  2021-08-27 22:56   ` [PATCH v2 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Elijah Newren
@ 2021-09-08 11:23   ` Derrick Stolee via GitGitGadget
  2021-09-08 11:23     ` [PATCH v3 1/6] diff: ignore sparse paths in diffstat Derrick Stolee via GitGitGadget
                       ` (6 more replies)
  7 siblings, 7 replies; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-09-08 11:23 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Taylor Blau, Derrick Stolee

This series integrates the sparse index with commands that perform merges
such as 'git merge', 'git cherry-pick', 'git revert' (free with
cherry-pick), and 'git rebase'.

When the ORT merge strategy is enabled, this allows most merges to succeed
without expanding the sparse index, leading to significant performance
gains. I tested these changes against an internal monorepo with over 2
million paths at HEAD but with a sparse-checkout that only has ~60,000 files
within the sparse-checkout cone. 'git merge' commands went from 5-6 seconds
to 0.750-1.250s.

In the case of the recursive merge strategy, the sparse index is expanded
before the recursive algorithm proceeds. We expect that this is as good as
we can get with that strategy. When the strategy shifts to ORT as the
default, then this will not be a problem except for users who decide to
change the behavior.

Most of the hard work was done by previous series, such as
ds/sparse-index-ignored-files (which this series is based on).


Updates in V3
=============

 * Fixed a typo in patch 2 (it is then moved in patch 3, affecting the
   range-diff)

 * There was a recommendation to use test_config over git config, but that
   is not possible in a subshell. So, it got moved outside of the subshell
   and that works just fine.

 * The other comments were about the use of GIT_TEST_MERGE_ALGORITHM in the
   test script, but the tests that isolate that environment variable are
   only for the 'ensure_not_expanded' tests, not the rest of the tests that
   already exist and are beneficial to cover the 'recursive' mode.


Updates in V2
=============

 * The tests no longer specify GIT_TEST_MERGE_ALGORITHM or directly
   reference "-s ort". By relaxing this condition, I found an issue with
   'git cherry-pick' and 'git rebase' when using the 'recursive' algorithm
   which is fixed in a new patch.

 * Use the pull.twohead config to specify the ORT merge algorithm to avoid
   expanding the sparse index when that is what we are testing.

 * Corrected some misstatements in my commit messages.

Thanks, -Stolee

Derrick Stolee (6):
  diff: ignore sparse paths in diffstat
  merge: make sparse-aware with ORT
  merge-ort: expand only for out-of-cone conflicts
  t1092: add cherry-pick, rebase tests
  sequencer: ensure full index if not ORT strategy
  sparse-index: integrate with cherry-pick and rebase

 builtin/merge.c                          |  3 +
 builtin/rebase.c                         |  6 ++
 builtin/revert.c                         |  3 +
 diff.c                                   |  8 +++
 merge-ort.c                              | 15 ++++
 merge-recursive.c                        |  3 +
 sequencer.c                              |  9 +++
 t/t1092-sparse-checkout-compatibility.sh | 92 +++++++++++++++++++++---
 8 files changed, 129 insertions(+), 10 deletions(-)


base-commit: 91b53f20109fe55635b1815f87afd5d5da68a182
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1019%2Fderrickstolee%2Fsparse-index%2Fmerge-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1019/derrickstolee/sparse-index/merge-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1019

Range-diff vs v2:

 1:  c5ae705648c = 1:  a6963182fe0 diff: ignore sparse paths in diffstat
 2:  bb150483bcf ! 2:  141f7fb26d6 merge: make sparse-aware with ORT
     @@ merge-ort.c: static int record_conflicted_index_entries(struct merge_options *op
       
      +	/*
      +	 * We are in a conflicted state. These conflicts might be inside
     -+	 * sparse-directory entries, so expand the index preemtively.
     ++	 * sparse-directory entries, so expand the index preemptively.
      +	 * Also, we set original_cache_nr below, but that might change if
      +	 * index_name_pos() calls ask for paths within sparse directories.
      +	 */
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is n
      +	ensure_not_expanded add . &&
      +
      +	ensure_not_expanded checkout -f update-deep &&
     ++	test_config -C sparse-index pull.twohead ort &&
      +	(
      +		sane_unset GIT_TEST_MERGE_ALGORITHM &&
     -+		git -C sparse-index config pull.twohead ort &&
      +		ensure_not_expanded merge -m merge update-folder1 &&
      +		ensure_not_expanded merge -m merge update-folder2
      +	)
 3:  815b1b1cfbf ! 3:  c3c9ffd855c merge-ort: expand only for out-of-cone conflicts
     @@ merge-ort.c: static int record_conflicted_index_entries(struct merge_options *op
       
       	/*
       	 * We are in a conflicted state. These conflicts might be inside
     --	 * sparse-directory entries, so expand the index preemtively.
     +-	 * sparse-directory entries, so expand the index preemptively.
      -	 * Also, we set original_cache_nr below, but that might change if
      +	 * sparse-directory entries, so check if any entries are outside
      +	 * of the sparse-checkout cone preemptively.
 4:  8032154bc8a = 4:  7aae5727fb7 t1092: add cherry-pick, rebase tests
 5:  90ac85500b8 = 5:  20f5bbae546 sequencer: ensure full index if not ORT strategy
 6:  df4bbec744f ! 6:  36cecb22330 sparse-index: integrate with cherry-pick and rebase
     @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'merge with confli
       	init_repos &&
       
      @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is not expanded' '
     + 	test_config -C sparse-index pull.twohead ort &&
       	(
       		sane_unset GIT_TEST_MERGE_ALGORITHM &&
     - 		git -C sparse-index config pull.twohead ort &&
      -		ensure_not_expanded merge -m merge update-folder1 &&
      -		ensure_not_expanded merge -m merge update-folder2
      +		for OPERATION in "merge -m merge" cherry-pick rebase

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v3 1/6] diff: ignore sparse paths in diffstat
  2021-09-08 11:23   ` [PATCH v3 " Derrick Stolee via GitGitGadget
@ 2021-09-08 11:23     ` Derrick Stolee via GitGitGadget
  2021-09-08 11:23     ` [PATCH v3 2/6] merge: make sparse-aware with ORT Derrick Stolee via GitGitGadget
                       ` (5 subsequent siblings)
  6 siblings, 0 replies; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-09-08 11:23 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Taylor Blau, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The diff_populate_filespec() method is used to describe the diff after a
merge operation is complete. In order to avoid expanding a sparse index,
the reuse_worktree_file() needs to be adapted to ignore files that are
outside of the sparse-checkout cone. The file names and OIDs used for
this check come from the merged tree in the case of the ORT strategy,
not the index, hence the ability to look into these paths without having
already expanded the index.

The work done by reuse_worktree_file() is only an optimization, and
requires the file being on disk for it to be of any value. Thus, it is
safe to exit the method early if we do not expect the file on disk.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 diff.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/diff.c b/diff.c
index a8113f17070..c8f530ffdbe 100644
--- a/diff.c
+++ b/diff.c
@@ -26,6 +26,7 @@
 #include "parse-options.h"
 #include "help.h"
 #include "promisor-remote.h"
+#include "dir.h"
 
 #ifdef NO_FAST_WORKING_DIRECTORY
 #define FAST_WORKING_DIRECTORY 0
@@ -3907,6 +3908,13 @@ static int reuse_worktree_file(struct index_state *istate,
 	if (!want_file && would_convert_to_git(istate, name))
 		return 0;
 
+	/*
+	 * If this path does not match our sparse-checkout definition,
+	 * then the file will not be in the working directory.
+	 */
+	if (!path_in_sparse_checkout(name, istate))
+		return 0;
+
 	len = strlen(name);
 	pos = index_name_pos(istate, name, len);
 	if (pos < 0)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 2/6] merge: make sparse-aware with ORT
  2021-09-08 11:23   ` [PATCH v3 " Derrick Stolee via GitGitGadget
  2021-09-08 11:23     ` [PATCH v3 1/6] diff: ignore sparse paths in diffstat Derrick Stolee via GitGitGadget
@ 2021-09-08 11:23     ` Derrick Stolee via GitGitGadget
  2021-09-08 11:23     ` [PATCH v3 3/6] merge-ort: expand only for out-of-cone conflicts Derrick Stolee via GitGitGadget
                       ` (4 subsequent siblings)
  6 siblings, 0 replies; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-09-08 11:23 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Taylor Blau, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Allow 'git merge' to operate without expanding a sparse index, at least
not immediately. The index still will be expanded in a few cases:

1. If the merge strategy is 'recursive', then we enable
   command_requires_full_index at the start of the merge_recursive()
   method. We expect sparse-index users to also have the 'ort' strategy
   enabled.

2. With the 'ort' strategy, if the merge results in a conflicted file,
   then we expand the index before updating the working tree. The loop
   that iterates over the worktree replaces index entries and tracks
   'origintal_cache_nr' which can become completely wrong if the index
   expands in the middle of the operation. This safety valve is
   important before that loop starts. A later change will focus this
   to only expand if we indeed have a conflict outside of the
   sparse-checkout cone.

3. Other merge strategies are executed as a 'git merge-X' subcommand,
   and those strategies are currently protected with the
   'command_requires_full_index' guard.

Some test updates are required, including a mistaken 'git checkout -b'
that did not specify the base branch, causing merges to be fast-forward
merges.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/merge.c                          |  3 +++
 merge-ort.c                              |  8 ++++++++
 merge-recursive.c                        |  3 +++
 t/t1092-sparse-checkout-compatibility.sh | 12 ++++++++++--
 4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 22f23990b37..926de328fbb 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1276,6 +1276,9 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 	if (argc == 2 && !strcmp(argv[1], "-h"))
 		usage_with_options(builtin_merge_usage, builtin_merge_options);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	/*
 	 * Check if we are _not_ on a detached HEAD, i.e. if there is a
 	 * current branch.
diff --git a/merge-ort.c b/merge-ort.c
index 6eb910d6f0c..1531b4c94c2 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -4058,6 +4058,14 @@ static int record_conflicted_index_entries(struct merge_options *opt)
 	if (strmap_empty(&opt->priv->conflicted))
 		return 0;
 
+	/*
+	 * We are in a conflicted state. These conflicts might be inside
+	 * sparse-directory entries, so expand the index preemptively.
+	 * Also, we set original_cache_nr below, but that might change if
+	 * index_name_pos() calls ask for paths within sparse directories.
+	 */
+	ensure_full_index(index);
+
 	/* If any entries have skip_worktree set, we'll have to check 'em out */
 	state.force = 1;
 	state.quiet = 1;
diff --git a/merge-recursive.c b/merge-recursive.c
index 3355d50e8ad..1f563cd6874 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3750,6 +3750,9 @@ int merge_recursive(struct merge_options *opt,
 	assert(opt->ancestor == NULL ||
 	       !strcmp(opt->ancestor, "constructed merge base"));
 
+	prepare_repo_settings(opt->repo);
+	opt->repo->settings.command_requires_full_index = 1;
+
 	if (merge_start(opt, repo_get_commit_tree(opt->repo, h1)))
 		return -1;
 	clean = merge_recursive_internal(opt, h1, h2, merge_bases, result);
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index ddc86bb4152..3b331a6bfe7 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -47,7 +47,7 @@ test_expect_success 'setup' '
 		git checkout -b base &&
 		for dir in folder1 folder2 deep
 		do
-			git checkout -b update-$dir &&
+			git checkout -b update-$dir base &&
 			echo "updated $dir" >$dir/a &&
 			git commit -a -m "update $dir" || return 1
 		done &&
@@ -647,7 +647,15 @@ test_expect_success 'sparse-index is not expanded' '
 	echo >>sparse-index/extra.txt &&
 	ensure_not_expanded add extra.txt &&
 	echo >>sparse-index/untracked.txt &&
-	ensure_not_expanded add .
+	ensure_not_expanded add . &&
+
+	ensure_not_expanded checkout -f update-deep &&
+	test_config -C sparse-index pull.twohead ort &&
+	(
+		sane_unset GIT_TEST_MERGE_ALGORITHM &&
+		ensure_not_expanded merge -m merge update-folder1 &&
+		ensure_not_expanded merge -m merge update-folder2
+	)
 '
 
 # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 3/6] merge-ort: expand only for out-of-cone conflicts
  2021-09-08 11:23   ` [PATCH v3 " Derrick Stolee via GitGitGadget
  2021-09-08 11:23     ` [PATCH v3 1/6] diff: ignore sparse paths in diffstat Derrick Stolee via GitGitGadget
  2021-09-08 11:23     ` [PATCH v3 2/6] merge: make sparse-aware with ORT Derrick Stolee via GitGitGadget
@ 2021-09-08 11:23     ` Derrick Stolee via GitGitGadget
  2021-09-08 11:23     ` [PATCH v3 4/6] t1092: add cherry-pick, rebase tests Derrick Stolee via GitGitGadget
                       ` (3 subsequent siblings)
  6 siblings, 0 replies; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-09-08 11:23 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Taylor Blau, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Merge conflicts happen often enough to want to avoid expanding a sparse
index when they happen, as long as those conflicts are within the
sparse-checkout cone. If a conflict exists outside of the
sparse-checkout cone, then we still need to expand before iterating over
the index entries. This is critical to do in advance because of how the
original_cache_nr is tracked to allow inserting and replacing cache
entries.

Iterate over the conflicted files and check if any paths are outside of
the sparse-checkout cone. If so, then expand the full index.

Add a test that demonstrates that we do not expand the index, even when
we hit a conflict within the sparse-checkout cone.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 merge-ort.c                              | 13 +++++++---
 t/t1092-sparse-checkout-compatibility.sh | 30 ++++++++++++++++++++++--
 2 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 1531b4c94c2..805f7c41397 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -4060,11 +4060,18 @@ static int record_conflicted_index_entries(struct merge_options *opt)
 
 	/*
 	 * We are in a conflicted state. These conflicts might be inside
-	 * sparse-directory entries, so expand the index preemptively.
-	 * Also, we set original_cache_nr below, but that might change if
+	 * sparse-directory entries, so check if any entries are outside
+	 * of the sparse-checkout cone preemptively.
+	 *
+	 * We set original_cache_nr below, but that might change if
 	 * index_name_pos() calls ask for paths within sparse directories.
 	 */
-	ensure_full_index(index);
+	strmap_for_each_entry(&opt->priv->conflicted, &iter, e) {
+		if (!path_in_sparse_checkout(e->key, index)) {
+			ensure_full_index(index);
+			break;
+		}
+	}
 
 	/* If any entries have skip_worktree set, we'll have to check 'em out */
 	state.force = 1;
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 3b331a6bfe7..36964f52f2f 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -617,8 +617,17 @@ test_expect_success 'sparse-index is expanded and converted back' '
 ensure_not_expanded () {
 	rm -f trace2.txt &&
 	echo >>sparse-index/untracked.txt &&
-	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
-		git -C sparse-index "$@" &&
+
+	if test "$1" = "!"
+	then
+		shift &&
+		test_must_fail env \
+			GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+			git -C sparse-index "$@" || return 1
+	else
+		GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
+			git -C sparse-index "$@" || return 1
+	fi &&
 	test_region ! index ensure_full_index trace2.txt
 }
 
@@ -658,6 +667,23 @@ test_expect_success 'sparse-index is not expanded' '
 	)
 '
 
+test_expect_success 'sparse-index is not expanded: merge conflict in cone' '
+	init_repos &&
+
+	for side in right left
+	do
+		git -C sparse-index checkout -b expand-$side base &&
+		echo $side >sparse-index/deep/a &&
+		git -C sparse-index commit -a -m "$side" || return 1
+	done &&
+
+	(
+		sane_unset GIT_TEST_MERGE_ALGORITHM &&
+		git -C sparse-index config pull.twohead ort &&
+		ensure_not_expanded ! merge -m merged expand-right
+	)
+'
+
 # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
 # in this scenario, but it shouldn't.
 test_expect_success 'reset mixed and checkout orphan' '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 4/6] t1092: add cherry-pick, rebase tests
  2021-09-08 11:23   ` [PATCH v3 " Derrick Stolee via GitGitGadget
                       ` (2 preceding siblings ...)
  2021-09-08 11:23     ` [PATCH v3 3/6] merge-ort: expand only for out-of-cone conflicts Derrick Stolee via GitGitGadget
@ 2021-09-08 11:23     ` Derrick Stolee via GitGitGadget
  2021-09-08 11:24     ` [PATCH v3 5/6] sequencer: ensure full index if not ORT strategy Derrick Stolee via GitGitGadget
                       ` (2 subsequent siblings)
  6 siblings, 0 replies; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-09-08 11:23 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Taylor Blau, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

Add tests to check that cherry-pick and rebase behave the same in the
sparse-index case as in the full index cases. These tests are agnostic
to GIT_TEST_MERGE_ALGORITHM, so a full CI test suite will check both the
'ort' and 'recursive' strategies on this test.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t1092-sparse-checkout-compatibility.sh | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 36964f52f2f..d9424ed6427 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -481,14 +481,17 @@ test_expect_success 'checkout and reset (mixed) [sparse]' '
 	test_sparse_match git reset update-folder2
 '
 
-test_expect_success 'merge' '
+test_expect_success 'merge, cherry-pick, and rebase' '
 	init_repos &&
 
-	test_all_match git checkout -b merge update-deep &&
-	test_all_match git merge -m "folder1" update-folder1 &&
-	test_all_match git rev-parse HEAD^{tree} &&
-	test_all_match git merge -m "folder2" update-folder2 &&
-	test_all_match git rev-parse HEAD^{tree}
+	for OPERATION in "merge -m merge" cherry-pick rebase
+	do
+		test_all_match git checkout -B temp update-deep &&
+		test_all_match git $OPERATION update-folder1 &&
+		test_all_match git rev-parse HEAD^{tree} &&
+		test_all_match git $OPERATION update-folder2 &&
+		test_all_match git rev-parse HEAD^{tree} || return 1
+	done
 '
 
 # NEEDSWORK: This test is documenting current behavior, but that
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 5/6] sequencer: ensure full index if not ORT strategy
  2021-09-08 11:23   ` [PATCH v3 " Derrick Stolee via GitGitGadget
                       ` (3 preceding siblings ...)
  2021-09-08 11:23     ` [PATCH v3 4/6] t1092: add cherry-pick, rebase tests Derrick Stolee via GitGitGadget
@ 2021-09-08 11:24     ` Derrick Stolee via GitGitGadget
  2021-09-08 11:24     ` [PATCH v3 6/6] sparse-index: integrate with cherry-pick and rebase Derrick Stolee via GitGitGadget
  2021-09-09 14:16     ` [PATCH v3 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Elijah Newren
  6 siblings, 0 replies; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-09-08 11:24 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Taylor Blau, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The sequencer is used by 'cherry-pick' and 'rebase' to sequence a list
of operations that modify the index. Since we intend to remove the need
for 'command_requires_full_index', we need to ensure the sparse index is
expanded every time it is written to disk between these steps. That is,
unless the merge strategy is 'ort' where the index can remain sparse
throughout.

There are two main places to be extra careful about a full index:

1. Right before calling merge_trees(), ensure the index is full. This
   happens within an 'else' where the 'if' block checks if the 'ort'
   strategy is selected.

2. During read_and_refresh_cache(), the index might be written to disk
   and converted to sparse in the process. Ensure it expands back to
   full afterwards by checking if the strategy is _not_ 'ort'. This
   'if' statement is the logical negation of the 'if' in item (1).

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 sequencer.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/sequencer.c b/sequencer.c
index 7f07cd00f3f..228bc089d22 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -652,6 +652,7 @@ static int do_recursive_merge(struct repository *r,
 		merge_switch_to_result(&o, head_tree, &result, 1, show_output);
 		clean = result.clean;
 	} else {
+		ensure_full_index(r->index);
 		clean = merge_trees(&o, head_tree, next_tree, base_tree);
 		if (is_rebase_i(opts) && clean <= 0)
 			fputs(o.obuf.buf, stdout);
@@ -2346,6 +2347,7 @@ static int read_and_refresh_cache(struct repository *r,
 			_(action_name(opts)));
 	}
 	refresh_index(r->index, REFRESH_QUIET|REFRESH_UNMERGED, NULL, NULL, NULL);
+
 	if (index_fd >= 0) {
 		if (write_locked_index(r->index, &index_lock,
 				       COMMIT_LOCK | SKIP_IF_UNCHANGED)) {
@@ -2353,6 +2355,13 @@ static int read_and_refresh_cache(struct repository *r,
 				_(action_name(opts)));
 		}
 	}
+
+	/*
+	 * If we are resolving merges in any way other than "ort", then
+	 * expand the sparse index.
+	 */
+	if (opts->strategy && strcmp(opts->strategy, "ort"))
+		ensure_full_index(r->index);
 	return 0;
 }
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 6/6] sparse-index: integrate with cherry-pick and rebase
  2021-09-08 11:23   ` [PATCH v3 " Derrick Stolee via GitGitGadget
                       ` (4 preceding siblings ...)
  2021-09-08 11:24     ` [PATCH v3 5/6] sequencer: ensure full index if not ORT strategy Derrick Stolee via GitGitGadget
@ 2021-09-08 11:24     ` Derrick Stolee via GitGitGadget
  2021-09-09 14:16     ` [PATCH v3 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Elijah Newren
  6 siblings, 0 replies; 45+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2021-09-08 11:24 UTC (permalink / raw)
  To: git; +Cc: newren, stolee, gitster, Taylor Blau, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The hard work was already done with 'git merge' and the ORT strategy.
Just add extra tests to see that we get the expected results in the
non-conflict cases.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/rebase.c                         |  6 ++++
 builtin/revert.c                         |  3 ++
 t/t1092-sparse-checkout-compatibility.sh | 39 ++++++++++++++++++++++--
 3 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index 33e09619005..27433d7c5a2 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -559,6 +559,9 @@ int cmd_rebase__interactive(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options,
 			builtin_rebase_interactive_usage, PARSE_OPT_KEEP_ARGV0);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	if (!is_null_oid(&squash_onto))
 		opts.squash_onto = &squash_onto;
 
@@ -1430,6 +1433,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
 		usage_with_options(builtin_rebase_usage,
 				   builtin_rebase_options);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	options.allow_empty_message = 1;
 	git_config(rebase_config, &options);
 	/* options.gpg_sign_opt will be either "-S" or NULL */
diff --git a/builtin/revert.c b/builtin/revert.c
index 237f2f18d4c..6c4c22691bd 100644
--- a/builtin/revert.c
+++ b/builtin/revert.c
@@ -136,6 +136,9 @@ static int run_sequencer(int argc, const char **argv, struct replay_opts *opts)
 			PARSE_OPT_KEEP_ARGV0 |
 			PARSE_OPT_KEEP_UNKNOWN);
 
+	prepare_repo_settings(the_repository);
+	the_repository->settings.command_requires_full_index = 0;
+
 	/* implies allow_empty */
 	if (opts->keep_redundant_commits)
 		opts->allow_empty = 1;
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index d9424ed6427..886e78715fe 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -527,6 +527,38 @@ test_expect_success 'merge with conflict outside cone' '
 	test_all_match git rev-parse HEAD^{tree}
 '
 
+test_expect_success 'cherry-pick/rebase with conflict outside cone' '
+	init_repos &&
+
+	for OPERATION in cherry-pick rebase
+	do
+		test_all_match git checkout -B tip &&
+		test_all_match git reset --hard merge-left &&
+		test_all_match git status --porcelain=v2 &&
+		test_all_match test_must_fail git $OPERATION merge-right &&
+		test_all_match git status --porcelain=v2 &&
+
+		# Resolve the conflict in different ways:
+		# 1. Revert to the base
+		test_all_match git checkout base -- deep/deeper2/a &&
+		test_all_match git status --porcelain=v2 &&
+
+		# 2. Add the file with conflict markers
+		test_all_match git add folder1/a &&
+		test_all_match git status --porcelain=v2 &&
+
+		# 3. Rename the file to another sparse filename and
+		#    accept conflict markers as resolved content.
+		run_on_all mv folder2/a folder2/z &&
+		test_all_match git add folder2 &&
+		test_all_match git status --porcelain=v2 &&
+
+		test_all_match git $OPERATION --continue &&
+		test_all_match git status --porcelain=v2 &&
+		test_all_match git rev-parse HEAD^{tree} || return 1
+	done
+'
+
 test_expect_success 'merge with outside renames' '
 	init_repos &&
 
@@ -665,8 +697,11 @@ test_expect_success 'sparse-index is not expanded' '
 	test_config -C sparse-index pull.twohead ort &&
 	(
 		sane_unset GIT_TEST_MERGE_ALGORITHM &&
-		ensure_not_expanded merge -m merge update-folder1 &&
-		ensure_not_expanded merge -m merge update-folder2
+		for OPERATION in "merge -m merge" cherry-pick rebase
+		do
+			ensure_not_expanded merge -m merge update-folder1 &&
+			ensure_not_expanded merge -m merge update-folder2 || return 1
+		done
 	)
 '
 
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert
  2021-09-08 11:23   ` [PATCH v3 " Derrick Stolee via GitGitGadget
                       ` (5 preceding siblings ...)
  2021-09-08 11:24     ` [PATCH v3 6/6] sparse-index: integrate with cherry-pick and rebase Derrick Stolee via GitGitGadget
@ 2021-09-09 14:16     ` Elijah Newren
  6 siblings, 0 replies; 45+ messages in thread
From: Elijah Newren @ 2021-09-09 14:16 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Derrick Stolee, Junio C Hamano, Taylor Blau,
	Derrick Stolee

On Wed, Sep 8, 2021 at 4:24 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> This series integrates the sparse index with commands that perform merges
> such as 'git merge', 'git cherry-pick', 'git revert' (free with
> cherry-pick), and 'git rebase'.
>
> When the ORT merge strategy is enabled, this allows most merges to succeed
> without expanding the sparse index, leading to significant performance
> gains. I tested these changes against an internal monorepo with over 2
> million paths at HEAD but with a sparse-checkout that only has ~60,000 files
> within the sparse-checkout cone. 'git merge' commands went from 5-6 seconds
> to 0.750-1.250s.
>
> In the case of the recursive merge strategy, the sparse index is expanded
> before the recursive algorithm proceeds. We expect that this is as good as
> we can get with that strategy. When the strategy shifts to ORT as the
> default, then this will not be a problem except for users who decide to
> change the behavior.
>
> Most of the hard work was done by previous series, such as
> ds/sparse-index-ignored-files (which this series is based on).
>
>
> Updates in V3
> =============
>
>  * Fixed a typo in patch 2 (it is then moved in patch 3, affecting the
>    range-diff)
>
>  * There was a recommendation to use test_config over git config, but that
>    is not possible in a subshell. So, it got moved outside of the subshell
>    and that works just fine.
>
>  * The other comments were about the use of GIT_TEST_MERGE_ALGORITHM in the
>    test script, but the tests that isolate that environment variable are
>    only for the 'ensure_not_expanded' tests, not the rest of the tests that
>    already exist and are beneficial to cover the 'recursive' mode.

This round addresses all my feedback and looks good to me:

Reviewed-by: Elijah Newren <newren@gmail.com>

>
>
> Updates in V2
> =============
>
>  * The tests no longer specify GIT_TEST_MERGE_ALGORITHM or directly
>    reference "-s ort". By relaxing this condition, I found an issue with
>    'git cherry-pick' and 'git rebase' when using the 'recursive' algorithm
>    which is fixed in a new patch.
>
>  * Use the pull.twohead config to specify the ORT merge algorithm to avoid
>    expanding the sparse index when that is what we are testing.
>
>  * Corrected some misstatements in my commit messages.
>
> Thanks, -Stolee
>
> Derrick Stolee (6):
>   diff: ignore sparse paths in diffstat
>   merge: make sparse-aware with ORT
>   merge-ort: expand only for out-of-cone conflicts
>   t1092: add cherry-pick, rebase tests
>   sequencer: ensure full index if not ORT strategy
>   sparse-index: integrate with cherry-pick and rebase
>
>  builtin/merge.c                          |  3 +
>  builtin/rebase.c                         |  6 ++
>  builtin/revert.c                         |  3 +
>  diff.c                                   |  8 +++
>  merge-ort.c                              | 15 ++++
>  merge-recursive.c                        |  3 +
>  sequencer.c                              |  9 +++
>  t/t1092-sparse-checkout-compatibility.sh | 92 +++++++++++++++++++++---
>  8 files changed, 129 insertions(+), 10 deletions(-)
>
>
> base-commit: 91b53f20109fe55635b1815f87afd5d5da68a182
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1019%2Fderrickstolee%2Fsparse-index%2Fmerge-v3
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1019/derrickstolee/sparse-index/merge-v3
> Pull-Request: https://github.com/gitgitgadget/git/pull/1019
>
> Range-diff vs v2:
>
>  1:  c5ae705648c = 1:  a6963182fe0 diff: ignore sparse paths in diffstat
>  2:  bb150483bcf ! 2:  141f7fb26d6 merge: make sparse-aware with ORT
>      @@ merge-ort.c: static int record_conflicted_index_entries(struct merge_options *op
>
>       + /*
>       +  * We are in a conflicted state. These conflicts might be inside
>      -+  * sparse-directory entries, so expand the index preemtively.
>      ++  * sparse-directory entries, so expand the index preemptively.
>       +  * Also, we set original_cache_nr below, but that might change if
>       +  * index_name_pos() calls ask for paths within sparse directories.
>       +  */
>      @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is n
>       + ensure_not_expanded add . &&
>       +
>       + ensure_not_expanded checkout -f update-deep &&
>      ++ test_config -C sparse-index pull.twohead ort &&
>       + (
>       +         sane_unset GIT_TEST_MERGE_ALGORITHM &&
>      -+         git -C sparse-index config pull.twohead ort &&
>       +         ensure_not_expanded merge -m merge update-folder1 &&
>       +         ensure_not_expanded merge -m merge update-folder2
>       + )
>  3:  815b1b1cfbf ! 3:  c3c9ffd855c merge-ort: expand only for out-of-cone conflicts
>      @@ merge-ort.c: static int record_conflicted_index_entries(struct merge_options *op
>
>         /*
>          * We are in a conflicted state. These conflicts might be inside
>      --  * sparse-directory entries, so expand the index preemtively.
>      +-  * sparse-directory entries, so expand the index preemptively.
>       -  * Also, we set original_cache_nr below, but that might change if
>       +  * sparse-directory entries, so check if any entries are outside
>       +  * of the sparse-checkout cone preemptively.
>  4:  8032154bc8a = 4:  7aae5727fb7 t1092: add cherry-pick, rebase tests
>  5:  90ac85500b8 = 5:  20f5bbae546 sequencer: ensure full index if not ORT strategy
>  6:  df4bbec744f ! 6:  36cecb22330 sparse-index: integrate with cherry-pick and rebase
>      @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'merge with confli
>         init_repos &&
>
>       @@ t/t1092-sparse-checkout-compatibility.sh: test_expect_success 'sparse-index is not expanded' '
>      +  test_config -C sparse-index pull.twohead ort &&
>         (
>                 sane_unset GIT_TEST_MERGE_ALGORITHM &&
>      -          git -C sparse-index config pull.twohead ort &&
>       -         ensure_not_expanded merge -m merge update-folder1 &&
>       -         ensure_not_expanded merge -m merge update-folder2
>       +         for OPERATION in "merge -m merge" cherry-pick rebase
>
> --
> gitgitgadget

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2021-09-09 14:42 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-17 17:08 [PATCH 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
2021-08-17 17:08 ` [PATCH 1/6] t1092: use ORT merge strategy Derrick Stolee via GitGitGadget
2021-08-18 17:16   ` Taylor Blau
2021-08-18 18:10   ` Junio C Hamano
2021-08-18 18:42     ` Derrick Stolee
2021-08-18 22:22       ` Junio C Hamano
2021-08-20 21:23       ` Elijah Newren
2021-08-20 23:32         ` Junio C Hamano
2021-08-21  0:20           ` Elijah Newren
2021-08-21  4:20             ` Junio C Hamano
2021-08-21 23:48               ` Elijah Newren
2021-08-17 17:08 ` [PATCH 2/6] diff: ignore sparse paths in diffstat Derrick Stolee via GitGitGadget
2021-08-20 21:32   ` Elijah Newren
2021-08-24 18:30     ` Derrick Stolee
2021-08-27 22:27       ` Elijah Newren
2021-08-17 17:08 ` [PATCH 3/6] merge: make sparse-aware with ORT Derrick Stolee via GitGitGadget
2021-08-20 21:40   ` Elijah Newren
2021-08-17 17:08 ` [PATCH 4/6] merge-ort: expand only for out-of-cone conflicts Derrick Stolee via GitGitGadget
2021-08-20 21:53   ` Elijah Newren
2021-08-17 17:08 ` [PATCH 5/6] t1092: add cherry-pick, rebase tests Derrick Stolee via GitGitGadget
2021-08-20 21:58   ` Elijah Newren
2021-08-17 17:08 ` [PATCH 6/6] sparse-index: integrate with cherry-pick and rebase Derrick Stolee via GitGitGadget
2021-08-21  0:12   ` Elijah Newren
2021-08-24 21:52 ` [PATCH v2 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Derrick Stolee via GitGitGadget
2021-08-24 21:52   ` [PATCH v2 1/6] diff: ignore sparse paths in diffstat Derrick Stolee via GitGitGadget
2021-08-24 21:52   ` [PATCH v2 2/6] merge: make sparse-aware with ORT Derrick Stolee via GitGitGadget
2021-08-27 22:43     ` Elijah Newren
2021-08-30 17:18       ` Derrick Stolee
2021-09-08  1:49         ` Derrick Stolee
2021-08-24 21:52   ` [PATCH v2 3/6] merge-ort: expand only for out-of-cone conflicts Derrick Stolee via GitGitGadget
2021-08-27 22:47     ` Elijah Newren
2021-08-30 17:21       ` Derrick Stolee
2021-08-24 21:52   ` [PATCH v2 4/6] t1092: add cherry-pick, rebase tests Derrick Stolee via GitGitGadget
2021-08-24 21:52   ` [PATCH v2 5/6] sequencer: ensure full index if not ORT strategy Derrick Stolee via GitGitGadget
2021-08-24 21:52   ` [PATCH v2 6/6] sparse-index: integrate with cherry-pick and rebase Derrick Stolee via GitGitGadget
2021-08-27 22:56   ` [PATCH v2 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Elijah Newren
2021-08-30 17:26     ` Derrick Stolee
2021-09-08 11:23   ` [PATCH v3 " Derrick Stolee via GitGitGadget
2021-09-08 11:23     ` [PATCH v3 1/6] diff: ignore sparse paths in diffstat Derrick Stolee via GitGitGadget
2021-09-08 11:23     ` [PATCH v3 2/6] merge: make sparse-aware with ORT Derrick Stolee via GitGitGadget
2021-09-08 11:23     ` [PATCH v3 3/6] merge-ort: expand only for out-of-cone conflicts Derrick Stolee via GitGitGadget
2021-09-08 11:23     ` [PATCH v3 4/6] t1092: add cherry-pick, rebase tests Derrick Stolee via GitGitGadget
2021-09-08 11:24     ` [PATCH v3 5/6] sequencer: ensure full index if not ORT strategy Derrick Stolee via GitGitGadget
2021-09-08 11:24     ` [PATCH v3 6/6] sparse-index: integrate with cherry-pick and rebase Derrick Stolee via GitGitGadget
2021-09-09 14:16     ` [PATCH v3 0/6] Sparse Index: Integrate with merge, cherry-pick, rebase, and revert Elijah Newren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).