All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: derrickstolee@github.com, newren@gmail.com, gitster@pobox.com,
	Victoria Dye <vdye@github.com>
Subject: [PATCH v3 0/6] Sparse index: integrate with 'git stash'
Date: Tue, 10 May 2022 23:32:26 +0000	[thread overview]
Message-ID: <pull.1171.v3.git.1652225552.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1171.v2.git.1651083378.gitgitgadget@gmail.com>

This series, in combination with the sparse index integrations of reset [1],
update-index [2], checkout-index [2], clean [2], and read-tree [3], allows
most subcommands of 'git stash' to use the sparse index end-to-end without
index expansion.

Like the earlier series, this series starts with new tests ensuring
compatibility of the sparse index with non-sparse index full and sparse
checkouts [1/6]. Next, sparse index is trivially enabled [2/6].
Functionally, sparse index-enabled sparse-checkouts remain compatible with
non-sparse index sparse-checkouts, but there are still some cases where the
index (or a temporary index) is expanded unnecessarily. These cases are
fixed in three parts:

 * First, 'git stash -u' is made sparse index-compatible by ensuring the
   "temporary" index holding the stashed, untracked files is created as a
   sparse index whenever possible (per repo settings &
   'is_sparse_index_allowed()'). Patch [3/6] exposes
   'is_sparse_index_allowed()' to files outside of 'sparse-index.c', then
   patch [4/6] uses that function to mark the temporary index sparse when
   appropriate.
 * Next, 'git stash (apply|pop)' are made sparse index-compatible by
   changing their internal merge function from 'merge_recursive_generic()'
   (which constructs "fake" commits from the working tree, stash, and stash
   base) to 'merge_ort_nonrecursive()' (which operates on the trees
   directly) in patch [5/6]. The use of the non-recursive 'merge-ort' helps
   us to avoid sparse index expansion associated with 'merge-recursive', as
   well as avoid the unused/unnecessary complexity of a recursive merge.
 * Finally, while patch 5 skips index expansion for most cases of 'git stash
   (apply|pop)', applying a stash that includes untracked files still
   expands the index. This is a result of an internal 'read-tree' execution
   (specifically in its 'unpack_trees' call) creating a result index that is
   never sparse in-core, thus forcing the index to be unnecessarily
   collapsed and re-expanded in 'do_write_locked_index()'. In patch [6/6],
   'unpack_trees' is updated to set the default sparsity of the resultant
   index to "sparse" if allowed by repo settings and
   'is_sparse_index_allowed()' (similar to the change in patch 4).

Performance results (from the 'p2000' tests):

(git stash &&
 git stash pop)              master            this series
---------------------------------------------------------------------
full-v3                      4.07(2.42+1.34)   3.98(2.42+1.32) -2.2%
full-v4                      4.05(2.46+1.31)   4.00(2.49+1.29) -1.2%
sparse-v3                    7.48(4.81+2.57)   1.53(0.26+1.61) -79.5%
sparse-v4                    7.35(4.74+2.54)   1.59(0.27+1.63) -78.4%

(echo >>new &&
 git stash -u &&
 git stash pop)              master            this series
---------------------------------------------------------------------
full-v3                      4.21(2.62+1.45)   4.11(2.55+1.44) -2.4%
full-v4                      4.11(2.51+1.41)   4.02(2.49+1.41) -2.2%
sparse-v3                    7.35(4.64+2.66)   1.70(0.32+1.64) -76.9%
sparse-v4                    7.74(4.87+2.83)   1.70(0.32+1.66) -78.0%



Changes since V2
================

 * Replaced use of 'merge_recursive_generic' with 'merge_ort_nonrecursive'
   in 'do_apply_stash()'
 * Rebased on top of 'master'


Changes since V1
================

 * Added quotes to the "$WITHOUT_UNTRACKED_TXT" when testing for it in
   'ensure_not_expanded' (in 't/t1092-sparse-checkout-compatibility.sh')
 * Moved the 'stash' test in 't1092' elsewhere in the file, so that it
   doesn't conflict (even trivially) with the also-in-flight 'git show'
   integration
 * Moved the 'ensure_not_expended' tests for 'checkout-index' back to
   original location

[1]
https://lore.kernel.org/git/pull.1048.v6.git.1638201164.gitgitgadget@gmail.com/
[2]
https://lore.kernel.org/git/pull.1109.v2.git.1641924306.gitgitgadget@gmail.com/
[3]
https://lore.kernel.org/git/pull.1157.v3.git.1646166271.gitgitgadget@gmail.com/

Thanks!

-Victoria

Victoria Dye (6):
  stash: expand sparse-checkout compatibility testing
  stash: integrate with sparse index
  sparse-index: expose 'is_sparse_index_allowed()'
  read-cache: set sparsity when index is new
  stash: apply stash using 'merge_ort_nonrecursive()'
  unpack-trees: preserve index sparsity

 builtin/stash.c                          | 33 ++++++++--
 read-cache.c                             | 18 +++++-
 sparse-index.c                           |  2 +-
 sparse-index.h                           |  1 +
 t/perf/p2000-sparse-operations.sh        |  2 +
 t/t1092-sparse-checkout-compatibility.sh | 78 +++++++++++++++++++++++-
 unpack-trees.c                           |  6 ++
 7 files changed, 131 insertions(+), 9 deletions(-)


base-commit: 0f828332d5ac36fc63b7d8202652efa152809856
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1171%2Fvdye%2Fsparse%2Fstash-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1171/vdye/sparse/stash-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1171

Range-diff vs v2:

 1:  8ea986cb249 = 1:  4e0a09f7a3c stash: expand sparse-checkout compatibility testing
 2:  b3e3f0298fb = 2:  7be484a8c0b stash: integrate with sparse index
 3:  73f04e95400 = 3:  6f00fca9267 sparse-index: expose 'is_sparse_index_allowed()'
 4:  42550f39a75 = 4:  bb092c075f4 read-cache: set sparsity when index is new
 5:  4537d473b93 < -:  ----------- merge-recursive: add merge function arg to 'merge_recursive_generic'
 6:  22fee0732ad ! 5:  e733c2fd9f4 stash: merge applied stash with merge-ort
     @@ Metadata
      Author: Victoria Dye <vdye@github.com>
      
       ## Commit message ##
     -    stash: merge applied stash with merge-ort
     +    stash: apply stash using 'merge_ort_nonrecursive()'
      
     -    Change the merge function used in 'do_apply_stash()' from 'merge_recursive'
     -    to 'merge_ort_recursive'. In addition to aligning with the default merge
     -    strategy used by 'git merge' (6a5fb96672 (Change default merge backend from
     -    recursive to ort, 2021-08-04)), this allows 'git stash <apply|pop>' to
     -    operate without expanding the index by default. Update tests in 't1092'
     -    verifying index expansion for 'git stash' accordingly.
     +    Update 'stash' to use 'merge_ort_nonrecursive()' to apply a stash to the
     +    current working tree. When 'git stash apply' was converted from its shell
     +    script implementation to a builtin in 8a0fc8d19d (stash: convert apply to
     +    builtin, 2019-02-25), 'merge_recursive_generic()' was used to merge a stash
     +    into the working tree as part of 'git stash (apply|pop)'. However, with the
     +    single merge base used in 'do_apply_stash()', the commit wrapping done by
     +    'merge_recursive_generic()' is not only unnecessary, but misleading (the
     +    *real* merge base is labeled "constructed merge base"). Therefore, a
     +    non-recursive merge of the working tree, stashed tree, and stash base tree
     +    is more appropriate.
     +
     +    There are two options for a non-recursive merge-then-update-worktree
     +    function: 'merge_trees()' and 'merge_ort_nonrecursive()'. Use
     +    'merge_ort_nonrecursive()' to align with the default merge strategy used by
     +    'git merge' (6a5fb96672 (Change default merge backend from recursive to ort,
     +    2021-08-04)) and, because merge-ort does not operate in-place on the index,
     +    avoid unnecessary index expansion. Update tests in 't1092' verifying index
     +    expansion for 'git stash' accordingly.
      
          Signed-off-by: Victoria Dye <vdye@github.com>
      
     @@ builtin/stash.c
       #include "strvec.h"
       #include "run-command.h"
       #include "dir.h"
     +@@ builtin/stash.c: static void unstage_changes_unless_new(struct object_id *orig_tree)
     + static int do_apply_stash(const char *prefix, struct stash_info *info,
     + 			  int index, int quiet)
     + {
     +-	int ret;
     ++	int clean, ret;
     + 	int has_index = index;
     + 	struct merge_options o;
     + 	struct object_id c_tree;
     + 	struct object_id index_tree;
     +-	struct commit *result;
     +-	const struct object_id *bases[1];
     ++	struct tree *head, *merge, *merge_base;
     ++	struct lock_file lock = LOCK_INIT;
     + 
     + 	read_cache_preload(NULL);
     + 	if (refresh_and_write_cache(REFRESH_QUIET, 0, 0))
     +@@ builtin/stash.c: static int do_apply_stash(const char *prefix, struct stash_info *info,
     + 
     + 	o.branch1 = "Updated upstream";
     + 	o.branch2 = "Stashed changes";
     ++	o.ancestor = "Stash base";
     + 
     + 	if (oideq(&info->b_tree, &c_tree))
     + 		o.branch1 = "Version stash was based on";
      @@ builtin/stash.c: static int do_apply_stash(const char *prefix, struct stash_info *info,
     - 	bases[0] = &info->b_tree;
     + 	if (o.verbosity >= 3)
     + 		printf_ln(_("Merging %s with %s"), o.branch1, o.branch2);
     + 
     +-	bases[0] = &info->b_tree;
     ++	head = lookup_tree(o.repo, &c_tree);
     ++	merge = lookup_tree(o.repo, &info->w_tree);
     ++	merge_base = lookup_tree(o.repo, &info->b_tree);
     ++
     ++	repo_hold_locked_index(o.repo, &lock, LOCK_DIE_ON_ERROR);
     ++	clean = merge_ort_nonrecursive(&o, head, merge, merge_base);
     ++
     ++	/*
     ++	 * If 'clean' >= 0, reverse the value for 'ret' so 'ret' is 0 when the
     ++	 * merge was clean, and nonzero if the merge was unclean or encountered
     ++	 * an error.
     ++	 */
     ++	ret = clean >= 0 ? !clean : clean;
     ++
     ++	if (ret < 0)
     ++		rollback_lock_file(&lock);
     ++	else if (write_locked_index(o.repo->index, &lock,
     ++				      COMMIT_LOCK | SKIP_IF_UNCHANGED))
     ++		ret = error(_("could not write index"));
       
     - 	ret = merge_recursive_generic(&o, &c_tree, &info->w_tree, 1, bases,
     --				      merge_recursive, &result);
     -+				      merge_ort_recursive, &result);
     +-	ret = merge_recursive_generic(&o, &c_tree, &info->w_tree, 1, bases,
     +-				      &result);
       	if (ret) {
       		rerere(0);
       
 7:  3179018a8cb = 6:  4b4c38fcc03 unpack-trees: preserve index sparsity

-- 
gitgitgadget

  parent reply	other threads:[~2022-05-10 23:32 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-25 17:49 [PATCH 0/7] Sparse index: integrate with 'git stash' Victoria Dye via GitGitGadget
2022-04-25 17:49 ` [PATCH 1/7] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget
2022-04-25 17:49 ` [PATCH 2/7] stash: integrate with sparse index Victoria Dye via GitGitGadget
2022-04-25 21:34   ` Junio C Hamano
2022-04-26 12:53   ` Derrick Stolee
2022-04-26 15:26     ` Victoria Dye
2022-04-26 16:21     ` Junio C Hamano
2022-04-25 17:49 ` [PATCH 3/7] sparse-index: expose 'is_sparse_index_allowed()' Victoria Dye via GitGitGadget
2022-04-25 17:49 ` [PATCH 4/7] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget
2022-04-25 21:35   ` Junio C Hamano
2022-04-25 17:49 ` [PATCH 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic' Victoria Dye via GitGitGadget
2022-04-25 21:38   ` Junio C Hamano
2022-04-26 12:57     ` Derrick Stolee
2022-04-25 17:49 ` [PATCH 6/7] stash: merge applied stash with merge-ort Victoria Dye via GitGitGadget
2022-04-26 13:02   ` Derrick Stolee
2022-04-25 17:49 ` [PATCH 7/7] unpack-trees: preserve index sparsity Victoria Dye via GitGitGadget
2022-04-26 12:49 ` [PATCH 0/7] Sparse index: integrate with 'git stash' Derrick Stolee
2022-04-26 13:09   ` Derrick Stolee
2022-04-27 18:16 ` [PATCH v2 " Victoria Dye via GitGitGadget
2022-04-27 18:16   ` [PATCH v2 1/7] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget
2022-04-27 18:16   ` [PATCH v2 2/7] stash: integrate with sparse index Victoria Dye via GitGitGadget
2022-04-27 18:16   ` [PATCH v2 3/7] sparse-index: expose 'is_sparse_index_allowed()' Victoria Dye via GitGitGadget
2022-04-27 18:16   ` [PATCH v2 4/7] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget
2022-04-27 18:16   ` [PATCH v2 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic' Victoria Dye via GitGitGadget
2022-05-06  7:23     ` Elijah Newren
2022-05-09 19:24       ` Victoria Dye
2022-05-10  7:06         ` Elijah Newren
2022-04-27 18:16   ` [PATCH v2 6/7] stash: merge applied stash with merge-ort Victoria Dye via GitGitGadget
2022-04-27 18:16   ` [PATCH v2 7/7] unpack-trees: preserve index sparsity Victoria Dye via GitGitGadget
2022-05-06  7:46   ` [PATCH v2 0/7] Sparse index: integrate with 'git stash' Elijah Newren
2022-05-10 23:32   ` Victoria Dye via GitGitGadget [this message]
2022-05-10 23:32     ` [PATCH v3 1/6] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget
2022-05-10 23:32     ` [PATCH v3 2/6] stash: integrate with sparse index Victoria Dye via GitGitGadget
2022-05-10 23:32     ` [PATCH v3 3/6] sparse-index: expose 'is_sparse_index_allowed()' Victoria Dye via GitGitGadget
2022-05-10 23:32     ` [PATCH v3 4/6] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget
2022-05-10 23:32     ` [PATCH v3 5/6] stash: apply stash using 'merge_ort_nonrecursive()' Victoria Dye via GitGitGadget
2022-05-11  0:26       ` Junio C Hamano
2022-05-12  1:01       ` Jonathan Tan
2022-05-12 14:52         ` Elijah Newren
2022-05-12 16:55           ` Jonathan Tan
2022-05-12 14:51       ` Elijah Newren
2022-05-10 23:32     ` [PATCH v3 6/6] unpack-trees: preserve index sparsity Victoria Dye via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1171.v3.git.1652225552.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.