From: Elijah Newren <newren@gmail.com> To: Victoria Dye via GitGitGadget <gitgitgadget@gmail.com> Cc: Git Mailing List <git@vger.kernel.org>, Derrick Stolee <derrickstolee@github.com>, Junio C Hamano <gitster@pobox.com>, Victoria Dye <vdye@github.com> Subject: Re: [PATCH v2 0/7] Sparse index: integrate with 'git stash' Date: Fri, 6 May 2022 00:46:49 -0700 [thread overview] Message-ID: <CABPp-BFANwZn73w8zrVySB7mh0bQQBdGJjBuSJy50UOeyYT6aA@mail.gmail.com> (raw) In-Reply-To: <pull.1171.v2.git.1651083378.gitgitgadget@gmail.com> On Wed, Apr 27, 2022 at 11:16 AM Victoria Dye via GitGitGadget <gitgitgadget@gmail.com> wrote: > > This series, in combination with the sparse index integrations of reset [1], > update-index [2], checkout-index [2], clean [2], and read-tree [3], allows > most subcommands of 'git stash' to use the sparse index end-to-end without > index expansion. I've read through the series. Like Stolee, I'm pleased with how simple some of the code changes are (much due to prior hard work in the area) and how nicely you describe the changes. At first when I was reading through, I was again a bit disappointed that we're just converting the subcommands stash uses instead of fixing stash to stop forking subprocesses, but...then I realized this served as a forcing function of sorts to make more of Git sparse-index compatible, so I think it's actually the better route. We can always make the orthogonal fixes to stash's implementation design later. :-) However, there is one small issue around the way merging is handled... > Like the earlier series, this series starts with new tests ensuring > compatibility of the sparse index with non-sparse index full and sparse > checkouts [1/7]. Next, sparse index is trivially enabled [2/7]. > Functionally, sparse index-enabled sparse-checkouts remain compatible with > non-sparse index sparse-checkouts, but there are still some cases where the > index (or a temporary index) is expanded unnecessarily. These cases are > fixed in three parts: > > * First, 'git stash -u' is made sparse index-compatible by ensuring the > "temporary" index holding the stashed, untracked files is created as a > sparse index whenever possible (per repo settings & > 'is_sparse_index_allowed()'). Patch [3/7] exposes > 'is_sparse_index_allowed()' to files outside of 'sparse-index.c', then > patch [4/7] uses that function to mark the temporary index sparse when > appropriate. > * Next, 'git stash (apply|pop)' are made sparse index-compatible by > changing their internal "merge" function (executed via > 'merge_recursive_generic()') from 'merge_recursive()' to > 'merge_ort_recursive()'. This requires first allowing > 'merge_recursive_generic()' to accept a merge function as an input > (rather than hardcoding use of 'merge_recursive()') in patch [5/7], then > changing the call in 'stash.c' to specify 'merge_ort_recursive()' in > patch [6/7]. See note [4] for possible alternate implementations. > * Finally, while patches 5 & 6 avoid index expansion for most cases of 'git > stash (apply|pop)', applying a stash that includes untracked files still > expands the index. This is a result of an internal 'read-tree' execution > (specifically in its 'unpack_trees' call) creating a result index that is > never sparse in-core, thus forcing the index to be unnecessarily > collapsed and re-expanded in 'do_write_locked_index()'. In patch [7/7], > 'unpack_trees' is updated to set the default sparsity of the resultant > index to "sparse" if allowed by repo settings and > 'is_sparse_index_allowed()' (similar to the change in patch 4). > > Performance results (from the 'p2000' tests): > > (git stash && > git stash pop) master this series > --------------------------------------------------------------------- > full-v3 4.07(2.42+1.34) 3.98(2.42+1.32) -2.2% > full-v4 4.05(2.46+1.31) 4.00(2.49+1.29) -1.2% > sparse-v3 7.48(4.81+2.57) 1.53(0.26+1.61) -79.5% > sparse-v4 7.35(4.74+2.54) 1.59(0.27+1.63) -78.4% > > (echo >>new && > git stash -u && > git stash pop) master this series > --------------------------------------------------------------------- > full-v3 4.21(2.62+1.45) 4.11(2.55+1.44) -2.4% > full-v4 4.11(2.51+1.41) 4.02(2.49+1.41) -2.2% > sparse-v3 7.35(4.64+2.66) 1.70(0.32+1.64) -76.9% > sparse-v4 7.74(4.87+2.83) 1.70(0.32+1.66) -78.0% > > > > Changes since V1 > ================ > > * Added quotes to the "$WITHOUT_UNTRACKED_TXT" when testing for it in > 'ensure_not_expanded' (in 't/t1092-sparse-checkout-compatibility.sh') > * Moved the 'stash' test in 't1092' elsewhere in the file, so that it > doesn't conflict (even trivially) with the also-in-flight 'git show' > integration > * Moved the 'ensure_not_expended' tests for 'checkout-index' back to > original location > > [1] > https://lore.kernel.org/git/pull.1048.v6.git.1638201164.gitgitgadget@gmail.com/ > [2] > https://lore.kernel.org/git/pull.1109.v2.git.1641924306.gitgitgadget@gmail.com/ > [3] > https://lore.kernel.org/git/pull.1157.v3.git.1646166271.gitgitgadget@gmail.com/ > [4] I went with changing 'stash' to always use 'merge-ort' in > 'merge_recursive_generic()' as a sort of "middle ground" between "replace > 'merge_recursive()' with 'merge_ort_recursive()' in all of its hardcoded > internal usage" and "only use 'merge-ort' if using a sparse index in 'git > stash', otherwise 'merge-recursive'". The former would extend the use of > 'merge-ort' to 'git am' and 'git merge-recursive', whereas the latter is a > more cautious/narrowly-focused option. If anyone has any other thoughts, I'm > interested in hearing them. I understand that you'd want to modify and use merge_recursive_generic() in order to make the smallest code change possible, but merge_recursive_generic() has no equivalent in ort because during the review of ort, we discovered that porting over merge_recursive_generic() meant porting bugs. We need to fix those bugs. However, in the case of stash, I think we can just sidestep those bugs. stash probably only uses merge_recursive_generic() because it was the easiest transliteration of shell (which originally invoked git-merge-recursive), as opposed to the best conversion. The best conversion at the time would have been to call merge_trees() instead (stash doesn't need recursiveness, and tends to have commits available rather than just trees). I'd like to see us modify stash to call merge_incore_nonrecursive() directly (and/or merge_trees() if we're supporting using both merge-ort and merge-recursive backends). No need to worry about git-am or git-merge-recursive for now; we can replace/fix/update their calls to merge_recursive_generic() later.
next prev parent reply other threads:[~2022-05-06 7:47 UTC|newest] Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-04-25 17:49 [PATCH " Victoria Dye via GitGitGadget 2022-04-25 17:49 ` [PATCH 1/7] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget 2022-04-25 17:49 ` [PATCH 2/7] stash: integrate with sparse index Victoria Dye via GitGitGadget 2022-04-25 21:34 ` Junio C Hamano 2022-04-26 12:53 ` Derrick Stolee 2022-04-26 15:26 ` Victoria Dye 2022-04-26 16:21 ` Junio C Hamano 2022-04-25 17:49 ` [PATCH 3/7] sparse-index: expose 'is_sparse_index_allowed()' Victoria Dye via GitGitGadget 2022-04-25 17:49 ` [PATCH 4/7] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget 2022-04-25 21:35 ` Junio C Hamano 2022-04-25 17:49 ` [PATCH 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic' Victoria Dye via GitGitGadget 2022-04-25 21:38 ` Junio C Hamano 2022-04-26 12:57 ` Derrick Stolee 2022-04-25 17:49 ` [PATCH 6/7] stash: merge applied stash with merge-ort Victoria Dye via GitGitGadget 2022-04-26 13:02 ` Derrick Stolee 2022-04-25 17:49 ` [PATCH 7/7] unpack-trees: preserve index sparsity Victoria Dye via GitGitGadget 2022-04-26 12:49 ` [PATCH 0/7] Sparse index: integrate with 'git stash' Derrick Stolee 2022-04-26 13:09 ` Derrick Stolee 2022-04-27 18:16 ` [PATCH v2 " Victoria Dye via GitGitGadget 2022-04-27 18:16 ` [PATCH v2 1/7] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget 2022-04-27 18:16 ` [PATCH v2 2/7] stash: integrate with sparse index Victoria Dye via GitGitGadget 2022-04-27 18:16 ` [PATCH v2 3/7] sparse-index: expose 'is_sparse_index_allowed()' Victoria Dye via GitGitGadget 2022-04-27 18:16 ` [PATCH v2 4/7] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget 2022-04-27 18:16 ` [PATCH v2 5/7] merge-recursive: add merge function arg to 'merge_recursive_generic' Victoria Dye via GitGitGadget 2022-05-06 7:23 ` Elijah Newren 2022-05-09 19:24 ` Victoria Dye 2022-05-10 7:06 ` Elijah Newren 2022-04-27 18:16 ` [PATCH v2 6/7] stash: merge applied stash with merge-ort Victoria Dye via GitGitGadget 2022-04-27 18:16 ` [PATCH v2 7/7] unpack-trees: preserve index sparsity Victoria Dye via GitGitGadget 2022-05-06 7:46 ` Elijah Newren [this message] 2022-05-10 23:32 ` [PATCH v3 0/6] Sparse index: integrate with 'git stash' Victoria Dye via GitGitGadget 2022-05-10 23:32 ` [PATCH v3 1/6] stash: expand sparse-checkout compatibility testing Victoria Dye via GitGitGadget 2022-05-10 23:32 ` [PATCH v3 2/6] stash: integrate with sparse index Victoria Dye via GitGitGadget 2022-05-10 23:32 ` [PATCH v3 3/6] sparse-index: expose 'is_sparse_index_allowed()' Victoria Dye via GitGitGadget 2022-05-10 23:32 ` [PATCH v3 4/6] read-cache: set sparsity when index is new Victoria Dye via GitGitGadget 2022-05-10 23:32 ` [PATCH v3 5/6] stash: apply stash using 'merge_ort_nonrecursive()' Victoria Dye via GitGitGadget 2022-05-11 0:26 ` Junio C Hamano 2022-05-12 1:01 ` Jonathan Tan 2022-05-12 14:52 ` Elijah Newren 2022-05-12 16:55 ` Jonathan Tan 2022-05-12 14:51 ` Elijah Newren 2022-05-10 23:32 ` [PATCH v3 6/6] unpack-trees: preserve index sparsity Victoria Dye via GitGitGadget
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CABPp-BFANwZn73w8zrVySB7mh0bQQBdGJjBuSJy50UOeyYT6aA@mail.gmail.com \ --to=newren@gmail.com \ --cc=derrickstolee@github.com \ --cc=git@vger.kernel.org \ --cc=gitgitgadget@gmail.com \ --cc=gitster@pobox.com \ --cc=vdye@github.com \ --subject='Re: [PATCH v2 0/7] Sparse index: integrate with '\''git stash'\''' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).