All of lore.kernel.org
 help / color / mirror / Atom feed
From: Victoria Dye <vdye@github.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: gitster@pobox.com, shaoxuan.yuan02@gmail.com,
	Derrick Stolee <derrickstolee@github.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 8/8] sparse-checkout: integrate with sparse index
Date: Mon, 16 May 2022 13:38:28 -0700	[thread overview]
Message-ID: <9191a98c-087a-39b9-cff3-eb7eccd198ea@github.com> (raw)
In-Reply-To: <b8a349c6deeb4b970075629d0c292b2ae9f7d0d3.1652724693.git.gitgitgadget@gmail.com>

Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> When modifying the sparse-checkout definition, the sparse-checkout
> builtin calls update_sparsity() to modify the SKIP_WORKTREE bits of all
> cache entries in the index. Before, we needed the index to be fully
> expanded in order to ensure we had the full list of files necessary that
> match the new patterns.
> 
> Insert a call to reset_sparse_directories() that expands sparse
> directories that are within the new pattern list, but only far enough
> that every necessary file path now exists as a cache entry. The
> remaining logic within update_sparsity() will modify the SKIP_WORKTREE
> bits appropriately.
> 
> This allows us to disable command_requires_full_index within the
> sparse-checkout builtin. Add tests that demonstrate that we are not
> expanding to a full index unnecessarily.
> 
> We can see the improved performance in the p2000 test script:
> 
> Test                           HEAD~1            HEAD
> ------------------------------------------------------------------------
> 2000.24: git ... (sparse-v3)   2.14(1.55+0.58)   1.57(1.03+0.53) -26.6%
> 2000.25: git ... (sparse-v4)   2.20(1.62+0.57)   1.58(0.98+0.59) -28.2%
> 
> These reductions of 26-28% are small compared to most examples, but the
> time is dominated by writing a new copy of the base repository to the
> worktree and then deleting it again. The fact that the previous index
> expansion was such a large portion of the time is telling how important
> it is to complete this sparse index integration.
> 
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  builtin/sparse-checkout.c                |  3 +++
>  t/t1092-sparse-checkout-compatibility.sh | 25 ++++++++++++++++++++++++
>  unpack-trees.c                           |  4 ++++
>  3 files changed, 32 insertions(+)
> 
> diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
> index cbff6ad00b0..0157b292b36 100644
> --- a/builtin/sparse-checkout.c
> +++ b/builtin/sparse-checkout.c
> @@ -937,6 +937,9 @@ int cmd_sparse_checkout(int argc, const char **argv, const char *prefix)
>  
>  	git_config(git_default_config, NULL);
>  
> +	prepare_repo_settings(the_repository);
> +	the_repository->settings.command_requires_full_index = 0;
> +
>  	if (argc > 0) {
>  		if (!strcmp(argv[0], "list"))
>  			return sparse_checkout_list(argc, argv);
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index 93bcfd20bbc..614357fc48c 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -1552,6 +1552,31 @@ test_expect_success 'ls-files' '
>  	ensure_not_expanded ls-files --sparse
>  '
>  
> +test_expect_success 'sparse index is not expanded: sparse-checkout' '
> +	init_repos &&
> +
> +	ensure_not_expanded sparse-checkout set deep/deeper2 &&
> +	ensure_not_expanded sparse-checkout set deep/deeper1 &&
> +	ensure_not_expanded sparse-checkout set deep &&
> +	ensure_not_expanded sparse-checkout add folder1 &&
> +	ensure_not_expanded sparse-checkout set deep/deeper1 &&
> +	ensure_not_expanded sparse-checkout set folder2 &&
> +
> +	# Demonstrate that the checks that "folder1/a" is a file
> +	# do not cause a sparse-index expansion (since it is in the
> +	# sparse-checkout cone).
> +	echo >>sparse-index/folder2/a &&
> +	git -C sparse-index add folder2/a &&
> +
> +	ensure_not_expanded sparse-checkout add folder1 &&
> +
> +	# Skip checks here, since deep/deeper1 is inside a sparse directory
> +	# that must be expanded to check whether `deep/deeper1` is a file
> +	# or not.
> +	ensure_not_expanded sparse-checkout set --skip-checks deep/deeper1 &&
> +	ensure_not_expanded sparse-checkout set
> +'
> +

These tests look good for ensuring sparsity is preserved, but it'd be nice
to also have some "stress tests" of 'sparse-checkout (add|set)'. The purpose
would be to make sure the index has the right contents for various types of
pattern changes, e.g. running 'sparse-checkout (add|set) <path>', then
verifying index contents with 'ls-files --sparse'. Paths might be:

- in vs. out of (current) cone
- match an existing vs. nonexistent directory

etc.

>  # NEEDSWORK: a sparse-checkout behaves differently from a full checkout
>  # in this scenario, but it shouldn't.
>  test_expect_success 'reset mixed and checkout orphan' '
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 7f528d35cc2..9745e0dfc34 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -18,6 +18,7 @@
>  #include "promisor-remote.h"
>  #include "entry.h"
>  #include "parallel-checkout.h"
> +#include "sparse-index.h"
>  
>  /*
>   * Error messages expected by scripts out of plumbing commands such as
> @@ -2018,6 +2019,9 @@ enum update_sparsity_result update_sparsity(struct unpack_trees_options *o)
>  			goto skip_sparse_checkout;
>  	}
>  
> +	/* Expand sparse directories as needed */
> +	expand_to_pattern_list(o->src_index, o->pl);
> +
>  	/* Set NEW_SKIP_WORKTREE on existing entries. */
>  	mark_all_ce_unused(o->src_index);
>  	mark_new_skip_worktree(o->pl, o->src_index, 0,


  reply	other threads:[~2022-05-16 21:01 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-16 18:11 [PATCH 0/8] Sparse index: integrate with sparse-checkout Derrick Stolee via GitGitGadget
2022-05-16 18:11 ` [PATCH 1/8] sparse-index: create expand_to_pattern_list() Derrick Stolee via GitGitGadget
2022-05-16 20:36   ` Victoria Dye
2022-05-16 20:49     ` Derrick Stolee
2022-05-16 18:11 ` [PATCH 2/8] sparse-index: introduce partially-sparse indexes Derrick Stolee via GitGitGadget
2022-05-16 18:11 ` [PATCH 3/8] cache-tree: implement cache_tree_find_path() Derrick Stolee via GitGitGadget
2022-05-16 18:11 ` [PATCH 4/8] sparse-checkout: --no-sparse-index needs a full index Derrick Stolee via GitGitGadget
2022-05-16 18:11 ` [PATCH 5/8] sparse-index: partially expand directories Derrick Stolee via GitGitGadget
2022-05-16 20:36   ` Victoria Dye
2022-05-16 18:11 ` [PATCH 6/8] sparse-index: complete partial expansion Derrick Stolee via GitGitGadget
2022-05-16 20:38   ` Victoria Dye
2022-05-17 13:23     ` Derrick Stolee
2022-05-16 18:11 ` [PATCH 7/8] p2000: add test for 'git sparse-checkout [add|set]' Derrick Stolee via GitGitGadget
2022-05-16 18:11 ` [PATCH 8/8] sparse-checkout: integrate with sparse index Derrick Stolee via GitGitGadget
2022-05-16 20:38   ` Victoria Dye [this message]
2022-05-17 13:28     ` Derrick Stolee
2022-05-19 17:52 ` [PATCH v2 00/10] Sparse index: integrate with sparse-checkout Derrick Stolee via GitGitGadget
2022-05-19 17:52   ` [PATCH v2 01/10] t1092: refactor 'sparse-index contents' test Derrick Stolee via GitGitGadget
2022-05-19 17:52   ` [PATCH v2 02/10] t1092: stress test 'git sparse-checkout set' Derrick Stolee via GitGitGadget
2022-05-19 17:52   ` [PATCH v2 03/10] sparse-index: create expand_to_pattern_list() Derrick Stolee via GitGitGadget
2022-05-19 19:50     ` Junio C Hamano
2022-05-20 18:01       ` Derrick Stolee
2022-05-19 17:52   ` [PATCH v2 04/10] sparse-index: introduce partially-sparse indexes Derrick Stolee via GitGitGadget
2022-05-19 20:05     ` Junio C Hamano
2022-05-20 18:05       ` Derrick Stolee
2022-05-20 18:23         ` Junio C Hamano
2022-05-19 17:52   ` [PATCH v2 05/10] cache-tree: implement cache_tree_find_path() Derrick Stolee via GitGitGadget
2022-05-19 20:14     ` Junio C Hamano
2022-05-20 18:13       ` Derrick Stolee
2022-05-19 17:52   ` [PATCH v2 06/10] sparse-checkout: --no-sparse-index needs a full index Derrick Stolee via GitGitGadget
2022-05-19 20:19     ` Junio C Hamano
2022-05-19 17:52   ` [PATCH v2 07/10] sparse-index: partially expand directories Derrick Stolee via GitGitGadget
2022-05-20 18:17     ` Junio C Hamano
2022-05-20 18:33       ` Derrick Stolee
2022-05-19 17:52   ` [PATCH v2 08/10] sparse-index: complete partial expansion Derrick Stolee via GitGitGadget
2022-05-21  7:45     ` Junio C Hamano
2022-05-23 13:13       ` Derrick Stolee
2022-05-23 13:18         ` Derrick Stolee
2022-05-23 18:01           ` Junio C Hamano
2022-05-23 22:48         ` Junio C Hamano
2022-05-25 14:26           ` Derrick Stolee
2022-05-25 16:32             ` Junio C Hamano
2022-05-19 17:52   ` [PATCH v2 09/10] p2000: add test for 'git sparse-checkout [add|set]' Derrick Stolee via GitGitGadget
2022-05-19 17:52   ` [PATCH v2 10/10] sparse-checkout: integrate with sparse index Derrick Stolee via GitGitGadget
2022-05-23 13:48   ` [PATCH v3 00/10] Sparse index: integrate with sparse-checkout Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 01/10] t1092: refactor 'sparse-index contents' test Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 02/10] t1092: stress test 'git sparse-checkout set' Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 03/10] sparse-index: create expand_index() Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 04/10] sparse-index: introduce partially-sparse indexes Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 05/10] cache-tree: implement cache_tree_find_path() Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 06/10] sparse-checkout: --no-sparse-index needs a full index Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 07/10] sparse-index: partially expand directories Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 08/10] sparse-index: complete partial expansion Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 09/10] p2000: add test for 'git sparse-checkout [add|set]' Derrick Stolee via GitGitGadget
2022-05-23 13:48     ` [PATCH v3 10/10] sparse-checkout: integrate with sparse index Derrick Stolee via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9191a98c-087a-39b9-cff3-eb7eccd198ea@github.com \
    --to=vdye@github.com \
    --cc=derrickstolee@github.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=shaoxuan.yuan02@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.