All of lore.kernel.org
 help / color / mirror / Atom feed
From: Victoria Dye <vdye@github.com>
To: Shuqi Liang <cheskaqiqi@gmail.com>, git@vger.kernel.org
Cc: gitster@pobox.com, derrickstolee@github.com
Subject: Re: [PATCH v2] write-tree: integrate with sparse index
Date: Wed, 5 Apr 2023 10:31:36 -0700	[thread overview]
Message-ID: <9d0309bd-943c-dd51-97cf-59721eda78f7@github.com> (raw)
In-Reply-To: <20230404003539.1578245-1-cheskaqiqi@gmail.com>

Shuqi Liang wrote:
> Update 'git write-tree' to allow using the sparse-index in memory
> without expanding to a full one.
> 
> The recursive algorithm for update_one() was already updated in 2de37c5
> (cache-tree: integrate with sparse directory entries, 2021-03-03) to
> handle sparse directory entries in the index. Hence we can just set the
> requires-full-index to false for "write-tree".
> 
> The `p2000` tests demonstrate a ~96% execution time reduction for 'git
> write-tree' using a sparse index:
> 
> Test                                           before  after
> -----------------------------------------------------------------
> 2000.78: git write-tree (full-v3)              0.34    0.33 -2.9%
> 2000.79: git write-tree (full-v4)              0.32    0.30 -6.3%
> 2000.80: git write-tree (sparse-v3)            0.47    0.02 -95.8%
> 2000.81: git write-tree (sparse-v4)            0.45    0.02 -95.6%
> 
> Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
> ---
> 
> * change the position of "settings.command_requires_full_index = 0"

Could you describe why you made this change? You don't need to re-roll, but
in the future please make sure to describe the reasoning for changes like
this in these version notes if the context can't be gathered from other
discussions in the thread. 

> 
> Range-diff against v1:
> 1:  d8a9ccd0b3 ! 1:  8873c79759 write-tree: integrate with sparse index
>     @@ Commit message
>      
>       ## builtin/write-tree.c ##
>      @@ builtin/write-tree.c: int cmd_write_tree(int argc, const char **argv, const char *cmd_prefix)
>     - 	};
>     - 
>     - 	git_config(git_default_config, NULL);
>     -+	
>     -+	prepare_repo_settings(the_repository);
>     -+	the_repository->settings.command_requires_full_index = 0;
>     -+
>       	argc = parse_options(argc, argv, cmd_prefix, write_tree_options,
>       			     write_tree_usage, 0);
>       
>     ++	prepare_repo_settings(the_repository);
>     ++	the_repository->settings.command_requires_full_index = 0;
>     ++	
>     + 	ret = write_cache_as_tree(&oid, flags, tree_prefix);
>     + 	switch (ret) {
>     + 	case 0:
>      
>       ## t/perf/p2000-sparse-operations.sh ##
>      @@ t/perf/p2000-sparse-operations.sh: test_perf_on_all git checkout-index -f --all
> 
> 
>  builtin/write-tree.c                     |  3 +++
>  t/perf/p2000-sparse-operations.sh        |  1 +
>  t/t1092-sparse-checkout-compatibility.sh | 28 ++++++++++++++++++++++++
>  3 files changed, 32 insertions(+)
> 
> diff --git a/builtin/write-tree.c b/builtin/write-tree.c
> index 45d61707e7..4492da0912 100644
> --- a/builtin/write-tree.c
> +++ b/builtin/write-tree.c
> @@ -38,6 +38,9 @@ int cmd_write_tree(int argc, const char **argv, const char *cmd_prefix)
>  	argc = parse_options(argc, argv, cmd_prefix, write_tree_options,
>  			     write_tree_usage, 0);
>  
> +	prepare_repo_settings(the_repository);
> +	the_repository->settings.command_requires_full_index = 0;
> +	
>  	ret = write_cache_as_tree(&oid, flags, tree_prefix);
>  	switch (ret) {
>  	case 0:
> diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
> index 3242cfe91a..9924adfc26 100755
> --- a/t/perf/p2000-sparse-operations.sh
> +++ b/t/perf/p2000-sparse-operations.sh
> @@ -125,5 +125,6 @@ test_perf_on_all git checkout-index -f --all
>  test_perf_on_all git update-index --add --remove $SPARSE_CONE/a
>  test_perf_on_all "git rm -f $SPARSE_CONE/a && git checkout HEAD -- $SPARSE_CONE/a"
>  test_perf_on_all git grep --cached --sparse bogus -- "f2/f1/f1/*"
> +test_perf_on_all git write-tree 
>  
>  test_done
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index 801919009e..3b8191b390 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -2055,4 +2055,32 @@ test_expect_success 'grep sparse directory within submodules' '
>  	test_cmp actual expect
>  '
>  
> +test_expect_success 'write-tree on all' '

It's not clear what "on all" means in this context. If it's "write-tree with
changes both inside and outside the cone", then please either make that
explicit in the test name or simplify the name to just 'write-tree' (like
'clean').

> +	init_repos &&

It would be nice to have a baseline 'test_all_match git write-tree' before
making any changes to the index (as you do in the 'sparse-index is not
expanded: write-tree' test). 

> +
> +	write_script edit-contents <<-\EOF &&
> +	echo text >>"$1"
> +	EOF
> +
> +	run_on_all ../edit-contents deep/a &&
> +	run_on_all git update-index deep/a &&
> +	test_all_match git write-tree &&

First you make a change inside the sparse cone and 'write-tree'...

> +
> +	run_on_all mkdir -p folder1 &&
> +	run_on_all cp a folder1/a &&
> +	run_on_all ../edit-contents folder1/a &&
> +	run_on_all git update-index folder1/a &&
> +	test_all_match git write-tree

...then make a change outside the cone and 'write-tree' again. Makes sense.

However, there isn't any test of the working tree after 'write-tree' exits.
For example, I'd be interested in seeing a comparison of the output of 'git
status --porcelain=v2', as well as ensuring that SKIP_WORKTREE files weren't
materialized on disk in 'sparse-checkout' and 'sparse-index' (e.g.,
'folder2/a' shouldn't exist).

It also wouldn't hurt to 'test_all_match' on the 'git update-index' calls,
but I don't feel too strongly either way.

> +'
> +
> +test_expect_success 'sparse-index is not expanded: write-tree' '
> +	init_repos &&
> +
> +	ensure_not_expanded write-tree &&
> +
> +	echo "test1" >>sparse-index/a &&
> +	git -C sparse-index update-index a &&
> +	ensure_not_expanded write-tree 

This also looks good. 

> +'
> +
>  test_done


  reply	other threads:[~2023-04-05 17:31 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-02  0:01 [RFC][PATCH v1] write-tree: integrate with sparse index Shuqi Liang
2023-04-03 20:58 ` Junio C Hamano
2023-04-03 22:16   ` Shuqi Liang
2023-04-03 22:54     ` Junio C Hamano
2023-04-04  0:35 ` [PATCH v2] " Shuqi Liang
2023-04-05 17:31   ` Victoria Dye [this message]
2023-04-05 19:48     ` Junio C Hamano
2023-04-19  7:21   ` [PATCH v3] " Shuqi Liang
2023-04-19 15:47     ` Junio C Hamano
2023-04-20  5:24       ` Shuqi Liang
2023-04-20 15:55         ` Junio C Hamano
2023-04-21  0:41     ` [PATCH v4] " Shuqi Liang
2023-04-21 21:42       ` Victoria Dye
2023-04-24 15:14         ` Junio C Hamano
2023-04-23  7:12       ` [PATCH v5] write-tree: optimize sparse integration Shuqi Liang
2023-04-24 16:00         ` Junio C Hamano
2023-05-08 20:05         ` [PATCH v6] " Shuqi Liang
2023-05-08 20:21           ` Shuqi Liang
2023-05-08 21:09             ` Junio C Hamano
2023-05-08 21:27               ` Shuqi Liang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9d0309bd-943c-dd51-97cf-59721eda78f7@github.com \
    --to=vdye@github.com \
    --cc=cheskaqiqi@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.