git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Derrick Stolee <derrickstolee@github.com>
To: Junio C Hamano <gitster@pobox.com>,
	Raghul Nanth A via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Raghul Nanth A <nanth.raghul@gmail.com>
Subject: Re: [PATCH] describe: enable sparse index for describe
Date: Tue, 28 Mar 2023 15:46:38 -0400	[thread overview]
Message-ID: <ff521177-b0ad-c567-c51a-a6c191584d7c@github.com> (raw)
In-Reply-To: <xmqqjzz29hkw.fsf@gitster.g>

On 3/27/23 2:26 PM, Junio C Hamano wrote:
> "Raghul Nanth A via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>>  builtin/describe.c                       |   2 +
>>  t/perf/p2000-sparse-operations.sh        |  14 +-
>>  t/t1092-sparse-checkout-compatibility.sh |  10 +
>>  t/t6121-describe-sparse.sh               | 675 +++++++++++++++++++++++
>>  4 files changed, 697 insertions(+), 4 deletions(-)
>>  create mode 100755 t/t6121-describe-sparse.sh
> 
> This copying of a file with 600+ lines only to touch up a handful
> lines (like a 20+ lines patch) is almost criminal.  Imagine the
> effort to keep them in sync over time, when "describe" itself may
> learn new features and improved output, independent from the
> sparse-index compatibility.
> 
> Can't we do better than this with a bit of refactoring?
> 
>> diff --git a/builtin/describe.c b/builtin/describe.c
>> index 5b5930f5c8c..7ff9b5e4b20 100644
>> --- a/builtin/describe.c
>> +++ b/builtin/describe.c
>> @@ -654,6 +654,8 @@ int cmd_describe(int argc, const char **argv, const char *prefix)
>>  			int fd, result;
>>  
>>  			setup_work_tree();
>> +			prepare_repo_settings(the_repository);
>> +			the_repository->settings.command_requires_full_index = 0;
> 
> Offhand, the only case I know that "describe" even _needs_ a working
> tree or the index is when asked to do the "--dirty" thing.  To
> figure out if the working tree files are modified, the code calls
> into run_diff_index(), but has that codepath been made sparse-index
> aware already?

It seems that this is a case where we can rely on the existing
changes around run_diff_index(), which is nice. We get a very
easy win for a narrow case.

And I agree about the test case situation. It would suffice to
show some checks that the result is the same across all cases
in t1092 for 'git describe --dirty'. Those should be the only
new correctness tests necessary for this change.

>> diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh

>> @@ -86,7 +89,8 @@ test_expect_success 'setup repo and indexes' '
>>  		git sparse-checkout set $SPARSE_CONE &&
>>  		git config index.version 4 &&
>>  		git update-index --index-version=4 &&
>> -		git checkout HEAD~4
>> +		git checkout HEAD~4 &&
>> +		git tag -a v1.0 -m "Final"
>>  	)
>>  '
> 
> It is unclear from the proposed commit log what the relevance of
> adding a step to create an annotated tag to these tests.  It is not
> like any later step uses that tag to figure out anything.  There may
> be good reasons to add these tags (otherwise you would not be adding
> them to these tests), but please explain why in the proposed log
> message so that future readers of the "git log -p" do not have to
> ask this question.

I imagine that 'git describe' reports something better when a tag
is reachable from HEAD. Would be good to make that clear.

Indeed, when removing these lines and running the test on a repo
without any tags, the test fails with this message:

  fatal: No names found, cannot describe anything.

These tags could be added earlier in the test, in one step:

diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
index 3242cfe91a0..ba13317c942 100755
--- a/t/perf/p2000-sparse-operations.sh
+++ b/t/perf/p2000-sparse-operations.sh
@@ -45,6 +45,7 @@ test_expect_success 'setup repo and indexes' '
 	git sparse-checkout init --cone &&
 	git sparse-checkout set $SPARSE_CONE &&
 	git checkout -b wide $OLD_COMMIT &&
+	git tag -a v1.0 -m "final" &&
 
 	for l2 in f1 f2 f3 f4
 	do

The tests then run on the four examples cloned from this copy.

>> @@ -125,5 +129,7 @@ test_perf_on_all git checkout-index -f --all
>>  test_perf_on_all git update-index --add --remove $SPARSE_CONE/a
>>  test_perf_on_all "git rm -f $SPARSE_CONE/a && git checkout HEAD -- $SPARSE_CONE/a"
>>  test_perf_on_all git grep --cached --sparse bogus -- "f2/f1/f1/*"
>> +test_perf_on_all git describe --dirty
>> +test_perf_on_all 'echo >> new && git describe --dirty'
>>  
>>  test_done
> 
> Just like '>', '>>' is a rediraction operator and should have SP
> before it (you got it right) and no SP between it and its operand.
> I.e.
> 
> 	echo >>new && git describe --dirty
> 
> You have the same in t1092, I think.

Also, since you are adding these performance tests, it would be
nice to see their results in the commit message. You can get
values without and with this change using (from t/perf/):

GIT_PERF_REPEAT_COUNT=10 ./run HEAD~1 HEAD -- p2000-sparse-operations.sh

For example, I ran this on my machine (after deleting the other tests
so it ran faster) and got these results:

Test                                                      HEAD~1            HEAD                  
--------------------------------------------------------------------------------------------------
2000.2: git describe --dirty (full-v3)                    0.36(0.07+0.32)   0.45(0.08+0.37) +25.0%
2000.3: git describe --dirty (full-v4)                    0.39(0.08+0.32)   0.42(0.08+0.35) +7.7% 
2000.4: git describe --dirty (sparse-v3)                  1.49(0.91+0.58)   0.33(0.04+0.59) -77.9%
2000.5: git describe --dirty (sparse-v4)                  1.48(0.92+0.57)   0.34(0.04+0.60) -77.0%
2000.6: echo >> new && git describe --dirty (full-v3)     0.37(0.07+0.32)   0.44(0.08+0.36) +18.9%
2000.7: echo >> new && git describe --dirty (full-v4)     0.40(0.08+0.32)   0.42(0.08+0.36) +5.0% 
2000.8: echo >> new && git describe --dirty (sparse-v3)   1.59(0.97+0.62)   0.33(0.04+0.57) -79.2%
2000.9: echo >> new && git describe --dirty (sparse-v4)   1.64(0.98+0.64)   0.31(0.03+0.54) -81.1%

Thanks,
-Stolee

  reply	other threads:[~2023-03-28 19:46 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-27 14:20 [PATCH] describe: enable sparse index for describe Raghul Nanth A via GitGitGadget
2023-03-27 18:26 ` Junio C Hamano
2023-03-28 19:46   ` Derrick Stolee [this message]
2023-03-28 20:24     ` Junio C Hamano
2023-03-28 20:35       ` Derrick Stolee
2023-03-29 16:25 ` [PATCH v2] " Raghul Nanth A via GitGitGadget
2023-03-29 17:00   ` Junio C Hamano
2023-03-29 17:49   ` Victoria Dye
2023-03-29 18:27     ` Junio C Hamano
2023-03-30 16:10     ` Raghul Nanth
2023-04-03 16:37       ` Victoria Dye
2023-03-30  5:59   ` [PATCH v3] " Raghul Nanth A via GitGitGadget
2023-03-30 14:57     ` Junio C Hamano
2023-03-30 15:13       ` Junio C Hamano
2023-03-30 16:23     ` Victoria Dye
2023-03-31 15:43       ` [GSOC][PATCH] " Raghul Nanth A
2023-03-31 16:34         ` Junio C Hamano
2023-03-31 18:20     ` [GSOC][PATCH v4] " Raghul Nanth A
2023-04-03 16:34       ` Victoria Dye
2023-04-03 16:47       ` [GSOC][PATCH v5] " Raghul Nanth A
2023-04-03  7:35     ` [PATCH v4] " Raghul Nanth A

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ff521177-b0ad-c567-c51a-a6c191584d7c@github.com \
    --to=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=nanth.raghul@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).