From: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
To: git@vger.kernel.org
Cc: derrickstolee@github.com, vdye@github.com,
Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Subject: [PATCH v2 2/2] builtin/grep.c: integrate with sparse index
Date: Mon, 29 Aug 2022 16:28:43 -0700 [thread overview]
Message-ID: <20220829232843.183711-3-shaoxuan.yuan02@gmail.com> (raw)
In-Reply-To: <20220829232843.183711-1-shaoxuan.yuan02@gmail.com>
Turn on sparse index and remove ensure_full_index().
Change it to only expands the index when using --sparse.
The p2000 tests demonstrate a ~99.4% execution time reduction for
`git grep` using a sparse index.
Test Before After
-----------------------------------------------------------------------------
git grep --cached bogus (full-v3) 0.019 0.018 (-5.2%)
git grep --cached bogus (full-v4) 0.017 0.016 (-5.8%)
git grep --cached bogus (sparse-v3) 0.29 0.0015 (-99.4%)
git grep --cached bogus (sparse-v4) 0.30 0.0018 (-99.4%)
Optional reading about performance test results
-----------------------------------------------
Notice that because `git-grep` needs to parse blobs in the index, the
index reading time is minuscule comparing to the object parsing time.
And because of this, the p2000 test results cannot clearly reflect the
speedup for index reading: combining with the object parsing time,
the aggregated time difference is extremely close between HEAD~1 and
HEAD.
Hence, the results presenting here are not directly extracted from the
p2000 test results. Instead, to make the performance difference more
visible, the test command is manually ran with GIT_TRACE2_PERF in the
four repos (full-v3, sparse-v3, full-v4, sparse-v4). The numbers here
are then extracted from the time difference between "region_enter" and
"region_leave" of label "do_read_index".
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
builtin/grep.c | 10 ++++++++--
t/perf/p2000-sparse-operations.sh | 1 +
t/t1092-sparse-checkout-compatibility.sh | 18 ++++++++++++++++++
3 files changed, 27 insertions(+), 2 deletions(-)
diff --git a/builtin/grep.c b/builtin/grep.c
index 12abd832fa..a0b4dbc1dc 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -522,8 +522,9 @@ static int grep_cache(struct grep_opt *opt,
if (repo_read_index(repo) < 0)
die(_("index file corrupt"));
- /* TODO: audit for interaction with sparse-index. */
- ensure_full_index(repo->index);
+ if (grep_sparse)
+ ensure_full_index(repo->index);
+
for (nr = 0; nr < repo->index->cache_nr; nr++) {
const struct cache_entry *ce = repo->index->cache[nr];
@@ -992,6 +993,11 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
PARSE_OPT_KEEP_DASHDASH |
PARSE_OPT_STOP_AT_NON_OPTION);
+ if (the_repository->gitdir) {
+ prepare_repo_settings(the_repository);
+ the_repository->settings.command_requires_full_index = 0;
+ }
+
if (use_index && !startup_info->have_repository) {
int fallback = 0;
git_config_get_bool("grep.fallbacktonoindex", &fallback);
diff --git a/t/perf/p2000-sparse-operations.sh b/t/perf/p2000-sparse-operations.sh
index fce8151d41..9a466fcbbe 100755
--- a/t/perf/p2000-sparse-operations.sh
+++ b/t/perf/p2000-sparse-operations.sh
@@ -124,5 +124,6 @@ test_perf_on_all git read-tree -mu HEAD
test_perf_on_all git checkout-index -f --all
test_perf_on_all git update-index --add --remove $SPARSE_CONE/a
test_perf_on_all "git rm -f $SPARSE_CONE/a && git checkout HEAD -- $SPARSE_CONE/a"
+test_perf_on_all git grep --cached bogus
test_done
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index a6a14c8a21..270b47840b 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -1972,4 +1972,22 @@ test_expect_success 'sparse index is not expanded: rm' '
ensure_not_expanded rm -r deep
'
+test_expect_success 'grep with --sparse and --cached' '
+ init_repos &&
+
+ test_all_match git grep --sparse --cached a &&
+ test_all_match git grep --sparse --cached a -- "folder1/*"
+'
+
+test_expect_success 'grep is not expanded' '
+ init_repos &&
+
+ ensure_not_expanded grep a &&
+ ensure_not_expanded grep a -- deep/* &&
+
+ # All files within the folder1/* pathspec are sparse,
+ # so this command does not find any matches
+ ensure_not_expanded ! grep a -- folder1/*
+'
+
test_done
--
2.37.0
next prev parent reply other threads:[~2022-08-29 23:29 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-17 7:56 [PATCH v1 0/2] grep: integrate with sparse index Shaoxuan Yuan
2022-08-17 7:56 ` [PATCH v1 1/2] builtin/grep.c: add --sparse option Shaoxuan Yuan
2022-08-17 14:12 ` Derrick Stolee
2022-08-17 17:13 ` Junio C Hamano
2022-08-17 17:34 ` Victoria Dye
2022-08-17 17:43 ` Derrick Stolee
2022-08-17 18:47 ` Junio C Hamano
2022-08-17 17:37 ` Elijah Newren
2022-08-24 18:20 ` Shaoxuan Yuan
2022-08-24 19:08 ` Derrick Stolee
2022-08-17 7:56 ` [PATCH v1 2/2] builtin/grep.c: integrate with sparse index Shaoxuan Yuan
2022-08-17 14:23 ` Derrick Stolee
2022-08-24 21:06 ` Shaoxuan Yuan
2022-08-25 0:39 ` Derrick Stolee
2022-08-17 13:46 ` [PATCH v1 0/2] grep: " Derrick Stolee
2022-08-29 23:28 ` [PATCH v2 " Shaoxuan Yuan
2022-08-29 23:28 ` [PATCH v2 1/2] builtin/grep.c: add --sparse option Shaoxuan Yuan
2022-08-29 23:28 ` Shaoxuan Yuan [this message]
2022-08-30 13:45 ` [PATCH v2 2/2] builtin/grep.c: integrate with sparse index Derrick Stolee
2022-09-01 4:57 ` [PATCH v3 0/3] grep: " Shaoxuan Yuan
2022-09-01 4:57 ` [PATCH v3 1/3] builtin/grep.c: add --sparse option Shaoxuan Yuan
2022-09-01 4:57 ` [PATCH v3 2/3] builtin/grep.c: integrate with sparse index Shaoxuan Yuan
2022-09-01 4:57 ` [PATCH v3 3/3] builtin/grep.c: walking tree instead of expanding index with --sparse Shaoxuan Yuan
2022-09-01 17:03 ` Derrick Stolee
2022-09-01 18:31 ` Shaoxuan Yuan
2022-09-01 17:17 ` Junio C Hamano
2022-09-01 17:27 ` Junio C Hamano
2022-09-01 22:49 ` Shaoxuan Yuan
2022-09-01 22:36 ` Shaoxuan Yuan
2022-09-02 3:28 ` Victoria Dye
2022-09-02 18:47 ` Shaoxuan Yuan
2022-09-03 0:36 ` [PATCH v4 0/3] grep: integrate with sparse index Shaoxuan Yuan
2022-09-03 0:36 ` [PATCH v4 1/3] builtin/grep.c: add --sparse option Shaoxuan Yuan
2022-09-03 0:36 ` [PATCH v4 2/3] builtin/grep.c: integrate with sparse index Shaoxuan Yuan
2022-09-03 0:36 ` [PATCH v4 3/3] builtin/grep.c: walking tree instead of expanding index with --sparse Shaoxuan Yuan
2022-09-03 4:39 ` Junio C Hamano
2022-09-08 0:24 ` Shaoxuan Yuan
2022-09-08 0:18 ` [PATCH v5 0/3] grep: integrate with sparse index Shaoxuan Yuan
2022-09-08 0:18 ` [PATCH v5 1/3] builtin/grep.c: add --sparse option Shaoxuan Yuan
2022-09-10 1:07 ` Victoria Dye
2022-09-14 6:08 ` Elijah Newren
2022-09-15 2:57 ` Junio C Hamano
2022-09-18 2:14 ` Elijah Newren
2022-09-18 19:52 ` Victoria Dye
2022-09-19 1:23 ` Junio C Hamano
2022-09-19 4:27 ` Shaoxuan Yuan
2022-09-19 11:03 ` Ævar Arnfjörð Bjarmason
2022-09-20 7:13 ` Elijah Newren
2022-09-17 3:34 ` Shaoxuan Yuan
2022-09-18 4:24 ` Elijah Newren
2022-09-19 4:13 ` Shaoxuan Yuan
2022-09-17 3:45 ` Shaoxuan Yuan
2022-09-08 0:18 ` [PATCH v5 2/3] builtin/grep.c: integrate with sparse index Shaoxuan Yuan
2022-09-08 0:18 ` [PATCH v5 3/3] builtin/grep.c: walking tree instead of expanding index with --sparse Shaoxuan Yuan
2022-09-08 17:59 ` Junio C Hamano
2022-09-08 20:46 ` Derrick Stolee
2022-09-08 20:56 ` Junio C Hamano
2022-09-08 21:06 ` Shaoxuan Yuan
2022-09-09 12:49 ` Derrick Stolee
2022-09-13 17:23 ` Junio C Hamano
2022-09-10 2:04 ` Victoria Dye
2022-09-23 4:18 ` [PATCH v6 0/1] grep: integrate with sparse index Shaoxuan Yuan
2022-09-23 4:18 ` [PATCH v6 1/1] builtin/grep.c: " Shaoxuan Yuan
2022-09-23 16:40 ` Junio C Hamano
2022-09-23 16:58 ` Junio C Hamano
2022-09-26 17:28 ` Junio C Hamano
2022-09-23 14:13 ` [PATCH v6 0/1] grep: " Derrick Stolee
2022-09-23 16:01 ` Victoria Dye
2022-09-23 17:08 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220829232843.183711-3-shaoxuan.yuan02@gmail.com \
--to=shaoxuan.yuan02@gmail.com \
--cc=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=vdye@github.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).