Git Mailing List Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat
@ 2020-10-17 21:04 Nipunn Koorapati via GitGitGadget
  2020-10-17 21:04 ` [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
                   ` (4 more replies)
  0 siblings, 5 replies; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-17 21:04 UTC (permalink / raw)
  To: git; +Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati

Credit to alexmv who made this commit back in Dec, 2017 when he was at dbx.
I've rebased it and am submitting it now.

With fsmonitor enabled, git diff currently lstats every file in the repo
This makes use of the fsmonitor extension to skip lstat() calls on files
that fsmonitor judged as unmodified.

I was able to do some testing with/without this change in a large in-house
repo (~ 400k files)

-----------------------------------------
(1) With fsmonitor enabled - on master of git (2.29.0)
-----------------------------------------
../git/bin-wrappers/git checkout HEAD~200
strace -c ../git/bin-wrappers/git diff

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.64    4.358994          10    446257         3 lstat
  0.12    0.005353           7       764       360 open

(A subsequent call)
strace -c ../git/bin-wrappers/git diff

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.84    4.380955          10    444904         3 lstat
  0.06    0.002564         135        19           munmap
...

-----------------------------------------
(2) With fsmonitor enabled - with my patch
-----------------------------------------
../git/bin-wrappers/git checkout HEAD~200
strace -c ../git/bin-wrappers/git diff

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 50.72    0.003090         163        19           munmap
 19.63    0.001196         598         2           futex
...
  0.00    0.000000           0         4         3 lstat


-----------------------------------------
(3) With fsmonitor disabled entirely
-----------------------------------------

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 98.52    0.277085       92362         3           futex
  0.27    0.000752           4       191        63 open
...
  0.14    0.000397           3       158         3 lstat

I encoded this into a perf test with results as follow:

On master (2.29)

Test                                                             this tree
--------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         2.52(1.59+1.56)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.18(0.12+0.06)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   1.36(0.73+0.62)
7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman)           0.85(0.30+0.54)
7519.7: status (fsmonitor=)                                      0.69(0.52+0.90)
7519.8: status -uno (fsmonitor=)                                 0.37(0.28+0.81)
7519.9: status -uall (fsmonitor=)                                1.53(0.93+1.32)
7519.10: diff (fsmonitor=)                                       0.34(0.26+0.81)

With this patch

Test                                                             this tree
--------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         2.84(1.70+1.76)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.18(0.13+0.05)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   1.35(0.81+0.53)
7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman)           0.15(0.11+0.05)
7519.7: status (fsmonitor=)                                      0.71(0.54+0.90)
7519.8: status -uno (fsmonitor=)                                 0.38(0.30+0.81)
7519.9: status -uall (fsmonitor=)                                1.55(0.93+1.34)
7519.10: diff (fsmonitor=)                                       0.35(0.32+0.76)

Alex Vandiver (1):
  fsmonitor: use fsmonitor data in `git diff`

Nipunn Koorapati (3):
  t/perf/README: elaborate on output format
  t/perf/p7519-fsmonitor.sh: warm cache on first git status
  t/perf: add fsmonitor perf test for git diff

 diff-lib.c                | 17 +++++++++++++++--
 t/perf/README             |  2 ++
 t/perf/p7519-fsmonitor.sh | 19 ++++++++++++++++++-
 3 files changed, 35 insertions(+), 3 deletions(-)


base-commit: d4a392452e292ff924e79ec8458611c0f679d6d4
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-756%2Fnipunn1313%2Fdiff_fsmon-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-756/nipunn1313/diff_fsmon-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/756
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff`
  2020-10-17 21:04 [PATCH 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
@ 2020-10-17 21:04 ` Alex Vandiver via GitGitGadget
  2020-10-17 22:25   ` Junio C Hamano
  2020-10-17 21:04 ` [PATCH 2/4] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 52+ messages in thread
From: Alex Vandiver via GitGitGadget @ 2020-10-17 21:04 UTC (permalink / raw)
  To: git; +Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Alex Vandiver

From: Alex Vandiver <alexmv@dropbox.com>

With fsmonitor enabled, the first call to match_stat_with_submodule
calls refresh_fsmonitor, incurring the overhead of reading the list of
updated files -- but run_diff_files does not respect the
CE_FSMONITOR_VALID flag.

Make use of the fsmonitor extension to skip lstat() calls on files
that fsmonitor judged as unmodified.

Notably, this change improves performance of the git shell prompt when
GIT_PS1_SHOWDIRTYSTATE is set.

Signed-off-by: Alex Vandiver <alexmv@dropbox.com>
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 diff-lib.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index f95c6de75f..b7ee1b89ef 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -97,6 +97,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 
 	diff_set_mnemonic_prefix(&revs->diffopt, "i/", "w/");
 
+	refresh_fsmonitor(istate);
+
 	if (diff_unmerged_stage < 0)
 		diff_unmerged_stage = 2;
 	entries = istate->cache_nr;
@@ -197,8 +199,19 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 		if (ce_uptodate(ce) || ce_skip_worktree(ce))
 			continue;
 
-		/* If CE_VALID is set, don't look at workdir for file removal */
-		if (ce->ce_flags & CE_VALID) {
+		/*
+		 * If CE_VALID is set, the user has promised us that the workdir
+		 * hasn't changed compared to index, so don't stat workdir
+		 * for file removal
+		 *  eg - via git udpate-index --assume-unchanged
+		 *  eg - via core.ignorestat=true
+		 *
+		 * When using FSMONITOR:
+		 * If CE_FSMONITOR_VALID is set, then we know the metadata on disk
+		 * has not changed since the last refresh, and we can skip the
+		 * file-removal checks without doing the stat in check_removed.
+		 */
+		if (ce->ce_flags & CE_VALID || ce->ce_flags & CE_FSMONITOR_VALID) {
 			changed = 0;
 			newmode = ce->ce_mode;
 		} else {
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 2/4] t/perf/README: elaborate on output format
  2020-10-17 21:04 [PATCH 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
  2020-10-17 21:04 ` [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
@ 2020-10-17 21:04 ` Nipunn Koorapati via GitGitGadget
  2020-10-17 21:04 ` [PATCH 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-17 21:04 UTC (permalink / raw)
  To: git; +Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati

From: Nipunn Koorapati <nipunn@dropbox.com>

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 t/perf/README | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/t/perf/README b/t/perf/README
index bd649afa97..fb9127a66f 100644
--- a/t/perf/README
+++ b/t/perf/README
@@ -28,6 +28,8 @@ the tests on the current git repository.
     7810.3: grep --cached, cheap regex       3.07(3.02+0.25)
     7810.4: grep --cached, expensive regex   9.39(30.57+0.24)
 
+Output format is in seconds "Elapsed(User + System)"
+
 You can compare multiple repositories and even git revisions with the
 'run' script:
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status
  2020-10-17 21:04 [PATCH 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
  2020-10-17 21:04 ` [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
  2020-10-17 21:04 ` [PATCH 2/4] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
@ 2020-10-17 21:04 ` Nipunn Koorapati via GitGitGadget
  2020-10-18  4:22   ` Taylor Blau
  2020-10-17 21:04 ` [PATCH 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
  2020-10-19 21:35 ` [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
  4 siblings, 1 reply; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-17 21:04 UTC (permalink / raw)
  To: git; +Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati

From: Nipunn Koorapati <nipunn@dropbox.com>

The first git status would be inflated due to warming of
filesystem cache. This makes the results comparable.

Before
Test                                                             this tree
--------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         2.52(1.59+1.56)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.18(0.12+0.06)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   1.36(0.73+0.62)
7519.7: status (fsmonitor=)                                      0.69(0.52+0.90)
7519.8: status -uno (fsmonitor=)                                 0.37(0.28+0.81)
7519.9: status -uall (fsmonitor=)                                1.53(0.93+1.32)

After
Test                                                             this tree
--------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         0.39(0.33+0.06)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.17(0.13+0.05)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   1.34(0.77+0.56)
7519.7: status (fsmonitor=)                                      0.70(0.53+0.90)
7519.8: status -uno (fsmonitor=)                                 0.37(0.32+0.78)
7519.9: status -uall (fsmonitor=)                                1.55(1.01+1.25)

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 t/perf/p7519-fsmonitor.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
index def7ecdbc7..9313d4a51d 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -114,7 +114,8 @@ test_expect_success "setup for fsmonitor" '
 	fi &&
 
 	git config core.fsmonitor "$INTEGRATION_SCRIPT" &&
-	git update-index --fsmonitor
+	git update-index --fsmonitor &&
+	git status  # Warm caches
 '
 
 if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 4/4] t/perf: add fsmonitor perf test for git diff
  2020-10-17 21:04 [PATCH 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
                   ` (2 preceding siblings ...)
  2020-10-17 21:04 ` [PATCH 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
@ 2020-10-17 21:04 ` Nipunn Koorapati via GitGitGadget
  2020-10-17 22:28   ` Junio C Hamano
  2020-10-19 21:35 ` [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
  4 siblings, 1 reply; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-17 21:04 UTC (permalink / raw)
  To: git; +Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati

From: Nipunn Koorapati <nipunn@dropbox.com>

Results for the git-diff fsmonitor optimization
in patch in the parent-rev (using a 400k file repo to test)

As you can see here - git diff with fsmonitor running is
significantly better with this patch series (80% faster on my
workload)!

On master (2.29)

Test                                                             this tree
--------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         0.39(0.33+0.06)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.17(0.13+0.05)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   1.34(0.77+0.56)
7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman)           0.82(0.24+0.58)
7519.7: status (fsmonitor=)                                      0.70(0.53+0.90)
7519.8: status -uno (fsmonitor=)                                 0.37(0.32+0.78)
7519.9: status -uall (fsmonitor=)                                1.55(1.01+1.25)
7519.10: diff (fsmonitor=)                                       0.34(0.35+0.72)

With this patch series

Test                                                             this tree
--------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         0.39(0.33+0.07)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.17(0.12+0.05)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   1.35(0.73+0.61)
7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman)           0.14(0.10+0.05)
7519.7: status (fsmonitor=)                                      0.70(0.56+0.87)
7519.8: status -uno (fsmonitor=)                                 0.37(0.31+0.79)
7519.9: status -uall (fsmonitor=)                                1.54(0.97+1.29)
7519.10: diff (fsmonitor=)                                       0.34(0.28+0.79)

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 t/perf/p7519-fsmonitor.sh | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
index 9313d4a51d..80d0148557 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -142,6 +142,14 @@ test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
 	git status -uall
 '
 
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-tool drop-caches
+fi
+
+test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff
+'
+
 test_expect_success "setup without fsmonitor" '
 	unset INTEGRATION_SCRIPT &&
 	git config --unset core.fsmonitor &&
@@ -172,6 +180,14 @@ test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
 	git status -uall
 '
 
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-tool drop-caches
+fi
+
+test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff
+'
+
 if test_have_prereq WATCHMAN
 then
 	watchman watch-del "$GIT_WORK_TREE" >/dev/null 2>&1 &&
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff`
  2020-10-17 21:04 ` [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
@ 2020-10-17 22:25   ` Junio C Hamano
  2020-10-18  0:54     ` Nipunn Koorapati
  0 siblings, 1 reply; 52+ messages in thread
From: Junio C Hamano @ 2020-10-17 22:25 UTC (permalink / raw)
  To: Alex Vandiver via GitGitGadget
  Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati, Alex Vandiver

"Alex Vandiver via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Alex Vandiver <alexmv@dropbox.com>
>
> With fsmonitor enabled, the first call to match_stat_with_submodule
> calls refresh_fsmonitor, incurring the overhead of reading the list of
> updated files -- but run_diff_files does not respect the
> CE_FSMONITOR_VALID flag.

run_diff_files() is used not just by "git diff" but other things
like "git add", so if we get an overall speed-up without having to
pay undue cost, that would be a very good news.

> diff --git a/diff-lib.c b/diff-lib.c
> index f95c6de75f..b7ee1b89ef 100644
> --- a/diff-lib.c
> +++ b/diff-lib.c
> @@ -97,6 +97,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>  
>  	diff_set_mnemonic_prefix(&revs->diffopt, "i/", "w/");
>  
> +	refresh_fsmonitor(istate);
> +

"git diff" and friends are often run with pathspec, but the API into
the fsmonitor, refresh_fsmonitor() call, has no way to say "I only
am interested in the status of this directory and everything else
does not matter".  How expensive would this call to accept fsmonitor
data for the entire tree be, and would there eventually be a point
where the number of paths we are interested in checking (i.e. the
paths that would match the pathspec) is so small that we would be
better off not making this call?  E.g. if we are checking more than
20% of the working tree, running refresh_fsmonitor() for the entire
working tree is still a win, but if we are only checking less than
that, we are better off without fsmonitor, or does a tradeoff like
that exist?

> @@ -197,8 +199,19 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>  		if (ce_uptodate(ce) || ce_skip_worktree(ce))
>  			continue;
>  
> -		/* If CE_VALID is set, don't look at workdir for file removal */
> -		if (ce->ce_flags & CE_VALID) {
> +		/*
> +		 * If CE_VALID is set, the user has promised us that the workdir
> +		 * hasn't changed compared to index, so don't stat workdir
> +		 * for file removal

The above seems to be an attempt to elaborate on the existing
comment, but ...

> +		 *  eg - via git udpate-index --assume-unchanged
> +		 *  eg - via core.ignorestat=true

... what are these two lines doing here?  It makes no sense to say
"Don't stat workdir for file removal by doing 'git update-index' or
by seetting core.ignorestat", but the placement of these two lines
makes it look as if that is what you are saying.  Perhaps

	When CE_VALID is set (via "update-index --assume-unchanged"
	or via adding paths while core.ignorestat is set to true),
	the user has promised ..., so don't stat workdir for removed
	files.

would probably be what you meant bo say.

> +		 * When using FSMONITOR:
> +		 * If CE_FSMONITOR_VALID is set, then we know the metadata on disk
> +		 * has not changed since the last refresh, and we can skip the
> +		 * file-removal checks without doing the stat in check_removed.

An iffy description.  You skip all the file-removal check by not
calling check_removed() as a whole.

This is not the fault of this patch, but in any case, the
description places too much stress on "removal" when in reality,
removal is not all that special in this codepath.  The check_removed
call also contributes to noticiing modified (not removed) files.  If
we are updating the comment here, we should correct that too,
perhaps

	When CE_VALID is set (via "update-index --assume-unchanged"
	or via adding paths while core.ignorestat is set to true),
	the user has promised that the working tree file for that
	path will not be modified.  When CE_FSMONITOR_VALID is true,
	the fsmonitor knows that the path hasn't been modified since
	we refreshed the cached stat information.  In either case,
	we do not have to stat to see if the path has been removed
	or modified.

or something like that, perhaps.

> +		 */
> +		if (ce->ce_flags & CE_VALID || ce->ce_flags & CE_FSMONITOR_VALID) {

Would it become easier to read, if written like this instead?

		if (ce->ce_flags & (CE_VALID | CE_FSMONITOR_VALID)) {

That reflects what the suggested comment says better.

>  			changed = 0;
>  			newmode = ce->ce_mode;
>  		} else {

Thanks.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 4/4] t/perf: add fsmonitor perf test for git diff
  2020-10-17 21:04 ` [PATCH 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
@ 2020-10-17 22:28   ` Junio C Hamano
  0 siblings, 0 replies; 52+ messages in thread
From: Junio C Hamano @ 2020-10-17 22:28 UTC (permalink / raw)
  To: Nipunn Koorapati via GitGitGadget
  Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati

"Nipunn Koorapati via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" '
> +	git diff

This is a whole-tree diff.  It would be interesting to also see if a
meaningful tradeoff exists if a test is run with a tree with say 100
top-level subdirectories but with just one of them covered by a
pathspec, with many modified paths sprinkled all over.

> +'
> +
>  if test_have_prereq WATCHMAN
>  then
>  	watchman watch-del "$GIT_WORK_TREE" >/dev/null 2>&1 &&

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff`
  2020-10-17 22:25   ` Junio C Hamano
@ 2020-10-18  0:54     ` Nipunn Koorapati
  2020-10-18  4:17       ` Taylor Blau
  0 siblings, 1 reply; 52+ messages in thread
From: Nipunn Koorapati @ 2020-10-18  0:54 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Alex Vandiver via GitGitGadget, git, Derrick Stolee, Utsav Shah,
	Alex Vandiver

> run_diff_files() is used not just by "git diff" but other things
> like "git add", so if we get an overall speed-up without having to
> pay undue cost, that would be a very good news.

Agreed! I may be able to write perf benchmark tests to highlight
benefits to git add as well.

> 20% of the working tree, running refresh_fsmonitor() for the entire
> working tree is still a win, but if we are only checking less than
> that, we are better off without fsmonitor, or does a tradeoff like
> that exist?

My understanding is that refresh_fsmonitor is
O(delta_since_last_refresh) - so for developers
with large repositories - this cost will amortize out over subsequent
commands, so I don't
think it's worth investigating this tradeoff here.
As a user of large repositories, I expect that my major source of
fsmonitor activity to be user
intent (eg git pull, or intentionally copying/editing a large number
of files). After such a command,
I expect my next git command to be slower - that would be unsurprising.

I think the tradeoff could be made for small diff requests, but I
don't think it's worth adding complexity here -
as that user will just have to pay the cost on their next git command.

> > +              *  eg - via git udpate-index --assume-unchanged
> > +              *  eg - via core.ignorestat=true
>
> ... what are these two lines doing here?

Intended to indicate potential ways that CE_VALID might be set. When I
was reading the source
here, it was pretty difficult to determine how this would be set.
Agree that I picked unfortunate wording.
Thanks for the suggestions. Will update in the next iteration.

>
> would probably be what you meant bo say.
>
>         When CE_VALID is set (via "update-index --assume-unchanged"
>         or via adding paths while core.ignorestat is set to true),
>         the user has promised that the working tree file for that
>         path will not be modified.  When CE_FSMONITOR_VALID is true,
>         the fsmonitor knows that the path hasn't been modified since
>         we refreshed the cached stat information.  In either case,
>         we do not have to stat to see if the path has been removed
>         or modified.
>
> or something like that, perhaps.

Sounds good. Will clarify. I like your comment better as well.

>
> > +              */
> > +             if (ce->ce_flags & CE_VALID || ce->ce_flags & CE_FSMONITOR_VALID) {
>
> Would it become easier to read, if written like this instead?
>
>                 if (ce->ce_flags & (CE_VALID | CE_FSMONITOR_VALID)) {

I personally find this more confusing because it involves multiple
bitwise ops, but this
is potentially due to me having more mental practice thinking about
boolean operators vs bitwise operators.
I'm more than happy to align with the common pattern of the repo. I'll
change this.

>
> Thanks.

Thank you for the thorough review!

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff`
  2020-10-18  0:54     ` Nipunn Koorapati
@ 2020-10-18  4:17       ` Taylor Blau
  2020-10-18  5:02         ` Junio C Hamano
  0 siblings, 1 reply; 52+ messages in thread
From: Taylor Blau @ 2020-10-18  4:17 UTC (permalink / raw)
  To: Nipunn Koorapati
  Cc: Junio C Hamano, Alex Vandiver via GitGitGadget, git,
	Derrick Stolee, Utsav Shah, Alex Vandiver

On Sun, Oct 18, 2020 at 01:54:44AM +0100, Nipunn Koorapati wrote:
> > 20% of the working tree, running refresh_fsmonitor() for the entire
> > working tree is still a win, but if we are only checking less than
> > that, we are better off without fsmonitor, or does a tradeoff like
> > that exist?
>
> My understanding is that refresh_fsmonitor is
> O(delta_since_last_refresh) - so for developers
> with large repositories - this cost will amortize out over subsequent
> commands, so I don't
> think it's worth investigating this tradeoff here.
> As a user of large repositories, I expect that my major source of
> fsmonitor activity to be user
> intent (eg git pull, or intentionally copying/editing a large number
> of files). After such a command,
> I expect my next git command to be slower - that would be unsurprising.
>
> I think the tradeoff could be made for small diff requests, but I
> don't think it's worth adding complexity here -
> as that user will just have to pay the cost on their next git command.

Hmm. I do agree that I'd like to stay out of the business of trying to
figure out exactly what that trade-off is (although I'm sure that it
exists), only because it seems likely to vary to a large extent from
repository to repository. (That is, 20% may be a good number for some
repository, but a terrible choice for another).

But, I think that we can invoke watchman better here; the
fsmonitor-watchman hook has no notion of a "pathspec", so every query
just asks for everything that isn't in '$GIT_DIR'. Is there anything
preventing us from taking an optional pathspec and building up a more
targeted query?

There is some overhead to invoke the hook and talk to watchman, but
I'd expect that to be dwarfed by not having to issue O(# files)
syscalls.

> >
> > > +              */
> > > +             if (ce->ce_flags & CE_VALID || ce->ce_flags & CE_FSMONITOR_VALID) {
> >
> > Would it become easier to read, if written like this instead?
> >
> >                 if (ce->ce_flags & (CE_VALID | CE_FSMONITOR_VALID)) {
>
> I personally find this more confusing because it involves multiple
> bitwise ops, but this
> is potentially due to me having more mental practice thinking about
> boolean operators vs bitwise operators.
> I'm more than happy to align with the common pattern of the repo. I'll
> change this.

I don't have an opinion, nor do I think that git.git has an established
practice of doing one over the other. For what it's worth, my two-cents
is that Junio's suggestion is easier to read.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status
  2020-10-17 21:04 ` [PATCH 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
@ 2020-10-18  4:22   ` Taylor Blau
  0 siblings, 0 replies; 52+ messages in thread
From: Taylor Blau @ 2020-10-18  4:22 UTC (permalink / raw)
  To: Nipunn Koorapati via GitGitGadget
  Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati

On Sat, Oct 17, 2020 at 09:04:35PM +0000, Nipunn Koorapati via GitGitGadget wrote:
> From: Nipunn Koorapati <nipunn@dropbox.com>
>
> The first git status would be inflated due to warming of
> filesystem cache. This makes the results comparable.
>
> Before
> Test                                                             this tree
> --------------------------------------------------------------------------------
> 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         2.52(1.59+1.56)
> 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.18(0.12+0.06)
> 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   1.36(0.73+0.62)
> 7519.7: status (fsmonitor=)                                      0.69(0.52+0.90)
> 7519.8: status -uno (fsmonitor=)                                 0.37(0.28+0.81)
> 7519.9: status -uall (fsmonitor=)                                1.53(0.93+1.32)
>
> After
> Test                                                             this tree
> --------------------------------------------------------------------------------
> 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         0.39(0.33+0.06)
> 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.17(0.13+0.05)
> 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   1.34(0.77+0.56)
> 7519.7: status (fsmonitor=)                                      0.70(0.53+0.90)
> 7519.8: status -uno (fsmonitor=)                                 0.37(0.32+0.78)
> 7519.9: status -uall (fsmonitor=)                                1.55(1.01+1.25)

Note that you can directly compare results with the perf suite's "run"
script by passing two revisions in addition to the test that you want to
run and have the results aggregated side-by-side.

In your case, you'd want something like (within the t/perf directory):

  $ ./run HEAD . p7519-*.sh

where this patch is the uncommitted state (alternatively you could
compare the two revisions directly in the case that you have already
committed).

> diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
> index def7ecdbc7..9313d4a51d 100755
> --- a/t/perf/p7519-fsmonitor.sh
> +++ b/t/perf/p7519-fsmonitor.sh
> @@ -114,7 +114,8 @@ test_expect_success "setup for fsmonitor" '
>  	fi &&
>
>  	git config core.fsmonitor "$INTEGRATION_SCRIPT" &&
> -	git update-index --fsmonitor
> +	git update-index --fsmonitor &&
> +	git status  # Warm caches

Seems reasonable, and the comment is much appreciated :-).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff`
  2020-10-18  4:17       ` Taylor Blau
@ 2020-10-18  5:02         ` Junio C Hamano
  2020-10-18 23:43           ` Taylor Blau
  0 siblings, 1 reply; 52+ messages in thread
From: Junio C Hamano @ 2020-10-18  5:02 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Nipunn Koorapati, Alex Vandiver via GitGitGadget, git,
	Derrick Stolee, Utsav Shah, Alex Vandiver

Taylor Blau <me@ttaylorr.com> writes:

> Hmm. I do agree that I'd like to stay out of the business of trying to
> figure out exactly what that trade-off is (although I'm sure that it
> exists), only because it seems likely to vary to a large extent from
> repository to repository. (That is, 20% may be a good number for some
> repository, but a terrible choice for another).

I think both of you misunderstood me.  

My question was a simple yes/no "does there a trade off exist?"
question and the sentences with 20% in it were mere example of
possible trade-off I had in mind that _could_ exist.  I wasn't even
suggesting to figure out what the optimum cut-off heuristics would
be (e.g. solving "when more than N% paths are subject to diff
fsmonitor is faster" for N).

I was hoping that we can show that even having to lstat just a
single path is expensive enough---IOW, "there is no trade-off worth
worrying about, because talking to fsmonitor is so cheap compared to
the cost of even a single lstst" would have been a valid and happy
answer.  With such a number, there is no risk of introducing an
unwarranted performance regression to use cases that we did not
anticipate by adding an unconditional call to refresh_fsmonitor().

But without any rationale, the performance implication of adding an
unconditional call to refresh_fsmonitor() would become much muddier.

> But, I think that we can invoke watchman better here; the
> fsmonitor-watchman hook has no notion of a "pathspec", so every query
> just asks for everything that isn't in '$GIT_DIR'. Is there anything
> preventing us from taking an optional pathspec and building up a more
> targeted query?

Yup, it is what I had in mind when I brought up the pathspec.  It
may be something worth pursuing longer term, but not within the
scope of this patch.

> There is some overhead to invoke the hook and talk to watchman, but
> I'd expect that to be dwarfed by not having to issue O(# files)
> syscalls.

"invoke the hook"---is that a pipe+fork+exec, or something else that
is far lighter-weight?

n

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff`
  2020-10-18  5:02         ` Junio C Hamano
@ 2020-10-18 23:43           ` Taylor Blau
  2020-10-19 17:23             ` Junio C Hamano
  0 siblings, 1 reply; 52+ messages in thread
From: Taylor Blau @ 2020-10-18 23:43 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Taylor Blau, Nipunn Koorapati, Alex Vandiver via GitGitGadget,
	git, Derrick Stolee, Utsav Shah, Alex Vandiver

On Sat, Oct 17, 2020 at 10:02:04PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > Hmm. I do agree that I'd like to stay out of the business of trying to
> > figure out exactly what that trade-off is (although I'm sure that it
> > exists), only because it seems likely to vary to a large extent from
> > repository to repository. (That is, 20% may be a good number for some
> > repository, but a terrible choice for another).
>
> I think both of you misunderstood me.
>
> My question was a simple yes/no "does there a trade off exist?"
> question and the sentences with 20% in it were mere example of
> possible trade-off I had in mind that _could_ exist.  I wasn't even
> suggesting to figure out what the optimum cut-off heuristics would
> be (e.g. solving "when more than N% paths are subject to diff
> fsmonitor is faster" for N).
>
> I was hoping that we can show that even having to lstat just a
> single path is expensive enough---IOW, "there is no trade-off worth
> worrying about, because talking to fsmonitor is so cheap compared to
> the cost of even a single lstst" would have been a valid and happy
> answer.  With such a number, there is no risk of introducing an
> unwarranted performance regression to use cases that we did not
> anticipate by adding an unconditional call to refresh_fsmonitor().
>
> But without any rationale, the performance implication of adding an
> unconditional call to refresh_fsmonitor() would become much muddier.

Aha; thanks for clarifying. I'm glad we agree that finding 'N' would not
be worth it, or at least that showing that talking to fsmonitor is
cheaper than a single lstat would be more worthwhile.

Nipunn - I don't have fsmonitor/watchman setup on my workstation, but if
you do, some numbers (or an interpretation of the numbers you already
provided) on this would be really useful. If you don't have it set up,
or don't have time to measure it, let me know, and I'd be happy to take
a look.

> > But, I think that we can invoke watchman better here; the
> > fsmonitor-watchman hook has no notion of a "pathspec", so every query
> > just asks for everything that isn't in '$GIT_DIR'. Is there anything
> > preventing us from taking an optional pathspec and building up a more
> > targeted query?
>
> Yup, it is what I had in mind when I brought up the pathspec.  It
> may be something worth pursuing longer term, but not within the
> scope of this patch.
>
> > There is some overhead to invoke the hook and talk to watchman, but
> > I'd expect that to be dwarfed by not having to issue O(# files)
> > syscalls.
>
> "invoke the hook"---is that a pipe+fork+exec, or something else that
> is far lighter-weight?

The former; see 'fsmonitor.c:query_fsmonitor()'.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff`
  2020-10-18 23:43           ` Taylor Blau
@ 2020-10-19 17:23             ` Junio C Hamano
  2020-10-19 17:37               ` Taylor Blau
  0 siblings, 1 reply; 52+ messages in thread
From: Junio C Hamano @ 2020-10-19 17:23 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Nipunn Koorapati, Alex Vandiver via GitGitGadget, git,
	Derrick Stolee, Utsav Shah, Alex Vandiver

Taylor Blau <me@ttaylorr.com> writes:

>> > There is some overhead to invoke the hook and talk to watchman, but
>> > I'd expect that to be dwarfed by not having to issue O(# files)
>> > syscalls.
>>
>> "invoke the hook"---is that a pipe+fork+exec, or something else that
>> is far lighter-weight?
>
> The former; see 'fsmonitor.c:query_fsmonitor()'.

It brings us back to the "overhead of how many lstat(2) takes us
closer to the overhead of a single pipe+fork+exec plus reading from
the pipe", doesn't it?


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff`
  2020-10-19 17:23             ` Junio C Hamano
@ 2020-10-19 17:37               ` Taylor Blau
  2020-10-19 18:07                 ` Nipunn Koorapati
  0 siblings, 1 reply; 52+ messages in thread
From: Taylor Blau @ 2020-10-19 17:37 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Taylor Blau, Nipunn Koorapati, Alex Vandiver via GitGitGadget,
	git, Derrick Stolee, Utsav Shah, Alex Vandiver

On Mon, Oct 19, 2020 at 10:23:26AM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> >> > There is some overhead to invoke the hook and talk to watchman, but
> >> > I'd expect that to be dwarfed by not having to issue O(# files)
> >> > syscalls.
> >>
> >> "invoke the hook"---is that a pipe+fork+exec, or something else that
> >> is far lighter-weight?
> >
> > The former; see 'fsmonitor.c:query_fsmonitor()'.
>
> It brings us back to the "overhead of how many lstat(2) takes us
> closer to the overhead of a single pipe+fork+exec plus reading from
> the pipe", doesn't it?

Somewhat unfortunately, yes. Hopefully any user that cares to use
fsmonitor has enough files in their repository that a pipe+fork+exec is
still faster than however many lstats they would have needed otherwise.

Of course, finding out what that number is is still interesting...

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff`
  2020-10-19 17:37               ` Taylor Blau
@ 2020-10-19 18:07                 ` Nipunn Koorapati
  0 siblings, 0 replies; 52+ messages in thread
From: Nipunn Koorapati @ 2020-10-19 18:07 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Junio C Hamano, Alex Vandiver via GitGitGadget, git,
	Derrick Stolee, Utsav Shah, Alex Vandiver

> It brings us back to the "overhead of how many lstat(2) takes us
> closer to the overhead of a single pipe+fork+exec plus reading from
> the pipe", doesn't it?
>

I will add a benchmark for a `git diff -- <pathspec>`

> Somewhat unfortunately, yes. Hopefully any user that cares to use
> fsmonitor has enough files in their repository that a pipe+fork+exec is
> still faster than however many lstats they would have needed otherwise.
>
> Of course, finding out what that number is is still interesting...

I can try to do some manual testing to figure this out. Doesn't seem like the
type of thing we'd want to add to the benchmark, as it would involve running
git diff on a variety of pathspec workloads

--Nipunn

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat
  2020-10-17 21:04 [PATCH 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
                   ` (3 preceding siblings ...)
  2020-10-17 21:04 ` [PATCH 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
@ 2020-10-19 21:35 ` Nipunn Koorapati via GitGitGadget
  2020-10-19 21:35   ` [PATCH v2 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
                     ` (4 more replies)
  4 siblings, 5 replies; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 21:35 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati

Credit to alexmv who made this commit back in Dec, 2017 when he was at dbx.
I've rebased it and am submitting it now.

With fsmonitor enabled, git diff currently lstats every file in the repo
This makes use of the fsmonitor extension to skip lstat() calls on files
that fsmonitor judged as unmodified.

I was able to do some testing with/without this change in a large in-house
repo (~ 400k files).

-----------------------------------------
(1) With fsmonitor enabled - on master of git (2.29.0)
-----------------------------------------
../git/bin-wrappers/git checkout HEAD~200
strace -c ../git/bin-wrappers/git diff

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.64    4.358994          10    446257         3 lstat
  0.12    0.005353           7       764       360 open

(A subsequent call)
strace -c ../git/bin-wrappers/git diff

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.84    4.380955          10    444904         3 lstat
  0.06    0.002564         135        19           munmap
...

-----------------------------------------
(2) With fsmonitor enabled - with my patch
-----------------------------------------
../git/bin-wrappers/git checkout HEAD~200
strace -c ../git/bin-wrappers/git diff

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 50.72    0.003090         163        19           munmap
 19.63    0.001196         598         2           futex
...
  0.00    0.000000           0         4         3 lstat


-----------------------------------------
(3) With fsmonitor disabled entirely
-----------------------------------------

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 98.52    0.277085       92362         3           futex
  0.27    0.000752           4       191        63 open
...
  0.14    0.000397           3       158         3 lstat

I was able to encode this into a perf test in one of the commits.

Changes since Patch Series V1

 * Add git diff -- <pathspec> to perf tests
 * improve readability of bitwise ops

Alex Vandiver (1):
  fsmonitor: use fsmonitor data in `git diff`

Nipunn Koorapati (3):
  t/perf/README: elaborate on output format
  t/perf/p7519-fsmonitor.sh: warm cache on first git status
  t/perf: add fsmonitor perf test for git diff

 diff-lib.c                | 15 ++++++--
 t/perf/README             |  2 ++
 t/perf/p7519-fsmonitor.sh | 74 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 88 insertions(+), 3 deletions(-)


base-commit: d4a392452e292ff924e79ec8458611c0f679d6d4
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-756%2Fnipunn1313%2Fdiff_fsmon-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-756/nipunn1313/diff_fsmon-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/756

Range-diff vs v1:

 1:  13fd992a37 ! 1:  cba03dd40b fsmonitor: use fsmonitor data in `git diff`
     @@ diff-lib.c: int run_diff_files(struct rev_info *revs, unsigned int option)
      -		/* If CE_VALID is set, don't look at workdir for file removal */
      -		if (ce->ce_flags & CE_VALID) {
      +		/*
     -+		 * If CE_VALID is set, the user has promised us that the workdir
     -+		 * hasn't changed compared to index, so don't stat workdir
     -+		 * for file removal
     -+		 *  eg - via git udpate-index --assume-unchanged
     -+		 *  eg - via core.ignorestat=true
     -+		 *
     -+		 * When using FSMONITOR:
     -+		 * If CE_FSMONITOR_VALID is set, then we know the metadata on disk
     -+		 * has not changed since the last refresh, and we can skip the
     -+		 * file-removal checks without doing the stat in check_removed.
     ++		 * When CE_VALID is set (via "update-index --assume-unchanged"
     ++		 * or via adding paths while core.ignorestat is set to true),
     ++		 * the user has promised that the working tree file for that
     ++		 * path will not be modified.  When CE_FSMONITOR_VALID is true,
     ++		 * the fsmonitor knows that the path hasn't been modified since
     ++		 * we refreshed the cached stat information.  In either case,
     ++		 * we do not have to stat to see if the path has been removed
     ++		 * or modified.
      +		 */
     -+		if (ce->ce_flags & CE_VALID || ce->ce_flags & CE_FSMONITOR_VALID) {
     ++		if (ce->ce_flags & (CE_VALID | CE_FSMONITOR_VALID)) {
       			changed = 0;
       			newmode = ce->ce_mode;
       		} else {
 2:  024cd07965 = 2:  1c7876166f t/perf/README: elaborate on output format
 3:  6482e372bc = 3:  401f696c81 t/perf/p7519-fsmonitor.sh: warm cache on first git status
 4:  0613b07676 ! 4:  f572e226bb t/perf: add fsmonitor perf test for git diff
     @@ Commit message
          significantly better with this patch series (80% faster on my
          workload)!
      
     -    On master (2.29)
     +    GIT_PERF_LARGE_REPO=~/src/server ./run v2.29.0-rc1 . -- p7519-fsmonitor.sh
      
     -    Test                                                             this tree
     -    --------------------------------------------------------------------------------
     -    7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         0.39(0.33+0.06)
     -    7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.17(0.13+0.05)
     -    7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   1.34(0.77+0.56)
     -    7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman)           0.82(0.24+0.58)
     -    7519.7: status (fsmonitor=)                                      0.70(0.53+0.90)
     -    7519.8: status -uno (fsmonitor=)                                 0.37(0.32+0.78)
     -    7519.9: status -uall (fsmonitor=)                                1.55(1.01+1.25)
     -    7519.10: diff (fsmonitor=)                                       0.34(0.35+0.72)
     +    Test                                                                     v2.29.0-rc1       this tree
     +    -----------------------------------------------------------------------------------------------------------------
     +    7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)                 1.46(0.82+0.64)   1.47(0.83+0.62) +0.7%
     +    7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)            0.16(0.12+0.04)   0.17(0.12+0.05) +6.3%
     +    7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)           1.36(0.73+0.62)   1.37(0.76+0.60) +0.7%
     +    7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman)                   0.85(0.22+0.63)   0.14(0.10+0.05) -83.5%
     +    7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman)        0.12(0.08+0.05)   0.13(0.11+0.02) +8.3%
     +    7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman)       0.12(0.08+0.04)   0.13(0.09+0.04) +8.3%
     +    7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman)      0.12(0.07+0.05)   0.13(0.07+0.06) +8.3%
     +    7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman)     0.12(0.09+0.04)   0.13(0.08+0.05) +8.3%
     +    7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman)   0.14(0.09+0.05)   0.13(0.10+0.03) -7.1%
     +    7519.12: status (fsmonitor=)                                             1.67(0.93+1.49)   1.67(0.99+1.42) +0.0%
     +    7519.13: status -uno (fsmonitor=)                                        0.37(0.30+0.82)   0.37(0.33+0.79) +0.0%
     +    7519.14: status -uall (fsmonitor=)                                       1.58(0.97+1.35)   1.57(0.86+1.45) -0.6%
     +    7519.15: diff (fsmonitor=)                                               0.34(0.28+0.83)   0.34(0.27+0.83) +0.0%
     +    7519.16: diff -- 0_files (fsmonitor=)                                    0.09(0.06+0.04)   0.09(0.08+0.02) +0.0%
     +    7519.17: diff -- 10_files (fsmonitor=)                                   0.09(0.07+0.03)   0.09(0.06+0.05) +0.0%
     +    7519.18: diff -- 100_files (fsmonitor=)                                  0.09(0.06+0.04)   0.09(0.06+0.04) +0.0%
     +    7519.19: diff -- 1000_files (fsmonitor=)                                 0.09(0.06+0.04)   0.09(0.05+0.05) +0.0%
     +    7519.20: diff -- 10000_files (fsmonitor=)                                0.10(0.08+0.04)   0.10(0.06+0.05) +0.0%
      
     -    With this patch series
     +    I also added a benchmark for a tiny git diff workload w/ a pathspec.
     +    I see an approximately .02 second overhead added w/ and w/o fsmonitor
      
     -    Test                                                             this tree
     -    --------------------------------------------------------------------------------
     -    7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         0.39(0.33+0.07)
     -    7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.17(0.12+0.05)
     -    7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   1.35(0.73+0.61)
     -    7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman)           0.14(0.10+0.05)
     -    7519.7: status (fsmonitor=)                                      0.70(0.56+0.87)
     -    7519.8: status -uno (fsmonitor=)                                 0.37(0.31+0.79)
     -    7519.9: status -uall (fsmonitor=)                                1.54(0.97+1.29)
     -    7519.10: diff (fsmonitor=)                                       0.34(0.28+0.79)
     +    From looking at these results, I suspected that refresh_fsmonitor
     +    is already happening during git diff - independent of this patch
     +    series' optimization. Confirmed that suspicion by breaking on
     +    refresh_fsmonitor.
     +
     +    (gdb) bt  [simplified]
     +    0  refresh_fsmonitor  at fsmonitor.c:176
     +    1  ie_match_stat  at read-cache.c:375
     +    2  match_stat_with_submodule at diff-lib.c:237
     +    4  builtin_diff_files  at builtin/diff.c:260
     +    5  cmd_diff  at builtin/diff.c:541
     +    6  run_builtin  at git.c:450
     +    7  handle_builtin  at git.c:700
     +    8  run_argv  at git.c:767
     +    9  cmd_main  at git.c:898
     +    10 main  at common-main.c:52
      
          Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
      
       ## t/perf/p7519-fsmonitor.sh ##
     +@@ t/perf/p7519-fsmonitor.sh: test_expect_success "setup for fsmonitor" '
     + 
     + 	git config core.fsmonitor "$INTEGRATION_SCRIPT" &&
     + 	git update-index --fsmonitor &&
     ++	mkdir 1_file 10_files 100_files 1000_files 10000_files &&
     ++	for i in `seq 1 10`; do touch 10_files/$i; done &&
     ++	for i in `seq 1 100`; do touch 100_files/$i; done &&
     ++	for i in `seq 1 1000`; do touch 1000_files/$i; done &&
     ++	for i in `seq 1 10000`; do touch 10000_files/$i; done &&
     ++	git add 1_file 10_files 100_files 1000_files 10000_files &&
     ++	git commit -m "Add files" &&
     + 	git status  # Warm caches
     + '
     + 
      @@ t/perf/p7519-fsmonitor.sh: test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
       	git status -uall
       '
     @@ t/perf/p7519-fsmonitor.sh: test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIP
      +test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" '
      +	git diff
      +'
     ++
     ++if test -n "$GIT_PERF_7519_DROP_CACHE"; then
     ++	test-tool drop-caches
     ++fi
     ++
     ++test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" '
     ++	git diff -- 1_file
     ++'
     ++
     ++test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" '
     ++	git diff -- 10_files
     ++'
     ++
     ++test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" '
     ++	git diff -- 100_files
     ++'
     ++
     ++test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
     ++	git diff -- 1000_files
     ++'
     ++
     ++test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
     ++	git diff -- 10000_files
     ++'
      +
       test_expect_success "setup without fsmonitor" '
       	unset INTEGRATION_SCRIPT &&
     @@ t/perf/p7519-fsmonitor.sh: test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIP
      +test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" '
      +	git diff
      +'
     ++
     ++if test -n "$GIT_PERF_7519_DROP_CACHE"; then
     ++	test-tool drop-caches
     ++fi
     ++
     ++test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" '
     ++	git diff -- 1_file
     ++'
     ++
     ++test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" '
     ++	git diff -- 10_files
     ++'
     ++
     ++test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" '
     ++	git diff -- 100_files
     ++'
     ++
     ++test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
     ++	git diff -- 1000_files
     ++'
     ++
     ++test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
     ++	git diff -- 10000_files
     ++'
      +
       if test_have_prereq WATCHMAN
       then

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 1/4] fsmonitor: use fsmonitor data in `git diff`
  2020-10-19 21:35 ` [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
@ 2020-10-19 21:35   ` Alex Vandiver via GitGitGadget
  2020-10-19 21:35   ` [PATCH v2 2/4] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 52+ messages in thread
From: Alex Vandiver via GitGitGadget @ 2020-10-19 21:35 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati, Alex Vandiver

From: Alex Vandiver <alexmv@dropbox.com>

With fsmonitor enabled, the first call to match_stat_with_submodule
calls refresh_fsmonitor, incurring the overhead of reading the list of
updated files -- but run_diff_files does not respect the
CE_FSMONITOR_VALID flag.

Make use of the fsmonitor extension to skip lstat() calls on files
that fsmonitor judged as unmodified.

Notably, this change improves performance of the git shell prompt when
GIT_PS1_SHOWDIRTYSTATE is set.

Signed-off-by: Alex Vandiver <alexmv@dropbox.com>
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 diff-lib.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index f95c6de75f..d2d31b9f82 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -97,6 +97,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 
 	diff_set_mnemonic_prefix(&revs->diffopt, "i/", "w/");
 
+	refresh_fsmonitor(istate);
+
 	if (diff_unmerged_stage < 0)
 		diff_unmerged_stage = 2;
 	entries = istate->cache_nr;
@@ -197,8 +199,17 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 		if (ce_uptodate(ce) || ce_skip_worktree(ce))
 			continue;
 
-		/* If CE_VALID is set, don't look at workdir for file removal */
-		if (ce->ce_flags & CE_VALID) {
+		/*
+		 * When CE_VALID is set (via "update-index --assume-unchanged"
+		 * or via adding paths while core.ignorestat is set to true),
+		 * the user has promised that the working tree file for that
+		 * path will not be modified.  When CE_FSMONITOR_VALID is true,
+		 * the fsmonitor knows that the path hasn't been modified since
+		 * we refreshed the cached stat information.  In either case,
+		 * we do not have to stat to see if the path has been removed
+		 * or modified.
+		 */
+		if (ce->ce_flags & (CE_VALID | CE_FSMONITOR_VALID)) {
 			changed = 0;
 			newmode = ce->ce_mode;
 		} else {
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 2/4] t/perf/README: elaborate on output format
  2020-10-19 21:35 ` [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
  2020-10-19 21:35   ` [PATCH v2 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
@ 2020-10-19 21:35   ` Nipunn Koorapati via GitGitGadget
  2020-10-19 21:35   ` [PATCH v2 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 21:35 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati, Nipunn Koorapati

From: Nipunn Koorapati <nipunn@dropbox.com>

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 t/perf/README | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/t/perf/README b/t/perf/README
index bd649afa97..fb9127a66f 100644
--- a/t/perf/README
+++ b/t/perf/README
@@ -28,6 +28,8 @@ the tests on the current git repository.
     7810.3: grep --cached, cheap regex       3.07(3.02+0.25)
     7810.4: grep --cached, expensive regex   9.39(30.57+0.24)
 
+Output format is in seconds "Elapsed(User + System)"
+
 You can compare multiple repositories and even git revisions with the
 'run' script:
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status
  2020-10-19 21:35 ` [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
  2020-10-19 21:35   ` [PATCH v2 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
  2020-10-19 21:35   ` [PATCH v2 2/4] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
@ 2020-10-19 21:35   ` Nipunn Koorapati via GitGitGadget
  2020-10-19 21:35   ` [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
  2020-10-19 22:47   ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
  4 siblings, 0 replies; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 21:35 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati, Nipunn Koorapati

From: Nipunn Koorapati <nipunn@dropbox.com>

The first git status would be inflated due to warming of
filesystem cache. This makes the results comparable.

Before
Test                                                             this tree
--------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         2.52(1.59+1.56)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.18(0.12+0.06)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   1.36(0.73+0.62)
7519.7: status (fsmonitor=)                                      0.69(0.52+0.90)
7519.8: status -uno (fsmonitor=)                                 0.37(0.28+0.81)
7519.9: status -uall (fsmonitor=)                                1.53(0.93+1.32)

After
Test                                                             this tree
--------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         0.39(0.33+0.06)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.17(0.13+0.05)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   1.34(0.77+0.56)
7519.7: status (fsmonitor=)                                      0.70(0.53+0.90)
7519.8: status -uno (fsmonitor=)                                 0.37(0.32+0.78)
7519.9: status -uall (fsmonitor=)                                1.55(1.01+1.25)

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 t/perf/p7519-fsmonitor.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
index def7ecdbc7..9313d4a51d 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -114,7 +114,8 @@ test_expect_success "setup for fsmonitor" '
 	fi &&
 
 	git config core.fsmonitor "$INTEGRATION_SCRIPT" &&
-	git update-index --fsmonitor
+	git update-index --fsmonitor &&
+	git status  # Warm caches
 '
 
 if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff
  2020-10-19 21:35 ` [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
                     ` (2 preceding siblings ...)
  2020-10-19 21:35   ` [PATCH v2 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
@ 2020-10-19 21:35   ` Nipunn Koorapati via GitGitGadget
  2020-10-19 21:43     ` Taylor Blau
  2020-10-19 21:54     ` Taylor Blau
  2020-10-19 22:47   ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
  4 siblings, 2 replies; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 21:35 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati, Nipunn Koorapati

From: Nipunn Koorapati <nipunn@dropbox.com>

Results for the git-diff fsmonitor optimization
in patch in the parent-rev (using a 400k file repo to test)

As you can see here - git diff with fsmonitor running is
significantly better with this patch series (80% faster on my
workload)!

GIT_PERF_LARGE_REPO=~/src/server ./run v2.29.0-rc1 . -- p7519-fsmonitor.sh

Test                                                                     v2.29.0-rc1       this tree
-----------------------------------------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)                 1.46(0.82+0.64)   1.47(0.83+0.62) +0.7%
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)            0.16(0.12+0.04)   0.17(0.12+0.05) +6.3%
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)           1.36(0.73+0.62)   1.37(0.76+0.60) +0.7%
7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman)                   0.85(0.22+0.63)   0.14(0.10+0.05) -83.5%
7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman)        0.12(0.08+0.05)   0.13(0.11+0.02) +8.3%
7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman)       0.12(0.08+0.04)   0.13(0.09+0.04) +8.3%
7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman)      0.12(0.07+0.05)   0.13(0.07+0.06) +8.3%
7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman)     0.12(0.09+0.04)   0.13(0.08+0.05) +8.3%
7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman)   0.14(0.09+0.05)   0.13(0.10+0.03) -7.1%
7519.12: status (fsmonitor=)                                             1.67(0.93+1.49)   1.67(0.99+1.42) +0.0%
7519.13: status -uno (fsmonitor=)                                        0.37(0.30+0.82)   0.37(0.33+0.79) +0.0%
7519.14: status -uall (fsmonitor=)                                       1.58(0.97+1.35)   1.57(0.86+1.45) -0.6%
7519.15: diff (fsmonitor=)                                               0.34(0.28+0.83)   0.34(0.27+0.83) +0.0%
7519.16: diff -- 0_files (fsmonitor=)                                    0.09(0.06+0.04)   0.09(0.08+0.02) +0.0%
7519.17: diff -- 10_files (fsmonitor=)                                   0.09(0.07+0.03)   0.09(0.06+0.05) +0.0%
7519.18: diff -- 100_files (fsmonitor=)                                  0.09(0.06+0.04)   0.09(0.06+0.04) +0.0%
7519.19: diff -- 1000_files (fsmonitor=)                                 0.09(0.06+0.04)   0.09(0.05+0.05) +0.0%
7519.20: diff -- 10000_files (fsmonitor=)                                0.10(0.08+0.04)   0.10(0.06+0.05) +0.0%

I also added a benchmark for a tiny git diff workload w/ a pathspec.
I see an approximately .02 second overhead added w/ and w/o fsmonitor

From looking at these results, I suspected that refresh_fsmonitor
is already happening during git diff - independent of this patch
series' optimization. Confirmed that suspicion by breaking on
refresh_fsmonitor.

(gdb) bt  [simplified]
0  refresh_fsmonitor  at fsmonitor.c:176
1  ie_match_stat  at read-cache.c:375
2  match_stat_with_submodule at diff-lib.c:237
4  builtin_diff_files  at builtin/diff.c:260
5  cmd_diff  at builtin/diff.c:541
6  run_builtin  at git.c:450
7  handle_builtin  at git.c:700
8  run_argv  at git.c:767
9  cmd_main  at git.c:898
10 main  at common-main.c:52

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 t/perf/p7519-fsmonitor.sh | 71 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
index 9313d4a51d..2b4803707f 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -115,6 +115,13 @@ test_expect_success "setup for fsmonitor" '
 
 	git config core.fsmonitor "$INTEGRATION_SCRIPT" &&
 	git update-index --fsmonitor &&
+	mkdir 1_file 10_files 100_files 1000_files 10000_files &&
+	for i in `seq 1 10`; do touch 10_files/$i; done &&
+	for i in `seq 1 100`; do touch 100_files/$i; done &&
+	for i in `seq 1 1000`; do touch 1000_files/$i; done &&
+	for i in `seq 1 10000`; do touch 10000_files/$i; done &&
+	git add 1_file 10_files 100_files 1000_files 10000_files &&
+	git commit -m "Add files" &&
 	git status  # Warm caches
 '
 
@@ -142,6 +149,38 @@ test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
 	git status -uall
 '
 
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-tool drop-caches
+fi
+
+test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-tool drop-caches
+fi
+
+test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 1_file
+'
+
+test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 10_files
+'
+
+test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 100_files
+'
+
+test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 1000_files
+'
+
+test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 10000_files
+'
+
 test_expect_success "setup without fsmonitor" '
 	unset INTEGRATION_SCRIPT &&
 	git config --unset core.fsmonitor &&
@@ -172,6 +211,38 @@ test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
 	git status -uall
 '
 
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-tool drop-caches
+fi
+
+test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-tool drop-caches
+fi
+
+test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 1_file
+'
+
+test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 10_files
+'
+
+test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 100_files
+'
+
+test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 1000_files
+'
+
+test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 10000_files
+'
+
 if test_have_prereq WATCHMAN
 then
 	watchman watch-del "$GIT_WORK_TREE" >/dev/null 2>&1 &&
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff
  2020-10-19 21:35   ` [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
@ 2020-10-19 21:43     ` Taylor Blau
  2020-10-19 21:54     ` Taylor Blau
  1 sibling, 0 replies; 52+ messages in thread
From: Taylor Blau @ 2020-10-19 21:43 UTC (permalink / raw)
  To: Nipunn Koorapati via GitGitGadget
  Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati,
	Nipunn Koorapati, Taylor Blau

On Mon, Oct 19, 2020 at 09:35:15PM +0000, Nipunn Koorapati via GitGitGadget wrote:
> diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
> index 9313d4a51d..2b4803707f 100755
> --- a/t/perf/p7519-fsmonitor.sh
> +++ b/t/perf/p7519-fsmonitor.sh
> @@ -115,6 +115,13 @@ test_expect_success "setup for fsmonitor" '
>
>  	git config core.fsmonitor "$INTEGRATION_SCRIPT" &&
>  	git update-index --fsmonitor &&
> +	mkdir 1_file 10_files 100_files 1000_files 10000_files &&
> +	for i in `seq 1 10`; do touch 10_files/$i; done &&
> +	for i in `seq 1 100`; do touch 100_files/$i; done &&
> +	for i in `seq 1 1000`; do touch 1000_files/$i; done &&
> +	for i in `seq 1 10000`; do touch 10000_files/$i; done &&

I just happened to notice these while reading your range diff; git
discourages the use of seq in test, instead preferring our own
works-everywhere 'test_seq()'.

I was wondering how this slipped through since it should be checked
automatically by t/check-non-portable-shell.pl, but that is only run
from t/Makefile, not t/perf/Makefile. That probably explains how a few
raw `seq`'s made it into t/perf.

In either case, test_seq() is preferred here.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff
  2020-10-19 21:35   ` [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
  2020-10-19 21:43     ` Taylor Blau
@ 2020-10-19 21:54     ` Taylor Blau
  2020-10-19 22:00       ` Nipunn Koorapati
  2020-10-19 22:25       ` Nipunn Koorapati
  1 sibling, 2 replies; 52+ messages in thread
From: Taylor Blau @ 2020-10-19 21:54 UTC (permalink / raw)
  To: Nipunn Koorapati via GitGitGadget
  Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati,
	Nipunn Koorapati, Taylor Blau

On Mon, Oct 19, 2020 at 09:35:15PM +0000, Nipunn Koorapati via GitGitGadget wrote:
> From: Nipunn Koorapati <nipunn@dropbox.com>
>
> Results for the git-diff fsmonitor optimization
> in patch in the parent-rev (using a 400k file repo to test)
>
> As you can see here - git diff with fsmonitor running is
> significantly better with this patch series (80% faster on my
> workload)!

These t/perf numbers are very helpful, at least to me.

> GIT_PERF_LARGE_REPO=~/src/server ./run v2.29.0-rc1 . -- p7519-fsmonitor.sh
>
> Test                                                                     v2.29.0-rc1       this tree
> -----------------------------------------------------------------------------------------------------------------
> 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)                 1.46(0.82+0.64)   1.47(0.83+0.62) +0.7%
> 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)            0.16(0.12+0.04)   0.17(0.12+0.05) +6.3%
> 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)           1.36(0.73+0.62)   1.37(0.76+0.60) +0.7%

Looks like about 0.01sec of overhead, which seems like an acceptable
trade-off for when the user has at least 10,000 files.

This reminds me; did you look at the 'git add' performance change? I
recall Junio mentioning that 'git add' takes the same paths in the code.

> 7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman)                   0.85(0.22+0.63)   0.14(0.10+0.05) -83.5%
> 7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman)        0.12(0.08+0.05)   0.13(0.11+0.02) +8.3%
> 7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman)       0.12(0.08+0.04)   0.13(0.09+0.04) +8.3%
> 7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman)      0.12(0.07+0.05)   0.13(0.07+0.06) +8.3%
> 7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman)     0.12(0.09+0.04)   0.13(0.08+0.05) +8.3%
> 7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman)   0.14(0.09+0.05)   0.13(0.10+0.03) -7.1%

OK... so having fsmonitor turned on adds an imperceptible amount of
slow-down to cases where there are [0, 10000) files. But, in exchange,
you get much-improved whole-tree performance, as well as single-tree
performance when that tree contains at least 10,000 files.

I was going to say that this has little downside, because turning on
fsmonitor is probably a good indicator that you don't have any fewer
than 10,000 files in your repository, but I think that's missing the
point. Likely true, but that doesn't exclude the possibility of having
sub-10,000 file directories, which users may very well still be
diff-ing.

So, there's a slow-down, but it's hard to complain when you consider
what we get in exchange.

> 7519.12: status (fsmonitor=)                                             1.67(0.93+1.49)   1.67(0.99+1.42) +0.0%
> 7519.13: status -uno (fsmonitor=)                                        0.37(0.30+0.82)   0.37(0.33+0.79) +0.0%
> 7519.14: status -uall (fsmonitor=)                                       1.58(0.97+1.35)   1.57(0.86+1.45) -0.6%
> 7519.15: diff (fsmonitor=)                                               0.34(0.28+0.83)   0.34(0.27+0.83) +0.0%
> 7519.16: diff -- 0_files (fsmonitor=)                                    0.09(0.06+0.04)   0.09(0.08+0.02) +0.0%
> 7519.17: diff -- 10_files (fsmonitor=)                                   0.09(0.07+0.03)   0.09(0.06+0.05) +0.0%
> 7519.18: diff -- 100_files (fsmonitor=)                                  0.09(0.06+0.04)   0.09(0.06+0.04) +0.0%
> 7519.19: diff -- 1000_files (fsmonitor=)                                 0.09(0.06+0.04)   0.09(0.05+0.05) +0.0%
> 7519.20: diff -- 10000_files (fsmonitor=)                                0.10(0.08+0.04)   0.10(0.06+0.05) +0.0%

Great! No slow-down without fsmonitor enabled, as expected. Fantastic.

> I also added a benchmark for a tiny git diff workload w/ a pathspec.
> I see an approximately .02 second overhead added w/ and w/o fsmonitor
>
> From looking at these results, I suspected that refresh_fsmonitor
> is already happening during git diff - independent of this patch
> series' optimization. Confirmed that suspicion by breaking on
> refresh_fsmonitor.

So, the overhead that we're paying is purely the pipe+fork+exec? I.e.,
that watchman has already computed an answer in the earlier call, and we
just have to read it again (or find out that the last results were
unchanged)?

> (gdb) bt  [simplified]
> 0  refresh_fsmonitor  at fsmonitor.c:176
> 1  ie_match_stat  at read-cache.c:375
> 2  match_stat_with_submodule at diff-lib.c:237
> 4  builtin_diff_files  at builtin/diff.c:260
> 5  cmd_diff  at builtin/diff.c:541
> 6  run_builtin  at git.c:450
> 7  handle_builtin  at git.c:700
> 8  run_argv  at git.c:767
> 9  cmd_main  at git.c:898
> 10 main  at common-main.c:52

:-).

> Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
> ---
>  t/perf/p7519-fsmonitor.sh | 71 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 71 insertions(+)
>
> diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
> index 9313d4a51d..2b4803707f 100755
> --- a/t/perf/p7519-fsmonitor.sh
> +++ b/t/perf/p7519-fsmonitor.sh
> @@ -115,6 +115,13 @@ test_expect_success "setup for fsmonitor" '

Everything in here looks very reasonable to me, except for the seq vs.
test_seq() issue that I pointed out in another email in this thread.

It's too bad that we have to write these twice, but that's not the fault
of your patch.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff
  2020-10-19 21:54     ` Taylor Blau
@ 2020-10-19 22:00       ` Nipunn Koorapati
  2020-10-19 22:02         ` Taylor Blau
  2020-10-19 22:25       ` Nipunn Koorapati
  1 sibling, 1 reply; 52+ messages in thread
From: Nipunn Koorapati @ 2020-10-19 22:00 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Nipunn Koorapati via GitGitGadget, git, Derrick Stolee,
	Utsav Shah, Nipunn Koorapati

> I was wondering how this slipped through since it should be checked
> automatically by t/check-non-portable-shell.pl, but that is only run
> from t/Makefile, not t/perf/Makefile. That probably explains how a few
> raw `seq`'s made it into t/perf.

Makefile issue is easy enough to fix - will fix in another commit

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff
  2020-10-19 22:00       ` Nipunn Koorapati
@ 2020-10-19 22:02         ` Taylor Blau
  0 siblings, 0 replies; 52+ messages in thread
From: Taylor Blau @ 2020-10-19 22:02 UTC (permalink / raw)
  To: Nipunn Koorapati
  Cc: Taylor Blau, Nipunn Koorapati via GitGitGadget, git,
	Derrick Stolee, Utsav Shah, Nipunn Koorapati

On Mon, Oct 19, 2020 at 11:00:05PM +0100, Nipunn Koorapati wrote:
> > I was wondering how this slipped through since it should be checked
> > automatically by t/check-non-portable-shell.pl, but that is only run
> > from t/Makefile, not t/perf/Makefile. That probably explains how a few
> > raw `seq`'s made it into t/perf.
>
> Makefile issue is easy enough to fix - will fix in another commit

Thanks; I don't think that you need to worry about touching up t/perf's
Makefile (although I'd be very grateful if you did!), but rather just
swapping new invocations of seq to use 'test_seq()' would be sufficient
in and of itself.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff
  2020-10-19 21:54     ` Taylor Blau
  2020-10-19 22:00       ` Nipunn Koorapati
@ 2020-10-19 22:25       ` Nipunn Koorapati
  1 sibling, 0 replies; 52+ messages in thread
From: Nipunn Koorapati @ 2020-10-19 22:25 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Nipunn Koorapati via GitGitGadget, git, Derrick Stolee,
	Utsav Shah, Nipunn Koorapati

> This reminds me; did you look at the 'git add' performance change? I
> recall Junio mentioning that 'git add' takes the same paths in the code.

I did look into it and didn't see a big perf change - and didn't dig into why.
I'll leave a perf test in the next roll of this patch series so you can see the
numbers.

--Nipunn

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat
  2020-10-19 21:35 ` [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
                     ` (3 preceding siblings ...)
  2020-10-19 21:35   ` [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
@ 2020-10-19 22:47   ` Nipunn Koorapati via GitGitGadget
  2020-10-19 22:47     ` [PATCH v3 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
                       ` (7 more replies)
  4 siblings, 8 replies; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati

Credit to alexmv who made this commit back in Dec, 2017 when he was at dbx.
I've rebased it and am submitting it now.

With fsmonitor enabled, git diff currently lstats every file in the repo
This makes use of the fsmonitor extension to skip lstat() calls on files
that fsmonitor judged as unmodified.

I was able to do some testing with/without this change in a large in-house
repo (~ 400k files).

-----------------------------------------
(1) With fsmonitor enabled - on master of git (2.29.0)
-----------------------------------------
../git/bin-wrappers/git checkout HEAD~200
strace -c ../git/bin-wrappers/git diff

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.64    4.358994          10    446257         3 lstat
  0.12    0.005353           7       764       360 open

(A subsequent call)
strace -c ../git/bin-wrappers/git diff

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.84    4.380955          10    444904         3 lstat
  0.06    0.002564         135        19           munmap
...

-----------------------------------------
(2) With fsmonitor enabled - with my patch
-----------------------------------------
../git/bin-wrappers/git checkout HEAD~200
strace -c ../git/bin-wrappers/git diff

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 50.72    0.003090         163        19           munmap
 19.63    0.001196         598         2           futex
...
  0.00    0.000000           0         4         3 lstat


-----------------------------------------
(3) With fsmonitor disabled entirely
-----------------------------------------

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 98.52    0.277085       92362         3           futex
  0.27    0.000752           4       191        63 open
...
  0.14    0.000397           3       158         3 lstat

I was able to encode this into a perf test in one of the commits.

Changes since Patch Series V1

 * Add git diff -- <pathspec> to perf tests
 * improve readability of bitwise ops

Alex Vandiver (1):
  fsmonitor: use fsmonitor data in `git diff`

Nipunn Koorapati (6):
  t/perf/README: elaborate on output format
  t/perf/p7519-fsmonitor.sh: warm cache on first git status
  t/perf: add fsmonitor perf test for git diff
  perf lint: check test-lint-shell-syntax in perf tests
  p7519-fsmonitor: refactor to avoid code duplication
  p7519-fsmonitor: add a git add benchmark

 diff-lib.c                | 15 +++++-
 t/Makefile                |  3 +-
 t/perf/README             |  2 +
 t/perf/p3400-rebase.sh    |  6 +--
 t/perf/p7519-fsmonitor.sh | 96 ++++++++++++++++++++++-----------------
 5 files changed, 75 insertions(+), 47 deletions(-)


base-commit: d4a392452e292ff924e79ec8458611c0f679d6d4
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-756%2Fnipunn1313%2Fdiff_fsmon-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-756/nipunn1313/diff_fsmon-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/756

Range-diff vs v2:

 1:  cba03dd40b = 1:  cba03dd40b fsmonitor: use fsmonitor data in `git diff`
 2:  1c7876166f = 2:  1c7876166f t/perf/README: elaborate on output format
 3:  401f696c81 = 3:  401f696c81 t/perf/p7519-fsmonitor.sh: warm cache on first git status
 4:  f572e226bb ! 4:  b3ad8faac4 t/perf: add fsmonitor perf test for git diff
     @@ t/perf/p7519-fsmonitor.sh: test_expect_success "setup for fsmonitor" '
       	git config core.fsmonitor "$INTEGRATION_SCRIPT" &&
       	git update-index --fsmonitor &&
      +	mkdir 1_file 10_files 100_files 1000_files 10000_files &&
     -+	for i in `seq 1 10`; do touch 10_files/$i; done &&
     -+	for i in `seq 1 100`; do touch 100_files/$i; done &&
     -+	for i in `seq 1 1000`; do touch 1000_files/$i; done &&
     -+	for i in `seq 1 10000`; do touch 10000_files/$i; done &&
     ++	for i in $(test_seq 1 10); do touch 10_files/$i; done &&
     ++	for i in $(test_seq 1 100); do touch 100_files/$i; done &&
     ++	for i in $(test_seq 1 1000); do touch 1000_files/$i; done &&
     ++	for i in $(test_seq 1 10000); do touch 10000_files/$i; done &&
      +	git add 1_file 10_files 100_files 1000_files 10000_files &&
      +	git commit -m "Add files" &&
       	git status  # Warm caches
 -:  ---------- > 5:  28c1e488bf perf lint: check test-lint-shell-syntax in perf tests
 -:  ---------- > 6:  b38f2984f9 p7519-fsmonitor: refactor to avoid code duplication
 -:  ---------- > 7:  d392a523f2 p7519-fsmonitor: add a git add benchmark

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v3 1/7] fsmonitor: use fsmonitor data in `git diff`
  2020-10-19 22:47   ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
@ 2020-10-19 22:47     ` Alex Vandiver via GitGitGadget
  2020-10-19 22:47     ` [PATCH v3 2/7] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
                       ` (6 subsequent siblings)
  7 siblings, 0 replies; 52+ messages in thread
From: Alex Vandiver via GitGitGadget @ 2020-10-19 22:47 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati, Alex Vandiver

From: Alex Vandiver <alexmv@dropbox.com>

With fsmonitor enabled, the first call to match_stat_with_submodule
calls refresh_fsmonitor, incurring the overhead of reading the list of
updated files -- but run_diff_files does not respect the
CE_FSMONITOR_VALID flag.

Make use of the fsmonitor extension to skip lstat() calls on files
that fsmonitor judged as unmodified.

Notably, this change improves performance of the git shell prompt when
GIT_PS1_SHOWDIRTYSTATE is set.

Signed-off-by: Alex Vandiver <alexmv@dropbox.com>
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 diff-lib.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index f95c6de75f..d2d31b9f82 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -97,6 +97,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 
 	diff_set_mnemonic_prefix(&revs->diffopt, "i/", "w/");
 
+	refresh_fsmonitor(istate);
+
 	if (diff_unmerged_stage < 0)
 		diff_unmerged_stage = 2;
 	entries = istate->cache_nr;
@@ -197,8 +199,17 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 		if (ce_uptodate(ce) || ce_skip_worktree(ce))
 			continue;
 
-		/* If CE_VALID is set, don't look at workdir for file removal */
-		if (ce->ce_flags & CE_VALID) {
+		/*
+		 * When CE_VALID is set (via "update-index --assume-unchanged"
+		 * or via adding paths while core.ignorestat is set to true),
+		 * the user has promised that the working tree file for that
+		 * path will not be modified.  When CE_FSMONITOR_VALID is true,
+		 * the fsmonitor knows that the path hasn't been modified since
+		 * we refreshed the cached stat information.  In either case,
+		 * we do not have to stat to see if the path has been removed
+		 * or modified.
+		 */
+		if (ce->ce_flags & (CE_VALID | CE_FSMONITOR_VALID)) {
 			changed = 0;
 			newmode = ce->ce_mode;
 		} else {
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v3 2/7] t/perf/README: elaborate on output format
  2020-10-19 22:47   ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
  2020-10-19 22:47     ` [PATCH v3 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
@ 2020-10-19 22:47     ` Nipunn Koorapati via GitGitGadget
  2020-10-19 22:47     ` [PATCH v3 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
                       ` (5 subsequent siblings)
  7 siblings, 0 replies; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati, Nipunn Koorapati

From: Nipunn Koorapati <nipunn@dropbox.com>

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 t/perf/README | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/t/perf/README b/t/perf/README
index bd649afa97..fb9127a66f 100644
--- a/t/perf/README
+++ b/t/perf/README
@@ -28,6 +28,8 @@ the tests on the current git repository.
     7810.3: grep --cached, cheap regex       3.07(3.02+0.25)
     7810.4: grep --cached, expensive regex   9.39(30.57+0.24)
 
+Output format is in seconds "Elapsed(User + System)"
+
 You can compare multiple repositories and even git revisions with the
 'run' script:
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v3 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status
  2020-10-19 22:47   ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
  2020-10-19 22:47     ` [PATCH v3 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
  2020-10-19 22:47     ` [PATCH v3 2/7] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
@ 2020-10-19 22:47     ` Nipunn Koorapati via GitGitGadget
  2020-10-19 22:47     ` [PATCH v3 4/7] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
                       ` (4 subsequent siblings)
  7 siblings, 0 replies; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati, Nipunn Koorapati

From: Nipunn Koorapati <nipunn@dropbox.com>

The first git status would be inflated due to warming of
filesystem cache. This makes the results comparable.

Before
Test                                                             this tree
--------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         2.52(1.59+1.56)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.18(0.12+0.06)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   1.36(0.73+0.62)
7519.7: status (fsmonitor=)                                      0.69(0.52+0.90)
7519.8: status -uno (fsmonitor=)                                 0.37(0.28+0.81)
7519.9: status -uall (fsmonitor=)                                1.53(0.93+1.32)

After
Test                                                             this tree
--------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         0.39(0.33+0.06)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.17(0.13+0.05)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   1.34(0.77+0.56)
7519.7: status (fsmonitor=)                                      0.70(0.53+0.90)
7519.8: status -uno (fsmonitor=)                                 0.37(0.32+0.78)
7519.9: status -uall (fsmonitor=)                                1.55(1.01+1.25)

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 t/perf/p7519-fsmonitor.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
index def7ecdbc7..9313d4a51d 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -114,7 +114,8 @@ test_expect_success "setup for fsmonitor" '
 	fi &&
 
 	git config core.fsmonitor "$INTEGRATION_SCRIPT" &&
-	git update-index --fsmonitor
+	git update-index --fsmonitor &&
+	git status  # Warm caches
 '
 
 if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v3 4/7] t/perf: add fsmonitor perf test for git diff
  2020-10-19 22:47   ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
                       ` (2 preceding siblings ...)
  2020-10-19 22:47     ` [PATCH v3 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
@ 2020-10-19 22:47     ` Nipunn Koorapati via GitGitGadget
  2020-10-19 22:47     ` [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests Nipunn Koorapati via GitGitGadget
                       ` (3 subsequent siblings)
  7 siblings, 0 replies; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati, Nipunn Koorapati

From: Nipunn Koorapati <nipunn@dropbox.com>

Results for the git-diff fsmonitor optimization
in patch in the parent-rev (using a 400k file repo to test)

As you can see here - git diff with fsmonitor running is
significantly better with this patch series (80% faster on my
workload)!

GIT_PERF_LARGE_REPO=~/src/server ./run v2.29.0-rc1 . -- p7519-fsmonitor.sh

Test                                                                     v2.29.0-rc1       this tree
-----------------------------------------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)                 1.46(0.82+0.64)   1.47(0.83+0.62) +0.7%
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)            0.16(0.12+0.04)   0.17(0.12+0.05) +6.3%
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)           1.36(0.73+0.62)   1.37(0.76+0.60) +0.7%
7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman)                   0.85(0.22+0.63)   0.14(0.10+0.05) -83.5%
7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman)        0.12(0.08+0.05)   0.13(0.11+0.02) +8.3%
7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman)       0.12(0.08+0.04)   0.13(0.09+0.04) +8.3%
7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman)      0.12(0.07+0.05)   0.13(0.07+0.06) +8.3%
7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman)     0.12(0.09+0.04)   0.13(0.08+0.05) +8.3%
7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman)   0.14(0.09+0.05)   0.13(0.10+0.03) -7.1%
7519.12: status (fsmonitor=)                                             1.67(0.93+1.49)   1.67(0.99+1.42) +0.0%
7519.13: status -uno (fsmonitor=)                                        0.37(0.30+0.82)   0.37(0.33+0.79) +0.0%
7519.14: status -uall (fsmonitor=)                                       1.58(0.97+1.35)   1.57(0.86+1.45) -0.6%
7519.15: diff (fsmonitor=)                                               0.34(0.28+0.83)   0.34(0.27+0.83) +0.0%
7519.16: diff -- 0_files (fsmonitor=)                                    0.09(0.06+0.04)   0.09(0.08+0.02) +0.0%
7519.17: diff -- 10_files (fsmonitor=)                                   0.09(0.07+0.03)   0.09(0.06+0.05) +0.0%
7519.18: diff -- 100_files (fsmonitor=)                                  0.09(0.06+0.04)   0.09(0.06+0.04) +0.0%
7519.19: diff -- 1000_files (fsmonitor=)                                 0.09(0.06+0.04)   0.09(0.05+0.05) +0.0%
7519.20: diff -- 10000_files (fsmonitor=)                                0.10(0.08+0.04)   0.10(0.06+0.05) +0.0%

I also added a benchmark for a tiny git diff workload w/ a pathspec.
I see an approximately .02 second overhead added w/ and w/o fsmonitor

From looking at these results, I suspected that refresh_fsmonitor
is already happening during git diff - independent of this patch
series' optimization. Confirmed that suspicion by breaking on
refresh_fsmonitor.

(gdb) bt  [simplified]
0  refresh_fsmonitor  at fsmonitor.c:176
1  ie_match_stat  at read-cache.c:375
2  match_stat_with_submodule at diff-lib.c:237
4  builtin_diff_files  at builtin/diff.c:260
5  cmd_diff  at builtin/diff.c:541
6  run_builtin  at git.c:450
7  handle_builtin  at git.c:700
8  run_argv  at git.c:767
9  cmd_main  at git.c:898
10 main  at common-main.c:52

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 t/perf/p7519-fsmonitor.sh | 71 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
index 9313d4a51d..ef4c3c8c5c 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -115,6 +115,13 @@ test_expect_success "setup for fsmonitor" '
 
 	git config core.fsmonitor "$INTEGRATION_SCRIPT" &&
 	git update-index --fsmonitor &&
+	mkdir 1_file 10_files 100_files 1000_files 10000_files &&
+	for i in $(test_seq 1 10); do touch 10_files/$i; done &&
+	for i in $(test_seq 1 100); do touch 100_files/$i; done &&
+	for i in $(test_seq 1 1000); do touch 1000_files/$i; done &&
+	for i in $(test_seq 1 10000); do touch 10000_files/$i; done &&
+	git add 1_file 10_files 100_files 1000_files 10000_files &&
+	git commit -m "Add files" &&
 	git status  # Warm caches
 '
 
@@ -142,6 +149,38 @@ test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
 	git status -uall
 '
 
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-tool drop-caches
+fi
+
+test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-tool drop-caches
+fi
+
+test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 1_file
+'
+
+test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 10_files
+'
+
+test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 100_files
+'
+
+test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 1000_files
+'
+
+test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 10000_files
+'
+
 test_expect_success "setup without fsmonitor" '
 	unset INTEGRATION_SCRIPT &&
 	git config --unset core.fsmonitor &&
@@ -172,6 +211,38 @@ test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
 	git status -uall
 '
 
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-tool drop-caches
+fi
+
+test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-tool drop-caches
+fi
+
+test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 1_file
+'
+
+test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 10_files
+'
+
+test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 100_files
+'
+
+test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 1000_files
+'
+
+test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 10000_files
+'
+
 if test_have_prereq WATCHMAN
 then
 	watchman watch-del "$GIT_WORK_TREE" >/dev/null 2>&1 &&
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests
  2020-10-19 22:47   ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
                       ` (3 preceding siblings ...)
  2020-10-19 22:47     ` [PATCH v3 4/7] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
@ 2020-10-19 22:47     ` Nipunn Koorapati via GitGitGadget
  2020-10-20  2:38       ` Taylor Blau
  2020-10-19 22:47     ` [PATCH v3 6/7] p7519-fsmonitor: refactor to avoid code duplication Nipunn Koorapati via GitGitGadget
                       ` (2 subsequent siblings)
  7 siblings, 1 reply; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati, Nipunn Koorapati

From: Nipunn Koorapati <nipunn@dropbox.com>

Perf tests have some seq instead of test_seq. This
runs the existing tests on the perf tests as well.

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 t/Makefile             | 3 ++-
 t/perf/p3400-rebase.sh | 6 +++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/t/Makefile b/t/Makefile
index c83fd18861..74b53af2bd 100644
--- a/t/Makefile
+++ b/t/Makefile
@@ -34,6 +34,7 @@ CHAINLINTTMP_SQ = $(subst ','\'',$(CHAINLINTTMP))
 T = $(sort $(wildcard t[0-9][0-9][0-9][0-9]-*.sh))
 TGITWEB = $(sort $(wildcard t95[0-9][0-9]-*.sh))
 THELPERS = $(sort $(filter-out $(T),$(wildcard *.sh)))
+TPERF = $(sort $(wildcard perf/p[0-9][0-9][0-9][0-9]-*.sh))
 CHAINLINTTESTS = $(sort $(patsubst chainlint/%.test,%,$(wildcard chainlint/*.test)))
 CHAINLINT = sed -f chainlint.sed
 
@@ -91,7 +92,7 @@ test-lint-executable:
 		echo >&2 "non-executable tests:" $$bad; exit 1; }
 
 test-lint-shell-syntax:
-	@'$(PERL_PATH_SQ)' check-non-portable-shell.pl $(T) $(THELPERS)
+	@'$(PERL_PATH_SQ)' check-non-portable-shell.pl $(T) $(THELPERS) $(TPERF)
 
 test-lint-filenames:
 	@# We do *not* pass a glob to ls-files but use grep instead, to catch
diff --git a/t/perf/p3400-rebase.sh b/t/perf/p3400-rebase.sh
index d202aaed06..7a0bb29448 100755
--- a/t/perf/p3400-rebase.sh
+++ b/t/perf/p3400-rebase.sh
@@ -9,16 +9,16 @@ test_expect_success 'setup rebasing on top of a lot of changes' '
 	git checkout -f -B base &&
 	git checkout -B to-rebase &&
 	git checkout -B upstream &&
-	for i in $(seq 100)
+	for i in $(test_seq 100)
 	do
 		# simulate huge diffs
 		echo change$i >unrelated-file$i &&
-		seq 1000 >>unrelated-file$i &&
+		test_seq 1000 >>unrelated-file$i &&
 		git add unrelated-file$i &&
 		test_tick &&
 		git commit -m commit$i unrelated-file$i &&
 		echo change$i >unrelated-file$i &&
-		seq 1000 | tac >>unrelated-file$i &&
+		test_seq 1000 | tac >>unrelated-file$i &&
 		git add unrelated-file$i &&
 		test_tick &&
 		git commit -m commit$i-reverse unrelated-file$i ||
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v3 6/7] p7519-fsmonitor: refactor to avoid code duplication
  2020-10-19 22:47   ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
                       ` (4 preceding siblings ...)
  2020-10-19 22:47     ` [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests Nipunn Koorapati via GitGitGadget
@ 2020-10-19 22:47     ` Nipunn Koorapati via GitGitGadget
  2020-10-20  2:43       ` Taylor Blau
  2020-10-19 22:47     ` [PATCH v3 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget
  2020-10-20 13:40     ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
  7 siblings, 1 reply; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati, Nipunn Koorapati

From: Nipunn Koorapati <nipunn@dropbox.com>

Much of the benchmark code is redundant. This is
easier to understand and edit.

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 t/perf/p7519-fsmonitor.sh | 136 +++++++++++---------------------------
 1 file changed, 37 insertions(+), 99 deletions(-)

diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
index ef4c3c8c5c..75a0cef01d 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -125,61 +125,53 @@ test_expect_success "setup for fsmonitor" '
 	git status  # Warm caches
 '
 
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
+test_perf_w_drop_caches () {
+	if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+		test-tool drop-caches
+	fi
 
-test_perf "status (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git status
-'
+	test_perf "$@"
+}
 
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
+test_fsmonitor_suite() {
+	test_perf_w_drop_caches "status (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git status
+	'
 
-test_perf "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git status -uno
-'
+	test_perf_w_drop_caches "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git status -uno
+	'
 
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
+	test_perf_w_drop_caches "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git status -uall
+	'
 
-test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git status -uall
-'
-
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
-
-test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff
-'
+	test_perf_w_drop_caches "diff (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git diff
+	'
 
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
+	test_perf_w_drop_caches "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git diff -- 1_file
+	'
 
-test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 1_file
-'
+	test_perf_w_drop_caches "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git diff -- 10_files
+	'
 
-test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 10_files
-'
+	test_perf_w_drop_caches "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git diff -- 100_files
+	'
 
-test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 100_files
-'
+	test_perf_w_drop_caches "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git diff -- 1000_files
+	'
 
-test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 1000_files
-'
+	test_perf_w_drop_caches "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git diff -- 10000_files
+	'
+}
 
-test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 10000_files
-'
+test_fsmonitor_suite
 
 test_expect_success "setup without fsmonitor" '
 	unset INTEGRATION_SCRIPT &&
@@ -187,61 +179,7 @@ test_expect_success "setup without fsmonitor" '
 	git update-index --no-fsmonitor
 '
 
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
-
-test_perf "status (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git status
-'
-
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
-
-test_perf "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git status -uno
-'
-
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
-
-test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git status -uall
-'
-
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
-
-test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff
-'
-
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
-
-test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 1_file
-'
-
-test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 10_files
-'
-
-test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 100_files
-'
-
-test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 1000_files
-'
-
-test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 10000_files
-'
+test_fsmonitor_suite
 
 if test_have_prereq WATCHMAN
 then
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v3 7/7] p7519-fsmonitor: add a git add benchmark
  2020-10-19 22:47   ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
                       ` (5 preceding siblings ...)
  2020-10-19 22:47     ` [PATCH v3 6/7] p7519-fsmonitor: refactor to avoid code duplication Nipunn Koorapati via GitGitGadget
@ 2020-10-19 22:47     ` Nipunn Koorapati via GitGitGadget
  2020-10-19 23:02       ` Nipunn Koorapati
  2020-10-20  2:40       ` Taylor Blau
  2020-10-20 13:40     ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
  7 siblings, 2 replies; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati, Nipunn Koorapati

From: Nipunn Koorapati <nipunn@dropbox.com>

Test                                                                     v2.29.0-rc1       this tree
-----------------------------------------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)                 1.48(0.79+0.67)   1.48(0.79+0.67) +0.0%
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)            0.16(0.11+0.05)   0.17(0.13+0.04) +6.3%
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)           1.36(0.77+0.58)   1.37(0.72+0.63) +0.7%
7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman)                   0.84(0.21+0.63)   0.14(0.11+0.03) -83.3%
7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman)        0.12(0.07+0.05)   0.13(0.09+0.04) +8.3%
7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman)       0.12(0.09+0.04)   0.13(0.07+0.06) +8.3%
7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman)      0.12(0.08+0.05)   0.12(0.08+0.05) +0.0%
7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman)     0.12(0.08+0.05)   0.13(0.09+0.04) +8.3%
7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman)   0.14(0.08+0.06)   0.13(0.07+0.06) -7.1%
7519.11: add (fsmonitor=.git/hooks/fsmonitor-watchman)                   2.75(1.41+1.27)   2.03(1.26+0.70) -26.2%
7519.13: status (fsmonitor=)                                             1.38(1.03+1.04)   1.37(1.04+1.04) -0.7%
7519.14: status -uno (fsmonitor=)                                        1.11(0.83+0.98)   1.10(0.89+0.90) -0.9%
7519.15: status -uall (fsmonitor=)                                       2.30(1.57+1.42)   2.31(1.49+1.50) +0.4%
7519.16: diff (fsmonitor=)                                               1.43(1.13+1.76)   1.46(1.19+1.72) +2.1%
7519.17: diff -- 0_files (fsmonitor=)                                    0.10(0.08+0.04)   0.11(0.08+0.04) +10.0%
7519.18: diff -- 10_files (fsmonitor=)                                   0.10(0.07+0.05)   0.11(0.08+0.04) +10.0%
7519.19: diff -- 100_files (fsmonitor=)                                  0.10(0.07+0.04)   0.11(0.07+0.05) +10.0%
7519.20: diff -- 1000_files (fsmonitor=)                                 0.10(0.08+0.03)   0.11(0.08+0.04) +10.0%
7519.21: diff -- 10000_files (fsmonitor=)                                0.11(0.08+0.05)   0.12(0.07+0.06) +9.1%
7519.22: add (fsmonitor=)                                                2.26(1.46+1.49)   2.27(1.42+1.55) +0.4%

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 t/perf/p7519-fsmonitor.sh | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
index 75a0cef01d..fb20fe0937 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -169,6 +169,10 @@ test_fsmonitor_suite() {
 	test_perf_w_drop_caches "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
 		git diff -- 10000_files
 	'
+
+	test_perf_w_drop_caches "add (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git add  --all
+	'
 }
 
 test_fsmonitor_suite
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 7/7] p7519-fsmonitor: add a git add benchmark
  2020-10-19 22:47     ` [PATCH v3 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget
@ 2020-10-19 23:02       ` Nipunn Koorapati
  2020-10-20  2:40       ` Taylor Blau
  1 sibling, 0 replies; 52+ messages in thread
From: Nipunn Koorapati @ 2020-10-19 23:02 UTC (permalink / raw)
  To: Nipunn Koorapati via GitGitGadget
  Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati, Taylor Blau

Actually found a 25% improvement here on git add with this patch series

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests
  2020-10-19 22:47     ` [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests Nipunn Koorapati via GitGitGadget
@ 2020-10-20  2:38       ` Taylor Blau
  2020-10-20  3:10         ` Junio C Hamano
  2020-10-20 10:09         ` Nipunn Koorapati
  0 siblings, 2 replies; 52+ messages in thread
From: Taylor Blau @ 2020-10-20  2:38 UTC (permalink / raw)
  To: Nipunn Koorapati via GitGitGadget
  Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati,
	Nipunn Koorapati, Taylor Blau

On Mon, Oct 19, 2020 at 10:47:35PM +0000, Nipunn Koorapati via GitGitGadget wrote:
>  test-lint-shell-syntax:
> -	@'$(PERL_PATH_SQ)' check-non-portable-shell.pl $(T) $(THELPERS)
> +	@'$(PERL_PATH_SQ)' check-non-portable-shell.pl $(T) $(THELPERS) $(TPERF)

I really appreciate your initiative to modify t/Makefile to start
linting t/perf/p????-*.sh files, too. Could I bother you to elaborate a
little bit on why you chose to modify a recipe in t/Makefile instead of
t/perf/Makefile?

I'm not necessarily opposed, but having this in t/perf/Makefile would
allow me to just run 'make' in 't/perf' and still have the scripts
linted there without having to involve a 'make' in 't'.

For what it's worth, I suspect that this is because 't/Makefile' already
has a 'test-lint-shell-syntax' target, and 't/perf/Makefile' does not. I
think it would be OK to add it there, too, and move this change into
t/perf.

> diff --git a/t/perf/p3400-rebase.sh b/t/perf/p3400-rebase.sh
> index d202aaed06..7a0bb29448 100755
> --- a/t/perf/p3400-rebase.sh
> +++ b/t/perf/p3400-rebase.sh
> @@ -9,16 +9,16 @@ test_expect_success 'setup rebasing on top of a lot of changes' '
>  	git checkout -f -B base &&
>  	git checkout -B to-rebase &&
>  	git checkout -B upstream &&
> -	for i in $(seq 100)
> +	for i in $(test_seq 100)
>  	do
>  		# simulate huge diffs
>  		echo change$i >unrelated-file$i &&
> -		seq 1000 >>unrelated-file$i &&
> +		test_seq 1000 >>unrelated-file$i &&
>  		git add unrelated-file$i &&
>  		test_tick &&
>  		git commit -m commit$i unrelated-file$i &&
>  		echo change$i >unrelated-file$i &&
> -		seq 1000 | tac >>unrelated-file$i &&
> +		test_seq 1000 | tac >>unrelated-file$i &&

Makes sense. I wouldn't be opposed to breaking this out into an earlier
change (e.g., "it's about to become not OK to use seq in t/perf, so
prepare for that by replacing any invocations with test_seq()"), but I
think it's probably not worth it, since this patch is small as it is.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 7/7] p7519-fsmonitor: add a git add benchmark
  2020-10-19 22:47     ` [PATCH v3 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget
  2020-10-19 23:02       ` Nipunn Koorapati
@ 2020-10-20  2:40       ` Taylor Blau
  1 sibling, 0 replies; 52+ messages in thread
From: Taylor Blau @ 2020-10-20  2:40 UTC (permalink / raw)
  To: Nipunn Koorapati via GitGitGadget
  Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati,
	Nipunn Koorapati, Taylor Blau

On Mon, Oct 19, 2020 at 10:47:37PM +0000, Nipunn Koorapati via GitGitGadget wrote:
> From: Nipunn Koorapati <nipunn@dropbox.com>
>
> Test                                                                     v2.29.0-rc1       this tree
> -----------------------------------------------------------------------------------------------------------------
> [...]
> 7519.22: add (fsmonitor=)                                                2.26(1.46+1.49)   2.27(1.42+1.55) +0.4%

Good; no huge slow-down here. Thanks for checking!

> Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
> ---
>  t/perf/p7519-fsmonitor.sh | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
> index 75a0cef01d..fb20fe0937 100755
> --- a/t/perf/p7519-fsmonitor.sh
> +++ b/t/perf/p7519-fsmonitor.sh
> @@ -169,6 +169,10 @@ test_fsmonitor_suite() {
>  	test_perf_w_drop_caches "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
>  		git diff -- 10000_files
>  	'
> +
> +	test_perf_w_drop_caches "add (fsmonitor=$INTEGRATION_SCRIPT)" '
> +		git add  --all
> +	'
>  }
>
>  test_fsmonitor_suite
> --
> gitgitgadget

  Acked-by: Taylor Blau <me@ttaylorr.com>

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 6/7] p7519-fsmonitor: refactor to avoid code duplication
  2020-10-19 22:47     ` [PATCH v3 6/7] p7519-fsmonitor: refactor to avoid code duplication Nipunn Koorapati via GitGitGadget
@ 2020-10-20  2:43       ` Taylor Blau
  0 siblings, 0 replies; 52+ messages in thread
From: Taylor Blau @ 2020-10-20  2:43 UTC (permalink / raw)
  To: Nipunn Koorapati via GitGitGadget
  Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati,
	Nipunn Koorapati, Taylor Blau

On Mon, Oct 19, 2020 at 10:47:36PM +0000, Nipunn Koorapati via GitGitGadget wrote:
> From: Nipunn Koorapati <nipunn@dropbox.com>
>
> Much of the benchmark code is redundant. This is
> easier to understand and edit.
>
> Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>

Much easier to read, thank you for taking the time to simplify the code.

I know that this is maybe more review than you were hoping for, but I
think it's been worth it and the series is in an even better state than
when you started.

  Acked-by: Taylor Blau <me@ttaylorr.com>

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests
  2020-10-20  2:38       ` Taylor Blau
@ 2020-10-20  3:10         ` Junio C Hamano
  2020-10-20  3:15           ` Taylor Blau
  2020-10-20 10:09         ` Nipunn Koorapati
  1 sibling, 1 reply; 52+ messages in thread
From: Junio C Hamano @ 2020-10-20  3:10 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Nipunn Koorapati via GitGitGadget, git, Derrick Stolee,
	Utsav Shah, Nipunn Koorapati, Nipunn Koorapati

Taylor Blau <me@ttaylorr.com> writes:

>>  		echo change$i >unrelated-file$i &&
>> -		seq 1000 | tac >>unrelated-file$i &&
>> +		test_seq 1000 | tac >>unrelated-file$i &&
>
> Makes sense. I wouldn't be opposed to breaking this out into an earlier
> change (e.g., "it's about to become not OK to use seq in t/perf, so
> prepare for that by replacing any invocations with test_seq()"), but I
> think it's probably not worth it, since this patch is small as it is.

test_seq is fine, but I do not think tac is portable (only saved by
the fact that not many people, especially on exotic platforms, run
perf scripts).

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests
  2020-10-20  3:10         ` Junio C Hamano
@ 2020-10-20  3:15           ` Taylor Blau
  2020-10-20 10:16             ` Nipunn Koorapati
  0 siblings, 1 reply; 52+ messages in thread
From: Taylor Blau @ 2020-10-20  3:15 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Taylor Blau, Nipunn Koorapati via GitGitGadget, git,
	Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati

On Mon, Oct 19, 2020 at 08:10:56PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> >>  		echo change$i >unrelated-file$i &&
> >> -		seq 1000 | tac >>unrelated-file$i &&
> >> +		test_seq 1000 | tac >>unrelated-file$i &&
> >
> > Makes sense. I wouldn't be opposed to breaking this out into an earlier
> > change (e.g., "it's about to become not OK to use seq in t/perf, so
> > prepare for that by replacing any invocations with test_seq()"), but I
> > think it's probably not worth it, since this patch is small as it is.
>
> test_seq is fine, but I do not think tac is portable (only saved by
> the fact that not many people, especially on exotic platforms, run
> perf scripts).

Serves me right for reading while I'm tired! I glazed right over 'tac'.
If you need a truly unrelated file, you could write random data into it
(there are some examples in t/test-lib-functions.sh), but I'd just write
'test_seq 1001'.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests
  2020-10-20  2:38       ` Taylor Blau
  2020-10-20  3:10         ` Junio C Hamano
@ 2020-10-20 10:09         ` Nipunn Koorapati
  1 sibling, 0 replies; 52+ messages in thread
From: Nipunn Koorapati @ 2020-10-20 10:09 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Nipunn Koorapati via GitGitGadget, git, Derrick Stolee,
	Utsav Shah, Nipunn Koorapati

On Tue, Oct 20, 2020 at 3:39 AM Taylor Blau <me@ttaylorr.com> wrote:
>
> I'm not necessarily opposed, but having this in t/perf/Makefile would
> allow me to just run 'make' in 't/perf' and still have the scripts
> linted there without having to involve a 'make' in 't'.
>
> For what it's worth, I suspect that this is because 't/Makefile' already
> has a 'test-lint-shell-syntax' target, and 't/perf/Makefile' does not. I
> think it would be OK to add it there, too, and move this change into
> t/perf.

Looked at doing this and noticed that there are several targets in test-lint
in t/Makefile. This would involve duplicating them into t/perf/Makefile which
seems like it would be poor form, especially given their complexity.
Perhaps t/perf/Makefile could have a target which calls t/Makefile's test-lint
target instead. Will play around with it.

>
> Makes sense. I wouldn't be opposed to breaking this out into an earlier
> change (e.g., "it's about to become not OK to use seq in t/perf, so
> prepare for that by replacing any invocations with test_seq()"), but I
> think it's probably not worth it, since this patch is small as it is.
>

Yeah - I see the point, but I agree that since the patch is small,
it's ok this way.
If the patch grows significantly, I can make it into two patches

--Nipunn

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests
  2020-10-20  3:15           ` Taylor Blau
@ 2020-10-20 10:16             ` Nipunn Koorapati
  0 siblings, 0 replies; 52+ messages in thread
From: Nipunn Koorapati @ 2020-10-20 10:16 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Junio C Hamano, Nipunn Koorapati via GitGitGadget, git,
	Derrick Stolee, Utsav Shah, Nipunn Koorapati

> > test_seq is fine, but I do not think tac is portable (only saved by
> > the fact that not many people, especially on exotic platforms, run
> > perf scripts).
>
> Serves me right for reading while I'm tired! I glazed right over 'tac'.
> If you need a truly unrelated file, you could write random data into it
> (there are some examples in t/test-lib-functions.sh), but I'd just write
> 'test_seq 1001'.

Seems like there might be some value to adding `tac` to the perl script
check-non-portable-shell.pl - though I'm not sure what we'd use as an
alternative. I'll leave this here for now for someone else to handle in
a follow up patch series

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat
  2020-10-19 22:47   ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
                       ` (6 preceding siblings ...)
  2020-10-19 22:47     ` [PATCH v3 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget
@ 2020-10-20 13:40     ` Nipunn Koorapati via GitGitGadget
  2020-10-20 13:40       ` [PATCH v4 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
                         ` (6 more replies)
  7 siblings, 7 replies; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:40 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati

Credit to alexmv who made this commit back in Dec, 2017 when he was at dbx.
I've rebased it and am submitting it now.

With fsmonitor enabled, git diff currently lstats every file in the repo
This makes use of the fsmonitor extension to skip lstat() calls on files
that fsmonitor judged as unmodified.

I was able to do some testing with/without this change in a large in-house
repo (~ 400k files).

-----------------------------------------
(1) With fsmonitor enabled - on master of git (2.29.0)
-----------------------------------------
../git/bin-wrappers/git checkout HEAD~200
strace -c ../git/bin-wrappers/git diff

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.64    4.358994          10    446257         3 lstat
  0.12    0.005353           7       764       360 open

(A subsequent call)
strace -c ../git/bin-wrappers/git diff

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.84    4.380955          10    444904         3 lstat
  0.06    0.002564         135        19           munmap
...

-----------------------------------------
(2) With fsmonitor enabled - with my patch
-----------------------------------------
../git/bin-wrappers/git checkout HEAD~200
strace -c ../git/bin-wrappers/git diff

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 50.72    0.003090         163        19           munmap
 19.63    0.001196         598         2           futex
...
  0.00    0.000000           0         4         3 lstat


-----------------------------------------
(3) With fsmonitor disabled entirely
-----------------------------------------

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 98.52    0.277085       92362         3           futex
  0.27    0.000752           4       191        63 open
...
  0.14    0.000397           3       158         3 lstat

I was able to encode this into a perf test in one of the commits.

Changes since Patch Series V1

 * Add git diff -- <pathspec> to perf tests
 * improve readability of bitwise ops

Changes since Patch Series V2

 * Add git add to perf tests
 * Refactor perf fsmonitor to simplify / remove redundancy
 * Add linting to perf tests
 * Added git diff -- <pathspec> for various sized pathspecs
 * Confirmed that refresh_fsmonitor was always being called / added to
   commit message

Changes since Patch Series V3

 * Move perf test linting to Makefile in perf/ directory

Alex Vandiver (1):
  fsmonitor: use fsmonitor data in `git diff`

Nipunn Koorapati (6):
  t/perf/README: elaborate on output format
  t/perf/p7519-fsmonitor.sh: warm cache on first git status
  t/perf: add fsmonitor perf test for git diff
  perf lint: add make test-lint to perf tests
  p7519-fsmonitor: refactor to avoid code duplication
  p7519-fsmonitor: add a git add benchmark

 diff-lib.c                | 15 +++++-
 t/Makefile                |  7 +--
 t/perf/Makefile           |  5 +-
 t/perf/README             |  2 +
 t/perf/p3400-rebase.sh    |  6 +--
 t/perf/p7519-fsmonitor.sh | 96 ++++++++++++++++++++++-----------------
 6 files changed, 81 insertions(+), 50 deletions(-)


base-commit: d4a392452e292ff924e79ec8458611c0f679d6d4
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-756%2Fnipunn1313%2Fdiff_fsmon-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-756/nipunn1313/diff_fsmon-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/756

Range-diff vs v3:

 1:  cba03dd40b = 1:  cba03dd40b fsmonitor: use fsmonitor data in `git diff`
 2:  1c7876166f = 2:  1c7876166f t/perf/README: elaborate on output format
 3:  401f696c81 = 3:  401f696c81 t/perf/p7519-fsmonitor.sh: warm cache on first git status
 4:  b3ad8faac4 = 4:  b3ad8faac4 t/perf: add fsmonitor perf test for git diff
 5:  28c1e488bf ! 5:  b534cd137a perf lint: check test-lint-shell-syntax in perf tests
     @@ Metadata
      Author: Nipunn Koorapati <nipunn@dropbox.com>
      
       ## Commit message ##
     -    perf lint: check test-lint-shell-syntax in perf tests
     +    perf lint: add make test-lint to perf tests
      
     -    Perf tests have some seq instead of test_seq. This
     -    runs the existing tests on the perf tests as well.
     +    Perf tests have not been linted for some time.
     +    They've grown some seq instead of test_seq. This
     +    runs the existing lints on the perf tests as well.
      
          Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
      
     @@ t/Makefile: CHAINLINTTMP_SQ = $(subst ','\'',$(CHAINLINTTMP))
       CHAINLINTTESTS = $(sort $(patsubst chainlint/%.test,%,$(wildcard chainlint/*.test)))
       CHAINLINT = sed -f chainlint.sed
       
     -@@ t/Makefile: test-lint-executable:
     +@@ t/Makefile: test-lint: test-lint-duplicates test-lint-executable test-lint-shell-syntax \
     + 	test-lint-filenames
     + 
     + test-lint-duplicates:
     +-	@dups=`echo $(T) | tr ' ' '\n' | sed 's/-.*//' | sort | uniq -d` && \
     ++	@dups=`echo $(T) $(TPERF) | tr ' ' '\n' | sed 's/-.*//' | sort | uniq -d` && \
     + 		test -z "$$dups" || { \
     + 		echo >&2 "duplicate test numbers:" $$dups; exit 1; }
     + 
     + test-lint-executable:
     +-	@bad=`for i in $(T); do test -x "$$i" || echo $$i; done` && \
     ++	@bad=`for i in $(T) $(TPERF); do test -x "$$i" || echo $$i; done` && \
     + 		test -z "$$bad" || { \
       		echo >&2 "non-executable tests:" $$bad; exit 1; }
       
       test-lint-shell-syntax:
     @@ t/Makefile: test-lint-executable:
       test-lint-filenames:
       	@# We do *not* pass a glob to ls-files but use grep instead, to catch
      
     + ## t/perf/Makefile ##
     +@@
     + -include ../../config.mak
     + export GIT_TEST_OPTIONS
     + 
     +-all: perf
     ++all: test-lint perf
     + 
     + perf: pre-clean
     + 	./run
     +@@ t/perf/Makefile: pre-clean:
     + clean:
     + 	rm -rf build "trash directory".* test-results
     + 
     ++test-lint:
     ++	$(MAKE) -C .. test-lint
     ++
     + .PHONY: all perf pre-clean clean
     +
       ## t/perf/p3400-rebase.sh ##
      @@ t/perf/p3400-rebase.sh: test_expect_success 'setup rebasing on top of a lot of changes' '
       	git checkout -f -B base &&
 6:  b38f2984f9 = 6:  3b20f4c76e p7519-fsmonitor: refactor to avoid code duplication
 7:  d392a523f2 = 7:  6f97439936 p7519-fsmonitor: add a git add benchmark

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 1/7] fsmonitor: use fsmonitor data in `git diff`
  2020-10-20 13:40     ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
@ 2020-10-20 13:40       ` Alex Vandiver via GitGitGadget
  2020-10-20 13:40       ` [PATCH v4 2/7] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
                         ` (5 subsequent siblings)
  6 siblings, 0 replies; 52+ messages in thread
From: Alex Vandiver via GitGitGadget @ 2020-10-20 13:40 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati, Alex Vandiver

From: Alex Vandiver <alexmv@dropbox.com>

With fsmonitor enabled, the first call to match_stat_with_submodule
calls refresh_fsmonitor, incurring the overhead of reading the list of
updated files -- but run_diff_files does not respect the
CE_FSMONITOR_VALID flag.

Make use of the fsmonitor extension to skip lstat() calls on files
that fsmonitor judged as unmodified.

Notably, this change improves performance of the git shell prompt when
GIT_PS1_SHOWDIRTYSTATE is set.

Signed-off-by: Alex Vandiver <alexmv@dropbox.com>
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 diff-lib.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index f95c6de75f..d2d31b9f82 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -97,6 +97,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 
 	diff_set_mnemonic_prefix(&revs->diffopt, "i/", "w/");
 
+	refresh_fsmonitor(istate);
+
 	if (diff_unmerged_stage < 0)
 		diff_unmerged_stage = 2;
 	entries = istate->cache_nr;
@@ -197,8 +199,17 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 		if (ce_uptodate(ce) || ce_skip_worktree(ce))
 			continue;
 
-		/* If CE_VALID is set, don't look at workdir for file removal */
-		if (ce->ce_flags & CE_VALID) {
+		/*
+		 * When CE_VALID is set (via "update-index --assume-unchanged"
+		 * or via adding paths while core.ignorestat is set to true),
+		 * the user has promised that the working tree file for that
+		 * path will not be modified.  When CE_FSMONITOR_VALID is true,
+		 * the fsmonitor knows that the path hasn't been modified since
+		 * we refreshed the cached stat information.  In either case,
+		 * we do not have to stat to see if the path has been removed
+		 * or modified.
+		 */
+		if (ce->ce_flags & (CE_VALID | CE_FSMONITOR_VALID)) {
 			changed = 0;
 			newmode = ce->ce_mode;
 		} else {
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 2/7] t/perf/README: elaborate on output format
  2020-10-20 13:40     ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
  2020-10-20 13:40       ` [PATCH v4 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
@ 2020-10-20 13:40       ` Nipunn Koorapati via GitGitGadget
  2020-10-20 13:41       ` [PATCH v4 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
                         ` (4 subsequent siblings)
  6 siblings, 0 replies; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:40 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati, Nipunn Koorapati

From: Nipunn Koorapati <nipunn@dropbox.com>

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 t/perf/README | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/t/perf/README b/t/perf/README
index bd649afa97..fb9127a66f 100644
--- a/t/perf/README
+++ b/t/perf/README
@@ -28,6 +28,8 @@ the tests on the current git repository.
     7810.3: grep --cached, cheap regex       3.07(3.02+0.25)
     7810.4: grep --cached, expensive regex   9.39(30.57+0.24)
 
+Output format is in seconds "Elapsed(User + System)"
+
 You can compare multiple repositories and even git revisions with the
 'run' script:
 
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status
  2020-10-20 13:40     ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
  2020-10-20 13:40       ` [PATCH v4 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
  2020-10-20 13:40       ` [PATCH v4 2/7] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
@ 2020-10-20 13:41       ` Nipunn Koorapati via GitGitGadget
  2020-10-20 13:41       ` [PATCH v4 4/7] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
                         ` (3 subsequent siblings)
  6 siblings, 0 replies; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:41 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati, Nipunn Koorapati

From: Nipunn Koorapati <nipunn@dropbox.com>

The first git status would be inflated due to warming of
filesystem cache. This makes the results comparable.

Before
Test                                                             this tree
--------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         2.52(1.59+1.56)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.18(0.12+0.06)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   1.36(0.73+0.62)
7519.7: status (fsmonitor=)                                      0.69(0.52+0.90)
7519.8: status -uno (fsmonitor=)                                 0.37(0.28+0.81)
7519.9: status -uall (fsmonitor=)                                1.53(0.93+1.32)

After
Test                                                             this tree
--------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)         0.39(0.33+0.06)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)    0.17(0.13+0.05)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)   1.34(0.77+0.56)
7519.7: status (fsmonitor=)                                      0.70(0.53+0.90)
7519.8: status -uno (fsmonitor=)                                 0.37(0.32+0.78)
7519.9: status -uall (fsmonitor=)                                1.55(1.01+1.25)

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 t/perf/p7519-fsmonitor.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
index def7ecdbc7..9313d4a51d 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -114,7 +114,8 @@ test_expect_success "setup for fsmonitor" '
 	fi &&
 
 	git config core.fsmonitor "$INTEGRATION_SCRIPT" &&
-	git update-index --fsmonitor
+	git update-index --fsmonitor &&
+	git status  # Warm caches
 '
 
 if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 4/7] t/perf: add fsmonitor perf test for git diff
  2020-10-20 13:40     ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
                         ` (2 preceding siblings ...)
  2020-10-20 13:41       ` [PATCH v4 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
@ 2020-10-20 13:41       ` Nipunn Koorapati via GitGitGadget
  2020-10-20 13:41       ` [PATCH v4 5/7] perf lint: add make test-lint to perf tests Nipunn Koorapati via GitGitGadget
                         ` (2 subsequent siblings)
  6 siblings, 0 replies; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:41 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati, Nipunn Koorapati

From: Nipunn Koorapati <nipunn@dropbox.com>

Results for the git-diff fsmonitor optimization
in patch in the parent-rev (using a 400k file repo to test)

As you can see here - git diff with fsmonitor running is
significantly better with this patch series (80% faster on my
workload)!

GIT_PERF_LARGE_REPO=~/src/server ./run v2.29.0-rc1 . -- p7519-fsmonitor.sh

Test                                                                     v2.29.0-rc1       this tree
-----------------------------------------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)                 1.46(0.82+0.64)   1.47(0.83+0.62) +0.7%
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)            0.16(0.12+0.04)   0.17(0.12+0.05) +6.3%
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)           1.36(0.73+0.62)   1.37(0.76+0.60) +0.7%
7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman)                   0.85(0.22+0.63)   0.14(0.10+0.05) -83.5%
7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman)        0.12(0.08+0.05)   0.13(0.11+0.02) +8.3%
7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman)       0.12(0.08+0.04)   0.13(0.09+0.04) +8.3%
7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman)      0.12(0.07+0.05)   0.13(0.07+0.06) +8.3%
7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman)     0.12(0.09+0.04)   0.13(0.08+0.05) +8.3%
7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman)   0.14(0.09+0.05)   0.13(0.10+0.03) -7.1%
7519.12: status (fsmonitor=)                                             1.67(0.93+1.49)   1.67(0.99+1.42) +0.0%
7519.13: status -uno (fsmonitor=)                                        0.37(0.30+0.82)   0.37(0.33+0.79) +0.0%
7519.14: status -uall (fsmonitor=)                                       1.58(0.97+1.35)   1.57(0.86+1.45) -0.6%
7519.15: diff (fsmonitor=)                                               0.34(0.28+0.83)   0.34(0.27+0.83) +0.0%
7519.16: diff -- 0_files (fsmonitor=)                                    0.09(0.06+0.04)   0.09(0.08+0.02) +0.0%
7519.17: diff -- 10_files (fsmonitor=)                                   0.09(0.07+0.03)   0.09(0.06+0.05) +0.0%
7519.18: diff -- 100_files (fsmonitor=)                                  0.09(0.06+0.04)   0.09(0.06+0.04) +0.0%
7519.19: diff -- 1000_files (fsmonitor=)                                 0.09(0.06+0.04)   0.09(0.05+0.05) +0.0%
7519.20: diff -- 10000_files (fsmonitor=)                                0.10(0.08+0.04)   0.10(0.06+0.05) +0.0%

I also added a benchmark for a tiny git diff workload w/ a pathspec.
I see an approximately .02 second overhead added w/ and w/o fsmonitor

From looking at these results, I suspected that refresh_fsmonitor
is already happening during git diff - independent of this patch
series' optimization. Confirmed that suspicion by breaking on
refresh_fsmonitor.

(gdb) bt  [simplified]
0  refresh_fsmonitor  at fsmonitor.c:176
1  ie_match_stat  at read-cache.c:375
2  match_stat_with_submodule at diff-lib.c:237
4  builtin_diff_files  at builtin/diff.c:260
5  cmd_diff  at builtin/diff.c:541
6  run_builtin  at git.c:450
7  handle_builtin  at git.c:700
8  run_argv  at git.c:767
9  cmd_main  at git.c:898
10 main  at common-main.c:52

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 t/perf/p7519-fsmonitor.sh | 71 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
index 9313d4a51d..ef4c3c8c5c 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -115,6 +115,13 @@ test_expect_success "setup for fsmonitor" '
 
 	git config core.fsmonitor "$INTEGRATION_SCRIPT" &&
 	git update-index --fsmonitor &&
+	mkdir 1_file 10_files 100_files 1000_files 10000_files &&
+	for i in $(test_seq 1 10); do touch 10_files/$i; done &&
+	for i in $(test_seq 1 100); do touch 100_files/$i; done &&
+	for i in $(test_seq 1 1000); do touch 1000_files/$i; done &&
+	for i in $(test_seq 1 10000); do touch 10000_files/$i; done &&
+	git add 1_file 10_files 100_files 1000_files 10000_files &&
+	git commit -m "Add files" &&
 	git status  # Warm caches
 '
 
@@ -142,6 +149,38 @@ test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
 	git status -uall
 '
 
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-tool drop-caches
+fi
+
+test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-tool drop-caches
+fi
+
+test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 1_file
+'
+
+test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 10_files
+'
+
+test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 100_files
+'
+
+test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 1000_files
+'
+
+test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 10000_files
+'
+
 test_expect_success "setup without fsmonitor" '
 	unset INTEGRATION_SCRIPT &&
 	git config --unset core.fsmonitor &&
@@ -172,6 +211,38 @@ test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
 	git status -uall
 '
 
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-tool drop-caches
+fi
+
+test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff
+'
+
+if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+	test-tool drop-caches
+fi
+
+test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 1_file
+'
+
+test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 10_files
+'
+
+test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 100_files
+'
+
+test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 1000_files
+'
+
+test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+	git diff -- 10000_files
+'
+
 if test_have_prereq WATCHMAN
 then
 	watchman watch-del "$GIT_WORK_TREE" >/dev/null 2>&1 &&
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 5/7] perf lint: add make test-lint to perf tests
  2020-10-20 13:40     ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
                         ` (3 preceding siblings ...)
  2020-10-20 13:41       ` [PATCH v4 4/7] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
@ 2020-10-20 13:41       ` Nipunn Koorapati via GitGitGadget
  2020-10-20 22:06         ` Taylor Blau
  2020-10-20 13:41       ` [PATCH v4 6/7] p7519-fsmonitor: refactor to avoid code duplication Nipunn Koorapati via GitGitGadget
  2020-10-20 13:41       ` [PATCH v4 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget
  6 siblings, 1 reply; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:41 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati, Nipunn Koorapati

From: Nipunn Koorapati <nipunn@dropbox.com>

Perf tests have not been linted for some time.
They've grown some seq instead of test_seq. This
runs the existing lints on the perf tests as well.

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 t/Makefile             | 7 ++++---
 t/perf/Makefile        | 5 ++++-
 t/perf/p3400-rebase.sh | 6 +++---
 3 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/t/Makefile b/t/Makefile
index c83fd18861..882d26eee3 100644
--- a/t/Makefile
+++ b/t/Makefile
@@ -34,6 +34,7 @@ CHAINLINTTMP_SQ = $(subst ','\'',$(CHAINLINTTMP))
 T = $(sort $(wildcard t[0-9][0-9][0-9][0-9]-*.sh))
 TGITWEB = $(sort $(wildcard t95[0-9][0-9]-*.sh))
 THELPERS = $(sort $(filter-out $(T),$(wildcard *.sh)))
+TPERF = $(sort $(wildcard perf/p[0-9][0-9][0-9][0-9]-*.sh))
 CHAINLINTTESTS = $(sort $(patsubst chainlint/%.test,%,$(wildcard chainlint/*.test)))
 CHAINLINT = sed -f chainlint.sed
 
@@ -81,17 +82,17 @@ test-lint: test-lint-duplicates test-lint-executable test-lint-shell-syntax \
 	test-lint-filenames
 
 test-lint-duplicates:
-	@dups=`echo $(T) | tr ' ' '\n' | sed 's/-.*//' | sort | uniq -d` && \
+	@dups=`echo $(T) $(TPERF) | tr ' ' '\n' | sed 's/-.*//' | sort | uniq -d` && \
 		test -z "$$dups" || { \
 		echo >&2 "duplicate test numbers:" $$dups; exit 1; }
 
 test-lint-executable:
-	@bad=`for i in $(T); do test -x "$$i" || echo $$i; done` && \
+	@bad=`for i in $(T) $(TPERF); do test -x "$$i" || echo $$i; done` && \
 		test -z "$$bad" || { \
 		echo >&2 "non-executable tests:" $$bad; exit 1; }
 
 test-lint-shell-syntax:
-	@'$(PERL_PATH_SQ)' check-non-portable-shell.pl $(T) $(THELPERS)
+	@'$(PERL_PATH_SQ)' check-non-portable-shell.pl $(T) $(THELPERS) $(TPERF)
 
 test-lint-filenames:
 	@# We do *not* pass a glob to ls-files but use grep instead, to catch
diff --git a/t/perf/Makefile b/t/perf/Makefile
index 8c47155a7c..fcb0e8865e 100644
--- a/t/perf/Makefile
+++ b/t/perf/Makefile
@@ -1,7 +1,7 @@
 -include ../../config.mak
 export GIT_TEST_OPTIONS
 
-all: perf
+all: test-lint perf
 
 perf: pre-clean
 	./run
@@ -12,4 +12,7 @@ pre-clean:
 clean:
 	rm -rf build "trash directory".* test-results
 
+test-lint:
+	$(MAKE) -C .. test-lint
+
 .PHONY: all perf pre-clean clean
diff --git a/t/perf/p3400-rebase.sh b/t/perf/p3400-rebase.sh
index d202aaed06..7a0bb29448 100755
--- a/t/perf/p3400-rebase.sh
+++ b/t/perf/p3400-rebase.sh
@@ -9,16 +9,16 @@ test_expect_success 'setup rebasing on top of a lot of changes' '
 	git checkout -f -B base &&
 	git checkout -B to-rebase &&
 	git checkout -B upstream &&
-	for i in $(seq 100)
+	for i in $(test_seq 100)
 	do
 		# simulate huge diffs
 		echo change$i >unrelated-file$i &&
-		seq 1000 >>unrelated-file$i &&
+		test_seq 1000 >>unrelated-file$i &&
 		git add unrelated-file$i &&
 		test_tick &&
 		git commit -m commit$i unrelated-file$i &&
 		echo change$i >unrelated-file$i &&
-		seq 1000 | tac >>unrelated-file$i &&
+		test_seq 1000 | tac >>unrelated-file$i &&
 		git add unrelated-file$i &&
 		test_tick &&
 		git commit -m commit$i-reverse unrelated-file$i ||
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 6/7] p7519-fsmonitor: refactor to avoid code duplication
  2020-10-20 13:40     ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
                         ` (4 preceding siblings ...)
  2020-10-20 13:41       ` [PATCH v4 5/7] perf lint: add make test-lint to perf tests Nipunn Koorapati via GitGitGadget
@ 2020-10-20 13:41       ` Nipunn Koorapati via GitGitGadget
  2020-10-20 13:41       ` [PATCH v4 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget
  6 siblings, 0 replies; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:41 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati, Nipunn Koorapati

From: Nipunn Koorapati <nipunn@dropbox.com>

Much of the benchmark code is redundant. This is
easier to understand and edit.

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 t/perf/p7519-fsmonitor.sh | 136 +++++++++++---------------------------
 1 file changed, 37 insertions(+), 99 deletions(-)

diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
index ef4c3c8c5c..75a0cef01d 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -125,61 +125,53 @@ test_expect_success "setup for fsmonitor" '
 	git status  # Warm caches
 '
 
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
+test_perf_w_drop_caches () {
+	if test -n "$GIT_PERF_7519_DROP_CACHE"; then
+		test-tool drop-caches
+	fi
 
-test_perf "status (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git status
-'
+	test_perf "$@"
+}
 
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
+test_fsmonitor_suite() {
+	test_perf_w_drop_caches "status (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git status
+	'
 
-test_perf "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git status -uno
-'
+	test_perf_w_drop_caches "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git status -uno
+	'
 
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
+	test_perf_w_drop_caches "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git status -uall
+	'
 
-test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git status -uall
-'
-
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
-
-test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff
-'
+	test_perf_w_drop_caches "diff (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git diff
+	'
 
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
+	test_perf_w_drop_caches "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git diff -- 1_file
+	'
 
-test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 1_file
-'
+	test_perf_w_drop_caches "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git diff -- 10_files
+	'
 
-test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 10_files
-'
+	test_perf_w_drop_caches "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git diff -- 100_files
+	'
 
-test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 100_files
-'
+	test_perf_w_drop_caches "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git diff -- 1000_files
+	'
 
-test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 1000_files
-'
+	test_perf_w_drop_caches "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git diff -- 10000_files
+	'
+}
 
-test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 10000_files
-'
+test_fsmonitor_suite
 
 test_expect_success "setup without fsmonitor" '
 	unset INTEGRATION_SCRIPT &&
@@ -187,61 +179,7 @@ test_expect_success "setup without fsmonitor" '
 	git update-index --no-fsmonitor
 '
 
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
-
-test_perf "status (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git status
-'
-
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
-
-test_perf "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git status -uno
-'
-
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
-
-test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git status -uall
-'
-
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
-
-test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff
-'
-
-if test -n "$GIT_PERF_7519_DROP_CACHE"; then
-	test-tool drop-caches
-fi
-
-test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 1_file
-'
-
-test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 10_files
-'
-
-test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 100_files
-'
-
-test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 1000_files
-'
-
-test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
-	git diff -- 10000_files
-'
+test_fsmonitor_suite
 
 if test_have_prereq WATCHMAN
 then
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 7/7] p7519-fsmonitor: add a git add benchmark
  2020-10-20 13:40     ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
                         ` (5 preceding siblings ...)
  2020-10-20 13:41       ` [PATCH v4 6/7] p7519-fsmonitor: refactor to avoid code duplication Nipunn Koorapati via GitGitGadget
@ 2020-10-20 13:41       ` Nipunn Koorapati via GitGitGadget
  6 siblings, 0 replies; 52+ messages in thread
From: Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:41 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati,
	Taylor Blau, Nipunn Koorapati, Nipunn Koorapati

From: Nipunn Koorapati <nipunn@dropbox.com>

Test                                                                     v2.29.0-rc1       this tree
-----------------------------------------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman)                 1.48(0.79+0.67)   1.48(0.79+0.67) +0.0%
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman)            0.16(0.11+0.05)   0.17(0.13+0.04) +6.3%
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman)           1.36(0.77+0.58)   1.37(0.72+0.63) +0.7%
7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman)                   0.84(0.21+0.63)   0.14(0.11+0.03) -83.3%
7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman)        0.12(0.07+0.05)   0.13(0.09+0.04) +8.3%
7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman)       0.12(0.09+0.04)   0.13(0.07+0.06) +8.3%
7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman)      0.12(0.08+0.05)   0.12(0.08+0.05) +0.0%
7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman)     0.12(0.08+0.05)   0.13(0.09+0.04) +8.3%
7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman)   0.14(0.08+0.06)   0.13(0.07+0.06) -7.1%
7519.11: add (fsmonitor=.git/hooks/fsmonitor-watchman)                   2.75(1.41+1.27)   2.03(1.26+0.70) -26.2%
7519.13: status (fsmonitor=)                                             1.38(1.03+1.04)   1.37(1.04+1.04) -0.7%
7519.14: status -uno (fsmonitor=)                                        1.11(0.83+0.98)   1.10(0.89+0.90) -0.9%
7519.15: status -uall (fsmonitor=)                                       2.30(1.57+1.42)   2.31(1.49+1.50) +0.4%
7519.16: diff (fsmonitor=)                                               1.43(1.13+1.76)   1.46(1.19+1.72) +2.1%
7519.17: diff -- 0_files (fsmonitor=)                                    0.10(0.08+0.04)   0.11(0.08+0.04) +10.0%
7519.18: diff -- 10_files (fsmonitor=)                                   0.10(0.07+0.05)   0.11(0.08+0.04) +10.0%
7519.19: diff -- 100_files (fsmonitor=)                                  0.10(0.07+0.04)   0.11(0.07+0.05) +10.0%
7519.20: diff -- 1000_files (fsmonitor=)                                 0.10(0.08+0.03)   0.11(0.08+0.04) +10.0%
7519.21: diff -- 10000_files (fsmonitor=)                                0.11(0.08+0.05)   0.12(0.07+0.06) +9.1%
7519.22: add (fsmonitor=)                                                2.26(1.46+1.49)   2.27(1.42+1.55) +0.4%

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
---
 t/perf/p7519-fsmonitor.sh | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
index 75a0cef01d..fb20fe0937 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -169,6 +169,10 @@ test_fsmonitor_suite() {
 	test_perf_w_drop_caches "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" '
 		git diff -- 10000_files
 	'
+
+	test_perf_w_drop_caches "add (fsmonitor=$INTEGRATION_SCRIPT)" '
+		git add  --all
+	'
 }
 
 test_fsmonitor_suite
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 5/7] perf lint: add make test-lint to perf tests
  2020-10-20 13:41       ` [PATCH v4 5/7] perf lint: add make test-lint to perf tests Nipunn Koorapati via GitGitGadget
@ 2020-10-20 22:06         ` Taylor Blau
  2020-10-20 22:17           ` Nipunn Koorapati
  0 siblings, 1 reply; 52+ messages in thread
From: Taylor Blau @ 2020-10-20 22:06 UTC (permalink / raw)
  To: Nipunn Koorapati via GitGitGadget
  Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati,
	Nipunn Koorapati, Taylor Blau

On Tue, Oct 20, 2020 at 01:41:02PM +0000, Nipunn Koorapati via GitGitGadget wrote:
> diff --git a/t/perf/Makefile b/t/perf/Makefile
> index 8c47155a7c..fcb0e8865e 100644
> --- a/t/perf/Makefile
> +++ b/t/perf/Makefile
> @@ -1,7 +1,7 @@
>  -include ../../config.mak
>  export GIT_TEST_OPTIONS
>
> -all: perf
> +all: test-lint perf
>
>  perf: pre-clean
>  	./run
> @@ -12,4 +12,7 @@ pre-clean:
>  clean:
>  	rm -rf build "trash directory".* test-results
>
> +test-lint:
> +	$(MAKE) -C .. test-lint
> +

Great; it sounds like adding a complete definition here was too much
effort to be worth it, but that adding a '$(MAKE) -C ..' is just right.
We can still run 'make test-lint' from within 't/perf', but there isn't
a bunch of clutter in this series to make that happen. Thanks.

>  .PHONY: all perf pre-clean clean
> diff --git a/t/perf/p3400-rebase.sh b/t/perf/p3400-rebase.sh
> index d202aaed06..7a0bb29448 100755
> --- a/t/perf/p3400-rebase.sh
> +++ b/t/perf/p3400-rebase.sh
> @@ -9,16 +9,16 @@ test_expect_success 'setup rebasing on top of a lot of changes' '
>  	git checkout -f -B base &&
>  	git checkout -B to-rebase &&
>  	git checkout -B upstream &&
> -	for i in $(seq 100)
> +	for i in $(test_seq 100)
>  	do
>  		# simulate huge diffs
>  		echo change$i >unrelated-file$i &&
> -		seq 1000 >>unrelated-file$i &&
> +		test_seq 1000 >>unrelated-file$i &&
>  		git add unrelated-file$i &&
>  		test_tick &&
>  		git commit -m commit$i unrelated-file$i &&
>  		echo change$i >unrelated-file$i &&
> -		seq 1000 | tac >>unrelated-file$i &&
> +		test_seq 1000 | tac >>unrelated-file$i &&

The rest of this all looks good, but I think adding 'tac' here is still
wrong; this isn't available everywhere, so we would want to find an
alternative before going further. Is there a reason that you couldn't
use a different 'N' in 'test_seq N' here?

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 5/7] perf lint: add make test-lint to perf tests
  2020-10-20 22:06         ` Taylor Blau
@ 2020-10-20 22:17           ` Nipunn Koorapati
  2020-10-20 22:19             ` Taylor Blau
  0 siblings, 1 reply; 52+ messages in thread
From: Nipunn Koorapati @ 2020-10-20 22:17 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Nipunn Koorapati via GitGitGadget, git, Derrick Stolee,
	Utsav Shah, Nipunn Koorapati

> > --- a/t/perf/p3400-rebase.sh
> > +++ b/t/perf/p3400-rebase.sh
> > @@ -9,16 +9,16 @@ test_expect_success 'setup rebasing on top of a lot of changes' '
> >       git checkout -f -B base &&
> >       git checkout -B to-rebase &&
> >       git checkout -B upstream &&
> > -     for i in $(seq 100)
> > +     for i in $(test_seq 100)
> >       do
> >               # simulate huge diffs
> >               echo change$i >unrelated-file$i &&
> > -             seq 1000 >>unrelated-file$i &&
> > +             test_seq 1000 >>unrelated-file$i &&
> >               git add unrelated-file$i &&
> >               test_tick &&
> >               git commit -m commit$i unrelated-file$i &&
> >               echo change$i >unrelated-file$i &&
> > -             seq 1000 | tac >>unrelated-file$i &&
> > +             test_seq 1000 | tac >>unrelated-file$i &&
>
> The rest of this all looks good, but I think adding 'tac' here is still
> wrong; this isn't available everywhere, so we would want to find an
> alternative before going further. Is there a reason that you couldn't
> use a different 'N' in 'test_seq N' here?

Hey. I think there's some confusion. I didn't add `tac`. It was
already here. I didn't even notice it until Junio mentioned it.

--Nipunn

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 5/7] perf lint: add make test-lint to perf tests
  2020-10-20 22:17           ` Nipunn Koorapati
@ 2020-10-20 22:19             ` Taylor Blau
  0 siblings, 0 replies; 52+ messages in thread
From: Taylor Blau @ 2020-10-20 22:19 UTC (permalink / raw)
  To: Nipunn Koorapati
  Cc: Taylor Blau, Nipunn Koorapati via GitGitGadget, git,
	Derrick Stolee, Utsav Shah, Nipunn Koorapati

On Tue, Oct 20, 2020 at 11:17:23PM +0100, Nipunn Koorapati wrote:
> > > --- a/t/perf/p3400-rebase.sh
> > > +++ b/t/perf/p3400-rebase.sh
> > > @@ -9,16 +9,16 @@ test_expect_success 'setup rebasing on top of a lot of changes' '
> > >       git checkout -f -B base &&
> > >       git checkout -B to-rebase &&
> > >       git checkout -B upstream &&
> > > -     for i in $(seq 100)
> > > +     for i in $(test_seq 100)
> > >       do
> > >               # simulate huge diffs
> > >               echo change$i >unrelated-file$i &&
> > > -             seq 1000 >>unrelated-file$i &&
> > > +             test_seq 1000 >>unrelated-file$i &&
> > >               git add unrelated-file$i &&
> > >               test_tick &&
> > >               git commit -m commit$i unrelated-file$i &&
> > >               echo change$i >unrelated-file$i &&
> > > -             seq 1000 | tac >>unrelated-file$i &&
> > > +             test_seq 1000 | tac >>unrelated-file$i &&
> >
> > The rest of this all looks good, but I think adding 'tac' here is still
> > wrong; this isn't available everywhere, so we would want to find an
> > alternative before going further. Is there a reason that you couldn't
> > use a different 'N' in 'test_seq N' here?
>
> Hey. I think there's some confusion. I didn't add `tac`. It was
> already here. I didn't even notice it until Junio mentioned it.

You're right, sorry; I just saw a line beginning with '+' that contained
'tac' and thought that it was new in this patch. What you have is OK,
then, since it's not a new problem with your patch.

It couldn't hurt to have the linting phase catch that, but let's leave
that for another day, since I think what you have in this version looks
good to me.

Thanks for listening to all of my feedback :).

> --Nipunn

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, back to index

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-17 21:04 [PATCH 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
2020-10-17 21:04 ` [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
2020-10-17 22:25   ` Junio C Hamano
2020-10-18  0:54     ` Nipunn Koorapati
2020-10-18  4:17       ` Taylor Blau
2020-10-18  5:02         ` Junio C Hamano
2020-10-18 23:43           ` Taylor Blau
2020-10-19 17:23             ` Junio C Hamano
2020-10-19 17:37               ` Taylor Blau
2020-10-19 18:07                 ` Nipunn Koorapati
2020-10-17 21:04 ` [PATCH 2/4] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
2020-10-17 21:04 ` [PATCH 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
2020-10-18  4:22   ` Taylor Blau
2020-10-17 21:04 ` [PATCH 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
2020-10-17 22:28   ` Junio C Hamano
2020-10-19 21:35 ` [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
2020-10-19 21:35   ` [PATCH v2 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
2020-10-19 21:35   ` [PATCH v2 2/4] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
2020-10-19 21:35   ` [PATCH v2 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
2020-10-19 21:35   ` [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
2020-10-19 21:43     ` Taylor Blau
2020-10-19 21:54     ` Taylor Blau
2020-10-19 22:00       ` Nipunn Koorapati
2020-10-19 22:02         ` Taylor Blau
2020-10-19 22:25       ` Nipunn Koorapati
2020-10-19 22:47   ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
2020-10-19 22:47     ` [PATCH v3 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
2020-10-19 22:47     ` [PATCH v3 2/7] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
2020-10-19 22:47     ` [PATCH v3 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
2020-10-19 22:47     ` [PATCH v3 4/7] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
2020-10-19 22:47     ` [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests Nipunn Koorapati via GitGitGadget
2020-10-20  2:38       ` Taylor Blau
2020-10-20  3:10         ` Junio C Hamano
2020-10-20  3:15           ` Taylor Blau
2020-10-20 10:16             ` Nipunn Koorapati
2020-10-20 10:09         ` Nipunn Koorapati
2020-10-19 22:47     ` [PATCH v3 6/7] p7519-fsmonitor: refactor to avoid code duplication Nipunn Koorapati via GitGitGadget
2020-10-20  2:43       ` Taylor Blau
2020-10-19 22:47     ` [PATCH v3 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget
2020-10-19 23:02       ` Nipunn Koorapati
2020-10-20  2:40       ` Taylor Blau
2020-10-20 13:40     ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
2020-10-20 13:40       ` [PATCH v4 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
2020-10-20 13:40       ` [PATCH v4 2/7] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
2020-10-20 13:41       ` [PATCH v4 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
2020-10-20 13:41       ` [PATCH v4 4/7] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
2020-10-20 13:41       ` [PATCH v4 5/7] perf lint: add make test-lint to perf tests Nipunn Koorapati via GitGitGadget
2020-10-20 22:06         ` Taylor Blau
2020-10-20 22:17           ` Nipunn Koorapati
2020-10-20 22:19             ` Taylor Blau
2020-10-20 13:41       ` [PATCH v4 6/7] p7519-fsmonitor: refactor to avoid code duplication Nipunn Koorapati via GitGitGadget
2020-10-20 13:41       ` [PATCH v4 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget

Git Mailing List Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/git/0 git/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 git git/ https://lore.kernel.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.git


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git