* [PATCH 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat @ 2020-10-17 21:04 Nipunn Koorapati via GitGitGadget 2020-10-17 21:04 ` [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget ` (4 more replies) 0 siblings, 5 replies; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-17 21:04 UTC (permalink / raw) To: git; +Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati Credit to alexmv who made this commit back in Dec, 2017 when he was at dbx. I've rebased it and am submitting it now. With fsmonitor enabled, git diff currently lstats every file in the repo This makes use of the fsmonitor extension to skip lstat() calls on files that fsmonitor judged as unmodified. I was able to do some testing with/without this change in a large in-house repo (~ 400k files) ----------------------------------------- (1) With fsmonitor enabled - on master of git (2.29.0) ----------------------------------------- ../git/bin-wrappers/git checkout HEAD~200 strace -c ../git/bin-wrappers/git diff % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 99.64 4.358994 10 446257 3 lstat 0.12 0.005353 7 764 360 open (A subsequent call) strace -c ../git/bin-wrappers/git diff % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 99.84 4.380955 10 444904 3 lstat 0.06 0.002564 135 19 munmap ... ----------------------------------------- (2) With fsmonitor enabled - with my patch ----------------------------------------- ../git/bin-wrappers/git checkout HEAD~200 strace -c ../git/bin-wrappers/git diff % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 50.72 0.003090 163 19 munmap 19.63 0.001196 598 2 futex ... 0.00 0.000000 0 4 3 lstat ----------------------------------------- (3) With fsmonitor disabled entirely ----------------------------------------- % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 98.52 0.277085 92362 3 futex 0.27 0.000752 4 191 63 open ... 0.14 0.000397 3 158 3 lstat I encoded this into a perf test with results as follow: On master (2.29) Test this tree -------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 2.52(1.59+1.56) 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.18(0.12+0.06) 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.73+0.62) 7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman) 0.85(0.30+0.54) 7519.7: status (fsmonitor=) 0.69(0.52+0.90) 7519.8: status -uno (fsmonitor=) 0.37(0.28+0.81) 7519.9: status -uall (fsmonitor=) 1.53(0.93+1.32) 7519.10: diff (fsmonitor=) 0.34(0.26+0.81) With this patch Test this tree -------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 2.84(1.70+1.76) 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.18(0.13+0.05) 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.35(0.81+0.53) 7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman) 0.15(0.11+0.05) 7519.7: status (fsmonitor=) 0.71(0.54+0.90) 7519.8: status -uno (fsmonitor=) 0.38(0.30+0.81) 7519.9: status -uall (fsmonitor=) 1.55(0.93+1.34) 7519.10: diff (fsmonitor=) 0.35(0.32+0.76) Alex Vandiver (1): fsmonitor: use fsmonitor data in `git diff` Nipunn Koorapati (3): t/perf/README: elaborate on output format t/perf/p7519-fsmonitor.sh: warm cache on first git status t/perf: add fsmonitor perf test for git diff diff-lib.c | 17 +++++++++++++++-- t/perf/README | 2 ++ t/perf/p7519-fsmonitor.sh | 19 ++++++++++++++++++- 3 files changed, 35 insertions(+), 3 deletions(-) base-commit: d4a392452e292ff924e79ec8458611c0f679d6d4 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-756%2Fnipunn1313%2Fdiff_fsmon-v1 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-756/nipunn1313/diff_fsmon-v1 Pull-Request: https://github.com/gitgitgadget/git/pull/756 -- gitgitgadget ^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` 2020-10-17 21:04 [PATCH 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget @ 2020-10-17 21:04 ` Alex Vandiver via GitGitGadget 2020-10-17 22:25 ` Junio C Hamano 2020-10-17 21:04 ` [PATCH 2/4] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget ` (3 subsequent siblings) 4 siblings, 1 reply; 52+ messages in thread From: Alex Vandiver via GitGitGadget @ 2020-10-17 21:04 UTC (permalink / raw) To: git; +Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Alex Vandiver From: Alex Vandiver <alexmv@dropbox.com> With fsmonitor enabled, the first call to match_stat_with_submodule calls refresh_fsmonitor, incurring the overhead of reading the list of updated files -- but run_diff_files does not respect the CE_FSMONITOR_VALID flag. Make use of the fsmonitor extension to skip lstat() calls on files that fsmonitor judged as unmodified. Notably, this change improves performance of the git shell prompt when GIT_PS1_SHOWDIRTYSTATE is set. Signed-off-by: Alex Vandiver <alexmv@dropbox.com> Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- diff-lib.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/diff-lib.c b/diff-lib.c index f95c6de75f..b7ee1b89ef 100644 --- a/diff-lib.c +++ b/diff-lib.c @@ -97,6 +97,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option) diff_set_mnemonic_prefix(&revs->diffopt, "i/", "w/"); + refresh_fsmonitor(istate); + if (diff_unmerged_stage < 0) diff_unmerged_stage = 2; entries = istate->cache_nr; @@ -197,8 +199,19 @@ int run_diff_files(struct rev_info *revs, unsigned int option) if (ce_uptodate(ce) || ce_skip_worktree(ce)) continue; - /* If CE_VALID is set, don't look at workdir for file removal */ - if (ce->ce_flags & CE_VALID) { + /* + * If CE_VALID is set, the user has promised us that the workdir + * hasn't changed compared to index, so don't stat workdir + * for file removal + * eg - via git udpate-index --assume-unchanged + * eg - via core.ignorestat=true + * + * When using FSMONITOR: + * If CE_FSMONITOR_VALID is set, then we know the metadata on disk + * has not changed since the last refresh, and we can skip the + * file-removal checks without doing the stat in check_removed. + */ + if (ce->ce_flags & CE_VALID || ce->ce_flags & CE_FSMONITOR_VALID) { changed = 0; newmode = ce->ce_mode; } else { -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` 2020-10-17 21:04 ` [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget @ 2020-10-17 22:25 ` Junio C Hamano 2020-10-18 0:54 ` Nipunn Koorapati 0 siblings, 1 reply; 52+ messages in thread From: Junio C Hamano @ 2020-10-17 22:25 UTC (permalink / raw) To: Alex Vandiver via GitGitGadget Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati, Alex Vandiver "Alex Vandiver via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Alex Vandiver <alexmv@dropbox.com> > > With fsmonitor enabled, the first call to match_stat_with_submodule > calls refresh_fsmonitor, incurring the overhead of reading the list of > updated files -- but run_diff_files does not respect the > CE_FSMONITOR_VALID flag. run_diff_files() is used not just by "git diff" but other things like "git add", so if we get an overall speed-up without having to pay undue cost, that would be a very good news. > diff --git a/diff-lib.c b/diff-lib.c > index f95c6de75f..b7ee1b89ef 100644 > --- a/diff-lib.c > +++ b/diff-lib.c > @@ -97,6 +97,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option) > > diff_set_mnemonic_prefix(&revs->diffopt, "i/", "w/"); > > + refresh_fsmonitor(istate); > + "git diff" and friends are often run with pathspec, but the API into the fsmonitor, refresh_fsmonitor() call, has no way to say "I only am interested in the status of this directory and everything else does not matter". How expensive would this call to accept fsmonitor data for the entire tree be, and would there eventually be a point where the number of paths we are interested in checking (i.e. the paths that would match the pathspec) is so small that we would be better off not making this call? E.g. if we are checking more than 20% of the working tree, running refresh_fsmonitor() for the entire working tree is still a win, but if we are only checking less than that, we are better off without fsmonitor, or does a tradeoff like that exist? > @@ -197,8 +199,19 @@ int run_diff_files(struct rev_info *revs, unsigned int option) > if (ce_uptodate(ce) || ce_skip_worktree(ce)) > continue; > > - /* If CE_VALID is set, don't look at workdir for file removal */ > - if (ce->ce_flags & CE_VALID) { > + /* > + * If CE_VALID is set, the user has promised us that the workdir > + * hasn't changed compared to index, so don't stat workdir > + * for file removal The above seems to be an attempt to elaborate on the existing comment, but ... > + * eg - via git udpate-index --assume-unchanged > + * eg - via core.ignorestat=true ... what are these two lines doing here? It makes no sense to say "Don't stat workdir for file removal by doing 'git update-index' or by seetting core.ignorestat", but the placement of these two lines makes it look as if that is what you are saying. Perhaps When CE_VALID is set (via "update-index --assume-unchanged" or via adding paths while core.ignorestat is set to true), the user has promised ..., so don't stat workdir for removed files. would probably be what you meant bo say. > + * When using FSMONITOR: > + * If CE_FSMONITOR_VALID is set, then we know the metadata on disk > + * has not changed since the last refresh, and we can skip the > + * file-removal checks without doing the stat in check_removed. An iffy description. You skip all the file-removal check by not calling check_removed() as a whole. This is not the fault of this patch, but in any case, the description places too much stress on "removal" when in reality, removal is not all that special in this codepath. The check_removed call also contributes to noticiing modified (not removed) files. If we are updating the comment here, we should correct that too, perhaps When CE_VALID is set (via "update-index --assume-unchanged" or via adding paths while core.ignorestat is set to true), the user has promised that the working tree file for that path will not be modified. When CE_FSMONITOR_VALID is true, the fsmonitor knows that the path hasn't been modified since we refreshed the cached stat information. In either case, we do not have to stat to see if the path has been removed or modified. or something like that, perhaps. > + */ > + if (ce->ce_flags & CE_VALID || ce->ce_flags & CE_FSMONITOR_VALID) { Would it become easier to read, if written like this instead? if (ce->ce_flags & (CE_VALID | CE_FSMONITOR_VALID)) { That reflects what the suggested comment says better. > changed = 0; > newmode = ce->ce_mode; > } else { Thanks. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` 2020-10-17 22:25 ` Junio C Hamano @ 2020-10-18 0:54 ` Nipunn Koorapati 2020-10-18 4:17 ` Taylor Blau 0 siblings, 1 reply; 52+ messages in thread From: Nipunn Koorapati @ 2020-10-18 0:54 UTC (permalink / raw) To: Junio C Hamano Cc: Alex Vandiver via GitGitGadget, git, Derrick Stolee, Utsav Shah, Alex Vandiver > run_diff_files() is used not just by "git diff" but other things > like "git add", so if we get an overall speed-up without having to > pay undue cost, that would be a very good news. Agreed! I may be able to write perf benchmark tests to highlight benefits to git add as well. > 20% of the working tree, running refresh_fsmonitor() for the entire > working tree is still a win, but if we are only checking less than > that, we are better off without fsmonitor, or does a tradeoff like > that exist? My understanding is that refresh_fsmonitor is O(delta_since_last_refresh) - so for developers with large repositories - this cost will amortize out over subsequent commands, so I don't think it's worth investigating this tradeoff here. As a user of large repositories, I expect that my major source of fsmonitor activity to be user intent (eg git pull, or intentionally copying/editing a large number of files). After such a command, I expect my next git command to be slower - that would be unsurprising. I think the tradeoff could be made for small diff requests, but I don't think it's worth adding complexity here - as that user will just have to pay the cost on their next git command. > > + * eg - via git udpate-index --assume-unchanged > > + * eg - via core.ignorestat=true > > ... what are these two lines doing here? Intended to indicate potential ways that CE_VALID might be set. When I was reading the source here, it was pretty difficult to determine how this would be set. Agree that I picked unfortunate wording. Thanks for the suggestions. Will update in the next iteration. > > would probably be what you meant bo say. > > When CE_VALID is set (via "update-index --assume-unchanged" > or via adding paths while core.ignorestat is set to true), > the user has promised that the working tree file for that > path will not be modified. When CE_FSMONITOR_VALID is true, > the fsmonitor knows that the path hasn't been modified since > we refreshed the cached stat information. In either case, > we do not have to stat to see if the path has been removed > or modified. > > or something like that, perhaps. Sounds good. Will clarify. I like your comment better as well. > > > + */ > > + if (ce->ce_flags & CE_VALID || ce->ce_flags & CE_FSMONITOR_VALID) { > > Would it become easier to read, if written like this instead? > > if (ce->ce_flags & (CE_VALID | CE_FSMONITOR_VALID)) { I personally find this more confusing because it involves multiple bitwise ops, but this is potentially due to me having more mental practice thinking about boolean operators vs bitwise operators. I'm more than happy to align with the common pattern of the repo. I'll change this. > > Thanks. Thank you for the thorough review! ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` 2020-10-18 0:54 ` Nipunn Koorapati @ 2020-10-18 4:17 ` Taylor Blau 2020-10-18 5:02 ` Junio C Hamano 0 siblings, 1 reply; 52+ messages in thread From: Taylor Blau @ 2020-10-18 4:17 UTC (permalink / raw) To: Nipunn Koorapati Cc: Junio C Hamano, Alex Vandiver via GitGitGadget, git, Derrick Stolee, Utsav Shah, Alex Vandiver On Sun, Oct 18, 2020 at 01:54:44AM +0100, Nipunn Koorapati wrote: > > 20% of the working tree, running refresh_fsmonitor() for the entire > > working tree is still a win, but if we are only checking less than > > that, we are better off without fsmonitor, or does a tradeoff like > > that exist? > > My understanding is that refresh_fsmonitor is > O(delta_since_last_refresh) - so for developers > with large repositories - this cost will amortize out over subsequent > commands, so I don't > think it's worth investigating this tradeoff here. > As a user of large repositories, I expect that my major source of > fsmonitor activity to be user > intent (eg git pull, or intentionally copying/editing a large number > of files). After such a command, > I expect my next git command to be slower - that would be unsurprising. > > I think the tradeoff could be made for small diff requests, but I > don't think it's worth adding complexity here - > as that user will just have to pay the cost on their next git command. Hmm. I do agree that I'd like to stay out of the business of trying to figure out exactly what that trade-off is (although I'm sure that it exists), only because it seems likely to vary to a large extent from repository to repository. (That is, 20% may be a good number for some repository, but a terrible choice for another). But, I think that we can invoke watchman better here; the fsmonitor-watchman hook has no notion of a "pathspec", so every query just asks for everything that isn't in '$GIT_DIR'. Is there anything preventing us from taking an optional pathspec and building up a more targeted query? There is some overhead to invoke the hook and talk to watchman, but I'd expect that to be dwarfed by not having to issue O(# files) syscalls. > > > > > + */ > > > + if (ce->ce_flags & CE_VALID || ce->ce_flags & CE_FSMONITOR_VALID) { > > > > Would it become easier to read, if written like this instead? > > > > if (ce->ce_flags & (CE_VALID | CE_FSMONITOR_VALID)) { > > I personally find this more confusing because it involves multiple > bitwise ops, but this > is potentially due to me having more mental practice thinking about > boolean operators vs bitwise operators. > I'm more than happy to align with the common pattern of the repo. I'll > change this. I don't have an opinion, nor do I think that git.git has an established practice of doing one over the other. For what it's worth, my two-cents is that Junio's suggestion is easier to read. Thanks, Taylor ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` 2020-10-18 4:17 ` Taylor Blau @ 2020-10-18 5:02 ` Junio C Hamano 2020-10-18 23:43 ` Taylor Blau 0 siblings, 1 reply; 52+ messages in thread From: Junio C Hamano @ 2020-10-18 5:02 UTC (permalink / raw) To: Taylor Blau Cc: Nipunn Koorapati, Alex Vandiver via GitGitGadget, git, Derrick Stolee, Utsav Shah, Alex Vandiver Taylor Blau <me@ttaylorr.com> writes: > Hmm. I do agree that I'd like to stay out of the business of trying to > figure out exactly what that trade-off is (although I'm sure that it > exists), only because it seems likely to vary to a large extent from > repository to repository. (That is, 20% may be a good number for some > repository, but a terrible choice for another). I think both of you misunderstood me. My question was a simple yes/no "does there a trade off exist?" question and the sentences with 20% in it were mere example of possible trade-off I had in mind that _could_ exist. I wasn't even suggesting to figure out what the optimum cut-off heuristics would be (e.g. solving "when more than N% paths are subject to diff fsmonitor is faster" for N). I was hoping that we can show that even having to lstat just a single path is expensive enough---IOW, "there is no trade-off worth worrying about, because talking to fsmonitor is so cheap compared to the cost of even a single lstst" would have been a valid and happy answer. With such a number, there is no risk of introducing an unwarranted performance regression to use cases that we did not anticipate by adding an unconditional call to refresh_fsmonitor(). But without any rationale, the performance implication of adding an unconditional call to refresh_fsmonitor() would become much muddier. > But, I think that we can invoke watchman better here; the > fsmonitor-watchman hook has no notion of a "pathspec", so every query > just asks for everything that isn't in '$GIT_DIR'. Is there anything > preventing us from taking an optional pathspec and building up a more > targeted query? Yup, it is what I had in mind when I brought up the pathspec. It may be something worth pursuing longer term, but not within the scope of this patch. > There is some overhead to invoke the hook and talk to watchman, but > I'd expect that to be dwarfed by not having to issue O(# files) > syscalls. "invoke the hook"---is that a pipe+fork+exec, or something else that is far lighter-weight? n ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` 2020-10-18 5:02 ` Junio C Hamano @ 2020-10-18 23:43 ` Taylor Blau 2020-10-19 17:23 ` Junio C Hamano 0 siblings, 1 reply; 52+ messages in thread From: Taylor Blau @ 2020-10-18 23:43 UTC (permalink / raw) To: Junio C Hamano Cc: Taylor Blau, Nipunn Koorapati, Alex Vandiver via GitGitGadget, git, Derrick Stolee, Utsav Shah, Alex Vandiver On Sat, Oct 17, 2020 at 10:02:04PM -0700, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > > Hmm. I do agree that I'd like to stay out of the business of trying to > > figure out exactly what that trade-off is (although I'm sure that it > > exists), only because it seems likely to vary to a large extent from > > repository to repository. (That is, 20% may be a good number for some > > repository, but a terrible choice for another). > > I think both of you misunderstood me. > > My question was a simple yes/no "does there a trade off exist?" > question and the sentences with 20% in it were mere example of > possible trade-off I had in mind that _could_ exist. I wasn't even > suggesting to figure out what the optimum cut-off heuristics would > be (e.g. solving "when more than N% paths are subject to diff > fsmonitor is faster" for N). > > I was hoping that we can show that even having to lstat just a > single path is expensive enough---IOW, "there is no trade-off worth > worrying about, because talking to fsmonitor is so cheap compared to > the cost of even a single lstst" would have been a valid and happy > answer. With such a number, there is no risk of introducing an > unwarranted performance regression to use cases that we did not > anticipate by adding an unconditional call to refresh_fsmonitor(). > > But without any rationale, the performance implication of adding an > unconditional call to refresh_fsmonitor() would become much muddier. Aha; thanks for clarifying. I'm glad we agree that finding 'N' would not be worth it, or at least that showing that talking to fsmonitor is cheaper than a single lstat would be more worthwhile. Nipunn - I don't have fsmonitor/watchman setup on my workstation, but if you do, some numbers (or an interpretation of the numbers you already provided) on this would be really useful. If you don't have it set up, or don't have time to measure it, let me know, and I'd be happy to take a look. > > But, I think that we can invoke watchman better here; the > > fsmonitor-watchman hook has no notion of a "pathspec", so every query > > just asks for everything that isn't in '$GIT_DIR'. Is there anything > > preventing us from taking an optional pathspec and building up a more > > targeted query? > > Yup, it is what I had in mind when I brought up the pathspec. It > may be something worth pursuing longer term, but not within the > scope of this patch. > > > There is some overhead to invoke the hook and talk to watchman, but > > I'd expect that to be dwarfed by not having to issue O(# files) > > syscalls. > > "invoke the hook"---is that a pipe+fork+exec, or something else that > is far lighter-weight? The former; see 'fsmonitor.c:query_fsmonitor()'. Thanks, Taylor ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` 2020-10-18 23:43 ` Taylor Blau @ 2020-10-19 17:23 ` Junio C Hamano 2020-10-19 17:37 ` Taylor Blau 0 siblings, 1 reply; 52+ messages in thread From: Junio C Hamano @ 2020-10-19 17:23 UTC (permalink / raw) To: Taylor Blau Cc: Nipunn Koorapati, Alex Vandiver via GitGitGadget, git, Derrick Stolee, Utsav Shah, Alex Vandiver Taylor Blau <me@ttaylorr.com> writes: >> > There is some overhead to invoke the hook and talk to watchman, but >> > I'd expect that to be dwarfed by not having to issue O(# files) >> > syscalls. >> >> "invoke the hook"---is that a pipe+fork+exec, or something else that >> is far lighter-weight? > > The former; see 'fsmonitor.c:query_fsmonitor()'. It brings us back to the "overhead of how many lstat(2) takes us closer to the overhead of a single pipe+fork+exec plus reading from the pipe", doesn't it? ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` 2020-10-19 17:23 ` Junio C Hamano @ 2020-10-19 17:37 ` Taylor Blau 2020-10-19 18:07 ` Nipunn Koorapati 0 siblings, 1 reply; 52+ messages in thread From: Taylor Blau @ 2020-10-19 17:37 UTC (permalink / raw) To: Junio C Hamano Cc: Taylor Blau, Nipunn Koorapati, Alex Vandiver via GitGitGadget, git, Derrick Stolee, Utsav Shah, Alex Vandiver On Mon, Oct 19, 2020 at 10:23:26AM -0700, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > >> > There is some overhead to invoke the hook and talk to watchman, but > >> > I'd expect that to be dwarfed by not having to issue O(# files) > >> > syscalls. > >> > >> "invoke the hook"---is that a pipe+fork+exec, or something else that > >> is far lighter-weight? > > > > The former; see 'fsmonitor.c:query_fsmonitor()'. > > It brings us back to the "overhead of how many lstat(2) takes us > closer to the overhead of a single pipe+fork+exec plus reading from > the pipe", doesn't it? Somewhat unfortunately, yes. Hopefully any user that cares to use fsmonitor has enough files in their repository that a pipe+fork+exec is still faster than however many lstats they would have needed otherwise. Of course, finding out what that number is is still interesting... Thanks, Taylor ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` 2020-10-19 17:37 ` Taylor Blau @ 2020-10-19 18:07 ` Nipunn Koorapati 0 siblings, 0 replies; 52+ messages in thread From: Nipunn Koorapati @ 2020-10-19 18:07 UTC (permalink / raw) To: Taylor Blau Cc: Junio C Hamano, Alex Vandiver via GitGitGadget, git, Derrick Stolee, Utsav Shah, Alex Vandiver > It brings us back to the "overhead of how many lstat(2) takes us > closer to the overhead of a single pipe+fork+exec plus reading from > the pipe", doesn't it? > I will add a benchmark for a `git diff -- <pathspec>` > Somewhat unfortunately, yes. Hopefully any user that cares to use > fsmonitor has enough files in their repository that a pipe+fork+exec is > still faster than however many lstats they would have needed otherwise. > > Of course, finding out what that number is is still interesting... I can try to do some manual testing to figure this out. Doesn't seem like the type of thing we'd want to add to the benchmark, as it would involve running git diff on a variety of pathspec workloads --Nipunn ^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH 2/4] t/perf/README: elaborate on output format 2020-10-17 21:04 [PATCH 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget 2020-10-17 21:04 ` [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget @ 2020-10-17 21:04 ` Nipunn Koorapati via GitGitGadget 2020-10-17 21:04 ` [PATCH 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget ` (2 subsequent siblings) 4 siblings, 0 replies; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-17 21:04 UTC (permalink / raw) To: git; +Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati From: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- t/perf/README | 2 ++ 1 file changed, 2 insertions(+) diff --git a/t/perf/README b/t/perf/README index bd649afa97..fb9127a66f 100644 --- a/t/perf/README +++ b/t/perf/README @@ -28,6 +28,8 @@ the tests on the current git repository. 7810.3: grep --cached, cheap regex 3.07(3.02+0.25) 7810.4: grep --cached, expensive regex 9.39(30.57+0.24) +Output format is in seconds "Elapsed(User + System)" + You can compare multiple repositories and even git revisions with the 'run' script: -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* [PATCH 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status 2020-10-17 21:04 [PATCH 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget 2020-10-17 21:04 ` [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget 2020-10-17 21:04 ` [PATCH 2/4] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget @ 2020-10-17 21:04 ` Nipunn Koorapati via GitGitGadget 2020-10-18 4:22 ` Taylor Blau 2020-10-17 21:04 ` [PATCH 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget 2020-10-19 21:35 ` [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget 4 siblings, 1 reply; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-17 21:04 UTC (permalink / raw) To: git; +Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati From: Nipunn Koorapati <nipunn@dropbox.com> The first git status would be inflated due to warming of filesystem cache. This makes the results comparable. Before Test this tree -------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 2.52(1.59+1.56) 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.18(0.12+0.06) 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.73+0.62) 7519.7: status (fsmonitor=) 0.69(0.52+0.90) 7519.8: status -uno (fsmonitor=) 0.37(0.28+0.81) 7519.9: status -uall (fsmonitor=) 1.53(0.93+1.32) After Test this tree -------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 0.39(0.33+0.06) 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.17(0.13+0.05) 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.34(0.77+0.56) 7519.7: status (fsmonitor=) 0.70(0.53+0.90) 7519.8: status -uno (fsmonitor=) 0.37(0.32+0.78) 7519.9: status -uall (fsmonitor=) 1.55(1.01+1.25) Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- t/perf/p7519-fsmonitor.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh index def7ecdbc7..9313d4a51d 100755 --- a/t/perf/p7519-fsmonitor.sh +++ b/t/perf/p7519-fsmonitor.sh @@ -114,7 +114,8 @@ test_expect_success "setup for fsmonitor" ' fi && git config core.fsmonitor "$INTEGRATION_SCRIPT" && - git update-index --fsmonitor + git update-index --fsmonitor && + git status # Warm caches ' if test -n "$GIT_PERF_7519_DROP_CACHE"; then -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: [PATCH 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status 2020-10-17 21:04 ` [PATCH 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget @ 2020-10-18 4:22 ` Taylor Blau 0 siblings, 0 replies; 52+ messages in thread From: Taylor Blau @ 2020-10-18 4:22 UTC (permalink / raw) To: Nipunn Koorapati via GitGitGadget Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati On Sat, Oct 17, 2020 at 09:04:35PM +0000, Nipunn Koorapati via GitGitGadget wrote: > From: Nipunn Koorapati <nipunn@dropbox.com> > > The first git status would be inflated due to warming of > filesystem cache. This makes the results comparable. > > Before > Test this tree > -------------------------------------------------------------------------------- > 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 2.52(1.59+1.56) > 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.18(0.12+0.06) > 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.73+0.62) > 7519.7: status (fsmonitor=) 0.69(0.52+0.90) > 7519.8: status -uno (fsmonitor=) 0.37(0.28+0.81) > 7519.9: status -uall (fsmonitor=) 1.53(0.93+1.32) > > After > Test this tree > -------------------------------------------------------------------------------- > 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 0.39(0.33+0.06) > 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.17(0.13+0.05) > 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.34(0.77+0.56) > 7519.7: status (fsmonitor=) 0.70(0.53+0.90) > 7519.8: status -uno (fsmonitor=) 0.37(0.32+0.78) > 7519.9: status -uall (fsmonitor=) 1.55(1.01+1.25) Note that you can directly compare results with the perf suite's "run" script by passing two revisions in addition to the test that you want to run and have the results aggregated side-by-side. In your case, you'd want something like (within the t/perf directory): $ ./run HEAD . p7519-*.sh where this patch is the uncommitted state (alternatively you could compare the two revisions directly in the case that you have already committed). > diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh > index def7ecdbc7..9313d4a51d 100755 > --- a/t/perf/p7519-fsmonitor.sh > +++ b/t/perf/p7519-fsmonitor.sh > @@ -114,7 +114,8 @@ test_expect_success "setup for fsmonitor" ' > fi && > > git config core.fsmonitor "$INTEGRATION_SCRIPT" && > - git update-index --fsmonitor > + git update-index --fsmonitor && > + git status # Warm caches Seems reasonable, and the comment is much appreciated :-). Thanks, Taylor ^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH 4/4] t/perf: add fsmonitor perf test for git diff 2020-10-17 21:04 [PATCH 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget ` (2 preceding siblings ...) 2020-10-17 21:04 ` [PATCH 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget @ 2020-10-17 21:04 ` Nipunn Koorapati via GitGitGadget 2020-10-17 22:28 ` Junio C Hamano 2020-10-19 21:35 ` [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget 4 siblings, 1 reply; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-17 21:04 UTC (permalink / raw) To: git; +Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati From: Nipunn Koorapati <nipunn@dropbox.com> Results for the git-diff fsmonitor optimization in patch in the parent-rev (using a 400k file repo to test) As you can see here - git diff with fsmonitor running is significantly better with this patch series (80% faster on my workload)! On master (2.29) Test this tree -------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 0.39(0.33+0.06) 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.17(0.13+0.05) 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.34(0.77+0.56) 7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman) 0.82(0.24+0.58) 7519.7: status (fsmonitor=) 0.70(0.53+0.90) 7519.8: status -uno (fsmonitor=) 0.37(0.32+0.78) 7519.9: status -uall (fsmonitor=) 1.55(1.01+1.25) 7519.10: diff (fsmonitor=) 0.34(0.35+0.72) With this patch series Test this tree -------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 0.39(0.33+0.07) 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.17(0.12+0.05) 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.35(0.73+0.61) 7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman) 0.14(0.10+0.05) 7519.7: status (fsmonitor=) 0.70(0.56+0.87) 7519.8: status -uno (fsmonitor=) 0.37(0.31+0.79) 7519.9: status -uall (fsmonitor=) 1.54(0.97+1.29) 7519.10: diff (fsmonitor=) 0.34(0.28+0.79) Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- t/perf/p7519-fsmonitor.sh | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh index 9313d4a51d..80d0148557 100755 --- a/t/perf/p7519-fsmonitor.sh +++ b/t/perf/p7519-fsmonitor.sh @@ -142,6 +142,14 @@ test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" ' git status -uall ' +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-tool drop-caches +fi + +test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff +' + test_expect_success "setup without fsmonitor" ' unset INTEGRATION_SCRIPT && git config --unset core.fsmonitor && @@ -172,6 +180,14 @@ test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" ' git status -uall ' +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-tool drop-caches +fi + +test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff +' + if test_have_prereq WATCHMAN then watchman watch-del "$GIT_WORK_TREE" >/dev/null 2>&1 && -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: [PATCH 4/4] t/perf: add fsmonitor perf test for git diff 2020-10-17 21:04 ` [PATCH 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget @ 2020-10-17 22:28 ` Junio C Hamano 0 siblings, 0 replies; 52+ messages in thread From: Junio C Hamano @ 2020-10-17 22:28 UTC (permalink / raw) To: Nipunn Koorapati via GitGitGadget Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati "Nipunn Koorapati via GitGitGadget" <gitgitgadget@gmail.com> writes: > +test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" ' > + git diff This is a whole-tree diff. It would be interesting to also see if a meaningful tradeoff exists if a test is run with a tree with say 100 top-level subdirectories but with just one of them covered by a pathspec, with many modified paths sprinkled all over. > +' > + > if test_have_prereq WATCHMAN > then > watchman watch-del "$GIT_WORK_TREE" >/dev/null 2>&1 && ^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat 2020-10-17 21:04 [PATCH 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget ` (3 preceding siblings ...) 2020-10-17 21:04 ` [PATCH 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget @ 2020-10-19 21:35 ` Nipunn Koorapati via GitGitGadget 2020-10-19 21:35 ` [PATCH v2 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget ` (4 more replies) 4 siblings, 5 replies; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 21:35 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati Credit to alexmv who made this commit back in Dec, 2017 when he was at dbx. I've rebased it and am submitting it now. With fsmonitor enabled, git diff currently lstats every file in the repo This makes use of the fsmonitor extension to skip lstat() calls on files that fsmonitor judged as unmodified. I was able to do some testing with/without this change in a large in-house repo (~ 400k files). ----------------------------------------- (1) With fsmonitor enabled - on master of git (2.29.0) ----------------------------------------- ../git/bin-wrappers/git checkout HEAD~200 strace -c ../git/bin-wrappers/git diff % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 99.64 4.358994 10 446257 3 lstat 0.12 0.005353 7 764 360 open (A subsequent call) strace -c ../git/bin-wrappers/git diff % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 99.84 4.380955 10 444904 3 lstat 0.06 0.002564 135 19 munmap ... ----------------------------------------- (2) With fsmonitor enabled - with my patch ----------------------------------------- ../git/bin-wrappers/git checkout HEAD~200 strace -c ../git/bin-wrappers/git diff % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 50.72 0.003090 163 19 munmap 19.63 0.001196 598 2 futex ... 0.00 0.000000 0 4 3 lstat ----------------------------------------- (3) With fsmonitor disabled entirely ----------------------------------------- % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 98.52 0.277085 92362 3 futex 0.27 0.000752 4 191 63 open ... 0.14 0.000397 3 158 3 lstat I was able to encode this into a perf test in one of the commits. Changes since Patch Series V1 * Add git diff -- <pathspec> to perf tests * improve readability of bitwise ops Alex Vandiver (1): fsmonitor: use fsmonitor data in `git diff` Nipunn Koorapati (3): t/perf/README: elaborate on output format t/perf/p7519-fsmonitor.sh: warm cache on first git status t/perf: add fsmonitor perf test for git diff diff-lib.c | 15 ++++++-- t/perf/README | 2 ++ t/perf/p7519-fsmonitor.sh | 74 ++++++++++++++++++++++++++++++++++++++- 3 files changed, 88 insertions(+), 3 deletions(-) base-commit: d4a392452e292ff924e79ec8458611c0f679d6d4 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-756%2Fnipunn1313%2Fdiff_fsmon-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-756/nipunn1313/diff_fsmon-v2 Pull-Request: https://github.com/gitgitgadget/git/pull/756 Range-diff vs v1: 1: 13fd992a37 ! 1: cba03dd40b fsmonitor: use fsmonitor data in `git diff` @@ diff-lib.c: int run_diff_files(struct rev_info *revs, unsigned int option) - /* If CE_VALID is set, don't look at workdir for file removal */ - if (ce->ce_flags & CE_VALID) { + /* -+ * If CE_VALID is set, the user has promised us that the workdir -+ * hasn't changed compared to index, so don't stat workdir -+ * for file removal -+ * eg - via git udpate-index --assume-unchanged -+ * eg - via core.ignorestat=true -+ * -+ * When using FSMONITOR: -+ * If CE_FSMONITOR_VALID is set, then we know the metadata on disk -+ * has not changed since the last refresh, and we can skip the -+ * file-removal checks without doing the stat in check_removed. ++ * When CE_VALID is set (via "update-index --assume-unchanged" ++ * or via adding paths while core.ignorestat is set to true), ++ * the user has promised that the working tree file for that ++ * path will not be modified. When CE_FSMONITOR_VALID is true, ++ * the fsmonitor knows that the path hasn't been modified since ++ * we refreshed the cached stat information. In either case, ++ * we do not have to stat to see if the path has been removed ++ * or modified. + */ -+ if (ce->ce_flags & CE_VALID || ce->ce_flags & CE_FSMONITOR_VALID) { ++ if (ce->ce_flags & (CE_VALID | CE_FSMONITOR_VALID)) { changed = 0; newmode = ce->ce_mode; } else { 2: 024cd07965 = 2: 1c7876166f t/perf/README: elaborate on output format 3: 6482e372bc = 3: 401f696c81 t/perf/p7519-fsmonitor.sh: warm cache on first git status 4: 0613b07676 ! 4: f572e226bb t/perf: add fsmonitor perf test for git diff @@ Commit message significantly better with this patch series (80% faster on my workload)! - On master (2.29) + GIT_PERF_LARGE_REPO=~/src/server ./run v2.29.0-rc1 . -- p7519-fsmonitor.sh - Test this tree - -------------------------------------------------------------------------------- - 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 0.39(0.33+0.06) - 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.17(0.13+0.05) - 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.34(0.77+0.56) - 7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman) 0.82(0.24+0.58) - 7519.7: status (fsmonitor=) 0.70(0.53+0.90) - 7519.8: status -uno (fsmonitor=) 0.37(0.32+0.78) - 7519.9: status -uall (fsmonitor=) 1.55(1.01+1.25) - 7519.10: diff (fsmonitor=) 0.34(0.35+0.72) + Test v2.29.0-rc1 this tree + ----------------------------------------------------------------------------------------------------------------- + 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 1.46(0.82+0.64) 1.47(0.83+0.62) +0.7% + 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.16(0.12+0.04) 0.17(0.12+0.05) +6.3% + 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.73+0.62) 1.37(0.76+0.60) +0.7% + 7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman) 0.85(0.22+0.63) 0.14(0.10+0.05) -83.5% + 7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.05) 0.13(0.11+0.02) +8.3% + 7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.04) 0.13(0.09+0.04) +8.3% + 7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.07+0.05) 0.13(0.07+0.06) +8.3% + 7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.09+0.04) 0.13(0.08+0.05) +8.3% + 7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.14(0.09+0.05) 0.13(0.10+0.03) -7.1% + 7519.12: status (fsmonitor=) 1.67(0.93+1.49) 1.67(0.99+1.42) +0.0% + 7519.13: status -uno (fsmonitor=) 0.37(0.30+0.82) 0.37(0.33+0.79) +0.0% + 7519.14: status -uall (fsmonitor=) 1.58(0.97+1.35) 1.57(0.86+1.45) -0.6% + 7519.15: diff (fsmonitor=) 0.34(0.28+0.83) 0.34(0.27+0.83) +0.0% + 7519.16: diff -- 0_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.08+0.02) +0.0% + 7519.17: diff -- 10_files (fsmonitor=) 0.09(0.07+0.03) 0.09(0.06+0.05) +0.0% + 7519.18: diff -- 100_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.06+0.04) +0.0% + 7519.19: diff -- 1000_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.05+0.05) +0.0% + 7519.20: diff -- 10000_files (fsmonitor=) 0.10(0.08+0.04) 0.10(0.06+0.05) +0.0% - With this patch series + I also added a benchmark for a tiny git diff workload w/ a pathspec. + I see an approximately .02 second overhead added w/ and w/o fsmonitor - Test this tree - -------------------------------------------------------------------------------- - 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 0.39(0.33+0.07) - 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.17(0.12+0.05) - 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.35(0.73+0.61) - 7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman) 0.14(0.10+0.05) - 7519.7: status (fsmonitor=) 0.70(0.56+0.87) - 7519.8: status -uno (fsmonitor=) 0.37(0.31+0.79) - 7519.9: status -uall (fsmonitor=) 1.54(0.97+1.29) - 7519.10: diff (fsmonitor=) 0.34(0.28+0.79) + From looking at these results, I suspected that refresh_fsmonitor + is already happening during git diff - independent of this patch + series' optimization. Confirmed that suspicion by breaking on + refresh_fsmonitor. + + (gdb) bt [simplified] + 0 refresh_fsmonitor at fsmonitor.c:176 + 1 ie_match_stat at read-cache.c:375 + 2 match_stat_with_submodule at diff-lib.c:237 + 4 builtin_diff_files at builtin/diff.c:260 + 5 cmd_diff at builtin/diff.c:541 + 6 run_builtin at git.c:450 + 7 handle_builtin at git.c:700 + 8 run_argv at git.c:767 + 9 cmd_main at git.c:898 + 10 main at common-main.c:52 Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> ## t/perf/p7519-fsmonitor.sh ## +@@ t/perf/p7519-fsmonitor.sh: test_expect_success "setup for fsmonitor" ' + + git config core.fsmonitor "$INTEGRATION_SCRIPT" && + git update-index --fsmonitor && ++ mkdir 1_file 10_files 100_files 1000_files 10000_files && ++ for i in `seq 1 10`; do touch 10_files/$i; done && ++ for i in `seq 1 100`; do touch 100_files/$i; done && ++ for i in `seq 1 1000`; do touch 1000_files/$i; done && ++ for i in `seq 1 10000`; do touch 10000_files/$i; done && ++ git add 1_file 10_files 100_files 1000_files 10000_files && ++ git commit -m "Add files" && + git status # Warm caches + ' + @@ t/perf/p7519-fsmonitor.sh: test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" ' git status -uall ' @@ t/perf/p7519-fsmonitor.sh: test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIP +test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff +' ++ ++if test -n "$GIT_PERF_7519_DROP_CACHE"; then ++ test-tool drop-caches ++fi ++ ++test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" ' ++ git diff -- 1_file ++' ++ ++test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" ' ++ git diff -- 10_files ++' ++ ++test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" ' ++ git diff -- 100_files ++' ++ ++test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' ++ git diff -- 1000_files ++' ++ ++test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' ++ git diff -- 10000_files ++' + test_expect_success "setup without fsmonitor" ' unset INTEGRATION_SCRIPT && @@ t/perf/p7519-fsmonitor.sh: test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIP +test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff +' ++ ++if test -n "$GIT_PERF_7519_DROP_CACHE"; then ++ test-tool drop-caches ++fi ++ ++test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" ' ++ git diff -- 1_file ++' ++ ++test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" ' ++ git diff -- 10_files ++' ++ ++test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" ' ++ git diff -- 100_files ++' ++ ++test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' ++ git diff -- 1000_files ++' ++ ++test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' ++ git diff -- 10000_files ++' + if test_have_prereq WATCHMAN then -- gitgitgadget ^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH v2 1/4] fsmonitor: use fsmonitor data in `git diff` 2020-10-19 21:35 ` [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget @ 2020-10-19 21:35 ` Alex Vandiver via GitGitGadget 2020-10-19 21:35 ` [PATCH v2 2/4] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget ` (3 subsequent siblings) 4 siblings, 0 replies; 52+ messages in thread From: Alex Vandiver via GitGitGadget @ 2020-10-19 21:35 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati, Alex Vandiver From: Alex Vandiver <alexmv@dropbox.com> With fsmonitor enabled, the first call to match_stat_with_submodule calls refresh_fsmonitor, incurring the overhead of reading the list of updated files -- but run_diff_files does not respect the CE_FSMONITOR_VALID flag. Make use of the fsmonitor extension to skip lstat() calls on files that fsmonitor judged as unmodified. Notably, this change improves performance of the git shell prompt when GIT_PS1_SHOWDIRTYSTATE is set. Signed-off-by: Alex Vandiver <alexmv@dropbox.com> Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- diff-lib.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/diff-lib.c b/diff-lib.c index f95c6de75f..d2d31b9f82 100644 --- a/diff-lib.c +++ b/diff-lib.c @@ -97,6 +97,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option) diff_set_mnemonic_prefix(&revs->diffopt, "i/", "w/"); + refresh_fsmonitor(istate); + if (diff_unmerged_stage < 0) diff_unmerged_stage = 2; entries = istate->cache_nr; @@ -197,8 +199,17 @@ int run_diff_files(struct rev_info *revs, unsigned int option) if (ce_uptodate(ce) || ce_skip_worktree(ce)) continue; - /* If CE_VALID is set, don't look at workdir for file removal */ - if (ce->ce_flags & CE_VALID) { + /* + * When CE_VALID is set (via "update-index --assume-unchanged" + * or via adding paths while core.ignorestat is set to true), + * the user has promised that the working tree file for that + * path will not be modified. When CE_FSMONITOR_VALID is true, + * the fsmonitor knows that the path hasn't been modified since + * we refreshed the cached stat information. In either case, + * we do not have to stat to see if the path has been removed + * or modified. + */ + if (ce->ce_flags & (CE_VALID | CE_FSMONITOR_VALID)) { changed = 0; newmode = ce->ce_mode; } else { -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* [PATCH v2 2/4] t/perf/README: elaborate on output format 2020-10-19 21:35 ` [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget 2020-10-19 21:35 ` [PATCH v2 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget @ 2020-10-19 21:35 ` Nipunn Koorapati via GitGitGadget 2020-10-19 21:35 ` [PATCH v2 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget ` (2 subsequent siblings) 4 siblings, 0 replies; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 21:35 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati, Nipunn Koorapati From: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- t/perf/README | 2 ++ 1 file changed, 2 insertions(+) diff --git a/t/perf/README b/t/perf/README index bd649afa97..fb9127a66f 100644 --- a/t/perf/README +++ b/t/perf/README @@ -28,6 +28,8 @@ the tests on the current git repository. 7810.3: grep --cached, cheap regex 3.07(3.02+0.25) 7810.4: grep --cached, expensive regex 9.39(30.57+0.24) +Output format is in seconds "Elapsed(User + System)" + You can compare multiple repositories and even git revisions with the 'run' script: -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* [PATCH v2 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status 2020-10-19 21:35 ` [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget 2020-10-19 21:35 ` [PATCH v2 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget 2020-10-19 21:35 ` [PATCH v2 2/4] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget @ 2020-10-19 21:35 ` Nipunn Koorapati via GitGitGadget 2020-10-19 21:35 ` [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget 2020-10-19 22:47 ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget 4 siblings, 0 replies; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 21:35 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati, Nipunn Koorapati From: Nipunn Koorapati <nipunn@dropbox.com> The first git status would be inflated due to warming of filesystem cache. This makes the results comparable. Before Test this tree -------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 2.52(1.59+1.56) 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.18(0.12+0.06) 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.73+0.62) 7519.7: status (fsmonitor=) 0.69(0.52+0.90) 7519.8: status -uno (fsmonitor=) 0.37(0.28+0.81) 7519.9: status -uall (fsmonitor=) 1.53(0.93+1.32) After Test this tree -------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 0.39(0.33+0.06) 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.17(0.13+0.05) 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.34(0.77+0.56) 7519.7: status (fsmonitor=) 0.70(0.53+0.90) 7519.8: status -uno (fsmonitor=) 0.37(0.32+0.78) 7519.9: status -uall (fsmonitor=) 1.55(1.01+1.25) Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- t/perf/p7519-fsmonitor.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh index def7ecdbc7..9313d4a51d 100755 --- a/t/perf/p7519-fsmonitor.sh +++ b/t/perf/p7519-fsmonitor.sh @@ -114,7 +114,8 @@ test_expect_success "setup for fsmonitor" ' fi && git config core.fsmonitor "$INTEGRATION_SCRIPT" && - git update-index --fsmonitor + git update-index --fsmonitor && + git status # Warm caches ' if test -n "$GIT_PERF_7519_DROP_CACHE"; then -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff 2020-10-19 21:35 ` [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget ` (2 preceding siblings ...) 2020-10-19 21:35 ` [PATCH v2 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget @ 2020-10-19 21:35 ` Nipunn Koorapati via GitGitGadget 2020-10-19 21:43 ` Taylor Blau 2020-10-19 21:54 ` Taylor Blau 2020-10-19 22:47 ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget 4 siblings, 2 replies; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 21:35 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati, Nipunn Koorapati From: Nipunn Koorapati <nipunn@dropbox.com> Results for the git-diff fsmonitor optimization in patch in the parent-rev (using a 400k file repo to test) As you can see here - git diff with fsmonitor running is significantly better with this patch series (80% faster on my workload)! GIT_PERF_LARGE_REPO=~/src/server ./run v2.29.0-rc1 . -- p7519-fsmonitor.sh Test v2.29.0-rc1 this tree ----------------------------------------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 1.46(0.82+0.64) 1.47(0.83+0.62) +0.7% 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.16(0.12+0.04) 0.17(0.12+0.05) +6.3% 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.73+0.62) 1.37(0.76+0.60) +0.7% 7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman) 0.85(0.22+0.63) 0.14(0.10+0.05) -83.5% 7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.05) 0.13(0.11+0.02) +8.3% 7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.04) 0.13(0.09+0.04) +8.3% 7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.07+0.05) 0.13(0.07+0.06) +8.3% 7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.09+0.04) 0.13(0.08+0.05) +8.3% 7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.14(0.09+0.05) 0.13(0.10+0.03) -7.1% 7519.12: status (fsmonitor=) 1.67(0.93+1.49) 1.67(0.99+1.42) +0.0% 7519.13: status -uno (fsmonitor=) 0.37(0.30+0.82) 0.37(0.33+0.79) +0.0% 7519.14: status -uall (fsmonitor=) 1.58(0.97+1.35) 1.57(0.86+1.45) -0.6% 7519.15: diff (fsmonitor=) 0.34(0.28+0.83) 0.34(0.27+0.83) +0.0% 7519.16: diff -- 0_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.08+0.02) +0.0% 7519.17: diff -- 10_files (fsmonitor=) 0.09(0.07+0.03) 0.09(0.06+0.05) +0.0% 7519.18: diff -- 100_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.06+0.04) +0.0% 7519.19: diff -- 1000_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.05+0.05) +0.0% 7519.20: diff -- 10000_files (fsmonitor=) 0.10(0.08+0.04) 0.10(0.06+0.05) +0.0% I also added a benchmark for a tiny git diff workload w/ a pathspec. I see an approximately .02 second overhead added w/ and w/o fsmonitor From looking at these results, I suspected that refresh_fsmonitor is already happening during git diff - independent of this patch series' optimization. Confirmed that suspicion by breaking on refresh_fsmonitor. (gdb) bt [simplified] 0 refresh_fsmonitor at fsmonitor.c:176 1 ie_match_stat at read-cache.c:375 2 match_stat_with_submodule at diff-lib.c:237 4 builtin_diff_files at builtin/diff.c:260 5 cmd_diff at builtin/diff.c:541 6 run_builtin at git.c:450 7 handle_builtin at git.c:700 8 run_argv at git.c:767 9 cmd_main at git.c:898 10 main at common-main.c:52 Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- t/perf/p7519-fsmonitor.sh | 71 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 71 insertions(+) diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh index 9313d4a51d..2b4803707f 100755 --- a/t/perf/p7519-fsmonitor.sh +++ b/t/perf/p7519-fsmonitor.sh @@ -115,6 +115,13 @@ test_expect_success "setup for fsmonitor" ' git config core.fsmonitor "$INTEGRATION_SCRIPT" && git update-index --fsmonitor && + mkdir 1_file 10_files 100_files 1000_files 10000_files && + for i in `seq 1 10`; do touch 10_files/$i; done && + for i in `seq 1 100`; do touch 100_files/$i; done && + for i in `seq 1 1000`; do touch 1000_files/$i; done && + for i in `seq 1 10000`; do touch 10000_files/$i; done && + git add 1_file 10_files 100_files 1000_files 10000_files && + git commit -m "Add files" && git status # Warm caches ' @@ -142,6 +149,38 @@ test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" ' git status -uall ' +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-tool drop-caches +fi + +test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff +' + +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-tool drop-caches +fi + +test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 1_file +' + +test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 10_files +' + +test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 100_files +' + +test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 1000_files +' + +test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 10000_files +' + test_expect_success "setup without fsmonitor" ' unset INTEGRATION_SCRIPT && git config --unset core.fsmonitor && @@ -172,6 +211,38 @@ test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" ' git status -uall ' +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-tool drop-caches +fi + +test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff +' + +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-tool drop-caches +fi + +test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 1_file +' + +test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 10_files +' + +test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 100_files +' + +test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 1000_files +' + +test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 10000_files +' + if test_have_prereq WATCHMAN then watchman watch-del "$GIT_WORK_TREE" >/dev/null 2>&1 && -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff 2020-10-19 21:35 ` [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget @ 2020-10-19 21:43 ` Taylor Blau 2020-10-19 21:54 ` Taylor Blau 1 sibling, 0 replies; 52+ messages in thread From: Taylor Blau @ 2020-10-19 21:43 UTC (permalink / raw) To: Nipunn Koorapati via GitGitGadget Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau On Mon, Oct 19, 2020 at 09:35:15PM +0000, Nipunn Koorapati via GitGitGadget wrote: > diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh > index 9313d4a51d..2b4803707f 100755 > --- a/t/perf/p7519-fsmonitor.sh > +++ b/t/perf/p7519-fsmonitor.sh > @@ -115,6 +115,13 @@ test_expect_success "setup for fsmonitor" ' > > git config core.fsmonitor "$INTEGRATION_SCRIPT" && > git update-index --fsmonitor && > + mkdir 1_file 10_files 100_files 1000_files 10000_files && > + for i in `seq 1 10`; do touch 10_files/$i; done && > + for i in `seq 1 100`; do touch 100_files/$i; done && > + for i in `seq 1 1000`; do touch 1000_files/$i; done && > + for i in `seq 1 10000`; do touch 10000_files/$i; done && I just happened to notice these while reading your range diff; git discourages the use of seq in test, instead preferring our own works-everywhere 'test_seq()'. I was wondering how this slipped through since it should be checked automatically by t/check-non-portable-shell.pl, but that is only run from t/Makefile, not t/perf/Makefile. That probably explains how a few raw `seq`'s made it into t/perf. In either case, test_seq() is preferred here. Thanks, Taylor ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff 2020-10-19 21:35 ` [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget 2020-10-19 21:43 ` Taylor Blau @ 2020-10-19 21:54 ` Taylor Blau 2020-10-19 22:00 ` Nipunn Koorapati 2020-10-19 22:25 ` Nipunn Koorapati 1 sibling, 2 replies; 52+ messages in thread From: Taylor Blau @ 2020-10-19 21:54 UTC (permalink / raw) To: Nipunn Koorapati via GitGitGadget Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau On Mon, Oct 19, 2020 at 09:35:15PM +0000, Nipunn Koorapati via GitGitGadget wrote: > From: Nipunn Koorapati <nipunn@dropbox.com> > > Results for the git-diff fsmonitor optimization > in patch in the parent-rev (using a 400k file repo to test) > > As you can see here - git diff with fsmonitor running is > significantly better with this patch series (80% faster on my > workload)! These t/perf numbers are very helpful, at least to me. > GIT_PERF_LARGE_REPO=~/src/server ./run v2.29.0-rc1 . -- p7519-fsmonitor.sh > > Test v2.29.0-rc1 this tree > ----------------------------------------------------------------------------------------------------------------- > 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 1.46(0.82+0.64) 1.47(0.83+0.62) +0.7% > 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.16(0.12+0.04) 0.17(0.12+0.05) +6.3% > 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.73+0.62) 1.37(0.76+0.60) +0.7% Looks like about 0.01sec of overhead, which seems like an acceptable trade-off for when the user has at least 10,000 files. This reminds me; did you look at the 'git add' performance change? I recall Junio mentioning that 'git add' takes the same paths in the code. > 7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman) 0.85(0.22+0.63) 0.14(0.10+0.05) -83.5% > 7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.05) 0.13(0.11+0.02) +8.3% > 7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.04) 0.13(0.09+0.04) +8.3% > 7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.07+0.05) 0.13(0.07+0.06) +8.3% > 7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.09+0.04) 0.13(0.08+0.05) +8.3% > 7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.14(0.09+0.05) 0.13(0.10+0.03) -7.1% OK... so having fsmonitor turned on adds an imperceptible amount of slow-down to cases where there are [0, 10000) files. But, in exchange, you get much-improved whole-tree performance, as well as single-tree performance when that tree contains at least 10,000 files. I was going to say that this has little downside, because turning on fsmonitor is probably a good indicator that you don't have any fewer than 10,000 files in your repository, but I think that's missing the point. Likely true, but that doesn't exclude the possibility of having sub-10,000 file directories, which users may very well still be diff-ing. So, there's a slow-down, but it's hard to complain when you consider what we get in exchange. > 7519.12: status (fsmonitor=) 1.67(0.93+1.49) 1.67(0.99+1.42) +0.0% > 7519.13: status -uno (fsmonitor=) 0.37(0.30+0.82) 0.37(0.33+0.79) +0.0% > 7519.14: status -uall (fsmonitor=) 1.58(0.97+1.35) 1.57(0.86+1.45) -0.6% > 7519.15: diff (fsmonitor=) 0.34(0.28+0.83) 0.34(0.27+0.83) +0.0% > 7519.16: diff -- 0_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.08+0.02) +0.0% > 7519.17: diff -- 10_files (fsmonitor=) 0.09(0.07+0.03) 0.09(0.06+0.05) +0.0% > 7519.18: diff -- 100_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.06+0.04) +0.0% > 7519.19: diff -- 1000_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.05+0.05) +0.0% > 7519.20: diff -- 10000_files (fsmonitor=) 0.10(0.08+0.04) 0.10(0.06+0.05) +0.0% Great! No slow-down without fsmonitor enabled, as expected. Fantastic. > I also added a benchmark for a tiny git diff workload w/ a pathspec. > I see an approximately .02 second overhead added w/ and w/o fsmonitor > > From looking at these results, I suspected that refresh_fsmonitor > is already happening during git diff - independent of this patch > series' optimization. Confirmed that suspicion by breaking on > refresh_fsmonitor. So, the overhead that we're paying is purely the pipe+fork+exec? I.e., that watchman has already computed an answer in the earlier call, and we just have to read it again (or find out that the last results were unchanged)? > (gdb) bt [simplified] > 0 refresh_fsmonitor at fsmonitor.c:176 > 1 ie_match_stat at read-cache.c:375 > 2 match_stat_with_submodule at diff-lib.c:237 > 4 builtin_diff_files at builtin/diff.c:260 > 5 cmd_diff at builtin/diff.c:541 > 6 run_builtin at git.c:450 > 7 handle_builtin at git.c:700 > 8 run_argv at git.c:767 > 9 cmd_main at git.c:898 > 10 main at common-main.c:52 :-). > Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> > --- > t/perf/p7519-fsmonitor.sh | 71 +++++++++++++++++++++++++++++++++++++++ > 1 file changed, 71 insertions(+) > > diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh > index 9313d4a51d..2b4803707f 100755 > --- a/t/perf/p7519-fsmonitor.sh > +++ b/t/perf/p7519-fsmonitor.sh > @@ -115,6 +115,13 @@ test_expect_success "setup for fsmonitor" ' Everything in here looks very reasonable to me, except for the seq vs. test_seq() issue that I pointed out in another email in this thread. It's too bad that we have to write these twice, but that's not the fault of your patch. Thanks, Taylor ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff 2020-10-19 21:54 ` Taylor Blau @ 2020-10-19 22:00 ` Nipunn Koorapati 2020-10-19 22:02 ` Taylor Blau 2020-10-19 22:25 ` Nipunn Koorapati 1 sibling, 1 reply; 52+ messages in thread From: Nipunn Koorapati @ 2020-10-19 22:00 UTC (permalink / raw) To: Taylor Blau Cc: Nipunn Koorapati via GitGitGadget, git, Derrick Stolee, Utsav Shah, Nipunn Koorapati > I was wondering how this slipped through since it should be checked > automatically by t/check-non-portable-shell.pl, but that is only run > from t/Makefile, not t/perf/Makefile. That probably explains how a few > raw `seq`'s made it into t/perf. Makefile issue is easy enough to fix - will fix in another commit ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff 2020-10-19 22:00 ` Nipunn Koorapati @ 2020-10-19 22:02 ` Taylor Blau 0 siblings, 0 replies; 52+ messages in thread From: Taylor Blau @ 2020-10-19 22:02 UTC (permalink / raw) To: Nipunn Koorapati Cc: Taylor Blau, Nipunn Koorapati via GitGitGadget, git, Derrick Stolee, Utsav Shah, Nipunn Koorapati On Mon, Oct 19, 2020 at 11:00:05PM +0100, Nipunn Koorapati wrote: > > I was wondering how this slipped through since it should be checked > > automatically by t/check-non-portable-shell.pl, but that is only run > > from t/Makefile, not t/perf/Makefile. That probably explains how a few > > raw `seq`'s made it into t/perf. > > Makefile issue is easy enough to fix - will fix in another commit Thanks; I don't think that you need to worry about touching up t/perf's Makefile (although I'd be very grateful if you did!), but rather just swapping new invocations of seq to use 'test_seq()' would be sufficient in and of itself. Thanks, Taylor ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff 2020-10-19 21:54 ` Taylor Blau 2020-10-19 22:00 ` Nipunn Koorapati @ 2020-10-19 22:25 ` Nipunn Koorapati 1 sibling, 0 replies; 52+ messages in thread From: Nipunn Koorapati @ 2020-10-19 22:25 UTC (permalink / raw) To: Taylor Blau Cc: Nipunn Koorapati via GitGitGadget, git, Derrick Stolee, Utsav Shah, Nipunn Koorapati > This reminds me; did you look at the 'git add' performance change? I > recall Junio mentioning that 'git add' takes the same paths in the code. I did look into it and didn't see a big perf change - and didn't dig into why. I'll leave a perf test in the next roll of this patch series so you can see the numbers. --Nipunn ^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat 2020-10-19 21:35 ` [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget ` (3 preceding siblings ...) 2020-10-19 21:35 ` [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 ` Nipunn Koorapati via GitGitGadget 2020-10-19 22:47 ` [PATCH v3 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget ` (7 more replies) 4 siblings, 8 replies; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati Credit to alexmv who made this commit back in Dec, 2017 when he was at dbx. I've rebased it and am submitting it now. With fsmonitor enabled, git diff currently lstats every file in the repo This makes use of the fsmonitor extension to skip lstat() calls on files that fsmonitor judged as unmodified. I was able to do some testing with/without this change in a large in-house repo (~ 400k files). ----------------------------------------- (1) With fsmonitor enabled - on master of git (2.29.0) ----------------------------------------- ../git/bin-wrappers/git checkout HEAD~200 strace -c ../git/bin-wrappers/git diff % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 99.64 4.358994 10 446257 3 lstat 0.12 0.005353 7 764 360 open (A subsequent call) strace -c ../git/bin-wrappers/git diff % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 99.84 4.380955 10 444904 3 lstat 0.06 0.002564 135 19 munmap ... ----------------------------------------- (2) With fsmonitor enabled - with my patch ----------------------------------------- ../git/bin-wrappers/git checkout HEAD~200 strace -c ../git/bin-wrappers/git diff % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 50.72 0.003090 163 19 munmap 19.63 0.001196 598 2 futex ... 0.00 0.000000 0 4 3 lstat ----------------------------------------- (3) With fsmonitor disabled entirely ----------------------------------------- % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 98.52 0.277085 92362 3 futex 0.27 0.000752 4 191 63 open ... 0.14 0.000397 3 158 3 lstat I was able to encode this into a perf test in one of the commits. Changes since Patch Series V1 * Add git diff -- <pathspec> to perf tests * improve readability of bitwise ops Alex Vandiver (1): fsmonitor: use fsmonitor data in `git diff` Nipunn Koorapati (6): t/perf/README: elaborate on output format t/perf/p7519-fsmonitor.sh: warm cache on first git status t/perf: add fsmonitor perf test for git diff perf lint: check test-lint-shell-syntax in perf tests p7519-fsmonitor: refactor to avoid code duplication p7519-fsmonitor: add a git add benchmark diff-lib.c | 15 +++++- t/Makefile | 3 +- t/perf/README | 2 + t/perf/p3400-rebase.sh | 6 +-- t/perf/p7519-fsmonitor.sh | 96 ++++++++++++++++++++++----------------- 5 files changed, 75 insertions(+), 47 deletions(-) base-commit: d4a392452e292ff924e79ec8458611c0f679d6d4 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-756%2Fnipunn1313%2Fdiff_fsmon-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-756/nipunn1313/diff_fsmon-v3 Pull-Request: https://github.com/gitgitgadget/git/pull/756 Range-diff vs v2: 1: cba03dd40b = 1: cba03dd40b fsmonitor: use fsmonitor data in `git diff` 2: 1c7876166f = 2: 1c7876166f t/perf/README: elaborate on output format 3: 401f696c81 = 3: 401f696c81 t/perf/p7519-fsmonitor.sh: warm cache on first git status 4: f572e226bb ! 4: b3ad8faac4 t/perf: add fsmonitor perf test for git diff @@ t/perf/p7519-fsmonitor.sh: test_expect_success "setup for fsmonitor" ' git config core.fsmonitor "$INTEGRATION_SCRIPT" && git update-index --fsmonitor && + mkdir 1_file 10_files 100_files 1000_files 10000_files && -+ for i in `seq 1 10`; do touch 10_files/$i; done && -+ for i in `seq 1 100`; do touch 100_files/$i; done && -+ for i in `seq 1 1000`; do touch 1000_files/$i; done && -+ for i in `seq 1 10000`; do touch 10000_files/$i; done && ++ for i in $(test_seq 1 10); do touch 10_files/$i; done && ++ for i in $(test_seq 1 100); do touch 100_files/$i; done && ++ for i in $(test_seq 1 1000); do touch 1000_files/$i; done && ++ for i in $(test_seq 1 10000); do touch 10000_files/$i; done && + git add 1_file 10_files 100_files 1000_files 10000_files && + git commit -m "Add files" && git status # Warm caches -: ---------- > 5: 28c1e488bf perf lint: check test-lint-shell-syntax in perf tests -: ---------- > 6: b38f2984f9 p7519-fsmonitor: refactor to avoid code duplication -: ---------- > 7: d392a523f2 p7519-fsmonitor: add a git add benchmark -- gitgitgadget ^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH v3 1/7] fsmonitor: use fsmonitor data in `git diff` 2020-10-19 22:47 ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 ` Alex Vandiver via GitGitGadget 2020-10-19 22:47 ` [PATCH v3 2/7] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget ` (6 subsequent siblings) 7 siblings, 0 replies; 52+ messages in thread From: Alex Vandiver via GitGitGadget @ 2020-10-19 22:47 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati, Alex Vandiver From: Alex Vandiver <alexmv@dropbox.com> With fsmonitor enabled, the first call to match_stat_with_submodule calls refresh_fsmonitor, incurring the overhead of reading the list of updated files -- but run_diff_files does not respect the CE_FSMONITOR_VALID flag. Make use of the fsmonitor extension to skip lstat() calls on files that fsmonitor judged as unmodified. Notably, this change improves performance of the git shell prompt when GIT_PS1_SHOWDIRTYSTATE is set. Signed-off-by: Alex Vandiver <alexmv@dropbox.com> Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- diff-lib.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/diff-lib.c b/diff-lib.c index f95c6de75f..d2d31b9f82 100644 --- a/diff-lib.c +++ b/diff-lib.c @@ -97,6 +97,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option) diff_set_mnemonic_prefix(&revs->diffopt, "i/", "w/"); + refresh_fsmonitor(istate); + if (diff_unmerged_stage < 0) diff_unmerged_stage = 2; entries = istate->cache_nr; @@ -197,8 +199,17 @@ int run_diff_files(struct rev_info *revs, unsigned int option) if (ce_uptodate(ce) || ce_skip_worktree(ce)) continue; - /* If CE_VALID is set, don't look at workdir for file removal */ - if (ce->ce_flags & CE_VALID) { + /* + * When CE_VALID is set (via "update-index --assume-unchanged" + * or via adding paths while core.ignorestat is set to true), + * the user has promised that the working tree file for that + * path will not be modified. When CE_FSMONITOR_VALID is true, + * the fsmonitor knows that the path hasn't been modified since + * we refreshed the cached stat information. In either case, + * we do not have to stat to see if the path has been removed + * or modified. + */ + if (ce->ce_flags & (CE_VALID | CE_FSMONITOR_VALID)) { changed = 0; newmode = ce->ce_mode; } else { -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* [PATCH v3 2/7] t/perf/README: elaborate on output format 2020-10-19 22:47 ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget 2020-10-19 22:47 ` [PATCH v3 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget @ 2020-10-19 22:47 ` Nipunn Koorapati via GitGitGadget 2020-10-19 22:47 ` [PATCH v3 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget ` (5 subsequent siblings) 7 siblings, 0 replies; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati, Nipunn Koorapati From: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- t/perf/README | 2 ++ 1 file changed, 2 insertions(+) diff --git a/t/perf/README b/t/perf/README index bd649afa97..fb9127a66f 100644 --- a/t/perf/README +++ b/t/perf/README @@ -28,6 +28,8 @@ the tests on the current git repository. 7810.3: grep --cached, cheap regex 3.07(3.02+0.25) 7810.4: grep --cached, expensive regex 9.39(30.57+0.24) +Output format is in seconds "Elapsed(User + System)" + You can compare multiple repositories and even git revisions with the 'run' script: -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* [PATCH v3 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status 2020-10-19 22:47 ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget 2020-10-19 22:47 ` [PATCH v3 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget 2020-10-19 22:47 ` [PATCH v3 2/7] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 ` Nipunn Koorapati via GitGitGadget 2020-10-19 22:47 ` [PATCH v3 4/7] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget ` (4 subsequent siblings) 7 siblings, 0 replies; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati, Nipunn Koorapati From: Nipunn Koorapati <nipunn@dropbox.com> The first git status would be inflated due to warming of filesystem cache. This makes the results comparable. Before Test this tree -------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 2.52(1.59+1.56) 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.18(0.12+0.06) 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.73+0.62) 7519.7: status (fsmonitor=) 0.69(0.52+0.90) 7519.8: status -uno (fsmonitor=) 0.37(0.28+0.81) 7519.9: status -uall (fsmonitor=) 1.53(0.93+1.32) After Test this tree -------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 0.39(0.33+0.06) 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.17(0.13+0.05) 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.34(0.77+0.56) 7519.7: status (fsmonitor=) 0.70(0.53+0.90) 7519.8: status -uno (fsmonitor=) 0.37(0.32+0.78) 7519.9: status -uall (fsmonitor=) 1.55(1.01+1.25) Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- t/perf/p7519-fsmonitor.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh index def7ecdbc7..9313d4a51d 100755 --- a/t/perf/p7519-fsmonitor.sh +++ b/t/perf/p7519-fsmonitor.sh @@ -114,7 +114,8 @@ test_expect_success "setup for fsmonitor" ' fi && git config core.fsmonitor "$INTEGRATION_SCRIPT" && - git update-index --fsmonitor + git update-index --fsmonitor && + git status # Warm caches ' if test -n "$GIT_PERF_7519_DROP_CACHE"; then -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* [PATCH v3 4/7] t/perf: add fsmonitor perf test for git diff 2020-10-19 22:47 ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget ` (2 preceding siblings ...) 2020-10-19 22:47 ` [PATCH v3 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 ` Nipunn Koorapati via GitGitGadget 2020-10-19 22:47 ` [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests Nipunn Koorapati via GitGitGadget ` (3 subsequent siblings) 7 siblings, 0 replies; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati, Nipunn Koorapati From: Nipunn Koorapati <nipunn@dropbox.com> Results for the git-diff fsmonitor optimization in patch in the parent-rev (using a 400k file repo to test) As you can see here - git diff with fsmonitor running is significantly better with this patch series (80% faster on my workload)! GIT_PERF_LARGE_REPO=~/src/server ./run v2.29.0-rc1 . -- p7519-fsmonitor.sh Test v2.29.0-rc1 this tree ----------------------------------------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 1.46(0.82+0.64) 1.47(0.83+0.62) +0.7% 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.16(0.12+0.04) 0.17(0.12+0.05) +6.3% 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.73+0.62) 1.37(0.76+0.60) +0.7% 7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman) 0.85(0.22+0.63) 0.14(0.10+0.05) -83.5% 7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.05) 0.13(0.11+0.02) +8.3% 7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.04) 0.13(0.09+0.04) +8.3% 7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.07+0.05) 0.13(0.07+0.06) +8.3% 7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.09+0.04) 0.13(0.08+0.05) +8.3% 7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.14(0.09+0.05) 0.13(0.10+0.03) -7.1% 7519.12: status (fsmonitor=) 1.67(0.93+1.49) 1.67(0.99+1.42) +0.0% 7519.13: status -uno (fsmonitor=) 0.37(0.30+0.82) 0.37(0.33+0.79) +0.0% 7519.14: status -uall (fsmonitor=) 1.58(0.97+1.35) 1.57(0.86+1.45) -0.6% 7519.15: diff (fsmonitor=) 0.34(0.28+0.83) 0.34(0.27+0.83) +0.0% 7519.16: diff -- 0_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.08+0.02) +0.0% 7519.17: diff -- 10_files (fsmonitor=) 0.09(0.07+0.03) 0.09(0.06+0.05) +0.0% 7519.18: diff -- 100_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.06+0.04) +0.0% 7519.19: diff -- 1000_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.05+0.05) +0.0% 7519.20: diff -- 10000_files (fsmonitor=) 0.10(0.08+0.04) 0.10(0.06+0.05) +0.0% I also added a benchmark for a tiny git diff workload w/ a pathspec. I see an approximately .02 second overhead added w/ and w/o fsmonitor From looking at these results, I suspected that refresh_fsmonitor is already happening during git diff - independent of this patch series' optimization. Confirmed that suspicion by breaking on refresh_fsmonitor. (gdb) bt [simplified] 0 refresh_fsmonitor at fsmonitor.c:176 1 ie_match_stat at read-cache.c:375 2 match_stat_with_submodule at diff-lib.c:237 4 builtin_diff_files at builtin/diff.c:260 5 cmd_diff at builtin/diff.c:541 6 run_builtin at git.c:450 7 handle_builtin at git.c:700 8 run_argv at git.c:767 9 cmd_main at git.c:898 10 main at common-main.c:52 Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- t/perf/p7519-fsmonitor.sh | 71 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 71 insertions(+) diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh index 9313d4a51d..ef4c3c8c5c 100755 --- a/t/perf/p7519-fsmonitor.sh +++ b/t/perf/p7519-fsmonitor.sh @@ -115,6 +115,13 @@ test_expect_success "setup for fsmonitor" ' git config core.fsmonitor "$INTEGRATION_SCRIPT" && git update-index --fsmonitor && + mkdir 1_file 10_files 100_files 1000_files 10000_files && + for i in $(test_seq 1 10); do touch 10_files/$i; done && + for i in $(test_seq 1 100); do touch 100_files/$i; done && + for i in $(test_seq 1 1000); do touch 1000_files/$i; done && + for i in $(test_seq 1 10000); do touch 10000_files/$i; done && + git add 1_file 10_files 100_files 1000_files 10000_files && + git commit -m "Add files" && git status # Warm caches ' @@ -142,6 +149,38 @@ test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" ' git status -uall ' +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-tool drop-caches +fi + +test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff +' + +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-tool drop-caches +fi + +test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 1_file +' + +test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 10_files +' + +test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 100_files +' + +test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 1000_files +' + +test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 10000_files +' + test_expect_success "setup without fsmonitor" ' unset INTEGRATION_SCRIPT && git config --unset core.fsmonitor && @@ -172,6 +211,38 @@ test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" ' git status -uall ' +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-tool drop-caches +fi + +test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff +' + +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-tool drop-caches +fi + +test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 1_file +' + +test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 10_files +' + +test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 100_files +' + +test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 1000_files +' + +test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 10000_files +' + if test_have_prereq WATCHMAN then watchman watch-del "$GIT_WORK_TREE" >/dev/null 2>&1 && -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests 2020-10-19 22:47 ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget ` (3 preceding siblings ...) 2020-10-19 22:47 ` [PATCH v3 4/7] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 ` Nipunn Koorapati via GitGitGadget 2020-10-20 2:38 ` Taylor Blau 2020-10-19 22:47 ` [PATCH v3 6/7] p7519-fsmonitor: refactor to avoid code duplication Nipunn Koorapati via GitGitGadget ` (2 subsequent siblings) 7 siblings, 1 reply; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati, Nipunn Koorapati From: Nipunn Koorapati <nipunn@dropbox.com> Perf tests have some seq instead of test_seq. This runs the existing tests on the perf tests as well. Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- t/Makefile | 3 ++- t/perf/p3400-rebase.sh | 6 +++--- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/t/Makefile b/t/Makefile index c83fd18861..74b53af2bd 100644 --- a/t/Makefile +++ b/t/Makefile @@ -34,6 +34,7 @@ CHAINLINTTMP_SQ = $(subst ','\'',$(CHAINLINTTMP)) T = $(sort $(wildcard t[0-9][0-9][0-9][0-9]-*.sh)) TGITWEB = $(sort $(wildcard t95[0-9][0-9]-*.sh)) THELPERS = $(sort $(filter-out $(T),$(wildcard *.sh))) +TPERF = $(sort $(wildcard perf/p[0-9][0-9][0-9][0-9]-*.sh)) CHAINLINTTESTS = $(sort $(patsubst chainlint/%.test,%,$(wildcard chainlint/*.test))) CHAINLINT = sed -f chainlint.sed @@ -91,7 +92,7 @@ test-lint-executable: echo >&2 "non-executable tests:" $$bad; exit 1; } test-lint-shell-syntax: - @'$(PERL_PATH_SQ)' check-non-portable-shell.pl $(T) $(THELPERS) + @'$(PERL_PATH_SQ)' check-non-portable-shell.pl $(T) $(THELPERS) $(TPERF) test-lint-filenames: @# We do *not* pass a glob to ls-files but use grep instead, to catch diff --git a/t/perf/p3400-rebase.sh b/t/perf/p3400-rebase.sh index d202aaed06..7a0bb29448 100755 --- a/t/perf/p3400-rebase.sh +++ b/t/perf/p3400-rebase.sh @@ -9,16 +9,16 @@ test_expect_success 'setup rebasing on top of a lot of changes' ' git checkout -f -B base && git checkout -B to-rebase && git checkout -B upstream && - for i in $(seq 100) + for i in $(test_seq 100) do # simulate huge diffs echo change$i >unrelated-file$i && - seq 1000 >>unrelated-file$i && + test_seq 1000 >>unrelated-file$i && git add unrelated-file$i && test_tick && git commit -m commit$i unrelated-file$i && echo change$i >unrelated-file$i && - seq 1000 | tac >>unrelated-file$i && + test_seq 1000 | tac >>unrelated-file$i && git add unrelated-file$i && test_tick && git commit -m commit$i-reverse unrelated-file$i || -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests 2020-10-19 22:47 ` [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests Nipunn Koorapati via GitGitGadget @ 2020-10-20 2:38 ` Taylor Blau 2020-10-20 3:10 ` Junio C Hamano 2020-10-20 10:09 ` Nipunn Koorapati 0 siblings, 2 replies; 52+ messages in thread From: Taylor Blau @ 2020-10-20 2:38 UTC (permalink / raw) To: Nipunn Koorapati via GitGitGadget Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau On Mon, Oct 19, 2020 at 10:47:35PM +0000, Nipunn Koorapati via GitGitGadget wrote: > test-lint-shell-syntax: > - @'$(PERL_PATH_SQ)' check-non-portable-shell.pl $(T) $(THELPERS) > + @'$(PERL_PATH_SQ)' check-non-portable-shell.pl $(T) $(THELPERS) $(TPERF) I really appreciate your initiative to modify t/Makefile to start linting t/perf/p????-*.sh files, too. Could I bother you to elaborate a little bit on why you chose to modify a recipe in t/Makefile instead of t/perf/Makefile? I'm not necessarily opposed, but having this in t/perf/Makefile would allow me to just run 'make' in 't/perf' and still have the scripts linted there without having to involve a 'make' in 't'. For what it's worth, I suspect that this is because 't/Makefile' already has a 'test-lint-shell-syntax' target, and 't/perf/Makefile' does not. I think it would be OK to add it there, too, and move this change into t/perf. > diff --git a/t/perf/p3400-rebase.sh b/t/perf/p3400-rebase.sh > index d202aaed06..7a0bb29448 100755 > --- a/t/perf/p3400-rebase.sh > +++ b/t/perf/p3400-rebase.sh > @@ -9,16 +9,16 @@ test_expect_success 'setup rebasing on top of a lot of changes' ' > git checkout -f -B base && > git checkout -B to-rebase && > git checkout -B upstream && > - for i in $(seq 100) > + for i in $(test_seq 100) > do > # simulate huge diffs > echo change$i >unrelated-file$i && > - seq 1000 >>unrelated-file$i && > + test_seq 1000 >>unrelated-file$i && > git add unrelated-file$i && > test_tick && > git commit -m commit$i unrelated-file$i && > echo change$i >unrelated-file$i && > - seq 1000 | tac >>unrelated-file$i && > + test_seq 1000 | tac >>unrelated-file$i && Makes sense. I wouldn't be opposed to breaking this out into an earlier change (e.g., "it's about to become not OK to use seq in t/perf, so prepare for that by replacing any invocations with test_seq()"), but I think it's probably not worth it, since this patch is small as it is. Thanks, Taylor ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests 2020-10-20 2:38 ` Taylor Blau @ 2020-10-20 3:10 ` Junio C Hamano 2020-10-20 3:15 ` Taylor Blau 2020-10-20 10:09 ` Nipunn Koorapati 1 sibling, 1 reply; 52+ messages in thread From: Junio C Hamano @ 2020-10-20 3:10 UTC (permalink / raw) To: Taylor Blau Cc: Nipunn Koorapati via GitGitGadget, git, Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati Taylor Blau <me@ttaylorr.com> writes: >> echo change$i >unrelated-file$i && >> - seq 1000 | tac >>unrelated-file$i && >> + test_seq 1000 | tac >>unrelated-file$i && > > Makes sense. I wouldn't be opposed to breaking this out into an earlier > change (e.g., "it's about to become not OK to use seq in t/perf, so > prepare for that by replacing any invocations with test_seq()"), but I > think it's probably not worth it, since this patch is small as it is. test_seq is fine, but I do not think tac is portable (only saved by the fact that not many people, especially on exotic platforms, run perf scripts). ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests 2020-10-20 3:10 ` Junio C Hamano @ 2020-10-20 3:15 ` Taylor Blau 2020-10-20 10:16 ` Nipunn Koorapati 0 siblings, 1 reply; 52+ messages in thread From: Taylor Blau @ 2020-10-20 3:15 UTC (permalink / raw) To: Junio C Hamano Cc: Taylor Blau, Nipunn Koorapati via GitGitGadget, git, Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati On Mon, Oct 19, 2020 at 08:10:56PM -0700, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > >> echo change$i >unrelated-file$i && > >> - seq 1000 | tac >>unrelated-file$i && > >> + test_seq 1000 | tac >>unrelated-file$i && > > > > Makes sense. I wouldn't be opposed to breaking this out into an earlier > > change (e.g., "it's about to become not OK to use seq in t/perf, so > > prepare for that by replacing any invocations with test_seq()"), but I > > think it's probably not worth it, since this patch is small as it is. > > test_seq is fine, but I do not think tac is portable (only saved by > the fact that not many people, especially on exotic platforms, run > perf scripts). Serves me right for reading while I'm tired! I glazed right over 'tac'. If you need a truly unrelated file, you could write random data into it (there are some examples in t/test-lib-functions.sh), but I'd just write 'test_seq 1001'. Thanks, Taylor ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests 2020-10-20 3:15 ` Taylor Blau @ 2020-10-20 10:16 ` Nipunn Koorapati 0 siblings, 0 replies; 52+ messages in thread From: Nipunn Koorapati @ 2020-10-20 10:16 UTC (permalink / raw) To: Taylor Blau Cc: Junio C Hamano, Nipunn Koorapati via GitGitGadget, git, Derrick Stolee, Utsav Shah, Nipunn Koorapati > > test_seq is fine, but I do not think tac is portable (only saved by > > the fact that not many people, especially on exotic platforms, run > > perf scripts). > > Serves me right for reading while I'm tired! I glazed right over 'tac'. > If you need a truly unrelated file, you could write random data into it > (there are some examples in t/test-lib-functions.sh), but I'd just write > 'test_seq 1001'. Seems like there might be some value to adding `tac` to the perl script check-non-portable-shell.pl - though I'm not sure what we'd use as an alternative. I'll leave this here for now for someone else to handle in a follow up patch series ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests 2020-10-20 2:38 ` Taylor Blau 2020-10-20 3:10 ` Junio C Hamano @ 2020-10-20 10:09 ` Nipunn Koorapati 1 sibling, 0 replies; 52+ messages in thread From: Nipunn Koorapati @ 2020-10-20 10:09 UTC (permalink / raw) To: Taylor Blau Cc: Nipunn Koorapati via GitGitGadget, git, Derrick Stolee, Utsav Shah, Nipunn Koorapati On Tue, Oct 20, 2020 at 3:39 AM Taylor Blau <me@ttaylorr.com> wrote: > > I'm not necessarily opposed, but having this in t/perf/Makefile would > allow me to just run 'make' in 't/perf' and still have the scripts > linted there without having to involve a 'make' in 't'. > > For what it's worth, I suspect that this is because 't/Makefile' already > has a 'test-lint-shell-syntax' target, and 't/perf/Makefile' does not. I > think it would be OK to add it there, too, and move this change into > t/perf. Looked at doing this and noticed that there are several targets in test-lint in t/Makefile. This would involve duplicating them into t/perf/Makefile which seems like it would be poor form, especially given their complexity. Perhaps t/perf/Makefile could have a target which calls t/Makefile's test-lint target instead. Will play around with it. > > Makes sense. I wouldn't be opposed to breaking this out into an earlier > change (e.g., "it's about to become not OK to use seq in t/perf, so > prepare for that by replacing any invocations with test_seq()"), but I > think it's probably not worth it, since this patch is small as it is. > Yeah - I see the point, but I agree that since the patch is small, it's ok this way. If the patch grows significantly, I can make it into two patches --Nipunn ^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH v3 6/7] p7519-fsmonitor: refactor to avoid code duplication 2020-10-19 22:47 ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget ` (4 preceding siblings ...) 2020-10-19 22:47 ` [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 ` Nipunn Koorapati via GitGitGadget 2020-10-20 2:43 ` Taylor Blau 2020-10-19 22:47 ` [PATCH v3 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget 2020-10-20 13:40 ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget 7 siblings, 1 reply; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati, Nipunn Koorapati From: Nipunn Koorapati <nipunn@dropbox.com> Much of the benchmark code is redundant. This is easier to understand and edit. Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- t/perf/p7519-fsmonitor.sh | 136 +++++++++++--------------------------- 1 file changed, 37 insertions(+), 99 deletions(-) diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh index ef4c3c8c5c..75a0cef01d 100755 --- a/t/perf/p7519-fsmonitor.sh +++ b/t/perf/p7519-fsmonitor.sh @@ -125,61 +125,53 @@ test_expect_success "setup for fsmonitor" ' git status # Warm caches ' -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi +test_perf_w_drop_caches () { + if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-tool drop-caches + fi -test_perf "status (fsmonitor=$INTEGRATION_SCRIPT)" ' - git status -' + test_perf "$@" +} -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi +test_fsmonitor_suite() { + test_perf_w_drop_caches "status (fsmonitor=$INTEGRATION_SCRIPT)" ' + git status + ' -test_perf "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" ' - git status -uno -' + test_perf_w_drop_caches "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" ' + git status -uno + ' -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi + test_perf_w_drop_caches "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" ' + git status -uall + ' -test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" ' - git status -uall -' - -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi - -test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -' + test_perf_w_drop_caches "diff (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff + ' -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi + test_perf_w_drop_caches "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 1_file + ' -test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 1_file -' + test_perf_w_drop_caches "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 10_files + ' -test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 10_files -' + test_perf_w_drop_caches "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 100_files + ' -test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 100_files -' + test_perf_w_drop_caches "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 1000_files + ' -test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 1000_files -' + test_perf_w_drop_caches "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 10000_files + ' +} -test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 10000_files -' +test_fsmonitor_suite test_expect_success "setup without fsmonitor" ' unset INTEGRATION_SCRIPT && @@ -187,61 +179,7 @@ test_expect_success "setup without fsmonitor" ' git update-index --no-fsmonitor ' -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi - -test_perf "status (fsmonitor=$INTEGRATION_SCRIPT)" ' - git status -' - -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi - -test_perf "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" ' - git status -uno -' - -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi - -test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" ' - git status -uall -' - -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi - -test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -' - -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi - -test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 1_file -' - -test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 10_files -' - -test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 100_files -' - -test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 1000_files -' - -test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 10000_files -' +test_fsmonitor_suite if test_have_prereq WATCHMAN then -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: [PATCH v3 6/7] p7519-fsmonitor: refactor to avoid code duplication 2020-10-19 22:47 ` [PATCH v3 6/7] p7519-fsmonitor: refactor to avoid code duplication Nipunn Koorapati via GitGitGadget @ 2020-10-20 2:43 ` Taylor Blau 0 siblings, 0 replies; 52+ messages in thread From: Taylor Blau @ 2020-10-20 2:43 UTC (permalink / raw) To: Nipunn Koorapati via GitGitGadget Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau On Mon, Oct 19, 2020 at 10:47:36PM +0000, Nipunn Koorapati via GitGitGadget wrote: > From: Nipunn Koorapati <nipunn@dropbox.com> > > Much of the benchmark code is redundant. This is > easier to understand and edit. > > Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Much easier to read, thank you for taking the time to simplify the code. I know that this is maybe more review than you were hoping for, but I think it's been worth it and the series is in an even better state than when you started. Acked-by: Taylor Blau <me@ttaylorr.com> Thanks, Taylor ^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH v3 7/7] p7519-fsmonitor: add a git add benchmark 2020-10-19 22:47 ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget ` (5 preceding siblings ...) 2020-10-19 22:47 ` [PATCH v3 6/7] p7519-fsmonitor: refactor to avoid code duplication Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 ` Nipunn Koorapati via GitGitGadget 2020-10-19 23:02 ` Nipunn Koorapati 2020-10-20 2:40 ` Taylor Blau 2020-10-20 13:40 ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget 7 siblings, 2 replies; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-19 22:47 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati, Nipunn Koorapati From: Nipunn Koorapati <nipunn@dropbox.com> Test v2.29.0-rc1 this tree ----------------------------------------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 1.48(0.79+0.67) 1.48(0.79+0.67) +0.0% 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.16(0.11+0.05) 0.17(0.13+0.04) +6.3% 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.77+0.58) 1.37(0.72+0.63) +0.7% 7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman) 0.84(0.21+0.63) 0.14(0.11+0.03) -83.3% 7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.07+0.05) 0.13(0.09+0.04) +8.3% 7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.09+0.04) 0.13(0.07+0.06) +8.3% 7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.05) 0.12(0.08+0.05) +0.0% 7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.05) 0.13(0.09+0.04) +8.3% 7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.14(0.08+0.06) 0.13(0.07+0.06) -7.1% 7519.11: add (fsmonitor=.git/hooks/fsmonitor-watchman) 2.75(1.41+1.27) 2.03(1.26+0.70) -26.2% 7519.13: status (fsmonitor=) 1.38(1.03+1.04) 1.37(1.04+1.04) -0.7% 7519.14: status -uno (fsmonitor=) 1.11(0.83+0.98) 1.10(0.89+0.90) -0.9% 7519.15: status -uall (fsmonitor=) 2.30(1.57+1.42) 2.31(1.49+1.50) +0.4% 7519.16: diff (fsmonitor=) 1.43(1.13+1.76) 1.46(1.19+1.72) +2.1% 7519.17: diff -- 0_files (fsmonitor=) 0.10(0.08+0.04) 0.11(0.08+0.04) +10.0% 7519.18: diff -- 10_files (fsmonitor=) 0.10(0.07+0.05) 0.11(0.08+0.04) +10.0% 7519.19: diff -- 100_files (fsmonitor=) 0.10(0.07+0.04) 0.11(0.07+0.05) +10.0% 7519.20: diff -- 1000_files (fsmonitor=) 0.10(0.08+0.03) 0.11(0.08+0.04) +10.0% 7519.21: diff -- 10000_files (fsmonitor=) 0.11(0.08+0.05) 0.12(0.07+0.06) +9.1% 7519.22: add (fsmonitor=) 2.26(1.46+1.49) 2.27(1.42+1.55) +0.4% Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- t/perf/p7519-fsmonitor.sh | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh index 75a0cef01d..fb20fe0937 100755 --- a/t/perf/p7519-fsmonitor.sh +++ b/t/perf/p7519-fsmonitor.sh @@ -169,6 +169,10 @@ test_fsmonitor_suite() { test_perf_w_drop_caches "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' git diff -- 10000_files ' + + test_perf_w_drop_caches "add (fsmonitor=$INTEGRATION_SCRIPT)" ' + git add --all + ' } test_fsmonitor_suite -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: [PATCH v3 7/7] p7519-fsmonitor: add a git add benchmark 2020-10-19 22:47 ` [PATCH v3 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget @ 2020-10-19 23:02 ` Nipunn Koorapati 2020-10-20 2:40 ` Taylor Blau 1 sibling, 0 replies; 52+ messages in thread From: Nipunn Koorapati @ 2020-10-19 23:02 UTC (permalink / raw) To: Nipunn Koorapati via GitGitGadget Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati, Taylor Blau Actually found a 25% improvement here on git add with this patch series ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH v3 7/7] p7519-fsmonitor: add a git add benchmark 2020-10-19 22:47 ` [PATCH v3 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget 2020-10-19 23:02 ` Nipunn Koorapati @ 2020-10-20 2:40 ` Taylor Blau 1 sibling, 0 replies; 52+ messages in thread From: Taylor Blau @ 2020-10-20 2:40 UTC (permalink / raw) To: Nipunn Koorapati via GitGitGadget Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau On Mon, Oct 19, 2020 at 10:47:37PM +0000, Nipunn Koorapati via GitGitGadget wrote: > From: Nipunn Koorapati <nipunn@dropbox.com> > > Test v2.29.0-rc1 this tree > ----------------------------------------------------------------------------------------------------------------- > [...] > 7519.22: add (fsmonitor=) 2.26(1.46+1.49) 2.27(1.42+1.55) +0.4% Good; no huge slow-down here. Thanks for checking! > Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> > --- > t/perf/p7519-fsmonitor.sh | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh > index 75a0cef01d..fb20fe0937 100755 > --- a/t/perf/p7519-fsmonitor.sh > +++ b/t/perf/p7519-fsmonitor.sh > @@ -169,6 +169,10 @@ test_fsmonitor_suite() { > test_perf_w_drop_caches "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' > git diff -- 10000_files > ' > + > + test_perf_w_drop_caches "add (fsmonitor=$INTEGRATION_SCRIPT)" ' > + git add --all > + ' > } > > test_fsmonitor_suite > -- > gitgitgadget Acked-by: Taylor Blau <me@ttaylorr.com> Thanks, Taylor ^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat 2020-10-19 22:47 ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget ` (6 preceding siblings ...) 2020-10-19 22:47 ` [PATCH v3 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:40 ` Nipunn Koorapati via GitGitGadget 2020-10-20 13:40 ` [PATCH v4 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget ` (6 more replies) 7 siblings, 7 replies; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:40 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati Credit to alexmv who made this commit back in Dec, 2017 when he was at dbx. I've rebased it and am submitting it now. With fsmonitor enabled, git diff currently lstats every file in the repo This makes use of the fsmonitor extension to skip lstat() calls on files that fsmonitor judged as unmodified. I was able to do some testing with/without this change in a large in-house repo (~ 400k files). ----------------------------------------- (1) With fsmonitor enabled - on master of git (2.29.0) ----------------------------------------- ../git/bin-wrappers/git checkout HEAD~200 strace -c ../git/bin-wrappers/git diff % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 99.64 4.358994 10 446257 3 lstat 0.12 0.005353 7 764 360 open (A subsequent call) strace -c ../git/bin-wrappers/git diff % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 99.84 4.380955 10 444904 3 lstat 0.06 0.002564 135 19 munmap ... ----------------------------------------- (2) With fsmonitor enabled - with my patch ----------------------------------------- ../git/bin-wrappers/git checkout HEAD~200 strace -c ../git/bin-wrappers/git diff % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 50.72 0.003090 163 19 munmap 19.63 0.001196 598 2 futex ... 0.00 0.000000 0 4 3 lstat ----------------------------------------- (3) With fsmonitor disabled entirely ----------------------------------------- % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 98.52 0.277085 92362 3 futex 0.27 0.000752 4 191 63 open ... 0.14 0.000397 3 158 3 lstat I was able to encode this into a perf test in one of the commits. Changes since Patch Series V1 * Add git diff -- <pathspec> to perf tests * improve readability of bitwise ops Changes since Patch Series V2 * Add git add to perf tests * Refactor perf fsmonitor to simplify / remove redundancy * Add linting to perf tests * Added git diff -- <pathspec> for various sized pathspecs * Confirmed that refresh_fsmonitor was always being called / added to commit message Changes since Patch Series V3 * Move perf test linting to Makefile in perf/ directory Alex Vandiver (1): fsmonitor: use fsmonitor data in `git diff` Nipunn Koorapati (6): t/perf/README: elaborate on output format t/perf/p7519-fsmonitor.sh: warm cache on first git status t/perf: add fsmonitor perf test for git diff perf lint: add make test-lint to perf tests p7519-fsmonitor: refactor to avoid code duplication p7519-fsmonitor: add a git add benchmark diff-lib.c | 15 +++++- t/Makefile | 7 +-- t/perf/Makefile | 5 +- t/perf/README | 2 + t/perf/p3400-rebase.sh | 6 +-- t/perf/p7519-fsmonitor.sh | 96 ++++++++++++++++++++++----------------- 6 files changed, 81 insertions(+), 50 deletions(-) base-commit: d4a392452e292ff924e79ec8458611c0f679d6d4 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-756%2Fnipunn1313%2Fdiff_fsmon-v4 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-756/nipunn1313/diff_fsmon-v4 Pull-Request: https://github.com/gitgitgadget/git/pull/756 Range-diff vs v3: 1: cba03dd40b = 1: cba03dd40b fsmonitor: use fsmonitor data in `git diff` 2: 1c7876166f = 2: 1c7876166f t/perf/README: elaborate on output format 3: 401f696c81 = 3: 401f696c81 t/perf/p7519-fsmonitor.sh: warm cache on first git status 4: b3ad8faac4 = 4: b3ad8faac4 t/perf: add fsmonitor perf test for git diff 5: 28c1e488bf ! 5: b534cd137a perf lint: check test-lint-shell-syntax in perf tests @@ Metadata Author: Nipunn Koorapati <nipunn@dropbox.com> ## Commit message ## - perf lint: check test-lint-shell-syntax in perf tests + perf lint: add make test-lint to perf tests - Perf tests have some seq instead of test_seq. This - runs the existing tests on the perf tests as well. + Perf tests have not been linted for some time. + They've grown some seq instead of test_seq. This + runs the existing lints on the perf tests as well. Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> @@ t/Makefile: CHAINLINTTMP_SQ = $(subst ','\'',$(CHAINLINTTMP)) CHAINLINTTESTS = $(sort $(patsubst chainlint/%.test,%,$(wildcard chainlint/*.test))) CHAINLINT = sed -f chainlint.sed -@@ t/Makefile: test-lint-executable: +@@ t/Makefile: test-lint: test-lint-duplicates test-lint-executable test-lint-shell-syntax \ + test-lint-filenames + + test-lint-duplicates: +- @dups=`echo $(T) | tr ' ' '\n' | sed 's/-.*//' | sort | uniq -d` && \ ++ @dups=`echo $(T) $(TPERF) | tr ' ' '\n' | sed 's/-.*//' | sort | uniq -d` && \ + test -z "$$dups" || { \ + echo >&2 "duplicate test numbers:" $$dups; exit 1; } + + test-lint-executable: +- @bad=`for i in $(T); do test -x "$$i" || echo $$i; done` && \ ++ @bad=`for i in $(T) $(TPERF); do test -x "$$i" || echo $$i; done` && \ + test -z "$$bad" || { \ echo >&2 "non-executable tests:" $$bad; exit 1; } test-lint-shell-syntax: @@ t/Makefile: test-lint-executable: test-lint-filenames: @# We do *not* pass a glob to ls-files but use grep instead, to catch + ## t/perf/Makefile ## +@@ + -include ../../config.mak + export GIT_TEST_OPTIONS + +-all: perf ++all: test-lint perf + + perf: pre-clean + ./run +@@ t/perf/Makefile: pre-clean: + clean: + rm -rf build "trash directory".* test-results + ++test-lint: ++ $(MAKE) -C .. test-lint ++ + .PHONY: all perf pre-clean clean + ## t/perf/p3400-rebase.sh ## @@ t/perf/p3400-rebase.sh: test_expect_success 'setup rebasing on top of a lot of changes' ' git checkout -f -B base && 6: b38f2984f9 = 6: 3b20f4c76e p7519-fsmonitor: refactor to avoid code duplication 7: d392a523f2 = 7: 6f97439936 p7519-fsmonitor: add a git add benchmark -- gitgitgadget ^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH v4 1/7] fsmonitor: use fsmonitor data in `git diff` 2020-10-20 13:40 ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:40 ` Alex Vandiver via GitGitGadget 2020-10-20 13:40 ` [PATCH v4 2/7] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget ` (5 subsequent siblings) 6 siblings, 0 replies; 52+ messages in thread From: Alex Vandiver via GitGitGadget @ 2020-10-20 13:40 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati, Alex Vandiver From: Alex Vandiver <alexmv@dropbox.com> With fsmonitor enabled, the first call to match_stat_with_submodule calls refresh_fsmonitor, incurring the overhead of reading the list of updated files -- but run_diff_files does not respect the CE_FSMONITOR_VALID flag. Make use of the fsmonitor extension to skip lstat() calls on files that fsmonitor judged as unmodified. Notably, this change improves performance of the git shell prompt when GIT_PS1_SHOWDIRTYSTATE is set. Signed-off-by: Alex Vandiver <alexmv@dropbox.com> Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- diff-lib.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/diff-lib.c b/diff-lib.c index f95c6de75f..d2d31b9f82 100644 --- a/diff-lib.c +++ b/diff-lib.c @@ -97,6 +97,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option) diff_set_mnemonic_prefix(&revs->diffopt, "i/", "w/"); + refresh_fsmonitor(istate); + if (diff_unmerged_stage < 0) diff_unmerged_stage = 2; entries = istate->cache_nr; @@ -197,8 +199,17 @@ int run_diff_files(struct rev_info *revs, unsigned int option) if (ce_uptodate(ce) || ce_skip_worktree(ce)) continue; - /* If CE_VALID is set, don't look at workdir for file removal */ - if (ce->ce_flags & CE_VALID) { + /* + * When CE_VALID is set (via "update-index --assume-unchanged" + * or via adding paths while core.ignorestat is set to true), + * the user has promised that the working tree file for that + * path will not be modified. When CE_FSMONITOR_VALID is true, + * the fsmonitor knows that the path hasn't been modified since + * we refreshed the cached stat information. In either case, + * we do not have to stat to see if the path has been removed + * or modified. + */ + if (ce->ce_flags & (CE_VALID | CE_FSMONITOR_VALID)) { changed = 0; newmode = ce->ce_mode; } else { -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* [PATCH v4 2/7] t/perf/README: elaborate on output format 2020-10-20 13:40 ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget 2020-10-20 13:40 ` [PATCH v4 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget @ 2020-10-20 13:40 ` Nipunn Koorapati via GitGitGadget 2020-10-20 13:41 ` [PATCH v4 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget ` (4 subsequent siblings) 6 siblings, 0 replies; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:40 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati, Nipunn Koorapati From: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- t/perf/README | 2 ++ 1 file changed, 2 insertions(+) diff --git a/t/perf/README b/t/perf/README index bd649afa97..fb9127a66f 100644 --- a/t/perf/README +++ b/t/perf/README @@ -28,6 +28,8 @@ the tests on the current git repository. 7810.3: grep --cached, cheap regex 3.07(3.02+0.25) 7810.4: grep --cached, expensive regex 9.39(30.57+0.24) +Output format is in seconds "Elapsed(User + System)" + You can compare multiple repositories and even git revisions with the 'run' script: -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* [PATCH v4 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status 2020-10-20 13:40 ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget 2020-10-20 13:40 ` [PATCH v4 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget 2020-10-20 13:40 ` [PATCH v4 2/7] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:41 ` Nipunn Koorapati via GitGitGadget 2020-10-20 13:41 ` [PATCH v4 4/7] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget ` (3 subsequent siblings) 6 siblings, 0 replies; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:41 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati, Nipunn Koorapati From: Nipunn Koorapati <nipunn@dropbox.com> The first git status would be inflated due to warming of filesystem cache. This makes the results comparable. Before Test this tree -------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 2.52(1.59+1.56) 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.18(0.12+0.06) 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.73+0.62) 7519.7: status (fsmonitor=) 0.69(0.52+0.90) 7519.8: status -uno (fsmonitor=) 0.37(0.28+0.81) 7519.9: status -uall (fsmonitor=) 1.53(0.93+1.32) After Test this tree -------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 0.39(0.33+0.06) 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.17(0.13+0.05) 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.34(0.77+0.56) 7519.7: status (fsmonitor=) 0.70(0.53+0.90) 7519.8: status -uno (fsmonitor=) 0.37(0.32+0.78) 7519.9: status -uall (fsmonitor=) 1.55(1.01+1.25) Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- t/perf/p7519-fsmonitor.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh index def7ecdbc7..9313d4a51d 100755 --- a/t/perf/p7519-fsmonitor.sh +++ b/t/perf/p7519-fsmonitor.sh @@ -114,7 +114,8 @@ test_expect_success "setup for fsmonitor" ' fi && git config core.fsmonitor "$INTEGRATION_SCRIPT" && - git update-index --fsmonitor + git update-index --fsmonitor && + git status # Warm caches ' if test -n "$GIT_PERF_7519_DROP_CACHE"; then -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* [PATCH v4 4/7] t/perf: add fsmonitor perf test for git diff 2020-10-20 13:40 ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget ` (2 preceding siblings ...) 2020-10-20 13:41 ` [PATCH v4 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:41 ` Nipunn Koorapati via GitGitGadget 2020-10-20 13:41 ` [PATCH v4 5/7] perf lint: add make test-lint to perf tests Nipunn Koorapati via GitGitGadget ` (2 subsequent siblings) 6 siblings, 0 replies; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:41 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati, Nipunn Koorapati From: Nipunn Koorapati <nipunn@dropbox.com> Results for the git-diff fsmonitor optimization in patch in the parent-rev (using a 400k file repo to test) As you can see here - git diff with fsmonitor running is significantly better with this patch series (80% faster on my workload)! GIT_PERF_LARGE_REPO=~/src/server ./run v2.29.0-rc1 . -- p7519-fsmonitor.sh Test v2.29.0-rc1 this tree ----------------------------------------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 1.46(0.82+0.64) 1.47(0.83+0.62) +0.7% 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.16(0.12+0.04) 0.17(0.12+0.05) +6.3% 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.73+0.62) 1.37(0.76+0.60) +0.7% 7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman) 0.85(0.22+0.63) 0.14(0.10+0.05) -83.5% 7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.05) 0.13(0.11+0.02) +8.3% 7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.04) 0.13(0.09+0.04) +8.3% 7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.07+0.05) 0.13(0.07+0.06) +8.3% 7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.09+0.04) 0.13(0.08+0.05) +8.3% 7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.14(0.09+0.05) 0.13(0.10+0.03) -7.1% 7519.12: status (fsmonitor=) 1.67(0.93+1.49) 1.67(0.99+1.42) +0.0% 7519.13: status -uno (fsmonitor=) 0.37(0.30+0.82) 0.37(0.33+0.79) +0.0% 7519.14: status -uall (fsmonitor=) 1.58(0.97+1.35) 1.57(0.86+1.45) -0.6% 7519.15: diff (fsmonitor=) 0.34(0.28+0.83) 0.34(0.27+0.83) +0.0% 7519.16: diff -- 0_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.08+0.02) +0.0% 7519.17: diff -- 10_files (fsmonitor=) 0.09(0.07+0.03) 0.09(0.06+0.05) +0.0% 7519.18: diff -- 100_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.06+0.04) +0.0% 7519.19: diff -- 1000_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.05+0.05) +0.0% 7519.20: diff -- 10000_files (fsmonitor=) 0.10(0.08+0.04) 0.10(0.06+0.05) +0.0% I also added a benchmark for a tiny git diff workload w/ a pathspec. I see an approximately .02 second overhead added w/ and w/o fsmonitor From looking at these results, I suspected that refresh_fsmonitor is already happening during git diff - independent of this patch series' optimization. Confirmed that suspicion by breaking on refresh_fsmonitor. (gdb) bt [simplified] 0 refresh_fsmonitor at fsmonitor.c:176 1 ie_match_stat at read-cache.c:375 2 match_stat_with_submodule at diff-lib.c:237 4 builtin_diff_files at builtin/diff.c:260 5 cmd_diff at builtin/diff.c:541 6 run_builtin at git.c:450 7 handle_builtin at git.c:700 8 run_argv at git.c:767 9 cmd_main at git.c:898 10 main at common-main.c:52 Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- t/perf/p7519-fsmonitor.sh | 71 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 71 insertions(+) diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh index 9313d4a51d..ef4c3c8c5c 100755 --- a/t/perf/p7519-fsmonitor.sh +++ b/t/perf/p7519-fsmonitor.sh @@ -115,6 +115,13 @@ test_expect_success "setup for fsmonitor" ' git config core.fsmonitor "$INTEGRATION_SCRIPT" && git update-index --fsmonitor && + mkdir 1_file 10_files 100_files 1000_files 10000_files && + for i in $(test_seq 1 10); do touch 10_files/$i; done && + for i in $(test_seq 1 100); do touch 100_files/$i; done && + for i in $(test_seq 1 1000); do touch 1000_files/$i; done && + for i in $(test_seq 1 10000); do touch 10000_files/$i; done && + git add 1_file 10_files 100_files 1000_files 10000_files && + git commit -m "Add files" && git status # Warm caches ' @@ -142,6 +149,38 @@ test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" ' git status -uall ' +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-tool drop-caches +fi + +test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff +' + +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-tool drop-caches +fi + +test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 1_file +' + +test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 10_files +' + +test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 100_files +' + +test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 1000_files +' + +test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 10000_files +' + test_expect_success "setup without fsmonitor" ' unset INTEGRATION_SCRIPT && git config --unset core.fsmonitor && @@ -172,6 +211,38 @@ test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" ' git status -uall ' +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-tool drop-caches +fi + +test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff +' + +if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-tool drop-caches +fi + +test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 1_file +' + +test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 10_files +' + +test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 100_files +' + +test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 1000_files +' + +test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 10000_files +' + if test_have_prereq WATCHMAN then watchman watch-del "$GIT_WORK_TREE" >/dev/null 2>&1 && -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* [PATCH v4 5/7] perf lint: add make test-lint to perf tests 2020-10-20 13:40 ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget ` (3 preceding siblings ...) 2020-10-20 13:41 ` [PATCH v4 4/7] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:41 ` Nipunn Koorapati via GitGitGadget 2020-10-20 22:06 ` Taylor Blau 2020-10-20 13:41 ` [PATCH v4 6/7] p7519-fsmonitor: refactor to avoid code duplication Nipunn Koorapati via GitGitGadget 2020-10-20 13:41 ` [PATCH v4 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget 6 siblings, 1 reply; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:41 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati, Nipunn Koorapati From: Nipunn Koorapati <nipunn@dropbox.com> Perf tests have not been linted for some time. They've grown some seq instead of test_seq. This runs the existing lints on the perf tests as well. Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- t/Makefile | 7 ++++--- t/perf/Makefile | 5 ++++- t/perf/p3400-rebase.sh | 6 +++--- 3 files changed, 11 insertions(+), 7 deletions(-) diff --git a/t/Makefile b/t/Makefile index c83fd18861..882d26eee3 100644 --- a/t/Makefile +++ b/t/Makefile @@ -34,6 +34,7 @@ CHAINLINTTMP_SQ = $(subst ','\'',$(CHAINLINTTMP)) T = $(sort $(wildcard t[0-9][0-9][0-9][0-9]-*.sh)) TGITWEB = $(sort $(wildcard t95[0-9][0-9]-*.sh)) THELPERS = $(sort $(filter-out $(T),$(wildcard *.sh))) +TPERF = $(sort $(wildcard perf/p[0-9][0-9][0-9][0-9]-*.sh)) CHAINLINTTESTS = $(sort $(patsubst chainlint/%.test,%,$(wildcard chainlint/*.test))) CHAINLINT = sed -f chainlint.sed @@ -81,17 +82,17 @@ test-lint: test-lint-duplicates test-lint-executable test-lint-shell-syntax \ test-lint-filenames test-lint-duplicates: - @dups=`echo $(T) | tr ' ' '\n' | sed 's/-.*//' | sort | uniq -d` && \ + @dups=`echo $(T) $(TPERF) | tr ' ' '\n' | sed 's/-.*//' | sort | uniq -d` && \ test -z "$$dups" || { \ echo >&2 "duplicate test numbers:" $$dups; exit 1; } test-lint-executable: - @bad=`for i in $(T); do test -x "$$i" || echo $$i; done` && \ + @bad=`for i in $(T) $(TPERF); do test -x "$$i" || echo $$i; done` && \ test -z "$$bad" || { \ echo >&2 "non-executable tests:" $$bad; exit 1; } test-lint-shell-syntax: - @'$(PERL_PATH_SQ)' check-non-portable-shell.pl $(T) $(THELPERS) + @'$(PERL_PATH_SQ)' check-non-portable-shell.pl $(T) $(THELPERS) $(TPERF) test-lint-filenames: @# We do *not* pass a glob to ls-files but use grep instead, to catch diff --git a/t/perf/Makefile b/t/perf/Makefile index 8c47155a7c..fcb0e8865e 100644 --- a/t/perf/Makefile +++ b/t/perf/Makefile @@ -1,7 +1,7 @@ -include ../../config.mak export GIT_TEST_OPTIONS -all: perf +all: test-lint perf perf: pre-clean ./run @@ -12,4 +12,7 @@ pre-clean: clean: rm -rf build "trash directory".* test-results +test-lint: + $(MAKE) -C .. test-lint + .PHONY: all perf pre-clean clean diff --git a/t/perf/p3400-rebase.sh b/t/perf/p3400-rebase.sh index d202aaed06..7a0bb29448 100755 --- a/t/perf/p3400-rebase.sh +++ b/t/perf/p3400-rebase.sh @@ -9,16 +9,16 @@ test_expect_success 'setup rebasing on top of a lot of changes' ' git checkout -f -B base && git checkout -B to-rebase && git checkout -B upstream && - for i in $(seq 100) + for i in $(test_seq 100) do # simulate huge diffs echo change$i >unrelated-file$i && - seq 1000 >>unrelated-file$i && + test_seq 1000 >>unrelated-file$i && git add unrelated-file$i && test_tick && git commit -m commit$i unrelated-file$i && echo change$i >unrelated-file$i && - seq 1000 | tac >>unrelated-file$i && + test_seq 1000 | tac >>unrelated-file$i && git add unrelated-file$i && test_tick && git commit -m commit$i-reverse unrelated-file$i || -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: [PATCH v4 5/7] perf lint: add make test-lint to perf tests 2020-10-20 13:41 ` [PATCH v4 5/7] perf lint: add make test-lint to perf tests Nipunn Koorapati via GitGitGadget @ 2020-10-20 22:06 ` Taylor Blau 2020-10-20 22:17 ` Nipunn Koorapati 0 siblings, 1 reply; 52+ messages in thread From: Taylor Blau @ 2020-10-20 22:06 UTC (permalink / raw) To: Nipunn Koorapati via GitGitGadget Cc: git, Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau On Tue, Oct 20, 2020 at 01:41:02PM +0000, Nipunn Koorapati via GitGitGadget wrote: > diff --git a/t/perf/Makefile b/t/perf/Makefile > index 8c47155a7c..fcb0e8865e 100644 > --- a/t/perf/Makefile > +++ b/t/perf/Makefile > @@ -1,7 +1,7 @@ > -include ../../config.mak > export GIT_TEST_OPTIONS > > -all: perf > +all: test-lint perf > > perf: pre-clean > ./run > @@ -12,4 +12,7 @@ pre-clean: > clean: > rm -rf build "trash directory".* test-results > > +test-lint: > + $(MAKE) -C .. test-lint > + Great; it sounds like adding a complete definition here was too much effort to be worth it, but that adding a '$(MAKE) -C ..' is just right. We can still run 'make test-lint' from within 't/perf', but there isn't a bunch of clutter in this series to make that happen. Thanks. > .PHONY: all perf pre-clean clean > diff --git a/t/perf/p3400-rebase.sh b/t/perf/p3400-rebase.sh > index d202aaed06..7a0bb29448 100755 > --- a/t/perf/p3400-rebase.sh > +++ b/t/perf/p3400-rebase.sh > @@ -9,16 +9,16 @@ test_expect_success 'setup rebasing on top of a lot of changes' ' > git checkout -f -B base && > git checkout -B to-rebase && > git checkout -B upstream && > - for i in $(seq 100) > + for i in $(test_seq 100) > do > # simulate huge diffs > echo change$i >unrelated-file$i && > - seq 1000 >>unrelated-file$i && > + test_seq 1000 >>unrelated-file$i && > git add unrelated-file$i && > test_tick && > git commit -m commit$i unrelated-file$i && > echo change$i >unrelated-file$i && > - seq 1000 | tac >>unrelated-file$i && > + test_seq 1000 | tac >>unrelated-file$i && The rest of this all looks good, but I think adding 'tac' here is still wrong; this isn't available everywhere, so we would want to find an alternative before going further. Is there a reason that you couldn't use a different 'N' in 'test_seq N' here? Thanks, Taylor ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH v4 5/7] perf lint: add make test-lint to perf tests 2020-10-20 22:06 ` Taylor Blau @ 2020-10-20 22:17 ` Nipunn Koorapati 2020-10-20 22:19 ` Taylor Blau 0 siblings, 1 reply; 52+ messages in thread From: Nipunn Koorapati @ 2020-10-20 22:17 UTC (permalink / raw) To: Taylor Blau Cc: Nipunn Koorapati via GitGitGadget, git, Derrick Stolee, Utsav Shah, Nipunn Koorapati > > --- a/t/perf/p3400-rebase.sh > > +++ b/t/perf/p3400-rebase.sh > > @@ -9,16 +9,16 @@ test_expect_success 'setup rebasing on top of a lot of changes' ' > > git checkout -f -B base && > > git checkout -B to-rebase && > > git checkout -B upstream && > > - for i in $(seq 100) > > + for i in $(test_seq 100) > > do > > # simulate huge diffs > > echo change$i >unrelated-file$i && > > - seq 1000 >>unrelated-file$i && > > + test_seq 1000 >>unrelated-file$i && > > git add unrelated-file$i && > > test_tick && > > git commit -m commit$i unrelated-file$i && > > echo change$i >unrelated-file$i && > > - seq 1000 | tac >>unrelated-file$i && > > + test_seq 1000 | tac >>unrelated-file$i && > > The rest of this all looks good, but I think adding 'tac' here is still > wrong; this isn't available everywhere, so we would want to find an > alternative before going further. Is there a reason that you couldn't > use a different 'N' in 'test_seq N' here? Hey. I think there's some confusion. I didn't add `tac`. It was already here. I didn't even notice it until Junio mentioned it. --Nipunn ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH v4 5/7] perf lint: add make test-lint to perf tests 2020-10-20 22:17 ` Nipunn Koorapati @ 2020-10-20 22:19 ` Taylor Blau 0 siblings, 0 replies; 52+ messages in thread From: Taylor Blau @ 2020-10-20 22:19 UTC (permalink / raw) To: Nipunn Koorapati Cc: Taylor Blau, Nipunn Koorapati via GitGitGadget, git, Derrick Stolee, Utsav Shah, Nipunn Koorapati On Tue, Oct 20, 2020 at 11:17:23PM +0100, Nipunn Koorapati wrote: > > > --- a/t/perf/p3400-rebase.sh > > > +++ b/t/perf/p3400-rebase.sh > > > @@ -9,16 +9,16 @@ test_expect_success 'setup rebasing on top of a lot of changes' ' > > > git checkout -f -B base && > > > git checkout -B to-rebase && > > > git checkout -B upstream && > > > - for i in $(seq 100) > > > + for i in $(test_seq 100) > > > do > > > # simulate huge diffs > > > echo change$i >unrelated-file$i && > > > - seq 1000 >>unrelated-file$i && > > > + test_seq 1000 >>unrelated-file$i && > > > git add unrelated-file$i && > > > test_tick && > > > git commit -m commit$i unrelated-file$i && > > > echo change$i >unrelated-file$i && > > > - seq 1000 | tac >>unrelated-file$i && > > > + test_seq 1000 | tac >>unrelated-file$i && > > > > The rest of this all looks good, but I think adding 'tac' here is still > > wrong; this isn't available everywhere, so we would want to find an > > alternative before going further. Is there a reason that you couldn't > > use a different 'N' in 'test_seq N' here? > > Hey. I think there's some confusion. I didn't add `tac`. It was > already here. I didn't even notice it until Junio mentioned it. You're right, sorry; I just saw a line beginning with '+' that contained 'tac' and thought that it was new in this patch. What you have is OK, then, since it's not a new problem with your patch. It couldn't hurt to have the linting phase catch that, but let's leave that for another day, since I think what you have in this version looks good to me. Thanks for listening to all of my feedback :). > --Nipunn Thanks, Taylor ^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH v4 6/7] p7519-fsmonitor: refactor to avoid code duplication 2020-10-20 13:40 ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget ` (4 preceding siblings ...) 2020-10-20 13:41 ` [PATCH v4 5/7] perf lint: add make test-lint to perf tests Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:41 ` Nipunn Koorapati via GitGitGadget 2020-10-20 13:41 ` [PATCH v4 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget 6 siblings, 0 replies; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:41 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati, Nipunn Koorapati From: Nipunn Koorapati <nipunn@dropbox.com> Much of the benchmark code is redundant. This is easier to understand and edit. Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- t/perf/p7519-fsmonitor.sh | 136 +++++++++++--------------------------- 1 file changed, 37 insertions(+), 99 deletions(-) diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh index ef4c3c8c5c..75a0cef01d 100755 --- a/t/perf/p7519-fsmonitor.sh +++ b/t/perf/p7519-fsmonitor.sh @@ -125,61 +125,53 @@ test_expect_success "setup for fsmonitor" ' git status # Warm caches ' -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi +test_perf_w_drop_caches () { + if test -n "$GIT_PERF_7519_DROP_CACHE"; then + test-tool drop-caches + fi -test_perf "status (fsmonitor=$INTEGRATION_SCRIPT)" ' - git status -' + test_perf "$@" +} -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi +test_fsmonitor_suite() { + test_perf_w_drop_caches "status (fsmonitor=$INTEGRATION_SCRIPT)" ' + git status + ' -test_perf "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" ' - git status -uno -' + test_perf_w_drop_caches "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" ' + git status -uno + ' -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi + test_perf_w_drop_caches "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" ' + git status -uall + ' -test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" ' - git status -uall -' - -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi - -test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -' + test_perf_w_drop_caches "diff (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff + ' -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi + test_perf_w_drop_caches "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 1_file + ' -test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 1_file -' + test_perf_w_drop_caches "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 10_files + ' -test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 10_files -' + test_perf_w_drop_caches "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 100_files + ' -test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 100_files -' + test_perf_w_drop_caches "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 1000_files + ' -test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 1000_files -' + test_perf_w_drop_caches "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' + git diff -- 10000_files + ' +} -test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 10000_files -' +test_fsmonitor_suite test_expect_success "setup without fsmonitor" ' unset INTEGRATION_SCRIPT && @@ -187,61 +179,7 @@ test_expect_success "setup without fsmonitor" ' git update-index --no-fsmonitor ' -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi - -test_perf "status (fsmonitor=$INTEGRATION_SCRIPT)" ' - git status -' - -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi - -test_perf "status -uno (fsmonitor=$INTEGRATION_SCRIPT)" ' - git status -uno -' - -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi - -test_perf "status -uall (fsmonitor=$INTEGRATION_SCRIPT)" ' - git status -uall -' - -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi - -test_perf "diff (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -' - -if test -n "$GIT_PERF_7519_DROP_CACHE"; then - test-tool drop-caches -fi - -test_perf "diff -- 0_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 1_file -' - -test_perf "diff -- 10_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 10_files -' - -test_perf "diff -- 100_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 100_files -' - -test_perf "diff -- 1000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 1000_files -' - -test_perf "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' - git diff -- 10000_files -' +test_fsmonitor_suite if test_have_prereq WATCHMAN then -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
* [PATCH v4 7/7] p7519-fsmonitor: add a git add benchmark 2020-10-20 13:40 ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget ` (5 preceding siblings ...) 2020-10-20 13:41 ` [PATCH v4 6/7] p7519-fsmonitor: refactor to avoid code duplication Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:41 ` Nipunn Koorapati via GitGitGadget 6 siblings, 0 replies; 52+ messages in thread From: Nipunn Koorapati via GitGitGadget @ 2020-10-20 13:41 UTC (permalink / raw) To: git Cc: Derrick Stolee, Utsav Shah, Nipunn Koorapati, Nipunn Koorapati, Taylor Blau, Nipunn Koorapati, Nipunn Koorapati From: Nipunn Koorapati <nipunn@dropbox.com> Test v2.29.0-rc1 this tree ----------------------------------------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 1.48(0.79+0.67) 1.48(0.79+0.67) +0.0% 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.16(0.11+0.05) 0.17(0.13+0.04) +6.3% 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.77+0.58) 1.37(0.72+0.63) +0.7% 7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman) 0.84(0.21+0.63) 0.14(0.11+0.03) -83.3% 7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.07+0.05) 0.13(0.09+0.04) +8.3% 7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.09+0.04) 0.13(0.07+0.06) +8.3% 7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.05) 0.12(0.08+0.05) +0.0% 7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.05) 0.13(0.09+0.04) +8.3% 7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.14(0.08+0.06) 0.13(0.07+0.06) -7.1% 7519.11: add (fsmonitor=.git/hooks/fsmonitor-watchman) 2.75(1.41+1.27) 2.03(1.26+0.70) -26.2% 7519.13: status (fsmonitor=) 1.38(1.03+1.04) 1.37(1.04+1.04) -0.7% 7519.14: status -uno (fsmonitor=) 1.11(0.83+0.98) 1.10(0.89+0.90) -0.9% 7519.15: status -uall (fsmonitor=) 2.30(1.57+1.42) 2.31(1.49+1.50) +0.4% 7519.16: diff (fsmonitor=) 1.43(1.13+1.76) 1.46(1.19+1.72) +2.1% 7519.17: diff -- 0_files (fsmonitor=) 0.10(0.08+0.04) 0.11(0.08+0.04) +10.0% 7519.18: diff -- 10_files (fsmonitor=) 0.10(0.07+0.05) 0.11(0.08+0.04) +10.0% 7519.19: diff -- 100_files (fsmonitor=) 0.10(0.07+0.04) 0.11(0.07+0.05) +10.0% 7519.20: diff -- 1000_files (fsmonitor=) 0.10(0.08+0.03) 0.11(0.08+0.04) +10.0% 7519.21: diff -- 10000_files (fsmonitor=) 0.11(0.08+0.05) 0.12(0.07+0.06) +9.1% 7519.22: add (fsmonitor=) 2.26(1.46+1.49) 2.27(1.42+1.55) +0.4% Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> --- t/perf/p7519-fsmonitor.sh | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh index 75a0cef01d..fb20fe0937 100755 --- a/t/perf/p7519-fsmonitor.sh +++ b/t/perf/p7519-fsmonitor.sh @@ -169,6 +169,10 @@ test_fsmonitor_suite() { test_perf_w_drop_caches "diff -- 10000_files (fsmonitor=$INTEGRATION_SCRIPT)" ' git diff -- 10000_files ' + + test_perf_w_drop_caches "add (fsmonitor=$INTEGRATION_SCRIPT)" ' + git add --all + ' } test_fsmonitor_suite -- gitgitgadget ^ permalink raw reply related [flat|nested] 52+ messages in thread
end of thread, other threads:[~2020-10-20 22:19 UTC | newest] Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-10-17 21:04 [PATCH 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget 2020-10-17 21:04 ` [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget 2020-10-17 22:25 ` Junio C Hamano 2020-10-18 0:54 ` Nipunn Koorapati 2020-10-18 4:17 ` Taylor Blau 2020-10-18 5:02 ` Junio C Hamano 2020-10-18 23:43 ` Taylor Blau 2020-10-19 17:23 ` Junio C Hamano 2020-10-19 17:37 ` Taylor Blau 2020-10-19 18:07 ` Nipunn Koorapati 2020-10-17 21:04 ` [PATCH 2/4] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget 2020-10-17 21:04 ` [PATCH 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget 2020-10-18 4:22 ` Taylor Blau 2020-10-17 21:04 ` [PATCH 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget 2020-10-17 22:28 ` Junio C Hamano 2020-10-19 21:35 ` [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget 2020-10-19 21:35 ` [PATCH v2 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget 2020-10-19 21:35 ` [PATCH v2 2/4] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget 2020-10-19 21:35 ` [PATCH v2 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget 2020-10-19 21:35 ` [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget 2020-10-19 21:43 ` Taylor Blau 2020-10-19 21:54 ` Taylor Blau 2020-10-19 22:00 ` Nipunn Koorapati 2020-10-19 22:02 ` Taylor Blau 2020-10-19 22:25 ` Nipunn Koorapati 2020-10-19 22:47 ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget 2020-10-19 22:47 ` [PATCH v3 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget 2020-10-19 22:47 ` [PATCH v3 2/7] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget 2020-10-19 22:47 ` [PATCH v3 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget 2020-10-19 22:47 ` [PATCH v3 4/7] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget 2020-10-19 22:47 ` [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests Nipunn Koorapati via GitGitGadget 2020-10-20 2:38 ` Taylor Blau 2020-10-20 3:10 ` Junio C Hamano 2020-10-20 3:15 ` Taylor Blau 2020-10-20 10:16 ` Nipunn Koorapati 2020-10-20 10:09 ` Nipunn Koorapati 2020-10-19 22:47 ` [PATCH v3 6/7] p7519-fsmonitor: refactor to avoid code duplication Nipunn Koorapati via GitGitGadget 2020-10-20 2:43 ` Taylor Blau 2020-10-19 22:47 ` [PATCH v3 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget 2020-10-19 23:02 ` Nipunn Koorapati 2020-10-20 2:40 ` Taylor Blau 2020-10-20 13:40 ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget 2020-10-20 13:40 ` [PATCH v4 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget 2020-10-20 13:40 ` [PATCH v4 2/7] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget 2020-10-20 13:41 ` [PATCH v4 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget 2020-10-20 13:41 ` [PATCH v4 4/7] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget 2020-10-20 13:41 ` [PATCH v4 5/7] perf lint: add make test-lint to perf tests Nipunn Koorapati via GitGitGadget 2020-10-20 22:06 ` Taylor Blau 2020-10-20 22:17 ` Nipunn Koorapati 2020-10-20 22:19 ` Taylor Blau 2020-10-20 13:41 ` [PATCH v4 6/7] p7519-fsmonitor: refactor to avoid code duplication Nipunn Koorapati via GitGitGadget 2020-10-20 13:41 ` [PATCH v4 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).