* [PATCH 0/3] Empty untracked cache performance issue @ 2021-06-24 18:29 Tao Klerks via GitGitGadget 2021-06-24 18:29 ` [PATCH 1/3] Add a second's delay to t7519 for untracked cache Tao Klerks via GitGitGadget ` (3 more replies) 0 siblings, 4 replies; 19+ messages in thread From: Tao Klerks via GitGitGadget @ 2021-06-24 18:29 UTC (permalink / raw) To: git; +Cc: Tao Klerks This patchset addresses a performance issue with untracked cache. When a new untracked cache structure is added to the index but remains empty, subsequent "git status" calls populate it but do not write the index - so they perform as though the index were disabled. This situation can be caused in several different ways: * Running "git update-index --untracked-cache" on a repo that did not have the untracked cache * Modifying the git configuration to enable untracked cache, but then immediately running "git status -uall", causing the untracked cache to be created, but not used/populated (and the index written). * (likely others) The patchset includes fixes to t7519, which otherwise starts failing because it wasn't testing what it intended to. Tao Klerks (3): Add a second's delay to t7519 for untracked cache In t7519, populate untracked cache before test Write index when populating empty untracked cache dir.c | 14 +++++++++++--- t/t7519-status-fsmonitor.sh | 6 ++++++ 2 files changed, 17 insertions(+), 3 deletions(-) base-commit: ebf3c04b262aa27fbb97f8a0156c2347fecafafb Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-986%2FTaoK%2Ftaok-empty-untracked-cache-bug-v1 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-986/TaoK/taok-empty-untracked-cache-bug-v1 Pull-Request: https://github.com/gitgitgadget/git/pull/986 -- gitgitgadget ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/3] Add a second's delay to t7519 for untracked cache 2021-06-24 18:29 [PATCH 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget @ 2021-06-24 18:29 ` Tao Klerks via GitGitGadget 2021-06-29 4:22 ` Junio C Hamano 2021-06-24 18:30 ` [PATCH 2/3] In t7519, populate untracked cache before test Tao Klerks via GitGitGadget ` (2 subsequent siblings) 3 siblings, 1 reply; 19+ messages in thread From: Tao Klerks via GitGitGadget @ 2021-06-24 18:29 UTC (permalink / raw) To: git; +Cc: Tao Klerks, Tao Klerks From: Tao Klerks <tao@klerks.biz> In t7519 there is a test that writes files to disk, and immediately writes the untracked index. Because of mtime-comparison logic that uses a 1-second resolution, this means the cached entries are not trusted/used under some circumstances (see read-cache.c#is_racy_stat()). Untracked cache tests in t7063 use a 1-second delay to avoid this issue. We should do the same here. This change doesn't actually affect the outcome of the test, but does enhance its validity, and becomes relevant after later changes Signed-off-by: Tao Klerks <tao@klerks.biz> --- t/t7519-status-fsmonitor.sh | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh index 637391c6ce4..1209fa93499 100755 --- a/t/t7519-status-fsmonitor.sh +++ b/t/t7519-status-fsmonitor.sh @@ -13,6 +13,10 @@ clean_repo () { git clean -fd } +avoid_racy() { + sleep 1 +} + dirty_repo () { : >untracked && : >dir1/untracked && @@ -332,6 +336,7 @@ test_expect_success UNTRACKED_CACHE 'ignore .git changes when invalidating UNTR' : >dir2/modified && write_integration_script && git config core.fsmonitor .git/hooks/fsmonitor-test && + avoid_racy && git update-index --untracked-cache && git update-index --fsmonitor && GIT_TRACE2_PERF="$TRASH_DIRECTORY/trace-before" \ -- gitgitgadget ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 1/3] Add a second's delay to t7519 for untracked cache 2021-06-24 18:29 ` [PATCH 1/3] Add a second's delay to t7519 for untracked cache Tao Klerks via GitGitGadget @ 2021-06-29 4:22 ` Junio C Hamano 0 siblings, 0 replies; 19+ messages in thread From: Junio C Hamano @ 2021-06-29 4:22 UTC (permalink / raw) To: Tao Klerks via GitGitGadget; +Cc: git, Tao Klerks "Tao Klerks via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Tao Klerks <tao@klerks.biz> > > In t7519 there is a test that writes files to disk, and immediately > writes the untracked index. Because of mtime-comparison logic > that uses a 1-second resolution, this means the cached entries > are not trusted/used under some circumstances (see > read-cache.c#is_racy_stat()). > > Untracked cache tests in t7063 use a 1-second delay to avoid > this issue. We should do the same here. We shouldn't waste wallclock time by slowing down tests, which already take way too much time. If you can use "test-tool chmtime" to pretend as if the index file was written earlier than it actually is, that would be a better solution. > +avoid_racy() { Style (see below). > + sleep 1 > +} > + > dirty_repo () { ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 2/3] In t7519, populate untracked cache before test 2021-06-24 18:29 [PATCH 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget 2021-06-24 18:29 ` [PATCH 1/3] Add a second's delay to t7519 for untracked cache Tao Klerks via GitGitGadget @ 2021-06-24 18:30 ` Tao Klerks via GitGitGadget 2021-06-24 18:30 ` [PATCH 3/3] Write index when populating empty untracked cache Tao Klerks via GitGitGadget 2022-02-25 17:10 ` [PATCH v2 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget 3 siblings, 0 replies; 19+ messages in thread From: Tao Klerks via GitGitGadget @ 2021-06-24 18:30 UTC (permalink / raw) To: git; +Cc: Tao Klerks, Tao Klerks From: Tao Klerks <tao@klerks.biz> In its current state, the t7519 test dealing with untracked cache assumes that "git update-index --untracked-cache" will *populate* the untracked cache. This is not correct - it will only add an empty untracked cache structure to the index. If we're going to compare two git status runs with something interesting happening in-between, we need to ensure that the index is in a stable/steady state *before* that first run. We achieve this by adding another prior "git status" run. At this stage this change does nothing, because there is a bug, addressed in the next patch. whereby once the empty untracked cache structure is added by the update-index invocation, the untracked cache gets updated in every subsequent "git status" call, but the index with these updates does not get written down. That bug actually invalidates this entire test case - but we're fixing that next. Signed-off-by: Tao Klerks <tao@klerks.biz> --- t/t7519-status-fsmonitor.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh index 1209fa93499..ef75c548d90 100755 --- a/t/t7519-status-fsmonitor.sh +++ b/t/t7519-status-fsmonitor.sh @@ -339,6 +339,7 @@ test_expect_success UNTRACKED_CACHE 'ignore .git changes when invalidating UNTR' avoid_racy && git update-index --untracked-cache && git update-index --fsmonitor && + git status && GIT_TRACE2_PERF="$TRASH_DIRECTORY/trace-before" \ git status && test-tool dump-untracked-cache >../before -- gitgitgadget ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 3/3] Write index when populating empty untracked cache 2021-06-24 18:29 [PATCH 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget 2021-06-24 18:29 ` [PATCH 1/3] Add a second's delay to t7519 for untracked cache Tao Klerks via GitGitGadget 2021-06-24 18:30 ` [PATCH 2/3] In t7519, populate untracked cache before test Tao Klerks via GitGitGadget @ 2021-06-24 18:30 ` Tao Klerks via GitGitGadget 2021-06-29 4:42 ` Junio C Hamano 2022-02-25 17:10 ` [PATCH v2 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget 3 siblings, 1 reply; 19+ messages in thread From: Tao Klerks via GitGitGadget @ 2021-06-24 18:30 UTC (permalink / raw) To: git; +Cc: Tao Klerks, Tao Klerks From: Tao Klerks <tao@klerks.biz> It is expected that an empty/unpopulated untracked cache structure can be written to the index - by update- index, or by a "git status" call that sees the untracked cache should be enabled, but is running with options that make it non-applicable in that run. Currently, if that happens, then subsequent "git status" calls end up populating the untracked cache, but not writing the index (not saving their work) - so the performance outcome is almost identical to the cache being altogether disabled. This continues until the index gets written with the cache populated, for some *other* reason. In this change, we detect the "existing cache is empty and it looks like we are using it" condition, and queues an index write when this happens. This change depends on previous fixes to t7519 for the "ignore .git changes when invalidating UNTR" test case to pass - before this fix, the test never actually did anything as it was not set up correctly. Signed-off-by: Tao Klerks <tao@klerks.biz> --- dir.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/dir.c b/dir.c index ebe5ec046e0..a326e40e1c1 100644 --- a/dir.c +++ b/dir.c @@ -2703,7 +2703,8 @@ void remove_untracked_cache(struct index_state *istate) static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *dir, int base_len, - const struct pathspec *pathspec) + const struct pathspec *pathspec, + struct index_state *istate) { struct untracked_cache_dir *root; static int untracked_cache_disabled = -1; @@ -2767,8 +2768,15 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d return NULL; } - if (!dir->untracked->root) + if (!dir->untracked->root) { FLEX_ALLOC_STR(dir->untracked->root, name, ""); + /* + * If we've had to initialize the root, then what we had was an + * empty uninitialized untracked cache structure. We will be + * populating it now, so we should trigger an index write. + */ + istate->cache_changed |= UNTRACKED_CHANGED; + } /* Validate $GIT_DIR/info/exclude and core.excludesfile */ root = dir->untracked->root; @@ -2838,7 +2846,7 @@ int read_directory(struct dir_struct *dir, struct index_state *istate, return dir->nr; } - untracked = validate_untracked_cache(dir, len, pathspec); + untracked = validate_untracked_cache(dir, len, pathspec, istate); if (!untracked) /* * make sure untracked cache code path is disabled, -- gitgitgadget ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] Write index when populating empty untracked cache 2021-06-24 18:30 ` [PATCH 3/3] Write index when populating empty untracked cache Tao Klerks via GitGitGadget @ 2021-06-29 4:42 ` Junio C Hamano 2022-02-24 17:52 ` Tao Klerks 0 siblings, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2021-06-29 4:42 UTC (permalink / raw) To: Tao Klerks via GitGitGadget; +Cc: git, Tao Klerks "Tao Klerks via GitGitGadget" <gitgitgadget@gmail.com> writes: > Subject: Re: [PATCH 3/3] Write index when populating empty untracked cache Common to all three patches. Look at "git shortlog --no-merges -200 master" and observe the pattern, i.e. our commit title are <area> prefix, a colon and then a summary. Try to make your commits blend in. > From: Tao Klerks <tao@klerks.biz> > > It is expected that an empty/unpopulated untracked > cache structure can be written to the index - by update- > index, or by a "git status" call that sees the untracked cache > should be enabled, but is running with options that make > it non-applicable in that run. Would an example be helpful? What requests the untracked cache to be created, and what options (or perhaps the lack of what options) prevent the untracked cache to be "non-applicable"? > Currently, if that happens, then subsequent "git status" > calls end up populating the untracked cache, but not > writing the index (not saving their work) - so the > performance outcome is almost identical to the cache > being altogether disabled. > > This continues until the index gets written with the cache > populated, for some *other* reason. Here (and only here), the word "cache" is used instead of "untracked cache"---but I suspect that "the cache" here is the same "untracked cache". If that is the case, being consistent and spelling it out would reduce the risk of confusing readers. > In this change, we detect the "existing cache is empty > and it looks like we are using it" condition, and queues > an index write when this happens. "we detect ... and queues" sounds like a grammo. In this project, we describe the change being proposed as if we are giving an order to the codebase to "become like so". So, perhaps Detect the condition where an empty untracked cache exists in the index and we collect the list of untracked paths, and queue an index write under that condition, so that the collected untracked paths can be written out to the untracked cache extension in the index. or something along the line. > This change depends on previous fixes to t7519 for the > "ignore .git changes when invalidating UNTR" test case to > pass - before this fix, the test never actually did anything > as it was not set up correctly. > > Signed-off-by: Tao Klerks <tao@klerks.biz> > --- > dir.c | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > > diff --git a/dir.c b/dir.c > index ebe5ec046e0..a326e40e1c1 100644 > --- a/dir.c > +++ b/dir.c > @@ -2703,7 +2703,8 @@ void remove_untracked_cache(struct index_state *istate) > > static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *dir, > int base_len, > - const struct pathspec *pathspec) > + const struct pathspec *pathspec, > + struct index_state *istate) > { > struct untracked_cache_dir *root; > static int untracked_cache_disabled = -1; > @@ -2767,8 +2768,15 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d > return NULL; > } > > - if (!dir->untracked->root) > + if (!dir->untracked->root) { > FLEX_ALLOC_STR(dir->untracked->root, name, ""); > + /* > + * If we've had to initialize the root, then what we had was an > + * empty uninitialized untracked cache structure. We will be > + * populating it now, so we should trigger an index write. > + */ > + istate->cache_changed |= UNTRACKED_CHANGED; > + } The logic sounds fairly straight-forward. The fact that we came this far in the helper function means that we know we want to use (i.e. untracked cache is not disabled, and various other "we shouldn't be contaminating the cache with the one-shot information" cases did not return from the helper) untracked cache, and root being NULL originally means the untracked cache wasn't populated. > /* Validate $GIT_DIR/info/exclude and core.excludesfile */ > root = dir->untracked->root; > @@ -2838,7 +2846,7 @@ int read_directory(struct dir_struct *dir, struct index_state *istate, > return dir->nr; > } > > - untracked = validate_untracked_cache(dir, len, pathspec); > + untracked = validate_untracked_cache(dir, len, pathspec, istate); > if (!untracked) > /* > * make sure untracked cache code path is disabled, ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] Write index when populating empty untracked cache 2021-06-29 4:42 ` Junio C Hamano @ 2022-02-24 17:52 ` Tao Klerks 2022-02-24 20:35 ` Junio C Hamano 0 siblings, 1 reply; 19+ messages in thread From: Tao Klerks @ 2022-02-24 17:52 UTC (permalink / raw) To: Junio C Hamano; +Cc: Tao Klerks via GitGitGadget, git Hi Junio, Thanks so much for reviewing this, and my apologies for not reacting earlier, after you took the time. > our commit title are <area> prefix, a colon and then a > summary. Try to make your commits blend in. Noted, will fix, thx. I believe I did this better in an unrelated patch series today, but I guess I will see what people say :) > What requests the untracked cache to > be created, and what options (or perhaps the lack of what options) > prevent the untracked cache to be "non-applicable"? Will try to clarify / be more specific. > I suspect that "the cache" here is the same "untracked > cache" Will fix. > "we detect ... and queues" sounds like a grammo. Yep, dreaded perspective shift, will fix. > perhaps [proposed text] or something along the line. Will incorporate, thx. > The logic sounds fairly straight-forward. I didn't understand here whether you were confirming that the change seems to make sense (yay!), or commenting that the extra comment block is redundant, stating something obvious, and should better be removed. Could you confirm please? In the meantime, I'll "re-roll" (?) with the proposed changes. Thanks again, Tao On Tue, Jun 29, 2021 at 6:42 AM Junio C Hamano <gitster@pobox.com> wrote: > > "Tao Klerks via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > Subject: Re: [PATCH 3/3] Write index when populating empty untracked cache > > Common to all three patches. > > Look at "git shortlog --no-merges -200 master" and observe the > pattern, i.e. our commit title are <area> prefix, a colon and then a > summary. Try to make your commits blend in. > > > From: Tao Klerks <tao@klerks.biz> > > > > It is expected that an empty/unpopulated untracked > > cache structure can be written to the index - by update- > > index, or by a "git status" call that sees the untracked cache > > should be enabled, but is running with options that make > > it non-applicable in that run. > > Would an example be helpful? What requests the untracked cache to > be created, and what options (or perhaps the lack of what options) > prevent the untracked cache to be "non-applicable"? > > > Currently, if that happens, then subsequent "git status" > > calls end up populating the untracked cache, but not > > writing the index (not saving their work) - so the > > performance outcome is almost identical to the cache > > being altogether disabled. > > > > This continues until the index gets written with the cache > > populated, for some *other* reason. > > Here (and only here), the word "cache" is used instead of "untracked > cache"---but I suspect that "the cache" here is the same "untracked > cache". If that is the case, being consistent and spelling it out > would reduce the risk of confusing readers. > > > In this change, we detect the "existing cache is empty > > and it looks like we are using it" condition, and queues > > an index write when this happens. > > "we detect ... and queues" sounds like a grammo. > > In this project, we describe the change being proposed as if we are > giving an order to the codebase to "become like so". So, perhaps > > Detect the condition where an empty untracked cache exists in > the index and we collect the list of untracked paths, and queue > an index write under that condition, so that the collected > untracked paths can be written out to the untracked cache > extension in the index. > > or something along the line. > > > This change depends on previous fixes to t7519 for the > > "ignore .git changes when invalidating UNTR" test case to > > pass - before this fix, the test never actually did anything > > as it was not set up correctly. > > > > Signed-off-by: Tao Klerks <tao@klerks.biz> > > --- > > dir.c | 14 +++++++++++--- > > 1 file changed, 11 insertions(+), 3 deletions(-) > > > > diff --git a/dir.c b/dir.c > > index ebe5ec046e0..a326e40e1c1 100644 > > --- a/dir.c > > +++ b/dir.c > > @@ -2703,7 +2703,8 @@ void remove_untracked_cache(struct index_state *istate) > > > > static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *dir, > > int base_len, > > - const struct pathspec *pathspec) > > + const struct pathspec *pathspec, > > + struct index_state *istate) > > { > > struct untracked_cache_dir *root; > > static int untracked_cache_disabled = -1; > > @@ -2767,8 +2768,15 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d > > return NULL; > > } > > > > - if (!dir->untracked->root) > > + if (!dir->untracked->root) { > > FLEX_ALLOC_STR(dir->untracked->root, name, ""); > > + /* > > + * If we've had to initialize the root, then what we had was an > > + * empty uninitialized untracked cache structure. We will be > > + * populating it now, so we should trigger an index write. > > + */ > > + istate->cache_changed |= UNTRACKED_CHANGED; > > + } > > The logic sounds fairly straight-forward. The fact that we came > this far in the helper function means that we know we want to use > (i.e. untracked cache is not disabled, and various other "we > shouldn't be contaminating the cache with the one-shot information" > cases did not return from the helper) untracked cache, and root > being NULL originally means the untracked cache wasn't populated. > > > /* Validate $GIT_DIR/info/exclude and core.excludesfile */ > > root = dir->untracked->root; > > @@ -2838,7 +2846,7 @@ int read_directory(struct dir_struct *dir, struct index_state *istate, > > return dir->nr; > > } > > > > - untracked = validate_untracked_cache(dir, len, pathspec); > > + untracked = validate_untracked_cache(dir, len, pathspec, istate); > > if (!untracked) > > /* > > * make sure untracked cache code path is disabled, ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] Write index when populating empty untracked cache 2022-02-24 17:52 ` Tao Klerks @ 2022-02-24 20:35 ` Junio C Hamano 0 siblings, 0 replies; 19+ messages in thread From: Junio C Hamano @ 2022-02-24 20:35 UTC (permalink / raw) To: Tao Klerks; +Cc: Tao Klerks via GitGitGadget, git Tao Klerks <tao@klerks.biz> writes: >> The logic sounds fairly straight-forward. > > I didn't understand here whether you were confirming that the change > seems to make sense (yay!), or commenting that the extra comment block > is redundant, stating something obvious, and should better be removed. > Could you confirm please? I meant the former when I wrote it. But now you made me re-read the patch, I am becoming slightly sympathetic to the "do we even need to comment?" interpretation, too ;-) The question is if the comment to these two statements is redundant. if (!dir->untracked->root) { /* * If we've had to initialize the root, then what we had was an * empty uninitialized untracked cache structure. We will be * populating it now, so we should trigger an index write. */ FLEX_ALLOC_STR(dir->untracked->root, name, ""); istate->cache_changed |= UNTRACKED_CHANGED; } I can be pursuaded either way. ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 0/3] Empty untracked cache performance issue 2021-06-24 18:29 [PATCH 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget ` (2 preceding siblings ...) 2021-06-24 18:30 ` [PATCH 3/3] Write index when populating empty untracked cache Tao Klerks via GitGitGadget @ 2022-02-25 17:10 ` Tao Klerks via GitGitGadget 2022-02-25 17:10 ` [PATCH v2 1/3] t7519: avoid file to index mtime race for untracked cache Tao Klerks via GitGitGadget ` (3 more replies) 3 siblings, 4 replies; 19+ messages in thread From: Tao Klerks via GitGitGadget @ 2022-02-25 17:10 UTC (permalink / raw) To: git; +Cc: Tao Klerks This patchset addresses a performance issue with untracked cache. When a new untracked cache structure is added to the index but remains empty, subsequent "git status" calls populate it but do not write the index - so they perform as though the index were disabled. This situation can be caused in several different ways: * Running "git update-index --untracked-cache" on a repo that did not have the untracked cache * Modifying the git configuration to enable untracked cache, but then immediately running "git status -uall", causing the untracked cache to be created, but not used/populated (and the index written). * (likely others) The patchset includes fixes to t7519, which otherwise starts failing because it wasn't testing what it intended to. Tao Klerks (3): t7519: avoid file to index mtime race for untracked cache t7519: populate untracked cache before test untracked-cache: write index when populating empty untracked cache dir.c | 10 +++++++--- t/t7519-status-fsmonitor.sh | 7 +++++++ 2 files changed, 14 insertions(+), 3 deletions(-) base-commit: 4c53a8c20f8984adb226293a3ffd7b88c3f4ac1a Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-986%2FTaoK%2Ftaok-empty-untracked-cache-bug-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-986/TaoK/taok-empty-untracked-cache-bug-v2 Pull-Request: https://github.com/gitgitgadget/git/pull/986 Range-diff vs v1: 1: c4d2a3cab4b ! 1: 9421b71540d Add a second's delay to t7519 for untracked cache @@ Metadata Author: Tao Klerks <tao@klerks.biz> ## Commit message ## - Add a second's delay to t7519 for untracked cache + t7519: avoid file to index mtime race for untracked cache In t7519 there is a test that writes files to disk, and immediately - writes the untracked index. Because of mtime-comparison logic - that uses a 1-second resolution, this means the cached entries - are not trusted/used under some circumstances (see - read-cache.c#is_racy_stat()). + writes the untracked index. Because of mtime-comparison logic that + uses a 1-second resolution, this means the cached entries are not + trusted/used under some circumstances + (see read-cache.c#is_racy_stat()). - Untracked cache tests in t7063 use a 1-second delay to avoid - this issue. We should do the same here. + Untracked cache tests in t7063 use a 1-second delay to avoid this + issue, but we don't want to introduce arbitrary slowdowns, so instead + use test-tool chmtime to backdate the files slightly. - This change doesn't actually affect the outcome of the test, - but does enhance its validity, and becomes relevant after - later changes + This change doesn't actually affect the outcome of the test, but does + enhance its validity, and becomes relevant after later changes. Signed-off-by: Tao Klerks <tao@klerks.biz> ## t/t7519-status-fsmonitor.sh ## -@@ t/t7519-status-fsmonitor.sh: clean_repo () { - git clean -fd - } - -+avoid_racy() { -+ sleep 1 -+} -+ - dirty_repo () { - : >untracked && - : >dir1/untracked && @@ t/t7519-status-fsmonitor.sh: test_expect_success UNTRACKED_CACHE 'ignore .git changes when invalidating UNTR' + cd dot-git && + mkdir -p .git/hooks && + : >tracked && ++ test-tool chmtime =-60 tracked && + : >modified && ++ test-tool chmtime =-60 modified && + mkdir dir1 && + : >dir1/tracked && ++ test-tool chmtime =-60 dir1/tracked && + : >dir1/modified && ++ test-tool chmtime =-60 dir1/modified && + mkdir dir2 && + : >dir2/tracked && ++ test-tool chmtime =-60 dir2/tracked && : >dir2/modified && ++ test-tool chmtime =-60 dir2/modified && write_integration_script && git config core.fsmonitor .git/hooks/fsmonitor-test && -+ avoid_racy && git update-index --untracked-cache && - git update-index --fsmonitor && - GIT_TRACE2_PERF="$TRASH_DIRECTORY/trace-before" \ 2: d63faad03a4 ! 2: d29a68e65a0 In t7519, populate untracked cache before test @@ Metadata Author: Tao Klerks <tao@klerks.biz> ## Commit message ## - In t7519, populate untracked cache before test + t7519: populate untracked cache before test - In its current state, the t7519 test dealing with untracked - cache assumes that - "git update-index --untracked-cache" will *populate* - the untracked cache. This is not correct - it will only add - an empty untracked cache structure to the index. + In its current state, the t7519 test dealing with untracked cache + assumes that "git update-index --untracked-cache" will *populate* the + untracked cache. This is not correct - it will only add an empty + untracked cache structure to the index. - If we're going to compare two git status runs with - something interesting happening in-between, we - need to ensure that the index is in a stable/steady - state *before* that first run. + If we're going to compare two git status runs with something + interesting happening in-between, we need to ensure that the index is + in a stable/steady state *before* that first run. - We achieve this by adding another prior "git status" - run. + Achieve this by adding another prior "git status" run. - At this stage this change does nothing, because there - is a bug, addressed in the next patch. whereby once - the empty untracked cache structure is added by the - update-index invocation, the untracked cache gets - updated in every subsequent "git status" call, but the - index with these updates does not get written down. + At this stage this change does nothing, because there is a bug, + addressed in the next patch, whereby once the empty untracked cache + structure is added by the update-index invocation, the untracked cache + gets updated in every subsequent "git status" call, but the index with + these updates does not get written down. - That bug actually invalidates this entire test case - - but we're fixing that next. + That bug actually invalidates this entire test case - but we're fixing + that next. Signed-off-by: Tao Klerks <tao@klerks.biz> ## t/t7519-status-fsmonitor.sh ## @@ t/t7519-status-fsmonitor.sh: test_expect_success UNTRACKED_CACHE 'ignore .git changes when invalidating UNTR' - avoid_racy && + git config core.fsmonitor .git/hooks/fsmonitor-test && git update-index --untracked-cache && git update-index --fsmonitor && + git status && 3: 627f1952fd8 ! 3: 190b27e518a Write index when populating empty untracked cache @@ Metadata Author: Tao Klerks <tao@klerks.biz> ## Commit message ## - Write index when populating empty untracked cache + untracked-cache: write index when populating empty untracked cache - It is expected that an empty/unpopulated untracked - cache structure can be written to the index - by update- - index, or by a "git status" call that sees the untracked cache - should be enabled, but is running with options that make - it non-applicable in that run. + It is expected that an empty/unpopulated untracked cache structure can + be written to the index - by update-index, or by a "git status" call + that sees the untracked cache should be enabled and is not, but is + running with options that make the untracked cache non-applicable in + that run (eg a pathspec). - Currently, if that happens, then subsequent "git status" - calls end up populating the untracked cache, but not - writing the index (not saving their work) - so the - performance outcome is almost identical to the cache - being altogether disabled. + Currently, if that happens, then subsequent "git status" calls end up + populating the untracked cache, but not writing the index (not saving + their work) - so the performance outcome is almost identical to the + cache being altogether disabled. - This continues until the index gets written with the cache - populated, for some *other* reason. + This continues until the index gets written with the untracked cache + populated, for some *other* reason, such as a working tree change. - In this change, we detect the "existing cache is empty - and it looks like we are using it" condition, and queues - an index write when this happens. + Detect the condition where an empty untracked cache exists in the + index and we will collect the list of untracked paths, and queue an + index write under that condition, so that the collected untracked + paths can be written out to the untracked cache extension in the + index. - This change depends on previous fixes to t7519 for the - "ignore .git changes when invalidating UNTR" test case to - pass - before this fix, the test never actually did anything - as it was not set up correctly. + This change depends on previous fixes to t7519 for the "ignore .git + changes when invalidating UNTR" test case to pass - before this fix, + the test never actually did anything as it was not set up correctly. Signed-off-by: Tao Klerks <tao@klerks.biz> @@ dir.c: static struct untracked_cache_dir *validate_untracked_cache(struct dir_st - if (!dir->untracked->root) + if (!dir->untracked->root) { ++ /* Untracked cache existed but is not initialized; fix that */ FLEX_ALLOC_STR(dir->untracked->root, name, ""); -+ /* -+ * If we've had to initialize the root, then what we had was an -+ * empty uninitialized untracked cache structure. We will be -+ * populating it now, so we should trigger an index write. -+ */ + istate->cache_changed |= UNTRACKED_CHANGED; + } -- gitgitgadget ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 1/3] t7519: avoid file to index mtime race for untracked cache 2022-02-25 17:10 ` [PATCH v2 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget @ 2022-02-25 17:10 ` Tao Klerks via GitGitGadget 2022-02-25 19:07 ` Junio C Hamano 2022-02-25 17:10 ` [PATCH v2 2/3] t7519: populate untracked cache before test Tao Klerks via GitGitGadget ` (2 subsequent siblings) 3 siblings, 1 reply; 19+ messages in thread From: Tao Klerks via GitGitGadget @ 2022-02-25 17:10 UTC (permalink / raw) To: git; +Cc: Tao Klerks, Tao Klerks From: Tao Klerks <tao@klerks.biz> In t7519 there is a test that writes files to disk, and immediately writes the untracked index. Because of mtime-comparison logic that uses a 1-second resolution, this means the cached entries are not trusted/used under some circumstances (see read-cache.c#is_racy_stat()). Untracked cache tests in t7063 use a 1-second delay to avoid this issue, but we don't want to introduce arbitrary slowdowns, so instead use test-tool chmtime to backdate the files slightly. This change doesn't actually affect the outcome of the test, but does enhance its validity, and becomes relevant after later changes. Signed-off-by: Tao Klerks <tao@klerks.biz> --- t/t7519-status-fsmonitor.sh | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh index a6308acf006..3f984136ea9 100755 --- a/t/t7519-status-fsmonitor.sh +++ b/t/t7519-status-fsmonitor.sh @@ -324,13 +324,19 @@ test_expect_success UNTRACKED_CACHE 'ignore .git changes when invalidating UNTR' cd dot-git && mkdir -p .git/hooks && : >tracked && + test-tool chmtime =-60 tracked && : >modified && + test-tool chmtime =-60 modified && mkdir dir1 && : >dir1/tracked && + test-tool chmtime =-60 dir1/tracked && : >dir1/modified && + test-tool chmtime =-60 dir1/modified && mkdir dir2 && : >dir2/tracked && + test-tool chmtime =-60 dir2/tracked && : >dir2/modified && + test-tool chmtime =-60 dir2/modified && write_integration_script && git config core.fsmonitor .git/hooks/fsmonitor-test && git update-index --untracked-cache && -- gitgitgadget ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2 1/3] t7519: avoid file to index mtime race for untracked cache 2022-02-25 17:10 ` [PATCH v2 1/3] t7519: avoid file to index mtime race for untracked cache Tao Klerks via GitGitGadget @ 2022-02-25 19:07 ` Junio C Hamano 2022-02-27 22:12 ` Tao Klerks 0 siblings, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2022-02-25 19:07 UTC (permalink / raw) To: Tao Klerks via GitGitGadget; +Cc: git, Tao Klerks "Tao Klerks via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Tao Klerks <tao@klerks.biz> > > In t7519 there is a test that writes files to disk, and immediately > writes the untracked index. Because of mtime-comparison logic that "untracked cache", I think. > uses a 1-second resolution, this means the cached entries are not > trusted/used under some circumstances > (see read-cache.c#is_racy_stat()). > > Untracked cache tests in t7063 use a 1-second delay to avoid this > issue, but we don't want to introduce arbitrary slowdowns, so instead > use test-tool chmtime to backdate the files slightly. Good approach. Perhaps fixing 7063 can be marked as #leftoverbit? > This change doesn't actually affect the outcome of the test, but does > enhance its validity, and becomes relevant after later changes. > > Signed-off-by: Tao Klerks <tao@klerks.biz> > --- > t/t7519-status-fsmonitor.sh | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh > index a6308acf006..3f984136ea9 100755 > --- a/t/t7519-status-fsmonitor.sh > +++ b/t/t7519-status-fsmonitor.sh > @@ -324,13 +324,19 @@ test_expect_success UNTRACKED_CACHE 'ignore .git changes when invalidating UNTR' > cd dot-git && > mkdir -p .git/hooks && > : >tracked && > + test-tool chmtime =-60 tracked && > : >modified && > + test-tool chmtime =-60 modified && > mkdir dir1 && > : >dir1/tracked && > + test-tool chmtime =-60 dir1/tracked && > : >dir1/modified && > + test-tool chmtime =-60 dir1/modified && > mkdir dir2 && > : >dir2/tracked && > + test-tool chmtime =-60 dir2/tracked && > : >dir2/modified && > + test-tool chmtime =-60 dir2/modified && > write_integration_script && > git config core.fsmonitor .git/hooks/fsmonitor-test && > git update-index --untracked-cache && ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 1/3] t7519: avoid file to index mtime race for untracked cache 2022-02-25 19:07 ` Junio C Hamano @ 2022-02-27 22:12 ` Tao Klerks 0 siblings, 0 replies; 19+ messages in thread From: Tao Klerks @ 2022-02-27 22:12 UTC (permalink / raw) To: Junio C Hamano; +Cc: Tao Klerks via GitGitGadget, git On Fri, Feb 25, 2022 at 8:07 PM Junio C Hamano <gitster@pobox.com> wrote: > > "Tao Klerks via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > In t7519 there is a test that writes files to disk, and immediately > > writes the untracked index. Because of mtime-comparison logic that > > "untracked cache", I think. > Yep, fixed, thx. > > uses a 1-second resolution, this means the cached entries are not > > trusted/used under some circumstances > > (see read-cache.c#is_racy_stat()). > > > > Untracked cache tests in t7063 use a 1-second delay to avoid this > > issue, but we don't want to introduce arbitrary slowdowns, so instead > > use test-tool chmtime to backdate the files slightly. > > Good approach. Perhaps fixing 7063 can be marked as #leftoverbit? > Sure. Updated the commit msg. I also started another thread with what that looks like - it's not very pretty in the end because of Windows compatibility issues, and still incomplete because (as far as I can tell) the sleep is used for a couple of different purposes, and only one of them lends itself to chmtime manipulation. ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 2/3] t7519: populate untracked cache before test 2022-02-25 17:10 ` [PATCH v2 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget 2022-02-25 17:10 ` [PATCH v2 1/3] t7519: avoid file to index mtime race for untracked cache Tao Klerks via GitGitGadget @ 2022-02-25 17:10 ` Tao Klerks via GitGitGadget 2022-02-25 17:10 ` [PATCH v2 3/3] untracked-cache: write index when populating empty untracked cache Tao Klerks via GitGitGadget 2022-02-27 21:56 ` [PATCH v3 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget 3 siblings, 0 replies; 19+ messages in thread From: Tao Klerks via GitGitGadget @ 2022-02-25 17:10 UTC (permalink / raw) To: git; +Cc: Tao Klerks, Tao Klerks From: Tao Klerks <tao@klerks.biz> In its current state, the t7519 test dealing with untracked cache assumes that "git update-index --untracked-cache" will *populate* the untracked cache. This is not correct - it will only add an empty untracked cache structure to the index. If we're going to compare two git status runs with something interesting happening in-between, we need to ensure that the index is in a stable/steady state *before* that first run. Achieve this by adding another prior "git status" run. At this stage this change does nothing, because there is a bug, addressed in the next patch, whereby once the empty untracked cache structure is added by the update-index invocation, the untracked cache gets updated in every subsequent "git status" call, but the index with these updates does not get written down. That bug actually invalidates this entire test case - but we're fixing that next. Signed-off-by: Tao Klerks <tao@klerks.biz> --- t/t7519-status-fsmonitor.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh index 3f984136ea9..fffc57120d6 100755 --- a/t/t7519-status-fsmonitor.sh +++ b/t/t7519-status-fsmonitor.sh @@ -341,6 +341,7 @@ test_expect_success UNTRACKED_CACHE 'ignore .git changes when invalidating UNTR' git config core.fsmonitor .git/hooks/fsmonitor-test && git update-index --untracked-cache && git update-index --fsmonitor && + git status && GIT_TRACE2_PERF="$TRASH_DIRECTORY/trace-before" \ git status && test-tool dump-untracked-cache >../before -- gitgitgadget ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 3/3] untracked-cache: write index when populating empty untracked cache 2022-02-25 17:10 ` [PATCH v2 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget 2022-02-25 17:10 ` [PATCH v2 1/3] t7519: avoid file to index mtime race for untracked cache Tao Klerks via GitGitGadget 2022-02-25 17:10 ` [PATCH v2 2/3] t7519: populate untracked cache before test Tao Klerks via GitGitGadget @ 2022-02-25 17:10 ` Tao Klerks via GitGitGadget 2022-02-25 19:12 ` Junio C Hamano 2022-02-27 21:56 ` [PATCH v3 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget 3 siblings, 1 reply; 19+ messages in thread From: Tao Klerks via GitGitGadget @ 2022-02-25 17:10 UTC (permalink / raw) To: git; +Cc: Tao Klerks, Tao Klerks From: Tao Klerks <tao@klerks.biz> It is expected that an empty/unpopulated untracked cache structure can be written to the index - by update-index, or by a "git status" call that sees the untracked cache should be enabled and is not, but is running with options that make the untracked cache non-applicable in that run (eg a pathspec). Currently, if that happens, then subsequent "git status" calls end up populating the untracked cache, but not writing the index (not saving their work) - so the performance outcome is almost identical to the cache being altogether disabled. This continues until the index gets written with the untracked cache populated, for some *other* reason, such as a working tree change. Detect the condition where an empty untracked cache exists in the index and we will collect the list of untracked paths, and queue an index write under that condition, so that the collected untracked paths can be written out to the untracked cache extension in the index. This change depends on previous fixes to t7519 for the "ignore .git changes when invalidating UNTR" test case to pass - before this fix, the test never actually did anything as it was not set up correctly. Signed-off-by: Tao Klerks <tao@klerks.biz> --- dir.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/dir.c b/dir.c index d91295f2bcd..4eee45dec91 100644 --- a/dir.c +++ b/dir.c @@ -2781,7 +2781,8 @@ void remove_untracked_cache(struct index_state *istate) static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *dir, int base_len, - const struct pathspec *pathspec) + const struct pathspec *pathspec, + struct index_state *istate) { struct untracked_cache_dir *root; static int untracked_cache_disabled = -1; @@ -2845,8 +2846,11 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d return NULL; } - if (!dir->untracked->root) + if (!dir->untracked->root) { + /* Untracked cache existed but is not initialized; fix that */ FLEX_ALLOC_STR(dir->untracked->root, name, ""); + istate->cache_changed |= UNTRACKED_CHANGED; + } /* Validate $GIT_DIR/info/exclude and core.excludesfile */ root = dir->untracked->root; @@ -2916,7 +2920,7 @@ int read_directory(struct dir_struct *dir, struct index_state *istate, return dir->nr; } - untracked = validate_untracked_cache(dir, len, pathspec); + untracked = validate_untracked_cache(dir, len, pathspec, istate); if (!untracked) /* * make sure untracked cache code path is disabled, -- gitgitgadget ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2 3/3] untracked-cache: write index when populating empty untracked cache 2022-02-25 17:10 ` [PATCH v2 3/3] untracked-cache: write index when populating empty untracked cache Tao Klerks via GitGitGadget @ 2022-02-25 19:12 ` Junio C Hamano 0 siblings, 0 replies; 19+ messages in thread From: Junio C Hamano @ 2022-02-25 19:12 UTC (permalink / raw) To: Tao Klerks via GitGitGadget; +Cc: git, Tao Klerks "Tao Klerks via GitGitGadget" <gitgitgadget@gmail.com> writes: > diff --git a/dir.c b/dir.c > index d91295f2bcd..4eee45dec91 100644 > --- a/dir.c > +++ b/dir.c > @@ -2781,7 +2781,8 @@ void remove_untracked_cache(struct index_state *istate) > > static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *dir, > int base_len, > - const struct pathspec *pathspec) > + const struct pathspec *pathspec, > + struct index_state *istate) > { > struct untracked_cache_dir *root; > static int untracked_cache_disabled = -1; > @@ -2845,8 +2846,11 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d > return NULL; > } > > - if (!dir->untracked->root) > + if (!dir->untracked->root) { > + /* Untracked cache existed but is not initialized; fix that */ > FLEX_ALLOC_STR(dir->untracked->root, name, ""); > + istate->cache_changed |= UNTRACKED_CHANGED; > + } > > /* Validate $GIT_DIR/info/exclude and core.excludesfile */ > root = dir->untracked->root; > @@ -2916,7 +2920,7 @@ int read_directory(struct dir_struct *dir, struct index_state *istate, > return dir->nr; > } > > - untracked = validate_untracked_cache(dir, len, pathspec); > + untracked = validate_untracked_cache(dir, len, pathspec, istate); > if (!untracked) > /* > * make sure untracked cache code path is disabled, Makes sense. Will queue. Thanks. ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v3 0/3] Empty untracked cache performance issue 2022-02-25 17:10 ` [PATCH v2 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget ` (2 preceding siblings ...) 2022-02-25 17:10 ` [PATCH v2 3/3] untracked-cache: write index when populating empty untracked cache Tao Klerks via GitGitGadget @ 2022-02-27 21:56 ` Tao Klerks via GitGitGadget 2022-02-27 21:56 ` [PATCH v3 1/3] t7519: avoid file to index mtime race for untracked cache Tao Klerks via GitGitGadget ` (2 more replies) 3 siblings, 3 replies; 19+ messages in thread From: Tao Klerks via GitGitGadget @ 2022-02-27 21:56 UTC (permalink / raw) To: git; +Cc: Tao Klerks This patchset addresses a performance issue with untracked cache. When a new untracked cache structure is added to the index but remains empty, subsequent "git status" calls populate it but do not write the index - so they perform as though the index were disabled. This situation can be caused in several different ways: * Running "git update-index --untracked-cache" on a repo that did not have the untracked cache * Modifying the git configuration to enable untracked cache, but then immediately running "git status -uall", causing the untracked cache to be created, but not used/populated (and the index written). * (likely others) The patchset includes fixes to t7519, which otherwise starts failing because it wasn't testing what it intended to. Tao Klerks (3): t7519: avoid file to index mtime race for untracked cache t7519: populate untracked cache before test untracked-cache: write index when populating empty untracked cache dir.c | 10 +++++++--- t/t7519-status-fsmonitor.sh | 7 +++++++ 2 files changed, 14 insertions(+), 3 deletions(-) base-commit: 4c53a8c20f8984adb226293a3ffd7b88c3f4ac1a Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-986%2FTaoK%2Ftaok-empty-untracked-cache-bug-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-986/TaoK/taok-empty-untracked-cache-bug-v3 Pull-Request: https://github.com/gitgitgadget/git/pull/986 Range-diff vs v2: 1: 9421b71540d ! 1: 8b1f89c259e t7519: avoid file to index mtime race for untracked cache @@ Commit message t7519: avoid file to index mtime race for untracked cache In t7519 there is a test that writes files to disk, and immediately - writes the untracked index. Because of mtime-comparison logic that - uses a 1-second resolution, this means the cached entries are not - trusted/used under some circumstances + writes the index with the untracked cache. Because of + mtime-comparison logic that uses a 1-second resolution, this means + the cached entries are not trusted/used under some circumstances (see read-cache.c#is_racy_stat()). Untracked cache tests in t7063 use a 1-second delay to avoid this issue, but we don't want to introduce arbitrary slowdowns, so instead - use test-tool chmtime to backdate the files slightly. + use test-tool chmtime to backdate the files slightly. The t7063 + delays are a #leftoverbit, to be worked on in a separate series. This change doesn't actually affect the outcome of the test, but does enhance its validity, and becomes relevant after later changes. 2: d29a68e65a0 = 2: c901f9d96ca t7519: populate untracked cache before test 3: 190b27e518a = 3: 9795a08414a untracked-cache: write index when populating empty untracked cache -- gitgitgadget ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v3 1/3] t7519: avoid file to index mtime race for untracked cache 2022-02-27 21:56 ` [PATCH v3 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget @ 2022-02-27 21:56 ` Tao Klerks via GitGitGadget 2022-02-27 21:57 ` [PATCH v3 2/3] t7519: populate untracked cache before test Tao Klerks via GitGitGadget 2022-02-27 21:57 ` [PATCH v3 3/3] untracked-cache: write index when populating empty untracked cache Tao Klerks via GitGitGadget 2 siblings, 0 replies; 19+ messages in thread From: Tao Klerks via GitGitGadget @ 2022-02-27 21:56 UTC (permalink / raw) To: git; +Cc: Tao Klerks, Tao Klerks From: Tao Klerks <tao@klerks.biz> In t7519 there is a test that writes files to disk, and immediately writes the index with the untracked cache. Because of mtime-comparison logic that uses a 1-second resolution, this means the cached entries are not trusted/used under some circumstances (see read-cache.c#is_racy_stat()). Untracked cache tests in t7063 use a 1-second delay to avoid this issue, but we don't want to introduce arbitrary slowdowns, so instead use test-tool chmtime to backdate the files slightly. The t7063 delays are a #leftoverbit, to be worked on in a separate series. This change doesn't actually affect the outcome of the test, but does enhance its validity, and becomes relevant after later changes. Signed-off-by: Tao Klerks <tao@klerks.biz> --- t/t7519-status-fsmonitor.sh | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh index a6308acf006..3f984136ea9 100755 --- a/t/t7519-status-fsmonitor.sh +++ b/t/t7519-status-fsmonitor.sh @@ -324,13 +324,19 @@ test_expect_success UNTRACKED_CACHE 'ignore .git changes when invalidating UNTR' cd dot-git && mkdir -p .git/hooks && : >tracked && + test-tool chmtime =-60 tracked && : >modified && + test-tool chmtime =-60 modified && mkdir dir1 && : >dir1/tracked && + test-tool chmtime =-60 dir1/tracked && : >dir1/modified && + test-tool chmtime =-60 dir1/modified && mkdir dir2 && : >dir2/tracked && + test-tool chmtime =-60 dir2/tracked && : >dir2/modified && + test-tool chmtime =-60 dir2/modified && write_integration_script && git config core.fsmonitor .git/hooks/fsmonitor-test && git update-index --untracked-cache && -- gitgitgadget ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v3 2/3] t7519: populate untracked cache before test 2022-02-27 21:56 ` [PATCH v3 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget 2022-02-27 21:56 ` [PATCH v3 1/3] t7519: avoid file to index mtime race for untracked cache Tao Klerks via GitGitGadget @ 2022-02-27 21:57 ` Tao Klerks via GitGitGadget 2022-02-27 21:57 ` [PATCH v3 3/3] untracked-cache: write index when populating empty untracked cache Tao Klerks via GitGitGadget 2 siblings, 0 replies; 19+ messages in thread From: Tao Klerks via GitGitGadget @ 2022-02-27 21:57 UTC (permalink / raw) To: git; +Cc: Tao Klerks, Tao Klerks From: Tao Klerks <tao@klerks.biz> In its current state, the t7519 test dealing with untracked cache assumes that "git update-index --untracked-cache" will *populate* the untracked cache. This is not correct - it will only add an empty untracked cache structure to the index. If we're going to compare two git status runs with something interesting happening in-between, we need to ensure that the index is in a stable/steady state *before* that first run. Achieve this by adding another prior "git status" run. At this stage this change does nothing, because there is a bug, addressed in the next patch, whereby once the empty untracked cache structure is added by the update-index invocation, the untracked cache gets updated in every subsequent "git status" call, but the index with these updates does not get written down. That bug actually invalidates this entire test case - but we're fixing that next. Signed-off-by: Tao Klerks <tao@klerks.biz> --- t/t7519-status-fsmonitor.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh index 3f984136ea9..fffc57120d6 100755 --- a/t/t7519-status-fsmonitor.sh +++ b/t/t7519-status-fsmonitor.sh @@ -341,6 +341,7 @@ test_expect_success UNTRACKED_CACHE 'ignore .git changes when invalidating UNTR' git config core.fsmonitor .git/hooks/fsmonitor-test && git update-index --untracked-cache && git update-index --fsmonitor && + git status && GIT_TRACE2_PERF="$TRASH_DIRECTORY/trace-before" \ git status && test-tool dump-untracked-cache >../before -- gitgitgadget ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v3 3/3] untracked-cache: write index when populating empty untracked cache 2022-02-27 21:56 ` [PATCH v3 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget 2022-02-27 21:56 ` [PATCH v3 1/3] t7519: avoid file to index mtime race for untracked cache Tao Klerks via GitGitGadget 2022-02-27 21:57 ` [PATCH v3 2/3] t7519: populate untracked cache before test Tao Klerks via GitGitGadget @ 2022-02-27 21:57 ` Tao Klerks via GitGitGadget 2 siblings, 0 replies; 19+ messages in thread From: Tao Klerks via GitGitGadget @ 2022-02-27 21:57 UTC (permalink / raw) To: git; +Cc: Tao Klerks, Tao Klerks From: Tao Klerks <tao@klerks.biz> It is expected that an empty/unpopulated untracked cache structure can be written to the index - by update-index, or by a "git status" call that sees the untracked cache should be enabled and is not, but is running with options that make the untracked cache non-applicable in that run (eg a pathspec). Currently, if that happens, then subsequent "git status" calls end up populating the untracked cache, but not writing the index (not saving their work) - so the performance outcome is almost identical to the cache being altogether disabled. This continues until the index gets written with the untracked cache populated, for some *other* reason, such as a working tree change. Detect the condition where an empty untracked cache exists in the index and we will collect the list of untracked paths, and queue an index write under that condition, so that the collected untracked paths can be written out to the untracked cache extension in the index. This change depends on previous fixes to t7519 for the "ignore .git changes when invalidating UNTR" test case to pass - before this fix, the test never actually did anything as it was not set up correctly. Signed-off-by: Tao Klerks <tao@klerks.biz> --- dir.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/dir.c b/dir.c index d91295f2bcd..4eee45dec91 100644 --- a/dir.c +++ b/dir.c @@ -2781,7 +2781,8 @@ void remove_untracked_cache(struct index_state *istate) static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *dir, int base_len, - const struct pathspec *pathspec) + const struct pathspec *pathspec, + struct index_state *istate) { struct untracked_cache_dir *root; static int untracked_cache_disabled = -1; @@ -2845,8 +2846,11 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d return NULL; } - if (!dir->untracked->root) + if (!dir->untracked->root) { + /* Untracked cache existed but is not initialized; fix that */ FLEX_ALLOC_STR(dir->untracked->root, name, ""); + istate->cache_changed |= UNTRACKED_CHANGED; + } /* Validate $GIT_DIR/info/exclude and core.excludesfile */ root = dir->untracked->root; @@ -2916,7 +2920,7 @@ int read_directory(struct dir_struct *dir, struct index_state *istate, return dir->nr; } - untracked = validate_untracked_cache(dir, len, pathspec); + untracked = validate_untracked_cache(dir, len, pathspec, istate); if (!untracked) /* * make sure untracked cache code path is disabled, -- gitgitgadget ^ permalink raw reply related [flat|nested] 19+ messages in thread
end of thread, other threads:[~2022-02-27 22:12 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-06-24 18:29 [PATCH 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget 2021-06-24 18:29 ` [PATCH 1/3] Add a second's delay to t7519 for untracked cache Tao Klerks via GitGitGadget 2021-06-29 4:22 ` Junio C Hamano 2021-06-24 18:30 ` [PATCH 2/3] In t7519, populate untracked cache before test Tao Klerks via GitGitGadget 2021-06-24 18:30 ` [PATCH 3/3] Write index when populating empty untracked cache Tao Klerks via GitGitGadget 2021-06-29 4:42 ` Junio C Hamano 2022-02-24 17:52 ` Tao Klerks 2022-02-24 20:35 ` Junio C Hamano 2022-02-25 17:10 ` [PATCH v2 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget 2022-02-25 17:10 ` [PATCH v2 1/3] t7519: avoid file to index mtime race for untracked cache Tao Klerks via GitGitGadget 2022-02-25 19:07 ` Junio C Hamano 2022-02-27 22:12 ` Tao Klerks 2022-02-25 17:10 ` [PATCH v2 2/3] t7519: populate untracked cache before test Tao Klerks via GitGitGadget 2022-02-25 17:10 ` [PATCH v2 3/3] untracked-cache: write index when populating empty untracked cache Tao Klerks via GitGitGadget 2022-02-25 19:12 ` Junio C Hamano 2022-02-27 21:56 ` [PATCH v3 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget 2022-02-27 21:56 ` [PATCH v3 1/3] t7519: avoid file to index mtime race for untracked cache Tao Klerks via GitGitGadget 2022-02-27 21:57 ` [PATCH v3 2/3] t7519: populate untracked cache before test Tao Klerks via GitGitGadget 2022-02-27 21:57 ` [PATCH v3 3/3] untracked-cache: write index when populating empty untracked cache Tao Klerks via GitGitGadget
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.