git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Untracked Cache and --untracked-files=all
@ 2021-06-22  9:25 Tao Klerks
  2021-06-22 16:06 ` Elijah Newren
  0 siblings, 1 reply; 3+ messages in thread
From: Tao Klerks @ 2021-06-22  9:25 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Hi folks,

I'm hoping for a little help understanding *intent* around a
particular code comment in dir.c, and reaching out to the whole list
because someone (Junio?) said they consider any mail that *doesn't*
copy the list to be spam anyway, and the original author of the
comment in question (Duy Nguyen) is no longer active on git.

Context:
I am trying to explore how to get "--untracked-files=all" to play nice
with the untracked cache, so that windows users using tooling that
sets "--untracked-files=all" can benefit from the same
orders-of-magnitude git status performance improvements as commandline
users.

There is a "naive" approach to this (store the untracked cache in the
index file with whatever dir flags were specified/used in the
recursive walk, and ignore/rewrite that cached data every time the
flags change), and there is presumably a more "comprehensive" approach
(store all the information required in the untracked cache to be able
to satisfy requests with either set of flags - even if this is a
little more expensive on first run).

The main disadvantage of the "naive" approach is that is every time
you flip-flop between "git status" and "git status -u", the untracked
cache is ignored, a recursive directory walk ensues, and the untracked
cache is rewritten to the index file for the next time you rerun,
hopefully with the same flags. However, I would think in most
situations flip-flopping will be less common - more commonly you're
using a tool or workflow that ends up running the same command(s)
repeatedly... At least, that's my thesis. I would put this "store the
untracked cache every time dir flags change" behavior behind a config
switch, anyway.

This "naive" approach *is* rather easy to achieve - you just need to
recreate a new "untracked" structure inside
dir.c#validate_untracked_cache() if you find a mismatch of flags (and
make other small fixes to store the correct flags in that new
"untracked" structure).

The one thing that sticks out after making these changes is a code
comment first introduced in 2015 by Duy Nguyen in ccad261f, explaining
*why* we refuse to use the untracked cache with "-uall":
> * See treat_directory(), case index_nonexistent. Without
> * this flag, we may need to also cache .git file content
> * for the resolve_gitlink_ref() call, which we don't.

I've seen this comment many times over the past few months, and I've
previously always interpreted it as a "correctness" concern.

Looking at the original code in question, however (as of ccad261f,
dir.c#treat_directory(), "resolve_gitlink_ref" call), I don't
understand how correctness could be impacted.

Fast-forward 6 years and all this code has been substantially
overhauled by several folks over the years (most recently and majorly
Elijah Newren), and the "resolve_gitlink_ref()" call is long-gone.

Does this look familiar to anyone? Is there any remaining obvious
reason to be wary of storing the untracked cache structure produced
with '-uall'?

Thanks,
Tao

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-06-23  6:42 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-22  9:25 Untracked Cache and --untracked-files=all Tao Klerks
2021-06-22 16:06 ` Elijah Newren
2021-06-23  6:42   ` Tao Klerks

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).