git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tao Klerks <tao@klerks.biz>
To: git <git@vger.kernel.org>
Cc: Elijah Newren <newren@gmail.com>
Subject: Untracked Cache and --untracked-files=all
Date: Tue, 22 Jun 2021 11:25:36 +0200	[thread overview]
Message-ID: <CAPMMpohp6+jW2C0ewfYEp3rrwbKSqGVa94LRgQDcKJvYmiANuA@mail.gmail.com> (raw)

Hi folks,

I'm hoping for a little help understanding *intent* around a
particular code comment in dir.c, and reaching out to the whole list
because someone (Junio?) said they consider any mail that *doesn't*
copy the list to be spam anyway, and the original author of the
comment in question (Duy Nguyen) is no longer active on git.

Context:
I am trying to explore how to get "--untracked-files=all" to play nice
with the untracked cache, so that windows users using tooling that
sets "--untracked-files=all" can benefit from the same
orders-of-magnitude git status performance improvements as commandline
users.

There is a "naive" approach to this (store the untracked cache in the
index file with whatever dir flags were specified/used in the
recursive walk, and ignore/rewrite that cached data every time the
flags change), and there is presumably a more "comprehensive" approach
(store all the information required in the untracked cache to be able
to satisfy requests with either set of flags - even if this is a
little more expensive on first run).

The main disadvantage of the "naive" approach is that is every time
you flip-flop between "git status" and "git status -u", the untracked
cache is ignored, a recursive directory walk ensues, and the untracked
cache is rewritten to the index file for the next time you rerun,
hopefully with the same flags. However, I would think in most
situations flip-flopping will be less common - more commonly you're
using a tool or workflow that ends up running the same command(s)
repeatedly... At least, that's my thesis. I would put this "store the
untracked cache every time dir flags change" behavior behind a config
switch, anyway.

This "naive" approach *is* rather easy to achieve - you just need to
recreate a new "untracked" structure inside
dir.c#validate_untracked_cache() if you find a mismatch of flags (and
make other small fixes to store the correct flags in that new
"untracked" structure).

The one thing that sticks out after making these changes is a code
comment first introduced in 2015 by Duy Nguyen in ccad261f, explaining
*why* we refuse to use the untracked cache with "-uall":
> * See treat_directory(), case index_nonexistent. Without
> * this flag, we may need to also cache .git file content
> * for the resolve_gitlink_ref() call, which we don't.

I've seen this comment many times over the past few months, and I've
previously always interpreted it as a "correctness" concern.

Looking at the original code in question, however (as of ccad261f,
dir.c#treat_directory(), "resolve_gitlink_ref" call), I don't
understand how correctness could be impacted.

Fast-forward 6 years and all this code has been substantially
overhauled by several folks over the years (most recently and majorly
Elijah Newren), and the "resolve_gitlink_ref()" call is long-gone.

Does this look familiar to anyone? Is there any remaining obvious
reason to be wary of storing the untracked cache structure produced
with '-uall'?

Thanks,
Tao

             reply	other threads:[~2021-06-22  9:25 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-22  9:25 Tao Klerks [this message]
2021-06-22 16:06 ` Untracked Cache and --untracked-files=all Elijah Newren
2021-06-23  6:42   ` Tao Klerks

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPMMpohp6+jW2C0ewfYEp3rrwbKSqGVa94LRgQDcKJvYmiANuA@mail.gmail.com \
    --to=tao@klerks.biz \
    --cc=git@vger.kernel.org \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).