git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Question about fsmonitor and --untracked-files=all
@ 2020-09-22 11:35 Tao Klerks
  2020-09-23 10:40 ` Johannes Schindelin
  0 siblings, 1 reply; 3+ messages in thread
From: Tao Klerks @ 2020-09-22 11:35 UTC (permalink / raw)
  To: git

Hi folks,

I've got a couple questions about the "fsmonitor" functionality,
untracked files, and multithreading.

Background:

In a repo with:
 * A couple hundred thousand tracked files, and a couple hundred
thousand .gitignored files, across a few thousand directories
 * The --untracked-cache setting, tested and working
 * core.fsmonitor set up with watchman (with the sample integration
script from january)
 * Git version 2.27.0.windows.1

"git status" takes about 2s
"git status --untracked-files=all" takes about 20s

When I turn off "core.fsmonitor", the numbers change to something like:
"git status": 8s
"git status --untracked-files=all": 9s

Using windows' "procmon" to observe git.exe's behavior from outside, I
think I've understood a couple things that surprise me:
1. when you specify "--untracked-files=all", git scans the entire
folder tree regardless of the "fsmonitor" hook
2. when you specify the "fsmonitor" hook, git does any
filesystem-scanning in a single-threaded fashion (as opposed to
multi-threaded without "fsmonitor" / normally)

These two things combine so that with "fsmonitor" set, normal
command-line git status performance is great, but the performance in
tools that eagerly look for untracked files (like "Git Extensions" on
windows) actually suffers - it takes twice as long to run the 'git -c
diff.ignoreSubModules=none status --porcelain=2 -z
--untracked-files=all' command that this UI wants (and blocks on, when
you go to a commit dialog).

Questions:

1. Is there a reason "--untracked-files=all" causes a full directory
tree scan even with the "fsmonitor" hook active, or is this
accidental?
2. Assuming that the full directory tree scan is indeed necessary even
with "fsmonitor" (when requesting all untracked files), could it be
made multithreaded?

(my apologies for the simplistic "outside-in" observations; I don't
feel qualified to attempt to understand the git source code)

Thanks for any help understanding the optimization opportunities here!

Tao Klerks

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-09-24 12:15 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-22 11:35 Question about fsmonitor and --untracked-files=all Tao Klerks
2020-09-23 10:40 ` Johannes Schindelin
2020-09-24 12:14   ` Tao Klerks

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).