git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Tao Klerks <tao@klerks.biz>
Cc: git@vger.kernel.org
Subject: Re: Question about fsmonitor and --untracked-files=all
Date: Wed, 23 Sep 2020 12:40:09 +0200 (CEST)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.2009231238560.5061@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <CAPMMpoj+UhKCW_k34-cGkiWFghOOu13GhPgA0V-y4ZpLVppuiA@mail.gmail.com>

Hi Tao,

On Tue, 22 Sep 2020, Tao Klerks wrote:

> I've got a couple questions about the "fsmonitor" functionality,
> untracked files, and multithreading.
>
> Background:
>
> In a repo with:
>  * A couple hundred thousand tracked files, and a couple hundred
> thousand .gitignored files, across a few thousand directories
>  * The --untracked-cache setting, tested and working
>  * core.fsmonitor set up with watchman (with the sample integration
> script from january)
>  * Git version 2.27.0.windows.1
>
> "git status" takes about 2s
> "git status --untracked-files=all" takes about 20s
>
> When I turn off "core.fsmonitor", the numbers change to something like:
> "git status": 8s
> "git status --untracked-files=all": 9s
>
> Using windows' "procmon" to observe git.exe's behavior from outside, I
> think I've understood a couple things that surprise me:
> 1. when you specify "--untracked-files=all", git scans the entire
> folder tree regardless of the "fsmonitor" hook
> 2. when you specify the "fsmonitor" hook, git does any
> filesystem-scanning in a single-threaded fashion (as opposed to
> multi-threaded without "fsmonitor" / normally)
>
> These two things combine so that with "fsmonitor" set, normal
> command-line git status performance is great, but the performance in
> tools that eagerly look for untracked files (like "Git Extensions" on
> windows) actually suffers - it takes twice as long to run the 'git -c
> diff.ignoreSubModules=none status --porcelain=2 -z
> --untracked-files=all' command that this UI wants (and blocks on, when
> you go to a commit dialog).
>
> Questions:
>
> 1. Is there a reason "--untracked-files=all" causes a full directory
> tree scan even with the "fsmonitor" hook active, or is this
> accidental?

I have a hunch that this might be related to a performance hack we have in
Git for Windows: did you enable FSCache perchance?

If so, I _suspect_ that turning it off would accelerate `git status
--untracked-files=all`.

Ciao,
Johannes

> 2. Assuming that the full directory tree scan is indeed necessary even
> with "fsmonitor" (when requesting all untracked files), could it be
> made multithreaded?
>
> (my apologies for the simplistic "outside-in" observations; I don't
> feel qualified to attempt to understand the git source code)
>
> Thanks for any help understanding the optimization opportunities here!
>
> Tao Klerks
>

  reply	other threads:[~2020-09-23 14:42 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-22 11:35 Question about fsmonitor and --untracked-files=all Tao Klerks
2020-09-23 10:40 ` Johannes Schindelin [this message]
2020-09-24 12:14   ` Tao Klerks

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.2009231238560.5061@tvgsbejvaqbjf.bet \
    --to=johannes.schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=tao@klerks.biz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).