Git Mailing List Archive on lore.kernel.org
 help / color / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Nipunn Koorapati <nipunn1313@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	Alex Vandiver via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, Derrick Stolee <stolee@gmail.com>,
	Utsav Shah <utsav@dropbox.com>,
	Alex Vandiver <alexmv@dropbox.com>
Subject: Re: [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff`
Date: Sun, 18 Oct 2020 00:17:47 -0400
Message-ID: <20201018041642.GB2262492@nand.local> (raw)
In-Reply-To: <CAN8Z4-W=+D-P_qCYijGMnStY-EGwKFx-+AYzjACDPAXnLRAA8A@mail.gmail.com>

On Sun, Oct 18, 2020 at 01:54:44AM +0100, Nipunn Koorapati wrote:
> > 20% of the working tree, running refresh_fsmonitor() for the entire
> > working tree is still a win, but if we are only checking less than
> > that, we are better off without fsmonitor, or does a tradeoff like
> > that exist?
>
> My understanding is that refresh_fsmonitor is
> O(delta_since_last_refresh) - so for developers
> with large repositories - this cost will amortize out over subsequent
> commands, so I don't
> think it's worth investigating this tradeoff here.
> As a user of large repositories, I expect that my major source of
> fsmonitor activity to be user
> intent (eg git pull, or intentionally copying/editing a large number
> of files). After such a command,
> I expect my next git command to be slower - that would be unsurprising.
>
> I think the tradeoff could be made for small diff requests, but I
> don't think it's worth adding complexity here -
> as that user will just have to pay the cost on their next git command.

Hmm. I do agree that I'd like to stay out of the business of trying to
figure out exactly what that trade-off is (although I'm sure that it
exists), only because it seems likely to vary to a large extent from
repository to repository. (That is, 20% may be a good number for some
repository, but a terrible choice for another).

But, I think that we can invoke watchman better here; the
fsmonitor-watchman hook has no notion of a "pathspec", so every query
just asks for everything that isn't in '$GIT_DIR'. Is there anything
preventing us from taking an optional pathspec and building up a more
targeted query?

There is some overhead to invoke the hook and talk to watchman, but
I'd expect that to be dwarfed by not having to issue O(# files)
syscalls.

> >
> > > +              */
> > > +             if (ce->ce_flags & CE_VALID || ce->ce_flags & CE_FSMONITOR_VALID) {
> >
> > Would it become easier to read, if written like this instead?
> >
> >                 if (ce->ce_flags & (CE_VALID | CE_FSMONITOR_VALID)) {
>
> I personally find this more confusing because it involves multiple
> bitwise ops, but this
> is potentially due to me having more mental practice thinking about
> boolean operators vs bitwise operators.
> I'm more than happy to align with the common pattern of the repo. I'll
> change this.

I don't have an opinion, nor do I think that git.git has an established
practice of doing one over the other. For what it's worth, my two-cents
is that Junio's suggestion is easier to read.

Thanks,
Taylor

  reply index

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-17 21:04 [PATCH 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
2020-10-17 21:04 ` [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
2020-10-17 22:25   ` Junio C Hamano
2020-10-18  0:54     ` Nipunn Koorapati
2020-10-18  4:17       ` Taylor Blau [this message]
2020-10-18  5:02         ` Junio C Hamano
2020-10-18 23:43           ` Taylor Blau
2020-10-19 17:23             ` Junio C Hamano
2020-10-19 17:37               ` Taylor Blau
2020-10-19 18:07                 ` Nipunn Koorapati
2020-10-17 21:04 ` [PATCH 2/4] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
2020-10-17 21:04 ` [PATCH 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
2020-10-18  4:22   ` Taylor Blau
2020-10-17 21:04 ` [PATCH 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
2020-10-17 22:28   ` Junio C Hamano
2020-10-19 21:35 ` [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
2020-10-19 21:35   ` [PATCH v2 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
2020-10-19 21:35   ` [PATCH v2 2/4] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
2020-10-19 21:35   ` [PATCH v2 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
2020-10-19 21:35   ` [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
2020-10-19 21:43     ` Taylor Blau
2020-10-19 21:54     ` Taylor Blau
2020-10-19 22:00       ` Nipunn Koorapati
2020-10-19 22:02         ` Taylor Blau
2020-10-19 22:25       ` Nipunn Koorapati
2020-10-19 22:47   ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
2020-10-19 22:47     ` [PATCH v3 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
2020-10-19 22:47     ` [PATCH v3 2/7] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
2020-10-19 22:47     ` [PATCH v3 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
2020-10-19 22:47     ` [PATCH v3 4/7] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
2020-10-19 22:47     ` [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests Nipunn Koorapati via GitGitGadget
2020-10-20  2:38       ` Taylor Blau
2020-10-20  3:10         ` Junio C Hamano
2020-10-20  3:15           ` Taylor Blau
2020-10-20 10:16             ` Nipunn Koorapati
2020-10-20 10:09         ` Nipunn Koorapati
2020-10-19 22:47     ` [PATCH v3 6/7] p7519-fsmonitor: refactor to avoid code duplication Nipunn Koorapati via GitGitGadget
2020-10-20  2:43       ` Taylor Blau
2020-10-19 22:47     ` [PATCH v3 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget
2020-10-19 23:02       ` Nipunn Koorapati
2020-10-20  2:40       ` Taylor Blau
2020-10-20 13:40     ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
2020-10-20 13:40       ` [PATCH v4 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
2020-10-20 13:40       ` [PATCH v4 2/7] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
2020-10-20 13:41       ` [PATCH v4 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
2020-10-20 13:41       ` [PATCH v4 4/7] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
2020-10-20 13:41       ` [PATCH v4 5/7] perf lint: add make test-lint to perf tests Nipunn Koorapati via GitGitGadget
2020-10-20 22:06         ` Taylor Blau
2020-10-20 22:17           ` Nipunn Koorapati
2020-10-20 22:19             ` Taylor Blau
2020-10-20 13:41       ` [PATCH v4 6/7] p7519-fsmonitor: refactor to avoid code duplication Nipunn Koorapati via GitGitGadget
2020-10-20 13:41       ` [PATCH v4 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201018041642.GB2262492@nand.local \
    --to=me@ttaylorr.com \
    --cc=alexmv@dropbox.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=nipunn1313@gmail.com \
    --cc=stolee@gmail.com \
    --cc=utsav@dropbox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Mailing List Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/git/0 git/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 git git/ https://lore.kernel.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.git


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git