Git Mailing List Archive on lore.kernel.org
 help / color / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Taylor Blau <me@ttaylorr.com>,
	Nipunn Koorapati <nipunn1313@gmail.com>,
	Alex Vandiver via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, Derrick Stolee <stolee@gmail.com>,
	Utsav Shah <utsav@dropbox.com>,
	Alex Vandiver <alexmv@dropbox.com>
Subject: Re: [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff`
Date: Sun, 18 Oct 2020 19:43:44 -0400
Message-ID: <20201018234344.GC4204@nand.local> (raw)
In-Reply-To: <xmqq1rhw86ur.fsf@gitster.c.googlers.com>

On Sat, Oct 17, 2020 at 10:02:04PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > Hmm. I do agree that I'd like to stay out of the business of trying to
> > figure out exactly what that trade-off is (although I'm sure that it
> > exists), only because it seems likely to vary to a large extent from
> > repository to repository. (That is, 20% may be a good number for some
> > repository, but a terrible choice for another).
>
> I think both of you misunderstood me.
>
> My question was a simple yes/no "does there a trade off exist?"
> question and the sentences with 20% in it were mere example of
> possible trade-off I had in mind that _could_ exist.  I wasn't even
> suggesting to figure out what the optimum cut-off heuristics would
> be (e.g. solving "when more than N% paths are subject to diff
> fsmonitor is faster" for N).
>
> I was hoping that we can show that even having to lstat just a
> single path is expensive enough---IOW, "there is no trade-off worth
> worrying about, because talking to fsmonitor is so cheap compared to
> the cost of even a single lstst" would have been a valid and happy
> answer.  With such a number, there is no risk of introducing an
> unwarranted performance regression to use cases that we did not
> anticipate by adding an unconditional call to refresh_fsmonitor().
>
> But without any rationale, the performance implication of adding an
> unconditional call to refresh_fsmonitor() would become much muddier.

Aha; thanks for clarifying. I'm glad we agree that finding 'N' would not
be worth it, or at least that showing that talking to fsmonitor is
cheaper than a single lstat would be more worthwhile.

Nipunn - I don't have fsmonitor/watchman setup on my workstation, but if
you do, some numbers (or an interpretation of the numbers you already
provided) on this would be really useful. If you don't have it set up,
or don't have time to measure it, let me know, and I'd be happy to take
a look.

> > But, I think that we can invoke watchman better here; the
> > fsmonitor-watchman hook has no notion of a "pathspec", so every query
> > just asks for everything that isn't in '$GIT_DIR'. Is there anything
> > preventing us from taking an optional pathspec and building up a more
> > targeted query?
>
> Yup, it is what I had in mind when I brought up the pathspec.  It
> may be something worth pursuing longer term, but not within the
> scope of this patch.
>
> > There is some overhead to invoke the hook and talk to watchman, but
> > I'd expect that to be dwarfed by not having to issue O(# files)
> > syscalls.
>
> "invoke the hook"---is that a pipe+fork+exec, or something else that
> is far lighter-weight?

The former; see 'fsmonitor.c:query_fsmonitor()'.

Thanks,
Taylor

  reply index

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-17 21:04 [PATCH 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
2020-10-17 21:04 ` [PATCH 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
2020-10-17 22:25   ` Junio C Hamano
2020-10-18  0:54     ` Nipunn Koorapati
2020-10-18  4:17       ` Taylor Blau
2020-10-18  5:02         ` Junio C Hamano
2020-10-18 23:43           ` Taylor Blau [this message]
2020-10-19 17:23             ` Junio C Hamano
2020-10-19 17:37               ` Taylor Blau
2020-10-19 18:07                 ` Nipunn Koorapati
2020-10-17 21:04 ` [PATCH 2/4] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
2020-10-17 21:04 ` [PATCH 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
2020-10-18  4:22   ` Taylor Blau
2020-10-17 21:04 ` [PATCH 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
2020-10-17 22:28   ` Junio C Hamano
2020-10-19 21:35 ` [PATCH v2 0/4] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
2020-10-19 21:35   ` [PATCH v2 1/4] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
2020-10-19 21:35   ` [PATCH v2 2/4] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
2020-10-19 21:35   ` [PATCH v2 3/4] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
2020-10-19 21:35   ` [PATCH v2 4/4] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
2020-10-19 21:43     ` Taylor Blau
2020-10-19 21:54     ` Taylor Blau
2020-10-19 22:00       ` Nipunn Koorapati
2020-10-19 22:02         ` Taylor Blau
2020-10-19 22:25       ` Nipunn Koorapati
2020-10-19 22:47   ` [PATCH v3 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
2020-10-19 22:47     ` [PATCH v3 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
2020-10-19 22:47     ` [PATCH v3 2/7] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
2020-10-19 22:47     ` [PATCH v3 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
2020-10-19 22:47     ` [PATCH v3 4/7] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
2020-10-19 22:47     ` [PATCH v3 5/7] perf lint: check test-lint-shell-syntax in perf tests Nipunn Koorapati via GitGitGadget
2020-10-20  2:38       ` Taylor Blau
2020-10-20  3:10         ` Junio C Hamano
2020-10-20  3:15           ` Taylor Blau
2020-10-20 10:16             ` Nipunn Koorapati
2020-10-20 10:09         ` Nipunn Koorapati
2020-10-19 22:47     ` [PATCH v3 6/7] p7519-fsmonitor: refactor to avoid code duplication Nipunn Koorapati via GitGitGadget
2020-10-20  2:43       ` Taylor Blau
2020-10-19 22:47     ` [PATCH v3 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget
2020-10-19 23:02       ` Nipunn Koorapati
2020-10-20  2:40       ` Taylor Blau
2020-10-20 13:40     ` [PATCH v4 0/7] use fsmonitor data in git diff eliminating O(num_files) calls to lstat Nipunn Koorapati via GitGitGadget
2020-10-20 13:40       ` [PATCH v4 1/7] fsmonitor: use fsmonitor data in `git diff` Alex Vandiver via GitGitGadget
2020-10-20 13:40       ` [PATCH v4 2/7] t/perf/README: elaborate on output format Nipunn Koorapati via GitGitGadget
2020-10-20 13:41       ` [PATCH v4 3/7] t/perf/p7519-fsmonitor.sh: warm cache on first git status Nipunn Koorapati via GitGitGadget
2020-10-20 13:41       ` [PATCH v4 4/7] t/perf: add fsmonitor perf test for git diff Nipunn Koorapati via GitGitGadget
2020-10-20 13:41       ` [PATCH v4 5/7] perf lint: add make test-lint to perf tests Nipunn Koorapati via GitGitGadget
2020-10-20 22:06         ` Taylor Blau
2020-10-20 22:17           ` Nipunn Koorapati
2020-10-20 22:19             ` Taylor Blau
2020-10-20 13:41       ` [PATCH v4 6/7] p7519-fsmonitor: refactor to avoid code duplication Nipunn Koorapati via GitGitGadget
2020-10-20 13:41       ` [PATCH v4 7/7] p7519-fsmonitor: add a git add benchmark Nipunn Koorapati via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201018234344.GC4204@nand.local \
    --to=me@ttaylorr.com \
    --cc=alexmv@dropbox.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=nipunn1313@gmail.com \
    --cc=stolee@gmail.com \
    --cc=utsav@dropbox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Mailing List Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/git/0 git/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 git git/ https://lore.kernel.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.git


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git