git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Duy Nguyen <pclouds@gmail.com>
To: David Turner <dturner@twopensource.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: Watchman support for git
Date: Sat, 10 May 2014 15:16:49 +0700	[thread overview]
Message-ID: <CACsJy8DVVpfYEmE8pSZNyXy1m5WRkdm08deW3EXgAy_0Gn72zw@mail.gmail.com> (raw)
In-Reply-To: <1399072451-15561-1-git-send-email-dturner@twopensource.com>

On Sat, May 3, 2014 at 6:14 AM,  <dturner@twopensource.com> wrote:
> The most sigificant patch uses Facebook's watchman daemon[1] to monitor
> the repository work tree for changes.  This makes allows git status
> to avoid traversing the entire work tree to find changes.

Some comments on this series. I still haven't been able to run it with
watchman so there are many guesses from my side.

First of all, when I set USE_WATCHMAN=Yes in config.mak I expect it to
work out of the box, provided that all dependencies are installed. I
still need to set WATCHMAN_LIBS for it to build because you only set
it with configure script. Would be nice to have a default value for
non-configure people too.

I'm not so happy that git now needs to link to libjansson.so and
libwatchman.so. I would load libwatchman.so at runtime only when
core.usewatchman is on, but this is more of personal taste.

I still prefer my old tracking model, where the majority of lstat() is
done by refresh operation and we only need to optimize those lstat
calls, not every single lstat statement in the code base. With that in
mind, I think you don't need to keep a fs cache on disk at all. All
you need to store (in the index) is the clock value from watchman.
After we parse the index, we perform a "since" query to get path names
(and perhaps "exists" and "mode" for later). Then we set CE_VALID bit
on entries that are _not_ in the query result. The remaining items
will be lstat'd by git (see [1] and read-cache.c changes onthe next
few patches). Assuming the number of updated files is reasonably
small, we won't  be punished by lookup time. To avoid write time cost
due to possible mass CE_VALID change, assuming split-index is already
used, we could store this bit separately as an extension and apply it
back at read time.

Your changes in dir.c. If I didn't misread it, you still do
last_exclude_matching when using fs-cache. That call is where all your
CPU is spent and probably explains why you didn't see big perf gain
with watchman. I think my untracked cache has dealt correctly with
caching in this area. So when you do the watchman query earlier, you
also check if any of the directories are updated and force using
untracked cache for the rest of the directories.

Hope it helps,
Duy

[1] http://thread.gmane.org/gmane.comp.version-control.git/240339

  parent reply	other threads:[~2014-05-10  8:17 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-02 23:14 Watchman support for git dturner
2014-05-02 23:14 ` [PATCH 1/3] After chdir to run grep, return to old directory dturner
2014-05-06 22:24   ` Junio C Hamano
2014-05-07  0:06     ` David Turner
2014-05-07  3:00       ` Jeff King
2014-05-07  3:33         ` David Turner
2014-05-07 17:42           ` Junio C Hamano
2014-05-07 20:57             ` David Turner
2014-05-02 23:14 ` [PATCH 3/3] Watchman support dturner
2014-05-02 23:20 ` Watchman support for git Felipe Contreras
2014-05-03  2:24   ` David Turner
2014-05-03  3:40     ` Felipe Contreras
2014-05-05 18:08       ` David Turner
2014-05-05 18:14         ` Felipe Contreras
2014-05-08 19:17       ` Sebastian Schuberth
2014-05-09  7:08         ` David Lang
2014-05-09 17:17           ` David Turner
2014-05-09 18:08             ` David Lang
2014-05-09 18:17               ` David Turner
2014-05-09 18:27                 ` David Lang
2014-05-09 18:47                   ` David Turner
2014-05-03  0:52 ` Duy Nguyen
2014-05-03  4:39   ` David Turner
2014-05-03  8:49     ` Duy Nguyen
2014-05-03 20:49       ` David Turner
2014-05-04  0:15         ` Duy Nguyen
2014-05-06  3:13           ` David Turner
2014-05-06  0:26   ` Duy Nguyen
2014-05-06  0:30     ` Duy Nguyen
2014-05-10  5:26 ` Duy Nguyen
2014-05-10 18:38   ` David Turner
2014-05-11  0:21     ` Duy Nguyen
2014-05-11 22:56       ` David Turner
2014-05-12 10:45         ` Duy Nguyen
2014-05-13 22:38           ` David Turner
2014-05-13 22:54             ` Duy Nguyen
2014-05-13 23:19               ` David Turner
2014-05-10  8:16 ` Duy Nguyen [this message]
2014-05-13 23:44   ` David Turner
2014-05-14 10:36     ` Duy Nguyen
2014-05-14 10:52       ` Duy Nguyen
2014-05-15 19:42       ` David Turner
2014-05-19 10:10         ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACsJy8DVVpfYEmE8pSZNyXy1m5WRkdm08deW3EXgAy_0Gn72zw@mail.gmail.com \
    --to=pclouds@gmail.com \
    --cc=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).