git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Turner <dturner@twopensource.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: Watchman support for git
Date: Tue, 13 May 2014 19:44:51 -0400	[thread overview]
Message-ID: <1400024691.14179.40.camel@stross> (raw)
In-Reply-To: <CACsJy8DVVpfYEmE8pSZNyXy1m5WRkdm08deW3EXgAy_0Gn72zw@mail.gmail.com>

On Sat, 2014-05-10 at 15:16 +0700, Duy Nguyen wrote:
> On Sat, May 3, 2014 at 6:14 AM,  <dturner@twopensource.com> wrote:
> > The most sigificant patch uses Facebook's watchman daemon[1] to monitor
> > the repository work tree for changes.  This makes allows git status
> > to avoid traversing the entire work tree to find changes.
> 
> Some comments on this series. I still haven't been able to run it with
> watchman so there are many guesses from my side.
> 
> First of all, when I set USE_WATCHMAN=Yes in config.mak I expect it to
> work out of the box, provided that all dependencies are installed. I
> still need to set WATCHMAN_LIBS for it to build because you only set
> it with configure script. Would be nice to have a default value for
> non-configure people too.

I'll fix that, thanks.

> I'm not so happy that git now needs to link to libjansson.so and
> libwatchman.so. I would load libwatchman.so at runtime only when
> core.usewatchman is on, but this is more of personal taste.

I assume you mean something with dlopen? I don't really like that because
(a) nothing else in git works that way and
(b) you would still need the libwatchman headers to build git (or the
structs could be copied, but that is ugly).

> I still prefer my old tracking model, where the majority of lstat() is
> done by refresh operation and we only need to optimize those lstat
> calls, not every single lstat statement in the code base.

The lstat calls are only one of the problems.  The others are
opendir/readdir and open(.gitignore). 

>  With that in
> mind, I think you don't need to keep a fs cache on disk at all. All
> you need to store (in the index) is the clock value from watchman.
> After we parse the index, we perform a "since" query to get path names
> (and perhaps "exists" and "mode" for later). Then we set CE_VALID bit
> on entries that are _not_ in the query result. The remaining items
> will be lstat'd by git (see [1] and read-cache.c changes on the next
> few patches). Assuming the number of updated files is reasonably
> small, we won't  be punished by lookup time. 

I considered this, but rejected it for a few reasons:
1. We still need to know about the untracked files.  I guess we could 
do an index extension for just that, like your untracked cache.

2. That doesn't eliminate opendir/readdir, I think.  Those are a major 
cost. I did not thoroughly review your patches from January/February, 
but I didn't notice anything in there about this part of dir.c.  
Also the cost of open(nonexistent .gitignore) to do ignore processing.

3. Changing a file in the index (e.g. git add) requires clearing
the CE_VALID bit; this means that third-party libraries (e.g. jgit) 
that change the git repo need to understand this extension for correct
operation.  Maybe that's just the nature of extensions, but it's not
something that my present patch set requires.

  reply	other threads:[~2014-05-13 23:45 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-02 23:14 Watchman support for git dturner
2014-05-02 23:14 ` [PATCH 1/3] After chdir to run grep, return to old directory dturner
2014-05-06 22:24   ` Junio C Hamano
2014-05-07  0:06     ` David Turner
2014-05-07  3:00       ` Jeff King
2014-05-07  3:33         ` David Turner
2014-05-07 17:42           ` Junio C Hamano
2014-05-07 20:57             ` David Turner
2014-05-02 23:14 ` [PATCH 3/3] Watchman support dturner
2014-05-02 23:20 ` Watchman support for git Felipe Contreras
2014-05-03  2:24   ` David Turner
2014-05-03  3:40     ` Felipe Contreras
2014-05-05 18:08       ` David Turner
2014-05-05 18:14         ` Felipe Contreras
2014-05-08 19:17       ` Sebastian Schuberth
2014-05-09  7:08         ` David Lang
2014-05-09 17:17           ` David Turner
2014-05-09 18:08             ` David Lang
2014-05-09 18:17               ` David Turner
2014-05-09 18:27                 ` David Lang
2014-05-09 18:47                   ` David Turner
2014-05-03  0:52 ` Duy Nguyen
2014-05-03  4:39   ` David Turner
2014-05-03  8:49     ` Duy Nguyen
2014-05-03 20:49       ` David Turner
2014-05-04  0:15         ` Duy Nguyen
2014-05-06  3:13           ` David Turner
2014-05-06  0:26   ` Duy Nguyen
2014-05-06  0:30     ` Duy Nguyen
2014-05-10  5:26 ` Duy Nguyen
2014-05-10 18:38   ` David Turner
2014-05-11  0:21     ` Duy Nguyen
2014-05-11 22:56       ` David Turner
2014-05-12 10:45         ` Duy Nguyen
2014-05-13 22:38           ` David Turner
2014-05-13 22:54             ` Duy Nguyen
2014-05-13 23:19               ` David Turner
2014-05-10  8:16 ` Duy Nguyen
2014-05-13 23:44   ` David Turner [this message]
2014-05-14 10:36     ` Duy Nguyen
2014-05-14 10:52       ` Duy Nguyen
2014-05-15 19:42       ` David Turner
2014-05-19 10:10         ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1400024691.14179.40.camel@stross \
    --to=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).