git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Duy Nguyen <pclouds@gmail.com>
To: David Turner <dturner@twopensource.com>
Cc: git mailing list <git@vger.kernel.org>
Subject: Re: Watchman support for git
Date: Sat, 3 May 2014 15:49:23 +0700	[thread overview]
Message-ID: <CACsJy8B1Q3WEPT+nzDDwS5f7Wx+u5CHfN9JppRHv5VEx5NTxSw@mail.gmail.com> (raw)
In-Reply-To: <1399091986.5310.20.camel@stross>

On Sat, May 3, 2014 at 11:39 AM, David Turner <dturner@twopensource.com> wrote:
>> Index v4 and split index (and the following read-cache daemon,
>> hopefully)
>
> Looking at some of the archives for read-cache daemon, it seems to be
> somewhat similar to watchman, right?  But I only saw inotify code; what
> about Mac OS?  Or am I misunderstanding what it is?

It's mentioned in [1], the second paragraph, mostly to hide index I/O
read cost and the SHA-1 hashing cost in the background. In theory it
should work on all platforms that support multiple processes and
efficient IPC. It can help load watchman file cache faster too.

>> The last line could be a competition between watchman and my coming
>> "untracked cache" series. I expect to cut the number in that line at
>> least in half without external dependency.
>
> I hadn't seen the "untracked cached" work (I actually finished these
> patches a month or so ago but have been waiting for some internal
> reviews before sending them out).  Looks interesting.  It seems we use a
> similar strategy for handling ignores.

Yep, mostly the same at the core, except that I exploit directory
mtime while you use inotify. Each approach has its own pros and cons,
I think. Both should face the same traps in caching (e.g. if you "git
rm --cached" a file, that file could be come either untracked, or
ignored).

>> Patch 2/3 did not seem to make it to the list by the way..
>
> Thanks for your comments.  I just tried again to send patch 2/3.  I do
> actually see the CC of it in my @twitter.com mailbox, but I don't see it
> in the archives on the web.  Do you know if there is a reason the
> mailing list would reject it?

Probably its size, 131K, which is also an indicator to split it (and
the third patch) into smaller patches if you want to merge this
feature in master eventually.

>   At any rate, the contents may be found
> at
> https://github.com/dturner-tw/git/commit/cf587d54fc72d82a23267348afa2c4b60f14ce51.diff

Good enough for me :)

>
>> initial
>> reaction is storing the list of all paths seems too much, but I'll
>> need to play with it a bit to understand it.
>
> I wonder if it would make sense to use the untracked cache as the
> storage strategy, but use watchman as the update strategy.

I'm afraid not. If a directory mtime is changed, which means
files/dirs have been added or deleted, the untracked code would fall
back to the opendir/readdir/is_excluded dance again on that directory.
If we naively do the same using watchman, we lose its advantage that
it knows exactly what files/dirs are added/removed. That kind of
knowledge can help speed up the dance, which is not stored anywhere in
the untracked cache.

We could extend the "read-cache daemon" mentioned above though, to
hide all the hard work in the background and present a good view to
git: when a file/dir is added, read-cache daemon classifies the new
files/dirs as tracked/untracked/ignore and update its untracked cache
in memory. When "git status" asks about the index and untracked cache,
it will receive the _updated_ cache (not the on disk version any more)
with latest dir mtime so git can verify the cache is perfect and skip
opendir/.... All git does is to write the index down in the end to
make the updated data permanent. It sounds interesting. But I'm not so
sure if it's worth the complexity.

[1] http://article.gmane.org/gmane.comp.version-control.git/247268
-- 
Duy

  reply	other threads:[~2014-05-03  8:50 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-02 23:14 Watchman support for git dturner
2014-05-02 23:14 ` [PATCH 1/3] After chdir to run grep, return to old directory dturner
2014-05-06 22:24   ` Junio C Hamano
2014-05-07  0:06     ` David Turner
2014-05-07  3:00       ` Jeff King
2014-05-07  3:33         ` David Turner
2014-05-07 17:42           ` Junio C Hamano
2014-05-07 20:57             ` David Turner
2014-05-02 23:14 ` [PATCH 3/3] Watchman support dturner
2014-05-02 23:20 ` Watchman support for git Felipe Contreras
2014-05-03  2:24   ` David Turner
2014-05-03  3:40     ` Felipe Contreras
2014-05-05 18:08       ` David Turner
2014-05-05 18:14         ` Felipe Contreras
2014-05-08 19:17       ` Sebastian Schuberth
2014-05-09  7:08         ` David Lang
2014-05-09 17:17           ` David Turner
2014-05-09 18:08             ` David Lang
2014-05-09 18:17               ` David Turner
2014-05-09 18:27                 ` David Lang
2014-05-09 18:47                   ` David Turner
2014-05-03  0:52 ` Duy Nguyen
2014-05-03  4:39   ` David Turner
2014-05-03  8:49     ` Duy Nguyen [this message]
2014-05-03 20:49       ` David Turner
2014-05-04  0:15         ` Duy Nguyen
2014-05-06  3:13           ` David Turner
2014-05-06  0:26   ` Duy Nguyen
2014-05-06  0:30     ` Duy Nguyen
2014-05-10  5:26 ` Duy Nguyen
2014-05-10 18:38   ` David Turner
2014-05-11  0:21     ` Duy Nguyen
2014-05-11 22:56       ` David Turner
2014-05-12 10:45         ` Duy Nguyen
2014-05-13 22:38           ` David Turner
2014-05-13 22:54             ` Duy Nguyen
2014-05-13 23:19               ` David Turner
2014-05-10  8:16 ` Duy Nguyen
2014-05-13 23:44   ` David Turner
2014-05-14 10:36     ` Duy Nguyen
2014-05-14 10:52       ` Duy Nguyen
2014-05-15 19:42       ` David Turner
2014-05-19 10:10         ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACsJy8B1Q3WEPT+nzDDwS5f7Wx+u5CHfN9JppRHv5VEx5NTxSw@mail.gmail.com \
    --to=pclouds@gmail.com \
    --cc=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).