git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Duy Nguyen <pclouds@gmail.com>
To: David Turner <dturner@twopensource.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: Watchman support for git
Date: Mon, 12 May 2014 17:45:42 +0700	[thread overview]
Message-ID: <CACsJy8C_j2bKVwqOQtOqGFkc_-_AmY=bQXquRfL-aqk=z9YKdw@mail.gmail.com> (raw)
In-Reply-To: <1399848982.11843.161.camel@stross>

On Mon, May 12, 2014 at 5:56 AM, David Turner <dturner@twopensource.com> wrote:
>> So without watchman I got
>>
>>    299.871ms read_index_from:1538 if (verify_hdr(hdr, mmap_size) < 0) go
>>    498.205ms cmd_status:1300 refresh_index(&the_index, REFRESH_QUIE
>>    796.050ms wt_status_collect:622 wt_status_collect_untracked(s)
>>
>> and with watchman ("git status" ran several times to make sure it's cached)
>>
>>    301.950ms read_index_from:1538 if (verify_hdr(hdr, mmap_size) < 0) go
>>     34.918ms  read_fs_cache:347 if (verify_hdr(hdr, mmap_size) < 0) go
>>   1564.096ms  watchman_load_fs_cache:628 update_fs_cache(istate, result);
>>    161.930ms cmd_status:1300 refresh_index(&the_index, REFRESH_QUIE
>>    251.614ms wt_status_collect:622 wt_status_collect_untracked(s)
>>
>> Given the total time of "git status" without watchman is 1.9s,,
>> update_fs_cache() nearly matches that number alone. All that is spent
>> in the exclude update code in the function, but if you do
>> last_excluding_matching() anyway, why cache it?
>
> My numbers are different (for my test repository):
>
> ---
>     30.031ms read_index:1386 r = read_index_from(istate, get_index_
>     71.625ms cmd_status:1302 refresh_index(&the_index, REFRESH_QUIE
>    259.712ms wt_status_collect:622 wt_status_collect_untracked(s)
> ----
>     41.110ms read_index:1386 r = read_index_from(istate, get_index_
>      9.294ms read_fs_cache:347 if (verify_hdr(hdr, mmap_size) < 0) go
>      0.173ms watchman_load_fs_cache:628 update_fs_cache(istate, result)
>     41.901ms read_index:1386 r = read_index_from(istate, get_index_
>     18.355ms cmd_status:1302 refresh_index(&the_index, REFRESH_QUIE
>     50.911ms wt_status_collect:622 wt_status_collect_untracked(s)
> ---
>
> I think something must be going wrong with update_fs_cache on your
> machine.  I have a few hypotheses:
>
> 1. Maybe watchman isn't fully started up when you run your tests.
> 2. Maybe there is a bug.

It's probably me doing something wrong (I ran it more than a couple
times so watchman must have loaded the whole thing). I got small
numbers in update_fs_cache() now.

>> A bit surprised about wt_status_collect_untracked number. I verified
>> that last_excluding_matching() still runs (on the same number of
>> entries like in no-watchman case). Replacing fs_cache_open() in
>> add_excludes_from_file_to_list() to plain open() does not change the
>> number, so we probably won't need that (unless your worktree is filled
>> with .gitignore, which I doubt it's a norm).
>
> My test repo has a couple hundred of them.  Maybe that's unusual?  A
> repo with a lot of projects will tend to have lots of gitignore files,
> because each project will want to maintain them independently.

I tried the worst case, every directory had an empty .gitignore. The
numbers did not change much. And I found out that because
add_excludes.. were called twice, not on every .gitignore because of
the condition "!(dir->flags & DIR_EXCLUDE_CMDL_ONLY)". So the number
of .gitignore does not matter (yet).

This is your quote from above, moved down a bit:

> update_fs_cache should only have to update based on what it has learned
> from watchman.  So if no .gitignore has been changed, it should not have
> to do very much work.
>
> I could take the fe_excluded check and move it above the
> last_exclude_matching check in fs_cache_is_excluded; it causes t7300 to
> fail when run under watchman but presumably that's fixable

So you exclude files early and make the real read_directory() pass do
pretty much nothing. This is probably not a good idea. Assume that I
touch $TOP/.gitignore then do something other than "git status" (or
"git add") then I have to pay read_directory() cost.

Back to the open vs fs_cache_open and the number of .gitignore files
above. I touch $TOP/.gitignore then do "git status" to make it read
all .gitignore files (6k of them) and change between open and
fs_cache_open. I think the numbers still  do not make any visible
difference (~1620-1630ms).
-- 
Duy

  reply	other threads:[~2014-05-12 10:46 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-02 23:14 Watchman support for git dturner
2014-05-02 23:14 ` [PATCH 1/3] After chdir to run grep, return to old directory dturner
2014-05-06 22:24   ` Junio C Hamano
2014-05-07  0:06     ` David Turner
2014-05-07  3:00       ` Jeff King
2014-05-07  3:33         ` David Turner
2014-05-07 17:42           ` Junio C Hamano
2014-05-07 20:57             ` David Turner
2014-05-02 23:14 ` [PATCH 3/3] Watchman support dturner
2014-05-02 23:20 ` Watchman support for git Felipe Contreras
2014-05-03  2:24   ` David Turner
2014-05-03  3:40     ` Felipe Contreras
2014-05-05 18:08       ` David Turner
2014-05-05 18:14         ` Felipe Contreras
2014-05-08 19:17       ` Sebastian Schuberth
2014-05-09  7:08         ` David Lang
2014-05-09 17:17           ` David Turner
2014-05-09 18:08             ` David Lang
2014-05-09 18:17               ` David Turner
2014-05-09 18:27                 ` David Lang
2014-05-09 18:47                   ` David Turner
2014-05-03  0:52 ` Duy Nguyen
2014-05-03  4:39   ` David Turner
2014-05-03  8:49     ` Duy Nguyen
2014-05-03 20:49       ` David Turner
2014-05-04  0:15         ` Duy Nguyen
2014-05-06  3:13           ` David Turner
2014-05-06  0:26   ` Duy Nguyen
2014-05-06  0:30     ` Duy Nguyen
2014-05-10  5:26 ` Duy Nguyen
2014-05-10 18:38   ` David Turner
2014-05-11  0:21     ` Duy Nguyen
2014-05-11 22:56       ` David Turner
2014-05-12 10:45         ` Duy Nguyen [this message]
2014-05-13 22:38           ` David Turner
2014-05-13 22:54             ` Duy Nguyen
2014-05-13 23:19               ` David Turner
2014-05-10  8:16 ` Duy Nguyen
2014-05-13 23:44   ` David Turner
2014-05-14 10:36     ` Duy Nguyen
2014-05-14 10:52       ` Duy Nguyen
2014-05-15 19:42       ` David Turner
2014-05-19 10:10         ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACsJy8C_j2bKVwqOQtOqGFkc_-_AmY=bQXquRfL-aqk=z9YKdw@mail.gmail.com' \
    --to=pclouds@gmail.com \
    --cc=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).