From: Duy Nguyen <pclouds@gmail.com>
To: David Turner <dturner@twopensource.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: Watchman support for git
Date: Mon, 12 May 2014 17:45:42 +0700 [thread overview]
Message-ID: <CACsJy8C_j2bKVwqOQtOqGFkc_-_AmY=bQXquRfL-aqk=z9YKdw@mail.gmail.com> (raw)
In-Reply-To: <1399848982.11843.161.camel@stross>
On Mon, May 12, 2014 at 5:56 AM, David Turner <dturner@twopensource.com> wrote:
>> So without watchman I got
>>
>> 299.871ms read_index_from:1538 if (verify_hdr(hdr, mmap_size) < 0) go
>> 498.205ms cmd_status:1300 refresh_index(&the_index, REFRESH_QUIE
>> 796.050ms wt_status_collect:622 wt_status_collect_untracked(s)
>>
>> and with watchman ("git status" ran several times to make sure it's cached)
>>
>> 301.950ms read_index_from:1538 if (verify_hdr(hdr, mmap_size) < 0) go
>> 34.918ms read_fs_cache:347 if (verify_hdr(hdr, mmap_size) < 0) go
>> 1564.096ms watchman_load_fs_cache:628 update_fs_cache(istate, result);
>> 161.930ms cmd_status:1300 refresh_index(&the_index, REFRESH_QUIE
>> 251.614ms wt_status_collect:622 wt_status_collect_untracked(s)
>>
>> Given the total time of "git status" without watchman is 1.9s,,
>> update_fs_cache() nearly matches that number alone. All that is spent
>> in the exclude update code in the function, but if you do
>> last_excluding_matching() anyway, why cache it?
>
> My numbers are different (for my test repository):
>
> ---
> 30.031ms read_index:1386 r = read_index_from(istate, get_index_
> 71.625ms cmd_status:1302 refresh_index(&the_index, REFRESH_QUIE
> 259.712ms wt_status_collect:622 wt_status_collect_untracked(s)
> ----
> 41.110ms read_index:1386 r = read_index_from(istate, get_index_
> 9.294ms read_fs_cache:347 if (verify_hdr(hdr, mmap_size) < 0) go
> 0.173ms watchman_load_fs_cache:628 update_fs_cache(istate, result)
> 41.901ms read_index:1386 r = read_index_from(istate, get_index_
> 18.355ms cmd_status:1302 refresh_index(&the_index, REFRESH_QUIE
> 50.911ms wt_status_collect:622 wt_status_collect_untracked(s)
> ---
>
> I think something must be going wrong with update_fs_cache on your
> machine. I have a few hypotheses:
>
> 1. Maybe watchman isn't fully started up when you run your tests.
> 2. Maybe there is a bug.
It's probably me doing something wrong (I ran it more than a couple
times so watchman must have loaded the whole thing). I got small
numbers in update_fs_cache() now.
>> A bit surprised about wt_status_collect_untracked number. I verified
>> that last_excluding_matching() still runs (on the same number of
>> entries like in no-watchman case). Replacing fs_cache_open() in
>> add_excludes_from_file_to_list() to plain open() does not change the
>> number, so we probably won't need that (unless your worktree is filled
>> with .gitignore, which I doubt it's a norm).
>
> My test repo has a couple hundred of them. Maybe that's unusual? A
> repo with a lot of projects will tend to have lots of gitignore files,
> because each project will want to maintain them independently.
I tried the worst case, every directory had an empty .gitignore. The
numbers did not change much. And I found out that because
add_excludes.. were called twice, not on every .gitignore because of
the condition "!(dir->flags & DIR_EXCLUDE_CMDL_ONLY)". So the number
of .gitignore does not matter (yet).
This is your quote from above, moved down a bit:
> update_fs_cache should only have to update based on what it has learned
> from watchman. So if no .gitignore has been changed, it should not have
> to do very much work.
>
> I could take the fe_excluded check and move it above the
> last_exclude_matching check in fs_cache_is_excluded; it causes t7300 to
> fail when run under watchman but presumably that's fixable
So you exclude files early and make the real read_directory() pass do
pretty much nothing. This is probably not a good idea. Assume that I
touch $TOP/.gitignore then do something other than "git status" (or
"git add") then I have to pay read_directory() cost.
Back to the open vs fs_cache_open and the number of .gitignore files
above. I touch $TOP/.gitignore then do "git status" to make it read
all .gitignore files (6k of them) and change between open and
fs_cache_open. I think the numbers still do not make any visible
difference (~1620-1630ms).
--
Duy
next prev parent reply other threads:[~2014-05-12 10:46 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-02 23:14 Watchman support for git dturner
2014-05-02 23:14 ` [PATCH 1/3] After chdir to run grep, return to old directory dturner
2014-05-06 22:24 ` Junio C Hamano
2014-05-07 0:06 ` David Turner
2014-05-07 3:00 ` Jeff King
2014-05-07 3:33 ` David Turner
2014-05-07 17:42 ` Junio C Hamano
2014-05-07 20:57 ` David Turner
2014-05-02 23:14 ` [PATCH 3/3] Watchman support dturner
2014-05-02 23:20 ` Watchman support for git Felipe Contreras
2014-05-03 2:24 ` David Turner
2014-05-03 3:40 ` Felipe Contreras
2014-05-05 18:08 ` David Turner
2014-05-05 18:14 ` Felipe Contreras
2014-05-08 19:17 ` Sebastian Schuberth
2014-05-09 7:08 ` David Lang
2014-05-09 17:17 ` David Turner
2014-05-09 18:08 ` David Lang
2014-05-09 18:17 ` David Turner
2014-05-09 18:27 ` David Lang
2014-05-09 18:47 ` David Turner
2014-05-03 0:52 ` Duy Nguyen
2014-05-03 4:39 ` David Turner
2014-05-03 8:49 ` Duy Nguyen
2014-05-03 20:49 ` David Turner
2014-05-04 0:15 ` Duy Nguyen
2014-05-06 3:13 ` David Turner
2014-05-06 0:26 ` Duy Nguyen
2014-05-06 0:30 ` Duy Nguyen
2014-05-10 5:26 ` Duy Nguyen
2014-05-10 18:38 ` David Turner
2014-05-11 0:21 ` Duy Nguyen
2014-05-11 22:56 ` David Turner
2014-05-12 10:45 ` Duy Nguyen [this message]
2014-05-13 22:38 ` David Turner
2014-05-13 22:54 ` Duy Nguyen
2014-05-13 23:19 ` David Turner
2014-05-10 8:16 ` Duy Nguyen
2014-05-13 23:44 ` David Turner
2014-05-14 10:36 ` Duy Nguyen
2014-05-14 10:52 ` Duy Nguyen
2014-05-15 19:42 ` David Turner
2014-05-19 10:10 ` Duy Nguyen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CACsJy8C_j2bKVwqOQtOqGFkc_-_AmY=bQXquRfL-aqk=z9YKdw@mail.gmail.com' \
--to=pclouds@gmail.com \
--cc=dturner@twopensource.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).