git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: git <git@vger.kernel.org>, Ben Peart <benpeart@microsoft.com>,
	Alex Vandiver <alexmv@dropbox.com>,
	Christian Couder <christian.couder@gmail.com>,
	David Turner <dturner@twopensource.com>
Subject: Re: Some rough edges of core.fsmonitor
Date: Sat, 27 Jan 2018 12:43:41 +0100	[thread overview]
Message-ID: <87bmhfwmqa.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <CACsJy8BpO0s6facg+zcKC9icijpefkipM326n6xOArjn=ZW6+w@mail.gmail.com>


On Sat, Jan 27 2018, Duy Nguyen jotted:

> On Sat, Jan 27, 2018 at 7:28 AM, Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> 3) A lot of time spend reading the index (or something..)
>
> I'm resending a patch from my old index-helper series. It should
> measure all big time consuming blocks in git. Maybe we should get it
> merged...
>
>> While the hook itself takes ~20ms (and watchman itself 1/4 of that)
>> status as a whole takes much longer. gprof reveals:
>>
>>     Each sample counts as 0.01 seconds.
>>       %   cumulative   self              self     total
>>      time   seconds   seconds    calls  ms/call  ms/call  name
>>      15.38      0.02     0.02   221690     0.00     0.00  memihash
>
> This sounds like name-hash to me.
>
>>      15.38      0.04     0.02   221689     0.00     0.00  create_from_disk
>>       7.69      0.05     0.01  2216897     0.00     0.00  git_bswap32
>>       7.69      0.06     0.01   222661     0.00     0.00  ce_path_match
>>       7.69      0.07     0.01   221769     0.00     0.00  hashmap_add
>>       7.69      0.08     0.01    39941     0.00     0.00  prep_exclude
>>       7.69      0.09     0.01    39940     0.00     0.00  strbuf_addch
>>       7.69      0.10     0.01        1    10.00    10.00  read_one
>>       7.69      0.11     0.01        1    10.00    10.00  refresh_index
>>       7.69      0.12     0.01        1    10.00    10.00  tweak_fsmonitor
>>       7.69      0.13     0.01                             preload_thread
>>
>> The index is 24M in this case, I guess it's unpacking it, but I wonder
>> if this couldn't be much faster if we saved away the result of the last
>> "status" in something that's quick to access, and then if nothing
>
> No we could do better, we could cache parsed index content so
> everybody benefits. I demonstrated it with my "index v254" patch a
> while back:
>
> https://public-inbox.org/git/1399980019-8706-1-git-send-email-pclouds@gmail.com/
>
> With the patch I'm sending soon, we can see how much time reading an
> index take out of that ~140-150ms (and we probably can cut down index
> read time to like 10-20% when cached).
>
>> changed we just report that, and no need to re-write the index (or just
>> write the "it was clean at this time" part).
>
> Hmm.. does an index write increase that much time?

Your patch is very useful. Here's (with gcc -03) some runtimes. This
also includes my .git exclusion patch.

These are all best out of 5, and with the top (until <0.5% time) of
strace -c output (run as another invocation, timing not done with
strace)::

a) no fsmonitor

    $ time GIT_TRACE_PERFORMANCE=1 ~/g/git/git-status
    12:32:44.947651 read-cache.c:1890       performance: 0.053153609 s: read cache .git/index
    12:32:44.967943 preload-index.c:112     performance: 0.020161093 s: preload index
    12:32:44.974217 read-cache.c:1446       performance: 0.006230611 s: refresh index
    12:32:44.979083 diff-lib.c:250          performance: 0.004649994 s: diff-files
    12:32:44.982511 diff-lib.c:527          performance: 0.002918416 s: diff-index
    12:32:45.037880 dir.c:2290              performance: 0.055331063 s: read directory
    On branch master
    Your branch is up to date with 'origin/master'.

    nothing to commit, working tree clean
    12:32:45.040666 trace.c:417             performance: 0.146724289 s: git command: '/home/aearnfjord/g/git/git-status'

    real    0m0.153s
    user    0m0.110s
    sys     0m0.354s
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     59.93    0.031924           1     39978         9 stat
     35.86    0.019104        6368         3           futex
      0.84    0.000446          12        36           mprotect
      0.73    0.000389          13        29           munmap
      0.66    0.000349           6        62           mmap
      0.53    0.000285          14        20           clone

b) with fsmonitor

    $ time GIT_TRACE_PERFORMANCE=1 ~/g/git/git-status
    12:34:23.833625 read-cache.c:1890       performance: 0.049485685 s: read cache .git/index
    12:34:23.838622 preload-index.c:112     performance: 0.001221197 s: preload index
    12:34:23.858723 fsmonitor.c:170         performance: 0.020059647 s: fsmonitor process '.git/hooks/fsmonitor-watchman'
    12:34:23.871532 read-cache.c:1446       performance: 0.032870818 s: refresh index
    12:34:23.876427 diff-lib.c:250          performance: 0.004731427 s: diff-files
    12:34:23.880669 diff-lib.c:527          performance: 0.003944422 s: diff-index
    12:34:23.899225 dir.c:2290              performance: 0.018509066 s: read directory
    On branch master
    Your branch is up to date with 'origin/master'.

    nothing to commit, working tree clean
    12:34:23.901914 trace.c:417             performance: 0.118250995 s: git command: '/home/aearnfjord/g/git/git-status'

    real    0m0.125s
    user    0m0.086s
    sys     0m0.043s
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     36.61    0.001281          61        21           clone
     33.84    0.001184          41        29           munmap
      5.09    0.000178           5        36           mprotect
      4.34    0.000152           0       619           brk
      4.17    0.000146           2        62           mmap
      3.34    0.000117           2        55        20 open
      3.00    0.000105           2        60        27 lstat
      1.77    0.000062           2        37         9 stat
      1.60    0.000056           1        51           read
      1.57    0.000055           5        12           write
      1.17    0.000041          41         1           wait4
      0.83    0.000029           1        41           close
      0.83    0.000029           1        22           getcwd
      0.80    0.000028           1        34           fstat

c) with fsmonitor + don't write index

    $ time GIT_TRACE_PERFORMANCE=1 GIT_OPTIONAL_LOCKS=0 ~/g/git/git-status
    12:36:03.176866 read-cache.c:1890       performance: 0.048292088 s: read cache .git/index
    12:36:03.181006 preload-index.c:112     performance: 0.001343593 s: preload index
    12:36:03.200970 fsmonitor.c:170         performance: 0.019936338 s: fsmonitor process '.git/hooks/fsmonitor-watchman'
    12:36:03.210556 read-cache.c:1446       performance: 0.029525531 s: refresh index
    12:36:03.213366 diff-lib.c:250          performance: 0.002718730 s: diff-files
    12:36:03.216273 diff-lib.c:527          performance: 0.002666604 s: diff-index
    12:36:03.233579 dir.c:2290              performance: 0.017261702 s: read directory
    On branch master
    Your branch is up to date with 'origin/master'.

    nothing to commit, working tree clean
    12:36:03.233733 trace.c:417             performance: 0.105632105 s: git command: '/home/aearnfjord/g/git/git-status'

    real    0m0.111s
    user    0m0.073s
    sys     0m0.044s
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     24.42    0.001081          37        29           munmap
     20.74    0.000918          44        21           clone
      7.63    0.000338           5        62           mmap
      7.43    0.000329           6        54        20 open
      6.05    0.000268           7        36           mprotect
      5.99    0.000265           0       619           brk
      4.99    0.000221           4        60        27 lstat
      4.13    0.000183           4        51           read
      3.68    0.000163           9        19        10 access
      3.25    0.000144           4        34           fstat
      2.78    0.000123           3        40           close
      2.48    0.000110           3        37         9 stat
      2.24    0.000099           5        21           getcwd
      1.15    0.000051           4        12           write
      0.99    0.000044          22         2           poll

For comparison just the output from the hook:

    $ time ('.git/hooks/fsmonitor-watchman' '1' '1517052856494406191')
    .git
    real    0m0.017s
    user    0m0.011s
    sys     0m0.003s

And this is what goes on behind the scenes after we get rid of the
.git/hooks/fsmonitor-watchman overhead:
    
    $ time (echo '["query", "/home/aearnfjord/git_tree/2015-04-03-1M-git", {
                            "since": 1517052856,
                            "fields": ["name"],
                            "expression": ["not", ["allof", ["since", 1517052856, "cclock"], ["not", "exists"]]]
                    }]' | watchman -j)
    {
        "version": "4.9.0",
        "is_fresh_instance": false,
        "clock": "c:1517006351:31165:4:1113968",
        "files": [
            ".git"
        ]
    }
    
    real    0m0.003s
    user    0m0.000s
    sys     0m0.004s

  parent reply	other threads:[~2018-01-27 11:43 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-27  0:28 Some rough edges of core.fsmonitor Ævar Arnfjörð Bjarmason
2018-01-27  1:36 ` Duy Nguyen
2018-01-27  1:39   ` [PATCH] trace: measure where the time is spent in the index-heavy operations Nguyễn Thái Ngọc Duy
2018-01-27 11:58     ` Thomas Gummerer
2018-01-27 12:27       ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2018-01-27 11:43   ` Ævar Arnfjörð Bjarmason [this message]
2018-01-27 12:39     ` Some rough edges of core.fsmonitor Duy Nguyen
2018-01-27 13:09       ` Duy Nguyen
2018-01-27 19:01         ` Ævar Arnfjörð Bjarmason
2018-01-30 22:41           ` Ben Peart
2018-01-29  9:40     ` Duy Nguyen
2018-01-29 23:16       ` Ben Peart
2018-02-01 10:40         ` Duy Nguyen
2018-01-28 20:44 ` Johannes Schindelin
2018-01-28 22:28   ` Ævar Arnfjörð Bjarmason
2018-01-30  1:21     ` Ben Peart
2018-01-31 10:15       ` Duy Nguyen
2018-02-04  9:38         ` [PATCH] dir.c: ignore paths containing .git when invalidating untracked cache Nguyễn Thái Ngọc Duy
2018-02-05 17:44           ` Ben Peart
2018-02-06 12:02             ` Duy Nguyen
2018-02-07  9:21           ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2018-02-07  9:21             ` Nguyễn Thái Ngọc Duy
2018-02-07 16:59               ` Ben Peart
2018-02-13 10:00                 ` Duy Nguyen
2018-02-13 17:57                   ` Junio C Hamano
2018-02-14  1:24                     ` Duy Nguyen
2018-02-14  8:00                       ` Junio C Hamano
2018-01-30 22:57 ` Some rough edges of core.fsmonitor Ben Peart
2018-01-30 23:16   ` Ævar Arnfjörð Bjarmason
2018-01-31 16:12     ` Ben Peart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bmhfwmqa.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=alexmv@dropbox.com \
    --cc=benpeart@microsoft.com \
    --cc=christian.couder@gmail.com \
    --cc=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).