git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: dturner@twopensource.com
To: git@vger.kernel.org
Subject: Watchman support for git
Date: Fri,  2 May 2014 19:14:08 -0400	[thread overview]
Message-ID: <1399072451-15561-1-git-send-email-dturner@twopensource.com> (raw)

The most sigificant patch uses Facebook's watchman daemon[1] to monitor
the repository work tree for changes.  This makes allows git status
to avoid traversing the entire work tree to find changes.

This change requires libwatchman[2], a client library that I wrote for
watchman.

While making the watchman change, I also made a change to the index
format (contributed here in a separate patch).  Index integrity
checking uses the same SHA1 algorithm as the rest of git; this is
actually relatively slow.  It's not a huge part of run-time, but since
I wanted to do the same checking for the Watchman filesystem cache
(which is about twice as large as the index), I decided to optimize it
anyway.  I switched to VMAC.  VMAC is supposed to be a MAC, but
there's no reason it can't be used with a fixed key as a simple
integrity check.  VMAC is roughly five times faster than SHA1 on my
machine; This adds up to a 5% overal speed improvement on git status
(depending on the structure of your repo, and about 15% on git diff
--cached with no cached changes).

The index format change might be less important with the split index;
I haven't investigated that since at the time I wrote these patches,
it didn't exist.

Some numbers follow.  They are on my laptop, which has 4x i5-2520M
processors, 8GB of RAM, and a solid state disk.  They're all tested
with a hot cache.

Test repository 1: Linux

Linux is about 45k files in 3k directories.  The average length of a
filename is about 32 bytes.

Git status timing:
no watchman: 125ms
watchman: 90ms

Test repository 2: Superscience

My second test repository (which is a semi-synthetic repo generated
from various Twitter internal repos) is somewhat larger than this, and
gets a correspondingly larger improvement.  It is about 65k files in
20k directories; the average length of a filename is 67 bytes.

Git status timing:
no watchman, index version 4: 370 ms
no watchman, index version 5: 365 ms
watchman, index version 4: 170 ms
watchman, index version 5: 165 ms


[1] https://github.com/facebook/watchman
[2] https://github.com/twitter/libwatchman

             reply	other threads:[~2014-05-02 23:14 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-02 23:14 dturner [this message]
2014-05-02 23:14 ` [PATCH 1/3] After chdir to run grep, return to old directory dturner
2014-05-06 22:24   ` Junio C Hamano
2014-05-07  0:06     ` David Turner
2014-05-07  3:00       ` Jeff King
2014-05-07  3:33         ` David Turner
2014-05-07 17:42           ` Junio C Hamano
2014-05-07 20:57             ` David Turner
2014-05-02 23:14 ` [PATCH 3/3] Watchman support dturner
2014-05-02 23:20 ` Watchman support for git Felipe Contreras
2014-05-03  2:24   ` David Turner
2014-05-03  3:40     ` Felipe Contreras
2014-05-05 18:08       ` David Turner
2014-05-05 18:14         ` Felipe Contreras
2014-05-08 19:17       ` Sebastian Schuberth
2014-05-09  7:08         ` David Lang
2014-05-09 17:17           ` David Turner
2014-05-09 18:08             ` David Lang
2014-05-09 18:17               ` David Turner
2014-05-09 18:27                 ` David Lang
2014-05-09 18:47                   ` David Turner
2014-05-03  0:52 ` Duy Nguyen
2014-05-03  4:39   ` David Turner
2014-05-03  8:49     ` Duy Nguyen
2014-05-03 20:49       ` David Turner
2014-05-04  0:15         ` Duy Nguyen
2014-05-06  3:13           ` David Turner
2014-05-06  0:26   ` Duy Nguyen
2014-05-06  0:30     ` Duy Nguyen
2014-05-10  5:26 ` Duy Nguyen
2014-05-10 18:38   ` David Turner
2014-05-11  0:21     ` Duy Nguyen
2014-05-11 22:56       ` David Turner
2014-05-12 10:45         ` Duy Nguyen
2014-05-13 22:38           ` David Turner
2014-05-13 22:54             ` Duy Nguyen
2014-05-13 23:19               ` David Turner
2014-05-10  8:16 ` Duy Nguyen
2014-05-13 23:44   ` David Turner
2014-05-14 10:36     ` Duy Nguyen
2014-05-14 10:52       ` Duy Nguyen
2014-05-15 19:42       ` David Turner
2014-05-19 10:10         ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1399072451-15561-1-git-send-email-dturner@twopensource.com \
    --to=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).