git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Karsten Blees <karsten.blees@gmail.com>
To: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: "Git List" <git@vger.kernel.org>,
	"Duy Nguyen" <pclouds@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Torsten Bögershausen" <tboegi@web.de>,
	"Robert Zeh" <robert.allan.zeh@gmail.com>,
	"Jeff King" <peff@peff.net>,
	"Erik Faye-Lund" <kusmabite@gmail.com>,
	"Drew Northup" <n1xim.email@gmail.com>
Subject: Re: [RFC/PATCH] Documentation/technical/api-fswatch.txt: start with outline
Date: Wed, 13 Mar 2013 00:21:00 +0100	[thread overview]
Message-ID: <513FB85C.5010106@gmail.com> (raw)
In-Reply-To: <1362946623-23649-1-git-send-email-artagnon@gmail.com>

Am 10.03.2013 21:17, schrieb Ramkumar Ramachandra:
> git operations are slow on repositories with lots of files, and lots
> of tiny filesystem calls like lstat(), getdents(), open() are
> reposible for this.  On the linux-2.6 repository, for instance, the
> numbers for "git status" look like this:
> 
>   top syscalls sorted     top syscalls sorted
>   by acc. time            by number
>   ----------------------------------------------
>   0.401906 40950 lstat    0.401906 40950 lstat
>   0.190484 5343 getdents  0.150055 5374 open
>   0.150055 5374 open      0.190484 5343 getdents
>   0.074843 2806 close     0.074843 2806 close
>   0.003216 157 read       0.003216 157 read
> 
> To solve this problem, we propose to build a daemon which will watch
> the filesystem using inotify and report batched up events over a UNIX
> socket.

[...]

> +
> +The credential C API is meant to be called by Git code which needs
> +information aboutx filesystem changes.  It is centered around an
> +object representing the changes the filesystem since the last
> +invocation.
> +

Hmmm...I don't see how filesystem changes since last invocation can solve the problem, or am I missing something? I think what you mean to say is that the daemon should keep track of the filesystem *state* of the working copy, or alternatively the deltas/changes to some known state (such as .git/index)?

I'm also still skeptical whether a daemon will improve overall performance. In my understanding its essentially a filesystem cache in user-mode. The difference to using the OS filesystem cache directly (via lstat/readdir) is that we replace ~50k sys-calls with a single IPC call (i.e. the git <--> fswatch daemon communication is less 'chatty'). However, the 'chattyness' is still there between the fswatch daemon and the OS / inotify. Consider 'git status; make; make clean; git status'...that's a *lot* of changes to process for nothing (potentially slowing down make).

Then there's the issue of stale data in the cache. Modifying porcelain commands that use 'git status --porcelain' to compile their changesets will want 100% exact data. I'm not saying its not doable, but adding another platform specific, caching daemon to the tool chain doesn't exactly simplify things...

But perhaps I'm too pessimistic (or just stigmatized by inherently slow and out-of-date TGitCache/TSvnCache on Windows :-)

  parent reply	other threads:[~2013-03-12 23:21 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-10 20:17 [RFC/PATCH] Documentation/technical/api-fswatch.txt: start with outline Ramkumar Ramachandra
2013-03-11 17:05 ` Heiko Voigt
2013-03-12  9:43   ` Ramkumar Ramachandra
2013-03-12  9:50     ` Erik Faye-Lund
2013-03-12  9:55     ` Jeff King
2013-03-12 23:21 ` Karsten Blees [this message]
2013-03-13  1:03   ` Duy Nguyen
2013-03-13 17:50     ` Karsten Blees
2013-03-13 19:38       ` Junio C Hamano
2013-03-14 10:58         ` Duy Nguyen
2013-03-15 16:27         ` Pete Wyckoff
2013-03-16 14:21         ` Thomas Rast
2013-03-18  8:24           ` Ramkumar Ramachandra
2013-03-18 10:07             ` Thomas Rast
2013-03-25 10:44               ` Ramkumar Ramachandra
2013-03-25 10:59                 ` Duy Nguyen
2013-03-25 11:13                   ` Ramkumar Ramachandra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=513FB85C.5010106@gmail.com \
    --to=karsten.blees@gmail.com \
    --cc=artagnon@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=kusmabite@gmail.com \
    --cc=n1xim.email@gmail.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=robert.allan.zeh@gmail.com \
    --cc=tboegi@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).