git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Peart <peartben@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: "Git Mailing List" <git@vger.kernel.org>,
	"Junio C Hamano" <gitster@pobox.com>,
	benpeart@microsoft.com,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Johannes Schindelin" <johannes.schindelin@gmx.de>,
	David.Turner@twosigma.com, "Jeff King" <peff@peff.net>
Subject: Re: [PATCH v2 5/6] fsmonitor: add documentation for the fsmonitor extension.
Date: Mon, 22 May 2017 12:18:20 -0400	[thread overview]
Message-ID: <5ab333a4-c3cd-1cb5-ba3e-6b08fa14c9e7@gmail.com> (raw)
In-Reply-To: <CACBZZX5URAeA+=12ezW-oDGnkdAqvQqV7it=HBaYCKUdx0p_XA@mail.gmail.com>



On 5/20/2017 8:10 AM, Ævar Arnfjörð Bjarmason wrote:
>> +== File System Monitor cache
>> +
>> +  The file system monitor cache tracks files for which the query-fsmonitor
>> +  hook has told us about changes.  The signature for this extension is
>> +  { 'F', 'S', 'M', 'N' }.
>> +
>> +  The extension starts with
>> +
>> +  - 32-bit version number: the current supported version is 1.
>> +
>> +  - 64-bit time: the extension data reflects all changes through the given
>> +       time which is stored as the seconds elapsed since midnight, January 1, 1970.
>> +
>> +  - 32-bit bitmap size: the size of the CE_FSMONITOR_DIRTY bitmap.
>> +
>> +  - An ewah bitmap, the n-th bit indicates whether the n-th index entry
>> +    is CE_FSMONITOR_DIRTY.
>
> We already have a uint64_t in one place in the codebase (getnanotime)
> which uses a 64 bit time for nanosecond accuracy, and numerous
> filesystems already support nanosecond timestamps (ext4, that new
> Apple thingy...).
>
> I don't know if any of the inotify/fsmonitor APIs support that yet,
> but it seems inevitable that that'll be added if not, in some
> pathological cases we can have a lot of files modified in 1 second, so
> using nanosecond accuracy means there'll be a lot less data to
> consider in some cases.
>
> It does mean this'll only work until the year ~2500, but that seems
> like an acceptable trade-off.
>

I really don't think nano-second resolution is needed in this case for a 
few reasons.

The number of files that can change within a given second is limited by 
the IO throughput of the underlying device. Even assuming a very fast 
device and very small files and changes, this won't be that many files.

Without this patch, git would have scanned all those files every time. 
With this patch, git will only scan those files a 2nd time that are 
modified in the same second that it did the first scan *that came before 
the first scan started* (the "lots of files modified" section in the 1 
second timeline below).

|------------------------- one second ---------------------|
|-lots of files modified - git status - more file modified-|

Yes, some duplicate status checks can be made but its still a 
significant win in any reasonable scenario. Especially when you consider 
that it is pretty unusual to do git status/add/commit calls in the 
middle of making lots of changes to files.

In addition, the backing file system monitor (Watchman) supports number 
of seconds since the unix epoch (unix time_t style).  This means any 
support of nano seconds by git is academic until someone provides a file 
system watcher that does support nano second granularity.

Finally, the fsmonitor index extension is versioned so that we can 
seamlessly upgrade to nano second resolution later if we desire.

  reply	other threads:[~2017-05-22 16:18 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-18 20:13 [PATCH v2 0/6] Fast git status via a file system watcher Ben Peart
2017-05-18 20:13 ` [PATCH v2 1/6] bswap: add 64 bit endianness helper get_be64 Ben Peart
2017-05-18 20:13 ` [PATCH v2 2/6] dir: make lookup_untracked() available outside of dir.c Ben Peart
2017-05-18 20:13 ` [PATCH v2 3/6] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
2017-05-19 15:33   ` Ben Peart
2017-05-20 10:41     ` Junio C Hamano
2017-05-24 12:30   ` Christian Couder
2017-05-18 20:13 ` [PATCH v2 4/6] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-05-20 16:55   ` Torsten Bögershausen
2017-05-18 20:13 ` [PATCH v2 5/6] fsmonitor: add documentation for the " Ben Peart
2017-05-20 11:28   ` Junio C Hamano
2017-05-20 12:10   ` Ævar Arnfjörð Bjarmason
2017-05-22 16:18     ` Ben Peart [this message]
2017-05-22 17:28       ` Ævar Arnfjörð Bjarmason
2017-05-25 13:49         ` Ben Peart
2017-05-18 20:13 ` [PATCH v2 6/6] fsmonitor: add a sample query-fsmonitor hook script for Watchman Ben Peart
2017-05-24 13:12   ` Christian Couder
2017-05-26  9:47     ` Ævar Arnfjörð Bjarmason
2017-05-26 16:02       ` Ben Peart
2017-05-25 21:05   ` Ævar Arnfjörð Bjarmason
2017-05-24 10:54 ` [PATCH v2 0/6] Fast git status via a file system watcher Christian Couder
2017-05-25 13:55   ` Ben Peart
2017-05-27  6:57     ` Christian Couder
2017-05-30 18:05       ` Ben Peart
2017-05-30 20:33         ` Christian Couder
2017-05-30 23:11           ` Ben Peart
2017-05-31  7:37             ` Christian Couder
2017-05-31  7:59     ` Christian Couder
2017-05-31 13:37       ` Ben Peart
2017-05-31 14:10         ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5ab333a4-c3cd-1cb5-ba3e-6b08fa14c9e7@gmail.com \
    --to=peartben@gmail.com \
    --cc=David.Turner@twosigma.com \
    --cc=avarab@gmail.com \
    --cc=benpeart@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).