All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Peart <peartben@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Ben Peart <benpeart@microsoft.com>,
	David.Turner@twosigma.com, avarab@gmail.com,
	christian.couder@gmail.com, git@vger.kernel.org,
	johannes.schindelin@gmx.de, pclouds@gmail.com, peff@peff.net
Subject: Re: [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
Date: Thu, 21 Sep 2017 10:35:33 -0400	[thread overview]
Message-ID: <f50825a4-fa15-9f28-a079-853e78ee8e2e@gmail.com> (raw)
In-Reply-To: <a4ab4766-0367-ff18-a3a9-e48ed49ccd80@gmail.com>



On 9/20/2017 10:24 PM, Ben Peart wrote:
> 
> 
> On 9/20/2017 10:00 PM, Junio C Hamano wrote:
>> Ben Peart <peartben@gmail.com> writes:
>>
>>> Pretty much the same places you would also use CE_MATCH_IGNORE_VALID
>>> and CE_MATCH_IGNORE_SKIP_WORKTREE which serve the same role for those
>>> features.  That is generally when you are about to overwrite data so
>>> want to be *really* sure you have what you think you have.
>>
>> Now that makes me worried gravely.
>>
>> IGNORE_VALID is ignored in these places because we have been burned
>> by end-users lying to us.  IGNORE_SKIP_WORKTREE must be ignored
>> because we know that the working tree state does not match the
>> "reality" the index wants to have.  The fact that the code treats
>> the status reported and kept up to date by fsmonitor the same way as
>> these two implies that it is merely advisory and cannot be trusted?
>> Is that the reason why we tell the codepath with IGNORE_FSMONITOR to
>> ignore the state fsmonitor reported and check the state ourselves?
>>
> 
> Sorry for causing unnecessary worry.  The fsmonitor data can be trusted 
> (as much as you can trust that Watchman or your file system monitor is 
> not buggy).  I wasn't 100% sure *why* these places passed the various 
> IGNORE_VALID and IGNORE_SKIP_WORKTREE flags.  When I looked at them, 
> that lack of trust seemed to be the reason.
> 
> Adding IGNORE_FSMONITOR in those same places was simply an abundance of 
> caution on my part.  The only down side of passing the flag for 
> fsmonitor is that we will end up calling lstat() on a file where we 
> technically didn't need too.  That seemed safer than potentially missing 
> a change if I had misunderstood the code.
> 
> I'd much rather return correct results (and fall back to the old 
> performance) than potentially be incorrect.  I followed that same 
> principal in the entire design of fsmonitor - if anything doesn't look 
> right, fall back to the old code path just in case...
> 

I spent some time with git blame/show trying to figure out the *why* for 
all the places CE_MATCH_IGNORE_* are passed without gaining a lot of 
additional understanding.  Based on your description above of why these 
exist, I believe there are very few places we actually need to pass 
CE_MATCH_IGNORE_FSMONITOR and that I was being overly cautious.

Here is a patch that removes the unnecessary CE_MATCH_IGNORE_FSMONITOR 
instances.  While the test suite passes with this change, I'm not 100% 
confident that we actually have test cases that would have detected all 
the places that we needed the CE_MATCH_IGNORE_* flags.

If this seems like a reasonable additional optimization to make, I can 
roll it into the next iteration of the patch series as I have some 
spelling, documentation changes and other tweaks as a result of all the 
feedback.


 From 6ff7ed0467fd736dca73efe62391bb3ee9b4e771 Mon Sep 17 00:00:00 2001
From: Ben Peart <benpeart@microsoft.com>
Date: Thu, 21 Sep 2017 09:09:42 -0400
Subject: [PATCH] fsmonitor: remove unnecessary uses of
  CE_MATCH_IGNORE_FSMONITOR

With a better understanding of *why* the CE_MATCH_IGNORE_* flags are
used, it is now more clear they are not required in most cases where
CE_MATCH_IGNORE_FSMONITOR was being passed out of an abundance of
caution.

Since the fsmonitor data can be trusted and is kept in sync with the
working directory, the only remaining valid uses are those locations
where we don't want to trigger an unneeded refresh_fsmonitor() call.

One is where preload_index() is doing a fast precompute of state for
the bulk of the index entries but is not required for correctness as
refresh_cache_ent() will ensure any "missed" by preload_index() are
up-to-date if/when they are needed.

The second is in is_staging_gitmodules_ok() where we don't want to
trigger a complete refresh just to check the .gitignore file.

The net result of this change will be that there are more cases where
we will be able to use the cached index state and avoid unnecessary
lstat() calls.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
  apply.c        | 2 +-
  entry.c        | 2 +-
  read-cache.c   | 4 ++--
  unpack-trees.c | 6 +++---
  4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/apply.c b/apply.c
index 9061cc5f15..71cbbd141c 100644
--- a/apply.c
+++ b/apply.c
@@ -3399,7 +3399,7 @@ static int verify_index_match(const struct 
cache_entry *ce, struct stat *st)
  			return -1;
  		return 0;
  	}
-	return ce_match_stat(ce, st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR);
+	return ce_match_stat(ce, st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
  }

  #define SUBMODULE_PATCH_WITHOUT_INDEX 1
diff --git a/entry.c b/entry.c
index 5e6794f9fc..3a7b667373 100644
--- a/entry.c
+++ b/entry.c
@@ -404,7 +404,7 @@ int checkout_entry(struct cache_entry *ce,

  	if (!check_path(path.buf, path.len, &st, state->base_dir_len)) {
  		const struct submodule *sub;
-		unsigned changed = ce_match_stat(ce, &st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR);
+		unsigned changed = ce_match_stat(ce, &st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
  		/*
  		 * Needs to be checked before !changed returns early,
  		 * as the possibly empty directory was not changed
diff --git a/read-cache.c b/read-cache.c
index 53093dbebf..05c0a33fdd 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -641,7 +641,7 @@ int add_to_index(struct index_state *istate, const 
char *path, struct stat *st,
  	int size, namelen, was_same;
  	mode_t st_mode = st->st_mode;
  	struct cache_entry *ce, *alias;
-	unsigned ce_option = 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_RACY_IS_DIRTY|CE_MATCH_IGNORE_FSMONITOR;
+	unsigned ce_option = 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_RACY_IS_DIRTY;
  	int verbose = flags & (ADD_CACHE_VERBOSE | ADD_CACHE_PRETEND);
  	int pretend = flags & ADD_CACHE_PRETEND;
  	int intent_only = flags & ADD_CACHE_INTENT;
@@ -1356,7 +1356,7 @@ int refresh_index(struct index_state *istate, 
unsigned int flags,
  	int first = 1;
  	int in_porcelain = (flags & REFRESH_IN_PORCELAIN);
  	unsigned int options = (CE_MATCH_REFRESH |
-				(really ? CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_FSMONITOR : 0) |
+				(really ? CE_MATCH_IGNORE_VALID : 0) |
  				(not_new ? CE_MATCH_IGNORE_MISSING : 0));
  	const char *modified_fmt;
  	const char *deleted_fmt;
diff --git a/unpack-trees.c b/unpack-trees.c
index f724a61ac0..1f5d371636 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1456,7 +1456,7 @@ static int verify_uptodate_1(const struct 
cache_entry *ce,
  		return 0;

  	if (!lstat(ce->name, &st)) {
-		int flags = 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR;
+		int flags = CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE;
  		unsigned changed = ie_match_stat(o->src_index, ce, &st, flags);

  		if (submodule_from_ce(ce)) {
@@ -1612,7 +1612,7 @@ static int icase_exists(struct 
unpack_trees_options *o, const char *name, int le
  	const struct cache_entry *src;

  	src = index_file_exists(o->src_index, name, len, 1);
-	return src && !ie_match_stat(o->src_index, src, st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR);
+	return src && !ie_match_stat(o->src_index, src, st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE);
  }

  static int check_ok_to_remove(const char *name, int len, int dtype,
@@ -2136,7 +2136,7 @@ int oneway_merge(const struct cache_entry * const 
*src,
  		if (o->reset && o->update && !ce_uptodate(old) && 
!ce_skip_worktree(old)) {
  			struct stat st;
  			if (lstat(old->name, &st) ||
-			    ie_match_stat(o->src_index, old, &st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE|CE_MATCH_IGNORE_FSMONITOR))
+			    ie_match_stat(o->src_index, old, &st, 
CE_MATCH_IGNORE_VALID|CE_MATCH_IGNORE_SKIP_WORKTREE))
  				update |= CE_UPDATE;
  		}
  		add_entry(o, old, update, 0);
-- 
2.14.1.548.g237ef02b2b.dirty



>> Oh, wait...
>>
>>
>>> The other place I used it was in preload_index(). In that case, I
>>> didn't want to trigger the call to refresh_fsmonitor() as
>>> preload_index() is about trying to do a fast precompute of state for
>>> the bulk of the index entries but is not required for correctness as
>>> refresh_cache_ent() will ensure any "missed" by preload_index() are
>>> up-to-date if/when that is needed.
>>
>> That is a very valid design decision.  So IGNORE_FSMONITOR is,
>> unlike IGNORE_VALID and IGNORE_SKIP_WORKTREE, to tell us "do not
>> bother asking fsmonitor to refresh the state of this entry--it is OK
>> for us to use a slightly stale information"?  That would make sense
>> as an optimization, but that does not mesh well with the previous
>> "we need to be really really sure" usecase.  That one wants "we do
>> not trust fsmonitor, so do not bother asking to refresh; we will do
>> so ourselves", which would not help the "we can use slightly stale
>> one and that is OK" usecase.
>>
>> Puzzled...
>>

  reply	other threads:[~2017-09-21 14:35 UTC|newest]

Thread overview: 137+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-10 13:40 [PATCH v5 0/7] Fast git status via a file system watcher Ben Peart
2017-06-10 13:40 ` [PATCH v5 1/7] bswap: add 64 bit endianness helper get_be64 Ben Peart
2017-06-10 13:40 ` [PATCH v5 2/7] dir: make lookup_untracked() available outside of dir.c Ben Peart
2017-06-10 13:40 ` [PATCH v5 3/7] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
2017-06-27 15:43   ` Christian Couder
2017-07-03 21:25     ` Ben Peart
2017-06-10 13:40 ` [PATCH v5 4/7] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-06-27 16:20   ` Christian Couder
2017-07-07 18:50     ` Ben Peart
2017-06-10 13:40 ` [PATCH v5 5/7] fsmonitor: add documentation for the " Ben Peart
2017-06-10 13:40 ` [PATCH v5 6/7] fsmonitor: add a sample query-fsmonitor hook script for Watchman Ben Peart
2017-06-10 13:40 ` [PATCH v5 7/7] fsmonitor: add a performance test Ben Peart
2017-06-10 14:04   ` Ben Peart
2017-06-12 22:04   ` Junio C Hamano
2017-06-14 14:12     ` Ben Peart
2017-06-14 18:36       ` Junio C Hamano
2017-07-07 18:14         ` Ben Peart
2017-07-07 18:35           ` Junio C Hamano
2017-07-07 19:07             ` Ben Peart
2017-07-07 19:33             ` David Turner
2017-07-08  7:19             ` Christian Couder
2017-06-28  5:11 ` [PATCH v5 0/7] Fast git status via a file system watcher Christian Couder
2017-07-10 13:36   ` Ben Peart
2017-07-10 14:40     ` Ben Peart
2017-09-15 19:20 ` [PATCH v6 00/12] " Ben Peart
2017-09-15 19:20   ` [PATCH v6 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
2017-09-15 19:20   ` [PATCH v6 02/12] preload-index: add override to enable testing preload-index Ben Peart
2017-09-15 19:20   ` [PATCH v6 03/12] update-index: add a new --force-write-index option Ben Peart
2017-09-15 19:20   ` [PATCH v6 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
2017-09-15 21:35     ` David Turner
2017-09-18 13:07       ` Ben Peart
2017-09-18 13:32         ` David Turner
2017-09-18 13:49           ` Ben Peart
2017-09-15 19:20   ` [PATCH v6 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
2017-09-15 19:43     ` David Turner
2017-09-18 13:27       ` Ben Peart
2017-09-17  8:03     ` Junio C Hamano
2017-09-18 13:29       ` Ben Peart
2017-09-15 19:20   ` [PATCH v6 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
2017-09-15 20:34     ` David Turner
2017-09-15 19:20   ` [PATCH v6 07/12] update-index: add fsmonitor support to update-index Ben Peart
2017-09-15 19:20   ` [PATCH v6 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
2017-09-17  8:02     ` Junio C Hamano
2017-09-18 13:38       ` Ben Peart
2017-09-18 15:43         ` Torsten Bögershausen
2017-09-18 16:28           ` Ben Peart
2017-09-19 14:16             ` Torsten Bögershausen
2017-09-19 15:36               ` Ben Peart
2017-09-15 19:20   ` [PATCH v6 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
2017-09-19 20:43     ` Jonathan Nieder
2017-09-20 17:11       ` Ben Peart
2017-09-20 17:46         ` Jonathan Nieder
2017-09-21  0:05           ` Ben Peart
2017-09-15 19:20   ` [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-09-15 22:00     ` David Turner
2017-09-19 19:32       ` David Turner
2017-09-19 20:30         ` Ben Peart
2017-09-16 15:27     ` Torsten Bögershausen
2017-09-17  5:43       ` [PATCH v1 1/1] test-lint: echo -e (or -E) is not portable tboegi
2017-09-19 20:37         ` Jonathan Nieder
2017-09-20 13:49           ` Torsten Bögershausen
2017-09-22  1:04             ` Junio C Hamano
2017-09-18 14:06       ` [PATCH v6 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-09-17  4:47     ` Junio C Hamano
2017-09-18 15:25       ` Ben Peart
2017-09-19 20:34         ` Jonathan Nieder
2017-09-15 19:20   ` [PATCH v6 11/12] fsmonitor: add a sample integration script for Watchman Ben Peart
2017-09-15 19:20   ` [PATCH v6 12/12] fsmonitor: add a performance test Ben Peart
2017-09-15 21:56     ` David Turner
2017-09-18 14:24     ` Johannes Schindelin
2017-09-18 18:19       ` Ben Peart
2017-09-19 15:28         ` Johannes Schindelin
2017-09-19 19:27   ` [PATCH v7 00/12] Fast git status via a file system watcher Ben Peart
2017-09-19 19:27     ` [PATCH v7 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
2017-09-19 19:27     ` [PATCH v7 02/12] preload-index: add override to enable testing preload-index Ben Peart
2017-09-20 22:06       ` Stefan Beller
2017-09-21  0:02         ` Ben Peart
2017-09-21  0:44           ` Stefan Beller
2017-09-19 19:27     ` [PATCH v7 03/12] update-index: add a new --force-write-index option Ben Peart
2017-09-20  5:47       ` Junio C Hamano
2017-09-20 14:58         ` Ben Peart
2017-09-21  1:46           ` Junio C Hamano
2017-09-21  2:06             ` Ben Peart
2017-09-21  2:18               ` Junio C Hamano
2017-09-21  2:32                 ` Junio C Hamano
2017-09-19 19:27     ` [PATCH v7 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
2017-09-20  2:28       ` Junio C Hamano
2017-09-20 16:19         ` Ben Peart
2017-09-21  2:00           ` Junio C Hamano
2017-09-21  2:24             ` Ben Peart
2017-09-21 14:35               ` Ben Peart [this message]
2017-09-22  1:02                 ` Junio C Hamano
2017-09-20  6:23       ` Junio C Hamano
2017-09-20 16:29         ` Ben Peart
2017-09-19 19:27     ` [PATCH v7 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
2017-09-20 10:00       ` Martin Ågren
2017-09-20 17:02         ` Ben Peart
2017-09-20 17:11           ` Martin Ågren
2017-09-19 19:27     ` [PATCH v7 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
2017-09-19 19:46       ` David Turner
2017-09-19 20:44         ` Ben Peart
2017-09-19 21:27           ` David Turner
2017-09-19 22:44             ` Ben Peart
2017-09-19 19:27     ` [PATCH v7 07/12] update-index: add fsmonitor support to update-index Ben Peart
2017-09-19 19:27     ` [PATCH v7 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
2017-09-19 19:27     ` [PATCH v7 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
2017-09-19 19:27     ` [PATCH v7 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-09-19 19:27     ` [PATCH v7 11/12] fsmonitor: add a sample integration script for Watchman Ben Peart
2017-09-19 19:27     ` [PATCH v7 12/12] fsmonitor: add a performance test Ben Peart
2017-09-22 16:35     ` [PATCH v8 00/12] Fast git status via a file system watcher Ben Peart
2017-09-22 16:35       ` [PATCH v8 01/12] bswap: add 64 bit endianness helper get_be64 Ben Peart
2017-09-22 23:37         ` Martin Ågren
2017-09-23 23:31           ` Ben Peart
2017-09-24  3:51             ` Jeff King
2017-09-24  3:52             ` Junio C Hamano
2017-09-22 16:35       ` [PATCH v8 02/12] preload-index: add override to enable testing preload-index Ben Peart
2017-09-22 16:35       ` [PATCH v8 03/12] update-index: add a new --force-write-index option Ben Peart
2017-09-22 16:35       ` [PATCH v8 04/12] fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files Ben Peart
2017-09-22 16:35       ` [PATCH v8 05/12] fsmonitor: add documentation for the fsmonitor extension Ben Peart
2017-09-22 16:35       ` [PATCH v8 06/12] ls-files: Add support in ls-files to display the fsmonitor valid bit Ben Peart
2017-09-22 16:35       ` [PATCH v8 07/12] update-index: add fsmonitor support to update-index Ben Peart
2017-09-22 16:35       ` [PATCH v8 08/12] fsmonitor: add a test tool to dump the index extension Ben Peart
2017-09-22 23:37         ` Martin Ågren
2017-09-23 23:33           ` Ben Peart
2017-09-24  3:51             ` Junio C Hamano
2017-09-22 16:35       ` [PATCH v8 09/12] split-index: disable the fsmonitor extension when running the split index test Ben Peart
2017-09-22 16:35       ` [PATCH v8 10/12] fsmonitor: add test cases for fsmonitor extension Ben Peart
2017-09-22 16:35       ` [PATCH v8 11/12] fsmonitor: add a sample integration script for Watchman Ben Peart
2017-09-22 16:35       ` [PATCH v8 12/12] fsmonitor: add a performance test Ben Peart
2017-09-29  2:20       ` [PATCH v8 00/12] Fast git status via a file system watcher Junio C Hamano
2017-09-29 12:07         ` Ben Peart
2017-10-01  8:24           ` Junio C Hamano
2017-10-03 19:48             ` Ben Peart
2017-10-04  2:09               ` Junio C Hamano
2017-10-04  6:38                 ` Alex Vandiver
2017-10-04 12:48                   ` Ben Peart
2017-10-04 12:27                 ` Ben Peart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f50825a4-fa15-9f28-a079-853e78ee8e2e@gmail.com \
    --to=peartben@gmail.com \
    --cc=David.Turner@twosigma.com \
    --cc=avarab@gmail.com \
    --cc=benpeart@microsoft.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.