git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/1] fsmonitor: fix watchman integration
@ 2019-11-04 17:50 Kevin Willford via GitGitGadget
  2019-11-04 17:50 ` [PATCH 1/1] " Kevin Willford via GitGitGadget
  2019-11-06  3:29 ` [PATCH 0/1] " Junio C Hamano
  0 siblings, 2 replies; 6+ messages in thread
From: Kevin Willford via GitGitGadget @ 2019-11-04 17:50 UTC (permalink / raw)
  To: git; +Cc: Kevin Willford, Junio C Hamano

When running Git commands quickly -- such as in a shell script or the test
suite -- the Git commands frequently complete and start again during the
same second. The example fsmonitor hooks to integrate with Watchman truncate
the nanosecond times to seconds. In principle, this is fine, as Watchman
claims to use inclusive comparisons [1]. The result should only be an
over-representation of the changed paths since the last Git command.

However, Watchman's own documentation claims "Using a timestamp is prone to
race conditions in understanding the complete state of the file tree" [2].
All of their documented examples use a "clockspec" that looks like
'c:123:234'. Git should eventually learn how to store this type of string to
provide a stronger integration, but that will be a more invasive change.

When using GIT_TEST_FSMONITOR="$(pwd)/t7519/fsmonitor-watchman", scripts
such as t7519-wtstatus.sh fail due to these race conditions. In fact,
running any test script with GIT_TEST_FSMONITOR pointing at
t/t7519/fsmonitor-wathcman will cause failures in the test_commit function.
The 'git add "$indir$file"' command fails due to not enough time between the
creation of '$file' and the 'git add' command.

For now, subtract one second from the timestamp we pass to Watchman. This
will make our window large enough to avoid these race conditions. Increasing
the window causes tests like t7519-wtstatus.sh to pass.

When the integration was introduced in def437671 (fsmonitor: add a sample
integration script for Watchman, 2018-09-22), the query included an
expression that would ignore files created and deleted in that window. The
performance reason for this change was to ignore temporary files created by
a build between Git commands. However, this causes failures in script
scenarios where Git is creating or deleting files quickly.

When using GIT_TEST_FSMONITOR as before, t2203-add-intent.sh fails due to
this add-and-delete race condition.

By removing the "expression" from the Watchman query, we remove this race
condition. It will lead to some performance degradation in the case of users
creating and deleting temporary files inside their working directory between
Git commands. However, that is a cost we need to pay to be correct.

[1] https://github.com/facebook/watchman/blob/master/query/since.cpp#L35-L39
[2] https://facebook.github.io/watchman/docs/clockspec.html

Kevin Willford (1):
  fsmonitor: fix watchman integration

 t/t7519/fsmonitor-watchman                 | 13 ++++---------
 templates/hooks--fsmonitor-watchman.sample | 13 ++++---------
 2 files changed, 8 insertions(+), 18 deletions(-)


base-commit: da72936f544fec5a335e66432610e4cef4430991
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-444%2Fkewillford%2Ffsmonitor-watchman-fix-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-444/kewillford/fsmonitor-watchman-fix-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/444
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/1] fsmonitor: fix watchman integration
  2019-11-04 17:50 [PATCH 0/1] fsmonitor: fix watchman integration Kevin Willford via GitGitGadget
@ 2019-11-04 17:50 ` Kevin Willford via GitGitGadget
  2019-11-12 11:32   ` SZEDER Gábor
  2019-11-06  3:29 ` [PATCH 0/1] " Junio C Hamano
  1 sibling, 1 reply; 6+ messages in thread
From: Kevin Willford via GitGitGadget @ 2019-11-04 17:50 UTC (permalink / raw)
  To: git; +Cc: Kevin Willford, Junio C Hamano, Kevin Willford

From: Kevin Willford <kewillf@microsoft.com>

When running Git commands quickly -- such as in a shell script or the
test suite -- the Git commands frequently complete and start again
during the same second. The example fsmonitor hooks to integrate with
Watchman truncate the nanosecond times to seconds. In principle, this is
fine, as Watchman claims to use inclusive comparisons [1]. The result
should only be an over-representation of the changed paths since the
last Git command.

However, Watchman's own documentation claims "Using a timestamp is prone
to race conditions in understanding the complete state of the file tree"
[2]. All of their documented examples use a "clockspec" that looks like
'c:123:234'. Git should eventually learn how to store this type of
string to provide a stronger integration, but that will be a more
invasive change.

When using GIT_TEST_FSMONITOR="$(pwd)/t7519/fsmonitor-watchman", scripts
such as t7519-wtstatus.sh fail due to these race conditions. In fact,
running any test script with GIT_TEST_FSMONITOR pointing at
t/t7519/fsmonitor-wathcman will cause failures in the test_commit
function. The 'git add "$indir$file"' command fails due to not enough
time between the creation of '$file' and the 'git add' command.

For now, subtract one second from the timestamp we pass to Watchman.
This will make our window large enough to avoid these race conditions.
Increasing the window causes tests like t7519-wtstatus.sh to pass.

When the integration was introduced in def437671 (fsmonitor: add a
sample integration script for Watchman, 2018-09-22), the query included
an expression that would ignore files created and deleted in that
window. The performance reason for this change was to ignore temporary
files created by a build between Git commands. However, this causes
failures in script scenarios where Git is creating or deleting files
quickly.

When using GIT_TEST_FSMONITOR as before, t2203-add-intent.sh fails
due to this add-and-delete race condition.

By removing the "expression" from the Watchman query, we remove this
race condition. It will lead to some performance degradation in the case
of users creating and deleting temporary files inside their working
directory between Git commands. However, that is a cost we need to pay
to be correct.

[1] https://github.com/facebook/watchman/blob/master/query/since.cpp#L35-L39
[2] https://facebook.github.io/watchman/docs/clockspec.html

Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
---
 t/t7519/fsmonitor-watchman                 | 13 ++++---------
 templates/hooks--fsmonitor-watchman.sample | 13 ++++---------
 2 files changed, 8 insertions(+), 18 deletions(-)

diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
index 5514edcf68..d8e7a1e5ba 100755
--- a/t/t7519/fsmonitor-watchman
+++ b/t/t7519/fsmonitor-watchman
@@ -23,7 +23,8 @@ my ($version, $time) = @ARGV;
 
 if ($version == 1) {
 	# convert nanoseconds to seconds
-	$time = int $time / 1000000000;
+	# subtract one second to make sure watchman will return all changes
+	$time = int ($time / 1000000000) - 1;
 } else {
 	die "Unsupported query-fsmonitor hook version '$version'.\n" .
 	    "Falling back to scanning...\n";
@@ -54,18 +55,12 @@ sub launch_watchman {
 	#
 	# To accomplish this, we're using the "since" generator to use the
 	# recency index to select candidate nodes and "fields" to limit the
-	# output to file names only. Then we're using the "expression" term to
-	# further constrain the results.
-	#
-	# The category of transient files that we want to ignore will have a
-	# creation clock (cclock) newer than $time_t value and will also not
-	# currently exist.
+	# output to file names only.
 
 	my $query = <<"	END";
 		["query", "$git_work_tree", {
 			"since": $time,
-			"fields": ["name"],
-			"expression": ["not", ["allof", ["since", $time, "cclock"], ["not", "exists"]]]
+			"fields": ["name"]
 		}]
 	END
 	
diff --git a/templates/hooks--fsmonitor-watchman.sample b/templates/hooks--fsmonitor-watchman.sample
index e673bb3980..ef94fa2938 100755
--- a/templates/hooks--fsmonitor-watchman.sample
+++ b/templates/hooks--fsmonitor-watchman.sample
@@ -22,7 +22,8 @@ my ($version, $time) = @ARGV;
 
 if ($version == 1) {
 	# convert nanoseconds to seconds
-	$time = int $time / 1000000000;
+	# subtract one second to make sure watchman will return all changes
+	$time = int ($time / 1000000000) - 1;
 } else {
 	die "Unsupported query-fsmonitor hook version '$version'.\n" .
 	    "Falling back to scanning...\n";
@@ -53,18 +54,12 @@ sub launch_watchman {
 	#
 	# To accomplish this, we're using the "since" generator to use the
 	# recency index to select candidate nodes and "fields" to limit the
-	# output to file names only. Then we're using the "expression" term to
-	# further constrain the results.
-	#
-	# The category of transient files that we want to ignore will have a
-	# creation clock (cclock) newer than $time_t value and will also not
-	# currently exist.
+	# output to file names only.
 
 	my $query = <<"	END";
 		["query", "$git_work_tree", {
 			"since": $time,
-			"fields": ["name"],
-			"expression": ["not", ["allof", ["since", $time, "cclock"], ["not", "exists"]]]
+			"fields": ["name"]
 		}]
 	END
 
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/1] fsmonitor: fix watchman integration
  2019-11-04 17:50 [PATCH 0/1] fsmonitor: fix watchman integration Kevin Willford via GitGitGadget
  2019-11-04 17:50 ` [PATCH 1/1] " Kevin Willford via GitGitGadget
@ 2019-11-06  3:29 ` Junio C Hamano
  2019-11-06 15:32   ` Kevin Willford
  1 sibling, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2019-11-06  3:29 UTC (permalink / raw)
  To: Kevin Willford via GitGitGadget; +Cc: git, Kevin Willford

"Kevin Willford via GitGitGadget" <gitgitgadget@gmail.com> writes:

> When running Git commands quickly -- such as in a shell script or the test
> suite -- the Git commands frequently complete and start again during the
> same second. The example fsmonitor hooks to integrate with Watchman truncate
> the nanosecond times to seconds. In principle, this is fine, as Watchman
> claims to use inclusive comparisons [1]. The result should only be an
> over-representation of the changed paths since the last Git command.
> ...

So, it doesn't seem to use "inclusive" and we need a workaround?

> Kevin Willford (1):
>   fsmonitor: fix watchman integration
>
>  t/t7519/fsmonitor-watchman                 | 13 ++++---------
>  templates/hooks--fsmonitor-watchman.sample | 13 ++++---------
>  2 files changed, 8 insertions(+), 18 deletions(-)

Thanks, will queue.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH 0/1] fsmonitor: fix watchman integration
  2019-11-06  3:29 ` [PATCH 0/1] " Junio C Hamano
@ 2019-11-06 15:32   ` Kevin Willford
  2019-11-06 15:46     ` Derrick Stolee
  0 siblings, 1 reply; 6+ messages in thread
From: Kevin Willford @ 2019-11-06 15:32 UTC (permalink / raw)
  To: Junio C Hamano, Kevin Willford via GitGitGadget; +Cc: git

> From: Junio C Hamano <gitster@pobox.com>
> Sent: Tuesday, November 5, 2019 8:30 PM
> 
> "Kevin Willford via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
> > When running Git commands quickly -- such as in a shell script or the
> > test suite -- the Git commands frequently complete and start again
> > during the same second. The example fsmonitor hooks to integrate with
> > Watchman truncate the nanosecond times to seconds. In principle, this
> > is fine, as Watchman claims to use inclusive comparisons [1]. The
> > result should only be an over-representation of the changed paths since
> the last Git command.
> > ...
> 
> So, it doesn't seem to use "inclusive" and we need a workaround?

That is what is seems like.  I would like to dig into the watchman code
to understand what is really going on.  They also document that "Using
a timestamp is prone to race conditions in understanding the complete
state of the file tree." Which could be the cause since the tests are
running things in quick succession, i.e. change a file, run a git command.

Long term we should switch to using watchman's clock id which the
documentation says does not have the race conditions. But the clock
id is a string and would take more invasive changes to integrate that
into the index where we are now simply using a uint64_t.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/1] fsmonitor: fix watchman integration
  2019-11-06 15:32   ` Kevin Willford
@ 2019-11-06 15:46     ` Derrick Stolee
  0 siblings, 0 replies; 6+ messages in thread
From: Derrick Stolee @ 2019-11-06 15:46 UTC (permalink / raw)
  To: Kevin Willford, Junio C Hamano, Kevin Willford via GitGitGadget; +Cc: git

On 11/6/2019 10:32 AM, Kevin Willford wrote:
>> From: Junio C Hamano <gitster@pobox.com>
>> Sent: Tuesday, November 5, 2019 8:30 PM
>>
>> "Kevin Willford via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>
>>> When running Git commands quickly -- such as in a shell script or the
>>> test suite -- the Git commands frequently complete and start again
>>> during the same second. The example fsmonitor hooks to integrate with
>>> Watchman truncate the nanosecond times to seconds. In principle, this
>>> is fine, as Watchman claims to use inclusive comparisons [1]. The
>>> result should only be an over-representation of the changed paths since
>> the last Git command.
>>> ...
>>
>> So, it doesn't seem to use "inclusive" and we need a workaround?
> 
> That is what is seems like.  I would like to dig into the watchman code
> to understand what is really going on.  They also document that "Using
> a timestamp is prone to race conditions in understanding the complete
> state of the file tree." Which could be the cause since the tests are
> running things in quick succession, i.e. change a file, run a git command.
> 
> Long term we should switch to using watchman's clock id which the
> documentation says does not have the race conditions. But the clock
> id is a string and would take more invasive changes to integrate that
> into the index where we are now simply using a uint64_t.

I should mention that I'm working on a patch series that will allow us
to use GIT_TEST_FSMONITOR pointing at Watchman in our CI builds. There
are several places where our fsmonitor integration does not work well
(such as when we delete submodules), but most of the remaining fixes
are small.

This timing issue fixes MOST of the problems we see when running the
test suite.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/1] fsmonitor: fix watchman integration
  2019-11-04 17:50 ` [PATCH 1/1] " Kevin Willford via GitGitGadget
@ 2019-11-12 11:32   ` SZEDER Gábor
  0 siblings, 0 replies; 6+ messages in thread
From: SZEDER Gábor @ 2019-11-12 11:32 UTC (permalink / raw)
  To: Kevin Willford via GitGitGadget
  Cc: git, Kevin Willford, Junio C Hamano, Kevin Willford

On Mon, Nov 04, 2019 at 05:50:41PM +0000, Kevin Willford via GitGitGadget wrote:
> When running Git commands quickly -- such as in a shell script or the
> test suite -- the Git commands frequently complete and start again
> during the same second. The example fsmonitor hooks to integrate with
> Watchman truncate the nanosecond times to seconds. In principle, this is
> fine, as Watchman claims to use inclusive comparisons [1]. The result
> should only be an over-representation of the changed paths since the
> last Git command.

> [1] https://github.com/facebook/watchman/blob/master/query/since.cpp#L35-L39

Nit: please refer to the specific blob in this link.  The file's
content in 'master' might change in time, and then the highlight will
point to different lines.  Worse, the file might be renamed, and then
we'll get a broken link.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-11-12 11:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-04 17:50 [PATCH 0/1] fsmonitor: fix watchman integration Kevin Willford via GitGitGadget
2019-11-04 17:50 ` [PATCH 1/1] " Kevin Willford via GitGitGadget
2019-11-12 11:32   ` SZEDER Gábor
2019-11-06  3:29 ` [PATCH 0/1] " Junio C Hamano
2019-11-06 15:32   ` Kevin Willford
2019-11-06 15:46     ` Derrick Stolee

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).