All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kevin Willford via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Kevin Willford <Kevin.Willford@microsoft.com>,
	Junio C Hamano <gitster@pobox.com>,
	Kevin Willford <Kevin.Willford@microsoft.com>
Subject: [PATCH 3/4] fsmonitor: add fsmonitor hook scripts for version 2
Date: Tue, 07 Jan 2020 19:04:30 +0000	[thread overview]
Message-ID: <e1fd5924d0441d1f4d6cce9f504a6a91f234c486.1578423871.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.510.git.1578423871.gitgitgadget@gmail.com>

From: Kevin Willford <Kevin.Willford@microsoft.com>

Version 2 of the fsmonitor hooks is passed the version and an update
token and must pass back a last update token to use for subsequent calls
to the hook.

Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
---
 t/t7519/fsmonitor-all-v2                   |  21 +++
 t/t7519/fsmonitor-watchman-v2              | 173 +++++++++++++++++++++
 templates/hooks--fsmonitor-watchman.sample | 138 ++++++++++------
 3 files changed, 281 insertions(+), 51 deletions(-)
 create mode 100755 t/t7519/fsmonitor-all-v2
 create mode 100755 t/t7519/fsmonitor-watchman-v2

diff --git a/t/t7519/fsmonitor-all-v2 b/t/t7519/fsmonitor-all-v2
new file mode 100755
index 0000000000..061907e88b
--- /dev/null
+++ b/t/t7519/fsmonitor-all-v2
@@ -0,0 +1,21 @@
+#!/usr/bin/perl
+
+use strict;
+use warnings;
+#
+# An test hook script to integrate with git to test fsmonitor.
+#
+# The hook is passed a version (currently 2) and since token
+# formatted as a string and outputs to stdout all files that have been
+# modified since the given time. Paths must be relative to the root of
+# the working tree and separated by a single NUL.
+#
+#echo "$0 $*" >&2
+my ($version, $last_update_token) = @ARGV;
+
+if ($version ne 2) {
+	print "Unsupported query-fsmonitor hook version '$version'.\n";
+	exit 1;
+}
+
+print "last_update_token\0/\0"
diff --git a/t/t7519/fsmonitor-watchman-v2 b/t/t7519/fsmonitor-watchman-v2
new file mode 100755
index 0000000000..14ed0aa42d
--- /dev/null
+++ b/t/t7519/fsmonitor-watchman-v2
@@ -0,0 +1,173 @@
+#!/usr/bin/perl
+
+use strict;
+use warnings;
+use IPC::Open2;
+
+# An example hook script to integrate Watchman
+# (https://facebook.github.io/watchman/) with git to speed up detecting
+# new and modified files.
+#
+# The hook is passed a version (currently 2) and last update token
+# formatted as a string and outputs to stdout a new update token and
+# all files that have been modified since the update token. Paths must
+# be relative to the root of the working tree and separated by a single NUL.
+#
+# To enable this hook, rename this file to "query-watchman" and set
+# 'git config core.fsmonitor .git/hooks/query-watchman'
+#
+my ($version, $last_update_token) = @ARGV;
+
+# Uncomment for debugging
+# print STDERR "$0 $version $last_update_token\n";
+
+# Check the hook interface version
+if ($version ne 2) {
+	die "Unsupported query-fsmonitor hook version '$version'.\n" .
+	    "Falling back to scanning...\n";
+}
+
+my $git_work_tree = get_working_dir();
+
+my $retry = 1;
+
+my $json_pkg;
+eval {
+	require JSON::XS;
+	$json_pkg = "JSON::XS";
+	1;
+} or do {
+	require JSON::PP;
+	$json_pkg = "JSON::PP";
+};
+
+launch_watchman();
+
+sub launch_watchman {
+	my $o = watchman_query();
+	if (is_work_tree_watched($o)) {
+		output_result($o->{clock}, @{$o->{files}});
+	}
+}
+
+sub output_result {
+	my ($clockid, @files) = @_;
+
+	# Uncomment for debugging watchman output
+	# open (my $fh, ">", ".git/watchman-output.out");
+	# binmode $fh, ":utf8";
+	# print $fh "$clockid\n@files\n";
+	# close $fh;
+
+	binmode STDOUT, ":utf8";
+	print $clockid;
+	print "\0";
+	local $, = "\0";
+	print @files;
+}
+
+sub watchman_clock {
+	my $response = qx/watchman clock "$git_work_tree"/;
+	die "Failed to get clock id on '$git_work_tree'.\n" .
+		"Falling back to scanning...\n" if $? != 0;
+
+	return $json_pkg->new->utf8->decode($response);
+}
+
+sub watchman_query {
+	my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j --no-pretty')
+	or die "open2() failed: $!\n" .
+	"Falling back to scanning...\n";
+
+	# In the query expression below we're asking for names of files that
+	# changed since $last_update_token but not from the .git folder.
+	#
+	# To accomplish this, we're using the "since" generator to use the
+	# recency index to select candidate nodes and "fields" to limit the
+	# output to file names only. Then we're using the "expression" term to
+	# further constrain the results.
+	if (substr($last_update_token, 0, 1) eq "c") {
+		$last_update_token = "\"$last_update_token\"";
+	}
+	my $query = <<"	END";
+		["query", "$git_work_tree", {
+			"since": $last_update_token,
+			"fields": ["name"],
+			"expression": ["not", ["dirname", ".git"]]
+		}]
+	END
+
+	# Uncomment for debugging the watchman query
+	# open (my $fh, ">", ".git/watchman-query.json");
+	# print $fh $query;
+	# close $fh;
+
+	print CHLD_IN $query;
+	close CHLD_IN;
+	my $response = do {local $/; <CHLD_OUT>};
+
+	# Uncomment for debugging the watch response
+	# open ($fh, ">", ".git/watchman-response.json");
+	# print $fh $response;
+	# close $fh;
+
+	die "Watchman: command returned no output.\n" .
+	"Falling back to scanning...\n" if $response eq "";
+	die "Watchman: command returned invalid output: $response\n" .
+	"Falling back to scanning...\n" unless $response =~ /^\{/;
+
+	return $json_pkg->new->utf8->decode($response);
+}
+
+sub is_work_tree_watched {
+	my ($output) = @_;
+	my $error = $output->{error};
+	if ($retry > 0 and $error and $error =~ m/unable to resolve root .* directory (.*) is not watched/) {
+		$retry--;
+		my $response = qx/watchman watch "$git_work_tree"/;
+		die "Failed to make watchman watch '$git_work_tree'.\n" .
+		    "Falling back to scanning...\n" if $? != 0;
+		$output = $json_pkg->new->utf8->decode($response);
+		$error = $output->{error};
+		die "Watchman: $error.\n" .
+		"Falling back to scanning...\n" if $error;
+
+		# Uncomment for debugging watchman output
+		# open (my $fh, ">", ".git/watchman-output.out");
+		# close $fh;
+
+		# Watchman will always return all files on the first query so
+		# return the fast "everything is dirty" flag to git and do the
+		# Watchman query just to get it over with now so we won't pay
+		# the cost in git to look up each individual file.
+		my $o = watchman_clock();
+		$error = $output->{error};
+
+		die "Watchman: $error.\n" .
+		"Falling back to scanning...\n" if $error;
+
+		output_result($o->{clock}, ("/"));
+		$last_update_token = $o->{clock};
+
+		eval { launch_watchman() };
+		return 0;
+	}
+
+	die "Watchman: $error.\n" .
+	"Falling back to scanning...\n" if $error;
+
+	return 1;
+}
+
+sub get_working_dir {
+	my $working_dir;
+	if ($^O =~ 'msys' || $^O =~ 'cygwin') {
+		$working_dir = Win32::GetCwd();
+		$working_dir =~ tr/\\/\//;
+	} else {
+		require Cwd;
+		$working_dir = Cwd::cwd();
+	}
+
+	return $working_dir;
+}
diff --git a/templates/hooks--fsmonitor-watchman.sample b/templates/hooks--fsmonitor-watchman.sample
index ef94fa2938..31d2070b2c 100755
--- a/templates/hooks--fsmonitor-watchman.sample
+++ b/templates/hooks--fsmonitor-watchman.sample
@@ -8,58 +8,83 @@ use IPC::Open2;
 # (https://facebook.github.io/watchman/) with git to speed up detecting
 # new and modified files.
 #
-# The hook is passed a version (currently 1) and a time in nanoseconds
-# formatted as a string and outputs to stdout all files that have been
-# modified since the given time. Paths must be relative to the root of
-# the working tree and separated by a single NUL.
+# The hook is passed a version (currently 2) and last update token
+# formatted as a string and outputs to stdout a new update token and
+# all files that have been modified since the update token. Paths must
+# be relative to the root of the working tree and separated by a single NUL.
 #
 # To enable this hook, rename this file to "query-watchman" and set
 # 'git config core.fsmonitor .git/hooks/query-watchman'
 #
-my ($version, $time) = @ARGV;
+my ($version, $last_update_token) = @ARGV;
 
 # Check the hook interface version
-
-if ($version == 1) {
-	# convert nanoseconds to seconds
-	# subtract one second to make sure watchman will return all changes
-	$time = int ($time / 1000000000) - 1;
-} else {
+if ($version ne 2) {
 	die "Unsupported query-fsmonitor hook version '$version'.\n" .
 	    "Falling back to scanning...\n";
 }
 
-my $git_work_tree;
-if ($^O =~ 'msys' || $^O =~ 'cygwin') {
-	$git_work_tree = Win32::GetCwd();
-	$git_work_tree =~ tr/\\/\//;
-} else {
-	require Cwd;
-	$git_work_tree = Cwd::cwd();
-}
+my $git_work_tree = get_working_dir();
 
 my $retry = 1;
 
+my $json_pkg;
+eval {
+	require JSON::XS;
+	$json_pkg = "JSON::XS";
+	1;
+} or do {
+	require JSON::PP;
+	$json_pkg = "JSON::PP";
+};
+
 launch_watchman();
 
 sub launch_watchman {
+	$o = watchman_query();
+	if (is_work_tree_watched($o)) {
+		output_result($o->{clock}, @{$o->{files}});
+	}
+}
+
+sub output_result {
+	my ($clockid, @files) = @_;
+
+	binmode STDOUT, ":utf8";
+	print $clockid;
+	print "\0";
+	local $, = "\0";
+	print @files;
+}
+
+sub watchman_clock {
+	my $response = qx/watchman clock "$git_work_tree"/;
+	die "Failed to get clock id on '$git_work_tree'.\n" .
+		"Falling back to scanning...\n" if $? != 0;
 
+	return $json_pkg->new->utf8->decode($response);
+}
+
+sub watchman_query {
 	my $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'watchman -j --no-pretty')
-	    or die "open2() failed: $!\n" .
-	    "Falling back to scanning...\n";
+	or die "open2() failed: $!\n" .
+	"Falling back to scanning...\n";
 
 	# In the query expression below we're asking for names of files that
-	# changed since $time but were not transient (ie created after
-	# $time but no longer exist).
+	# changed since $last_update_token but not from the .git folder.
 	#
 	# To accomplish this, we're using the "since" generator to use the
 	# recency index to select candidate nodes and "fields" to limit the
-	# output to file names only.
-
+	# output to file names only. Then we're using the "expression" term to
+	# further constrain the results.
+	if (substr($last_update_token, 0, 1) eq "c") {
+		$last_update_token = "\"$last_update_token\"";
+	}
 	my $query = <<"	END";
 		["query", "$git_work_tree", {
-			"since": $time,
-			"fields": ["name"]
+			"since": $last_update_token,
+			"fields": ["name"],
+			"expression": ["not", ["dirname", ".git"]]
 		}]
 	END
 
@@ -68,24 +93,16 @@ sub launch_watchman {
 	my $response = do {local $/; <CHLD_OUT>};
 
 	die "Watchman: command returned no output.\n" .
-	    "Falling back to scanning...\n" if $response eq "";
+	"Falling back to scanning...\n" if $response eq "";
 	die "Watchman: command returned invalid output: $response\n" .
-	    "Falling back to scanning...\n" unless $response =~ /^\{/;
-
-	my $json_pkg;
-	eval {
-		require JSON::XS;
-		$json_pkg = "JSON::XS";
-		1;
-	} or do {
-		require JSON::PP;
-		$json_pkg = "JSON::PP";
-	};
-
-	my $o = $json_pkg->new->utf8->decode($response);
-
-	if ($retry > 0 and $o->{error} and $o->{error} =~ m/unable to resolve root .* directory (.*) is not watched/) {
-		print STDERR "Adding '$git_work_tree' to watchman's watch list.\n";
+	"Falling back to scanning...\n" unless $response =~ /^\{/;
+
+	return $json_pkg->new->utf8->decode($response);
+}
+
+sub is_work_tree_watched {
+	my ($output) = @_;
+	if ($retry > 0 and $output->{error} and $output->{error} =~ m/unable to resolve root .* directory (.*) is not watched/) {
 		$retry--;
 		qx/watchman watch "$git_work_tree"/;
 		die "Failed to make watchman watch '$git_work_tree'.\n" .
@@ -95,15 +112,34 @@ sub launch_watchman {
 		# return the fast "everything is dirty" flag to git and do the
 		# Watchman query just to get it over with now so we won't pay
 		# the cost in git to look up each individual file.
-		print "/\0";
+		my $o = watchman_clock();
+		$error = $output->{error};
+
+		die "Watchman: $error.\n" .
+		"Falling back to scanning...\n" if $error;
+
+		output_result($o->{clock}, ("/"));
+		$last_update_token = $o->{clock};
+
 		eval { launch_watchman() };
-		exit 0;
+		return 0;
 	}
 
-	die "Watchman: $o->{error}.\n" .
-	    "Falling back to scanning...\n" if $o->{error};
+	die "Watchman: $output->{error}.\n" .
+	"Falling back to scanning...\n" if $output->{error};
 
-	binmode STDOUT, ":utf8";
-	local $, = "\0";
-	print @{$o->{files}};
+	return 1;
+}
+
+sub get_working_dir {
+	my $working_dir;
+	if ($^O =~ 'msys' || $^O =~ 'cygwin') {
+		$working_dir = Win32::GetCwd();
+		$working_dir =~ tr/\\/\//;
+	} else {
+		require Cwd;
+		$working_dir = Cwd::cwd();
+	}
+
+	return $working_dir;
 }
-- 
gitgitgadget


  parent reply	other threads:[~2020-01-07 19:04 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-07 19:04 [PATCH 0/4] fsmonitor: start using an opaque token for last update Kevin Willford via GitGitGadget
2020-01-07 19:04 ` [PATCH 1/4] fsmonitor: change last update timestamp on the index_state to opaque token Kevin Willford via GitGitGadget
2020-01-07 19:04 ` [PATCH 2/4] fsmonitor: handle version 2 of the hooks that will use " Kevin Willford via GitGitGadget
2020-01-07 19:04 ` Kevin Willford via GitGitGadget [this message]
2020-01-07 19:04 ` [PATCH 4/4] fsmonitor: update documentation for hook version and watchman hooks Kevin Willford via GitGitGadget
2020-01-23 15:26 ` [PATCH v2 0/4] fsmonitor: start using an opaque token for last update Kevin Willford via GitGitGadget
2020-01-23 15:26   ` [PATCH v2 1/4] fsmonitor: change last update timestamp on the index_state to opaque token Kevin Willford via GitGitGadget
2020-02-05 12:49     ` Johannes Schindelin
2020-01-23 15:26   ` [PATCH v2 2/4] fsmonitor: handle version 2 of the hooks that will use " Kevin Willford via GitGitGadget
2020-02-05 13:11     ` Johannes Schindelin
2020-01-23 15:26   ` [PATCH v2 3/4] fsmonitor: add fsmonitor hook scripts for version 2 Kevin Willford via GitGitGadget
2020-01-23 15:26   ` [PATCH v2 4/4] fsmonitor: update documentation for hook version and watchman hooks Kevin Willford via GitGitGadget
2020-02-05 13:15   ` [PATCH v2 0/4] fsmonitor: start using an opaque token for last update Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e1fd5924d0441d1f4d6cce9f504a6a91f234c486.1578423871.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=Kevin.Willford@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.