All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/23] [RFC] Builtin FSMonitor Feature
@ 2021-04-01 15:40 Jeff Hostetler via GitGitGadget
  2021-04-01 15:40 ` [PATCH 01/23] fsmonitor--daemon: man page and documentation Jeff Hostetler via GitGitGadget
                   ` (26 more replies)
  0 siblings, 27 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:40 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler

This patch series adds a builtin FSMonitor daemon to Git.

This daemon uses platform-specific filesystem notifications to keep track of
changes to a working directory. It also listens over the "Simple IPC"
facility for client requests and responds with a list of files/directories
that have been recently modified.

Client commands, such as git status, already know how to request a list of
modified files via the FSMonitor Hook. This patch series teaches client
commands to talk directly to the daemon via IPC and avoid the overhead of
the hook API. (Hook process creation can be expensive on Windows.)

Since the daemon is a feature of Git, rather than a generic third-party tool
like Watchman, the daemon can format its response to be exactly what the
client needs, so there is no need for a hook process to proxy and reformat
the data. For example, when Watchman is used, Watchman responds in JSON and
the hook process (typically a PERL script) must parse it and convert it into
a simple NUL-delimited list. FSMonitor daemon responses are already in this
NUL-delimited format, so no processing is required.

The current daemon implementation is rather simple in that it just records
the set of files/directories that have changed. For example, it is not aware
of specific Git features, such as .gitignore and doesn't attempt to filter
out ignored files. Having a Git-specific daemon lets us explore such things
in the future.

Finally, having a builtin daemon eliminates the need for user to download
and install a third-party tool. This makes enterprise deployments simpler
since there are fewer parts to install, maintain, and updates to track.

This RFC version includes support for Windows and MacOS file system events.
A Linux version will be submitted in a later patch series.

This patch series is being previewed as an experimental feature in Git for
Windows v2.31.0.windows.1.

This patch series requires the jh/simple-ipc and jh/fsmonitor-prework patch
series.

Jeff Hostetler (21):
  fsmonitor--daemon: man page and documentation
  fsmonitor-ipc: create client routines for git-fsmonitor--daemon
  fsmonitor--daemon: add a built-in fsmonitor daemon
  fsmonitor--daemon: implement client command options
  fsmonitor-fs-listen-win32: stub in backend for Windows
  fsmonitor-fs-listen-macos: stub in backend for MacOS
  fsmonitor--daemon: implement daemon command options
  fsmonitor--daemon: add pathname classification
  fsmonitor--daemon: define token-ids
  fsmonitor--daemon: create token-based changed path cache
  fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  fsmonitor-fs-listen-macos: add macos header files for FSEvent
  fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS
  fsmonitor--daemon: implement handle_client callback
  fsmonitor--daemon: periodically truncate list of modified files
  fsmonitor--daemon:: introduce client delay for testing
  fsmonitor--daemon: use a cookie file to sync with file system
  fsmonitor: force update index when fsmonitor token advances
  t7527: create test for fsmonitor--daemon
  p7519: add fsmonitor--daemon
  t7527: test status with untracked-cache and fsmonitor--daemon

Johannes Schindelin (2):
  config: FSMonitor is repository-specific
  fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via
    IPC

 .gitignore                                   |    1 +
 Documentation/config/core.txt                |   45 +-
 Documentation/git-fsmonitor--daemon.txt      |  104 ++
 Documentation/git-update-index.txt           |    4 +-
 Documentation/githooks.txt                   |    3 +-
 Makefile                                     |   15 +
 builtin.h                                    |    1 +
 builtin/fsmonitor--daemon.c                  | 1611 ++++++++++++++++++
 builtin/update-index.c                       |    4 +-
 compat/fsmonitor/fsmonitor-fs-listen-macos.c |  484 ++++++
 compat/fsmonitor/fsmonitor-fs-listen-win32.c |  514 ++++++
 compat/fsmonitor/fsmonitor-fs-listen.h       |   49 +
 config.c                                     |    9 +-
 config.h                                     |    2 +-
 config.mak.uname                             |    4 +
 contrib/buildsystems/CMakeLists.txt          |    8 +
 fsmonitor--daemon.h                          |  142 ++
 fsmonitor-ipc.c                              |  153 ++
 fsmonitor-ipc.h                              |   48 +
 fsmonitor.c                                  |   32 +-
 git.c                                        |    1 +
 help.c                                       |    4 +
 repo-settings.c                              |    3 +
 repository.h                                 |    2 +
 t/perf/p7519-fsmonitor.sh                    |   37 +-
 t/t7527-builtin-fsmonitor.sh                 |  582 +++++++
 26 files changed, 3839 insertions(+), 23 deletions(-)
 create mode 100644 Documentation/git-fsmonitor--daemon.txt
 create mode 100644 builtin/fsmonitor--daemon.c
 create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-macos.c
 create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-win32.c
 create mode 100644 compat/fsmonitor/fsmonitor-fs-listen.h
 create mode 100644 fsmonitor--daemon.h
 create mode 100644 fsmonitor-ipc.c
 create mode 100644 fsmonitor-ipc.h
 create mode 100755 t/t7527-builtin-fsmonitor.sh


base-commit: f1725819714fbcd96c47ae5f14e00cc01045272f
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-923%2Fjeffhostetler%2Fbuiltin-fsmonitor-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-923/jeffhostetler/builtin-fsmonitor-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/923
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 237+ messages in thread

* [PATCH 01/23] fsmonitor--daemon: man page and documentation
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:40 ` Jeff Hostetler via GitGitGadget
  2021-04-26 14:13   ` Derrick Stolee
  2021-04-01 15:40 ` [PATCH 02/23] fsmonitor-ipc: create client routines for git-fsmonitor--daemon Jeff Hostetler via GitGitGadget
                   ` (25 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:40 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Create a manual page describing the `git fsmonitor--daemon` feature.

Update references to `core.fsmonitor`, `core.fsmonitorHookVersion` and
pointers to `watchman` to mention the built-in FSMonitor.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 Documentation/config/core.txt           |  45 +++++++---
 Documentation/git-fsmonitor--daemon.txt | 104 ++++++++++++++++++++++++
 Documentation/git-update-index.txt      |   4 +-
 Documentation/githooks.txt              |   3 +-
 4 files changed, 144 insertions(+), 12 deletions(-)
 create mode 100644 Documentation/git-fsmonitor--daemon.txt

diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
index c04f62a54a15..d6e2f01966cb 100644
--- a/Documentation/config/core.txt
+++ b/Documentation/config/core.txt
@@ -66,18 +66,43 @@ core.fsmonitor::
 	will identify all files that may have changed since the
 	requested date/time. This information is used to speed up git by
 	avoiding unnecessary processing of files that have not changed.
-	See the "fsmonitor-watchman" section of linkgit:githooks[5].
++
+See the "fsmonitor-watchman" section of linkgit:githooks[5].
++
+Note: FSMonitor hooks (and this config setting) are ignored if the
+built-in FSMonitor is enabled (see `core.useBuiltinFSMonitor`).
 
 core.fsmonitorHookVersion::
-	Sets the version of hook that is to be used when calling fsmonitor.
-	There are currently versions 1 and 2. When this is not set,
-	version 2 will be tried first and if it fails then version 1
-	will be tried. Version 1 uses a timestamp as input to determine
-	which files have changes since that time but some monitors
-	like watchman have race conditions when used with a timestamp.
-	Version 2 uses an opaque string so that the monitor can return
-	something that can be used to determine what files have changed
-	without race conditions.
+	Sets the version of hook that is to be used when calling the
+	FSMonitor hook (as configured via `core.fsmonitor`).
++
+There are currently versions 1 and 2. When this is not set,
+version 2 will be tried first and if it fails then version 1
+will be tried. Version 1 uses a timestamp as input to determine
+which files have changes since that time but some monitors
+like watchman have race conditions when used with a timestamp.
+Version 2 uses an opaque string so that the monitor can return
+something that can be used to determine what files have changed
+without race conditions.
++
+Note: FSMonitor hooks (and this config setting) are ignored if the
+built-in FSMonitor is enabled (see `core.useBuiltinFSMonitor`).
+
+core.useBuiltinFSMonitor::
+	If set to true, enable the built-in filesystem event watcher (for
+	technical details, see linkgit:git-fsmonitor--daemon[1]).
++
+Like external (hook-based) FSMonitors, the built-in FSMonitor can speed up
+Git commands that need to refresh the Git index (e.g. `git status`) in a
+worktree with many files. The built-in FSMonitor facility eliminates the
+need to install and maintain an external third-party monitoring tool.
++
+The built-in FSMonitor is currently available only on a limited set of
+supported platforms.
++
+Note: if this config setting is set to `true`, any FSMonitor hook
+configured via `core.fsmonitor` (and possibly `core.fsmonitorHookVersion`)
+is ignored.
 
 core.trustctime::
 	If false, the ctime differences between the index and the
diff --git a/Documentation/git-fsmonitor--daemon.txt b/Documentation/git-fsmonitor--daemon.txt
new file mode 100644
index 000000000000..b94f57c97fe4
--- /dev/null
+++ b/Documentation/git-fsmonitor--daemon.txt
@@ -0,0 +1,104 @@
+git-fsmonitor--daemon(1)
+========================
+
+NAME
+----
+git-fsmonitor--daemon - Builtin file system monitor daemon
+
+SYNOPSIS
+--------
+[verse]
+'git fsmonitor--daemon' --start
+'git fsmonitor--daemon' --run
+'git fsmonitor--daemon' --stop
+'git fsmonitor--daemon' --is-running
+'git fsmonitor--daemon' --is-supported
+'git fsmonitor--daemon' --query <token>
+'git fsmonitor--daemon' --query-index
+'git fsmonitor--daemon' --flush
+
+DESCRIPTION
+-----------
+
+Monitors files and directories in the working directory for changes using
+platform-specific file system notification facilities.
+
+It communicates directly with commands like `git status` using the
+link:technical/api-simple-ipc.html[simple IPC] interface instead of
+the slower linkgit:githooks[5] interface.
+
+OPTIONS
+-------
+
+--start::
+	Starts the fsmonitor daemon in the background.
+
+--run::
+	Runs the fsmonitor daemon in the foreground.
+
+--stop::
+	Stops the fsmonitor daemon running for the current working
+	directory, if present.
+
+--is-running::
+	Exits with zero status if the fsmonitor daemon is watching the
+	current working directory.
+
+--is-supported::
+	Exits with zero status if the fsmonitor daemon feature is supported
+	on this platform.
+
+--query <token>::
+	Connects to the fsmonitor daemon (starting it if necessary) and
+	requests the list of changed files and directories since the
+	given token.
+	This is intended for testing purposes.
+
+--query-index::
+	Read the current `<token>` from the File System Monitor index
+	extension (if present) and use it to query the fsmonitor daemon.
+	This is intended for testing purposes.
+
+--flush::
+	Force the fsmonitor daemon to flush its in-memory cache and
+	re-sync with the file system.
+	This is intended for testing purposes.
+
+REMARKS
+-------
+The fsmonitor daemon is a long running process that will watch a single
+working directory.  Commands, such as `git status`, should automatically
+start it (if necessary) when `core.useBuiltinFSMonitor` is set to `true`
+(see linkgit:git-config[1]).
+
+Configure the built-in FSMonitor via `core.useBuiltinFSMonitor` in each
+working directory separately, or globally via `git config --global
+core.useBuiltinFSMonitor true`.
+
+Tokens are opaque strings.  They are used by the fsmonitor daemon to
+mark a point in time and the associated internal state.  Callers should
+make no assumptions about the content of the token.  In particular,
+the should not assume that it is a timestamp.
+
+Query commands send a request-token to the daemon and it responds with
+a summary of the changes that have occurred since that token was
+created.  The daemon also returns a response-token that the client can
+use in a future query.
+
+For more information see the "File System Monitor" section in
+linkgit:git-update-index[1].
+
+CAVEATS
+-------
+
+The fsmonitor daemon does not currently know about submodules and does
+not know to filter out file system events that happen within a
+submodule.  If fsmonitor daemon is watching a super repo and a file is
+modified within the working directory of a submodule, it will report
+the change (as happening against the super repo).  However, the client
+should properly ignore these extra events, so performance may be affected
+but it should not cause an incorrect result.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
index 2853f168d976..8169aad7ee9f 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -498,7 +498,9 @@ FILE SYSTEM MONITOR
 This feature is intended to speed up git operations for repos that have
 large working directories.
 
-It enables git to work together with a file system monitor (see the
+It enables git to work together with a file system monitor (see
+linkgit:git-fsmonitor--daemon[1]
+and the
 "fsmonitor-watchman" section of linkgit:githooks[5]) that can
 inform it as to what files have been modified. This enables git to avoid
 having to lstat() every file to find modified files.
diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index b51959ff9418..b7d5e926f7b0 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -593,7 +593,8 @@ fsmonitor-watchman
 
 This hook is invoked when the configuration option `core.fsmonitor` is
 set to `.git/hooks/fsmonitor-watchman` or `.git/hooks/fsmonitor-watchmanv2`
-depending on the version of the hook to use.
+depending on the version of the hook to use, unless overridden via
+`core.useBuiltinFSMonitor` (see linkgit:git-config[1]).
 
 Version 1 takes two arguments, a version (1) and the time in elapsed
 nanoseconds since midnight, January 1, 1970.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 02/23] fsmonitor-ipc: create client routines for git-fsmonitor--daemon
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
  2021-04-01 15:40 ` [PATCH 01/23] fsmonitor--daemon: man page and documentation Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:40 ` Jeff Hostetler via GitGitGadget
  2021-04-26 14:31   ` Derrick Stolee
  2021-04-01 15:40 ` [PATCH 03/23] config: FSMonitor is repository-specific Johannes Schindelin via GitGitGadget
                   ` (24 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:40 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Create client routines to spawn a fsmonitor daemon and send it an IPC
request using `simple-ipc`.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 Makefile        |   1 +
 fsmonitor-ipc.c | 153 ++++++++++++++++++++++++++++++++++++++++++++++++
 fsmonitor-ipc.h |  48 +++++++++++++++
 help.c          |   4 ++
 4 files changed, 206 insertions(+)
 create mode 100644 fsmonitor-ipc.c
 create mode 100644 fsmonitor-ipc.h

diff --git a/Makefile b/Makefile
index a6a73c574191..50977911d41a 100644
--- a/Makefile
+++ b/Makefile
@@ -891,6 +891,7 @@ LIB_OBJS += fetch-pack.o
 LIB_OBJS += fmt-merge-msg.o
 LIB_OBJS += fsck.o
 LIB_OBJS += fsmonitor.o
+LIB_OBJS += fsmonitor-ipc.o
 LIB_OBJS += gettext.o
 LIB_OBJS += gpg-interface.o
 LIB_OBJS += graph.o
diff --git a/fsmonitor-ipc.c b/fsmonitor-ipc.c
new file mode 100644
index 000000000000..b0dc334ff02d
--- /dev/null
+++ b/fsmonitor-ipc.c
@@ -0,0 +1,153 @@
+#include "cache.h"
+#include "fsmonitor.h"
+#include "fsmonitor-ipc.h"
+#include "run-command.h"
+#include "strbuf.h"
+#include "trace2.h"
+
+#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+#define FSMONITOR_DAEMON_IS_SUPPORTED 1
+#else
+#define FSMONITOR_DAEMON_IS_SUPPORTED 0
+#endif
+
+/*
+ * A trivial function so that this source file always defines at least
+ * one symbol even when the feature is not supported.  This quiets an
+ * annoying compiler error.
+ */
+int fsmonitor_ipc__is_supported(void)
+{
+	return FSMONITOR_DAEMON_IS_SUPPORTED;
+}
+
+#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+
+GIT_PATH_FUNC(fsmonitor_ipc__get_path, "fsmonitor")
+
+enum ipc_active_state fsmonitor_ipc__get_state(void)
+{
+	return ipc_get_active_state(fsmonitor_ipc__get_path());
+}
+
+static int spawn_daemon(void)
+{
+	const char *args[] = { "fsmonitor--daemon", "--start", NULL };
+
+	return run_command_v_opt_tr2(args, RUN_COMMAND_NO_STDIN | RUN_GIT_CMD,
+				    "fsmonitor");
+}
+
+int fsmonitor_ipc__send_query(const char *since_token,
+			      struct strbuf *answer)
+{
+	int ret = -1;
+	int tried_to_spawn = 0;
+	enum ipc_active_state state = IPC_STATE__OTHER_ERROR;
+	struct ipc_client_connection *connection = NULL;
+	struct ipc_client_connect_options options
+		= IPC_CLIENT_CONNECT_OPTIONS_INIT;
+
+	options.wait_if_busy = 1;
+	options.wait_if_not_found = 0;
+
+	trace2_region_enter("fsm_client", "query", NULL);
+
+	trace2_data_string("fsm_client", NULL, "query/command",
+			   since_token);
+
+try_again:
+	state = ipc_client_try_connect(fsmonitor_ipc__get_path(), &options,
+				       &connection);
+
+	switch (state) {
+	case IPC_STATE__LISTENING:
+		ret = ipc_client_send_command_to_connection(
+			connection, since_token, answer);
+		ipc_client_close_connection(connection);
+
+		trace2_data_intmax("fsm_client", NULL,
+				   "query/response-length", answer->len);
+
+		if (fsmonitor_is_trivial_response(answer))
+			trace2_data_intmax("fsm_client", NULL,
+					   "query/trivial-response", 1);
+
+		goto done;
+
+	case IPC_STATE__NOT_LISTENING:
+		ret = error(_("fsmonitor_ipc__send_query: daemon not available"));
+		goto done;
+
+	case IPC_STATE__PATH_NOT_FOUND:
+		if (tried_to_spawn)
+			goto done;
+
+		tried_to_spawn++;
+		if (spawn_daemon())
+			goto done;
+
+		/*
+		 * Try again, but this time give the daemon a chance to
+		 * actually create the pipe/socket.
+		 *
+		 * Granted, the daemon just started so it can't possibly have
+		 * any FS cached yet, so we'll always get a trivial answer.
+		 * BUT the answer should include a new token that can serve
+		 * as the basis for subsequent requests.
+		 */
+		options.wait_if_not_found = 1;
+		goto try_again;
+
+	case IPC_STATE__INVALID_PATH:
+		ret = error(_("fsmonitor_ipc__send_query: invalid path '%s'"),
+			    fsmonitor_ipc__get_path());
+		goto done;
+
+	case IPC_STATE__OTHER_ERROR:
+	default:
+		ret = error(_("fsmonitor_ipc__send_query: unspecified error on '%s'"),
+			    fsmonitor_ipc__get_path());
+		goto done;
+	}
+
+done:
+	trace2_region_leave("fsm_client", "query", NULL);
+
+	return ret;
+}
+
+int fsmonitor_ipc__send_command(const char *command,
+				struct strbuf *answer)
+{
+	struct ipc_client_connection *connection = NULL;
+	struct ipc_client_connect_options options
+		= IPC_CLIENT_CONNECT_OPTIONS_INIT;
+	int ret;
+	enum ipc_active_state state;
+
+	strbuf_reset(answer);
+
+	options.wait_if_busy = 1;
+	options.wait_if_not_found = 0;
+
+	state = ipc_client_try_connect(fsmonitor_ipc__get_path(), &options,
+				       &connection);
+	if (state != IPC_STATE__LISTENING) {
+		die("fsmonitor--daemon is not running");
+		return -1;
+	}
+
+	ret = ipc_client_send_command_to_connection(connection, command, answer);
+	ipc_client_close_connection(connection);
+
+	if (ret == -1) {
+		die("could not send '%s' command to fsmonitor--daemon",
+		    command);
+		return -1;
+	}
+
+	return 0;
+}
+
+#endif
diff --git a/fsmonitor-ipc.h b/fsmonitor-ipc.h
new file mode 100644
index 000000000000..7d21c1260151
--- /dev/null
+++ b/fsmonitor-ipc.h
@@ -0,0 +1,48 @@
+#ifndef FSMONITOR_IPC_H
+#define FSMONITOR_IPC_H
+
+/*
+ * Returns true if a filesystem notification backend is defined
+ * for this platform.  This symbol must always be visible and
+ * outside of the HAVE_ ifdef.
+ */
+int fsmonitor_ipc__is_supported(void);
+
+#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+#include "run-command.h"
+#include "simple-ipc.h"
+
+/*
+ * Returns the pathname to the IPC named pipe or Unix domain socket
+ * where a `git-fsmonitor--daemon` process will listen.  This is a
+ * per-worktree value.
+ */
+const char *fsmonitor_ipc__get_path(void);
+
+/*
+ * Try to determine whether there is a `git-fsmonitor--daemon` process
+ * listening on the IPC pipe/socket.
+ */
+enum ipc_active_state fsmonitor_ipc__get_state(void);
+
+/*
+ * Connect to a `git-fsmonitor--daemon` process via simple-ipc
+ * and ask for the set of changed files since the given token.
+ *
+ * This DOES NOT use the hook interface.
+ *
+ * Spawn a daemon process in the background if necessary.
+ */
+int fsmonitor_ipc__send_query(const char *since_token,
+			      struct strbuf *answer);
+
+/*
+ * Connect to a `git-fsmonitor--daemon` process via simple-ipc and
+ * send a command verb.  If no daemon is available, we DO NOT try to
+ * start one.
+ */
+int fsmonitor_ipc__send_command(const char *command,
+				struct strbuf *answer);
+
+#endif /* HAVE_FSMONITOR_DAEMON_BACKEND */
+#endif /* FSMONITOR_IPC_H */
diff --git a/help.c b/help.c
index 3c3bdec21356..e22ba1d246a5 100644
--- a/help.c
+++ b/help.c
@@ -11,6 +11,7 @@
 #include "version.h"
 #include "refs.h"
 #include "parse-options.h"
+#include "fsmonitor-ipc.h"
 
 struct category_description {
 	uint32_t category;
@@ -664,6 +665,9 @@ void get_version_info(struct strbuf *buf, int show_build_options)
 		strbuf_addf(buf, "sizeof-size_t: %d\n", (int)sizeof(size_t));
 		strbuf_addf(buf, "shell-path: %s\n", SHELL_PATH);
 		/* NEEDSWORK: also save and output GIT-BUILD_OPTIONS? */
+
+		if (fsmonitor_ipc__is_supported())
+			strbuf_addstr(buf, "feature: fsmonitor--daemon\n");
 	}
 }
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 03/23] config: FSMonitor is repository-specific
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
  2021-04-01 15:40 ` [PATCH 01/23] fsmonitor--daemon: man page and documentation Jeff Hostetler via GitGitGadget
  2021-04-01 15:40 ` [PATCH 02/23] fsmonitor-ipc: create client routines for git-fsmonitor--daemon Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:40 ` Johannes Schindelin via GitGitGadget
  2021-04-01 15:40 ` [PATCH 04/23] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC Johannes Schindelin via GitGitGadget
                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 237+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2021-04-01 15:40 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

This commit refactors `git_config_get_fsmonitor()` into the `repo_*()`
form that takes a parameter `struct repository *r`.

That change prepares for the upcoming `core.useFSMonitorDaemon` flag which
will be stored in the `repo_settings` struct.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/update-index.c | 4 ++--
 config.c               | 4 ++--
 config.h               | 2 +-
 fsmonitor.c            | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 79087bccea4b..84793df8b2b6 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1214,14 +1214,14 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	}
 
 	if (fsmonitor > 0) {
-		if (git_config_get_fsmonitor() == 0)
+		if (repo_config_get_fsmonitor(r) == 0)
 			warning(_("core.fsmonitor is unset; "
 				"set it if you really want to "
 				"enable fsmonitor"));
 		add_fsmonitor(&the_index);
 		report(_("fsmonitor enabled"));
 	} else if (!fsmonitor) {
-		if (git_config_get_fsmonitor() == 1)
+		if (repo_config_get_fsmonitor(r) == 1)
 			warning(_("core.fsmonitor is set; "
 				"remove it if you really want to "
 				"disable fsmonitor"));
diff --git a/config.c b/config.c
index 6428393a4143..955ff4f9461d 100644
--- a/config.c
+++ b/config.c
@@ -2513,9 +2513,9 @@ int git_config_get_max_percent_split_change(void)
 	return -1; /* default value */
 }
 
-int git_config_get_fsmonitor(void)
+int repo_config_get_fsmonitor(struct repository *r)
 {
-	if (git_config_get_pathname("core.fsmonitor", &core_fsmonitor))
+	if (repo_config_get_pathname(r, "core.fsmonitor", &core_fsmonitor))
 		core_fsmonitor = getenv("GIT_TEST_FSMONITOR");
 
 	if (core_fsmonitor && !*core_fsmonitor)
diff --git a/config.h b/config.h
index 19a9adbaa9a3..3139de81d986 100644
--- a/config.h
+++ b/config.h
@@ -607,7 +607,7 @@ int git_config_get_index_threads(int *dest);
 int git_config_get_untracked_cache(void);
 int git_config_get_split_index(void);
 int git_config_get_max_percent_split_change(void);
-int git_config_get_fsmonitor(void);
+int repo_config_get_fsmonitor(struct repository *r);
 
 /* This dies if the configured or default date is in the future */
 int git_config_get_expiry(const char *key, const char **output);
diff --git a/fsmonitor.c b/fsmonitor.c
index ab9bfc60b34e..9c9b2abc9414 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -411,7 +411,7 @@ void remove_fsmonitor(struct index_state *istate)
 void tweak_fsmonitor(struct index_state *istate)
 {
 	unsigned int i;
-	int fsmonitor_enabled = git_config_get_fsmonitor();
+	int fsmonitor_enabled = repo_config_get_fsmonitor(istate->repo ? istate->repo : the_repository);
 
 	if (istate->fsmonitor_dirty) {
 		if (fsmonitor_enabled) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 04/23] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (2 preceding siblings ...)
  2021-04-01 15:40 ` [PATCH 03/23] config: FSMonitor is repository-specific Johannes Schindelin via GitGitGadget
@ 2021-04-01 15:40 ` Johannes Schindelin via GitGitGadget
  2021-04-26 14:56   ` Derrick Stolee
  2021-04-01 15:40 ` [PATCH 05/23] fsmonitor--daemon: add a built-in fsmonitor daemon Jeff Hostetler via GitGitGadget
                   ` (22 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2021-04-01 15:40 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `core.fsmonitor` setting is supposed to be set to a path pointing
to a script or executable that (via the Hook API) queries an fsmonitor
process such as watchman.

We are about to implement our own fsmonitor backend, and do not want
to spawn hook processes just to query it.  Let's use `Simple IPC` to
directly communicate with the daemon (and start it if necessary),
guarded by the brand-new `core.useBuiltinFSMonitor` toggle.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 config.c        |  5 +++++
 fsmonitor.c     | 20 +++++++++++++++++---
 repo-settings.c |  3 +++
 repository.h    |  2 ++
 4 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/config.c b/config.c
index 955ff4f9461d..31f2cbaf6dfb 100644
--- a/config.c
+++ b/config.c
@@ -2515,6 +2515,11 @@ int git_config_get_max_percent_split_change(void)
 
 int repo_config_get_fsmonitor(struct repository *r)
 {
+	if (r->settings.use_builtin_fsmonitor > 0) {
+		core_fsmonitor = "(built-in daemon)";
+		return 1;
+	}
+
 	if (repo_config_get_pathname(r, "core.fsmonitor", &core_fsmonitor))
 		core_fsmonitor = getenv("GIT_TEST_FSMONITOR");
 
diff --git a/fsmonitor.c b/fsmonitor.c
index 9c9b2abc9414..d7e18fc8cd47 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -3,6 +3,7 @@
 #include "dir.h"
 #include "ewah/ewok.h"
 #include "fsmonitor.h"
+#include "fsmonitor-ipc.h"
 #include "run-command.h"
 #include "strbuf.h"
 
@@ -148,14 +149,27 @@ void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate)
 /*
  * Call the query-fsmonitor hook passing the last update token of the saved results.
  */
-static int query_fsmonitor(int version, const char *last_update, struct strbuf *query_result)
+static int query_fsmonitor(int version, struct index_state *istate, struct strbuf *query_result)
 {
+	struct repository *r = istate->repo ? istate->repo : the_repository;
+	const char *last_update = istate->fsmonitor_last_update;
 	struct child_process cp = CHILD_PROCESS_INIT;
 	int result;
 
 	if (!core_fsmonitor)
 		return -1;
 
+	if (r->settings.use_builtin_fsmonitor > 0) {
+#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+		return fsmonitor_ipc__send_query(last_update, query_result);
+#else
+		/* Fake a trivial response. */
+		warning(_("fsmonitor--daemon unavailable; falling back"));
+		strbuf_add(query_result, "/", 2);
+		return 0;
+#endif
+	}
+
 	strvec_push(&cp.args, core_fsmonitor);
 	strvec_pushf(&cp.args, "%d", version);
 	strvec_pushf(&cp.args, "%s", last_update);
@@ -263,7 +277,7 @@ void refresh_fsmonitor(struct index_state *istate)
 	if (istate->fsmonitor_last_update) {
 		if (hook_version == -1 || hook_version == HOOK_INTERFACE_VERSION2) {
 			query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION2,
-				istate->fsmonitor_last_update, &query_result);
+				istate, &query_result);
 
 			if (query_success) {
 				if (hook_version < 0)
@@ -293,7 +307,7 @@ void refresh_fsmonitor(struct index_state *istate)
 
 		if (hook_version == HOOK_INTERFACE_VERSION1) {
 			query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION1,
-				istate->fsmonitor_last_update, &query_result);
+				istate, &query_result);
 		}
 
 		trace_performance_since(last_update, "fsmonitor process '%s'", core_fsmonitor);
diff --git a/repo-settings.c b/repo-settings.c
index f7fff0f5ab83..93aab92ff164 100644
--- a/repo-settings.c
+++ b/repo-settings.c
@@ -58,6 +58,9 @@ void prepare_repo_settings(struct repository *r)
 		r->settings.core_multi_pack_index = value;
 	UPDATE_DEFAULT_BOOL(r->settings.core_multi_pack_index, 1);
 
+	if (!repo_config_get_bool(r, "core.usebuiltinfsmonitor", &value) && value)
+		r->settings.use_builtin_fsmonitor = 1;
+
 	if (!repo_config_get_bool(r, "feature.manyfiles", &value) && value) {
 		UPDATE_DEFAULT_BOOL(r->settings.index_version, 4);
 		UPDATE_DEFAULT_BOOL(r->settings.core_untracked_cache, UNTRACKED_CACHE_WRITE);
diff --git a/repository.h b/repository.h
index b385ca3c94b6..7eeab871ac3e 100644
--- a/repository.h
+++ b/repository.h
@@ -41,6 +41,8 @@ struct repo_settings {
 	enum fetch_negotiation_setting fetch_negotiation_algorithm;
 
 	int core_multi_pack_index;
+
+	int use_builtin_fsmonitor;
 };
 
 struct repository {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 05/23] fsmonitor--daemon: add a built-in fsmonitor daemon
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (3 preceding siblings ...)
  2021-04-01 15:40 ` [PATCH 04/23] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC Johannes Schindelin via GitGitGadget
@ 2021-04-01 15:40 ` Jeff Hostetler via GitGitGadget
  2021-04-26 15:08   ` Derrick Stolee
  2021-04-01 15:40 ` [PATCH 06/23] fsmonitor--daemon: implement client command options Jeff Hostetler via GitGitGadget
                   ` (21 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:40 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Create a built-in file system monitoring daemon that can be used by
the existing `fsmonitor` feature (protocol API and index extension)
to improve the performance of various Git commands, such as `status`.

The `fsmonitor--daemon` feature builds upon the `Simple IPC` API and
provides an alternative to hook access to existing fsmonitors such
as `watchman`.

This commit merely adds the new command without any functionality.

Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 .gitignore                  |  1 +
 Makefile                    |  1 +
 builtin.h                   |  1 +
 builtin/fsmonitor--daemon.c | 52 +++++++++++++++++++++++++++++++++++++
 git.c                       |  1 +
 5 files changed, 56 insertions(+)
 create mode 100644 builtin/fsmonitor--daemon.c

diff --git a/.gitignore b/.gitignore
index 3dcdb6bb5ab8..beccf34abe9e 100644
--- a/.gitignore
+++ b/.gitignore
@@ -71,6 +71,7 @@
 /git-format-patch
 /git-fsck
 /git-fsck-objects
+/git-fsmonitor--daemon
 /git-gc
 /git-get-tar-commit-id
 /git-grep
diff --git a/Makefile b/Makefile
index 50977911d41a..d792631d4250 100644
--- a/Makefile
+++ b/Makefile
@@ -1091,6 +1091,7 @@ BUILTIN_OBJS += builtin/fmt-merge-msg.o
 BUILTIN_OBJS += builtin/for-each-ref.o
 BUILTIN_OBJS += builtin/for-each-repo.o
 BUILTIN_OBJS += builtin/fsck.o
+BUILTIN_OBJS += builtin/fsmonitor--daemon.o
 BUILTIN_OBJS += builtin/gc.o
 BUILTIN_OBJS += builtin/get-tar-commit-id.o
 BUILTIN_OBJS += builtin/grep.o
diff --git a/builtin.h b/builtin.h
index b6ce981b7377..7554476f90a4 100644
--- a/builtin.h
+++ b/builtin.h
@@ -158,6 +158,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix);
 int cmd_for_each_repo(int argc, const char **argv, const char *prefix);
 int cmd_format_patch(int argc, const char **argv, const char *prefix);
 int cmd_fsck(int argc, const char **argv, const char *prefix);
+int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix);
 int cmd_gc(int argc, const char **argv, const char *prefix);
 int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
 int cmd_grep(int argc, const char **argv, const char *prefix);
diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
new file mode 100644
index 000000000000..6700bac92c7d
--- /dev/null
+++ b/builtin/fsmonitor--daemon.c
@@ -0,0 +1,52 @@
+#include "builtin.h"
+#include "config.h"
+#include "parse-options.h"
+#include "fsmonitor.h"
+#include "fsmonitor-ipc.h"
+#include "simple-ipc.h"
+#include "khash.h"
+
+static const char * const builtin_fsmonitor__daemon_usage[] = {
+	NULL
+};
+
+#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+
+int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
+{
+	enum daemon_mode {
+		UNDEFINED_MODE,
+	} mode = UNDEFINED_MODE;
+
+	struct option options[] = {
+		OPT_END()
+	};
+
+	if (argc == 2 && !strcmp(argv[1], "-h"))
+		usage_with_options(builtin_fsmonitor__daemon_usage, options);
+
+	git_config(git_default_config, NULL);
+
+	argc = parse_options(argc, argv, prefix, options,
+			     builtin_fsmonitor__daemon_usage, 0);
+
+	switch (mode) {
+	case UNDEFINED_MODE:
+	default:
+		die(_("Unhandled command mode %d"), mode);
+	}
+}
+
+#else
+int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
+{
+	struct option options[] = {
+		OPT_END()
+	};
+
+	if (argc == 2 && !strcmp(argv[1], "-h"))
+		usage_with_options(builtin_fsmonitor__daemon_usage, options);
+
+	die(_("fsmonitor--daemon not supported on this platform"));
+}
+#endif
diff --git a/git.c b/git.c
index 9bc077a025cb..239deb9823fc 100644
--- a/git.c
+++ b/git.c
@@ -523,6 +523,7 @@ static struct cmd_struct commands[] = {
 	{ "format-patch", cmd_format_patch, RUN_SETUP },
 	{ "fsck", cmd_fsck, RUN_SETUP },
 	{ "fsck-objects", cmd_fsck, RUN_SETUP },
+	{ "fsmonitor--daemon", cmd_fsmonitor__daemon, RUN_SETUP },
 	{ "gc", cmd_gc, RUN_SETUP },
 	{ "get-tar-commit-id", cmd_get_tar_commit_id, NO_PARSEOPT },
 	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 06/23] fsmonitor--daemon: implement client command options
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (4 preceding siblings ...)
  2021-04-01 15:40 ` [PATCH 05/23] fsmonitor--daemon: add a built-in fsmonitor daemon Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:40 ` Jeff Hostetler via GitGitGadget
  2021-04-26 15:12   ` Derrick Stolee
  2021-04-01 15:40 ` [PATCH 07/23] fsmonitor-fs-listen-win32: stub in backend for Windows Jeff Hostetler via GitGitGadget
                   ` (20 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:40 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Implement command options `--stop`, `--is-running`, `--query`,
`--query-index`, and `--flush` to control and query the status of a
`fsmonitor--daemon` server process (and implicitly start a server
process if necessary).

Later commits will implement the actual server and monitor
the file system.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 144 ++++++++++++++++++++++++++++++++++++
 1 file changed, 144 insertions(+)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 6700bac92c7d..10434bce4b64 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -7,18 +7,144 @@
 #include "khash.h"
 
 static const char * const builtin_fsmonitor__daemon_usage[] = {
+	N_("git fsmonitor--daemon --stop"),
+	N_("git fsmonitor--daemon --is-running"),
+	N_("git fsmonitor--daemon --query <token>"),
+	N_("git fsmonitor--daemon --query-index"),
+	N_("git fsmonitor--daemon --flush"),
 	NULL
 };
 
 #ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+/*
+ * Acting as a CLIENT.
+ *
+ * Send an IPC query to a `git-fsmonitor--daemon` SERVER process and
+ * ask for the changes since the given token.  This will implicitly
+ * start a daemon process if necessary.  The daemon process will
+ * persist after we exit.
+ *
+ * This feature is primarily used by the test suite.
+ */
+static int do_as_client__query_token(const char *token)
+{
+	struct strbuf answer = STRBUF_INIT;
+	int ret;
+
+	ret = fsmonitor_ipc__send_query(token, &answer);
+	if (ret < 0)
+		die(_("could not query fsmonitor--daemon"));
+
+	write_in_full(1, answer.buf, answer.len);
+	strbuf_release(&answer);
+
+	return 0;
+}
+
+/*
+ * Acting as a CLIENT.
+ *
+ * Read the `.git/index` to get the last token written to the FSMonitor index
+ * extension and use that to make a query.
+ *
+ * This feature is primarily used by the test suite.
+ */
+static int do_as_client__query_from_index(void)
+{
+	struct index_state *istate = the_repository->index;
+
+	setup_git_directory();
+	if (do_read_index(istate, the_repository->index_file, 0) < 0)
+		die("unable to read index file");
+	if (!istate->fsmonitor_last_update)
+		die("index file does not have fsmonitor extension");
+
+	return do_as_client__query_token(istate->fsmonitor_last_update);
+}
+
+/*
+ * Acting as a CLIENT.
+ *
+ * Send a "quit" command to the `git-fsmonitor--daemon` (if running)
+ * and wait for it to shutdown.
+ */
+static int do_as_client__send_stop(void)
+{
+	struct strbuf answer = STRBUF_INIT;
+	int ret;
+
+	ret = fsmonitor_ipc__send_command("quit", &answer);
+
+	/* The quit command does not return any response data. */
+	strbuf_release(&answer);
+
+	if (ret)
+		return ret;
+
+	trace2_region_enter("fsm_client", "polling-for-daemon-exit", NULL);
+	while (fsmonitor_ipc__get_state() == IPC_STATE__LISTENING)
+		sleep_millisec(50);
+	trace2_region_leave("fsm_client", "polling-for-daemon-exit", NULL);
+
+	return 0;
+}
+
+/*
+ * Acting as a CLIENT.
+ *
+ * Send a "flush" command to the `git-fsmonitor--daemon` (if running)
+ * and tell it to flush its cache.
+ *
+ * This feature is primarily used by the test suite to simulate a loss of
+ * sync with the filesystem where we miss kernel events.
+ */
+static int do_as_client__send_flush(void)
+{
+	struct strbuf answer = STRBUF_INIT;
+	int ret;
+
+	ret = fsmonitor_ipc__send_command("flush", &answer);
+	if (ret)
+		return ret;
+
+	write_in_full(1, answer.buf, answer.len);
+	strbuf_release(&answer);
+
+	return 0;
+}
+
+static int is_ipc_daemon_listening(void)
+{
+	return fsmonitor_ipc__get_state() == IPC_STATE__LISTENING;
+}
 
 int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 {
 	enum daemon_mode {
 		UNDEFINED_MODE,
+		STOP,
+		IS_RUNNING,
+		QUERY,
+		QUERY_INDEX,
+		FLUSH,
 	} mode = UNDEFINED_MODE;
 
 	struct option options[] = {
+		OPT_CMDMODE(0, "stop", &mode, N_("stop the running daemon"),
+			    STOP),
+
+		OPT_CMDMODE(0, "is-running", &mode,
+			    N_("test whether the daemon is running"),
+			    IS_RUNNING),
+
+		OPT_CMDMODE(0, "query", &mode,
+			    N_("query the daemon (starting if necessary)"),
+			    QUERY),
+		OPT_CMDMODE(0, "query-index", &mode,
+			    N_("query the daemon (starting if necessary) using token from index"),
+			    QUERY_INDEX),
+		OPT_CMDMODE(0, "flush", &mode, N_("flush cached filesystem events"),
+			    FLUSH),
 		OPT_END()
 	};
 
@@ -31,6 +157,24 @@ int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 			     builtin_fsmonitor__daemon_usage, 0);
 
 	switch (mode) {
+	case STOP:
+		return !!do_as_client__send_stop();
+
+	case IS_RUNNING:
+		return !is_ipc_daemon_listening();
+
+	case QUERY:
+		if (argc != 1)
+			usage_with_options(builtin_fsmonitor__daemon_usage,
+					   options);
+		return !!do_as_client__query_token(argv[0]);
+
+	case QUERY_INDEX:
+		return !!do_as_client__query_from_index();
+
+	case FLUSH:
+		return !!do_as_client__send_flush();
+
 	case UNDEFINED_MODE:
 	default:
 		die(_("Unhandled command mode %d"), mode);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 07/23] fsmonitor-fs-listen-win32: stub in backend for Windows
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (5 preceding siblings ...)
  2021-04-01 15:40 ` [PATCH 06/23] fsmonitor--daemon: implement client command options Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:40 ` Jeff Hostetler via GitGitGadget
  2021-04-26 15:23   ` Derrick Stolee
  2021-04-01 15:40 ` [PATCH 08/23] fsmonitor-fs-listen-macos: stub in backend for MacOS Jeff Hostetler via GitGitGadget
                   ` (19 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:40 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Stub in empty backend for fsmonitor--daemon on Windows.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 Makefile                                     | 13 ++++++
 compat/fsmonitor/fsmonitor-fs-listen-win32.c | 21 +++++++++
 compat/fsmonitor/fsmonitor-fs-listen.h       | 49 ++++++++++++++++++++
 config.mak.uname                             |  2 +
 contrib/buildsystems/CMakeLists.txt          |  5 ++
 5 files changed, 90 insertions(+)
 create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-win32.c
 create mode 100644 compat/fsmonitor/fsmonitor-fs-listen.h

diff --git a/Makefile b/Makefile
index d792631d4250..014bc1baa03a 100644
--- a/Makefile
+++ b/Makefile
@@ -467,6 +467,11 @@ all::
 # directory, and the JSON compilation database 'compile_commands.json' will be
 # created at the root of the repository.
 #
+# If your platform supports an built-in fsmonitor backend, set
+# FSMONITOR_DAEMON_BACKEND to the name of the corresponding
+# `compat/fsmonitor/fsmonitor-fs-listen-<name>.c` that implements the
+# `fsmonitor_fs_listen__*()` routines.
+#
 # Define DEVELOPER to enable more compiler warnings. Compiler version
 # and family are auto detected, but could be overridden by defining
 # COMPILER_FEATURES (see config.mak.dev). You can still set
@@ -1904,6 +1909,11 @@ ifdef NEED_ACCESS_ROOT_HANDLER
 	COMPAT_OBJS += compat/access.o
 endif
 
+ifdef FSMONITOR_DAEMON_BACKEND
+	COMPAT_CFLAGS += -DHAVE_FSMONITOR_DAEMON_BACKEND
+	COMPAT_OBJS += compat/fsmonitor/fsmonitor-fs-listen-$(FSMONITOR_DAEMON_BACKEND).o
+endif
+
 ifeq ($(TCLTK_PATH),)
 NO_TCLTK = NoThanks
 endif
@@ -2761,6 +2771,9 @@ GIT-BUILD-OPTIONS: FORCE
 	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
 	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
 	@echo X=\'$(X)\' >>$@+
+ifdef FSMONITOR_DAEMON_BACKEND
+	@echo FSMONITOR_DAEMON_BACKEND=\''$(subst ','\'',$(subst ','\'',$(FSMONITOR_DAEMON_BACKEND)))'\' >>$@+
+endif
 ifdef TEST_OUTPUT_DIRECTORY
 	@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@+
 endif
diff --git a/compat/fsmonitor/fsmonitor-fs-listen-win32.c b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
new file mode 100644
index 000000000000..880446b49e35
--- /dev/null
+++ b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
@@ -0,0 +1,21 @@
+#include "cache.h"
+#include "config.h"
+#include "fsmonitor.h"
+#include "fsmonitor-fs-listen.h"
+
+void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state)
+{
+}
+
+void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
+{
+}
+
+int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
+{
+	return -1;
+}
+
+void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
+{
+}
diff --git a/compat/fsmonitor/fsmonitor-fs-listen.h b/compat/fsmonitor/fsmonitor-fs-listen.h
new file mode 100644
index 000000000000..c7b5776b3b60
--- /dev/null
+++ b/compat/fsmonitor/fsmonitor-fs-listen.h
@@ -0,0 +1,49 @@
+#ifndef FSMONITOR_FS_LISTEN_H
+#define FSMONITOR_FS_LISTEN_H
+
+/* This needs to be implemented by each backend */
+
+#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+
+struct fsmonitor_daemon_state;
+
+/*
+ * Initialize platform-specific data for the fsmonitor listener thread.
+ * This will be called from the main thread PRIOR to staring the
+ * fsmonitor_fs_listener thread.
+ *
+ * Returns 0 if successful.
+ * Returns -1 otherwise.
+ */
+int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state);
+
+/*
+ * Cleanup platform-specific data for the fsmonitor listener thread.
+ * This will be called from the main thread AFTER joining the listener.
+ */
+void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state);
+
+/*
+ * The main body of the platform-specific event loop to watch for
+ * filesystem events.  This will run in the fsmonitor_fs_listen thread.
+ *
+ * It should call `ipc_server_stop_async()` if the listener thread
+ * prematurely terminates (because of a filesystem error or if it
+ * detects that the .git directory has been deleted).  (It should NOT
+ * do so if the listener thread receives a normal shutdown signal from
+ * the IPC layer.)
+ *
+ * It should set `state->error_code` to -1 if the daemon should exit
+ * with an error.
+ */
+void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state);
+
+/*
+ * Gently request that the fsmonitor listener thread shutdown.
+ * It does not wait for it to stop.  The caller should do a JOIN
+ * to wait for it.
+ */
+void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state);
+
+#endif /* HAVE_FSMONITOR_DAEMON_BACKEND */
+#endif /* FSMONITOR_FS_LISTEN_H */
diff --git a/config.mak.uname b/config.mak.uname
index cb443b4e023a..fcd88b60b14a 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -420,6 +420,7 @@ ifeq ($(uname_S),Windows)
 	# so we don't need this:
 	#
 	#   SNPRINTF_RETURNS_BOGUS = YesPlease
+	FSMONITOR_DAEMON_BACKEND = win32
 	NO_SVN_TESTS = YesPlease
 	RUNTIME_PREFIX = YesPlease
 	HAVE_WPGMPTR = YesWeDo
@@ -598,6 +599,7 @@ ifneq (,$(findstring MINGW,$(uname_S)))
 	NO_STRTOUMAX = YesPlease
 	NO_MKDTEMP = YesPlease
 	NO_SVN_TESTS = YesPlease
+	FSMONITOR_DAEMON_BACKEND = win32
 	RUNTIME_PREFIX = YesPlease
 	HAVE_WPGMPTR = YesWeDo
 	NO_ST_BLOCKS_IN_STRUCT_STAT = YesPlease
diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index 9897fcc8ea2a..727cfd561169 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -252,6 +252,11 @@ else()
 	list(APPEND compat_SOURCES compat/simple-ipc/ipc-shared.c compat/simple-ipc/ipc-unix-socket.c)
 endif()
 
+if(CMAKE_SYSTEM_NAME STREQUAL "Windows")
+	add_compile_definitions(HAVE_FSMONITOR_DAEMON_BACKEND)
+	list(APPEND compat_SOURCES compat/fsmonitor/fsmonitor-fs-listen-win32.c)
+endif()
+
 set(EXE_EXTENSION ${CMAKE_EXECUTABLE_SUFFIX})
 
 #header checks
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 08/23] fsmonitor-fs-listen-macos: stub in backend for MacOS
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (6 preceding siblings ...)
  2021-04-01 15:40 ` [PATCH 07/23] fsmonitor-fs-listen-win32: stub in backend for Windows Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:40 ` Jeff Hostetler via GitGitGadget
  2021-04-01 15:40 ` [PATCH 09/23] fsmonitor--daemon: implement daemon command options Jeff Hostetler via GitGitGadget
                   ` (18 subsequent siblings)
  26 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:40 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Stub in empty implementation of fsmonitor--daemon
backend for MacOS.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/fsmonitor/fsmonitor-fs-listen-macos.c | 20 ++++++++++++++++++++
 config.mak.uname                             |  2 ++
 contrib/buildsystems/CMakeLists.txt          |  3 +++
 3 files changed, 25 insertions(+)
 create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-macos.c

diff --git a/compat/fsmonitor/fsmonitor-fs-listen-macos.c b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
new file mode 100644
index 000000000000..b91058d1c4f8
--- /dev/null
+++ b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
@@ -0,0 +1,20 @@
+#include "cache.h"
+#include "fsmonitor.h"
+#include "fsmonitor-fs-listen.h"
+
+int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
+{
+	return -1;
+}
+
+void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
+{
+}
+
+void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state)
+{
+}
+
+void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
+{
+}
diff --git a/config.mak.uname b/config.mak.uname
index fcd88b60b14a..394355463e1e 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -147,6 +147,8 @@ ifeq ($(uname_S),Darwin)
 			MSGFMT = /usr/local/opt/gettext/bin/msgfmt
 		endif
 	endif
+	FSMONITOR_DAEMON_BACKEND = macos
+	BASIC_LDFLAGS += -framework CoreServices
 endif
 ifeq ($(uname_S),SunOS)
 	NEEDS_SOCKET = YesPlease
diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index 727cfd561169..341a85e7bfc9 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -255,6 +255,9 @@ endif()
 if(CMAKE_SYSTEM_NAME STREQUAL "Windows")
 	add_compile_definitions(HAVE_FSMONITOR_DAEMON_BACKEND)
 	list(APPEND compat_SOURCES compat/fsmonitor/fsmonitor-fs-listen-win32.c)
+elseif(CMAKE_SYSTEM_NAME STREQUAL "Darwin")
+	add_compile_definitions(HAVE_FSMONITOR_DAEMON_BACKEND)
+	list(APPEND compat_SOURCES compat/fsmonitor/fsmonitor-fs-listen-macos.c)
 endif()
 
 set(EXE_EXTENSION ${CMAKE_EXECUTABLE_SUFFIX})
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 09/23] fsmonitor--daemon: implement daemon command options
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (7 preceding siblings ...)
  2021-04-01 15:40 ` [PATCH 08/23] fsmonitor-fs-listen-macos: stub in backend for MacOS Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:40 ` Jeff Hostetler via GitGitGadget
  2021-04-26 15:47   ` Derrick Stolee
  2021-04-01 15:40 ` [PATCH 10/23] fsmonitor--daemon: add pathname classification Jeff Hostetler via GitGitGadget
                   ` (17 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:40 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Implement command options `--run` and `--start` to try to
begin listening for file system events.

This version defines the thread structure with a single
fsmonitor_fs_listen thread to watch for file system events
and a simple IPC thread pool to wait for connections from
Git clients over a well-known named pipe or Unix domain
socket.

This version does not actually do anything yet because the
backends are still just stubs.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 395 +++++++++++++++++++++++++++++++++++-
 fsmonitor--daemon.h         |  36 ++++
 2 files changed, 430 insertions(+), 1 deletion(-)
 create mode 100644 fsmonitor--daemon.h

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 10434bce4b64..23a063707972 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -3,10 +3,14 @@
 #include "parse-options.h"
 #include "fsmonitor.h"
 #include "fsmonitor-ipc.h"
+#include "compat/fsmonitor/fsmonitor-fs-listen.h"
+#include "fsmonitor--daemon.h"
 #include "simple-ipc.h"
 #include "khash.h"
 
 static const char * const builtin_fsmonitor__daemon_usage[] = {
+	N_("git fsmonitor--daemon --start [<options>]"),
+	N_("git fsmonitor--daemon --run [<options>]"),
 	N_("git fsmonitor--daemon --stop"),
 	N_("git fsmonitor--daemon --is-running"),
 	N_("git fsmonitor--daemon --query <token>"),
@@ -16,6 +20,38 @@ static const char * const builtin_fsmonitor__daemon_usage[] = {
 };
 
 #ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+/*
+ * Global state loaded from config.
+ */
+#define FSMONITOR__IPC_THREADS "fsmonitor.ipcthreads"
+static int fsmonitor__ipc_threads = 8;
+
+#define FSMONITOR__START_TIMEOUT "fsmonitor.starttimeout"
+static int fsmonitor__start_timeout_sec = 60;
+
+static int fsmonitor_config(const char *var, const char *value, void *cb)
+{
+	if (!strcmp(var, FSMONITOR__IPC_THREADS)) {
+		int i = git_config_int(var, value);
+		if (i < 1)
+			return error(_("value of '%s' out of range: %d"),
+				     FSMONITOR__IPC_THREADS, i);
+		fsmonitor__ipc_threads = i;
+		return 0;
+	}
+
+	if (!strcmp(var, FSMONITOR__START_TIMEOUT)) {
+		int i = git_config_int(var, value);
+		if (i < 0)
+			return error(_("value of '%s' out of range: %d"),
+				     FSMONITOR__START_TIMEOUT, i);
+		fsmonitor__start_timeout_sec = i;
+		return 0;
+	}
+
+	return git_default_config(var, value, cb);
+}
+
 /*
  * Acting as a CLIENT.
  *
@@ -113,15 +149,350 @@ static int do_as_client__send_flush(void)
 	return 0;
 }
 
+static ipc_server_application_cb handle_client;
+
+static int handle_client(void *data, const char *command,
+			 ipc_server_reply_cb *reply,
+			 struct ipc_server_reply_data *reply_data)
+{
+	/* struct fsmonitor_daemon_state *state = data; */
+	int result;
+
+	trace2_region_enter("fsmonitor", "handle_client", the_repository);
+	trace2_data_string("fsmonitor", the_repository, "request", command);
+
+	result = 0; /* TODO Do something here. */
+
+	trace2_region_leave("fsmonitor", "handle_client", the_repository);
+
+	return result;
+}
+
+static void *fsmonitor_fs_listen__thread_proc(void *_state)
+{
+	struct fsmonitor_daemon_state *state = _state;
+
+	trace2_thread_start("fsm-listen");
+
+	trace_printf_key(&trace_fsmonitor, "Watching: worktree '%s'",
+			 state->path_worktree_watch.buf);
+	if (state->nr_paths_watching > 1)
+		trace_printf_key(&trace_fsmonitor, "Watching: gitdir '%s'",
+				 state->path_gitdir_watch.buf);
+
+	fsmonitor_fs_listen__loop(state);
+
+	trace2_thread_exit();
+	return NULL;
+}
+
+static int fsmonitor_run_daemon_1(struct fsmonitor_daemon_state *state)
+{
+	struct ipc_server_opts ipc_opts = {
+		.nr_threads = fsmonitor__ipc_threads,
+
+		/*
+		 * We know that there are no other active threads yet,
+		 * so we can let the IPC layer temporarily chdir() if
+		 * it needs to when creating the server side of the
+		 * Unix domain socket.
+		 */
+		.uds_disallow_chdir = 0
+	};
+
+	/*
+	 * Start the IPC thread pool before the we've started the file
+	 * system event listener thread so that we have the IPC handle
+	 * before we need it.
+	 */
+	if (ipc_server_run_async(&state->ipc_server_data,
+				 fsmonitor_ipc__get_path(), &ipc_opts,
+				 handle_client, state))
+		return error(_("could not start IPC thread pool"));
+
+	/*
+	 * Start the fsmonitor listener thread to collect filesystem
+	 * events.
+	 */
+	if (pthread_create(&state->listener_thread, NULL,
+			   fsmonitor_fs_listen__thread_proc, state) < 0) {
+		ipc_server_stop_async(state->ipc_server_data);
+		ipc_server_await(state->ipc_server_data);
+
+		return error(_("could not start fsmonitor listener thread"));
+	}
+
+	/*
+	 * The daemon is now fully functional in background threads.
+	 * Wait for the IPC thread pool to shutdown (whether by client
+	 * request or from filesystem activity).
+	 */
+	ipc_server_await(state->ipc_server_data);
+
+	/*
+	 * The fsmonitor listener thread may have received a shutdown
+	 * event from the IPC thread pool, but it doesn't hurt to tell
+	 * it again.  And wait for it to shutdown.
+	 */
+	fsmonitor_fs_listen__stop_async(state);
+	pthread_join(state->listener_thread, NULL);
+
+	return state->error_code;
+}
+
+static int fsmonitor_run_daemon(void)
+{
+	struct fsmonitor_daemon_state state;
+	int err;
+
+	memset(&state, 0, sizeof(state));
+
+	pthread_mutex_init(&state.main_lock, NULL);
+	state.error_code = 0;
+	state.current_token_data = NULL;
+	state.test_client_delay_ms = 0;
+
+	/* Prepare to (recursively) watch the <worktree-root> directory. */
+	strbuf_init(&state.path_worktree_watch, 0);
+	strbuf_addstr(&state.path_worktree_watch, absolute_path(get_git_work_tree()));
+	state.nr_paths_watching = 1;
+
+	/*
+	 * If ".git" is not a directory, then <gitdir> is not inside the
+	 * cone of <worktree-root>, so set up a second watch for it.
+	 */
+	strbuf_init(&state.path_gitdir_watch, 0);
+	strbuf_addbuf(&state.path_gitdir_watch, &state.path_worktree_watch);
+	strbuf_addstr(&state.path_gitdir_watch, "/.git");
+	if (!is_directory(state.path_gitdir_watch.buf)) {
+		strbuf_reset(&state.path_gitdir_watch);
+		strbuf_addstr(&state.path_gitdir_watch, absolute_path(get_git_dir()));
+		state.nr_paths_watching = 2;
+	}
+
+	/*
+	 * Confirm that we can create platform-specific resources for the
+	 * filesystem listener before we bother starting all the threads.
+	 */
+	if (fsmonitor_fs_listen__ctor(&state)) {
+		err = error(_("could not initialize listener thread"));
+		goto done;
+	}
+
+	err = fsmonitor_run_daemon_1(&state);
+
+done:
+	pthread_mutex_destroy(&state.main_lock);
+	fsmonitor_fs_listen__dtor(&state);
+
+	ipc_server_free(state.ipc_server_data);
+
+	strbuf_release(&state.path_worktree_watch);
+	strbuf_release(&state.path_gitdir_watch);
+
+	return err;
+}
+
 static int is_ipc_daemon_listening(void)
 {
 	return fsmonitor_ipc__get_state() == IPC_STATE__LISTENING;
 }
 
+static int try_to_run_foreground_daemon(void)
+{
+	/*
+	 * Technically, we don't need to probe for an existing daemon
+	 * process, since we could just call `fsmonitor_run_daemon()`
+	 * and let it fail if the pipe/socket is busy.
+	 *
+	 * However, this method gives us a nicer error message for a
+	 * common error case.
+	 */
+	if (is_ipc_daemon_listening())
+		die("fsmonitor--daemon is already running.");
+
+	return !!fsmonitor_run_daemon();
+}
+
+#ifndef GIT_WINDOWS_NATIVE
+/*
+ * This is adapted from `daemonize()`.  Use `fork()` to directly create
+ * and run the daemon in a child process.  The fork-parent returns the
+ * child PID so that we can wait for the child to startup before exiting.
+ */
+static int spawn_background_fsmonitor_daemon(pid_t *pid)
+{
+	*pid = fork();
+
+	switch (*pid) {
+	case 0:
+		if (setsid() == -1)
+			error_errno(_("setsid failed"));
+		close(0);
+		close(1);
+		close(2);
+		sanitize_stdfds();
+
+		return !!fsmonitor_run_daemon();
+
+	case -1:
+		return error_errno(_("could not spawn fsmonitor--daemon in the background"));
+
+	default:
+		return 0;
+	}
+}
+#else
+/*
+ * Conceptually like `daemonize()` but different because Windows does not
+ * have `fork(2)`.  Spawn a normal Windows child process but without the
+ * limitations of `start_command()` and `finish_command()`.
+ */
+static int spawn_background_fsmonitor_daemon(pid_t *pid)
+{
+	char git_exe[MAX_PATH];
+	struct strvec args = STRVEC_INIT;
+	int in, out;
+
+	GetModuleFileNameA(NULL, git_exe, MAX_PATH);
+
+	in = open("/dev/null", O_RDONLY);
+	out = open("/dev/null", O_WRONLY);
+
+	strvec_push(&args, git_exe);
+	strvec_push(&args, "fsmonitor--daemon");
+	strvec_push(&args, "--run");
+
+	*pid = mingw_spawnvpe(args.v[0], args.v, NULL, NULL, in, out, out);
+	close(in);
+	close(out);
+
+	strvec_clear(&args);
+
+	if (*pid < 0)
+		return error(_("could not spawn fsmonitor--daemon in the background"));
+
+	return 0;
+}
+#endif
+
+/*
+ * This is adapted from `wait_or_whine()`.  Watch the child process and
+ * let it get started and begin listening for requests on the socket
+ * before reporting our success.
+ */
+static int wait_for_background_startup(pid_t pid_child)
+{
+	int status;
+	pid_t pid_seen;
+	enum ipc_active_state s;
+	time_t time_limit, now;
+
+	time(&time_limit);
+	time_limit += fsmonitor__start_timeout_sec;
+
+	for (;;) {
+		pid_seen = waitpid(pid_child, &status, WNOHANG);
+
+		if (pid_seen == -1)
+			return error_errno(_("waitpid failed"));
+
+		else if (pid_seen == 0) {
+			/*
+			 * The child is still running (this should be
+			 * the normal case).  Try to connect to it on
+			 * the socket and see if it is ready for
+			 * business.
+			 *
+			 * If there is another daemon already running,
+			 * our child will fail to start (possibly
+			 * after a timeout on the lock), but we don't
+			 * care (who responds) if the socket is live.
+			 */
+			s = fsmonitor_ipc__get_state();
+			if (s == IPC_STATE__LISTENING)
+				return 0;
+
+			time(&now);
+			if (now > time_limit)
+				return error(_("fsmonitor--daemon not online yet"));
+
+			continue;
+		}
+
+		else if (pid_seen == pid_child) {
+			/*
+			 * The new child daemon process shutdown while
+			 * it was starting up, so it is not listening
+			 * on the socket.
+			 *
+			 * Try to ping the socket in the odd chance
+			 * that another daemon started (or was already
+			 * running) while our child was starting.
+			 *
+			 * Again, we don't care who services the socket.
+			 */
+			s = fsmonitor_ipc__get_state();
+			if (s == IPC_STATE__LISTENING)
+				return 0;
+
+			/*
+			 * We don't care about the WEXITSTATUS() nor
+			 * any of the WIF*(status) values because
+			 * `cmd_fsmonitor__daemon()` does the `!!result`
+			 * trick on all function return values.
+			 *
+			 * So it is sufficient to just report the
+			 * early shutdown as an error.
+			 */
+			return error(_("fsmonitor--daemon failed to start"));
+		}
+
+		else
+			return error(_("waitpid is confused"));
+	}
+}
+
+static int try_to_start_background_daemon(void)
+{
+	pid_t pid_child;
+	int ret;
+
+	/*
+	 * Before we try to create a background daemon process, see
+	 * if a daemon process is already listening.  This makes it
+	 * easier for us to report an already-listening error to the
+	 * console, since our spawn/daemon can only report the success
+	 * of creating the background process (and not whether it
+	 * immediately exited).
+	 */
+	if (is_ipc_daemon_listening())
+		die("fsmonitor--daemon is already running.");
+
+	/*
+	 * Run the actual daemon in a background process.
+	 */
+	ret = spawn_background_fsmonitor_daemon(&pid_child);
+	if (pid_child <= 0)
+		return ret;
+
+	/*
+	 * Wait (with timeout) for the background child process get
+	 * started and begin listening on the socket/pipe.  This makes
+	 * the "start" command more synchronous and more reliable in
+	 * tests.
+	 */
+	ret = wait_for_background_startup(pid_child);
+
+	return ret;
+}
+
 int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 {
 	enum daemon_mode {
 		UNDEFINED_MODE,
+		START,
+		RUN,
 		STOP,
 		IS_RUNNING,
 		QUERY,
@@ -130,6 +501,11 @@ int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 	} mode = UNDEFINED_MODE;
 
 	struct option options[] = {
+		OPT_CMDMODE(0, "start", &mode,
+			    N_("run the daemon in the background"),
+			    START),
+		OPT_CMDMODE(0, "run", &mode,
+			    N_("run the daemon in the foreground"), RUN),
 		OPT_CMDMODE(0, "stop", &mode, N_("stop the running daemon"),
 			    STOP),
 
@@ -145,18 +521,35 @@ int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 			    QUERY_INDEX),
 		OPT_CMDMODE(0, "flush", &mode, N_("flush cached filesystem events"),
 			    FLUSH),
+
+		OPT_GROUP(N_("Daemon options")),
+		OPT_INTEGER(0, "ipc-threads",
+			    &fsmonitor__ipc_threads,
+			    N_("use <n> ipc worker threads")),
+		OPT_INTEGER(0, "start-timeout",
+			    &fsmonitor__start_timeout_sec,
+			    N_("Max seconds to wait for background daemon startup")),
 		OPT_END()
 	};
 
 	if (argc == 2 && !strcmp(argv[1], "-h"))
 		usage_with_options(builtin_fsmonitor__daemon_usage, options);
 
-	git_config(git_default_config, NULL);
+	git_config(fsmonitor_config, NULL);
 
 	argc = parse_options(argc, argv, prefix, options,
 			     builtin_fsmonitor__daemon_usage, 0);
+	if (fsmonitor__ipc_threads < 1)
+		die(_("invalid 'ipc-threads' value (%d)"),
+		    fsmonitor__ipc_threads);
 
 	switch (mode) {
+	case START:
+		return !!try_to_start_background_daemon();
+
+	case RUN:
+		return !!try_to_run_foreground_daemon();
+
 	case STOP:
 		return !!do_as_client__send_stop();
 
diff --git a/fsmonitor--daemon.h b/fsmonitor--daemon.h
new file mode 100644
index 000000000000..09e4a6fb6675
--- /dev/null
+++ b/fsmonitor--daemon.h
@@ -0,0 +1,36 @@
+#ifndef FSMONITOR_DAEMON_H
+#define FSMONITOR_DAEMON_H
+
+#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+
+#include "cache.h"
+#include "dir.h"
+#include "run-command.h"
+#include "simple-ipc.h"
+#include "thread-utils.h"
+
+struct fsmonitor_batch;
+struct fsmonitor_token_data;
+
+struct fsmonitor_daemon_backend_data; /* opaque platform-specific data */
+
+struct fsmonitor_daemon_state {
+	pthread_t listener_thread;
+	pthread_mutex_t main_lock;
+
+	struct strbuf path_worktree_watch;
+	struct strbuf path_gitdir_watch;
+	int nr_paths_watching;
+
+	struct fsmonitor_token_data *current_token_data;
+
+	int error_code;
+	struct fsmonitor_daemon_backend_data *backend_data;
+
+	struct ipc_server_data *ipc_server_data;
+
+	int test_client_delay_ms;
+};
+
+#endif /* HAVE_FSMONITOR_DAEMON_BACKEND */
+#endif /* FSMONITOR_DAEMON_H */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 10/23] fsmonitor--daemon: add pathname classification
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (8 preceding siblings ...)
  2021-04-01 15:40 ` [PATCH 09/23] fsmonitor--daemon: implement daemon command options Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:40 ` Jeff Hostetler via GitGitGadget
  2021-04-26 19:17   ` Derrick Stolee
  2021-04-01 15:40 ` [PATCH 11/23] fsmonitor--daemon: define token-ids Jeff Hostetler via GitGitGadget
                   ` (16 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:40 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon to classify relative and absolute
pathnames and decide how they should be handled.  This will
be used by the platform-specific backend to respond to each
filesystem event.

When we register for filesystem notifications on a directory,
we get events for everything (recursively) in the directory.
We want to report to clients changes to tracked and untracked
paths within the working directory.  We do not want to report
changes within the .git directory, for example.

This classification will be used in a later commit by the
different backends to classify paths as events are received.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 81 +++++++++++++++++++++++++++++++++++++
 fsmonitor--daemon.h         | 61 ++++++++++++++++++++++++++++
 2 files changed, 142 insertions(+)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 23a063707972..16252487240a 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -168,6 +168,87 @@ static int handle_client(void *data, const char *command,
 	return result;
 }
 
+#define FSMONITOR_COOKIE_PREFIX ".fsmonitor-daemon-"
+
+enum fsmonitor_path_type fsmonitor_classify_path_workdir_relative(
+	const char *rel)
+{
+	if (fspathncmp(rel, ".git", 4))
+		return IS_WORKDIR_PATH;
+	rel += 4;
+
+	if (!*rel)
+		return IS_DOT_GIT;
+	if (*rel != '/')
+		return IS_WORKDIR_PATH; /* e.g. .gitignore */
+	rel++;
+
+	if (!fspathncmp(rel, FSMONITOR_COOKIE_PREFIX,
+			strlen(FSMONITOR_COOKIE_PREFIX)))
+		return IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX;
+
+	return IS_INSIDE_DOT_GIT;
+}
+
+enum fsmonitor_path_type fsmonitor_classify_path_gitdir_relative(
+	const char *rel)
+{
+	if (!fspathncmp(rel, FSMONITOR_COOKIE_PREFIX,
+			strlen(FSMONITOR_COOKIE_PREFIX)))
+		return IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX;
+
+	return IS_INSIDE_GITDIR;
+}
+
+static enum fsmonitor_path_type try_classify_workdir_abs_path(
+	struct fsmonitor_daemon_state *state,
+	const char *path)
+{
+	const char *rel;
+
+	if (fspathncmp(path, state->path_worktree_watch.buf,
+		       state->path_worktree_watch.len))
+		return IS_OUTSIDE_CONE;
+
+	rel = path + state->path_worktree_watch.len;
+
+	if (!*rel)
+		return IS_WORKDIR_PATH; /* it is the root dir exactly */
+	if (*rel != '/')
+		return IS_OUTSIDE_CONE;
+	rel++;
+
+	return fsmonitor_classify_path_workdir_relative(rel);
+}
+
+enum fsmonitor_path_type fsmonitor_classify_path_absolute(
+	struct fsmonitor_daemon_state *state,
+	const char *path)
+{
+	const char *rel;
+	enum fsmonitor_path_type t;
+
+	t = try_classify_workdir_abs_path(state, path);
+	if (state->nr_paths_watching == 1)
+		return t;
+	if (t != IS_OUTSIDE_CONE)
+		return t;
+
+	if (fspathncmp(path, state->path_gitdir_watch.buf,
+		       state->path_gitdir_watch.len))
+		return IS_OUTSIDE_CONE;
+
+	rel = path + state->path_gitdir_watch.len;
+
+	if (!*rel)
+		return IS_GITDIR; /* it is the <gitdir> exactly */
+	if (*rel != '/')
+		return IS_OUTSIDE_CONE;
+	rel++;
+
+	return fsmonitor_classify_path_gitdir_relative(rel);
+}
+
 static void *fsmonitor_fs_listen__thread_proc(void *_state)
 {
 	struct fsmonitor_daemon_state *state = _state;
diff --git a/fsmonitor--daemon.h b/fsmonitor--daemon.h
index 09e4a6fb6675..97ea3766e900 100644
--- a/fsmonitor--daemon.h
+++ b/fsmonitor--daemon.h
@@ -32,5 +32,66 @@ struct fsmonitor_daemon_state {
 	int test_client_delay_ms;
 };
 
+/*
+ * Pathname classifications.
+ *
+ * The daemon classifies the pathnames that it receives from file
+ * system notification events into the following categories and uses
+ * that to decide whether clients are told about them.  (And to watch
+ * for file system synchronization events.)
+ *
+ * The client should only care about paths within the working
+ * directory proper (inside the working directory and not ".git" nor
+ * inside of ".git/").  That is, the client has read the index and is
+ * asking for a list of any paths in the working directory that have
+ * been modified since the last token.  The client does not care about
+ * file system changes within the .git directory (such as new loose
+ * objects or packfiles).  So the client will only receive paths that
+ * are classified as IS_WORKDIR_PATH.
+ *
+ * The daemon uses the IS_DOT_GIT and IS_GITDIR internally to mean the
+ * exact ".git" directory or GITDIR.  If the daemon receives a delete
+ * event for either of these directories, it will automatically
+ * shutdown, for example.
+ *
+ * Note that the daemon DOES NOT explicitly watch nor special case the
+ * ".git/index" file.  The daemon does not read the index and does not
+ * have any internal index-relative state.  The daemon only collects
+ * the set of modified paths within the working directory.
+ */
+enum fsmonitor_path_type {
+	IS_WORKDIR_PATH = 0,
+
+	IS_DOT_GIT,
+	IS_INSIDE_DOT_GIT,
+	IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX,
+
+	IS_GITDIR,
+	IS_INSIDE_GITDIR,
+	IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX,
+
+	IS_OUTSIDE_CONE,
+};
+
+/*
+ * Classify a pathname relative to the root of the working directory.
+ */
+enum fsmonitor_path_type fsmonitor_classify_path_workdir_relative(
+	const char *relative_path);
+
+/*
+ * Classify a pathname relative to a <gitdir> that is external to the
+ * worktree directory.
+ */
+enum fsmonitor_path_type fsmonitor_classify_path_gitdir_relative(
+	const char *relative_path);
+
+/*
+ * Classify an absolute pathname received from a filesystem event.
+ */
+enum fsmonitor_path_type fsmonitor_classify_path_absolute(
+	struct fsmonitor_daemon_state *state,
+	const char *path);
+
 #endif /* HAVE_FSMONITOR_DAEMON_BACKEND */
 #endif /* FSMONITOR_DAEMON_H */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 11/23] fsmonitor--daemon: define token-ids
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (9 preceding siblings ...)
  2021-04-01 15:40 ` [PATCH 10/23] fsmonitor--daemon: add pathname classification Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:40 ` Jeff Hostetler via GitGitGadget
  2021-04-26 19:49   ` Derrick Stolee
  2021-04-01 15:40 ` [PATCH 12/23] fsmonitor--daemon: create token-based changed path cache Jeff Hostetler via GitGitGadget
                   ` (15 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:40 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon to create token-ids and define the
overall token naming scheme.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 108 +++++++++++++++++++++++++++++++++++-
 1 file changed, 107 insertions(+), 1 deletion(-)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 16252487240a..2d25e36601fe 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -149,6 +149,112 @@ static int do_as_client__send_flush(void)
 	return 0;
 }
 
+/*
+ * Requests to and from a FSMonitor Protocol V2 provider use an opaque
+ * "token" as a virtual timestamp.  Clients can request a summary of all
+ * created/deleted/modified files relative to a token.  In the response,
+ * clients receive a new token for the next (relative) request.
+ *
+ *
+ * Token Format
+ * ============
+ *
+ * The contents of the token are private and provider-specific.
+ *
+ * For the built-in fsmonitor--daemon, we define a token as follows:
+ *
+ *     "builtin" ":" <token_id> ":" <sequence_nr>
+ *
+ * The <token_id> is an arbitrary OPAQUE string, such as a GUID,
+ * UUID, or {timestamp,pid}.  It is used to group all filesystem
+ * events that happened while the daemon was monitoring (and in-sync
+ * with the filesystem).
+ *
+ *     Unlike FSMonitor Protocol V1, it is not defined as a timestamp
+ *     and does not define less-than/greater-than relationships.
+ *     (There are too many race conditions to rely on file system
+ *     event timestamps.)
+ *
+ * The <sequence_nr> is a simple integer incremented for each event
+ * received.  When a new <token_id> is created, the <sequence_nr> is
+ * reset to zero.
+ *
+ *
+ * About Token Ids
+ * ===============
+ *
+ * A new token_id is created:
+ *
+ * [1] each time the daemon is started.
+ *
+ * [2] any time that the daemon must re-sync with the filesystem
+ *     (such as when the kernel drops or we miss events on a very
+ *     active volume).
+ *
+ * [3] in response to a client "flush" command (for dropped event
+ *     testing).
+ *
+ * [4] MAYBE We might want to change the token_id after very complex
+ *     filesystem operations are performed, such as a directory move
+ *     sequence that affects many files within.  It might be simpler
+ *     to just give up and fake a re-sync (and let the client do a
+ *     full scan) than try to enumerate the effects of such a change.
+ *
+ * When a new token_id is created, the daemon is free to discard all
+ * cached filesystem events associated with any previous token_ids.
+ * Events associated with a non-current token_id will never be sent
+ * to a client.  A token_id change implicitly means that the daemon
+ * has gap in its event history.
+ *
+ * Therefore, clients that present a token with a stale (non-current)
+ * token_id will always be given a trivial response.
+ */
+struct fsmonitor_token_data {
+	struct strbuf token_id;
+	struct fsmonitor_batch *batch_head;
+	struct fsmonitor_batch *batch_tail;
+	uint64_t client_ref_count;
+};
+
+static struct fsmonitor_token_data *fsmonitor_new_token_data(void)
+{
+	static int test_env_value = -1;
+	static uint64_t flush_count = 0;
+	struct fsmonitor_token_data *token;
+
+	token = (struct fsmonitor_token_data *)xcalloc(1, sizeof(*token));
+
+	strbuf_init(&token->token_id, 0);
+	token->batch_head = NULL;
+	token->batch_tail = NULL;
+	token->client_ref_count = 0;
+
+	if (test_env_value < 0)
+		test_env_value = git_env_bool("GIT_TEST_FSMONITOR_TOKEN", 0);
+
+	if (!test_env_value) {
+		struct timeval tv;
+		struct tm tm;
+		time_t secs;
+
+		gettimeofday(&tv, NULL);
+		secs = tv.tv_sec;
+		gmtime_r(&secs, &tm);
+
+		strbuf_addf(&token->token_id,
+			    "%"PRIu64".%d.%4d%02d%02dT%02d%02d%02d.%06ldZ",
+			    flush_count++,
+			    getpid(),
+			    tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
+			    tm.tm_hour, tm.tm_min, tm.tm_sec,
+			    (long)tv.tv_usec);
+	} else {
+		strbuf_addf(&token->token_id, "test_%08x", test_env_value++);
+	}
+
+	return token;
+}
+
 static ipc_server_application_cb handle_client;
 
 static int handle_client(void *data, const char *command,
@@ -330,7 +436,7 @@ static int fsmonitor_run_daemon(void)
 
 	pthread_mutex_init(&state.main_lock, NULL);
 	state.error_code = 0;
-	state.current_token_data = NULL;
+	state.current_token_data = fsmonitor_new_token_data();
 	state.test_client_delay_ms = 0;
 
 	/* Prepare to (recursively) watch the <worktree-root> directory. */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 12/23] fsmonitor--daemon: create token-based changed path cache
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (10 preceding siblings ...)
  2021-04-01 15:40 ` [PATCH 11/23] fsmonitor--daemon: define token-ids Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:40 ` Jeff Hostetler via GitGitGadget
  2021-04-26 20:22   ` Derrick Stolee
  2021-04-01 15:40 ` [PATCH 13/23] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows Jeff Hostetler via GitGitGadget
                   ` (14 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:40 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon to build lists of changed paths and associate
them with a token-id.  This will be used by the platform-specific
backends to accumulate changed paths in response to filesystem events.

The platform-specific event loops receive batches containing one or
more changed paths.  Their fs listener thread will accumulate them in
a `fsmonitor_batch` (and without locking) and then "publish" them to
associate them with the current token and to make them visible to the
client worker threads.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 192 ++++++++++++++++++++++++++++++++++++
 fsmonitor--daemon.h         |  40 ++++++++
 2 files changed, 232 insertions(+)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 2d25e36601fe..48071d445c49 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -255,6 +255,120 @@ static struct fsmonitor_token_data *fsmonitor_new_token_data(void)
 	return token;
 }
 
+struct fsmonitor_batch {
+	struct fsmonitor_batch *next;
+	uint64_t batch_seq_nr;
+	const char **interned_paths;
+	size_t nr, alloc;
+	time_t pinned_time;
+};
+
+struct fsmonitor_batch *fsmonitor_batch__new(void)
+{
+	struct fsmonitor_batch *batch = xcalloc(1, sizeof(*batch));
+
+	return batch;
+}
+
+struct fsmonitor_batch *fsmonitor_batch__free(struct fsmonitor_batch *batch)
+{
+	struct fsmonitor_batch *next;
+
+	if (!batch)
+		return NULL;
+
+	next = batch->next;
+
+	/*
+	 * The actual strings within the array are interned, so we don't
+	 * own them.
+	 */
+	free(batch->interned_paths);
+
+	return next;
+}
+
+void fsmonitor_batch__add_path(struct fsmonitor_batch *batch,
+			       const char *path)
+{
+	const char *interned_path = strintern(path);
+
+	trace_printf_key(&trace_fsmonitor, "event: %s", interned_path);
+
+	ALLOC_GROW(batch->interned_paths, batch->nr + 1, batch->alloc);
+	batch->interned_paths[batch->nr++] = interned_path;
+}
+
+static void fsmonitor_batch__combine(struct fsmonitor_batch *batch_dest,
+				     const struct fsmonitor_batch *batch_src)
+{
+	/* assert state->main_lock */
+
+	size_t k;
+
+	ALLOC_GROW(batch_dest->interned_paths,
+		   batch_dest->nr + batch_src->nr + 1,
+		   batch_dest->alloc);
+
+	for (k = 0; k < batch_src->nr; k++)
+		batch_dest->interned_paths[batch_dest->nr++] =
+			batch_src->interned_paths[k];
+}
+
+static void fsmonitor_free_token_data(struct fsmonitor_token_data *token)
+{
+	struct fsmonitor_batch *p;
+
+	if (!token)
+		return;
+
+	assert(token->client_ref_count == 0);
+
+	strbuf_release(&token->token_id);
+
+	for (p = token->batch_head; p; p = fsmonitor_batch__free(p))
+		;
+
+	free(token);
+}
+
+/*
+ * Flush all of our cached data about the filesystem.  Call this if we
+ * lose sync with the filesystem and miss some notification events.
+ *
+ * [1] If we are missing events, then we no longer have a complete
+ *     history of the directory (relative to our current start token).
+ *     We should create a new token and start fresh (as if we just
+ *     booted up).
+ *
+ * If there are no readers of the the current token data series, we
+ * can free it now.  Otherwise, let the last reader free it.  Either
+ * way, the old token data series is no longer associated with our
+ * state data.
+ */
+void fsmonitor_force_resync(struct fsmonitor_daemon_state *state)
+{
+	struct fsmonitor_token_data *free_me = NULL;
+	struct fsmonitor_token_data *new_one = NULL;
+
+	new_one = fsmonitor_new_token_data();
+
+	pthread_mutex_lock(&state->main_lock);
+
+	trace_printf_key(&trace_fsmonitor,
+			 "force resync [old '%s'][new '%s']",
+			 state->current_token_data->token_id.buf,
+			 new_one->token_id.buf);
+
+	if (state->current_token_data->client_ref_count == 0)
+		free_me = state->current_token_data;
+	state->current_token_data = new_one;
+
+	pthread_mutex_unlock(&state->main_lock);
+
+	fsmonitor_free_token_data(free_me);
+}
+
 static ipc_server_application_cb handle_client;
 
 static int handle_client(void *data, const char *command,
@@ -355,6 +469,77 @@ enum fsmonitor_path_type fsmonitor_classify_path_absolute(
 	return fsmonitor_classify_path_gitdir_relative(rel);
 }
 
+/*
+ * We try to combine small batches at the front of the batch-list to avoid
+ * having a long list.  This hopefully makes it a little easier when we want
+ * to truncate and maintain the list.  However, we don't want the paths array
+ * to just keep growing and growing with realloc, so we insert an arbitrary
+ * limit.
+ */
+#define MY_COMBINE_LIMIT (1024)
+
+void fsmonitor_publish(struct fsmonitor_daemon_state *state,
+		       struct fsmonitor_batch *batch,
+		       const struct string_list *cookie_names)
+{
+	if (!batch && !cookie_names->nr)
+		return;
+
+	pthread_mutex_lock(&state->main_lock);
+
+	if (batch) {
+		struct fsmonitor_batch *head;
+
+		head = state->current_token_data->batch_head;
+		if (!head) {
+			batch->batch_seq_nr = 0;
+			batch->next = NULL;
+			state->current_token_data->batch_head = batch;
+			state->current_token_data->batch_tail = batch;
+		} else if (head->pinned_time) {
+			/*
+			 * We cannot alter the current batch list
+			 * because:
+			 *
+			 * [a] it is being transmitted to at least one
+			 * client and the handle_client() thread has a
+			 * ref-count, but not a lock on the batch list
+			 * starting with this item.
+			 *
+			 * [b] it has been transmitted in the past to
+			 * at least one client such that future
+			 * requests are relative to this head batch.
+			 *
+			 * So, we can only prepend a new batch onto
+			 * the front of the list.
+			 */
+			batch->batch_seq_nr = head->batch_seq_nr + 1;
+			batch->next = head;
+			state->current_token_data->batch_head = batch;
+		} else if (head->nr + batch->nr > MY_COMBINE_LIMIT) {
+			/*
+			 * The head batch in the list has never been
+			 * transmitted to a client, but folding the
+			 * contents of the new batch onto it would
+			 * exceed our arbitrary limit, so just prepend
+			 * the new batch onto the list.
+			 */
+			batch->batch_seq_nr = head->batch_seq_nr + 1;
+			batch->next = head;
+			state->current_token_data->batch_head = batch;
+		} else {
+			/*
+			 * We are free to append the paths in the given
+			 * batch onto the end of the current head batch.
+			 */
+			fsmonitor_batch__combine(head, batch);
+			fsmonitor_batch__free(batch);
+		}
+	}
+
+	pthread_mutex_unlock(&state->main_lock);
+}
+
 static void *fsmonitor_fs_listen__thread_proc(void *_state)
 {
 	struct fsmonitor_daemon_state *state = _state;
@@ -369,6 +554,13 @@ static void *fsmonitor_fs_listen__thread_proc(void *_state)
 
 	fsmonitor_fs_listen__loop(state);
 
+	pthread_mutex_lock(&state->main_lock);
+	if (state->current_token_data &&
+	    state->current_token_data->client_ref_count == 0)
+		fsmonitor_free_token_data(state->current_token_data);
+	state->current_token_data = NULL;
+	pthread_mutex_unlock(&state->main_lock);
+
 	trace2_thread_exit();
 	return NULL;
 }
diff --git a/fsmonitor--daemon.h b/fsmonitor--daemon.h
index 97ea3766e900..06563b6ed56c 100644
--- a/fsmonitor--daemon.h
+++ b/fsmonitor--daemon.h
@@ -12,6 +12,27 @@
 struct fsmonitor_batch;
 struct fsmonitor_token_data;
 
+/*
+ * Create a new batch of path(s).  The returned batch is considered
+ * private and not linked into the fsmonitor daemon state.  The caller
+ * should fill this batch with one or more paths and then publish it.
+ */
+struct fsmonitor_batch *fsmonitor_batch__new(void);
+
+/*
+ * Free this batch and return the value of the batch->next field.
+ */
+struct fsmonitor_batch *fsmonitor_batch__free(struct fsmonitor_batch *batch);
+
+/*
+ * Add this path to this batch of modified files.
+ *
+ * The batch should be private and NOT (yet) linked into the fsmonitor
+ * daemon state and therefore not yet visible to worker threads and so
+ * no locking is required.
+ */
+void fsmonitor_batch__add_path(struct fsmonitor_batch *batch, const char *path);
+
 struct fsmonitor_daemon_backend_data; /* opaque platform-specific data */
 
 struct fsmonitor_daemon_state {
@@ -93,5 +114,24 @@ enum fsmonitor_path_type fsmonitor_classify_path_absolute(
 	struct fsmonitor_daemon_state *state,
 	const char *path);
 
+/*
+ * Prepend the this batch of path(s) onto the list of batches associated
+ * with the current token.  This makes the batch visible to worker threads.
+ *
+ * The caller no longer owns the batch and must not free it.
+ *
+ * Wake up the client threads waiting on these cookies.
+ */
+void fsmonitor_publish(struct fsmonitor_daemon_state *state,
+		       struct fsmonitor_batch *batch,
+		       const struct string_list *cookie_names);
+
+/*
+ * If the platform-specific layer loses sync with the filesystem,
+ * it should call this to invalidate cached data and abort waiting
+ * threads.
+ */
+void fsmonitor_force_resync(struct fsmonitor_daemon_state *state);
+
 #endif /* HAVE_FSMONITOR_DAEMON_BACKEND */
 #endif /* FSMONITOR_DAEMON_H */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 13/23] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (11 preceding siblings ...)
  2021-04-01 15:40 ` [PATCH 12/23] fsmonitor--daemon: create token-based changed path cache Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:40 ` Jeff Hostetler via GitGitGadget
  2021-04-27 17:22   ` Derrick Stolee
  2021-04-01 15:40 ` [PATCH 14/23] fsmonitor-fs-listen-macos: add macos header files for FSEvent Jeff Hostetler via GitGitGadget
                   ` (13 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:40 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach the win32 backend to register a watch on the working tree
root directory (recursively).  Also watch the <gitdir> if it is
not inside the working tree.  And to collect path change notifications
into batches and publish.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/fsmonitor/fsmonitor-fs-listen-win32.c | 493 +++++++++++++++++++
 1 file changed, 493 insertions(+)

diff --git a/compat/fsmonitor/fsmonitor-fs-listen-win32.c b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
index 880446b49e35..2f1fcf85a0a4 100644
--- a/compat/fsmonitor/fsmonitor-fs-listen-win32.c
+++ b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
@@ -2,20 +2,513 @@
 #include "config.h"
 #include "fsmonitor.h"
 #include "fsmonitor-fs-listen.h"
+#include "fsmonitor--daemon.h"
+
+/*
+ * The documentation of ReadDirectoryChangesW() states that the maximum
+ * buffer size is 64K when the monitored directory is remote.
+ *
+ * Larger buffers may be used when the monitored directory is local and
+ * will help us receive events faster from the kernel and avoid dropped
+ * events.
+ *
+ * So we try to use a very large buffer and silently fallback to 64K if
+ * we get an error.
+ */
+#define MAX_RDCW_BUF_FALLBACK (65536)
+#define MAX_RDCW_BUF          (65536 * 8)
+
+struct one_watch
+{
+	char buffer[MAX_RDCW_BUF];
+	DWORD buf_len;
+	DWORD count;
+
+	struct strbuf path;
+	HANDLE hDir;
+	HANDLE hEvent;
+	OVERLAPPED overlapped;
+
+	/*
+	 * Is there an active ReadDirectoryChangesW() call pending.  If so, we
+	 * need to later call GetOverlappedResult() and possibly CancelIoEx().
+	 */
+	BOOL is_active;
+};
+
+struct fsmonitor_daemon_backend_data
+{
+	struct one_watch *watch_worktree;
+	struct one_watch *watch_gitdir;
+
+	HANDLE hEventShutdown;
+
+	HANDLE hListener[3]; /* we don't own these handles */
+#define LISTENER_SHUTDOWN 0
+#define LISTENER_HAVE_DATA_WORKTREE 1
+#define LISTENER_HAVE_DATA_GITDIR 2
+	int nr_listener_handles;
+};
+
+/*
+ * Convert the WCHAR path from the notification into UTF8 and
+ * then normalize it.
+ */
+static int normalize_path_in_utf8(FILE_NOTIFY_INFORMATION *info,
+				  struct strbuf *normalized_path)
+{
+	int reserve;
+	int len = 0;
+
+	strbuf_reset(normalized_path);
+	if (!info->FileNameLength)
+		goto normalize;
+
+	/*
+	 * Pre-reserve enough space in the UTF8 buffer for
+	 * each Unicode WCHAR character to be mapped into a
+	 * sequence of 2 UTF8 characters.  That should let us
+	 * avoid ERROR_INSUFFICIENT_BUFFER 99.9+% of the time.
+	 */
+	reserve = info->FileNameLength + 1;
+	strbuf_grow(normalized_path, reserve);
+
+	for (;;) {
+		len = WideCharToMultiByte(CP_UTF8, 0, info->FileName,
+					  info->FileNameLength / sizeof(WCHAR),
+					  normalized_path->buf,
+					  strbuf_avail(normalized_path) - 1,
+					  NULL, NULL);
+		if (len > 0)
+			goto normalize;
+		if (GetLastError() != ERROR_INSUFFICIENT_BUFFER) {
+			error("[GLE %ld] could not convert path to UTF-8: '%.*ls'",
+			      GetLastError(),
+			      (int)(info->FileNameLength / sizeof(WCHAR)),
+			      info->FileName);
+			return -1;
+		}
+
+		strbuf_grow(normalized_path,
+			    strbuf_avail(normalized_path) + reserve);
+	}
+
+normalize:
+	strbuf_setlen(normalized_path, len);
+	return strbuf_normalize_path(normalized_path);
+}
 
 void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state)
 {
+	SetEvent(state->backend_data->hListener[LISTENER_SHUTDOWN]);
+}
+
+static struct one_watch *create_watch(struct fsmonitor_daemon_state *state,
+				      const char *path)
+{
+	struct one_watch *watch = NULL;
+	DWORD desired_access = FILE_LIST_DIRECTORY;
+	DWORD share_mode =
+		FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE;
+	HANDLE hDir;
+
+	hDir = CreateFileA(path,
+			   desired_access, share_mode, NULL, OPEN_EXISTING,
+			   FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OVERLAPPED,
+			   NULL);
+	if (hDir == INVALID_HANDLE_VALUE) {
+		error(_("[GLE %ld] could not watch '%s'"),
+		      GetLastError(), path);
+		return NULL;
+	}
+
+	watch = xcalloc(1, sizeof(*watch));
+
+	watch->buf_len = sizeof(watch->buffer); /* assume full MAX_RDCW_BUF */
+
+	strbuf_init(&watch->path, 0);
+	strbuf_addstr(&watch->path, path);
+
+	watch->hDir = hDir;
+	watch->hEvent = CreateEvent(NULL, TRUE, FALSE, NULL);
+
+	return watch;
+}
+
+static void destroy_watch(struct one_watch *watch)
+{
+	if (!watch)
+		return;
+
+	strbuf_release(&watch->path);
+	if (watch->hDir != INVALID_HANDLE_VALUE)
+		CloseHandle(watch->hDir);
+	if (watch->hEvent != INVALID_HANDLE_VALUE)
+		CloseHandle(watch->hEvent);
+
+	free(watch);
+}
+
+static int start_rdcw_watch(struct fsmonitor_daemon_backend_data *data,
+			    struct one_watch *watch)
+{
+	DWORD dwNotifyFilter =
+		FILE_NOTIFY_CHANGE_FILE_NAME |
+		FILE_NOTIFY_CHANGE_DIR_NAME |
+		FILE_NOTIFY_CHANGE_ATTRIBUTES |
+		FILE_NOTIFY_CHANGE_SIZE |
+		FILE_NOTIFY_CHANGE_LAST_WRITE |
+		FILE_NOTIFY_CHANGE_CREATION;
+
+	ResetEvent(watch->hEvent);
+
+	memset(&watch->overlapped, 0, sizeof(watch->overlapped));
+	watch->overlapped.hEvent = watch->hEvent;
+
+start_watch:
+	watch->is_active = ReadDirectoryChangesW(
+		watch->hDir, watch->buffer, watch->buf_len, TRUE,
+		dwNotifyFilter, &watch->count, &watch->overlapped, NULL);
+
+	if (!watch->is_active &&
+	    GetLastError() == ERROR_INVALID_PARAMETER &&
+	    watch->buf_len > MAX_RDCW_BUF_FALLBACK) {
+		watch->buf_len = MAX_RDCW_BUF_FALLBACK;
+		goto start_watch;
+	}
+
+	if (watch->is_active)
+		return 0;
+
+	error("ReadDirectoryChangedW failed on '%s' [GLE %ld]",
+	      watch->path.buf, GetLastError());
+	return -1;
+}
+
+static int recv_rdcw_watch(struct one_watch *watch)
+{
+	watch->is_active = FALSE;
+
+	if (GetOverlappedResult(watch->hDir, &watch->overlapped, &watch->count,
+				TRUE))
+		return 0;
+
+	// TODO If an external <gitdir> is deleted, the above returns an error.
+	// TODO I'm not sure that there's anything that we can do here other
+	// TODO than failing -- the <worktree>/.git link file would be broken
+	// TODO anyway.  We might try to check for that and return a better
+	// TODO error message.
+
+	error("GetOverlappedResult failed on '%s' [GLE %ld]",
+	      watch->path.buf, GetLastError());
+	return -1;
+}
+
+static void cancel_rdcw_watch(struct one_watch *watch)
+{
+	DWORD count;
+
+	if (!watch || !watch->is_active)
+		return;
+
+	CancelIoEx(watch->hDir, &watch->overlapped);
+	GetOverlappedResult(watch->hDir, &watch->overlapped, &count, TRUE);
+	watch->is_active = FALSE;
+}
+
+/*
+ * Process filesystem events that happen anywhere (recursively) under the
+ * <worktree> root directory.  For a normal working directory, this includes
+ * both version controlled files and the contents of the .git/ directory.
+ *
+ * If <worktree>/.git is a file, then we only see events for the file
+ * itself.
+ */
+static int process_worktree_events(struct fsmonitor_daemon_state *state)
+{
+	struct fsmonitor_daemon_backend_data *data = state->backend_data;
+	struct one_watch *watch = data->watch_worktree;
+	struct strbuf path = STRBUF_INIT;
+	struct string_list cookie_list = STRING_LIST_INIT_DUP;
+	struct fsmonitor_batch *batch = NULL;
+	const char *p = watch->buffer;
+
+	/*
+	 * If the kernel gets more events than will fit in the kernel
+	 * buffer associated with our RDCW handle, it drops them and
+	 * returns a count of zero.  (A successful call, but with
+	 * length zero.)
+	 */
+	if (!watch->count) {
+		trace2_data_string("fsmonitor", NULL, "fsm-listen/kernel",
+				   "overflow");
+		fsmonitor_force_resync(state);
+		return LISTENER_HAVE_DATA_WORKTREE;
+	}
+
+	/*
+	 * On Windows, `info` contains an "array" of paths that are
+	 * relative to the root of whichever directory handle received
+	 * the event.
+	 */
+	for (;;) {
+		FILE_NOTIFY_INFORMATION *info = (void *)p;
+		const char *slash;
+		enum fsmonitor_path_type t;
+
+		strbuf_reset(&path);
+		if (normalize_path_in_utf8(info, &path) == -1)
+			goto skip_this_path;
+
+		t = fsmonitor_classify_path_workdir_relative(path.buf);
+
+		switch (t) {
+		case IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX:
+			/* special case cookie files within .git */
+
+			/* Use just the filename of the cookie file. */
+			slash = find_last_dir_sep(path.buf);
+			string_list_append(&cookie_list,
+					   slash ? slash + 1 : path.buf);
+			break;
+
+		case IS_INSIDE_DOT_GIT:
+			/* ignore everything inside of "<worktree>/.git/" */
+			break;
+
+		case IS_DOT_GIT:
+			/* "<worktree>/.git" was deleted (or renamed away) */
+			if ((info->Action == FILE_ACTION_REMOVED) ||
+			    (info->Action == FILE_ACTION_RENAMED_OLD_NAME)) {
+				trace2_data_string("fsmonitor", NULL,
+						   "fsm-listen/dotgit",
+						   "removed");
+				goto force_shutdown;
+			}
+			break;
+
+		case IS_WORKDIR_PATH:
+			/* queue normal pathname */
+			if (!batch)
+				batch = fsmonitor_batch__new();
+			fsmonitor_batch__add_path(batch, path.buf);
+			break;
+
+		case IS_GITDIR:
+		case IS_INSIDE_GITDIR:
+		case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX:
+		default:
+			BUG("unexpected path classification '%d' for '%s'",
+			    t, path.buf);
+			goto skip_this_path;
+		}
+
+skip_this_path:
+		if (!info->NextEntryOffset)
+			break;
+		p += info->NextEntryOffset;
+	}
+
+	fsmonitor_publish(state, batch, &cookie_list);
+	batch = NULL;
+	string_list_clear(&cookie_list, 0);
+	strbuf_release(&path);
+	return LISTENER_HAVE_DATA_WORKTREE;
+
+force_shutdown:
+	fsmonitor_batch__free(batch);
+	string_list_clear(&cookie_list, 0);
+	strbuf_release(&path);
+	return LISTENER_SHUTDOWN;
+}
+
+/*
+ * Process filesystem events that happend anywhere (recursively) under the
+ * external <gitdir> (such as non-primary worktrees or submodules).
+ * We only care about cookie files that our client threads created here.
+ *
+ * Note that we DO NOT get filesystem events on the external <gitdir>
+ * itself (it is not inside something that we are watching).  In particular,
+ * we do not get an event if the external <gitdir> is deleted.
+ */
+static int process_gitdir_events(struct fsmonitor_daemon_state *state)
+{
+	struct fsmonitor_daemon_backend_data *data = state->backend_data;
+	struct one_watch *watch = data->watch_gitdir;
+	struct strbuf path = STRBUF_INIT;
+	struct string_list cookie_list = STRING_LIST_INIT_DUP;
+	const char *p = watch->buffer;
+
+	if (!watch->count) {
+		trace2_data_string("fsmonitor", NULL, "fsm-listen/kernel",
+				   "overflow");
+		fsmonitor_force_resync(state);
+		return LISTENER_HAVE_DATA_GITDIR;
+	}
+
+	for (;;) {
+		FILE_NOTIFY_INFORMATION *info = (void *)p;
+		const char *slash;
+		enum fsmonitor_path_type t;
+
+		strbuf_reset(&path);
+		if (normalize_path_in_utf8(info, &path) == -1)
+			goto skip_this_path;
+
+		t = fsmonitor_classify_path_gitdir_relative(path.buf);
+
+		trace_printf_key(&trace_fsmonitor, "BBB: %s", path.buf);
+
+		switch (t) {
+		case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX:
+			/* special case cookie files within gitdir */
+
+			/* Use just the filename of the cookie file. */
+			slash = find_last_dir_sep(path.buf);
+			string_list_append(&cookie_list,
+					   slash ? slash + 1 : path.buf);
+			break;
+
+		case IS_INSIDE_GITDIR:
+			goto skip_this_path;
+
+		default:
+			BUG("unexpected path classification '%d' for '%s'",
+			    t, path.buf);
+			goto skip_this_path;
+		}
+
+skip_this_path:
+		if (!info->NextEntryOffset)
+			break;
+		p += info->NextEntryOffset;
+	}
+
+	fsmonitor_publish(state, NULL, &cookie_list);
+	string_list_clear(&cookie_list, 0);
+	strbuf_release(&path);
+	return LISTENER_HAVE_DATA_GITDIR;
 }
 
 void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
 {
+	struct fsmonitor_daemon_backend_data *data = state->backend_data;
+	DWORD dwWait;
+
+	state->error_code = 0;
+
+	if (start_rdcw_watch(data, data->watch_worktree) == -1)
+		goto force_error_stop;
+
+	if (data->watch_gitdir &&
+	    start_rdcw_watch(data, data->watch_gitdir) == -1)
+		goto force_error_stop;
+
+	for (;;) {
+		dwWait = WaitForMultipleObjects(data->nr_listener_handles,
+						data->hListener,
+						FALSE, INFINITE);
+
+		if (dwWait == WAIT_OBJECT_0 + LISTENER_HAVE_DATA_WORKTREE) {
+			if (recv_rdcw_watch(data->watch_worktree) == -1)
+				goto force_error_stop;
+			if (process_worktree_events(state) == LISTENER_SHUTDOWN)
+				goto force_shutdown;
+			if (start_rdcw_watch(data, data->watch_worktree) == -1)
+				goto force_error_stop;
+			continue;
+		}
+
+		if (dwWait == WAIT_OBJECT_0 + LISTENER_HAVE_DATA_GITDIR) {
+			if (recv_rdcw_watch(data->watch_gitdir) == -1)
+				goto force_error_stop;
+			if (process_gitdir_events(state) == LISTENER_SHUTDOWN)
+				goto force_shutdown;
+			if (start_rdcw_watch(data, data->watch_gitdir) == -1)
+				goto force_error_stop;
+			continue;
+		}
+
+		if (dwWait == WAIT_OBJECT_0 + LISTENER_SHUTDOWN)
+			goto clean_shutdown;
+
+		error(_("could not read directory changes [GLE %ld]"),
+		      GetLastError());
+		goto force_error_stop;
+	}
+
+force_error_stop:
+	state->error_code = -1;
+
+force_shutdown:
+	/*
+	 * Tell the IPC thead pool to stop (which completes the await
+	 * in the main thread (which will also signal this thread (if
+	 * we are still alive))).
+	 */
+	ipc_server_stop_async(state->ipc_server_data);
+
+clean_shutdown:
+	cancel_rdcw_watch(data->watch_worktree);
+	cancel_rdcw_watch(data->watch_gitdir);
 }
 
 int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
 {
+	struct fsmonitor_daemon_backend_data *data;
+
+	data = xcalloc(1, sizeof(*data));
+
+	data->hEventShutdown = CreateEvent(NULL, TRUE, FALSE, NULL);
+
+	data->watch_worktree = create_watch(state,
+					    state->path_worktree_watch.buf);
+	if (!data->watch_worktree)
+		goto failed;
+
+	if (state->nr_paths_watching > 1) {
+		data->watch_gitdir = create_watch(state,
+						  state->path_gitdir_watch.buf);
+		if (!data->watch_gitdir)
+			goto failed;
+	}
+
+	data->hListener[LISTENER_SHUTDOWN] = data->hEventShutdown;
+	data->nr_listener_handles++;
+
+	data->hListener[LISTENER_HAVE_DATA_WORKTREE] =
+		data->watch_worktree->hEvent;
+	data->nr_listener_handles++;
+
+	if (data->watch_gitdir) {
+		data->hListener[LISTENER_HAVE_DATA_GITDIR] =
+			data->watch_gitdir->hEvent;
+		data->nr_listener_handles++;
+	}
+
+	state->backend_data = data;
+	return 0;
+
+failed:
+	CloseHandle(data->hEventShutdown);
+	destroy_watch(data->watch_worktree);
+	destroy_watch(data->watch_gitdir);
+
 	return -1;
 }
 
 void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
 {
+	struct fsmonitor_daemon_backend_data *data;
+
+	if (!state || !state->backend_data)
+		return;
+
+	data = state->backend_data;
+
+	CloseHandle(data->hEventShutdown);
+	destroy_watch(data->watch_worktree);
+	destroy_watch(data->watch_gitdir);
+
+	FREE_AND_NULL(state->backend_data);
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 14/23] fsmonitor-fs-listen-macos: add macos header files for FSEvent
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (12 preceding siblings ...)
  2021-04-01 15:40 ` [PATCH 13/23] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:40 ` Jeff Hostetler via GitGitGadget
  2021-04-27 18:13   ` Derrick Stolee
  2021-04-01 15:40 ` [PATCH 15/23] fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS Jeff Hostetler via GitGitGadget
                   ` (12 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:40 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Include MacOS system declarations to allow us to use FSEvent and
CoreFoundation APIs.  We need GCC and clang versions because of
compiler and header file conflicts.

While it is quite possible to #include Apple's CoreServices.h when
compiling C source code with clang, trying to build it with GCC
currently fails with this error:

In file included
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Security.framework/Headers/AuthSession.h:32,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Security.framework/Headers/Security.h:42,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/OSServices.framework/Headers/CSIdentity.h:43,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/OSServices.framework/Headers/OSServices.h:29,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/LaunchServices.framework/Headers/IconsCore.h:23,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/LaunchServices.framework/Headers/LaunchServices.h:23,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Headers/CoreServices.h:45,
     /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Security.framework/Headers/Authorization.h:193:7: error: variably modified 'bytes' at file scope
       193 | char bytes[kAuthorizationExternalFormLength];
           |      ^~~~~

The underlying reason is that GCC (rightfully) objects that an `enum`
value such as `kAuthorizationExternalFormLength` is not a constant
(because it is not, the preprocessor has no knowledge of it, only the
actual C compiler does) and can therefore not be used to define the size
of a C array.

This is a known problem and tracked in GCC's bug tracker:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93082

In the meantime, let's not block things and go the slightly ugly route
of declaring/defining the FSEvents constants, data structures and
functions that we need, so that we can avoid above-mentioned issue.

Let's do this _only_ for GCC, though, so that the CI/PR builds (which
build both with clang and with GCC) can guarantee that we _are_ using
the correct data types.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/fsmonitor/fsmonitor-fs-listen-macos.c | 96 ++++++++++++++++++++
 1 file changed, 96 insertions(+)

diff --git a/compat/fsmonitor/fsmonitor-fs-listen-macos.c b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
index b91058d1c4f8..bec5130d9e1d 100644
--- a/compat/fsmonitor/fsmonitor-fs-listen-macos.c
+++ b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
@@ -1,3 +1,99 @@
+#if defined(__GNUC__)
+/*
+ * It is possible to #include CoreFoundation/CoreFoundation.h when compiling
+ * with clang, but not with GCC as of time of writing.
+ *
+ * See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93082 for details.
+ */
+typedef unsigned int FSEventStreamCreateFlags;
+#define kFSEventStreamEventFlagNone               0x00000000
+#define kFSEventStreamEventFlagMustScanSubDirs    0x00000001
+#define kFSEventStreamEventFlagUserDropped        0x00000002
+#define kFSEventStreamEventFlagKernelDropped      0x00000004
+#define kFSEventStreamEventFlagEventIdsWrapped    0x00000008
+#define kFSEventStreamEventFlagHistoryDone        0x00000010
+#define kFSEventStreamEventFlagRootChanged        0x00000020
+#define kFSEventStreamEventFlagMount              0x00000040
+#define kFSEventStreamEventFlagUnmount            0x00000080
+#define kFSEventStreamEventFlagItemCreated        0x00000100
+#define kFSEventStreamEventFlagItemRemoved        0x00000200
+#define kFSEventStreamEventFlagItemInodeMetaMod   0x00000400
+#define kFSEventStreamEventFlagItemRenamed        0x00000800
+#define kFSEventStreamEventFlagItemModified       0x00001000
+#define kFSEventStreamEventFlagItemFinderInfoMod  0x00002000
+#define kFSEventStreamEventFlagItemChangeOwner    0x00004000
+#define kFSEventStreamEventFlagItemXattrMod       0x00008000
+#define kFSEventStreamEventFlagItemIsFile         0x00010000
+#define kFSEventStreamEventFlagItemIsDir          0x00020000
+#define kFSEventStreamEventFlagItemIsSymlink      0x00040000
+#define kFSEventStreamEventFlagOwnEvent           0x00080000
+#define kFSEventStreamEventFlagItemIsHardlink     0x00100000
+#define kFSEventStreamEventFlagItemIsLastHardlink 0x00200000
+#define kFSEventStreamEventFlagItemCloned         0x00400000
+
+typedef struct __FSEventStream *FSEventStreamRef;
+typedef const FSEventStreamRef ConstFSEventStreamRef;
+
+typedef unsigned int CFStringEncoding;
+#define kCFStringEncodingUTF8 0x08000100
+
+typedef const struct __CFString *CFStringRef;
+typedef const struct __CFArray *CFArrayRef;
+typedef const struct __CFRunLoop *CFRunLoopRef;
+
+struct FSEventStreamContext {
+    long long version;
+    void *cb_data, *retain, *release, *copy_description;
+};
+
+typedef struct FSEventStreamContext FSEventStreamContext;
+typedef unsigned int FSEventStreamEventFlags;
+#define kFSEventStreamCreateFlagNoDefer 0x02
+#define kFSEventStreamCreateFlagWatchRoot 0x04
+#define kFSEventStreamCreateFlagFileEvents 0x10
+
+typedef unsigned long long FSEventStreamEventId;
+#define kFSEventStreamEventIdSinceNow 0xFFFFFFFFFFFFFFFFULL
+
+typedef void (*FSEventStreamCallback)(ConstFSEventStreamRef streamRef,
+				      void *context,
+				      __SIZE_TYPE__ num_of_events,
+				      void *event_paths,
+				      const FSEventStreamEventFlags event_flags[],
+				      const FSEventStreamEventId event_ids[]);
+typedef double CFTimeInterval;
+FSEventStreamRef FSEventStreamCreate(void *allocator,
+				     FSEventStreamCallback callback,
+				     FSEventStreamContext *context,
+				     CFArrayRef paths_to_watch,
+				     FSEventStreamEventId since_when,
+				     CFTimeInterval latency,
+				     FSEventStreamCreateFlags flags);
+CFStringRef CFStringCreateWithCString(void *allocator, const char *string,
+				      CFStringEncoding encoding);
+CFArrayRef CFArrayCreate(void *allocator, const void **items, long long count,
+			 void *callbacks);
+void CFRunLoopRun(void);
+void CFRunLoopStop(CFRunLoopRef run_loop);
+CFRunLoopRef CFRunLoopGetCurrent(void);
+extern CFStringRef kCFRunLoopDefaultMode;
+void FSEventStreamScheduleWithRunLoop(FSEventStreamRef stream,
+				      CFRunLoopRef run_loop,
+				      CFStringRef run_loop_mode);
+unsigned char FSEventStreamStart(FSEventStreamRef stream);
+void FSEventStreamStop(FSEventStreamRef stream);
+void FSEventStreamInvalidate(FSEventStreamRef stream);
+void FSEventStreamRelease(FSEventStreamRef stream);
+#else
+/*
+ * Let Apple's headers declare `isalnum()` first, before
+ * Git's headers override it via a constant
+ */
+#include <string.h>
+#include <CoreFoundation/CoreFoundation.h>
+#include <CoreServices/CoreServices.h>
+#endif
+
 #include "cache.h"
 #include "fsmonitor.h"
 #include "fsmonitor-fs-listen.h"
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 15/23] fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (13 preceding siblings ...)
  2021-04-01 15:40 ` [PATCH 14/23] fsmonitor-fs-listen-macos: add macos header files for FSEvent Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:40 ` Jeff Hostetler via GitGitGadget
  2021-04-27 18:35   ` Derrick Stolee
  2021-04-01 15:40 ` [PATCH 16/23] fsmonitor--daemon: implement handle_client callback Jeff Hostetler via GitGitGadget
                   ` (11 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:40 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Implement file system event listener on MacOS using FSEvent,
CoreFoundation, and CoreServices.

Co-authored-by: Kevin Willford <Kevin.Willford@microsoft.com>
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/fsmonitor/fsmonitor-fs-listen-macos.c | 368 +++++++++++++++++++
 1 file changed, 368 insertions(+)

diff --git a/compat/fsmonitor/fsmonitor-fs-listen-macos.c b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
index bec5130d9e1d..e055fb579cc4 100644
--- a/compat/fsmonitor/fsmonitor-fs-listen-macos.c
+++ b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
@@ -97,20 +97,388 @@ void FSEventStreamRelease(FSEventStreamRef stream);
 #include "cache.h"
 #include "fsmonitor.h"
 #include "fsmonitor-fs-listen.h"
+#include "fsmonitor--daemon.h"
+
+struct fsmonitor_daemon_backend_data
+{
+	CFStringRef cfsr_worktree_path;
+	CFStringRef cfsr_gitdir_path;
+
+	CFArrayRef cfar_paths_to_watch;
+	int nr_paths_watching;
+
+	FSEventStreamRef stream;
+
+	CFRunLoopRef rl;
+
+	enum shutdown_style {
+		SHUTDOWN_EVENT = 0,
+		FORCE_SHUTDOWN,
+		FORCE_ERROR_STOP,
+	} shutdown_style;
+
+	unsigned int stream_scheduled:1;
+	unsigned int stream_started:1;
+};
+
+static void log_flags_set(const char *path, const FSEventStreamEventFlags flag)
+{
+	struct strbuf msg = STRBUF_INIT;
+
+	if (flag & kFSEventStreamEventFlagMustScanSubDirs)
+		strbuf_addstr(&msg, "MustScanSubDirs|");
+	if (flag & kFSEventStreamEventFlagUserDropped)
+		strbuf_addstr(&msg, "UserDropped|");
+	if (flag & kFSEventStreamEventFlagKernelDropped)
+		strbuf_addstr(&msg, "KernelDropped|");
+	if (flag & kFSEventStreamEventFlagEventIdsWrapped)
+		strbuf_addstr(&msg, "EventIdsWrapped|");
+	if (flag & kFSEventStreamEventFlagHistoryDone)
+		strbuf_addstr(&msg, "HistoryDone|");
+	if (flag & kFSEventStreamEventFlagRootChanged)
+		strbuf_addstr(&msg, "RootChanged|");
+	if (flag & kFSEventStreamEventFlagMount)
+		strbuf_addstr(&msg, "Mount|");
+	if (flag & kFSEventStreamEventFlagUnmount)
+		strbuf_addstr(&msg, "Unmount|");
+	if (flag & kFSEventStreamEventFlagItemChangeOwner)
+		strbuf_addstr(&msg, "ItemChangeOwner|");
+	if (flag & kFSEventStreamEventFlagItemCreated)
+		strbuf_addstr(&msg, "ItemCreated|");
+	if (flag & kFSEventStreamEventFlagItemFinderInfoMod)
+		strbuf_addstr(&msg, "ItemFinderInfoMod|");
+	if (flag & kFSEventStreamEventFlagItemInodeMetaMod)
+		strbuf_addstr(&msg, "ItemInodeMetaMod|");
+	if (flag & kFSEventStreamEventFlagItemIsDir)
+		strbuf_addstr(&msg, "ItemIsDir|");
+	if (flag & kFSEventStreamEventFlagItemIsFile)
+		strbuf_addstr(&msg, "ItemIsFile|");
+	if (flag & kFSEventStreamEventFlagItemIsHardlink)
+		strbuf_addstr(&msg, "ItemIsHardlink|");
+	if (flag & kFSEventStreamEventFlagItemIsLastHardlink)
+		strbuf_addstr(&msg, "ItemIsLastHardlink|");
+	if (flag & kFSEventStreamEventFlagItemIsSymlink)
+		strbuf_addstr(&msg, "ItemIsSymlink|");
+	if (flag & kFSEventStreamEventFlagItemModified)
+		strbuf_addstr(&msg, "ItemModified|");
+	if (flag & kFSEventStreamEventFlagItemRemoved)
+		strbuf_addstr(&msg, "ItemRemoved|");
+	if (flag & kFSEventStreamEventFlagItemRenamed)
+		strbuf_addstr(&msg, "ItemRenamed|");
+	if (flag & kFSEventStreamEventFlagItemXattrMod)
+		strbuf_addstr(&msg, "ItemXattrMod|");
+	if (flag & kFSEventStreamEventFlagOwnEvent)
+		strbuf_addstr(&msg, "OwnEvent|");
+	if (flag & kFSEventStreamEventFlagItemCloned)
+		strbuf_addstr(&msg, "ItemCloned|");
+
+	trace_printf_key(&trace_fsmonitor, "fsevent: '%s', flags=%u %s",
+			 path, flag, msg.buf);
+
+	strbuf_release(&msg);
+}
+
+static int ef_is_root_delete(const FSEventStreamEventFlags ef)
+{
+	return (ef & kFSEventStreamEventFlagItemIsDir &&
+		ef & kFSEventStreamEventFlagItemRemoved);
+}
+
+static int ef_is_root_renamed(const FSEventStreamEventFlags ef)
+{
+	return (ef & kFSEventStreamEventFlagItemIsDir &&
+		ef & kFSEventStreamEventFlagItemRenamed);
+}
+
+static void fsevent_callback(ConstFSEventStreamRef streamRef,
+			     void *ctx,
+			     size_t num_of_events,
+			     void *event_paths,
+			     const FSEventStreamEventFlags event_flags[],
+			     const FSEventStreamEventId event_ids[])
+{
+	struct fsmonitor_daemon_state *state = ctx;
+	struct fsmonitor_daemon_backend_data *data = state->backend_data;
+	char **paths = (char **)event_paths;
+	struct fsmonitor_batch *batch = NULL;
+	struct string_list cookie_list = STRING_LIST_INIT_DUP;
+	const char *path_k;
+	const char *slash;
+	int k;
+
+	/*
+	 * Build a list of all filesystem changes into a private/local
+	 * list and without holding any locks.
+	 */
+	for (k = 0; k < num_of_events; k++) {
+		/*
+		 * On Mac, we receive an array of absolute paths.
+		 */
+		path_k = paths[k];
+
+		/*
+		 * If you want to debug FSEvents, log them to GIT_TRACE_FSMONITOR.
+		 * Please don't log them to Trace2.
+		 *
+		 * trace_printf_key(&trace_fsmonitor, "XXX '%s'", path_k);
+		 */
+
+		/*
+		 * If event[k] is marked as dropped, we assume that we have
+		 * lost sync with the filesystem and should flush our cached
+		 * data.  We need to:
+		 *
+		 * [1] Abort/wake any client threads waiting for a cookie and
+		 *     flush the cached state data (the current token), and
+		 *     create a new token.
+		 *
+		 * [2] Discard the batch that we were locally building (since
+		 *     they are conceptually relative to the just flushed
+		 *     token).
+		 */
+		if ((event_flags[k] & kFSEventStreamEventFlagKernelDropped) ||
+		    (event_flags[k] & kFSEventStreamEventFlagUserDropped)) {
+			/*
+			 * see also kFSEventStreamEventFlagMustScanSubDirs
+			 */
+			trace2_data_string("fsmonitor", NULL,
+					   "fsm-listen/kernel", "dropped");
+
+			fsmonitor_force_resync(state);
+
+			if (fsmonitor_batch__free(batch))
+				BUG("batch should not have a next");
+			string_list_clear(&cookie_list, 0);
+
+			/*
+			 * We assume that any events that we received
+			 * in this callback after this dropped event
+			 * may still be valid, so we continue rather
+			 * than break.  (And just in case there is a
+			 * delete of ".git" hiding in there.)
+			 */
+			continue;
+		}
+
+		switch (fsmonitor_classify_path_absolute(state, path_k)) {
+
+		case IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX:
+		case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX:
+			/* special case cookie files within .git or gitdir */
+
+			/* Use just the filename of the cookie file. */
+			slash = find_last_dir_sep(path_k);
+			string_list_append(&cookie_list,
+					   slash ? slash + 1 : path_k);
+			break;
+
+		case IS_INSIDE_DOT_GIT:
+		case IS_INSIDE_GITDIR:
+			/* ignore all other paths inside of .git or gitdir */
+			break;
+
+		case IS_DOT_GIT:
+		case IS_GITDIR:
+			/*
+			 * If .git directory is deleted or renamed away,
+			 * we have to quit.
+			 */
+			if (ef_is_root_delete(event_flags[k])) {
+				trace2_data_string("fsmonitor", NULL,
+						   "fsm-listen/gitdir",
+						   "removed");
+				goto force_shutdown;
+			}
+			if (ef_is_root_renamed(event_flags[k])) {
+				trace2_data_string("fsmonitor", NULL,
+						   "fsm-listen/gitdir",
+						   "renamed");
+				goto force_shutdown;
+			}
+			break;
+
+		case IS_WORKDIR_PATH:
+			/* try to queue normal pathnames */
+
+			if (trace_pass_fl(&trace_fsmonitor))
+				log_flags_set(path_k, event_flags[k]);
+
+			/* fsevent could be marked as both a file and directory */
+
+			if (event_flags[k] & kFSEventStreamEventFlagItemIsFile) {
+				const char *rel = path_k +
+					state->path_worktree_watch.len + 1;
+
+				if (!batch)
+					batch = fsmonitor_batch__new();
+				fsmonitor_batch__add_path(batch, rel);
+			}
+
+			if (event_flags[k] & kFSEventStreamEventFlagItemIsDir) {
+				const char *rel = path_k +
+					state->path_worktree_watch.len + 1;
+				char *p = xstrfmt("%s/", rel);
+
+				if (!batch)
+					batch = fsmonitor_batch__new();
+				fsmonitor_batch__add_path(batch, p);
+
+				free(p);
+			}
+
+			break;
+
+		case IS_OUTSIDE_CONE:
+		default:
+			trace_printf_key(&trace_fsmonitor,
+					 "ignoring '%s'", path_k);
+			break;
+		}
+	}
+
+	fsmonitor_publish(state, batch, &cookie_list);
+	string_list_clear(&cookie_list, 0);
+	return;
+
+force_shutdown:
+	if (fsmonitor_batch__free(batch))
+		BUG("batch should not have a next");
+	string_list_clear(&cookie_list, 0);
+
+	data->shutdown_style = FORCE_SHUTDOWN;
+	CFRunLoopStop(data->rl);
+	return;
+}
+
+/*
+ * TODO Investigate the proper value for the `latency` argument in the call
+ * TODO to `FSEventStreamCreate()`.  I'm not sure that this needs to be a
+ * TODO config setting or just something that we tune after some testing.
+ * TODO
+ * TODO With a latency of 0.1, I was seeing lots of dropped events during
+ * TODO the "touch 100000" files test within t/perf/p7519, but with a
+ * TODO latency of 0.001 I did not see any dropped events.  So the "correct"
+ * TODO value may be somewhere in between.
+ * TODO
+ * TODO https://developer.apple.com/documentation/coreservices/1443980-fseventstreamcreate
+ */
 
 int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
 {
+	FSEventStreamCreateFlags flags = kFSEventStreamCreateFlagNoDefer |
+		kFSEventStreamCreateFlagWatchRoot |
+		kFSEventStreamCreateFlagFileEvents;
+	FSEventStreamContext ctx = {
+		0,
+		state,
+		NULL,
+		NULL,
+		NULL
+	};
+	struct fsmonitor_daemon_backend_data *data;
+	const void *dir_array[2];
+
+	data = xcalloc(1, sizeof(*data));
+	state->backend_data = data;
+
+	data->cfsr_worktree_path = CFStringCreateWithCString(
+		NULL, state->path_worktree_watch.buf, kCFStringEncodingUTF8);
+	dir_array[data->nr_paths_watching++] = data->cfsr_worktree_path;
+
+	if (state->nr_paths_watching > 1) {
+		data->cfsr_gitdir_path = CFStringCreateWithCString(
+			NULL, state->path_gitdir_watch.buf,
+			kCFStringEncodingUTF8);
+		dir_array[data->nr_paths_watching++] = data->cfsr_gitdir_path;
+	}
+
+	data->cfar_paths_to_watch = CFArrayCreate(NULL, dir_array,
+						  data->nr_paths_watching,
+						  NULL);
+	data->stream = FSEventStreamCreate(NULL, fsevent_callback, &ctx,
+					   data->cfar_paths_to_watch,
+					   kFSEventStreamEventIdSinceNow,
+					   0.001, flags);
+	if (data->stream == NULL)
+		goto failed;
+
+	/*
+	 * `data->rl` needs to be set inside the listener thread.
+	 */
+
+	return 0;
+
+failed:
+	error("Unable to create FSEventStream.");
+
+	FREE_AND_NULL(state->backend_data);
 	return -1;
 }
 
 void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
 {
+	struct fsmonitor_daemon_backend_data *data;
+
+	if (!state || !state->backend_data)
+		return;
+
+	data = state->backend_data;
+
+	if (data->stream) {
+		if (data->stream_started)
+			FSEventStreamStop(data->stream);
+		if (data->stream_scheduled)
+			FSEventStreamInvalidate(data->stream);
+		FSEventStreamRelease(data->stream);
+	}
+
+	FREE_AND_NULL(state->backend_data);
 }
 
 void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state)
 {
+	struct fsmonitor_daemon_backend_data *data;
+
+	data = state->backend_data;
+	data->shutdown_style = SHUTDOWN_EVENT;
+
+	CFRunLoopStop(data->rl);
 }
 
 void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
 {
+	struct fsmonitor_daemon_backend_data *data;
+
+	data = state->backend_data;
+
+	data->rl = CFRunLoopGetCurrent();
+
+	FSEventStreamScheduleWithRunLoop(data->stream, data->rl, kCFRunLoopDefaultMode);
+	data->stream_scheduled = 1;
+
+	if (!FSEventStreamStart(data->stream)) {
+		error("Failed to start the FSEventStream");
+		goto force_error_stop_without_loop;
+	}
+	data->stream_started = 1;
+
+	CFRunLoopRun();
+
+	switch (data->shutdown_style) {
+	case FORCE_ERROR_STOP:
+		state->error_code = -1;
+		/* fall thru */
+	case FORCE_SHUTDOWN:
+		ipc_server_stop_async(state->ipc_server_data);
+		/* fall thru */
+	case SHUTDOWN_EVENT:
+	default:
+		break;
+	}
+	return;
+
+force_error_stop_without_loop:
+	state->error_code = -1;
+	ipc_server_stop_async(state->ipc_server_data);
+	return;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 16/23] fsmonitor--daemon: implement handle_client callback
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (14 preceding siblings ...)
  2021-04-01 15:40 ` [PATCH 15/23] fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:40 ` Jeff Hostetler via GitGitGadget
  2021-04-26 21:01   ` Derrick Stolee
  2021-05-13 18:52   ` Derrick Stolee
  2021-04-01 15:40 ` [PATCH 17/23] fsmonitor--daemon: periodically truncate list of modified files Jeff Hostetler via GitGitGadget
                   ` (10 subsequent siblings)
  26 siblings, 2 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:40 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon to respond to IPC requests from client
Git processes and respond with a list of modified pathnames
relative to the provided token.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 335 +++++++++++++++++++++++++++++++++++-
 1 file changed, 333 insertions(+), 2 deletions(-)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 48071d445c49..32df392b25d3 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -7,6 +7,7 @@
 #include "fsmonitor--daemon.h"
 #include "simple-ipc.h"
 #include "khash.h"
+#include "pkt-line.h"
 
 static const char * const builtin_fsmonitor__daemon_usage[] = {
 	N_("git fsmonitor--daemon --start [<options>]"),
@@ -369,19 +370,349 @@ void fsmonitor_force_resync(struct fsmonitor_daemon_state *state)
 	fsmonitor_free_token_data(free_me);
 }
 
+/*
+ * Format an opaque token string to send to the client.
+ */
+static void fsmonitor_format_response_token(
+	struct strbuf *response_token,
+	const struct strbuf *response_token_id,
+	const struct fsmonitor_batch *batch)
+{
+	uint64_t seq_nr = (batch) ? batch->batch_seq_nr + 1 : 0;
+
+	strbuf_reset(response_token);
+	strbuf_addf(response_token, "builtin:%s:%"PRIu64,
+		    response_token_id->buf, seq_nr);
+}
+
+/*
+ * Parse an opaque token from the client.
+ */
+static int fsmonitor_parse_client_token(const char *buf_token,
+					struct strbuf *requested_token_id,
+					uint64_t *seq_nr)
+{
+	const char *p;
+	char *p_end;
+
+	strbuf_reset(requested_token_id);
+	*seq_nr = 0;
+
+	if (!skip_prefix(buf_token, "builtin:", &p))
+		return 1;
+
+	while (*p && *p != ':')
+		strbuf_addch(requested_token_id, *p++);
+	if (!*p++)
+		return 1;
+
+	*seq_nr = (uint64_t)strtoumax(p, &p_end, 10);
+	if (*p_end)
+		return 1;
+
+	return 0;
+}
+
+KHASH_INIT(str, const char *, int, 0, kh_str_hash_func, kh_str_hash_equal);
+
+static int do_handle_client(struct fsmonitor_daemon_state *state,
+			    const char *command,
+			    ipc_server_reply_cb *reply,
+			    struct ipc_server_reply_data *reply_data)
+{
+	struct fsmonitor_token_data *token_data = NULL;
+	struct strbuf response_token = STRBUF_INIT;
+	struct strbuf requested_token_id = STRBUF_INIT;
+	struct strbuf payload = STRBUF_INIT;
+	uint64_t requested_oldest_seq_nr = 0;
+	uint64_t total_response_len = 0;
+	const char *p;
+	const struct fsmonitor_batch *batch_head;
+	const struct fsmonitor_batch *batch;
+	intmax_t count = 0, duplicates = 0;
+	kh_str_t *shown;
+	int hash_ret;
+	int result;
+
+	/*
+	 * We expect `command` to be of the form:
+	 *
+	 * <command> := quit NUL
+	 *            | flush NUL
+	 *            | <V1-time-since-epoch-ns> NUL
+	 *            | <V2-opaque-fsmonitor-token> NUL
+	 */
+
+	if (!strcmp(command, "quit")) {
+		/*
+		 * A client has requested over the socket/pipe that the
+		 * daemon shutdown.
+		 *
+		 * Tell the IPC thread pool to shutdown (which completes
+		 * the await in the main thread (which can stop the
+		 * fsmonitor listener thread)).
+		 *
+		 * There is no reply to the client.
+		 */
+		return SIMPLE_IPC_QUIT;
+	}
+
+	if (!strcmp(command, "flush")) {
+		/*
+		 * Flush all of our cached data and generate a new token
+		 * just like if we lost sync with the filesystem.
+		 *
+		 * Then send a trivial response using the new token.
+		 */
+		fsmonitor_force_resync(state);
+		result = 0;
+		goto send_trivial_response;
+	}
+
+	if (!skip_prefix(command, "builtin:", &p)) {
+		/* assume V1 timestamp or garbage */
+
+		char *p_end;
+
+		strtoumax(command, &p_end, 10);
+		trace_printf_key(&trace_fsmonitor,
+				 ((*p_end) ?
+				  "fsmonitor: invalid command line '%s'" :
+				  "fsmonitor: unsupported V1 protocol '%s'"),
+				 command);
+		result = -1;
+		goto send_trivial_response;
+	}
+
+	/* try V2 token */
+
+	if (fsmonitor_parse_client_token(command, &requested_token_id,
+					 &requested_oldest_seq_nr)) {
+		trace_printf_key(&trace_fsmonitor,
+				 "fsmonitor: invalid V2 protocol token '%s'",
+				 command);
+		result = -1;
+		goto send_trivial_response;
+	}
+
+	pthread_mutex_lock(&state->main_lock);
+
+	if (!state->current_token_data) {
+		/*
+		 * We don't have a current token.  This may mean that
+		 * the listener thread has not yet started.
+		 */
+		pthread_mutex_unlock(&state->main_lock);
+		result = 0;
+		goto send_trivial_response;
+	}
+	if (strcmp(requested_token_id.buf,
+		   state->current_token_data->token_id.buf)) {
+		/*
+		 * The client last spoke to a different daemon
+		 * instance -OR- the daemon had to resync with
+		 * the filesystem (and lost events), so reject.
+		 */
+		pthread_mutex_unlock(&state->main_lock);
+		result = 0;
+		trace2_data_string("fsmonitor", the_repository,
+				   "response/token", "different");
+		goto send_trivial_response;
+	}
+	if (!state->current_token_data->batch_tail) {
+		/*
+		 * The listener has not received any filesystem
+		 * events yet since we created the current token.
+		 * We can respond with an empty list, since the
+		 * client has already seen the current token and
+		 * we have nothing new to report.  (This is
+		 * instead of sending a trivial response.)
+		 */
+		pthread_mutex_unlock(&state->main_lock);
+		result = 0;
+		goto send_empty_response;
+	}
+	if (requested_oldest_seq_nr <
+	    state->current_token_data->batch_tail->batch_seq_nr) {
+		/*
+		 * The client wants older events than we have for
+		 * this token_id.  This means that the end of our
+		 * batch list was truncated and we cannot give the
+		 * client a complete snapshot relative to their
+		 * request.
+		 */
+		pthread_mutex_unlock(&state->main_lock);
+
+		trace_printf_key(&trace_fsmonitor,
+				 "client requested truncated data");
+		result = 0;
+		goto send_trivial_response;
+	}
+
+	/*
+	 * We're going to hold onto a pointer to the current
+	 * token-data while we walk the list of batches of files.
+	 * During this time, we will NOT be under the lock.
+	 * So we ref-count it.
+	 *
+	 * This allows the listener thread to continue prepending
+	 * new batches of items to the token-data (which we'll ignore).
+	 *
+	 * AND it allows the listener thread to do a token-reset
+	 * (and install a new `current_token_data`).
+	 *
+	 * We mark the current head of the batch list as "pinned" so
+	 * that the listener thread will treat this item as read-only
+	 * (and prevent any more paths from being added to it) from
+	 * now on.
+	 */
+	token_data = state->current_token_data;
+	token_data->client_ref_count++;
+
+	batch_head = token_data->batch_head;
+	((struct fsmonitor_batch *)batch_head)->pinned_time = time(NULL);
+
+	pthread_mutex_unlock(&state->main_lock);
+
+	/*
+	 * FSMonitor Protocol V2 requires that we send a response header
+	 * with a "new current token" and then all of the paths that changed
+	 * since the "requested token".
+	 */
+	fsmonitor_format_response_token(&response_token,
+					&token_data->token_id,
+					batch_head);
+
+	reply(reply_data, response_token.buf, response_token.len + 1);
+	total_response_len += response_token.len + 1;
+
+	trace2_data_string("fsmonitor", the_repository, "response/token",
+			   response_token.buf);
+	trace_printf_key(&trace_fsmonitor, "response token: %s", response_token.buf);
+
+	shown = kh_init_str();
+	for (batch = batch_head;
+	     batch && batch->batch_seq_nr >= requested_oldest_seq_nr;
+	     batch = batch->next) {
+		size_t k;
+
+		for (k = 0; k < batch->nr; k++) {
+			const char *s = batch->interned_paths[k];
+			size_t s_len;
+
+			if (kh_get_str(shown, s) != kh_end(shown))
+				duplicates++;
+			else {
+				kh_put_str(shown, s, &hash_ret);
+
+				trace_printf_key(&trace_fsmonitor,
+						 "send[%"PRIuMAX"]: %s",
+						 count, s);
+
+				/* Each path gets written with a trailing NUL */
+				s_len = strlen(s) + 1;
+
+				if (payload.len + s_len >=
+				    LARGE_PACKET_DATA_MAX) {
+					reply(reply_data, payload.buf,
+					      payload.len);
+					total_response_len += payload.len;
+					strbuf_reset(&payload);
+				}
+
+				strbuf_add(&payload, s, s_len);
+				count++;
+			}
+		}
+	}
+
+	if (payload.len) {
+		reply(reply_data, payload.buf, payload.len);
+		total_response_len += payload.len;
+	}
+
+	kh_release_str(shown);
+
+	pthread_mutex_lock(&state->main_lock);
+	if (token_data->client_ref_count > 0)
+		token_data->client_ref_count--;
+
+	if (token_data->client_ref_count == 0) {
+		if (token_data != state->current_token_data) {
+			/*
+			 * The listener thread did a token-reset while we were
+			 * walking the batch list.  Therefore, this token is
+			 * stale and can be discarded completely.  If we are
+			 * the last reader thread using this token, we own
+			 * that work.
+			 */
+			fsmonitor_free_token_data(token_data);
+		}
+	}
+
+	pthread_mutex_unlock(&state->main_lock);
+
+	trace2_data_intmax("fsmonitor", the_repository, "response/length", total_response_len);
+	trace2_data_intmax("fsmonitor", the_repository, "response/count/files", count);
+	trace2_data_intmax("fsmonitor", the_repository, "response/count/duplicates", duplicates);
+
+	strbuf_release(&response_token);
+	strbuf_release(&requested_token_id);
+	strbuf_release(&payload);
+
+	return 0;
+
+send_trivial_response:
+	pthread_mutex_lock(&state->main_lock);
+	fsmonitor_format_response_token(&response_token,
+					&state->current_token_data->token_id,
+					state->current_token_data->batch_head);
+	pthread_mutex_unlock(&state->main_lock);
+
+	reply(reply_data, response_token.buf, response_token.len + 1);
+	trace2_data_string("fsmonitor", the_repository, "response/token",
+			   response_token.buf);
+	reply(reply_data, "/", 2);
+	trace2_data_intmax("fsmonitor", the_repository, "response/trivial", 1);
+
+	strbuf_release(&response_token);
+	strbuf_release(&requested_token_id);
+
+	return result;
+
+send_empty_response:
+	pthread_mutex_lock(&state->main_lock);
+	fsmonitor_format_response_token(&response_token,
+					&state->current_token_data->token_id,
+					NULL);
+	pthread_mutex_unlock(&state->main_lock);
+
+	reply(reply_data, response_token.buf, response_token.len + 1);
+	trace2_data_string("fsmonitor", the_repository, "response/token",
+			   response_token.buf);
+	trace2_data_intmax("fsmonitor", the_repository, "response/empty", 1);
+
+	strbuf_release(&response_token);
+	strbuf_release(&requested_token_id);
+
+	return 0;
+}
+
 static ipc_server_application_cb handle_client;
 
 static int handle_client(void *data, const char *command,
 			 ipc_server_reply_cb *reply,
 			 struct ipc_server_reply_data *reply_data)
 {
-	/* struct fsmonitor_daemon_state *state = data; */
+	struct fsmonitor_daemon_state *state = data;
 	int result;
 
+	trace_printf_key(&trace_fsmonitor, "requested token: %s", command);
+
 	trace2_region_enter("fsmonitor", "handle_client", the_repository);
 	trace2_data_string("fsmonitor", the_repository, "request", command);
 
-	result = 0; /* TODO Do something here. */
+	result = do_handle_client(state, command, reply, reply_data);
 
 	trace2_region_leave("fsmonitor", "handle_client", the_repository);
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 17/23] fsmonitor--daemon: periodically truncate list of modified files
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (15 preceding siblings ...)
  2021-04-01 15:40 ` [PATCH 16/23] fsmonitor--daemon: implement handle_client callback Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:40 ` Jeff Hostetler via GitGitGadget
  2021-04-27 13:24   ` Derrick Stolee
  2021-04-01 15:41 ` [PATCH 18/23] fsmonitor--daemon:: introduce client delay for testing Jeff Hostetler via GitGitGadget
                   ` (9 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:40 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon to periodically truncate the list of
modified files to save some memory.

Clients will ask for the set of changes relative to a token that they
found in the FSMN index extension in the index.  (This token is like a
point in time, but different).  Clients will then update the index to
contain the response token (so that subsequent commands will be
relative to this new token).

Therefore, the daemon can gradually truncate the in-memory list of
changed paths as they become obsolete (older that the previous token).
Since we may have multiple clients making concurrent requests with a
skew of tokens and clients may be racing to the talk to the daemon,
we lazily truncate the list.

We introduce a 5 minute delay and truncate batches 5 minutes after
they are considered obsolete.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 78 +++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 32df392b25d3..e9a9aea59ad6 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -316,6 +316,75 @@ static void fsmonitor_batch__combine(struct fsmonitor_batch *batch_dest,
 			batch_src->interned_paths[k];
 }
 
+/*
+ * To keep the batch list from growing unbounded in response to filesystem
+ * activity, we try to truncate old batches from the end of the list as
+ * they become irrelevant.
+ *
+ * We assume that the .git/index will be updated with the most recent token
+ * any time the index is updated.  And future commands will only ask for
+ * recent changes *since* that new token.  So as tokens advance into the
+ * future, older batch items will never be requested/needed.  So we can
+ * truncate them without loss of functionality.
+ *
+ * However, multiple commands may be talking to the daemon concurrently
+ * or perform a slow command, so a little "token skew" is possible.
+ * Therefore, we want this to be a little bit lazy and have a generous
+ * delay.
+ *
+ * The current reader thread walked backwards in time from `token->batch_head`
+ * back to `batch_marker` somewhere in the middle of the batch list.
+ *
+ * Let's walk backwards in time from that marker an arbitrary delay
+ * and truncate the list there.  Note that these timestamps are completely
+ * artificial (based on when we pinned the batch item) and not on any
+ * filesystem activity.
+ */
+#define MY_TIME_DELAY (5 * 60) /* seconds */
+
+static void fsmonitor_batch__truncate(struct fsmonitor_daemon_state *state,
+				      const struct fsmonitor_batch *batch_marker)
+{
+	/* assert state->main_lock */
+
+	const struct fsmonitor_batch *batch;
+	struct fsmonitor_batch *rest;
+	struct fsmonitor_batch *p;
+	time_t t;
+
+	if (!batch_marker)
+		return;
+
+	trace_printf_key(&trace_fsmonitor, "TRNC mark (%"PRIu64",%"PRIu64")",
+			 batch_marker->batch_seq_nr,
+			 (uint64_t)batch_marker->pinned_time);
+
+	for (batch = batch_marker; batch; batch = batch->next) {
+		if (!batch->pinned_time) /* an overflow batch */
+			continue;
+
+		t = batch->pinned_time + MY_TIME_DELAY;
+		if (t > batch_marker->pinned_time) /* too close to marker */
+			continue;
+
+		goto truncate_past_here;
+	}
+
+	return;
+
+truncate_past_here:
+	state->current_token_data->batch_tail = (struct fsmonitor_batch *)batch;
+
+	rest = ((struct fsmonitor_batch *)batch)->next;
+	((struct fsmonitor_batch *)batch)->next = NULL;
+
+	for (p = rest; p; p = fsmonitor_batch__free(p)) {
+		trace_printf_key(&trace_fsmonitor,
+				 "TRNC kill (%"PRIu64",%"PRIu64")",
+				 p->batch_seq_nr, (uint64_t)p->pinned_time);
+	}
+}
+
 static void fsmonitor_free_token_data(struct fsmonitor_token_data *token)
 {
 	struct fsmonitor_batch *p;
@@ -647,6 +716,15 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
 			 * that work.
 			 */
 			fsmonitor_free_token_data(token_data);
+		} else if (batch) {
+			/*
+			 * This batch is the first item in the list
+			 * that is older than the requested sequence
+			 * number and might be considered to be
+			 * obsolete.  See if we can truncate the list
+			 * and save some memory.
+			 */
+			fsmonitor_batch__truncate(state, batch);
 		}
 	}
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 18/23] fsmonitor--daemon:: introduce client delay for testing
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (16 preceding siblings ...)
  2021-04-01 15:40 ` [PATCH 17/23] fsmonitor--daemon: periodically truncate list of modified files Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:41 ` Jeff Hostetler via GitGitGadget
  2021-04-27 13:36   ` Derrick Stolee
  2021-04-01 15:41 ` [PATCH 19/23] fsmonitor--daemon: use a cookie file to sync with file system Jeff Hostetler via GitGitGadget
                   ` (8 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:41 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Define GIT_TEST_FSMONITOR_CLIENT_DELAY as a millisecond delay.

Introduce an artificial delay when processing client requests.
This make the CI/PR test suite a little more stable and avoids
the need to load up test scripts with sleep statements to avoid
racy failures.  This was mostly seen on 1 or 2 core CI build
machines where the test script would create a file and quickly
try to confirm that the daemon had seen it *before* the daemon
had received the kernel event and causing a test failure.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 38 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index e9a9aea59ad6..0cb09ef0b984 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -150,6 +150,30 @@ static int do_as_client__send_flush(void)
 	return 0;
 }
 
+static int lookup_client_test_delay(void)
+{
+	static int delay_ms = -1;
+
+	const char *s;
+	int ms;
+
+	if (delay_ms >= 0)
+		return delay_ms;
+
+	delay_ms = 0;
+
+	s = getenv("GIT_TEST_FSMONITOR_CLIENT_DELAY");
+	if (!s)
+		return delay_ms;
+
+	ms = atoi(s);
+	if (ms < 0)
+		return delay_ms;
+
+	delay_ms = ms;
+	return delay_ms;
+}
+
 /*
  * Requests to and from a FSMonitor Protocol V2 provider use an opaque
  * "token" as a virtual timestamp.  Clients can request a summary of all
@@ -526,6 +550,18 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
 		return SIMPLE_IPC_QUIT;
 	}
 
+	/*
+	 * For testing purposes, introduce an artificial delay in this
+	 * worker to allow the filesystem listener thread to receive
+	 * any fs events that may have been generated by the client
+	 * process on the other end of the pipe/socket.  This helps
+	 * make the CI/PR test suite runs a little more predictable
+	 * and hopefully eliminates the need to introduce `sleep`
+	 * commands in the test scripts.
+	 */
+	if (state->test_client_delay_ms)
+		sleep_millisec(state->test_client_delay_ms);
+
 	if (!strcmp(command, "flush")) {
 		/*
 		 * Flush all of our cached data and generate a new token
@@ -1038,7 +1074,7 @@ static int fsmonitor_run_daemon(void)
 	pthread_mutex_init(&state.main_lock, NULL);
 	state.error_code = 0;
 	state.current_token_data = fsmonitor_new_token_data();
-	state.test_client_delay_ms = 0;
+	state.test_client_delay_ms = lookup_client_test_delay();
 
 	/* Prepare to (recursively) watch the <worktree-root> directory. */
 	strbuf_init(&state.path_worktree_watch, 0);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 19/23] fsmonitor--daemon: use a cookie file to sync with file system
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (17 preceding siblings ...)
  2021-04-01 15:41 ` [PATCH 18/23] fsmonitor--daemon:: introduce client delay for testing Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:41 ` Jeff Hostetler via GitGitGadget
  2021-04-27 14:23   ` Derrick Stolee
  2021-04-01 15:41 ` [PATCH 20/23] fsmonitor: force update index when fsmonitor token advances Jeff Hostetler via GitGitGadget
                   ` (7 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:41 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon client threads to create a cookie file
inside the .git directory and then wait until FS events for the
cookie are observed by the FS listener thread.

This helps address the racy nature of file system events by
blocking the client response until the kernel has drained any
event backlog.

This is especially important on MacOS where kernel events are
only issued with a limited frequency.  See the `latency` argument
of `FSeventStreamCreate()`.  The kernel only signals every `latency`
seconds, but does not guarantee that the kernel queue is completely
drained, so we may have to wait more than one interval.  If we
increase the frequency, the system is more likely to drop events.
We avoid these issues by having each client thread create a unique
cookie file and then wait until it is seen in the event stream.

Co-authored-by: Kevin Willford <Kevin.Willford@microsoft.com>
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 198 ++++++++++++++++++++++++++++++++++++
 fsmonitor--daemon.h         |   5 +
 2 files changed, 203 insertions(+)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 0cb09ef0b984..d6b59a98cedd 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -150,6 +150,149 @@ static int do_as_client__send_flush(void)
 	return 0;
 }
 
+enum fsmonitor_cookie_item_result {
+	FCIR_ERROR = -1, /* could not create cookie file ? */
+	FCIR_INIT = 0,
+	FCIR_SEEN,
+	FCIR_ABORT,
+};
+
+struct fsmonitor_cookie_item {
+	struct hashmap_entry entry;
+	const char *name;
+	enum fsmonitor_cookie_item_result result;
+};
+
+static int cookies_cmp(const void *data, const struct hashmap_entry *he1,
+		     const struct hashmap_entry *he2, const void *keydata)
+{
+	const struct fsmonitor_cookie_item *a =
+		container_of(he1, const struct fsmonitor_cookie_item, entry);
+	const struct fsmonitor_cookie_item *b =
+		container_of(he2, const struct fsmonitor_cookie_item, entry);
+
+	return strcmp(a->name, keydata ? keydata : b->name);
+}
+
+static enum fsmonitor_cookie_item_result fsmonitor_wait_for_cookie(
+	struct fsmonitor_daemon_state *state)
+{
+	int fd;
+	struct fsmonitor_cookie_item cookie;
+	struct strbuf cookie_pathname = STRBUF_INIT;
+	struct strbuf cookie_filename = STRBUF_INIT;
+	const char *slash;
+	int my_cookie_seq;
+
+	pthread_mutex_lock(&state->main_lock);
+
+	my_cookie_seq = state->cookie_seq++;
+
+	strbuf_addbuf(&cookie_pathname, &state->path_cookie_prefix);
+	strbuf_addf(&cookie_pathname, "%i-%i", getpid(), my_cookie_seq);
+
+	slash = find_last_dir_sep(cookie_pathname.buf);
+	if (slash)
+		strbuf_addstr(&cookie_filename, slash + 1);
+	else
+		strbuf_addbuf(&cookie_filename, &cookie_pathname);
+	cookie.name = strbuf_detach(&cookie_filename, NULL);
+	cookie.result = FCIR_INIT;
+	// TODO should we have case-insenstive hash (and in cookie_cmp()) ??
+	hashmap_entry_init(&cookie.entry, strhash(cookie.name));
+
+	/*
+	 * Warning: we are putting the address of a stack variable into a
+	 * global hashmap.  This feels dodgy.  We must ensure that we remove
+	 * it before this thread and stack frame returns.
+	 */
+	hashmap_add(&state->cookies, &cookie.entry);
+
+	trace_printf_key(&trace_fsmonitor, "cookie-wait: '%s' '%s'",
+			 cookie.name, cookie_pathname.buf);
+
+	/*
+	 * Create the cookie file on disk and then wait for a notification
+	 * that the listener thread has seen it.
+	 */
+	fd = open(cookie_pathname.buf, O_WRONLY | O_CREAT | O_EXCL, 0600);
+	if (fd >= 0) {
+		close(fd);
+		unlink_or_warn(cookie_pathname.buf);
+
+		while (cookie.result == FCIR_INIT)
+			pthread_cond_wait(&state->cookies_cond,
+					  &state->main_lock);
+
+		hashmap_remove(&state->cookies, &cookie.entry, NULL);
+	} else {
+		error_errno(_("could not create fsmonitor cookie '%s'"),
+			    cookie.name);
+
+		cookie.result = FCIR_ERROR;
+		hashmap_remove(&state->cookies, &cookie.entry, NULL);
+	}
+
+	pthread_mutex_unlock(&state->main_lock);
+
+	free((char*)cookie.name);
+	strbuf_release(&cookie_pathname);
+	return cookie.result;
+}
+
+/*
+ * Mark these cookies as _SEEN and wake up the corresponding client threads.
+ */
+static void fsmonitor_cookie_mark_seen(struct fsmonitor_daemon_state *state,
+				       const struct string_list *cookie_names)
+{
+	/* assert state->main_lock */
+
+	int k;
+	int nr_seen = 0;
+
+	for (k = 0; k < cookie_names->nr; k++) {
+		struct fsmonitor_cookie_item key;
+		struct fsmonitor_cookie_item *cookie;
+
+		key.name = cookie_names->items[k].string;
+		hashmap_entry_init(&key.entry, strhash(key.name));
+
+		cookie = hashmap_get_entry(&state->cookies, &key, entry, NULL);
+		if (cookie) {
+			trace_printf_key(&trace_fsmonitor, "cookie-seen: '%s'",
+					 cookie->name);
+			cookie->result = FCIR_SEEN;
+			nr_seen++;
+		}
+	}
+
+	if (nr_seen)
+		pthread_cond_broadcast(&state->cookies_cond);
+}
+
+/*
+ * Set _ABORT on all pending cookies and wake up all client threads.
+ */
+static void fsmonitor_cookie_abort_all(struct fsmonitor_daemon_state *state)
+{
+	/* assert state->main_lock */
+
+	struct hashmap_iter iter;
+	struct fsmonitor_cookie_item *cookie;
+	int nr_aborted = 0;
+
+	hashmap_for_each_entry(&state->cookies, &iter, cookie, entry) {
+		trace_printf_key(&trace_fsmonitor, "cookie-abort: '%s'",
+				 cookie->name);
+		cookie->result = FCIR_ABORT;
+		nr_aborted++;
+	}
+
+	if (nr_aborted)
+		pthread_cond_broadcast(&state->cookies_cond);
+}
+
 static int lookup_client_test_delay(void)
 {
 	static int delay_ms = -1;
@@ -435,6 +578,9 @@ static void fsmonitor_free_token_data(struct fsmonitor_token_data *token)
  *     We should create a new token and start fresh (as if we just
  *     booted up).
  *
+ * [2] Some of those lost events may have been for cookie files.  We
+ *     should assume the worst and abort them rather letting them starve.
+ *
  * If there are no readers of the the current token data series, we
  * can free it now.  Otherwise, let the last reader free it.  Either
  * way, the old token data series is no longer associated with our
@@ -454,6 +600,8 @@ void fsmonitor_force_resync(struct fsmonitor_daemon_state *state)
 			 state->current_token_data->token_id.buf,
 			 new_one->token_id.buf);
 
+	fsmonitor_cookie_abort_all(state);
+
 	if (state->current_token_data->client_ref_count == 0)
 		free_me = state->current_token_data;
 	state->current_token_data = new_one;
@@ -526,6 +674,7 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
 	kh_str_t *shown;
 	int hash_ret;
 	int result;
+	enum fsmonitor_cookie_item_result cookie_result;
 
 	/*
 	 * We expect `command` to be of the form:
@@ -654,6 +803,39 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
 		goto send_trivial_response;
 	}
 
+	pthread_mutex_unlock(&state->main_lock);
+
+	/*
+	 * Write a cookie file inside the directory being watched in an
+	 * effort to flush out existing filesystem events that we actually
+	 * care about.  Suspend this client thread until we see the filesystem
+	 * events for this cookie file.
+	 */
+	cookie_result = fsmonitor_wait_for_cookie(state);
+	if (cookie_result != FCIR_SEEN) {
+		error(_("fsmonitor: cookie_result '%d' != SEEN"),
+		      cookie_result);
+		result = 0;
+		goto send_trivial_response;
+	}
+
+	pthread_mutex_lock(&state->main_lock);
+
+	if (strcmp(requested_token_id.buf,
+		   state->current_token_data->token_id.buf)) {
+		/*
+		 * Ack! The listener thread lost sync with the filesystem
+		 * and created a new token while we were waiting for the
+		 * cookie file to be created!  Just give up.
+		 */
+		pthread_mutex_unlock(&state->main_lock);
+
+		trace_printf_key(&trace_fsmonitor,
+				 "lost filesystem sync");
+		result = 0;
+		goto send_trivial_response;
+	}
+
 	/*
 	 * We're going to hold onto a pointer to the current
 	 * token-data while we walk the list of batches of files.
@@ -982,6 +1164,9 @@ void fsmonitor_publish(struct fsmonitor_daemon_state *state,
 		}
 	}
 
+	if (cookie_names->nr)
+		fsmonitor_cookie_mark_seen(state, cookie_names);
+
 	pthread_mutex_unlock(&state->main_lock);
 }
 
@@ -1071,7 +1256,9 @@ static int fsmonitor_run_daemon(void)
 
 	memset(&state, 0, sizeof(state));
 
+	hashmap_init(&state.cookies, cookies_cmp, NULL, 0);
 	pthread_mutex_init(&state.main_lock, NULL);
+	pthread_cond_init(&state.cookies_cond, NULL);
 	state.error_code = 0;
 	state.current_token_data = fsmonitor_new_token_data();
 	state.test_client_delay_ms = lookup_client_test_delay();
@@ -1094,6 +1281,15 @@ static int fsmonitor_run_daemon(void)
 		state.nr_paths_watching = 2;
 	}
 
+	/*
+	 * We will write filesystem syncing cookie files into
+	 * <gitdir>/<cookie-prefix><pid>-<seq>.
+	 */
+	strbuf_init(&state.path_cookie_prefix, 0);
+	strbuf_addbuf(&state.path_cookie_prefix, &state.path_gitdir_watch);
+	strbuf_addch(&state.path_cookie_prefix, '/');
+	strbuf_addstr(&state.path_cookie_prefix, FSMONITOR_COOKIE_PREFIX);
+
 	/*
 	 * Confirm that we can create platform-specific resources for the
 	 * filesystem listener before we bother starting all the threads.
@@ -1106,6 +1302,7 @@ static int fsmonitor_run_daemon(void)
 	err = fsmonitor_run_daemon_1(&state);
 
 done:
+	pthread_cond_destroy(&state.cookies_cond);
 	pthread_mutex_destroy(&state.main_lock);
 	fsmonitor_fs_listen__dtor(&state);
 
@@ -1113,6 +1310,7 @@ static int fsmonitor_run_daemon(void)
 
 	strbuf_release(&state.path_worktree_watch);
 	strbuf_release(&state.path_gitdir_watch);
+	strbuf_release(&state.path_cookie_prefix);
 
 	return err;
 }
diff --git a/fsmonitor--daemon.h b/fsmonitor--daemon.h
index 06563b6ed56c..4e580e285ed6 100644
--- a/fsmonitor--daemon.h
+++ b/fsmonitor--daemon.h
@@ -45,6 +45,11 @@ struct fsmonitor_daemon_state {
 
 	struct fsmonitor_token_data *current_token_data;
 
+	struct strbuf path_cookie_prefix;
+	pthread_cond_t cookies_cond;
+	int cookie_seq;
+	struct hashmap cookies;
+
 	int error_code;
 	struct fsmonitor_daemon_backend_data *backend_data;
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 20/23] fsmonitor: force update index when fsmonitor token advances
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (18 preceding siblings ...)
  2021-04-01 15:41 ` [PATCH 19/23] fsmonitor--daemon: use a cookie file to sync with file system Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:41 ` Jeff Hostetler via GitGitGadget
  2021-04-27 14:52   ` Derrick Stolee
  2021-04-01 15:41 ` [PATCH 21/23] t7527: create test for fsmonitor--daemon Jeff Hostetler via GitGitGadget
                   ` (6 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:41 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Set the `FSMONITOR_CHANGED` bit on `istate->cache_changed` when the
fsmonitor response contains a different token to ensure that the index
is written to disk.

Normally, when the fsmonitor response includes a tracked file, the
index is always updated.  Similarly, the index might be updated when
the response alters the untracked-cache (when enabled).  However, in
cases where neither of those cause the index to be considered changed,
the fsmonitor response is wasted.  And subsequent commands will
continue to make requests with the same token and if there have not
been any changes in the working directory, they will receive the same
response.

This was observed on Windows after a large checkout.  On Windows, the
kernel emits events for the files that are changed as they are
changed.  However, it might delay events for the containing
directories until the system is more idle (or someone scans the
directory (so it seems)).  The first status following a checkout would
get the list of files.  The subsequent status commands would get the
list of directories as the events trickled out.  But they would never
catch up because the token was not advanced because the index wasn't
updated.

This list of directories caused `wt_status_collect_untracked()` to
unnecessarily spend time actually scanning them during each command.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 fsmonitor.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/fsmonitor.c b/fsmonitor.c
index d7e18fc8cd47..8b544e31f29f 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -353,6 +353,16 @@ void refresh_fsmonitor(struct index_state *istate)
 	}
 	strbuf_release(&query_result);
 
+	/*
+	 * If the fsmonitor response and the subsequent scan of the disk
+	 * did not cause the in-memory index to be marked dirty, then force
+	 * it so that we advance the fsmonitor token in our extension, so
+	 * that future requests don't keep re-requesting the same range.
+	 */
+	if (istate->fsmonitor_last_update &&
+	    strcmp(istate->fsmonitor_last_update, last_update_token.buf))
+		istate->cache_changed |= FSMONITOR_CHANGED;
+
 	/* Now that we've updated istate, save the last_update_token */
 	FREE_AND_NULL(istate->fsmonitor_last_update);
 	istate->fsmonitor_last_update = strbuf_detach(&last_update_token, NULL);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 21/23] t7527: create test for fsmonitor--daemon
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (19 preceding siblings ...)
  2021-04-01 15:41 ` [PATCH 20/23] fsmonitor: force update index when fsmonitor token advances Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:41 ` Jeff Hostetler via GitGitGadget
  2021-04-27 15:41   ` Derrick Stolee
  2021-04-01 15:41 ` [PATCH 22/23] p7519: add fsmonitor--daemon Jeff Hostetler via GitGitGadget
                   ` (5 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:41 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 t/t7527-builtin-fsmonitor.sh | 485 +++++++++++++++++++++++++++++++++++
 1 file changed, 485 insertions(+)
 create mode 100755 t/t7527-builtin-fsmonitor.sh

diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
new file mode 100755
index 000000000000..1fd230f1d4c6
--- /dev/null
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -0,0 +1,485 @@
+#!/bin/sh
+
+test_description='built-in file system watcher'
+
+. ./test-lib.sh
+
+# Ask the fsmonitor daemon to insert a little delay before responding to
+# client commands like `git status` and `git fsmonitor--daemon --query` to
+# allow recent filesystem events to be received by the daemon.  This helps
+# the CI/PR builds be more stable.
+#
+# An arbitrary millisecond value.
+#
+GIT_TEST_FSMONITOR_CLIENT_DELAY=1000
+export GIT_TEST_FSMONITOR_CLIENT_DELAY
+
+git version --build-options | grep "feature:" | grep "fsmonitor--daemon" || {
+	skip_all="The built-in FSMonitor is not supported on this platform"
+	test_done
+}
+
+kill_repo () {
+	r=$1
+	git -C $r fsmonitor--daemon --stop >/dev/null 2>/dev/null
+	rm -rf $1
+	return 0
+}
+
+start_daemon () {
+	case "$#" in
+		1) r="-C $1";;
+		*) r="";
+	esac
+
+	git $r fsmonitor--daemon --start || return $?
+	git $r fsmonitor--daemon --is-running || return $?
+
+	return 0
+}
+
+test_expect_success 'explicit daemon start and stop' '
+	test_when_finished "kill_repo test_explicit" &&
+
+	git init test_explicit &&
+	start_daemon test_explicit &&
+
+	git -C test_explicit fsmonitor--daemon --stop &&
+	test_must_fail git -C test_explicit fsmonitor--daemon --is-running
+'
+
+test_expect_success 'implicit daemon start' '
+	test_when_finished "kill_repo test_implicit" &&
+
+	git init test_implicit &&
+	test_must_fail git -C test_implicit fsmonitor--daemon --is-running &&
+
+	# query will implicitly start the daemon.
+	#
+	# for test-script simplicity, we send a V1 timestamp rather than
+	# a V2 token.  either way, the daemon response to any query contains
+	# a new V2 token.  (the daemon may complain that we sent a V1 request,
+	# but this test case is only concerned with whether the daemon was
+	# implicitly started.)
+
+	GIT_TRACE2_EVENT="$PWD/.git/trace" \
+		git -C test_implicit fsmonitor--daemon --query 0 >actual &&
+	nul_to_q <actual >actual.filtered &&
+	grep "builtin:" actual.filtered &&
+
+	# confirm that a daemon was started in the background.
+	#
+	# since the mechanism for starting the background daemon is platform
+	# dependent, just confirm that the foreground command received a
+	# response from the daemon.
+
+	grep :\"query/response-length\" .git/trace &&
+
+	git -C test_implicit fsmonitor--daemon --is-running &&
+	git -C test_implicit fsmonitor--daemon --stop &&
+	test_must_fail git -C test_implicit fsmonitor--daemon --is-running
+'
+
+test_expect_success 'implicit daemon stop (delete .git)' '
+	test_when_finished "kill_repo test_implicit_1" &&
+
+	git init test_implicit_1 &&
+
+	start_daemon test_implicit_1 &&
+
+	# deleting the .git directory will implicitly stop the daemon.
+	rm -rf test_implicit_1/.git &&
+
+	# Create an empty .git directory so that the following Git command
+	# will stay relative to the `-C` directory.  Without this, the Git
+	# command will (override the requested -C argument) and crawl out
+	# to the containing Git source tree.  This would make the test
+	# result dependent upon whether we were using fsmonitor on our
+	# development worktree.
+
+	sleep 1 &&
+	mkdir test_implicit_1/.git &&
+
+	test_must_fail git -C test_implicit_1 fsmonitor--daemon --is-running
+'
+
+test_expect_success 'implicit daemon stop (rename .git)' '
+	test_when_finished "kill_repo test_implicit_2" &&
+
+	git init test_implicit_2 &&
+
+	start_daemon test_implicit_2 &&
+
+	# renaming the .git directory will implicitly stop the daemon.
+	mv test_implicit_2/.git test_implicit_2/.xxx &&
+
+	# Create an empty .git directory so that the following Git command
+	# will stay relative to the `-C` directory.  Without this, the Git
+	# command will (override the requested -C argument) and crawl out
+	# to the containing Git source tree.  This would make the test
+	# result dependent upon whether we were using fsmonitor on our
+	# development worktree.
+
+	sleep 1 &&
+	mkdir test_implicit_2/.git &&
+
+	test_must_fail git -C test_implicit_2 fsmonitor--daemon --is-running
+'
+
+test_expect_success 'cannot start multiple daemons' '
+	test_when_finished "kill_repo test_multiple" &&
+
+	git init test_multiple &&
+
+	start_daemon test_multiple &&
+
+	test_must_fail git -C test_multiple fsmonitor--daemon --start 2>actual &&
+	grep "fsmonitor--daemon is already running" actual &&
+
+	git -C test_multiple fsmonitor--daemon --stop &&
+	test_must_fail git -C test_multiple fsmonitor--daemon --is-running
+'
+
+test_expect_success 'setup' '
+	>tracked &&
+	>modified &&
+	>delete &&
+	>rename &&
+	mkdir dir1 &&
+	>dir1/tracked &&
+	>dir1/modified &&
+	>dir1/delete &&
+	>dir1/rename &&
+	mkdir dir2 &&
+	>dir2/tracked &&
+	>dir2/modified &&
+	>dir2/delete &&
+	>dir2/rename &&
+	mkdir dirtorename &&
+	>dirtorename/a &&
+	>dirtorename/b &&
+
+	cat >.gitignore <<-\EOF &&
+	.gitignore
+	expect*
+	actual*
+	EOF
+
+	git -c core.useBuiltinFSMonitor= add . &&
+	test_tick &&
+	git -c core.useBuiltinFSMonitor= commit -m initial &&
+
+	git config core.useBuiltinFSMonitor true
+'
+
+test_expect_success 'update-index implicitly starts daemon' '
+	test_must_fail git fsmonitor--daemon --is-running &&
+
+	GIT_TRACE2_EVENT="$PWD/.git/trace_implicit_1" \
+		git update-index --fsmonitor &&
+
+	git fsmonitor--daemon --is-running &&
+	test_might_fail git fsmonitor--daemon --stop &&
+
+	grep \"event\":\"start\".*\"fsmonitor--daemon\" .git/trace_implicit_1
+'
+
+test_expect_success 'status implicitly starts daemon' '
+	test_must_fail git fsmonitor--daemon --is-running &&
+
+	GIT_TRACE2_EVENT="$PWD/.git/trace_implicit_2" \
+		git status >actual &&
+
+	git fsmonitor--daemon --is-running &&
+	test_might_fail git fsmonitor--daemon --stop &&
+
+	grep \"event\":\"start\".*\"fsmonitor--daemon\" .git/trace_implicit_2
+'
+
+edit_files() {
+	echo 1 >modified
+	echo 2 >dir1/modified
+	echo 3 >dir2/modified
+	>dir1/untracked
+}
+
+delete_files() {
+	rm -f delete
+	rm -f dir1/delete
+	rm -f dir2/delete
+}
+
+create_files() {
+	echo 1 >new
+	echo 2 >dir1/new
+	echo 3 >dir2/new
+}
+
+rename_files() {
+	mv rename renamed
+	mv dir1/rename dir1/renamed
+	mv dir2/rename dir2/renamed
+}
+
+file_to_directory() {
+	rm -f delete
+	mkdir delete
+	echo 1 >delete/new
+}
+
+directory_to_file() {
+	rm -rf dir1
+	echo 1 >dir1
+}
+
+verify_status() {
+	git status >actual &&
+	GIT_INDEX_FILE=.git/fresh-index git read-tree master &&
+	GIT_INDEX_FILE=.git/fresh-index git -c core.useBuiltinFSMonitor= status >expect &&
+	test_cmp expect actual &&
+	echo HELLO AFTER &&
+	cat .git/trace &&
+	echo HELLO AFTER
+}
+
+# The next few test cases confirm that our fsmonitor daemon sees each type
+# of OS filesystem notification that we care about.  At this layer we just
+# ensure we are getting the OS notifications and do not try to confirm what
+# is reported by `git status`.
+#
+# We run a simple query after modifying the filesystem just to introduce
+# a bit of a delay so that the trace logging from the daemon has time to
+# get flushed to disk.
+#
+# We `reset` and `clean` at the bottom of each test (and before stopping the
+# daemon) because these commands might implicitly restart the daemon.
+
+clean_up_repo_and_stop_daemon () {
+	git reset --hard HEAD
+	git clean -fd
+	git fsmonitor--daemon --stop
+	rm -f .git/trace
+}
+
+test_expect_success 'edit some files' '
+	test_when_finished "clean_up_repo_and_stop_daemon" &&
+
+	(
+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	edit_files &&
+
+	git fsmonitor--daemon --query 0 >/dev/null 2>&1 &&
+
+	grep "^event: dir1/modified$"  .git/trace &&
+	grep "^event: dir2/modified$"  .git/trace &&
+	grep "^event: modified$"       .git/trace &&
+	grep "^event: dir1/untracked$" .git/trace
+'
+
+test_expect_success 'create some files' '
+	test_when_finished "clean_up_repo_and_stop_daemon" &&
+
+	(
+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	create_files &&
+
+	git fsmonitor--daemon --query 0 >/dev/null 2>&1 &&
+
+	grep "^event: dir1/new$" .git/trace &&
+	grep "^event: dir2/new$" .git/trace &&
+	grep "^event: new$"      .git/trace
+'
+
+test_expect_success 'delete some files' '
+	test_when_finished "clean_up_repo_and_stop_daemon" &&
+
+	(
+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	delete_files &&
+
+	git fsmonitor--daemon --query 0 >/dev/null 2>&1 &&
+
+	grep "^event: dir1/delete$" .git/trace &&
+	grep "^event: dir2/delete$" .git/trace &&
+	grep "^event: delete$"      .git/trace
+'
+
+test_expect_success 'rename some files' '
+	test_when_finished "clean_up_repo_and_stop_daemon" &&
+
+	(
+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	rename_files &&
+
+	git fsmonitor--daemon --query 0 >/dev/null 2>&1 &&
+
+	grep "^event: dir1/rename$"  .git/trace &&
+	grep "^event: dir2/rename$"  .git/trace &&
+	grep "^event: rename$"       .git/trace &&
+	grep "^event: dir1/renamed$" .git/trace &&
+	grep "^event: dir2/renamed$" .git/trace &&
+	grep "^event: renamed$"      .git/trace
+'
+
+test_expect_success 'rename directory' '
+	test_when_finished "clean_up_repo_and_stop_daemon" &&
+
+	(
+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	mv dirtorename dirrenamed &&
+
+	git fsmonitor--daemon --query 0 >/dev/null 2>&1 &&
+
+	grep "^event: dirtorename/*$" .git/trace &&
+	grep "^event: dirrenamed/*$"  .git/trace
+'
+
+test_expect_success 'file changes to directory' '
+	test_when_finished "clean_up_repo_and_stop_daemon" &&
+
+	(
+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	file_to_directory &&
+
+	git fsmonitor--daemon --query 0 >/dev/null 2>&1 &&
+
+	grep "^event: delete$"     .git/trace &&
+	grep "^event: delete/new$" .git/trace
+'
+
+test_expect_success 'directory changes to a file' '
+	test_when_finished "clean_up_repo_and_stop_daemon" &&
+
+	(
+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	directory_to_file &&
+
+	git fsmonitor--daemon --query 0 >/dev/null 2>&1 &&
+
+	grep "^event: dir1$" .git/trace
+'
+
+# The next few test cases exercise the token-resync code.  When filesystem
+# drops events (because of filesystem velocity or because the daemon isn't
+# polling fast enough), we need to discard the cached data (relative to the
+# current token) and start collecting events under a new token.
+#
+# the 'git fsmonitor--daemon --flush' command can be used to send a "flush"
+# message to a running daemon and ask it to do a flush/resync.
+
+test_expect_success 'flush cached data' '
+	test_when_finished "kill_repo test_flush" &&
+
+	git init test_flush &&
+
+	(
+		GIT_TEST_FSMONITOR_TOKEN=true &&
+		export GIT_TEST_FSMONITOR_TOKEN &&
+
+		GIT_TRACE_FSMONITOR="$PWD/.git/trace_daemon" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon test_flush
+	) &&
+
+	# The daemon should have an initial token with no events in _0 and
+	# then a few (probably platform-specific number of) events in _1.
+	# These should both have the same <token_id>.
+
+	git -C test_flush fsmonitor--daemon --query "builtin:test_00000001:0" >actual_0 &&
+	nul_to_q <actual_0 >actual_q0 &&
+
+	touch test_flush/file_1 &&
+	touch test_flush/file_2 &&
+
+	git -C test_flush fsmonitor--daemon --query "builtin:test_00000001:0" >actual_1 &&
+	nul_to_q <actual_1 >actual_q1 &&
+
+	grep "file_1" actual_q1 &&
+
+	# Force a flush.  This will change the <token_id>, reset the <seq_nr>, and
+	# flush the file data.  Then create some events and ensure that the file
+	# again appears in the cache.  It should have the new <token_id>.
+
+	git -C test_flush fsmonitor--daemon --flush >flush_0 &&
+	nul_to_q <flush_0 >flush_q0 &&
+	grep "^builtin:test_00000002:0Q/Q$" flush_q0 &&
+
+	git -C test_flush fsmonitor--daemon --query "builtin:test_00000002:0" >actual_2 &&
+	nul_to_q <actual_2 >actual_q2 &&
+
+	grep "^builtin:test_00000002:0Q$" actual_q2 &&
+
+	touch test_flush/file_3 &&
+
+	git -C test_flush fsmonitor--daemon --query "builtin:test_00000002:0" >actual_3 &&
+	nul_to_q <actual_3 >actual_q3 &&
+
+	grep "file_3" actual_q3
+'
+
+# The next few test cases create repos where the .git directory is NOT
+# inside the one of the working directory.  That is, where .git is a file
+# that points to a directory elsewhere.  This happens for submodules and
+# non-primary worktrees.
+
+test_expect_success 'setup worktree base' '
+	git init wt-base &&
+	echo 1 >wt-base/file1 &&
+	git -C wt-base add file1 &&
+	git -C wt-base commit -m "c1"
+'
+
+test_expect_success 'worktree with .git file' '
+	git -C wt-base worktree add ../wt-secondary &&
+
+	(
+		GIT_TRACE2_PERF="$PWD/trace2_wt_secondary" &&
+		export GIT_TRACE2_PERF &&
+
+		GIT_TRACE_FSMONITOR="$PWD/trace_wt_secondary" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon wt-secondary
+	) &&
+
+	git -C wt-secondary fsmonitor--daemon --stop &&
+	test_must_fail git -C wt-secondary fsmonitor--daemon --is-running
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 22/23] p7519: add fsmonitor--daemon
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (20 preceding siblings ...)
  2021-04-01 15:41 ` [PATCH 21/23] t7527: create test for fsmonitor--daemon Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:41 ` Jeff Hostetler via GitGitGadget
  2021-04-27 15:45   ` Derrick Stolee
  2021-04-01 15:41 ` [PATCH 23/23] t7527: test status with untracked-cache and fsmonitor--daemon Jeff Hostetler via GitGitGadget
                   ` (4 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:41 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Repeat all of the fsmonitor perf tests using `git fsmonitor--daemon` and
the "Simple IPC" interface.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 t/perf/p7519-fsmonitor.sh | 37 +++++++++++++++++++++++++++++++++++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
index 5eb5044a103c..2d018bc7d589 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -24,7 +24,8 @@ test_description="Test core.fsmonitor"
 # GIT_PERF_7519_SPLIT_INDEX: used to configure core.splitIndex
 # GIT_PERF_7519_FSMONITOR: used to configure core.fsMonitor. May be an
 #   absolute path to an integration. May be a space delimited list of
-#   absolute paths to integrations.
+#   absolute paths to integrations.  (This hook or list of hooks does not
+#   include the built-in fsmonitor--daemon.)
 #
 # The big win for using fsmonitor is the elimination of the need to scan the
 # working directory looking for changed and untracked files. If the file
@@ -135,10 +136,16 @@ test_expect_success "one time repo setup" '
 
 setup_for_fsmonitor() {
 	# set INTEGRATION_SCRIPT depending on the environment
-	if test -n "$INTEGRATION_PATH"
+	if test -n "$USE_FSMONITOR_DAEMON"
 	then
+		git config core.useBuiltinFSMonitor true &&
+		INTEGRATION_SCRIPT=false
+	elif test -n "$INTEGRATION_PATH"
+	then
+		git config core.useBuiltinFSMonitor false &&
 		INTEGRATION_SCRIPT="$INTEGRATION_PATH"
 	else
+		git config core.useBuiltinFSMonitor false &&
 		#
 		# Choose integration script based on existence of Watchman.
 		# Fall back to an empty integration script.
@@ -285,4 +292,30 @@ test_expect_success "setup without fsmonitor" '
 test_fsmonitor_suite
 trace_stop
 
+#
+# Run a full set of perf tests using the built-in fsmonitor--daemon.
+# It does not use the Hook API, so it has a different setup.
+# Explicitly start the daemon here and before we start client commands
+# so that we can later add custom tracing.
+#
+
+test_lazy_prereq HAVE_FSMONITOR_DAEMON '
+	git version --build-options | grep "feature:" | grep "fsmonitor--daemon"
+'
+
+if test_have_prereq HAVE_FSMONITOR_DAEMON
+then
+	USE_FSMONITOR_DAEMON=t
+
+	trace_start fsmonitor--daemon--server
+	git fsmonitor--daemon --start
+
+	trace_start fsmonitor--daemon--client
+	test_expect_success "setup for fsmonitor--daemon" 'setup_for_fsmonitor'
+	test_fsmonitor_suite
+
+	git fsmonitor--daemon --stop
+	trace_stop
+fi
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH 23/23] t7527: test status with untracked-cache and fsmonitor--daemon
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (21 preceding siblings ...)
  2021-04-01 15:41 ` [PATCH 22/23] p7519: add fsmonitor--daemon Jeff Hostetler via GitGitGadget
@ 2021-04-01 15:41 ` Jeff Hostetler via GitGitGadget
  2021-04-27 15:51   ` Derrick Stolee
  2021-04-16 22:44 ` [PATCH 00/23] [RFC] Builtin FSMonitor Feature Junio C Hamano
                   ` (3 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-04-01 15:41 UTC (permalink / raw)
  To: git; +Cc: Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Create 2x2 test matrix with the untracked-cache and fsmonitor--daemon
features and a series of edits and verify that status output is
identical.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 t/t7527-builtin-fsmonitor.sh | 97 ++++++++++++++++++++++++++++++++++++
 1 file changed, 97 insertions(+)

diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index 1fd230f1d4c6..ad2188169db7 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -163,6 +163,8 @@ test_expect_success 'setup' '
 	.gitignore
 	expect*
 	actual*
+	flush*
+	trace*
 	EOF
 
 	git -c core.useBuiltinFSMonitor= add . &&
@@ -482,4 +484,99 @@ test_expect_success 'worktree with .git file' '
 	test_must_fail git -C wt-secondary fsmonitor--daemon --is-running
 '
 
+# TODO Repeat one of the "edit" tests on wt-secondary and confirm that
+# TODO we get the same events and behavior -- that is, that fsmonitor--daemon
+# TODO correctly listens to events on both the working directory and to the
+# TODO referenced GITDIR.
+
+test_expect_success 'cleanup worktrees' '
+	kill_repo wt-secondary &&
+	kill_repo wt-base
+'
+
+# The next few tests perform arbitrary/contrived file operations and
+# confirm that status is correct.  That is, that the data (or lack of
+# data) from fsmonitor doesn't cause incorrect results.  And doesn't
+# cause incorrect results when the untracked-cache is enabled.
+
+test_lazy_prereq UNTRACKED_CACHE '
+	{ git update-index --test-untracked-cache; ret=$?; } &&
+	test $ret -ne 1
+'
+
+test_expect_success 'Matrix: setup for untracked-cache,fsmonitor matrix' '
+	test_might_fail git config --unset core.useBuiltinFSMonitor &&
+	git update-index --no-fsmonitor &&
+	test_might_fail git fsmonitor--daemon --stop
+'
+
+matrix_clean_up_repo () {
+	git reset --hard HEAD
+	git clean -fd
+}
+
+matrix_try () {
+	uc=$1
+	fsm=$2
+	fn=$3
+
+	test_expect_success "Matrix[uc:$uc][fsm:$fsm] $fn" '
+		matrix_clean_up_repo &&
+		$fn &&
+		if test $uc = false -a $fsm = false
+		then
+			git status --porcelain=v1 >.git/expect.$fn
+		else
+			git status --porcelain=v1 >.git/actual.$fn
+			test_cmp .git/expect.$fn .git/actual.$fn
+		fi
+	'
+
+	return $?
+}
+
+uc_values="false"
+test_have_prereq UNTRACKED_CACHE && uc_values="false true"
+for uc_val in $uc_values
+do
+	if test $uc_val = false
+	then
+		test_expect_success "Matrix[uc:$uc_val] disable untracked cache" '
+			git config core.untrackedcache false &&
+			git update-index --no-untracked-cache
+		'
+	else
+		test_expect_success "Matrix[uc:$uc_val] enable untracked cache" '
+			git config core.untrackedcache true &&
+			git update-index --untracked-cache
+		'
+	fi
+
+	fsm_values="false true"
+	for fsm_val in $fsm_values
+	do
+		if test $fsm_val = false
+		then
+			test_expect_success "Matrix[uc:$uc_val][fsm:$fsm_val] disable fsmonitor" '
+				test_might_fail git config --unset core.useBuiltinFSMonitor &&
+				git update-index --no-fsmonitor &&
+				test_might_fail git fsmonitor--daemon --stop 2>/dev/null
+			'
+		else
+			test_expect_success "Matrix[uc:$uc_val][fsm:$fsm_val] enable fsmonitor" '
+				git config core.useBuiltinFSMonitor true &&
+				git fsmonitor--daemon --start &&
+				git update-index --fsmonitor
+			'
+		fi
+
+		matrix_try $uc_val $fsm_val edit_files
+		matrix_try $uc_val $fsm_val delete_files
+		matrix_try $uc_val $fsm_val create_files
+		matrix_try $uc_val $fsm_val rename_files
+		matrix_try $uc_val $fsm_val file_to_directory
+		matrix_try $uc_val $fsm_val directory_to_file
+	done
+done
+
 test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 237+ messages in thread

* Re: [PATCH 00/23] [RFC] Builtin FSMonitor Feature
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (22 preceding siblings ...)
  2021-04-01 15:41 ` [PATCH 23/23] t7527: test status with untracked-cache and fsmonitor--daemon Jeff Hostetler via GitGitGadget
@ 2021-04-16 22:44 ` Junio C Hamano
  2021-04-20 15:27   ` Johannes Schindelin
  2021-04-27 18:49 ` FS Monitor Windows Performance (was [PATCH 00/23] [RFC] Builtin FSMonitor Feature) Derrick Stolee
                   ` (2 subsequent siblings)
  26 siblings, 1 reply; 237+ messages in thread
From: Junio C Hamano @ 2021-04-16 22:44 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget; +Cc: git, Jeff Hostetler

"Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:

> This patch series adds a builtin FSMonitor daemon to Git.

This hasn't seen much (if any) activity for a few weeks.

Does that mean nobody (other than obviously the author and whoever
wanted to have this feature) is interested?

What does it need to get this topic unstuck?

> Finally, having a builtin daemon eliminates the need for user to download
> and install a third-party tool. This makes enterprise deployments simpler
> since there are fewer parts to install, maintain, and updates to track.
>
> This RFC version includes support for Windows and MacOS file system events.
> A Linux version will be submitted in a later patch series.


^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 00/23] [RFC] Builtin FSMonitor Feature
  2021-04-16 22:44 ` [PATCH 00/23] [RFC] Builtin FSMonitor Feature Junio C Hamano
@ 2021-04-20 15:27   ` Johannes Schindelin
  2021-04-20 19:13     ` Jeff Hostetler
  2021-04-21 13:17     ` Derrick Stolee
  0 siblings, 2 replies; 237+ messages in thread
From: Johannes Schindelin @ 2021-04-20 15:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff Hostetler via GitGitGadget, git, Jeff Hostetler

Hi Junio,

On Fri, 16 Apr 2021, Junio C Hamano wrote:

> "Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > This patch series adds a builtin FSMonitor daemon to Git.
>
> This hasn't seen much (if any) activity for a few weeks.

It actually is a good sign: I integrated this into Git for Windows (as an
experimental feature) and am running with it for a couple of weeks already
(in _all_ worktrees, not just the massively large ones).

At first, I ran into a handful Blue Screens of Death, and I was worried
that they should be attributed to FSMonitor. But it turns out that this
issue was most likely caused by a Windows update, and semi-resolved with
another Windows update (and only happens because I use WSL extensively).
In other words, those crashes are not related to FSMonitor.

So yeah, I find the lack of activity pretty good news.

However, I would have hoped that this patch series would see a couple of
reviews in the meantime. Since I was involved in the development of this
patch series (I started it just before I got dragged into all that
security work that led to v2.24.1, and never quite got back to it after
that), I wondered whether it would be "self review" if I reviewed those
patches, which is something I'd rather avoid.

But if nobody else reviews the patches, I will.

> Does that mean nobody (other than obviously the author and whoever
> wanted to have this feature) is interested?

The most likely reason why this does not see more reviews is that it
matters most for massive worktrees, and I don't think anybody here works
with those. The closest to a massive worktree I have is the `git-sdk-64`
repository (which has pretty much nothing to do with source code at all,
it is just a matter of convenience that this is a Git repository; Think of
it as if somebody mirrored their Ubuntu installation by tracking it in a
Git repository and cloning it onto all of their machines). And that is not
really all that massive:

	$ git -C / ls-files | wc -l
	162975

That's tiny compared to some worktrees I saw.

But we should not mistake the needs of those on the Git mailing list (`git
ls-tree -r v2.31.1 | wc -l` says we have only 3901 files/symlinks) for the
needs of some of our biggest users.

So I would like to respectfully ask for this patch series to be kept under
consideration for `next`.

> What does it need to get this topic unstuck?

The same resource that you keep complaining about, and that seems to be
drained more quickly than it can be replenished: reviewers.

I am as guilty as the next person, of course, and it does not help that I
get Cc:ed on several dozen patches seemingly every couple of days: this is
just too much, and I cannot do it, so I admittedly neglect too many patch
series (even the ones that I _do_ want to review, such as the
`bisect-in-c` one). My inbox is seriously no fun place to visit right now.

> > Finally, having a builtin daemon eliminates the need for user to download
> > and install a third-party tool. This makes enterprise deployments simpler
> > since there are fewer parts to install, maintain, and updates to track.
> >
> > This RFC version includes support for Windows and MacOS file system events.
> > A Linux version will be submitted in a later patch series.

I guess this is another reason why this patch series did not see many
reviews: the lack of a Linux backend. And I fear that the statement "A
Linux version will be submitted in a later patch series" is a bit strong,
given that my original implementation of that backend does not really do
its job well: it uses `inotify` and therefore requires one handle _per
directory_, which in turn drains the number of file handles rather quickly
when your worktree has many directories. Meaning: It fails todoes not work in the
massive worktrees for which it was intended.

Now, I heard rumors that there is a saner way to monitor directory trees
in recent Linux kernel versions (Jeff, can you fill in where I am
blanking?) and it might be a good idea to solicit volunteers to tackle
this backend, so that the Linux-leaning crowd on this here mailing list
is interested a bit more?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 00/23] [RFC] Builtin FSMonitor Feature
  2021-04-20 15:27   ` Johannes Schindelin
@ 2021-04-20 19:13     ` Jeff Hostetler
  2021-04-21 13:17     ` Derrick Stolee
  1 sibling, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-04-20 19:13 UTC (permalink / raw)
  To: Johannes Schindelin, Junio C Hamano
  Cc: Jeff Hostetler via GitGitGadget, git, Jeff Hostetler



On 4/20/21 11:27 AM, Johannes Schindelin wrote:
> Hi Junio,
...
>>> This RFC version includes support for Windows and MacOS file system events.
>>> A Linux version will be submitted in a later patch series.
> 
> I guess this is another reason why this patch series did not see many
> reviews: the lack of a Linux backend. And I fear that the statement "A
> Linux version will be submitted in a later patch series" is a bit strong,
> given that my original implementation of that backend does not really do
> its job well: it uses `inotify` and therefore requires one handle _per
> directory_, which in turn drains the number of file handles rather quickly
> when your worktree has many directories. Meaning: It fails todoes not work in the
> massive worktrees for which it was intended.
> 
> Now, I heard rumors that there is a saner way to monitor directory trees
> in recent Linux kernel versions (Jeff, can you fill in where I am
> blanking?) and it might be a good idea to solicit volunteers to tackle
> this backend, so that the Linux-leaning crowd on this here mailing list
> is interested a bit more?

Yes, I removed the early inotify-based version because the kernel limits
the number of inotify handles to 8k (at least on my Mint box) and that
is a global limit -- shared by any process wanting to use inotify.
The first monorepo that I tried it on had 120K directories in my
sparse checkout...

I'm told there is a newer "fanotify" facility available in newer
Linux kernels that behaves more like Windows and MacOS and handles
subdirectories.  I intend to jump into that shortly (unless someone
is already familiar with fanotify and and wants to try it).

Jeff


^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 00/23] [RFC] Builtin FSMonitor Feature
  2021-04-20 15:27   ` Johannes Schindelin
  2021-04-20 19:13     ` Jeff Hostetler
@ 2021-04-21 13:17     ` Derrick Stolee
  1 sibling, 0 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-21 13:17 UTC (permalink / raw)
  To: Johannes Schindelin, Junio C Hamano
  Cc: Jeff Hostetler via GitGitGadget, git, Jeff Hostetler

On 4/20/2021 11:27 AM, Johannes Schindelin wrote:
> Hi Junio,
> 
> On Fri, 16 Apr 2021, Junio C Hamano wrote:
> 
>> "Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>
>>> This patch series adds a builtin FSMonitor daemon to Git.
>>
>> This hasn't seen much (if any) activity for a few weeks.
...
>> What does it need to get this topic unstuck?
> 
> The same resource that you keep complaining about, and that seems to be
> drained more quickly than it can be replenished: reviewers.

I purposefully stayed away from reviewing the series since we are on
the same team, but I have _not_ been involved in the development. At
least that lets me have fresh eyes.

If no external community members are willing to review it, then I will
dedicate time for a careful review this week.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 01/23] fsmonitor--daemon: man page and documentation
  2021-04-01 15:40 ` [PATCH 01/23] fsmonitor--daemon: man page and documentation Jeff Hostetler via GitGitGadget
@ 2021-04-26 14:13   ` Derrick Stolee
  2021-04-28 13:54     ` Jeff Hostetler
  0 siblings, 1 reply; 237+ messages in thread
From: Derrick Stolee @ 2021-04-26 14:13 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/21 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Create a manual page describing the `git fsmonitor--daemon` feature.
> 
> Update references to `core.fsmonitor`, `core.fsmonitorHookVersion` and
> pointers to `watchman` to mention the built-in FSMonitor.

Make sense to add clarity here, since there will be new ways
to interact with a fileystem monitor.
>  core.fsmonitorHookVersion::
> -	Sets the version of hook that is to be used when calling fsmonitor.
> -	There are currently versions 1 and 2. When this is not set,
> -	version 2 will be tried first and if it fails then version 1
> -	will be tried. Version 1 uses a timestamp as input to determine
> -	which files have changes since that time but some monitors
> -	like watchman have race conditions when used with a timestamp.
> -	Version 2 uses an opaque string so that the monitor can return
> -	something that can be used to determine what files have changed
> -	without race conditions.
> +	Sets the version of hook that is to be used when calling the
> +	FSMonitor hook (as configured via `core.fsmonitor`).
> ++
> +There are currently versions 1 and 2. When this is not set,
> +version 2 will be tried first and if it fails then version 1
> +will be tried. Version 1 uses a timestamp as input to determine
> +which files have changes since that time but some monitors
> +like watchman have race conditions when used with a timestamp.
> +Version 2 uses an opaque string so that the monitor can return
> +something that can be used to determine what files have changed
> +without race conditions.

This initially seemed like a big edit, but you just split the single
paragraph into multiple, with a better leading sentence and a final
statement about the built-in FSMonitor. Good.
> ++
> +Note: FSMonitor hooks (and this config setting) are ignored if the
> +built-in FSMonitor is enabled (see `core.useBuiltinFSMonitor`).
> +
> +core.useBuiltinFSMonitor::
> +	If set to true, enable the built-in filesystem event watcher (for
> +	technical details, see linkgit:git-fsmonitor--daemon[1]).
> ++
> +Like external (hook-based) FSMonitors, the built-in FSMonitor can speed up
> +Git commands that need to refresh the Git index (e.g. `git status`) in a
> +worktree with many files. The built-in FSMonitor facility eliminates the
> +need to install and maintain an external third-party monitoring tool.
> ++
> +The built-in FSMonitor is currently available only on a limited set of
> +supported platforms.

Is there a way for users to know this set of platforms? Can they run
a command to find out? Will 'git fsmonitor--daemon --start' send a
helpful message to assist here? Or, could there be a 'git
fsmonitor--daemon --test' command?

> +Note: if this config setting is set to `true`, any FSMonitor hook
> +configured via `core.fsmonitor` (and possibly `core.fsmonitorHookVersion`)
> +is ignored.
...
> +git-fsmonitor--daemon(1)
> +========================
> +
> +NAME
> +----
> +git-fsmonitor--daemon - Builtin file system monitor daemon
> +
> +SYNOPSIS
> +--------
> +[verse]
> +'git fsmonitor--daemon' --start
> +'git fsmonitor--daemon' --run
> +'git fsmonitor--daemon' --stop
> +'git fsmonitor--daemon' --is-running
> +'git fsmonitor--daemon' --is-supported
> +'git fsmonitor--daemon' --query <token>
> +'git fsmonitor--daemon' --query-index
> +'git fsmonitor--daemon' --flush

These arguments with the "--" prefix make it seem like they are
options that could be grouped together, but you really want these
to be verbs within the daemon. What do you think about removing
the "--" prefixes?

> +
> +DESCRIPTION
> +-----------
> +
> +Monitors files and directories in the working directory for changes using
> +platform-specific file system notification facilities.
> +
> +It communicates directly with commands like `git status` using the
> +link:technical/api-simple-ipc.html[simple IPC] interface instead of
> +the slower linkgit:githooks[5] interface.
> +
> +OPTIONS
> +-------

I typically view "OPTIONS" as arguments that can be grouped together,
but you are describing things more like verbs or subcommands. The
most recent example I know about is 'git maintenance <subcommand>',
documented at [1].

[1] https://git-scm.com/docs/git-maintenance#_subcommands

> +
> +--start::
> +	Starts the fsmonitor daemon in the background.
> +
> +--run::
> +	Runs the fsmonitor daemon in the foreground.
> +
> +--stop::
> +	Stops the fsmonitor daemon running for the current working
> +	directory, if present.

I'm noticing "fsmonitor" in lowercase throughout this document. Is
that the intended case for user-facing documentation? I've been 
seeing "FS Monitor", "filesystem monitor", or even "File System
Monitor" in other places.

> +--is-running::
> +	Exits with zero status if the fsmonitor daemon is watching the
> +	current working directory.

Another potential name for this verb is "status".

> +--is-supported::
> +	Exits with zero status if the fsmonitor daemon feature is supported
> +	on this platform.

Ah, here is an indicator of whether the platform is supported. Please
include details for this command in the earlier documentation. I'll
check later to see if a message is also sent over 'stderr', which
would be helpful. Documenting the exit status is good for third-party
tools that might use this.

> +--query <token>::
> +	Connects to the fsmonitor daemon (starting it if necessary) and
> +	requests the list of changed files and directories since the
> +	given token.
> +	This is intended for testing purposes.
> +
> +--query-index::
> +	Read the current `<token>` from the File System Monitor index
> +	extension (if present) and use it to query the fsmonitor daemon.
> +	This is intended for testing purposes.

These two could be grouped as "query [--token=X|--index]", especially
because they are for testing purposes.

> +
> +--flush::
> +	Force the fsmonitor daemon to flush its in-memory cache and
> +	re-sync with the file system.
> +	This is intended for testing purposes.

Do you see benefits to these being available in the CLI? Could these
be better served as a test helper?

> +REMARKS
> +-------
> +The fsmonitor daemon is a long running process that will watch a single
> +working directory.  Commands, such as `git status`, should automatically
> +start it (if necessary) when `core.useBuiltinFSMonitor` is set to `true`
> +(see linkgit:git-config[1]).
> +
> +Configure the built-in FSMonitor via `core.useBuiltinFSMonitor` in each
> +working directory separately, or globally via `git config --global
> +core.useBuiltinFSMonitor true`.
> +
> +Tokens are opaque strings.  They are used by the fsmonitor daemon to
> +mark a point in time and the associated internal state.  Callers should
> +make no assumptions about the content of the token.  In particular,
> +the should not assume that it is a timestamp.
> +
> +Query commands send a request-token to the daemon and it responds with
> +a summary of the changes that have occurred since that token was
> +created.  The daemon also returns a response-token that the client can
> +use in a future query.
> +
> +For more information see the "File System Monitor" section in
> +linkgit:git-update-index[1].
> +
> +CAVEATS
> +-------
> +
> +The fsmonitor daemon does not currently know about submodules and does
> +not know to filter out file system events that happen within a
> +submodule.  If fsmonitor daemon is watching a super repo and a file is
> +modified within the working directory of a submodule, it will report
> +the change (as happening against the super repo).  However, the client
> +should properly ignore these extra events, so performance may be affected
> +but it should not cause an incorrect result.

There are several uses of the word "should" where I think "will" is a
more appropriate word. That is, unless we do not actually have confidence
in this behavior.

> --- a/Documentation/git-update-index.txt
> +++ b/Documentation/git-update-index.txt
> @@ -498,7 +498,9 @@ FILE SYSTEM MONITOR
>  This feature is intended to speed up git operations for repos that have
>  large working directories.
>  
> -It enables git to work together with a file system monitor (see the
> +It enables git to work together with a file system monitor (see
> +linkgit:git-fsmonitor--daemon[1]
> +and the
>  "fsmonitor-watchman" section of linkgit:githooks[5]) that can
>  inform it as to what files have been modified. This enables git to avoid
>  having to lstat() every file to find modified files.
> diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
> index b51959ff9418..b7d5e926f7b0 100644
> --- a/Documentation/githooks.txt
> +++ b/Documentation/githooks.txt
> @@ -593,7 +593,8 @@ fsmonitor-watchman
>  
>  This hook is invoked when the configuration option `core.fsmonitor` is
>  set to `.git/hooks/fsmonitor-watchman` or `.git/hooks/fsmonitor-watchmanv2`
> -depending on the version of the hook to use.
> +depending on the version of the hook to use, unless overridden via
> +`core.useBuiltinFSMonitor` (see linkgit:git-config[1]).
>  
>  Version 1 takes two arguments, a version (1) and the time in elapsed
>  nanoseconds since midnight, January 1, 1970.

These are good connections to make.

Since the documentation for the fsmonitor--daemon is so deep, this
patch might be served well to split into two: one that just documents
the daemon, and another that updates existing documentation to point
to the new file.

This does provide a good basis for me to investigate during the rest
of the review.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 02/23] fsmonitor-ipc: create client routines for git-fsmonitor--daemon
  2021-04-01 15:40 ` [PATCH 02/23] fsmonitor-ipc: create client routines for git-fsmonitor--daemon Jeff Hostetler via GitGitGadget
@ 2021-04-26 14:31   ` Derrick Stolee
  2021-04-26 20:20     ` Eric Sunshine
  2021-04-28 19:26     ` Jeff Hostetler
  0 siblings, 2 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-26 14:31 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/21 11:40 AM, Jeff Hostetler via GitGitGadget wrote:> +#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
> +#define FSMONITOR_DAEMON_IS_SUPPORTED 1
> +#else
> +#define FSMONITOR_DAEMON_IS_SUPPORTED 0
> +#endif
> +
> +/*
> + * A trivial function so that this source file always defines at least
> + * one symbol even when the feature is not supported.  This quiets an
> + * annoying compiler error.
> + */
> +int fsmonitor_ipc__is_supported(void)
> +{
> +	return FSMONITOR_DAEMON_IS_SUPPORTED;
> +}

I don't see any other use of FSMONITOR_DAEMON_IS_SUPPORTED,
so I was thinking you could use the #ifdef/#else/#endif
construct within the implementation of this method instead
of creating a macro outside. But my suggestion might be an
anti-pattern, so feel free to ignore me.

> +#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
> +
> +GIT_PATH_FUNC(fsmonitor_ipc__get_path, "fsmonitor")
> +
> +enum ipc_active_state fsmonitor_ipc__get_state(void)
> +{
> +	return ipc_get_active_state(fsmonitor_ipc__get_path());
> +}
> +
> +static int spawn_daemon(void)
> +{
> +	const char *args[] = { "fsmonitor--daemon", "--start", NULL };
> +
> +	return run_command_v_opt_tr2(args, RUN_COMMAND_NO_STDIN | RUN_GIT_CMD,
> +				    "fsmonitor");
> +}
> +
> +int fsmonitor_ipc__send_query(const char *since_token,
> +			      struct strbuf *answer)
> +{
> +	int ret = -1;
> +	int tried_to_spawn = 0;
> +	enum ipc_active_state state = IPC_STATE__OTHER_ERROR;
> +	struct ipc_client_connection *connection = NULL;
> +	struct ipc_client_connect_options options
> +		= IPC_CLIENT_CONNECT_OPTIONS_INIT;
> +
> +	options.wait_if_busy = 1;
> +	options.wait_if_not_found = 0;
> +
> +	trace2_region_enter("fsm_client", "query", NULL);
> +
> +	trace2_data_string("fsm_client", NULL, "query/command",
> +			   since_token);
> +
> +try_again:
> +	state = ipc_client_try_connect(fsmonitor_ipc__get_path(), &options,
> +				       &connection);
> +
> +	switch (state) {
> +	case IPC_STATE__LISTENING:
> +		ret = ipc_client_send_command_to_connection(
> +			connection, since_token, answer);
> +		ipc_client_close_connection(connection);
> +
> +		trace2_data_intmax("fsm_client", NULL,
> +				   "query/response-length", answer->len);
> +
> +		if (fsmonitor_is_trivial_response(answer))
> +			trace2_data_intmax("fsm_client", NULL,
> +					   "query/trivial-response", 1);
> +
> +		goto done;
> +
> +	case IPC_STATE__NOT_LISTENING:
> +		ret = error(_("fsmonitor_ipc__send_query: daemon not available"));
> +		goto done;

I'll need to read up on the IPC layer a bit to find out the difference
between IPC_STATE__NOT_LISTENING and IPC_STATE__PATH_NOT_FOUND. When
testing on my macOS machine, I got this error. I was expecting the
daemon to be spawned. After spawning it myself, it started working.

I expect that there are some cases where the process can fail and the
named pipe is not cleaned up. Let's investigate that soon. I should
make it clear that I had tested the builtin FS Monitor on this machine
a few weeks ago, but hadn't been using it much since. We should auto-
recover from this situation.

But also: what is the cost of treating these two cases the same? Could
we attempt to "restart" the daemon by spawning a new one? Will the new
one find a way to kill a stale one?

(Reading on.)

> +	case IPC_STATE__PATH_NOT_FOUND:
> +		if (tried_to_spawn)
> +			goto done;
> +
> +		tried_to_spawn++;
> +		if (spawn_daemon())
> +			goto done;

This should return zero on success, OK.

> +		/*
> +		 * Try again, but this time give the daemon a chance to
> +		 * actually create the pipe/socket.
> +		 *
> +		 * Granted, the daemon just started so it can't possibly have
> +		 * any FS cached yet, so we'll always get a trivial answer.
> +		 * BUT the answer should include a new token that can serve
> +		 * as the basis for subsequent requests.
> +		 */
> +		options.wait_if_not_found = 1;
> +		goto try_again;

Because of the tried_to_spawn check, we will re-run the request over
IPC but will not retry the spawn_daemon() request. I'm unsure how
this could be helpful: is it possible that spawn_daemon() returns a
non-zero error code after starting the daemon and somehow that
daemon starts working? Or, is this a race-condition thing with parallel
processes also starting up the daemon? It could be good to use this
comment to describe why a retry might be helpful.

> +
> +	case IPC_STATE__INVALID_PATH:
> +		ret = error(_("fsmonitor_ipc__send_query: invalid path '%s'"),
> +			    fsmonitor_ipc__get_path());
> +		goto done;
> +
> +	case IPC_STATE__OTHER_ERROR:
> +	default:
> +		ret = error(_("fsmonitor_ipc__send_query: unspecified error on '%s'"),
> +			    fsmonitor_ipc__get_path());
> +		goto done;
> +	}
> +
> +done:
> +	trace2_region_leave("fsm_client", "query", NULL);
> +
> +	return ret;
> +}
> +
> +int fsmonitor_ipc__send_command(const char *command,
> +				struct strbuf *answer)
> +{
> +	struct ipc_client_connection *connection = NULL;
> +	struct ipc_client_connect_options options
> +		= IPC_CLIENT_CONNECT_OPTIONS_INIT;
> +	int ret;
> +	enum ipc_active_state state;
> +
> +	strbuf_reset(answer);
> +
> +	options.wait_if_busy = 1;
> +	options.wait_if_not_found = 0;
> +
> +	state = ipc_client_try_connect(fsmonitor_ipc__get_path(), &options,
> +				       &connection);
> +	if (state != IPC_STATE__LISTENING) {
> +		die("fsmonitor--daemon is not running");
> +		return -1;
> +	}
> +
> +	ret = ipc_client_send_command_to_connection(connection, command, answer);
> +	ipc_client_close_connection(connection);
> +
> +	if (ret == -1) {
> +		die("could not send '%s' command to fsmonitor--daemon",
> +		    command);
> +		return -1;
> +	}
> +
> +	return 0;
> +}

I wondier if this ...send_command() method is too generic. It might
be nice to have more structure to its inputs and outputs to lessen
the cognitive load when plugging into other portions of the code.
However, I'll wait to see what those consumers look like in case the
generality is merited.
>  struct category_description {
>  	uint32_t category;
> @@ -664,6 +665,9 @@ void get_version_info(struct strbuf *buf, int show_build_options)
>  		strbuf_addf(buf, "sizeof-size_t: %d\n", (int)sizeof(size_t));
>  		strbuf_addf(buf, "shell-path: %s\n", SHELL_PATH);
>  		/* NEEDSWORK: also save and output GIT-BUILD_OPTIONS? */
> +
> +		if (fsmonitor_ipc__is_supported())
> +			strbuf_addstr(buf, "feature: fsmonitor--daemon\n");

This change might deserve its own patch, including some documentation
about how users can use 'git version --build-options' to determine if
the builtin FS Monitor feature is available on their platform.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 04/23] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC
  2021-04-01 15:40 ` [PATCH 04/23] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC Johannes Schindelin via GitGitGadget
@ 2021-04-26 14:56   ` Derrick Stolee
  2021-04-27  9:20     ` Ævar Arnfjörð Bjarmason
  2021-04-30 14:23     ` [PATCH 04/23] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC Jeff Hostetler
  0 siblings, 2 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-26 14:56 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget, git
  Cc: Jeff Hostetler, Johannes Schindelin

On 4/1/21 11:40 AM, Johannes Schindelin via GitGitGadget wrote:> @@ -2515,6 +2515,11 @@ int git_config_get_max_percent_split_change(void)
>  
>  int repo_config_get_fsmonitor(struct repository *r)
>  {
> +	if (r->settings.use_builtin_fsmonitor > 0) {

Don't forget to run prepare_repo_settings(r) first.

> +		core_fsmonitor = "(built-in daemon)";
> +		return 1;
> +	}
> +

I found this odd, assigning a string to core_fsmonitor that
would definitely cause a problem trying to execute it as a
hook. I wondered the need for it at all, but found that
there are several places in the FS Monitor subsystem that use
core_fsmonitor as if it was a boolean, indicating whether or
not the feature is enabled at all.

A cleaner way to handle this would be to hide the data behind
a helper method, say "fsmonitor_enabled()" that could then
check a value on the repository (or index) and store the hook
value as a separate value that is only used by the hook-based
implementation.

It's probably a good idea to do that cleanup now, before we
find on accident that we missed a gap and start trying to run
this bogus string as a hook invocation.
> -static int query_fsmonitor(int version, const char *last_update, struct strbuf *query_result)
> +static int query_fsmonitor(int version, struct index_state *istate, struct strbuf *query_result)
>  {
> +	struct repository *r = istate->repo ? istate->repo : the_repository;
> +	const char *last_update = istate->fsmonitor_last_update;
>  	struct child_process cp = CHILD_PROCESS_INIT;
>  	int result;
>  
>  	if (!core_fsmonitor)
>  		return -1;

Here is an example of it being used as a boolean.

> +	if (r->settings.use_builtin_fsmonitor > 0) {
> +#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
> +		return fsmonitor_ipc__send_query(last_update, query_result);
> +#else
> +		/* Fake a trivial response. */
> +		warning(_("fsmonitor--daemon unavailable; falling back"));
> +		strbuf_add(query_result, "/", 2);
> +		return 0;
> +#endif

This seems like a case where the helper fsmonitor_ipc__is_supported()
could be used instead of compile-time macros.

(I think this is especially true when we consider the future of the
feature on Linux and the possibility of the same compiled code needing
to check run-time properties of the platform for compatibility.)

> --- a/repo-settings.c
> +++ b/repo-settings.c
> @@ -58,6 +58,9 @@ void prepare_repo_settings(struct repository *r)
>  		r->settings.core_multi_pack_index = value;
>  	UPDATE_DEFAULT_BOOL(r->settings.core_multi_pack_index, 1);
>  
> +	if (!repo_config_get_bool(r, "core.usebuiltinfsmonitor", &value) && value)
> +		r->settings.use_builtin_fsmonitor = 1;
> +

Follows the patterns of repo settings. Good.


^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 05/23] fsmonitor--daemon: add a built-in fsmonitor daemon
  2021-04-01 15:40 ` [PATCH 05/23] fsmonitor--daemon: add a built-in fsmonitor daemon Jeff Hostetler via GitGitGadget
@ 2021-04-26 15:08   ` Derrick Stolee
  2021-04-26 15:45     ` Derrick Stolee
  0 siblings, 1 reply; 237+ messages in thread
From: Derrick Stolee @ 2021-04-26 15:08 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/21 11:40 AM, Jeff Hostetler via GitGitGadget wrote:> +#ifdef HAVE_FSMONITOR_DAEMON_BACKEND

I think these compile-time macros should be replaced with a
method call, as I've said before. It should be simple to say

	if (!fsmonitor_ipc__is_supported())
		die(_("fsmonitor--daemon is not supported on this platform"));

and call it a day. This can be done before parsing arguments.

> +int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
> +{
> +	enum daemon_mode {
> +		UNDEFINED_MODE,
> +	} mode = UNDEFINED_MODE;
> +
> +	struct option options[] = {
> +		OPT_END()
> +	};

I can see where you are going here, to use the parse-opts API
to get your "--<verb>" arguments to populate an 'enum'. However,
it seems like you will run into the problem where a user enters
multiple such arguments and you lose the information as the
parser overwrites 'mode' here.

Better to use a positional argument and drop the "--" prefix,
in my opinion.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 06/23] fsmonitor--daemon: implement client command options
  2021-04-01 15:40 ` [PATCH 06/23] fsmonitor--daemon: implement client command options Jeff Hostetler via GitGitGadget
@ 2021-04-26 15:12   ` Derrick Stolee
  2021-04-30 14:33     ` Jeff Hostetler
  0 siblings, 1 reply; 237+ messages in thread
From: Derrick Stolee @ 2021-04-26 15:12 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/21 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Implement command options `--stop`, `--is-running`, `--query`,
> `--query-index`, and `--flush` to control and query the status of a
> `fsmonitor--daemon` server process (and implicitly start a server
> process if necessary).
> 
> Later commits will implement the actual server and monitor
> the file system.

As mentioned before, I think the "query", "query-index", and
"flush" commands are better served in a test helper. Luckily,
the implementation you give here seems rather straightforward
and could fit into a test helper without a lot of duplicated
boilerplate. That's a good sign for the API presented here.

As a bonus, you could delay the implementation of those test
helpers until they are going to be used in a test.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 07/23] fsmonitor-fs-listen-win32: stub in backend for Windows
  2021-04-01 15:40 ` [PATCH 07/23] fsmonitor-fs-listen-win32: stub in backend for Windows Jeff Hostetler via GitGitGadget
@ 2021-04-26 15:23   ` Derrick Stolee
  0 siblings, 0 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-26 15:23 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/21 11:40 AM, Jeff Hostetler via GitGitGadget wrote:

> +# If your platform supports an built-in fsmonitor backend, set
> +# FSMONITOR_DAEMON_BACKEND to the name of the corresponding
> +# `compat/fsmonitor/fsmonitor-fs-listen-<name>.c` that implements the
> +# `fsmonitor_fs_listen__*()` routines.
I found this to be a little confusing, specifically that you
care about the "<name>" part of the filename, not the full file
name. Here is an option:

# If your platform supports an built-in fsmonitor backend, set
# FSMONITOR_DAEMON_BACKEND to "<name>", corresponding to the file
# `compat/fsmonitor/fsmonitor-fs-listen-<name>.c` that implements the
# `fsmonitor_fs_listen__*()` routines.

Everything else looks pretty standard. Good to create stubs this
way so they can be consumed by a platform-agnostic caller and then
implemented with that context.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 05/23] fsmonitor--daemon: add a built-in fsmonitor daemon
  2021-04-26 15:08   ` Derrick Stolee
@ 2021-04-26 15:45     ` Derrick Stolee
  2021-04-30 14:31       ` Jeff Hostetler
  0 siblings, 1 reply; 237+ messages in thread
From: Derrick Stolee @ 2021-04-26 15:45 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/26/21 11:08 AM, Derrick Stolee wrote:
> On 4/1/21 11:40 AM, Jeff Hostetler via GitGitGadget wrote:> +#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
> 
> I think these compile-time macros should be replaced with a
> method call, as I've said before. It should be simple to say
> 
> 	if (!fsmonitor_ipc__is_supported())
> 		die(_("fsmonitor--daemon is not supported on this platform"));
> 
> and call it a day. This can be done before parsing arguments.
> 
>> +int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
>> +{
>> +	enum daemon_mode {
>> +		UNDEFINED_MODE,
>> +	} mode = UNDEFINED_MODE;
>> +
>> +	struct option options[] = {
>> +		OPT_END()
>> +	};
> 
> I can see where you are going here, to use the parse-opts API
> to get your "--<verb>" arguments to populate an 'enum'. However,
> it seems like you will run into the problem where a user enters
> multiple such arguments and you lose the information as the
> parser overwrites 'mode' here.

I see that you use OPT_CMDMODE in your implementation, which
makes this concern invalid.

> Better to use a positional argument and drop the "--" prefix,
> in my opinion.

This is my personal taste, but the technical reason to do this
doesn't exist.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 09/23] fsmonitor--daemon: implement daemon command options
  2021-04-01 15:40 ` [PATCH 09/23] fsmonitor--daemon: implement daemon command options Jeff Hostetler via GitGitGadget
@ 2021-04-26 15:47   ` Derrick Stolee
  2021-04-26 16:12     ` Derrick Stolee
  2021-04-30 15:59     ` Jeff Hostetler
  0 siblings, 2 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-26 15:47 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/21 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
...
> +	/* Prepare to (recursively) watch the <worktree-root> directory. */
> +	strbuf_init(&state.path_worktree_watch, 0);
> +	strbuf_addstr(&state.path_worktree_watch, absolute_path(get_git_work_tree()));
> +	state.nr_paths_watching = 1;

Yes, let's watch the working directory.

> +	/*
> +	 * If ".git" is not a directory, then <gitdir> is not inside the
> +	 * cone of <worktree-root>, so set up a second watch for it.
> +	 */
> +	strbuf_init(&state.path_gitdir_watch, 0);
> +	strbuf_addbuf(&state.path_gitdir_watch, &state.path_worktree_watch);
> +	strbuf_addstr(&state.path_gitdir_watch, "/.git");
> +	if (!is_directory(state.path_gitdir_watch.buf)) {
> +		strbuf_reset(&state.path_gitdir_watch);
> +		strbuf_addstr(&state.path_gitdir_watch, absolute_path(get_git_dir()));
> +		state.nr_paths_watching = 2;
> +	}

But why watch the .git directory, especially for a worktree (or
submodule I guess)? What benefit do we get from events within the
.git directory? I'm expecting any event within the .git directory
should be silently ignored.

> +
>  static int is_ipc_daemon_listening(void)
>  {
>  	return fsmonitor_ipc__get_state() == IPC_STATE__LISTENING;
>  }
>  
> +static int try_to_run_foreground_daemon(void)
> +{
> +	/*
> +	 * Technically, we don't need to probe for an existing daemon
> +	 * process, since we could just call `fsmonitor_run_daemon()`
> +	 * and let it fail if the pipe/socket is busy.
> +	 *
> +	 * However, this method gives us a nicer error message for a
> +	 * common error case.
> +	 */
> +	if (is_ipc_daemon_listening())
> +		die("fsmonitor--daemon is already running.");
Here, it seems like we only care about IPC_STATE_LISTENING, while
earlier I mentioned that I ended up in IPC_STATE__NOT_LISTENING,
and my manually running of the daemon helped.

> +	return !!fsmonitor_run_daemon();
> +}

You are ignoring the IPC_STATE__NOT_LISTENING and creating a new
process, which is good. I'm just wondering why that state exists
and what is the proper way to handle it?

> +
> +#ifndef GIT_WINDOWS_NATIVE

You are already creating a platform-specific mechanism for the
filesystem watcher. Shouldn't the implementation of this method
be part of that file in compat/fsmonitor/?

I guess the biggest reason is that macOS and Linux share this
implementation, so maybe this is the cleanest approach.

> +
> +/*
> + * This is adapted from `wait_or_whine()`.  Watch the child process and
> + * let it get started and begin listening for requests on the socket
> + * before reporting our success.
> + */
> +static int wait_for_background_startup(pid_t pid_child)
> +{
> +	int status;
> +	pid_t pid_seen;
> +	enum ipc_active_state s;
> +	time_t time_limit, now;
> +
> +	time(&time_limit);
> +	time_limit += fsmonitor__start_timeout_sec;
> +
> +	for (;;) {
> +		pid_seen = waitpid(pid_child, &status, WNOHANG);
> +
> +		if (pid_seen == -1)
> +			return error_errno(_("waitpid failed"));
> +
> +		else if (pid_seen == 0) {

There is some non-standard whitespace throughout this
if/else if/else:
...> +			continue;
> +		}
> +
> +		else if (pid_seen == pid_child) {
...
> +			return error(_("fsmonitor--daemon failed to start"));
> +		}
> +
> +		else
> +			return error(_("waitpid is confused"));

The rest of the glue in this patch looks good.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 09/23] fsmonitor--daemon: implement daemon command options
  2021-04-26 15:47   ` Derrick Stolee
@ 2021-04-26 16:12     ` Derrick Stolee
  2021-04-30 15:18       ` Jeff Hostetler
  2021-04-30 15:59     ` Jeff Hostetler
  1 sibling, 1 reply; 237+ messages in thread
From: Derrick Stolee @ 2021-04-26 16:12 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/26/2021 11:47 AM, Derrick Stolee wrote:
> On 4/1/21 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
>> From: Jeff Hostetler <jeffhost@microsoft.com>
> ...
>> +	/* Prepare to (recursively) watch the <worktree-root> directory. */
>> +	strbuf_init(&state.path_worktree_watch, 0);
>> +	strbuf_addstr(&state.path_worktree_watch, absolute_path(get_git_work_tree()));
>> +	state.nr_paths_watching = 1;
> 
> Yes, let's watch the working directory.
> 
>> +	/*
>> +	 * If ".git" is not a directory, then <gitdir> is not inside the
>> +	 * cone of <worktree-root>, so set up a second watch for it.
>> +	 */
>> +	strbuf_init(&state.path_gitdir_watch, 0);
>> +	strbuf_addbuf(&state.path_gitdir_watch, &state.path_worktree_watch);
>> +	strbuf_addstr(&state.path_gitdir_watch, "/.git");
>> +	if (!is_directory(state.path_gitdir_watch.buf)) {
>> +		strbuf_reset(&state.path_gitdir_watch);
>> +		strbuf_addstr(&state.path_gitdir_watch, absolute_path(get_git_dir()));
>> +		state.nr_paths_watching = 2;
>> +	}
> 
> But why watch the .git directory, especially for a worktree (or
> submodule I guess)? What benefit do we get from events within the
> .git directory? I'm expecting any event within the .git directory
> should be silently ignored.

I see in a following patch that we place a cookie file within the
.git directory. I'm reminded that this is done for a reason: other
filesystem watchers can get into a loop if we place the cookie
file outside of the .git directory. The classic example is VS Code
running 'git status' in a loop because Watchman writes a cookie
into the root of the working directory.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 10/23] fsmonitor--daemon: add pathname classification
  2021-04-01 15:40 ` [PATCH 10/23] fsmonitor--daemon: add pathname classification Jeff Hostetler via GitGitGadget
@ 2021-04-26 19:17   ` Derrick Stolee
  2021-04-26 20:11     ` Eric Sunshine
  0 siblings, 1 reply; 237+ messages in thread
From: Derrick Stolee @ 2021-04-26 19:17 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
...
> +#define FSMONITOR_COOKIE_PREFIX ".fsmonitor-daemon-"
> +
> +enum fsmonitor_path_type fsmonitor_classify_path_workdir_relative(
> +	const char *rel)
> +{
> +	if (fspathncmp(rel, ".git", 4))
> +		return IS_WORKDIR_PATH;
> +	rel += 4;
> +
> +	if (!*rel)
> +		return IS_DOT_GIT;
> +	if (*rel != '/')
> +		return IS_WORKDIR_PATH; /* e.g. .gitignore */
> +	rel++;
> +
> +	if (!fspathncmp(rel, FSMONITOR_COOKIE_PREFIX,
> +			strlen(FSMONITOR_COOKIE_PREFIX)))

Seems like this strlen() could be abstracted out. Is it
something the compiler can compute and set for us? Or,
should we create a macro for this constant?

> +		return IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX;
> +
> +	return IS_INSIDE_DOT_GIT;
> +}

Here is the reasoning I was missing for why we watch the .git
directory.

> +enum fsmonitor_path_type fsmonitor_classify_path_gitdir_relative(
> +	const char *rel)
> +{
> +	if (!fspathncmp(rel, FSMONITOR_COOKIE_PREFIX,
> +			strlen(FSMONITOR_COOKIE_PREFIX)))
> +		return IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX;
> +
> +	return IS_INSIDE_GITDIR;
> +}

And I was about to ask "what happens if we are watching the .git
directory of a worktree?" but here we have a different classifier.

> +static enum fsmonitor_path_type try_classify_workdir_abs_path(
> +	struct fsmonitor_daemon_state *state,
> +	const char *path)
> +{
> +	const char *rel;
> +
> +	if (fspathncmp(path, state->path_worktree_watch.buf,
> +		       state->path_worktree_watch.len))
> +		return IS_OUTSIDE_CONE;
> +
> +	rel = path + state->path_worktree_watch.len;
> +
> +	if (!*rel)
> +		return IS_WORKDIR_PATH; /* it is the root dir exactly */
> +	if (*rel != '/')
> +		return IS_OUTSIDE_CONE;
> +	rel++;
> +
> +	return fsmonitor_classify_path_workdir_relative(rel);
> +}
> +
> +enum fsmonitor_path_type fsmonitor_classify_path_absolute(
> +	struct fsmonitor_daemon_state *state,
> +	const char *path)
> +{
> +	const char *rel;
> +	enum fsmonitor_path_type t;
> +
> +	t = try_classify_workdir_abs_path(state, path);
> +	if (state->nr_paths_watching == 1)
> +		return t;
> +	if (t != IS_OUTSIDE_CONE)
> +		return t;
> +
> +	if (fspathncmp(path, state->path_gitdir_watch.buf,
> +		       state->path_gitdir_watch.len))
> +		return IS_OUTSIDE_CONE;
> +
> +	rel = path + state->path_gitdir_watch.len;
> +
> +	if (!*rel)
> +		return IS_GITDIR; /* it is the <gitdir> exactly */
> +	if (*rel != '/')
> +		return IS_OUTSIDE_CONE;
> +	rel++;
> +
> +	return fsmonitor_classify_path_gitdir_relative(rel);
> +}

And here is where you differentiate the event across the two
cases. OK.

> +/*
> + * Pathname classifications.
> + *
> + * The daemon classifies the pathnames that it receives from file
> + * system notification events into the following categories and uses
> + * that to decide whether clients are told about them.  (And to watch
> + * for file system synchronization events.)
> + *
> + * The client should only care about paths within the working
> + * directory proper (inside the working directory and not ".git" nor
> + * inside of ".git/").  That is, the client has read the index and is
> + * asking for a list of any paths in the working directory that have
> + * been modified since the last token.  The client does not care about
> + * file system changes within the .git directory (such as new loose
> + * objects or packfiles).  So the client will only receive paths that
> + * are classified as IS_WORKDIR_PATH.
> + *
> + * The daemon uses the IS_DOT_GIT and IS_GITDIR internally to mean the
> + * exact ".git" directory or GITDIR.  If the daemon receives a delete
> + * event for either of these directories, it will automatically
> + * shutdown, for example.
> + *
> + * Note that the daemon DOES NOT explicitly watch nor special case the
> + * ".git/index" file.  The daemon does not read the index and does not
> + * have any internal index-relative state.  The daemon only collects
> + * the set of modified paths within the working directory.
> + */
> +enum fsmonitor_path_type {
> +	IS_WORKDIR_PATH = 0,
> +
> +	IS_DOT_GIT,
> +	IS_INSIDE_DOT_GIT,
> +	IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX,
> +
> +	IS_GITDIR,
> +	IS_INSIDE_GITDIR,
> +	IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX,
> +
> +	IS_OUTSIDE_CONE,
> +};
> +
> +/*
> + * Classify a pathname relative to the root of the working directory.
> + */
> +enum fsmonitor_path_type fsmonitor_classify_path_workdir_relative(
> +	const char *relative_path);
> +
> +/*
> + * Classify a pathname relative to a <gitdir> that is external to the
> + * worktree directory.
> + */
> +enum fsmonitor_path_type fsmonitor_classify_path_gitdir_relative(
> +	const char *relative_path);
> +
> +/*
> + * Classify an absolute pathname received from a filesystem event.
> + */
> +enum fsmonitor_path_type fsmonitor_classify_path_absolute(
> +	struct fsmonitor_daemon_state *state,
> +	const char *path);
> +
>  #endif /* HAVE_FSMONITOR_DAEMON_BACKEND */
>  #endif /* FSMONITOR_DAEMON_H */

Had I looked ahead and read these comments beforehand, then I would
have had an easier time determining the intended behavior from the
implementations. Oops.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 11/23] fsmonitor--daemon: define token-ids
  2021-04-01 15:40 ` [PATCH 11/23] fsmonitor--daemon: define token-ids Jeff Hostetler via GitGitGadget
@ 2021-04-26 19:49   ` Derrick Stolee
  2021-04-26 20:01     ` Eric Sunshine
  2021-04-30 16:17     ` Jeff Hostetler
  0 siblings, 2 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-26 19:49 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Teach fsmonitor--daemon to create token-ids and define the
> overall token naming scheme.
...
> +/*
> + * Requests to and from a FSMonitor Protocol V2 provider use an opaque
> + * "token" as a virtual timestamp.  Clients can request a summary of all
> + * created/deleted/modified files relative to a token.  In the response,
> + * clients receive a new token for the next (relative) request.
> + *
> + *
> + * Token Format
> + * ============
> + *
> + * The contents of the token are private and provider-specific.
> + *
> + * For the built-in fsmonitor--daemon, we define a token as follows:
> + *
> + *     "builtin" ":" <token_id> ":" <sequence_nr>
> + *
> + * The <token_id> is an arbitrary OPAQUE string, such as a GUID,
> + * UUID, or {timestamp,pid}.  It is used to group all filesystem
> + * events that happened while the daemon was monitoring (and in-sync
> + * with the filesystem).
> + *
> + *     Unlike FSMonitor Protocol V1, it is not defined as a timestamp
> + *     and does not define less-than/greater-than relationships.
> + *     (There are too many race conditions to rely on file system
> + *     event timestamps.)
> + *
> + * The <sequence_nr> is a simple integer incremented for each event
> + * received.  When a new <token_id> is created, the <sequence_nr> is
> + * reset to zero.
> + *
> + *
> + * About Token Ids
> + * ===============
> + *
> + * A new token_id is created:
> + *
> + * [1] each time the daemon is started.
> + *
> + * [2] any time that the daemon must re-sync with the filesystem
> + *     (such as when the kernel drops or we miss events on a very
> + *     active volume).
> + *
> + * [3] in response to a client "flush" command (for dropped event
> + *     testing).
> + *
> + * [4] MAYBE We might want to change the token_id after very complex
> + *     filesystem operations are performed, such as a directory move
> + *     sequence that affects many files within.  It might be simpler
> + *     to just give up and fake a re-sync (and let the client do a
> + *     full scan) than try to enumerate the effects of such a change.
> + *
> + * When a new token_id is created, the daemon is free to discard all
> + * cached filesystem events associated with any previous token_ids.
> + * Events associated with a non-current token_id will never be sent
> + * to a client.  A token_id change implicitly means that the daemon
> + * has gap in its event history.
> + *
> + * Therefore, clients that present a token with a stale (non-current)
> + * token_id will always be given a trivial response.

From this comment, it seems to be the case that concurrent Git
commands will race to advance the FS Monitor token and one of them
will lose, causing a full working directory scan. There is no list
of "recent" tokens.

I could see this changing in the future, but for now it is a
reasonable simplification.

> + */
> +struct fsmonitor_token_data {
> +	struct strbuf token_id;
> +	struct fsmonitor_batch *batch_head;
> +	struct fsmonitor_batch *batch_tail;
> +	uint64_t client_ref_count;
> +};
> +
> +static struct fsmonitor_token_data *fsmonitor_new_token_data(void)
> +{
> +	static int test_env_value = -1;
> +	static uint64_t flush_count = 0;
> +	struct fsmonitor_token_data *token;
> +
> +	token = (struct fsmonitor_token_data *)xcalloc(1, sizeof(*token));

I think the best practice here is "CALLOC_ARRAY(token, 1);"

> +
> +	strbuf_init(&token->token_id, 0);

This is likely overkill since you used calloc() above.

> +	token->batch_head = NULL;
> +	token->batch_tail = NULL;
> +	token->client_ref_count = 0;
> +
> +	if (test_env_value < 0)
> +		test_env_value = git_env_bool("GIT_TEST_FSMONITOR_TOKEN", 0);
> +
> +	if (!test_env_value) {
> +		struct timeval tv;
> +		struct tm tm;
> +		time_t secs;
> +
> +		gettimeofday(&tv, NULL);
> +		secs = tv.tv_sec;
> +		gmtime_r(&secs, &tm);
> +
> +		strbuf_addf(&token->token_id,
> +			    "%"PRIu64".%d.%4d%02d%02dT%02d%02d%02d.%06ldZ",
> +			    flush_count++,
> +			    getpid(),
> +			    tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
> +			    tm.tm_hour, tm.tm_min, tm.tm_sec,
> +			    (long)tv.tv_usec);

Between the PID, the flush count, and how deep you go in the
timestamp, this seems to be specific enough.

> +	} else {
> +		strbuf_addf(&token->token_id, "test_%08x", test_env_value++);

And this will be nice for testing.

> +	}
> +
> +	return token;
> +}
> +
>  static ipc_server_application_cb handle_client;
>  
>  static int handle_client(void *data, const char *command,
> @@ -330,7 +436,7 @@ static int fsmonitor_run_daemon(void)
>  
>  	pthread_mutex_init(&state.main_lock, NULL);
>  	state.error_code = 0;
> -	state.current_token_data = NULL;
> +	state.current_token_data = fsmonitor_new_token_data();
>  	state.test_client_delay_ms = 0;
>  
>  	/* Prepare to (recursively) watch the <worktree-root> directory. */
> 

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 11/23] fsmonitor--daemon: define token-ids
  2021-04-26 19:49   ` Derrick Stolee
@ 2021-04-26 20:01     ` Eric Sunshine
  2021-04-26 20:03       ` Derrick Stolee
  2021-04-30 16:17     ` Jeff Hostetler
  1 sibling, 1 reply; 237+ messages in thread
From: Eric Sunshine @ 2021-04-26 20:01 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Jeff Hostetler via GitGitGadget, Git List, Jeff Hostetler

On Mon, Apr 26, 2021 at 3:49 PM Derrick Stolee <stolee@gmail.com> wrote:
> On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> > +     token = (struct fsmonitor_token_data *)xcalloc(1, sizeof(*token));
>
> I think the best practice here is "CALLOC_ARRAY(token, 1);"
>
> > +
> > +     strbuf_init(&token->token_id, 0);
>
> This is likely overkill since you used calloc() above.

Not quite. A strbuf must be initialized either with STRBUF_INIT or
strbuf_init() in order to make strbuf.buf point at strbuf_slopbuf.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 11/23] fsmonitor--daemon: define token-ids
  2021-04-26 20:01     ` Eric Sunshine
@ 2021-04-26 20:03       ` Derrick Stolee
  0 siblings, 0 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-26 20:03 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Jeff Hostetler via GitGitGadget, Git List, Jeff Hostetler

On 4/26/2021 4:01 PM, Eric Sunshine wrote:
> On Mon, Apr 26, 2021 at 3:49 PM Derrick Stolee <stolee@gmail.com> wrote:
>> On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
>>> +     token = (struct fsmonitor_token_data *)xcalloc(1, sizeof(*token));
>>
>> I think the best practice here is "CALLOC_ARRAY(token, 1);"
>>
>>> +
>>> +     strbuf_init(&token->token_id, 0);
>>
>> This is likely overkill since you used calloc() above.
> 
> Not quite. A strbuf must be initialized either with STRBUF_INIT or
> strbuf_init() in order to make strbuf.buf point at strbuf_slopbuf.

Thanks! I didn't know that detail, but it makes a lot of sense.

-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 10/23] fsmonitor--daemon: add pathname classification
  2021-04-26 19:17   ` Derrick Stolee
@ 2021-04-26 20:11     ` Eric Sunshine
  2021-04-26 20:24       ` Derrick Stolee
  0 siblings, 1 reply; 237+ messages in thread
From: Eric Sunshine @ 2021-04-26 20:11 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Jeff Hostetler via GitGitGadget, Git List, Jeff Hostetler

On Mon, Apr 26, 2021 at 3:17 PM Derrick Stolee <stolee@gmail.com> wrote:
> On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> > +#define FSMONITOR_COOKIE_PREFIX ".fsmonitor-daemon-"
> > +
> > +     if (!fspathncmp(rel, FSMONITOR_COOKIE_PREFIX,
> > +                     strlen(FSMONITOR_COOKIE_PREFIX)))
>
> Seems like this strlen() could be abstracted out. Is it
> something the compiler can compute and set for us? Or,
> should we create a macro for this constant?

If you're asking whether the compiler will resolve strlen("literal
string") to an integer constant at compile time rather than computing
the length at runtime, then the answer is that on this project we
presume that the compiler is smart enough to do that.

Or are you asking for a function something like this?

    fspathhasprefix(rel, FSMONITOR_COOKIE_PREFIX)

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 02/23] fsmonitor-ipc: create client routines for git-fsmonitor--daemon
  2021-04-26 14:31   ` Derrick Stolee
@ 2021-04-26 20:20     ` Eric Sunshine
  2021-04-26 21:02       ` Derrick Stolee
  2021-04-28 19:26     ` Jeff Hostetler
  1 sibling, 1 reply; 237+ messages in thread
From: Eric Sunshine @ 2021-04-26 20:20 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Jeff Hostetler via GitGitGadget, Git List, Jeff Hostetler

On Mon, Apr 26, 2021 at 10:31 AM Derrick Stolee <stolee@gmail.com> wrote:
> On 4/1/21 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> > +#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
> > +#define FSMONITOR_DAEMON_IS_SUPPORTED 1
> > +#else
> > +#define FSMONITOR_DAEMON_IS_SUPPORTED 0
> > +#endif
> > +
> > +int fsmonitor_ipc__is_supported(void)
> > +{
> > +     return FSMONITOR_DAEMON_IS_SUPPORTED;
> > +}
>
> I don't see any other use of FSMONITOR_DAEMON_IS_SUPPORTED,
> so I was thinking you could use the #ifdef/#else/#endif
> construct within the implementation of this method instead
> of creating a macro outside. But my suggestion might be an
> anti-pattern, so feel free to ignore me.

On this project, it is preferred to keep the #if / #else / #endif
outside of functions since embedding them within functions often makes
it difficult to follow how the code flows (and generally makes
functions unnecessarily noisy). So, the way Jeff did this seems fine.

An alternative would have been:

  #ifdef HAVE_FSMONITOR_DAEMON_BACKEND
  #define fsmonitor_ipc__is_supported() 1
  #else
  #define fsmonitor_ipc__is_supported() 0
  #endif

which would still allow calling it as a function:

    if (fsmonitor_ipc__is_supported())
        ...

but it's subjective whether that's actually any cleaner or better.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 12/23] fsmonitor--daemon: create token-based changed path cache
  2021-04-01 15:40 ` [PATCH 12/23] fsmonitor--daemon: create token-based changed path cache Jeff Hostetler via GitGitGadget
@ 2021-04-26 20:22   ` Derrick Stolee
  2021-04-30 17:36     ` Jeff Hostetler
  0 siblings, 1 reply; 237+ messages in thread
From: Derrick Stolee @ 2021-04-26 20:22 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Teach fsmonitor--daemon to build lists of changed paths and associate
> them with a token-id.  This will be used by the platform-specific
> backends to accumulate changed paths in response to filesystem events.
> 
> The platform-specific event loops receive batches containing one or
> more changed paths.  Their fs listener thread will accumulate them in

I think the lowercase "fs" here is strange. "Their listener thread"
could be interpreted as the IPC listener, so it's probably best to
spell it out: "Their filesystem listener thread".

> a `fsmonitor_batch` (and without locking) and then "publish" them to
> associate them with the current token and to make them visible to the
> client worker threads.
...
> +struct fsmonitor_batch {
> +	struct fsmonitor_batch *next;
> +	uint64_t batch_seq_nr;
> +	const char **interned_paths;
> +	size_t nr, alloc;
> +	time_t pinned_time;
> +};

A linked list to help with adding while consuming it, but also
batching for efficiency. I can see how this will work out
nicely.

> +struct fsmonitor_batch *fsmonitor_batch__new(void)
> +{
> +	struct fsmonitor_batch *batch = xcalloc(1, sizeof(*batch));

I mentioned earlier that I think `CALLOC_ARRAY(batch, 1)` is the
typical pattern here.

> +
> +	return batch;
> +}
> +
> +struct fsmonitor_batch *fsmonitor_batch__free(struct fsmonitor_batch *batch)

Since this method frees the tip of the list and returns the next
item (instead of freeing the entire list), perhaps this would be
better named as _pop()?

> +{
> +	struct fsmonitor_batch *next;
> +
> +	if (!batch)
> +		return NULL;
> +
> +	next = batch->next;
> +
> +	/*
> +	 * The actual strings within the array are interned, so we don't
> +	 * own them.
> +	 */
> +	free(batch->interned_paths);
> +
> +	return next;
> +}
> +
> +void fsmonitor_batch__add_path(struct fsmonitor_batch *batch,
> +			       const char *path)
> +{
> +	const char *interned_path = strintern(path);

This use of interned paths is interesting, although I become
concerned for the amount of memory we are consuming over the
lifetime of the process. This could be considered as a target
for future improvements, perhaps with an LRU cache or something
similar.

> +
> +	trace_printf_key(&trace_fsmonitor, "event: %s", interned_path);
> +
> +	ALLOC_GROW(batch->interned_paths, batch->nr + 1, batch->alloc);
> +	batch->interned_paths[batch->nr++] = interned_path;
> +}
> +
> +static void fsmonitor_batch__combine(struct fsmonitor_batch *batch_dest,
> +				     const struct fsmonitor_batch *batch_src)
> +{
> +	/* assert state->main_lock */
> +

This comment seems stale.

> +	size_t k;
> +
> +	ALLOC_GROW(batch_dest->interned_paths,
> +		   batch_dest->nr + batch_src->nr + 1,
> +		   batch_dest->alloc);
> +
> +	for (k = 0; k < batch_src->nr; k++)
> +		batch_dest->interned_paths[batch_dest->nr++] =
> +			batch_src->interned_paths[k];
> +}
> +
> +static void fsmonitor_free_token_data(struct fsmonitor_token_data *token)

This one _does_ free the whole list.

> +{
> +	struct fsmonitor_batch *p;
> +
> +	if (!token)
> +		return;
> +
> +	assert(token->client_ref_count == 0);
> +
> +	strbuf_release(&token->token_id);
> +
> +	for (p = token->batch_head; p; p = fsmonitor_batch__free(p))
> +		;
> +
> +	free(token);
> +}
> +
> +/*
> + * Flush all of our cached data about the filesystem.  Call this if we
> + * lose sync with the filesystem and miss some notification events.
> + *
> + * [1] If we are missing events, then we no longer have a complete
> + *     history of the directory (relative to our current start token).
> + *     We should create a new token and start fresh (as if we just
> + *     booted up).
> + *
> + * If there are no readers of the the current token data series, we
> + * can free it now.  Otherwise, let the last reader free it.  Either
> + * way, the old token data series is no longer associated with our
> + * state data.
> + */
> +void fsmonitor_force_resync(struct fsmonitor_daemon_state *state)
> +{
> +	struct fsmonitor_token_data *free_me = NULL;
> +	struct fsmonitor_token_data *new_one = NULL;
> +
> +	new_one = fsmonitor_new_token_data();
> +
> +	pthread_mutex_lock(&state->main_lock);
> +
> +	trace_printf_key(&trace_fsmonitor,
> +			 "force resync [old '%s'][new '%s']",
> +			 state->current_token_data->token_id.buf,
> +			 new_one->token_id.buf);
> +
> +	if (state->current_token_data->client_ref_count == 0)
> +		free_me = state->current_token_data;
> +	state->current_token_data = new_one;
> +
> +	pthread_mutex_unlock(&state->main_lock);
> +
> +	fsmonitor_free_token_data(free_me);
> +}
> +

Swap the pointer under a lock, free outside of it. Good.

> +/*
> + * We try to combine small batches at the front of the batch-list to avoid
> + * having a long list.  This hopefully makes it a little easier when we want
> + * to truncate and maintain the list.  However, we don't want the paths array
> + * to just keep growing and growing with realloc, so we insert an arbitrary
> + * limit.
> + */
> +#define MY_COMBINE_LIMIT (1024)
> +
> +void fsmonitor_publish(struct fsmonitor_daemon_state *state,
> +		       struct fsmonitor_batch *batch,
> +		       const struct string_list *cookie_names)
> +{
> +	if (!batch && !cookie_names->nr)
> +		return;
> +
> +	pthread_mutex_lock(&state->main_lock);
> +
> +	if (batch) {
> +		struct fsmonitor_batch *head;
> +
> +		head = state->current_token_data->batch_head;
> +		if (!head) {
> +			batch->batch_seq_nr = 0;
> +			batch->next = NULL;
> +			state->current_token_data->batch_head = batch;
> +			state->current_token_data->batch_tail = batch;
> +		} else if (head->pinned_time) {
> +			/*
> +			 * We cannot alter the current batch list
> +			 * because:
> +			 *
> +			 * [a] it is being transmitted to at least one
> +			 * client and the handle_client() thread has a
> +			 * ref-count, but not a lock on the batch list
> +			 * starting with this item.
> +			 *
> +			 * [b] it has been transmitted in the past to
> +			 * at least one client such that future
> +			 * requests are relative to this head batch.
> +			 *
> +			 * So, we can only prepend a new batch onto
> +			 * the front of the list.
> +			 */
> +			batch->batch_seq_nr = head->batch_seq_nr + 1;
> +			batch->next = head;
> +			state->current_token_data->batch_head = batch;
> +		} else if (head->nr + batch->nr > MY_COMBINE_LIMIT) {
> +			/*
> +			 * The head batch in the list has never been
> +			 * transmitted to a client, but folding the
> +			 * contents of the new batch onto it would
> +			 * exceed our arbitrary limit, so just prepend
> +			 * the new batch onto the list.
> +			 */
> +			batch->batch_seq_nr = head->batch_seq_nr + 1;
> +			batch->next = head;
> +			state->current_token_data->batch_head = batch;
> +		} else {
> +			/*
> +			 * We are free to append the paths in the given
> +			 * batch onto the end of the current head batch.
> +			 */
> +			fsmonitor_batch__combine(head, batch);
> +			fsmonitor_batch__free(batch);
> +		}
> +	}
> +
> +	pthread_mutex_unlock(&state->main_lock);
> +}

I appreciate the careful comments in this critical piece of the
data structure. Also, it is good that you already have a batch
of results to merge into the list instead of updating a lock for
every filesystem event.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 10/23] fsmonitor--daemon: add pathname classification
  2021-04-26 20:11     ` Eric Sunshine
@ 2021-04-26 20:24       ` Derrick Stolee
  0 siblings, 0 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-26 20:24 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Jeff Hostetler via GitGitGadget, Git List, Jeff Hostetler

On 4/26/2021 4:11 PM, Eric Sunshine wrote:
> On Mon, Apr 26, 2021 at 3:17 PM Derrick Stolee <stolee@gmail.com> wrote:
>> On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
>>> +#define FSMONITOR_COOKIE_PREFIX ".fsmonitor-daemon-"
>>> +
>>> +     if (!fspathncmp(rel, FSMONITOR_COOKIE_PREFIX,
>>> +                     strlen(FSMONITOR_COOKIE_PREFIX)))
>>
>> Seems like this strlen() could be abstracted out. Is it
>> something the compiler can compute and set for us? Or,
>> should we create a macro for this constant?
> 
> If you're asking whether the compiler will resolve strlen("literal
> string") to an integer constant at compile time rather than computing
> the length at runtime, then the answer is that on this project we
> presume that the compiler is smart enough to do that.

That is what I was asking.

> Or are you asking for a function something like this?
> 
>     fspathhasprefix(rel, FSMONITOR_COOKIE_PREFIX)

The "fix" I would recommend otherwise would be

	if (!fspathncmp(rel, FSMONITOR_COOKIE_PREFIX,
			FSMONITOR_COOKIE_PREFIX_LEN))

which is much uglier. I'm glad we can trust the compiler to
be smart enough.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 16/23] fsmonitor--daemon: implement handle_client callback
  2021-04-01 15:40 ` [PATCH 16/23] fsmonitor--daemon: implement handle_client callback Jeff Hostetler via GitGitGadget
@ 2021-04-26 21:01   ` Derrick Stolee
  2021-05-03 15:04     ` Jeff Hostetler
  2021-05-13 18:52   ` Derrick Stolee
  1 sibling, 1 reply; 237+ messages in thread
From: Derrick Stolee @ 2021-04-26 21:01 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Teach fsmonitor--daemon to respond to IPC requests from client
> Git processes and respond with a list of modified pathnames
> relative to the provided token.

(I'm skipping ahead to this part. I'll examine the platform
specific bits after I finish with "the Git bits".)

> +static void fsmonitor_format_response_token(
> +	struct strbuf *response_token,
> +	const struct strbuf *response_token_id,
> +	const struct fsmonitor_batch *batch)
> +{
> +	uint64_t seq_nr = (batch) ? batch->batch_seq_nr + 1 : 0;
> +
> +	strbuf_reset(response_token);
> +	strbuf_addf(response_token, "builtin:%s:%"PRIu64,
> +		    response_token_id->buf, seq_nr);]

Ah, right. The token string gets _even more specific_ to allow
for multiple "checkpoints" within a batch.

> +static int fsmonitor_parse_client_token(const char *buf_token,
> +					struct strbuf *requested_token_id,
> +					uint64_t *seq_nr)
> +{
> +	const char *p;
> +	char *p_end;
> +
> +	strbuf_reset(requested_token_id);
> +	*seq_nr = 0;
> +
> +	if (!skip_prefix(buf_token, "builtin:", &p))
> +		return 1;
> +
> +	while (*p && *p != ':')
> +		strbuf_addch(requested_token_id, *p++);

My mind is going towards microoptimizations, but I wonder if there
is a difference using

	q = strchr(p, ':');
	if (!q)
		return 1;
	strbuf_add(requested_token_id, p, q - p);

We trade one scan with several method calls for instead two scans
and two method calls, but also those two scans are very optimized.

Probably not worth it, as this is something like 20 bytes of data
per round-trip.

> +	if (!skip_prefix(command, "builtin:", &p)) {
> +		/* assume V1 timestamp or garbage */
> +
> +		char *p_end;
> +
> +		strtoumax(command, &p_end, 10);
> +		trace_printf_key(&trace_fsmonitor,
> +				 ((*p_end) ?
> +				  "fsmonitor: invalid command line '%s'" :
> +				  "fsmonitor: unsupported V1 protocol '%s'"),
> +				 command);
> +		result = -1;
> +		goto send_trivial_response;
> +	}

This is an interesting protection for users currently using FS
Monitor but upgrade to the builtin approach.

> +	if (fsmonitor_parse_client_token(command, &requested_token_id,
> +					 &requested_oldest_seq_nr)) {

It appears you will call skip_prefix() twice this way, once to
determine we are actually the right kind of token, but a second
time as part of this call. Perhaps the helper method could start
from 'p' which has already advanced beyond "buildin:"?

> +		trace_printf_key(&trace_fsmonitor,
> +				 "fsmonitor: invalid V2 protocol token '%s'",
> +				 command);
> +		result = -1;
> +		goto send_trivial_response;
> +	}

This method is getting a bit long. Could the interesting data
structure code below be extracted as a method?

> +	pthread_mutex_lock(&state->main_lock);
> +
> +	if (!state->current_token_data) {
> +		/*
> +		 * We don't have a current token.  This may mean that
> +		 * the listener thread has not yet started.
> +		 */
> +		pthread_mutex_unlock(&state->main_lock);
> +		result = 0;
> +		goto send_trivial_response;
> +	}
> +	if (strcmp(requested_token_id.buf,
> +		   state->current_token_data->token_id.buf)) {
> +		/*
> +		 * The client last spoke to a different daemon
> +		 * instance -OR- the daemon had to resync with
> +		 * the filesystem (and lost events), so reject.
> +		 */
> +		pthread_mutex_unlock(&state->main_lock);
> +		result = 0;
> +		trace2_data_string("fsmonitor", the_repository,
> +				   "response/token", "different");
> +		goto send_trivial_response;
> +	}
> +	if (!state->current_token_data->batch_tail) {
> +		/*
> +		 * The listener has not received any filesystem
> +		 * events yet since we created the current token.
> +		 * We can respond with an empty list, since the
> +		 * client has already seen the current token and
> +		 * we have nothing new to report.  (This is
> +		 * instead of sending a trivial response.)
> +		 */
> +		pthread_mutex_unlock(&state->main_lock);
> +		result = 0;
> +		goto send_empty_response;
> +	}
> +	if (requested_oldest_seq_nr <
> +	    state->current_token_data->batch_tail->batch_seq_nr) {
> +		/*
> +		 * The client wants older events than we have for
> +		 * this token_id.  This means that the end of our
> +		 * batch list was truncated and we cannot give the
> +		 * client a complete snapshot relative to their
> +		 * request.
> +		 */
> +		pthread_mutex_unlock(&state->main_lock);
> +
> +		trace_printf_key(&trace_fsmonitor,
> +				 "client requested truncated data");
> +		result = 0;
> +		goto send_trivial_response;
> +	}

If these are part of a helper method, then they could be reorganized
to "goto" the end of the method which returns an error code after
unlocking the mutex. The multiple unlocks are making me nervous.

> +
> +	/*
> +	 * We're going to hold onto a pointer to the current
> +	 * token-data while we walk the list of batches of files.
> +	 * During this time, we will NOT be under the lock.
> +	 * So we ref-count it.

I was wondering if this would happen. I'm glad it is.

> +	 * This allows the listener thread to continue prepending
> +	 * new batches of items to the token-data (which we'll ignore).
> +	 *
> +	 * AND it allows the listener thread to do a token-reset
> +	 * (and install a new `current_token_data`).
> +	 *
> +	 * We mark the current head of the batch list as "pinned" so
> +	 * that the listener thread will treat this item as read-only
> +	 * (and prevent any more paths from being added to it) from
> +	 * now on.
> +	 */
> +	token_data = state->current_token_data;
> +	token_data->client_ref_count++;
> +
> +	batch_head = token_data->batch_head;
> +	((struct fsmonitor_batch *)batch_head)->pinned_time = time(NULL);
> +
> +	pthread_mutex_unlock(&state->main_lock);

We are now pinned. Makes sense.

> +	/*
> +	 * FSMonitor Protocol V2 requires that we send a response header
> +	 * with a "new current token" and then all of the paths that changed
> +	 * since the "requested token".
> +	 */
> +	fsmonitor_format_response_token(&response_token,
> +					&token_data->token_id,
> +					batch_head);
> +
> +	reply(reply_data, response_token.buf, response_token.len + 1);
> +	total_response_len += response_token.len + 1;

I was going to say we should let "reply" return the number of bytes written,
but that is already an error code. But then we seem to be ignoring it here.
Should we at least do something like "err |= reply()" to collect any errors?

> +
> +	trace2_data_string("fsmonitor", the_repository, "response/token",
> +			   response_token.buf);
> +	trace_printf_key(&trace_fsmonitor, "response token: %s", response_token.buf);
> +
> +	shown = kh_init_str();
> +	for (batch = batch_head;
> +	     batch && batch->batch_seq_nr >= requested_oldest_seq_nr;
> +	     batch = batch->next) {
> +		size_t k;
> +
> +		for (k = 0; k < batch->nr; k++) {
> +			const char *s = batch->interned_paths[k];
> +			size_t s_len;
> +
> +			if (kh_get_str(shown, s) != kh_end(shown))
> +				duplicates++;
> +			else {
> +				kh_put_str(shown, s, &hash_ret);

It appears that you could make use of 'struct strmap' instead of managing your
own khash structure.

> +
> +				trace_printf_key(&trace_fsmonitor,
> +						 "send[%"PRIuMAX"]: %s",
> +						 count, s);
> +
> +				/* Each path gets written with a trailing NUL */
> +				s_len = strlen(s) + 1;
> +
> +				if (payload.len + s_len >=
> +				    LARGE_PACKET_DATA_MAX) {
> +					reply(reply_data, payload.buf,
> +					      payload.len);
> +					total_response_len += payload.len;
> +					strbuf_reset(&payload);
> +				}
> +
> +				strbuf_add(&payload, s, s_len);
> +				count++;
> +			}
> +		}
> +	}
> +
> +	if (payload.len) {
> +		reply(reply_data, payload.buf, payload.len);
> +		total_response_len += payload.len;
> +	}
> +
> +	kh_release_str(shown);
> +
> +	pthread_mutex_lock(&state->main_lock);
> +	if (token_data->client_ref_count > 0)
> +		token_data->client_ref_count--;
> +
> +	if (token_data->client_ref_count == 0) {
> +		if (token_data != state->current_token_data) {
> +			/*
> +			 * The listener thread did a token-reset while we were
> +			 * walking the batch list.  Therefore, this token is
> +			 * stale and can be discarded completely.  If we are
> +			 * the last reader thread using this token, we own
> +			 * that work.
> +			 */
> +			fsmonitor_free_token_data(token_data);
> +		}
> +	}

Perhaps this could be extracted to a method, so that any (locked) caller
could run

	free_token_if_unused(state, token_data);

and the token will either keep around (because client_ref_count > 0 or
state->current_token_data is still on token_data). Otherwise I predict
this being implemented in two places, which is too many when dealing with
memory ownership.

> +
> +	pthread_mutex_unlock(&state->main_lock);
> +
> +	trace2_data_intmax("fsmonitor", the_repository, "response/length", total_response_len);
> +	trace2_data_intmax("fsmonitor", the_repository, "response/count/files", count);
> +	trace2_data_intmax("fsmonitor", the_repository, "response/count/duplicates", duplicates);
> +
> +	strbuf_release(&response_token);
> +	strbuf_release(&requested_token_id);
> +	strbuf_release(&payload);
> +
> +	return 0;
> +
> +send_trivial_response:
> +	pthread_mutex_lock(&state->main_lock);
> +	fsmonitor_format_response_token(&response_token,
> +					&state->current_token_data->token_id,
> +					state->current_token_data->batch_head);
> +	pthread_mutex_unlock(&state->main_lock);
> +
> +	reply(reply_data, response_token.buf, response_token.len + 1);
> +	trace2_data_string("fsmonitor", the_repository, "response/token",
> +			   response_token.buf);
> +	reply(reply_data, "/", 2);
> +	trace2_data_intmax("fsmonitor", the_repository, "response/trivial", 1);
> +
> +	strbuf_release(&response_token);
> +	strbuf_release(&requested_token_id);
> +
> +	return result;
> +
> +send_empty_response:
> +	pthread_mutex_lock(&state->main_lock);
> +	fsmonitor_format_response_token(&response_token,
> +					&state->current_token_data->token_id,
> +					NULL);
> +	pthread_mutex_unlock(&state->main_lock);
> +
> +	reply(reply_data, response_token.buf, response_token.len + 1);
> +	trace2_data_string("fsmonitor", the_repository, "response/token",
> +			   response_token.buf);
> +	trace2_data_intmax("fsmonitor", the_repository, "response/empty", 1);
> +
> +	strbuf_release(&response_token);
> +	strbuf_release(&requested_token_id);
> +
> +	return 0;
> +}
> +
>  static ipc_server_application_cb handle_client;
>  
>  static int handle_client(void *data, const char *command,
>  			 ipc_server_reply_cb *reply,
>  			 struct ipc_server_reply_data *reply_data)
>  {
> -	/* struct fsmonitor_daemon_state *state = data; */
> +	struct fsmonitor_daemon_state *state = data;
>  	int result;
>  
> +	trace_printf_key(&trace_fsmonitor, "requested token: %s", command);
> +
>  	trace2_region_enter("fsmonitor", "handle_client", the_repository);
>  	trace2_data_string("fsmonitor", the_repository, "request", command);
>  
> -	result = 0; /* TODO Do something here. */
> +	result = do_handle_client(state, command, reply, reply_data);
>  
>  	trace2_region_leave("fsmonitor", "handle_client", the_repository);
>  

A simple integration with earlier work. Good.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 02/23] fsmonitor-ipc: create client routines for git-fsmonitor--daemon
  2021-04-26 20:20     ` Eric Sunshine
@ 2021-04-26 21:02       ` Derrick Stolee
  0 siblings, 0 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-26 21:02 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Jeff Hostetler via GitGitGadget, Git List, Jeff Hostetler

On 4/26/2021 4:20 PM, Eric Sunshine wrote:
> On Mon, Apr 26, 2021 at 10:31 AM Derrick Stolee <stolee@gmail.com> wrote:
>> On 4/1/21 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
>>> +#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
>>> +#define FSMONITOR_DAEMON_IS_SUPPORTED 1
>>> +#else
>>> +#define FSMONITOR_DAEMON_IS_SUPPORTED 0
>>> +#endif
>>> +
>>> +int fsmonitor_ipc__is_supported(void)
>>> +{
>>> +     return FSMONITOR_DAEMON_IS_SUPPORTED;
>>> +}
>>
>> I don't see any other use of FSMONITOR_DAEMON_IS_SUPPORTED,
>> so I was thinking you could use the #ifdef/#else/#endif
>> construct within the implementation of this method instead
>> of creating a macro outside. But my suggestion might be an
>> anti-pattern, so feel free to ignore me.
> 
> On this project, it is preferred to keep the #if / #else / #endif
> outside of functions since embedding them within functions often makes
> it difficult to follow how the code flows (and generally makes
> functions unnecessarily noisy). So, the way Jeff did this seems fine.

Makes sense.

> An alternative would have been:
> 
>   #ifdef HAVE_FSMONITOR_DAEMON_BACKEND
>   #define fsmonitor_ipc__is_supported() 1
>   #else
>   #define fsmonitor_ipc__is_supported() 0
>   #endif
> 
> which would still allow calling it as a function:
> 
>     if (fsmonitor_ipc__is_supported())
>         ...
> 
> but it's subjective whether that's actually any cleaner or better.
 
True. I'm just thinking about a future where we need to do a runtime
check for compatibility, but let's use the YAGNI principle and skip
it for now.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 04/23] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC
  2021-04-26 14:56   ` Derrick Stolee
@ 2021-04-27  9:20     ` Ævar Arnfjörð Bjarmason
  2021-04-27 12:42       ` Derrick Stolee
  2021-04-30 14:23     ` [PATCH 04/23] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC Jeff Hostetler
  1 sibling, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-27  9:20 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Johannes Schindelin via GitGitGadget, git, Jeff Hostetler,
	Johannes Schindelin


On Mon, Apr 26 2021, Derrick Stolee wrote:

> On 4/1/21 11:40 AM, Johannes Schindelin via GitGitGadget wrote:> @@ -2515,6 +2515,11 @@ int git_config_get_max_percent_split_change(void)
>>  
>>  int repo_config_get_fsmonitor(struct repository *r)
>>  {
>> +	if (r->settings.use_builtin_fsmonitor > 0) {
>
> Don't forget to run prepare_repo_settings(r) first.
>
>> +		core_fsmonitor = "(built-in daemon)";
>> +		return 1;
>> +	}
>> +
>
> I found this odd, assigning a string to core_fsmonitor that
> would definitely cause a problem trying to execute it as a
> hook. I wondered the need for it at all, but found that
> there are several places in the FS Monitor subsystem that use
> core_fsmonitor as if it was a boolean, indicating whether or
> not the feature is enabled at all.
>
> A cleaner way to handle this would be to hide the data behind
> a helper method, say "fsmonitor_enabled()" that could then
> check a value on the repository (or index) and store the hook
> value as a separate value that is only used by the hook-based
> implementation.
>
> It's probably a good idea to do that cleanup now, before we
> find on accident that we missed a gap and start trying to run
> this bogus string as a hook invocation.
>> -static int query_fsmonitor(int version, const char *last_update, struct strbuf *query_result)
>> +static int query_fsmonitor(int version, struct index_state *istate, struct strbuf *query_result)
>>  {
>> +	struct repository *r = istate->repo ? istate->repo : the_repository;
>> +	const char *last_update = istate->fsmonitor_last_update;
>>  	struct child_process cp = CHILD_PROCESS_INIT;
>>  	int result;
>>  
>>  	if (!core_fsmonitor)
>>  		return -1;
>
> Here is an example of it being used as a boolean.
>
>> +	if (r->settings.use_builtin_fsmonitor > 0) {
>> +#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
>> +		return fsmonitor_ipc__send_query(last_update, query_result);
>> +#else
>> +		/* Fake a trivial response. */
>> +		warning(_("fsmonitor--daemon unavailable; falling back"));
>> +		strbuf_add(query_result, "/", 2);
>> +		return 0;
>> +#endif
>
> This seems like a case where the helper fsmonitor_ipc__is_supported()
> could be used instead of compile-time macros.
>
> (I think this is especially true when we consider the future of the
> feature on Linux and the possibility of the same compiled code needing
> to check run-time properties of the platform for compatibility.)
>
>> --- a/repo-settings.c
>> +++ b/repo-settings.c
>> @@ -58,6 +58,9 @@ void prepare_repo_settings(struct repository *r)
>>  		r->settings.core_multi_pack_index = value;
>>  	UPDATE_DEFAULT_BOOL(r->settings.core_multi_pack_index, 1);
>>  
>> +	if (!repo_config_get_bool(r, "core.usebuiltinfsmonitor", &value) && value)
>> +		r->settings.use_builtin_fsmonitor = 1;
>> +
>
> Follows the patterns of repo settings. Good.

It follows the pattern, but as an aside the pattern seems bit odd. I see
it dates back to your 7211b9e7534 (repo-settings: consolidate some
config settings, 2019-08-13).

I.e. we memset() the whole thing to -1, then for most things do something like:

    if (!repo_config_get_bool(r, "gc.writecommitgraph", &value))
        r->settings.gc_write_commit_graph = value;
    UPDATE_DEFAULT_BOOL(r->settings.gc_write_commit_graph, 1);

But could do:

    if (repo_config_get_bool(r, "gc.writecommitgraph", &r->settings.gc_write_commit_graph))
        r->settings.gc_write_commit_graph = 1;

No? I.e. the repo_config_get_bool() function already returns non-zero if
we don't find it in the config.

I see the UPDATE_DEFAULT_BOOL() macro has also drifted from "set thing
default boolean" to "set any default value".

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 04/23] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC
  2021-04-27  9:20     ` Ævar Arnfjörð Bjarmason
@ 2021-04-27 12:42       ` Derrick Stolee
  2021-04-28  7:59         ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 237+ messages in thread
From: Derrick Stolee @ 2021-04-27 12:42 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin via GitGitGadget, git, Jeff Hostetler,
	Johannes Schindelin

On 4/27/2021 5:20 AM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Mon, Apr 26 2021, Derrick Stolee wrote:
> 
>> On 4/1/21 11:40 AM, Johannes Schindelin via GitGitGadget wrote:> @@ -2515,6 +2515,11 @@ int git_config_get_max_percent_split_change(void)
...
>>> --- a/repo-settings.c
>>> +++ b/repo-settings.c
>>> @@ -58,6 +58,9 @@ void prepare_repo_settings(struct repository *r)
>>>  		r->settings.core_multi_pack_index = value;
>>>  	UPDATE_DEFAULT_BOOL(r->settings.core_multi_pack_index, 1);
>>>  
>>> +	if (!repo_config_get_bool(r, "core.usebuiltinfsmonitor", &value) && value)
>>> +		r->settings.use_builtin_fsmonitor = 1;
>>> +
>>
>> Follows the patterns of repo settings. Good.
> 
> It follows the pattern, but as an aside the pattern seems bit odd. I see
> it dates back to your 7211b9e7534 (repo-settings: consolidate some
> config settings, 2019-08-13).
> 
> I.e. we memset() the whole thing to -1, then for most things do something like:
> 
>     if (!repo_config_get_bool(r, "gc.writecommitgraph", &value))
>         r->settings.gc_write_commit_graph = value;
>     UPDATE_DEFAULT_BOOL(r->settings.gc_write_commit_graph, 1);
> 
> But could do:
> 
>     if (repo_config_get_bool(r, "gc.writecommitgraph", &r->settings.gc_write_commit_graph))
>         r->settings.gc_write_commit_graph = 1;
> 
> No? I.e. the repo_config_get_bool() function already returns non-zero if
> we don't find it in the config.

I see how this is fewer lines of code, but it is harder to read the intent
of the implementation. The current layout makes it clear that we set the
value from the config, if it exists, but otherwise we choose a default.

Sometimes, this choice of a default _needs_ to be deferred, for example with
the fetch_negotiation_algorithm setting, which can be set both from the
fetch.negotiationAlgorithm config, but also the feature.experimental config.

However, perhaps it would be better still for these one-off requests to
create a new macro, say USE_CONFIG_OR_DEFAULT_BOOL() that fills a value
from config _or_ sets the given default:

#define USE_CONFIG_OR_DEFAULT_BOOL(r, v, s, d) \
	if (repo_config_get_bool(r, s, &v)) \
		v = d

And then for this example we would write

	USE_CONFIG_OR_DEFAULT_BOOL(r, r->settings.core_commit_graph,
				   "core.commitgraph", 1);

This would work for multiple config options in this file.

> I see the UPDATE_DEFAULT_BOOL() macro has also drifted from "set thing
> default boolean" to "set any default value".
 
This is correct. I suppose it would be a good change to make some time.
Such a rename could be combined with the refactor above.

I would recommend waiting until such a change isn't conflicting with
ongoing topics, such as this one.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 17/23] fsmonitor--daemon: periodically truncate list of modified files
  2021-04-01 15:40 ` [PATCH 17/23] fsmonitor--daemon: periodically truncate list of modified files Jeff Hostetler via GitGitGadget
@ 2021-04-27 13:24   ` Derrick Stolee
  0 siblings, 0 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-27 13:24 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Teach fsmonitor--daemon to periodically truncate the list of
> modified files to save some memory.
> 
> Clients will ask for the set of changes relative to a token that they
> found in the FSMN index extension in the index.  (This token is like a
> point in time, but different).  Clients will then update the index to
> contain the response token (so that subsequent commands will be
> relative to this new token).
> 
> Therefore, the daemon can gradually truncate the in-memory list of
> changed paths as they become obsolete (older that the previous token).

s/older that/older than/

> Since we may have multiple clients making concurrent requests with a
> skew of tokens and clients may be racing to the talk to the daemon,
> we lazily truncate the list.
> 
> We introduce a 5 minute delay and truncate batches 5 minutes after
> they are considered obsolete.

5 minutes seems like a good default timeframe. We can consider
making this customizable in the future.

> +/*
> + * To keep the batch list from growing unbounded in response to filesystem
> + * activity, we try to truncate old batches from the end of the list as
> + * they become irrelevant.
> + *
> + * We assume that the .git/index will be updated with the most recent token
> + * any time the index is updated.  And future commands will only ask for
> + * recent changes *since* that new token.  So as tokens advance into the
> + * future, older batch items will never be requested/needed.  So we can
> + * truncate them without loss of functionality.
> + *
> + * However, multiple commands may be talking to the daemon concurrently
> + * or perform a slow command, so a little "token skew" is possible.
> + * Therefore, we want this to be a little bit lazy and have a generous
> + * delay.

I appreciate this documentation of the "expected" behavior and how it
compares to the "possible" behavior.

> + * The current reader thread walked backwards in time from `token->batch_head`
> + * back to `batch_marker` somewhere in the middle of the batch list.
> + *
> + * Let's walk backwards in time from that marker an arbitrary delay
> + * and truncate the list there.  Note that these timestamps are completely
> + * artificial (based on when we pinned the batch item) and not on any
> + * filesystem activity.
> + */
> +#define MY_TIME_DELAY (5 * 60) /* seconds */

Perhaps put the units into the macro? MY_TIME_DELAY_SECONDS?
> +static void fsmonitor_batch__truncate(struct fsmonitor_daemon_state *state,
> +				      const struct fsmonitor_batch *batch_marker)
> +{
> +	/* assert state->main_lock */

If this comment is intended to be a warning for consumers that they should
have the lock around this method, then maybe that should be in a documentation
comment above the method declaration.

> +	const struct fsmonitor_batch *batch;
> +	struct fsmonitor_batch *rest;
> +	struct fsmonitor_batch *p;
> +	time_t t;

This is only used within the for loop, so it could be defined there.

> +
> +	if (!batch_marker)
> +		return;
> +
> +	trace_printf_key(&trace_fsmonitor, "TRNC mark (%"PRIu64",%"PRIu64")",

What's the value of abbreviating "truncate" like this? Is there a special
reason?

> +			 batch_marker->batch_seq_nr,
> +			 (uint64_t)batch_marker->pinned_time);
> +
> +	for (batch = batch_marker; batch; batch = batch->next) {
> +		if (!batch->pinned_time) /* an overflow batch */
> +			continue;
> +
> +		t = batch->pinned_time + MY_TIME_DELAY;
> +		if (t > batch_marker->pinned_time) /* too close to marker */
> +			continue;> +
> +		goto truncate_past_here;
> +	}
> +
> +	return;
> +
> +truncate_past_here:
> +	state->current_token_data->batch_tail = (struct fsmonitor_batch *)batch;
> +
> +	rest = ((struct fsmonitor_batch *)batch)->next;
> +	((struct fsmonitor_batch *)batch)->next = NULL;
> +
> +	for (p = rest; p; p = fsmonitor_batch__free(p)) {
> +		trace_printf_key(&trace_fsmonitor,
> +				 "TRNC kill (%"PRIu64",%"PRIu64")",
> +				 p->batch_seq_nr, (uint64_t)p->pinned_time);
> +	}

I see that you are not using the method that frees the entire list so
you can trace each entry as it is deleted. That works.

> +}
> +
>  static void fsmonitor_free_token_data(struct fsmonitor_token_data *token)
>  {
>  	struct fsmonitor_batch *p;
> @@ -647,6 +716,15 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
>  			 * that work.
>  			 */
>  			fsmonitor_free_token_data(token_data);
> +		} else if (batch) {
> +			/*
> +			 * This batch is the first item in the list
> +			 * that is older than the requested sequence
> +			 * number and might be considered to be
> +			 * obsolete.  See if we can truncate the list
> +			 * and save some memory.
> +			 */
> +			fsmonitor_batch__truncate(state, batch);

Seems to work as advertised.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 18/23] fsmonitor--daemon:: introduce client delay for testing
  2021-04-01 15:41 ` [PATCH 18/23] fsmonitor--daemon:: introduce client delay for testing Jeff Hostetler via GitGitGadget
@ 2021-04-27 13:36   ` Derrick Stolee
  0 siblings, 0 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-27 13:36 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/2021 11:41 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Define GIT_TEST_FSMONITOR_CLIENT_DELAY as a millisecond delay.

This is a second delay introduced in this feature, but the units
are different. Could we put a unit in the name? Perhaps a "_MS"
suffix.

> Introduce an artificial delay when processing client requests.
> This make the CI/PR test suite a little more stable and avoids
> the need to load up test scripts with sleep statements to avoid
> racy failures.  This was mostly seen on 1 or 2 core CI build
> machines where the test script would create a file and quickly
> try to confirm that the daemon had seen it *before* the daemon
> had received the kernel event and causing a test failure.

Isn't the cookie file supposed to prevent this from happening?

Yes, our test suite interacts with the filesystem and Git commands
more quickly than a human user would, but Git is used all the time
by scripts or build machines to quickly process data. The FS
Monitor feature should be robust to such a situation.

I feel that as currently described, this patch is only hiding a
bug that shows up during heavy use.

Perhaps the test failures are limited to a small number of
specific tests that are checking the FS Monitor daemon in a
non-standard way, especially in a way that circumvents the
cookie file. In this case, I'd like to see _in this patch_ how
the environment variable is used in the test suite.

I understand that it is difficult to simultaneously build a new
feature like this in small increments, but the biggest issue I
have with the series' organization so far is that we are 18
patches deep and I still haven't seen a single test. This is
a case where I think this only serves the purpose of the test
suite, so it would be good to delay until only seeing its value
in a test script.

Looking ahead, I see that you insert it as a blanket statement
in the t7527 test script, which seems like it has potential to
hide bugs instead of being an isolated cover for a specific
interaction.

As for the code, it all looks correct. However, please update
t/README with a description of the new GIT_TEST_* variable.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 19/23] fsmonitor--daemon: use a cookie file to sync with file system
  2021-04-01 15:41 ` [PATCH 19/23] fsmonitor--daemon: use a cookie file to sync with file system Jeff Hostetler via GitGitGadget
@ 2021-04-27 14:23   ` Derrick Stolee
  2021-05-03 21:59     ` Jeff Hostetler
  0 siblings, 1 reply; 237+ messages in thread
From: Derrick Stolee @ 2021-04-27 14:23 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/2021 11:41 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Teach fsmonitor--daemon client threads to create a cookie file
> inside the .git directory and then wait until FS events for the
> cookie are observed by the FS listener thread.
> 
> This helps address the racy nature of file system events by
> blocking the client response until the kernel has drained any
> event backlog.

This description matches my expectation of the cookie file,
which furthers my confusion about GIT_TEST_FSMONITOR_CLIENT_DELAY.

> +enum fsmonitor_cookie_item_result {
> +	FCIR_ERROR = -1, /* could not create cookie file ? */
> +	FCIR_INIT = 0,
> +	FCIR_SEEN,
> +	FCIR_ABORT,
> +};
> +
> +struct fsmonitor_cookie_item {
> +	struct hashmap_entry entry;
> +	const char *name;
> +	enum fsmonitor_cookie_item_result result;
> +};
> +
> +static int cookies_cmp(const void *data, const struct hashmap_entry *he1,
> +		     const struct hashmap_entry *he2, const void *keydata)

I'm interested to see why a hashset is necessary.

> +static enum fsmonitor_cookie_item_result fsmonitor_wait_for_cookie(
> +	struct fsmonitor_daemon_state *state)
> +{
> +	int fd;
> +	struct fsmonitor_cookie_item cookie;
> +	struct strbuf cookie_pathname = STRBUF_INIT;
> +	struct strbuf cookie_filename = STRBUF_INIT;
> +	const char *slash;
> +	int my_cookie_seq;
> +
> +	pthread_mutex_lock(&state->main_lock);

Hm. We are entering a locked region. I hope this is only for the
cookie write and not the entire waiting period.

> +	my_cookie_seq = state->cookie_seq++;
> +
> +	strbuf_addbuf(&cookie_pathname, &state->path_cookie_prefix);
> +	strbuf_addf(&cookie_pathname, "%i-%i", getpid(), my_cookie_seq);
> +
> +	slash = find_last_dir_sep(cookie_pathname.buf);
> +	if (slash)
> +		strbuf_addstr(&cookie_filename, slash + 1);
> +	else
> +		strbuf_addbuf(&cookie_filename, &cookie_pathname);

This business about the slash-or-not-slash is good defensive
programming. I imagine the only possible way for there to not
be a slash is if the Git process is running with the .git
directory as its working directory?

> +	cookie.name = strbuf_detach(&cookie_filename, NULL);
> +	cookie.result = FCIR_INIT;
> +	// TODO should we have case-insenstive hash (and in cookie_cmp()) ??

This TODO comment should be cleaned up. Doesn't match C-style, either.

As for the question, I believe that we can limit ourselves to names that
don't need case-insensitive hashes and trust that the filesystem will not
change the case. Using lowercase letters should help with this.

> +	hashmap_entry_init(&cookie.entry, strhash(cookie.name));
> +
> +	/*
> +	 * Warning: we are putting the address of a stack variable into a
> +	 * global hashmap.  This feels dodgy.  We must ensure that we remove
> +	 * it before this thread and stack frame returns.
> +	 */
> +	hashmap_add(&state->cookies, &cookie.entry);

I saw this warning and thought about avoiding it by using the heap, but
even with a heap pointer we need to be careful to remove the result
before returning and stopping the thread.

However, there is likely a higher potential of a bug leading to a
security issue through an error causing stack corruption and unsafe
code execution. Perhaps it is worth converting to using heap data here.

> +	trace_printf_key(&trace_fsmonitor, "cookie-wait: '%s' '%s'",
> +			 cookie.name, cookie_pathname.buf);
> +
> +	/*
> +	 * Create the cookie file on disk and then wait for a notification
> +	 * that the listener thread has seen it.
> +	 */
> +	fd = open(cookie_pathname.buf, O_WRONLY | O_CREAT | O_EXCL, 0600);
> +	if (fd >= 0) {
> +		close(fd);
> +		unlink_or_warn(cookie_pathname.buf);

Interesting that we are ignoring the warning here. Is it possible that
these cookie files will continue to grow if this unlink fails?

> +
> +		while (cookie.result == FCIR_INIT)
> +			pthread_cond_wait(&state->cookies_cond,
> +					  &state->main_lock);

Ok, we are waiting here for another thread to signal that the cookie
file has been found in the events. What happens if the event gets lost?
I'll look for a later signal that cookie.result can change based on a
timeout, too.

> +
> +		hashmap_remove(&state->cookies, &cookie.entry, NULL);
> +	} else {
> +		error_errno(_("could not create fsmonitor cookie '%s'"),
> +			    cookie.name);
> +
> +		cookie.result = FCIR_ERROR;
> +		hashmap_remove(&state->cookies, &cookie.entry, NULL);
> +	}

Both blocks here remove the cookie entry, so move it to the end of the
method with the other cleanups.

> +
> +	pthread_mutex_unlock(&state->main_lock);

Hm. We are locking the main state throughout this process. I suppose that
the listener thread could be watching multiple repos and updating them
while we wait here for one repo to update. This is a larger lock window
than I was hoping for, but I don't currently see how to reduce it safely.

> +
> +	free((char*)cookie.name);
> +	strbuf_release(&cookie_pathname);
> +	return cookie.result;

Remove the cookie from the hashset along with these lines.

> +}
> +
> +/*
> + * Mark these cookies as _SEEN and wake up the corresponding client threads.
> + */
> +static void fsmonitor_cookie_mark_seen(struct fsmonitor_daemon_state *state,
> +				       const struct string_list *cookie_names)
> +{
> +	/* assert state->main_lock */

I'm now confused what this is trying to document. The 'state' should be
locked by another thread while we are waiting for a cookie response, so
this method is updating the cookie as seen from a different thread that
doesn't have the lock.

...
> +/*
> + * Set _ABORT on all pending cookies and wake up all client threads.
> + */
> +static void fsmonitor_cookie_abort_all(struct fsmonitor_daemon_state *state)
...

> + * [2] Some of those lost events may have been for cookie files.  We
> + *     should assume the worst and abort them rather letting them starve.
> + *
>   * If there are no readers of the the current token data series, we
>   * can free it now.  Otherwise, let the last reader free it.  Either
>   * way, the old token data series is no longer associated with our
> @@ -454,6 +600,8 @@ void fsmonitor_force_resync(struct fsmonitor_daemon_state *state)
>  			 state->current_token_data->token_id.buf,
>  			 new_one->token_id.buf);
>  
> +	fsmonitor_cookie_abort_all(state);
> +

I see we abort here if we force a resync. I lost the detail of whether
this is triggered by a timeout, too.

> @@ -654,6 +803,39 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
>  		goto send_trivial_response;
>  	}
>  
> +	pthread_mutex_unlock(&state->main_lock);
> +
> +	/*
> +	 * Write a cookie file inside the directory being watched in an
> +	 * effort to flush out existing filesystem events that we actually
> +	 * care about.  Suspend this client thread until we see the filesystem
> +	 * events for this cookie file.
> +	 */
> +	cookie_result = fsmonitor_wait_for_cookie(state);

Odd that we unlock before calling this method, then just take the lock
again inside of it.

> +	if (cookie_result != FCIR_SEEN) {
> +		error(_("fsmonitor: cookie_result '%d' != SEEN"),
> +		      cookie_result);
> +		result = 0;
> +		goto send_trivial_response;
> +	}
> +
> +	pthread_mutex_lock(&state->main_lock);
> +
> +	if (strcmp(requested_token_id.buf,
> +		   state->current_token_data->token_id.buf)) {
> +		/*
> +		 * Ack! The listener thread lost sync with the filesystem
> +		 * and created a new token while we were waiting for the
> +		 * cookie file to be created!  Just give up.
> +		 */
> +		pthread_mutex_unlock(&state->main_lock);
> +
> +		trace_printf_key(&trace_fsmonitor,
> +				 "lost filesystem sync");
> +		result = 0;
> +		goto send_trivial_response;
> +	}
> +
>  	/*
>  	 * We're going to hold onto a pointer to the current
>  	 * token-data while we walk the list of batches of files.
> @@ -982,6 +1164,9 @@ void fsmonitor_publish(struct fsmonitor_daemon_state *state,
>  		}
>  	}
>  
> +	if (cookie_names->nr)
> +		fsmonitor_cookie_mark_seen(state, cookie_names);
> +

I was confused as to what updates 'cookie_names', but it appears that
these are updated in the platform-specific code. That seems to happen
in later patches.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 20/23] fsmonitor: force update index when fsmonitor token advances
  2021-04-01 15:41 ` [PATCH 20/23] fsmonitor: force update index when fsmonitor token advances Jeff Hostetler via GitGitGadget
@ 2021-04-27 14:52   ` Derrick Stolee
  0 siblings, 0 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-27 14:52 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/2021 11:41 AM, Jeff Hostetler via GitGitGadget wrote:...
> +	/*
> +	 * If the fsmonitor response and the subsequent scan of the disk
> +	 * did not cause the in-memory index to be marked dirty, then force
> +	 * it so that we advance the fsmonitor token in our extension, so
> +	 * that future requests don't keep re-requesting the same range.
> +	 */
> +	if (istate->fsmonitor_last_update &&
> +	    strcmp(istate->fsmonitor_last_update, last_update_token.buf))
> +		istate->cache_changed |= FSMONITOR_CHANGED;
> +

This could lead to extra index writes that don't normally happen in
the case without the FS Monitor feature. I'm particularly sensitive
to this because of my sparse-index work is trying to solve for the
I/O cost of large indexes, but perhaps this cost is worth the benefit.

I'll keep an eye out as I do performance testing.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 21/23] t7527: create test for fsmonitor--daemon
  2021-04-01 15:41 ` [PATCH 21/23] t7527: create test for fsmonitor--daemon Jeff Hostetler via GitGitGadget
@ 2021-04-27 15:41   ` Derrick Stolee
  0 siblings, 0 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-27 15:41 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/2021 11:41 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>

It might be nice to summarize the testing strategy here. Are these just
the basics? Is this a full list of every conceivable client/server
interaction? Do some platforms need special tests?

> +# Ask the fsmonitor daemon to insert a little delay before responding to
> +# client commands like `git status` and `git fsmonitor--daemon --query` to
> +# allow recent filesystem events to be received by the daemon.  This helps
> +# the CI/PR builds be more stable.
> +#
> +# An arbitrary millisecond value.
> +#
> +GIT_TEST_FSMONITOR_CLIENT_DELAY=1000
> +export GIT_TEST_FSMONITOR_CLIENT_DELAY

As I mentioned before, this seems like it is hiding a bug, especially
because of a full second delay. But even a 1 millisecond delay seems
like it is incorrect to assume this feature works correctly if the
test requires this delay.

If there is a specific interaction that has issues, then it might be
valid to insert this delay in a specific test or two.

> +git version --build-options | grep "feature:" | grep "fsmonitor--daemon" || {
> +	skip_all="The built-in FSMonitor is not supported on this platform"
> +	test_done
> +}

I see some precedent of this pattern, but it might be nice to instead
register a prereq and then test for the prereq here in the test script.

> +kill_repo () {

Perhaps "kill_repo_daemon" might be more specific?

> +	r=$1
> +	git -C $r fsmonitor--daemon --stop >/dev/null 2>/dev/null
> +	rm -rf $1
> +	return 0
> +}
> +
> +start_daemon () {
> +	case "$#" in
> +		1) r="-C $1";;
> +		*) r="";
> +	esac
> +
> +	git $r fsmonitor--daemon --start || return $?
> +	git $r fsmonitor--daemon --is-running || return $?

Perhaps add 'test_when_finished kill_repo "$r"' as a line here so
consumers don't need to do it themselves.

> +	return 0
> +}
> +
> +test_expect_success 'explicit daemon start and stop' '
> +	test_when_finished "kill_repo test_explicit" &&
> +
> +	git init test_explicit &&
> +	start_daemon test_explicit &&
> +
> +	git -C test_explicit fsmonitor--daemon --stop &&
> +	test_must_fail git -C test_explicit fsmonitor--daemon --is-running
> +'

This is an example of a test that could have been created as early as
patch 09/23.

> +test_expect_success 'implicit daemon start' '
> +	test_when_finished "kill_repo test_implicit" &&
> +
> +	git init test_implicit &&
> +	test_must_fail git -C test_implicit fsmonitor--daemon --is-running &&
> +
> +	# query will implicitly start the daemon.
> +	#
> +	# for test-script simplicity, we send a V1 timestamp rather than
> +	# a V2 token.  either way, the daemon response to any query contains
> +	# a new V2 token.  (the daemon may complain that we sent a V1 request,
> +	# but this test case is only concerned with whether the daemon was
> +	# implicitly started.)
> +
> +	GIT_TRACE2_EVENT="$PWD/.git/trace" \
> +		git -C test_implicit fsmonitor--daemon --query 0 >actual &&
> +	nul_to_q <actual >actual.filtered &&
> +	grep "builtin:" actual.filtered &&
> +
> +	# confirm that a daemon was started in the background.
> +	#
> +	# since the mechanism for starting the background daemon is platform
> +	# dependent, just confirm that the foreground command received a
> +	# response from the daemon.
> +
> +	grep :\"query/response-length\" .git/trace &&
> +
> +	git -C test_implicit fsmonitor--daemon --is-running &&
> +	git -C test_implicit fsmonitor--daemon --stop &&
> +	test_must_fail git -C test_implicit fsmonitor--daemon --is-running
> +'
> +
> +test_expect_success 'implicit daemon stop (delete .git)' '
> +	test_when_finished "kill_repo test_implicit_1" &&
> +
> +	git init test_implicit_1 &&
> +
> +	start_daemon test_implicit_1 &&
> +
> +	# deleting the .git directory will implicitly stop the daemon.
> +	rm -rf test_implicit_1/.git &&
> +
> +	# Create an empty .git directory so that the following Git command
> +	# will stay relative to the `-C` directory.  Without this, the Git
> +	# command will (override the requested -C argument) and crawl out

Why the parentheses here?

> +	# to the containing Git source tree.  This would make the test
> +	# result dependent upon whether we were using fsmonitor on our
> +	# development worktree.
> +
> +	sleep 1 &&

I can understand this sleep, as we are waiting for a background process
to end in response to a directory being deleted.

I'm surprised this works on Windows! I recall having issues deleting
repos that are being watched by Watchman.

> +	mkdir test_implicit_1/.git &&
> +
> +	test_must_fail git -C test_implicit_1 fsmonitor--daemon --is-running
> +'
> +
> +test_expect_success 'implicit daemon stop (rename .git)' '
> +	test_when_finished "kill_repo test_implicit_2" &&
> +
> +	git init test_implicit_2 &&
> +
> +	start_daemon test_implicit_2 &&
> +
> +	# renaming the .git directory will implicitly stop the daemon.
> +	mv test_implicit_2/.git test_implicit_2/.xxx &&
> +
> +	# Create an empty .git directory so that the following Git command
> +	# will stay relative to the `-C` directory.  Without this, the Git
> +	# command will (override the requested -C argument) and crawl out
> +	# to the containing Git source tree.  This would make the test
> +	# result dependent upon whether we were using fsmonitor on our
> +	# development worktree.
> +
> +	sleep 1 &&
> +	mkdir test_implicit_2/.git &&
> +
> +	test_must_fail git -C test_implicit_2 fsmonitor--daemon --is-running
> +'
> +
> +test_expect_success 'cannot start multiple daemons' '
> +	test_when_finished "kill_repo test_multiple" &&
> +
> +	git init test_multiple &&
> +
> +	start_daemon test_multiple &&
> +
> +	test_must_fail git -C test_multiple fsmonitor--daemon --start 2>actual &&
> +	grep "fsmonitor--daemon is already running" actual &&
> +
> +	git -C test_multiple fsmonitor--daemon --stop &&
> +	test_must_fail git -C test_multiple fsmonitor--daemon --is-running
> +'

The tests above seem like they could be inserted as soon as the
platform-specific listeners are created. None of this requires the
linked-list of batched updates or cookie file checks.

> +test_expect_success 'setup' '
> +	>tracked &&
> +	>modified &&
> +	>delete &&
> +	>rename &&
> +	mkdir dir1 &&
> +	>dir1/tracked &&
> +	>dir1/modified &&
> +	>dir1/delete &&
> +	>dir1/rename &&
> +	mkdir dir2 &&
> +	>dir2/tracked &&
> +	>dir2/modified &&
> +	>dir2/delete &&
> +	>dir2/rename &&
> +	mkdir dirtorename &&
> +	>dirtorename/a &&
> +	>dirtorename/b &&
> +
> +	cat >.gitignore <<-\EOF &&
> +	.gitignore
> +	expect*
> +	actual*
> +	EOF
> +
> +	git -c core.useBuiltinFSMonitor= add . &&
> +	test_tick &&
> +	git -c core.useBuiltinFSMonitor= commit -m initial &&
> +
> +	git config core.useBuiltinFSMonitor true
> +'

Now we are getting into the meat of the interactions with Git
features. I can understand these not being ready until all of
the previous product patches are in place.

> +test_expect_success 'update-index implicitly starts daemon' '
> +	test_must_fail git fsmonitor--daemon --is-running &&
> +
> +	GIT_TRACE2_EVENT="$PWD/.git/trace_implicit_1" \
> +		git update-index --fsmonitor &&
> +
> +	git fsmonitor--daemon --is-running &&
> +	test_might_fail git fsmonitor--daemon --stop &&

Should this be a "test_when_finished kill_repo ." at the
beginning of the test?

> +
> +	grep \"event\":\"start\".*\"fsmonitor--daemon\" .git/trace_implicit_1
> +'
> +
> +test_expect_success 'status implicitly starts daemon' '
> +	test_must_fail git fsmonitor--daemon --is-running &&
> +
> +	GIT_TRACE2_EVENT="$PWD/.git/trace_implicit_2" \
> +		git status >actual &&
> +
> +	git fsmonitor--daemon --is-running &&
> +	test_might_fail git fsmonitor--daemon --stop &&
> +
> +	grep \"event\":\"start\".*\"fsmonitor--daemon\" .git/trace_implicit_2
> +'
> +
> +edit_files() {
> +	echo 1 >modified
> +	echo 2 >dir1/modified
> +	echo 3 >dir2/modified
> +	>dir1/untracked
> +}
> +
> +delete_files() {
> +	rm -f delete
> +	rm -f dir1/delete
> +	rm -f dir2/delete
> +}
> +
> +create_files() {
> +	echo 1 >new
> +	echo 2 >dir1/new
> +	echo 3 >dir2/new
> +}
> +
> +rename_files() {
> +	mv rename renamed
> +	mv dir1/rename dir1/renamed
> +	mv dir2/rename dir2/renamed
> +}
> +
> +file_to_directory() {
> +	rm -f delete
> +	mkdir delete
> +	echo 1 >delete/new
> +}
> +
> +directory_to_file() {
> +	rm -rf dir1
> +	echo 1 >dir1
> +}
> +
> +verify_status() {
> +	git status >actual &&
> +	GIT_INDEX_FILE=.git/fresh-index git read-tree master &&
> +	GIT_INDEX_FILE=.git/fresh-index git -c core.useBuiltinFSMonitor= status >expect &&
> +	test_cmp expect actual &&
> +	echo HELLO AFTER &&
> +	cat .git/trace &&
> +	echo HELLO AFTER
> +}
> +
> +# The next few test cases confirm that our fsmonitor daemon sees each type
> +# of OS filesystem notification that we care about.  At this layer we just
> +# ensure we are getting the OS notifications and do not try to confirm what
> +# is reported by `git status`.
> +#
> +# We run a simple query after modifying the filesystem just to introduce
> +# a bit of a delay so that the trace logging from the daemon has time to
> +# get flushed to disk.
> +#
> +# We `reset` and `clean` at the bottom of each test (and before stopping the
> +# daemon) because these commands might implicitly restart the daemon.
> +
> +clean_up_repo_and_stop_daemon () {
> +	git reset --hard HEAD
> +	git clean -fd
> +	git fsmonitor--daemon --stop
> +	rm -f .git/trace
> +}
> +
> +test_expect_success 'edit some files' '
> +	test_when_finished "clean_up_repo_and_stop_daemon" &&

Do you need the quotes here?

> +
> +	(
> +		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&

Use "$(pwd)/.git/trace". There are some strange things with $PWD
especially on Windows.

> +		export GIT_TRACE_FSMONITOR &&
> +
> +		start_daemon
> +	) &&
> +
> +	edit_files &&
> +
> +	git fsmonitor--daemon --query 0 >/dev/null 2>&1 &&
> +
> +	grep "^event: dir1/modified$"  .git/trace &&
> +	grep "^event: dir2/modified$"  .git/trace &&
> +	grep "^event: modified$"       .git/trace &&
> +	grep "^event: dir1/untracked$" .git/trace
> +'
> +
> +test_expect_success 'create some files' '
> +	test_when_finished "clean_up_repo_and_stop_daemon" &&
> +
> +	(
> +		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
> +		export GIT_TRACE_FSMONITOR &&
> +
> +		start_daemon
> +	) &&
> +
> +	create_files &&
> +
> +	git fsmonitor--daemon --query 0 >/dev/null 2>&1 &&
> +
> +	grep "^event: dir1/new$" .git/trace &&
> +	grep "^event: dir2/new$" .git/trace &&
> +	grep "^event: new$"      .git/trace
> +'

I wonder if we can scan the trace for the number of events
and ensure we have the right count, to ensure we aren't getting
_extra_ events that we don't want?

The rest of the tests seem similarly structured and testing
important cases. I'll delay thinking of new tests until I see
the rest of the tests you are adding.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 22/23] p7519: add fsmonitor--daemon
  2021-04-01 15:41 ` [PATCH 22/23] p7519: add fsmonitor--daemon Jeff Hostetler via GitGitGadget
@ 2021-04-27 15:45   ` Derrick Stolee
  0 siblings, 0 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-27 15:45 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/2021 11:41 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Repeat all of the fsmonitor perf tests using `git fsmonitor--daemon` and
> the "Simple IPC" interface.

It would be nice to see some numbers for how this test performs
on some standard Git repositories across Windows and macOS.

> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  t/perf/p7519-fsmonitor.sh | 37 +++++++++++++++++++++++++++++++++++--
>  1 file changed, 35 insertions(+), 2 deletions(-)
> 
> diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
> index 5eb5044a103c..2d018bc7d589 100755
> --- a/t/perf/p7519-fsmonitor.sh
> +++ b/t/perf/p7519-fsmonitor.sh
> @@ -24,7 +24,8 @@ test_description="Test core.fsmonitor"
>  # GIT_PERF_7519_SPLIT_INDEX: used to configure core.splitIndex
>  # GIT_PERF_7519_FSMONITOR: used to configure core.fsMonitor. May be an
>  #   absolute path to an integration. May be a space delimited list of
> -#   absolute paths to integrations.
> +#   absolute paths to integrations.  (This hook or list of hooks does not
> +#   include the built-in fsmonitor--daemon.)
>  #
>  # The big win for using fsmonitor is the elimination of the need to scan the
>  # working directory looking for changed and untracked files. If the file
> @@ -135,10 +136,16 @@ test_expect_success "one time repo setup" '
>  
>  setup_for_fsmonitor() {
>  	# set INTEGRATION_SCRIPT depending on the environment
> -	if test -n "$INTEGRATION_PATH"
> +	if test -n "$USE_FSMONITOR_DAEMON"
>  	then
> +		git config core.useBuiltinFSMonitor true &&
> +		INTEGRATION_SCRIPT=false
> +	elif test -n "$INTEGRATION_PATH"
> +	then
> +		git config core.useBuiltinFSMonitor false &&
>  		INTEGRATION_SCRIPT="$INTEGRATION_PATH"
>  	else
> +		git config core.useBuiltinFSMonitor false &&
>  		#
>  		# Choose integration script based on existence of Watchman.
>  		# Fall back to an empty integration script.
> @@ -285,4 +292,30 @@ test_expect_success "setup without fsmonitor" '
>  test_fsmonitor_suite
>  trace_stop
>  
> +#
> +# Run a full set of perf tests using the built-in fsmonitor--daemon.
> +# It does not use the Hook API, so it has a different setup.
> +# Explicitly start the daemon here and before we start client commands
> +# so that we can later add custom tracing.
> +#
> +
> +test_lazy_prereq HAVE_FSMONITOR_DAEMON '
> +	git version --build-options | grep "feature:" | grep "fsmonitor--daemon"
> +'

Here you do create the prereq. Let's put this into t/test-lib.sh
or t/test-lib-functions.sh, whichever is more appropriate.

> +
> +if test_have_prereq HAVE_FSMONITOR_DAEMON
> +then
> +	USE_FSMONITOR_DAEMON=t
> +
> +	trace_start fsmonitor--daemon--server
> +	git fsmonitor--daemon --start
> +
> +	trace_start fsmonitor--daemon--client
> +	test_expect_success "setup for fsmonitor--daemon" 'setup_for_fsmonitor'

Maybe this is copied from the rest of the file, but we should probably
use the standard layout for tests here:

	test_expect_success 'setup for fsmonitor--daemon' '
		setup_for_fsmonitor
	'

> +	test_fsmonitor_suite
> +
> +	git fsmonitor--daemon --stop
> +	trace_stop
> +fi
> +

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 23/23] t7527: test status with untracked-cache and fsmonitor--daemon
  2021-04-01 15:41 ` [PATCH 23/23] t7527: test status with untracked-cache and fsmonitor--daemon Jeff Hostetler via GitGitGadget
@ 2021-04-27 15:51   ` Derrick Stolee
  0 siblings, 0 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-27 15:51 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/2021 11:41 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Create 2x2 test matrix with the untracked-cache and fsmonitor--daemon
> features and a series of edits and verify that status output is
> identical.

I value the detail here. It also signals that there is something
interesting going on with the untracked cache, which I have also
discovered in my testing of this feature. I'll follow up in a
response to the cover letter.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 13/23] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  2021-04-01 15:40 ` [PATCH 13/23] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows Jeff Hostetler via GitGitGadget
@ 2021-04-27 17:22   ` Derrick Stolee
  2021-04-27 17:41     ` Eric Sunshine
  2021-04-30 19:32     ` Jeff Hostetler
  0 siblings, 2 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-27 17:22 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Teach the win32 backend to register a watch on the working tree
> root directory (recursively).  Also watch the <gitdir> if it is
> not inside the working tree.  And to collect path change notifications
> into batches and publish.

Is it valuable to list the important API methods here for an interested
reader to discover them? Perhaps using links to the docs [1] might be
too ephemeral, in case those URLs stop being valid.

In any case, here are the URLs I found helpful:

[1] https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-readdirectorychangesw
[2] https://docs.microsoft.com/en-us/windows/win32/api/ioapiset/nf-ioapiset-getoverlappedresult
[3] https://docs.microsoft.com/en-us/windows/win32/fileio/cancelioex-func
[4] https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-resetevent
[5] https://docs.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-file_notify_information
[6] https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-waitformultipleobjects

> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  compat/fsmonitor/fsmonitor-fs-listen-win32.c | 493 +++++++++++++++++++
>  1 file changed, 493 insertions(+)
> 
> diff --git a/compat/fsmonitor/fsmonitor-fs-listen-win32.c b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
> index 880446b49e35..2f1fcf85a0a4 100644
> --- a/compat/fsmonitor/fsmonitor-fs-listen-win32.c
> +++ b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
> @@ -2,20 +2,513 @@
>  #include "config.h"
>  #include "fsmonitor.h"
>  #include "fsmonitor-fs-listen.h"
> +#include "fsmonitor--daemon.h"
> +
> +/*
> + * The documentation of ReadDirectoryChangesW() states that the maximum
> + * buffer size is 64K when the monitored directory is remote.
> + *
> + * Larger buffers may be used when the monitored directory is local and
> + * will help us receive events faster from the kernel and avoid dropped
> + * events.
> + *
> + * So we try to use a very large buffer and silently fallback to 64K if
> + * we get an error.
> + */
> +#define MAX_RDCW_BUF_FALLBACK (65536)
> +#define MAX_RDCW_BUF          (65536 * 8)
> +
> +struct one_watch
> +{
> +	char buffer[MAX_RDCW_BUF];
> +	DWORD buf_len;
> +	DWORD count;
> +
> +	struct strbuf path;
> +	HANDLE hDir;
> +	HANDLE hEvent;
> +	OVERLAPPED overlapped;
> +
> +	/*
> +	 * Is there an active ReadDirectoryChangesW() call pending.  If so, we
> +	 * need to later call GetOverlappedResult() and possibly CancelIoEx().
> +	 */
> +	BOOL is_active;
> +};
> +
> +struct fsmonitor_daemon_backend_data
> +{
> +	struct one_watch *watch_worktree;
> +	struct one_watch *watch_gitdir;
> +
> +	HANDLE hEventShutdown;
> +
> +	HANDLE hListener[3]; /* we don't own these handles */
> +#define LISTENER_SHUTDOWN 0
> +#define LISTENER_HAVE_DATA_WORKTREE 1
> +#define LISTENER_HAVE_DATA_GITDIR 2
> +	int nr_listener_handles;
> +};
> +
> +/*
> + * Convert the WCHAR path from the notification into UTF8 and
> + * then normalize it.
> + */
> +static int normalize_path_in_utf8(FILE_NOTIFY_INFORMATION *info,
> +				  struct strbuf *normalized_path)
> +{
> +	int reserve;
> +	int len = 0;
> +
> +	strbuf_reset(normalized_path);
> +	if (!info->FileNameLength)
> +		goto normalize;
> +
> +	/*
> +	 * Pre-reserve enough space in the UTF8 buffer for
> +	 * each Unicode WCHAR character to be mapped into a
> +	 * sequence of 2 UTF8 characters.  That should let us
> +	 * avoid ERROR_INSUFFICIENT_BUFFER 99.9+% of the time.
> +	 */
> +	reserve = info->FileNameLength + 1;
> +	strbuf_grow(normalized_path, reserve);
> +
> +	for (;;) {
> +		len = WideCharToMultiByte(CP_UTF8, 0, info->FileName,
> +					  info->FileNameLength / sizeof(WCHAR),
> +					  normalized_path->buf,
> +					  strbuf_avail(normalized_path) - 1,
> +					  NULL, NULL);
> +		if (len > 0)
> +			goto normalize;
> +		if (GetLastError() != ERROR_INSUFFICIENT_BUFFER) {
> +			error("[GLE %ld] could not convert path to UTF-8: '%.*ls'",
> +			      GetLastError(),
> +			      (int)(info->FileNameLength / sizeof(WCHAR)),
> +			      info->FileName);
> +			return -1;
> +		}
> +
> +		strbuf_grow(normalized_path,
> +			    strbuf_avail(normalized_path) + reserve);
> +	}
> +
> +normalize:
> +	strbuf_setlen(normalized_path, len);
> +	return strbuf_normalize_path(normalized_path);
> +}
>  
>  void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state)
>  {
> +	SetEvent(state->backend_data->hListener[LISTENER_SHUTDOWN]);
> +}
> +
> +static struct one_watch *create_watch(struct fsmonitor_daemon_state *state,
> +				      const char *path)
> +{
> +	struct one_watch *watch = NULL;
> +	DWORD desired_access = FILE_LIST_DIRECTORY;
> +	DWORD share_mode =
> +		FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE;

Ah, this is probably why we can delete a repo that is under a watch.

> +	HANDLE hDir;
> +
> +	hDir = CreateFileA(path,
> +			   desired_access, share_mode, NULL, OPEN_EXISTING,
> +			   FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OVERLAPPED,
> +			   NULL);
> +	if (hDir == INVALID_HANDLE_VALUE) {
> +		error(_("[GLE %ld] could not watch '%s'"),
> +		      GetLastError(), path);
> +		return NULL;
> +	}
> +
> +	watch = xcalloc(1, sizeof(*watch));
> +
> +	watch->buf_len = sizeof(watch->buffer); /* assume full MAX_RDCW_BUF */
> +
> +	strbuf_init(&watch->path, 0);
> +	strbuf_addstr(&watch->path, path);
> +
> +	watch->hDir = hDir;
> +	watch->hEvent = CreateEvent(NULL, TRUE, FALSE, NULL);
> +
> +	return watch;
> +}
> +
> +static void destroy_watch(struct one_watch *watch)
> +{
> +	if (!watch)
> +		return;
> +
> +	strbuf_release(&watch->path);
> +	if (watch->hDir != INVALID_HANDLE_VALUE)
> +		CloseHandle(watch->hDir);
> +	if (watch->hEvent != INVALID_HANDLE_VALUE)
> +		CloseHandle(watch->hEvent);
> +
> +	free(watch);
> +}
> +
> +static int start_rdcw_watch(struct fsmonitor_daemon_backend_data *data,
> +			    struct one_watch *watch)
> +{
> +	DWORD dwNotifyFilter =
> +		FILE_NOTIFY_CHANGE_FILE_NAME |
> +		FILE_NOTIFY_CHANGE_DIR_NAME |
> +		FILE_NOTIFY_CHANGE_ATTRIBUTES |
> +		FILE_NOTIFY_CHANGE_SIZE |
> +		FILE_NOTIFY_CHANGE_LAST_WRITE |
> +		FILE_NOTIFY_CHANGE_CREATION;
> +
> +	ResetEvent(watch->hEvent);
> +
> +	memset(&watch->overlapped, 0, sizeof(watch->overlapped));
> +	watch->overlapped.hEvent = watch->hEvent;
> +
> +start_watch:
> +	watch->is_active = ReadDirectoryChangesW(
> +		watch->hDir, watch->buffer, watch->buf_len, TRUE,
> +		dwNotifyFilter, &watch->count, &watch->overlapped, NULL);
> +
> +	if (!watch->is_active &&
> +	    GetLastError() == ERROR_INVALID_PARAMETER &&
> +	    watch->buf_len > MAX_RDCW_BUF_FALLBACK) {
> +		watch->buf_len = MAX_RDCW_BUF_FALLBACK;
> +		goto start_watch;
> +	}
> +
> +	if (watch->is_active)
> +		return 0;
> +
> +	error("ReadDirectoryChangedW failed on '%s' [GLE %ld]",
> +	      watch->path.buf, GetLastError());
> +	return -1;
> +}
> +
> +static int recv_rdcw_watch(struct one_watch *watch)
> +{
> +	watch->is_active = FALSE;
> +
> +	if (GetOverlappedResult(watch->hDir, &watch->overlapped, &watch->count,
> +				TRUE))
> +		return 0;
> +
> +	// TODO If an external <gitdir> is deleted, the above returns an error.
> +	// TODO I'm not sure that there's anything that we can do here other
> +	// TODO than failing -- the <worktree>/.git link file would be broken
> +	// TODO anyway.  We might try to check for that and return a better
> +	// TODO error message.

These are not fit C-style comments. This situation can be handled
by a later patch series, if valuable enough.

> +
> +	error("GetOverlappedResult failed on '%s' [GLE %ld]",
> +	      watch->path.buf, GetLastError());
> +	return -1;
> +}
> +
> +static void cancel_rdcw_watch(struct one_watch *watch)
> +{
> +	DWORD count;
> +
> +	if (!watch || !watch->is_active)
> +		return;
> +
> +	CancelIoEx(watch->hDir, &watch->overlapped);
> +	GetOverlappedResult(watch->hDir, &watch->overlapped, &count, TRUE);
> +	watch->is_active = FALSE;
> +}
> +
> +/*
> + * Process filesystem events that happen anywhere (recursively) under the
> + * <worktree> root directory.  For a normal working directory, this includes
> + * both version controlled files and the contents of the .git/ directory.
> + *
> + * If <worktree>/.git is a file, then we only see events for the file
> + * itself.
> + */
> +static int process_worktree_events(struct fsmonitor_daemon_state *state)
> +{
> +	struct fsmonitor_daemon_backend_data *data = state->backend_data;
> +	struct one_watch *watch = data->watch_worktree;
> +	struct strbuf path = STRBUF_INIT;
> +	struct string_list cookie_list = STRING_LIST_INIT_DUP;
> +	struct fsmonitor_batch *batch = NULL;
> +	const char *p = watch->buffer;
> +
> +	/*
> +	 * If the kernel gets more events than will fit in the kernel
> +	 * buffer associated with our RDCW handle, it drops them and
> +	 * returns a count of zero.  (A successful call, but with
> +	 * length zero.)
> +	 */

I suppose that since we create a cookie file, we don't expect a zero
result to ever be a meaningful value? Or, is there another way to
differentiate between "nothing happened" and "too much happened"?

> +	if (!watch->count) {
> +		trace2_data_string("fsmonitor", NULL, "fsm-listen/kernel",
> +				   "overflow");
> +		fsmonitor_force_resync(state);
> +		return LISTENER_HAVE_DATA_WORKTREE;
> +	}
> +
> +	/*
> +	 * On Windows, `info` contains an "array" of paths that are
> +	 * relative to the root of whichever directory handle received
> +	 * the event.
> +	 */
> +	for (;;) {
> +		FILE_NOTIFY_INFORMATION *info = (void *)p;
> +		const char *slash;
> +		enum fsmonitor_path_type t;
> +
> +		strbuf_reset(&path);
> +		if (normalize_path_in_utf8(info, &path) == -1)
> +			goto skip_this_path;
> +
> +		t = fsmonitor_classify_path_workdir_relative(path.buf);
> +
> +		switch (t) {
> +		case IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX:
> +			/* special case cookie files within .git */
> +
> +			/* Use just the filename of the cookie file. */
> +			slash = find_last_dir_sep(path.buf);
> +			string_list_append(&cookie_list,
> +					   slash ? slash + 1 : path.buf);

Ok, I see now how we special-case cookies in the list of events.

> +			break;
> +
> +		case IS_INSIDE_DOT_GIT:
> +			/* ignore everything inside of "<worktree>/.git/" */
> +			break;
> +
> +		case IS_DOT_GIT:
> +			/* "<worktree>/.git" was deleted (or renamed away) */
> +			if ((info->Action == FILE_ACTION_REMOVED) ||
> +			    (info->Action == FILE_ACTION_RENAMED_OLD_NAME)) {
> +				trace2_data_string("fsmonitor", NULL,
> +						   "fsm-listen/dotgit",
> +						   "removed");
> +				goto force_shutdown;
> +			}
> +			break;
> +
> +		case IS_WORKDIR_PATH:
> +			/* queue normal pathname */
> +			if (!batch)
> +				batch = fsmonitor_batch__new();
> +			fsmonitor_batch__add_path(batch, path.buf);
> +			break;
> +
> +		case IS_GITDIR:
> +		case IS_INSIDE_GITDIR:
> +		case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX:
> +		default:
> +			BUG("unexpected path classification '%d' for '%s'",
> +			    t, path.buf);
So these events should be caught by the _other_ watcher. I suppose
BUG() is somewhat appropriate, but also seems heavy-handed. For
example, the 'goto' in the next line will never be visited. A die()
would even be appropriate, but somewhat less harsh than a BUG(),
especially for a background process.

> +			goto skip_this_path;
> +		}
> +
> +skip_this_path:
> +		if (!info->NextEntryOffset)
> +			break;
> +		p += info->NextEntryOffset;
> +	}
> +
> +	fsmonitor_publish(state, batch, &cookie_list);
> +	batch = NULL;
> +	string_list_clear(&cookie_list, 0);
> +	strbuf_release(&path);
> +	return LISTENER_HAVE_DATA_WORKTREE;
> +
> +force_shutdown:
> +	fsmonitor_batch__free(batch);
> +	string_list_clear(&cookie_list, 0);
> +	strbuf_release(&path);
> +	return LISTENER_SHUTDOWN;
> +}
> +
> +/*
> + * Process filesystem events that happend anywhere (recursively) under the

s/happend/happened

> + * external <gitdir> (such as non-primary worktrees or submodules).
> + * We only care about cookie files that our client threads created here.
> + *
> + * Note that we DO NOT get filesystem events on the external <gitdir>
> + * itself (it is not inside something that we are watching).  In particular,
> + * we do not get an event if the external <gitdir> is deleted.

This is an interesting change of behavior. I forget if it is listed in
the documentation file, but definitely could be. I imagine wanting a
"Troubleshooting" section that describes special cases like this.

Also, because of this worktree-specific behavior, we might want to
recommend using 'git config --worktree' when choosing to use FS Monitor,
so that each worktree is opted-in as requested. Without --worktree, all
worktrees with a common base would stard using FS Monitor simultaneously.

> + */
> +static int process_gitdir_events(struct fsmonitor_daemon_state *state)
> +{
> +	struct fsmonitor_daemon_backend_data *data = state->backend_data;
> +	struct one_watch *watch = data->watch_gitdir;
> +	struct strbuf path = STRBUF_INIT;
> +	struct string_list cookie_list = STRING_LIST_INIT_DUP;
> +	const char *p = watch->buffer;
> +
> +	if (!watch->count) {
> +		trace2_data_string("fsmonitor", NULL, "fsm-listen/kernel",
> +				   "overflow");
> +		fsmonitor_force_resync(state);
> +		return LISTENER_HAVE_DATA_GITDIR;
> +	}
> +
> +	for (;;) {
> +		FILE_NOTIFY_INFORMATION *info = (void *)p;
> +		const char *slash;
> +		enum fsmonitor_path_type t;
> +
> +		strbuf_reset(&path);
> +		if (normalize_path_in_utf8(info, &path) == -1)
> +			goto skip_this_path;
> +
> +		t = fsmonitor_classify_path_gitdir_relative(path.buf);
> +
> +		trace_printf_key(&trace_fsmonitor, "BBB: %s", path.buf);
> +
> +		switch (t) {
> +		case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX:
> +			/* special case cookie files within gitdir */
> +
> +			/* Use just the filename of the cookie file. */
> +			slash = find_last_dir_sep(path.buf);
> +			string_list_append(&cookie_list,
> +					   slash ? slash + 1 : path.buf);
> +			break;
> +
> +		case IS_INSIDE_GITDIR:
> +			goto skip_this_path;
> +
> +		default:
> +			BUG("unexpected path classification '%d' for '%s'",
> +			    t, path.buf);

If we decide against BUG() earlier, then also get this one.

> +			goto skip_this_path;
> +		}
> +
> +skip_this_path:
> +		if (!info->NextEntryOffset)
> +			break;
> +		p += info->NextEntryOffset;
> +	}
> +
> +	fsmonitor_publish(state, NULL, &cookie_list);
> +	string_list_clear(&cookie_list, 0);
> +	strbuf_release(&path);
> +	return LISTENER_HAVE_DATA_GITDIR;
>  }
>  
>  void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
>  {
> +	struct fsmonitor_daemon_backend_data *data = state->backend_data;
> +	DWORD dwWait;
> +
> +	state->error_code = 0;
> +
> +	if (start_rdcw_watch(data, data->watch_worktree) == -1)
> +		goto force_error_stop;
> +
> +	if (data->watch_gitdir &&
> +	    start_rdcw_watch(data, data->watch_gitdir) == -1)
> +		goto force_error_stop;
> +
> +	for (;;) {
> +		dwWait = WaitForMultipleObjects(data->nr_listener_handles,
> +						data->hListener,
> +						FALSE, INFINITE);

Since you use INFINITE here, that says that we will wait for at least one
signal, solving the confusion about zero results: zero results unambiguously
indicates a loss of events.

> +
> +		if (dwWait == WAIT_OBJECT_0 + LISTENER_HAVE_DATA_WORKTREE) {
> +			if (recv_rdcw_watch(data->watch_worktree) == -1)
> +				goto force_error_stop;
> +			if (process_worktree_events(state) == LISTENER_SHUTDOWN)
> +				goto force_shutdown;
> +			if (start_rdcw_watch(data, data->watch_worktree) == -1)
> +				goto force_error_stop;
> +			continue;
> +		}
> +
> +		if (dwWait == WAIT_OBJECT_0 + LISTENER_HAVE_DATA_GITDIR) {
> +			if (recv_rdcw_watch(data->watch_gitdir) == -1)
> +				goto force_error_stop;
> +			if (process_gitdir_events(state) == LISTENER_SHUTDOWN)
> +				goto force_shutdown;
> +			if (start_rdcw_watch(data, data->watch_gitdir) == -1)
> +				goto force_error_stop;
> +			continue;
> +		}
> +
> +		if (dwWait == WAIT_OBJECT_0 + LISTENER_SHUTDOWN)
> +			goto clean_shutdown;
> +
> +		error(_("could not read directory changes [GLE %ld]"),
> +		      GetLastError());
> +		goto force_error_stop;
> +	}
> +
> +force_error_stop:
> +	state->error_code = -1;
> +
> +force_shutdown:
> +	/*
> +	 * Tell the IPC thead pool to stop (which completes the await
> +	 * in the main thread (which will also signal this thread (if
> +	 * we are still alive))).
> +	 */
> +	ipc_server_stop_async(state->ipc_server_data);
> +
> +clean_shutdown:
> +	cancel_rdcw_watch(data->watch_worktree);
> +	cancel_rdcw_watch(data->watch_gitdir);
>  }
>  
>  int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
>  {
> +	struct fsmonitor_daemon_backend_data *data;
> +
> +	data = xcalloc(1, sizeof(*data));

CALLOC_ARRAY()

> +
> +	data->hEventShutdown = CreateEvent(NULL, TRUE, FALSE, NULL);
> +
> +	data->watch_worktree = create_watch(state,
> +					    state->path_worktree_watch.buf);
> +	if (!data->watch_worktree)
> +		goto failed;
> +
> +	if (state->nr_paths_watching > 1) {
> +		data->watch_gitdir = create_watch(state,
> +						  state->path_gitdir_watch.buf);
> +		if (!data->watch_gitdir)
> +			goto failed;
> +	}
> +
> +	data->hListener[LISTENER_SHUTDOWN] = data->hEventShutdown;
> +	data->nr_listener_handles++;
> +
> +	data->hListener[LISTENER_HAVE_DATA_WORKTREE] =
> +		data->watch_worktree->hEvent;
> +	data->nr_listener_handles++;
> +
> +	if (data->watch_gitdir) {
> +		data->hListener[LISTENER_HAVE_DATA_GITDIR] =
> +			data->watch_gitdir->hEvent;
> +		data->nr_listener_handles++;
> +	}

This is a clever organization of the event handles. I imagine it
will requires some rework if we decide to include another optional
handle whose inclusion is orthogonal to the gitdir one, but that
is unlikely enough to keep these well-defined array indices.

> +	state->backend_data = data;
> +	return 0;
> +
> +failed:
> +	CloseHandle(data->hEventShutdown);
> +	destroy_watch(data->watch_worktree);
> +	destroy_watch(data->watch_gitdir);
> +
>  	return -1;
>  }
>  
>  void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
>  {
> +	struct fsmonitor_daemon_backend_data *data;
> +
> +	if (!state || !state->backend_data)
> +		return;
> +
> +	data = state->backend_data;
> +
> +	CloseHandle(data->hEventShutdown);
> +	destroy_watch(data->watch_worktree);
> +	destroy_watch(data->watch_gitdir);
> +
> +	FREE_AND_NULL(state->backend_data);
>  }

I tried to follow all the API calls and check the documentation for
any misuse, but did not find any. I can only contribute nitpicks
here, and rely on the tests to really see that this is working as
expected.

I was hoping to find in here why we need to sleep in the test suite,
but have not pinpointed that issue yet.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 13/23] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  2021-04-27 17:22   ` Derrick Stolee
@ 2021-04-27 17:41     ` Eric Sunshine
  2021-04-30 19:32     ` Jeff Hostetler
  1 sibling, 0 replies; 237+ messages in thread
From: Eric Sunshine @ 2021-04-27 17:41 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Jeff Hostetler via GitGitGadget, Git List, Jeff Hostetler

On Tue, Apr 27, 2021 at 1:22 PM Derrick Stolee <stolee@gmail.com> wrote:
> On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> > +     // TODO If an external <gitdir> is deleted, the above returns an error.
> > +     // TODO I'm not sure that there's anything that we can do here other
> > +     // TODO than failing -- the <worktree>/.git link file would be broken
> > +     // TODO anyway.  We might try to check for that and return a better
> > +     // TODO error message.
>
> These are not fit C-style comments. This situation can be handled
> by a later patch series, if valuable enough.

In this project, a comment like this would normally be prefixed by
NEEDSWORK rather than TODO, and only the first line would carry the
prefix line, not all of them.

    /* NEEDSWORK: the foinkster blorps the wooz */

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 14/23] fsmonitor-fs-listen-macos: add macos header files for FSEvent
  2021-04-01 15:40 ` [PATCH 14/23] fsmonitor-fs-listen-macos: add macos header files for FSEvent Jeff Hostetler via GitGitGadget
@ 2021-04-27 18:13   ` Derrick Stolee
  0 siblings, 0 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-27 18:13 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Include MacOS system declarations to allow us to use FSEvent and
> CoreFoundation APIs.  We need GCC and clang versions because of
> compiler and header file conflicts.
...
> This is a known problem and tracked in GCC's bug tracker:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93082
> 
> In the meantime, let's not block things and go the slightly ugly route
> of declaring/defining the FSEvents constants, data structures and
> functions that we need, so that we can avoid above-mentioned issue.
> 
> Let's do this _only_ for GCC, though, so that the CI/PR builds (which
> build both with clang and with GCC) can guarantee that we _are_ using
> the correct data types.

I appreciate that this issue with header files is isolated to its own
patch.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 15/23] fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS
  2021-04-01 15:40 ` [PATCH 15/23] fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS Jeff Hostetler via GitGitGadget
@ 2021-04-27 18:35   ` Derrick Stolee
  2021-04-30 20:05     ` Jeff Hostetler
  0 siblings, 1 reply; 237+ messages in thread
From: Derrick Stolee @ 2021-04-27 18:35 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler, Eric Sunshine

On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Implement file system event listener on MacOS using FSEvent,
> CoreFoundation, and CoreServices.

Again, I'm not sure if we _should_ be including URLs to
documentation in our messages, but here are some I found helpful:

[1] https://developer.apple.com/library/archive/documentation/Darwin/Conceptual/FSEvents_ProgGuide/UsingtheFSEventsFramework/UsingtheFSEventsFramework.html
[2] https://developer.apple.com/documentation/corefoundation/1541796-cfrunloopstop
[3] https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/Multithreading/RunLoopManagement/RunLoopManagement.html

> Co-authored-by: Kevin Willford <Kevin.Willford@microsoft.com>
> Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  compat/fsmonitor/fsmonitor-fs-listen-macos.c | 368 +++++++++++++++++++
>  1 file changed, 368 insertions(+)
> 
> diff --git a/compat/fsmonitor/fsmonitor-fs-listen-macos.c b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
> index bec5130d9e1d..e055fb579cc4 100644
> --- a/compat/fsmonitor/fsmonitor-fs-listen-macos.c
> +++ b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
> @@ -97,20 +97,388 @@ void FSEventStreamRelease(FSEventStreamRef stream);
>  #include "cache.h"
>  #include "fsmonitor.h"
>  #include "fsmonitor-fs-listen.h"
> +#include "fsmonitor--daemon.h"
> +
> +struct fsmonitor_daemon_backend_data
> +{
> +	CFStringRef cfsr_worktree_path;
> +	CFStringRef cfsr_gitdir_path;
> +
> +	CFArrayRef cfar_paths_to_watch;
> +	int nr_paths_watching;
> +
> +	FSEventStreamRef stream;
> +
> +	CFRunLoopRef rl;
> +
> +	enum shutdown_style {
> +		SHUTDOWN_EVENT = 0,
> +		FORCE_SHUTDOWN,
> +		FORCE_ERROR_STOP,
> +	} shutdown_style;
> +
> +	unsigned int stream_scheduled:1;
> +	unsigned int stream_started:1;
> +};
> +
> +static void log_flags_set(const char *path, const FSEventStreamEventFlags flag)
> +{
> +	struct strbuf msg = STRBUF_INIT;

Before going through these ifs and constructing a string, it
might be a good idea to check if the trace event will actually
be sent somewhere. If the logging method is switched to a
trace2 method, then up here we can do:

	if (!trace2_is_enabled())
		return;

> +	if (flag & kFSEventStreamEventFlagMustScanSubDirs)
> +		strbuf_addstr(&msg, "MustScanSubDirs|");
> +	if (flag & kFSEventStreamEventFlagUserDropped)
> +		strbuf_addstr(&msg, "UserDropped|");
> +	if (flag & kFSEventStreamEventFlagKernelDropped)
> +		strbuf_addstr(&msg, "KernelDropped|");
> +	if (flag & kFSEventStreamEventFlagEventIdsWrapped)
> +		strbuf_addstr(&msg, "EventIdsWrapped|");
> +	if (flag & kFSEventStreamEventFlagHistoryDone)
> +		strbuf_addstr(&msg, "HistoryDone|");
> +	if (flag & kFSEventStreamEventFlagRootChanged)
> +		strbuf_addstr(&msg, "RootChanged|");
> +	if (flag & kFSEventStreamEventFlagMount)
> +		strbuf_addstr(&msg, "Mount|");
> +	if (flag & kFSEventStreamEventFlagUnmount)
> +		strbuf_addstr(&msg, "Unmount|");
> +	if (flag & kFSEventStreamEventFlagItemChangeOwner)
> +		strbuf_addstr(&msg, "ItemChangeOwner|");
> +	if (flag & kFSEventStreamEventFlagItemCreated)
> +		strbuf_addstr(&msg, "ItemCreated|");
> +	if (flag & kFSEventStreamEventFlagItemFinderInfoMod)
> +		strbuf_addstr(&msg, "ItemFinderInfoMod|");
> +	if (flag & kFSEventStreamEventFlagItemInodeMetaMod)
> +		strbuf_addstr(&msg, "ItemInodeMetaMod|");
> +	if (flag & kFSEventStreamEventFlagItemIsDir)
> +		strbuf_addstr(&msg, "ItemIsDir|");
> +	if (flag & kFSEventStreamEventFlagItemIsFile)
> +		strbuf_addstr(&msg, "ItemIsFile|");
> +	if (flag & kFSEventStreamEventFlagItemIsHardlink)
> +		strbuf_addstr(&msg, "ItemIsHardlink|");
> +	if (flag & kFSEventStreamEventFlagItemIsLastHardlink)
> +		strbuf_addstr(&msg, "ItemIsLastHardlink|");
> +	if (flag & kFSEventStreamEventFlagItemIsSymlink)
> +		strbuf_addstr(&msg, "ItemIsSymlink|");
> +	if (flag & kFSEventStreamEventFlagItemModified)
> +		strbuf_addstr(&msg, "ItemModified|");
> +	if (flag & kFSEventStreamEventFlagItemRemoved)
> +		strbuf_addstr(&msg, "ItemRemoved|");
> +	if (flag & kFSEventStreamEventFlagItemRenamed)
> +		strbuf_addstr(&msg, "ItemRenamed|");
> +	if (flag & kFSEventStreamEventFlagItemXattrMod)
> +		strbuf_addstr(&msg, "ItemXattrMod|");
> +	if (flag & kFSEventStreamEventFlagOwnEvent)
> +		strbuf_addstr(&msg, "OwnEvent|");
> +	if (flag & kFSEventStreamEventFlagItemCloned)
> +		strbuf_addstr(&msg, "ItemCloned|");
> +
> +	trace_printf_key(&trace_fsmonitor, "fsevent: '%s', flags=%u %s",
> +			 path, flag, msg.buf);

Should this be a trace2 call?

> +
> +	strbuf_release(&msg);
> +}
> +
> +static int ef_is_root_delete(const FSEventStreamEventFlags ef)
> +{
> +	return (ef & kFSEventStreamEventFlagItemIsDir &&
> +		ef & kFSEventStreamEventFlagItemRemoved);
> +}
> +
> +static int ef_is_root_renamed(const FSEventStreamEventFlags ef)
> +{
> +	return (ef & kFSEventStreamEventFlagItemIsDir &&
> +		ef & kFSEventStreamEventFlagItemRenamed);
> +}

Will these be handled differently? Or is it enough to detect
ef_is_root_moved_or_deleted()?

> +static void fsevent_callback(ConstFSEventStreamRef streamRef,
> +			     void *ctx,
> +			     size_t num_of_events,
> +			     void *event_paths,
> +			     const FSEventStreamEventFlags event_flags[],
> +			     const FSEventStreamEventId event_ids[])
> +{
> +	struct fsmonitor_daemon_state *state = ctx;
> +	struct fsmonitor_daemon_backend_data *data = state->backend_data;
> +	char **paths = (char **)event_paths;
> +	struct fsmonitor_batch *batch = NULL;
> +	struct string_list cookie_list = STRING_LIST_INIT_DUP;
> +	const char *path_k;
> +	const char *slash;
> +	int k;
> +
> +	/*
> +	 * Build a list of all filesystem changes into a private/local
> +	 * list and without holding any locks.
> +	 */
> +	for (k = 0; k < num_of_events; k++) {
> +		/*
> +		 * On Mac, we receive an array of absolute paths.
> +		 */
> +		path_k = paths[k];
> +
> +		/*
> +		 * If you want to debug FSEvents, log them to GIT_TRACE_FSMONITOR.
> +		 * Please don't log them to Trace2.
> +		 *
> +		 * trace_printf_key(&trace_fsmonitor, "XXX '%s'", path_k);
> +		 */

Oh, I see. _Not_ trace2. What should we do to see if this is enabled
to avoid over-working in the case we are not using GIT_TRACE_FSMONITOR?

> +		/*
> +		 * If event[k] is marked as dropped, we assume that we have
> +		 * lost sync with the filesystem and should flush our cached
> +		 * data.  We need to:
> +		 *
> +		 * [1] Abort/wake any client threads waiting for a cookie and
> +		 *     flush the cached state data (the current token), and
> +		 *     create a new token.
> +		 *
> +		 * [2] Discard the batch that we were locally building (since
> +		 *     they are conceptually relative to the just flushed
> +		 *     token).
> +		 */
> +		if ((event_flags[k] & kFSEventStreamEventFlagKernelDropped) ||
> +		    (event_flags[k] & kFSEventStreamEventFlagUserDropped)) {

Perhaps create a macro EVENT_FLAG_DROPPED that is the union of these two? Then
a single "event_flags[k] & EVENT_FLAG_DROPPED" would suffice here. Helps cover
up how complicated the macOS API names are, too.

> +			/*
> +			 * see also kFSEventStreamEventFlagMustScanSubDirs
> +			 */
> +			trace2_data_string("fsmonitor", NULL,
> +					   "fsm-listen/kernel", "dropped");
> +
> +			fsmonitor_force_resync(state);
> +
> +			if (fsmonitor_batch__free(batch))
> +				BUG("batch should not have a next");

I mentioned before that BUG() seems overkill for these processes, but this
one fits. If this batch has a next, then we did something wrong, right? Do
we have an automated test that checks enough events to maybe cause a second
batch to be created?

> +			string_list_clear(&cookie_list, 0);
> +
> +			/*
> +			 * We assume that any events that we received
> +			 * in this callback after this dropped event
> +			 * may still be valid, so we continue rather
> +			 * than break.  (And just in case there is a
> +			 * delete of ".git" hiding in there.)
> +			 */
> +			continue;
> +		}
> +
> +		switch (fsmonitor_classify_path_absolute(state, path_k)) {
> +
> +		case IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX:
> +		case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX:
> +			/* special case cookie files within .git or gitdir */
> +
> +			/* Use just the filename of the cookie file. */
> +			slash = find_last_dir_sep(path_k);
> +			string_list_append(&cookie_list,
> +					   slash ? slash + 1 : path_k);
> +			break;
> +
> +		case IS_INSIDE_DOT_GIT:
> +		case IS_INSIDE_GITDIR:
> +			/* ignore all other paths inside of .git or gitdir */
> +			break;
> +
> +		case IS_DOT_GIT:
> +		case IS_GITDIR:
> +			/*
> +			 * If .git directory is deleted or renamed away,
> +			 * we have to quit.
> +			 */
> +			if (ef_is_root_delete(event_flags[k])) {
> +				trace2_data_string("fsmonitor", NULL,
> +						   "fsm-listen/gitdir",
> +						   "removed");
> +				goto force_shutdown;
> +			}
> +			if (ef_is_root_renamed(event_flags[k])) {
> +				trace2_data_string("fsmonitor", NULL,
> +						   "fsm-listen/gitdir",
> +						   "renamed");
> +				goto force_shutdown;
> +			}

I see. The only difference is in how we trace the result. I'm not sure
this tracing message is worth the differentiation.

> +			break;
> +
> +		case IS_WORKDIR_PATH:
> +			/* try to queue normal pathnames */
> +
> +			if (trace_pass_fl(&trace_fsmonitor))
> +				log_flags_set(path_k, event_flags[k]);
> +
> +			/* fsevent could be marked as both a file and directory */

The _same_ event? Interesting. And I see that you need to log the name
differently in the case of a file or a directory.

> +			if (event_flags[k] & kFSEventStreamEventFlagItemIsFile) {
> +				const char *rel = path_k +
> +					state->path_worktree_watch.len + 1;
> +
> +				if (!batch)
> +					batch = fsmonitor_batch__new();
> +				fsmonitor_batch__add_path(batch, rel);
> +			}
> +
> +			if (event_flags[k] & kFSEventStreamEventFlagItemIsDir) {
> +				const char *rel = path_k +
> +					state->path_worktree_watch.len + 1;
> +				char *p = xstrfmt("%s/", rel);

In a critical path, xstrfmt() may be too slow for such a simple case.
Likely we should instead use a strbuf with:

	strbuf_addstr(&p, rel);
	strbuf_addch(&p, '/');

Bonus points if we can use the data to predict the size of the strbuf's
buffer.

> +
> +				if (!batch)
> +					batch = fsmonitor_batch__new();
> +				fsmonitor_batch__add_path(batch, p);
> +
> +				free(p);
> +			}
> +
> +			break;
> +
> +		case IS_OUTSIDE_CONE:
> +		default:
> +			trace_printf_key(&trace_fsmonitor,
> +					 "ignoring '%s'", path_k);
> +			break;
> +		}
> +	}
> +
> +	fsmonitor_publish(state, batch, &cookie_list);
> +	string_list_clear(&cookie_list, 0);
> +	return;
> +
> +force_shutdown:
> +	if (fsmonitor_batch__free(batch))
> +		BUG("batch should not have a next");
> +	string_list_clear(&cookie_list, 0);
> +
> +	data->shutdown_style = FORCE_SHUTDOWN;
> +	CFRunLoopStop(data->rl);
> +	return;
> +}
> +
> +/*
> + * TODO Investigate the proper value for the `latency` argument in the call
> + * TODO to `FSEventStreamCreate()`.  I'm not sure that this needs to be a
> + * TODO config setting or just something that we tune after some testing.
> + * TODO
> + * TODO With a latency of 0.1, I was seeing lots of dropped events during
> + * TODO the "touch 100000" files test within t/perf/p7519, but with a
> + * TODO latency of 0.001 I did not see any dropped events.  So the "correct"
> + * TODO value may be somewhere in between.
> + * TODO
> + * TODO https://developer.apple.com/documentation/coreservices/1443980-fseventstreamcreate
> + */

As Eric mentioned in another thread, this should say "NEEDSWORK" at
the top. This is a good candidate for follow-up after the basics of
the series is stable.

>  int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
>  {
> +	FSEventStreamCreateFlags flags = kFSEventStreamCreateFlagNoDefer |
> +		kFSEventStreamCreateFlagWatchRoot |
> +		kFSEventStreamCreateFlagFileEvents;
> +	FSEventStreamContext ctx = {
> +		0,
> +		state,
> +		NULL,
> +		NULL,
> +		NULL
> +	};
> +	struct fsmonitor_daemon_backend_data *data;
> +	const void *dir_array[2];
> +
> +	data = xcalloc(1, sizeof(*data));

CALLOC_ARRAY()

> +	state->backend_data = data;
> +
> +	data->cfsr_worktree_path = CFStringCreateWithCString(
> +		NULL, state->path_worktree_watch.buf, kCFStringEncodingUTF8);
> +	dir_array[data->nr_paths_watching++] = data->cfsr_worktree_path;
> +
> +	if (state->nr_paths_watching > 1) {
> +		data->cfsr_gitdir_path = CFStringCreateWithCString(
> +			NULL, state->path_gitdir_watch.buf,
> +			kCFStringEncodingUTF8);
> +		dir_array[data->nr_paths_watching++] = data->cfsr_gitdir_path;
> +	}
> +
> +	data->cfar_paths_to_watch = CFArrayCreate(NULL, dir_array,
> +						  data->nr_paths_watching,
> +						  NULL);
> +	data->stream = FSEventStreamCreate(NULL, fsevent_callback, &ctx,
> +					   data->cfar_paths_to_watch,
> +					   kFSEventStreamEventIdSinceNow,
> +					   0.001, flags);
> +	if (data->stream == NULL)
> +		goto failed;
> +
> +	/*
> +	 * `data->rl` needs to be set inside the listener thread.
> +	 */
> +
> +	return 0;
> +
> +failed:
> +	error("Unable to create FSEventStream.");
> +
> +	FREE_AND_NULL(state->backend_data);
>  	return -1;
>  }
>  
>  void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
>  {
> +	struct fsmonitor_daemon_backend_data *data;
> +
> +	if (!state || !state->backend_data)
> +		return;
> +
> +	data = state->backend_data;
> +
> +	if (data->stream) {
> +		if (data->stream_started)
> +			FSEventStreamStop(data->stream);
> +		if (data->stream_scheduled)
> +			FSEventStreamInvalidate(data->stream);
> +		FSEventStreamRelease(data->stream);
> +	}
> +
> +	FREE_AND_NULL(state->backend_data);
>  }
>  
>  void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state)
>  {
> +	struct fsmonitor_daemon_backend_data *data;
> +
> +	data = state->backend_data;
> +	data->shutdown_style = SHUTDOWN_EVENT;
> +
> +	CFRunLoopStop(data->rl);
>  }
>  
>  void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
>  {
> +	struct fsmonitor_daemon_backend_data *data;
> +
> +	data = state->backend_data;
> +
> +	data->rl = CFRunLoopGetCurrent();
> +
> +	FSEventStreamScheduleWithRunLoop(data->stream, data->rl, kCFRunLoopDefaultMode);
> +	data->stream_scheduled = 1;
> +
> +	if (!FSEventStreamStart(data->stream)) {
> +		error("Failed to start the FSEventStream");
> +		goto force_error_stop_without_loop;
> +	}
> +	data->stream_started = 1;
> +
> +	CFRunLoopRun();
> +
> +	switch (data->shutdown_style) {
> +	case FORCE_ERROR_STOP:
> +		state->error_code = -1;
> +		/* fall thru */
> +	case FORCE_SHUTDOWN:
> +		ipc_server_stop_async(state->ipc_server_data);
> +		/* fall thru */
> +	case SHUTDOWN_EVENT:
> +	default:
> +		break;
> +	}
> +	return;
> +
> +force_error_stop_without_loop:
> +	state->error_code = -1;
> +	ipc_server_stop_async(state->ipc_server_data);
> +	return;
>  }

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* FS Monitor Windows Performance (was [PATCH 00/23] [RFC] Builtin FSMonitor Feature)
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (23 preceding siblings ...)
  2021-04-16 22:44 ` [PATCH 00/23] [RFC] Builtin FSMonitor Feature Junio C Hamano
@ 2021-04-27 18:49 ` Derrick Stolee
  2021-04-27 19:31 ` FS Monitor macOS " Derrick Stolee
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
  26 siblings, 0 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-27 18:49 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> This patch series adds a builtin FSMonitor daemon to Git.
> 
> This daemon uses platform-specific filesystem notifications to keep track of
> changes to a working directory. It also listens over the "Simple IPC"
> facility for client requests and responds with a list of files/directories
> that have been recently modified.
...
> This RFC version includes support for Windows and MacOS file system events.
> A Linux version will be submitted in a later patch series.

I finished a full read-through of the series, and pointed out what I
could. I'm not an expert on filesystems or these platform-specific
APIs, so I could only do a surface-level check that those integrations
are correct. They certainly appear to be, and the real proof is in the
tests and its performance.

I mentioned that I am concerned about the need for delays in the test
suite, since the feature itself should be robust to scripts and tools
interacting with Git shortly after modifying the filesystem. I hope we
can isolate the need for such delays.

As for performance, I wanted to check the timings for how this improves
the case for large repositories. I believe it should be clear that this
makes things easier when there is a large set of filesystem events,
causing Git to need to walk more of the workdir in a command like 'git
status'. So, I wanted to focus on zero or one changes, and see how
that affects performance.

This message focuses only on the Windows case. I will provide my macOS
performance numbers in a separate message.

I've been using two cases, one that tests 'git status' when there are no
changes to the filesystem, and another where a file is modified and then
deleted (with 'git status' run between each case).

hyperfine \
	-n "none (clean)" "$GIT -c core.useBuiltinFSMonitor=false status" \
	-n "builtin (clean)" "$GIT -c core.useBuiltinFSMonitor=true status" \
	--warmup=5

hyperfine \
	-n "none (dirty)" "echo >>$FILE && $GIT -c core.useBuiltinFSMonitor=false status && rm $FILE && $GIT -c core.useBuiltinFSMonitor=false status" \
	-n "builtin (dirty)" "echo >>$FILE && $GIT -c core.useBuiltinFSMonitor=true status && rm $FILE && $GIT -c core.useBuiltinFSMonitor=true status" \
	--warmup=5

Note that we are running 'git status' twice in the dirty case, which
will make it appear like things are more than twice as slow as the
clean case.

I then got some disappointing results on my first run:

sparse-index disabled, untracked cache disabled
-----------------------------------------------

Benchmark #1: none (clean)
  Time (mean ± σ):      1.870 s ±  0.029 s    [User: 6.1 ms, System: 13.9 ms]
  Range (min … max):    1.814 s …  1.903 s    10 runs

Benchmark #2: builtin (clean)
  Time (mean ± σ):      1.961 s ±  0.102 s    [User: 3.6 ms, System: 12.4 ms]
  Range (min … max):    1.832 s …  2.172 s    10 runs

Summary
  'none (clean)' ran
    1.05 ± 0.06 times faster than 'builtin (clean)'
Benchmark #1: none (dirty)
  Time (mean ± σ):      3.738 s ±  0.044 s    [User: 5.3 ms, System: 17.0 ms]
  Range (min … max):    3.663 s …  3.832 s    10 runs

Benchmark #2: builtin (dirty)
  Time (mean ± σ):      5.987 s ±  0.062 s    [User: 2.8 ms, System: 17.3 ms]
  Range (min … max):    5.895 s …  6.090 s    10 runs

Summary
  'none (dirty)' ran
    1.60 ± 0.03 times faster than 'builtin (dirty)'

This all depends on the index being very large. I'm testing using a repo
with 2 million files at HEAD, but only 4% actually checked out on disk.
This exaggerates the cost of the index rewrite. The FS Monitor feature
forces 'git status' to rewrite the index because it updates the token in
the extension. I wish I had a better understanding of why the index is
not updated in the default case.

Interestingly, the untracked cache extension makes a big difference here.
The performance of the overall behavior is much faster if the untracked
cache exists (when paired with the builtin FS Monitor; it doesn't make a
significant difference when FS Monitor is disabled).

sparse-index disabled, untracked cache enabled
----------------------------------------------

Benchmark #1: none (clean)
  Time (mean ± σ):      1.803 s ±  0.037 s    [User: 1.3 ms, System: 19.5 ms]
  Range (min … max):    1.748 s …  1.878 s    10 runs

Benchmark #2: builtin (clean)
  Time (mean ± σ):      1.071 s ±  0.035 s    [User: 1.3 ms, System: 14.0 ms]
  Range (min … max):    1.019 s …  1.138 s    10 runs

Summary
  'builtin (clean)' ran
    1.68 ± 0.07 times faster than 'none (clean)'

Benchmark #1: none (dirty)
  Time (mean ± σ):      3.648 s ±  0.079 s    [User: 3.9 ms, System: 20.5 ms]
  Range (min … max):    3.533 s …  3.761 s    10 runs

Benchmark #2: builtin (dirty)
  Time (mean ± σ):      4.268 s ±  0.095 s    [User: 2.6 ms, System: 20.8 ms]
  Range (min … max):    4.115 s …  4.403 s    10 runs

Summary
  'none (dirty)' ran
    1.17 ± 0.04 times faster than 'builtin (dirty)'

However, when I enable the sparse-index and the code where 'git status'
works with it, then I get the results we hope for with the FS Monitor
feature:

sparse-index enabled, untracked cache enabled
---------------------------------------------

Benchmark #1: none (clean)
  Time (mean ± σ):     568.3 ms ±  21.7 ms    [User: 5.0 ms, System: 10.4 ms]
  Range (min … max):   541.6 ms … 598.3 ms    10 runs

Benchmark #2: builtin (clean)
  Time (mean ± σ):     214.8 ms ±  24.9 ms    [User: 1.0 ms, System: 16.0 ms]
  Range (min … max):   175.9 ms … 249.4 ms    12 runs

Summary
  'builtin (clean)' ran
    2.65 ± 0.32 times faster than 'none (clean)'

Benchmark #1: none (dirty)
  Time (mean ± σ):     979.1 ms ±  30.2 ms    [User: 2.4 ms, System: 18.7 ms]
  Range (min … max):   951.8 ms … 1051.1 ms    10 runs

Benchmark #2: builtin (dirty)
  Time (mean ± σ):     529.2 ms ±  43.1 ms    [User: 8.0 ms, System: 12.4 ms]
  Range (min … max):   461.2 ms … 590.6 ms    10 runs

Summary
  'builtin (dirty)' ran
    1.85 ± 0.16 times faster than 'none (dirty)'

However, if I disable the untracked cache, we are back to where we
started:

sparse-index enabled, untracked cache disabled
----------------------------------------------

Benchmark #1: none (clean)
  Time (mean ± σ):     542.3 ms ±  28.5 ms    [User: 4.0 ms, System: 17.2 ms]
  Range (min … max):   501.8 ms … 594.4 ms    10 runs

Benchmark #2: builtin (clean)
  Time (mean ± σ):      1.126 s ±  0.034 s    [User: 5.2 ms, System: 7.6 ms]
  Range (min … max):    1.074 s …  1.163 s    10 runs

Summary
  'none (clean)' ran
    2.08 ± 0.13 times faster than 'builtin (clean)'

Benchmark #1: none (dirty)
  Time (mean ± σ):      1.128 s ±  0.032 s    [User: 6.9 ms, System: 5.0 ms]
  Range (min … max):    1.078 s …  1.202 s    10 runs

Benchmark #2: builtin (dirty)
  Time (mean ± σ):      2.334 s ±  0.072 s    [User: 2.9 ms, System: 21.7 ms]
  Range (min … max):    2.220 s …  2.444 s    10 runs

Summary
  'none (dirty)' ran
    2.07 ± 0.09 times faster than 'builtin (dirty)'

When I use a much smaller repository (git.git) without sparse-checkout,
I see that this extra cost is not there, but the untracked cache still
helps FS Monitor more than the standard case:

untracked cache disabled
-------------------------

Benchmark #1: none (clean)
  Time (mean ± σ):     118.0 ms ±   8.4 ms    [User: 2.3 ms, System: 4.8 ms]
  Range (min … max):   102.6 ms … 133.7 ms    22 runs

Benchmark #2: builtin (clean)
  Time (mean ± σ):      72.2 ms ±   9.5 ms    [User: 3.0 ms, System: 7.8 ms]
  Range (min … max):    53.5 ms …  96.0 ms    33 runs

Summary
  'builtin (clean)' ran
    1.63 ± 0.25 times faster than 'none (clean)'

Benchmark #1: none (dirty)
  Time (mean ± σ):     270.3 ms ±  17.6 ms    [User: 1.3 ms, System: 13.6 ms]
  Range (min … max):   248.7 ms … 306.8 ms    10 runs

Benchmark #2: builtin (dirty)
  Time (mean ± σ):     165.5 ms ±  10.4 ms    [User: 3.3 ms, System: 11.5 ms]
  Range (min … max):   146.0 ms … 183.7 ms    16 runs

Summary
  'builtin (dirty)' ran
    1.63 ± 0.15 times faster than 'none (dirty)'


untracked cache enabled
-------------------------

Benchmark #1: none (clean)
  Time (mean ± σ):     129.3 ms ±  10.9 ms    [User: 2.2 ms, System: 6.9 ms]
  Range (min … max):   108.2 ms … 146.0 ms    19 runs

Benchmark #2: builtin (clean)
  Time (mean ± σ):      51.6 ms ±  10.5 ms    [User: 5.3 ms, System: 10.4 ms]
  Range (min … max):    34.9 ms …  99.1 ms    48 runs

Summary
  'builtin (clean)' ran
    2.51 ± 0.55 times faster than 'none (clean)'

Benchmark #1: none (dirty)
  Time (mean ± σ):     214.5 ms ±   7.5 ms    [User: 7.7 ms, System: 3.4 ms]
  Range (min … max):   207.1 ms … 234.5 ms    12 runs

Benchmark #2: builtin (dirty)
  Time (mean ± σ):     131.8 ms ±  13.1 ms    [User: 2.9 ms, System: 9.8 ms]
  Range (min … max):   110.8 ms … 159.7 ms    22 runs

Summary
  'builtin (dirty)' ran
    1.63 ± 0.17 times faster than 'none (dirty)'

I think it would be valuable to discover why using the builtin FS Monitor
without the untracked cache causes such performance problems (on Windows).
That might be reason enough to enable the untracked cache feature when the
FS Monitor feature is enabled, as in the following diff:

--- >8 ---

diff --git a/repo-settings.c b/repo-settings.c
index 93aab92ff16..1f25609f019 100644
--- a/repo-settings.c
+++ b/repo-settings.c
@@ -58,9 +58,13 @@ void prepare_repo_settings(struct repository *r)
 		r->settings.core_multi_pack_index = value;
 	UPDATE_DEFAULT_BOOL(r->settings.core_multi_pack_index, 1);
 
-	if (!repo_config_get_bool(r, "core.usebuiltinfsmonitor", &value) && value)
+	if (!repo_config_get_bool(r, "core.usebuiltinfsmonitor", &value) && value) {
 		r->settings.use_builtin_fsmonitor = 1;
 
+		/* Use untracked cache if FS Monitor is enabled. */
+		UPDATE_DEFAULT_BOOL(r->settings.core_untracked_cache, UNTRACKED_CACHE_WRITE);
+	}
+
 	if (!repo_config_get_bool(r, "feature.manyfiles", &value) && value) {
 		UPDATE_DEFAULT_BOOL(r->settings.index_version, 4);
 		UPDATE_DEFAULT_BOOL(r->settings.core_untracked_cache, UNTRACKED_CACHE_WRITE);

--- >8 ---

Thanks,
-Stolee

^ permalink raw reply related	[flat|nested] 237+ messages in thread

* FS Monitor macOS Performance (was [PATCH 00/23] [RFC] Builtin FSMonitor Feature)
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (24 preceding siblings ...)
  2021-04-27 18:49 ` FS Monitor Windows Performance (was [PATCH 00/23] [RFC] Builtin FSMonitor Feature) Derrick Stolee
@ 2021-04-27 19:31 ` Derrick Stolee
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
  26 siblings, 0 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-27 19:31 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/21 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
> This patch series adds a builtin FSMonitor daemon to Git.
> 
> This daemon uses platform-specific filesystem notifications to keep track of
> changes to a working directory. It also listens over the "Simple IPC"
> facility for client requests and responds with a list of files/directories
> that have been recently modified.
...
> This RFC version includes support for Windows and MacOS file system events.
> A Linux version will be submitted in a later patch series.

Similarly to my message about testing the Windows performance, I
repeated those tests on macOS.

The same testing procedure was used, except now I'm on a MacBook
Pro laptop instead of a desktop, so the CPU power is likely to be
significantly less.

However, I am pleased to report that the FS Monitor feature is
a clear winner in all scenarios. Using the untracked cache is
still highly recommended, but not necessary in order to get a
speed boost from the builtin FS Montiro.


Sparse Index Disabled, Untracked Cache Enabled
----------------------------------------------

Benchmark #1: none (clean)
  Time (mean ± σ):      3.980 s ±  0.026 s    [User: 919.1 ms, System: 1891.8 ms]
  Range (min … max):    3.940 s …  4.028 s    10 runs
 
Benchmark #2: builtin (clean)
  Time (mean ± σ):     477.9 ms ±   6.6 ms    [User: 772.9 ms, System: 379.7 ms]
  Range (min … max):   468.1 ms … 489.5 ms    10 runs
 
Summary
  'builtin (clean)' ran
    8.33 ± 0.13 times faster than 'none (clean)'

Benchmark #1: none (dirty)
  Time (mean ± σ):      5.411 s ±  0.199 s    [User: 2.993 s, System: 4.120 s]
  Range (min … max):    5.026 s …  5.756 s    10 runs
 
Benchmark #2: builtin (dirty)
  Time (mean ± σ):      2.588 s ±  0.025 s    [User: 3.752 s, System: 2.853 s]
  Range (min … max):    2.540 s …  2.628 s    10 runs
 
Summary
  'builtin (dirty)' ran
    2.09 ± 0.08 times faster than 'none (dirty)'

Sparse Index Disabled, Untracked Cache Disabled
-----------------------------------------------

Benchmark #1: none (clean)
  Time (mean ± σ):      2.993 s ±  0.115 s    [User: 1.562 s, System: 2.289 s]
  Range (min … max):    2.741 s …  3.167 s    10 runs
 
Benchmark #2: builtin (clean)
  Time (mean ± σ):     939.4 ms ±  10.1 ms    [User: 1.452 s, System: 1.519 s]
  Range (min … max):   925.1 ms … 961.0 ms    10 runs
 
Summary
  'builtin (clean)' ran
    3.19 ± 0.13 times faster than 'none (clean)'

Benchmark #1: none (dirty)
  Time (mean ± σ):      8.245 s ±  1.118 s    [User: 3.204 s, System: 5.684 s]
  Range (min … max):    5.927 s …  8.985 s    10 runs
 
Benchmark #2: builtin (dirty)
  Time (mean ± σ):      2.969 s ±  0.034 s    [User: 3.832 s, System: 3.160 s]
  Range (min … max):    2.927 s …  3.023 s    10 runs
 
Summary
  'builtin (dirty)' ran
    2.78 ± 0.38 times faster than 'none (dirty)'


Sparse Index Enabled, Untracked Cache Enabled
---------------------------------------------

Benchmark #1: none (clean)
  Time (mean ± σ):      1.250 s ±  0.050 s    [User: 216.9 ms, System: 1836.9 ms]
  Range (min … max):    1.177 s …  1.300 s    10 runs
 
Benchmark #2: builtin (clean)
  Time (mean ± σ):      89.3 ms ±   2.9 ms    [User: 51.3 ms, System: 22.6 ms]
  Range (min … max):    81.9 ms …  93.5 ms    31 runs
 
Summary
  'builtin (clean)' ran
   14.01 ± 0.72 times faster than 'none (clean)'

Benchmark #1: none (dirty)
  Time (mean ± σ):      2.087 s ±  0.095 s    [User: 320.9 ms, System: 3327.5 ms]
  Range (min … max):    1.943 s …  2.242 s    10 runs
 
Benchmark #2: builtin (dirty)
  Time (mean ± σ):     233.5 ms ±   2.7 ms    [User: 165.5 ms, System: 74.1 ms]
  Range (min … max):   227.8 ms … 237.1 ms    12 runs
 
Summary
  'builtin (dirty)' ran
    8.94 ± 0.42 times faster than 'none (dirty)'


Sparse Index Enabled, Untracked Cache Disabled
----------------------------------------------

Benchmark #1: none (clean)
  Time (mean ± σ):      1.277 s ±  0.101 s    [User: 215.5 ms, System: 1877.9 ms]
  Range (min … max):    1.138 s …  1.458 s    10 runs
 
Benchmark #2: builtin (clean)
  Time (mean ± σ):     300.0 ms ±   6.1 ms    [User: 119.4 ms, System: 183.1 ms]
  Range (min … max):   293.0 ms … 313.2 ms    10 runs
 
Summary
  'builtin (clean)' ran
    4.26 ± 0.35 times faster than 'none (clean)'
Benchmark #1: none (dirty)
  Time (mean ± σ):      2.488 s ±  0.088 s    [User: 432.6 ms, System: 3631.6 ms]
  Range (min … max):    2.328 s …  2.601 s    10 runs
 
Benchmark #2: builtin (dirty)
  Time (mean ± σ):     636.4 ms ±  12.8 ms    [User: 266.2 ms, System: 374.0 ms]
  Range (min … max):   624.4 ms … 671.0 ms    10 runs

Summary
  'builtin (dirty)' ran
    3.91 ± 0.16 times faster than 'none (dirty)'


Here are my results for the Git repository:

Untracked Cache Enabled
-----------------------

Benchmark #1: none (clean)
  Time (mean ± σ):      51.2 ms ±   4.0 ms    [User: 12.9 ms, System: 61.2 ms]
  Range (min … max):    46.2 ms …  65.7 ms    54 runs
 
Benchmark #2: builtin (clean)
  Time (mean ± σ):      38.6 ms ±   1.7 ms    [User: 9.9 ms, System: 9.7 ms]
  Range (min … max):    28.6 ms …  42.4 ms    75 runs
 
Summary
  'builtin (clean)' ran
    1.33 ± 0.12 times faster than 'none (clean)'

Benchmark #1: none (dirty)
  Time (mean ± σ):     108.1 ms ±   7.2 ms    [User: 27.2 ms, System: 126.9 ms]
  Range (min … max):    97.6 ms … 130.4 ms    25 runs
 
Benchmark #2: builtin (dirty)
  Time (mean ± σ):      91.7 ms ±   3.8 ms    [User: 25.4 ms, System: 27.0 ms]
  Range (min … max):    88.5 ms … 105.1 ms    32 runs
 
Summary
  'builtin (dirty)' ran
    1.18 ± 0.09 times faster than 'none (dirty)'


Untracked Cache Disabled
------------------------

Benchmark #1: none (clean)
  Time (mean ± σ):      59.5 ms ±   4.0 ms    [User: 15.2 ms, System: 67.7 ms]
  Range (min … max):    55.5 ms …  71.6 ms    46 runs
 
Benchmark #2: builtin (clean)
  Time (mean ± σ):      48.9 ms ±   1.0 ms    [User: 12.5 ms, System: 17.3 ms]
  Range (min … max):    46.7 ms …  51.3 ms    58 runs
 
Summary
  'builtin (clean)' ran
    1.22 ± 0.08 times faster than 'none (clean)'

Benchmark #1: none (dirty)
  Time (mean ± σ):     124.4 ms ±   6.8 ms    [User: 31.5 ms, System: 140.2 ms]
  Range (min … max):   116.8 ms … 140.0 ms    24 runs
 
Benchmark #2: builtin (dirty)
  Time (mean ± σ):     104.1 ms ±   1.7 ms    [User: 27.4 ms, System: 37.8 ms]
  Range (min … max):    99.7 ms … 106.6 ms    27 runs
 
Summary
  'builtin (dirty)' ran
    1.19 ± 0.07 times faster than 'none (dirty)'

I think it valuable to point out that in my initial tests I had forgotten
to disable the Watchman-based FS Monitor hook, and the results looked even
more impressive (on the small Git repository). Dropping the hook overhead is
a huge benefit here.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 04/23] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC
  2021-04-27 12:42       ` Derrick Stolee
@ 2021-04-28  7:59         ` Ævar Arnfjörð Bjarmason
  2021-04-28 16:26           ` [PATCH] repo-settings.c: simplify the setup Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-28  7:59 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Johannes Schindelin via GitGitGadget, git, Jeff Hostetler,
	Johannes Schindelin


On Tue, Apr 27 2021, Derrick Stolee wrote:

> On 4/27/2021 5:20 AM, Ævar Arnfjörð Bjarmason wrote:
>> 
>> On Mon, Apr 26 2021, Derrick Stolee wrote:
>> 
>>> On 4/1/21 11:40 AM, Johannes Schindelin via GitGitGadget wrote:> @@ -2515,6 +2515,11 @@ int git_config_get_max_percent_split_change(void)
> ...
>>>> --- a/repo-settings.c
>>>> +++ b/repo-settings.c
>>>> @@ -58,6 +58,9 @@ void prepare_repo_settings(struct repository *r)
>>>>  		r->settings.core_multi_pack_index = value;
>>>>  	UPDATE_DEFAULT_BOOL(r->settings.core_multi_pack_index, 1);
>>>>  
>>>> +	if (!repo_config_get_bool(r, "core.usebuiltinfsmonitor", &value) && value)
>>>> +		r->settings.use_builtin_fsmonitor = 1;
>>>> +
>>>
>>> Follows the patterns of repo settings. Good.
>> 
>> It follows the pattern, but as an aside the pattern seems bit odd. I see
>> it dates back to your 7211b9e7534 (repo-settings: consolidate some
>> config settings, 2019-08-13).
>> 
>> I.e. we memset() the whole thing to -1, then for most things do something like:
>> 
>>     if (!repo_config_get_bool(r, "gc.writecommitgraph", &value))
>>         r->settings.gc_write_commit_graph = value;
>>     UPDATE_DEFAULT_BOOL(r->settings.gc_write_commit_graph, 1);
>> 
>> But could do:
>> 
>>     if (repo_config_get_bool(r, "gc.writecommitgraph", &r->settings.gc_write_commit_graph))
>>         r->settings.gc_write_commit_graph = 1;
>> 
>> No? I.e. the repo_config_get_bool() function already returns non-zero if
>> we don't find it in the config.
>
> I see how this is fewer lines of code, but it is harder to read the intent
> of the implementation. The current [...]

That's exactly the reason I find the existing version unreadable, i.e.:

> layout makes it clear that we set the value from the config, if it
> exists, but otherwise we choose a default.

The repo_config_get_*() functions only return non-zero if the value
doesn't exist, so the pattern of:

    if (repo_config_get(..., "some.key", &value))
        value = 123;

Is idiomatic for "use 123 if some.key doesn't exist in config".

Maybe I'm missing something and that isn't true, but it seems like a
case of going out of one's way to use what the return value is going to
give you.

> Sometimes, this choice of a default _needs_ to be deferred, for example with
> the fetch_negotiation_algorithm setting, which can be set both from the
> fetch.negotiationAlgorithm config, but also the feature.experimental config.

Don't FETCH_NEGOTIATION_UNSET and UNTRACKED_CACHE_UNSET only exist as
action-at-a-distance interaction with the memset to -1 that this
function does?

I.e. it's somewhat complex state management, first we set it to
"uninit", then later act on fetch.negotiationalgorithm, and then on
feature.experimental, and then set a default only if we didn't do any of
the previous things.;

I.e. something like:

    x = -1;
    if (fetch.negotiationalgorithm is set)
    if (x != -1 && feature.experimental is set)
    if (x != -1) x = default
    settings->x = x;

As opposed to a more (to me at least) simpler:

    int x;
    if (fetch.negotiationalgorithm is set)
    else if (feature.experimental is set)
    else x = default
    settings->x = x;

> However, perhaps it would be better still for these one-off requests to
> create a new macro, say USE_CONFIG_OR_DEFAULT_BOOL() that fills a value
> from config _or_ sets the given default:
>
> #define USE_CONFIG_OR_DEFAULT_BOOL(r, v, s, d) \
> 	if (repo_config_get_bool(r, s, &v)) \
> 		v = d
>
> And then for this example we would write
>
> 	USE_CONFIG_OR_DEFAULT_BOOL(r, r->settings.core_commit_graph,
> 				   "core.commitgraph", 1);
>
> This would work for multiple config options in this file.

I came up with this:

+static void repo_env_config_bool_or_default(struct repository *r, const char *env,
+					    const char *key, int *dest, int def)
+{
+	if (env) {
+		int val = git_env_bool(env, -1);
+		if (val != -1) {
+			*dest = val;
+			return;
+		}
+	}
+	if (repo_config_get_bool(r, key, dest))
+		*dest = def;
+}

Used as e.g.:

+	repo_env_config_bool_or_default(r, NULL, "pack.usesparse",
+					&r->settings.pack_use_sparse, 1);
+	repo_env_config_bool_or_default(r, GIT_TEST_MULTI_PACK_INDEX, "core.multipackindex",
+					&r->settings.core_multi_pack_index, 1);

It works for most things there.

Using that sort of pattern also fixes e.g. a bug in your 18e449f86b7
(midx: enable core.multiPackIndex by default, 2020-09-25), where we'll
ignore a false-but-existing env config value over a true config key.

>> I see the UPDATE_DEFAULT_BOOL() macro has also drifted from "set thing
>> default boolean" to "set any default value".
>  
> This is correct. I suppose it would be a good change to make some time.
> Such a rename could be combined with the refactor above.
>
> I would recommend waiting until such a change isn't conflicting with
> ongoing topics, such as this one.

I'm not planning to work on it, but thought I'd ask/prod the original
author if they were interested :)

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 01/23] fsmonitor--daemon: man page and documentation
  2021-04-26 14:13   ` Derrick Stolee
@ 2021-04-28 13:54     ` Jeff Hostetler
  0 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-04-28 13:54 UTC (permalink / raw)
  To: Derrick Stolee, Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler



On 4/26/21 10:13 AM, Derrick Stolee wrote:
> On 4/1/21 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> Create a manual page describing the `git fsmonitor--daemon` feature.
>>
>> Update references to `core.fsmonitor`, `core.fsmonitorHookVersion` and
>> pointers to `watchman` to mention the built-in FSMonitor.
> 
> Make sense to add clarity here, since there will be new ways
> to interact with a fileystem monitor.
>>   core.fsmonitorHookVersion::
>> -	Sets the version of hook that is to be used when calling fsmonitor.
>> -	There are currently versions 1 and 2. When this is not set,
>> -	version 2 will be tried first and if it fails then version 1
>> -	will be tried. Version 1 uses a timestamp as input to determine
>> -	which files have changes since that time but some monitors
>> -	like watchman have race conditions when used with a timestamp.
>> -	Version 2 uses an opaque string so that the monitor can return
>> -	something that can be used to determine what files have changed
>> -	without race conditions.
>> +	Sets the version of hook that is to be used when calling the
>> +	FSMonitor hook (as configured via `core.fsmonitor`).
>> ++
>> +There are currently versions 1 and 2. When this is not set,
>> +version 2 will be tried first and if it fails then version 1
>> +will be tried. Version 1 uses a timestamp as input to determine
>> +which files have changes since that time but some monitors
>> +like watchman have race conditions when used with a timestamp.
>> +Version 2 uses an opaque string so that the monitor can return
>> +something that can be used to determine what files have changed
>> +without race conditions.
> 
> This initially seemed like a big edit, but you just split the single
> paragraph into multiple, with a better leading sentence and a final
> statement about the built-in FSMonitor. Good.
>> ++
>> +Note: FSMonitor hooks (and this config setting) are ignored if the
>> +built-in FSMonitor is enabled (see `core.useBuiltinFSMonitor`).
>> +
>> +core.useBuiltinFSMonitor::
>> +	If set to true, enable the built-in filesystem event watcher (for
>> +	technical details, see linkgit:git-fsmonitor--daemon[1]).
>> ++
>> +Like external (hook-based) FSMonitors, the built-in FSMonitor can speed up
>> +Git commands that need to refresh the Git index (e.g. `git status`) in a
>> +worktree with many files. The built-in FSMonitor facility eliminates the
>> +need to install and maintain an external third-party monitoring tool.
>> ++
>> +The built-in FSMonitor is currently available only on a limited set of
>> +supported platforms.
> 
> Is there a way for users to know this set of platforms? Can they run
> a command to find out? Will 'git fsmonitor--daemon --start' send a
> helpful message to assist here? Or, could there be a 'git
> fsmonitor--daemon --test' command?

I do have a `git fsmonitor--daemon --is-supported` option.  It will
exit with 0 if the current platform is supported.

It would probably be helpful to list the current platforms and/or
add a statement about the `--is-supported` command here.

> 
>> +Note: if this config setting is set to `true`, any FSMonitor hook
>> +configured via `core.fsmonitor` (and possibly `core.fsmonitorHookVersion`)
>> +is ignored.
> ...
>> +git-fsmonitor--daemon(1)
>> +========================
>> +
>> +NAME
>> +----
>> +git-fsmonitor--daemon - Builtin file system monitor daemon
>> +
>> +SYNOPSIS
>> +--------
>> +[verse]
>> +'git fsmonitor--daemon' --start
>> +'git fsmonitor--daemon' --run
>> +'git fsmonitor--daemon' --stop
>> +'git fsmonitor--daemon' --is-running
>> +'git fsmonitor--daemon' --is-supported
>> +'git fsmonitor--daemon' --query <token>
>> +'git fsmonitor--daemon' --query-index
>> +'git fsmonitor--daemon' --flush
> 
> These arguments with the "--" prefix make it seem like they are
> options that could be grouped together, but you really want these
> to be verbs within the daemon. What do you think about removing
> the "--" prefixes?

That's easy enough.  The OPT_CMDMODE() made it easy to do it this
way.

> 
>> +
>> +DESCRIPTION
>> +-----------
>> +
>> +Monitors files and directories in the working directory for changes using
>> +platform-specific file system notification facilities.
>> +
>> +It communicates directly with commands like `git status` using the
>> +link:technical/api-simple-ipc.html[simple IPC] interface instead of
>> +the slower linkgit:githooks[5] interface.
>> +
>> +OPTIONS
>> +-------
> 
> I typically view "OPTIONS" as arguments that can be grouped together,
> but you are describing things more like verbs or subcommands. The
> most recent example I know about is 'git maintenance <subcommand>',
> documented at [1].
> 
> [1] https://git-scm.com/docs/git-maintenance#_subcommands

Let me take a look at doing the subcommand way.

> 
>> +
>> +--start::
>> +	Starts the fsmonitor daemon in the background.
>> +
>> +--run::
>> +	Runs the fsmonitor daemon in the foreground.
>> +
>> +--stop::
>> +	Stops the fsmonitor daemon running for the current working
>> +	directory, if present.
> 
> I'm noticing "fsmonitor" in lowercase throughout this document. Is
> that the intended case for user-facing documentation? I've been
> seeing "FS Monitor", "filesystem monitor", or even "File System
> Monitor" in other places.

I think I want to rewrite this whole man-page and address all
of the different spellings and phrasing.


> 
>> +--is-running::
>> +	Exits with zero status if the fsmonitor daemon is watching the
>> +	current working directory.
> 
> Another potential name for this verb is "status".
> 
>> +--is-supported::
>> +	Exits with zero status if the fsmonitor daemon feature is supported
>> +	on this platform.
> 
> Ah, here is an indicator of whether the platform is supported. Please
> include details for this command in the earlier documentation. I'll
> check later to see if a message is also sent over 'stderr', which
> would be helpful. Documenting the exit status is good for third-party
> tools that might use this.
> 
>> +--query <token>::
>> +	Connects to the fsmonitor daemon (starting it if necessary) and
>> +	requests the list of changed files and directories since the
>> +	given token.
>> +	This is intended for testing purposes.
>> +
>> +--query-index::
>> +	Read the current `<token>` from the File System Monitor index
>> +	extension (if present) and use it to query the fsmonitor daemon.
>> +	This is intended for testing purposes.
> 
> These two could be grouped as "query [--token=X|--index]", especially
> because they are for testing purposes.
> 
>> +
>> +--flush::
>> +	Force the fsmonitor daemon to flush its in-memory cache and
>> +	re-sync with the file system.
>> +	This is intended for testing purposes.
> 
> Do you see benefits to these being available in the CLI? Could these
> be better served as a test helper?

I debated putting the 3 test options into a test helper.
Let me take a look at that.

> 
>> +REMARKS
>> +-------
>> +The fsmonitor daemon is a long running process that will watch a single
>> +working directory.  Commands, such as `git status`, should automatically
>> +start it (if necessary) when `core.useBuiltinFSMonitor` is set to `true`
>> +(see linkgit:git-config[1]).
>> +
>> +Configure the built-in FSMonitor via `core.useBuiltinFSMonitor` in each
>> +working directory separately, or globally via `git config --global
>> +core.useBuiltinFSMonitor true`.
>> +
>> +Tokens are opaque strings.  They are used by the fsmonitor daemon to
>> +mark a point in time and the associated internal state.  Callers should
>> +make no assumptions about the content of the token.  In particular,
>> +the should not assume that it is a timestamp.
>> +
>> +Query commands send a request-token to the daemon and it responds with
>> +a summary of the changes that have occurred since that token was
>> +created.  The daemon also returns a response-token that the client can
>> +use in a future query.
>> +
>> +For more information see the "File System Monitor" section in
>> +linkgit:git-update-index[1].
>> +
>> +CAVEATS
>> +-------
>> +
>> +The fsmonitor daemon does not currently know about submodules and does
>> +not know to filter out file system events that happen within a
>> +submodule.  If fsmonitor daemon is watching a super repo and a file is
>> +modified within the working directory of a submodule, it will report
>> +the change (as happening against the super repo).  However, the client
>> +should properly ignore these extra events, so performance may be affected
>> +but it should not cause an incorrect result.
> 
> There are several uses of the word "should" where I think "will" is a
> more appropriate word. That is, unless we do not actually have confidence
> in this behavior.

I think I was just being overly conservative in my language.

> 
>> --- a/Documentation/git-update-index.txt
>> +++ b/Documentation/git-update-index.txt
>> @@ -498,7 +498,9 @@ FILE SYSTEM MONITOR
>>   This feature is intended to speed up git operations for repos that have
>>   large working directories.
>>   
>> -It enables git to work together with a file system monitor (see the
>> +It enables git to work together with a file system monitor (see
>> +linkgit:git-fsmonitor--daemon[1]
>> +and the
>>   "fsmonitor-watchman" section of linkgit:githooks[5]) that can
>>   inform it as to what files have been modified. This enables git to avoid
>>   having to lstat() every file to find modified files.
>> diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
>> index b51959ff9418..b7d5e926f7b0 100644
>> --- a/Documentation/githooks.txt
>> +++ b/Documentation/githooks.txt
>> @@ -593,7 +593,8 @@ fsmonitor-watchman
>>   
>>   This hook is invoked when the configuration option `core.fsmonitor` is
>>   set to `.git/hooks/fsmonitor-watchman` or `.git/hooks/fsmonitor-watchmanv2`
>> -depending on the version of the hook to use.
>> +depending on the version of the hook to use, unless overridden via
>> +`core.useBuiltinFSMonitor` (see linkgit:git-config[1]).
>>   
>>   Version 1 takes two arguments, a version (1) and the time in elapsed
>>   nanoseconds since midnight, January 1, 1970.
> 
> These are good connections to make.
> 
> Since the documentation for the fsmonitor--daemon is so deep, this
> patch might be served well to split into two: one that just documents
> the daemon, and another that updates existing documentation to point
> to the new file.

Good point.  Thanks!

> 
> This does provide a good basis for me to investigate during the rest
> of the review.
> 
> Thanks,
> -Stolee
> 

Thanks
Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* [PATCH] repo-settings.c: simplify the setup
  2021-04-28  7:59         ` Ævar Arnfjörð Bjarmason
@ 2021-04-28 16:26           ` Ævar Arnfjörð Bjarmason
  2021-04-28 19:09             ` Nesting topics within other threads (was: [PATCH] repo-settings.c: simplify the setup) Derrick Stolee
  0 siblings, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-28 16:26 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Derrick Stolee, Taylor Blau, Patrick Steinhardt,
	Ævar Arnfjörð Bjarmason

Simplify the setup code in repo-settings.c in various ways, making the
code shorter, easier to read, and requiring fewer hacks to do the same
thing as it did before:

Since 7211b9e7534 (repo-settings: consolidate some config settings,
2019-08-13) we have memset() the whole "settings" structure to -1, and
subsequently relied on the -1 value. As it turns out most things did
not need to be initialized to -1, and e.g. UNTRACKED_CACHE_UNSET and
FETCH_NEGOTIATION_UNSET existed purely to reflect the previous
internal state of the prepare_repo_settings() function.

Much of the "are we -1, then read xyz" can simply be removed by
re-arranging what we read first. E.g. we should read
feature.experimental first, set some values, and then e.g. an explicit
index.version setting should override that. We don't need to read
index.version first, and then check when reading feature.experimental
if it's still -1.

Instead of the global ignore_untracked_cache_config variable added in
dae6c322fa1 (test-dump-untracked-cache: don't modify the untracked
cache, 2016-01-27) we can make use of the new facility to set config
via environment variables added in d8d77153eaf (config: allow
specifying config entries via envvar pairs, 2021-01-12).

It's arguably a bit hacky to use setenv() and getenv() to pass
messages between the same program, but since the test helpers are not
the main intended audience of repo-settings.c I think it's better than
hardcoding the test-only special-case in prepare_repo_settings().

In ad0fb659993 (repo-settings: parse core.untrackedCache, 2019-08-13)
the "unset" and "keep" handling for core.untrackedCache was
consolidated. But it apparently wasn't noticed that while we
understand the "keep" value, we actually don't handle it differently
than the case of any other unknown value.

So we can remove UNTRACKED_CACHE_KEEP from the codebase. It's not
handled any differently than UNTRACKED_CACHE_UNSET once we get past
the config parsing step.

The UPDATE_DEFAULT_BOOL() wrapper added in 31b1de6a09b (commit-graph:
turn on commit-graph by default, 2019-08-13) is redundant to simply
using the return value from repo_config_get_bool(), which is non-zero
if the provided key exists in the config.

This also fixes an (admittedly obscure) logic error in the previous
code where we'd conflate an explicit "-1" value in the config with our
own earlier memset() -1.

Since the two enum fields added in aaf633c2ad1 (repo-settings: create
feature.experimental setting, 2019-08-13) and
ad0fb659993 (repo-settings: parse core.untrackedCache, 2019-08-13)
don't rely on the memzero() setting them to "-1" anymore we don't have
to provide them with explicit values. Let's also explicitly use the
enum type in read-cache.c and fetch-negotiator.c for
self-documentation. Since the FETCH_NEGOTIATION_UNSET is gone we can
remove the "default" case in fetch-negotiator.c, and rely on the
compiler to complain about missing enum values instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

On Wed, Apr 28 2021, Ævar Arnfjörð Bjarmason wrote:

> On Tue, Apr 27 2021, Derrick Stolee wrote:
>
>> This is correct. I suppose it would be a good change to make some time.
>> Such a rename could be combined with the refactor above.
>>
>> I would recommend waiting until such a change isn't conflicting with
>> ongoing topics, such as this one.
>
> I'm not planning to work on it, but thought I'd ask/prod the original
> author if they were interested :)

Seems I'm pretty bad at sticking to my plans. Here's that refactoring,
since I mostly had this hacked-up locally anyway.

The conflict with the fsmonitor work can be resolved by adding:

	repo_config_get_bool_or(r, "core.usebuiltinfsmonitor",
				&r->settings.use_builtin_fsmonitor, 0);

To the "Boolean config or default, does not cascade (simple)" section
in my version. I.e. I assume nothing past 04/23 cares about the case
where it was set to "-1", which as noted in the commit message above
was (like many other setting variables) leaking an internal
implementation detail.

 cache.h                              |   7 --
 environment.c                        |   7 --
 fetch-negotiator.c                   |   6 +-
 read-cache.c                         |  17 ++--
 repo-settings.c                      | 119 +++++++++++++++------------
 repository.h                         |  15 ++--
 t/helper/test-dump-untracked-cache.c |   6 +-
 7 files changed, 92 insertions(+), 85 deletions(-)

diff --git a/cache.h b/cache.h
index 148d9ab5f18..7ea0feb3462 100644
--- a/cache.h
+++ b/cache.h
@@ -1684,13 +1684,6 @@ int update_server_info(int);
 const char *get_log_output_encoding(void);
 const char *get_commit_output_encoding(void);
 
-/*
- * This is a hack for test programs like test-dump-untracked-cache to
- * ensure that they do not modify the untracked cache when reading it.
- * Do not use it otherwise!
- */
-extern int ignore_untracked_cache_config;
-
 int committer_ident_sufficiently_given(void);
 int author_ident_sufficiently_given(void);
 
diff --git a/environment.c b/environment.c
index 2f27008424a..bc825cc7e05 100644
--- a/environment.c
+++ b/environment.c
@@ -96,13 +96,6 @@ int auto_comment_line_char;
 /* Parallel index stat data preload? */
 int core_preload_index = 1;
 
-/*
- * This is a hack for test programs like test-dump-untracked-cache to
- * ensure that they do not modify the untracked cache when reading it.
- * Do not use it otherwise!
- */
-int ignore_untracked_cache_config;
-
 /* This is set by setup_git_dir_gently() and/or git_default_config() */
 char *git_work_tree_cfg;
 
diff --git a/fetch-negotiator.c b/fetch-negotiator.c
index 57ed5784e14..c7c0eda7e21 100644
--- a/fetch-negotiator.c
+++ b/fetch-negotiator.c
@@ -8,8 +8,11 @@
 void fetch_negotiator_init(struct repository *r,
 			   struct fetch_negotiator *negotiator)
 {
+	enum fetch_negotiation_setting setting;
 	prepare_repo_settings(r);
-	switch(r->settings.fetch_negotiation_algorithm) {
+	setting = r->settings.fetch_negotiation_algorithm;
+
+	switch (setting) {
 	case FETCH_NEGOTIATION_SKIPPING:
 		skipping_negotiator_init(negotiator);
 		return;
@@ -19,7 +22,6 @@ void fetch_negotiator_init(struct repository *r,
 		return;
 
 	case FETCH_NEGOTIATION_DEFAULT:
-	default:
 		default_negotiator_init(negotiator);
 		return;
 	}
diff --git a/read-cache.c b/read-cache.c
index 5a907af2fb5..1aefe4a5c23 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1889,16 +1889,23 @@ static void check_ce_order(struct index_state *istate)
 static void tweak_untracked_cache(struct index_state *istate)
 {
 	struct repository *r = the_repository;
+	enum untracked_cache_setting setting;
 
 	prepare_repo_settings(r);
+	setting = r->settings.core_untracked_cache;
 
-	if (r->settings.core_untracked_cache  == UNTRACKED_CACHE_REMOVE) {
+	switch (setting) {
+	case UNTRACKED_CACHE_REMOVE:
 		remove_untracked_cache(istate);
-		return;
-	}
-
-	if (r->settings.core_untracked_cache == UNTRACKED_CACHE_WRITE)
+		break;
+	case UNTRACKED_CACHE_WRITE:
 		add_untracked_cache(istate);
+		break;
+	case UNTRACKED_CACHE_UNSET:
+		/* This includes core.untrackedCache=keep */
+		break;
+	}
+	return;
 }
 
 static void tweak_split_index(struct index_state *istate)
diff --git a/repo-settings.c b/repo-settings.c
index f7fff0f5ab8..2be242fde1d 100644
--- a/repo-settings.c
+++ b/repo-settings.c
@@ -3,40 +3,84 @@
 #include "repository.h"
 #include "midx.h"
 
-#define UPDATE_DEFAULT_BOOL(s,v) do { if (s == -1) { s = v; } } while(0)
+static void repo_config_get_bool_or(struct repository *r, const char *key,
+				    int *dest, int def)
+{
+	if (repo_config_get_bool(r, key, dest))
+		*dest = def;
+}
 
 void prepare_repo_settings(struct repository *r)
 {
-	int value;
+	int experimental;
+	int intval;
 	char *strval;
+	int manyfiles;
 
 	if (r->settings.initialized)
 		return;
 
 	/* Defaults */
-	memset(&r->settings, -1, sizeof(r->settings));
+	r->settings.index_version = -1;
+	r->settings.core_untracked_cache = UNTRACKED_CACHE_UNSET;
+	r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_DEFAULT;
+
+	/* Booleans config or default, cascades to other settings */
+	repo_config_get_bool_or(r, "feature.manyfiles", &manyfiles, 0);
+	repo_config_get_bool_or(r, "feature.experimental", &experimental, 0);
+
+	/* Defaults modified by feature.* */
+	if (experimental) {
+		r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_SKIPPING;
+	}
+	if (manyfiles) {
+		r->settings.index_version = 4;
+		r->settings.core_untracked_cache = UNTRACKED_CACHE_WRITE;
+	}
 
-	if (!repo_config_get_bool(r, "core.commitgraph", &value))
-		r->settings.core_commit_graph = value;
-	if (!repo_config_get_bool(r, "commitgraph.readchangedpaths", &value))
-		r->settings.commit_graph_read_changed_paths = value;
-	if (!repo_config_get_bool(r, "gc.writecommitgraph", &value))
-		r->settings.gc_write_commit_graph = value;
-	UPDATE_DEFAULT_BOOL(r->settings.core_commit_graph, 1);
-	UPDATE_DEFAULT_BOOL(r->settings.commit_graph_read_changed_paths, 1);
-	UPDATE_DEFAULT_BOOL(r->settings.gc_write_commit_graph, 1);
+	/* Boolean config or default, does not cascade (simple)  */
+	repo_config_get_bool_or(r, "core.commitgraph",
+				&r->settings.core_commit_graph, 1);
+	repo_config_get_bool_or(r, "commitgraph.readchangedpaths",
+				&r->settings.commit_graph_read_changed_paths, 1);
+	repo_config_get_bool_or(r, "gc.writecommitgraph",
+				&r->settings.gc_write_commit_graph, 1);
+	repo_config_get_bool_or(r, "fetch.writecommitgraph",
+				&r->settings.fetch_write_commit_graph, 0);
+	repo_config_get_bool_or(r, "pack.usesparse",
+				&r->settings.pack_use_sparse, 1);
+	repo_config_get_bool_or(r, "core.multipackindex",
+				&r->settings.core_multi_pack_index, 1);
 
-	if (!repo_config_get_int(r, "index.version", &value))
-		r->settings.index_version = value;
-	if (!repo_config_get_maybe_bool(r, "core.untrackedcache", &value)) {
-		if (value == 0)
-			r->settings.core_untracked_cache = UNTRACKED_CACHE_REMOVE;
-		else
-			r->settings.core_untracked_cache = UNTRACKED_CACHE_WRITE;
-	} else if (!repo_config_get_string(r, "core.untrackedcache", &strval)) {
-		if (!strcasecmp(strval, "keep"))
-			r->settings.core_untracked_cache = UNTRACKED_CACHE_KEEP;
+	/*
+	 * The GIT_TEST_MULTI_PACK_INDEX variable is special in that
+	 * either it *or* the config sets
+	 * r->settings.core_multi_pack_index if true. We don't take
+	 * the environment variable if it exists (even if false) over
+	 * any config, as in other cases.
+	 */
+	if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0))
+		r->settings.core_multi_pack_index = 1;
 
+	/*
+	 * Non-boolean config
+	 */
+	if (!repo_config_get_int(r, "index.version", &intval))
+		r->settings.index_version = intval;
+
+	if (!repo_config_get_string(r, "core.untrackedcache", &strval)) {
+		int maybe_bool = git_parse_maybe_bool(strval);
+		if (maybe_bool == -1) {
+			/*
+			 * Set to "keep", or some other non-boolean
+			 * value. In either case we do nothing but
+			 * keep UNTRACKED_CACHE_UNSET.
+			 */
+		} else {
+			r->settings.core_untracked_cache = maybe_bool
+				? UNTRACKED_CACHE_WRITE
+				: UNTRACKED_CACHE_REMOVE;
+		}
 		free(strval);
 	}
 
@@ -45,36 +89,5 @@ void prepare_repo_settings(struct repository *r)
 			r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_SKIPPING;
 		else if (!strcasecmp(strval, "noop"))
 			r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_NOOP;
-		else
-			r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_DEFAULT;
 	}
-
-	if (!repo_config_get_bool(r, "pack.usesparse", &value))
-		r->settings.pack_use_sparse = value;
-	UPDATE_DEFAULT_BOOL(r->settings.pack_use_sparse, 1);
-
-	value = git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0);
-	if (value || !repo_config_get_bool(r, "core.multipackindex", &value))
-		r->settings.core_multi_pack_index = value;
-	UPDATE_DEFAULT_BOOL(r->settings.core_multi_pack_index, 1);
-
-	if (!repo_config_get_bool(r, "feature.manyfiles", &value) && value) {
-		UPDATE_DEFAULT_BOOL(r->settings.index_version, 4);
-		UPDATE_DEFAULT_BOOL(r->settings.core_untracked_cache, UNTRACKED_CACHE_WRITE);
-	}
-
-	if (!repo_config_get_bool(r, "fetch.writecommitgraph", &value))
-		r->settings.fetch_write_commit_graph = value;
-	UPDATE_DEFAULT_BOOL(r->settings.fetch_write_commit_graph, 0);
-
-	if (!repo_config_get_bool(r, "feature.experimental", &value) && value)
-		UPDATE_DEFAULT_BOOL(r->settings.fetch_negotiation_algorithm, FETCH_NEGOTIATION_SKIPPING);
-
-	/* Hack for test programs like test-dump-untracked-cache */
-	if (ignore_untracked_cache_config)
-		r->settings.core_untracked_cache = UNTRACKED_CACHE_KEEP;
-	else
-		UPDATE_DEFAULT_BOOL(r->settings.core_untracked_cache, UNTRACKED_CACHE_KEEP);
-
-	UPDATE_DEFAULT_BOOL(r->settings.fetch_negotiation_algorithm, FETCH_NEGOTIATION_DEFAULT);
 }
diff --git a/repository.h b/repository.h
index b385ca3c94b..9345423c5ba 100644
--- a/repository.h
+++ b/repository.h
@@ -12,18 +12,15 @@ struct raw_object_store;
 struct submodule_cache;
 
 enum untracked_cache_setting {
-	UNTRACKED_CACHE_UNSET = -1,
-	UNTRACKED_CACHE_REMOVE = 0,
-	UNTRACKED_CACHE_KEEP = 1,
-	UNTRACKED_CACHE_WRITE = 2
+	UNTRACKED_CACHE_UNSET,
+	UNTRACKED_CACHE_REMOVE,
+	UNTRACKED_CACHE_WRITE,
 };
 
 enum fetch_negotiation_setting {
-	FETCH_NEGOTIATION_UNSET = -1,
-	FETCH_NEGOTIATION_NONE = 0,
-	FETCH_NEGOTIATION_DEFAULT = 1,
-	FETCH_NEGOTIATION_SKIPPING = 2,
-	FETCH_NEGOTIATION_NOOP = 3,
+	FETCH_NEGOTIATION_DEFAULT,
+	FETCH_NEGOTIATION_SKIPPING,
+	FETCH_NEGOTIATION_NOOP,
 };
 
 struct repo_settings {
diff --git a/t/helper/test-dump-untracked-cache.c b/t/helper/test-dump-untracked-cache.c
index cf0f2c7228e..8b73a2f8bc3 100644
--- a/t/helper/test-dump-untracked-cache.c
+++ b/t/helper/test-dump-untracked-cache.c
@@ -45,8 +45,10 @@ int cmd__dump_untracked_cache(int ac, const char **av)
 	struct untracked_cache *uc;
 	struct strbuf base = STRBUF_INIT;
 
-	/* Hack to avoid modifying the untracked cache when we read it */
-	ignore_untracked_cache_config = 1;
+	/* Set core.untrackedCache=keep before setup_git_directory() */
+	setenv("GIT_CONFIG_COUNT", "1", 1);
+	setenv("GIT_CONFIG_KEY_0", "core.untrackedCache", 1);
+	setenv("GIT_CONFIG_VALUE_0", "keep", 1);
 
 	setup_git_directory();
 	if (read_cache() < 0)
-- 
2.31.1.734.g8d26f61af32


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* Nesting topics within other threads (was: [PATCH] repo-settings.c: simplify the setup)
  2021-04-28 16:26           ` [PATCH] repo-settings.c: simplify the setup Ævar Arnfjörð Bjarmason
@ 2021-04-28 19:09             ` Derrick Stolee
  2021-04-28 23:01               ` Ævar Arnfjörð Bjarmason
  2021-04-29  5:12               ` Nesting topics within other threads Junio C Hamano
  0 siblings, 2 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-04-28 19:09 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Taylor Blau, Patrick Steinhardt

On 4/28/2021 12:26 PM, Ævar Arnfjörð Bjarmason wrote:
> Simplify the setup code in repo-settings.c in various ways, making the
> code shorter, easier to read, and requiring fewer hacks to do the same
> thing as it did before:

This patch is interesting, and I'll review it when I have some more
time. Probably tomorrow.

But I thought that I would point out that this pattern of adding a
patch within the thread of a larger series makes it very difficult
to separate the two. I use an email client that groups messages by
thread in order to help parse meaningful discussion from the list
which otherwise looks like a fire hose of noise. Now, this patch is
linked to the FS Monitor thread and feedback to either will trigger
the thread as having unread messages.

I find it very difficult to track multiple patch series that are
being juggled in the same thread. It is mentally taxing enough that
I have avoided reviewing code presented this way to save myself the
effort of tracking which patches go with what topic in what order.

Since I've committed to reviewing the FS Monitor code, I'd prefer if
this patch (or maybe its v2, since this is here already) be sent as
a top-level message so it can be discussed independently.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 02/23] fsmonitor-ipc: create client routines for git-fsmonitor--daemon
  2021-04-26 14:31   ` Derrick Stolee
  2021-04-26 20:20     ` Eric Sunshine
@ 2021-04-28 19:26     ` Jeff Hostetler
  1 sibling, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-04-28 19:26 UTC (permalink / raw)
  To: Derrick Stolee, Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler



On 4/26/21 10:31 AM, Derrick Stolee wrote:
> On 4/1/21 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
>> +#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
>> +#define FSMONITOR_DAEMON_IS_SUPPORTED 1
>> +#else
>> +#define FSMONITOR_DAEMON_IS_SUPPORTED 0
>> +#endif
>> +
>> +/*
>> + * A trivial function so that this source file always defines at least
>> + * one symbol even when the feature is not supported.  This quiets an
>> + * annoying compiler error.
>> + */
>> +int fsmonitor_ipc__is_supported(void)
>> +{
>> +	return FSMONITOR_DAEMON_IS_SUPPORTED;
>> +}
> 
> I don't see any other use of FSMONITOR_DAEMON_IS_SUPPORTED,
> so I was thinking you could use the #ifdef/#else/#endif
> construct within the implementation of this method instead
> of creating a macro outside. But my suggestion might be an
> anti-pattern, so feel free to ignore me.

I think an earlier draft did more with the macros
and wasn't as distilled as it is here.  So yes, I
could simplify it here.


>> +#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
>> +
>> +GIT_PATH_FUNC(fsmonitor_ipc__get_path, "fsmonitor")
>> +
>> +enum ipc_active_state fsmonitor_ipc__get_state(void)
>> +{
>> +	return ipc_get_active_state(fsmonitor_ipc__get_path());
>> +}
>> +
>> +static int spawn_daemon(void)
>> +{
>> +	const char *args[] = { "fsmonitor--daemon", "--start", NULL };
>> +
>> +	return run_command_v_opt_tr2(args, RUN_COMMAND_NO_STDIN | RUN_GIT_CMD,
>> +				    "fsmonitor");
>> +}
>> +
>> +int fsmonitor_ipc__send_query(const char *since_token,
>> +			      struct strbuf *answer)
>> +{
>> +	int ret = -1;
>> +	int tried_to_spawn = 0;
>> +	enum ipc_active_state state = IPC_STATE__OTHER_ERROR;
>> +	struct ipc_client_connection *connection = NULL;
>> +	struct ipc_client_connect_options options
>> +		= IPC_CLIENT_CONNECT_OPTIONS_INIT;
>> +
>> +	options.wait_if_busy = 1;
>> +	options.wait_if_not_found = 0;
>> +
>> +	trace2_region_enter("fsm_client", "query", NULL);
>> +
>> +	trace2_data_string("fsm_client", NULL, "query/command",
>> +			   since_token);
>> +
>> +try_again:
>> +	state = ipc_client_try_connect(fsmonitor_ipc__get_path(), &options,
>> +				       &connection);
>> +
>> +	switch (state) {
>> +	case IPC_STATE__LISTENING:
>> +		ret = ipc_client_send_command_to_connection(
>> +			connection, since_token, answer);
>> +		ipc_client_close_connection(connection);
>> +
>> +		trace2_data_intmax("fsm_client", NULL,
>> +				   "query/response-length", answer->len);
>> +
>> +		if (fsmonitor_is_trivial_response(answer))
>> +			trace2_data_intmax("fsm_client", NULL,
>> +					   "query/trivial-response", 1);
>> +
>> +		goto done;
>> +
>> +	case IPC_STATE__NOT_LISTENING:
>> +		ret = error(_("fsmonitor_ipc__send_query: daemon not available"));
>> +		goto done;
> 
> I'll need to read up on the IPC layer a bit to find out the difference
> between IPC_STATE__NOT_LISTENING and IPC_STATE__PATH_NOT_FOUND. When
> testing on my macOS machine, I got this error. I was expecting the
> daemon to be spawned. After spawning it myself, it started working.
> 
> I expect that there are some cases where the process can fail and the
> named pipe is not cleaned up. Let's investigate that soon. I should
> make it clear that I had tested the builtin FS Monitor on this machine
> a few weeks ago, but hadn't been using it much since. We should auto-
> recover from this situation.

This probably means that you had a left over (dead) unix domain
socket in your .git directory from your prior testing.  I delete
it when the daemon shuts down normally, but if it was killed or
crashed, it may have been left behind.

When the client tries to connect (to a socket with no listener)
the OS refuses the connection and the IPC layer maps that back
to the __NOT_LISTENING error.

(There are other ways to get __NOT_LISTENING error, such as when
the client times-out because the daemon is too busy to respond,
but I'll ignore that for now.)

The unix daemon startup code tries to gently create the socket
(without stealing it from an existing server) and if that fails
(because there is no server present), it force creates a new socket
and starts listening on it.  (There was a large conversation on the
Simple IPC patch series about this.)  So this is how it fixed itself
after you started the daemon.


> But also: what is the cost of treating these two cases the same? Could
> we attempt to "restart" the daemon by spawning a new one? Will the new
> one find a way to kill a stale one?

On Windows, named pipes are magically deleted when the last handle
is closed (they are hosted on a special Named Pipe FS rather than NTFS,
so they have slightly different semantics).  If a named pipe exists,
but the connect fails, then a server is present but busy (or wedged).
The __NOT_LISTENING error basically means that the connect timed out.
So we know that the server is technically present, but it did not
respond.


On both platforms, if the socket/pipe is not present then the connect
returns __PATH_NOT_FOUND.  So we know that no daemon is present and are
free to implicitly start one.


The subtle difference in the __NOT_LISTENING case between the platforms
is why I hesitated to implicitly start (or restart) the daemon in this
case.

I would like to revisit auto-starting the daemon (at least on Unix)
when we have a dead socket.  I'll review this.

Thanks for the question.


> 
> (Reading on.)
> 
>> +	case IPC_STATE__PATH_NOT_FOUND:
>> +		if (tried_to_spawn)
>> +			goto done;
>> +
>> +		tried_to_spawn++;
>> +		if (spawn_daemon())
>> +			goto done;
> 
> This should return zero on success, OK.
> 
>> +		/*
>> +		 * Try again, but this time give the daemon a chance to
>> +		 * actually create the pipe/socket.
>> +		 *
>> +		 * Granted, the daemon just started so it can't possibly have
>> +		 * any FS cached yet, so we'll always get a trivial answer.
>> +		 * BUT the answer should include a new token that can serve
>> +		 * as the basis for subsequent requests.
>> +		 */
>> +		options.wait_if_not_found = 1;
>> +		goto try_again;
> 
> Because of the tried_to_spawn check, we will re-run the request over
> IPC but will not retry the spawn_daemon() request. I'm unsure how
> this could be helpful: is it possible that spawn_daemon() returns a
> non-zero error code after starting the daemon and somehow that
> daemon starts working? Or, is this a race-condition thing with parallel
> processes also starting up the daemon? It could be good to use this
> comment to describe why a retry might be helpful.

I'm trying to be fairly conservative here.  If no daemon/socket/pipe
is present, we try once to start it and then (with a small delay) try
to connect to the new daemon.  There is a little race with our process
and the new daemon instance, but we have the client spin a little to
give the daemon a chance to get started up.  Normally, that connect
will then succeed.

If that new daemon fails to start or we have some other error, we
need to just give up and tell the caller to do the work -- we've
already held up the caller long enough IMHO.

The thought here is that if that first daemon failed to start, then
subsequent attempts are likely to also fail.  And we don't want to
cause the client to get stuck trying to repeatedly start the daemon.
Better to just give up and go on.

> 
>> +
>> +	case IPC_STATE__INVALID_PATH:
>> +		ret = error(_("fsmonitor_ipc__send_query: invalid path '%s'"),
>> +			    fsmonitor_ipc__get_path());
>> +		goto done;
>> +
>> +	case IPC_STATE__OTHER_ERROR:
>> +	default:
>> +		ret = error(_("fsmonitor_ipc__send_query: unspecified error on '%s'"),
>> +			    fsmonitor_ipc__get_path());
>> +		goto done;
>> +	}
>> +
>> +done:
>> +	trace2_region_leave("fsm_client", "query", NULL);
>> +
>> +	return ret;
>> +}
>> +
>> +int fsmonitor_ipc__send_command(const char *command,
>> +				struct strbuf *answer)
>> +{
>> +	struct ipc_client_connection *connection = NULL;
>> +	struct ipc_client_connect_options options
>> +		= IPC_CLIENT_CONNECT_OPTIONS_INIT;
>> +	int ret;
>> +	enum ipc_active_state state;
>> +
>> +	strbuf_reset(answer);
>> +
>> +	options.wait_if_busy = 1;
>> +	options.wait_if_not_found = 0;
>> +
>> +	state = ipc_client_try_connect(fsmonitor_ipc__get_path(), &options,
>> +				       &connection);
>> +	if (state != IPC_STATE__LISTENING) {
>> +		die("fsmonitor--daemon is not running");
>> +		return -1;
>> +	}
>> +
>> +	ret = ipc_client_send_command_to_connection(connection, command, answer);
>> +	ipc_client_close_connection(connection);
>> +
>> +	if (ret == -1) {
>> +		die("could not send '%s' command to fsmonitor--daemon",
>> +		    command);
>> +		return -1;
>> +	}
>> +
>> +	return 0;
>> +}
> 
> I wondier if this ...send_command() method is too generic. It might
> be nice to have more structure to its inputs and outputs to lessen
> the cognitive load when plugging into other portions of the code.
> However, I'll wait to see what those consumers look like in case the
> generality is merited.
>>   struct category_description {
>>   	uint32_t category;
>> @@ -664,6 +665,9 @@ void get_version_info(struct strbuf *buf, int show_build_options)
>>   		strbuf_addf(buf, "sizeof-size_t: %d\n", (int)sizeof(size_t));
>>   		strbuf_addf(buf, "shell-path: %s\n", SHELL_PATH);
>>   		/* NEEDSWORK: also save and output GIT-BUILD_OPTIONS? */
>> +
>> +		if (fsmonitor_ipc__is_supported())
>> +			strbuf_addstr(buf, "feature: fsmonitor--daemon\n");
> 
> This change might deserve its own patch, including some documentation
> about how users can use 'git version --build-options' to determine if
> the builtin FS Monitor feature is available on their platform.
> 

Good point.  Thanks.

> Thanks,
> -Stolee
> 

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: Nesting topics within other threads (was: [PATCH] repo-settings.c: simplify the setup)
  2021-04-28 19:09             ` Nesting topics within other threads (was: [PATCH] repo-settings.c: simplify the setup) Derrick Stolee
@ 2021-04-28 23:01               ` Ævar Arnfjörð Bjarmason
  2021-05-05 16:12                 ` Johannes Schindelin
  2021-04-29  5:12               ` Nesting topics within other threads Junio C Hamano
  1 sibling, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-28 23:01 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: git, Junio C Hamano, Taylor Blau, Patrick Steinhardt,
	Johannes Schindelin


On Wed, Apr 28 2021, Derrick Stolee wrote:

> On 4/28/2021 12:26 PM, Ævar Arnfjörð Bjarmason wrote:
>> Simplify the setup code in repo-settings.c in various ways, making the
>> code shorter, easier to read, and requiring fewer hacks to do the same
>> thing as it did before:
>
> This patch is interesting, and I'll review it when I have some more
> time. Probably tomorrow.
>
> But I thought that I would point out that this pattern of adding a
> patch within the thread of a larger series makes it very difficult
> to separate the two. I use an email client that groups messages by
> thread in order to help parse meaningful discussion from the list
> which otherwise looks like a fire hose of noise. Now, this patch is
> linked to the FS Monitor thread and feedback to either will trigger
> the thread as having unread messages.
>
> I find it very difficult to track multiple patch series that are
> being juggled in the same thread. It is mentally taxing enough that
> I have avoided reviewing code presented this way to save myself the
> effort of tracking which patches go with what topic in what order.
>
> Since I've committed to reviewing the FS Monitor code, I'd prefer if
> this patch (or maybe its v2, since this is here already) be sent as
> a top-level message so it can be discussed independently.

As a practical matter I think any effort I make to accommodate your
request will be dwarfed by your own starting of a sub-thread on
E-Mail/MUA nuances :)

When [1] was brought up the other day (showing that I'm probably not the
best person to ask about on-list In-Reply-To semantics) I was surprised
to find that we don't have much (if any) explicit documentation about
In-Reply-To best practices. There's a passing mention in
Documentation/MyFirstContribution.txt, but as far as I can tell from a
cursory glance that's it.

Personally I draw the line at "this random unrelated thing occurred to
me while reading X" v.s. "this is directly in reply to X".

Reading the upthread I don't really see a good point at which to start
breaking the reply chain and not make things harder for others reading
along with clients that aren't yours (which, looking at your headers
seems to be Thunderbird 78).

I.e. the one feedback on the patch idea is your upthread "waiting until
such a change". With threading you can see the context, but without
you'd need to get it via some not-MUA side-channel (presumably
lore.kernel.org link). Sending a v2 (if any) without threading would
break the chain again.

1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.2103191540330.57@tvgsbejvaqbjf.bet/

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: Nesting topics within other threads
  2021-04-28 19:09             ` Nesting topics within other threads (was: [PATCH] repo-settings.c: simplify the setup) Derrick Stolee
  2021-04-28 23:01               ` Ævar Arnfjörð Bjarmason
@ 2021-04-29  5:12               ` Junio C Hamano
  2021-04-29 12:14                 ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 237+ messages in thread
From: Junio C Hamano @ 2021-04-29  5:12 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Ævar Arnfjörð Bjarmason, git, Taylor Blau,
	Patrick Steinhardt

Derrick Stolee <stolee@gmail.com> writes:

> On 4/28/2021 12:26 PM, Ævar Arnfjörð Bjarmason wrote:
>> Simplify the setup code in repo-settings.c in various ways, making the
>> code shorter, easier to read, and requiring fewer hacks to do the same
>> thing as it did before:
>
> This patch is interesting, and I'll review it when I have some more
> time. Probably tomorrow.
>
> But I thought that I would point out that this pattern of adding a
> patch within the thread of a larger series makes it very difficult
> to separate the two. I use an email client that groups messages by
> thread in order to help parse meaningful discussion from the list
> which otherwise looks like a fire hose of noise. Now, this patch is
> linked to the FS Monitor thread and feedback to either will trigger
> the thread as having unread messages.
>
> I find it very difficult to track multiple patch series that are
> being juggled in the same thread. It is mentally taxing enough that
> I have avoided reviewing code presented this way to save myself the
> effort of tracking which patches go with what topic in what order.

I do find it distracting to have a full "ah, I just thought of
something while discussing this unrelated series" patch fairly
irritating for the same reason.  It however is unavoidable human
nature that we come up with ideas while thinking about something not
necessarily related.  So it largely is a presentation issue.

I really appreciate the way some people (Peff is a stellar example,
but there are others who are as good at this) handle these tangents,
where the message sent to an existing thread is limited to only give
an outline of the idea (possibly with "something like this?" patch
for illustration) and then they quickly get out of the way of the
discussion by starting a separate thread, while back-referencing "So
here is a proper patch based on the idea I interjected in the
discussion of that other topic."  And the discussion on the tangent
will be done on its own thread.



^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: Nesting topics within other threads
  2021-04-29  5:12               ` Nesting topics within other threads Junio C Hamano
@ 2021-04-29 12:14                 ` Ævar Arnfjörð Bjarmason
  2021-04-29 20:14                   ` Jeff King
  2021-04-30  0:07                   ` Junio C Hamano
  0 siblings, 2 replies; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-04-29 12:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Derrick Stolee, git, Taylor Blau, Patrick Steinhardt


On Thu, Apr 29 2021, Junio C Hamano wrote:

> Derrick Stolee <stolee@gmail.com> writes:
>
>> On 4/28/2021 12:26 PM, Ævar Arnfjörð Bjarmason wrote:
>>> Simplify the setup code in repo-settings.c in various ways, making the
>>> code shorter, easier to read, and requiring fewer hacks to do the same
>>> thing as it did before:
>>
>> This patch is interesting, and I'll review it when I have some more
>> time. Probably tomorrow.
>>
>> But I thought that I would point out that this pattern of adding a
>> patch within the thread of a larger series makes it very difficult
>> to separate the two. I use an email client that groups messages by
>> thread in order to help parse meaningful discussion from the list
>> which otherwise looks like a fire hose of noise. Now, this patch is
>> linked to the FS Monitor thread and feedback to either will trigger
>> the thread as having unread messages.
>>
>> I find it very difficult to track multiple patch series that are
>> being juggled in the same thread. It is mentally taxing enough that
>> I have avoided reviewing code presented this way to save myself the
>> effort of tracking which patches go with what topic in what order.
>
> I do find it distracting to have a full "ah, I just thought of
> something while discussing this unrelated series" patch fairly
> irritating for the same reason.  It however is unavoidable human
> nature that we come up with ideas while thinking about something not
> necessarily related.  So it largely is a presentation issue.
>
> I really appreciate the way some people (Peff is a stellar example,
> but there are others who are as good at this) handle these tangents,
> where the message sent to an existing thread is limited to only give
> an outline of the idea (possibly with "something like this?" patch
> for illustration) and then they quickly get out of the way of the
> discussion by starting a separate thread, while back-referencing "So
> here is a proper patch based on the idea I interjected in the
> discussion of that other topic."  And the discussion on the tangent
> will be done on its own thread.

In RFC 822 terms. Are you talking about the In-Reply-To[1] or
References[2] headers, or both/neither?

I'm happy to go along with whatever the convention is, but as noted
think it's valuable to come to some explicit decision to document the
convention.

Threading isn't a concept that exists in E-Mail protocols per-se. Just
In-Reply-To and References. The References header can reference N
messages most would think about as a separate "thread", and "thread" is
ultimately some fuzzy MUA-specific concept on top of these (and others).

E.g. in my client right now I'm looking at just 4 messages in this
"thread", it doesn't descend down the whole In-Reply-To, others would
act differently.

Some (such as GMail) have their own ad-hoc concept of "thread" separate
from anything in RFCs (which includes some fuzzy group-by-subject). In
GMail's web UI everything as of my "upthread"
<patch-1.1-e1d8c842c70-20210428T161817Z-avarab@gmail.com> is presented
as its own thread.

The ML read as it happens, but it's also a collectively maintained
datastructure.

It seems to me to be better to veer on the side of using standard fields
for their intended purpose for archiving / future use. I.e. making "a
reference" universally machine-readable, as opposed to a lore.kernel.org
link, or a free-form "in a recent thread" blurb.

ML Archive Formats Matter[3] :)

But yes, maybe MUAs in the wild these days mostly render things one way
or another, so catering to them would be a good trade-off. I'm writing
this from within an Emacs MUA, so I don't have much of a feel for common
MUA conventions these days.

I'm prodding to see if we can define the problem exactly, because
e.g. maybe "References: <break@threading.hack> [actual <references>]" is
something that would achieve both aims, i.e. make the references
machine-readable, but break up threading in common in-the-wild
clients. We could then patch format-patch etc. to support such
"detached" threading.

1. https://tools.ietf.org/html/rfc822#section-4.6.2
2. https://tools.ietf.org/html/rfc822#section-4.6.3
3. https://keithp.com/blogs/Repository_Formats_Matter/

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: Nesting topics within other threads
  2021-04-29 12:14                 ` Ævar Arnfjörð Bjarmason
@ 2021-04-29 20:14                   ` Jeff King
  2021-04-30  0:07                   ` Junio C Hamano
  1 sibling, 0 replies; 237+ messages in thread
From: Jeff King @ 2021-04-29 20:14 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, Derrick Stolee, git, Taylor Blau, Patrick Steinhardt

On Thu, Apr 29, 2021 at 02:14:52PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > I really appreciate the way some people (Peff is a stellar example,
> > but there are others who are as good at this) handle these tangents,
> > where the message sent to an existing thread is limited to only give
> > an outline of the idea (possibly with "something like this?" patch
> > for illustration) and then they quickly get out of the way of the
> > discussion by starting a separate thread, while back-referencing "So
> > here is a proper patch based on the idea I interjected in the
> > discussion of that other topic."  And the discussion on the tangent
> > will be done on its own thread.
> 
> In RFC 822 terms. Are you talking about the In-Reply-To[1] or
> References[2] headers, or both/neither?

Since I got listed as an example, I can tell you what I do: I start a
totally new thread with no in-reply-to or references to the old thread.
And the subject is new (usually "[PATCH 0/N] foo..."), so no clever
group-by-subject heuristics will link them.

It's usually a good idea to reference the message-id/lore link in at
least one direction, though (usually I'd do it in the new thread, saying
"this is a followup to ...", but you could also follow-up in the
original to say "I've spun this off into its own series here...").

Which is really _sort of_ like putting it into "References", except that
it's not machine readable. Which is a good thing, because it's a weaker
form and doesn't tell mail clients to group it all into one thread.

> Threading isn't a concept that exists in E-Mail protocols per-se. Just
> In-Reply-To and References. The References header can reference N
> messages most would think about as a separate "thread", and "thread" is
> ultimately some fuzzy MUA-specific concept on top of these (and others).
> 
> E.g. in my client right now I'm looking at just 4 messages in this
> "thread", it doesn't descend down the whole In-Reply-To, others would
> act differently.

Interesting. Mutt (and notmuch, and public-inbox) definitely view these
as part of a larger thread.  It looks like you're using mu4e; I'm
surprised it doesn't, too (of course some clients will give a partial
view of a thread if you've already marked the older messages as read and
moved them into an archival folder).

> It seems to me to be better to veer on the side of using standard fields
> for their intended purpose for archiving / future use. I.e. making "a
> reference" universally machine-readable, as opposed to a lore.kernel.org
> link, or a free-form "in a recent thread" blurb.

I'd disagree here. There's a long history of intentionally breaking the
thread in mailing lists and newsgroups exactly because the topic is
sufficiently different that you want to make it easy for people to treat
it as a separate unit. I admit there's a bit of an art form to deciding
when that is appropriate and when not.

-Peff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: Nesting topics within other threads
  2021-04-29 12:14                 ` Ævar Arnfjörð Bjarmason
  2021-04-29 20:14                   ` Jeff King
@ 2021-04-30  0:07                   ` Junio C Hamano
  1 sibling, 0 replies; 237+ messages in thread
From: Junio C Hamano @ 2021-04-30  0:07 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Derrick Stolee, git, Taylor Blau, Patrick Steinhardt

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> In RFC 822 terms. Are you talking about the In-Reply-To[1] or
> References[2] headers, or both/neither?

Neither (I think Peff explained why it is a good idea to defer to
verbal communication not to confuse tools better than I could).

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 04/23] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC
  2021-04-26 14:56   ` Derrick Stolee
  2021-04-27  9:20     ` Ævar Arnfjörð Bjarmason
@ 2021-04-30 14:23     ` Jeff Hostetler
  1 sibling, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-04-30 14:23 UTC (permalink / raw)
  To: Derrick Stolee, Johannes Schindelin via GitGitGadget, git
  Cc: Jeff Hostetler, Johannes Schindelin



On 4/26/21 10:56 AM, Derrick Stolee wrote:
> On 4/1/21 11:40 AM, Johannes Schindelin via GitGitGadget wrote:> @@ -2515,6 +2515,11 @@ int git_config_get_max_percent_split_change(void)
>>   
>>   int repo_config_get_fsmonitor(struct repository *r)
>>   {
>> +	if (r->settings.use_builtin_fsmonitor > 0) {
> 
> Don't forget to run prepare_repo_settings(r) first.
> 
>> +		core_fsmonitor = "(built-in daemon)";
>> +		return 1;
>> +	}
>> +
> 
> I found this odd, assigning a string to core_fsmonitor that
> would definitely cause a problem trying to execute it as a
> hook. I wondered the need for it at all, but found that
> there are several places in the FS Monitor subsystem that use
> core_fsmonitor as if it was a boolean, indicating whether or
> not the feature is enabled at all.
> 
> A cleaner way to handle this would be to hide the data behind
> a helper method, say "fsmonitor_enabled()" that could then
> check a value on the repository (or index) and store the hook
> value as a separate value that is only used by the hook-based
> implementation.
> 
> It's probably a good idea to do that cleanup now, before we
> find on accident that we missed a gap and start trying to run
> this bogus string as a hook invocation.

Good point.  In an earlier draft we were using that known
string as a bogus hook path to indicate that we should
call the IPC routines rather than the hook API.  But then
we added the `core.useBuiltinFSMonitor` boolean and had it
override all of the existing fsmonitor config settings.
So we don't technically need it to have a value now and can
and should stop using the pointer as a boolean.

Thanks!

>> -static int query_fsmonitor(int version, const char *last_update, struct strbuf *query_result)
>> +static int query_fsmonitor(int version, struct index_state *istate, struct strbuf *query_result)
>>   {
>> +	struct repository *r = istate->repo ? istate->repo : the_repository;
>> +	const char *last_update = istate->fsmonitor_last_update;
>>   	struct child_process cp = CHILD_PROCESS_INIT;
>>   	int result;
>>   
>>   	if (!core_fsmonitor)
>>   		return -1;
> 
> Here is an example of it being used as a boolean.
> 
>> +	if (r->settings.use_builtin_fsmonitor > 0) {
>> +#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
>> +		return fsmonitor_ipc__send_query(last_update, query_result);
>> +#else
>> +		/* Fake a trivial response. */
>> +		warning(_("fsmonitor--daemon unavailable; falling back"));
>> +		strbuf_add(query_result, "/", 2);
>> +		return 0;
>> +#endif
> 
> This seems like a case where the helper fsmonitor_ipc__is_supported()
> could be used instead of compile-time macros.
> 
> (I think this is especially true when we consider the future of the
> feature on Linux and the possibility of the same compiled code needing
> to check run-time properties of the platform for compatibility.)

Yes.

>> --- a/repo-settings.c
>> +++ b/repo-settings.c
>> @@ -58,6 +58,9 @@ void prepare_repo_settings(struct repository *r)
>>   		r->settings.core_multi_pack_index = value;
>>   	UPDATE_DEFAULT_BOOL(r->settings.core_multi_pack_index, 1);
>>   
>> +	if (!repo_config_get_bool(r, "core.usebuiltinfsmonitor", &value) && value)
>> +		r->settings.use_builtin_fsmonitor = 1;
>> +
> 
> Follows the patterns of repo settings. Good.
> 

I'm going to ignore all of the thread responses to this patch
dealing with how we acquire config settings and macros and etc.
Those issues are completely independent of FSMonitor (which is
already way too big).

Jeff


^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 05/23] fsmonitor--daemon: add a built-in fsmonitor daemon
  2021-04-26 15:45     ` Derrick Stolee
@ 2021-04-30 14:31       ` Jeff Hostetler
  0 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-04-30 14:31 UTC (permalink / raw)
  To: Derrick Stolee, Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler



On 4/26/21 11:45 AM, Derrick Stolee wrote:
> On 4/26/21 11:08 AM, Derrick Stolee wrote:
>> On 4/1/21 11:40 AM, Jeff Hostetler via GitGitGadget wrote:> +#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
>>
>> I think these compile-time macros should be replaced with a
>> method call, as I've said before. It should be simple to say
>>
>> 	if (!fsmonitor_ipc__is_supported())
>> 		die(_("fsmonitor--daemon is not supported on this platform"));
>>
>> and call it a day. This can be done before parsing arguments.
>>
>>> +int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
>>> +{
>>> +	enum daemon_mode {
>>> +		UNDEFINED_MODE,
>>> +	} mode = UNDEFINED_MODE;
>>> +
>>> +	struct option options[] = {
>>> +		OPT_END()
>>> +	};
>>
>> I can see where you are going here, to use the parse-opts API
>> to get your "--<verb>" arguments to populate an 'enum'. However,
>> it seems like you will run into the problem where a user enters
>> multiple such arguments and you lose the information as the
>> parser overwrites 'mode' here.
> 
> I see that you use OPT_CMDMODE in your implementation, which
> makes this concern invalid.
> 
>> Better to use a positional argument and drop the "--" prefix,
>> in my opinion.
> 
> This is my personal taste, but the technical reason to do this
> doesn't exist.

Either method is fine/equivalent and I'm open to doing it either
way.  (In fact, I did the t/helper/test-simple-ipc the other way
and didn't even think about it.)

Does the mailing list have a preference for one form over the other?
That is:

     git fsmonitor--daemon --start [<options>]
vs
     git fsmonitor--daemon start [<options>]

Jeff


^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 06/23] fsmonitor--daemon: implement client command options
  2021-04-26 15:12   ` Derrick Stolee
@ 2021-04-30 14:33     ` Jeff Hostetler
  0 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-04-30 14:33 UTC (permalink / raw)
  To: Derrick Stolee, Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler



On 4/26/21 11:12 AM, Derrick Stolee wrote:
> On 4/1/21 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> Implement command options `--stop`, `--is-running`, `--query`,
>> `--query-index`, and `--flush` to control and query the status of a
>> `fsmonitor--daemon` server process (and implicitly start a server
>> process if necessary).
>>
>> Later commits will implement the actual server and monitor
>> the file system.
> 
> As mentioned before, I think the "query", "query-index", and
> "flush" commands are better served in a test helper. Luckily,
> the implementation you give here seems rather straightforward
> and could fit into a test helper without a lot of duplicated
> boilerplate. That's a good sign for the API presented here.
> 
> As a bonus, you could delay the implementation of those test
> helpers until they are going to be used in a test.
> 
> Thanks,
> -Stolee
> 

Good point.  I'll take a look at this.

Thanks
Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 09/23] fsmonitor--daemon: implement daemon command options
  2021-04-26 16:12     ` Derrick Stolee
@ 2021-04-30 15:18       ` Jeff Hostetler
  0 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-04-30 15:18 UTC (permalink / raw)
  To: Derrick Stolee, Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler



On 4/26/21 12:12 PM, Derrick Stolee wrote:
> On 4/26/2021 11:47 AM, Derrick Stolee wrote:
>> On 4/1/21 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
>>> From: Jeff Hostetler <jeffhost@microsoft.com>
>> ...
>>> +	/* Prepare to (recursively) watch the <worktree-root> directory. */
>>> +	strbuf_init(&state.path_worktree_watch, 0);
>>> +	strbuf_addstr(&state.path_worktree_watch, absolute_path(get_git_work_tree()));
>>> +	state.nr_paths_watching = 1;
>>
>> Yes, let's watch the working directory.
>>
>>> +	/*
>>> +	 * If ".git" is not a directory, then <gitdir> is not inside the
>>> +	 * cone of <worktree-root>, so set up a second watch for it.
>>> +	 */
>>> +	strbuf_init(&state.path_gitdir_watch, 0);
>>> +	strbuf_addbuf(&state.path_gitdir_watch, &state.path_worktree_watch);
>>> +	strbuf_addstr(&state.path_gitdir_watch, "/.git");
>>> +	if (!is_directory(state.path_gitdir_watch.buf)) {
>>> +		strbuf_reset(&state.path_gitdir_watch);
>>> +		strbuf_addstr(&state.path_gitdir_watch, absolute_path(get_git_dir()));
>>> +		state.nr_paths_watching = 2;
>>> +	}
>>
>> But why watch the .git directory, especially for a worktree (or
>> submodule I guess)? What benefit do we get from events within the
>> .git directory? I'm expecting any event within the .git directory
>> should be silently ignored.
> 
> I see in a following patch that we place a cookie file within the
> .git directory. I'm reminded that this is done for a reason: other
> filesystem watchers can get into a loop if we place the cookie
> file outside of the .git directory. The classic example is VS Code
> running 'git status' in a loop because Watchman writes a cookie
> into the root of the working directory.

Yes.  I'll add a comment explaining the need for the second watch.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 09/23] fsmonitor--daemon: implement daemon command options
  2021-04-26 15:47   ` Derrick Stolee
  2021-04-26 16:12     ` Derrick Stolee
@ 2021-04-30 15:59     ` Jeff Hostetler
  1 sibling, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-04-30 15:59 UTC (permalink / raw)
  To: Derrick Stolee, Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler



On 4/26/21 11:47 AM, Derrick Stolee wrote:
...
> 
>> +
>>   static int is_ipc_daemon_listening(void)
>>   {
>>   	return fsmonitor_ipc__get_state() == IPC_STATE__LISTENING;
>>   }
>>   
>> +static int try_to_run_foreground_daemon(void)
>> +{
>> +	/*
>> +	 * Technically, we don't need to probe for an existing daemon
>> +	 * process, since we could just call `fsmonitor_run_daemon()`
>> +	 * and let it fail if the pipe/socket is busy.
>> +	 *
>> +	 * However, this method gives us a nicer error message for a
>> +	 * common error case.
>> +	 */
>> +	if (is_ipc_daemon_listening())
>> +		die("fsmonitor--daemon is already running.");
> Here, it seems like we only care about IPC_STATE_LISTENING, while
> earlier I mentioned that I ended up in IPC_STATE__NOT_LISTENING,
> and my manually running of the daemon helped.
> 
>> +	return !!fsmonitor_run_daemon();
>> +}
> 
> You are ignoring the IPC_STATE__NOT_LISTENING and creating a new
> process, which is good. I'm just wondering why that state exists
> and what is the proper way to handle it?

I'll revisit this and clarify.

> 
>> +
>> +#ifndef GIT_WINDOWS_NATIVE
> 
> You are already creating a platform-specific mechanism for the
> filesystem watcher. Shouldn't the implementation of this method
> be part of that file in compat/fsmonitor/?
> 
> I guess the biggest reason is that macOS and Linux share this
> implementation, so maybe this is the cleanest approach.

This has to do with how to spawn a background process and
disassociate from the console and all that.

On Windows, the "git fsmonitor--daemon --start" process[1] must
start a child process[2] with "git fsmonitor--daemon --run" and
then the [1] can exit (to let the caller/shell continue) while
[2] is free to continue.

On Unix, the "git fsmonitor-daemon --start" process[1] can
fork() a child process[2] and [2] can just call the _run_daemon()
code.  We don't need to exec the child, so this is a bit faster.

This code is platform-specific, so maybe it should go elsewhere,
but it has knowledge of fsmonitor--daemon-specific command line
args and private functions (`fsmonitor_run_daemon()`) and it knows
that it is not a library function.  So it made sense to keep it
close to the fsmonitor--daemon main entry point.


It didn't feel right to make these 2 versions of
`spawn_background_fsmonitor_daemon()` more generic (such as putting
them near `daemonize()`), because they know too much about
fsmonitor--daemon.

I did the same thing in `t/helper/test-simple-ipc.c` where I
created variants of this that started the test-tool in the background.
36a7eb6876 (t0052: add simple-ipc tests and t/helper/test-simple-ipc 
tool, 2021-03-22)


I thought about putting them inside `compat/fsmonitor/*.c`
but that has different problems.  Those files are concerned strictly
with the FS layer and how to get FS notification events from the
kernel and translating them into something cross-platform.  They
are, in a sense, a "driver" for that FS.  Process spawning is outside
of their scope.

And as you say, MacOS and Linux can both use the same process
spawning code, but will have vastly different FS layers.

So I left them here.


Thanks,
Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 11/23] fsmonitor--daemon: define token-ids
  2021-04-26 19:49   ` Derrick Stolee
  2021-04-26 20:01     ` Eric Sunshine
@ 2021-04-30 16:17     ` Jeff Hostetler
  1 sibling, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-04-30 16:17 UTC (permalink / raw)
  To: Derrick Stolee, Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler



On 4/26/21 3:49 PM, Derrick Stolee wrote:
> On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> Teach fsmonitor--daemon to create token-ids and define the
>> overall token naming scheme.
> ...
>> +/*
>> + * Requests to and from a FSMonitor Protocol V2 provider use an opaque
>> + * "token" as a virtual timestamp.  Clients can request a summary of all
>> + * created/deleted/modified files relative to a token.  In the response,
>> + * clients receive a new token for the next (relative) request.
>> + *
>> + *
>> + * Token Format
>> + * ============
>> + *
>> + * The contents of the token are private and provider-specific.
>> + *
>> + * For the built-in fsmonitor--daemon, we define a token as follows:
>> + *
>> + *     "builtin" ":" <token_id> ":" <sequence_nr>
>> + *
>> + * The <token_id> is an arbitrary OPAQUE string, such as a GUID,
>> + * UUID, or {timestamp,pid}.  It is used to group all filesystem
>> + * events that happened while the daemon was monitoring (and in-sync
>> + * with the filesystem).
>> + *
>> + *     Unlike FSMonitor Protocol V1, it is not defined as a timestamp
>> + *     and does not define less-than/greater-than relationships.
>> + *     (There are too many race conditions to rely on file system
>> + *     event timestamps.)
>> + *
>> + * The <sequence_nr> is a simple integer incremented for each event
>> + * received.  When a new <token_id> is created, the <sequence_nr> is
>> + * reset to zero.
>> + *
>> + *
>> + * About Token Ids
>> + * ===============
>> + *
>> + * A new token_id is created:
>> + *
>> + * [1] each time the daemon is started.
>> + *
>> + * [2] any time that the daemon must re-sync with the filesystem
>> + *     (such as when the kernel drops or we miss events on a very
>> + *     active volume).
>> + *
>> + * [3] in response to a client "flush" command (for dropped event
>> + *     testing).
>> + *
>> + * [4] MAYBE We might want to change the token_id after very complex
>> + *     filesystem operations are performed, such as a directory move
>> + *     sequence that affects many files within.  It might be simpler
>> + *     to just give up and fake a re-sync (and let the client do a
>> + *     full scan) than try to enumerate the effects of such a change.
>> + *
>> + * When a new token_id is created, the daemon is free to discard all
>> + * cached filesystem events associated with any previous token_ids.
>> + * Events associated with a non-current token_id will never be sent
>> + * to a client.  A token_id change implicitly means that the daemon
>> + * has gap in its event history.
>> + *
>> + * Therefore, clients that present a token with a stale (non-current)
>> + * token_id will always be given a trivial response.
> 
>  From this comment, it seems to be the case that concurrent Git
> commands will race to advance the FS Monitor token and one of them
> will lose, causing a full working directory scan. There is no list
> of "recent" tokens.
> 
> I could see this changing in the future, but for now it is a
> reasonable simplification.

The daemon only creates a new token-id when it needs to because of
a loss of sync with the FS.  And the sequence-nr is advanced based
upon the quantity of FS activity.  Clients don't cause either to
change or advance (except for the flush, which is a testing hack).

Ideally, the token-id is created when the daemon starts up and is
never changed.

Concurrent clients all receive normalized event data from the
in-memory cache/queue from threads reading the queue in parallel.


I included [4] as a possible future enhancement, but so far haven't
actually needed it.  The event stream (at least on Windows and MacOS)
from the OS is sufficient that I didn't need to implement that.

I'll remove [4] from the comments.

Thanks,
Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 12/23] fsmonitor--daemon: create token-based changed path cache
  2021-04-26 20:22   ` Derrick Stolee
@ 2021-04-30 17:36     ` Jeff Hostetler
  0 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-04-30 17:36 UTC (permalink / raw)
  To: Derrick Stolee, Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler



On 4/26/21 4:22 PM, Derrick Stolee wrote:
...
>> +
>> +void fsmonitor_batch__add_path(struct fsmonitor_batch *batch,
>> +			       const char *path)
>> +{
>> +	const char *interned_path = strintern(path);
> 
> This use of interned paths is interesting, although I become
> concerned for the amount of memory we are consuming over the
> lifetime of the process. This could be considered as a target
> for future improvements, perhaps with an LRU cache or something
> similar.

Interning gives us a fixed pointer for any given path.  This
gives us a way to de-dup paths using just pointers rather than
string compares.

Yes, we will accumulate paths in that dictionary, but the set of
paths present in the typical working directory are usually pretty
fixed.

We only generate these for modified paths.  Users don't typically
create/modify/delete that many paths in their source trees during
normal development.

Compilers may generate lots of trash files in their worktree, but
those names are usually repeated (with each "make").  So we might
acculuate a lot of paths for a repo, it should become stable.
However, if they use temp files in the tree, it might invalidate
this statement.

WRT LRUs, that gets us into threading and lock contention problem
and ref-counting.  I have it designed such that parallel threads
read and send the current queue to the client without a lock.  They
only need a quick lock to get the current head pointer; the rest
is done lock free.  Also, purging from the end of the LRU would
put is in contention with the FS listener thread that is adding
new paths to the LRU.

So, yeah, maybe this is something to keep an eye on -- especially
in the monorepo case, but I don't think we need to address it now.

Thanks,
Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 13/23] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  2021-04-27 17:22   ` Derrick Stolee
  2021-04-27 17:41     ` Eric Sunshine
@ 2021-04-30 19:32     ` Jeff Hostetler
  1 sibling, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-04-30 19:32 UTC (permalink / raw)
  To: Derrick Stolee, Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler



On 4/27/21 1:22 PM, Derrick Stolee wrote:
> On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> Teach the win32 backend to register a watch on the working tree
>> root directory (recursively).  Also watch the <gitdir> if it is
>> not inside the working tree.  And to collect path change notifications
>> into batches and publish.
> 
> Is it valuable to list the important API methods here for an interested
> reader to discover them? Perhaps using links to the docs [1] might be
> too ephemeral, in case those URLs stop being valid.
> 
> In any case, here are the URLs I found helpful:
> 
> [1] https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-readdirectorychangesw
> [2] https://docs.microsoft.com/en-us/windows/win32/api/ioapiset/nf-ioapiset-getoverlappedresult
> [3] https://docs.microsoft.com/en-us/windows/win32/fileio/cancelioex-func
> [4] https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-resetevent
> [5] https://docs.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-file_notify_information
> [6] https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-waitformultipleobjects

I could see adding them.  I think I had some in an earlier draft.
(And I'm always glad to find them when I go back and revisit the
code later :-)

And yes, the Win32 APIs and code are a bit dense and tricky to
understand -- especially the async IO stuff, so this would be good.

> 
>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
...
>> +static struct one_watch *create_watch(struct fsmonitor_daemon_state *state,
>> +				      const char *path)
>> +{
>> +	struct one_watch *watch = NULL;
>> +	DWORD desired_access = FILE_LIST_DIRECTORY;
>> +	DWORD share_mode =
>> +		FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE;
> 
> Ah, this is probably why we can delete a repo that is under a watch.

Yes, we're holding a handle to the directory (and we're CD'd into it),
but I have code in there to detect when the .git directory is deleted
and trigger a shutdown.  A "rm -rf" on the worktree will delete the
.git directory first and we "usually" win that race before "rm" gets
around to deleting the root directory.

I intend to harden this a bit more.  The Git startup code does the CD
to the root directory and I'd like to CD out after we get everything
setup.

...
>> +static int recv_rdcw_watch(struct one_watch *watch)
>> +{
>> +	watch->is_active = FALSE;
>> +
>> +	if (GetOverlappedResult(watch->hDir, &watch->overlapped, &watch->count,
>> +				TRUE))
>> +		return 0;
>> +
>> +	// TODO If an external <gitdir> is deleted, the above returns an error.
>> +	// TODO I'm not sure that there's anything that we can do here other
>> +	// TODO than failing -- the <worktree>/.git link file would be broken
>> +	// TODO anyway.  We might try to check for that and return a better
>> +	// TODO error message.
> 
> These are not fit C-style comments. This situation can be handled
> by a later patch series, if valuable enough.

Oooops, I missed that one before I posted it.

> 
>> +
>> +	error("GetOverlappedResult failed on '%s' [GLE %ld]",
>> +	      watch->path.buf, GetLastError());
>> +	return -1;
>> +}
>> +
>> +static void cancel_rdcw_watch(struct one_watch *watch)
>> +{
>> +	DWORD count;
>> +
>> +	if (!watch || !watch->is_active)
>> +		return;
>> +
>> +	CancelIoEx(watch->hDir, &watch->overlapped);
>> +	GetOverlappedResult(watch->hDir, &watch->overlapped, &count, TRUE);
>> +	watch->is_active = FALSE;
>> +}
>> +
>> +/*
>> + * Process filesystem events that happen anywhere (recursively) under the
>> + * <worktree> root directory.  For a normal working directory, this includes
>> + * both version controlled files and the contents of the .git/ directory.
>> + *
>> + * If <worktree>/.git is a file, then we only see events for the file
>> + * itself.
>> + */
>> +static int process_worktree_events(struct fsmonitor_daemon_state *state)
>> +{
>> +	struct fsmonitor_daemon_backend_data *data = state->backend_data;
>> +	struct one_watch *watch = data->watch_worktree;
>> +	struct strbuf path = STRBUF_INIT;
>> +	struct string_list cookie_list = STRING_LIST_INIT_DUP;
>> +	struct fsmonitor_batch *batch = NULL;
>> +	const char *p = watch->buffer;
>> +
>> +	/*
>> +	 * If the kernel gets more events than will fit in the kernel
>> +	 * buffer associated with our RDCW handle, it drops them and
>> +	 * returns a count of zero.  (A successful call, but with
>> +	 * length zero.)
>> +	 */
> 
> I suppose that since we create a cookie file, we don't expect a zero
> result to ever be a meaningful value? Or, is there another way to
> differentiate between "nothing happened" and "too much happened"?

This is independent of cookie files.  We have a thread watching
watching for FS events and building up batches of changes and
assembling the list of batches.  This is always running.

Cookies are created by the threads in the IPC thread pool and
those threads wait until we get a FS event for their cookie file.

All of these threads are running independently.

The code here is saying that Windows will give us a non-error but
zero result when the kernel had to drop events.  (The "too much
happened" case.)

If nothing happens on the FS, our Async IO doesn't trigger a
wakeup and the FS listener thread stays in the WaitForMultipleObjects().

All of this is pretty dense I realize.

> 
>> +	if (!watch->count) {
>> +		trace2_data_string("fsmonitor", NULL, "fsm-listen/kernel",
>> +				   "overflow");
>> +		fsmonitor_force_resync(state);
>> +		return LISTENER_HAVE_DATA_WORKTREE;
>> +	}
>> +
>> +	/*
>> +	 * On Windows, `info` contains an "array" of paths that are
>> +	 * relative to the root of whichever directory handle received
>> +	 * the event.
>> +	 */
>> +	for (;;) {
>> +		FILE_NOTIFY_INFORMATION *info = (void *)p;
>> +		const char *slash;
>> +		enum fsmonitor_path_type t;
>> +
>> +		strbuf_reset(&path);
>> +		if (normalize_path_in_utf8(info, &path) == -1)
>> +			goto skip_this_path;
>> +
>> +		t = fsmonitor_classify_path_workdir_relative(path.buf);
>> +
>> +		switch (t) {
>> +		case IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX:
>> +			/* special case cookie files within .git */
>> +
>> +			/* Use just the filename of the cookie file. */
>> +			slash = find_last_dir_sep(path.buf);
>> +			string_list_append(&cookie_list,
>> +					   slash ? slash + 1 : path.buf);
> 
> Ok, I see now how we special-case cookies in the list of events.
> 
>> +			break;
>> +
>> +		case IS_INSIDE_DOT_GIT:
>> +			/* ignore everything inside of "<worktree>/.git/" */
>> +			break;
>> +
>> +		case IS_DOT_GIT:
>> +			/* "<worktree>/.git" was deleted (or renamed away) */
>> +			if ((info->Action == FILE_ACTION_REMOVED) ||
>> +			    (info->Action == FILE_ACTION_RENAMED_OLD_NAME)) {
>> +				trace2_data_string("fsmonitor", NULL,
>> +						   "fsm-listen/dotgit",
>> +						   "removed");
>> +				goto force_shutdown;
>> +			}
>> +			break;
>> +
>> +		case IS_WORKDIR_PATH:
>> +			/* queue normal pathname */
>> +			if (!batch)
>> +				batch = fsmonitor_batch__new();
>> +			fsmonitor_batch__add_path(batch, path.buf);
>> +			break;
>> +
>> +		case IS_GITDIR:
>> +		case IS_INSIDE_GITDIR:
>> +		case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX:
>> +		default:
>> +			BUG("unexpected path classification '%d' for '%s'",
>> +			    t, path.buf);
> So these events should be caught by the _other_ watcher. I suppose
> BUG() is somewhat appropriate, but also seems heavy-handed. For
> example, the 'goto' in the next line will never be visited. A die()
> would even be appropriate, but somewhat less harsh than a BUG(),
> especially for a background process.
> 
>> +			goto skip_this_path;

Right, the IS_*GITDIR events should only happen in the other
watcher.  They are bugs in the sense that the path classifier
failed.  And the "default:" is my usual backstop for such an enum
where any other value would be a bug.

I'm not sure whether BUG(), die(), or error() is better than the
other.  The daemon runs with without a console normally, so the
messages will not be seen -- unless you're debugging it or running
it in the foreground.

I could get rid of the goto.

>> +		}
>> +
>> +skip_this_path:
>> +		if (!info->NextEntryOffset)
>> +			break;
>> +		p += info->NextEntryOffset;
>> +	}
>> +
>> +	fsmonitor_publish(state, batch, &cookie_list);
>> +	batch = NULL;
>> +	string_list_clear(&cookie_list, 0);
>> +	strbuf_release(&path);
>> +	return LISTENER_HAVE_DATA_WORKTREE;
>> +
>> +force_shutdown:
>> +	fsmonitor_batch__free(batch);
>> +	string_list_clear(&cookie_list, 0);
>> +	strbuf_release(&path);
>> +	return LISTENER_SHUTDOWN;
>> +}
>> +
>> +/*
>> + * Process filesystem events that happend anywhere (recursively) under the
> 
> s/happend/happened
> 
>> + * external <gitdir> (such as non-primary worktrees or submodules).
>> + * We only care about cookie files that our client threads created here.
>> + *
>> + * Note that we DO NOT get filesystem events on the external <gitdir>
>> + * itself (it is not inside something that we are watching).  In particular,
>> + * we do not get an event if the external <gitdir> is deleted.
> 
> This is an interesting change of behavior. I forget if it is listed in
> the documentation file, but definitely could be. I imagine wanting a
> "Troubleshooting" section that describes special cases like this.
> 
> Also, because of this worktree-specific behavior, we might want to
> recommend using 'git config --worktree' when choosing to use FS Monitor,
> so that each worktree is opted-in as requested. Without --worktree, all
> worktrees with a common base would stard using FS Monitor simultaneously.

I'll take a look.

> 
>> + */
>> +static int process_gitdir_events(struct fsmonitor_daemon_state *state)
>> +{
>> +	struct fsmonitor_daemon_backend_data *data = state->backend_data;
>> +	struct one_watch *watch = data->watch_gitdir;
>> +	struct strbuf path = STRBUF_INIT;
>> +	struct string_list cookie_list = STRING_LIST_INIT_DUP;
>> +	const char *p = watch->buffer;
>> +
>> +	if (!watch->count) {
>> +		trace2_data_string("fsmonitor", NULL, "fsm-listen/kernel",
>> +				   "overflow");
>> +		fsmonitor_force_resync(state);
>> +		return LISTENER_HAVE_DATA_GITDIR;
>> +	}
>> +
>> +	for (;;) {
>> +		FILE_NOTIFY_INFORMATION *info = (void *)p;
>> +		const char *slash;
>> +		enum fsmonitor_path_type t;
>> +
>> +		strbuf_reset(&path);
>> +		if (normalize_path_in_utf8(info, &path) == -1)
>> +			goto skip_this_path;
>> +
>> +		t = fsmonitor_classify_path_gitdir_relative(path.buf);
>> +
>> +		trace_printf_key(&trace_fsmonitor, "BBB: %s", path.buf);
>> +
>> +		switch (t) {
>> +		case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX:
>> +			/* special case cookie files within gitdir */
>> +
>> +			/* Use just the filename of the cookie file. */
>> +			slash = find_last_dir_sep(path.buf);
>> +			string_list_append(&cookie_list,
>> +					   slash ? slash + 1 : path.buf);
>> +			break;
>> +
>> +		case IS_INSIDE_GITDIR:
>> +			goto skip_this_path;
>> +
>> +		default:
>> +			BUG("unexpected path classification '%d' for '%s'",
>> +			    t, path.buf);
> 
> If we decide against BUG() earlier, then also get this one.
> 
>> +			goto skip_this_path;
>> +		}
>> +
>> +skip_this_path:
>> +		if (!info->NextEntryOffset)
>> +			break;
>> +		p += info->NextEntryOffset;
>> +	}
>> +
>> +	fsmonitor_publish(state, NULL, &cookie_list);
>> +	string_list_clear(&cookie_list, 0);
>> +	strbuf_release(&path);
>> +	return LISTENER_HAVE_DATA_GITDIR;
>>   }
>>   
>>   void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
>>   {
>> +	struct fsmonitor_daemon_backend_data *data = state->backend_data;
>> +	DWORD dwWait;
>> +
>> +	state->error_code = 0;
>> +
>> +	if (start_rdcw_watch(data, data->watch_worktree) == -1)
>> +		goto force_error_stop;
>> +
>> +	if (data->watch_gitdir &&
>> +	    start_rdcw_watch(data, data->watch_gitdir) == -1)
>> +		goto force_error_stop;
>> +
>> +	for (;;) {
>> +		dwWait = WaitForMultipleObjects(data->nr_listener_handles,
>> +						data->hListener,
>> +						FALSE, INFINITE);
> 
> Since you use INFINITE here, that says that we will wait for at least one
> signal, solving the confusion about zero results: zero results unambiguously
> indicates a loss of events.

Right.

> 
>> +
>> +		if (dwWait == WAIT_OBJECT_0 + LISTENER_HAVE_DATA_WORKTREE) {
>> +			if (recv_rdcw_watch(data->watch_worktree) == -1)
>> +				goto force_error_stop;
>> +			if (process_worktree_events(state) == LISTENER_SHUTDOWN)
>> +				goto force_shutdown;
>> +			if (start_rdcw_watch(data, data->watch_worktree) == -1)
>> +				goto force_error_stop;
>> +			continue;
>> +		}
>> +
>> +		if (dwWait == WAIT_OBJECT_0 + LISTENER_HAVE_DATA_GITDIR) {
>> +			if (recv_rdcw_watch(data->watch_gitdir) == -1)
>> +				goto force_error_stop;
>> +			if (process_gitdir_events(state) == LISTENER_SHUTDOWN)
>> +				goto force_shutdown;
>> +			if (start_rdcw_watch(data, data->watch_gitdir) == -1)
>> +				goto force_error_stop;
>> +			continue;
>> +		}
>> +
>> +		if (dwWait == WAIT_OBJECT_0 + LISTENER_SHUTDOWN)
>> +			goto clean_shutdown;
>> +
>> +		error(_("could not read directory changes [GLE %ld]"),
>> +		      GetLastError());
>> +		goto force_error_stop;
>> +	}
>> +
>> +force_error_stop:
>> +	state->error_code = -1;
>> +
>> +force_shutdown:
>> +	/*
>> +	 * Tell the IPC thead pool to stop (which completes the await
>> +	 * in the main thread (which will also signal this thread (if
>> +	 * we are still alive))).
>> +	 */
>> +	ipc_server_stop_async(state->ipc_server_data);
>> +
>> +clean_shutdown:
>> +	cancel_rdcw_watch(data->watch_worktree);
>> +	cancel_rdcw_watch(data->watch_gitdir);
>>   }
>>   
>>   int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
>>   {
>> +	struct fsmonitor_daemon_backend_data *data;
>> +
>> +	data = xcalloc(1, sizeof(*data));
> 
> CALLOC_ARRAY()
> 
>> +
>> +	data->hEventShutdown = CreateEvent(NULL, TRUE, FALSE, NULL);
>> +
>> +	data->watch_worktree = create_watch(state,
>> +					    state->path_worktree_watch.buf);
>> +	if (!data->watch_worktree)
>> +		goto failed;
>> +
>> +	if (state->nr_paths_watching > 1) {
>> +		data->watch_gitdir = create_watch(state,
>> +						  state->path_gitdir_watch.buf);
>> +		if (!data->watch_gitdir)
>> +			goto failed;
>> +	}
>> +
>> +	data->hListener[LISTENER_SHUTDOWN] = data->hEventShutdown;
>> +	data->nr_listener_handles++;
>> +
>> +	data->hListener[LISTENER_HAVE_DATA_WORKTREE] =
>> +		data->watch_worktree->hEvent;
>> +	data->nr_listener_handles++;
>> +
>> +	if (data->watch_gitdir) {
>> +		data->hListener[LISTENER_HAVE_DATA_GITDIR] =
>> +			data->watch_gitdir->hEvent;
>> +		data->nr_listener_handles++;
>> +	}
> 
> This is a clever organization of the event handles. I imagine it
> will requires some rework if we decide to include another optional
> handle whose inclusion is orthogonal to the gitdir one, but that
> is unlikely enough to keep these well-defined array indices.

I think we're good here.  I think an INVALID_HANDLE value can
be used to supply a hole in the vector of event handles.
But yes, we don't have to worry about that now.

> 
>> +	state->backend_data = data;
>> +	return 0;
>> +
>> +failed:
>> +	CloseHandle(data->hEventShutdown);
>> +	destroy_watch(data->watch_worktree);
>> +	destroy_watch(data->watch_gitdir);
>> +
>>   	return -1;
>>   }
>>   
>>   void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
>>   {
>> +	struct fsmonitor_daemon_backend_data *data;
>> +
>> +	if (!state || !state->backend_data)
>> +		return;
>> +
>> +	data = state->backend_data;
>> +
>> +	CloseHandle(data->hEventShutdown);
>> +	destroy_watch(data->watch_worktree);
>> +	destroy_watch(data->watch_gitdir);
>> +
>> +	FREE_AND_NULL(state->backend_data);
>>   }
> 
> I tried to follow all the API calls and check the documentation for
> any misuse, but did not find any. I can only contribute nitpicks
> here, and rely on the tests to really see that this is working as
> expected.
> 
> I was hoping to find in here why we need to sleep in the test suite,
> but have not pinpointed that issue yet.

The sleep was only needed on severely under-powered CI machines
(think 1 core).  I revisit this.  I think I might be able to get
rid of it.

Thanks,
Jeff



^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 15/23] fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS
  2021-04-27 18:35   ` Derrick Stolee
@ 2021-04-30 20:05     ` Jeff Hostetler
  0 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-04-30 20:05 UTC (permalink / raw)
  To: Derrick Stolee, Jeff Hostetler via GitGitGadget, git
  Cc: Jeff Hostetler, Eric Sunshine



On 4/27/21 2:35 PM, Derrick Stolee wrote:
> On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> Implement file system event listener on MacOS using FSEvent,
>> CoreFoundation, and CoreServices.
> 
> Again, I'm not sure if we _should_ be including URLs to
> documentation in our messages, but here are some I found helpful:
> 
> [1] https://developer.apple.com/library/archive/documentation/Darwin/Conceptual/FSEvents_ProgGuide/UsingtheFSEventsFramework/UsingtheFSEventsFramework.html
> [2] https://developer.apple.com/documentation/corefoundation/1541796-cfrunloopstop
> [3] https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/Multithreading/RunLoopManagement/RunLoopManagement.html
> 

Sure.

>> Co-authored-by: Kevin Willford <Kevin.Willford@microsoft.com>
>> Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>> ---
>>   compat/fsmonitor/fsmonitor-fs-listen-macos.c | 368 +++++++++++++++++++
>>   1 file changed, 368 insertions(+)
>>
>> diff --git a/compat/fsmonitor/fsmonitor-fs-listen-macos.c b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
>> index bec5130d9e1d..e055fb579cc4 100644
>> --- a/compat/fsmonitor/fsmonitor-fs-listen-macos.c
>> +++ b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
>> @@ -97,20 +97,388 @@ void FSEventStreamRelease(FSEventStreamRef stream);
>>   #include "cache.h"
>>   #include "fsmonitor.h"
>>   #include "fsmonitor-fs-listen.h"
>> +#include "fsmonitor--daemon.h"
>> +
>> +struct fsmonitor_daemon_backend_data
>> +{
>> +	CFStringRef cfsr_worktree_path;
>> +	CFStringRef cfsr_gitdir_path;
>> +
>> +	CFArrayRef cfar_paths_to_watch;
>> +	int nr_paths_watching;
>> +
>> +	FSEventStreamRef stream;
>> +
>> +	CFRunLoopRef rl;
>> +
>> +	enum shutdown_style {
>> +		SHUTDOWN_EVENT = 0,
>> +		FORCE_SHUTDOWN,
>> +		FORCE_ERROR_STOP,
>> +	} shutdown_style;
>> +
>> +	unsigned int stream_scheduled:1;
>> +	unsigned int stream_started:1;
>> +};
>> +
>> +static void log_flags_set(const char *path, const FSEventStreamEventFlags flag)
>> +{
>> +	struct strbuf msg = STRBUF_INIT;
> 
> Before going through these ifs and constructing a string, it
> might be a good idea to check if the trace event will actually
> be sent somewhere. If the logging method is switched to a
> trace2 method, then up here we can do:
> 
> 	if (!trace2_is_enabled())
> 		return;

No, we're routing these messages (very noisy and low value) to
the old-style GIT_TRACE_FSMONITOR key.  There's a whole set of
existing FSMonitor tracing that Ben and Kevin added that is
useful for interactive debugging.

> 
>> +	if (flag & kFSEventStreamEventFlagMustScanSubDirs)
>> +		strbuf_addstr(&msg, "MustScanSubDirs|");
>> +	if (flag & kFSEventStreamEventFlagUserDropped)
>> +		strbuf_addstr(&msg, "UserDropped|");
>> +	if (flag & kFSEventStreamEventFlagKernelDropped)
>> +		strbuf_addstr(&msg, "KernelDropped|");
>> +	if (flag & kFSEventStreamEventFlagEventIdsWrapped)
>> +		strbuf_addstr(&msg, "EventIdsWrapped|");
>> +	if (flag & kFSEventStreamEventFlagHistoryDone)
>> +		strbuf_addstr(&msg, "HistoryDone|");
>> +	if (flag & kFSEventStreamEventFlagRootChanged)
>> +		strbuf_addstr(&msg, "RootChanged|");
>> +	if (flag & kFSEventStreamEventFlagMount)
>> +		strbuf_addstr(&msg, "Mount|");
>> +	if (flag & kFSEventStreamEventFlagUnmount)
>> +		strbuf_addstr(&msg, "Unmount|");
>> +	if (flag & kFSEventStreamEventFlagItemChangeOwner)
>> +		strbuf_addstr(&msg, "ItemChangeOwner|");
>> +	if (flag & kFSEventStreamEventFlagItemCreated)
>> +		strbuf_addstr(&msg, "ItemCreated|");
>> +	if (flag & kFSEventStreamEventFlagItemFinderInfoMod)
>> +		strbuf_addstr(&msg, "ItemFinderInfoMod|");
>> +	if (flag & kFSEventStreamEventFlagItemInodeMetaMod)
>> +		strbuf_addstr(&msg, "ItemInodeMetaMod|");
>> +	if (flag & kFSEventStreamEventFlagItemIsDir)
>> +		strbuf_addstr(&msg, "ItemIsDir|");
>> +	if (flag & kFSEventStreamEventFlagItemIsFile)
>> +		strbuf_addstr(&msg, "ItemIsFile|");
>> +	if (flag & kFSEventStreamEventFlagItemIsHardlink)
>> +		strbuf_addstr(&msg, "ItemIsHardlink|");
>> +	if (flag & kFSEventStreamEventFlagItemIsLastHardlink)
>> +		strbuf_addstr(&msg, "ItemIsLastHardlink|");
>> +	if (flag & kFSEventStreamEventFlagItemIsSymlink)
>> +		strbuf_addstr(&msg, "ItemIsSymlink|");
>> +	if (flag & kFSEventStreamEventFlagItemModified)
>> +		strbuf_addstr(&msg, "ItemModified|");
>> +	if (flag & kFSEventStreamEventFlagItemRemoved)
>> +		strbuf_addstr(&msg, "ItemRemoved|");
>> +	if (flag & kFSEventStreamEventFlagItemRenamed)
>> +		strbuf_addstr(&msg, "ItemRenamed|");
>> +	if (flag & kFSEventStreamEventFlagItemXattrMod)
>> +		strbuf_addstr(&msg, "ItemXattrMod|");
>> +	if (flag & kFSEventStreamEventFlagOwnEvent)
>> +		strbuf_addstr(&msg, "OwnEvent|");
>> +	if (flag & kFSEventStreamEventFlagItemCloned)
>> +		strbuf_addstr(&msg, "ItemCloned|");
>> +
>> +	trace_printf_key(&trace_fsmonitor, "fsevent: '%s', flags=%u %s",
>> +			 path, flag, msg.buf);
> 
> Should this be a trace2 call?

No. I wanted to keep these messages on the old-style key.

> 
>> +
>> +	strbuf_release(&msg);
>> +}
>> +
>> +static int ef_is_root_delete(const FSEventStreamEventFlags ef)
>> +{
>> +	return (ef & kFSEventStreamEventFlagItemIsDir &&
>> +		ef & kFSEventStreamEventFlagItemRemoved);
>> +}
>> +
>> +static int ef_is_root_renamed(const FSEventStreamEventFlags ef)
>> +{
>> +	return (ef & kFSEventStreamEventFlagItemIsDir &&
>> +		ef & kFSEventStreamEventFlagItemRenamed);
>> +}
> 
> Will these be handled differently? Or is it enough to detect
> ef_is_root_moved_or_deleted()?

The whole set of Apple kernel APIs are very foreign territory
to me, so I kept it simple and distinct.  Also, it will let us
log different trace events.

> 
>> +static void fsevent_callback(ConstFSEventStreamRef streamRef,
>> +			     void *ctx,
>> +			     size_t num_of_events,
>> +			     void *event_paths,
>> +			     const FSEventStreamEventFlags event_flags[],
>> +			     const FSEventStreamEventId event_ids[])
>> +{
>> +	struct fsmonitor_daemon_state *state = ctx;
>> +	struct fsmonitor_daemon_backend_data *data = state->backend_data;
>> +	char **paths = (char **)event_paths;
>> +	struct fsmonitor_batch *batch = NULL;
>> +	struct string_list cookie_list = STRING_LIST_INIT_DUP;
>> +	const char *path_k;
>> +	const char *slash;
>> +	int k;
>> +
>> +	/*
>> +	 * Build a list of all filesystem changes into a private/local
>> +	 * list and without holding any locks.
>> +	 */
>> +	for (k = 0; k < num_of_events; k++) {
>> +		/*
>> +		 * On Mac, we receive an array of absolute paths.
>> +		 */
>> +		path_k = paths[k];
>> +
>> +		/*
>> +		 * If you want to debug FSEvents, log them to GIT_TRACE_FSMONITOR.
>> +		 * Please don't log them to Trace2.
>> +		 *
>> +		 * trace_printf_key(&trace_fsmonitor, "XXX '%s'", path_k);
>> +		 */
> 
> Oh, I see. _Not_ trace2. What should we do to see if this is enabled
> to avoid over-working in the case we are not using GIT_TRACE_FSMONITOR?

Right. I'll see if there's a nice way to avoid the log_flags_set() call.
And I should rename the "XXX" message.

> 
>> +		/*
>> +		 * If event[k] is marked as dropped, we assume that we have
>> +		 * lost sync with the filesystem and should flush our cached
>> +		 * data.  We need to:
>> +		 *
>> +		 * [1] Abort/wake any client threads waiting for a cookie and
>> +		 *     flush the cached state data (the current token), and
>> +		 *     create a new token.
>> +		 *
>> +		 * [2] Discard the batch that we were locally building (since
>> +		 *     they are conceptually relative to the just flushed
>> +		 *     token).
>> +		 */
>> +		if ((event_flags[k] & kFSEventStreamEventFlagKernelDropped) ||
>> +		    (event_flags[k] & kFSEventStreamEventFlagUserDropped)) {
> 
> Perhaps create a macro EVENT_FLAG_DROPPED that is the union of these two? Then
> a single "event_flags[k] & EVENT_FLAG_DROPPED" would suffice here. Helps cover
> up how complicated the macOS API names are, too.
> 
>> +			/*
>> +			 * see also kFSEventStreamEventFlagMustScanSubDirs
>> +			 */
>> +			trace2_data_string("fsmonitor", NULL,
>> +					   "fsm-listen/kernel", "dropped");
>> +
>> +			fsmonitor_force_resync(state);
>> +
>> +			if (fsmonitor_batch__free(batch))
>> +				BUG("batch should not have a next");
> 
> I mentioned before that BUG() seems overkill for these processes, but this
> one fits. If this batch has a next, then we did something wrong, right? Do
> we have an automated test that checks enough events to maybe cause a second
> batch to be created?

This is me being parnoid.  I hate to see a returned pointer go to waste.

The local variable `batch` has a flexarray of interned paths, it should
never have a next pointer until it has been "published".  Publishing the
batch links it into the list of batches (and we lose ownership of it).

The flexarray grows as needed, so we won't have a second batch locally.

> 
>> +			string_list_clear(&cookie_list, 0);
>> +
>> +			/*
>> +			 * We assume that any events that we received
>> +			 * in this callback after this dropped event
>> +			 * may still be valid, so we continue rather
>> +			 * than break.  (And just in case there is a
>> +			 * delete of ".git" hiding in there.)
>> +			 */
>> +			continue;
>> +		}
>> +
>> +		switch (fsmonitor_classify_path_absolute(state, path_k)) {
>> +
>> +		case IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX:
>> +		case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX:
>> +			/* special case cookie files within .git or gitdir */
>> +
>> +			/* Use just the filename of the cookie file. */
>> +			slash = find_last_dir_sep(path_k);
>> +			string_list_append(&cookie_list,
>> +					   slash ? slash + 1 : path_k);
>> +			break;
>> +
>> +		case IS_INSIDE_DOT_GIT:
>> +		case IS_INSIDE_GITDIR:
>> +			/* ignore all other paths inside of .git or gitdir */
>> +			break;
>> +
>> +		case IS_DOT_GIT:
>> +		case IS_GITDIR:
>> +			/*
>> +			 * If .git directory is deleted or renamed away,
>> +			 * we have to quit.
>> +			 */
>> +			if (ef_is_root_delete(event_flags[k])) {
>> +				trace2_data_string("fsmonitor", NULL,
>> +						   "fsm-listen/gitdir",
>> +						   "removed");
>> +				goto force_shutdown;
>> +			}
>> +			if (ef_is_root_renamed(event_flags[k])) {
>> +				trace2_data_string("fsmonitor", NULL,
>> +						   "fsm-listen/gitdir",
>> +						   "renamed");
>> +				goto force_shutdown;
>> +			}
> 
> I see. The only difference is in how we trace the result. I'm not sure
> this tracing message is worth the differentiation.

I could get rid of it.  It was helpful during development to ensure
that I had covered all the bases.

> 
>> +			break;
>> +
>> +		case IS_WORKDIR_PATH:
>> +			/* try to queue normal pathnames */
>> +
>> +			if (trace_pass_fl(&trace_fsmonitor))
>> +				log_flags_set(path_k, event_flags[k]);
>> +
>> +			/* fsevent could be marked as both a file and directory */
> 
> The _same_ event? Interesting. And I see that you need to log the name
> differently in the case of a file or a directory.

Apple will consolidate messages, so if you delete a file and create a
directory with the same name quick enough, they will send a single event
with both bits set.  We receive a batch of events from the kernel with a
certain frequency, so we might get 2 events in separate bins, or 1 event
with both bits.

Yes, we queue up different paths.  Adding the trailing slash tells the
client to invalidate a range of cache-entries, for example.

> 
>> +			if (event_flags[k] & kFSEventStreamEventFlagItemIsFile) {
>> +				const char *rel = path_k +
>> +					state->path_worktree_watch.len + 1;
>> +
>> +				if (!batch)
>> +					batch = fsmonitor_batch__new();
>> +				fsmonitor_batch__add_path(batch, rel);
>> +			}
>> +
>> +			if (event_flags[k] & kFSEventStreamEventFlagItemIsDir) {
>> +				const char *rel = path_k +
>> +					state->path_worktree_watch.len + 1;
>> +				char *p = xstrfmt("%s/", rel);
> 
> In a critical path, xstrfmt() may be too slow for such a simple case.
> Likely we should instead use a strbuf with:
> 
> 	strbuf_addstr(&p, rel);
> 	strbuf_addch(&p, '/');
> 
> Bonus points if we can use the data to predict the size of the strbuf's
> buffer.

Good point!

> 
>> +
>> +				if (!batch)
>> +					batch = fsmonitor_batch__new();
>> +				fsmonitor_batch__add_path(batch, p);
>> +
>> +				free(p);
>> +			}
>> +
>> +			break;
>> +
>> +		case IS_OUTSIDE_CONE:
>> +		default:
>> +			trace_printf_key(&trace_fsmonitor,
>> +					 "ignoring '%s'", path_k);
>> +			break;
>> +		}
>> +	}
>> +
>> +	fsmonitor_publish(state, batch, &cookie_list);
>> +	string_list_clear(&cookie_list, 0);
>> +	return;
>> +
>> +force_shutdown:
>> +	if (fsmonitor_batch__free(batch))
>> +		BUG("batch should not have a next");
>> +	string_list_clear(&cookie_list, 0);
>> +
>> +	data->shutdown_style = FORCE_SHUTDOWN;
>> +	CFRunLoopStop(data->rl);
>> +	return;
>> +}
>> +
>> +/*
>> + * TODO Investigate the proper value for the `latency` argument in the call
>> + * TODO to `FSEventStreamCreate()`.  I'm not sure that this needs to be a
>> + * TODO config setting or just something that we tune after some testing.
>> + * TODO
>> + * TODO With a latency of 0.1, I was seeing lots of dropped events during
>> + * TODO the "touch 100000" files test within t/perf/p7519, but with a
>> + * TODO latency of 0.001 I did not see any dropped events.  So the "correct"
>> + * TODO value may be somewhere in between.
>> + * TODO
>> + * TODO https://developer.apple.com/documentation/coreservices/1443980-fseventstreamcreate
>> + */
> 
> As Eric mentioned in another thread, this should say "NEEDSWORK" at
> the top. This is a good candidate for follow-up after the basics of
> the series is stable.

Right.

> 
>>   int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
>>   {
>> +	FSEventStreamCreateFlags flags = kFSEventStreamCreateFlagNoDefer |
>> +		kFSEventStreamCreateFlagWatchRoot |
>> +		kFSEventStreamCreateFlagFileEvents;
>> +	FSEventStreamContext ctx = {
>> +		0,
>> +		state,
>> +		NULL,
>> +		NULL,
>> +		NULL
>> +	};
>> +	struct fsmonitor_daemon_backend_data *data;
>> +	const void *dir_array[2];
>> +
>> +	data = xcalloc(1, sizeof(*data));
> 
> CALLOC_ARRAY()
> 
>> +	state->backend_data = data;
>> +
>> +	data->cfsr_worktree_path = CFStringCreateWithCString(
>> +		NULL, state->path_worktree_watch.buf, kCFStringEncodingUTF8);
>> +	dir_array[data->nr_paths_watching++] = data->cfsr_worktree_path;
>> +
>> +	if (state->nr_paths_watching > 1) {
>> +		data->cfsr_gitdir_path = CFStringCreateWithCString(
>> +			NULL, state->path_gitdir_watch.buf,
>> +			kCFStringEncodingUTF8);
>> +		dir_array[data->nr_paths_watching++] = data->cfsr_gitdir_path;
>> +	}
>> +
>> +	data->cfar_paths_to_watch = CFArrayCreate(NULL, dir_array,
>> +						  data->nr_paths_watching,
>> +						  NULL);
>> +	data->stream = FSEventStreamCreate(NULL, fsevent_callback, &ctx,
>> +					   data->cfar_paths_to_watch,
>> +					   kFSEventStreamEventIdSinceNow,
>> +					   0.001, flags);
>> +	if (data->stream == NULL)
>> +		goto failed;
>> +
>> +	/*
>> +	 * `data->rl` needs to be set inside the listener thread.
>> +	 */
>> +
>> +	return 0;
>> +
>> +failed:
>> +	error("Unable to create FSEventStream.");
>> +
>> +	FREE_AND_NULL(state->backend_data);
>>   	return -1;
>>   }
>>   
>>   void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
>>   {
>> +	struct fsmonitor_daemon_backend_data *data;
>> +
>> +	if (!state || !state->backend_data)
>> +		return;
>> +
>> +	data = state->backend_data;
>> +
>> +	if (data->stream) {
>> +		if (data->stream_started)
>> +			FSEventStreamStop(data->stream);
>> +		if (data->stream_scheduled)
>> +			FSEventStreamInvalidate(data->stream);
>> +		FSEventStreamRelease(data->stream);
>> +	}
>> +
>> +	FREE_AND_NULL(state->backend_data);
>>   }
>>   
>>   void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state)
>>   {
>> +	struct fsmonitor_daemon_backend_data *data;
>> +
>> +	data = state->backend_data;
>> +	data->shutdown_style = SHUTDOWN_EVENT;
>> +
>> +	CFRunLoopStop(data->rl);
>>   }
>>   
>>   void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
>>   {
>> +	struct fsmonitor_daemon_backend_data *data;
>> +
>> +	data = state->backend_data;
>> +
>> +	data->rl = CFRunLoopGetCurrent();
>> +
>> +	FSEventStreamScheduleWithRunLoop(data->stream, data->rl, kCFRunLoopDefaultMode);
>> +	data->stream_scheduled = 1;
>> +
>> +	if (!FSEventStreamStart(data->stream)) {
>> +		error("Failed to start the FSEventStream");
>> +		goto force_error_stop_without_loop;
>> +	}
>> +	data->stream_started = 1;
>> +
>> +	CFRunLoopRun();
>> +
>> +	switch (data->shutdown_style) {
>> +	case FORCE_ERROR_STOP:
>> +		state->error_code = -1;
>> +		/* fall thru */
>> +	case FORCE_SHUTDOWN:
>> +		ipc_server_stop_async(state->ipc_server_data);
>> +		/* fall thru */
>> +	case SHUTDOWN_EVENT:
>> +	default:
>> +		break;
>> +	}
>> +	return;
>> +
>> +force_error_stop_without_loop:
>> +	state->error_code = -1;
>> +	ipc_server_stop_async(state->ipc_server_data);
>> +	return;
>>   }
> 
> Thanks,
> -Stolee
> 

Big thanks for looking at all of this!
Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 16/23] fsmonitor--daemon: implement handle_client callback
  2021-04-26 21:01   ` Derrick Stolee
@ 2021-05-03 15:04     ` Jeff Hostetler
  0 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-05-03 15:04 UTC (permalink / raw)
  To: Derrick Stolee, Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler



On 4/26/21 5:01 PM, Derrick Stolee wrote:
> On 4/1/2021 11:40 AM, Jeff Hostetler via GitGitGadget wrote:
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> Teach fsmonitor--daemon to respond to IPC requests from client
>> Git processes and respond with a list of modified pathnames
>> relative to the provided token.
> 
> (I'm skipping ahead to this part. I'll examine the platform
> specific bits after I finish with "the Git bits".)
> 
>> +static void fsmonitor_format_response_token(
>> +	struct strbuf *response_token,
>> +	const struct strbuf *response_token_id,
>> +	const struct fsmonitor_batch *batch)
>> +{
>> +	uint64_t seq_nr = (batch) ? batch->batch_seq_nr + 1 : 0;
>> +
>> +	strbuf_reset(response_token);
>> +	strbuf_addf(response_token, "builtin:%s:%"PRIu64,
>> +		    response_token_id->buf, seq_nr);]
> 
> Ah, right. The token string gets _even more specific_ to allow
> for multiple "checkpoints" within a batch.
> 
>> +static int fsmonitor_parse_client_token(const char *buf_token,
>> +					struct strbuf *requested_token_id,
>> +					uint64_t *seq_nr)
>> +{
>> +	const char *p;
>> +	char *p_end;
>> +
>> +	strbuf_reset(requested_token_id);
>> +	*seq_nr = 0;
>> +
>> +	if (!skip_prefix(buf_token, "builtin:", &p))
>> +		return 1;
>> +
>> +	while (*p && *p != ':')
>> +		strbuf_addch(requested_token_id, *p++);
> 
> My mind is going towards microoptimizations, but I wonder if there
> is a difference using
> 
> 	q = strchr(p, ':');
> 	if (!q)
> 		return 1;
> 	strbuf_add(requested_token_id, p, q - p);
> 
> We trade one scan with several method calls for instead two scans
> and two method calls, but also those two scans are very optimized.
> 
> Probably not worth it, as this is something like 20 bytes of data
> per round-trip.

I'll take a look at this and your later comment about double parsing
"builtin:".

> 
>> +	if (!skip_prefix(command, "builtin:", &p)) {
>> +		/* assume V1 timestamp or garbage */
>> +
>> +		char *p_end;
>> +
>> +		strtoumax(command, &p_end, 10);
>> +		trace_printf_key(&trace_fsmonitor,
>> +				 ((*p_end) ?
>> +				  "fsmonitor: invalid command line '%s'" :
>> +				  "fsmonitor: unsupported V1 protocol '%s'"),
>> +				 command);
>> +		result = -1;
>> +		goto send_trivial_response;
>> +	}
> 
> This is an interesting protection for users currently using FS
> Monitor but upgrade to the builtin approach.

Yes, the token is stored in the .git/index extension, so we always
have to assume that we'll get an old-school token that they inherited
from the last command executed before they switched FSMonitor providers.

Also, there is a chicken-n-egg problem in the FSMonitor protocol that
we inherited from the V1 effort -- the client has to initiate the
conversation with a token/timestamp, but has yet to talk to the daemon
(or hook) and doesn't know what a V2 token looks like (since they are
opaque to the client).  So I let the client blindly send a V1 timestamp
which the daemon will silently reject and send a trivial response (which
tells the client to do the work itself and gives it a V2 token for the
next call).

> 
>> +	if (fsmonitor_parse_client_token(command, &requested_token_id,
>> +					 &requested_oldest_seq_nr)) {
> 
> It appears you will call skip_prefix() twice this way, once to
> determine we are actually the right kind of token, but a second
> time as part of this call. Perhaps the helper method could start
> from 'p' which has already advanced beyond "buildin:"?
> 
>> +		trace_printf_key(&trace_fsmonitor,
>> +				 "fsmonitor: invalid V2 protocol token '%s'",
>> +				 command);
>> +		result = -1;
>> +		goto send_trivial_response;
>> +	}
> 
> This method is getting a bit long. Could the interesting data
> structure code below be extracted as a method?

Let me try refactoring it.

> 
>> +	pthread_mutex_lock(&state->main_lock);
>> +
>> +	if (!state->current_token_data) {
>> +		/*
>> +		 * We don't have a current token.  This may mean that
>> +		 * the listener thread has not yet started.
>> +		 */
>> +		pthread_mutex_unlock(&state->main_lock);
>> +		result = 0;
>> +		goto send_trivial_response;
>> +	}
>> +	if (strcmp(requested_token_id.buf,
>> +		   state->current_token_data->token_id.buf)) {
>> +		/*
>> +		 * The client last spoke to a different daemon
>> +		 * instance -OR- the daemon had to resync with
>> +		 * the filesystem (and lost events), so reject.
>> +		 */
>> +		pthread_mutex_unlock(&state->main_lock);
>> +		result = 0;
>> +		trace2_data_string("fsmonitor", the_repository,
>> +				   "response/token", "different");
>> +		goto send_trivial_response;
>> +	}
>> +	if (!state->current_token_data->batch_tail) {
>> +		/*
>> +		 * The listener has not received any filesystem
>> +		 * events yet since we created the current token.
>> +		 * We can respond with an empty list, since the
>> +		 * client has already seen the current token and
>> +		 * we have nothing new to report.  (This is
>> +		 * instead of sending a trivial response.)
>> +		 */
>> +		pthread_mutex_unlock(&state->main_lock);
>> +		result = 0;
>> +		goto send_empty_response;
>> +	}
>> +	if (requested_oldest_seq_nr <
>> +	    state->current_token_data->batch_tail->batch_seq_nr) {
>> +		/*
>> +		 * The client wants older events than we have for
>> +		 * this token_id.  This means that the end of our
>> +		 * batch list was truncated and we cannot give the
>> +		 * client a complete snapshot relative to their
>> +		 * request.
>> +		 */
>> +		pthread_mutex_unlock(&state->main_lock);
>> +
>> +		trace_printf_key(&trace_fsmonitor,
>> +				 "client requested truncated data");
>> +		result = 0;
>> +		goto send_trivial_response;
>> +	}
> 
> If these are part of a helper method, then they could be reorganized
> to "goto" the end of the method which returns an error code after
> unlocking the mutex. The multiple unlocks are making me nervous.
> 
>> +
>> +	/*
>> +	 * We're going to hold onto a pointer to the current
>> +	 * token-data while we walk the list of batches of files.
>> +	 * During this time, we will NOT be under the lock.
>> +	 * So we ref-count it.
> 
> I was wondering if this would happen. I'm glad it is.
> 
>> +	 * This allows the listener thread to continue prepending
>> +	 * new batches of items to the token-data (which we'll ignore).
>> +	 *
>> +	 * AND it allows the listener thread to do a token-reset
>> +	 * (and install a new `current_token_data`).
>> +	 *
>> +	 * We mark the current head of the batch list as "pinned" so
>> +	 * that the listener thread will treat this item as read-only
>> +	 * (and prevent any more paths from being added to it) from
>> +	 * now on.
>> +	 */
>> +	token_data = state->current_token_data;
>> +	token_data->client_ref_count++;
>> +
>> +	batch_head = token_data->batch_head;
>> +	((struct fsmonitor_batch *)batch_head)->pinned_time = time(NULL);
>> +
>> +	pthread_mutex_unlock(&state->main_lock);
> 
> We are now pinned. Makes sense.
> 
>> +	/*
>> +	 * FSMonitor Protocol V2 requires that we send a response header
>> +	 * with a "new current token" and then all of the paths that changed
>> +	 * since the "requested token".
>> +	 */
>> +	fsmonitor_format_response_token(&response_token,
>> +					&token_data->token_id,
>> +					batch_head);
>> +
>> +	reply(reply_data, response_token.buf, response_token.len + 1);
>> +	total_response_len += response_token.len + 1;
> 
> I was going to say we should let "reply" return the number of bytes written,
> but that is already an error code. But then we seem to be ignoring it here.
> Should we at least do something like "err |= reply()" to collect any errors?

Maybe. I'll have to look and see what the daemon thread can do
on such an error.

> 
>> +
>> +	trace2_data_string("fsmonitor", the_repository, "response/token",
>> +			   response_token.buf);
>> +	trace_printf_key(&trace_fsmonitor, "response token: %s", response_token.buf);
>> +
>> +	shown = kh_init_str();
>> +	for (batch = batch_head;
>> +	     batch && batch->batch_seq_nr >= requested_oldest_seq_nr;
>> +	     batch = batch->next) {
>> +		size_t k;
>> +
>> +		for (k = 0; k < batch->nr; k++) {
>> +			const char *s = batch->interned_paths[k];
>> +			size_t s_len;
>> +
>> +			if (kh_get_str(shown, s) != kh_end(shown))
>> +				duplicates++;
>> +			else {
>> +				kh_put_str(shown, s, &hash_ret);
> 
> It appears that you could make use of 'struct strmap' instead of managing your
> own khash structure.

Since all of the strings in the batches are already interned
(and we have constant fixed pointers for each string), I'd
eventually like to have this khash take advantage of that and
make this (essentially) a Set on the pointer values rather than
a Set on the strings.  This version is a step in that direction.

> 
>> +
>> +				trace_printf_key(&trace_fsmonitor,
>> +						 "send[%"PRIuMAX"]: %s",
>> +						 count, s);
>> +
>> +				/* Each path gets written with a trailing NUL */
>> +				s_len = strlen(s) + 1;
>> +
>> +				if (payload.len + s_len >=
>> +				    LARGE_PACKET_DATA_MAX) {
>> +					reply(reply_data, payload.buf,
>> +					      payload.len);
>> +					total_response_len += payload.len;
>> +					strbuf_reset(&payload);
>> +				}
>> +
>> +				strbuf_add(&payload, s, s_len);
>> +				count++;
>> +			}
>> +		}
>> +	}
>> +
>> +	if (payload.len) {
>> +		reply(reply_data, payload.buf, payload.len);
>> +		total_response_len += payload.len;
>> +	}
>> +
>> +	kh_release_str(shown);
>> +
>> +	pthread_mutex_lock(&state->main_lock);
>> +	if (token_data->client_ref_count > 0)
>> +		token_data->client_ref_count--;
>> +
>> +	if (token_data->client_ref_count == 0) {
>> +		if (token_data != state->current_token_data) {
>> +			/*
>> +			 * The listener thread did a token-reset while we were
>> +			 * walking the batch list.  Therefore, this token is
>> +			 * stale and can be discarded completely.  If we are
>> +			 * the last reader thread using this token, we own
>> +			 * that work.
>> +			 */
>> +			fsmonitor_free_token_data(token_data);
>> +		}
>> +	}
> 
> Perhaps this could be extracted to a method, so that any (locked) caller
> could run
> 
> 	free_token_if_unused(state, token_data);
> 
> and the token will either keep around (because client_ref_count > 0 or
> state->current_token_data is still on token_data). Otherwise I predict
> this being implemented in two places, which is too many when dealing with
> memory ownership.

I'll take a look at this.

> 
>> +
>> +	pthread_mutex_unlock(&state->main_lock);
>> +
>> +	trace2_data_intmax("fsmonitor", the_repository, "response/length", total_response_len);
>> +	trace2_data_intmax("fsmonitor", the_repository, "response/count/files", count);
>> +	trace2_data_intmax("fsmonitor", the_repository, "response/count/duplicates", duplicates);
>> +
>> +	strbuf_release(&response_token);
>> +	strbuf_release(&requested_token_id);
>> +	strbuf_release(&payload);
>> +
>> +	return 0;
>> +
>> +send_trivial_response:
>> +	pthread_mutex_lock(&state->main_lock);
>> +	fsmonitor_format_response_token(&response_token,
>> +					&state->current_token_data->token_id,
>> +					state->current_token_data->batch_head);
>> +	pthread_mutex_unlock(&state->main_lock);
>> +
>> +	reply(reply_data, response_token.buf, response_token.len + 1);
>> +	trace2_data_string("fsmonitor", the_repository, "response/token",
>> +			   response_token.buf);
>> +	reply(reply_data, "/", 2);
>> +	trace2_data_intmax("fsmonitor", the_repository, "response/trivial", 1);
>> +
>> +	strbuf_release(&response_token);
>> +	strbuf_release(&requested_token_id);
>> +
>> +	return result;
>> +
>> +send_empty_response:
>> +	pthread_mutex_lock(&state->main_lock);
>> +	fsmonitor_format_response_token(&response_token,
>> +					&state->current_token_data->token_id,
>> +					NULL);
>> +	pthread_mutex_unlock(&state->main_lock);
>> +
>> +	reply(reply_data, response_token.buf, response_token.len + 1);
>> +	trace2_data_string("fsmonitor", the_repository, "response/token",
>> +			   response_token.buf);
>> +	trace2_data_intmax("fsmonitor", the_repository, "response/empty", 1);
>> +
>> +	strbuf_release(&response_token);
>> +	strbuf_release(&requested_token_id);
>> +
>> +	return 0;
>> +}
>> +
>>   static ipc_server_application_cb handle_client;
>>   
>>   static int handle_client(void *data, const char *command,
>>   			 ipc_server_reply_cb *reply,
>>   			 struct ipc_server_reply_data *reply_data)
>>   {
>> -	/* struct fsmonitor_daemon_state *state = data; */
>> +	struct fsmonitor_daemon_state *state = data;
>>   	int result;
>>   
>> +	trace_printf_key(&trace_fsmonitor, "requested token: %s", command);
>> +
>>   	trace2_region_enter("fsmonitor", "handle_client", the_repository);
>>   	trace2_data_string("fsmonitor", the_repository, "request", command);
>>   
>> -	result = 0; /* TODO Do something here. */
>> +	result = do_handle_client(state, command, reply, reply_data);
>>   
>>   	trace2_region_leave("fsmonitor", "handle_client", the_repository);
>>   
> 
> A simple integration with earlier work. Good.
> 
> Thanks,
> -Stolee
> 

Thanks,
Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 19/23] fsmonitor--daemon: use a cookie file to sync with file system
  2021-04-27 14:23   ` Derrick Stolee
@ 2021-05-03 21:59     ` Jeff Hostetler
  0 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-05-03 21:59 UTC (permalink / raw)
  To: Derrick Stolee, Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler



On 4/27/21 10:23 AM, Derrick Stolee wrote:
> On 4/1/2021 11:41 AM, Jeff Hostetler via GitGitGadget wrote:
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> Teach fsmonitor--daemon client threads to create a cookie file
>> inside the .git directory and then wait until FS events for the
>> cookie are observed by the FS listener thread.
>>
>> This helps address the racy nature of file system events by
>> blocking the client response until the kernel has drained any
>> event backlog.
> 
> This description matches my expectation of the cookie file,
> which furthers my confusion about GIT_TEST_FSMONITOR_CLIENT_DELAY.

I'm going to try to create the cookie earlier in the thread
and see if that lets me eliminate the delay.  I don't remember
if I added the delay first and then the cookie when I was testing
or not.  IIRC I was switching between the 2 techniques at one point.

> 
>> +enum fsmonitor_cookie_item_result {
>> +	FCIR_ERROR = -1, /* could not create cookie file ? */
>> +	FCIR_INIT = 0,
>> +	FCIR_SEEN,
>> +	FCIR_ABORT,
>> +};
>> +
>> +struct fsmonitor_cookie_item {
>> +	struct hashmap_entry entry;
>> +	const char *name;
>> +	enum fsmonitor_cookie_item_result result;
>> +};
>> +
>> +static int cookies_cmp(const void *data, const struct hashmap_entry *he1,
>> +		     const struct hashmap_entry *he2, const void *keydata)
> 
> I'm interested to see why a hashset is necessary.

I suppose I could search a linked list of active cookies, but
this seemed easier.  Basically, we have an active cookie (and a
socket listener thread blocked) for each active client connection.
When the FS event thread receives an FS notification for a cookie
file, it needs to do a quick lookup on the cookie file and release
the associate socket thread.

Given that we're only likely to have a few clients connected at
any given time, a list might be faster.

> 
>> +static enum fsmonitor_cookie_item_result fsmonitor_wait_for_cookie(
>> +	struct fsmonitor_daemon_state *state)
>> +{
>> +	int fd;
>> +	struct fsmonitor_cookie_item cookie;
>> +	struct strbuf cookie_pathname = STRBUF_INIT;
>> +	struct strbuf cookie_filename = STRBUF_INIT;
>> +	const char *slash;
>> +	int my_cookie_seq;
>> +
>> +	pthread_mutex_lock(&state->main_lock);
> 
> Hm. We are entering a locked region. I hope this is only for the
> cookie write and not the entire waiting period.

I'm taking the lock to increment the cookie_seq and to add
the hash-entry to the hashmap mainly.  The cond_wait() after
the open() is an atomic unlock-and-wait-and-relock.  So we
wait there for the FS thread to tell us it has seen our cookie
file.  Then we remove our hash-entry from the hashmap and unlock.

Yes, I am doing several things here, but it didn't seem
worth it to lock-unlock-lock-unlock-lock-cond_wait...

> 
>> +	my_cookie_seq = state->cookie_seq++;
>> +
>> +	strbuf_addbuf(&cookie_pathname, &state->path_cookie_prefix);
>> +	strbuf_addf(&cookie_pathname, "%i-%i", getpid(), my_cookie_seq);
>> +
>> +	slash = find_last_dir_sep(cookie_pathname.buf);
>> +	if (slash)
>> +		strbuf_addstr(&cookie_filename, slash + 1);
>> +	else
>> +		strbuf_addbuf(&cookie_filename, &cookie_pathname);
> 
> This business about the slash-or-not-slash is good defensive
> programming. I imagine the only possible way for there to not
> be a slash is if the Git process is running with the .git
> directory as its working directory?
> 
>> +	cookie.name = strbuf_detach(&cookie_filename, NULL);
>> +	cookie.result = FCIR_INIT;
>> +	// TODO should we have case-insenstive hash (and in cookie_cmp()) ??
> 
> This TODO comment should be cleaned up. Doesn't match C-style, either.
> 
> As for the question, I believe that we can limit ourselves to names that
> don't need case-insensitive hashes and trust that the filesystem will not
> change the case. Using lowercase letters should help with this.
> 

I'm going to redo the pathname construction (to solve a conflict
with VSCode) and will clean up this.

>> +	hashmap_entry_init(&cookie.entry, strhash(cookie.name));
>> +
>> +	/*
>> +	 * Warning: we are putting the address of a stack variable into a
>> +	 * global hashmap.  This feels dodgy.  We must ensure that we remove
>> +	 * it before this thread and stack frame returns.
>> +	 */
>> +	hashmap_add(&state->cookies, &cookie.entry);
> 
> I saw this warning and thought about avoiding it by using the heap, but
> even with a heap pointer we need to be careful to remove the result
> before returning and stopping the thread.
> 
> However, there is likely a higher potential of a bug leading to a
> security issue through an error causing stack corruption and unsafe
> code execution. Perhaps it is worth converting to using heap data here.

I never liked the stack buffer.  I'm going to move it to the heap.

> 
>> +	trace_printf_key(&trace_fsmonitor, "cookie-wait: '%s' '%s'",
>> +			 cookie.name, cookie_pathname.buf);
>> +
>> +	/*
>> +	 * Create the cookie file on disk and then wait for a notification
>> +	 * that the listener thread has seen it.
>> +	 */
>> +	fd = open(cookie_pathname.buf, O_WRONLY | O_CREAT | O_EXCL, 0600);
>> +	if (fd >= 0) {
>> +		close(fd);
>> +		unlink_or_warn(cookie_pathname.buf);
> 
> Interesting that we are ignoring the warning here. Is it possible that
> these cookie files will continue to grow if this unlink fails?

It is possible that the unlink() could fail, but I'm not
sure what we can do about it.  The FS event from the open()
(and/or the close()) will be sufficient to wake up this thread.

> 
>> +
>> +		while (cookie.result == FCIR_INIT)
>> +			pthread_cond_wait(&state->cookies_cond,
>> +					  &state->main_lock);
> 
> Ok, we are waiting here for another thread to signal that the cookie
> file has been found in the events. What happens if the event gets lost?
> I'll look for a later signal that cookie.result can change based on a
> timeout, too.

I'd like to use `pthread_cond_timedwait()` here, but I'm not
sure it is supported everywhere.

I do have code in the FS layers to dump/alert all cookies
at certain times, such as loss of sync.

> 
>> +
>> +		hashmap_remove(&state->cookies, &cookie.entry, NULL);
>> +	} else {
>> +		error_errno(_("could not create fsmonitor cookie '%s'"),
>> +			    cookie.name);
>> +
>> +		cookie.result = FCIR_ERROR;
>> +		hashmap_remove(&state->cookies, &cookie.entry, NULL);
>> +	}
> 
> Both blocks here remove the cookie entry, so move it to the end of the
> method with the other cleanups.

I can move it outside of the IF, but it has to be before we unlock.

> 
>> +
>> +	pthread_mutex_unlock(&state->main_lock);
> 
> Hm. We are locking the main state throughout this process. I suppose that
> the listener thread could be watching multiple repos and updating them
> while we wait here for one repo to update. This is a larger lock window
> than I was hoping for, but I don't currently see how to reduce it safely.

We only watch a single repo/working directory.  We're locking because
we could have multiple clients all hitting us at the same time.

> 
>> +
>> +	free((char*)cookie.name);
>> +	strbuf_release(&cookie_pathname);
>> +	return cookie.result;
> 
> Remove the cookie from the hashset along with these lines.

No, it has to be within the lock above.

> 
>> +}
>> +
>> +/*
>> + * Mark these cookies as _SEEN and wake up the corresponding client threads.
>> + */
>> +static void fsmonitor_cookie_mark_seen(struct fsmonitor_daemon_state *state,
>> +				       const struct string_list *cookie_names)
>> +{
>> +	/* assert state->main_lock */
> 
> I'm now confused what this is trying to document. The 'state' should be
> locked by another thread while we are waiting for a cookie response, so
> this method is updating the cookie as seen from a different thread that
> doesn't have the lock.

I'm trying to document that this function must be called while the
thread is holding the main lock (without paying for a lock check or
trying to do a recursive lock or whatever).

Since it is a little static function and I control the 2 or 3 callers,
I can just visually check this without fuss.

> 
> ...
>> +/*
>> + * Set _ABORT on all pending cookies and wake up all client threads.
>> + */
>> +static void fsmonitor_cookie_abort_all(struct fsmonitor_daemon_state *state)
> ...
> 
>> + * [2] Some of those lost events may have been for cookie files.  We
>> + *     should assume the worst and abort them rather letting them starve.
>> + *
>>    * If there are no readers of the the current token data series, we
>>    * can free it now.  Otherwise, let the last reader free it.  Either
>>    * way, the old token data series is no longer associated with our
>> @@ -454,6 +600,8 @@ void fsmonitor_force_resync(struct fsmonitor_daemon_state *state)
>>   			 state->current_token_data->token_id.buf,
>>   			 new_one->token_id.buf);
>>   
>> +	fsmonitor_cookie_abort_all(state);
>> +
> 
> I see we abort here if we force a resync. I lost the detail of whether
> this is triggered by a timeout, too.

I don't currently have a cookie timeout for each thread.  I'd like
to use pthread_cond_timedwait(), but I didn't see it in the
compat headers, so I'm not sure if it is portable.  I'll make a note
to look into this again.

> 
>> @@ -654,6 +803,39 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
>>   		goto send_trivial_response;
>>   	}
>>   
>> +	pthread_mutex_unlock(&state->main_lock);
>> +
>> +	/*
>> +	 * Write a cookie file inside the directory being watched in an
>> +	 * effort to flush out existing filesystem events that we actually
>> +	 * care about.  Suspend this client thread until we see the filesystem
>> +	 * events for this cookie file.
>> +	 */
>> +	cookie_result = fsmonitor_wait_for_cookie(state);
> 
> Odd that we unlock before calling this method, then just take the lock
> again inside of it.

Yeah, I didn't like doing that.  I'll revisit.

> 
>> +	if (cookie_result != FCIR_SEEN) {
>> +		error(_("fsmonitor: cookie_result '%d' != SEEN"),
>> +		      cookie_result);
>> +		result = 0;
>> +		goto send_trivial_response;
>> +	}
>> +
>> +	pthread_mutex_lock(&state->main_lock);
>> +
>> +	if (strcmp(requested_token_id.buf,
>> +		   state->current_token_data->token_id.buf)) {
>> +		/*
>> +		 * Ack! The listener thread lost sync with the filesystem
>> +		 * and created a new token while we were waiting for the
>> +		 * cookie file to be created!  Just give up.
>> +		 */
>> +		pthread_mutex_unlock(&state->main_lock);
>> +
>> +		trace_printf_key(&trace_fsmonitor,
>> +				 "lost filesystem sync");
>> +		result = 0;
>> +		goto send_trivial_response;
>> +	}
>> +
>>   	/*
>>   	 * We're going to hold onto a pointer to the current
>>   	 * token-data while we walk the list of batches of files.
>> @@ -982,6 +1164,9 @@ void fsmonitor_publish(struct fsmonitor_daemon_state *state,
>>   		}
>>   	}
>>   
>> +	if (cookie_names->nr)
>> +		fsmonitor_cookie_mark_seen(state, cookie_names);
>> +
> 
> I was confused as to what updates 'cookie_names', but it appears that
> these are updated in the platform-specific code. That seems to happen
> in later patches.

Yes, this is a list of the cookies that the platform layer saw events
for.  It was passed in along with the set of batched paths.  So the
platform code can "publish/prepend" a new set of changed paths and
wake any threads whose cookie was seen.

> 
> Thanks,
> -Stolee
> 

Thanks,
Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: Nesting topics within other threads (was: [PATCH] repo-settings.c: simplify the setup)
  2021-04-28 23:01               ` Ævar Arnfjörð Bjarmason
@ 2021-05-05 16:12                 ` Johannes Schindelin
  0 siblings, 0 replies; 237+ messages in thread
From: Johannes Schindelin @ 2021-05-05 16:12 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Derrick Stolee, git, Junio C Hamano, Taylor Blau, Patrick Steinhardt

[-- Attachment #1: Type: text/plain, Size: 1714 bytes --]

Hi Ævar,

On Thu, 29 Apr 2021, Ævar Arnfjörð Bjarmason wrote:

> On Wed, Apr 28 2021, Derrick Stolee wrote:
>
> > On 4/28/2021 12:26 PM, Ævar Arnfjörð Bjarmason wrote:
> >> Simplify the setup code in repo-settings.c in various ways, making the
> >> code shorter, easier to read, and requiring fewer hacks to do the same
> >> thing as it did before:
> >
> > [...]
> > Since I've committed to reviewing the FS Monitor code, I'd prefer if
> > this patch (or maybe its v2, since this is here already) be sent as
> > a top-level message so it can be discussed independently.
>
> As a practical matter I think any effort I make to accommodate your
> request will be dwarfed by your own starting of a sub-thread on
> E-Mail/MUA nuances :)
>
> When [1] was brought up the other day (showing that I'm probably not the
> best person to ask about on-list In-Reply-To semantics) I was surprised
> to find that we don't have much (if any) explicit documentation about
> In-Reply-To best practices. [...]
>
> 1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.2103191540330.57@tvgsbejvaqbjf.bet/

I find it a bit disingenous to reference my complaint about your
disconnected cover letter (which _definitely_ belongs with the patches for
which it covers) with the practice of hiding patches or patch
series deep in a thread discussion an (lengthy!) patch series,
_especially_ if it threatens to totally conflict with that patch series
and thereby disrupt the flow.

Couldn't you hold off with your patch for a while, instead help FSMonitor
get over the finish line, and _then_ submit that simplification of
repo-settings? That would be constructive, from my perspective.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH 16/23] fsmonitor--daemon: implement handle_client callback
  2021-04-01 15:40 ` [PATCH 16/23] fsmonitor--daemon: implement handle_client callback Jeff Hostetler via GitGitGadget
  2021-04-26 21:01   ` Derrick Stolee
@ 2021-05-13 18:52   ` Derrick Stolee
  1 sibling, 0 replies; 237+ messages in thread
From: Derrick Stolee @ 2021-05-13 18:52 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git; +Cc: Jeff Hostetler

On 4/1/21 11:40 AM, Jeff Hostetler via GitGitGadget wrote:

Here is a rather important bug that I found on a whim while working
with sparse-index integrations. The sparse-index isn't important
except that it caused a different pattern of batch creation and
responses from the daemon.

> +/*
> + * Format an opaque token string to send to the client.
> + */
> +static void fsmonitor_format_response_token(
> +	struct strbuf *response_token,
> +	const struct strbuf *response_token_id,
> +	const struct fsmonitor_batch *batch)
> +{
> +	uint64_t seq_nr = (batch) ? batch->batch_seq_nr + 1 : 0;

Here, you add one to the batch value to indicate a difference
between "zero" and "positive" values.

> +
> +	strbuf_reset(response_token);
> +	strbuf_addf(response_token, "builtin:%s:%"PRIu64,
> +		    response_token_id->buf, seq_nr);
> +}
> +
> +/*
> + * Parse an opaque token from the client.
> + */
> +static int fsmonitor_parse_client_token(const char *buf_token,
> +					struct strbuf *requested_token_id,
> +					uint64_t *seq_nr)
> +{
> +	const char *p;
> +	char *p_end;
> +
> +	strbuf_reset(requested_token_id);
> +	*seq_nr = 0;
> +
> +	if (!skip_prefix(buf_token, "builtin:", &p))
> +		return 1;
> +
> +	while (*p && *p != ':')
> +		strbuf_addch(requested_token_id, *p++);
> +	if (!*p++)
> +		return 1;
> +
> +	*seq_nr = (uint64_t)strtoumax(p, &p_end, 10);

Which means here you should decrement one from the value, possibly,
(except if it is zero).

> +	if (*p_end)
> +		return 1;
> +
> +	return 0;
> +}

...

> +	shown = kh_init_str();
> +	for (batch = batch_head;
> +	     batch && batch->batch_seq_nr >= requested_oldest_seq_nr;
> +	     batch = batch->next) {

And without either decrementing one from requested_oldest_seq_nr or
adding one to the batch_seq_nr here, this loop could terminate
immediately.

In my testing, I added one to the left-hand side of the inequality.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 237+ messages in thread

* [PATCH v2 00/28] Builtin FSMonitor Feature
  2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                   ` (25 preceding siblings ...)
  2021-04-27 19:31 ` FS Monitor macOS " Derrick Stolee
@ 2021-05-22 13:56 ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 01/28] simple-ipc: preparations for supporting binary messages Jeff Hostetler via GitGitGadget
                     ` (29 more replies)
  26 siblings, 30 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler

Here is V2 of my patch series to add a builtin FSMonitor daemon to Git.

V2 includes addresses most of the review comments from the RFC and many of
the issues seen during out beta-testing with selected GVFS users. However,
there are still a few items that I need to address:

[ ] Revisit the how the client handles the IPC_STATE__NOT_LISTENING state
(where a daemon appears to be running, but is non-responsive) [ ] Revisit
use of global core_fsmonitor as both a pathname and a boolean. The existing
fsmonitor code uses it as the pathname to the fsmonitor hook and as a flag
to indicate that a hook is configured. [ ] Consider having daemon chdir()
out of the working directory to avoid directory handle issues on Windows. [
] Some documentation recommendations. [ ] Split up the commit containing the
tests and move some earlier in patch series. [ ] Move my FSMonitor PREREQ to
test-lib.sh instead of having it in my test scripts. [ ] Document
performance gains. [ ] On Windows, If the daemon is started as an elevated
process, then client commands might not have access to communicate with it.
[ ] Review if/how we decide to shutdown the FSMonitor daemon after and a
significant idle period. [ ] Investigate ways to temporarily shutdown
FSMonitor daemons processes so that the Git for Windows installer can
install an upgrade.

In this version, the first commit updates the Simple IPC API to make it
easier to pass binary data using {char *, size_t} rather than assuming that
the message is a null-terminated string. FSMonitor does not use binary
messages and doesn't really need this API change, but I thought it best to
fix the API now before we have other callers of IPC.

This V2 version will hopefully be previewed as an experimental feature in
Git for Windows v2.32.0.windows.*.

Jeff Hostetler (26):
  simple-ipc: preparations for supporting binary messages.
  fsmonitor--daemon: man page
  fsmonitor--daemon: update fsmonitor documentation
  fsmonitor-ipc: create client routines for git-fsmonitor--daemon
  help: include fsmonitor--daemon feature flag in version info
  fsmonitor--daemon: add a built-in fsmonitor daemon
  fsmonitor--daemon: implement client command options
  t/helper/fsmonitor-client: create IPC client to talk to FSMonitor
    Daemon
  fsmonitor-fs-listen-win32: stub in backend for Windows
  fsmonitor-fs-listen-macos: stub in backend for MacOS
  fsmonitor--daemon: implement daemon command options
  fsmonitor--daemon: add pathname classification
  fsmonitor--daemon: define token-ids
  fsmonitor--daemon: create token-based changed path cache
  fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  fsmonitor-fs-listen-macos: add macos header files for FSEvent
  fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS
  fsmonitor--daemon: implement handle_client callback
  fsmonitor--daemon: periodically truncate list of modified files
  fsmonitor--daemon: use a cookie file to sync with file system
  fsmonitor: enhance existing comments
  fsmonitor: force update index after large responses
  t7527: create test for fsmonitor--daemon
  p7519: add fsmonitor--daemon
  t7527: test status with untracked-cache and fsmonitor--daemon
  t/perf: avoid copying builtin fsmonitor files into test repo

Johannes Schindelin (2):
  config: FSMonitor is repository-specific
  fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via
    IPC

 .gitignore                                   |    1 +
 Documentation/config/core.txt                |   56 +-
 Documentation/git-fsmonitor--daemon.txt      |   75 +
 Documentation/git-update-index.txt           |   27 +-
 Documentation/githooks.txt                   |    3 +-
 Makefile                                     |   16 +
 builtin.h                                    |    1 +
 builtin/fsmonitor--daemon.c                  | 1511 ++++++++++++++++++
 builtin/update-index.c                       |    4 +-
 compat/fsmonitor/fsmonitor-fs-listen-macos.c |  497 ++++++
 compat/fsmonitor/fsmonitor-fs-listen-win32.c |  553 +++++++
 compat/fsmonitor/fsmonitor-fs-listen.h       |   49 +
 compat/simple-ipc/ipc-unix-socket.c          |   14 +-
 compat/simple-ipc/ipc-win32.c                |   14 +-
 config.c                                     |    9 +-
 config.h                                     |    2 +-
 config.mak.uname                             |    4 +
 contrib/buildsystems/CMakeLists.txt          |    8 +
 fsmonitor--daemon.h                          |  140 ++
 fsmonitor-ipc.c                              |  179 +++
 fsmonitor-ipc.h                              |   48 +
 fsmonitor.c                                  |  132 +-
 git.c                                        |    1 +
 help.c                                       |    4 +
 repo-settings.c                              |    3 +
 repository.h                                 |    2 +
 simple-ipc.h                                 |    7 +-
 t/helper/test-fsmonitor-client.c             |  125 ++
 t/helper/test-simple-ipc.c                   |   34 +-
 t/helper/test-tool.c                         |    1 +
 t/helper/test-tool.h                         |    1 +
 t/perf/p7519-fsmonitor.sh                    |   42 +-
 t/perf/perf-lib.sh                           |    2 +-
 t/t7527-builtin-fsmonitor.sh                 |  572 +++++++
 34 files changed, 4069 insertions(+), 68 deletions(-)
 create mode 100644 Documentation/git-fsmonitor--daemon.txt
 create mode 100644 builtin/fsmonitor--daemon.c
 create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-macos.c
 create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-win32.c
 create mode 100644 compat/fsmonitor/fsmonitor-fs-listen.h
 create mode 100644 fsmonitor--daemon.h
 create mode 100644 fsmonitor-ipc.c
 create mode 100644 fsmonitor-ipc.h
 create mode 100644 t/helper/test-fsmonitor-client.c
 create mode 100755 t/t7527-builtin-fsmonitor.sh


base-commit: b0c09ab8796fb736efa432b8e817334f3e5ee75a
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-923%2Fjeffhostetler%2Fbuiltin-fsmonitor-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-923/jeffhostetler/builtin-fsmonitor-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/923

Range-diff vs v1:

  -:  ------------ >  1:  763fa1ee7bb6 simple-ipc: preparations for supporting binary messages.
  -:  ------------ >  2:  fc180e8591bf fsmonitor--daemon: man page
  1:  074273330f8d !  3:  d56f3e91db9f fsmonitor--daemon: man page and documentation
     @@ Metadata
      Author: Jeff Hostetler <jeffhost@microsoft.com>
      
       ## Commit message ##
     -    fsmonitor--daemon: man page and documentation
     +    fsmonitor--daemon: update fsmonitor documentation
      
     -    Create a manual page describing the `git fsmonitor--daemon` feature.
     -
     -    Update references to `core.fsmonitor`, `core.fsmonitorHookVersion` and
     -    pointers to `watchman` to mention the built-in FSMonitor.
     +    Update references to `core.fsmonitor` and `core.fsmonitorHookVersion` and
     +    pointers to `Watchman` to mention the new built-in `fsmonitor--daemon`.
      
          Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
      
       ## Documentation/config/core.txt ##
     -@@ Documentation/config/core.txt: core.fsmonitor::
     - 	will identify all files that may have changed since the
     - 	requested date/time. This information is used to speed up git by
     - 	avoiding unnecessary processing of files that have not changed.
     +@@ Documentation/config/core.txt: core.protectNTFS::
     + 	Defaults to `true` on Windows, and `false` elsewhere.
     + 
     + core.fsmonitor::
     +-	If set, the value of this variable is used as a command which
     +-	will identify all files that may have changed since the
     +-	requested date/time. This information is used to speed up git by
     +-	avoiding unnecessary processing of files that have not changed.
      -	See the "fsmonitor-watchman" section of linkgit:githooks[5].
     ++	If set, this variable contains the pathname of the "fsmonitor"
     ++	hook command.
     +++
     ++This hook command is used to identify all files that may have changed
     ++since the requested date/time. This information is used to speed up
     ++git by avoiding unnecessary scanning of files that have not changed.
      ++
      +See the "fsmonitor-watchman" section of linkgit:githooks[5].
      ++
     -+Note: FSMonitor hooks (and this config setting) are ignored if the
     -+built-in FSMonitor is enabled (see `core.useBuiltinFSMonitor`).
     ++Note: The value of this config setting is ignored if the
     ++built-in file system monitor is enabled (see `core.useBuiltinFSMonitor`).
       
       core.fsmonitorHookVersion::
      -	Sets the version of hook that is to be used when calling fsmonitor.
     @@ Documentation/config/core.txt: core.fsmonitor::
      -	Version 2 uses an opaque string so that the monitor can return
      -	something that can be used to determine what files have changed
      -	without race conditions.
     -+	Sets the version of hook that is to be used when calling the
     -+	FSMonitor hook (as configured via `core.fsmonitor`).
     ++	Sets the protocol version to be used when invoking the
     ++	"fsmonitor" hook.
      ++
      +There are currently versions 1 and 2. When this is not set,
      +version 2 will be tried first and if it fails then version 1
      +will be tried. Version 1 uses a timestamp as input to determine
      +which files have changes since that time but some monitors
     -+like watchman have race conditions when used with a timestamp.
     ++like Watchman have race conditions when used with a timestamp.
      +Version 2 uses an opaque string so that the monitor can return
      +something that can be used to determine what files have changed
      +without race conditions.
      ++
     -+Note: FSMonitor hooks (and this config setting) are ignored if the
     -+built-in FSMonitor is enabled (see `core.useBuiltinFSMonitor`).
     ++Note: The value of this config setting is ignored if the
     ++built-in file system monitor is enabled (see `core.useBuiltinFSMonitor`).
      +
      +core.useBuiltinFSMonitor::
     -+	If set to true, enable the built-in filesystem event watcher (for
     -+	technical details, see linkgit:git-fsmonitor--daemon[1]).
     ++	If set to true, enable the built-in file system monitor
     ++	daemon for this working directory (linkgit:git-fsmonitor--daemon[1]).
      ++
     -+Like external (hook-based) FSMonitors, the built-in FSMonitor can speed up
     -+Git commands that need to refresh the Git index (e.g. `git status`) in a
     -+worktree with many files. The built-in FSMonitor facility eliminates the
     -+need to install and maintain an external third-party monitoring tool.
     ++Like hook-based file system monitors, the built-in file system monitor
     ++can speed up Git commands that need to refresh the Git index
     ++(e.g. `git status`) in a working directory with many files.  The
     ++built-in monitor eliminates the need to install and maintain an
     ++external third-party tool.
      ++
     -+The built-in FSMonitor is currently available only on a limited set of
     -+supported platforms.
     ++The built-in file system monitor is currently available only on a
     ++limited set of supported platforms.  Currently, this includes Windows
     ++and MacOS.
      ++
     -+Note: if this config setting is set to `true`, any FSMonitor hook
     -+configured via `core.fsmonitor` (and possibly `core.fsmonitorHookVersion`)
     -+is ignored.
     ++Note: if this config setting is set to `true`, the values of
     ++`core.fsmonitor` and `core.fsmonitorHookVersion` are ignored.
       
       core.trustctime::
       	If false, the ctime differences between the index and the
      
     - ## Documentation/git-fsmonitor--daemon.txt (new) ##
     -@@
     -+git-fsmonitor--daemon(1)
     -+========================
     -+
     -+NAME
     -+----
     -+git-fsmonitor--daemon - Builtin file system monitor daemon
     -+
     -+SYNOPSIS
     -+--------
     -+[verse]
     -+'git fsmonitor--daemon' --start
     -+'git fsmonitor--daemon' --run
     -+'git fsmonitor--daemon' --stop
     -+'git fsmonitor--daemon' --is-running
     -+'git fsmonitor--daemon' --is-supported
     -+'git fsmonitor--daemon' --query <token>
     -+'git fsmonitor--daemon' --query-index
     -+'git fsmonitor--daemon' --flush
     -+
     -+DESCRIPTION
     -+-----------
     -+
     -+Monitors files and directories in the working directory for changes using
     -+platform-specific file system notification facilities.
     -+
     -+It communicates directly with commands like `git status` using the
     -+link:technical/api-simple-ipc.html[simple IPC] interface instead of
     -+the slower linkgit:githooks[5] interface.
     -+
     -+OPTIONS
     -+-------
     -+
     -+--start::
     -+	Starts the fsmonitor daemon in the background.
     -+
     -+--run::
     -+	Runs the fsmonitor daemon in the foreground.
     -+
     -+--stop::
     -+	Stops the fsmonitor daemon running for the current working
     -+	directory, if present.
     -+
     -+--is-running::
     -+	Exits with zero status if the fsmonitor daemon is watching the
     -+	current working directory.
     -+
     -+--is-supported::
     -+	Exits with zero status if the fsmonitor daemon feature is supported
     -+	on this platform.
     -+
     -+--query <token>::
     -+	Connects to the fsmonitor daemon (starting it if necessary) and
     -+	requests the list of changed files and directories since the
     -+	given token.
     -+	This is intended for testing purposes.
     -+
     -+--query-index::
     -+	Read the current `<token>` from the File System Monitor index
     -+	extension (if present) and use it to query the fsmonitor daemon.
     -+	This is intended for testing purposes.
     -+
     -+--flush::
     -+	Force the fsmonitor daemon to flush its in-memory cache and
     -+	re-sync with the file system.
     -+	This is intended for testing purposes.
     -+
     -+REMARKS
     -+-------
     -+The fsmonitor daemon is a long running process that will watch a single
     -+working directory.  Commands, such as `git status`, should automatically
     -+start it (if necessary) when `core.useBuiltinFSMonitor` is set to `true`
     -+(see linkgit:git-config[1]).
     -+
     -+Configure the built-in FSMonitor via `core.useBuiltinFSMonitor` in each
     -+working directory separately, or globally via `git config --global
     -+core.useBuiltinFSMonitor true`.
     -+
     -+Tokens are opaque strings.  They are used by the fsmonitor daemon to
     -+mark a point in time and the associated internal state.  Callers should
     -+make no assumptions about the content of the token.  In particular,
     -+the should not assume that it is a timestamp.
     -+
     -+Query commands send a request-token to the daemon and it responds with
     -+a summary of the changes that have occurred since that token was
     -+created.  The daemon also returns a response-token that the client can
     -+use in a future query.
     -+
     -+For more information see the "File System Monitor" section in
     -+linkgit:git-update-index[1].
     -+
     -+CAVEATS
     -+-------
     -+
     -+The fsmonitor daemon does not currently know about submodules and does
     -+not know to filter out file system events that happen within a
     -+submodule.  If fsmonitor daemon is watching a super repo and a file is
     -+modified within the working directory of a submodule, it will report
     -+the change (as happening against the super repo).  However, the client
     -+should properly ignore these extra events, so performance may be affected
     -+but it should not cause an incorrect result.
     -+
     -+GIT
     -+---
     -+Part of the linkgit:git[1] suite
     -
       ## Documentation/git-update-index.txt ##
      @@ Documentation/git-update-index.txt: FILE SYSTEM MONITOR
       This feature is intended to speed up git operations for repos that have
     @@ Documentation/git-update-index.txt: FILE SYSTEM MONITOR
       "fsmonitor-watchman" section of linkgit:githooks[5]) that can
       inform it as to what files have been modified. This enables git to avoid
       having to lstat() every file to find modified files.
     +@@ Documentation/git-update-index.txt: performance by avoiding the cost of scanning the entire working directory
     + looking for new files.
     + 
     + If you want to enable (or disable) this feature, it is easier to use
     +-the `core.fsmonitor` configuration variable (see
     +-linkgit:git-config[1]) than using the `--fsmonitor` option to
     +-`git update-index` in each repository, especially if you want to do so
     +-across all repositories you use, because you can set the configuration
     +-variable in your `$HOME/.gitconfig` just once and have it affect all
     +-repositories you touch.
     +-
     +-When the `core.fsmonitor` configuration variable is changed, the
     +-file system monitor is added to or removed from the index the next time
     +-a command reads the index. When `--[no-]fsmonitor` are used, the file
     +-system monitor is immediately added to or removed from the index.
     ++the `core.fsmonitor` or `core.useBuiltinFSMonitor` configuration
     ++variable (see linkgit:git-config[1]) than using the `--fsmonitor`
     ++option to `git update-index` in each repository, especially if you
     ++want to do so across all repositories you use, because you can set the
     ++configuration variable in your `$HOME/.gitconfig` just once and have
     ++it affect all repositories you touch.
     ++
     ++When the `core.fsmonitor` or `core.useBuiltinFSMonitor` configuration
     ++variable is changed, the file system monitor is added to or removed
     ++from the index the next time a command reads the index. When
     ++`--[no-]fsmonitor` are used, the file system monitor is immediately
     ++added to or removed from the index.
     + 
     + CONFIGURATION
     + -------------
      
       ## Documentation/githooks.txt ##
      @@ Documentation/githooks.txt: fsmonitor-watchman
  2:  3dac63eae201 !  4:  e4a263728773 fsmonitor-ipc: create client routines for git-fsmonitor--daemon
     @@ Metadata
       ## Commit message ##
          fsmonitor-ipc: create client routines for git-fsmonitor--daemon
      
     -    Create client routines to spawn a fsmonitor daemon and send it an IPC
     -    request using `simple-ipc`.
     +    Create fsmonitor_ipc__*() client routines to spawn the built-in file
     +    system monitor daemon and send it an IPC request using the `Simple
     +    IPC` API.
     +
     +    Stub in empty fsmonitor_ipc__*() functions for unsupported platforms.
      
          Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
      
     @@ fsmonitor-ipc.c (new)
      @@
      +#include "cache.h"
      +#include "fsmonitor.h"
     ++#include "simple-ipc.h"
      +#include "fsmonitor-ipc.h"
      +#include "run-command.h"
      +#include "strbuf.h"
      +#include "trace2.h"
      +
      +#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
     -+#define FSMONITOR_DAEMON_IS_SUPPORTED 1
     -+#else
     -+#define FSMONITOR_DAEMON_IS_SUPPORTED 0
     -+#endif
      +
     -+/*
     -+ * A trivial function so that this source file always defines at least
     -+ * one symbol even when the feature is not supported.  This quiets an
     -+ * annoying compiler error.
     -+ */
      +int fsmonitor_ipc__is_supported(void)
      +{
     -+	return FSMONITOR_DAEMON_IS_SUPPORTED;
     ++	return 1;
      +}
      +
     -+#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
     -+
     -+GIT_PATH_FUNC(fsmonitor_ipc__get_path, "fsmonitor")
     ++GIT_PATH_FUNC(fsmonitor_ipc__get_path, "fsmonitor--daemon.ipc")
      +
      +enum ipc_active_state fsmonitor_ipc__get_state(void)
      +{
     @@ fsmonitor-ipc.c (new)
      +
      +static int spawn_daemon(void)
      +{
     -+	const char *args[] = { "fsmonitor--daemon", "--start", NULL };
     ++	const char *args[] = { "fsmonitor--daemon", "start", NULL };
      +
      +	return run_command_v_opt_tr2(args, RUN_COMMAND_NO_STDIN | RUN_GIT_CMD,
      +				    "fsmonitor");
     @@ fsmonitor-ipc.c (new)
      +	switch (state) {
      +	case IPC_STATE__LISTENING:
      +		ret = ipc_client_send_command_to_connection(
     -+			connection, since_token, answer);
     ++			connection, since_token, strlen(since_token), answer);
      +		ipc_client_close_connection(connection);
      +
      +		trace2_data_intmax("fsm_client", NULL,
     @@ fsmonitor-ipc.c (new)
      +		return -1;
      +	}
      +
     -+	ret = ipc_client_send_command_to_connection(connection, command, answer);
     ++	ret = ipc_client_send_command_to_connection(connection,
     ++						    command, strlen(command),
     ++						    answer);
      +	ipc_client_close_connection(connection);
      +
      +	if (ret == -1) {
     @@ fsmonitor-ipc.c (new)
      +	return 0;
      +}
      +
     ++#else
     ++
     ++/*
     ++ * A trivial implementation of the fsmonitor_ipc__ API for unsupported
     ++ * platforms.
     ++ */
     ++
     ++int fsmonitor_ipc__is_supported(void)
     ++{
     ++	return 0;
     ++}
     ++
     ++const char *fsmonitor_ipc__get_path(void)
     ++{
     ++	return NULL;
     ++}
     ++
     ++enum ipc_active_state fsmonitor_ipc__get_state(void)
     ++{
     ++	return IPC_STATE__OTHER_ERROR;
     ++}
     ++
     ++int fsmonitor_ipc__send_query(const char *since_token,
     ++			      struct strbuf *answer)
     ++{
     ++	return -1;
     ++}
     ++
     ++int fsmonitor_ipc__send_command(const char *command,
     ++				struct strbuf *answer)
     ++{
     ++	return -1;
     ++}
     ++
      +#endif
      
       ## fsmonitor-ipc.h (new) ##
     @@ fsmonitor-ipc.h (new)
      +#define FSMONITOR_IPC_H
      +
      +/*
     -+ * Returns true if a filesystem notification backend is defined
     -+ * for this platform.  This symbol must always be visible and
     -+ * outside of the HAVE_ ifdef.
     ++ * Returns true if built-in file system monitor daemon is defined
     ++ * for this platform.
      + */
      +int fsmonitor_ipc__is_supported(void);
      +
     -+#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
     -+#include "run-command.h"
     -+#include "simple-ipc.h"
     -+
      +/*
      + * Returns the pathname to the IPC named pipe or Unix domain socket
      + * where a `git-fsmonitor--daemon` process will listen.  This is a
      + * per-worktree value.
     ++ *
     ++ * Returns NULL if the daemon is not supported on this platform.
      + */
      +const char *fsmonitor_ipc__get_path(void);
      +
     @@ fsmonitor-ipc.h (new)
      + * This DOES NOT use the hook interface.
      + *
      + * Spawn a daemon process in the background if necessary.
     ++ *
     ++ * Returns -1 on error; 0 on success.
      + */
      +int fsmonitor_ipc__send_query(const char *since_token,
      +			      struct strbuf *answer);
     @@ fsmonitor-ipc.h (new)
      + * Connect to a `git-fsmonitor--daemon` process via simple-ipc and
      + * send a command verb.  If no daemon is available, we DO NOT try to
      + * start one.
     ++ *
     ++ * Returns -1 on error; 0 on success.
      + */
      +int fsmonitor_ipc__send_command(const char *command,
      +				struct strbuf *answer);
      +
     -+#endif /* HAVE_FSMONITOR_DAEMON_BACKEND */
      +#endif /* FSMONITOR_IPC_H */
     -
     - ## help.c ##
     -@@
     - #include "version.h"
     - #include "refs.h"
     - #include "parse-options.h"
     -+#include "fsmonitor-ipc.h"
     - 
     - struct category_description {
     - 	uint32_t category;
     -@@ help.c: void get_version_info(struct strbuf *buf, int show_build_options)
     - 		strbuf_addf(buf, "sizeof-size_t: %d\n", (int)sizeof(size_t));
     - 		strbuf_addf(buf, "shell-path: %s\n", SHELL_PATH);
     - 		/* NEEDSWORK: also save and output GIT-BUILD_OPTIONS? */
     -+
     -+		if (fsmonitor_ipc__is_supported())
     -+			strbuf_addstr(buf, "feature: fsmonitor--daemon\n");
     - 	}
     - }
     - 
  -:  ------------ >  5:  d5d09eb1635b help: include fsmonitor--daemon feature flag in version info
  3:  18c125ec73dc !  6:  67bcf57f5948 config: FSMonitor is repository-specific
     @@ Commit message
          This commit refactors `git_config_get_fsmonitor()` into the `repo_*()`
          form that takes a parameter `struct repository *r`.
      
     -    That change prepares for the upcoming `core.useFSMonitorDaemon` flag which
     +    That change prepares for the upcoming `core.useBuiltinFSMonitor` flag which
          will be stored in the `repo_settings` struct.
      
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
  4:  7082528d8f7c !  7:  7e097cebc143 fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC
     @@ Metadata
       ## Commit message ##
          fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC
      
     -    The `core.fsmonitor` setting is supposed to be set to a path pointing
     -    to a script or executable that (via the Hook API) queries an fsmonitor
     -    process such as watchman.
     +    Use simple IPC to directly communicate with the new builtin file
     +    system monitor daemon.
      
     -    We are about to implement our own fsmonitor backend, and do not want
     -    to spawn hook processes just to query it.  Let's use `Simple IPC` to
     -    directly communicate with the daemon (and start it if necessary),
     -    guarded by the brand-new `core.useBuiltinFSMonitor` toggle.
     +    Define a new config setting `core.useBuiltinFSMonitor` to enable the
     +    builtin file system monitor.
     +
     +    The `core.fsmonitor` setting has already been defined as a HOOK
     +    pathname.  Historically, this has been set to a HOOK script that will
     +    talk with Watchman.  For compatibility reasons, we do not want to
     +    overload that definition (and cause problems if users have multiple
     +    versions of Git installed).
      
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
          Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
     @@ fsmonitor.c
       #include "run-command.h"
       #include "strbuf.h"
       
     -@@ fsmonitor.c: void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate)
     - /*
     -  * Call the query-fsmonitor hook passing the last update token of the saved results.
     -  */
     --static int query_fsmonitor(int version, const char *last_update, struct strbuf *query_result)
     -+static int query_fsmonitor(int version, struct index_state *istate, struct strbuf *query_result)
     +@@ fsmonitor.c: static void fsmonitor_refresh_callback(struct index_state *istate, char *name)
     + 
     + void refresh_fsmonitor(struct index_state *istate)
       {
      +	struct repository *r = istate->repo ? istate->repo : the_repository;
     -+	const char *last_update = istate->fsmonitor_last_update;
     - 	struct child_process cp = CHILD_PROCESS_INIT;
     - 	int result;
     - 
     - 	if (!core_fsmonitor)
     - 		return -1;
     + 	struct strbuf query_result = STRBUF_INIT;
     + 	int query_success = 0, hook_version = -1;
     + 	size_t bol = 0; /* beginning of line */
     +@@ fsmonitor.c: void refresh_fsmonitor(struct index_state *istate)
     + 	istate->fsmonitor_has_run_once = 1;
       
     + 	trace_printf_key(&trace_fsmonitor, "refresh fsmonitor");
     ++
      +	if (r->settings.use_builtin_fsmonitor > 0) {
     -+#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
     -+		return fsmonitor_ipc__send_query(last_update, query_result);
     -+#else
     -+		/* Fake a trivial response. */
     -+		warning(_("fsmonitor--daemon unavailable; falling back"));
     -+		strbuf_add(query_result, "/", 2);
     -+		return 0;
     -+#endif
     ++		query_success = !fsmonitor_ipc__send_query(
     ++			istate->fsmonitor_last_update, &query_result);
     ++		if (query_success) {
     ++			/*
     ++			 * The response contains a series of nul terminated
     ++			 * strings.  The first is the new token.
     ++			 *
     ++			 * Use `char *buf` as an interlude to trick the CI
     ++			 * static analysis to let us use `strbuf_addstr()`
     ++			 * here (and only copy the token) rather than
     ++			 * `strbuf_addbuf()`.
     ++			 */
     ++			buf = query_result.buf;
     ++			strbuf_addstr(&last_update_token, buf);
     ++			bol = last_update_token.len + 1;
     ++		} else {
     ++			/*
     ++			 * The builtin daemon is not available on this
     ++			 * platform -OR- we failed to get a response.
     ++			 *
     ++			 * Generate a fake token (rather than a V1
     ++			 * timestamp) for the index extension.  (If
     ++			 * they switch back to the hook API, we don't
     ++			 * want ambiguous state.)
     ++			 */
     ++			strbuf_addstr(&last_update_token, "builtin:fake");
     ++		}
     ++
     ++		/*
     ++		 * Regardless of whether we successfully talked to a
     ++		 * fsmonitor daemon or not, we skip over and do not
     ++		 * try to use the hook.  The "core.useBuiltinFSMonitor"
     ++		 * config setting ALWAYS overrides the "core.fsmonitor"
     ++		 * hook setting.
     ++		 */
     ++		goto apply_results;
      +	}
      +
     - 	strvec_push(&cp.args, core_fsmonitor);
     - 	strvec_pushf(&cp.args, "%d", version);
     - 	strvec_pushf(&cp.args, "%s", last_update);
     + 	/*
     + 	 * This could be racy so save the date/time now and query_fsmonitor
     + 	 * should be inclusive to ensure we don't miss potential changes.
      @@ fsmonitor.c: void refresh_fsmonitor(struct index_state *istate)
     - 	if (istate->fsmonitor_last_update) {
     - 		if (hook_version == -1 || hook_version == HOOK_INTERFACE_VERSION2) {
     - 			query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION2,
     --				istate->fsmonitor_last_update, &query_result);
     -+				istate, &query_result);
     + 			core_fsmonitor, query_success ? "success" : "failure");
     + 	}
       
     - 			if (query_success) {
     - 				if (hook_version < 0)
     -@@ fsmonitor.c: void refresh_fsmonitor(struct index_state *istate)
     - 
     - 		if (hook_version == HOOK_INTERFACE_VERSION1) {
     - 			query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION1,
     --				istate->fsmonitor_last_update, &query_result);
     -+				istate, &query_result);
     - 		}
     - 
     - 		trace_performance_since(last_update, "fsmonitor process '%s'", core_fsmonitor);
     ++apply_results:
     + 	/* a fsmonitor process can return '/' to indicate all entries are invalid */
     + 	if (query_success && query_result.buf[bol] != '/') {
     + 		/* Mark all entries returned by the monitor as dirty */
      
       ## repo-settings.c ##
      @@ repo-settings.c: void prepare_repo_settings(struct repository *r)
     @@ repo-settings.c: void prepare_repo_settings(struct repository *r)
       		UPDATE_DEFAULT_BOOL(r->settings.core_untracked_cache, UNTRACKED_CACHE_WRITE);
      
       ## repository.h ##
     -@@ repository.h: struct repo_settings {
     - 	enum fetch_negotiation_setting fetch_negotiation_algorithm;
     +@@ repository.h: enum fetch_negotiation_setting {
     + struct repo_settings {
     + 	int initialized;
       
     - 	int core_multi_pack_index;
     -+
      +	int use_builtin_fsmonitor;
     - };
     - 
     - struct repository {
     ++
     + 	int core_commit_graph;
     + 	int commit_graph_read_changed_paths;
     + 	int gc_write_commit_graph;
  5:  95d511d83b12 !  8:  f362a88632e4 fsmonitor--daemon: add a built-in fsmonitor daemon
     @@ builtin/fsmonitor--daemon.c (new)
      +
      +int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
      +{
     -+	enum daemon_mode {
     -+		UNDEFINED_MODE,
     -+	} mode = UNDEFINED_MODE;
     ++	const char *subcmd;
      +
      +	struct option options[] = {
      +		OPT_END()
      +	};
      +
     ++	if (argc < 2)
     ++		usage_with_options(builtin_fsmonitor__daemon_usage, options);
     ++
      +	if (argc == 2 && !strcmp(argv[1], "-h"))
      +		usage_with_options(builtin_fsmonitor__daemon_usage, options);
      +
      +	git_config(git_default_config, NULL);
      +
     ++	subcmd = argv[1];
     ++	argv--;
     ++	argc++;
     ++
      +	argc = parse_options(argc, argv, prefix, options,
      +			     builtin_fsmonitor__daemon_usage, 0);
      +
     -+	switch (mode) {
     -+	case UNDEFINED_MODE:
     -+	default:
     -+		die(_("Unhandled command mode %d"), mode);
     -+	}
     ++	die(_("Unhandled subcommand '%s'"), subcmd);
      +}
      +
      +#else
  6:  77170e521f67 !  9:  4f401310539e fsmonitor--daemon: implement client command options
     @@ Metadata
       ## Commit message ##
          fsmonitor--daemon: implement client command options
      
     -    Implement command options `--stop`, `--is-running`, `--query`,
     -    `--query-index`, and `--flush` to control and query the status of a
     -    `fsmonitor--daemon` server process (and implicitly start a server
     -    process if necessary).
     +    Implement `stop` and `status` client commands to control and query the
     +    status of a `fsmonitor--daemon` server process (and implicitly start a
     +    server process if necessary).
      
     -    Later commits will implement the actual server and monitor
     -    the file system.
     +    Later commits will implement the actual server and monitor the file
     +    system.
      
          Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
      
     @@ builtin/fsmonitor--daemon.c
       #include "khash.h"
       
       static const char * const builtin_fsmonitor__daemon_usage[] = {
     -+	N_("git fsmonitor--daemon --stop"),
     -+	N_("git fsmonitor--daemon --is-running"),
     -+	N_("git fsmonitor--daemon --query <token>"),
     -+	N_("git fsmonitor--daemon --query-index"),
     -+	N_("git fsmonitor--daemon --flush"),
     ++	N_("git fsmonitor--daemon stop"),
     ++	N_("git fsmonitor--daemon status"),
       	NULL
       };
       
     @@ builtin/fsmonitor--daemon.c
      +/*
      + * Acting as a CLIENT.
      + *
     -+ * Send an IPC query to a `git-fsmonitor--daemon` SERVER process and
     -+ * ask for the changes since the given token.  This will implicitly
     -+ * start a daemon process if necessary.  The daemon process will
     -+ * persist after we exit.
     -+ *
     -+ * This feature is primarily used by the test suite.
     -+ */
     -+static int do_as_client__query_token(const char *token)
     -+{
     -+	struct strbuf answer = STRBUF_INIT;
     -+	int ret;
     -+
     -+	ret = fsmonitor_ipc__send_query(token, &answer);
     -+	if (ret < 0)
     -+		die(_("could not query fsmonitor--daemon"));
     -+
     -+	write_in_full(1, answer.buf, answer.len);
     -+	strbuf_release(&answer);
     -+
     -+	return 0;
     -+}
     -+
     -+/*
     -+ * Acting as a CLIENT.
     -+ *
     -+ * Read the `.git/index` to get the last token written to the FSMonitor index
     -+ * extension and use that to make a query.
     -+ *
     -+ * This feature is primarily used by the test suite.
     -+ */
     -+static int do_as_client__query_from_index(void)
     -+{
     -+	struct index_state *istate = the_repository->index;
     -+
     -+	setup_git_directory();
     -+	if (do_read_index(istate, the_repository->index_file, 0) < 0)
     -+		die("unable to read index file");
     -+	if (!istate->fsmonitor_last_update)
     -+		die("index file does not have fsmonitor extension");
     -+
     -+	return do_as_client__query_token(istate->fsmonitor_last_update);
     -+}
     -+
     -+/*
     -+ * Acting as a CLIENT.
     -+ *
      + * Send a "quit" command to the `git-fsmonitor--daemon` (if running)
      + * and wait for it to shutdown.
      + */
     @@ builtin/fsmonitor--daemon.c
      +	return 0;
      +}
      +
     -+/*
     -+ * Acting as a CLIENT.
     -+ *
     -+ * Send a "flush" command to the `git-fsmonitor--daemon` (if running)
     -+ * and tell it to flush its cache.
     -+ *
     -+ * This feature is primarily used by the test suite to simulate a loss of
     -+ * sync with the filesystem where we miss kernel events.
     -+ */
     -+static int do_as_client__send_flush(void)
     ++static int do_as_client__status(void)
      +{
     -+	struct strbuf answer = STRBUF_INIT;
     -+	int ret;
     -+
     -+	ret = fsmonitor_ipc__send_command("flush", &answer);
     -+	if (ret)
     -+		return ret;
     -+
     -+	write_in_full(1, answer.buf, answer.len);
     -+	strbuf_release(&answer);
     ++	enum ipc_active_state state = fsmonitor_ipc__get_state();
      +
     -+	return 0;
     -+}
     ++	switch (state) {
     ++	case IPC_STATE__LISTENING:
     ++		printf(_("The built-in file system monitor is active\n"));
     ++		return 0;
      +
     -+static int is_ipc_daemon_listening(void)
     -+{
     -+	return fsmonitor_ipc__get_state() == IPC_STATE__LISTENING;
     ++	default:
     ++		printf(_("The built-in file system monitor is not active\n"));
     ++		return 1;
     ++	}
      +}
       
       int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
       {
     - 	enum daemon_mode {
     - 		UNDEFINED_MODE,
     -+		STOP,
     -+		IS_RUNNING,
     -+		QUERY,
     -+		QUERY_INDEX,
     -+		FLUSH,
     - 	} mode = UNDEFINED_MODE;
     - 
     - 	struct option options[] = {
     -+		OPT_CMDMODE(0, "stop", &mode, N_("stop the running daemon"),
     -+			    STOP),
     -+
     -+		OPT_CMDMODE(0, "is-running", &mode,
     -+			    N_("test whether the daemon is running"),
     -+			    IS_RUNNING),
     -+
     -+		OPT_CMDMODE(0, "query", &mode,
     -+			    N_("query the daemon (starting if necessary)"),
     -+			    QUERY),
     -+		OPT_CMDMODE(0, "query-index", &mode,
     -+			    N_("query the daemon (starting if necessary) using token from index"),
     -+			    QUERY_INDEX),
     -+		OPT_CMDMODE(0, "flush", &mode, N_("flush cached filesystem events"),
     -+			    FLUSH),
     - 		OPT_END()
     - 	};
     - 
      @@ builtin/fsmonitor--daemon.c: int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
     + 	argc = parse_options(argc, argv, prefix, options,
       			     builtin_fsmonitor__daemon_usage, 0);
       
     - 	switch (mode) {
     -+	case STOP:
     ++	if (!strcmp(subcmd, "stop"))
      +		return !!do_as_client__send_stop();
      +
     -+	case IS_RUNNING:
     -+		return !is_ipc_daemon_listening();
     -+
     -+	case QUERY:
     -+		if (argc != 1)
     -+			usage_with_options(builtin_fsmonitor__daemon_usage,
     -+					   options);
     -+		return !!do_as_client__query_token(argv[0]);
     ++	if (!strcmp(subcmd, "status"))
     ++		return !!do_as_client__status();
      +
     -+	case QUERY_INDEX:
     -+		return !!do_as_client__query_from_index();
     -+
     -+	case FLUSH:
     -+		return !!do_as_client__send_flush();
     -+
     - 	case UNDEFINED_MODE:
     - 	default:
     - 		die(_("Unhandled command mode %d"), mode);
     + 	die(_("Unhandled subcommand '%s'"), subcmd);
     + }
     + 
  -:  ------------ > 10:  d21af7ff842c t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon
  7:  27f47dfbd9cf ! 11:  49f9e2e3d49c fsmonitor-fs-listen-win32: stub in backend for Windows
     @@ Makefile: all::
       # directory, and the JSON compilation database 'compile_commands.json' will be
       # created at the root of the repository.
       #
     -+# If your platform supports an built-in fsmonitor backend, set
     -+# FSMONITOR_DAEMON_BACKEND to the name of the corresponding
     ++# If your platform supports a built-in fsmonitor backend, set
     ++# FSMONITOR_DAEMON_BACKEND to the "<name>" of the corresponding
      +# `compat/fsmonitor/fsmonitor-fs-listen-<name>.c` that implements the
      +# `fsmonitor_fs_listen__*()` routines.
      +#
  8:  a84dee24e13e = 12:  2aa85151f03c fsmonitor-fs-listen-macos: stub in backend for MacOS
  9:  2b291d805d59 ! 13:  2aa05ad5c67f fsmonitor--daemon: implement daemon command options
     @@ Metadata
       ## Commit message ##
          fsmonitor--daemon: implement daemon command options
      
     -    Implement command options `--run` and `--start` to try to
     +    Implement `run` and `start` commands to try to
          begin listening for file system events.
      
          This version defines the thread structure with a single
     @@ builtin/fsmonitor--daemon.c
       #include "khash.h"
       
       static const char * const builtin_fsmonitor__daemon_usage[] = {
     -+	N_("git fsmonitor--daemon --start [<options>]"),
     -+	N_("git fsmonitor--daemon --run [<options>]"),
     - 	N_("git fsmonitor--daemon --stop"),
     - 	N_("git fsmonitor--daemon --is-running"),
     - 	N_("git fsmonitor--daemon --query <token>"),
     -@@ builtin/fsmonitor--daemon.c: static const char * const builtin_fsmonitor__daemon_usage[] = {
     ++	N_("git fsmonitor--daemon start [<options>]"),
     ++	N_("git fsmonitor--daemon run [<options>]"),
     + 	N_("git fsmonitor--daemon stop"),
     + 	N_("git fsmonitor--daemon status"),
     + 	NULL
       };
       
       #ifdef HAVE_FSMONITOR_DAEMON_BACKEND
     @@ builtin/fsmonitor--daemon.c: static const char * const builtin_fsmonitor__daemon
       /*
        * Acting as a CLIENT.
        *
     -@@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
     - 	return 0;
     +@@ builtin/fsmonitor--daemon.c: static int do_as_client__status(void)
     + 	}
       }
       
      +static ipc_server_application_cb handle_client;
      +
     -+static int handle_client(void *data, const char *command,
     ++static int handle_client(void *data,
     ++			 const char *command, size_t command_len,
      +			 ipc_server_reply_cb *reply,
      +			 struct ipc_server_reply_data *reply_data)
      +{
      +	/* struct fsmonitor_daemon_state *state = data; */
      +	int result;
      +
     ++	/*
     ++	 * The Simple IPC API now supports {char*, len} arguments, but
     ++	 * FSMonitor always uses proper null-terminated strings, so
     ++	 * we can ignore the command_len argument.  (Trust, but verify.)
     ++	 */
     ++	if (command_len != strlen(command))
     ++		BUG("FSMonitor assumes text messages");
     ++
      +	trace2_region_enter("fsmonitor", "handle_client", the_repository);
      +	trace2_data_string("fsmonitor", the_repository, "request", command);
      +
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      +	pthread_mutex_init(&state.main_lock, NULL);
      +	state.error_code = 0;
      +	state.current_token_data = NULL;
     -+	state.test_client_delay_ms = 0;
      +
      +	/* Prepare to (recursively) watch the <worktree-root> directory. */
      +	strbuf_init(&state.path_worktree_watch, 0);
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      +	state.nr_paths_watching = 1;
      +
      +	/*
     -+	 * If ".git" is not a directory, then <gitdir> is not inside the
     -+	 * cone of <worktree-root>, so set up a second watch for it.
     ++	 * We create/delete cookie files inside the .git directory to
     ++	 * help us keep sync with the file system.  If ".git" is not a
     ++	 * directory, then <gitdir> is not inside the cone of
     ++	 * <worktree-root>, so set up a second watch for it.
      +	 */
      +	strbuf_init(&state.path_gitdir_watch, 0);
      +	strbuf_addbuf(&state.path_gitdir_watch, &state.path_worktree_watch);
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      +	return err;
      +}
      +
     - static int is_ipc_daemon_listening(void)
     - {
     - 	return fsmonitor_ipc__get_state() == IPC_STATE__LISTENING;
     - }
     - 
      +static int try_to_run_foreground_daemon(void)
      +{
      +	/*
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      +	 * However, this method gives us a nicer error message for a
      +	 * common error case.
      +	 */
     -+	if (is_ipc_daemon_listening())
     ++	if (fsmonitor_ipc__get_state() == IPC_STATE__LISTENING)
      +		die("fsmonitor--daemon is already running.");
      +
      +	return !!fsmonitor_run_daemon();
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      +
      +	strvec_push(&args, git_exe);
      +	strvec_push(&args, "fsmonitor--daemon");
     -+	strvec_push(&args, "--run");
     ++	strvec_push(&args, "run");
      +
      +	*pid = mingw_spawnvpe(args.v[0], args.v, NULL, NULL, in, out, out);
      +	close(in);
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      +
      +		if (pid_seen == -1)
      +			return error_errno(_("waitpid failed"));
     -+
      +		else if (pid_seen == 0) {
      +			/*
      +			 * The child is still running (this should be
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      +			time(&now);
      +			if (now > time_limit)
      +				return error(_("fsmonitor--daemon not online yet"));
     -+
     -+			continue;
     -+		}
     -+
     -+		else if (pid_seen == pid_child) {
     ++		} else if (pid_seen == pid_child) {
      +			/*
      +			 * The new child daemon process shutdown while
      +			 * it was starting up, so it is not listening
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      +			 * early shutdown as an error.
      +			 */
      +			return error(_("fsmonitor--daemon failed to start"));
     -+		}
     -+
     -+		else
     ++		} else
      +			return error(_("waitpid is confused"));
      +	}
      +}
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      +	 * of creating the background process (and not whether it
      +	 * immediately exited).
      +	 */
     -+	if (is_ipc_daemon_listening())
     ++	if (fsmonitor_ipc__get_state() == IPC_STATE__LISTENING)
      +		die("fsmonitor--daemon is already running.");
      +
      +	/*
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      +
       int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
       {
     - 	enum daemon_mode {
     - 		UNDEFINED_MODE,
     -+		START,
     -+		RUN,
     - 		STOP,
     - 		IS_RUNNING,
     - 		QUERY,
     -@@ builtin/fsmonitor--daemon.c: int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
     - 	} mode = UNDEFINED_MODE;
     + 	const char *subcmd;
       
       	struct option options[] = {
     -+		OPT_CMDMODE(0, "start", &mode,
     -+			    N_("run the daemon in the background"),
     -+			    START),
     -+		OPT_CMDMODE(0, "run", &mode,
     -+			    N_("run the daemon in the foreground"), RUN),
     - 		OPT_CMDMODE(0, "stop", &mode, N_("stop the running daemon"),
     - 			    STOP),
     - 
     -@@ builtin/fsmonitor--daemon.c: int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
     - 			    QUERY_INDEX),
     - 		OPT_CMDMODE(0, "flush", &mode, N_("flush cached filesystem events"),
     - 			    FLUSH),
     -+
     -+		OPT_GROUP(N_("Daemon options")),
      +		OPT_INTEGER(0, "ipc-threads",
      +			    &fsmonitor__ipc_threads,
      +			    N_("use <n> ipc worker threads")),
      +		OPT_INTEGER(0, "start-timeout",
      +			    &fsmonitor__start_timeout_sec,
      +			    N_("Max seconds to wait for background daemon startup")),
     ++
       		OPT_END()
       	};
       
     +@@ builtin/fsmonitor--daemon.c: int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
       	if (argc == 2 && !strcmp(argv[1], "-h"))
       		usage_with_options(builtin_fsmonitor__daemon_usage, options);
       
      -	git_config(git_default_config, NULL);
      +	git_config(fsmonitor_config, NULL);
       
     + 	subcmd = argv[1];
     + 	argv--;
     +@@ builtin/fsmonitor--daemon.c: int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
     + 
       	argc = parse_options(argc, argv, prefix, options,
       			     builtin_fsmonitor__daemon_usage, 0);
      +	if (fsmonitor__ipc_threads < 1)
      +		die(_("invalid 'ipc-threads' value (%d)"),
      +		    fsmonitor__ipc_threads);
     - 
     - 	switch (mode) {
     -+	case START:
     ++
     ++	if (!strcmp(subcmd, "start"))
      +		return !!try_to_start_background_daemon();
      +
     -+	case RUN:
     ++	if (!strcmp(subcmd, "run"))
      +		return !!try_to_run_foreground_daemon();
     -+
     - 	case STOP:
     - 		return !!do_as_client__send_stop();
       
     + 	if (!strcmp(subcmd, "stop"))
     + 		return !!do_as_client__send_stop();
      
       ## fsmonitor--daemon.h (new) ##
      @@
     @@ fsmonitor--daemon.h (new)
      +	struct fsmonitor_daemon_backend_data *backend_data;
      +
      +	struct ipc_server_data *ipc_server_data;
     -+
     -+	int test_client_delay_ms;
      +};
      +
      +#endif /* HAVE_FSMONITOR_DAEMON_BACKEND */
 10:  451563314d84 ! 14:  d5ababfd03e9 fsmonitor--daemon: add pathname classification
     @@ Commit message
          Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
      
       ## builtin/fsmonitor--daemon.c ##
     -@@ builtin/fsmonitor--daemon.c: static int handle_client(void *data, const char *command,
     +@@ builtin/fsmonitor--daemon.c: static int handle_client(void *data,
       	return result;
       }
       
     @@ builtin/fsmonitor--daemon.c: static int handle_client(void *data, const char *co
      
       ## fsmonitor--daemon.h ##
      @@ fsmonitor--daemon.h: struct fsmonitor_daemon_state {
     - 	int test_client_delay_ms;
     + 	struct ipc_server_data *ipc_server_data;
       };
       
      +/*
 11:  304fe03034f8 ! 15:  c092cdf2c8b7 fsmonitor--daemon: define token-ids
     @@ Commit message
          Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
      
       ## builtin/fsmonitor--daemon.c ##
     -@@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
     - 	return 0;
     +@@ builtin/fsmonitor--daemon.c: static int do_as_client__status(void)
     + 	}
       }
       
      +/*
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      + *
      + *     "builtin" ":" <token_id> ":" <sequence_nr>
      + *
     ++ * The "builtin" prefix is used as a namespace to avoid conflicts
     ++ * with other providers (such as Watchman).
     ++ *
      + * The <token_id> is an arbitrary OPAQUE string, such as a GUID,
      + * UUID, or {timestamp,pid}.  It is used to group all filesystem
      + * events that happened while the daemon was monitoring (and in-sync
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      + *     (There are too many race conditions to rely on file system
      + *     event timestamps.)
      + *
     -+ * The <sequence_nr> is a simple integer incremented for each event
     -+ * received.  When a new <token_id> is created, the <sequence_nr> is
     -+ * reset to zero.
     ++ * The <sequence_nr> is a simple integer incremented whenever the
     ++ * daemon needs to make its state public.  For example, if 1000 file
     ++ * system events come in, but no clients have requested the data,
     ++ * the daemon can continue to accumulate file changes in the same
     ++ * bin and does not need to advance the sequence number.  However,
     ++ * as soon as a client does arrive, the daemon needs to start a new
     ++ * bin and increment the sequence number.
     ++ *
     ++ *     The sequence number serves as the boundary between 2 sets
     ++ *     of bins -- the older ones that the client has already seen
     ++ *     and the newer ones that it hasn't.
     ++ *
     ++ * When a new <token_id> is created, the <sequence_nr> is reset to
     ++ * zero.
      + *
      + *
      + * About Token Ids
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      + * [3] in response to a client "flush" command (for dropped event
      + *     testing).
      + *
     -+ * [4] MAYBE We might want to change the token_id after very complex
     -+ *     filesystem operations are performed, such as a directory move
     -+ *     sequence that affects many files within.  It might be simpler
     -+ *     to just give up and fake a re-sync (and let the client do a
     -+ *     full scan) than try to enumerate the effects of such a change.
     -+ *
      + * When a new token_id is created, the daemon is free to discard all
      + * cached filesystem events associated with any previous token_ids.
      + * Events associated with a non-current token_id will never be sent
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      +	static uint64_t flush_count = 0;
      +	struct fsmonitor_token_data *token;
      +
     -+	token = (struct fsmonitor_token_data *)xcalloc(1, sizeof(*token));
     ++	CALLOC_ARRAY(token, 1);
      +
      +	strbuf_init(&token->token_id, 0);
      +	token->batch_head = NULL;
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      +
       static ipc_server_application_cb handle_client;
       
     - static int handle_client(void *data, const char *command,
     + static int handle_client(void *data,
      @@ builtin/fsmonitor--daemon.c: static int fsmonitor_run_daemon(void)
       
       	pthread_mutex_init(&state.main_lock, NULL);
       	state.error_code = 0;
      -	state.current_token_data = NULL;
      +	state.current_token_data = fsmonitor_new_token_data();
     - 	state.test_client_delay_ms = 0;
       
       	/* Prepare to (recursively) watch the <worktree-root> directory. */
     + 	strbuf_init(&state.path_worktree_watch, 0);
 12:  f1fa803ebe9c ! 16:  2ed7bc3fae7a fsmonitor--daemon: create token-based changed path cache
     @@ Metadata
       ## Commit message ##
          fsmonitor--daemon: create token-based changed path cache
      
     -    Teach fsmonitor--daemon to build lists of changed paths and associate
     +    Teach fsmonitor--daemon to build a list of changed paths and associate
          them with a token-id.  This will be used by the platform-specific
          backends to accumulate changed paths in response to filesystem events.
      
     -    The platform-specific event loops receive batches containing one or
     -    more changed paths.  Their fs listener thread will accumulate them in
     -    a `fsmonitor_batch` (and without locking) and then "publish" them to
     -    associate them with the current token and to make them visible to the
     -    client worker threads.
     +    The platform-specific file system listener thread receives file system
     +    events containing one or more changed pathnames (with whatever bucketing
     +    or grouping that is convenient for the file system).  These paths are
     +    accumulated (without locking) by the file system layer into a `fsmonitor_batch`.
     +
     +    When the file system layer has drained the kernel event queue, it will
     +    "publish" them to our token queue and make them visible to concurrent
     +    client worker threads.  The token layer is free to combine and/or de-dup
     +    paths within these batches for efficient presentation to clients.
      
          Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
      
       ## builtin/fsmonitor--daemon.c ##
     -@@ builtin/fsmonitor--daemon.c: static struct fsmonitor_token_data *fsmonitor_new_token_data(void)
     - 	return token;
     - }
     +@@ builtin/fsmonitor--daemon.c: struct fsmonitor_token_data {
     + 	uint64_t client_ref_count;
     + };
       
      +struct fsmonitor_batch {
      +	struct fsmonitor_batch *next;
     @@ builtin/fsmonitor--daemon.c: static struct fsmonitor_token_data *fsmonitor_new_t
      +	time_t pinned_time;
      +};
      +
     + static struct fsmonitor_token_data *fsmonitor_new_token_data(void)
     + {
     + 	static int test_env_value = -1;
     + 	static uint64_t flush_count = 0;
     + 	struct fsmonitor_token_data *token;
     ++	struct fsmonitor_batch *batch;
     + 
     + 	CALLOC_ARRAY(token, 1);
     ++	batch = fsmonitor_batch__new();
     + 
     + 	strbuf_init(&token->token_id, 0);
     +-	token->batch_head = NULL;
     +-	token->batch_tail = NULL;
     ++	token->batch_head = batch;
     ++	token->batch_tail = batch;
     + 	token->client_ref_count = 0;
     + 
     + 	if (test_env_value < 0)
     +@@ builtin/fsmonitor--daemon.c: static struct fsmonitor_token_data *fsmonitor_new_token_data(void)
     + 		strbuf_addf(&token->token_id, "test_%08x", test_env_value++);
     + 	}
     + 
     ++	/*
     ++	 * We created a new <token_id> and are starting a new series
     ++	 * of tokens with a zero <seq_nr>.
     ++	 *
     ++	 * Since clients cannot guess our new (non test) <token_id>
     ++	 * they will always receive a trivial response (because of the
     ++	 * mismatch on the <token_id>).  The trivial response will
     ++	 * tell them our new <token_id> so that subsequent requests
     ++	 * will be relative to our new series.  (And when sending that
     ++	 * response, we pin the current head of the batch list.)
     ++	 *
     ++	 * Even if the client correctly guesses the <token_id>, their
     ++	 * request of "builtin:<token_id>:0" asks for all changes MORE
     ++	 * RECENT than batch/bin 0.
     ++	 *
     ++	 * This implies that it is a waste to accumulate paths in the
     ++	 * initial batch/bin (because they will never be transmitted).
     ++	 *
     ++	 * So the daemon could be running for days and watching the
     ++	 * file system, but doesn't need to actually accumulate any
     ++	 * paths UNTIL we need to set a reference point for a later
     ++	 * relative request.
     ++	 *
     ++	 * However, it is very useful for testing to always have a
     ++	 * reference point set.  Pin batch 0 to force early file system
     ++	 * events to accumulate.
     ++	 */
     ++	if (test_env_value)
     ++		batch->pinned_time = time(NULL);
     ++
     + 	return token;
     + }
     + 
      +struct fsmonitor_batch *fsmonitor_batch__new(void)
      +{
     -+	struct fsmonitor_batch *batch = xcalloc(1, sizeof(*batch));
     ++	struct fsmonitor_batch *batch;
     ++
     ++	CALLOC_ARRAY(batch, 1);
      +
      +	return batch;
      +}
      +
     -+struct fsmonitor_batch *fsmonitor_batch__free(struct fsmonitor_batch *batch)
     ++struct fsmonitor_batch *fsmonitor_batch__pop(struct fsmonitor_batch *batch)
      +{
      +	struct fsmonitor_batch *next;
      +
     @@ builtin/fsmonitor--daemon.c: static struct fsmonitor_token_data *fsmonitor_new_t
      +static void fsmonitor_batch__combine(struct fsmonitor_batch *batch_dest,
      +				     const struct fsmonitor_batch *batch_src)
      +{
     -+	/* assert state->main_lock */
     -+
      +	size_t k;
      +
      +	ALLOC_GROW(batch_dest->interned_paths,
     @@ builtin/fsmonitor--daemon.c: static struct fsmonitor_token_data *fsmonitor_new_t
      +
      +	strbuf_release(&token->token_id);
      +
     -+	for (p = token->batch_head; p; p = fsmonitor_batch__free(p))
     ++	for (p = token->batch_head; p; p = fsmonitor_batch__pop(p))
      +		;
      +
      +	free(token);
     @@ builtin/fsmonitor--daemon.c: static struct fsmonitor_token_data *fsmonitor_new_t
      + *     We should create a new token and start fresh (as if we just
      + *     booted up).
      + *
     -+ * If there are no readers of the the current token data series, we
     -+ * can free it now.  Otherwise, let the last reader free it.  Either
     -+ * way, the old token data series is no longer associated with our
     -+ * state data.
     ++ * If there are no concurrent threads readering the current token data
     ++ * series, we can free it now.  Otherwise, let the last reader free
     ++ * it.
     ++ *
     ++ * Either way, the old token data series is no longer associated with
     ++ * our state data.
      + */
     -+void fsmonitor_force_resync(struct fsmonitor_daemon_state *state)
     ++static void with_lock__do_force_resync(struct fsmonitor_daemon_state *state)
      +{
     ++	/* assert current thread holding state->main_lock */
     ++
      +	struct fsmonitor_token_data *free_me = NULL;
      +	struct fsmonitor_token_data *new_one = NULL;
      +
      +	new_one = fsmonitor_new_token_data();
      +
     -+	pthread_mutex_lock(&state->main_lock);
     -+
     -+	trace_printf_key(&trace_fsmonitor,
     -+			 "force resync [old '%s'][new '%s']",
     -+			 state->current_token_data->token_id.buf,
     -+			 new_one->token_id.buf);
     -+
      +	if (state->current_token_data->client_ref_count == 0)
      +		free_me = state->current_token_data;
      +	state->current_token_data = new_one;
      +
     -+	pthread_mutex_unlock(&state->main_lock);
     -+
      +	fsmonitor_free_token_data(free_me);
      +}
     ++
     ++void fsmonitor_force_resync(struct fsmonitor_daemon_state *state)
     ++{
     ++	pthread_mutex_lock(&state->main_lock);
     ++	with_lock__do_force_resync(state);
     ++	pthread_mutex_unlock(&state->main_lock);
     ++}
      +
       static ipc_server_application_cb handle_client;
       
     - static int handle_client(void *data, const char *command,
     + static int handle_client(void *data,
      @@ builtin/fsmonitor--daemon.c: enum fsmonitor_path_type fsmonitor_classify_path_absolute(
       	return fsmonitor_classify_path_gitdir_relative(rel);
       }
     @@ builtin/fsmonitor--daemon.c: enum fsmonitor_path_type fsmonitor_classify_path_ab
      +
      +		head = state->current_token_data->batch_head;
      +		if (!head) {
     -+			batch->batch_seq_nr = 0;
     -+			batch->next = NULL;
     -+			state->current_token_data->batch_head = batch;
     -+			state->current_token_data->batch_tail = batch;
     ++			BUG("token does not have batch");
      +		} else if (head->pinned_time) {
      +			/*
      +			 * We cannot alter the current batch list
     @@ builtin/fsmonitor--daemon.c: enum fsmonitor_path_type fsmonitor_classify_path_ab
      +			batch->batch_seq_nr = head->batch_seq_nr + 1;
      +			batch->next = head;
      +			state->current_token_data->batch_head = batch;
     ++		} else if (!head->batch_seq_nr) {
     ++			/*
     ++			 * Batch 0 is unpinned.  See the note in
     ++			 * `fsmonitor_new_token_data()` about why we
     ++			 * don't need to accumulate these paths.
     ++			 */
     ++			fsmonitor_batch__pop(batch);
      +		} else if (head->nr + batch->nr > MY_COMBINE_LIMIT) {
      +			/*
      +			 * The head batch in the list has never been
     @@ builtin/fsmonitor--daemon.c: enum fsmonitor_path_type fsmonitor_classify_path_ab
      +			 * batch onto the end of the current head batch.
      +			 */
      +			fsmonitor_batch__combine(head, batch);
     -+			fsmonitor_batch__free(batch);
     ++			fsmonitor_batch__pop(batch);
      +		}
      +	}
      +
     @@ fsmonitor--daemon.h
      +/*
      + * Free this batch and return the value of the batch->next field.
      + */
     -+struct fsmonitor_batch *fsmonitor_batch__free(struct fsmonitor_batch *batch);
     ++struct fsmonitor_batch *fsmonitor_batch__pop(struct fsmonitor_batch *batch);
      +
      +/*
      + * Add this path to this batch of modified files.
 13:  a57ddb3bc7cc ! 17:  9ea4b04b8215 fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
     @@ compat/fsmonitor/fsmonitor-fs-listen-win32.c
      +		return NULL;
      +	}
      +
     -+	watch = xcalloc(1, sizeof(*watch));
     ++	CALLOC_ARRAY(watch, 1);
      +
      +	watch->buf_len = sizeof(watch->buffer); /* assume full MAX_RDCW_BUF */
      +
     @@ compat/fsmonitor/fsmonitor-fs-listen-win32.c
      +	watch->overlapped.hEvent = watch->hEvent;
      +
      +start_watch:
     ++	/*
     ++	 * Queue an async call using Overlapped IO.  This returns immediately.
     ++	 * Our event handle will be signalled when the real result is available.
     ++	 *
     ++	 * The return value here just means that we successfully queued it.
     ++	 * We won't know if the Read...() actually produces data until later.
     ++	 */
      +	watch->is_active = ReadDirectoryChangesW(
      +		watch->hDir, watch->buffer, watch->buf_len, TRUE,
      +		dwNotifyFilter, &watch->count, &watch->overlapped, NULL);
      +
     ++	/*
     ++	 * The kernel throws an invalid parameter error when our buffer
     ++	 * is too big and we are pointed at a remote directory (and possibly
     ++	 * for other reasons).  Quietly set it down and try again.
     ++	 *
     ++	 * See note about MAX_RDCW_BUF at the top.
     ++	 */
      +	if (!watch->is_active &&
      +	    GetLastError() == ERROR_INVALID_PARAMETER &&
      +	    watch->buf_len > MAX_RDCW_BUF_FALLBACK) {
     @@ compat/fsmonitor/fsmonitor-fs-listen-win32.c
      +{
      +	watch->is_active = FALSE;
      +
     ++	/*
     ++	 * The overlapped result is ready.  If the Read...() was successful
     ++	 * we finally receive the actual result into our buffer.
     ++	 */
      +	if (GetOverlappedResult(watch->hDir, &watch->overlapped, &watch->count,
      +				TRUE))
      +		return 0;
      +
     -+	// TODO If an external <gitdir> is deleted, the above returns an error.
     -+	// TODO I'm not sure that there's anything that we can do here other
     -+	// TODO than failing -- the <worktree>/.git link file would be broken
     -+	// TODO anyway.  We might try to check for that and return a better
     -+	// TODO error message.
     ++	/*
     ++	 * NEEDSWORK: If an external <gitdir> is deleted, the above
     ++	 * returns an error.  I'm not sure that there's anything that
     ++	 * we can do here other than failing -- the <worktree>/.git
     ++	 * link file would be broken anyway.  We might try to check
     ++	 * for that and return a better error message, but I'm not
     ++	 * sure it is worth it.
     ++	 */
      +
      +	error("GetOverlappedResult failed on '%s' [GLE %ld]",
      +	      watch->path.buf, GetLastError());
     @@ compat/fsmonitor/fsmonitor-fs-listen-win32.c
      +	if (!watch || !watch->is_active)
      +		return;
      +
     ++	/*
     ++	 * The calls to ReadDirectoryChangesW() and GetOverlappedResult()
     ++	 * form a "pair" (my term) where we queue an IO and promise to
     ++	 * hang around and wait for the kernel to give us the result.
     ++	 *
     ++	 * If for some reason after we queue the IO, we have to quit
     ++	 * or otherwise not stick around for the second half, we must
     ++	 * tell the kernel to abort the IO.  This prevents the kernel
     ++	 * from writing to our buffer and/or signalling our event
     ++	 * after we free them.
     ++	 *
     ++	 * (Ask me how much fun it was to track that one down).
     ++	 */
      +	CancelIoEx(watch->hDir, &watch->overlapped);
      +	GetOverlappedResult(watch->hDir, &watch->overlapped, &count, TRUE);
      +	watch->is_active = FALSE;
     @@ compat/fsmonitor/fsmonitor-fs-listen-win32.c
      +	/*
      +	 * If the kernel gets more events than will fit in the kernel
      +	 * buffer associated with our RDCW handle, it drops them and
     -+	 * returns a count of zero.  (A successful call, but with
     -+	 * length zero.)
     ++	 * returns a count of zero.
     ++	 *
     ++	 * Yes, the call returns WITHOUT error and with length zero.
     ++	 *
     ++	 * (The "overflow" case is not ambiguous with the "no data" case
     ++	 * because we did an INFINITE wait.)
     ++	 *
     ++	 * This means we have a gap in coverage.  Tell the daemon layer
     ++	 * to resync.
      +	 */
      +	if (!watch->count) {
      +		trace2_data_string("fsmonitor", NULL, "fsm-listen/kernel",
     @@ compat/fsmonitor/fsmonitor-fs-listen-win32.c
      +		default:
      +			BUG("unexpected path classification '%d' for '%s'",
      +			    t, path.buf);
     -+			goto skip_this_path;
      +		}
      +
      +skip_this_path:
     @@ compat/fsmonitor/fsmonitor-fs-listen-win32.c
      +	return LISTENER_HAVE_DATA_WORKTREE;
      +
      +force_shutdown:
     -+	fsmonitor_batch__free(batch);
     ++	fsmonitor_batch__pop(batch);
      +	string_list_clear(&cookie_list, 0);
      +	strbuf_release(&path);
      +	return LISTENER_SHUTDOWN;
      +}
      +
      +/*
     -+ * Process filesystem events that happend anywhere (recursively) under the
     ++ * Process filesystem events that happened anywhere (recursively) under the
      + * external <gitdir> (such as non-primary worktrees or submodules).
      + * We only care about cookie files that our client threads created here.
      + *
     @@ compat/fsmonitor/fsmonitor-fs-listen-win32.c
      +		default:
      +			BUG("unexpected path classification '%d' for '%s'",
      +			    t, path.buf);
     -+			goto skip_this_path;
      +		}
      +
      +skip_this_path:
     @@ compat/fsmonitor/fsmonitor-fs-listen-win32.c
       {
      +	struct fsmonitor_daemon_backend_data *data;
      +
     -+	data = xcalloc(1, sizeof(*data));
     ++	CALLOC_ARRAY(data, 1);
      +
      +	data->hEventShutdown = CreateEvent(NULL, TRUE, FALSE, NULL);
      +
 14:  67fa7c7b8ac7 = 18:  21b2b4f941b2 fsmonitor-fs-listen-macos: add macos header files for FSEvent
 15:  d469d3f02e33 ! 19:  08474bad8303 fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS
     @@ compat/fsmonitor/fsmonitor-fs-listen-macos.c: void FSEventStreamRelease(FSEventS
      +		ef & kFSEventStreamEventFlagItemRenamed);
      +}
      +
     ++static int ef_is_dropped(const FSEventStreamEventFlags ef)
     ++{
     ++	return (ef & kFSEventStreamEventFlagKernelDropped ||
     ++		ef & kFSEventStreamEventFlagUserDropped);
     ++}
     ++
      +static void fsevent_callback(ConstFSEventStreamRef streamRef,
      +			     void *ctx,
      +			     size_t num_of_events,
     @@ compat/fsmonitor/fsmonitor-fs-listen-macos.c: void FSEventStreamRelease(FSEventS
      +	const char *path_k;
      +	const char *slash;
      +	int k;
     ++	struct strbuf tmp = STRBUF_INIT;
      +
      +	/*
      +	 * Build a list of all filesystem changes into a private/local
     @@ compat/fsmonitor/fsmonitor-fs-listen-macos.c: void FSEventStreamRelease(FSEventS
      +		 * If you want to debug FSEvents, log them to GIT_TRACE_FSMONITOR.
      +		 * Please don't log them to Trace2.
      +		 *
     -+		 * trace_printf_key(&trace_fsmonitor, "XXX '%s'", path_k);
     ++		 * trace_printf_key(&trace_fsmonitor, "Path: '%s'", path_k);
      +		 */
      +
      +		/*
     @@ compat/fsmonitor/fsmonitor-fs-listen-macos.c: void FSEventStreamRelease(FSEventS
      +		 *     they are conceptually relative to the just flushed
      +		 *     token).
      +		 */
     -+		if ((event_flags[k] & kFSEventStreamEventFlagKernelDropped) ||
     -+		    (event_flags[k] & kFSEventStreamEventFlagUserDropped)) {
     ++		if (ef_is_dropped(event_flags[k])) {
      +			/*
      +			 * see also kFSEventStreamEventFlagMustScanSubDirs
      +			 */
     -+			trace2_data_string("fsmonitor", NULL,
     -+					   "fsm-listen/kernel", "dropped");
     ++			trace_printf_key(&trace_fsmonitor, "event: dropped");
      +
      +			fsmonitor_force_resync(state);
     -+
     -+			if (fsmonitor_batch__free(batch))
     -+				BUG("batch should not have a next");
     ++			fsmonitor_batch__pop(batch);
      +			string_list_clear(&cookie_list, 0);
      +
      +			/*
     @@ compat/fsmonitor/fsmonitor-fs-listen-macos.c: void FSEventStreamRelease(FSEventS
      +			 * we have to quit.
      +			 */
      +			if (ef_is_root_delete(event_flags[k])) {
     -+				trace2_data_string("fsmonitor", NULL,
     -+						   "fsm-listen/gitdir",
     -+						   "removed");
     ++				trace_printf_key(&trace_fsmonitor,
     ++						 "event: gitdir removed");
      +				goto force_shutdown;
      +			}
      +			if (ef_is_root_renamed(event_flags[k])) {
     -+				trace2_data_string("fsmonitor", NULL,
     -+						   "fsm-listen/gitdir",
     -+						   "renamed");
     ++				trace_printf_key(&trace_fsmonitor,
     ++						 "event: gitdir renamed");
      +				goto force_shutdown;
      +			}
      +			break;
     @@ compat/fsmonitor/fsmonitor-fs-listen-macos.c: void FSEventStreamRelease(FSEventS
      +			if (trace_pass_fl(&trace_fsmonitor))
      +				log_flags_set(path_k, event_flags[k]);
      +
     -+			/* fsevent could be marked as both a file and directory */
     ++			/*
     ++			 * Because of the implicit "binning" (the
     ++			 * kernel calls us at a given frequency) and
     ++			 * de-duping (the kernel is free to combine
     ++			 * multiple events for a given pathname), an
     ++			 * individual fsevent could be marked as both
     ++			 * a file and directory.  Add it to the queue
     ++			 * with both spellings so that the client will
     ++			 * know how much to invalidate/refresh.
     ++			 */
      +
      +			if (event_flags[k] & kFSEventStreamEventFlagItemIsFile) {
      +				const char *rel = path_k +
     @@ compat/fsmonitor/fsmonitor-fs-listen-macos.c: void FSEventStreamRelease(FSEventS
      +			if (event_flags[k] & kFSEventStreamEventFlagItemIsDir) {
      +				const char *rel = path_k +
      +					state->path_worktree_watch.len + 1;
     -+				char *p = xstrfmt("%s/", rel);
     ++
     ++				strbuf_reset(&tmp);
     ++				strbuf_addstr(&tmp, rel);
     ++				strbuf_addch(&tmp, '/');
      +
      +				if (!batch)
      +					batch = fsmonitor_batch__new();
     -+				fsmonitor_batch__add_path(batch, p);
     -+
     -+				free(p);
     ++				fsmonitor_batch__add_path(batch, tmp.buf);
      +			}
      +
      +			break;
     @@ compat/fsmonitor/fsmonitor-fs-listen-macos.c: void FSEventStreamRelease(FSEventS
      +
      +	fsmonitor_publish(state, batch, &cookie_list);
      +	string_list_clear(&cookie_list, 0);
     ++	strbuf_release(&tmp);
      +	return;
      +
      +force_shutdown:
     -+	if (fsmonitor_batch__free(batch))
     -+		BUG("batch should not have a next");
     ++	fsmonitor_batch__pop(batch);
      +	string_list_clear(&cookie_list, 0);
      +
      +	data->shutdown_style = FORCE_SHUTDOWN;
      +	CFRunLoopStop(data->rl);
     ++	strbuf_release(&tmp);
      +	return;
      +}
      +
      +/*
     -+ * TODO Investigate the proper value for the `latency` argument in the call
     -+ * TODO to `FSEventStreamCreate()`.  I'm not sure that this needs to be a
     -+ * TODO config setting or just something that we tune after some testing.
     -+ * TODO
     -+ * TODO With a latency of 0.1, I was seeing lots of dropped events during
     -+ * TODO the "touch 100000" files test within t/perf/p7519, but with a
     -+ * TODO latency of 0.001 I did not see any dropped events.  So the "correct"
     -+ * TODO value may be somewhere in between.
     -+ * TODO
     -+ * TODO https://developer.apple.com/documentation/coreservices/1443980-fseventstreamcreate
     ++ * NEEDSWORK: Investigate the proper value for the `latency` argument
     ++ * in the call to `FSEventStreamCreate()`.  I'm not sure that this
     ++ * needs to be a config setting or just something that we tune after
     ++ * some testing.
     ++ *
     ++ * With a latency of 0.1, I was seeing lots of dropped events during
     ++ * the "touch 100000" files test within t/perf/p7519, but with a
     ++ * latency of 0.001 I did not see any dropped events.  So the
     ++ * "correct" value may be somewhere in between.
     ++ *
     ++ * https://developer.apple.com/documentation/coreservices/1443980-fseventstreamcreate
      + */
       
       int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
     @@ compat/fsmonitor/fsmonitor-fs-listen-macos.c: void FSEventStreamRelease(FSEventS
      +	struct fsmonitor_daemon_backend_data *data;
      +	const void *dir_array[2];
      +
     -+	data = xcalloc(1, sizeof(*data));
     ++	CALLOC_ARRAY(data, 1);
      +	state->backend_data = data;
      +
      +	data->cfsr_worktree_path = CFStringCreateWithCString(
 16:  2b4ae4fc3d62 ! 20:  cc4a596d17c7 fsmonitor--daemon: implement handle_client callback
     @@ builtin/fsmonitor--daemon.c
      +#include "pkt-line.h"
       
       static const char * const builtin_fsmonitor__daemon_usage[] = {
     - 	N_("git fsmonitor--daemon --start [<options>]"),
     + 	N_("git fsmonitor--daemon start [<options>]"),
      @@ builtin/fsmonitor--daemon.c: void fsmonitor_force_resync(struct fsmonitor_daemon_state *state)
     - 	fsmonitor_free_token_data(free_me);
     + 	pthread_mutex_unlock(&state->main_lock);
       }
       
      +/*
      + * Format an opaque token string to send to the client.
      + */
     -+static void fsmonitor_format_response_token(
     ++static void with_lock__format_response_token(
      +	struct strbuf *response_token,
      +	const struct strbuf *response_token_id,
      +	const struct fsmonitor_batch *batch)
      +{
     -+	uint64_t seq_nr = (batch) ? batch->batch_seq_nr + 1 : 0;
     ++	/* assert current thread holding state->main_lock */
      +
      +	strbuf_reset(response_token);
      +	strbuf_addf(response_token, "builtin:%s:%"PRIu64,
     -+		    response_token_id->buf, seq_nr);
     ++		    response_token_id->buf, batch->batch_seq_nr);
      +}
      +
      +/*
      + * Parse an opaque token from the client.
     ++ * Returns -1 on error.
      + */
      +static int fsmonitor_parse_client_token(const char *buf_token,
      +					struct strbuf *requested_token_id,
     @@ builtin/fsmonitor--daemon.c: void fsmonitor_force_resync(struct fsmonitor_daemon
      +	*seq_nr = 0;
      +
      +	if (!skip_prefix(buf_token, "builtin:", &p))
     -+		return 1;
     ++		return -1;
      +
      +	while (*p && *p != ':')
      +		strbuf_addch(requested_token_id, *p++);
      +	if (!*p++)
     -+		return 1;
     ++		return -1;
      +
      +	*seq_nr = (uint64_t)strtoumax(p, &p_end, 10);
      +	if (*p_end)
     -+		return 1;
     ++		return -1;
      +
      +	return 0;
      +}
     @@ builtin/fsmonitor--daemon.c: void fsmonitor_force_resync(struct fsmonitor_daemon
      +	intmax_t count = 0, duplicates = 0;
      +	kh_str_t *shown;
      +	int hash_ret;
     -+	int result;
     ++	int do_trivial = 0;
     ++	int do_flush = 0;
      +
      +	/*
      +	 * We expect `command` to be of the form:
     @@ builtin/fsmonitor--daemon.c: void fsmonitor_force_resync(struct fsmonitor_daemon
      +		 * There is no reply to the client.
      +		 */
      +		return SIMPLE_IPC_QUIT;
     -+	}
      +
     -+	if (!strcmp(command, "flush")) {
     ++	} else if (!strcmp(command, "flush")) {
      +		/*
      +		 * Flush all of our cached data and generate a new token
      +		 * just like if we lost sync with the filesystem.
      +		 *
      +		 * Then send a trivial response using the new token.
      +		 */
     -+		fsmonitor_force_resync(state);
     -+		result = 0;
     -+		goto send_trivial_response;
     -+	}
     ++		do_flush = 1;
     ++		do_trivial = 1;
      +
     -+	if (!skip_prefix(command, "builtin:", &p)) {
     ++	} else if (!skip_prefix(command, "builtin:", &p)) {
      +		/* assume V1 timestamp or garbage */
      +
      +		char *p_end;
     @@ builtin/fsmonitor--daemon.c: void fsmonitor_force_resync(struct fsmonitor_daemon
      +				  "fsmonitor: invalid command line '%s'" :
      +				  "fsmonitor: unsupported V1 protocol '%s'"),
      +				 command);
     -+		result = -1;
     -+		goto send_trivial_response;
     ++		do_trivial = 1;
     ++
     ++	} else {
     ++		/* We have "builtin:*" */
     ++		if (fsmonitor_parse_client_token(command, &requested_token_id,
     ++						 &requested_oldest_seq_nr)) {
     ++			trace_printf_key(&trace_fsmonitor,
     ++					 "fsmonitor: invalid V2 protocol token '%s'",
     ++					 command);
     ++			do_trivial = 1;
     ++
     ++		} else {
     ++			/*
     ++			 * We have a V2 valid token:
     ++			 *     "builtin:<token_id>:<seq_nr>"
     ++			 */
     ++		}
      +	}
      +
     -+	/* try V2 token */
     ++	pthread_mutex_lock(&state->main_lock);
      +
     -+	if (fsmonitor_parse_client_token(command, &requested_token_id,
     -+					 &requested_oldest_seq_nr)) {
     -+		trace_printf_key(&trace_fsmonitor,
     -+				 "fsmonitor: invalid V2 protocol token '%s'",
     -+				 command);
     -+		result = -1;
     -+		goto send_trivial_response;
     -+	}
     ++	if (!state->current_token_data)
     ++		BUG("fsmonitor state does not have a current token");
      +
     -+	pthread_mutex_lock(&state->main_lock);
     ++	if (do_flush)
     ++		with_lock__do_force_resync(state);
      +
     -+	if (!state->current_token_data) {
     -+		/*
     -+		 * We don't have a current token.  This may mean that
     -+		 * the listener thread has not yet started.
     -+		 */
     -+		pthread_mutex_unlock(&state->main_lock);
     -+		result = 0;
     -+		goto send_trivial_response;
     -+	}
     -+	if (strcmp(requested_token_id.buf,
     -+		   state->current_token_data->token_id.buf)) {
     -+		/*
     -+		 * The client last spoke to a different daemon
     -+		 * instance -OR- the daemon had to resync with
     -+		 * the filesystem (and lost events), so reject.
     -+		 */
     -+		pthread_mutex_unlock(&state->main_lock);
     -+		result = 0;
     -+		trace2_data_string("fsmonitor", the_repository,
     -+				   "response/token", "different");
     -+		goto send_trivial_response;
     -+	}
     -+	if (!state->current_token_data->batch_tail) {
     -+		/*
     -+		 * The listener has not received any filesystem
     -+		 * events yet since we created the current token.
     -+		 * We can respond with an empty list, since the
     -+		 * client has already seen the current token and
     -+		 * we have nothing new to report.  (This is
     -+		 * instead of sending a trivial response.)
     -+		 */
     -+		pthread_mutex_unlock(&state->main_lock);
     -+		result = 0;
     -+		goto send_empty_response;
     ++	/*
     ++	 * We mark the current head of the batch list as "pinned" so
     ++	 * that the listener thread will treat this item as read-only
     ++	 * (and prevent any more paths from being added to it) from
     ++	 * now on.
     ++	 */
     ++	token_data = state->current_token_data;
     ++	batch_head = token_data->batch_head;
     ++	((struct fsmonitor_batch *)batch_head)->pinned_time = time(NULL);
     ++
     ++	/*
     ++	 * FSMonitor Protocol V2 requires that we send a response header
     ++	 * with a "new current token" and then all of the paths that changed
     ++	 * since the "requested token".  We send the seq_nr of the just-pinned
     ++	 * head batch so that future requests from a client will be relative
     ++	 * to it.
     ++	 */
     ++	with_lock__format_response_token(&response_token,
     ++					 &token_data->token_id, batch_head);
     ++
     ++	reply(reply_data, response_token.buf, response_token.len + 1);
     ++	total_response_len += response_token.len + 1;
     ++
     ++	trace2_data_string("fsmonitor", the_repository, "response/token",
     ++			   response_token.buf);
     ++	trace_printf_key(&trace_fsmonitor, "response token: %s",
     ++			 response_token.buf);
     ++
     ++	if (!do_trivial) {
     ++		if (strcmp(requested_token_id.buf, token_data->token_id.buf)) {
     ++			/*
     ++			 * The client last spoke to a different daemon
     ++			 * instance -OR- the daemon had to resync with
     ++			 * the filesystem (and lost events), so reject.
     ++			 */
     ++			trace2_data_string("fsmonitor", the_repository,
     ++					   "response/token", "different");
     ++			do_trivial = 1;
     ++
     ++		} else if (requested_oldest_seq_nr <
     ++			   token_data->batch_tail->batch_seq_nr) {
     ++			/*
     ++			 * The client wants older events than we have for
     ++			 * this token_id.  This means that the end of our
     ++			 * batch list was truncated and we cannot give the
     ++			 * client a complete snapshot relative to their
     ++			 * request.
     ++			 */
     ++			trace_printf_key(&trace_fsmonitor,
     ++					 "client requested truncated data");
     ++			do_trivial = 1;
     ++		}
      +	}
     -+	if (requested_oldest_seq_nr <
     -+	    state->current_token_data->batch_tail->batch_seq_nr) {
     -+		/*
     -+		 * The client wants older events than we have for
     -+		 * this token_id.  This means that the end of our
     -+		 * batch list was truncated and we cannot give the
     -+		 * client a complete snapshot relative to their
     -+		 * request.
     -+		 */
     ++
     ++	if (do_trivial) {
      +		pthread_mutex_unlock(&state->main_lock);
      +
     -+		trace_printf_key(&trace_fsmonitor,
     -+				 "client requested truncated data");
     -+		result = 0;
     -+		goto send_trivial_response;
     ++		reply(reply_data, "/", 2);
     ++
     ++		trace2_data_intmax("fsmonitor", the_repository,
     ++				   "response/trivial", 1);
     ++
     ++		strbuf_release(&response_token);
     ++		strbuf_release(&requested_token_id);
     ++		return 0;
      +	}
      +
      +	/*
     @@ builtin/fsmonitor--daemon.c: void fsmonitor_force_resync(struct fsmonitor_daemon
      +	 *
      +	 * AND it allows the listener thread to do a token-reset
      +	 * (and install a new `current_token_data`).
     -+	 *
     -+	 * We mark the current head of the batch list as "pinned" so
     -+	 * that the listener thread will treat this item as read-only
     -+	 * (and prevent any more paths from being added to it) from
     -+	 * now on.
      +	 */
     -+	token_data = state->current_token_data;
      +	token_data->client_ref_count++;
      +
     -+	batch_head = token_data->batch_head;
     -+	((struct fsmonitor_batch *)batch_head)->pinned_time = time(NULL);
     -+
      +	pthread_mutex_unlock(&state->main_lock);
      +
      +	/*
     -+	 * FSMonitor Protocol V2 requires that we send a response header
     -+	 * with a "new current token" and then all of the paths that changed
     -+	 * since the "requested token".
     ++	 * The client request is relative to the token that they sent,
     ++	 * so walk the batch list backwards from the current head back
     ++	 * to the batch (sequence number) they named.
     ++	 *
     ++	 * We use khash to de-dup the list of pathnames.
     ++	 *
     ++	 * NEEDSWORK: each batch contains a list of interned strings,
     ++	 * so we only need to do pointer comparisons here to build the
     ++	 * hash table.  Currently, we're still comparing the string
     ++	 * values.
      +	 */
     -+	fsmonitor_format_response_token(&response_token,
     -+					&token_data->token_id,
     -+					batch_head);
     -+
     -+	reply(reply_data, response_token.buf, response_token.len + 1);
     -+	total_response_len += response_token.len + 1;
     -+
     -+	trace2_data_string("fsmonitor", the_repository, "response/token",
     -+			   response_token.buf);
     -+	trace_printf_key(&trace_fsmonitor, "response token: %s", response_token.buf);
     -+
      +	shown = kh_init_str();
      +	for (batch = batch_head;
     -+	     batch && batch->batch_seq_nr >= requested_oldest_seq_nr;
     ++	     batch && batch->batch_seq_nr > requested_oldest_seq_nr;
      +	     batch = batch->next) {
      +		size_t k;
      +
     @@ builtin/fsmonitor--daemon.c: void fsmonitor_force_resync(struct fsmonitor_daemon
      +	kh_release_str(shown);
      +
      +	pthread_mutex_lock(&state->main_lock);
     ++
      +	if (token_data->client_ref_count > 0)
      +		token_data->client_ref_count--;
      +
     @@ builtin/fsmonitor--daemon.c: void fsmonitor_force_resync(struct fsmonitor_daemon
      +	strbuf_release(&payload);
      +
      +	return 0;
     -+
     -+send_trivial_response:
     -+	pthread_mutex_lock(&state->main_lock);
     -+	fsmonitor_format_response_token(&response_token,
     -+					&state->current_token_data->token_id,
     -+					state->current_token_data->batch_head);
     -+	pthread_mutex_unlock(&state->main_lock);
     -+
     -+	reply(reply_data, response_token.buf, response_token.len + 1);
     -+	trace2_data_string("fsmonitor", the_repository, "response/token",
     -+			   response_token.buf);
     -+	reply(reply_data, "/", 2);
     -+	trace2_data_intmax("fsmonitor", the_repository, "response/trivial", 1);
     -+
     -+	strbuf_release(&response_token);
     -+	strbuf_release(&requested_token_id);
     -+
     -+	return result;
     -+
     -+send_empty_response:
     -+	pthread_mutex_lock(&state->main_lock);
     -+	fsmonitor_format_response_token(&response_token,
     -+					&state->current_token_data->token_id,
     -+					NULL);
     -+	pthread_mutex_unlock(&state->main_lock);
     -+
     -+	reply(reply_data, response_token.buf, response_token.len + 1);
     -+	trace2_data_string("fsmonitor", the_repository, "response/token",
     -+			   response_token.buf);
     -+	trace2_data_intmax("fsmonitor", the_repository, "response/empty", 1);
     -+
     -+	strbuf_release(&response_token);
     -+	strbuf_release(&requested_token_id);
     -+
     -+	return 0;
      +}
      +
       static ipc_server_application_cb handle_client;
       
     - static int handle_client(void *data, const char *command,
     + static int handle_client(void *data,
     +@@ builtin/fsmonitor--daemon.c: static int handle_client(void *data,
       			 ipc_server_reply_cb *reply,
       			 struct ipc_server_reply_data *reply_data)
       {
     @@ builtin/fsmonitor--daemon.c: void fsmonitor_force_resync(struct fsmonitor_daemon
      +	struct fsmonitor_daemon_state *state = data;
       	int result;
       
     + 	/*
     +@@ builtin/fsmonitor--daemon.c: static int handle_client(void *data,
     + 	if (command_len != strlen(command))
     + 		BUG("FSMonitor assumes text messages");
     + 
      +	trace_printf_key(&trace_fsmonitor, "requested token: %s", command);
      +
       	trace2_region_enter("fsmonitor", "handle_client", the_repository);
 17:  9f263e70c724 ! 21:  f0da90e9b050 fsmonitor--daemon: periodically truncate list of modified files
     @@ Commit message
          relative to this new token).
      
          Therefore, the daemon can gradually truncate the in-memory list of
     -    changed paths as they become obsolete (older that the previous token).
     +    changed paths as they become obsolete (older than the previous token).
          Since we may have multiple clients making concurrent requests with a
          skew of tokens and clients may be racing to the talk to the daemon,
          we lazily truncate the list.
     @@ builtin/fsmonitor--daemon.c: static void fsmonitor_batch__combine(struct fsmonit
      + * artificial (based on when we pinned the batch item) and not on any
      + * filesystem activity.
      + */
     -+#define MY_TIME_DELAY (5 * 60) /* seconds */
     ++#define MY_TIME_DELAY_SECONDS (5 * 60) /* seconds */
      +
     -+static void fsmonitor_batch__truncate(struct fsmonitor_daemon_state *state,
     -+				      const struct fsmonitor_batch *batch_marker)
     ++static void with_lock__truncate_old_batches(
     ++	struct fsmonitor_daemon_state *state,
     ++	const struct fsmonitor_batch *batch_marker)
      +{
     -+	/* assert state->main_lock */
     ++	/* assert current thread holding state->main_lock */
      +
      +	const struct fsmonitor_batch *batch;
      +	struct fsmonitor_batch *rest;
      +	struct fsmonitor_batch *p;
     -+	time_t t;
      +
      +	if (!batch_marker)
      +		return;
      +
     -+	trace_printf_key(&trace_fsmonitor, "TRNC mark (%"PRIu64",%"PRIu64")",
     ++	trace_printf_key(&trace_fsmonitor, "Truncate: mark (%"PRIu64",%"PRIu64")",
      +			 batch_marker->batch_seq_nr,
      +			 (uint64_t)batch_marker->pinned_time);
      +
      +	for (batch = batch_marker; batch; batch = batch->next) {
     ++		time_t t;
     ++
      +		if (!batch->pinned_time) /* an overflow batch */
      +			continue;
      +
     -+		t = batch->pinned_time + MY_TIME_DELAY;
     ++		t = batch->pinned_time + MY_TIME_DELAY_SECONDS;
      +		if (t > batch_marker->pinned_time) /* too close to marker */
      +			continue;
      +
     @@ builtin/fsmonitor--daemon.c: static void fsmonitor_batch__combine(struct fsmonit
      +	rest = ((struct fsmonitor_batch *)batch)->next;
      +	((struct fsmonitor_batch *)batch)->next = NULL;
      +
     -+	for (p = rest; p; p = fsmonitor_batch__free(p)) {
     ++	for (p = rest; p; p = fsmonitor_batch__pop(p)) {
      +		trace_printf_key(&trace_fsmonitor,
     -+				 "TRNC kill (%"PRIu64",%"PRIu64")",
     ++				 "Truncate: kill (%"PRIu64",%"PRIu64")",
      +				 p->batch_seq_nr, (uint64_t)p->pinned_time);
      +	}
      +}
     @@ builtin/fsmonitor--daemon.c: static int do_handle_client(struct fsmonitor_daemon
      +			 * obsolete.  See if we can truncate the list
      +			 * and save some memory.
      +			 */
     -+			fsmonitor_batch__truncate(state, batch);
     ++			with_lock__truncate_old_batches(state, batch);
       		}
       	}
       
 18:  c6d5f045fb56 <  -:  ------------ fsmonitor--daemon:: introduce client delay for testing
 19:  038b62dc6744 ! 22:  bb7b1912bb47 fsmonitor--daemon: use a cookie file to sync with file system
     @@ Commit message
          Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
      
       ## builtin/fsmonitor--daemon.c ##
     -@@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
     - 	return 0;
     +@@ builtin/fsmonitor--daemon.c: static int do_as_client__status(void)
     + 	}
       }
       
      +enum fsmonitor_cookie_item_result {
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      +	return strcmp(a->name, keydata ? keydata : b->name);
      +}
      +
     -+static enum fsmonitor_cookie_item_result fsmonitor_wait_for_cookie(
     ++static enum fsmonitor_cookie_item_result with_lock__wait_for_cookie(
      +	struct fsmonitor_daemon_state *state)
      +{
     ++	/* assert current thread holding state->main_lock */
     ++
      +	int fd;
     -+	struct fsmonitor_cookie_item cookie;
     ++	struct fsmonitor_cookie_item *cookie;
      +	struct strbuf cookie_pathname = STRBUF_INIT;
      +	struct strbuf cookie_filename = STRBUF_INIT;
     -+	const char *slash;
     ++	enum fsmonitor_cookie_item_result result;
      +	int my_cookie_seq;
      +
     -+	pthread_mutex_lock(&state->main_lock);
     ++	CALLOC_ARRAY(cookie, 1);
      +
      +	my_cookie_seq = state->cookie_seq++;
      +
     ++	strbuf_addf(&cookie_filename, "%i-%i", getpid(), my_cookie_seq);
     ++
      +	strbuf_addbuf(&cookie_pathname, &state->path_cookie_prefix);
     -+	strbuf_addf(&cookie_pathname, "%i-%i", getpid(), my_cookie_seq);
     -+
     -+	slash = find_last_dir_sep(cookie_pathname.buf);
     -+	if (slash)
     -+		strbuf_addstr(&cookie_filename, slash + 1);
     -+	else
     -+		strbuf_addbuf(&cookie_filename, &cookie_pathname);
     -+	cookie.name = strbuf_detach(&cookie_filename, NULL);
     -+	cookie.result = FCIR_INIT;
     -+	// TODO should we have case-insenstive hash (and in cookie_cmp()) ??
     -+	hashmap_entry_init(&cookie.entry, strhash(cookie.name));
     ++	strbuf_addbuf(&cookie_pathname, &cookie_filename);
      +
     -+	/*
     -+	 * Warning: we are putting the address of a stack variable into a
     -+	 * global hashmap.  This feels dodgy.  We must ensure that we remove
     -+	 * it before this thread and stack frame returns.
     -+	 */
     -+	hashmap_add(&state->cookies, &cookie.entry);
     ++	cookie->name = strbuf_detach(&cookie_filename, NULL);
     ++	cookie->result = FCIR_INIT;
     ++	hashmap_entry_init(&cookie->entry, strhash(cookie->name));
     ++
     ++	hashmap_add(&state->cookies, &cookie->entry);
      +
      +	trace_printf_key(&trace_fsmonitor, "cookie-wait: '%s' '%s'",
     -+			 cookie.name, cookie_pathname.buf);
     ++			 cookie->name, cookie_pathname.buf);
      +
      +	/*
      +	 * Create the cookie file on disk and then wait for a notification
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      +	fd = open(cookie_pathname.buf, O_WRONLY | O_CREAT | O_EXCL, 0600);
      +	if (fd >= 0) {
      +		close(fd);
     -+		unlink_or_warn(cookie_pathname.buf);
     ++		unlink(cookie_pathname.buf);
      +
     -+		while (cookie.result == FCIR_INIT)
     ++		/*
     ++		 * NEEDSWORK: This is an infinite wait (well, unless another
     ++		 * thread sends us an abort).  I'd like to change this to
     ++		 * use `pthread_cond_timedwait()` and return an error/timeout
     ++		 * and let the caller do the trivial response thing.
     ++		 */
     ++		while (cookie->result == FCIR_INIT)
      +			pthread_cond_wait(&state->cookies_cond,
      +					  &state->main_lock);
     -+
     -+		hashmap_remove(&state->cookies, &cookie.entry, NULL);
      +	} else {
      +		error_errno(_("could not create fsmonitor cookie '%s'"),
     -+			    cookie.name);
     ++			    cookie->name);
      +
     -+		cookie.result = FCIR_ERROR;
     -+		hashmap_remove(&state->cookies, &cookie.entry, NULL);
     ++		cookie->result = FCIR_ERROR;
      +	}
      +
     -+	pthread_mutex_unlock(&state->main_lock);
     ++	hashmap_remove(&state->cookies, &cookie->entry, NULL);
     ++
     ++	result = cookie->result;
      +
     -+	free((char*)cookie.name);
     ++	free((char*)cookie->name);
     ++	free(cookie);
      +	strbuf_release(&cookie_pathname);
     -+	return cookie.result;
     ++
     ++	return result;
      +}
      +
      +/*
      + * Mark these cookies as _SEEN and wake up the corresponding client threads.
      + */
     -+static void fsmonitor_cookie_mark_seen(struct fsmonitor_daemon_state *state,
     -+				       const struct string_list *cookie_names)
     ++static void with_lock__mark_cookies_seen(struct fsmonitor_daemon_state *state,
     ++					 const struct string_list *cookie_names)
      +{
     -+	/* assert state->main_lock */
     ++	/* assert current thread holding state->main_lock */
      +
      +	int k;
      +	int nr_seen = 0;
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      +/*
      + * Set _ABORT on all pending cookies and wake up all client threads.
      + */
     -+static void fsmonitor_cookie_abort_all(struct fsmonitor_daemon_state *state)
     ++static void with_lock__abort_all_cookies(struct fsmonitor_daemon_state *state)
      +{
     -+	/* assert state->main_lock */
     ++	/* assert current thread holding state->main_lock */
      +
      +	struct hashmap_iter iter;
      +	struct fsmonitor_cookie_item *cookie;
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__send_flush(void)
      +		pthread_cond_broadcast(&state->cookies_cond);
      +}
      +
     - static int lookup_client_test_delay(void)
     - {
     - 	static int delay_ms = -1;
     + /*
     +  * Requests to and from a FSMonitor Protocol V2 provider use an opaque
     +  * "token" as a virtual timestamp.  Clients can request a summary of all
      @@ builtin/fsmonitor--daemon.c: static void fsmonitor_free_token_data(struct fsmonitor_token_data *token)
        *     We should create a new token and start fresh (as if we just
        *     booted up).
     @@ builtin/fsmonitor--daemon.c: static void fsmonitor_free_token_data(struct fsmoni
      + * [2] Some of those lost events may have been for cookie files.  We
      + *     should assume the worst and abort them rather letting them starve.
      + *
     -  * If there are no readers of the the current token data series, we
     -  * can free it now.  Otherwise, let the last reader free it.  Either
     -  * way, the old token data series is no longer associated with our
     -@@ builtin/fsmonitor--daemon.c: void fsmonitor_force_resync(struct fsmonitor_daemon_state *state)
     - 			 state->current_token_data->token_id.buf,
     - 			 new_one->token_id.buf);
     +  * If there are no concurrent threads readering the current token data
     +  * series, we can free it now.  Otherwise, let the last reader free
     +  * it.
     +@@ builtin/fsmonitor--daemon.c: static void with_lock__do_force_resync(struct fsmonitor_daemon_state *state)
     + 	state->current_token_data = new_one;
       
     -+	fsmonitor_cookie_abort_all(state);
     + 	fsmonitor_free_token_data(free_me);
      +
     - 	if (state->current_token_data->client_ref_count == 0)
     - 		free_me = state->current_token_data;
     - 	state->current_token_data = new_one;
     ++	with_lock__abort_all_cookies(state);
     + }
     + 
     + void fsmonitor_force_resync(struct fsmonitor_daemon_state *state)
      @@ builtin/fsmonitor--daemon.c: static int do_handle_client(struct fsmonitor_daemon_state *state,
     - 	kh_str_t *shown;
       	int hash_ret;
     - 	int result;
     + 	int do_trivial = 0;
     + 	int do_flush = 0;
     ++	int do_cookie = 0;
      +	enum fsmonitor_cookie_item_result cookie_result;
       
       	/*
       	 * We expect `command` to be of the form:
      @@ builtin/fsmonitor--daemon.c: static int do_handle_client(struct fsmonitor_daemon_state *state,
     - 		goto send_trivial_response;
     + 		 */
     + 		do_flush = 1;
     + 		do_trivial = 1;
     ++		do_cookie = 1;
     + 
     + 	} else if (!skip_prefix(command, "builtin:", &p)) {
     + 		/* assume V1 timestamp or garbage */
     +@@ builtin/fsmonitor--daemon.c: static int do_handle_client(struct fsmonitor_daemon_state *state,
     + 				  "fsmonitor: unsupported V1 protocol '%s'"),
     + 				 command);
     + 		do_trivial = 1;
     ++		do_cookie = 1;
     + 
     + 	} else {
     + 		/* We have "builtin:*" */
     +@@ builtin/fsmonitor--daemon.c: static int do_handle_client(struct fsmonitor_daemon_state *state,
     + 					 "fsmonitor: invalid V2 protocol token '%s'",
     + 					 command);
     + 			do_trivial = 1;
     ++			do_cookie = 1;
     + 
     + 		} else {
     + 			/*
     + 			 * We have a V2 valid token:
     + 			 *     "builtin:<token_id>:<seq_nr>"
     + 			 */
     ++			do_cookie = 1;
     + 		}
       	}
       
     -+	pthread_mutex_unlock(&state->main_lock);
     -+
     +@@ builtin/fsmonitor--daemon.c: static int do_handle_client(struct fsmonitor_daemon_state *state,
     + 	if (!state->current_token_data)
     + 		BUG("fsmonitor state does not have a current token");
     + 
      +	/*
     -+	 * Write a cookie file inside the directory being watched in an
     -+	 * effort to flush out existing filesystem events that we actually
     -+	 * care about.  Suspend this client thread until we see the filesystem
     -+	 * events for this cookie file.
     ++	 * Write a cookie file inside the directory being watched in
     ++	 * an effort to flush out existing filesystem events that we
     ++	 * actually care about.  Suspend this client thread until we
     ++	 * see the filesystem events for this cookie file.
     ++	 *
     ++	 * Creating the cookie lets us guarantee that our FS listener
     ++	 * thread has drained the kernel queue and we are caught up
     ++	 * with the kernel.
     ++	 *
     ++	 * If we cannot create the cookie (or otherwise guarantee that
     ++	 * we are caught up), we send a trivial response.  We have to
     ++	 * assume that there might be some very, very recent activity
     ++	 * on the FS still in flight.
      +	 */
     -+	cookie_result = fsmonitor_wait_for_cookie(state);
     -+	if (cookie_result != FCIR_SEEN) {
     -+		error(_("fsmonitor: cookie_result '%d' != SEEN"),
     -+		      cookie_result);
     -+		result = 0;
     -+		goto send_trivial_response;
     -+	}
     -+
     -+	pthread_mutex_lock(&state->main_lock);
     -+
     -+	if (strcmp(requested_token_id.buf,
     -+		   state->current_token_data->token_id.buf)) {
     -+		/*
     -+		 * Ack! The listener thread lost sync with the filesystem
     -+		 * and created a new token while we were waiting for the
     -+		 * cookie file to be created!  Just give up.
     -+		 */
     -+		pthread_mutex_unlock(&state->main_lock);
     -+
     -+		trace_printf_key(&trace_fsmonitor,
     -+				 "lost filesystem sync");
     -+		result = 0;
     -+		goto send_trivial_response;
     ++	if (do_cookie) {
     ++		cookie_result = with_lock__wait_for_cookie(state);
     ++		if (cookie_result != FCIR_SEEN) {
     ++			error(_("fsmonitor: cookie_result '%d' != SEEN"),
     ++			      cookie_result);
     ++			do_trivial = 1;
     ++		}
      +	}
      +
     - 	/*
     - 	 * We're going to hold onto a pointer to the current
     - 	 * token-data while we walk the list of batches of files.
     + 	if (do_flush)
     + 		with_lock__do_force_resync(state);
     + 
     +@@ builtin/fsmonitor--daemon.c: static int handle_client(void *data,
     + 	return result;
     + }
     + 
     +-#define FSMONITOR_COOKIE_PREFIX ".fsmonitor-daemon-"
     ++#define FSMONITOR_DIR           "fsmonitor--daemon"
     ++#define FSMONITOR_COOKIE_DIR    "cookies"
     ++#define FSMONITOR_COOKIE_PREFIX (FSMONITOR_DIR "/" FSMONITOR_COOKIE_DIR "/")
     + 
     + enum fsmonitor_path_type fsmonitor_classify_path_workdir_relative(
     + 	const char *rel)
      @@ builtin/fsmonitor--daemon.c: void fsmonitor_publish(struct fsmonitor_daemon_state *state,
       		}
       	}
       
      +	if (cookie_names->nr)
     -+		fsmonitor_cookie_mark_seen(state, cookie_names);
     ++		with_lock__mark_cookies_seen(state, cookie_names);
      +
       	pthread_mutex_unlock(&state->main_lock);
       }
     @@ builtin/fsmonitor--daemon.c: static int fsmonitor_run_daemon(void)
      +	pthread_cond_init(&state.cookies_cond, NULL);
       	state.error_code = 0;
       	state.current_token_data = fsmonitor_new_token_data();
     - 	state.test_client_delay_ms = lookup_client_test_delay();
     + 
      @@ builtin/fsmonitor--daemon.c: static int fsmonitor_run_daemon(void)
       		state.nr_paths_watching = 2;
       	}
       
      +	/*
      +	 * We will write filesystem syncing cookie files into
     -+	 * <gitdir>/<cookie-prefix><pid>-<seq>.
     ++	 * <gitdir>/<fsmonitor-dir>/<cookie-dir>/<pid>-<seq>.
      +	 */
      +	strbuf_init(&state.path_cookie_prefix, 0);
      +	strbuf_addbuf(&state.path_cookie_prefix, &state.path_gitdir_watch);
     ++
     ++	strbuf_addch(&state.path_cookie_prefix, '/');
     ++	strbuf_addstr(&state.path_cookie_prefix, FSMONITOR_DIR);
     ++	mkdir(state.path_cookie_prefix.buf, 0777);
     ++
     ++	strbuf_addch(&state.path_cookie_prefix, '/');
     ++	strbuf_addstr(&state.path_cookie_prefix, FSMONITOR_COOKIE_DIR);
     ++	mkdir(state.path_cookie_prefix.buf, 0777);
     ++
      +	strbuf_addch(&state.path_cookie_prefix, '/');
     -+	strbuf_addstr(&state.path_cookie_prefix, FSMONITOR_COOKIE_PREFIX);
      +
       	/*
       	 * Confirm that we can create platform-specific resources for the
     @@ builtin/fsmonitor--daemon.c: static int fsmonitor_run_daemon(void)
       	strbuf_release(&state.path_worktree_watch);
       	strbuf_release(&state.path_gitdir_watch);
      +	strbuf_release(&state.path_cookie_prefix);
     ++
     ++	/*
     ++	 * NEEDSWORK: Consider "rm -rf <gitdir>/<fsmonitor-dir>"
     ++	 */
       
       	return err;
       }
 20:  d699ad597d2c ! 23:  102e17cbc875 fsmonitor: force update index when fsmonitor token advances
     @@ Metadata
      Author: Jeff Hostetler <jeffhost@microsoft.com>
      
       ## Commit message ##
     -    fsmonitor: force update index when fsmonitor token advances
     -
     -    Set the `FSMONITOR_CHANGED` bit on `istate->cache_changed` when the
     -    fsmonitor response contains a different token to ensure that the index
     -    is written to disk.
     -
     -    Normally, when the fsmonitor response includes a tracked file, the
     -    index is always updated.  Similarly, the index might be updated when
     -    the response alters the untracked-cache (when enabled).  However, in
     -    cases where neither of those cause the index to be considered changed,
     -    the fsmonitor response is wasted.  And subsequent commands will
     -    continue to make requests with the same token and if there have not
     -    been any changes in the working directory, they will receive the same
     -    response.
     -
     -    This was observed on Windows after a large checkout.  On Windows, the
     -    kernel emits events for the files that are changed as they are
     -    changed.  However, it might delay events for the containing
     -    directories until the system is more idle (or someone scans the
     -    directory (so it seems)).  The first status following a checkout would
     -    get the list of files.  The subsequent status commands would get the
     -    list of directories as the events trickled out.  But they would never
     -    catch up because the token was not advanced because the index wasn't
     -    updated.
     -
     -    This list of directories caused `wt_status_collect_untracked()` to
     -    unnecessarily spend time actually scanning them during each command.
     +    fsmonitor: enhance existing comments
      
          Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
      
       ## fsmonitor.c ##
      @@ fsmonitor.c: void refresh_fsmonitor(struct index_state *istate)
       	}
     - 	strbuf_release(&query_result);
       
     + apply_results:
     +-	/* a fsmonitor process can return '/' to indicate all entries are invalid */
      +	/*
     -+	 * If the fsmonitor response and the subsequent scan of the disk
     -+	 * did not cause the in-memory index to be marked dirty, then force
     -+	 * it so that we advance the fsmonitor token in our extension, so
     -+	 * that future requests don't keep re-requesting the same range.
     ++	 * The response from FSMonitor (excluding the header token) is
     ++	 * either:
     ++	 *
     ++	 * [a] a (possibly empty) list of NUL delimited relative
     ++	 *     pathnames of changed paths.  This list can contain
     ++	 *     files and directories.  Directories have a trailing
     ++	 *     slash.
     ++	 *
     ++	 * [b] a single '/' to indicate the provider had no
     ++	 *     information and that we should consider everything
     ++	 *     invalid.  We call this a trivial response.
      +	 */
     -+	if (istate->fsmonitor_last_update &&
     -+	    strcmp(istate->fsmonitor_last_update, last_update_token.buf))
     -+		istate->cache_changed |= FSMONITOR_CHANGED;
     + 	if (query_success && query_result.buf[bol] != '/') {
     +-		/* Mark all entries returned by the monitor as dirty */
     ++		/*
     ++		 * Mark all pathnames returned by the monitor as dirty.
     ++		 *
     ++		 * This updates both the cache-entries and the untracked-cache.
     ++		 */
     + 		buf = query_result.buf;
     + 		for (i = bol; i < query_result.len; i++) {
     + 			if (buf[i] != '\0')
     +@@ fsmonitor.c: void refresh_fsmonitor(struct index_state *istate)
     + 		if (istate->untracked)
     + 			istate->untracked->use_fsmonitor = 1;
     + 	} else {
     +-
     +-		/* We only want to run the post index changed hook if we've actually changed entries, so keep track
     +-		 * if we actually changed entries or not */
     ++		/*
     ++		 * We received a trivial response, so invalidate everything.
     ++		 *
     ++		 * We only want to run the post index changed hook if
     ++		 * we've actually changed entries, so keep track if we
     ++		 * actually changed entries or not.
     ++		 */
     + 		int is_cache_changed = 0;
     +-		/* Mark all entries invalid */
      +
     - 	/* Now that we've updated istate, save the last_update_token */
     - 	FREE_AND_NULL(istate->fsmonitor_last_update);
     - 	istate->fsmonitor_last_update = strbuf_detach(&last_update_token, NULL);
     + 		for (i = 0; i < istate->cache_nr; i++) {
     + 			if (istate->cache[i]->ce_flags & CE_FSMONITOR_VALID) {
     + 				is_cache_changed = 1;
     +@@ fsmonitor.c: void refresh_fsmonitor(struct index_state *istate)
     + 			}
     + 		}
     + 
     +-		/* If we're going to check every file, ensure we save the results */
     ++		/*
     ++		 * If we're going to check every file, ensure we save
     ++		 * the results.
     ++		 */
     + 		if (is_cache_changed)
     + 			istate->cache_changed |= FSMONITOR_CHANGED;
     + 
  -:  ------------ > 24:  11ea2f97def6 fsmonitor: force update index after large responses
 21:  8b2280e5c4d2 ! 25:  c9159db718a7 t7527: create test for fsmonitor--daemon
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +
      +. ./test-lib.sh
      +
     -+# Ask the fsmonitor daemon to insert a little delay before responding to
     -+# client commands like `git status` and `git fsmonitor--daemon --query` to
     -+# allow recent filesystem events to be received by the daemon.  This helps
     -+# the CI/PR builds be more stable.
     -+#
     -+# An arbitrary millisecond value.
     -+#
     -+GIT_TEST_FSMONITOR_CLIENT_DELAY=1000
     -+export GIT_TEST_FSMONITOR_CLIENT_DELAY
     -+
      +git version --build-options | grep "feature:" | grep "fsmonitor--daemon" || {
      +	skip_all="The built-in FSMonitor is not supported on this platform"
      +	test_done
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +
      +kill_repo () {
      +	r=$1
     -+	git -C $r fsmonitor--daemon --stop >/dev/null 2>/dev/null
     ++	git -C $r fsmonitor--daemon stop >/dev/null 2>/dev/null
      +	rm -rf $1
      +	return 0
      +}
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +		*) r="";
      +	esac
      +
     -+	git $r fsmonitor--daemon --start || return $?
     -+	git $r fsmonitor--daemon --is-running || return $?
     ++	git $r fsmonitor--daemon start || return $?
     ++	git $r fsmonitor--daemon status || return $?
      +
      +	return 0
      +}
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +	git init test_explicit &&
      +	start_daemon test_explicit &&
      +
     -+	git -C test_explicit fsmonitor--daemon --stop &&
     -+	test_must_fail git -C test_explicit fsmonitor--daemon --is-running
     ++	git -C test_explicit fsmonitor--daemon stop &&
     ++	test_must_fail git -C test_explicit fsmonitor--daemon status
      +'
      +
      +test_expect_success 'implicit daemon start' '
      +	test_when_finished "kill_repo test_implicit" &&
      +
      +	git init test_implicit &&
     -+	test_must_fail git -C test_implicit fsmonitor--daemon --is-running &&
     ++	test_must_fail git -C test_implicit fsmonitor--daemon status &&
      +
      +	# query will implicitly start the daemon.
      +	#
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +	# implicitly started.)
      +
      +	GIT_TRACE2_EVENT="$PWD/.git/trace" \
     -+		git -C test_implicit fsmonitor--daemon --query 0 >actual &&
     ++		test-tool -C test_implicit fsmonitor-client query --token 0 >actual &&
      +	nul_to_q <actual >actual.filtered &&
      +	grep "builtin:" actual.filtered &&
      +
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +
      +	grep :\"query/response-length\" .git/trace &&
      +
     -+	git -C test_implicit fsmonitor--daemon --is-running &&
     -+	git -C test_implicit fsmonitor--daemon --stop &&
     -+	test_must_fail git -C test_implicit fsmonitor--daemon --is-running
     ++	git -C test_implicit fsmonitor--daemon status &&
     ++	git -C test_implicit fsmonitor--daemon stop &&
     ++	test_must_fail git -C test_implicit fsmonitor--daemon status
      +'
      +
      +test_expect_success 'implicit daemon stop (delete .git)' '
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +	sleep 1 &&
      +	mkdir test_implicit_1/.git &&
      +
     -+	test_must_fail git -C test_implicit_1 fsmonitor--daemon --is-running
     ++	test_must_fail git -C test_implicit_1 fsmonitor--daemon status
      +'
      +
      +test_expect_success 'implicit daemon stop (rename .git)' '
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +	sleep 1 &&
      +	mkdir test_implicit_2/.git &&
      +
     -+	test_must_fail git -C test_implicit_2 fsmonitor--daemon --is-running
     ++	test_must_fail git -C test_implicit_2 fsmonitor--daemon status
      +'
      +
      +test_expect_success 'cannot start multiple daemons' '
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +
      +	start_daemon test_multiple &&
      +
     -+	test_must_fail git -C test_multiple fsmonitor--daemon --start 2>actual &&
     ++	test_must_fail git -C test_multiple fsmonitor--daemon start 2>actual &&
      +	grep "fsmonitor--daemon is already running" actual &&
      +
     -+	git -C test_multiple fsmonitor--daemon --stop &&
     -+	test_must_fail git -C test_multiple fsmonitor--daemon --is-running
     ++	git -C test_multiple fsmonitor--daemon stop &&
     ++	test_must_fail git -C test_multiple fsmonitor--daemon status
      +'
      +
      +test_expect_success 'setup' '
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +'
      +
      +test_expect_success 'update-index implicitly starts daemon' '
     -+	test_must_fail git fsmonitor--daemon --is-running &&
     ++	test_must_fail git fsmonitor--daemon status &&
      +
      +	GIT_TRACE2_EVENT="$PWD/.git/trace_implicit_1" \
      +		git update-index --fsmonitor &&
      +
     -+	git fsmonitor--daemon --is-running &&
     -+	test_might_fail git fsmonitor--daemon --stop &&
     ++	git fsmonitor--daemon status &&
     ++	test_might_fail git fsmonitor--daemon stop &&
      +
      +	grep \"event\":\"start\".*\"fsmonitor--daemon\" .git/trace_implicit_1
      +'
      +
      +test_expect_success 'status implicitly starts daemon' '
     -+	test_must_fail git fsmonitor--daemon --is-running &&
     ++	test_must_fail git fsmonitor--daemon status &&
      +
      +	GIT_TRACE2_EVENT="$PWD/.git/trace_implicit_2" \
      +		git status >actual &&
      +
     -+	git fsmonitor--daemon --is-running &&
     -+	test_might_fail git fsmonitor--daemon --stop &&
     ++	git fsmonitor--daemon status &&
     ++	test_might_fail git fsmonitor--daemon stop &&
      +
      +	grep \"event\":\"start\".*\"fsmonitor--daemon\" .git/trace_implicit_2
      +'
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +clean_up_repo_and_stop_daemon () {
      +	git reset --hard HEAD
      +	git clean -fd
     -+	git fsmonitor--daemon --stop
     ++	git fsmonitor--daemon stop
      +	rm -f .git/trace
      +}
      +
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +
      +	edit_files &&
      +
     -+	git fsmonitor--daemon --query 0 >/dev/null 2>&1 &&
     ++	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
      +
      +	grep "^event: dir1/modified$"  .git/trace &&
      +	grep "^event: dir2/modified$"  .git/trace &&
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +
      +	create_files &&
      +
     -+	git fsmonitor--daemon --query 0 >/dev/null 2>&1 &&
     ++	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
      +
      +	grep "^event: dir1/new$" .git/trace &&
      +	grep "^event: dir2/new$" .git/trace &&
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +
      +	delete_files &&
      +
     -+	git fsmonitor--daemon --query 0 >/dev/null 2>&1 &&
     ++	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
      +
      +	grep "^event: dir1/delete$" .git/trace &&
      +	grep "^event: dir2/delete$" .git/trace &&
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +
      +	rename_files &&
      +
     -+	git fsmonitor--daemon --query 0 >/dev/null 2>&1 &&
     ++	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
      +
      +	grep "^event: dir1/rename$"  .git/trace &&
      +	grep "^event: dir2/rename$"  .git/trace &&
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +
      +	mv dirtorename dirrenamed &&
      +
     -+	git fsmonitor--daemon --query 0 >/dev/null 2>&1 &&
     ++	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
      +
      +	grep "^event: dirtorename/*$" .git/trace &&
      +	grep "^event: dirrenamed/*$"  .git/trace
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +
      +	file_to_directory &&
      +
     -+	git fsmonitor--daemon --query 0 >/dev/null 2>&1 &&
     ++	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
      +
      +	grep "^event: delete$"     .git/trace &&
      +	grep "^event: delete/new$" .git/trace
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +
      +	directory_to_file &&
      +
     -+	git fsmonitor--daemon --query 0 >/dev/null 2>&1 &&
     ++	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
      +
      +	grep "^event: dir1$" .git/trace
      +'
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +# polling fast enough), we need to discard the cached data (relative to the
      +# current token) and start collecting events under a new token.
      +#
     -+# the 'git fsmonitor--daemon --flush' command can be used to send a "flush"
     -+# message to a running daemon and ask it to do a flush/resync.
     ++# the 'test-tool fsmonitor-client flush' command can be used to send a
     ++# "flush" message to a running daemon and ask it to do a flush/resync.
      +
      +test_expect_success 'flush cached data' '
      +	test_when_finished "kill_repo test_flush" &&
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +	# then a few (probably platform-specific number of) events in _1.
      +	# These should both have the same <token_id>.
      +
     -+	git -C test_flush fsmonitor--daemon --query "builtin:test_00000001:0" >actual_0 &&
     ++	test-tool -C test_flush fsmonitor-client query --token "builtin:test_00000001:0" >actual_0 &&
      +	nul_to_q <actual_0 >actual_q0 &&
      +
      +	touch test_flush/file_1 &&
      +	touch test_flush/file_2 &&
      +
     -+	git -C test_flush fsmonitor--daemon --query "builtin:test_00000001:0" >actual_1 &&
     ++	test-tool -C test_flush fsmonitor-client query --token "builtin:test_00000001:0" >actual_1 &&
      +	nul_to_q <actual_1 >actual_q1 &&
      +
      +	grep "file_1" actual_q1 &&
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +	# flush the file data.  Then create some events and ensure that the file
      +	# again appears in the cache.  It should have the new <token_id>.
      +
     -+	git -C test_flush fsmonitor--daemon --flush >flush_0 &&
     ++	test-tool -C test_flush fsmonitor-client flush >flush_0 &&
      +	nul_to_q <flush_0 >flush_q0 &&
      +	grep "^builtin:test_00000002:0Q/Q$" flush_q0 &&
      +
     -+	git -C test_flush fsmonitor--daemon --query "builtin:test_00000002:0" >actual_2 &&
     ++	test-tool -C test_flush fsmonitor-client query --token "builtin:test_00000002:0" >actual_2 &&
      +	nul_to_q <actual_2 >actual_q2 &&
      +
      +	grep "^builtin:test_00000002:0Q$" actual_q2 &&
      +
      +	touch test_flush/file_3 &&
      +
     -+	git -C test_flush fsmonitor--daemon --query "builtin:test_00000002:0" >actual_3 &&
     ++	test-tool -C test_flush fsmonitor-client query --token "builtin:test_00000002:0" >actual_3 &&
      +	nul_to_q <actual_3 >actual_q3 &&
      +
      +	grep "file_3" actual_q3
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +		start_daemon wt-secondary
      +	) &&
      +
     -+	git -C wt-secondary fsmonitor--daemon --stop &&
     -+	test_must_fail git -C wt-secondary fsmonitor--daemon --is-running
     ++	git -C wt-secondary fsmonitor--daemon stop &&
     ++	test_must_fail git -C wt-secondary fsmonitor--daemon status
      +'
      +
      +test_done
 22:  da5094e52032 ! 26:  5b035c6e0d60 p7519: add fsmonitor--daemon
     @@ t/perf/p7519-fsmonitor.sh: test_expect_success "one time repo setup" '
       		#
       		# Choose integration script based on existence of Watchman.
       		# Fall back to an empty integration script.
     +@@ t/perf/p7519-fsmonitor.sh: test_perf_w_drop_caches () {
     + }
     + 
     + test_fsmonitor_suite() {
     +-	if test -n "$INTEGRATION_SCRIPT"; then
     ++	if test -n "$USE_FSMONITOR_DAEMON"
     ++	then
     ++		DESC="builtin fsmonitor--daemon"
     ++	elif test -n "$INTEGRATION_SCRIPT"; then
     + 		DESC="fsmonitor=$(basename $INTEGRATION_SCRIPT)"
     + 	else
     + 		DESC="fsmonitor=disabled"
      @@ t/perf/p7519-fsmonitor.sh: test_expect_success "setup without fsmonitor" '
       test_fsmonitor_suite
       trace_stop
     @@ t/perf/p7519-fsmonitor.sh: test_expect_success "setup without fsmonitor" '
      +	USE_FSMONITOR_DAEMON=t
      +
      +	trace_start fsmonitor--daemon--server
     -+	git fsmonitor--daemon --start
     ++	git fsmonitor--daemon start
      +
      +	trace_start fsmonitor--daemon--client
      +	test_expect_success "setup for fsmonitor--daemon" 'setup_for_fsmonitor'
      +	test_fsmonitor_suite
      +
     -+	git fsmonitor--daemon --stop
     ++	git fsmonitor--daemon stop
      +	trace_stop
      +fi
      +
 23:  3eafd0b5cb09 ! 27:  1483c68855cb t7527: test status with untracked-cache and fsmonitor--daemon
     @@ t/t7527-builtin-fsmonitor.sh: test_expect_success 'setup' '
       
       	git -c core.useBuiltinFSMonitor= add . &&
      @@ t/t7527-builtin-fsmonitor.sh: test_expect_success 'worktree with .git file' '
     - 	test_must_fail git -C wt-secondary fsmonitor--daemon --is-running
     + 	test_must_fail git -C wt-secondary fsmonitor--daemon status
       '
       
      +# TODO Repeat one of the "edit" tests on wt-secondary and confirm that
     @@ t/t7527-builtin-fsmonitor.sh: test_expect_success 'worktree with .git file' '
      +test_expect_success 'Matrix: setup for untracked-cache,fsmonitor matrix' '
      +	test_might_fail git config --unset core.useBuiltinFSMonitor &&
      +	git update-index --no-fsmonitor &&
     -+	test_might_fail git fsmonitor--daemon --stop
     ++	test_might_fail git fsmonitor--daemon stop
      +'
      +
      +matrix_clean_up_repo () {
     @@ t/t7527-builtin-fsmonitor.sh: test_expect_success 'worktree with .git file' '
      +			test_expect_success "Matrix[uc:$uc_val][fsm:$fsm_val] disable fsmonitor" '
      +				test_might_fail git config --unset core.useBuiltinFSMonitor &&
      +				git update-index --no-fsmonitor &&
     -+				test_might_fail git fsmonitor--daemon --stop 2>/dev/null
     ++				test_might_fail git fsmonitor--daemon stop 2>/dev/null
      +			'
      +		else
      +			test_expect_success "Matrix[uc:$uc_val][fsm:$fsm_val] enable fsmonitor" '
      +				git config core.useBuiltinFSMonitor true &&
     -+				git fsmonitor--daemon --start &&
     ++				git fsmonitor--daemon start &&
      +				git update-index --fsmonitor
      +			'
      +		fi
  -:  ------------ > 28:  96a3eab819f4 t/perf: avoid copying builtin fsmonitor files into test repo

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 237+ messages in thread

* [PATCH v2 01/28] simple-ipc: preparations for supporting binary messages.
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:56   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 02/28] fsmonitor--daemon: man page Jeff Hostetler via GitGitGadget
                     ` (28 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Add `command_len` argument to the Simple IPC API.

In my original Simple IPC API, I assumed that the request
would always be a null-terminated string of text characters.
The command arg was just a `const char *`.

I found a caller that would like to pass a binary command
to the daemon, so I want to ammend the Simple IPC API to
take `const char *command, size_t command_len` and pass
that to the daemon.  (Really, the first arg should just be
a `void *` or `const unsigned byte *` to make that clearer.)

Note, the response side has always been a `struct strbuf`
which includes the buffer and length, so we already support
returning a binary answer.  (Yes, it feels a little weird
returning a binary buffer in a `strbuf`, but it works.)

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/simple-ipc/ipc-unix-socket.c | 14 +++++++-----
 compat/simple-ipc/ipc-win32.c       | 14 +++++++-----
 simple-ipc.h                        |  7 ++++--
 t/helper/test-simple-ipc.c          | 34 +++++++++++++++++++----------
 4 files changed, 46 insertions(+), 23 deletions(-)

diff --git a/compat/simple-ipc/ipc-unix-socket.c b/compat/simple-ipc/ipc-unix-socket.c
index 38689b278df3..0a2d1c1162b9 100644
--- a/compat/simple-ipc/ipc-unix-socket.c
+++ b/compat/simple-ipc/ipc-unix-socket.c
@@ -164,7 +164,8 @@ void ipc_client_close_connection(struct ipc_client_connection *connection)
 
 int ipc_client_send_command_to_connection(
 	struct ipc_client_connection *connection,
-	const char *message, struct strbuf *answer)
+	const char *message, size_t message_len,
+	struct strbuf *answer)
 {
 	int ret = 0;
 
@@ -172,7 +173,7 @@ int ipc_client_send_command_to_connection(
 
 	trace2_region_enter("ipc-client", "send-command", NULL);
 
-	if (write_packetized_from_buf_no_flush(message, strlen(message),
+	if (write_packetized_from_buf_no_flush(message, message_len,
 					       connection->fd) < 0 ||
 	    packet_flush_gently(connection->fd) < 0) {
 		ret = error(_("could not send IPC command"));
@@ -193,7 +194,8 @@ int ipc_client_send_command_to_connection(
 
 int ipc_client_send_command(const char *path,
 			    const struct ipc_client_connect_options *options,
-			    const char *message, struct strbuf *answer)
+			    const char *message, size_t message_len,
+			    struct strbuf *answer)
 {
 	int ret = -1;
 	enum ipc_active_state state;
@@ -204,7 +206,9 @@ int ipc_client_send_command(const char *path,
 	if (state != IPC_STATE__LISTENING)
 		return ret;
 
-	ret = ipc_client_send_command_to_connection(connection, message, answer);
+	ret = ipc_client_send_command_to_connection(connection,
+						    message, message_len,
+						    answer);
 
 	ipc_client_close_connection(connection);
 
@@ -499,7 +503,7 @@ static int worker_thread__do_io(
 	if (ret >= 0) {
 		ret = worker_thread_data->server_data->application_cb(
 			worker_thread_data->server_data->application_data,
-			buf.buf, do_io_reply_callback, &reply_data);
+			buf.buf, buf.len, do_io_reply_callback, &reply_data);
 
 		packet_flush_gently(reply_data.fd);
 	}
diff --git a/compat/simple-ipc/ipc-win32.c b/compat/simple-ipc/ipc-win32.c
index 8f89c02037e3..632fb3c7ea24 100644
--- a/compat/simple-ipc/ipc-win32.c
+++ b/compat/simple-ipc/ipc-win32.c
@@ -204,7 +204,8 @@ void ipc_client_close_connection(struct ipc_client_connection *connection)
 
 int ipc_client_send_command_to_connection(
 	struct ipc_client_connection *connection,
-	const char *message, struct strbuf *answer)
+	const char *message, size_t message_len,
+	struct strbuf *answer)
 {
 	int ret = 0;
 
@@ -212,7 +213,7 @@ int ipc_client_send_command_to_connection(
 
 	trace2_region_enter("ipc-client", "send-command", NULL);
 
-	if (write_packetized_from_buf_no_flush(message, strlen(message),
+	if (write_packetized_from_buf_no_flush(message, message_len,
 					       connection->fd) < 0 ||
 	    packet_flush_gently(connection->fd) < 0) {
 		ret = error(_("could not send IPC command"));
@@ -235,7 +236,8 @@ int ipc_client_send_command_to_connection(
 
 int ipc_client_send_command(const char *path,
 			    const struct ipc_client_connect_options *options,
-			    const char *message, struct strbuf *response)
+			    const char *message, size_t message_len,
+			    struct strbuf *response)
 {
 	int ret = -1;
 	enum ipc_active_state state;
@@ -246,7 +248,9 @@ int ipc_client_send_command(const char *path,
 	if (state != IPC_STATE__LISTENING)
 		return ret;
 
-	ret = ipc_client_send_command_to_connection(connection, message, response);
+	ret = ipc_client_send_command_to_connection(connection,
+						    message, message_len,
+						    response);
 
 	ipc_client_close_connection(connection);
 
@@ -454,7 +458,7 @@ static int do_io(struct ipc_server_thread_data *server_thread_data)
 	if (ret >= 0) {
 		ret = server_thread_data->server_data->application_cb(
 			server_thread_data->server_data->application_data,
-			buf.buf, do_io_reply_callback, &reply_data);
+			buf.buf, buf.len, do_io_reply_callback, &reply_data);
 
 		packet_flush_gently(reply_data.fd);
 
diff --git a/simple-ipc.h b/simple-ipc.h
index dc3606e30bd6..c4d5225b41c2 100644
--- a/simple-ipc.h
+++ b/simple-ipc.h
@@ -111,7 +111,8 @@ void ipc_client_close_connection(struct ipc_client_connection *connection);
  */
 int ipc_client_send_command_to_connection(
 	struct ipc_client_connection *connection,
-	const char *message, struct strbuf *answer);
+	const char *message, size_t message_len,
+	struct strbuf *answer);
 
 /*
  * Used by the client to synchronously connect and send and receive a
@@ -123,7 +124,8 @@ int ipc_client_send_command_to_connection(
  */
 int ipc_client_send_command(const char *path,
 			    const struct ipc_client_connect_options *options,
-			    const char *message, struct strbuf *answer);
+			    const char *message, size_t message_len,
+			    struct strbuf *answer);
 
 /*
  * Simple IPC Server Side API.
@@ -148,6 +150,7 @@ typedef int (ipc_server_reply_cb)(struct ipc_server_reply_data *,
  */
 typedef int (ipc_server_application_cb)(void *application_data,
 					const char *request,
+					size_t request_len,
 					ipc_server_reply_cb *reply_cb,
 					struct ipc_server_reply_data *reply_data);
 
diff --git a/t/helper/test-simple-ipc.c b/t/helper/test-simple-ipc.c
index 42040ef81b1e..913451807509 100644
--- a/t/helper/test-simple-ipc.c
+++ b/t/helper/test-simple-ipc.c
@@ -112,7 +112,7 @@ static int app__slow_command(ipc_server_reply_cb *reply_cb,
 /*
  * The client sent a command followed by a (possibly very) large buffer.
  */
-static int app__sendbytes_command(const char *received,
+static int app__sendbytes_command(const char *received, size_t received_len,
 				  ipc_server_reply_cb *reply_cb,
 				  struct ipc_server_reply_data *reply_data)
 {
@@ -123,6 +123,13 @@ static int app__sendbytes_command(const char *received,
 	int errs = 0;
 	int ret;
 
+	/*
+	 * The test is setup to send:
+	 *     "sendbytes" SP <n * char>
+	 */
+	if (received_len < strlen("sendbytes "))
+		BUG("received_len is short in app__sendbytes_command");
+
 	if (skip_prefix(received, "sendbytes ", &p))
 		len_ballast = strlen(p);
 
@@ -160,7 +167,7 @@ static ipc_server_application_cb test_app_cb;
  * by this application.
  */
 static int test_app_cb(void *application_data,
-		       const char *command,
+		       const char *command, size_t command_len,
 		       ipc_server_reply_cb *reply_cb,
 		       struct ipc_server_reply_data *reply_data)
 {
@@ -173,7 +180,7 @@ static int test_app_cb(void *application_data,
 	if (application_data != (void*)&my_app_data)
 		BUG("application_cb: application_data pointer wrong");
 
-	if (!strcmp(command, "quit")) {
+	if (command_len == 4 && !strncmp(command, "quit", 4)) {
 		/*
 		 * The client sent a "quit" command.  This is an async
 		 * request for the server to shutdown.
@@ -193,22 +200,23 @@ static int test_app_cb(void *application_data,
 		return SIMPLE_IPC_QUIT;
 	}
 
-	if (!strcmp(command, "ping")) {
+	if (command_len == 4 && !strncmp(command, "ping", 4)) {
 		const char *answer = "pong";
 		return reply_cb(reply_data, answer, strlen(answer));
 	}
 
-	if (!strcmp(command, "big"))
+	if (command_len == 3 && !strncmp(command, "big", 3))
 		return app__big_command(reply_cb, reply_data);
 
-	if (!strcmp(command, "chunk"))
+	if (command_len == 5 && !strncmp(command, "chunk", 5))
 		return app__chunk_command(reply_cb, reply_data);
 
-	if (!strcmp(command, "slow"))
+	if (command_len == 4 && !strncmp(command, "slow", 4))
 		return app__slow_command(reply_cb, reply_data);
 
-	if (starts_with(command, "sendbytes "))
-		return app__sendbytes_command(command, reply_cb, reply_data);
+	if (command_len >= 10 && starts_with(command, "sendbytes "))
+		return app__sendbytes_command(command, command_len,
+					      reply_cb, reply_data);
 
 	return app__unhandled_command(command, reply_cb, reply_data);
 }
@@ -488,7 +496,9 @@ static int client__send_ipc(void)
 	options.wait_if_busy = 1;
 	options.wait_if_not_found = 0;
 
-	if (!ipc_client_send_command(cl_args.path, &options, command, &buf)) {
+	if (!ipc_client_send_command(cl_args.path, &options,
+				     command, strlen(command),
+				     &buf)) {
 		if (buf.len) {
 			printf("%s\n", buf.buf);
 			fflush(stdout);
@@ -556,7 +566,9 @@ static int do_sendbytes(int bytecount, char byte, const char *path,
 	strbuf_addstr(&buf_send, "sendbytes ");
 	strbuf_addchars(&buf_send, byte, bytecount);
 
-	if (!ipc_client_send_command(path, options, buf_send.buf, &buf_resp)) {
+	if (!ipc_client_send_command(path, options,
+				     buf_send.buf, buf_send.len,
+				     &buf_resp)) {
 		strbuf_rtrim(&buf_resp);
 		printf("sent:%c%08d %s\n", byte, bytecount, buf_resp.buf);
 		fflush(stdout);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 02/28] fsmonitor--daemon: man page
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 01/28] simple-ipc: preparations for supporting binary messages Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:56   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 03/28] fsmonitor--daemon: update fsmonitor documentation Jeff Hostetler via GitGitGadget
                     ` (27 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Create a manual page describing the `git fsmonitor--daemon` feature.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 Documentation/git-fsmonitor--daemon.txt | 75 +++++++++++++++++++++++++
 1 file changed, 75 insertions(+)
 create mode 100644 Documentation/git-fsmonitor--daemon.txt

diff --git a/Documentation/git-fsmonitor--daemon.txt b/Documentation/git-fsmonitor--daemon.txt
new file mode 100644
index 000000000000..154e7684daae
--- /dev/null
+++ b/Documentation/git-fsmonitor--daemon.txt
@@ -0,0 +1,75 @@
+git-fsmonitor--daemon(1)
+========================
+
+NAME
+----
+git-fsmonitor--daemon - A Built-in File System Monitor
+
+SYNOPSIS
+--------
+[verse]
+'git fsmonitor--daemon' start
+'git fsmonitor--daemon' run
+'git fsmonitor--daemon' stop
+'git fsmonitor--daemon' status
+
+DESCRIPTION
+-----------
+
+A daemon to watch the working directory for file and directory
+changes using platform-specific file system notification facilities.
+
+This daemon communicates directly with commands like `git status`
+using the link:technical/api-simple-ipc.html[simple IPC] interface
+instead of the slower linkgit:githooks[5] interface.
+
+This daemon is built into Git so that no third-party tools are
+required.
+
+OPTIONS
+-------
+
+start::
+	Starts a daemon in the background.
+
+run::
+	Runs a daemon in the foreground.
+
+stop::
+	Stops the daemon running in the current working
+	directory, if present.
+
+status::
+	Exits with zero status if a daemon is watching the
+	current working directory.
+
+REMARKS
+-------
+
+This daemon is a long running process used to watch a single working
+directory and maintain a list of the recently changed files and
+directories.  Performance of commands such as `git status` can be
+increased if they just ask for a summary of changes to the working
+directory and can avoid scanning the disk.
+
+When `core.useBuiltinFSMonitor` is set to `true` (see
+linkgit:git-config[1]) commands, such as `git status`, will ask the
+daemon for changes and automatically start it (if necessary).
+
+For more information see the "File System Monitor" section in
+linkgit:git-update-index[1].
+
+CAVEATS
+-------
+
+The fsmonitor daemon does not currently know about submodules and does
+not know to filter out file system events that happen within a
+submodule.  If fsmonitor daemon is watching a super repo and a file is
+modified within the working directory of a submodule, it will report
+the change (as happening against the super repo).  However, the client
+will properly ignore these extra events, so performance may be affected
+but it will not cause an incorrect result.
+
+GIT
+---
+Part of the linkgit:git[1] suite
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 03/28] fsmonitor--daemon: update fsmonitor documentation
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 01/28] simple-ipc: preparations for supporting binary messages Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 02/28] fsmonitor--daemon: man page Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:56   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 04/28] fsmonitor-ipc: create client routines for git-fsmonitor--daemon Jeff Hostetler via GitGitGadget
                     ` (26 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Update references to `core.fsmonitor` and `core.fsmonitorHookVersion` and
pointers to `Watchman` to mention the new built-in `fsmonitor--daemon`.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 Documentation/config/core.txt      | 56 ++++++++++++++++++++++--------
 Documentation/git-update-index.txt | 27 +++++++-------
 Documentation/githooks.txt         |  3 +-
 3 files changed, 59 insertions(+), 27 deletions(-)

diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
index c04f62a54a15..4f6e519bc025 100644
--- a/Documentation/config/core.txt
+++ b/Documentation/config/core.txt
@@ -62,22 +62,50 @@ core.protectNTFS::
 	Defaults to `true` on Windows, and `false` elsewhere.
 
 core.fsmonitor::
-	If set, the value of this variable is used as a command which
-	will identify all files that may have changed since the
-	requested date/time. This information is used to speed up git by
-	avoiding unnecessary processing of files that have not changed.
-	See the "fsmonitor-watchman" section of linkgit:githooks[5].
+	If set, this variable contains the pathname of the "fsmonitor"
+	hook command.
++
+This hook command is used to identify all files that may have changed
+since the requested date/time. This information is used to speed up
+git by avoiding unnecessary scanning of files that have not changed.
++
+See the "fsmonitor-watchman" section of linkgit:githooks[5].
++
+Note: The value of this config setting is ignored if the
+built-in file system monitor is enabled (see `core.useBuiltinFSMonitor`).
 
 core.fsmonitorHookVersion::
-	Sets the version of hook that is to be used when calling fsmonitor.
-	There are currently versions 1 and 2. When this is not set,
-	version 2 will be tried first and if it fails then version 1
-	will be tried. Version 1 uses a timestamp as input to determine
-	which files have changes since that time but some monitors
-	like watchman have race conditions when used with a timestamp.
-	Version 2 uses an opaque string so that the monitor can return
-	something that can be used to determine what files have changed
-	without race conditions.
+	Sets the protocol version to be used when invoking the
+	"fsmonitor" hook.
++
+There are currently versions 1 and 2. When this is not set,
+version 2 will be tried first and if it fails then version 1
+will be tried. Version 1 uses a timestamp as input to determine
+which files have changes since that time but some monitors
+like Watchman have race conditions when used with a timestamp.
+Version 2 uses an opaque string so that the monitor can return
+something that can be used to determine what files have changed
+without race conditions.
++
+Note: The value of this config setting is ignored if the
+built-in file system monitor is enabled (see `core.useBuiltinFSMonitor`).
+
+core.useBuiltinFSMonitor::
+	If set to true, enable the built-in file system monitor
+	daemon for this working directory (linkgit:git-fsmonitor--daemon[1]).
++
+Like hook-based file system monitors, the built-in file system monitor
+can speed up Git commands that need to refresh the Git index
+(e.g. `git status`) in a working directory with many files.  The
+built-in monitor eliminates the need to install and maintain an
+external third-party tool.
++
+The built-in file system monitor is currently available only on a
+limited set of supported platforms.  Currently, this includes Windows
+and MacOS.
++
+Note: if this config setting is set to `true`, the values of
+`core.fsmonitor` and `core.fsmonitorHookVersion` are ignored.
 
 core.trustctime::
 	If false, the ctime differences between the index and the
diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
index 2853f168d976..c7c31b3fcf9c 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -498,7 +498,9 @@ FILE SYSTEM MONITOR
 This feature is intended to speed up git operations for repos that have
 large working directories.
 
-It enables git to work together with a file system monitor (see the
+It enables git to work together with a file system monitor (see
+linkgit:git-fsmonitor--daemon[1]
+and the
 "fsmonitor-watchman" section of linkgit:githooks[5]) that can
 inform it as to what files have been modified. This enables git to avoid
 having to lstat() every file to find modified files.
@@ -508,17 +510,18 @@ performance by avoiding the cost of scanning the entire working directory
 looking for new files.
 
 If you want to enable (or disable) this feature, it is easier to use
-the `core.fsmonitor` configuration variable (see
-linkgit:git-config[1]) than using the `--fsmonitor` option to
-`git update-index` in each repository, especially if you want to do so
-across all repositories you use, because you can set the configuration
-variable in your `$HOME/.gitconfig` just once and have it affect all
-repositories you touch.
-
-When the `core.fsmonitor` configuration variable is changed, the
-file system monitor is added to or removed from the index the next time
-a command reads the index. When `--[no-]fsmonitor` are used, the file
-system monitor is immediately added to or removed from the index.
+the `core.fsmonitor` or `core.useBuiltinFSMonitor` configuration
+variable (see linkgit:git-config[1]) than using the `--fsmonitor`
+option to `git update-index` in each repository, especially if you
+want to do so across all repositories you use, because you can set the
+configuration variable in your `$HOME/.gitconfig` just once and have
+it affect all repositories you touch.
+
+When the `core.fsmonitor` or `core.useBuiltinFSMonitor` configuration
+variable is changed, the file system monitor is added to or removed
+from the index the next time a command reads the index. When
+`--[no-]fsmonitor` are used, the file system monitor is immediately
+added to or removed from the index.
 
 CONFIGURATION
 -------------
diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index b51959ff9418..b7d5e926f7b0 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -593,7 +593,8 @@ fsmonitor-watchman
 
 This hook is invoked when the configuration option `core.fsmonitor` is
 set to `.git/hooks/fsmonitor-watchman` or `.git/hooks/fsmonitor-watchmanv2`
-depending on the version of the hook to use.
+depending on the version of the hook to use, unless overridden via
+`core.useBuiltinFSMonitor` (see linkgit:git-config[1]).
 
 Version 1 takes two arguments, a version (1) and the time in elapsed
 nanoseconds since midnight, January 1, 1970.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 04/28] fsmonitor-ipc: create client routines for git-fsmonitor--daemon
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (2 preceding siblings ...)
  2021-05-22 13:56   ` [PATCH v2 03/28] fsmonitor--daemon: update fsmonitor documentation Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:56   ` Jeff Hostetler via GitGitGadget
  2021-06-02 11:24     ` Johannes Schindelin
  2021-05-22 13:56   ` [PATCH v2 05/28] help: include fsmonitor--daemon feature flag in version info Jeff Hostetler via GitGitGadget
                     ` (25 subsequent siblings)
  29 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Create fsmonitor_ipc__*() client routines to spawn the built-in file
system monitor daemon and send it an IPC request using the `Simple
IPC` API.

Stub in empty fsmonitor_ipc__*() functions for unsupported platforms.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 Makefile        |   1 +
 fsmonitor-ipc.c | 179 ++++++++++++++++++++++++++++++++++++++++++++++++
 fsmonitor-ipc.h |  48 +++++++++++++
 3 files changed, 228 insertions(+)
 create mode 100644 fsmonitor-ipc.c
 create mode 100644 fsmonitor-ipc.h

diff --git a/Makefile b/Makefile
index 21c0bf16672b..23f3b9890acd 100644
--- a/Makefile
+++ b/Makefile
@@ -892,6 +892,7 @@ LIB_OBJS += fetch-pack.o
 LIB_OBJS += fmt-merge-msg.o
 LIB_OBJS += fsck.o
 LIB_OBJS += fsmonitor.o
+LIB_OBJS += fsmonitor-ipc.o
 LIB_OBJS += gettext.o
 LIB_OBJS += gpg-interface.o
 LIB_OBJS += graph.o
diff --git a/fsmonitor-ipc.c b/fsmonitor-ipc.c
new file mode 100644
index 000000000000..e62901a85b5d
--- /dev/null
+++ b/fsmonitor-ipc.c
@@ -0,0 +1,179 @@
+#include "cache.h"
+#include "fsmonitor.h"
+#include "simple-ipc.h"
+#include "fsmonitor-ipc.h"
+#include "run-command.h"
+#include "strbuf.h"
+#include "trace2.h"
+
+#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+
+int fsmonitor_ipc__is_supported(void)
+{
+	return 1;
+}
+
+GIT_PATH_FUNC(fsmonitor_ipc__get_path, "fsmonitor--daemon.ipc")
+
+enum ipc_active_state fsmonitor_ipc__get_state(void)
+{
+	return ipc_get_active_state(fsmonitor_ipc__get_path());
+}
+
+static int spawn_daemon(void)
+{
+	const char *args[] = { "fsmonitor--daemon", "start", NULL };
+
+	return run_command_v_opt_tr2(args, RUN_COMMAND_NO_STDIN | RUN_GIT_CMD,
+				    "fsmonitor");
+}
+
+int fsmonitor_ipc__send_query(const char *since_token,
+			      struct strbuf *answer)
+{
+	int ret = -1;
+	int tried_to_spawn = 0;
+	enum ipc_active_state state = IPC_STATE__OTHER_ERROR;
+	struct ipc_client_connection *connection = NULL;
+	struct ipc_client_connect_options options
+		= IPC_CLIENT_CONNECT_OPTIONS_INIT;
+
+	options.wait_if_busy = 1;
+	options.wait_if_not_found = 0;
+
+	trace2_region_enter("fsm_client", "query", NULL);
+
+	trace2_data_string("fsm_client", NULL, "query/command",
+			   since_token);
+
+try_again:
+	state = ipc_client_try_connect(fsmonitor_ipc__get_path(), &options,
+				       &connection);
+
+	switch (state) {
+	case IPC_STATE__LISTENING:
+		ret = ipc_client_send_command_to_connection(
+			connection, since_token, strlen(since_token), answer);
+		ipc_client_close_connection(connection);
+
+		trace2_data_intmax("fsm_client", NULL,
+				   "query/response-length", answer->len);
+
+		if (fsmonitor_is_trivial_response(answer))
+			trace2_data_intmax("fsm_client", NULL,
+					   "query/trivial-response", 1);
+
+		goto done;
+
+	case IPC_STATE__NOT_LISTENING:
+		ret = error(_("fsmonitor_ipc__send_query: daemon not available"));
+		goto done;
+
+	case IPC_STATE__PATH_NOT_FOUND:
+		if (tried_to_spawn)
+			goto done;
+
+		tried_to_spawn++;
+		if (spawn_daemon())
+			goto done;
+
+		/*
+		 * Try again, but this time give the daemon a chance to
+		 * actually create the pipe/socket.
+		 *
+		 * Granted, the daemon just started so it can't possibly have
+		 * any FS cached yet, so we'll always get a trivial answer.
+		 * BUT the answer should include a new token that can serve
+		 * as the basis for subsequent requests.
+		 */
+		options.wait_if_not_found = 1;
+		goto try_again;
+
+	case IPC_STATE__INVALID_PATH:
+		ret = error(_("fsmonitor_ipc__send_query: invalid path '%s'"),
+			    fsmonitor_ipc__get_path());
+		goto done;
+
+	case IPC_STATE__OTHER_ERROR:
+	default:
+		ret = error(_("fsmonitor_ipc__send_query: unspecified error on '%s'"),
+			    fsmonitor_ipc__get_path());
+		goto done;
+	}
+
+done:
+	trace2_region_leave("fsm_client", "query", NULL);
+
+	return ret;
+}
+
+int fsmonitor_ipc__send_command(const char *command,
+				struct strbuf *answer)
+{
+	struct ipc_client_connection *connection = NULL;
+	struct ipc_client_connect_options options
+		= IPC_CLIENT_CONNECT_OPTIONS_INIT;
+	int ret;
+	enum ipc_active_state state;
+
+	strbuf_reset(answer);
+
+	options.wait_if_busy = 1;
+	options.wait_if_not_found = 0;
+
+	state = ipc_client_try_connect(fsmonitor_ipc__get_path(), &options,
+				       &connection);
+	if (state != IPC_STATE__LISTENING) {
+		die("fsmonitor--daemon is not running");
+		return -1;
+	}
+
+	ret = ipc_client_send_command_to_connection(connection,
+						    command, strlen(command),
+						    answer);
+	ipc_client_close_connection(connection);
+
+	if (ret == -1) {
+		die("could not send '%s' command to fsmonitor--daemon",
+		    command);
+		return -1;
+	}
+
+	return 0;
+}
+
+#else
+
+/*
+ * A trivial implementation of the fsmonitor_ipc__ API for unsupported
+ * platforms.
+ */
+
+int fsmonitor_ipc__is_supported(void)
+{
+	return 0;
+}
+
+const char *fsmonitor_ipc__get_path(void)
+{
+	return NULL;
+}
+
+enum ipc_active_state fsmonitor_ipc__get_state(void)
+{
+	return IPC_STATE__OTHER_ERROR;
+}
+
+int fsmonitor_ipc__send_query(const char *since_token,
+			      struct strbuf *answer)
+{
+	return -1;
+}
+
+int fsmonitor_ipc__send_command(const char *command,
+				struct strbuf *answer)
+{
+	return -1;
+}
+
+#endif
diff --git a/fsmonitor-ipc.h b/fsmonitor-ipc.h
new file mode 100644
index 000000000000..837c5e5b64ad
--- /dev/null
+++ b/fsmonitor-ipc.h
@@ -0,0 +1,48 @@
+#ifndef FSMONITOR_IPC_H
+#define FSMONITOR_IPC_H
+
+/*
+ * Returns true if built-in file system monitor daemon is defined
+ * for this platform.
+ */
+int fsmonitor_ipc__is_supported(void);
+
+/*
+ * Returns the pathname to the IPC named pipe or Unix domain socket
+ * where a `git-fsmonitor--daemon` process will listen.  This is a
+ * per-worktree value.
+ *
+ * Returns NULL if the daemon is not supported on this platform.
+ */
+const char *fsmonitor_ipc__get_path(void);
+
+/*
+ * Try to determine whether there is a `git-fsmonitor--daemon` process
+ * listening on the IPC pipe/socket.
+ */
+enum ipc_active_state fsmonitor_ipc__get_state(void);
+
+/*
+ * Connect to a `git-fsmonitor--daemon` process via simple-ipc
+ * and ask for the set of changed files since the given token.
+ *
+ * This DOES NOT use the hook interface.
+ *
+ * Spawn a daemon process in the background if necessary.
+ *
+ * Returns -1 on error; 0 on success.
+ */
+int fsmonitor_ipc__send_query(const char *since_token,
+			      struct strbuf *answer);
+
+/*
+ * Connect to a `git-fsmonitor--daemon` process via simple-ipc and
+ * send a command verb.  If no daemon is available, we DO NOT try to
+ * start one.
+ *
+ * Returns -1 on error; 0 on success.
+ */
+int fsmonitor_ipc__send_command(const char *command,
+				struct strbuf *answer);
+
+#endif /* FSMONITOR_IPC_H */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 05/28] help: include fsmonitor--daemon feature flag in version info
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (3 preceding siblings ...)
  2021-05-22 13:56   ` [PATCH v2 04/28] fsmonitor-ipc: create client routines for git-fsmonitor--daemon Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:56   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 06/28] config: FSMonitor is repository-specific Johannes Schindelin via GitGitGadget
                     ` (24 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Add the "feature: fsmonitor--daemon" message to the output of
`git version --build-options`.

This allows users to know if the built-in fsmonitor feature is
supported on their platform.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 help.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/help.c b/help.c
index 3c3bdec21356..e22ba1d246a5 100644
--- a/help.c
+++ b/help.c
@@ -11,6 +11,7 @@
 #include "version.h"
 #include "refs.h"
 #include "parse-options.h"
+#include "fsmonitor-ipc.h"
 
 struct category_description {
 	uint32_t category;
@@ -664,6 +665,9 @@ void get_version_info(struct strbuf *buf, int show_build_options)
 		strbuf_addf(buf, "sizeof-size_t: %d\n", (int)sizeof(size_t));
 		strbuf_addf(buf, "shell-path: %s\n", SHELL_PATH);
 		/* NEEDSWORK: also save and output GIT-BUILD_OPTIONS? */
+
+		if (fsmonitor_ipc__is_supported())
+			strbuf_addstr(buf, "feature: fsmonitor--daemon\n");
 	}
 }
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 06/28] config: FSMonitor is repository-specific
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (4 preceding siblings ...)
  2021-05-22 13:56   ` [PATCH v2 05/28] help: include fsmonitor--daemon feature flag in version info Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:56   ` Johannes Schindelin via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 07/28] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC Johannes Schindelin via GitGitGadget
                     ` (23 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

This commit refactors `git_config_get_fsmonitor()` into the `repo_*()`
form that takes a parameter `struct repository *r`.

That change prepares for the upcoming `core.useBuiltinFSMonitor` flag which
will be stored in the `repo_settings` struct.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/update-index.c | 4 ++--
 config.c               | 4 ++--
 config.h               | 2 +-
 fsmonitor.c            | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 79087bccea4b..84793df8b2b6 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1214,14 +1214,14 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	}
 
 	if (fsmonitor > 0) {
-		if (git_config_get_fsmonitor() == 0)
+		if (repo_config_get_fsmonitor(r) == 0)
 			warning(_("core.fsmonitor is unset; "
 				"set it if you really want to "
 				"enable fsmonitor"));
 		add_fsmonitor(&the_index);
 		report(_("fsmonitor enabled"));
 	} else if (!fsmonitor) {
-		if (git_config_get_fsmonitor() == 1)
+		if (repo_config_get_fsmonitor(r) == 1)
 			warning(_("core.fsmonitor is set; "
 				"remove it if you really want to "
 				"disable fsmonitor"));
diff --git a/config.c b/config.c
index 870d9534defc..a896f44cba1f 100644
--- a/config.c
+++ b/config.c
@@ -2499,9 +2499,9 @@ int git_config_get_max_percent_split_change(void)
 	return -1; /* default value */
 }
 
-int git_config_get_fsmonitor(void)
+int repo_config_get_fsmonitor(struct repository *r)
 {
-	if (git_config_get_pathname("core.fsmonitor", &core_fsmonitor))
+	if (repo_config_get_pathname(r, "core.fsmonitor", &core_fsmonitor))
 		core_fsmonitor = getenv("GIT_TEST_FSMONITOR");
 
 	if (core_fsmonitor && !*core_fsmonitor)
diff --git a/config.h b/config.h
index 19a9adbaa9a3..3139de81d986 100644
--- a/config.h
+++ b/config.h
@@ -607,7 +607,7 @@ int git_config_get_index_threads(int *dest);
 int git_config_get_untracked_cache(void);
 int git_config_get_split_index(void);
 int git_config_get_max_percent_split_change(void);
-int git_config_get_fsmonitor(void);
+int repo_config_get_fsmonitor(struct repository *r);
 
 /* This dies if the configured or default date is in the future */
 int git_config_get_expiry(const char *key, const char **output);
diff --git a/fsmonitor.c b/fsmonitor.c
index ab9bfc60b34e..9c9b2abc9414 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -411,7 +411,7 @@ void remove_fsmonitor(struct index_state *istate)
 void tweak_fsmonitor(struct index_state *istate)
 {
 	unsigned int i;
-	int fsmonitor_enabled = git_config_get_fsmonitor();
+	int fsmonitor_enabled = repo_config_get_fsmonitor(istate->repo ? istate->repo : the_repository);
 
 	if (istate->fsmonitor_dirty) {
 		if (fsmonitor_enabled) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 07/28] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (5 preceding siblings ...)
  2021-05-22 13:56   ` [PATCH v2 06/28] config: FSMonitor is repository-specific Johannes Schindelin via GitGitGadget
@ 2021-05-22 13:56   ` Johannes Schindelin via GitGitGadget
  2021-06-14 21:28     ` Johannes Schindelin
  2021-05-22 13:56   ` [PATCH v2 08/28] fsmonitor--daemon: add a built-in fsmonitor daemon Jeff Hostetler via GitGitGadget
                     ` (22 subsequent siblings)
  29 siblings, 1 reply; 237+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Use simple IPC to directly communicate with the new builtin file
system monitor daemon.

Define a new config setting `core.useBuiltinFSMonitor` to enable the
builtin file system monitor.

The `core.fsmonitor` setting has already been defined as a HOOK
pathname.  Historically, this has been set to a HOOK script that will
talk with Watchman.  For compatibility reasons, we do not want to
overload that definition (and cause problems if users have multiple
versions of Git installed).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 config.c        |  5 +++++
 fsmonitor.c     | 43 +++++++++++++++++++++++++++++++++++++++++++
 repo-settings.c |  3 +++
 repository.h    |  2 ++
 4 files changed, 53 insertions(+)

diff --git a/config.c b/config.c
index a896f44cba1f..c82f40c22b43 100644
--- a/config.c
+++ b/config.c
@@ -2501,6 +2501,11 @@ int git_config_get_max_percent_split_change(void)
 
 int repo_config_get_fsmonitor(struct repository *r)
 {
+	if (r->settings.use_builtin_fsmonitor > 0) {
+		core_fsmonitor = "(built-in daemon)";
+		return 1;
+	}
+
 	if (repo_config_get_pathname(r, "core.fsmonitor", &core_fsmonitor))
 		core_fsmonitor = getenv("GIT_TEST_FSMONITOR");
 
diff --git a/fsmonitor.c b/fsmonitor.c
index 9c9b2abc9414..c6d3c34ad78e 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -3,6 +3,7 @@
 #include "dir.h"
 #include "ewah/ewok.h"
 #include "fsmonitor.h"
+#include "fsmonitor-ipc.h"
 #include "run-command.h"
 #include "strbuf.h"
 
@@ -231,6 +232,7 @@ static void fsmonitor_refresh_callback(struct index_state *istate, char *name)
 
 void refresh_fsmonitor(struct index_state *istate)
 {
+	struct repository *r = istate->repo ? istate->repo : the_repository;
 	struct strbuf query_result = STRBUF_INIT;
 	int query_success = 0, hook_version = -1;
 	size_t bol = 0; /* beginning of line */
@@ -247,6 +249,46 @@ void refresh_fsmonitor(struct index_state *istate)
 	istate->fsmonitor_has_run_once = 1;
 
 	trace_printf_key(&trace_fsmonitor, "refresh fsmonitor");
+
+	if (r->settings.use_builtin_fsmonitor > 0) {
+		query_success = !fsmonitor_ipc__send_query(
+			istate->fsmonitor_last_update, &query_result);
+		if (query_success) {
+			/*
+			 * The response contains a series of nul terminated
+			 * strings.  The first is the new token.
+			 *
+			 * Use `char *buf` as an interlude to trick the CI
+			 * static analysis to let us use `strbuf_addstr()`
+			 * here (and only copy the token) rather than
+			 * `strbuf_addbuf()`.
+			 */
+			buf = query_result.buf;
+			strbuf_addstr(&last_update_token, buf);
+			bol = last_update_token.len + 1;
+		} else {
+			/*
+			 * The builtin daemon is not available on this
+			 * platform -OR- we failed to get a response.
+			 *
+			 * Generate a fake token (rather than a V1
+			 * timestamp) for the index extension.  (If
+			 * they switch back to the hook API, we don't
+			 * want ambiguous state.)
+			 */
+			strbuf_addstr(&last_update_token, "builtin:fake");
+		}
+
+		/*
+		 * Regardless of whether we successfully talked to a
+		 * fsmonitor daemon or not, we skip over and do not
+		 * try to use the hook.  The "core.useBuiltinFSMonitor"
+		 * config setting ALWAYS overrides the "core.fsmonitor"
+		 * hook setting.
+		 */
+		goto apply_results;
+	}
+
 	/*
 	 * This could be racy so save the date/time now and query_fsmonitor
 	 * should be inclusive to ensure we don't miss potential changes.
@@ -301,6 +343,7 @@ void refresh_fsmonitor(struct index_state *istate)
 			core_fsmonitor, query_success ? "success" : "failure");
 	}
 
+apply_results:
 	/* a fsmonitor process can return '/' to indicate all entries are invalid */
 	if (query_success && query_result.buf[bol] != '/') {
 		/* Mark all entries returned by the monitor as dirty */
diff --git a/repo-settings.c b/repo-settings.c
index f7fff0f5ab83..93aab92ff164 100644
--- a/repo-settings.c
+++ b/repo-settings.c
@@ -58,6 +58,9 @@ void prepare_repo_settings(struct repository *r)
 		r->settings.core_multi_pack_index = value;
 	UPDATE_DEFAULT_BOOL(r->settings.core_multi_pack_index, 1);
 
+	if (!repo_config_get_bool(r, "core.usebuiltinfsmonitor", &value) && value)
+		r->settings.use_builtin_fsmonitor = 1;
+
 	if (!repo_config_get_bool(r, "feature.manyfiles", &value) && value) {
 		UPDATE_DEFAULT_BOOL(r->settings.index_version, 4);
 		UPDATE_DEFAULT_BOOL(r->settings.core_untracked_cache, UNTRACKED_CACHE_WRITE);
diff --git a/repository.h b/repository.h
index b385ca3c94b6..d6e7f61f9cf7 100644
--- a/repository.h
+++ b/repository.h
@@ -29,6 +29,8 @@ enum fetch_negotiation_setting {
 struct repo_settings {
 	int initialized;
 
+	int use_builtin_fsmonitor;
+
 	int core_commit_graph;
 	int commit_graph_read_changed_paths;
 	int gc_write_commit_graph;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 08/28] fsmonitor--daemon: add a built-in fsmonitor daemon
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (6 preceding siblings ...)
  2021-05-22 13:56   ` [PATCH v2 07/28] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC Johannes Schindelin via GitGitGadget
@ 2021-05-22 13:56   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 09/28] fsmonitor--daemon: implement client command options Jeff Hostetler via GitGitGadget
                     ` (21 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Create a built-in file system monitoring daemon that can be used by
the existing `fsmonitor` feature (protocol API and index extension)
to improve the performance of various Git commands, such as `status`.

The `fsmonitor--daemon` feature builds upon the `Simple IPC` API and
provides an alternative to hook access to existing fsmonitors such
as `watchman`.

This commit merely adds the new command without any functionality.

Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 .gitignore                  |  1 +
 Makefile                    |  1 +
 builtin.h                   |  1 +
 builtin/fsmonitor--daemon.c | 53 +++++++++++++++++++++++++++++++++++++
 git.c                       |  1 +
 5 files changed, 57 insertions(+)
 create mode 100644 builtin/fsmonitor--daemon.c

diff --git a/.gitignore b/.gitignore
index 3dcdb6bb5ab8..beccf34abe9e 100644
--- a/.gitignore
+++ b/.gitignore
@@ -71,6 +71,7 @@
 /git-format-patch
 /git-fsck
 /git-fsck-objects
+/git-fsmonitor--daemon
 /git-gc
 /git-get-tar-commit-id
 /git-grep
diff --git a/Makefile b/Makefile
index 23f3b9890acd..74673acc9833 100644
--- a/Makefile
+++ b/Makefile
@@ -1092,6 +1092,7 @@ BUILTIN_OBJS += builtin/fmt-merge-msg.o
 BUILTIN_OBJS += builtin/for-each-ref.o
 BUILTIN_OBJS += builtin/for-each-repo.o
 BUILTIN_OBJS += builtin/fsck.o
+BUILTIN_OBJS += builtin/fsmonitor--daemon.o
 BUILTIN_OBJS += builtin/gc.o
 BUILTIN_OBJS += builtin/get-tar-commit-id.o
 BUILTIN_OBJS += builtin/grep.o
diff --git a/builtin.h b/builtin.h
index b6ce981b7377..7554476f90a4 100644
--- a/builtin.h
+++ b/builtin.h
@@ -158,6 +158,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix);
 int cmd_for_each_repo(int argc, const char **argv, const char *prefix);
 int cmd_format_patch(int argc, const char **argv, const char *prefix);
 int cmd_fsck(int argc, const char **argv, const char *prefix);
+int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix);
 int cmd_gc(int argc, const char **argv, const char *prefix);
 int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
 int cmd_grep(int argc, const char **argv, const char *prefix);
diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
new file mode 100644
index 000000000000..df2bad531118
--- /dev/null
+++ b/builtin/fsmonitor--daemon.c
@@ -0,0 +1,53 @@
+#include "builtin.h"
+#include "config.h"
+#include "parse-options.h"
+#include "fsmonitor.h"
+#include "fsmonitor-ipc.h"
+#include "simple-ipc.h"
+#include "khash.h"
+
+static const char * const builtin_fsmonitor__daemon_usage[] = {
+	NULL
+};
+
+#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+
+int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
+{
+	const char *subcmd;
+
+	struct option options[] = {
+		OPT_END()
+	};
+
+	if (argc < 2)
+		usage_with_options(builtin_fsmonitor__daemon_usage, options);
+
+	if (argc == 2 && !strcmp(argv[1], "-h"))
+		usage_with_options(builtin_fsmonitor__daemon_usage, options);
+
+	git_config(git_default_config, NULL);
+
+	subcmd = argv[1];
+	argv--;
+	argc++;
+
+	argc = parse_options(argc, argv, prefix, options,
+			     builtin_fsmonitor__daemon_usage, 0);
+
+	die(_("Unhandled subcommand '%s'"), subcmd);
+}
+
+#else
+int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
+{
+	struct option options[] = {
+		OPT_END()
+	};
+
+	if (argc == 2 && !strcmp(argv[1], "-h"))
+		usage_with_options(builtin_fsmonitor__daemon_usage, options);
+
+	die(_("fsmonitor--daemon not supported on this platform"));
+}
+#endif
diff --git a/git.c b/git.c
index b53e66567138..41980c897964 100644
--- a/git.c
+++ b/git.c
@@ -523,6 +523,7 @@ static struct cmd_struct commands[] = {
 	{ "format-patch", cmd_format_patch, RUN_SETUP },
 	{ "fsck", cmd_fsck, RUN_SETUP },
 	{ "fsck-objects", cmd_fsck, RUN_SETUP },
+	{ "fsmonitor--daemon", cmd_fsmonitor__daemon, RUN_SETUP },
 	{ "gc", cmd_gc, RUN_SETUP },
 	{ "get-tar-commit-id", cmd_get_tar_commit_id, NO_PARSEOPT },
 	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 09/28] fsmonitor--daemon: implement client command options
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (7 preceding siblings ...)
  2021-05-22 13:56   ` [PATCH v2 08/28] fsmonitor--daemon: add a built-in fsmonitor daemon Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:56   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 10/28] t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon Jeff Hostetler via GitGitGadget
                     ` (20 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Implement `stop` and `status` client commands to control and query the
status of a `fsmonitor--daemon` server process (and implicitly start a
server process if necessary).

Later commits will implement the actual server and monitor the file
system.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 49 +++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index df2bad531118..16ff68b65407 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -7,10 +7,53 @@
 #include "khash.h"
 
 static const char * const builtin_fsmonitor__daemon_usage[] = {
+	N_("git fsmonitor--daemon stop"),
+	N_("git fsmonitor--daemon status"),
 	NULL
 };
 
 #ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+/*
+ * Acting as a CLIENT.
+ *
+ * Send a "quit" command to the `git-fsmonitor--daemon` (if running)
+ * and wait for it to shutdown.
+ */
+static int do_as_client__send_stop(void)
+{
+	struct strbuf answer = STRBUF_INIT;
+	int ret;
+
+	ret = fsmonitor_ipc__send_command("quit", &answer);
+
+	/* The quit command does not return any response data. */
+	strbuf_release(&answer);
+
+	if (ret)
+		return ret;
+
+	trace2_region_enter("fsm_client", "polling-for-daemon-exit", NULL);
+	while (fsmonitor_ipc__get_state() == IPC_STATE__LISTENING)
+		sleep_millisec(50);
+	trace2_region_leave("fsm_client", "polling-for-daemon-exit", NULL);
+
+	return 0;
+}
+
+static int do_as_client__status(void)
+{
+	enum ipc_active_state state = fsmonitor_ipc__get_state();
+
+	switch (state) {
+	case IPC_STATE__LISTENING:
+		printf(_("The built-in file system monitor is active\n"));
+		return 0;
+
+	default:
+		printf(_("The built-in file system monitor is not active\n"));
+		return 1;
+	}
+}
 
 int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 {
@@ -35,6 +78,12 @@ int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options,
 			     builtin_fsmonitor__daemon_usage, 0);
 
+	if (!strcmp(subcmd, "stop"))
+		return !!do_as_client__send_stop();
+
+	if (!strcmp(subcmd, "status"))
+		return !!do_as_client__status();
+
 	die(_("Unhandled subcommand '%s'"), subcmd);
 }
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 10/28] t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (8 preceding siblings ...)
  2021-05-22 13:56   ` [PATCH v2 09/28] fsmonitor--daemon: implement client command options Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:56   ` Jeff Hostetler via GitGitGadget
  2021-06-11  6:32     ` Junio C Hamano
  2021-05-22 13:56   ` [PATCH v2 11/28] fsmonitor-fs-listen-win32: stub in backend for Windows Jeff Hostetler via GitGitGadget
                     ` (19 subsequent siblings)
  29 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Create an IPC client to send query and flush commands to the daemon.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 Makefile                         |   1 +
 t/helper/test-fsmonitor-client.c | 125 +++++++++++++++++++++++++++++++
 t/helper/test-tool.c             |   1 +
 t/helper/test-tool.h             |   1 +
 4 files changed, 128 insertions(+)
 create mode 100644 t/helper/test-fsmonitor-client.c

diff --git a/Makefile b/Makefile
index 74673acc9833..80059032c4e3 100644
--- a/Makefile
+++ b/Makefile
@@ -709,6 +709,7 @@ TEST_BUILTINS_OBJS += test-dump-split-index.o
 TEST_BUILTINS_OBJS += test-dump-untracked-cache.o
 TEST_BUILTINS_OBJS += test-example-decorate.o
 TEST_BUILTINS_OBJS += test-fast-rebase.o
+TEST_BUILTINS_OBJS += test-fsmonitor-client.o
 TEST_BUILTINS_OBJS += test-genrandom.o
 TEST_BUILTINS_OBJS += test-genzeros.o
 TEST_BUILTINS_OBJS += test-hash-speed.o
diff --git a/t/helper/test-fsmonitor-client.c b/t/helper/test-fsmonitor-client.c
new file mode 100644
index 000000000000..4961f28e3e02
--- /dev/null
+++ b/t/helper/test-fsmonitor-client.c
@@ -0,0 +1,125 @@
+/*
+ * test-fsmonitor-client.c: client code to send commands/requests to
+ * a `git fsmonitor--daemon` daemon.
+ */
+
+#include "test-tool.h"
+#include "cache.h"
+#include "parse-options.h"
+//#include "fsmonitor.h"
+#include "fsmonitor-ipc.h"
+//#include "compat/fsmonitor/fsmonitor-fs-listen.h"
+//#include "fsmonitor--daemon.h"
+//#include "simple-ipc.h"
+
+#ifndef HAVE_FSMONITOR_DAEMON_BACKEND
+int cmd__fsmonitor_client(int argc, const char **argv)
+{
+	die("fsmonitor--daemon not available on this platform");
+}
+#else
+
+/*
+ * Read the `.git/index` to get the last token written to the
+ * FSMonitor Index Extension.
+ */
+static const char *get_token_from_index(void)
+{
+	struct index_state *istate = the_repository->index;
+
+	if (do_read_index(istate, the_repository->index_file, 0) < 0)
+		die("unable to read index file");
+	if (!istate->fsmonitor_last_update)
+		die("index file does not have fsmonitor extension");
+
+	return istate->fsmonitor_last_update;
+}
+
+/*
+ * Send an IPC query to a `git-fsmonitor--daemon` daemon and
+ * ask for the changes since the given token or from the last
+ * token in the index extension.
+ *
+ * This will implicitly start a daemon process if necessary.  The
+ * daemon process will persist after we exit.
+ */
+static int do_send_query(const char *token)
+{
+	struct strbuf answer = STRBUF_INIT;
+	int ret;
+
+	if (!token || !*token)
+		token = get_token_from_index();
+
+	ret = fsmonitor_ipc__send_query(token, &answer);
+	if (ret < 0)
+		die(_("could not query fsmonitor--daemon"));
+
+	write_in_full(1, answer.buf, answer.len);
+	strbuf_release(&answer);
+
+	return 0;
+}
+
+/*
+ * Send a "flush" command to the `git-fsmonitor--daemon` (if running)
+ * and tell it to flush its cache.
+ *
+ * This feature is primarily used by the test suite to simulate a loss of
+ * sync with the filesystem where we miss kernel events.
+ */
+static int do_send_flush(void)
+{
+	struct strbuf answer = STRBUF_INIT;
+	int ret;
+
+	ret = fsmonitor_ipc__send_command("flush", &answer);
+	if (ret)
+		return ret;
+
+	write_in_full(1, answer.buf, answer.len);
+	strbuf_release(&answer);
+
+	return 0;
+}
+
+int cmd__fsmonitor_client(int argc, const char **argv)
+{
+	const char *subcmd;
+	const char *token = NULL;
+
+	const char * const fsmonitor_client_usage[] = {
+		N_("test-helper fsmonitor-client query [<token>]"),
+		N_("test-helper fsmonitor-client flush"),
+		NULL,
+	};
+
+	struct option options[] = {
+		OPT_STRING(0, "token", &token, N_("token"),
+			   N_("command token to send to the server")),
+		OPT_END()
+	};
+
+	if (argc < 2)
+		usage_with_options(fsmonitor_client_usage, options);
+
+	if (argc == 2 && !strcmp(argv[1], "-h"))
+		usage_with_options(fsmonitor_client_usage, options);
+
+	subcmd = argv[1];
+	argv--;
+	argc++;
+
+	argc = parse_options(argc, argv, NULL, options, fsmonitor_client_usage, 0);
+
+	setup_git_directory();
+
+	if (!strcmp(subcmd, "query"))
+		return !!do_send_query(token);
+
+	if (!strcmp(subcmd, "flush"))
+		return !!do_send_flush();
+
+	die("Unhandled subcommand: '%s'", subcmd);
+}
+#endif
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 25c6a37e93e5..b15d328f9a41 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -31,6 +31,7 @@ static struct test_cmd cmds[] = {
 	{ "dump-untracked-cache", cmd__dump_untracked_cache },
 	{ "example-decorate", cmd__example_decorate },
 	{ "fast-rebase", cmd__fast_rebase },
+	{ "fsmonitor-client", cmd__fsmonitor_client },
 	{ "genrandom", cmd__genrandom },
 	{ "genzeros", cmd__genzeros },
 	{ "hashmap", cmd__hashmap },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index f03c5988b20c..a8e96b97c419 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -21,6 +21,7 @@ int cmd__dump_split_index(int argc, const char **argv);
 int cmd__dump_untracked_cache(int argc, const char **argv);
 int cmd__example_decorate(int argc, const char **argv);
 int cmd__fast_rebase(int argc, const char **argv);
+int cmd__fsmonitor_client(int argc, const char **argv);
 int cmd__genrandom(int argc, const char **argv);
 int cmd__genzeros(int argc, const char **argv);
 int cmd__hashmap(int argc, const char **argv);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 11/28] fsmonitor-fs-listen-win32: stub in backend for Windows
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (9 preceding siblings ...)
  2021-05-22 13:56   ` [PATCH v2 10/28] t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:56   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 12/28] fsmonitor-fs-listen-macos: stub in backend for MacOS Jeff Hostetler via GitGitGadget
                     ` (18 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Stub in empty backend for fsmonitor--daemon on Windows.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 Makefile                                     | 13 ++++++
 compat/fsmonitor/fsmonitor-fs-listen-win32.c | 21 +++++++++
 compat/fsmonitor/fsmonitor-fs-listen.h       | 49 ++++++++++++++++++++
 config.mak.uname                             |  2 +
 contrib/buildsystems/CMakeLists.txt          |  5 ++
 5 files changed, 90 insertions(+)
 create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-win32.c
 create mode 100644 compat/fsmonitor/fsmonitor-fs-listen.h

diff --git a/Makefile b/Makefile
index 80059032c4e3..3f31adfd135c 100644
--- a/Makefile
+++ b/Makefile
@@ -467,6 +467,11 @@ all::
 # directory, and the JSON compilation database 'compile_commands.json' will be
 # created at the root of the repository.
 #
+# If your platform supports a built-in fsmonitor backend, set
+# FSMONITOR_DAEMON_BACKEND to the "<name>" of the corresponding
+# `compat/fsmonitor/fsmonitor-fs-listen-<name>.c` that implements the
+# `fsmonitor_fs_listen__*()` routines.
+#
 # Define DEVELOPER to enable more compiler warnings. Compiler version
 # and family are auto detected, but could be overridden by defining
 # COMPILER_FEATURES (see config.mak.dev). You can still set
@@ -1906,6 +1911,11 @@ ifdef NEED_ACCESS_ROOT_HANDLER
 	COMPAT_OBJS += compat/access.o
 endif
 
+ifdef FSMONITOR_DAEMON_BACKEND
+	COMPAT_CFLAGS += -DHAVE_FSMONITOR_DAEMON_BACKEND
+	COMPAT_OBJS += compat/fsmonitor/fsmonitor-fs-listen-$(FSMONITOR_DAEMON_BACKEND).o
+endif
+
 ifeq ($(TCLTK_PATH),)
 NO_TCLTK = NoThanks
 endif
@@ -2765,6 +2775,9 @@ GIT-BUILD-OPTIONS: FORCE
 	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
 	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
 	@echo X=\'$(X)\' >>$@+
+ifdef FSMONITOR_DAEMON_BACKEND
+	@echo FSMONITOR_DAEMON_BACKEND=\''$(subst ','\'',$(subst ','\'',$(FSMONITOR_DAEMON_BACKEND)))'\' >>$@+
+endif
 ifdef TEST_OUTPUT_DIRECTORY
 	@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@+
 endif
diff --git a/compat/fsmonitor/fsmonitor-fs-listen-win32.c b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
new file mode 100644
index 000000000000..880446b49e35
--- /dev/null
+++ b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
@@ -0,0 +1,21 @@
+#include "cache.h"
+#include "config.h"
+#include "fsmonitor.h"
+#include "fsmonitor-fs-listen.h"
+
+void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state)
+{
+}
+
+void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
+{
+}
+
+int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
+{
+	return -1;
+}
+
+void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
+{
+}
diff --git a/compat/fsmonitor/fsmonitor-fs-listen.h b/compat/fsmonitor/fsmonitor-fs-listen.h
new file mode 100644
index 000000000000..c7b5776b3b60
--- /dev/null
+++ b/compat/fsmonitor/fsmonitor-fs-listen.h
@@ -0,0 +1,49 @@
+#ifndef FSMONITOR_FS_LISTEN_H
+#define FSMONITOR_FS_LISTEN_H
+
+/* This needs to be implemented by each backend */
+
+#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+
+struct fsmonitor_daemon_state;
+
+/*
+ * Initialize platform-specific data for the fsmonitor listener thread.
+ * This will be called from the main thread PRIOR to staring the
+ * fsmonitor_fs_listener thread.
+ *
+ * Returns 0 if successful.
+ * Returns -1 otherwise.
+ */
+int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state);
+
+/*
+ * Cleanup platform-specific data for the fsmonitor listener thread.
+ * This will be called from the main thread AFTER joining the listener.
+ */
+void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state);
+
+/*
+ * The main body of the platform-specific event loop to watch for
+ * filesystem events.  This will run in the fsmonitor_fs_listen thread.
+ *
+ * It should call `ipc_server_stop_async()` if the listener thread
+ * prematurely terminates (because of a filesystem error or if it
+ * detects that the .git directory has been deleted).  (It should NOT
+ * do so if the listener thread receives a normal shutdown signal from
+ * the IPC layer.)
+ *
+ * It should set `state->error_code` to -1 if the daemon should exit
+ * with an error.
+ */
+void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state);
+
+/*
+ * Gently request that the fsmonitor listener thread shutdown.
+ * It does not wait for it to stop.  The caller should do a JOIN
+ * to wait for it.
+ */
+void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state);
+
+#endif /* HAVE_FSMONITOR_DAEMON_BACKEND */
+#endif /* FSMONITOR_FS_LISTEN_H */
diff --git a/config.mak.uname b/config.mak.uname
index cb443b4e023a..fcd88b60b14a 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -420,6 +420,7 @@ ifeq ($(uname_S),Windows)
 	# so we don't need this:
 	#
 	#   SNPRINTF_RETURNS_BOGUS = YesPlease
+	FSMONITOR_DAEMON_BACKEND = win32
 	NO_SVN_TESTS = YesPlease
 	RUNTIME_PREFIX = YesPlease
 	HAVE_WPGMPTR = YesWeDo
@@ -598,6 +599,7 @@ ifneq (,$(findstring MINGW,$(uname_S)))
 	NO_STRTOUMAX = YesPlease
 	NO_MKDTEMP = YesPlease
 	NO_SVN_TESTS = YesPlease
+	FSMONITOR_DAEMON_BACKEND = win32
 	RUNTIME_PREFIX = YesPlease
 	HAVE_WPGMPTR = YesWeDo
 	NO_ST_BLOCKS_IN_STRUCT_STAT = YesPlease
diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index 75ed198a6a36..4e812462d955 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -256,6 +256,11 @@ else()
 	list(APPEND compat_SOURCES compat/simple-ipc/ipc-shared.c compat/simple-ipc/ipc-unix-socket.c)
 endif()
 
+if(CMAKE_SYSTEM_NAME STREQUAL "Windows")
+	add_compile_definitions(HAVE_FSMONITOR_DAEMON_BACKEND)
+	list(APPEND compat_SOURCES compat/fsmonitor/fsmonitor-fs-listen-win32.c)
+endif()
+
 set(EXE_EXTENSION ${CMAKE_EXECUTABLE_SUFFIX})
 
 #header checks
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 12/28] fsmonitor-fs-listen-macos: stub in backend for MacOS
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (10 preceding siblings ...)
  2021-05-22 13:56   ` [PATCH v2 11/28] fsmonitor-fs-listen-win32: stub in backend for Windows Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:56   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 13/28] fsmonitor--daemon: implement daemon command options Jeff Hostetler via GitGitGadget
                     ` (17 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Stub in empty implementation of fsmonitor--daemon
backend for MacOS.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/fsmonitor/fsmonitor-fs-listen-macos.c | 20 ++++++++++++++++++++
 config.mak.uname                             |  2 ++
 contrib/buildsystems/CMakeLists.txt          |  3 +++
 3 files changed, 25 insertions(+)
 create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-macos.c

diff --git a/compat/fsmonitor/fsmonitor-fs-listen-macos.c b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
new file mode 100644
index 000000000000..b91058d1c4f8
--- /dev/null
+++ b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
@@ -0,0 +1,20 @@
+#include "cache.h"
+#include "fsmonitor.h"
+#include "fsmonitor-fs-listen.h"
+
+int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
+{
+	return -1;
+}
+
+void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
+{
+}
+
+void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state)
+{
+}
+
+void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
+{
+}
diff --git a/config.mak.uname b/config.mak.uname
index fcd88b60b14a..394355463e1e 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -147,6 +147,8 @@ ifeq ($(uname_S),Darwin)
 			MSGFMT = /usr/local/opt/gettext/bin/msgfmt
 		endif
 	endif
+	FSMONITOR_DAEMON_BACKEND = macos
+	BASIC_LDFLAGS += -framework CoreServices
 endif
 ifeq ($(uname_S),SunOS)
 	NEEDS_SOCKET = YesPlease
diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index 4e812462d955..22dec4600431 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -259,6 +259,9 @@ endif()
 if(CMAKE_SYSTEM_NAME STREQUAL "Windows")
 	add_compile_definitions(HAVE_FSMONITOR_DAEMON_BACKEND)
 	list(APPEND compat_SOURCES compat/fsmonitor/fsmonitor-fs-listen-win32.c)
+elseif(CMAKE_SYSTEM_NAME STREQUAL "Darwin")
+	add_compile_definitions(HAVE_FSMONITOR_DAEMON_BACKEND)
+	list(APPEND compat_SOURCES compat/fsmonitor/fsmonitor-fs-listen-macos.c)
 endif()
 
 set(EXE_EXTENSION ${CMAKE_EXECUTABLE_SUFFIX})
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 13/28] fsmonitor--daemon: implement daemon command options
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (11 preceding siblings ...)
  2021-05-22 13:56   ` [PATCH v2 12/28] fsmonitor-fs-listen-macos: stub in backend for MacOS Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:56   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 14/28] fsmonitor--daemon: add pathname classification Jeff Hostetler via GitGitGadget
                     ` (16 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Implement `run` and `start` commands to try to
begin listening for file system events.

This version defines the thread structure with a single
fsmonitor_fs_listen thread to watch for file system events
and a simple IPC thread pool to wait for connections from
Git clients over a well-known named pipe or Unix domain
socket.

This version does not actually do anything yet because the
backends are still just stubs.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 390 +++++++++++++++++++++++++++++++++++-
 fsmonitor--daemon.h         |  34 ++++
 2 files changed, 423 insertions(+), 1 deletion(-)
 create mode 100644 fsmonitor--daemon.h

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 16ff68b65407..85f99dba861f 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -3,16 +3,52 @@
 #include "parse-options.h"
 #include "fsmonitor.h"
 #include "fsmonitor-ipc.h"
+#include "compat/fsmonitor/fsmonitor-fs-listen.h"
+#include "fsmonitor--daemon.h"
 #include "simple-ipc.h"
 #include "khash.h"
 
 static const char * const builtin_fsmonitor__daemon_usage[] = {
+	N_("git fsmonitor--daemon start [<options>]"),
+	N_("git fsmonitor--daemon run [<options>]"),
 	N_("git fsmonitor--daemon stop"),
 	N_("git fsmonitor--daemon status"),
 	NULL
 };
 
 #ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+/*
+ * Global state loaded from config.
+ */
+#define FSMONITOR__IPC_THREADS "fsmonitor.ipcthreads"
+static int fsmonitor__ipc_threads = 8;
+
+#define FSMONITOR__START_TIMEOUT "fsmonitor.starttimeout"
+static int fsmonitor__start_timeout_sec = 60;
+
+static int fsmonitor_config(const char *var, const char *value, void *cb)
+{
+	if (!strcmp(var, FSMONITOR__IPC_THREADS)) {
+		int i = git_config_int(var, value);
+		if (i < 1)
+			return error(_("value of '%s' out of range: %d"),
+				     FSMONITOR__IPC_THREADS, i);
+		fsmonitor__ipc_threads = i;
+		return 0;
+	}
+
+	if (!strcmp(var, FSMONITOR__START_TIMEOUT)) {
+		int i = git_config_int(var, value);
+		if (i < 0)
+			return error(_("value of '%s' out of range: %d"),
+				     FSMONITOR__START_TIMEOUT, i);
+		fsmonitor__start_timeout_sec = i;
+		return 0;
+	}
+
+	return git_default_config(var, value, cb);
+}
+
 /*
  * Acting as a CLIENT.
  *
@@ -55,11 +91,354 @@ static int do_as_client__status(void)
 	}
 }
 
+static ipc_server_application_cb handle_client;
+
+static int handle_client(void *data,
+			 const char *command, size_t command_len,
+			 ipc_server_reply_cb *reply,
+			 struct ipc_server_reply_data *reply_data)
+{
+	/* struct fsmonitor_daemon_state *state = data; */
+	int result;
+
+	/*
+	 * The Simple IPC API now supports {char*, len} arguments, but
+	 * FSMonitor always uses proper null-terminated strings, so
+	 * we can ignore the command_len argument.  (Trust, but verify.)
+	 */
+	if (command_len != strlen(command))
+		BUG("FSMonitor assumes text messages");
+
+	trace2_region_enter("fsmonitor", "handle_client", the_repository);
+	trace2_data_string("fsmonitor", the_repository, "request", command);
+
+	result = 0; /* TODO Do something here. */
+
+	trace2_region_leave("fsmonitor", "handle_client", the_repository);
+
+	return result;
+}
+
+static void *fsmonitor_fs_listen__thread_proc(void *_state)
+{
+	struct fsmonitor_daemon_state *state = _state;
+
+	trace2_thread_start("fsm-listen");
+
+	trace_printf_key(&trace_fsmonitor, "Watching: worktree '%s'",
+			 state->path_worktree_watch.buf);
+	if (state->nr_paths_watching > 1)
+		trace_printf_key(&trace_fsmonitor, "Watching: gitdir '%s'",
+				 state->path_gitdir_watch.buf);
+
+	fsmonitor_fs_listen__loop(state);
+
+	trace2_thread_exit();
+	return NULL;
+}
+
+static int fsmonitor_run_daemon_1(struct fsmonitor_daemon_state *state)
+{
+	struct ipc_server_opts ipc_opts = {
+		.nr_threads = fsmonitor__ipc_threads,
+
+		/*
+		 * We know that there are no other active threads yet,
+		 * so we can let the IPC layer temporarily chdir() if
+		 * it needs to when creating the server side of the
+		 * Unix domain socket.
+		 */
+		.uds_disallow_chdir = 0
+	};
+
+	/*
+	 * Start the IPC thread pool before the we've started the file
+	 * system event listener thread so that we have the IPC handle
+	 * before we need it.
+	 */
+	if (ipc_server_run_async(&state->ipc_server_data,
+				 fsmonitor_ipc__get_path(), &ipc_opts,
+				 handle_client, state))
+		return error(_("could not start IPC thread pool"));
+
+	/*
+	 * Start the fsmonitor listener thread to collect filesystem
+	 * events.
+	 */
+	if (pthread_create(&state->listener_thread, NULL,
+			   fsmonitor_fs_listen__thread_proc, state) < 0) {
+		ipc_server_stop_async(state->ipc_server_data);
+		ipc_server_await(state->ipc_server_data);
+
+		return error(_("could not start fsmonitor listener thread"));
+	}
+
+	/*
+	 * The daemon is now fully functional in background threads.
+	 * Wait for the IPC thread pool to shutdown (whether by client
+	 * request or from filesystem activity).
+	 */
+	ipc_server_await(state->ipc_server_data);
+
+	/*
+	 * The fsmonitor listener thread may have received a shutdown
+	 * event from the IPC thread pool, but it doesn't hurt to tell
+	 * it again.  And wait for it to shutdown.
+	 */
+	fsmonitor_fs_listen__stop_async(state);
+	pthread_join(state->listener_thread, NULL);
+
+	return state->error_code;
+}
+
+static int fsmonitor_run_daemon(void)
+{
+	struct fsmonitor_daemon_state state;
+	int err;
+
+	memset(&state, 0, sizeof(state));
+
+	pthread_mutex_init(&state.main_lock, NULL);
+	state.error_code = 0;
+	state.current_token_data = NULL;
+
+	/* Prepare to (recursively) watch the <worktree-root> directory. */
+	strbuf_init(&state.path_worktree_watch, 0);
+	strbuf_addstr(&state.path_worktree_watch, absolute_path(get_git_work_tree()));
+	state.nr_paths_watching = 1;
+
+	/*
+	 * We create/delete cookie files inside the .git directory to
+	 * help us keep sync with the file system.  If ".git" is not a
+	 * directory, then <gitdir> is not inside the cone of
+	 * <worktree-root>, so set up a second watch for it.
+	 */
+	strbuf_init(&state.path_gitdir_watch, 0);
+	strbuf_addbuf(&state.path_gitdir_watch, &state.path_worktree_watch);
+	strbuf_addstr(&state.path_gitdir_watch, "/.git");
+	if (!is_directory(state.path_gitdir_watch.buf)) {
+		strbuf_reset(&state.path_gitdir_watch);
+		strbuf_addstr(&state.path_gitdir_watch, absolute_path(get_git_dir()));
+		state.nr_paths_watching = 2;
+	}
+
+	/*
+	 * Confirm that we can create platform-specific resources for the
+	 * filesystem listener before we bother starting all the threads.
+	 */
+	if (fsmonitor_fs_listen__ctor(&state)) {
+		err = error(_("could not initialize listener thread"));
+		goto done;
+	}
+
+	err = fsmonitor_run_daemon_1(&state);
+
+done:
+	pthread_mutex_destroy(&state.main_lock);
+	fsmonitor_fs_listen__dtor(&state);
+
+	ipc_server_free(state.ipc_server_data);
+
+	strbuf_release(&state.path_worktree_watch);
+	strbuf_release(&state.path_gitdir_watch);
+
+	return err;
+}
+
+static int try_to_run_foreground_daemon(void)
+{
+	/*
+	 * Technically, we don't need to probe for an existing daemon
+	 * process, since we could just call `fsmonitor_run_daemon()`
+	 * and let it fail if the pipe/socket is busy.
+	 *
+	 * However, this method gives us a nicer error message for a
+	 * common error case.
+	 */
+	if (fsmonitor_ipc__get_state() == IPC_STATE__LISTENING)
+		die("fsmonitor--daemon is already running.");
+
+	return !!fsmonitor_run_daemon();
+}
+
+#ifndef GIT_WINDOWS_NATIVE
+/*
+ * This is adapted from `daemonize()`.  Use `fork()` to directly create
+ * and run the daemon in a child process.  The fork-parent returns the
+ * child PID so that we can wait for the child to startup before exiting.
+ */
+static int spawn_background_fsmonitor_daemon(pid_t *pid)
+{
+	*pid = fork();
+
+	switch (*pid) {
+	case 0:
+		if (setsid() == -1)
+			error_errno(_("setsid failed"));
+		close(0);
+		close(1);
+		close(2);
+		sanitize_stdfds();
+
+		return !!fsmonitor_run_daemon();
+
+	case -1:
+		return error_errno(_("could not spawn fsmonitor--daemon in the background"));
+
+	default:
+		return 0;
+	}
+}
+#else
+/*
+ * Conceptually like `daemonize()` but different because Windows does not
+ * have `fork(2)`.  Spawn a normal Windows child process but without the
+ * limitations of `start_command()` and `finish_command()`.
+ */
+static int spawn_background_fsmonitor_daemon(pid_t *pid)
+{
+	char git_exe[MAX_PATH];
+	struct strvec args = STRVEC_INIT;
+	int in, out;
+
+	GetModuleFileNameA(NULL, git_exe, MAX_PATH);
+
+	in = open("/dev/null", O_RDONLY);
+	out = open("/dev/null", O_WRONLY);
+
+	strvec_push(&args, git_exe);
+	strvec_push(&args, "fsmonitor--daemon");
+	strvec_push(&args, "run");
+
+	*pid = mingw_spawnvpe(args.v[0], args.v, NULL, NULL, in, out, out);
+	close(in);
+	close(out);
+
+	strvec_clear(&args);
+
+	if (*pid < 0)
+		return error(_("could not spawn fsmonitor--daemon in the background"));
+
+	return 0;
+}
+#endif
+
+/*
+ * This is adapted from `wait_or_whine()`.  Watch the child process and
+ * let it get started and begin listening for requests on the socket
+ * before reporting our success.
+ */
+static int wait_for_background_startup(pid_t pid_child)
+{
+	int status;
+	pid_t pid_seen;
+	enum ipc_active_state s;
+	time_t time_limit, now;
+
+	time(&time_limit);
+	time_limit += fsmonitor__start_timeout_sec;
+
+	for (;;) {
+		pid_seen = waitpid(pid_child, &status, WNOHANG);
+
+		if (pid_seen == -1)
+			return error_errno(_("waitpid failed"));
+		else if (pid_seen == 0) {
+			/*
+			 * The child is still running (this should be
+			 * the normal case).  Try to connect to it on
+			 * the socket and see if it is ready for
+			 * business.
+			 *
+			 * If there is another daemon already running,
+			 * our child will fail to start (possibly
+			 * after a timeout on the lock), but we don't
+			 * care (who responds) if the socket is live.
+			 */
+			s = fsmonitor_ipc__get_state();
+			if (s == IPC_STATE__LISTENING)
+				return 0;
+
+			time(&now);
+			if (now > time_limit)
+				return error(_("fsmonitor--daemon not online yet"));
+		} else if (pid_seen == pid_child) {
+			/*
+			 * The new child daemon process shutdown while
+			 * it was starting up, so it is not listening
+			 * on the socket.
+			 *
+			 * Try to ping the socket in the odd chance
+			 * that another daemon started (or was already
+			 * running) while our child was starting.
+			 *
+			 * Again, we don't care who services the socket.
+			 */
+			s = fsmonitor_ipc__get_state();
+			if (s == IPC_STATE__LISTENING)
+				return 0;
+
+			/*
+			 * We don't care about the WEXITSTATUS() nor
+			 * any of the WIF*(status) values because
+			 * `cmd_fsmonitor__daemon()` does the `!!result`
+			 * trick on all function return values.
+			 *
+			 * So it is sufficient to just report the
+			 * early shutdown as an error.
+			 */
+			return error(_("fsmonitor--daemon failed to start"));
+		} else
+			return error(_("waitpid is confused"));
+	}
+}
+
+static int try_to_start_background_daemon(void)
+{
+	pid_t pid_child;
+	int ret;
+
+	/*
+	 * Before we try to create a background daemon process, see
+	 * if a daemon process is already listening.  This makes it
+	 * easier for us to report an already-listening error to the
+	 * console, since our spawn/daemon can only report the success
+	 * of creating the background process (and not whether it
+	 * immediately exited).
+	 */
+	if (fsmonitor_ipc__get_state() == IPC_STATE__LISTENING)
+		die("fsmonitor--daemon is already running.");
+
+	/*
+	 * Run the actual daemon in a background process.
+	 */
+	ret = spawn_background_fsmonitor_daemon(&pid_child);
+	if (pid_child <= 0)
+		return ret;
+
+	/*
+	 * Wait (with timeout) for the background child process get
+	 * started and begin listening on the socket/pipe.  This makes
+	 * the "start" command more synchronous and more reliable in
+	 * tests.
+	 */
+	ret = wait_for_background_startup(pid_child);
+
+	return ret;
+}
+
 int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 {
 	const char *subcmd;
 
 	struct option options[] = {
+		OPT_INTEGER(0, "ipc-threads",
+			    &fsmonitor__ipc_threads,
+			    N_("use <n> ipc worker threads")),
+		OPT_INTEGER(0, "start-timeout",
+			    &fsmonitor__start_timeout_sec,
+			    N_("Max seconds to wait for background daemon startup")),
+
 		OPT_END()
 	};
 
@@ -69,7 +448,7 @@ int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 	if (argc == 2 && !strcmp(argv[1], "-h"))
 		usage_with_options(builtin_fsmonitor__daemon_usage, options);
 
-	git_config(git_default_config, NULL);
+	git_config(fsmonitor_config, NULL);
 
 	subcmd = argv[1];
 	argv--;
@@ -77,6 +456,15 @@ int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 
 	argc = parse_options(argc, argv, prefix, options,
 			     builtin_fsmonitor__daemon_usage, 0);
+	if (fsmonitor__ipc_threads < 1)
+		die(_("invalid 'ipc-threads' value (%d)"),
+		    fsmonitor__ipc_threads);
+
+	if (!strcmp(subcmd, "start"))
+		return !!try_to_start_background_daemon();
+
+	if (!strcmp(subcmd, "run"))
+		return !!try_to_run_foreground_daemon();
 
 	if (!strcmp(subcmd, "stop"))
 		return !!do_as_client__send_stop();
diff --git a/fsmonitor--daemon.h b/fsmonitor--daemon.h
new file mode 100644
index 000000000000..3009c1a83de7
--- /dev/null
+++ b/fsmonitor--daemon.h
@@ -0,0 +1,34 @@
+#ifndef FSMONITOR_DAEMON_H
+#define FSMONITOR_DAEMON_H
+
+#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+
+#include "cache.h"
+#include "dir.h"
+#include "run-command.h"
+#include "simple-ipc.h"
+#include "thread-utils.h"
+
+struct fsmonitor_batch;
+struct fsmonitor_token_data;
+
+struct fsmonitor_daemon_backend_data; /* opaque platform-specific data */
+
+struct fsmonitor_daemon_state {
+	pthread_t listener_thread;
+	pthread_mutex_t main_lock;
+
+	struct strbuf path_worktree_watch;
+	struct strbuf path_gitdir_watch;
+	int nr_paths_watching;
+
+	struct fsmonitor_token_data *current_token_data;
+
+	int error_code;
+	struct fsmonitor_daemon_backend_data *backend_data;
+
+	struct ipc_server_data *ipc_server_data;
+};
+
+#endif /* HAVE_FSMONITOR_DAEMON_BACKEND */
+#endif /* FSMONITOR_DAEMON_H */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 14/28] fsmonitor--daemon: add pathname classification
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (12 preceding siblings ...)
  2021-05-22 13:56   ` [PATCH v2 13/28] fsmonitor--daemon: implement daemon command options Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:56   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 15/28] fsmonitor--daemon: define token-ids Jeff Hostetler via GitGitGadget
                     ` (15 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon to classify relative and absolute
pathnames and decide how they should be handled.  This will
be used by the platform-specific backend to respond to each
filesystem event.

When we register for filesystem notifications on a directory,
we get events for everything (recursively) in the directory.
We want to report to clients changes to tracked and untracked
paths within the working directory.  We do not want to report
changes within the .git directory, for example.

This classification will be used in a later commit by the
different backends to classify paths as events are received.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 81 +++++++++++++++++++++++++++++++++++++
 fsmonitor--daemon.h         | 61 ++++++++++++++++++++++++++++
 2 files changed, 142 insertions(+)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 85f99dba861f..d6e35ded9f68 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -119,6 +119,87 @@ static int handle_client(void *data,
 	return result;
 }
 
+#define FSMONITOR_COOKIE_PREFIX ".fsmonitor-daemon-"
+
+enum fsmonitor_path_type fsmonitor_classify_path_workdir_relative(
+	const char *rel)
+{
+	if (fspathncmp(rel, ".git", 4))
+		return IS_WORKDIR_PATH;
+	rel += 4;
+
+	if (!*rel)
+		return IS_DOT_GIT;
+	if (*rel != '/')
+		return IS_WORKDIR_PATH; /* e.g. .gitignore */
+	rel++;
+
+	if (!fspathncmp(rel, FSMONITOR_COOKIE_PREFIX,
+			strlen(FSMONITOR_COOKIE_PREFIX)))
+		return IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX;
+
+	return IS_INSIDE_DOT_GIT;
+}
+
+enum fsmonitor_path_type fsmonitor_classify_path_gitdir_relative(
+	const char *rel)
+{
+	if (!fspathncmp(rel, FSMONITOR_COOKIE_PREFIX,
+			strlen(FSMONITOR_COOKIE_PREFIX)))
+		return IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX;
+
+	return IS_INSIDE_GITDIR;
+}
+
+static enum fsmonitor_path_type try_classify_workdir_abs_path(
+	struct fsmonitor_daemon_state *state,
+	const char *path)
+{
+	const char *rel;
+
+	if (fspathncmp(path, state->path_worktree_watch.buf,
+		       state->path_worktree_watch.len))
+		return IS_OUTSIDE_CONE;
+
+	rel = path + state->path_worktree_watch.len;
+
+	if (!*rel)
+		return IS_WORKDIR_PATH; /* it is the root dir exactly */
+	if (*rel != '/')
+		return IS_OUTSIDE_CONE;
+	rel++;
+
+	return fsmonitor_classify_path_workdir_relative(rel);
+}
+
+enum fsmonitor_path_type fsmonitor_classify_path_absolute(
+	struct fsmonitor_daemon_state *state,
+	const char *path)
+{
+	const char *rel;
+	enum fsmonitor_path_type t;
+
+	t = try_classify_workdir_abs_path(state, path);
+	if (state->nr_paths_watching == 1)
+		return t;
+	if (t != IS_OUTSIDE_CONE)
+		return t;
+
+	if (fspathncmp(path, state->path_gitdir_watch.buf,
+		       state->path_gitdir_watch.len))
+		return IS_OUTSIDE_CONE;
+
+	rel = path + state->path_gitdir_watch.len;
+
+	if (!*rel)
+		return IS_GITDIR; /* it is the <gitdir> exactly */
+	if (*rel != '/')
+		return IS_OUTSIDE_CONE;
+	rel++;
+
+	return fsmonitor_classify_path_gitdir_relative(rel);
+}
+
 static void *fsmonitor_fs_listen__thread_proc(void *_state)
 {
 	struct fsmonitor_daemon_state *state = _state;
diff --git a/fsmonitor--daemon.h b/fsmonitor--daemon.h
index 3009c1a83de7..7bbb3a27a1ce 100644
--- a/fsmonitor--daemon.h
+++ b/fsmonitor--daemon.h
@@ -30,5 +30,66 @@ struct fsmonitor_daemon_state {
 	struct ipc_server_data *ipc_server_data;
 };
 
+/*
+ * Pathname classifications.
+ *
+ * The daemon classifies the pathnames that it receives from file
+ * system notification events into the following categories and uses
+ * that to decide whether clients are told about them.  (And to watch
+ * for file system synchronization events.)
+ *
+ * The client should only care about paths within the working
+ * directory proper (inside the working directory and not ".git" nor
+ * inside of ".git/").  That is, the client has read the index and is
+ * asking for a list of any paths in the working directory that have
+ * been modified since the last token.  The client does not care about
+ * file system changes within the .git directory (such as new loose
+ * objects or packfiles).  So the client will only receive paths that
+ * are classified as IS_WORKDIR_PATH.
+ *
+ * The daemon uses the IS_DOT_GIT and IS_GITDIR internally to mean the
+ * exact ".git" directory or GITDIR.  If the daemon receives a delete
+ * event for either of these directories, it will automatically
+ * shutdown, for example.
+ *
+ * Note that the daemon DOES NOT explicitly watch nor special case the
+ * ".git/index" file.  The daemon does not read the index and does not
+ * have any internal index-relative state.  The daemon only collects
+ * the set of modified paths within the working directory.
+ */
+enum fsmonitor_path_type {
+	IS_WORKDIR_PATH = 0,
+
+	IS_DOT_GIT,
+	IS_INSIDE_DOT_GIT,
+	IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX,
+
+	IS_GITDIR,
+	IS_INSIDE_GITDIR,
+	IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX,
+
+	IS_OUTSIDE_CONE,
+};
+
+/*
+ * Classify a pathname relative to the root of the working directory.
+ */
+enum fsmonitor_path_type fsmonitor_classify_path_workdir_relative(
+	const char *relative_path);
+
+/*
+ * Classify a pathname relative to a <gitdir> that is external to the
+ * worktree directory.
+ */
+enum fsmonitor_path_type fsmonitor_classify_path_gitdir_relative(
+	const char *relative_path);
+
+/*
+ * Classify an absolute pathname received from a filesystem event.
+ */
+enum fsmonitor_path_type fsmonitor_classify_path_absolute(
+	struct fsmonitor_daemon_state *state,
+	const char *path);
+
 #endif /* HAVE_FSMONITOR_DAEMON_BACKEND */
 #endif /* FSMONITOR_DAEMON_H */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 15/28] fsmonitor--daemon: define token-ids
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (13 preceding siblings ...)
  2021-05-22 13:56   ` [PATCH v2 14/28] fsmonitor--daemon: add pathname classification Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:56   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 16/28] fsmonitor--daemon: create token-based changed path cache Jeff Hostetler via GitGitGadget
                     ` (14 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon to create token-ids and define the
overall token naming scheme.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 116 +++++++++++++++++++++++++++++++++++-
 1 file changed, 115 insertions(+), 1 deletion(-)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index d6e35ded9f68..ecd456dc9284 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -91,6 +91,120 @@ static int do_as_client__status(void)
 	}
 }
 
+/*
+ * Requests to and from a FSMonitor Protocol V2 provider use an opaque
+ * "token" as a virtual timestamp.  Clients can request a summary of all
+ * created/deleted/modified files relative to a token.  In the response,
+ * clients receive a new token for the next (relative) request.
+ *
+ *
+ * Token Format
+ * ============
+ *
+ * The contents of the token are private and provider-specific.
+ *
+ * For the built-in fsmonitor--daemon, we define a token as follows:
+ *
+ *     "builtin" ":" <token_id> ":" <sequence_nr>
+ *
+ * The "builtin" prefix is used as a namespace to avoid conflicts
+ * with other providers (such as Watchman).
+ *
+ * The <token_id> is an arbitrary OPAQUE string, such as a GUID,
+ * UUID, or {timestamp,pid}.  It is used to group all filesystem
+ * events that happened while the daemon was monitoring (and in-sync
+ * with the filesystem).
+ *
+ *     Unlike FSMonitor Protocol V1, it is not defined as a timestamp
+ *     and does not define less-than/greater-than relationships.
+ *     (There are too many race conditions to rely on file system
+ *     event timestamps.)
+ *
+ * The <sequence_nr> is a simple integer incremented whenever the
+ * daemon needs to make its state public.  For example, if 1000 file
+ * system events come in, but no clients have requested the data,
+ * the daemon can continue to accumulate file changes in the same
+ * bin and does not need to advance the sequence number.  However,
+ * as soon as a client does arrive, the daemon needs to start a new
+ * bin and increment the sequence number.
+ *
+ *     The sequence number serves as the boundary between 2 sets
+ *     of bins -- the older ones that the client has already seen
+ *     and the newer ones that it hasn't.
+ *
+ * When a new <token_id> is created, the <sequence_nr> is reset to
+ * zero.
+ *
+ *
+ * About Token Ids
+ * ===============
+ *
+ * A new token_id is created:
+ *
+ * [1] each time the daemon is started.
+ *
+ * [2] any time that the daemon must re-sync with the filesystem
+ *     (such as when the kernel drops or we miss events on a very
+ *     active volume).
+ *
+ * [3] in response to a client "flush" command (for dropped event
+ *     testing).
+ *
+ * When a new token_id is created, the daemon is free to discard all
+ * cached filesystem events associated with any previous token_ids.
+ * Events associated with a non-current token_id will never be sent
+ * to a client.  A token_id change implicitly means that the daemon
+ * has gap in its event history.
+ *
+ * Therefore, clients that present a token with a stale (non-current)
+ * token_id will always be given a trivial response.
+ */
+struct fsmonitor_token_data {
+	struct strbuf token_id;
+	struct fsmonitor_batch *batch_head;
+	struct fsmonitor_batch *batch_tail;
+	uint64_t client_ref_count;
+};
+
+static struct fsmonitor_token_data *fsmonitor_new_token_data(void)
+{
+	static int test_env_value = -1;
+	static uint64_t flush_count = 0;
+	struct fsmonitor_token_data *token;
+
+	CALLOC_ARRAY(token, 1);
+
+	strbuf_init(&token->token_id, 0);
+	token->batch_head = NULL;
+	token->batch_tail = NULL;
+	token->client_ref_count = 0;
+
+	if (test_env_value < 0)
+		test_env_value = git_env_bool("GIT_TEST_FSMONITOR_TOKEN", 0);
+
+	if (!test_env_value) {
+		struct timeval tv;
+		struct tm tm;
+		time_t secs;
+
+		gettimeofday(&tv, NULL);
+		secs = tv.tv_sec;
+		gmtime_r(&secs, &tm);
+
+		strbuf_addf(&token->token_id,
+			    "%"PRIu64".%d.%4d%02d%02dT%02d%02d%02d.%06ldZ",
+			    flush_count++,
+			    getpid(),
+			    tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
+			    tm.tm_hour, tm.tm_min, tm.tm_sec,
+			    (long)tv.tv_usec);
+	} else {
+		strbuf_addf(&token->token_id, "test_%08x", test_env_value++);
+	}
+
+	return token;
+}
+
 static ipc_server_application_cb handle_client;
 
 static int handle_client(void *data,
@@ -281,7 +395,7 @@ static int fsmonitor_run_daemon(void)
 
 	pthread_mutex_init(&state.main_lock, NULL);
 	state.error_code = 0;
-	state.current_token_data = NULL;
+	state.current_token_data = fsmonitor_new_token_data();
 
 	/* Prepare to (recursively) watch the <worktree-root> directory. */
 	strbuf_init(&state.path_worktree_watch, 0);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 16/28] fsmonitor--daemon: create token-based changed path cache
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (14 preceding siblings ...)
  2021-05-22 13:56   ` [PATCH v2 15/28] fsmonitor--daemon: define token-ids Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:56   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 17/28] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows Jeff Hostetler via GitGitGadget
                     ` (13 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon to build a list of changed paths and associate
them with a token-id.  This will be used by the platform-specific
backends to accumulate changed paths in response to filesystem events.

The platform-specific file system listener thread receives file system
events containing one or more changed pathnames (with whatever bucketing
or grouping that is convenient for the file system).  These paths are
accumulated (without locking) by the file system layer into a `fsmonitor_batch`.

When the file system layer has drained the kernel event queue, it will
"publish" them to our token queue and make them visible to concurrent
client worker threads.  The token layer is free to combine and/or de-dup
paths within these batches for efficient presentation to clients.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 234 +++++++++++++++++++++++++++++++++++-
 fsmonitor--daemon.h         |  40 ++++++
 2 files changed, 272 insertions(+), 2 deletions(-)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index ecd456dc9284..663fead0d66e 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -166,17 +166,27 @@ struct fsmonitor_token_data {
 	uint64_t client_ref_count;
 };
 
+struct fsmonitor_batch {
+	struct fsmonitor_batch *next;
+	uint64_t batch_seq_nr;
+	const char **interned_paths;
+	size_t nr, alloc;
+	time_t pinned_time;
+};
+
 static struct fsmonitor_token_data *fsmonitor_new_token_data(void)
 {
 	static int test_env_value = -1;
 	static uint64_t flush_count = 0;
 	struct fsmonitor_token_data *token;
+	struct fsmonitor_batch *batch;
 
 	CALLOC_ARRAY(token, 1);
+	batch = fsmonitor_batch__new();
 
 	strbuf_init(&token->token_id, 0);
-	token->batch_head = NULL;
-	token->batch_tail = NULL;
+	token->batch_head = batch;
+	token->batch_tail = batch;
 	token->client_ref_count = 0;
 
 	if (test_env_value < 0)
@@ -202,9 +212,147 @@ static struct fsmonitor_token_data *fsmonitor_new_token_data(void)
 		strbuf_addf(&token->token_id, "test_%08x", test_env_value++);
 	}
 
+	/*
+	 * We created a new <token_id> and are starting a new series
+	 * of tokens with a zero <seq_nr>.
+	 *
+	 * Since clients cannot guess our new (non test) <token_id>
+	 * they will always receive a trivial response (because of the
+	 * mismatch on the <token_id>).  The trivial response will
+	 * tell them our new <token_id> so that subsequent requests
+	 * will be relative to our new series.  (And when sending that
+	 * response, we pin the current head of the batch list.)
+	 *
+	 * Even if the client correctly guesses the <token_id>, their
+	 * request of "builtin:<token_id>:0" asks for all changes MORE
+	 * RECENT than batch/bin 0.
+	 *
+	 * This implies that it is a waste to accumulate paths in the
+	 * initial batch/bin (because they will never be transmitted).
+	 *
+	 * So the daemon could be running for days and watching the
+	 * file system, but doesn't need to actually accumulate any
+	 * paths UNTIL we need to set a reference point for a later
+	 * relative request.
+	 *
+	 * However, it is very useful for testing to always have a
+	 * reference point set.  Pin batch 0 to force early file system
+	 * events to accumulate.
+	 */
+	if (test_env_value)
+		batch->pinned_time = time(NULL);
+
 	return token;
 }
 
+struct fsmonitor_batch *fsmonitor_batch__new(void)
+{
+	struct fsmonitor_batch *batch;
+
+	CALLOC_ARRAY(batch, 1);
+
+	return batch;
+}
+
+struct fsmonitor_batch *fsmonitor_batch__pop(struct fsmonitor_batch *batch)
+{
+	struct fsmonitor_batch *next;
+
+	if (!batch)
+		return NULL;
+
+	next = batch->next;
+
+	/*
+	 * The actual strings within the array are interned, so we don't
+	 * own them.
+	 */
+	free(batch->interned_paths);
+
+	return next;
+}
+
+void fsmonitor_batch__add_path(struct fsmonitor_batch *batch,
+			       const char *path)
+{
+	const char *interned_path = strintern(path);
+
+	trace_printf_key(&trace_fsmonitor, "event: %s", interned_path);
+
+	ALLOC_GROW(batch->interned_paths, batch->nr + 1, batch->alloc);
+	batch->interned_paths[batch->nr++] = interned_path;
+}
+
+static void fsmonitor_batch__combine(struct fsmonitor_batch *batch_dest,
+				     const struct fsmonitor_batch *batch_src)
+{
+	size_t k;
+
+	ALLOC_GROW(batch_dest->interned_paths,
+		   batch_dest->nr + batch_src->nr + 1,
+		   batch_dest->alloc);
+
+	for (k = 0; k < batch_src->nr; k++)
+		batch_dest->interned_paths[batch_dest->nr++] =
+			batch_src->interned_paths[k];
+}
+
+static void fsmonitor_free_token_data(struct fsmonitor_token_data *token)
+{
+	struct fsmonitor_batch *p;
+
+	if (!token)
+		return;
+
+	assert(token->client_ref_count == 0);
+
+	strbuf_release(&token->token_id);
+
+	for (p = token->batch_head; p; p = fsmonitor_batch__pop(p))
+		;
+
+	free(token);
+}
+
+/*
+ * Flush all of our cached data about the filesystem.  Call this if we
+ * lose sync with the filesystem and miss some notification events.
+ *
+ * [1] If we are missing events, then we no longer have a complete
+ *     history of the directory (relative to our current start token).
+ *     We should create a new token and start fresh (as if we just
+ *     booted up).
+ *
+ * If there are no concurrent threads readering the current token data
+ * series, we can free it now.  Otherwise, let the last reader free
+ * it.
+ *
+ * Either way, the old token data series is no longer associated with
+ * our state data.
+ */
+static void with_lock__do_force_resync(struct fsmonitor_daemon_state *state)
+{
+	/* assert current thread holding state->main_lock */
+
+	struct fsmonitor_token_data *free_me = NULL;
+	struct fsmonitor_token_data *new_one = NULL;
+
+	new_one = fsmonitor_new_token_data();
+
+	if (state->current_token_data->client_ref_count == 0)
+		free_me = state->current_token_data;
+	state->current_token_data = new_one;
+
+	fsmonitor_free_token_data(free_me);
+}
+
+void fsmonitor_force_resync(struct fsmonitor_daemon_state *state)
+{
+	pthread_mutex_lock(&state->main_lock);
+	with_lock__do_force_resync(state);
+	pthread_mutex_unlock(&state->main_lock);
+}
+
 static ipc_server_application_cb handle_client;
 
 static int handle_client(void *data,
@@ -314,6 +462,81 @@ enum fsmonitor_path_type fsmonitor_classify_path_absolute(
 	return fsmonitor_classify_path_gitdir_relative(rel);
 }
 
+/*
+ * We try to combine small batches at the front of the batch-list to avoid
+ * having a long list.  This hopefully makes it a little easier when we want
+ * to truncate and maintain the list.  However, we don't want the paths array
+ * to just keep growing and growing with realloc, so we insert an arbitrary
+ * limit.
+ */
+#define MY_COMBINE_LIMIT (1024)
+
+void fsmonitor_publish(struct fsmonitor_daemon_state *state,
+		       struct fsmonitor_batch *batch,
+		       const struct string_list *cookie_names)
+{
+	if (!batch && !cookie_names->nr)
+		return;
+
+	pthread_mutex_lock(&state->main_lock);
+
+	if (batch) {
+		struct fsmonitor_batch *head;
+
+		head = state->current_token_data->batch_head;
+		if (!head) {
+			BUG("token does not have batch");
+		} else if (head->pinned_time) {
+			/*
+			 * We cannot alter the current batch list
+			 * because:
+			 *
+			 * [a] it is being transmitted to at least one
+			 * client and the handle_client() thread has a
+			 * ref-count, but not a lock on the batch list
+			 * starting with this item.
+			 *
+			 * [b] it has been transmitted in the past to
+			 * at least one client such that future
+			 * requests are relative to this head batch.
+			 *
+			 * So, we can only prepend a new batch onto
+			 * the front of the list.
+			 */
+			batch->batch_seq_nr = head->batch_seq_nr + 1;
+			batch->next = head;
+			state->current_token_data->batch_head = batch;
+		} else if (!head->batch_seq_nr) {
+			/*
+			 * Batch 0 is unpinned.  See the note in
+			 * `fsmonitor_new_token_data()` about why we
+			 * don't need to accumulate these paths.
+			 */
+			fsmonitor_batch__pop(batch);
+		} else if (head->nr + batch->nr > MY_COMBINE_LIMIT) {
+			/*
+			 * The head batch in the list has never been
+			 * transmitted to a client, but folding the
+			 * contents of the new batch onto it would
+			 * exceed our arbitrary limit, so just prepend
+			 * the new batch onto the list.
+			 */
+			batch->batch_seq_nr = head->batch_seq_nr + 1;
+			batch->next = head;
+			state->current_token_data->batch_head = batch;
+		} else {
+			/*
+			 * We are free to append the paths in the given
+			 * batch onto the end of the current head batch.
+			 */
+			fsmonitor_batch__combine(head, batch);
+			fsmonitor_batch__pop(batch);
+		}
+	}
+
+	pthread_mutex_unlock(&state->main_lock);
+}
+
 static void *fsmonitor_fs_listen__thread_proc(void *_state)
 {
 	struct fsmonitor_daemon_state *state = _state;
@@ -328,6 +551,13 @@ static void *fsmonitor_fs_listen__thread_proc(void *_state)
 
 	fsmonitor_fs_listen__loop(state);
 
+	pthread_mutex_lock(&state->main_lock);
+	if (state->current_token_data &&
+	    state->current_token_data->client_ref_count == 0)
+		fsmonitor_free_token_data(state->current_token_data);
+	state->current_token_data = NULL;
+	pthread_mutex_unlock(&state->main_lock);
+
 	trace2_thread_exit();
 	return NULL;
 }
diff --git a/fsmonitor--daemon.h b/fsmonitor--daemon.h
index 7bbb3a27a1ce..89a9ef20b24b 100644
--- a/fsmonitor--daemon.h
+++ b/fsmonitor--daemon.h
@@ -12,6 +12,27 @@
 struct fsmonitor_batch;
 struct fsmonitor_token_data;
 
+/*
+ * Create a new batch of path(s).  The returned batch is considered
+ * private and not linked into the fsmonitor daemon state.  The caller
+ * should fill this batch with one or more paths and then publish it.
+ */
+struct fsmonitor_batch *fsmonitor_batch__new(void);
+
+/*
+ * Free this batch and return the value of the batch->next field.
+ */
+struct fsmonitor_batch *fsmonitor_batch__pop(struct fsmonitor_batch *batch);
+
+/*
+ * Add this path to this batch of modified files.
+ *
+ * The batch should be private and NOT (yet) linked into the fsmonitor
+ * daemon state and therefore not yet visible to worker threads and so
+ * no locking is required.
+ */
+void fsmonitor_batch__add_path(struct fsmonitor_batch *batch, const char *path);
+
 struct fsmonitor_daemon_backend_data; /* opaque platform-specific data */
 
 struct fsmonitor_daemon_state {
@@ -91,5 +112,24 @@ enum fsmonitor_path_type fsmonitor_classify_path_absolute(
 	struct fsmonitor_daemon_state *state,
 	const char *path);
 
+/*
+ * Prepend the this batch of path(s) onto the list of batches associated
+ * with the current token.  This makes the batch visible to worker threads.
+ *
+ * The caller no longer owns the batch and must not free it.
+ *
+ * Wake up the client threads waiting on these cookies.
+ */
+void fsmonitor_publish(struct fsmonitor_daemon_state *state,
+		       struct fsmonitor_batch *batch,
+		       const struct string_list *cookie_names);
+
+/*
+ * If the platform-specific layer loses sync with the filesystem,
+ * it should call this to invalidate cached data and abort waiting
+ * threads.
+ */
+void fsmonitor_force_resync(struct fsmonitor_daemon_state *state);
+
 #endif /* HAVE_FSMONITOR_DAEMON_BACKEND */
 #endif /* FSMONITOR_DAEMON_H */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 17/28] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (15 preceding siblings ...)
  2021-05-22 13:56   ` [PATCH v2 16/28] fsmonitor--daemon: create token-based changed path cache Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:56   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 18/28] fsmonitor-fs-listen-macos: add macos header files for FSEvent Jeff Hostetler via GitGitGadget
                     ` (12 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach the win32 backend to register a watch on the working tree
root directory (recursively).  Also watch the <gitdir> if it is
not inside the working tree.  And to collect path change notifications
into batches and publish.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/fsmonitor/fsmonitor-fs-listen-win32.c | 532 +++++++++++++++++++
 1 file changed, 532 insertions(+)

diff --git a/compat/fsmonitor/fsmonitor-fs-listen-win32.c b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
index 880446b49e35..ba087b292df6 100644
--- a/compat/fsmonitor/fsmonitor-fs-listen-win32.c
+++ b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
@@ -2,20 +2,552 @@
 #include "config.h"
 #include "fsmonitor.h"
 #include "fsmonitor-fs-listen.h"
+#include "fsmonitor--daemon.h"
+
+/*
+ * The documentation of ReadDirectoryChangesW() states that the maximum
+ * buffer size is 64K when the monitored directory is remote.
+ *
+ * Larger buffers may be used when the monitored directory is local and
+ * will help us receive events faster from the kernel and avoid dropped
+ * events.
+ *
+ * So we try to use a very large buffer and silently fallback to 64K if
+ * we get an error.
+ */
+#define MAX_RDCW_BUF_FALLBACK (65536)
+#define MAX_RDCW_BUF          (65536 * 8)
+
+struct one_watch
+{
+	char buffer[MAX_RDCW_BUF];
+	DWORD buf_len;
+	DWORD count;
+
+	struct strbuf path;
+	HANDLE hDir;
+	HANDLE hEvent;
+	OVERLAPPED overlapped;
+
+	/*
+	 * Is there an active ReadDirectoryChangesW() call pending.  If so, we
+	 * need to later call GetOverlappedResult() and possibly CancelIoEx().
+	 */
+	BOOL is_active;
+};
+
+struct fsmonitor_daemon_backend_data
+{
+	struct one_watch *watch_worktree;
+	struct one_watch *watch_gitdir;
+
+	HANDLE hEventShutdown;
+
+	HANDLE hListener[3]; /* we don't own these handles */
+#define LISTENER_SHUTDOWN 0
+#define LISTENER_HAVE_DATA_WORKTREE 1
+#define LISTENER_HAVE_DATA_GITDIR 2
+	int nr_listener_handles;
+};
+
+/*
+ * Convert the WCHAR path from the notification into UTF8 and
+ * then normalize it.
+ */
+static int normalize_path_in_utf8(FILE_NOTIFY_INFORMATION *info,
+				  struct strbuf *normalized_path)
+{
+	int reserve;
+	int len = 0;
+
+	strbuf_reset(normalized_path);
+	if (!info->FileNameLength)
+		goto normalize;
+
+	/*
+	 * Pre-reserve enough space in the UTF8 buffer for
+	 * each Unicode WCHAR character to be mapped into a
+	 * sequence of 2 UTF8 characters.  That should let us
+	 * avoid ERROR_INSUFFICIENT_BUFFER 99.9+% of the time.
+	 */
+	reserve = info->FileNameLength + 1;
+	strbuf_grow(normalized_path, reserve);
+
+	for (;;) {
+		len = WideCharToMultiByte(CP_UTF8, 0, info->FileName,
+					  info->FileNameLength / sizeof(WCHAR),
+					  normalized_path->buf,
+					  strbuf_avail(normalized_path) - 1,
+					  NULL, NULL);
+		if (len > 0)
+			goto normalize;
+		if (GetLastError() != ERROR_INSUFFICIENT_BUFFER) {
+			error("[GLE %ld] could not convert path to UTF-8: '%.*ls'",
+			      GetLastError(),
+			      (int)(info->FileNameLength / sizeof(WCHAR)),
+			      info->FileName);
+			return -1;
+		}
+
+		strbuf_grow(normalized_path,
+			    strbuf_avail(normalized_path) + reserve);
+	}
+
+normalize:
+	strbuf_setlen(normalized_path, len);
+	return strbuf_normalize_path(normalized_path);
+}
 
 void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state)
 {
+	SetEvent(state->backend_data->hListener[LISTENER_SHUTDOWN]);
+}
+
+static struct one_watch *create_watch(struct fsmonitor_daemon_state *state,
+				      const char *path)
+{
+	struct one_watch *watch = NULL;
+	DWORD desired_access = FILE_LIST_DIRECTORY;
+	DWORD share_mode =
+		FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE;
+	HANDLE hDir;
+
+	hDir = CreateFileA(path,
+			   desired_access, share_mode, NULL, OPEN_EXISTING,
+			   FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OVERLAPPED,
+			   NULL);
+	if (hDir == INVALID_HANDLE_VALUE) {
+		error(_("[GLE %ld] could not watch '%s'"),
+		      GetLastError(), path);
+		return NULL;
+	}
+
+	CALLOC_ARRAY(watch, 1);
+
+	watch->buf_len = sizeof(watch->buffer); /* assume full MAX_RDCW_BUF */
+
+	strbuf_init(&watch->path, 0);
+	strbuf_addstr(&watch->path, path);
+
+	watch->hDir = hDir;
+	watch->hEvent = CreateEvent(NULL, TRUE, FALSE, NULL);
+
+	return watch;
+}
+
+static void destroy_watch(struct one_watch *watch)
+{
+	if (!watch)
+		return;
+
+	strbuf_release(&watch->path);
+	if (watch->hDir != INVALID_HANDLE_VALUE)
+		CloseHandle(watch->hDir);
+	if (watch->hEvent != INVALID_HANDLE_VALUE)
+		CloseHandle(watch->hEvent);
+
+	free(watch);
+}
+
+static int start_rdcw_watch(struct fsmonitor_daemon_backend_data *data,
+			    struct one_watch *watch)
+{
+	DWORD dwNotifyFilter =
+		FILE_NOTIFY_CHANGE_FILE_NAME |
+		FILE_NOTIFY_CHANGE_DIR_NAME |
+		FILE_NOTIFY_CHANGE_ATTRIBUTES |
+		FILE_NOTIFY_CHANGE_SIZE |
+		FILE_NOTIFY_CHANGE_LAST_WRITE |
+		FILE_NOTIFY_CHANGE_CREATION;
+
+	ResetEvent(watch->hEvent);
+
+	memset(&watch->overlapped, 0, sizeof(watch->overlapped));
+	watch->overlapped.hEvent = watch->hEvent;
+
+start_watch:
+	/*
+	 * Queue an async call using Overlapped IO.  This returns immediately.
+	 * Our event handle will be signalled when the real result is available.
+	 *
+	 * The return value here just means that we successfully queued it.
+	 * We won't know if the Read...() actually produces data until later.
+	 */
+	watch->is_active = ReadDirectoryChangesW(
+		watch->hDir, watch->buffer, watch->buf_len, TRUE,
+		dwNotifyFilter, &watch->count, &watch->overlapped, NULL);
+
+	/*
+	 * The kernel throws an invalid parameter error when our buffer
+	 * is too big and we are pointed at a remote directory (and possibly
+	 * for other reasons).  Quietly set it down and try again.
+	 *
+	 * See note about MAX_RDCW_BUF at the top.
+	 */
+	if (!watch->is_active &&
+	    GetLastError() == ERROR_INVALID_PARAMETER &&
+	    watch->buf_len > MAX_RDCW_BUF_FALLBACK) {
+		watch->buf_len = MAX_RDCW_BUF_FALLBACK;
+		goto start_watch;
+	}
+
+	if (watch->is_active)
+		return 0;
+
+	error("ReadDirectoryChangedW failed on '%s' [GLE %ld]",
+	      watch->path.buf, GetLastError());
+	return -1;
+}
+
+static int recv_rdcw_watch(struct one_watch *watch)
+{
+	watch->is_active = FALSE;
+
+	/*
+	 * The overlapped result is ready.  If the Read...() was successful
+	 * we finally receive the actual result into our buffer.
+	 */
+	if (GetOverlappedResult(watch->hDir, &watch->overlapped, &watch->count,
+				TRUE))
+		return 0;
+
+	/*
+	 * NEEDSWORK: If an external <gitdir> is deleted, the above
+	 * returns an error.  I'm not sure that there's anything that
+	 * we can do here other than failing -- the <worktree>/.git
+	 * link file would be broken anyway.  We might try to check
+	 * for that and return a better error message, but I'm not
+	 * sure it is worth it.
+	 */
+
+	error("GetOverlappedResult failed on '%s' [GLE %ld]",
+	      watch->path.buf, GetLastError());
+	return -1;
+}
+
+static void cancel_rdcw_watch(struct one_watch *watch)
+{
+	DWORD count;
+
+	if (!watch || !watch->is_active)
+		return;
+
+	/*
+	 * The calls to ReadDirectoryChangesW() and GetOverlappedResult()
+	 * form a "pair" (my term) where we queue an IO and promise to
+	 * hang around and wait for the kernel to give us the result.
+	 *
+	 * If for some reason after we queue the IO, we have to quit
+	 * or otherwise not stick around for the second half, we must
+	 * tell the kernel to abort the IO.  This prevents the kernel
+	 * from writing to our buffer and/or signalling our event
+	 * after we free them.
+	 *
+	 * (Ask me how much fun it was to track that one down).
+	 */
+	CancelIoEx(watch->hDir, &watch->overlapped);
+	GetOverlappedResult(watch->hDir, &watch->overlapped, &count, TRUE);
+	watch->is_active = FALSE;
+}
+
+/*
+ * Process filesystem events that happen anywhere (recursively) under the
+ * <worktree> root directory.  For a normal working directory, this includes
+ * both version controlled files and the contents of the .git/ directory.
+ *
+ * If <worktree>/.git is a file, then we only see events for the file
+ * itself.
+ */
+static int process_worktree_events(struct fsmonitor_daemon_state *state)
+{
+	struct fsmonitor_daemon_backend_data *data = state->backend_data;
+	struct one_watch *watch = data->watch_worktree;
+	struct strbuf path = STRBUF_INIT;
+	struct string_list cookie_list = STRING_LIST_INIT_DUP;
+	struct fsmonitor_batch *batch = NULL;
+	const char *p = watch->buffer;
+
+	/*
+	 * If the kernel gets more events than will fit in the kernel
+	 * buffer associated with our RDCW handle, it drops them and
+	 * returns a count of zero.
+	 *
+	 * Yes, the call returns WITHOUT error and with length zero.
+	 *
+	 * (The "overflow" case is not ambiguous with the "no data" case
+	 * because we did an INFINITE wait.)
+	 *
+	 * This means we have a gap in coverage.  Tell the daemon layer
+	 * to resync.
+	 */
+	if (!watch->count) {
+		trace2_data_string("fsmonitor", NULL, "fsm-listen/kernel",
+				   "overflow");
+		fsmonitor_force_resync(state);
+		return LISTENER_HAVE_DATA_WORKTREE;
+	}
+
+	/*
+	 * On Windows, `info` contains an "array" of paths that are
+	 * relative to the root of whichever directory handle received
+	 * the event.
+	 */
+	for (;;) {
+		FILE_NOTIFY_INFORMATION *info = (void *)p;
+		const char *slash;
+		enum fsmonitor_path_type t;
+
+		strbuf_reset(&path);
+		if (normalize_path_in_utf8(info, &path) == -1)
+			goto skip_this_path;
+
+		t = fsmonitor_classify_path_workdir_relative(path.buf);
+
+		switch (t) {
+		case IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX:
+			/* special case cookie files within .git */
+
+			/* Use just the filename of the cookie file. */
+			slash = find_last_dir_sep(path.buf);
+			string_list_append(&cookie_list,
+					   slash ? slash + 1 : path.buf);
+			break;
+
+		case IS_INSIDE_DOT_GIT:
+			/* ignore everything inside of "<worktree>/.git/" */
+			break;
+
+		case IS_DOT_GIT:
+			/* "<worktree>/.git" was deleted (or renamed away) */
+			if ((info->Action == FILE_ACTION_REMOVED) ||
+			    (info->Action == FILE_ACTION_RENAMED_OLD_NAME)) {
+				trace2_data_string("fsmonitor", NULL,
+						   "fsm-listen/dotgit",
+						   "removed");
+				goto force_shutdown;
+			}
+			break;
+
+		case IS_WORKDIR_PATH:
+			/* queue normal pathname */
+			if (!batch)
+				batch = fsmonitor_batch__new();
+			fsmonitor_batch__add_path(batch, path.buf);
+			break;
+
+		case IS_GITDIR:
+		case IS_INSIDE_GITDIR:
+		case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX:
+		default:
+			BUG("unexpected path classification '%d' for '%s'",
+			    t, path.buf);
+		}
+
+skip_this_path:
+		if (!info->NextEntryOffset)
+			break;
+		p += info->NextEntryOffset;
+	}
+
+	fsmonitor_publish(state, batch, &cookie_list);
+	batch = NULL;
+	string_list_clear(&cookie_list, 0);
+	strbuf_release(&path);
+	return LISTENER_HAVE_DATA_WORKTREE;
+
+force_shutdown:
+	fsmonitor_batch__pop(batch);
+	string_list_clear(&cookie_list, 0);
+	strbuf_release(&path);
+	return LISTENER_SHUTDOWN;
+}
+
+/*
+ * Process filesystem events that happened anywhere (recursively) under the
+ * external <gitdir> (such as non-primary worktrees or submodules).
+ * We only care about cookie files that our client threads created here.
+ *
+ * Note that we DO NOT get filesystem events on the external <gitdir>
+ * itself (it is not inside something that we are watching).  In particular,
+ * we do not get an event if the external <gitdir> is deleted.
+ */
+static int process_gitdir_events(struct fsmonitor_daemon_state *state)
+{
+	struct fsmonitor_daemon_backend_data *data = state->backend_data;
+	struct one_watch *watch = data->watch_gitdir;
+	struct strbuf path = STRBUF_INIT;
+	struct string_list cookie_list = STRING_LIST_INIT_DUP;
+	const char *p = watch->buffer;
+
+	if (!watch->count) {
+		trace2_data_string("fsmonitor", NULL, "fsm-listen/kernel",
+				   "overflow");
+		fsmonitor_force_resync(state);
+		return LISTENER_HAVE_DATA_GITDIR;
+	}
+
+	for (;;) {
+		FILE_NOTIFY_INFORMATION *info = (void *)p;
+		const char *slash;
+		enum fsmonitor_path_type t;
+
+		strbuf_reset(&path);
+		if (normalize_path_in_utf8(info, &path) == -1)
+			goto skip_this_path;
+
+		t = fsmonitor_classify_path_gitdir_relative(path.buf);
+
+		trace_printf_key(&trace_fsmonitor, "BBB: %s", path.buf);
+
+		switch (t) {
+		case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX:
+			/* special case cookie files within gitdir */
+
+			/* Use just the filename of the cookie file. */
+			slash = find_last_dir_sep(path.buf);
+			string_list_append(&cookie_list,
+					   slash ? slash + 1 : path.buf);
+			break;
+
+		case IS_INSIDE_GITDIR:
+			goto skip_this_path;
+
+		default:
+			BUG("unexpected path classification '%d' for '%s'",
+			    t, path.buf);
+		}
+
+skip_this_path:
+		if (!info->NextEntryOffset)
+			break;
+		p += info->NextEntryOffset;
+	}
+
+	fsmonitor_publish(state, NULL, &cookie_list);
+	string_list_clear(&cookie_list, 0);
+	strbuf_release(&path);
+	return LISTENER_HAVE_DATA_GITDIR;
 }
 
 void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
 {
+	struct fsmonitor_daemon_backend_data *data = state->backend_data;
+	DWORD dwWait;
+
+	state->error_code = 0;
+
+	if (start_rdcw_watch(data, data->watch_worktree) == -1)
+		goto force_error_stop;
+
+	if (data->watch_gitdir &&
+	    start_rdcw_watch(data, data->watch_gitdir) == -1)
+		goto force_error_stop;
+
+	for (;;) {
+		dwWait = WaitForMultipleObjects(data->nr_listener_handles,
+						data->hListener,
+						FALSE, INFINITE);
+
+		if (dwWait == WAIT_OBJECT_0 + LISTENER_HAVE_DATA_WORKTREE) {
+			if (recv_rdcw_watch(data->watch_worktree) == -1)
+				goto force_error_stop;
+			if (process_worktree_events(state) == LISTENER_SHUTDOWN)
+				goto force_shutdown;
+			if (start_rdcw_watch(data, data->watch_worktree) == -1)
+				goto force_error_stop;
+			continue;
+		}
+
+		if (dwWait == WAIT_OBJECT_0 + LISTENER_HAVE_DATA_GITDIR) {
+			if (recv_rdcw_watch(data->watch_gitdir) == -1)
+				goto force_error_stop;
+			if (process_gitdir_events(state) == LISTENER_SHUTDOWN)
+				goto force_shutdown;
+			if (start_rdcw_watch(data, data->watch_gitdir) == -1)
+				goto force_error_stop;
+			continue;
+		}
+
+		if (dwWait == WAIT_OBJECT_0 + LISTENER_SHUTDOWN)
+			goto clean_shutdown;
+
+		error(_("could not read directory changes [GLE %ld]"),
+		      GetLastError());
+		goto force_error_stop;
+	}
+
+force_error_stop:
+	state->error_code = -1;
+
+force_shutdown:
+	/*
+	 * Tell the IPC thead pool to stop (which completes the await
+	 * in the main thread (which will also signal this thread (if
+	 * we are still alive))).
+	 */
+	ipc_server_stop_async(state->ipc_server_data);
+
+clean_shutdown:
+	cancel_rdcw_watch(data->watch_worktree);
+	cancel_rdcw_watch(data->watch_gitdir);
 }
 
 int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
 {
+	struct fsmonitor_daemon_backend_data *data;
+
+	CALLOC_ARRAY(data, 1);
+
+	data->hEventShutdown = CreateEvent(NULL, TRUE, FALSE, NULL);
+
+	data->watch_worktree = create_watch(state,
+					    state->path_worktree_watch.buf);
+	if (!data->watch_worktree)
+		goto failed;
+
+	if (state->nr_paths_watching > 1) {
+		data->watch_gitdir = create_watch(state,
+						  state->path_gitdir_watch.buf);
+		if (!data->watch_gitdir)
+			goto failed;
+	}
+
+	data->hListener[LISTENER_SHUTDOWN] = data->hEventShutdown;
+	data->nr_listener_handles++;
+
+	data->hListener[LISTENER_HAVE_DATA_WORKTREE] =
+		data->watch_worktree->hEvent;
+	data->nr_listener_handles++;
+
+	if (data->watch_gitdir) {
+		data->hListener[LISTENER_HAVE_DATA_GITDIR] =
+			data->watch_gitdir->hEvent;
+		data->nr_listener_handles++;
+	}
+
+	state->backend_data = data;
+	return 0;
+
+failed:
+	CloseHandle(data->hEventShutdown);
+	destroy_watch(data->watch_worktree);
+	destroy_watch(data->watch_gitdir);
+
 	return -1;
 }
 
 void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
 {
+	struct fsmonitor_daemon_backend_data *data;
+
+	if (!state || !state->backend_data)
+		return;
+
+	data = state->backend_data;
+
+	CloseHandle(data->hEventShutdown);
+	destroy_watch(data->watch_worktree);
+	destroy_watch(data->watch_gitdir);
+
+	FREE_AND_NULL(state->backend_data);
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 18/28] fsmonitor-fs-listen-macos: add macos header files for FSEvent
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (16 preceding siblings ...)
  2021-05-22 13:56   ` [PATCH v2 17/28] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:56   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 19/28] fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS Jeff Hostetler via GitGitGadget
                     ` (11 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Include MacOS system declarations to allow us to use FSEvent and
CoreFoundation APIs.  We need GCC and clang versions because of
compiler and header file conflicts.

While it is quite possible to #include Apple's CoreServices.h when
compiling C source code with clang, trying to build it with GCC
currently fails with this error:

In file included
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Security.framework/Headers/AuthSession.h:32,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Security.framework/Headers/Security.h:42,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/OSServices.framework/Headers/CSIdentity.h:43,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/OSServices.framework/Headers/OSServices.h:29,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/LaunchServices.framework/Headers/IconsCore.h:23,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/LaunchServices.framework/Headers/LaunchServices.h:23,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Headers/CoreServices.h:45,
     /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Security.framework/Headers/Authorization.h:193:7: error: variably modified 'bytes' at file scope
       193 | char bytes[kAuthorizationExternalFormLength];
           |      ^~~~~

The underlying reason is that GCC (rightfully) objects that an `enum`
value such as `kAuthorizationExternalFormLength` is not a constant
(because it is not, the preprocessor has no knowledge of it, only the
actual C compiler does) and can therefore not be used to define the size
of a C array.

This is a known problem and tracked in GCC's bug tracker:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93082

In the meantime, let's not block things and go the slightly ugly route
of declaring/defining the FSEvents constants, data structures and
functions that we need, so that we can avoid above-mentioned issue.

Let's do this _only_ for GCC, though, so that the CI/PR builds (which
build both with clang and with GCC) can guarantee that we _are_ using
the correct data types.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/fsmonitor/fsmonitor-fs-listen-macos.c | 96 ++++++++++++++++++++
 1 file changed, 96 insertions(+)

diff --git a/compat/fsmonitor/fsmonitor-fs-listen-macos.c b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
index b91058d1c4f8..bec5130d9e1d 100644
--- a/compat/fsmonitor/fsmonitor-fs-listen-macos.c
+++ b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
@@ -1,3 +1,99 @@
+#if defined(__GNUC__)
+/*
+ * It is possible to #include CoreFoundation/CoreFoundation.h when compiling
+ * with clang, but not with GCC as of time of writing.
+ *
+ * See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93082 for details.
+ */
+typedef unsigned int FSEventStreamCreateFlags;
+#define kFSEventStreamEventFlagNone               0x00000000
+#define kFSEventStreamEventFlagMustScanSubDirs    0x00000001
+#define kFSEventStreamEventFlagUserDropped        0x00000002
+#define kFSEventStreamEventFlagKernelDropped      0x00000004
+#define kFSEventStreamEventFlagEventIdsWrapped    0x00000008
+#define kFSEventStreamEventFlagHistoryDone        0x00000010
+#define kFSEventStreamEventFlagRootChanged        0x00000020
+#define kFSEventStreamEventFlagMount              0x00000040
+#define kFSEventStreamEventFlagUnmount            0x00000080
+#define kFSEventStreamEventFlagItemCreated        0x00000100
+#define kFSEventStreamEventFlagItemRemoved        0x00000200
+#define kFSEventStreamEventFlagItemInodeMetaMod   0x00000400
+#define kFSEventStreamEventFlagItemRenamed        0x00000800
+#define kFSEventStreamEventFlagItemModified       0x00001000
+#define kFSEventStreamEventFlagItemFinderInfoMod  0x00002000
+#define kFSEventStreamEventFlagItemChangeOwner    0x00004000
+#define kFSEventStreamEventFlagItemXattrMod       0x00008000
+#define kFSEventStreamEventFlagItemIsFile         0x00010000
+#define kFSEventStreamEventFlagItemIsDir          0x00020000
+#define kFSEventStreamEventFlagItemIsSymlink      0x00040000
+#define kFSEventStreamEventFlagOwnEvent           0x00080000
+#define kFSEventStreamEventFlagItemIsHardlink     0x00100000
+#define kFSEventStreamEventFlagItemIsLastHardlink 0x00200000
+#define kFSEventStreamEventFlagItemCloned         0x00400000
+
+typedef struct __FSEventStream *FSEventStreamRef;
+typedef const FSEventStreamRef ConstFSEventStreamRef;
+
+typedef unsigned int CFStringEncoding;
+#define kCFStringEncodingUTF8 0x08000100
+
+typedef const struct __CFString *CFStringRef;
+typedef const struct __CFArray *CFArrayRef;
+typedef const struct __CFRunLoop *CFRunLoopRef;
+
+struct FSEventStreamContext {
+    long long version;
+    void *cb_data, *retain, *release, *copy_description;
+};
+
+typedef struct FSEventStreamContext FSEventStreamContext;
+typedef unsigned int FSEventStreamEventFlags;
+#define kFSEventStreamCreateFlagNoDefer 0x02
+#define kFSEventStreamCreateFlagWatchRoot 0x04
+#define kFSEventStreamCreateFlagFileEvents 0x10
+
+typedef unsigned long long FSEventStreamEventId;
+#define kFSEventStreamEventIdSinceNow 0xFFFFFFFFFFFFFFFFULL
+
+typedef void (*FSEventStreamCallback)(ConstFSEventStreamRef streamRef,
+				      void *context,
+				      __SIZE_TYPE__ num_of_events,
+				      void *event_paths,
+				      const FSEventStreamEventFlags event_flags[],
+				      const FSEventStreamEventId event_ids[]);
+typedef double CFTimeInterval;
+FSEventStreamRef FSEventStreamCreate(void *allocator,
+				     FSEventStreamCallback callback,
+				     FSEventStreamContext *context,
+				     CFArrayRef paths_to_watch,
+				     FSEventStreamEventId since_when,
+				     CFTimeInterval latency,
+				     FSEventStreamCreateFlags flags);
+CFStringRef CFStringCreateWithCString(void *allocator, const char *string,
+				      CFStringEncoding encoding);
+CFArrayRef CFArrayCreate(void *allocator, const void **items, long long count,
+			 void *callbacks);
+void CFRunLoopRun(void);
+void CFRunLoopStop(CFRunLoopRef run_loop);
+CFRunLoopRef CFRunLoopGetCurrent(void);
+extern CFStringRef kCFRunLoopDefaultMode;
+void FSEventStreamScheduleWithRunLoop(FSEventStreamRef stream,
+				      CFRunLoopRef run_loop,
+				      CFStringRef run_loop_mode);
+unsigned char FSEventStreamStart(FSEventStreamRef stream);
+void FSEventStreamStop(FSEventStreamRef stream);
+void FSEventStreamInvalidate(FSEventStreamRef stream);
+void FSEventStreamRelease(FSEventStreamRef stream);
+#else
+/*
+ * Let Apple's headers declare `isalnum()` first, before
+ * Git's headers override it via a constant
+ */
+#include <string.h>
+#include <CoreFoundation/CoreFoundation.h>
+#include <CoreServices/CoreServices.h>
+#endif
+
 #include "cache.h"
 #include "fsmonitor.h"
 #include "fsmonitor-fs-listen.h"
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 19/28] fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (17 preceding siblings ...)
  2021-05-22 13:56   ` [PATCH v2 18/28] fsmonitor-fs-listen-macos: add macos header files for FSEvent Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:56   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:56   ` [PATCH v2 20/28] fsmonitor--daemon: implement handle_client callback Jeff Hostetler via GitGitGadget
                     ` (10 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Implement file system event listener on MacOS using FSEvent,
CoreFoundation, and CoreServices.

Co-authored-by: Kevin Willford <Kevin.Willford@microsoft.com>
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/fsmonitor/fsmonitor-fs-listen-macos.c | 381 +++++++++++++++++++
 1 file changed, 381 insertions(+)

diff --git a/compat/fsmonitor/fsmonitor-fs-listen-macos.c b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
index bec5130d9e1d..02f89de216e9 100644
--- a/compat/fsmonitor/fsmonitor-fs-listen-macos.c
+++ b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
@@ -97,20 +97,401 @@ void FSEventStreamRelease(FSEventStreamRef stream);
 #include "cache.h"
 #include "fsmonitor.h"
 #include "fsmonitor-fs-listen.h"
+#include "fsmonitor--daemon.h"
+
+struct fsmonitor_daemon_backend_data
+{
+	CFStringRef cfsr_worktree_path;
+	CFStringRef cfsr_gitdir_path;
+
+	CFArrayRef cfar_paths_to_watch;
+	int nr_paths_watching;
+
+	FSEventStreamRef stream;
+
+	CFRunLoopRef rl;
+
+	enum shutdown_style {
+		SHUTDOWN_EVENT = 0,
+		FORCE_SHUTDOWN,
+		FORCE_ERROR_STOP,
+	} shutdown_style;
+
+	unsigned int stream_scheduled:1;
+	unsigned int stream_started:1;
+};
+
+static void log_flags_set(const char *path, const FSEventStreamEventFlags flag)
+{
+	struct strbuf msg = STRBUF_INIT;
+
+	if (flag & kFSEventStreamEventFlagMustScanSubDirs)
+		strbuf_addstr(&msg, "MustScanSubDirs|");
+	if (flag & kFSEventStreamEventFlagUserDropped)
+		strbuf_addstr(&msg, "UserDropped|");
+	if (flag & kFSEventStreamEventFlagKernelDropped)
+		strbuf_addstr(&msg, "KernelDropped|");
+	if (flag & kFSEventStreamEventFlagEventIdsWrapped)
+		strbuf_addstr(&msg, "EventIdsWrapped|");
+	if (flag & kFSEventStreamEventFlagHistoryDone)
+		strbuf_addstr(&msg, "HistoryDone|");
+	if (flag & kFSEventStreamEventFlagRootChanged)
+		strbuf_addstr(&msg, "RootChanged|");
+	if (flag & kFSEventStreamEventFlagMount)
+		strbuf_addstr(&msg, "Mount|");
+	if (flag & kFSEventStreamEventFlagUnmount)
+		strbuf_addstr(&msg, "Unmount|");
+	if (flag & kFSEventStreamEventFlagItemChangeOwner)
+		strbuf_addstr(&msg, "ItemChangeOwner|");
+	if (flag & kFSEventStreamEventFlagItemCreated)
+		strbuf_addstr(&msg, "ItemCreated|");
+	if (flag & kFSEventStreamEventFlagItemFinderInfoMod)
+		strbuf_addstr(&msg, "ItemFinderInfoMod|");
+	if (flag & kFSEventStreamEventFlagItemInodeMetaMod)
+		strbuf_addstr(&msg, "ItemInodeMetaMod|");
+	if (flag & kFSEventStreamEventFlagItemIsDir)
+		strbuf_addstr(&msg, "ItemIsDir|");
+	if (flag & kFSEventStreamEventFlagItemIsFile)
+		strbuf_addstr(&msg, "ItemIsFile|");
+	if (flag & kFSEventStreamEventFlagItemIsHardlink)
+		strbuf_addstr(&msg, "ItemIsHardlink|");
+	if (flag & kFSEventStreamEventFlagItemIsLastHardlink)
+		strbuf_addstr(&msg, "ItemIsLastHardlink|");
+	if (flag & kFSEventStreamEventFlagItemIsSymlink)
+		strbuf_addstr(&msg, "ItemIsSymlink|");
+	if (flag & kFSEventStreamEventFlagItemModified)
+		strbuf_addstr(&msg, "ItemModified|");
+	if (flag & kFSEventStreamEventFlagItemRemoved)
+		strbuf_addstr(&msg, "ItemRemoved|");
+	if (flag & kFSEventStreamEventFlagItemRenamed)
+		strbuf_addstr(&msg, "ItemRenamed|");
+	if (flag & kFSEventStreamEventFlagItemXattrMod)
+		strbuf_addstr(&msg, "ItemXattrMod|");
+	if (flag & kFSEventStreamEventFlagOwnEvent)
+		strbuf_addstr(&msg, "OwnEvent|");
+	if (flag & kFSEventStreamEventFlagItemCloned)
+		strbuf_addstr(&msg, "ItemCloned|");
+
+	trace_printf_key(&trace_fsmonitor, "fsevent: '%s', flags=%u %s",
+			 path, flag, msg.buf);
+
+	strbuf_release(&msg);
+}
+
+static int ef_is_root_delete(const FSEventStreamEventFlags ef)
+{
+	return (ef & kFSEventStreamEventFlagItemIsDir &&
+		ef & kFSEventStreamEventFlagItemRemoved);
+}
+
+static int ef_is_root_renamed(const FSEventStreamEventFlags ef)
+{
+	return (ef & kFSEventStreamEventFlagItemIsDir &&
+		ef & kFSEventStreamEventFlagItemRenamed);
+}
+
+static int ef_is_dropped(const FSEventStreamEventFlags ef)
+{
+	return (ef & kFSEventStreamEventFlagKernelDropped ||
+		ef & kFSEventStreamEventFlagUserDropped);
+}
+
+static void fsevent_callback(ConstFSEventStreamRef streamRef,
+			     void *ctx,
+			     size_t num_of_events,
+			     void *event_paths,
+			     const FSEventStreamEventFlags event_flags[],
+			     const FSEventStreamEventId event_ids[])
+{
+	struct fsmonitor_daemon_state *state = ctx;
+	struct fsmonitor_daemon_backend_data *data = state->backend_data;
+	char **paths = (char **)event_paths;
+	struct fsmonitor_batch *batch = NULL;
+	struct string_list cookie_list = STRING_LIST_INIT_DUP;
+	const char *path_k;
+	const char *slash;
+	int k;
+	struct strbuf tmp = STRBUF_INIT;
+
+	/*
+	 * Build a list of all filesystem changes into a private/local
+	 * list and without holding any locks.
+	 */
+	for (k = 0; k < num_of_events; k++) {
+		/*
+		 * On Mac, we receive an array of absolute paths.
+		 */
+		path_k = paths[k];
+
+		/*
+		 * If you want to debug FSEvents, log them to GIT_TRACE_FSMONITOR.
+		 * Please don't log them to Trace2.
+		 *
+		 * trace_printf_key(&trace_fsmonitor, "Path: '%s'", path_k);
+		 */
+
+		/*
+		 * If event[k] is marked as dropped, we assume that we have
+		 * lost sync with the filesystem and should flush our cached
+		 * data.  We need to:
+		 *
+		 * [1] Abort/wake any client threads waiting for a cookie and
+		 *     flush the cached state data (the current token), and
+		 *     create a new token.
+		 *
+		 * [2] Discard the batch that we were locally building (since
+		 *     they are conceptually relative to the just flushed
+		 *     token).
+		 */
+		if (ef_is_dropped(event_flags[k])) {
+			/*
+			 * see also kFSEventStreamEventFlagMustScanSubDirs
+			 */
+			trace_printf_key(&trace_fsmonitor, "event: dropped");
+
+			fsmonitor_force_resync(state);
+			fsmonitor_batch__pop(batch);
+			string_list_clear(&cookie_list, 0);
+
+			/*
+			 * We assume that any events that we received
+			 * in this callback after this dropped event
+			 * may still be valid, so we continue rather
+			 * than break.  (And just in case there is a
+			 * delete of ".git" hiding in there.)
+			 */
+			continue;
+		}
+
+		switch (fsmonitor_classify_path_absolute(state, path_k)) {
+
+		case IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX:
+		case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX:
+			/* special case cookie files within .git or gitdir */
+
+			/* Use just the filename of the cookie file. */
+			slash = find_last_dir_sep(path_k);
+			string_list_append(&cookie_list,
+					   slash ? slash + 1 : path_k);
+			break;
+
+		case IS_INSIDE_DOT_GIT:
+		case IS_INSIDE_GITDIR:
+			/* ignore all other paths inside of .git or gitdir */
+			break;
+
+		case IS_DOT_GIT:
+		case IS_GITDIR:
+			/*
+			 * If .git directory is deleted or renamed away,
+			 * we have to quit.
+			 */
+			if (ef_is_root_delete(event_flags[k])) {
+				trace_printf_key(&trace_fsmonitor,
+						 "event: gitdir removed");
+				goto force_shutdown;
+			}
+			if (ef_is_root_renamed(event_flags[k])) {
+				trace_printf_key(&trace_fsmonitor,
+						 "event: gitdir renamed");
+				goto force_shutdown;
+			}
+			break;
+
+		case IS_WORKDIR_PATH:
+			/* try to queue normal pathnames */
+
+			if (trace_pass_fl(&trace_fsmonitor))
+				log_flags_set(path_k, event_flags[k]);
+
+			/*
+			 * Because of the implicit "binning" (the
+			 * kernel calls us at a given frequency) and
+			 * de-duping (the kernel is free to combine
+			 * multiple events for a given pathname), an
+			 * individual fsevent could be marked as both
+			 * a file and directory.  Add it to the queue
+			 * with both spellings so that the client will
+			 * know how much to invalidate/refresh.
+			 */
+
+			if (event_flags[k] & kFSEventStreamEventFlagItemIsFile) {
+				const char *rel = path_k +
+					state->path_worktree_watch.len + 1;
+
+				if (!batch)
+					batch = fsmonitor_batch__new();
+				fsmonitor_batch__add_path(batch, rel);
+			}
+
+			if (event_flags[k] & kFSEventStreamEventFlagItemIsDir) {
+				const char *rel = path_k +
+					state->path_worktree_watch.len + 1;
+
+				strbuf_reset(&tmp);
+				strbuf_addstr(&tmp, rel);
+				strbuf_addch(&tmp, '/');
+
+				if (!batch)
+					batch = fsmonitor_batch__new();
+				fsmonitor_batch__add_path(batch, tmp.buf);
+			}
+
+			break;
+
+		case IS_OUTSIDE_CONE:
+		default:
+			trace_printf_key(&trace_fsmonitor,
+					 "ignoring '%s'", path_k);
+			break;
+		}
+	}
+
+	fsmonitor_publish(state, batch, &cookie_list);
+	string_list_clear(&cookie_list, 0);
+	strbuf_release(&tmp);
+	return;
+
+force_shutdown:
+	fsmonitor_batch__pop(batch);
+	string_list_clear(&cookie_list, 0);
+
+	data->shutdown_style = FORCE_SHUTDOWN;
+	CFRunLoopStop(data->rl);
+	strbuf_release(&tmp);
+	return;
+}
+
+/*
+ * NEEDSWORK: Investigate the proper value for the `latency` argument
+ * in the call to `FSEventStreamCreate()`.  I'm not sure that this
+ * needs to be a config setting or just something that we tune after
+ * some testing.
+ *
+ * With a latency of 0.1, I was seeing lots of dropped events during
+ * the "touch 100000" files test within t/perf/p7519, but with a
+ * latency of 0.001 I did not see any dropped events.  So the
+ * "correct" value may be somewhere in between.
+ *
+ * https://developer.apple.com/documentation/coreservices/1443980-fseventstreamcreate
+ */
 
 int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
 {
+	FSEventStreamCreateFlags flags = kFSEventStreamCreateFlagNoDefer |
+		kFSEventStreamCreateFlagWatchRoot |
+		kFSEventStreamCreateFlagFileEvents;
+	FSEventStreamContext ctx = {
+		0,
+		state,
+		NULL,
+		NULL,
+		NULL
+	};
+	struct fsmonitor_daemon_backend_data *data;
+	const void *dir_array[2];
+
+	CALLOC_ARRAY(data, 1);
+	state->backend_data = data;
+
+	data->cfsr_worktree_path = CFStringCreateWithCString(
+		NULL, state->path_worktree_watch.buf, kCFStringEncodingUTF8);
+	dir_array[data->nr_paths_watching++] = data->cfsr_worktree_path;
+
+	if (state->nr_paths_watching > 1) {
+		data->cfsr_gitdir_path = CFStringCreateWithCString(
+			NULL, state->path_gitdir_watch.buf,
+			kCFStringEncodingUTF8);
+		dir_array[data->nr_paths_watching++] = data->cfsr_gitdir_path;
+	}
+
+	data->cfar_paths_to_watch = CFArrayCreate(NULL, dir_array,
+						  data->nr_paths_watching,
+						  NULL);
+	data->stream = FSEventStreamCreate(NULL, fsevent_callback, &ctx,
+					   data->cfar_paths_to_watch,
+					   kFSEventStreamEventIdSinceNow,
+					   0.001, flags);
+	if (data->stream == NULL)
+		goto failed;
+
+	/*
+	 * `data->rl` needs to be set inside the listener thread.
+	 */
+
+	return 0;
+
+failed:
+	error("Unable to create FSEventStream.");
+
+	FREE_AND_NULL(state->backend_data);
 	return -1;
 }
 
 void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
 {
+	struct fsmonitor_daemon_backend_data *data;
+
+	if (!state || !state->backend_data)
+		return;
+
+	data = state->backend_data;
+
+	if (data->stream) {
+		if (data->stream_started)
+			FSEventStreamStop(data->stream);
+		if (data->stream_scheduled)
+			FSEventStreamInvalidate(data->stream);
+		FSEventStreamRelease(data->stream);
+	}
+
+	FREE_AND_NULL(state->backend_data);
 }
 
 void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state)
 {
+	struct fsmonitor_daemon_backend_data *data;
+
+	data = state->backend_data;
+	data->shutdown_style = SHUTDOWN_EVENT;
+
+	CFRunLoopStop(data->rl);
 }
 
 void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
 {
+	struct fsmonitor_daemon_backend_data *data;
+
+	data = state->backend_data;
+
+	data->rl = CFRunLoopGetCurrent();
+
+	FSEventStreamScheduleWithRunLoop(data->stream, data->rl, kCFRunLoopDefaultMode);
+	data->stream_scheduled = 1;
+
+	if (!FSEventStreamStart(data->stream)) {
+		error("Failed to start the FSEventStream");
+		goto force_error_stop_without_loop;
+	}
+	data->stream_started = 1;
+
+	CFRunLoopRun();
+
+	switch (data->shutdown_style) {
+	case FORCE_ERROR_STOP:
+		state->error_code = -1;
+		/* fall thru */
+	case FORCE_SHUTDOWN:
+		ipc_server_stop_async(state->ipc_server_data);
+		/* fall thru */
+	case SHUTDOWN_EVENT:
+	default:
+		break;
+	}
+	return;
+
+force_error_stop_without_loop:
+	state->error_code = -1;
+	ipc_server_stop_async(state->ipc_server_data);
+	return;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 20/28] fsmonitor--daemon: implement handle_client callback
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (18 preceding siblings ...)
  2021-05-22 13:56   ` [PATCH v2 19/28] fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:56   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:57   ` [PATCH v2 21/28] fsmonitor--daemon: periodically truncate list of modified files Jeff Hostetler via GitGitGadget
                     ` (9 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:56 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon to respond to IPC requests from client
Git processes and respond with a list of modified pathnames
relative to the provided token.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 312 +++++++++++++++++++++++++++++++++++-
 1 file changed, 310 insertions(+), 2 deletions(-)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 663fead0d66e..33b4f09c72ca 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -7,6 +7,7 @@
 #include "fsmonitor--daemon.h"
 #include "simple-ipc.h"
 #include "khash.h"
+#include "pkt-line.h"
 
 static const char * const builtin_fsmonitor__daemon_usage[] = {
 	N_("git fsmonitor--daemon start [<options>]"),
@@ -353,6 +354,311 @@ void fsmonitor_force_resync(struct fsmonitor_daemon_state *state)
 	pthread_mutex_unlock(&state->main_lock);
 }
 
+/*
+ * Format an opaque token string to send to the client.
+ */
+static void with_lock__format_response_token(
+	struct strbuf *response_token,
+	const struct strbuf *response_token_id,
+	const struct fsmonitor_batch *batch)
+{
+	/* assert current thread holding state->main_lock */
+
+	strbuf_reset(response_token);
+	strbuf_addf(response_token, "builtin:%s:%"PRIu64,
+		    response_token_id->buf, batch->batch_seq_nr);
+}
+
+/*
+ * Parse an opaque token from the client.
+ * Returns -1 on error.
+ */
+static int fsmonitor_parse_client_token(const char *buf_token,
+					struct strbuf *requested_token_id,
+					uint64_t *seq_nr)
+{
+	const char *p;
+	char *p_end;
+
+	strbuf_reset(requested_token_id);
+	*seq_nr = 0;
+
+	if (!skip_prefix(buf_token, "builtin:", &p))
+		return -1;
+
+	while (*p && *p != ':')
+		strbuf_addch(requested_token_id, *p++);
+	if (!*p++)
+		return -1;
+
+	*seq_nr = (uint64_t)strtoumax(p, &p_end, 10);
+	if (*p_end)
+		return -1;
+
+	return 0;
+}
+
+KHASH_INIT(str, const char *, int, 0, kh_str_hash_func, kh_str_hash_equal);
+
+static int do_handle_client(struct fsmonitor_daemon_state *state,
+			    const char *command,
+			    ipc_server_reply_cb *reply,
+			    struct ipc_server_reply_data *reply_data)
+{
+	struct fsmonitor_token_data *token_data = NULL;
+	struct strbuf response_token = STRBUF_INIT;
+	struct strbuf requested_token_id = STRBUF_INIT;
+	struct strbuf payload = STRBUF_INIT;
+	uint64_t requested_oldest_seq_nr = 0;
+	uint64_t total_response_len = 0;
+	const char *p;
+	const struct fsmonitor_batch *batch_head;
+	const struct fsmonitor_batch *batch;
+	intmax_t count = 0, duplicates = 0;
+	kh_str_t *shown;
+	int hash_ret;
+	int do_trivial = 0;
+	int do_flush = 0;
+
+	/*
+	 * We expect `command` to be of the form:
+	 *
+	 * <command> := quit NUL
+	 *            | flush NUL
+	 *            | <V1-time-since-epoch-ns> NUL
+	 *            | <V2-opaque-fsmonitor-token> NUL
+	 */
+
+	if (!strcmp(command, "quit")) {
+		/*
+		 * A client has requested over the socket/pipe that the
+		 * daemon shutdown.
+		 *
+		 * Tell the IPC thread pool to shutdown (which completes
+		 * the await in the main thread (which can stop the
+		 * fsmonitor listener thread)).
+		 *
+		 * There is no reply to the client.
+		 */
+		return SIMPLE_IPC_QUIT;
+
+	} else if (!strcmp(command, "flush")) {
+		/*
+		 * Flush all of our cached data and generate a new token
+		 * just like if we lost sync with the filesystem.
+		 *
+		 * Then send a trivial response using the new token.
+		 */
+		do_flush = 1;
+		do_trivial = 1;
+
+	} else if (!skip_prefix(command, "builtin:", &p)) {
+		/* assume V1 timestamp or garbage */
+
+		char *p_end;
+
+		strtoumax(command, &p_end, 10);
+		trace_printf_key(&trace_fsmonitor,
+				 ((*p_end) ?
+				  "fsmonitor: invalid command line '%s'" :
+				  "fsmonitor: unsupported V1 protocol '%s'"),
+				 command);
+		do_trivial = 1;
+
+	} else {
+		/* We have "builtin:*" */
+		if (fsmonitor_parse_client_token(command, &requested_token_id,
+						 &requested_oldest_seq_nr)) {
+			trace_printf_key(&trace_fsmonitor,
+					 "fsmonitor: invalid V2 protocol token '%s'",
+					 command);
+			do_trivial = 1;
+
+		} else {
+			/*
+			 * We have a V2 valid token:
+			 *     "builtin:<token_id>:<seq_nr>"
+			 */
+		}
+	}
+
+	pthread_mutex_lock(&state->main_lock);
+
+	if (!state->current_token_data)
+		BUG("fsmonitor state does not have a current token");
+
+	if (do_flush)
+		with_lock__do_force_resync(state);
+
+	/*
+	 * We mark the current head of the batch list as "pinned" so
+	 * that the listener thread will treat this item as read-only
+	 * (and prevent any more paths from being added to it) from
+	 * now on.
+	 */
+	token_data = state->current_token_data;
+	batch_head = token_data->batch_head;
+	((struct fsmonitor_batch *)batch_head)->pinned_time = time(NULL);
+
+	/*
+	 * FSMonitor Protocol V2 requires that we send a response header
+	 * with a "new current token" and then all of the paths that changed
+	 * since the "requested token".  We send the seq_nr of the just-pinned
+	 * head batch so that future requests from a client will be relative
+	 * to it.
+	 */
+	with_lock__format_response_token(&response_token,
+					 &token_data->token_id, batch_head);
+
+	reply(reply_data, response_token.buf, response_token.len + 1);
+	total_response_len += response_token.len + 1;
+
+	trace2_data_string("fsmonitor", the_repository, "response/token",
+			   response_token.buf);
+	trace_printf_key(&trace_fsmonitor, "response token: %s",
+			 response_token.buf);
+
+	if (!do_trivial) {
+		if (strcmp(requested_token_id.buf, token_data->token_id.buf)) {
+			/*
+			 * The client last spoke to a different daemon
+			 * instance -OR- the daemon had to resync with
+			 * the filesystem (and lost events), so reject.
+			 */
+			trace2_data_string("fsmonitor", the_repository,
+					   "response/token", "different");
+			do_trivial = 1;
+
+		} else if (requested_oldest_seq_nr <
+			   token_data->batch_tail->batch_seq_nr) {
+			/*
+			 * The client wants older events than we have for
+			 * this token_id.  This means that the end of our
+			 * batch list was truncated and we cannot give the
+			 * client a complete snapshot relative to their
+			 * request.
+			 */
+			trace_printf_key(&trace_fsmonitor,
+					 "client requested truncated data");
+			do_trivial = 1;
+		}
+	}
+
+	if (do_trivial) {
+		pthread_mutex_unlock(&state->main_lock);
+
+		reply(reply_data, "/", 2);
+
+		trace2_data_intmax("fsmonitor", the_repository,
+				   "response/trivial", 1);
+
+		strbuf_release(&response_token);
+		strbuf_release(&requested_token_id);
+		return 0;
+	}
+
+	/*
+	 * We're going to hold onto a pointer to the current
+	 * token-data while we walk the list of batches of files.
+	 * During this time, we will NOT be under the lock.
+	 * So we ref-count it.
+	 *
+	 * This allows the listener thread to continue prepending
+	 * new batches of items to the token-data (which we'll ignore).
+	 *
+	 * AND it allows the listener thread to do a token-reset
+	 * (and install a new `current_token_data`).
+	 */
+	token_data->client_ref_count++;
+
+	pthread_mutex_unlock(&state->main_lock);
+
+	/*
+	 * The client request is relative to the token that they sent,
+	 * so walk the batch list backwards from the current head back
+	 * to the batch (sequence number) they named.
+	 *
+	 * We use khash to de-dup the list of pathnames.
+	 *
+	 * NEEDSWORK: each batch contains a list of interned strings,
+	 * so we only need to do pointer comparisons here to build the
+	 * hash table.  Currently, we're still comparing the string
+	 * values.
+	 */
+	shown = kh_init_str();
+	for (batch = batch_head;
+	     batch && batch->batch_seq_nr > requested_oldest_seq_nr;
+	     batch = batch->next) {
+		size_t k;
+
+		for (k = 0; k < batch->nr; k++) {
+			const char *s = batch->interned_paths[k];
+			size_t s_len;
+
+			if (kh_get_str(shown, s) != kh_end(shown))
+				duplicates++;
+			else {
+				kh_put_str(shown, s, &hash_ret);
+
+				trace_printf_key(&trace_fsmonitor,
+						 "send[%"PRIuMAX"]: %s",
+						 count, s);
+
+				/* Each path gets written with a trailing NUL */
+				s_len = strlen(s) + 1;
+
+				if (payload.len + s_len >=
+				    LARGE_PACKET_DATA_MAX) {
+					reply(reply_data, payload.buf,
+					      payload.len);
+					total_response_len += payload.len;
+					strbuf_reset(&payload);
+				}
+
+				strbuf_add(&payload, s, s_len);
+				count++;
+			}
+		}
+	}
+
+	if (payload.len) {
+		reply(reply_data, payload.buf, payload.len);
+		total_response_len += payload.len;
+	}
+
+	kh_release_str(shown);
+
+	pthread_mutex_lock(&state->main_lock);
+
+	if (token_data->client_ref_count > 0)
+		token_data->client_ref_count--;
+
+	if (token_data->client_ref_count == 0) {
+		if (token_data != state->current_token_data) {
+			/*
+			 * The listener thread did a token-reset while we were
+			 * walking the batch list.  Therefore, this token is
+			 * stale and can be discarded completely.  If we are
+			 * the last reader thread using this token, we own
+			 * that work.
+			 */
+			fsmonitor_free_token_data(token_data);
+		}
+	}
+
+	pthread_mutex_unlock(&state->main_lock);
+
+	trace2_data_intmax("fsmonitor", the_repository, "response/length", total_response_len);
+	trace2_data_intmax("fsmonitor", the_repository, "response/count/files", count);
+	trace2_data_intmax("fsmonitor", the_repository, "response/count/duplicates", duplicates);
+
+	strbuf_release(&response_token);
+	strbuf_release(&requested_token_id);
+	strbuf_release(&payload);
+
+	return 0;
+}
+
 static ipc_server_application_cb handle_client;
 
 static int handle_client(void *data,
@@ -360,7 +666,7 @@ static int handle_client(void *data,
 			 ipc_server_reply_cb *reply,
 			 struct ipc_server_reply_data *reply_data)
 {
-	/* struct fsmonitor_daemon_state *state = data; */
+	struct fsmonitor_daemon_state *state = data;
 	int result;
 
 	/*
@@ -371,10 +677,12 @@ static int handle_client(void *data,
 	if (command_len != strlen(command))
 		BUG("FSMonitor assumes text messages");
 
+	trace_printf_key(&trace_fsmonitor, "requested token: %s", command);
+
 	trace2_region_enter("fsmonitor", "handle_client", the_repository);
 	trace2_data_string("fsmonitor", the_repository, "request", command);
 
-	result = 0; /* TODO Do something here. */
+	result = do_handle_client(state, command, reply, reply_data);
 
 	trace2_region_leave("fsmonitor", "handle_client", the_repository);
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 21/28] fsmonitor--daemon: periodically truncate list of modified files
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (19 preceding siblings ...)
  2021-05-22 13:56   ` [PATCH v2 20/28] fsmonitor--daemon: implement handle_client callback Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:57   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:57   ` [PATCH v2 22/28] fsmonitor--daemon: use a cookie file to sync with file system Jeff Hostetler via GitGitGadget
                     ` (8 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:57 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon to periodically truncate the list of
modified files to save some memory.

Clients will ask for the set of changes relative to a token that they
found in the FSMN index extension in the index.  (This token is like a
point in time, but different).  Clients will then update the index to
contain the response token (so that subsequent commands will be
relative to this new token).

Therefore, the daemon can gradually truncate the in-memory list of
changed paths as they become obsolete (older than the previous token).
Since we may have multiple clients making concurrent requests with a
skew of tokens and clients may be racing to the talk to the daemon,
we lazily truncate the list.

We introduce a 5 minute delay and truncate batches 5 minutes after
they are considered obsolete.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 80 +++++++++++++++++++++++++++++++++++++
 1 file changed, 80 insertions(+)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 33b4f09c72ca..e807aa8f6741 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -298,6 +298,77 @@ static void fsmonitor_batch__combine(struct fsmonitor_batch *batch_dest,
 			batch_src->interned_paths[k];
 }
 
+/*
+ * To keep the batch list from growing unbounded in response to filesystem
+ * activity, we try to truncate old batches from the end of the list as
+ * they become irrelevant.
+ *
+ * We assume that the .git/index will be updated with the most recent token
+ * any time the index is updated.  And future commands will only ask for
+ * recent changes *since* that new token.  So as tokens advance into the
+ * future, older batch items will never be requested/needed.  So we can
+ * truncate them without loss of functionality.
+ *
+ * However, multiple commands may be talking to the daemon concurrently
+ * or perform a slow command, so a little "token skew" is possible.
+ * Therefore, we want this to be a little bit lazy and have a generous
+ * delay.
+ *
+ * The current reader thread walked backwards in time from `token->batch_head`
+ * back to `batch_marker` somewhere in the middle of the batch list.
+ *
+ * Let's walk backwards in time from that marker an arbitrary delay
+ * and truncate the list there.  Note that these timestamps are completely
+ * artificial (based on when we pinned the batch item) and not on any
+ * filesystem activity.
+ */
+#define MY_TIME_DELAY_SECONDS (5 * 60) /* seconds */
+
+static void with_lock__truncate_old_batches(
+	struct fsmonitor_daemon_state *state,
+	const struct fsmonitor_batch *batch_marker)
+{
+	/* assert current thread holding state->main_lock */
+
+	const struct fsmonitor_batch *batch;
+	struct fsmonitor_batch *rest;
+	struct fsmonitor_batch *p;
+
+	if (!batch_marker)
+		return;
+
+	trace_printf_key(&trace_fsmonitor, "Truncate: mark (%"PRIu64",%"PRIu64")",
+			 batch_marker->batch_seq_nr,
+			 (uint64_t)batch_marker->pinned_time);
+
+	for (batch = batch_marker; batch; batch = batch->next) {
+		time_t t;
+
+		if (!batch->pinned_time) /* an overflow batch */
+			continue;
+
+		t = batch->pinned_time + MY_TIME_DELAY_SECONDS;
+		if (t > batch_marker->pinned_time) /* too close to marker */
+			continue;
+
+		goto truncate_past_here;
+	}
+
+	return;
+
+truncate_past_here:
+	state->current_token_data->batch_tail = (struct fsmonitor_batch *)batch;
+
+	rest = ((struct fsmonitor_batch *)batch)->next;
+	((struct fsmonitor_batch *)batch)->next = NULL;
+
+	for (p = rest; p; p = fsmonitor_batch__pop(p)) {
+		trace_printf_key(&trace_fsmonitor,
+				 "Truncate: kill (%"PRIu64",%"PRIu64")",
+				 p->batch_seq_nr, (uint64_t)p->pinned_time);
+	}
+}
+
 static void fsmonitor_free_token_data(struct fsmonitor_token_data *token)
 {
 	struct fsmonitor_batch *p;
@@ -643,6 +714,15 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
 			 * that work.
 			 */
 			fsmonitor_free_token_data(token_data);
+		} else if (batch) {
+			/*
+			 * This batch is the first item in the list
+			 * that is older than the requested sequence
+			 * number and might be considered to be
+			 * obsolete.  See if we can truncate the list
+			 * and save some memory.
+			 */
+			with_lock__truncate_old_batches(state, batch);
 		}
 	}
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 22/28] fsmonitor--daemon: use a cookie file to sync with file system
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (20 preceding siblings ...)
  2021-05-22 13:57   ` [PATCH v2 21/28] fsmonitor--daemon: periodically truncate list of modified files Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:57   ` Jeff Hostetler via GitGitGadget
  2021-06-14 21:42     ` Johannes Schindelin
  2021-05-22 13:57   ` [PATCH v2 23/28] fsmonitor: enhance existing comments Jeff Hostetler via GitGitGadget
                     ` (7 subsequent siblings)
  29 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:57 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon client threads to create a cookie file
inside the .git directory and then wait until FS events for the
cookie are observed by the FS listener thread.

This helps address the racy nature of file system events by
blocking the client response until the kernel has drained any
event backlog.

This is especially important on MacOS where kernel events are
only issued with a limited frequency.  See the `latency` argument
of `FSeventStreamCreate()`.  The kernel only signals every `latency`
seconds, but does not guarantee that the kernel queue is completely
drained, so we may have to wait more than one interval.  If we
increase the frequency, the system is more likely to drop events.
We avoid these issues by having each client thread create a unique
cookie file and then wait until it is seen in the event stream.

Co-authored-by: Kevin Willford <Kevin.Willford@microsoft.com>
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 210 +++++++++++++++++++++++++++++++++++-
 fsmonitor--daemon.h         |   5 +
 2 files changed, 214 insertions(+), 1 deletion(-)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index e807aa8f6741..985a82cf39e0 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -92,6 +92,149 @@ static int do_as_client__status(void)
 	}
 }
 
+enum fsmonitor_cookie_item_result {
+	FCIR_ERROR = -1, /* could not create cookie file ? */
+	FCIR_INIT = 0,
+	FCIR_SEEN,
+	FCIR_ABORT,
+};
+
+struct fsmonitor_cookie_item {
+	struct hashmap_entry entry;
+	const char *name;
+	enum fsmonitor_cookie_item_result result;
+};
+
+static int cookies_cmp(const void *data, const struct hashmap_entry *he1,
+		     const struct hashmap_entry *he2, const void *keydata)
+{
+	const struct fsmonitor_cookie_item *a =
+		container_of(he1, const struct fsmonitor_cookie_item, entry);
+	const struct fsmonitor_cookie_item *b =
+		container_of(he2, const struct fsmonitor_cookie_item, entry);
+
+	return strcmp(a->name, keydata ? keydata : b->name);
+}
+
+static enum fsmonitor_cookie_item_result with_lock__wait_for_cookie(
+	struct fsmonitor_daemon_state *state)
+{
+	/* assert current thread holding state->main_lock */
+
+	int fd;
+	struct fsmonitor_cookie_item *cookie;
+	struct strbuf cookie_pathname = STRBUF_INIT;
+	struct strbuf cookie_filename = STRBUF_INIT;
+	enum fsmonitor_cookie_item_result result;
+	int my_cookie_seq;
+
+	CALLOC_ARRAY(cookie, 1);
+
+	my_cookie_seq = state->cookie_seq++;
+
+	strbuf_addf(&cookie_filename, "%i-%i", getpid(), my_cookie_seq);
+
+	strbuf_addbuf(&cookie_pathname, &state->path_cookie_prefix);
+	strbuf_addbuf(&cookie_pathname, &cookie_filename);
+
+	cookie->name = strbuf_detach(&cookie_filename, NULL);
+	cookie->result = FCIR_INIT;
+	hashmap_entry_init(&cookie->entry, strhash(cookie->name));
+
+	hashmap_add(&state->cookies, &cookie->entry);
+
+	trace_printf_key(&trace_fsmonitor, "cookie-wait: '%s' '%s'",
+			 cookie->name, cookie_pathname.buf);
+
+	/*
+	 * Create the cookie file on disk and then wait for a notification
+	 * that the listener thread has seen it.
+	 */
+	fd = open(cookie_pathname.buf, O_WRONLY | O_CREAT | O_EXCL, 0600);
+	if (fd >= 0) {
+		close(fd);
+		unlink(cookie_pathname.buf);
+
+		/*
+		 * NEEDSWORK: This is an infinite wait (well, unless another
+		 * thread sends us an abort).  I'd like to change this to
+		 * use `pthread_cond_timedwait()` and return an error/timeout
+		 * and let the caller do the trivial response thing.
+		 */
+		while (cookie->result == FCIR_INIT)
+			pthread_cond_wait(&state->cookies_cond,
+					  &state->main_lock);
+	} else {
+		error_errno(_("could not create fsmonitor cookie '%s'"),
+			    cookie->name);
+
+		cookie->result = FCIR_ERROR;
+	}
+
+	hashmap_remove(&state->cookies, &cookie->entry, NULL);
+
+	result = cookie->result;
+
+	free((char*)cookie->name);
+	free(cookie);
+	strbuf_release(&cookie_pathname);
+
+	return result;
+}
+
+/*
+ * Mark these cookies as _SEEN and wake up the corresponding client threads.
+ */
+static void with_lock__mark_cookies_seen(struct fsmonitor_daemon_state *state,
+					 const struct string_list *cookie_names)
+{
+	/* assert current thread holding state->main_lock */
+
+	int k;
+	int nr_seen = 0;
+
+	for (k = 0; k < cookie_names->nr; k++) {
+		struct fsmonitor_cookie_item key;
+		struct fsmonitor_cookie_item *cookie;
+
+		key.name = cookie_names->items[k].string;
+		hashmap_entry_init(&key.entry, strhash(key.name));
+
+		cookie = hashmap_get_entry(&state->cookies, &key, entry, NULL);
+		if (cookie) {
+			trace_printf_key(&trace_fsmonitor, "cookie-seen: '%s'",
+					 cookie->name);
+			cookie->result = FCIR_SEEN;
+			nr_seen++;
+		}
+	}
+
+	if (nr_seen)
+		pthread_cond_broadcast(&state->cookies_cond);
+}
+
+/*
+ * Set _ABORT on all pending cookies and wake up all client threads.
+ */
+static void with_lock__abort_all_cookies(struct fsmonitor_daemon_state *state)
+{
+	/* assert current thread holding state->main_lock */
+
+	struct hashmap_iter iter;
+	struct fsmonitor_cookie_item *cookie;
+	int nr_aborted = 0;
+
+	hashmap_for_each_entry(&state->cookies, &iter, cookie, entry) {
+		trace_printf_key(&trace_fsmonitor, "cookie-abort: '%s'",
+				 cookie->name);
+		cookie->result = FCIR_ABORT;
+		nr_aborted++;
+	}
+
+	if (nr_aborted)
+		pthread_cond_broadcast(&state->cookies_cond);
+}
+
 /*
  * Requests to and from a FSMonitor Protocol V2 provider use an opaque
  * "token" as a virtual timestamp.  Clients can request a summary of all
@@ -395,6 +538,9 @@ static void fsmonitor_free_token_data(struct fsmonitor_token_data *token)
  *     We should create a new token and start fresh (as if we just
  *     booted up).
  *
+ * [2] Some of those lost events may have been for cookie files.  We
+ *     should assume the worst and abort them rather letting them starve.
+ *
  * If there are no concurrent threads readering the current token data
  * series, we can free it now.  Otherwise, let the last reader free
  * it.
@@ -416,6 +562,8 @@ static void with_lock__do_force_resync(struct fsmonitor_daemon_state *state)
 	state->current_token_data = new_one;
 
 	fsmonitor_free_token_data(free_me);
+
+	with_lock__abort_all_cookies(state);
 }
 
 void fsmonitor_force_resync(struct fsmonitor_daemon_state *state)
@@ -490,6 +638,8 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
 	int hash_ret;
 	int do_trivial = 0;
 	int do_flush = 0;
+	int do_cookie = 0;
+	enum fsmonitor_cookie_item_result cookie_result;
 
 	/*
 	 * We expect `command` to be of the form:
@@ -522,6 +672,7 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
 		 */
 		do_flush = 1;
 		do_trivial = 1;
+		do_cookie = 1;
 
 	} else if (!skip_prefix(command, "builtin:", &p)) {
 		/* assume V1 timestamp or garbage */
@@ -535,6 +686,7 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
 				  "fsmonitor: unsupported V1 protocol '%s'"),
 				 command);
 		do_trivial = 1;
+		do_cookie = 1;
 
 	} else {
 		/* We have "builtin:*" */
@@ -544,12 +696,14 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
 					 "fsmonitor: invalid V2 protocol token '%s'",
 					 command);
 			do_trivial = 1;
+			do_cookie = 1;
 
 		} else {
 			/*
 			 * We have a V2 valid token:
 			 *     "builtin:<token_id>:<seq_nr>"
 			 */
+			do_cookie = 1;
 		}
 	}
 
@@ -558,6 +712,30 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
 	if (!state->current_token_data)
 		BUG("fsmonitor state does not have a current token");
 
+	/*
+	 * Write a cookie file inside the directory being watched in
+	 * an effort to flush out existing filesystem events that we
+	 * actually care about.  Suspend this client thread until we
+	 * see the filesystem events for this cookie file.
+	 *
+	 * Creating the cookie lets us guarantee that our FS listener
+	 * thread has drained the kernel queue and we are caught up
+	 * with the kernel.
+	 *
+	 * If we cannot create the cookie (or otherwise guarantee that
+	 * we are caught up), we send a trivial response.  We have to
+	 * assume that there might be some very, very recent activity
+	 * on the FS still in flight.
+	 */
+	if (do_cookie) {
+		cookie_result = with_lock__wait_for_cookie(state);
+		if (cookie_result != FCIR_SEEN) {
+			error(_("fsmonitor: cookie_result '%d' != SEEN"),
+			      cookie_result);
+			do_trivial = 1;
+		}
+	}
+
 	if (do_flush)
 		with_lock__do_force_resync(state);
 
@@ -769,7 +947,9 @@ static int handle_client(void *data,
 	return result;
 }
 
-#define FSMONITOR_COOKIE_PREFIX ".fsmonitor-daemon-"
+#define FSMONITOR_DIR           "fsmonitor--daemon"
+#define FSMONITOR_COOKIE_DIR    "cookies"
+#define FSMONITOR_COOKIE_PREFIX (FSMONITOR_DIR "/" FSMONITOR_COOKIE_DIR "/")
 
 enum fsmonitor_path_type fsmonitor_classify_path_workdir_relative(
 	const char *rel)
@@ -922,6 +1102,9 @@ void fsmonitor_publish(struct fsmonitor_daemon_state *state,
 		}
 	}
 
+	if (cookie_names->nr)
+		with_lock__mark_cookies_seen(state, cookie_names);
+
 	pthread_mutex_unlock(&state->main_lock);
 }
 
@@ -1011,7 +1194,9 @@ static int fsmonitor_run_daemon(void)
 
 	memset(&state, 0, sizeof(state));
 
+	hashmap_init(&state.cookies, cookies_cmp, NULL, 0);
 	pthread_mutex_init(&state.main_lock, NULL);
+	pthread_cond_init(&state.cookies_cond, NULL);
 	state.error_code = 0;
 	state.current_token_data = fsmonitor_new_token_data();
 
@@ -1035,6 +1220,23 @@ static int fsmonitor_run_daemon(void)
 		state.nr_paths_watching = 2;
 	}
 
+	/*
+	 * We will write filesystem syncing cookie files into
+	 * <gitdir>/<fsmonitor-dir>/<cookie-dir>/<pid>-<seq>.
+	 */
+	strbuf_init(&state.path_cookie_prefix, 0);
+	strbuf_addbuf(&state.path_cookie_prefix, &state.path_gitdir_watch);
+
+	strbuf_addch(&state.path_cookie_prefix, '/');
+	strbuf_addstr(&state.path_cookie_prefix, FSMONITOR_DIR);
+	mkdir(state.path_cookie_prefix.buf, 0777);
+
+	strbuf_addch(&state.path_cookie_prefix, '/');
+	strbuf_addstr(&state.path_cookie_prefix, FSMONITOR_COOKIE_DIR);
+	mkdir(state.path_cookie_prefix.buf, 0777);
+
+	strbuf_addch(&state.path_cookie_prefix, '/');
+
 	/*
 	 * Confirm that we can create platform-specific resources for the
 	 * filesystem listener before we bother starting all the threads.
@@ -1047,6 +1249,7 @@ static int fsmonitor_run_daemon(void)
 	err = fsmonitor_run_daemon_1(&state);
 
 done:
+	pthread_cond_destroy(&state.cookies_cond);
 	pthread_mutex_destroy(&state.main_lock);
 	fsmonitor_fs_listen__dtor(&state);
 
@@ -1054,6 +1257,11 @@ static int fsmonitor_run_daemon(void)
 
 	strbuf_release(&state.path_worktree_watch);
 	strbuf_release(&state.path_gitdir_watch);
+	strbuf_release(&state.path_cookie_prefix);
+
+	/*
+	 * NEEDSWORK: Consider "rm -rf <gitdir>/<fsmonitor-dir>"
+	 */
 
 	return err;
 }
diff --git a/fsmonitor--daemon.h b/fsmonitor--daemon.h
index 89a9ef20b24b..e9fc099bae9c 100644
--- a/fsmonitor--daemon.h
+++ b/fsmonitor--daemon.h
@@ -45,6 +45,11 @@ struct fsmonitor_daemon_state {
 
 	struct fsmonitor_token_data *current_token_data;
 
+	struct strbuf path_cookie_prefix;
+	pthread_cond_t cookies_cond;
+	int cookie_seq;
+	struct hashmap cookies;
+
 	int error_code;
 	struct fsmonitor_daemon_backend_data *backend_data;
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 23/28] fsmonitor: enhance existing comments
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (21 preceding siblings ...)
  2021-05-22 13:57   ` [PATCH v2 22/28] fsmonitor--daemon: use a cookie file to sync with file system Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:57   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:57   ` [PATCH v2 24/28] fsmonitor: force update index after large responses Jeff Hostetler via GitGitGadget
                     ` (6 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:57 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 fsmonitor.c | 37 ++++++++++++++++++++++++++++++-------
 1 file changed, 30 insertions(+), 7 deletions(-)

diff --git a/fsmonitor.c b/fsmonitor.c
index c6d3c34ad78e..29adf3e53ef3 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -344,9 +344,25 @@ void refresh_fsmonitor(struct index_state *istate)
 	}
 
 apply_results:
-	/* a fsmonitor process can return '/' to indicate all entries are invalid */
+	/*
+	 * The response from FSMonitor (excluding the header token) is
+	 * either:
+	 *
+	 * [a] a (possibly empty) list of NUL delimited relative
+	 *     pathnames of changed paths.  This list can contain
+	 *     files and directories.  Directories have a trailing
+	 *     slash.
+	 *
+	 * [b] a single '/' to indicate the provider had no
+	 *     information and that we should consider everything
+	 *     invalid.  We call this a trivial response.
+	 */
 	if (query_success && query_result.buf[bol] != '/') {
-		/* Mark all entries returned by the monitor as dirty */
+		/*
+		 * Mark all pathnames returned by the monitor as dirty.
+		 *
+		 * This updates both the cache-entries and the untracked-cache.
+		 */
 		buf = query_result.buf;
 		for (i = bol; i < query_result.len; i++) {
 			if (buf[i] != '\0')
@@ -361,11 +377,15 @@ void refresh_fsmonitor(struct index_state *istate)
 		if (istate->untracked)
 			istate->untracked->use_fsmonitor = 1;
 	} else {
-
-		/* We only want to run the post index changed hook if we've actually changed entries, so keep track
-		 * if we actually changed entries or not */
+		/*
+		 * We received a trivial response, so invalidate everything.
+		 *
+		 * We only want to run the post index changed hook if
+		 * we've actually changed entries, so keep track if we
+		 * actually changed entries or not.
+		 */
 		int is_cache_changed = 0;
-		/* Mark all entries invalid */
+
 		for (i = 0; i < istate->cache_nr; i++) {
 			if (istate->cache[i]->ce_flags & CE_FSMONITOR_VALID) {
 				is_cache_changed = 1;
@@ -373,7 +393,10 @@ void refresh_fsmonitor(struct index_state *istate)
 			}
 		}
 
-		/* If we're going to check every file, ensure we save the results */
+		/*
+		 * If we're going to check every file, ensure we save
+		 * the results.
+		 */
 		if (is_cache_changed)
 			istate->cache_changed |= FSMONITOR_CHANGED;
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 24/28] fsmonitor: force update index after large responses
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (22 preceding siblings ...)
  2021-05-22 13:57   ` [PATCH v2 23/28] fsmonitor: enhance existing comments Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:57   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:57   ` [PATCH v2 25/28] t7527: create test for fsmonitor--daemon Jeff Hostetler via GitGitGadget
                     ` (5 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:57 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Set the `FSMONITOR_CHANGED` bit on `istate->cache_changed` when
FSMonitor returns a very large repsonse to ensure that the index is
written to disk.

Normally, when the FSMonitor response includes a tracked file, the
index is always updated.  Similarly, the index might be updated when
the response alters the untracked-cache (when enabled).  However, in
cases where neither of those cause the index to be considered changed,
the FSMonitor response is wasted.  Subsequent Git commands will make
requests with the same token and receive the same response.

If that response is very large, performance may suffer.  It would be
more efficient to force update the index now (and the token in the
index extension) in order to reduce the size of the response received
by future commands.

This was observed on Windows after a large checkout.  On Windows, the
kernel emits events for the files that are changed as they are
changed.  However, it might delay events for the containing
directories until the system is more idle (or someone scans the
directory (so it seems)).  The first status following a checkout would
get the list of files.  The subsequent status commands would get the
list of directories as the events trickled out.  But they would never
catch up because the token was not advanced because the index wasn't
updated.

This list of directories caused `wt_status_collect_untracked()` to
unnecessarily spend time actually scanning them during each command.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 fsmonitor.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/fsmonitor.c b/fsmonitor.c
index 29adf3e53ef3..22623fd228fc 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -230,6 +230,45 @@ static void fsmonitor_refresh_callback(struct index_state *istate, char *name)
 	untracked_cache_invalidate_path(istate, name, 0);
 }
 
+/*
+ * The number of pathnames that we need to receive from FSMonitor
+ * before we force the index to be updated.
+ *
+ * Note that any pathname within the set of received paths MAY cause
+ * cache-entry or istate flag bits to be updated and thus cause the
+ * index to be updated on disk.
+ *
+ * However, the response may contain many paths (such as ignored
+ * paths) that will not update any flag bits.  And thus not force the
+ * index to be updated.  (This is fine and normal.)  It also means
+ * that the token will not be updated in the FSMonitor index
+ * extension.  So the next Git command will find the same token in the
+ * index, make the same token-relative request, and receive the same
+ * response (plus any newly changed paths).  If this response is large
+ * (and continues to grow), performance could be impacted.
+ *
+ * For example, if the user runs a build and it writes 100K object
+ * files but doesn't modify any source files, the index would not need
+ * to be updated.  The FSMonitor response (after the build and
+ * relative to a pre-build token) might be 5MB.  Each subsequent Git
+ * command will receive that same 100K/5MB response until something
+ * causes the index to be updated.  And `refresh_fsmonitor()` will
+ * have to iterate over those 100K paths each time.
+ *
+ * Performance could be improved if we optionally force update the
+ * index after a very large response and get an updated token into
+ * the FSMonitor index extension.  This should allow subsequent
+ * commands to get smaller and more current responses.
+ *
+ * The value chosen here does not need to be precise.  The index
+ * will be updated automatically the first time the user touches
+ * a tracked file and causes a command like `git status` to
+ * update an mtime to be updated and/or set a flag bit.
+ *
+ * NEEDSWORK: Does this need to be a config value?
+ */
+static int fsmonitor_force_update_threshold = 100;
+
 void refresh_fsmonitor(struct index_state *istate)
 {
 	struct repository *r = istate->repo ? istate->repo : the_repository;
@@ -363,19 +402,28 @@ void refresh_fsmonitor(struct index_state *istate)
 		 *
 		 * This updates both the cache-entries and the untracked-cache.
 		 */
+		int count = 0;
+
 		buf = query_result.buf;
 		for (i = bol; i < query_result.len; i++) {
 			if (buf[i] != '\0')
 				continue;
 			fsmonitor_refresh_callback(istate, buf + bol);
 			bol = i + 1;
+			count++;
 		}
-		if (bol < query_result.len)
+		if (bol < query_result.len) {
 			fsmonitor_refresh_callback(istate, buf + bol);
+			count++;
+		}
 
 		/* Now mark the untracked cache for fsmonitor usage */
 		if (istate->untracked)
 			istate->untracked->use_fsmonitor = 1;
+
+		if (count > fsmonitor_force_update_threshold)
+			istate->cache_changed |= FSMONITOR_CHANGED;
+
 	} else {
 		/*
 		 * We received a trivial response, so invalidate everything.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 25/28] t7527: create test for fsmonitor--daemon
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (23 preceding siblings ...)
  2021-05-22 13:57   ` [PATCH v2 24/28] fsmonitor: force update index after large responses Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:57   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:57   ` [PATCH v2 26/28] p7519: add fsmonitor--daemon Jeff Hostetler via GitGitGadget
                     ` (4 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:57 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 t/t7527-builtin-fsmonitor.sh | 475 +++++++++++++++++++++++++++++++++++
 1 file changed, 475 insertions(+)
 create mode 100755 t/t7527-builtin-fsmonitor.sh

diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
new file mode 100755
index 000000000000..eaed44ebad63
--- /dev/null
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -0,0 +1,475 @@
+#!/bin/sh
+
+test_description='built-in file system watcher'
+
+. ./test-lib.sh
+
+git version --build-options | grep "feature:" | grep "fsmonitor--daemon" || {
+	skip_all="The built-in FSMonitor is not supported on this platform"
+	test_done
+}
+
+kill_repo () {
+	r=$1
+	git -C $r fsmonitor--daemon stop >/dev/null 2>/dev/null
+	rm -rf $1
+	return 0
+}
+
+start_daemon () {
+	case "$#" in
+		1) r="-C $1";;
+		*) r="";
+	esac
+
+	git $r fsmonitor--daemon start || return $?
+	git $r fsmonitor--daemon status || return $?
+
+	return 0
+}
+
+test_expect_success 'explicit daemon start and stop' '
+	test_when_finished "kill_repo test_explicit" &&
+
+	git init test_explicit &&
+	start_daemon test_explicit &&
+
+	git -C test_explicit fsmonitor--daemon stop &&
+	test_must_fail git -C test_explicit fsmonitor--daemon status
+'
+
+test_expect_success 'implicit daemon start' '
+	test_when_finished "kill_repo test_implicit" &&
+
+	git init test_implicit &&
+	test_must_fail git -C test_implicit fsmonitor--daemon status &&
+
+	# query will implicitly start the daemon.
+	#
+	# for test-script simplicity, we send a V1 timestamp rather than
+	# a V2 token.  either way, the daemon response to any query contains
+	# a new V2 token.  (the daemon may complain that we sent a V1 request,
+	# but this test case is only concerned with whether the daemon was
+	# implicitly started.)
+
+	GIT_TRACE2_EVENT="$PWD/.git/trace" \
+		test-tool -C test_implicit fsmonitor-client query --token 0 >actual &&
+	nul_to_q <actual >actual.filtered &&
+	grep "builtin:" actual.filtered &&
+
+	# confirm that a daemon was started in the background.
+	#
+	# since the mechanism for starting the background daemon is platform
+	# dependent, just confirm that the foreground command received a
+	# response from the daemon.
+
+	grep :\"query/response-length\" .git/trace &&
+
+	git -C test_implicit fsmonitor--daemon status &&
+	git -C test_implicit fsmonitor--daemon stop &&
+	test_must_fail git -C test_implicit fsmonitor--daemon status
+'
+
+test_expect_success 'implicit daemon stop (delete .git)' '
+	test_when_finished "kill_repo test_implicit_1" &&
+
+	git init test_implicit_1 &&
+
+	start_daemon test_implicit_1 &&
+
+	# deleting the .git directory will implicitly stop the daemon.
+	rm -rf test_implicit_1/.git &&
+
+	# Create an empty .git directory so that the following Git command
+	# will stay relative to the `-C` directory.  Without this, the Git
+	# command will (override the requested -C argument) and crawl out
+	# to the containing Git source tree.  This would make the test
+	# result dependent upon whether we were using fsmonitor on our
+	# development worktree.
+
+	sleep 1 &&
+	mkdir test_implicit_1/.git &&
+
+	test_must_fail git -C test_implicit_1 fsmonitor--daemon status
+'
+
+test_expect_success 'implicit daemon stop (rename .git)' '
+	test_when_finished "kill_repo test_implicit_2" &&
+
+	git init test_implicit_2 &&
+
+	start_daemon test_implicit_2 &&
+
+	# renaming the .git directory will implicitly stop the daemon.
+	mv test_implicit_2/.git test_implicit_2/.xxx &&
+
+	# Create an empty .git directory so that the following Git command
+	# will stay relative to the `-C` directory.  Without this, the Git
+	# command will (override the requested -C argument) and crawl out
+	# to the containing Git source tree.  This would make the test
+	# result dependent upon whether we were using fsmonitor on our
+	# development worktree.
+
+	sleep 1 &&
+	mkdir test_implicit_2/.git &&
+
+	test_must_fail git -C test_implicit_2 fsmonitor--daemon status
+'
+
+test_expect_success 'cannot start multiple daemons' '
+	test_when_finished "kill_repo test_multiple" &&
+
+	git init test_multiple &&
+
+	start_daemon test_multiple &&
+
+	test_must_fail git -C test_multiple fsmonitor--daemon start 2>actual &&
+	grep "fsmonitor--daemon is already running" actual &&
+
+	git -C test_multiple fsmonitor--daemon stop &&
+	test_must_fail git -C test_multiple fsmonitor--daemon status
+'
+
+test_expect_success 'setup' '
+	>tracked &&
+	>modified &&
+	>delete &&
+	>rename &&
+	mkdir dir1 &&
+	>dir1/tracked &&
+	>dir1/modified &&
+	>dir1/delete &&
+	>dir1/rename &&
+	mkdir dir2 &&
+	>dir2/tracked &&
+	>dir2/modified &&
+	>dir2/delete &&
+	>dir2/rename &&
+	mkdir dirtorename &&
+	>dirtorename/a &&
+	>dirtorename/b &&
+
+	cat >.gitignore <<-\EOF &&
+	.gitignore
+	expect*
+	actual*
+	EOF
+
+	git -c core.useBuiltinFSMonitor= add . &&
+	test_tick &&
+	git -c core.useBuiltinFSMonitor= commit -m initial &&
+
+	git config core.useBuiltinFSMonitor true
+'
+
+test_expect_success 'update-index implicitly starts daemon' '
+	test_must_fail git fsmonitor--daemon status &&
+
+	GIT_TRACE2_EVENT="$PWD/.git/trace_implicit_1" \
+		git update-index --fsmonitor &&
+
+	git fsmonitor--daemon status &&
+	test_might_fail git fsmonitor--daemon stop &&
+
+	grep \"event\":\"start\".*\"fsmonitor--daemon\" .git/trace_implicit_1
+'
+
+test_expect_success 'status implicitly starts daemon' '
+	test_must_fail git fsmonitor--daemon status &&
+
+	GIT_TRACE2_EVENT="$PWD/.git/trace_implicit_2" \
+		git status >actual &&
+
+	git fsmonitor--daemon status &&
+	test_might_fail git fsmonitor--daemon stop &&
+
+	grep \"event\":\"start\".*\"fsmonitor--daemon\" .git/trace_implicit_2
+'
+
+edit_files() {
+	echo 1 >modified
+	echo 2 >dir1/modified
+	echo 3 >dir2/modified
+	>dir1/untracked
+}
+
+delete_files() {
+	rm -f delete
+	rm -f dir1/delete
+	rm -f dir2/delete
+}
+
+create_files() {
+	echo 1 >new
+	echo 2 >dir1/new
+	echo 3 >dir2/new
+}
+
+rename_files() {
+	mv rename renamed
+	mv dir1/rename dir1/renamed
+	mv dir2/rename dir2/renamed
+}
+
+file_to_directory() {
+	rm -f delete
+	mkdir delete
+	echo 1 >delete/new
+}
+
+directory_to_file() {
+	rm -rf dir1
+	echo 1 >dir1
+}
+
+verify_status() {
+	git status >actual &&
+	GIT_INDEX_FILE=.git/fresh-index git read-tree master &&
+	GIT_INDEX_FILE=.git/fresh-index git -c core.useBuiltinFSMonitor= status >expect &&
+	test_cmp expect actual &&
+	echo HELLO AFTER &&
+	cat .git/trace &&
+	echo HELLO AFTER
+}
+
+# The next few test cases confirm that our fsmonitor daemon sees each type
+# of OS filesystem notification that we care about.  At this layer we just
+# ensure we are getting the OS notifications and do not try to confirm what
+# is reported by `git status`.
+#
+# We run a simple query after modifying the filesystem just to introduce
+# a bit of a delay so that the trace logging from the daemon has time to
+# get flushed to disk.
+#
+# We `reset` and `clean` at the bottom of each test (and before stopping the
+# daemon) because these commands might implicitly restart the daemon.
+
+clean_up_repo_and_stop_daemon () {
+	git reset --hard HEAD
+	git clean -fd
+	git fsmonitor--daemon stop
+	rm -f .git/trace
+}
+
+test_expect_success 'edit some files' '
+	test_when_finished "clean_up_repo_and_stop_daemon" &&
+
+	(
+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	edit_files &&
+
+	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
+
+	grep "^event: dir1/modified$"  .git/trace &&
+	grep "^event: dir2/modified$"  .git/trace &&
+	grep "^event: modified$"       .git/trace &&
+	grep "^event: dir1/untracked$" .git/trace
+'
+
+test_expect_success 'create some files' '
+	test_when_finished "clean_up_repo_and_stop_daemon" &&
+
+	(
+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	create_files &&
+
+	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
+
+	grep "^event: dir1/new$" .git/trace &&
+	grep "^event: dir2/new$" .git/trace &&
+	grep "^event: new$"      .git/trace
+'
+
+test_expect_success 'delete some files' '
+	test_when_finished "clean_up_repo_and_stop_daemon" &&
+
+	(
+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	delete_files &&
+
+	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
+
+	grep "^event: dir1/delete$" .git/trace &&
+	grep "^event: dir2/delete$" .git/trace &&
+	grep "^event: delete$"      .git/trace
+'
+
+test_expect_success 'rename some files' '
+	test_when_finished "clean_up_repo_and_stop_daemon" &&
+
+	(
+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	rename_files &&
+
+	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
+
+	grep "^event: dir1/rename$"  .git/trace &&
+	grep "^event: dir2/rename$"  .git/trace &&
+	grep "^event: rename$"       .git/trace &&
+	grep "^event: dir1/renamed$" .git/trace &&
+	grep "^event: dir2/renamed$" .git/trace &&
+	grep "^event: renamed$"      .git/trace
+'
+
+test_expect_success 'rename directory' '
+	test_when_finished "clean_up_repo_and_stop_daemon" &&
+
+	(
+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	mv dirtorename dirrenamed &&
+
+	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
+
+	grep "^event: dirtorename/*$" .git/trace &&
+	grep "^event: dirrenamed/*$"  .git/trace
+'
+
+test_expect_success 'file changes to directory' '
+	test_when_finished "clean_up_repo_and_stop_daemon" &&
+
+	(
+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	file_to_directory &&
+
+	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
+
+	grep "^event: delete$"     .git/trace &&
+	grep "^event: delete/new$" .git/trace
+'
+
+test_expect_success 'directory changes to a file' '
+	test_when_finished "clean_up_repo_and_stop_daemon" &&
+
+	(
+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	directory_to_file &&
+
+	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
+
+	grep "^event: dir1$" .git/trace
+'
+
+# The next few test cases exercise the token-resync code.  When filesystem
+# drops events (because of filesystem velocity or because the daemon isn't
+# polling fast enough), we need to discard the cached data (relative to the
+# current token) and start collecting events under a new token.
+#
+# the 'test-tool fsmonitor-client flush' command can be used to send a
+# "flush" message to a running daemon and ask it to do a flush/resync.
+
+test_expect_success 'flush cached data' '
+	test_when_finished "kill_repo test_flush" &&
+
+	git init test_flush &&
+
+	(
+		GIT_TEST_FSMONITOR_TOKEN=true &&
+		export GIT_TEST_FSMONITOR_TOKEN &&
+
+		GIT_TRACE_FSMONITOR="$PWD/.git/trace_daemon" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon test_flush
+	) &&
+
+	# The daemon should have an initial token with no events in _0 and
+	# then a few (probably platform-specific number of) events in _1.
+	# These should both have the same <token_id>.
+
+	test-tool -C test_flush fsmonitor-client query --token "builtin:test_00000001:0" >actual_0 &&
+	nul_to_q <actual_0 >actual_q0 &&
+
+	touch test_flush/file_1 &&
+	touch test_flush/file_2 &&
+
+	test-tool -C test_flush fsmonitor-client query --token "builtin:test_00000001:0" >actual_1 &&
+	nul_to_q <actual_1 >actual_q1 &&
+
+	grep "file_1" actual_q1 &&
+
+	# Force a flush.  This will change the <token_id>, reset the <seq_nr>, and
+	# flush the file data.  Then create some events and ensure that the file
+	# again appears in the cache.  It should have the new <token_id>.
+
+	test-tool -C test_flush fsmonitor-client flush >flush_0 &&
+	nul_to_q <flush_0 >flush_q0 &&
+	grep "^builtin:test_00000002:0Q/Q$" flush_q0 &&
+
+	test-tool -C test_flush fsmonitor-client query --token "builtin:test_00000002:0" >actual_2 &&
+	nul_to_q <actual_2 >actual_q2 &&
+
+	grep "^builtin:test_00000002:0Q$" actual_q2 &&
+
+	touch test_flush/file_3 &&
+
+	test-tool -C test_flush fsmonitor-client query --token "builtin:test_00000002:0" >actual_3 &&
+	nul_to_q <actual_3 >actual_q3 &&
+
+	grep "file_3" actual_q3
+'
+
+# The next few test cases create repos where the .git directory is NOT
+# inside the one of the working directory.  That is, where .git is a file
+# that points to a directory elsewhere.  This happens for submodules and
+# non-primary worktrees.
+
+test_expect_success 'setup worktree base' '
+	git init wt-base &&
+	echo 1 >wt-base/file1 &&
+	git -C wt-base add file1 &&
+	git -C wt-base commit -m "c1"
+'
+
+test_expect_success 'worktree with .git file' '
+	git -C wt-base worktree add ../wt-secondary &&
+
+	(
+		GIT_TRACE2_PERF="$PWD/trace2_wt_secondary" &&
+		export GIT_TRACE2_PERF &&
+
+		GIT_TRACE_FSMONITOR="$PWD/trace_wt_secondary" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon wt-secondary
+	) &&
+
+	git -C wt-secondary fsmonitor--daemon stop &&
+	test_must_fail git -C wt-secondary fsmonitor--daemon status
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 26/28] p7519: add fsmonitor--daemon
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (24 preceding siblings ...)
  2021-05-22 13:57   ` [PATCH v2 25/28] t7527: create test for fsmonitor--daemon Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:57   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:57   ` [PATCH v2 27/28] t7527: test status with untracked-cache and fsmonitor--daemon Jeff Hostetler via GitGitGadget
                     ` (3 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:57 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Repeat all of the fsmonitor perf tests using `git fsmonitor--daemon` and
the "Simple IPC" interface.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 t/perf/p7519-fsmonitor.sh | 42 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 39 insertions(+), 3 deletions(-)

diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
index 5eb5044a103c..542ae61c99d2 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -24,7 +24,8 @@ test_description="Test core.fsmonitor"
 # GIT_PERF_7519_SPLIT_INDEX: used to configure core.splitIndex
 # GIT_PERF_7519_FSMONITOR: used to configure core.fsMonitor. May be an
 #   absolute path to an integration. May be a space delimited list of
-#   absolute paths to integrations.
+#   absolute paths to integrations.  (This hook or list of hooks does not
+#   include the built-in fsmonitor--daemon.)
 #
 # The big win for using fsmonitor is the elimination of the need to scan the
 # working directory looking for changed and untracked files. If the file
@@ -135,10 +136,16 @@ test_expect_success "one time repo setup" '
 
 setup_for_fsmonitor() {
 	# set INTEGRATION_SCRIPT depending on the environment
-	if test -n "$INTEGRATION_PATH"
+	if test -n "$USE_FSMONITOR_DAEMON"
 	then
+		git config core.useBuiltinFSMonitor true &&
+		INTEGRATION_SCRIPT=false
+	elif test -n "$INTEGRATION_PATH"
+	then
+		git config core.useBuiltinFSMonitor false &&
 		INTEGRATION_SCRIPT="$INTEGRATION_PATH"
 	else
+		git config core.useBuiltinFSMonitor false &&
 		#
 		# Choose integration script based on existence of Watchman.
 		# Fall back to an empty integration script.
@@ -174,7 +181,10 @@ test_perf_w_drop_caches () {
 }
 
 test_fsmonitor_suite() {
-	if test -n "$INTEGRATION_SCRIPT"; then
+	if test -n "$USE_FSMONITOR_DAEMON"
+	then
+		DESC="builtin fsmonitor--daemon"
+	elif test -n "$INTEGRATION_SCRIPT"; then
 		DESC="fsmonitor=$(basename $INTEGRATION_SCRIPT)"
 	else
 		DESC="fsmonitor=disabled"
@@ -285,4 +295,30 @@ test_expect_success "setup without fsmonitor" '
 test_fsmonitor_suite
 trace_stop
 
+#
+# Run a full set of perf tests using the built-in fsmonitor--daemon.
+# It does not use the Hook API, so it has a different setup.
+# Explicitly start the daemon here and before we start client commands
+# so that we can later add custom tracing.
+#
+
+test_lazy_prereq HAVE_FSMONITOR_DAEMON '
+	git version --build-options | grep "feature:" | grep "fsmonitor--daemon"
+'
+
+if test_have_prereq HAVE_FSMONITOR_DAEMON
+then
+	USE_FSMONITOR_DAEMON=t
+
+	trace_start fsmonitor--daemon--server
+	git fsmonitor--daemon start
+
+	trace_start fsmonitor--daemon--client
+	test_expect_success "setup for fsmonitor--daemon" 'setup_for_fsmonitor'
+	test_fsmonitor_suite
+
+	git fsmonitor--daemon stop
+	trace_stop
+fi
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 27/28] t7527: test status with untracked-cache and fsmonitor--daemon
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (25 preceding siblings ...)
  2021-05-22 13:57   ` [PATCH v2 26/28] p7519: add fsmonitor--daemon Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:57   ` Jeff Hostetler via GitGitGadget
  2021-05-22 13:57   ` [PATCH v2 28/28] t/perf: avoid copying builtin fsmonitor files into test repo Jeff Hostetler via GitGitGadget
                     ` (2 subsequent siblings)
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:57 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Create 2x2 test matrix with the untracked-cache and fsmonitor--daemon
features and a series of edits and verify that status output is
identical.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 t/t7527-builtin-fsmonitor.sh | 97 ++++++++++++++++++++++++++++++++++++
 1 file changed, 97 insertions(+)

diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index eaed44ebad63..106adc2a7ee0 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -153,6 +153,8 @@ test_expect_success 'setup' '
 	.gitignore
 	expect*
 	actual*
+	flush*
+	trace*
 	EOF
 
 	git -c core.useBuiltinFSMonitor= add . &&
@@ -472,4 +474,99 @@ test_expect_success 'worktree with .git file' '
 	test_must_fail git -C wt-secondary fsmonitor--daemon status
 '
 
+# TODO Repeat one of the "edit" tests on wt-secondary and confirm that
+# TODO we get the same events and behavior -- that is, that fsmonitor--daemon
+# TODO correctly listens to events on both the working directory and to the
+# TODO referenced GITDIR.
+
+test_expect_success 'cleanup worktrees' '
+	kill_repo wt-secondary &&
+	kill_repo wt-base
+'
+
+# The next few tests perform arbitrary/contrived file operations and
+# confirm that status is correct.  That is, that the data (or lack of
+# data) from fsmonitor doesn't cause incorrect results.  And doesn't
+# cause incorrect results when the untracked-cache is enabled.
+
+test_lazy_prereq UNTRACKED_CACHE '
+	{ git update-index --test-untracked-cache; ret=$?; } &&
+	test $ret -ne 1
+'
+
+test_expect_success 'Matrix: setup for untracked-cache,fsmonitor matrix' '
+	test_might_fail git config --unset core.useBuiltinFSMonitor &&
+	git update-index --no-fsmonitor &&
+	test_might_fail git fsmonitor--daemon stop
+'
+
+matrix_clean_up_repo () {
+	git reset --hard HEAD
+	git clean -fd
+}
+
+matrix_try () {
+	uc=$1
+	fsm=$2
+	fn=$3
+
+	test_expect_success "Matrix[uc:$uc][fsm:$fsm] $fn" '
+		matrix_clean_up_repo &&
+		$fn &&
+		if test $uc = false -a $fsm = false
+		then
+			git status --porcelain=v1 >.git/expect.$fn
+		else
+			git status --porcelain=v1 >.git/actual.$fn
+			test_cmp .git/expect.$fn .git/actual.$fn
+		fi
+	'
+
+	return $?
+}
+
+uc_values="false"
+test_have_prereq UNTRACKED_CACHE && uc_values="false true"
+for uc_val in $uc_values
+do
+	if test $uc_val = false
+	then
+		test_expect_success "Matrix[uc:$uc_val] disable untracked cache" '
+			git config core.untrackedcache false &&
+			git update-index --no-untracked-cache
+		'
+	else
+		test_expect_success "Matrix[uc:$uc_val] enable untracked cache" '
+			git config core.untrackedcache true &&
+			git update-index --untracked-cache
+		'
+	fi
+
+	fsm_values="false true"
+	for fsm_val in $fsm_values
+	do
+		if test $fsm_val = false
+		then
+			test_expect_success "Matrix[uc:$uc_val][fsm:$fsm_val] disable fsmonitor" '
+				test_might_fail git config --unset core.useBuiltinFSMonitor &&
+				git update-index --no-fsmonitor &&
+				test_might_fail git fsmonitor--daemon stop 2>/dev/null
+			'
+		else
+			test_expect_success "Matrix[uc:$uc_val][fsm:$fsm_val] enable fsmonitor" '
+				git config core.useBuiltinFSMonitor true &&
+				git fsmonitor--daemon start &&
+				git update-index --fsmonitor
+			'
+		fi
+
+		matrix_try $uc_val $fsm_val edit_files
+		matrix_try $uc_val $fsm_val delete_files
+		matrix_try $uc_val $fsm_val create_files
+		matrix_try $uc_val $fsm_val rename_files
+		matrix_try $uc_val $fsm_val file_to_directory
+		matrix_try $uc_val $fsm_val directory_to_file
+	done
+done
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v2 28/28] t/perf: avoid copying builtin fsmonitor files into test repo
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (26 preceding siblings ...)
  2021-05-22 13:57   ` [PATCH v2 27/28] t7527: test status with untracked-cache and fsmonitor--daemon Jeff Hostetler via GitGitGadget
@ 2021-05-22 13:57   ` Jeff Hostetler via GitGitGadget
  2021-05-27  2:06   ` [PATCH v2 00/28] Builtin FSMonitor Feature Junio C Hamano
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
  29 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-05-22 13:57 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Do not try to copy a fsmonitor--daemon socket from the current
development directory into the test trash directory.

When we run the perf suite without an explicit source repo set,
we copy of the current $GIT_DIR into the test trash directory.
Unix domain sockets cannot be copied in that manner, so the test
setup fails.

Additionally, omit any other fsmonitor--daemon temp files inside
the $GIT_DIR directory.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 t/perf/perf-lib.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
index 601d9f67ddb0..3b97e3fc0f27 100644
--- a/t/perf/perf-lib.sh
+++ b/t/perf/perf-lib.sh
@@ -74,7 +74,7 @@ test_perf_copy_repo_contents () {
 	for stuff in "$1"/*
 	do
 		case "$stuff" in
-		*/objects|*/hooks|*/config|*/commondir|*/gitdir|*/worktrees)
+		*/objects|*/hooks|*/config|*/commondir|*/gitdir|*/worktrees|*/fsmonitor--daemon*)
 			;;
 		*)
 			cp -R "$stuff" "$repo/.git/" || exit 1
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 237+ messages in thread

* Re: [PATCH v2 00/28] Builtin FSMonitor Feature
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (27 preceding siblings ...)
  2021-05-22 13:57   ` [PATCH v2 28/28] t/perf: avoid copying builtin fsmonitor files into test repo Jeff Hostetler via GitGitGadget
@ 2021-05-27  2:06   ` Junio C Hamano
  2021-06-02 11:28     ` Johannes Schindelin
  2021-06-22 15:45     ` Jeff Hostetler
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
  29 siblings, 2 replies; 237+ messages in thread
From: Junio C Hamano @ 2021-05-27  2:06 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler

These new global symbols are introduced by the series, but never
used outside the file they are added to:

fsmonitor-ipc.o        - fsmonitor_ipc__get_path
fsmonitor-ipc.o        - fsmonitor_ipc__get_state
fsmonitor-ipc.o        - fsmonitor_ipc__send_command

Perhaps make them file-scope static?

Thanks.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v2 04/28] fsmonitor-ipc: create client routines for git-fsmonitor--daemon
  2021-05-22 13:56   ` [PATCH v2 04/28] fsmonitor-ipc: create client routines for git-fsmonitor--daemon Jeff Hostetler via GitGitGadget
@ 2021-06-02 11:24     ` Johannes Schindelin
  2021-06-14 21:23       ` Johannes Schindelin
  0 siblings, 1 reply; 237+ messages in thread
From: Johannes Schindelin @ 2021-06-02 11:24 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Jeff Hostetler, Derrick Stolee, Jeff Hostetler, Jeff Hostetler

Hi Jeff,

I know you're on vacation, therefore I would like to apologize for adding
to your post-vacation notification overload, but...

On Sat, 22 May 2021, Jeff Hostetler via GitGitGadget wrote:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> diff --git a/fsmonitor-ipc.c b/fsmonitor-ipc.c
> new file mode 100644
> index 000000000000..e62901a85b5d
> --- /dev/null
> +++ b/fsmonitor-ipc.c
> @@ -0,0 +1,179 @@
> [...]
> +
> +int fsmonitor_ipc__send_query(const char *since_token,
> +			      struct strbuf *answer)
> +{
> +	int ret = -1;
> +	int tried_to_spawn = 0;
> +	enum ipc_active_state state = IPC_STATE__OTHER_ERROR;
> +	struct ipc_client_connection *connection = NULL;
> +	struct ipc_client_connect_options options
> +		= IPC_CLIENT_CONNECT_OPTIONS_INIT;
> +
> +	options.wait_if_busy = 1;
> +	options.wait_if_not_found = 0;
> +
> +	trace2_region_enter("fsm_client", "query", NULL);
> +
> +	trace2_data_string("fsm_client", NULL, "query/command",
> +			   since_token);
> +
> +try_again:
> +	state = ipc_client_try_connect(fsmonitor_ipc__get_path(), &options,
> +				       &connection);
> +
> +	switch (state) {
> +	case IPC_STATE__LISTENING:
> +		ret = ipc_client_send_command_to_connection(
> +			connection, since_token, strlen(since_token), answer);

Here, `since_token` can be `NULL` (and hence the `strlen(since_token)` can
lead to a segmentation fault). I ran into this situation while `git rebase
-i --autostash` wanted to apply the stashed changes.

Since I picked up your v2 and included it in Git for Windows v2.32.0-rc2,
I needed this hotfix: https://github.com/git-for-windows/git/pull/3241

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v2 00/28] Builtin FSMonitor Feature
  2021-05-27  2:06   ` [PATCH v2 00/28] Builtin FSMonitor Feature Junio C Hamano
@ 2021-06-02 11:28     ` Johannes Schindelin
  2021-06-22 15:45     ` Jeff Hostetler
  1 sibling, 0 replies; 237+ messages in thread
From: Johannes Schindelin @ 2021-06-02 11:28 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff Hostetler via GitGitGadget, git, Jeff Hostetler,
	Derrick Stolee, Jeff Hostetler

Hi Junio,

On Thu, 27 May 2021, Junio C Hamano wrote:

> These new global symbols are introduced by the series, but never
> used outside the file they are added to:
>
> fsmonitor-ipc.o        - fsmonitor_ipc__get_path
> fsmonitor-ipc.o        - fsmonitor_ipc__get_state
> fsmonitor-ipc.o        - fsmonitor_ipc__send_command
>
> Perhaps make them file-scope static?

Good idea!

By the way, GitGitGadget keeps getting confused by the fact that one of
Peff's patches looks very similar to the tip commit of this here patch
series, and mislabels Jeff's PR as being closed.

Would you terribly mind picking up v2 some time soon so that I do not have
to click "Reopen" on Jeff's PR all the time (I had to reopen it a couple
times already: https://github.com/gitgitgadget/git/pull/923, and I do lack
the time to teach GitGitGadget new tricks to avoid this mislabeling).

Thank you,
Dscho

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v2 10/28] t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon
  2021-05-22 13:56   ` [PATCH v2 10/28] t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon Jeff Hostetler via GitGitGadget
@ 2021-06-11  6:32     ` Junio C Hamano
  0 siblings, 0 replies; 237+ messages in thread
From: Junio C Hamano @ 2021-06-11  6:32 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler

"Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +#include "test-tool.h"
> +#include "cache.h"
> +#include "parse-options.h"
> +//#include "fsmonitor.h"
> +#include "fsmonitor-ipc.h"
> +//#include "compat/fsmonitor/fsmonitor-fs-listen.h"
> +//#include "fsmonitor--daemon.h"
> +//#include "simple-ipc.h"

Please never commit a commented-out code.  Comments are for humans
(and they shouldn't use the // style in this project)---use removal
for machines.

Thanks.



^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v2 04/28] fsmonitor-ipc: create client routines for git-fsmonitor--daemon
  2021-06-02 11:24     ` Johannes Schindelin
@ 2021-06-14 21:23       ` Johannes Schindelin
  0 siblings, 0 replies; 237+ messages in thread
From: Johannes Schindelin @ 2021-06-14 21:23 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Jeff Hostetler, Derrick Stolee, Jeff Hostetler, Jeff Hostetler

Hi Jeff,

On Wed, 2 Jun 2021, Johannes Schindelin wrote:

> I know you're on vacation, therefore I would like to apologize for adding
> to your post-vacation notification overload, but...

Now that you're back from vacation...

> On Sat, 22 May 2021, Jeff Hostetler via GitGitGadget wrote:
>
> > From: Jeff Hostetler <jeffhost@microsoft.com>
> >
> > diff --git a/fsmonitor-ipc.c b/fsmonitor-ipc.c
> > new file mode 100644
> > index 000000000000..e62901a85b5d
> > --- /dev/null
> > +++ b/fsmonitor-ipc.c
> > @@ -0,0 +1,179 @@
> > [...]
> > +
> > +int fsmonitor_ipc__send_query(const char *since_token,
> > +			      struct strbuf *answer)
> > +{
> > +	int ret = -1;
> > +	int tried_to_spawn = 0;
> > +	enum ipc_active_state state = IPC_STATE__OTHER_ERROR;
> > +	struct ipc_client_connection *connection = NULL;
> > +	struct ipc_client_connect_options options
> > +		= IPC_CLIENT_CONNECT_OPTIONS_INIT;
> > +
> > +	options.wait_if_busy = 1;
> > +	options.wait_if_not_found = 0;
> > +
> > +	trace2_region_enter("fsm_client", "query", NULL);
> > +
> > +	trace2_data_string("fsm_client", NULL, "query/command",
> > +			   since_token);
> > +
> > +try_again:
> > +	state = ipc_client_try_connect(fsmonitor_ipc__get_path(), &options,
> > +				       &connection);
> > +
> > +	switch (state) {
> > +	case IPC_STATE__LISTENING:
> > +		ret = ipc_client_send_command_to_connection(
> > +			connection, since_token, strlen(since_token), answer);
>
> Here, `since_token` can be `NULL` (and hence the `strlen(since_token)` can
> lead to a segmentation fault). I ran into this situation while `git rebase
> -i --autostash` wanted to apply the stashed changes.
>
> Since I picked up your v2 and included it in Git for Windows v2.32.0-rc2,
> I needed this hotfix: https://github.com/git-for-windows/git/pull/3241

I actually noticed another similar issue and fixed it in time for Git for
Windows v2.32.0, but eventually figured out the actual culprit, with a
much better fix:

-- snip --
commit bc40a560d3c95040b55fd7be6fe5b7012d267f8f
Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Date:   Wed Jun 9 09:49:50 2021 +0200

    fixup! fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC

    In FSMonitor v1, we made sure to only use a valid `since_token` when
    querying the FSMonitor. This condition was accidentally lost in v2, and
    caused segmentation faults uncovered by Scalar's Functional Tests.

    I had tried to fix this in https://github.com/git-for-windows/pull/3241,
    but the fix was incomplete, and I had to follow up with
    https://github.com/git-for-windows/pull/3258. However, it turns out that
    both of them were actually only work-arounds; I should have dug deeper
    to figure out _why_ the `since_token` was no longer guaranteed not to be
    `NULL`, and I finally did.

    Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

diff --git a/fsmonitor.c b/fsmonitor.c
index 22623fd228f..0b40643442e 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -290,8 +290,9 @@ void refresh_fsmonitor(struct index_state *istate)
 	trace_printf_key(&trace_fsmonitor, "refresh fsmonitor");

 	if (r->settings.use_builtin_fsmonitor > 0) {
-		query_success = !fsmonitor_ipc__send_query(
-			istate->fsmonitor_last_update, &query_result);
+		query_success = istate->fsmonitor_last_update &&
+			!fsmonitor_ipc__send_query(istate->fsmonitor_last_update,
+						   &query_result);
 		if (query_success) {
 			/*
 			 * The response contains a series of nul terminated

-- snap --

Would you mind squashing this in when you re-roll?

Ciao,
Dscho

^ permalink raw reply related	[flat|nested] 237+ messages in thread

* Re: [PATCH v2 07/28] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC
  2021-05-22 13:56   ` [PATCH v2 07/28] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC Johannes Schindelin via GitGitGadget
@ 2021-06-14 21:28     ` Johannes Schindelin
  0 siblings, 0 replies; 237+ messages in thread
From: Johannes Schindelin @ 2021-06-14 21:28 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Jeff Hostetler, Derrick Stolee, Jeff Hostetler

Hi Jeff,

On Sat, 22 May 2021, Johannes Schindelin via GitGitGadget wrote:

> diff --git a/fsmonitor.c b/fsmonitor.c
> index 9c9b2abc9414..c6d3c34ad78e 100644
> --- a/fsmonitor.c
> +++ b/fsmonitor.c
> @@ -3,6 +3,7 @@
>  #include "dir.h"
>  #include "ewah/ewok.h"
>  #include "fsmonitor.h"
> +#include "fsmonitor-ipc.h"
>  #include "run-command.h"
>  #include "strbuf.h"
>
> @@ -231,6 +232,7 @@ static void fsmonitor_refresh_callback(struct index_state *istate, char *name)
>
>  void refresh_fsmonitor(struct index_state *istate)
>  {
> +	struct repository *r = istate->repo ? istate->repo : the_repository;
>  	struct strbuf query_result = STRBUF_INIT;
>  	int query_success = 0, hook_version = -1;
>  	size_t bol = 0; /* beginning of line */
> @@ -247,6 +249,46 @@ void refresh_fsmonitor(struct index_state *istate)
>  	istate->fsmonitor_has_run_once = 1;
>
>  	trace_printf_key(&trace_fsmonitor, "refresh fsmonitor");
> +
> +	if (r->settings.use_builtin_fsmonitor > 0) {
> +		query_success = !fsmonitor_ipc__send_query(
> +			istate->fsmonitor_last_update, &query_result);

As I pointed out elsewhere in the thread, this is a slight change in
behavior: in the previous iteration, we had this call in
`query_fsmonitor()`, which was only ever called if
`istate->fsmonitor_last_update` is non-`NULL`.

The code in `fsmonitor_ipc__send_query()` does actually depend on this,
therefore we need this change to be squashed in:

-- snip --
diff --git a/fsmonitor.c b/fsmonitor.c
index 22623fd228f..0b40643442e 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -290,8 +290,9 @@ void refresh_fsmonitor(struct index_state *istate)
 	trace_printf_key(&trace_fsmonitor, "refresh fsmonitor");

 	if (r->settings.use_builtin_fsmonitor > 0) {
-		query_success = !fsmonitor_ipc__send_query(
-			istate->fsmonitor_last_update, &query_result);
+		query_success = istate->fsmonitor_last_update &&
+			!fsmonitor_ipc__send_query(istate->fsmonitor_last_update,
+						   &query_result);
 		if (query_success) {
 			/*
 			 * The response contains a series of nul terminated
-- snap --

Thanks,
Dscho

> +		if (query_success) {
> +			/*
> +			 * The response contains a series of nul terminated
> +			 * strings.  The first is the new token.
> +			 *
> +			 * Use `char *buf` as an interlude to trick the CI
> +			 * static analysis to let us use `strbuf_addstr()`
> +			 * here (and only copy the token) rather than
> +			 * `strbuf_addbuf()`.
> +			 */
> +			buf = query_result.buf;
> +			strbuf_addstr(&last_update_token, buf);
> +			bol = last_update_token.len + 1;
> +		} else {
> +			/*
> +			 * The builtin daemon is not available on this
> +			 * platform -OR- we failed to get a response.
> +			 *
> +			 * Generate a fake token (rather than a V1
> +			 * timestamp) for the index extension.  (If
> +			 * they switch back to the hook API, we don't
> +			 * want ambiguous state.)
> +			 */
> +			strbuf_addstr(&last_update_token, "builtin:fake");
> +		}
> +
> +		/*
> +		 * Regardless of whether we successfully talked to a
> +		 * fsmonitor daemon or not, we skip over and do not
> +		 * try to use the hook.  The "core.useBuiltinFSMonitor"
> +		 * config setting ALWAYS overrides the "core.fsmonitor"
> +		 * hook setting.
> +		 */
> +		goto apply_results;
> +	}
> +
>  	/*
>  	 * This could be racy so save the date/time now and query_fsmonitor
>  	 * should be inclusive to ensure we don't miss potential changes.
> @@ -301,6 +343,7 @@ void refresh_fsmonitor(struct index_state *istate)
>  			core_fsmonitor, query_success ? "success" : "failure");
>  	}
>
> +apply_results:
>  	/* a fsmonitor process can return '/' to indicate all entries are invalid */
>  	if (query_success && query_result.buf[bol] != '/') {
>  		/* Mark all entries returned by the monitor as dirty */
> diff --git a/repo-settings.c b/repo-settings.c
> index f7fff0f5ab83..93aab92ff164 100644
> --- a/repo-settings.c
> +++ b/repo-settings.c
> @@ -58,6 +58,9 @@ void prepare_repo_settings(struct repository *r)
>  		r->settings.core_multi_pack_index = value;
>  	UPDATE_DEFAULT_BOOL(r->settings.core_multi_pack_index, 1);
>
> +	if (!repo_config_get_bool(r, "core.usebuiltinfsmonitor", &value) && value)
> +		r->settings.use_builtin_fsmonitor = 1;
> +
>  	if (!repo_config_get_bool(r, "feature.manyfiles", &value) && value) {
>  		UPDATE_DEFAULT_BOOL(r->settings.index_version, 4);
>  		UPDATE_DEFAULT_BOOL(r->settings.core_untracked_cache, UNTRACKED_CACHE_WRITE);
> diff --git a/repository.h b/repository.h
> index b385ca3c94b6..d6e7f61f9cf7 100644
> --- a/repository.h
> +++ b/repository.h
> @@ -29,6 +29,8 @@ enum fetch_negotiation_setting {
>  struct repo_settings {
>  	int initialized;
>
> +	int use_builtin_fsmonitor;
> +
>  	int core_commit_graph;
>  	int commit_graph_read_changed_paths;
>  	int gc_write_commit_graph;
> --
> gitgitgadget
>
>

^ permalink raw reply related	[flat|nested] 237+ messages in thread

* Re: [PATCH v2 22/28] fsmonitor--daemon: use a cookie file to sync with file system
  2021-05-22 13:57   ` [PATCH v2 22/28] fsmonitor--daemon: use a cookie file to sync with file system Jeff Hostetler via GitGitGadget
@ 2021-06-14 21:42     ` Johannes Schindelin
  0 siblings, 0 replies; 237+ messages in thread
From: Johannes Schindelin @ 2021-06-14 21:42 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Jeff Hostetler, Derrick Stolee, Jeff Hostetler, Jeff Hostetler

Hi Jeff,

I've found an issue with this patch that I unfortunately failed to fix in
time for Git for Windows v2.32.0. It is a subtle issue, and it only reared
its head in the form of flakiness when running Scalar's Functional Tests
(which is the most comprehensive test suite we have in the absence of
proper integration tests for Git).

The issue is the change in behavior between the previous iteration and
this one when replying with the trivial response:

On Sat, 22 May 2021, Jeff Hostetler via GitGitGadget wrote:

> diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
> index e807aa8f6741..985a82cf39e0 100644
> --- a/builtin/fsmonitor--daemon.c
> +++ b/builtin/fsmonitor--daemon.c
> [...]
> @@ -522,6 +672,7 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
>  		 */
>  		do_flush = 1;
>  		do_trivial = 1;
> +		do_cookie = 1;
>
>  	} else if (!skip_prefix(command, "builtin:", &p)) {
>  		/* assume V1 timestamp or garbage */
> @@ -535,6 +686,7 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
>  				  "fsmonitor: unsupported V1 protocol '%s'"),
>  				 command);
>  		do_trivial = 1;
> +		do_cookie = 1;
>
>  	} else {
>  		/* We have "builtin:*" */
> @@ -544,12 +696,14 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
>  					 "fsmonitor: invalid V2 protocol token '%s'",
>  					 command);
>  			do_trivial = 1;
> +			do_cookie = 1;

In the first iteration of this patch series, these three trivial responses
were sent without first writing a cookie file and then waiting for the
event to arrive.

The symptom of the issue here is that some of Scalar's Functional Test
hung, ostensibly waiting for a cookie file that never arrived.

I am not 100% clear on why this only happened in Scalar's Functional Tests
and not in the regression test in Git's test suite, but here are my
thoughts on this:

- I _suspect_ that the .git/ directory gets deleted on the client side
  after receiving a trivial response and finishing the test case, and
  _before_ the daemon can receive the event. As the directory already got
  deleted, the event never arrives.

- I _think_ that running the test cases in parallel (10 concurrent test
  cases, if memory serves) exacerbates this problem.

- Simply _not_ writing that cookie file (i.e. removing those three
  `do_cookie = 1` assignments above is sufficient to let Scalar's
  Functional Tests pass.

- Those trivial responses do not _actually_ need to be guarded behind that
  cookie file. The worst that could happen is for the FSMonitor daemon to
  over-report paths that need to be `lstat()`ed.

Therefore, I would like to suggest this diff to be squashed into this
patch in a re-roll:

-- snip --
diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 985a82cf39e..4afbb36fe61 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -672,7 +672,6 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
 		 */
 		do_flush = 1;
 		do_trivial = 1;
-		do_cookie = 1;

 	} else if (!skip_prefix(command, "builtin:", &p)) {
 		/* assume V1 timestamp or garbage */
@@ -686,7 +685,6 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
 				  "fsmonitor: unsupported V1 protocol '%s'"),
 				 command);
 		do_trivial = 1;
-		do_cookie = 1;

 	} else {
 		/* We have "builtin:*" */
@@ -696,7 +694,6 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
 					 "fsmonitor: invalid V2 protocol token '%s'",
 					 command);
 			do_trivial = 1;
-			do_cookie = 1;

 		} else {
 			/*
-- snap --

For the record, the Windows installer and macOS/Linux packages available
at https://github.com/microsoft/git/releases/tag/v2.32.0.vfs.0.1 do come
with this fix.

Thanks,
Dscho

>
>  		} else {
>  			/*
>  			 * We have a V2 valid token:
>  			 *     "builtin:<token_id>:<seq_nr>"
>  			 */
> +			do_cookie = 1;
>  		}
>  	}
>
> @@ -558,6 +712,30 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
>  	if (!state->current_token_data)
>  		BUG("fsmonitor state does not have a current token");
>
> +	/*
> +	 * Write a cookie file inside the directory being watched in
> +	 * an effort to flush out existing filesystem events that we
> +	 * actually care about.  Suspend this client thread until we
> +	 * see the filesystem events for this cookie file.
> +	 *
> +	 * Creating the cookie lets us guarantee that our FS listener
> +	 * thread has drained the kernel queue and we are caught up
> +	 * with the kernel.
> +	 *
> +	 * If we cannot create the cookie (or otherwise guarantee that
> +	 * we are caught up), we send a trivial response.  We have to
> +	 * assume that there might be some very, very recent activity
> +	 * on the FS still in flight.
> +	 */
> +	if (do_cookie) {
> +		cookie_result = with_lock__wait_for_cookie(state);
> +		if (cookie_result != FCIR_SEEN) {
> +			error(_("fsmonitor: cookie_result '%d' != SEEN"),
> +			      cookie_result);
> +			do_trivial = 1;
> +		}
> +	}
> +
>  	if (do_flush)
>  		with_lock__do_force_resync(state);
>
> @@ -769,7 +947,9 @@ static int handle_client(void *data,
>  	return result;
>  }
>
> -#define FSMONITOR_COOKIE_PREFIX ".fsmonitor-daemon-"
> +#define FSMONITOR_DIR           "fsmonitor--daemon"
> +#define FSMONITOR_COOKIE_DIR    "cookies"
> +#define FSMONITOR_COOKIE_PREFIX (FSMONITOR_DIR "/" FSMONITOR_COOKIE_DIR "/")
>
>  enum fsmonitor_path_type fsmonitor_classify_path_workdir_relative(
>  	const char *rel)
> @@ -922,6 +1102,9 @@ void fsmonitor_publish(struct fsmonitor_daemon_state *state,
>  		}
>  	}
>
> +	if (cookie_names->nr)
> +		with_lock__mark_cookies_seen(state, cookie_names);
> +
>  	pthread_mutex_unlock(&state->main_lock);
>  }
>
> @@ -1011,7 +1194,9 @@ static int fsmonitor_run_daemon(void)
>
>  	memset(&state, 0, sizeof(state));
>
> +	hashmap_init(&state.cookies, cookies_cmp, NULL, 0);
>  	pthread_mutex_init(&state.main_lock, NULL);
> +	pthread_cond_init(&state.cookies_cond, NULL);
>  	state.error_code = 0;
>  	state.current_token_data = fsmonitor_new_token_data();
>
> @@ -1035,6 +1220,23 @@ static int fsmonitor_run_daemon(void)
>  		state.nr_paths_watching = 2;
>  	}
>
> +	/*
> +	 * We will write filesystem syncing cookie files into
> +	 * <gitdir>/<fsmonitor-dir>/<cookie-dir>/<pid>-<seq>.
> +	 */
> +	strbuf_init(&state.path_cookie_prefix, 0);
> +	strbuf_addbuf(&state.path_cookie_prefix, &state.path_gitdir_watch);
> +
> +	strbuf_addch(&state.path_cookie_prefix, '/');
> +	strbuf_addstr(&state.path_cookie_prefix, FSMONITOR_DIR);
> +	mkdir(state.path_cookie_prefix.buf, 0777);
> +
> +	strbuf_addch(&state.path_cookie_prefix, '/');
> +	strbuf_addstr(&state.path_cookie_prefix, FSMONITOR_COOKIE_DIR);
> +	mkdir(state.path_cookie_prefix.buf, 0777);
> +
> +	strbuf_addch(&state.path_cookie_prefix, '/');
> +
>  	/*
>  	 * Confirm that we can create platform-specific resources for the
>  	 * filesystem listener before we bother starting all the threads.
> @@ -1047,6 +1249,7 @@ static int fsmonitor_run_daemon(void)
>  	err = fsmonitor_run_daemon_1(&state);
>
>  done:
> +	pthread_cond_destroy(&state.cookies_cond);
>  	pthread_mutex_destroy(&state.main_lock);
>  	fsmonitor_fs_listen__dtor(&state);
>
> @@ -1054,6 +1257,11 @@ static int fsmonitor_run_daemon(void)
>
>  	strbuf_release(&state.path_worktree_watch);
>  	strbuf_release(&state.path_gitdir_watch);
> +	strbuf_release(&state.path_cookie_prefix);
> +
> +	/*
> +	 * NEEDSWORK: Consider "rm -rf <gitdir>/<fsmonitor-dir>"
> +	 */
>
>  	return err;
>  }
> diff --git a/fsmonitor--daemon.h b/fsmonitor--daemon.h
> index 89a9ef20b24b..e9fc099bae9c 100644
> --- a/fsmonitor--daemon.h
> +++ b/fsmonitor--daemon.h
> @@ -45,6 +45,11 @@ struct fsmonitor_daemon_state {
>
>  	struct fsmonitor_token_data *current_token_data;
>
> +	struct strbuf path_cookie_prefix;
> +	pthread_cond_t cookies_cond;
> +	int cookie_seq;
> +	struct hashmap cookies;
> +
>  	int error_code;
>  	struct fsmonitor_daemon_backend_data *backend_data;
>
> --
> gitgitgadget
>
>

^ permalink raw reply related	[flat|nested] 237+ messages in thread

* Re: [PATCH v2 00/28] Builtin FSMonitor Feature
  2021-05-27  2:06   ` [PATCH v2 00/28] Builtin FSMonitor Feature Junio C Hamano
  2021-06-02 11:28     ` Johannes Schindelin
@ 2021-06-22 15:45     ` Jeff Hostetler
  1 sibling, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-06-22 15:45 UTC (permalink / raw)
  To: Junio C Hamano, Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Derrick Stolee, Jeff Hostetler



On 5/26/21 10:06 PM, Junio C Hamano wrote:
> These new global symbols are introduced by the series, but never
> used outside the file they are added to:
> 
> fsmonitor-ipc.o        - fsmonitor_ipc__get_path
> fsmonitor-ipc.o        - fsmonitor_ipc__get_state
> fsmonitor-ipc.o        - fsmonitor_ipc__send_command
> 
> Perhaps make them file-scope static?

I intended these to be part of the API for talking to a builtin
FSMonitor.  They are called from builtin/fsmonitor--daemon.c
and t/helper/test-fsmonitor-client.c

Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* [PATCH v3 00/34] Builtin FSMonitor Feature
  2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
                     ` (28 preceding siblings ...)
  2021-05-27  2:06   ` [PATCH v2 00/28] Builtin FSMonitor Feature Junio C Hamano
@ 2021-07-01 14:47   ` Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 01/34] simple-ipc: preparations for supporting binary messages Jeff Hostetler via GitGitGadget
                       ` (34 more replies)
  29 siblings, 35 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler

Here is V3 of my patch series to add a builtin FSMonitor daemon to Git. I
rebased this series onto v2.32.0.

V3 addresses most of the previous review comments and things we've learned
from our experimental testing of V2. (A version of V2 was shipped as an
experimental feature in the v2.32.0-based releases of Git for Windows and
VFS for Git.)

There are still a few items that I need to address, but that list is getting
very short.

The following items from my V2 cover have not yet been addressed:

[ ] Revisit the how the client handles the IPC_STATE__NOT_LISTENING state
(where a daemon appears to be running, but is non-responsive)

[ ] Consider having daemon chdir() out of the working directory to avoid
directory handle issues on Windows.

[ ] On Windows, If the daemon is started as an elevated process, then client
commands might not have access to communicate with it.

[ ] Review if/how we decide to shutdown the FSMonitor daemon after and a
significant idle period.

Also, there are potential problems with the untracked-cache that we have
been looking at. Concurrently, Tao Klerks independently noticed similar
problems with the untracked-cache and has reported/discussed them here on
the mailing list. I would like to get to the bottom of them before going
further -- at this point I don't know they are related to FSMonitor or not.

In this version, the first commit updates the Simple IPC API to make it
easier to pass binary data using {char *, size_t} rather than assuming that
the message is a null-terminated string. FSMonitor does not use binary
messages and doesn't really need this API change, but I thought it best to
fix the API now before we have other callers of IPC.

This patch series contains 34 commits and is rather large. If it would help
with the review, I could try to divide it into a client-side part 1 and a
daemon-side part 2 -- if there is interest.

Jeff Hostetler (34):
  simple-ipc: preparations for supporting binary messages.
  fsmonitor--daemon: man page
  fsmonitor--daemon: update fsmonitor documentation
  fsmonitor-ipc: create client routines for git-fsmonitor--daemon
  help: include fsmonitor--daemon feature flag in version info
  fsmonitor: config settings are repository-specific
  fsmonitor: use IPC to query the builtin FSMonitor daemon
  fsmonitor--daemon: add a built-in fsmonitor daemon
  fsmonitor--daemon: implement 'stop' and 'status' commands
  t/helper/fsmonitor-client: create IPC client to talk to FSMonitor
    Daemon
  fsmonitor-fs-listen-win32: stub in backend for Windows
  fsmonitor-fs-listen-macos: stub in backend for MacOS
  fsmonitor--daemon: implement 'run' command
  fsmonitor--daemon: implement 'start' command
  fsmonitor: do not try to operate on bare repos
  fsmonitor--daemon: add pathname classification
  fsmonitor--daemon: define token-ids
  fsmonitor--daemon: create token-based changed path cache
  fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  fsmonitor-fs-listen-macos: add macos header files for FSEvent
  fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS
  fsmonitor--daemon: implement handle_client callback
  t/helper/test-touch: add helper to touch a series of files
  t/perf/p7519: speed up test using "test-tool touch"
  t/perf: avoid copying builtin fsmonitor files into test repo
  t/perf/p7519: add fsmonitor--daemon test cases
  t7527: create test for fsmonitor--daemon
  fsmonitor--daemon: periodically truncate list of modified files
  fsmonitor--daemon: use a cookie file to sync with file system
  fsmonitor: enhance existing comments
  fsmonitor: force update index after large responses
  t7527: test status with untracked-cache and fsmonitor--daemon
  fsmonitor: handle shortname for .git
  t7527: test FS event reporing on MacOS WRT case and Unicode

 .gitignore                                   |    1 +
 Documentation/config/core.txt                |   56 +-
 Documentation/git-fsmonitor--daemon.txt      |   75 +
 Documentation/git-update-index.txt           |   27 +-
 Documentation/githooks.txt                   |    3 +-
 Makefile                                     |   17 +
 builtin.h                                    |    1 +
 builtin/fsmonitor--daemon.c                  | 1567 ++++++++++++++++++
 builtin/update-index.c                       |   20 +-
 cache.h                                      |    1 -
 compat/fsmonitor/fsmonitor-fs-listen-macos.c |  497 ++++++
 compat/fsmonitor/fsmonitor-fs-listen-win32.c |  663 ++++++++
 compat/fsmonitor/fsmonitor-fs-listen.h       |   49 +
 compat/simple-ipc/ipc-unix-socket.c          |   14 +-
 compat/simple-ipc/ipc-win32.c                |   14 +-
 config.c                                     |   14 -
 config.h                                     |    1 -
 config.mak.uname                             |    4 +
 contrib/buildsystems/CMakeLists.txt          |    8 +
 environment.c                                |    1 -
 fsmonitor--daemon.h                          |  140 ++
 fsmonitor-ipc.c                              |  179 ++
 fsmonitor-ipc.h                              |   48 +
 fsmonitor.c                                  |  189 ++-
 fsmonitor.h                                  |   14 +-
 git.c                                        |    1 +
 help.c                                       |    4 +
 repo-settings.c                              |   48 +
 repository.h                                 |   11 +
 simple-ipc.h                                 |    7 +-
 t/README                                     |    4 +-
 t/helper/test-fsmonitor-client.c             |  121 ++
 t/helper/test-simple-ipc.c                   |   34 +-
 t/helper/test-tool.c                         |    2 +
 t/helper/test-tool.h                         |    2 +
 t/helper/test-touch.c                        |  126 ++
 t/perf/p7519-fsmonitor.sh                    |   51 +-
 t/perf/perf-lib.sh                           |    2 +-
 t/t7519-status-fsmonitor.sh                  |   38 +
 t/t7527-builtin-fsmonitor.sh                 |  679 ++++++++
 t/test-lib.sh                                |    6 +
 41 files changed, 4618 insertions(+), 121 deletions(-)
 create mode 100644 Documentation/git-fsmonitor--daemon.txt
 create mode 100644 builtin/fsmonitor--daemon.c
 create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-macos.c
 create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-win32.c
 create mode 100644 compat/fsmonitor/fsmonitor-fs-listen.h
 create mode 100644 fsmonitor--daemon.h
 create mode 100644 fsmonitor-ipc.c
 create mode 100644 fsmonitor-ipc.h
 create mode 100644 t/helper/test-fsmonitor-client.c
 create mode 100644 t/helper/test-touch.c
 create mode 100755 t/t7527-builtin-fsmonitor.sh


base-commit: ebf3c04b262aa27fbb97f8a0156c2347fecafafb
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-923%2Fjeffhostetler%2Fbuiltin-fsmonitor-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-923/jeffhostetler/builtin-fsmonitor-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/923

Range-diff vs v2:

  1:  763fa1ee7bb =  1:  cafc71c8d7d simple-ipc: preparations for supporting binary messages.
  2:  fc180e8591b =  2:  5db2c0390a6 fsmonitor--daemon: man page
  3:  d56f3e91db9 =  3:  86413bfe347 fsmonitor--daemon: update fsmonitor documentation
  4:  e4a26372877 !  4:  dfcd3e5ac97 fsmonitor-ipc: create client routines for git-fsmonitor--daemon
     @@ fsmonitor-ipc.c (new)
      +	struct ipc_client_connection *connection = NULL;
      +	struct ipc_client_connect_options options
      +		= IPC_CLIENT_CONNECT_OPTIONS_INIT;
     ++	const char *tok = since_token ? since_token : "";
     ++	size_t tok_len = since_token ? strlen(since_token) : 0;
      +
      +	options.wait_if_busy = 1;
      +	options.wait_if_not_found = 0;
      +
      +	trace2_region_enter("fsm_client", "query", NULL);
     -+
     -+	trace2_data_string("fsm_client", NULL, "query/command",
     -+			   since_token);
     ++	trace2_data_string("fsm_client", NULL, "query/command", tok);
      +
      +try_again:
      +	state = ipc_client_try_connect(fsmonitor_ipc__get_path(), &options,
     @@ fsmonitor-ipc.c (new)
      +	switch (state) {
      +	case IPC_STATE__LISTENING:
      +		ret = ipc_client_send_command_to_connection(
     -+			connection, since_token, strlen(since_token), answer);
     ++			connection, tok, tok_len, answer);
      +		ipc_client_close_connection(connection);
      +
      +		trace2_data_intmax("fsm_client", NULL,
     @@ fsmonitor-ipc.c (new)
      +		= IPC_CLIENT_CONNECT_OPTIONS_INIT;
      +	int ret;
      +	enum ipc_active_state state;
     ++	const char *c = command ? command : "";
     ++	size_t c_len = command ? strlen(command) : 0;
      +
      +	strbuf_reset(answer);
      +
     @@ fsmonitor-ipc.c (new)
      +		return -1;
      +	}
      +
     -+	ret = ipc_client_send_command_to_connection(connection,
     -+						    command, strlen(command),
     ++	ret = ipc_client_send_command_to_connection(connection, c, c_len,
      +						    answer);
      +	ipc_client_close_connection(connection);
      +
      +	if (ret == -1) {
     -+		die("could not send '%s' command to fsmonitor--daemon",
     -+		    command);
     ++		die("could not send '%s' command to fsmonitor--daemon", c);
      +		return -1;
      +	}
      +
  5:  d5d09eb1635 !  5:  0aaca2f9390 help: include fsmonitor--daemon feature flag in version info
     @@ help.c: void get_version_info(struct strbuf *buf, int show_build_options)
       	}
       }
       
     +
     + ## t/test-lib.sh ##
     +@@ t/test-lib.sh: test_lazy_prereq REBASE_P '
     + # Tests that verify the scheduler integration must set this locally
     + # to avoid errors.
     + GIT_TEST_MAINT_SCHEDULER="none:exit 1"
     ++
     ++# Does this platform support `git fsmonitor--daemon`
     ++#
     ++test_lazy_prereq FSMONITOR_DAEMON '
     ++	git version --build-options | grep "feature:" | grep "fsmonitor--daemon"
     ++'
  6:  67bcf57f594 !  6:  8b64b7cd367 config: FSMonitor is repository-specific
     @@
       ## Metadata ##
     -Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
     +Author: Jeff Hostetler <jeffhost@microsoft.com>
      
       ## Commit message ##
     -    config: FSMonitor is repository-specific
     +    fsmonitor: config settings are repository-specific
      
     -    This commit refactors `git_config_get_fsmonitor()` into the `repo_*()`
     -    form that takes a parameter `struct repository *r`.
     +    Move FSMonitor config settings to `struct repo_settings`, get rid of
     +    the `core_fsmonitor` global variable, and add support for the new
     +    `core.useBuiltinFSMonitor` config setting.  Move config code to lookup
     +    `core.fsmonitor` into `prepare_repo_settings()` with the rest of the
     +    setup code.
      
     -    That change prepares for the upcoming `core.useBuiltinFSMonitor` flag which
     -    will be stored in the `repo_settings` struct.
     +    The `core_fsmonitor` global variable was used to store the pathname to
     +    the FSMonitor hook and it was used as a boolean to see if FSMonitor
     +    was enabled.  This dual usage will lead to confusion when we add
     +    support for a builtin FSMonitor based on IPC, since the builtin
     +    FSMonitor doesn't need the hook pathname.
      
     -    Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
     +    Replace the boolean usage with an `enum fsmonitor_mode` to represent
     +    the state of FSMonitor.  And only set the pathname when in HOOK mode.
     +
     +    Also, disable FSMonitor when the repository working directory is
     +    incompatible.  For example, in bare repositories, since there aren't
     +    any files to watch.
     +
     +    Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
      
       ## builtin/update-index.c ##
      @@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const char *prefix)
     @@ builtin/update-index.c: int cmd_update_index(int argc, const char **argv, const
       
       	if (fsmonitor > 0) {
      -		if (git_config_get_fsmonitor() == 0)
     -+		if (repo_config_get_fsmonitor(r) == 0)
     ++		if (r->settings.fsmonitor_mode == FSMONITOR_MODE_INCOMPATIBLE)
     ++			return error(
     ++				_("repository is incompatible with fsmonitor"));
     ++
     ++		if (r->settings.fsmonitor_mode == FSMONITOR_MODE_DISABLED) {
     ++			warning(_("core.useBuiltinFSMonitor is unset; "
     ++				"set it if you really want to enable the "
     ++				"builtin fsmonitor"));
       			warning(_("core.fsmonitor is unset; "
     - 				"set it if you really want to "
     - 				"enable fsmonitor"));
     +-				"set it if you really want to "
     +-				"enable fsmonitor"));
     ++				"set it if you really want to enable the "
     ++				"hook-based fsmonitor"));
     ++		}
       		add_fsmonitor(&the_index);
       		report(_("fsmonitor enabled"));
       	} else if (!fsmonitor) {
      -		if (git_config_get_fsmonitor() == 1)
     -+		if (repo_config_get_fsmonitor(r) == 1)
     ++		if (r->settings.fsmonitor_mode == FSMONITOR_MODE_IPC)
     ++			warning(_("core.useBuiltinFSMonitor is set; "
     ++				"remove it if you really want to "
     ++				"disable fsmonitor"));
     ++		if (r->settings.fsmonitor_mode == FSMONITOR_MODE_HOOK)
       			warning(_("core.fsmonitor is set; "
       				"remove it if you really want to "
       				"disable fsmonitor"));
      
     + ## cache.h ##
     +@@ cache.h: extern int core_preload_index;
     + extern int precomposed_unicode;
     + extern int protect_hfs;
     + extern int protect_ntfs;
     +-extern const char *core_fsmonitor;
     + 
     + extern int core_apply_sparse_checkout;
     + extern int core_sparse_checkout_cone;
     +
       ## config.c ##
      @@ config.c: int git_config_get_max_percent_split_change(void)
       	return -1; /* default value */
       }
       
      -int git_config_get_fsmonitor(void)
     -+int repo_config_get_fsmonitor(struct repository *r)
     - {
     +-{
      -	if (git_config_get_pathname("core.fsmonitor", &core_fsmonitor))
     -+	if (repo_config_get_pathname(r, "core.fsmonitor", &core_fsmonitor))
     - 		core_fsmonitor = getenv("GIT_TEST_FSMONITOR");
     - 
     - 	if (core_fsmonitor && !*core_fsmonitor)
     +-		core_fsmonitor = getenv("GIT_TEST_FSMONITOR");
     +-
     +-	if (core_fsmonitor && !*core_fsmonitor)
     +-		core_fsmonitor = NULL;
     +-
     +-	if (core_fsmonitor)
     +-		return 1;
     +-
     +-	return 0;
     +-}
     +-
     + int git_config_get_index_threads(int *dest)
     + {
     + 	int is_bool, val;
      
       ## config.h ##
      @@ config.h: int git_config_get_index_threads(int *dest);
     @@ config.h: int git_config_get_index_threads(int *dest);
       int git_config_get_split_index(void);
       int git_config_get_max_percent_split_change(void);
      -int git_config_get_fsmonitor(void);
     -+int repo_config_get_fsmonitor(struct repository *r);
       
       /* This dies if the configured or default date is in the future */
       int git_config_get_expiry(const char *key, const char **output);
      
     + ## environment.c ##
     +@@ environment.c: int protect_hfs = PROTECT_HFS_DEFAULT;
     + #define PROTECT_NTFS_DEFAULT 1
     + #endif
     + int protect_ntfs = PROTECT_NTFS_DEFAULT;
     +-const char *core_fsmonitor;
     + 
     + /*
     +  * The character that begins a commented line in user-editable file
     +
       ## fsmonitor.c ##
     +@@
     + #include "dir.h"
     + #include "ewah/ewok.h"
     + #include "fsmonitor.h"
     ++#include "fsmonitor-ipc.h"
     + #include "run-command.h"
     + #include "strbuf.h"
     + 
     +@@ fsmonitor.c: void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate)
     + /*
     +  * Call the query-fsmonitor hook passing the last update token of the saved results.
     +  */
     +-static int query_fsmonitor(int version, const char *last_update, struct strbuf *query_result)
     ++static int query_fsmonitor_hook(struct repository *r,
     ++				int version,
     ++				const char *last_update,
     ++				struct strbuf *query_result)
     + {
     + 	struct child_process cp = CHILD_PROCESS_INIT;
     + 	int result;
     + 
     +-	if (!core_fsmonitor)
     ++	if (r->settings.fsmonitor_mode != FSMONITOR_MODE_HOOK)
     + 		return -1;
     + 
     +-	strvec_push(&cp.args, core_fsmonitor);
     ++	assert(r->settings.fsmonitor_hook_path);
     ++	assert(*r->settings.fsmonitor_hook_path);
     ++
     ++	strvec_push(&cp.args, r->settings.fsmonitor_hook_path);
     + 	strvec_pushf(&cp.args, "%d", version);
     + 	strvec_pushf(&cp.args, "%s", last_update);
     + 	cp.use_shell = 1;
     +@@ fsmonitor.c: void refresh_fsmonitor(struct index_state *istate)
     + 	struct strbuf last_update_token = STRBUF_INIT;
     + 	char *buf;
     + 	unsigned int i;
     ++	struct repository *r = istate->repo ? istate->repo : the_repository;
     + 
     +-	if (!core_fsmonitor || istate->fsmonitor_has_run_once)
     ++	if (r->settings.fsmonitor_mode <= FSMONITOR_MODE_DISABLED ||
     ++	    istate->fsmonitor_has_run_once)
     + 		return;
     + 
     +-	hook_version = fsmonitor_hook_version();
     +-
     + 	istate->fsmonitor_has_run_once = 1;
     + 
     + 	trace_printf_key(&trace_fsmonitor, "refresh fsmonitor");
     ++
     ++	if (r->settings.fsmonitor_mode == FSMONITOR_MODE_IPC) {
     ++		/* TODO */
     ++		return;
     ++	}
     ++
     ++	assert(r->settings.fsmonitor_mode == FSMONITOR_MODE_HOOK);
     ++
     ++	hook_version = fsmonitor_hook_version();
     ++
     + 	/*
     +-	 * This could be racy so save the date/time now and query_fsmonitor
     ++	 * This could be racy so save the date/time now and query_fsmonitor_hook
     + 	 * should be inclusive to ensure we don't miss potential changes.
     + 	 */
     + 	last_update = getnanotime();
     +@@ fsmonitor.c: void refresh_fsmonitor(struct index_state *istate)
     + 		strbuf_addf(&last_update_token, "%"PRIu64"", last_update);
     + 
     + 	/*
     +-	 * If we have a last update token, call query_fsmonitor for the set of
     ++	 * If we have a last update token, call query_fsmonitor_hook for the set of
     + 	 * changes since that token, else assume everything is possibly dirty
     + 	 * and check it all.
     + 	 */
     + 	if (istate->fsmonitor_last_update) {
     + 		if (hook_version == -1 || hook_version == HOOK_INTERFACE_VERSION2) {
     +-			query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION2,
     ++			query_success = !query_fsmonitor_hook(
     ++				r, HOOK_INTERFACE_VERSION2,
     + 				istate->fsmonitor_last_update, &query_result);
     + 
     + 			if (query_success) {
     +@@ fsmonitor.c: void refresh_fsmonitor(struct index_state *istate)
     + 		}
     + 
     + 		if (hook_version == HOOK_INTERFACE_VERSION1) {
     +-			query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION1,
     ++			query_success = !query_fsmonitor_hook(
     ++				r, HOOK_INTERFACE_VERSION1,
     + 				istate->fsmonitor_last_update, &query_result);
     + 		}
     + 
     +-		trace_performance_since(last_update, "fsmonitor process '%s'", core_fsmonitor);
     +-		trace_printf_key(&trace_fsmonitor, "fsmonitor process '%s' returned %s",
     +-			core_fsmonitor, query_success ? "success" : "failure");
     ++		trace_performance_since(last_update, "fsmonitor process '%s'",
     ++					r->settings.fsmonitor_hook_path);
     ++		trace_printf_key(&trace_fsmonitor,
     ++				 "fsmonitor process '%s' returned %s",
     ++				 r->settings.fsmonitor_hook_path,
     ++				 query_success ? "success" : "failure");
     + 	}
     + 
     + 	/* a fsmonitor process can return '/' to indicate all entries are invalid */
      @@ fsmonitor.c: void remove_fsmonitor(struct index_state *istate)
       void tweak_fsmonitor(struct index_state *istate)
       {
       	unsigned int i;
      -	int fsmonitor_enabled = git_config_get_fsmonitor();
     -+	int fsmonitor_enabled = repo_config_get_fsmonitor(istate->repo ? istate->repo : the_repository);
     ++	struct repository *r = istate->repo ? istate->repo : the_repository;
     ++	int fsmonitor_enabled = r->settings.fsmonitor_mode > FSMONITOR_MODE_DISABLED;
       
       	if (istate->fsmonitor_dirty) {
       		if (fsmonitor_enabled) {
     +@@ fsmonitor.c: void tweak_fsmonitor(struct index_state *istate)
     + 		istate->fsmonitor_dirty = NULL;
     + 	}
     + 
     +-	switch (fsmonitor_enabled) {
     +-	case -1: /* keep: do nothing */
     +-		break;
     +-	case 0: /* false */
     +-		remove_fsmonitor(istate);
     +-		break;
     +-	case 1: /* true */
     ++	if (fsmonitor_enabled)
     + 		add_fsmonitor(istate);
     +-		break;
     +-	default: /* unknown value: do nothing */
     +-		break;
     +-	}
     ++	else
     ++		remove_fsmonitor(istate);
     + }
     +
     + ## fsmonitor.h ##
     +@@ fsmonitor.h: int fsmonitor_is_trivial_response(const struct strbuf *query_result);
     +  */
     + static inline int is_fsmonitor_refreshed(const struct index_state *istate)
     + {
     +-	return !core_fsmonitor || istate->fsmonitor_has_run_once;
     ++	struct repository *r = istate->repo ? istate->repo : the_repository;
     ++
     ++	return r->settings.fsmonitor_mode <= FSMONITOR_MODE_DISABLED ||
     ++		istate->fsmonitor_has_run_once;
     + }
     + 
     + /*
     +@@ fsmonitor.h: static inline int is_fsmonitor_refreshed(const struct index_state *istate)
     +  */
     + static inline void mark_fsmonitor_valid(struct index_state *istate, struct cache_entry *ce)
     + {
     +-	if (core_fsmonitor && !(ce->ce_flags & CE_FSMONITOR_VALID)) {
     ++	struct repository *r = istate->repo ? istate->repo : the_repository;
     ++
     ++	if (r->settings.fsmonitor_mode > FSMONITOR_MODE_DISABLED &&
     ++	    !(ce->ce_flags & CE_FSMONITOR_VALID)) {
     + 		istate->cache_changed = 1;
     + 		ce->ce_flags |= CE_FSMONITOR_VALID;
     + 		trace_printf_key(&trace_fsmonitor, "mark_fsmonitor_clean '%s'", ce->name);
     +@@ fsmonitor.h: static inline void mark_fsmonitor_valid(struct index_state *istate, struct cache
     +  */
     + static inline void mark_fsmonitor_invalid(struct index_state *istate, struct cache_entry *ce)
     + {
     +-	if (core_fsmonitor) {
     ++	struct repository *r = istate->repo ? istate->repo : the_repository;
     ++
     ++	if (r->settings.fsmonitor_mode > FSMONITOR_MODE_DISABLED) {
     + 		ce->ce_flags &= ~CE_FSMONITOR_VALID;
     + 		untracked_cache_invalidate_path(istate, ce->name, 1);
     + 		trace_printf_key(&trace_fsmonitor, "mark_fsmonitor_invalid '%s'", ce->name);
     +
     + ## repo-settings.c ##
     +@@
     + 
     + #define UPDATE_DEFAULT_BOOL(s,v) do { if (s == -1) { s = v; } } while(0)
     + 
     ++/*
     ++ * Return 1 if the repo/workdir is incompatible with FSMonitor.
     ++ */
     ++static int is_repo_incompatible_with_fsmonitor(struct repository *r)
     ++{
     ++	const char *const_strval;
     ++
     ++	/*
     ++	 * Bare repositories don't have a working directory and
     ++	 * therefore, nothing to watch.
     ++	 */
     ++	if (!r->worktree)
     ++		return 1;
     ++
     ++	/*
     ++	 * GVFS (aka VFS for Git) is incompatible with FSMonitor.
     ++	 *
     ++	 * Granted, core Git does not know anything about GVFS and
     ++	 * we shouldn't make assumptions about a downstream feature,
     ++	 * but users can install both versions.  And this can lead
     ++	 * to incorrect results from core Git commands.  So, without
     ++	 * bringing in any of the GVFS code, do a simple config test
     ++	 * for a published config setting.  (We do not look at the
     ++	 * various *_TEST_* environment variables.)
     ++	 */
     ++	if (!repo_config_get_value(r, "core.virtualfilesystem", &const_strval))
     ++		return 1;
     ++
     ++	return 0;
     ++}
     ++
     + void prepare_repo_settings(struct repository *r)
     + {
     + 	int value;
     + 	char *strval;
     ++	const char *const_strval;
     + 
     + 	if (r->settings.initialized)
     + 		return;
     +@@ repo-settings.c: void prepare_repo_settings(struct repository *r)
     + 	UPDATE_DEFAULT_BOOL(r->settings.commit_graph_read_changed_paths, 1);
     + 	UPDATE_DEFAULT_BOOL(r->settings.gc_write_commit_graph, 1);
     + 
     ++	r->settings.fsmonitor_hook_path = NULL;
     ++	r->settings.fsmonitor_mode = FSMONITOR_MODE_DISABLED;
     ++	if (is_repo_incompatible_with_fsmonitor(r))
     ++		r->settings.fsmonitor_mode = FSMONITOR_MODE_INCOMPATIBLE;
     ++	else if (!repo_config_get_bool(r, "core.usebuiltinfsmonitor", &value)
     ++		   && value)
     ++		r->settings.fsmonitor_mode = FSMONITOR_MODE_IPC;
     ++	else {
     ++		if (repo_config_get_pathname(r, "core.fsmonitor", &const_strval))
     ++			const_strval = getenv("GIT_TEST_FSMONITOR");
     ++		if (const_strval && *const_strval) {
     ++			r->settings.fsmonitor_hook_path = strdup(const_strval);
     ++			r->settings.fsmonitor_mode = FSMONITOR_MODE_HOOK;
     ++		}
     ++	}
     ++
     + 	if (!repo_config_get_int(r, "index.version", &value))
     + 		r->settings.index_version = value;
     + 	if (!repo_config_get_maybe_bool(r, "core.untrackedcache", &value)) {
     +
     + ## repository.h ##
     +@@ repository.h: enum fetch_negotiation_setting {
     + 	FETCH_NEGOTIATION_NOOP = 3,
     + };
     + 
     ++enum fsmonitor_mode {
     ++	FSMONITOR_MODE_INCOMPATIBLE = -2,
     ++	FSMONITOR_MODE_UNSET = -1,
     ++	FSMONITOR_MODE_DISABLED = 0,
     ++	FSMONITOR_MODE_HOOK = 1, /* core.fsmonitor */
     ++	FSMONITOR_MODE_IPC = 2, /* core.useBuiltinFSMonitor */
     ++};
     ++
     + struct repo_settings {
     + 	int initialized;
     + 
     +@@ repository.h: struct repo_settings {
     + 	int gc_write_commit_graph;
     + 	int fetch_write_commit_graph;
     + 
     ++	enum fsmonitor_mode fsmonitor_mode;
     ++	char *fsmonitor_hook_path;
     ++
     + 	int index_version;
     + 	enum untracked_cache_setting core_untracked_cache;
     + 
     +
     + ## t/README ##
     +@@ t/README: every 'git commit-graph write', as if the `--changed-paths` option was
     + passed in.
     + 
     + GIT_TEST_FSMONITOR=$PWD/t7519/fsmonitor-all exercises the fsmonitor
     +-code path for utilizing a file system monitor to speed up detecting
     +-new or changed files.
     ++code path for utilizing a (hook based) file system monitor to speed up
     ++detecting new or changed files.
     + 
     + GIT_TEST_INDEX_VERSION=<n> exercises the index read/write code path
     + for the index version specified.  Can be set to any valid version
     +
     + ## t/t7519-status-fsmonitor.sh ##
     +@@ t/t7519-status-fsmonitor.sh: test_expect_success 'status succeeds after staging/unstaging' '
     + 	)
     + '
     + 
     ++# Test that we detect and disallow repos that are incompatible with FSMonitor.
     ++test_expect_success 'incompatible bare repo' '
     ++	test_when_finished "rm -rf ./bare-clone" &&
     ++	git clone --bare . ./bare-clone &&
     ++	cat >expect <<-\EOF &&
     ++	error: repository is incompatible with fsmonitor
     ++	EOF
     ++	test_must_fail git -C ./bare-clone update-index --fsmonitor 2>actual &&
     ++	test_cmp expect actual
     ++'
     ++
     ++test_expect_success 'incompatible core.virtualfilesystem' '
     ++	test_when_finished "rm -rf ./fake-gvfs-clone" &&
     ++	git clone . ./fake-gvfs-clone &&
     ++	git -C ./fake-gvfs-clone config core.virtualfilesystem true &&
     ++	cat >expect <<-\EOF &&
     ++	error: repository is incompatible with fsmonitor
     ++	EOF
     ++	test_must_fail git -C ./fake-gvfs-clone update-index --fsmonitor 2>actual &&
     ++	test_cmp expect actual
     ++'
     ++
     + test_done
  7:  7e097cebc14 !  7:  c86b5651ecc fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC
     @@
       ## Metadata ##
     -Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
     +Author: Jeff Hostetler <jeffhost@microsoft.com>
      
       ## Commit message ##
     -    fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC
     +    fsmonitor: use IPC to query the builtin FSMonitor daemon
      
          Use simple IPC to directly communicate with the new builtin file
     -    system monitor daemon.
     -
     -    Define a new config setting `core.useBuiltinFSMonitor` to enable the
     -    builtin file system monitor.
     +    system monitor daemon when `core.useBuiltinFSMonitor` is set.
      
          The `core.fsmonitor` setting has already been defined as a HOOK
          pathname.  Historically, this has been set to a HOOK script that will
     @@ Commit message
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
          Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
      
     - ## config.c ##
     -@@ config.c: int git_config_get_max_percent_split_change(void)
     - 
     - int repo_config_get_fsmonitor(struct repository *r)
     - {
     -+	if (r->settings.use_builtin_fsmonitor > 0) {
     -+		core_fsmonitor = "(built-in daemon)";
     -+		return 1;
     -+	}
     -+
     - 	if (repo_config_get_pathname(r, "core.fsmonitor", &core_fsmonitor))
     - 		core_fsmonitor = getenv("GIT_TEST_FSMONITOR");
     - 
     -
       ## fsmonitor.c ##
     -@@
     - #include "dir.h"
     - #include "ewah/ewok.h"
     - #include "fsmonitor.h"
     -+#include "fsmonitor-ipc.h"
     - #include "run-command.h"
     - #include "strbuf.h"
     - 
     -@@ fsmonitor.c: static void fsmonitor_refresh_callback(struct index_state *istate, char *name)
     - 
     - void refresh_fsmonitor(struct index_state *istate)
     - {
     -+	struct repository *r = istate->repo ? istate->repo : the_repository;
     - 	struct strbuf query_result = STRBUF_INIT;
     - 	int query_success = 0, hook_version = -1;
     - 	size_t bol = 0; /* beginning of line */
      @@ fsmonitor.c: void refresh_fsmonitor(struct index_state *istate)
     - 	istate->fsmonitor_has_run_once = 1;
     - 
       	trace_printf_key(&trace_fsmonitor, "refresh fsmonitor");
     -+
     -+	if (r->settings.use_builtin_fsmonitor > 0) {
     + 
     + 	if (r->settings.fsmonitor_mode == FSMONITOR_MODE_IPC) {
     +-		/* TODO */
     +-		return;
      +		query_success = !fsmonitor_ipc__send_query(
     -+			istate->fsmonitor_last_update, &query_result);
     ++			istate->fsmonitor_last_update ?
     ++			istate->fsmonitor_last_update : "builtin:fake",
     ++			&query_result);
      +		if (query_success) {
      +			/*
      +			 * The response contains a series of nul terminated
     @@ fsmonitor.c: void refresh_fsmonitor(struct index_state *istate)
      +		 * hook setting.
      +		 */
      +		goto apply_results;
     -+	}
     -+
     - 	/*
     - 	 * This could be racy so save the date/time now and query_fsmonitor
     - 	 * should be inclusive to ensure we don't miss potential changes.
     + 	}
     + 
     + 	assert(r->settings.fsmonitor_mode == FSMONITOR_MODE_HOOK);
      @@ fsmonitor.c: void refresh_fsmonitor(struct index_state *istate)
     - 			core_fsmonitor, query_success ? "success" : "failure");
     + 				 query_success ? "success" : "failure");
       	}
       
      +apply_results:
       	/* a fsmonitor process can return '/' to indicate all entries are invalid */
       	if (query_success && query_result.buf[bol] != '/') {
       		/* Mark all entries returned by the monitor as dirty */
     -
     - ## repo-settings.c ##
     -@@ repo-settings.c: void prepare_repo_settings(struct repository *r)
     - 		r->settings.core_multi_pack_index = value;
     - 	UPDATE_DEFAULT_BOOL(r->settings.core_multi_pack_index, 1);
     - 
     -+	if (!repo_config_get_bool(r, "core.usebuiltinfsmonitor", &value) && value)
     -+		r->settings.use_builtin_fsmonitor = 1;
     -+
     - 	if (!repo_config_get_bool(r, "feature.manyfiles", &value) && value) {
     - 		UPDATE_DEFAULT_BOOL(r->settings.index_version, 4);
     - 		UPDATE_DEFAULT_BOOL(r->settings.core_untracked_cache, UNTRACKED_CACHE_WRITE);
     -
     - ## repository.h ##
     -@@ repository.h: enum fetch_negotiation_setting {
     - struct repo_settings {
     - 	int initialized;
     - 
     -+	int use_builtin_fsmonitor;
     -+
     - 	int core_commit_graph;
     - 	int commit_graph_read_changed_paths;
     - 	int gc_write_commit_graph;
  8:  f362a88632e =  8:  f88db92d425 fsmonitor--daemon: add a built-in fsmonitor daemon
  9:  4f401310539 !  9:  02e21384ef0 fsmonitor--daemon: implement client command options
     @@ Metadata
      Author: Jeff Hostetler <jeffhost@microsoft.com>
      
       ## Commit message ##
     -    fsmonitor--daemon: implement client command options
     +    fsmonitor--daemon: implement 'stop' and 'status' commands
      
          Implement `stop` and `status` client commands to control and query the
          status of a `fsmonitor--daemon` server process (and implicitly start a
     @@ builtin/fsmonitor--daemon.c
      +
      +	switch (state) {
      +	case IPC_STATE__LISTENING:
     -+		printf(_("The built-in file system monitor is active\n"));
     ++		printf(_("fsmonitor-daemon is watching '%s'\n"),
     ++		       the_repository->worktree);
      +		return 0;
      +
      +	default:
     -+		printf(_("The built-in file system monitor is not active\n"));
     ++		printf(_("fsmonitor-daemon is not watching '%s'\n"),
     ++		       the_repository->worktree);
      +		return 1;
      +	}
      +}
 10:  d21af7ff842 ! 10:  c2adac8ed4b t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon
     @@ t/helper/test-fsmonitor-client.c (new)
      +#include "test-tool.h"
      +#include "cache.h"
      +#include "parse-options.h"
     -+//#include "fsmonitor.h"
      +#include "fsmonitor-ipc.h"
     -+//#include "compat/fsmonitor/fsmonitor-fs-listen.h"
     -+//#include "fsmonitor--daemon.h"
     -+//#include "simple-ipc.h"
      +
      +#ifndef HAVE_FSMONITOR_DAEMON_BACKEND
      +int cmd__fsmonitor_client(int argc, const char **argv)
 11:  49f9e2e3d49 ! 11:  5a9bda72203 fsmonitor-fs-listen-win32: stub in backend for Windows
     @@ config.mak.uname: ifneq (,$(findstring MINGW,$(uname_S)))
      
       ## contrib/buildsystems/CMakeLists.txt ##
      @@ contrib/buildsystems/CMakeLists.txt: else()
     - 	list(APPEND compat_SOURCES compat/simple-ipc/ipc-shared.c compat/simple-ipc/ipc-unix-socket.c)
     + 	endif()
       endif()
       
      +if(CMAKE_SYSTEM_NAME STREQUAL "Windows")
 12:  2aa85151f03 = 12:  58758048947 fsmonitor-fs-listen-macos: stub in backend for MacOS
 13:  2aa05ad5c67 ! 13:  5d6646df93a fsmonitor--daemon: implement daemon command options
     @@ Metadata
      Author: Jeff Hostetler <jeffhost@microsoft.com>
      
       ## Commit message ##
     -    fsmonitor--daemon: implement daemon command options
     +    fsmonitor--daemon: implement 'run' command
      
     -    Implement `run` and `start` commands to try to
     -    begin listening for file system events.
     +    Implement `run` command to try to begin listening for file system events.
      
     -    This version defines the thread structure with a single
     -    fsmonitor_fs_listen thread to watch for file system events
     -    and a simple IPC thread pool to wait for connections from
     -    Git clients over a well-known named pipe or Unix domain
     -    socket.
     +    This version defines the thread structure with a single fsmonitor_fs_listen
     +    thread to watch for file system events and a simple IPC thread pool to
     +    watch for connection from Git clients over a well-known named pipe or
     +    Unix domain socket.
      
     -    This version does not actually do anything yet because the
     +    This commit does not actually do anything yet because the platform
          backends are still just stubs.
      
          Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
     @@ builtin/fsmonitor--daemon.c
       #include "khash.h"
       
       static const char * const builtin_fsmonitor__daemon_usage[] = {
     -+	N_("git fsmonitor--daemon start [<options>]"),
      +	N_("git fsmonitor--daemon run [<options>]"),
       	N_("git fsmonitor--daemon stop"),
       	N_("git fsmonitor--daemon status"),
     @@ builtin/fsmonitor--daemon.c
      +#define FSMONITOR__IPC_THREADS "fsmonitor.ipcthreads"
      +static int fsmonitor__ipc_threads = 8;
      +
     -+#define FSMONITOR__START_TIMEOUT "fsmonitor.starttimeout"
     -+static int fsmonitor__start_timeout_sec = 60;
     -+
      +static int fsmonitor_config(const char *var, const char *value, void *cb)
      +{
      +	if (!strcmp(var, FSMONITOR__IPC_THREADS)) {
     @@ builtin/fsmonitor--daemon.c
      +		return 0;
      +	}
      +
     -+	if (!strcmp(var, FSMONITOR__START_TIMEOUT)) {
     -+		int i = git_config_int(var, value);
     -+		if (i < 0)
     -+			return error(_("value of '%s' out of range: %d"),
     -+				     FSMONITOR__START_TIMEOUT, i);
     -+		fsmonitor__start_timeout_sec = i;
     -+		return 0;
     -+	}
     -+
      +	return git_default_config(var, value, cb);
      +}
      +
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__status(void)
      +	state.nr_paths_watching = 1;
      +
      +	/*
     -+	 * We create/delete cookie files inside the .git directory to
     -+	 * help us keep sync with the file system.  If ".git" is not a
     -+	 * directory, then <gitdir> is not inside the cone of
     -+	 * <worktree-root>, so set up a second watch for it.
     ++	 * We create and delete cookie files somewhere inside the .git
     ++	 * directory to help us keep sync with the file system.  If
     ++	 * ".git" is not a directory, then <gitdir> is not inside the
     ++	 * cone of <worktree-root>, so set up a second watch to watch
     ++	 * the <gitdir> so that we get events for the cookie files.
      +	 */
      +	strbuf_init(&state.path_gitdir_watch, 0);
      +	strbuf_addbuf(&state.path_gitdir_watch, &state.path_worktree_watch);
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__status(void)
      +	 * common error case.
      +	 */
      +	if (fsmonitor_ipc__get_state() == IPC_STATE__LISTENING)
     -+		die("fsmonitor--daemon is already running.");
     -+
     -+	return !!fsmonitor_run_daemon();
     -+}
     -+
     -+#ifndef GIT_WINDOWS_NATIVE
     -+/*
     -+ * This is adapted from `daemonize()`.  Use `fork()` to directly create
     -+ * and run the daemon in a child process.  The fork-parent returns the
     -+ * child PID so that we can wait for the child to startup before exiting.
     -+ */
     -+static int spawn_background_fsmonitor_daemon(pid_t *pid)
     -+{
     -+	*pid = fork();
     -+
     -+	switch (*pid) {
     -+	case 0:
     -+		if (setsid() == -1)
     -+			error_errno(_("setsid failed"));
     -+		close(0);
     -+		close(1);
     -+		close(2);
     -+		sanitize_stdfds();
     -+
     -+		return !!fsmonitor_run_daemon();
     -+
     -+	case -1:
     -+		return error_errno(_("could not spawn fsmonitor--daemon in the background"));
     -+
     -+	default:
     -+		return 0;
     -+	}
     -+}
     -+#else
     -+/*
     -+ * Conceptually like `daemonize()` but different because Windows does not
     -+ * have `fork(2)`.  Spawn a normal Windows child process but without the
     -+ * limitations of `start_command()` and `finish_command()`.
     -+ */
     -+static int spawn_background_fsmonitor_daemon(pid_t *pid)
     -+{
     -+	char git_exe[MAX_PATH];
     -+	struct strvec args = STRVEC_INIT;
     -+	int in, out;
     -+
     -+	GetModuleFileNameA(NULL, git_exe, MAX_PATH);
     -+
     -+	in = open("/dev/null", O_RDONLY);
     -+	out = open("/dev/null", O_WRONLY);
     -+
     -+	strvec_push(&args, git_exe);
     -+	strvec_push(&args, "fsmonitor--daemon");
     -+	strvec_push(&args, "run");
     ++		die("fsmonitor--daemon is already running '%s'",
     ++		    the_repository->worktree);
      +
     -+	*pid = mingw_spawnvpe(args.v[0], args.v, NULL, NULL, in, out, out);
     -+	close(in);
     -+	close(out);
     ++	printf(_("running fsmonitor-daemon in '%s'\n"),
     ++	       the_repository->worktree);
     ++	fflush(stdout);
      +
     -+	strvec_clear(&args);
     -+
     -+	if (*pid < 0)
     -+		return error(_("could not spawn fsmonitor--daemon in the background"));
     -+
     -+	return 0;
     -+}
     -+#endif
     -+
     -+/*
     -+ * This is adapted from `wait_or_whine()`.  Watch the child process and
     -+ * let it get started and begin listening for requests on the socket
     -+ * before reporting our success.
     -+ */
     -+static int wait_for_background_startup(pid_t pid_child)
     -+{
     -+	int status;
     -+	pid_t pid_seen;
     -+	enum ipc_active_state s;
     -+	time_t time_limit, now;
     -+
     -+	time(&time_limit);
     -+	time_limit += fsmonitor__start_timeout_sec;
     -+
     -+	for (;;) {
     -+		pid_seen = waitpid(pid_child, &status, WNOHANG);
     -+
     -+		if (pid_seen == -1)
     -+			return error_errno(_("waitpid failed"));
     -+		else if (pid_seen == 0) {
     -+			/*
     -+			 * The child is still running (this should be
     -+			 * the normal case).  Try to connect to it on
     -+			 * the socket and see if it is ready for
     -+			 * business.
     -+			 *
     -+			 * If there is another daemon already running,
     -+			 * our child will fail to start (possibly
     -+			 * after a timeout on the lock), but we don't
     -+			 * care (who responds) if the socket is live.
     -+			 */
     -+			s = fsmonitor_ipc__get_state();
     -+			if (s == IPC_STATE__LISTENING)
     -+				return 0;
     -+
     -+			time(&now);
     -+			if (now > time_limit)
     -+				return error(_("fsmonitor--daemon not online yet"));
     -+		} else if (pid_seen == pid_child) {
     -+			/*
     -+			 * The new child daemon process shutdown while
     -+			 * it was starting up, so it is not listening
     -+			 * on the socket.
     -+			 *
     -+			 * Try to ping the socket in the odd chance
     -+			 * that another daemon started (or was already
     -+			 * running) while our child was starting.
     -+			 *
     -+			 * Again, we don't care who services the socket.
     -+			 */
     -+			s = fsmonitor_ipc__get_state();
     -+			if (s == IPC_STATE__LISTENING)
     -+				return 0;
     -+
     -+			/*
     -+			 * We don't care about the WEXITSTATUS() nor
     -+			 * any of the WIF*(status) values because
     -+			 * `cmd_fsmonitor__daemon()` does the `!!result`
     -+			 * trick on all function return values.
     -+			 *
     -+			 * So it is sufficient to just report the
     -+			 * early shutdown as an error.
     -+			 */
     -+			return error(_("fsmonitor--daemon failed to start"));
     -+		} else
     -+			return error(_("waitpid is confused"));
     -+	}
     -+}
     -+
     -+static int try_to_start_background_daemon(void)
     -+{
     -+	pid_t pid_child;
     -+	int ret;
     -+
     -+	/*
     -+	 * Before we try to create a background daemon process, see
     -+	 * if a daemon process is already listening.  This makes it
     -+	 * easier for us to report an already-listening error to the
     -+	 * console, since our spawn/daemon can only report the success
     -+	 * of creating the background process (and not whether it
     -+	 * immediately exited).
     -+	 */
     -+	if (fsmonitor_ipc__get_state() == IPC_STATE__LISTENING)
     -+		die("fsmonitor--daemon is already running.");
     -+
     -+	/*
     -+	 * Run the actual daemon in a background process.
     -+	 */
     -+	ret = spawn_background_fsmonitor_daemon(&pid_child);
     -+	if (pid_child <= 0)
     -+		return ret;
     -+
     -+	/*
     -+	 * Wait (with timeout) for the background child process get
     -+	 * started and begin listening on the socket/pipe.  This makes
     -+	 * the "start" command more synchronous and more reliable in
     -+	 * tests.
     -+	 */
     -+	ret = wait_for_background_startup(pid_child);
     -+
     -+	return ret;
     ++	return !!fsmonitor_run_daemon();
      +}
      +
       int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
     @@ builtin/fsmonitor--daemon.c: static int do_as_client__status(void)
      +		OPT_INTEGER(0, "ipc-threads",
      +			    &fsmonitor__ipc_threads,
      +			    N_("use <n> ipc worker threads")),
     -+		OPT_INTEGER(0, "start-timeout",
     -+			    &fsmonitor__start_timeout_sec,
     -+			    N_("Max seconds to wait for background daemon startup")),
     -+
       		OPT_END()
       	};
       
     @@ builtin/fsmonitor--daemon.c: int cmd_fsmonitor__daemon(int argc, const char **ar
      +		die(_("invalid 'ipc-threads' value (%d)"),
      +		    fsmonitor__ipc_threads);
      +
     -+	if (!strcmp(subcmd, "start"))
     -+		return !!try_to_start_background_daemon();
     -+
      +	if (!strcmp(subcmd, "run"))
      +		return !!try_to_run_foreground_daemon();
       
  -:  ----------- > 14:  9fe902aad87 fsmonitor--daemon: implement 'start' command
  -:  ----------- > 15:  eef39aa168f fsmonitor: do not try to operate on bare repos
 14:  d5ababfd03e = 16:  3b12f668060 fsmonitor--daemon: add pathname classification
 15:  c092cdf2c8b = 17:  37fdce5ec3a fsmonitor--daemon: define token-ids
 16:  2ed7bc3fae7 = 18:  84444c44c32 fsmonitor--daemon: create token-based changed path cache
 17:  9ea4b04b821 ! 19:  5bba5eb3d1b fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
     @@ compat/fsmonitor/fsmonitor-fs-listen-win32.c
      +
      +		t = fsmonitor_classify_path_gitdir_relative(path.buf);
      +
     -+		trace_printf_key(&trace_fsmonitor, "BBB: %s", path.buf);
     -+
      +		switch (t) {
      +		case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX:
      +			/* special case cookie files within gitdir */
 18:  21b2b4f941b = 20:  175ae9a757e fsmonitor-fs-listen-macos: add macos header files for FSEvent
 19:  08474bad830 = 21:  5d12b5d808a fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS
 20:  cc4a596d17c = 22:  39df123143b fsmonitor--daemon: implement handle_client callback
  -:  ----------- > 23:  3cf8f3cd771 t/helper/test-touch: add helper to touch a series of files
  -:  ----------- > 24:  f1ef9656fc3 t/perf/p7519: speed up test using "test-tool touch"
 28:  96a3eab819f = 25:  a83485fb10f t/perf: avoid copying builtin fsmonitor files into test repo
 26:  5b035c6e0d6 ! 26:  de517a8259a p7519: add fsmonitor--daemon
     @@ Metadata
      Author: Jeff Hostetler <jeffhost@microsoft.com>
      
       ## Commit message ##
     -    p7519: add fsmonitor--daemon
     +    t/perf/p7519: add fsmonitor--daemon test cases
      
          Repeat all of the fsmonitor perf tests using `git fsmonitor--daemon` and
          the "Simple IPC" interface.
     @@ t/perf/p7519-fsmonitor.sh: test_expect_success "setup without fsmonitor" '
      +# Explicitly start the daemon here and before we start client commands
      +# so that we can later add custom tracing.
      +#
     -+
     -+test_lazy_prereq HAVE_FSMONITOR_DAEMON '
     -+	git version --build-options | grep "feature:" | grep "fsmonitor--daemon"
     -+'
     -+
     -+if test_have_prereq HAVE_FSMONITOR_DAEMON
     ++if test_have_prereq FSMONITOR_DAEMON
      +then
      +	USE_FSMONITOR_DAEMON=t
      +
 25:  c9159db718a ! 27:  99279c0ebd2 t7527: create test for fsmonitor--daemon
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +
      +. ./test-lib.sh
      +
     -+git version --build-options | grep "feature:" | grep "fsmonitor--daemon" || {
     -+	skip_all="The built-in FSMonitor is not supported on this platform"
     ++if ! test_have_prereq FSMONITOR_DAEMON
     ++then
     ++	skip_all="fsmonitor--daemon is not supported on this platform"
      +	test_done
     -+}
     ++fi
      +
     -+kill_repo () {
     ++stop_daemon_delete_repo () {
      +	r=$1
      +	git -C $r fsmonitor--daemon stop >/dev/null 2>/dev/null
      +	rm -rf $1
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +}
      +
      +test_expect_success 'explicit daemon start and stop' '
     -+	test_when_finished "kill_repo test_explicit" &&
     ++	test_when_finished "stop_daemon_delete_repo test_explicit" &&
      +
      +	git init test_explicit &&
      +	start_daemon test_explicit &&
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +'
      +
      +test_expect_success 'implicit daemon start' '
     -+	test_when_finished "kill_repo test_implicit" &&
     ++	test_when_finished "stop_daemon_delete_repo test_implicit" &&
      +
      +	git init test_implicit &&
      +	test_must_fail git -C test_implicit fsmonitor--daemon status &&
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +	# but this test case is only concerned with whether the daemon was
      +	# implicitly started.)
      +
     -+	GIT_TRACE2_EVENT="$PWD/.git/trace" \
     ++	GIT_TRACE2_EVENT="$(pwd)/.git/trace" \
      +		test-tool -C test_implicit fsmonitor-client query --token 0 >actual &&
      +	nul_to_q <actual >actual.filtered &&
      +	grep "builtin:" actual.filtered &&
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +'
      +
      +test_expect_success 'implicit daemon stop (delete .git)' '
     -+	test_when_finished "kill_repo test_implicit_1" &&
     ++	test_when_finished "stop_daemon_delete_repo test_implicit_1" &&
      +
      +	git init test_implicit_1 &&
      +
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +	# deleting the .git directory will implicitly stop the daemon.
      +	rm -rf test_implicit_1/.git &&
      +
     -+	# Create an empty .git directory so that the following Git command
     -+	# will stay relative to the `-C` directory.  Without this, the Git
     -+	# command will (override the requested -C argument) and crawl out
     -+	# to the containing Git source tree.  This would make the test
     -+	# result dependent upon whether we were using fsmonitor on our
     -+	# development worktree.
     -+
     ++	# [1] Create an empty .git directory so that the following Git
     ++	#     command will stay relative to the `-C` directory.
     ++	#
     ++	#     Without this, the Git command will override the requested
     ++	#     -C argument and crawl out to the containing Git source tree.
     ++	#     This would make the test result dependent upon whether we
     ++	#     were using fsmonitor on our development worktree.
     ++	#
      +	sleep 1 &&
      +	mkdir test_implicit_1/.git &&
      +
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +'
      +
      +test_expect_success 'implicit daemon stop (rename .git)' '
     -+	test_when_finished "kill_repo test_implicit_2" &&
     ++	test_when_finished "stop_daemon_delete_repo test_implicit_2" &&
      +
      +	git init test_implicit_2 &&
      +
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +	# renaming the .git directory will implicitly stop the daemon.
      +	mv test_implicit_2/.git test_implicit_2/.xxx &&
      +
     -+	# Create an empty .git directory so that the following Git command
     -+	# will stay relative to the `-C` directory.  Without this, the Git
     -+	# command will (override the requested -C argument) and crawl out
     -+	# to the containing Git source tree.  This would make the test
     -+	# result dependent upon whether we were using fsmonitor on our
     -+	# development worktree.
     -+
     ++	# See [1] above.
     ++	#
      +	sleep 1 &&
      +	mkdir test_implicit_2/.git &&
      +
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +'
      +
      +test_expect_success 'cannot start multiple daemons' '
     -+	test_when_finished "kill_repo test_multiple" &&
     ++	test_when_finished "stop_daemon_delete_repo test_multiple" &&
      +
      +	git init test_multiple &&
      +
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +	test_must_fail git -C test_multiple fsmonitor--daemon status
      +'
      +
     ++# These tests use the main repo in the trash directory
     ++
      +test_expect_success 'setup' '
      +	>tracked &&
      +	>modified &&
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +	git config core.useBuiltinFSMonitor true
      +'
      +
     ++# The test already explicitly stopped (or tried to stop) the daemon.
     ++# This is here in case something else fails first.
     ++#
     ++redundant_stop_daemon () {
     ++	git fsmonitor--daemon stop
     ++	return 0
     ++}
     ++
      +test_expect_success 'update-index implicitly starts daemon' '
     ++	test_when_finished redundant_stop_daemon &&
     ++
      +	test_must_fail git fsmonitor--daemon status &&
      +
     -+	GIT_TRACE2_EVENT="$PWD/.git/trace_implicit_1" \
     ++	GIT_TRACE2_EVENT="$(pwd)/.git/trace_implicit_1" \
      +		git update-index --fsmonitor &&
      +
      +	git fsmonitor--daemon status &&
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +'
      +
      +test_expect_success 'status implicitly starts daemon' '
     ++	test_when_finished redundant_stop_daemon &&
     ++
      +	test_must_fail git fsmonitor--daemon status &&
      +
     -+	GIT_TRACE2_EVENT="$PWD/.git/trace_implicit_2" \
     ++	GIT_TRACE2_EVENT="$(pwd)/.git/trace_implicit_2" \
      +		git status >actual &&
      +
      +	git fsmonitor--daemon status &&
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +}
      +
      +test_expect_success 'edit some files' '
     -+	test_when_finished "clean_up_repo_and_stop_daemon" &&
     ++	test_when_finished clean_up_repo_and_stop_daemon &&
      +
      +	(
     -+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
     ++		GIT_TRACE_FSMONITOR="$(pwd)/.git/trace" &&
      +		export GIT_TRACE_FSMONITOR &&
      +
      +		start_daemon
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +'
      +
      +test_expect_success 'create some files' '
     -+	test_when_finished "clean_up_repo_and_stop_daemon" &&
     ++	test_when_finished clean_up_repo_and_stop_daemon &&
      +
      +	(
     -+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
     ++		GIT_TRACE_FSMONITOR="$(pwd)/.git/trace" &&
      +		export GIT_TRACE_FSMONITOR &&
      +
      +		start_daemon
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +'
      +
      +test_expect_success 'delete some files' '
     -+	test_when_finished "clean_up_repo_and_stop_daemon" &&
     ++	test_when_finished clean_up_repo_and_stop_daemon &&
      +
      +	(
     -+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
     ++		GIT_TRACE_FSMONITOR="$(pwd)/.git/trace" &&
      +		export GIT_TRACE_FSMONITOR &&
      +
      +		start_daemon
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +'
      +
      +test_expect_success 'rename some files' '
     -+	test_when_finished "clean_up_repo_and_stop_daemon" &&
     ++	test_when_finished clean_up_repo_and_stop_daemon &&
      +
      +	(
     -+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
     ++		GIT_TRACE_FSMONITOR="$(pwd)/.git/trace" &&
      +		export GIT_TRACE_FSMONITOR &&
      +
      +		start_daemon
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +'
      +
      +test_expect_success 'rename directory' '
     -+	test_when_finished "clean_up_repo_and_stop_daemon" &&
     ++	test_when_finished clean_up_repo_and_stop_daemon &&
      +
      +	(
     -+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
     ++		GIT_TRACE_FSMONITOR="$(pwd)/.git/trace" &&
      +		export GIT_TRACE_FSMONITOR &&
      +
      +		start_daemon
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +'
      +
      +test_expect_success 'file changes to directory' '
     -+	test_when_finished "clean_up_repo_and_stop_daemon" &&
     ++	test_when_finished clean_up_repo_and_stop_daemon &&
      +
      +	(
     -+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
     ++		GIT_TRACE_FSMONITOR="$(pwd)/.git/trace" &&
      +		export GIT_TRACE_FSMONITOR &&
      +
      +		start_daemon
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +'
      +
      +test_expect_success 'directory changes to a file' '
     -+	test_when_finished "clean_up_repo_and_stop_daemon" &&
     ++	test_when_finished clean_up_repo_and_stop_daemon &&
      +
      +	(
     -+		GIT_TRACE_FSMONITOR="$PWD/.git/trace" &&
     ++		GIT_TRACE_FSMONITOR="$(pwd)/.git/trace" &&
      +		export GIT_TRACE_FSMONITOR &&
      +
      +		start_daemon
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +# "flush" message to a running daemon and ask it to do a flush/resync.
      +
      +test_expect_success 'flush cached data' '
     -+	test_when_finished "kill_repo test_flush" &&
     ++	test_when_finished "stop_daemon_delete_repo test_flush" &&
      +
      +	git init test_flush &&
      +
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +		GIT_TEST_FSMONITOR_TOKEN=true &&
      +		export GIT_TEST_FSMONITOR_TOKEN &&
      +
     -+		GIT_TRACE_FSMONITOR="$PWD/.git/trace_daemon" &&
     ++		GIT_TRACE_FSMONITOR="$(pwd)/.git/trace_daemon" &&
      +		export GIT_TRACE_FSMONITOR &&
      +
      +		start_daemon test_flush
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +	git -C wt-base worktree add ../wt-secondary &&
      +
      +	(
     -+		GIT_TRACE2_PERF="$PWD/trace2_wt_secondary" &&
     ++		GIT_TRACE2_PERF="$(pwd)/trace2_wt_secondary" &&
      +		export GIT_TRACE2_PERF &&
      +
     -+		GIT_TRACE_FSMONITOR="$PWD/trace_wt_secondary" &&
     ++		GIT_TRACE_FSMONITOR="$(pwd)/trace_wt_secondary" &&
      +		export GIT_TRACE_FSMONITOR &&
      +
      +		start_daemon wt-secondary
     @@ t/t7527-builtin-fsmonitor.sh (new)
      +	test_must_fail git -C wt-secondary fsmonitor--daemon status
      +'
      +
     ++# NEEDSWORK: Repeat one of the "edit" tests on wt-secondary and
     ++# confirm that we get the same events and behavior -- that is, that
     ++# fsmonitor--daemon correctly watches BOTH the working directory and
     ++# the external GITDIR directory and behaves the same as when ".git"
     ++# is a directory inside the working directory.
     ++
     ++test_expect_success 'cleanup worktrees' '
     ++	stop_daemon_delete_repo wt-secondary &&
     ++	stop_daemon_delete_repo wt-base
     ++'
     ++
      +test_done
 21:  f0da90e9b05 = 28:  3f36a31eb42 fsmonitor--daemon: periodically truncate list of modified files
 22:  bb7b1912bb4 ! 29:  555caca2216 fsmonitor--daemon: use a cookie file to sync with file system
     @@ builtin/fsmonitor--daemon.c: static int do_handle_client(struct fsmonitor_daemon
       	/*
       	 * We expect `command` to be of the form:
      @@ builtin/fsmonitor--daemon.c: static int do_handle_client(struct fsmonitor_daemon_state *state,
     - 		 */
     - 		do_flush = 1;
     - 		do_trivial = 1;
     -+		do_cookie = 1;
     - 
     - 	} else if (!skip_prefix(command, "builtin:", &p)) {
     - 		/* assume V1 timestamp or garbage */
     -@@ builtin/fsmonitor--daemon.c: static int do_handle_client(struct fsmonitor_daemon_state *state,
     - 				  "fsmonitor: unsupported V1 protocol '%s'"),
     - 				 command);
     - 		do_trivial = 1;
     -+		do_cookie = 1;
     - 
     - 	} else {
     - 		/* We have "builtin:*" */
     -@@ builtin/fsmonitor--daemon.c: static int do_handle_client(struct fsmonitor_daemon_state *state,
     - 					 "fsmonitor: invalid V2 protocol token '%s'",
     - 					 command);
     - 			do_trivial = 1;
     -+			do_cookie = 1;
     - 
     - 		} else {
     - 			/*
       			 * We have a V2 valid token:
       			 *     "builtin:<token_id>:<seq_nr>"
       			 */
     @@ builtin/fsmonitor--daemon.c: static int fsmonitor_run_daemon(void)
      +	/*
      +	 * We will write filesystem syncing cookie files into
      +	 * <gitdir>/<fsmonitor-dir>/<cookie-dir>/<pid>-<seq>.
     ++	 *
     ++	 * The extra layers of subdirectories here keep us from
     ++	 * changing the mtime on ".git/" or ".git/foo/" when we create
     ++	 * or delete cookie files.
     ++	 *
     ++	 * There have been problems with some IDEs that do a
     ++	 * non-recursive watch of the ".git/" directory and run a
     ++	 * series of commands any time something happens.
     ++	 *
     ++	 * For example, if we place our cookie files directly in
     ++	 * ".git/" or ".git/foo/" then a `git status` (or similar
     ++	 * command) from the IDE will cause a cookie file to be
     ++	 * created in one of those dirs.  This causes the mtime of
     ++	 * those dirs to change.  This triggers the IDE's watch
     ++	 * notification.  This triggers the IDE to run those commands
     ++	 * again.  And the process repeats and the machine never goes
     ++	 * idle.
     ++	 *
     ++	 * Adding the extra layers of subdirectories prevents the
     ++	 * mtime of ".git/" and ".git/foo" from changing when a
     ++	 * cookie file is created.
      +	 */
      +	strbuf_init(&state.path_cookie_prefix, 0);
      +	strbuf_addbuf(&state.path_cookie_prefix, &state.path_gitdir_watch);
 23:  102e17cbc87 = 30:  75bb4bc8463 fsmonitor: enhance existing comments
 24:  11ea2f97def ! 31:  8b3c4f4e6dd fsmonitor: force update index after large responses
     @@ fsmonitor.c: static void fsmonitor_refresh_callback(struct index_state *istate,
      +
       void refresh_fsmonitor(struct index_state *istate)
       {
     - 	struct repository *r = istate->repo ? istate->repo : the_repository;
     + 	struct strbuf query_result = STRBUF_INIT;
      @@ fsmonitor.c: apply_results:
       		 *
       		 * This updates both the cache-entries and the untracked-cache.
 27:  1483c68855c ! 32:  97dce46d1d0 t7527: test status with untracked-cache and fsmonitor--daemon
     @@ t/t7527-builtin-fsmonitor.sh: test_expect_success 'setup' '
       	EOF
       
       	git -c core.useBuiltinFSMonitor= add . &&
     -@@ t/t7527-builtin-fsmonitor.sh: test_expect_success 'worktree with .git file' '
     - 	test_must_fail git -C wt-secondary fsmonitor--daemon status
     +@@ t/t7527-builtin-fsmonitor.sh: test_expect_success 'cleanup worktrees' '
     + 	stop_daemon_delete_repo wt-base
       '
       
     -+# TODO Repeat one of the "edit" tests on wt-secondary and confirm that
     -+# TODO we get the same events and behavior -- that is, that fsmonitor--daemon
     -+# TODO correctly listens to events on both the working directory and to the
     -+# TODO referenced GITDIR.
     -+
     -+test_expect_success 'cleanup worktrees' '
     -+	kill_repo wt-secondary &&
     -+	kill_repo wt-base
     -+'
     -+
      +# The next few tests perform arbitrary/contrived file operations and
      +# confirm that status is correct.  That is, that the data (or lack of
      +# data) from fsmonitor doesn't cause incorrect results.  And doesn't
  -:  ----------- > 33:  e32ba686f7e fsmonitor: handle shortname for .git
  -:  ----------- > 34:  627e27fe60b t7527: test FS event reporing on MacOS WRT case and Unicode

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 237+ messages in thread

* [PATCH v3 01/34] simple-ipc: preparations for supporting binary messages.
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 02/34] fsmonitor--daemon: man page Jeff Hostetler via GitGitGadget
                       ` (33 subsequent siblings)
  34 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Add `command_len` argument to the Simple IPC API.

In my original Simple IPC API, I assumed that the request
would always be a null-terminated string of text characters.
The command arg was just a `const char *`.

I found a caller that would like to pass a binary command
to the daemon, so I want to ammend the Simple IPC API to
take `const char *command, size_t command_len` and pass
that to the daemon.  (Really, the first arg should just be
a `void *` or `const unsigned byte *` to make that clearer.)

Note, the response side has always been a `struct strbuf`
which includes the buffer and length, so we already support
returning a binary answer.  (Yes, it feels a little weird
returning a binary buffer in a `strbuf`, but it works.)

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/simple-ipc/ipc-unix-socket.c | 14 +++++++-----
 compat/simple-ipc/ipc-win32.c       | 14 +++++++-----
 simple-ipc.h                        |  7 ++++--
 t/helper/test-simple-ipc.c          | 34 +++++++++++++++++++----------
 4 files changed, 46 insertions(+), 23 deletions(-)

diff --git a/compat/simple-ipc/ipc-unix-socket.c b/compat/simple-ipc/ipc-unix-socket.c
index 1927e6ef4bc..4e28857a0a1 100644
--- a/compat/simple-ipc/ipc-unix-socket.c
+++ b/compat/simple-ipc/ipc-unix-socket.c
@@ -168,7 +168,8 @@ void ipc_client_close_connection(struct ipc_client_connection *connection)
 
 int ipc_client_send_command_to_connection(
 	struct ipc_client_connection *connection,
-	const char *message, struct strbuf *answer)
+	const char *message, size_t message_len,
+	struct strbuf *answer)
 {
 	int ret = 0;
 
@@ -176,7 +177,7 @@ int ipc_client_send_command_to_connection(
 
 	trace2_region_enter("ipc-client", "send-command", NULL);
 
-	if (write_packetized_from_buf_no_flush(message, strlen(message),
+	if (write_packetized_from_buf_no_flush(message, message_len,
 					       connection->fd) < 0 ||
 	    packet_flush_gently(connection->fd) < 0) {
 		ret = error(_("could not send IPC command"));
@@ -197,7 +198,8 @@ done:
 
 int ipc_client_send_command(const char *path,
 			    const struct ipc_client_connect_options *options,
-			    const char *message, struct strbuf *answer)
+			    const char *message, size_t message_len,
+			    struct strbuf *answer)
 {
 	int ret = -1;
 	enum ipc_active_state state;
@@ -208,7 +210,9 @@ int ipc_client_send_command(const char *path,
 	if (state != IPC_STATE__LISTENING)
 		return ret;
 
-	ret = ipc_client_send_command_to_connection(connection, message, answer);
+	ret = ipc_client_send_command_to_connection(connection,
+						    message, message_len,
+						    answer);
 
 	ipc_client_close_connection(connection);
 
@@ -503,7 +507,7 @@ static int worker_thread__do_io(
 	if (ret >= 0) {
 		ret = worker_thread_data->server_data->application_cb(
 			worker_thread_data->server_data->application_data,
-			buf.buf, do_io_reply_callback, &reply_data);
+			buf.buf, buf.len, do_io_reply_callback, &reply_data);
 
 		packet_flush_gently(reply_data.fd);
 	}
diff --git a/compat/simple-ipc/ipc-win32.c b/compat/simple-ipc/ipc-win32.c
index 8dc7bda087d..8e889d6a506 100644
--- a/compat/simple-ipc/ipc-win32.c
+++ b/compat/simple-ipc/ipc-win32.c
@@ -208,7 +208,8 @@ void ipc_client_close_connection(struct ipc_client_connection *connection)
 
 int ipc_client_send_command_to_connection(
 	struct ipc_client_connection *connection,
-	const char *message, struct strbuf *answer)
+	const char *message, size_t message_len,
+	struct strbuf *answer)
 {
 	int ret = 0;
 
@@ -216,7 +217,7 @@ int ipc_client_send_command_to_connection(
 
 	trace2_region_enter("ipc-client", "send-command", NULL);
 
-	if (write_packetized_from_buf_no_flush(message, strlen(message),
+	if (write_packetized_from_buf_no_flush(message, message_len,
 					       connection->fd) < 0 ||
 	    packet_flush_gently(connection->fd) < 0) {
 		ret = error(_("could not send IPC command"));
@@ -239,7 +240,8 @@ done:
 
 int ipc_client_send_command(const char *path,
 			    const struct ipc_client_connect_options *options,
-			    const char *message, struct strbuf *response)
+			    const char *message, size_t message_len,
+			    struct strbuf *response)
 {
 	int ret = -1;
 	enum ipc_active_state state;
@@ -250,7 +252,9 @@ int ipc_client_send_command(const char *path,
 	if (state != IPC_STATE__LISTENING)
 		return ret;
 
-	ret = ipc_client_send_command_to_connection(connection, message, response);
+	ret = ipc_client_send_command_to_connection(connection,
+						    message, message_len,
+						    response);
 
 	ipc_client_close_connection(connection);
 
@@ -458,7 +462,7 @@ static int do_io(struct ipc_server_thread_data *server_thread_data)
 	if (ret >= 0) {
 		ret = server_thread_data->server_data->application_cb(
 			server_thread_data->server_data->application_data,
-			buf.buf, do_io_reply_callback, &reply_data);
+			buf.buf, buf.len, do_io_reply_callback, &reply_data);
 
 		packet_flush_gently(reply_data.fd);
 
diff --git a/simple-ipc.h b/simple-ipc.h
index 2c48a5ee004..9c7330fcda0 100644
--- a/simple-ipc.h
+++ b/simple-ipc.h
@@ -107,7 +107,8 @@ void ipc_client_close_connection(struct ipc_client_connection *connection);
  */
 int ipc_client_send_command_to_connection(
 	struct ipc_client_connection *connection,
-	const char *message, struct strbuf *answer);
+	const char *message, size_t message_len,
+	struct strbuf *answer);
 
 /*
  * Used by the client to synchronously connect and send and receive a
@@ -119,7 +120,8 @@ int ipc_client_send_command_to_connection(
  */
 int ipc_client_send_command(const char *path,
 			    const struct ipc_client_connect_options *options,
-			    const char *message, struct strbuf *answer);
+			    const char *message, size_t message_len,
+			    struct strbuf *answer);
 
 /*
  * Simple IPC Server Side API.
@@ -144,6 +146,7 @@ typedef int (ipc_server_reply_cb)(struct ipc_server_reply_data *,
  */
 typedef int (ipc_server_application_cb)(void *application_data,
 					const char *request,
+					size_t request_len,
 					ipc_server_reply_cb *reply_cb,
 					struct ipc_server_reply_data *reply_data);
 
diff --git a/t/helper/test-simple-ipc.c b/t/helper/test-simple-ipc.c
index 42040ef81b1..91345180750 100644
--- a/t/helper/test-simple-ipc.c
+++ b/t/helper/test-simple-ipc.c
@@ -112,7 +112,7 @@ static int app__slow_command(ipc_server_reply_cb *reply_cb,
 /*
  * The client sent a command followed by a (possibly very) large buffer.
  */
-static int app__sendbytes_command(const char *received,
+static int app__sendbytes_command(const char *received, size_t received_len,
 				  ipc_server_reply_cb *reply_cb,
 				  struct ipc_server_reply_data *reply_data)
 {
@@ -123,6 +123,13 @@ static int app__sendbytes_command(const char *received,
 	int errs = 0;
 	int ret;
 
+	/*
+	 * The test is setup to send:
+	 *     "sendbytes" SP <n * char>
+	 */
+	if (received_len < strlen("sendbytes "))
+		BUG("received_len is short in app__sendbytes_command");
+
 	if (skip_prefix(received, "sendbytes ", &p))
 		len_ballast = strlen(p);
 
@@ -160,7 +167,7 @@ static ipc_server_application_cb test_app_cb;
  * by this application.
  */
 static int test_app_cb(void *application_data,
-		       const char *command,
+		       const char *command, size_t command_len,
 		       ipc_server_reply_cb *reply_cb,
 		       struct ipc_server_reply_data *reply_data)
 {
@@ -173,7 +180,7 @@ static int test_app_cb(void *application_data,
 	if (application_data != (void*)&my_app_data)
 		BUG("application_cb: application_data pointer wrong");
 
-	if (!strcmp(command, "quit")) {
+	if (command_len == 4 && !strncmp(command, "quit", 4)) {
 		/*
 		 * The client sent a "quit" command.  This is an async
 		 * request for the server to shutdown.
@@ -193,22 +200,23 @@ static int test_app_cb(void *application_data,
 		return SIMPLE_IPC_QUIT;
 	}
 
-	if (!strcmp(command, "ping")) {
+	if (command_len == 4 && !strncmp(command, "ping", 4)) {
 		const char *answer = "pong";
 		return reply_cb(reply_data, answer, strlen(answer));
 	}
 
-	if (!strcmp(command, "big"))
+	if (command_len == 3 && !strncmp(command, "big", 3))
 		return app__big_command(reply_cb, reply_data);
 
-	if (!strcmp(command, "chunk"))
+	if (command_len == 5 && !strncmp(command, "chunk", 5))
 		return app__chunk_command(reply_cb, reply_data);
 
-	if (!strcmp(command, "slow"))
+	if (command_len == 4 && !strncmp(command, "slow", 4))
 		return app__slow_command(reply_cb, reply_data);
 
-	if (starts_with(command, "sendbytes "))
-		return app__sendbytes_command(command, reply_cb, reply_data);
+	if (command_len >= 10 && starts_with(command, "sendbytes "))
+		return app__sendbytes_command(command, command_len,
+					      reply_cb, reply_data);
 
 	return app__unhandled_command(command, reply_cb, reply_data);
 }
@@ -488,7 +496,9 @@ static int client__send_ipc(void)
 	options.wait_if_busy = 1;
 	options.wait_if_not_found = 0;
 
-	if (!ipc_client_send_command(cl_args.path, &options, command, &buf)) {
+	if (!ipc_client_send_command(cl_args.path, &options,
+				     command, strlen(command),
+				     &buf)) {
 		if (buf.len) {
 			printf("%s\n", buf.buf);
 			fflush(stdout);
@@ -556,7 +566,9 @@ static int do_sendbytes(int bytecount, char byte, const char *path,
 	strbuf_addstr(&buf_send, "sendbytes ");
 	strbuf_addchars(&buf_send, byte, bytecount);
 
-	if (!ipc_client_send_command(path, options, buf_send.buf, &buf_resp)) {
+	if (!ipc_client_send_command(path, options,
+				     buf_send.buf, buf_send.len,
+				     &buf_resp)) {
 		strbuf_rtrim(&buf_resp);
 		printf("sent:%c%08d %s\n", byte, bytecount, buf_resp.buf);
 		fflush(stdout);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 02/34] fsmonitor--daemon: man page
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 01/34] simple-ipc: preparations for supporting binary messages Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 22:29       ` Ævar Arnfjörð Bjarmason
  2021-07-01 14:47     ` [PATCH v3 03/34] fsmonitor--daemon: update fsmonitor documentation Jeff Hostetler via GitGitGadget
                       ` (32 subsequent siblings)
  34 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Create a manual page describing the `git fsmonitor--daemon` feature.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 Documentation/git-fsmonitor--daemon.txt | 75 +++++++++++++++++++++++++
 1 file changed, 75 insertions(+)
 create mode 100644 Documentation/git-fsmonitor--daemon.txt

diff --git a/Documentation/git-fsmonitor--daemon.txt b/Documentation/git-fsmonitor--daemon.txt
new file mode 100644
index 00000000000..154e7684daa
--- /dev/null
+++ b/Documentation/git-fsmonitor--daemon.txt
@@ -0,0 +1,75 @@
+git-fsmonitor--daemon(1)
+========================
+
+NAME
+----
+git-fsmonitor--daemon - A Built-in File System Monitor
+
+SYNOPSIS
+--------
+[verse]
+'git fsmonitor--daemon' start
+'git fsmonitor--daemon' run
+'git fsmonitor--daemon' stop
+'git fsmonitor--daemon' status
+
+DESCRIPTION
+-----------
+
+A daemon to watch the working directory for file and directory
+changes using platform-specific file system notification facilities.
+
+This daemon communicates directly with commands like `git status`
+using the link:technical/api-simple-ipc.html[simple IPC] interface
+instead of the slower linkgit:githooks[5] interface.
+
+This daemon is built into Git so that no third-party tools are
+required.
+
+OPTIONS
+-------
+
+start::
+	Starts a daemon in the background.
+
+run::
+	Runs a daemon in the foreground.
+
+stop::
+	Stops the daemon running in the current working
+	directory, if present.
+
+status::
+	Exits with zero status if a daemon is watching the
+	current working directory.
+
+REMARKS
+-------
+
+This daemon is a long running process used to watch a single working
+directory and maintain a list of the recently changed files and
+directories.  Performance of commands such as `git status` can be
+increased if they just ask for a summary of changes to the working
+directory and can avoid scanning the disk.
+
+When `core.useBuiltinFSMonitor` is set to `true` (see
+linkgit:git-config[1]) commands, such as `git status`, will ask the
+daemon for changes and automatically start it (if necessary).
+
+For more information see the "File System Monitor" section in
+linkgit:git-update-index[1].
+
+CAVEATS
+-------
+
+The fsmonitor daemon does not currently know about submodules and does
+not know to filter out file system events that happen within a
+submodule.  If fsmonitor daemon is watching a super repo and a file is
+modified within the working directory of a submodule, it will report
+the change (as happening against the super repo).  However, the client
+will properly ignore these extra events, so performance may be affected
+but it will not cause an incorrect result.
+
+GIT
+---
+Part of the linkgit:git[1] suite
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 03/34] fsmonitor--daemon: update fsmonitor documentation
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 01/34] simple-ipc: preparations for supporting binary messages Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 02/34] fsmonitor--daemon: man page Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 22:31       ` Ævar Arnfjörð Bjarmason
  2021-07-01 14:47     ` [PATCH v3 04/34] fsmonitor-ipc: create client routines for git-fsmonitor--daemon Jeff Hostetler via GitGitGadget
                       ` (31 subsequent siblings)
  34 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Update references to `core.fsmonitor` and `core.fsmonitorHookVersion` and
pointers to `Watchman` to mention the new built-in `fsmonitor--daemon`.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 Documentation/config/core.txt      | 56 ++++++++++++++++++++++--------
 Documentation/git-update-index.txt | 27 +++++++-------
 Documentation/githooks.txt         |  3 +-
 3 files changed, 59 insertions(+), 27 deletions(-)

diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
index c04f62a54a1..4f6e519bc02 100644
--- a/Documentation/config/core.txt
+++ b/Documentation/config/core.txt
@@ -62,22 +62,50 @@ core.protectNTFS::
 	Defaults to `true` on Windows, and `false` elsewhere.
 
 core.fsmonitor::
-	If set, the value of this variable is used as a command which
-	will identify all files that may have changed since the
-	requested date/time. This information is used to speed up git by
-	avoiding unnecessary processing of files that have not changed.
-	See the "fsmonitor-watchman" section of linkgit:githooks[5].
+	If set, this variable contains the pathname of the "fsmonitor"
+	hook command.
++
+This hook command is used to identify all files that may have changed
+since the requested date/time. This information is used to speed up
+git by avoiding unnecessary scanning of files that have not changed.
++
+See the "fsmonitor-watchman" section of linkgit:githooks[5].
++
+Note: The value of this config setting is ignored if the
+built-in file system monitor is enabled (see `core.useBuiltinFSMonitor`).
 
 core.fsmonitorHookVersion::
-	Sets the version of hook that is to be used when calling fsmonitor.
-	There are currently versions 1 and 2. When this is not set,
-	version 2 will be tried first and if it fails then version 1
-	will be tried. Version 1 uses a timestamp as input to determine
-	which files have changes since that time but some monitors
-	like watchman have race conditions when used with a timestamp.
-	Version 2 uses an opaque string so that the monitor can return
-	something that can be used to determine what files have changed
-	without race conditions.
+	Sets the protocol version to be used when invoking the
+	"fsmonitor" hook.
++
+There are currently versions 1 and 2. When this is not set,
+version 2 will be tried first and if it fails then version 1
+will be tried. Version 1 uses a timestamp as input to determine
+which files have changes since that time but some monitors
+like Watchman have race conditions when used with a timestamp.
+Version 2 uses an opaque string so that the monitor can return
+something that can be used to determine what files have changed
+without race conditions.
++
+Note: The value of this config setting is ignored if the
+built-in file system monitor is enabled (see `core.useBuiltinFSMonitor`).
+
+core.useBuiltinFSMonitor::
+	If set to true, enable the built-in file system monitor
+	daemon for this working directory (linkgit:git-fsmonitor--daemon[1]).
++
+Like hook-based file system monitors, the built-in file system monitor
+can speed up Git commands that need to refresh the Git index
+(e.g. `git status`) in a working directory with many files.  The
+built-in monitor eliminates the need to install and maintain an
+external third-party tool.
++
+The built-in file system monitor is currently available only on a
+limited set of supported platforms.  Currently, this includes Windows
+and MacOS.
++
+Note: if this config setting is set to `true`, the values of
+`core.fsmonitor` and `core.fsmonitorHookVersion` are ignored.
 
 core.trustctime::
 	If false, the ctime differences between the index and the
diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
index 2853f168d97..c7c31b3fcf9 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -498,7 +498,9 @@ FILE SYSTEM MONITOR
 This feature is intended to speed up git operations for repos that have
 large working directories.
 
-It enables git to work together with a file system monitor (see the
+It enables git to work together with a file system monitor (see
+linkgit:git-fsmonitor--daemon[1]
+and the
 "fsmonitor-watchman" section of linkgit:githooks[5]) that can
 inform it as to what files have been modified. This enables git to avoid
 having to lstat() every file to find modified files.
@@ -508,17 +510,18 @@ performance by avoiding the cost of scanning the entire working directory
 looking for new files.
 
 If you want to enable (or disable) this feature, it is easier to use
-the `core.fsmonitor` configuration variable (see
-linkgit:git-config[1]) than using the `--fsmonitor` option to
-`git update-index` in each repository, especially if you want to do so
-across all repositories you use, because you can set the configuration
-variable in your `$HOME/.gitconfig` just once and have it affect all
-repositories you touch.
-
-When the `core.fsmonitor` configuration variable is changed, the
-file system monitor is added to or removed from the index the next time
-a command reads the index. When `--[no-]fsmonitor` are used, the file
-system monitor is immediately added to or removed from the index.
+the `core.fsmonitor` or `core.useBuiltinFSMonitor` configuration
+variable (see linkgit:git-config[1]) than using the `--fsmonitor`
+option to `git update-index` in each repository, especially if you
+want to do so across all repositories you use, because you can set the
+configuration variable in your `$HOME/.gitconfig` just once and have
+it affect all repositories you touch.
+
+When the `core.fsmonitor` or `core.useBuiltinFSMonitor` configuration
+variable is changed, the file system monitor is added to or removed
+from the index the next time a command reads the index. When
+`--[no-]fsmonitor` are used, the file system monitor is immediately
+added to or removed from the index.
 
 CONFIGURATION
 -------------
diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index b51959ff941..b7d5e926f7b 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -593,7 +593,8 @@ fsmonitor-watchman
 
 This hook is invoked when the configuration option `core.fsmonitor` is
 set to `.git/hooks/fsmonitor-watchman` or `.git/hooks/fsmonitor-watchmanv2`
-depending on the version of the hook to use.
+depending on the version of the hook to use, unless overridden via
+`core.useBuiltinFSMonitor` (see linkgit:git-config[1]).
 
 Version 1 takes two arguments, a version (1) and the time in elapsed
 nanoseconds since midnight, January 1, 1970.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 04/34] fsmonitor-ipc: create client routines for git-fsmonitor--daemon
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (2 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 03/34] fsmonitor--daemon: update fsmonitor documentation Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 05/34] help: include fsmonitor--daemon feature flag in version info Jeff Hostetler via GitGitGadget
                       ` (30 subsequent siblings)
  34 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Create fsmonitor_ipc__*() client routines to spawn the built-in file
system monitor daemon and send it an IPC request using the `Simple
IPC` API.

Stub in empty fsmonitor_ipc__*() functions for unsupported platforms.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 Makefile        |   1 +
 fsmonitor-ipc.c | 179 ++++++++++++++++++++++++++++++++++++++++++++++++
 fsmonitor-ipc.h |  48 +++++++++++++
 3 files changed, 228 insertions(+)
 create mode 100644 fsmonitor-ipc.c
 create mode 100644 fsmonitor-ipc.h

diff --git a/Makefile b/Makefile
index c3565fc0f8f..209c97aa22d 100644
--- a/Makefile
+++ b/Makefile
@@ -893,6 +893,7 @@ LIB_OBJS += fetch-pack.o
 LIB_OBJS += fmt-merge-msg.o
 LIB_OBJS += fsck.o
 LIB_OBJS += fsmonitor.o
+LIB_OBJS += fsmonitor-ipc.o
 LIB_OBJS += gettext.o
 LIB_OBJS += gpg-interface.o
 LIB_OBJS += graph.o
diff --git a/fsmonitor-ipc.c b/fsmonitor-ipc.c
new file mode 100644
index 00000000000..41706972520
--- /dev/null
+++ b/fsmonitor-ipc.c
@@ -0,0 +1,179 @@
+#include "cache.h"
+#include "fsmonitor.h"
+#include "simple-ipc.h"
+#include "fsmonitor-ipc.h"
+#include "run-command.h"
+#include "strbuf.h"
+#include "trace2.h"
+
+#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+
+int fsmonitor_ipc__is_supported(void)
+{
+	return 1;
+}
+
+GIT_PATH_FUNC(fsmonitor_ipc__get_path, "fsmonitor--daemon.ipc")
+
+enum ipc_active_state fsmonitor_ipc__get_state(void)
+{
+	return ipc_get_active_state(fsmonitor_ipc__get_path());
+}
+
+static int spawn_daemon(void)
+{
+	const char *args[] = { "fsmonitor--daemon", "start", NULL };
+
+	return run_command_v_opt_tr2(args, RUN_COMMAND_NO_STDIN | RUN_GIT_CMD,
+				    "fsmonitor");
+}
+
+int fsmonitor_ipc__send_query(const char *since_token,
+			      struct strbuf *answer)
+{
+	int ret = -1;
+	int tried_to_spawn = 0;
+	enum ipc_active_state state = IPC_STATE__OTHER_ERROR;
+	struct ipc_client_connection *connection = NULL;
+	struct ipc_client_connect_options options
+		= IPC_CLIENT_CONNECT_OPTIONS_INIT;
+	const char *tok = since_token ? since_token : "";
+	size_t tok_len = since_token ? strlen(since_token) : 0;
+
+	options.wait_if_busy = 1;
+	options.wait_if_not_found = 0;
+
+	trace2_region_enter("fsm_client", "query", NULL);
+	trace2_data_string("fsm_client", NULL, "query/command", tok);
+
+try_again:
+	state = ipc_client_try_connect(fsmonitor_ipc__get_path(), &options,
+				       &connection);
+
+	switch (state) {
+	case IPC_STATE__LISTENING:
+		ret = ipc_client_send_command_to_connection(
+			connection, tok, tok_len, answer);
+		ipc_client_close_connection(connection);
+
+		trace2_data_intmax("fsm_client", NULL,
+				   "query/response-length", answer->len);
+
+		if (fsmonitor_is_trivial_response(answer))
+			trace2_data_intmax("fsm_client", NULL,
+					   "query/trivial-response", 1);
+
+		goto done;
+
+	case IPC_STATE__NOT_LISTENING:
+		ret = error(_("fsmonitor_ipc__send_query: daemon not available"));
+		goto done;
+
+	case IPC_STATE__PATH_NOT_FOUND:
+		if (tried_to_spawn)
+			goto done;
+
+		tried_to_spawn++;
+		if (spawn_daemon())
+			goto done;
+
+		/*
+		 * Try again, but this time give the daemon a chance to
+		 * actually create the pipe/socket.
+		 *
+		 * Granted, the daemon just started so it can't possibly have
+		 * any FS cached yet, so we'll always get a trivial answer.
+		 * BUT the answer should include a new token that can serve
+		 * as the basis for subsequent requests.
+		 */
+		options.wait_if_not_found = 1;
+		goto try_again;
+
+	case IPC_STATE__INVALID_PATH:
+		ret = error(_("fsmonitor_ipc__send_query: invalid path '%s'"),
+			    fsmonitor_ipc__get_path());
+		goto done;
+
+	case IPC_STATE__OTHER_ERROR:
+	default:
+		ret = error(_("fsmonitor_ipc__send_query: unspecified error on '%s'"),
+			    fsmonitor_ipc__get_path());
+		goto done;
+	}
+
+done:
+	trace2_region_leave("fsm_client", "query", NULL);
+
+	return ret;
+}
+
+int fsmonitor_ipc__send_command(const char *command,
+				struct strbuf *answer)
+{
+	struct ipc_client_connection *connection = NULL;
+	struct ipc_client_connect_options options
+		= IPC_CLIENT_CONNECT_OPTIONS_INIT;
+	int ret;
+	enum ipc_active_state state;
+	const char *c = command ? command : "";
+	size_t c_len = command ? strlen(command) : 0;
+
+	strbuf_reset(answer);
+
+	options.wait_if_busy = 1;
+	options.wait_if_not_found = 0;
+
+	state = ipc_client_try_connect(fsmonitor_ipc__get_path(), &options,
+				       &connection);
+	if (state != IPC_STATE__LISTENING) {
+		die("fsmonitor--daemon is not running");
+		return -1;
+	}
+
+	ret = ipc_client_send_command_to_connection(connection, c, c_len,
+						    answer);
+	ipc_client_close_connection(connection);
+
+	if (ret == -1) {
+		die("could not send '%s' command to fsmonitor--daemon", c);
+		return -1;
+	}
+
+	return 0;
+}
+
+#else
+
+/*
+ * A trivial implementation of the fsmonitor_ipc__ API for unsupported
+ * platforms.
+ */
+
+int fsmonitor_ipc__is_supported(void)
+{
+	return 0;
+}
+
+const char *fsmonitor_ipc__get_path(void)
+{
+	return NULL;
+}
+
+enum ipc_active_state fsmonitor_ipc__get_state(void)
+{
+	return IPC_STATE__OTHER_ERROR;
+}
+
+int fsmonitor_ipc__send_query(const char *since_token,
+			      struct strbuf *answer)
+{
+	return -1;
+}
+
+int fsmonitor_ipc__send_command(const char *command,
+				struct strbuf *answer)
+{
+	return -1;
+}
+
+#endif
diff --git a/fsmonitor-ipc.h b/fsmonitor-ipc.h
new file mode 100644
index 00000000000..837c5e5b64a
--- /dev/null
+++ b/fsmonitor-ipc.h
@@ -0,0 +1,48 @@
+#ifndef FSMONITOR_IPC_H
+#define FSMONITOR_IPC_H
+
+/*
+ * Returns true if built-in file system monitor daemon is defined
+ * for this platform.
+ */
+int fsmonitor_ipc__is_supported(void);
+
+/*
+ * Returns the pathname to the IPC named pipe or Unix domain socket
+ * where a `git-fsmonitor--daemon` process will listen.  This is a
+ * per-worktree value.
+ *
+ * Returns NULL if the daemon is not supported on this platform.
+ */
+const char *fsmonitor_ipc__get_path(void);
+
+/*
+ * Try to determine whether there is a `git-fsmonitor--daemon` process
+ * listening on the IPC pipe/socket.
+ */
+enum ipc_active_state fsmonitor_ipc__get_state(void);
+
+/*
+ * Connect to a `git-fsmonitor--daemon` process via simple-ipc
+ * and ask for the set of changed files since the given token.
+ *
+ * This DOES NOT use the hook interface.
+ *
+ * Spawn a daemon process in the background if necessary.
+ *
+ * Returns -1 on error; 0 on success.
+ */
+int fsmonitor_ipc__send_query(const char *since_token,
+			      struct strbuf *answer);
+
+/*
+ * Connect to a `git-fsmonitor--daemon` process via simple-ipc and
+ * send a command verb.  If no daemon is available, we DO NOT try to
+ * start one.
+ *
+ * Returns -1 on error; 0 on success.
+ */
+int fsmonitor_ipc__send_command(const char *command,
+				struct strbuf *answer);
+
+#endif /* FSMONITOR_IPC_H */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 05/34] help: include fsmonitor--daemon feature flag in version info
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (3 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 04/34] fsmonitor-ipc: create client routines for git-fsmonitor--daemon Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 06/34] fsmonitor: config settings are repository-specific Jeff Hostetler via GitGitGadget
                       ` (29 subsequent siblings)
  34 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Add the "feature: fsmonitor--daemon" message to the output of
`git version --build-options`.

This allows users to know if the built-in fsmonitor feature is
supported on their platform.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 help.c        | 4 ++++
 t/test-lib.sh | 6 ++++++
 2 files changed, 10 insertions(+)

diff --git a/help.c b/help.c
index 3c3bdec2135..e22ba1d246a 100644
--- a/help.c
+++ b/help.c
@@ -11,6 +11,7 @@
 #include "version.h"
 #include "refs.h"
 #include "parse-options.h"
+#include "fsmonitor-ipc.h"
 
 struct category_description {
 	uint32_t category;
@@ -664,6 +665,9 @@ void get_version_info(struct strbuf *buf, int show_build_options)
 		strbuf_addf(buf, "sizeof-size_t: %d\n", (int)sizeof(size_t));
 		strbuf_addf(buf, "shell-path: %s\n", SHELL_PATH);
 		/* NEEDSWORK: also save and output GIT-BUILD_OPTIONS? */
+
+		if (fsmonitor_ipc__is_supported())
+			strbuf_addstr(buf, "feature: fsmonitor--daemon\n");
 	}
 }
 
diff --git a/t/test-lib.sh b/t/test-lib.sh
index adaf03543e8..bfb7ff6ed17 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1674,3 +1674,9 @@ test_lazy_prereq REBASE_P '
 # Tests that verify the scheduler integration must set this locally
 # to avoid errors.
 GIT_TEST_MAINT_SCHEDULER="none:exit 1"
+
+# Does this platform support `git fsmonitor--daemon`
+#
+test_lazy_prereq FSMONITOR_DAEMON '
+	git version --build-options | grep "feature:" | grep "fsmonitor--daemon"
+'
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 06/34] fsmonitor: config settings are repository-specific
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (4 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 05/34] help: include fsmonitor--daemon feature flag in version info Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 16:46       ` Ævar Arnfjörð Bjarmason
  2021-07-01 14:47     ` [PATCH v3 07/34] fsmonitor: use IPC to query the builtin FSMonitor daemon Jeff Hostetler via GitGitGadget
                       ` (28 subsequent siblings)
  34 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Move FSMonitor config settings to `struct repo_settings`, get rid of
the `core_fsmonitor` global variable, and add support for the new
`core.useBuiltinFSMonitor` config setting.  Move config code to lookup
`core.fsmonitor` into `prepare_repo_settings()` with the rest of the
setup code.

The `core_fsmonitor` global variable was used to store the pathname to
the FSMonitor hook and it was used as a boolean to see if FSMonitor
was enabled.  This dual usage will lead to confusion when we add
support for a builtin FSMonitor based on IPC, since the builtin
FSMonitor doesn't need the hook pathname.

Replace the boolean usage with an `enum fsmonitor_mode` to represent
the state of FSMonitor.  And only set the pathname when in HOOK mode.

Also, disable FSMonitor when the repository working directory is
incompatible.  For example, in bare repositories, since there aren't
any files to watch.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/update-index.c      | 20 +++++++++---
 cache.h                     |  1 -
 config.c                    | 14 --------
 config.h                    |  1 -
 environment.c               |  1 -
 fsmonitor.c                 | 65 +++++++++++++++++++++++--------------
 fsmonitor.h                 | 14 ++++++--
 repo-settings.c             | 48 +++++++++++++++++++++++++++
 repository.h                | 11 +++++++
 t/README                    |  4 +--
 t/t7519-status-fsmonitor.sh | 22 +++++++++++++
 11 files changed, 150 insertions(+), 51 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index f1f16f2de52..0141899caa3 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1216,14 +1216,26 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 	}
 
 	if (fsmonitor > 0) {
-		if (git_config_get_fsmonitor() == 0)
+		if (r->settings.fsmonitor_mode == FSMONITOR_MODE_INCOMPATIBLE)
+			return error(
+				_("repository is incompatible with fsmonitor"));
+
+		if (r->settings.fsmonitor_mode == FSMONITOR_MODE_DISABLED) {
+			warning(_("core.useBuiltinFSMonitor is unset; "
+				"set it if you really want to enable the "
+				"builtin fsmonitor"));
 			warning(_("core.fsmonitor is unset; "
-				"set it if you really want to "
-				"enable fsmonitor"));
+				"set it if you really want to enable the "
+				"hook-based fsmonitor"));
+		}
 		add_fsmonitor(&the_index);
 		report(_("fsmonitor enabled"));
 	} else if (!fsmonitor) {
-		if (git_config_get_fsmonitor() == 1)
+		if (r->settings.fsmonitor_mode == FSMONITOR_MODE_IPC)
+			warning(_("core.useBuiltinFSMonitor is set; "
+				"remove it if you really want to "
+				"disable fsmonitor"));
+		if (r->settings.fsmonitor_mode == FSMONITOR_MODE_HOOK)
 			warning(_("core.fsmonitor is set; "
 				"remove it if you really want to "
 				"disable fsmonitor"));
diff --git a/cache.h b/cache.h
index ba04ff8bd36..50463876852 100644
--- a/cache.h
+++ b/cache.h
@@ -981,7 +981,6 @@ extern int core_preload_index;
 extern int precomposed_unicode;
 extern int protect_hfs;
 extern int protect_ntfs;
-extern const char *core_fsmonitor;
 
 extern int core_apply_sparse_checkout;
 extern int core_sparse_checkout_cone;
diff --git a/config.c b/config.c
index f9c400ad306..ec3b88b4639 100644
--- a/config.c
+++ b/config.c
@@ -2516,20 +2516,6 @@ int git_config_get_max_percent_split_change(void)
 	return -1; /* default value */
 }
 
-int git_config_get_fsmonitor(void)
-{
-	if (git_config_get_pathname("core.fsmonitor", &core_fsmonitor))
-		core_fsmonitor = getenv("GIT_TEST_FSMONITOR");
-
-	if (core_fsmonitor && !*core_fsmonitor)
-		core_fsmonitor = NULL;
-
-	if (core_fsmonitor)
-		return 1;
-
-	return 0;
-}
-
 int git_config_get_index_threads(int *dest)
 {
 	int is_bool, val;
diff --git a/config.h b/config.h
index 9038538ffdc..b8d6b75d4fe 100644
--- a/config.h
+++ b/config.h
@@ -609,7 +609,6 @@ int git_config_get_index_threads(int *dest);
 int git_config_get_untracked_cache(void);
 int git_config_get_split_index(void);
 int git_config_get_max_percent_split_change(void);
-int git_config_get_fsmonitor(void);
 
 /* This dies if the configured or default date is in the future */
 int git_config_get_expiry(const char *key, const char **output);
diff --git a/environment.c b/environment.c
index 2f27008424a..7b5e8ff78da 100644
--- a/environment.c
+++ b/environment.c
@@ -84,7 +84,6 @@ int protect_hfs = PROTECT_HFS_DEFAULT;
 #define PROTECT_NTFS_DEFAULT 1
 #endif
 int protect_ntfs = PROTECT_NTFS_DEFAULT;
-const char *core_fsmonitor;
 
 /*
  * The character that begins a commented line in user-editable file
diff --git a/fsmonitor.c b/fsmonitor.c
index ab9bfc60b34..374189be7d9 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -3,6 +3,7 @@
 #include "dir.h"
 #include "ewah/ewok.h"
 #include "fsmonitor.h"
+#include "fsmonitor-ipc.h"
 #include "run-command.h"
 #include "strbuf.h"
 
@@ -148,15 +149,21 @@ void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate)
 /*
  * Call the query-fsmonitor hook passing the last update token of the saved results.
  */
-static int query_fsmonitor(int version, const char *last_update, struct strbuf *query_result)
+static int query_fsmonitor_hook(struct repository *r,
+				int version,
+				const char *last_update,
+				struct strbuf *query_result)
 {
 	struct child_process cp = CHILD_PROCESS_INIT;
 	int result;
 
-	if (!core_fsmonitor)
+	if (r->settings.fsmonitor_mode != FSMONITOR_MODE_HOOK)
 		return -1;
 
-	strvec_push(&cp.args, core_fsmonitor);
+	assert(r->settings.fsmonitor_hook_path);
+	assert(*r->settings.fsmonitor_hook_path);
+
+	strvec_push(&cp.args, r->settings.fsmonitor_hook_path);
 	strvec_pushf(&cp.args, "%d", version);
 	strvec_pushf(&cp.args, "%s", last_update);
 	cp.use_shell = 1;
@@ -238,17 +245,27 @@ void refresh_fsmonitor(struct index_state *istate)
 	struct strbuf last_update_token = STRBUF_INIT;
 	char *buf;
 	unsigned int i;
+	struct repository *r = istate->repo ? istate->repo : the_repository;
 
-	if (!core_fsmonitor || istate->fsmonitor_has_run_once)
+	if (r->settings.fsmonitor_mode <= FSMONITOR_MODE_DISABLED ||
+	    istate->fsmonitor_has_run_once)
 		return;
 
-	hook_version = fsmonitor_hook_version();
-
 	istate->fsmonitor_has_run_once = 1;
 
 	trace_printf_key(&trace_fsmonitor, "refresh fsmonitor");
+
+	if (r->settings.fsmonitor_mode == FSMONITOR_MODE_IPC) {
+		/* TODO */
+		return;
+	}
+
+	assert(r->settings.fsmonitor_mode == FSMONITOR_MODE_HOOK);
+
+	hook_version = fsmonitor_hook_version();
+
 	/*
-	 * This could be racy so save the date/time now and query_fsmonitor
+	 * This could be racy so save the date/time now and query_fsmonitor_hook
 	 * should be inclusive to ensure we don't miss potential changes.
 	 */
 	last_update = getnanotime();
@@ -256,13 +273,14 @@ void refresh_fsmonitor(struct index_state *istate)
 		strbuf_addf(&last_update_token, "%"PRIu64"", last_update);
 
 	/*
-	 * If we have a last update token, call query_fsmonitor for the set of
+	 * If we have a last update token, call query_fsmonitor_hook for the set of
 	 * changes since that token, else assume everything is possibly dirty
 	 * and check it all.
 	 */
 	if (istate->fsmonitor_last_update) {
 		if (hook_version == -1 || hook_version == HOOK_INTERFACE_VERSION2) {
-			query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION2,
+			query_success = !query_fsmonitor_hook(
+				r, HOOK_INTERFACE_VERSION2,
 				istate->fsmonitor_last_update, &query_result);
 
 			if (query_success) {
@@ -292,13 +310,17 @@ void refresh_fsmonitor(struct index_state *istate)
 		}
 
 		if (hook_version == HOOK_INTERFACE_VERSION1) {
-			query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION1,
+			query_success = !query_fsmonitor_hook(
+				r, HOOK_INTERFACE_VERSION1,
 				istate->fsmonitor_last_update, &query_result);
 		}
 
-		trace_performance_since(last_update, "fsmonitor process '%s'", core_fsmonitor);
-		trace_printf_key(&trace_fsmonitor, "fsmonitor process '%s' returned %s",
-			core_fsmonitor, query_success ? "success" : "failure");
+		trace_performance_since(last_update, "fsmonitor process '%s'",
+					r->settings.fsmonitor_hook_path);
+		trace_printf_key(&trace_fsmonitor,
+				 "fsmonitor process '%s' returned %s",
+				 r->settings.fsmonitor_hook_path,
+				 query_success ? "success" : "failure");
 	}
 
 	/* a fsmonitor process can return '/' to indicate all entries are invalid */
@@ -411,7 +433,8 @@ void remove_fsmonitor(struct index_state *istate)
 void tweak_fsmonitor(struct index_state *istate)
 {
 	unsigned int i;
-	int fsmonitor_enabled = git_config_get_fsmonitor();
+	struct repository *r = istate->repo ? istate->repo : the_repository;
+	int fsmonitor_enabled = r->settings.fsmonitor_mode > FSMONITOR_MODE_DISABLED;
 
 	if (istate->fsmonitor_dirty) {
 		if (fsmonitor_enabled) {
@@ -431,16 +454,8 @@ void tweak_fsmonitor(struct index_state *istate)
 		istate->fsmonitor_dirty = NULL;
 	}
 
-	switch (fsmonitor_enabled) {
-	case -1: /* keep: do nothing */
-		break;
-	case 0: /* false */
-		remove_fsmonitor(istate);
-		break;
-	case 1: /* true */
+	if (fsmonitor_enabled)
 		add_fsmonitor(istate);
-		break;
-	default: /* unknown value: do nothing */
-		break;
-	}
+	else
+		remove_fsmonitor(istate);
 }
diff --git a/fsmonitor.h b/fsmonitor.h
index f20d72631d7..9cc14e05239 100644
--- a/fsmonitor.h
+++ b/fsmonitor.h
@@ -57,7 +57,10 @@ int fsmonitor_is_trivial_response(const struct strbuf *query_result);
  */
 static inline int is_fsmonitor_refreshed(const struct index_state *istate)
 {
-	return !core_fsmonitor || istate->fsmonitor_has_run_once;
+	struct repository *r = istate->repo ? istate->repo : the_repository;
+
+	return r->settings.fsmonitor_mode <= FSMONITOR_MODE_DISABLED ||
+		istate->fsmonitor_has_run_once;
 }
 
 /*
@@ -67,7 +70,10 @@ static inline int is_fsmonitor_refreshed(const struct index_state *istate)
  */
 static inline void mark_fsmonitor_valid(struct index_state *istate, struct cache_entry *ce)
 {
-	if (core_fsmonitor && !(ce->ce_flags & CE_FSMONITOR_VALID)) {
+	struct repository *r = istate->repo ? istate->repo : the_repository;
+
+	if (r->settings.fsmonitor_mode > FSMONITOR_MODE_DISABLED &&
+	    !(ce->ce_flags & CE_FSMONITOR_VALID)) {
 		istate->cache_changed = 1;
 		ce->ce_flags |= CE_FSMONITOR_VALID;
 		trace_printf_key(&trace_fsmonitor, "mark_fsmonitor_clean '%s'", ce->name);
@@ -83,7 +89,9 @@ static inline void mark_fsmonitor_valid(struct index_state *istate, struct cache
  */
 static inline void mark_fsmonitor_invalid(struct index_state *istate, struct cache_entry *ce)
 {
-	if (core_fsmonitor) {
+	struct repository *r = istate->repo ? istate->repo : the_repository;
+
+	if (r->settings.fsmonitor_mode > FSMONITOR_MODE_DISABLED) {
 		ce->ce_flags &= ~CE_FSMONITOR_VALID;
 		untracked_cache_invalidate_path(istate, ce->name, 1);
 		trace_printf_key(&trace_fsmonitor, "mark_fsmonitor_invalid '%s'", ce->name);
diff --git a/repo-settings.c b/repo-settings.c
index 0cfe8b787db..faf197ff60a 100644
--- a/repo-settings.c
+++ b/repo-settings.c
@@ -5,10 +5,42 @@
 
 #define UPDATE_DEFAULT_BOOL(s,v) do { if (s == -1) { s = v; } } while(0)
 
+/*
+ * Return 1 if the repo/workdir is incompatible with FSMonitor.
+ */
+static int is_repo_incompatible_with_fsmonitor(struct repository *r)
+{
+	const char *const_strval;
+
+	/*
+	 * Bare repositories don't have a working directory and
+	 * therefore, nothing to watch.
+	 */
+	if (!r->worktree)
+		return 1;
+
+	/*
+	 * GVFS (aka VFS for Git) is incompatible with FSMonitor.
+	 *
+	 * Granted, core Git does not know anything about GVFS and
+	 * we shouldn't make assumptions about a downstream feature,
+	 * but users can install both versions.  And this can lead
+	 * to incorrect results from core Git commands.  So, without
+	 * bringing in any of the GVFS code, do a simple config test
+	 * for a published config setting.  (We do not look at the
+	 * various *_TEST_* environment variables.)
+	 */
+	if (!repo_config_get_value(r, "core.virtualfilesystem", &const_strval))
+		return 1;
+
+	return 0;
+}
+
 void prepare_repo_settings(struct repository *r)
 {
 	int value;
 	char *strval;
+	const char *const_strval;
 
 	if (r->settings.initialized)
 		return;
@@ -26,6 +58,22 @@ void prepare_repo_settings(struct repository *r)
 	UPDATE_DEFAULT_BOOL(r->settings.commit_graph_read_changed_paths, 1);
 	UPDATE_DEFAULT_BOOL(r->settings.gc_write_commit_graph, 1);
 
+	r->settings.fsmonitor_hook_path = NULL;
+	r->settings.fsmonitor_mode = FSMONITOR_MODE_DISABLED;
+	if (is_repo_incompatible_with_fsmonitor(r))
+		r->settings.fsmonitor_mode = FSMONITOR_MODE_INCOMPATIBLE;
+	else if (!repo_config_get_bool(r, "core.usebuiltinfsmonitor", &value)
+		   && value)
+		r->settings.fsmonitor_mode = FSMONITOR_MODE_IPC;
+	else {
+		if (repo_config_get_pathname(r, "core.fsmonitor", &const_strval))
+			const_strval = getenv("GIT_TEST_FSMONITOR");
+		if (const_strval && *const_strval) {
+			r->settings.fsmonitor_hook_path = strdup(const_strval);
+			r->settings.fsmonitor_mode = FSMONITOR_MODE_HOOK;
+		}
+	}
+
 	if (!repo_config_get_int(r, "index.version", &value))
 		r->settings.index_version = value;
 	if (!repo_config_get_maybe_bool(r, "core.untrackedcache", &value)) {
diff --git a/repository.h b/repository.h
index a45f7520fd9..09154298ba1 100644
--- a/repository.h
+++ b/repository.h
@@ -26,6 +26,14 @@ enum fetch_negotiation_setting {
 	FETCH_NEGOTIATION_NOOP = 3,
 };
 
+enum fsmonitor_mode {
+	FSMONITOR_MODE_INCOMPATIBLE = -2,
+	FSMONITOR_MODE_UNSET = -1,
+	FSMONITOR_MODE_DISABLED = 0,
+	FSMONITOR_MODE_HOOK = 1, /* core.fsmonitor */
+	FSMONITOR_MODE_IPC = 2, /* core.useBuiltinFSMonitor */
+};
+
 struct repo_settings {
 	int initialized;
 
@@ -34,6 +42,9 @@ struct repo_settings {
 	int gc_write_commit_graph;
 	int fetch_write_commit_graph;
 
+	enum fsmonitor_mode fsmonitor_mode;
+	char *fsmonitor_hook_path;
+
 	int index_version;
 	enum untracked_cache_setting core_untracked_cache;
 
diff --git a/t/README b/t/README
index 1a2072b2c8a..852a4eae9da 100644
--- a/t/README
+++ b/t/README
@@ -398,8 +398,8 @@ every 'git commit-graph write', as if the `--changed-paths` option was
 passed in.
 
 GIT_TEST_FSMONITOR=$PWD/t7519/fsmonitor-all exercises the fsmonitor
-code path for utilizing a file system monitor to speed up detecting
-new or changed files.
+code path for utilizing a (hook based) file system monitor to speed up
+detecting new or changed files.
 
 GIT_TEST_INDEX_VERSION=<n> exercises the index read/write code path
 for the index version specified.  Can be set to any valid version
diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
index 637391c6ce4..02919c68ddd 100755
--- a/t/t7519-status-fsmonitor.sh
+++ b/t/t7519-status-fsmonitor.sh
@@ -383,4 +383,26 @@ test_expect_success 'status succeeds after staging/unstaging' '
 	)
 '
 
+# Test that we detect and disallow repos that are incompatible with FSMonitor.
+test_expect_success 'incompatible bare repo' '
+	test_when_finished "rm -rf ./bare-clone" &&
+	git clone --bare . ./bare-clone &&
+	cat >expect <<-\EOF &&
+	error: repository is incompatible with fsmonitor
+	EOF
+	test_must_fail git -C ./bare-clone update-index --fsmonitor 2>actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'incompatible core.virtualfilesystem' '
+	test_when_finished "rm -rf ./fake-gvfs-clone" &&
+	git clone . ./fake-gvfs-clone &&
+	git -C ./fake-gvfs-clone config core.virtualfilesystem true &&
+	cat >expect <<-\EOF &&
+	error: repository is incompatible with fsmonitor
+	EOF
+	test_must_fail git -C ./fake-gvfs-clone update-index --fsmonitor 2>actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 07/34] fsmonitor: use IPC to query the builtin FSMonitor daemon
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (5 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 06/34] fsmonitor: config settings are repository-specific Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 08/34] fsmonitor--daemon: add a built-in fsmonitor daemon Jeff Hostetler via GitGitGadget
                       ` (27 subsequent siblings)
  34 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Use simple IPC to directly communicate with the new builtin file
system monitor daemon when `core.useBuiltinFSMonitor` is set.

The `core.fsmonitor` setting has already been defined as a HOOK
pathname.  Historically, this has been set to a HOOK script that will
talk with Watchman.  For compatibility reasons, we do not want to
overload that definition (and cause problems if users have multiple
versions of Git installed).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 fsmonitor.c | 41 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 39 insertions(+), 2 deletions(-)

diff --git a/fsmonitor.c b/fsmonitor.c
index 374189be7d9..3719ddfeec9 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -256,8 +256,44 @@ void refresh_fsmonitor(struct index_state *istate)
 	trace_printf_key(&trace_fsmonitor, "refresh fsmonitor");
 
 	if (r->settings.fsmonitor_mode == FSMONITOR_MODE_IPC) {
-		/* TODO */
-		return;
+		query_success = !fsmonitor_ipc__send_query(
+			istate->fsmonitor_last_update ?
+			istate->fsmonitor_last_update : "builtin:fake",
+			&query_result);
+		if (query_success) {
+			/*
+			 * The response contains a series of nul terminated
+			 * strings.  The first is the new token.
+			 *
+			 * Use `char *buf` as an interlude to trick the CI
+			 * static analysis to let us use `strbuf_addstr()`
+			 * here (and only copy the token) rather than
+			 * `strbuf_addbuf()`.
+			 */
+			buf = query_result.buf;
+			strbuf_addstr(&last_update_token, buf);
+			bol = last_update_token.len + 1;
+		} else {
+			/*
+			 * The builtin daemon is not available on this
+			 * platform -OR- we failed to get a response.
+			 *
+			 * Generate a fake token (rather than a V1
+			 * timestamp) for the index extension.  (If
+			 * they switch back to the hook API, we don't
+			 * want ambiguous state.)
+			 */
+			strbuf_addstr(&last_update_token, "builtin:fake");
+		}
+
+		/*
+		 * Regardless of whether we successfully talked to a
+		 * fsmonitor daemon or not, we skip over and do not
+		 * try to use the hook.  The "core.useBuiltinFSMonitor"
+		 * config setting ALWAYS overrides the "core.fsmonitor"
+		 * hook setting.
+		 */
+		goto apply_results;
 	}
 
 	assert(r->settings.fsmonitor_mode == FSMONITOR_MODE_HOOK);
@@ -323,6 +359,7 @@ void refresh_fsmonitor(struct index_state *istate)
 				 query_success ? "success" : "failure");
 	}
 
+apply_results:
 	/* a fsmonitor process can return '/' to indicate all entries are invalid */
 	if (query_success && query_result.buf[bol] != '/') {
 		/* Mark all entries returned by the monitor as dirty */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 08/34] fsmonitor--daemon: add a built-in fsmonitor daemon
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (6 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 07/34] fsmonitor: use IPC to query the builtin FSMonitor daemon Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 22:36       ` Ævar Arnfjörð Bjarmason
  2021-07-01 14:47     ` [PATCH v3 09/34] fsmonitor--daemon: implement 'stop' and 'status' commands Jeff Hostetler via GitGitGadget
                       ` (26 subsequent siblings)
  34 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Create a built-in file system monitoring daemon that can be used by
the existing `fsmonitor` feature (protocol API and index extension)
to improve the performance of various Git commands, such as `status`.

The `fsmonitor--daemon` feature builds upon the `Simple IPC` API and
provides an alternative to hook access to existing fsmonitors such
as `watchman`.

This commit merely adds the new command without any functionality.

Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 .gitignore                  |  1 +
 Makefile                    |  1 +
 builtin.h                   |  1 +
 builtin/fsmonitor--daemon.c | 53 +++++++++++++++++++++++++++++++++++++
 git.c                       |  1 +
 5 files changed, 57 insertions(+)
 create mode 100644 builtin/fsmonitor--daemon.c

diff --git a/.gitignore b/.gitignore
index 311841f9bed..4baba472aa8 100644
--- a/.gitignore
+++ b/.gitignore
@@ -72,6 +72,7 @@
 /git-format-patch
 /git-fsck
 /git-fsck-objects
+/git-fsmonitor--daemon
 /git-gc
 /git-get-tar-commit-id
 /git-grep
diff --git a/Makefile b/Makefile
index 209c97aa22d..8fe1e42a435 100644
--- a/Makefile
+++ b/Makefile
@@ -1097,6 +1097,7 @@ BUILTIN_OBJS += builtin/fmt-merge-msg.o
 BUILTIN_OBJS += builtin/for-each-ref.o
 BUILTIN_OBJS += builtin/for-each-repo.o
 BUILTIN_OBJS += builtin/fsck.o
+BUILTIN_OBJS += builtin/fsmonitor--daemon.o
 BUILTIN_OBJS += builtin/gc.o
 BUILTIN_OBJS += builtin/get-tar-commit-id.o
 BUILTIN_OBJS += builtin/grep.o
diff --git a/builtin.h b/builtin.h
index 16ecd5586f0..2470d1cd3a2 100644
--- a/builtin.h
+++ b/builtin.h
@@ -159,6 +159,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix);
 int cmd_for_each_repo(int argc, const char **argv, const char *prefix);
 int cmd_format_patch(int argc, const char **argv, const char *prefix);
 int cmd_fsck(int argc, const char **argv, const char *prefix);
+int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix);
 int cmd_gc(int argc, const char **argv, const char *prefix);
 int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
 int cmd_grep(int argc, const char **argv, const char *prefix);
diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
new file mode 100644
index 00000000000..df2bad53111
--- /dev/null
+++ b/builtin/fsmonitor--daemon.c
@@ -0,0 +1,53 @@
+#include "builtin.h"
+#include "config.h"
+#include "parse-options.h"
+#include "fsmonitor.h"
+#include "fsmonitor-ipc.h"
+#include "simple-ipc.h"
+#include "khash.h"
+
+static const char * const builtin_fsmonitor__daemon_usage[] = {
+	NULL
+};
+
+#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+
+int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
+{
+	const char *subcmd;
+
+	struct option options[] = {
+		OPT_END()
+	};
+
+	if (argc < 2)
+		usage_with_options(builtin_fsmonitor__daemon_usage, options);
+
+	if (argc == 2 && !strcmp(argv[1], "-h"))
+		usage_with_options(builtin_fsmonitor__daemon_usage, options);
+
+	git_config(git_default_config, NULL);
+
+	subcmd = argv[1];
+	argv--;
+	argc++;
+
+	argc = parse_options(argc, argv, prefix, options,
+			     builtin_fsmonitor__daemon_usage, 0);
+
+	die(_("Unhandled subcommand '%s'"), subcmd);
+}
+
+#else
+int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
+{
+	struct option options[] = {
+		OPT_END()
+	};
+
+	if (argc == 2 && !strcmp(argv[1], "-h"))
+		usage_with_options(builtin_fsmonitor__daemon_usage, options);
+
+	die(_("fsmonitor--daemon not supported on this platform"));
+}
+#endif
diff --git a/git.c b/git.c
index 18bed9a9964..c6160f4a886 100644
--- a/git.c
+++ b/git.c
@@ -533,6 +533,7 @@ static struct cmd_struct commands[] = {
 	{ "format-patch", cmd_format_patch, RUN_SETUP },
 	{ "fsck", cmd_fsck, RUN_SETUP },
 	{ "fsck-objects", cmd_fsck, RUN_SETUP },
+	{ "fsmonitor--daemon", cmd_fsmonitor__daemon, RUN_SETUP },
 	{ "gc", cmd_gc, RUN_SETUP },
 	{ "get-tar-commit-id", cmd_get_tar_commit_id, NO_PARSEOPT },
 	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 09/34] fsmonitor--daemon: implement 'stop' and 'status' commands
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (7 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 08/34] fsmonitor--daemon: add a built-in fsmonitor daemon Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 10/34] t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon Jeff Hostetler via GitGitGadget
                       ` (25 subsequent siblings)
  34 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Implement `stop` and `status` client commands to control and query the
status of a `fsmonitor--daemon` server process (and implicitly start a
server process if necessary).

Later commits will implement the actual server and monitor the file
system.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 51 +++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index df2bad53111..62efd5ea787 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -7,10 +7,55 @@
 #include "khash.h"
 
 static const char * const builtin_fsmonitor__daemon_usage[] = {
+	N_("git fsmonitor--daemon stop"),
+	N_("git fsmonitor--daemon status"),
 	NULL
 };
 
 #ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+/*
+ * Acting as a CLIENT.
+ *
+ * Send a "quit" command to the `git-fsmonitor--daemon` (if running)
+ * and wait for it to shutdown.
+ */
+static int do_as_client__send_stop(void)
+{
+	struct strbuf answer = STRBUF_INIT;
+	int ret;
+
+	ret = fsmonitor_ipc__send_command("quit", &answer);
+
+	/* The quit command does not return any response data. */
+	strbuf_release(&answer);
+
+	if (ret)
+		return ret;
+
+	trace2_region_enter("fsm_client", "polling-for-daemon-exit", NULL);
+	while (fsmonitor_ipc__get_state() == IPC_STATE__LISTENING)
+		sleep_millisec(50);
+	trace2_region_leave("fsm_client", "polling-for-daemon-exit", NULL);
+
+	return 0;
+}
+
+static int do_as_client__status(void)
+{
+	enum ipc_active_state state = fsmonitor_ipc__get_state();
+
+	switch (state) {
+	case IPC_STATE__LISTENING:
+		printf(_("fsmonitor-daemon is watching '%s'\n"),
+		       the_repository->worktree);
+		return 0;
+
+	default:
+		printf(_("fsmonitor-daemon is not watching '%s'\n"),
+		       the_repository->worktree);
+		return 1;
+	}
+}
 
 int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 {
@@ -35,6 +80,12 @@ int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options,
 			     builtin_fsmonitor__daemon_usage, 0);
 
+	if (!strcmp(subcmd, "stop"))
+		return !!do_as_client__send_stop();
+
+	if (!strcmp(subcmd, "status"))
+		return !!do_as_client__status();
+
 	die(_("Unhandled subcommand '%s'"), subcmd);
 }
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 10/34] t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (8 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 09/34] fsmonitor--daemon: implement 'stop' and 'status' commands Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 22:41       ` Ævar Arnfjörð Bjarmason
  2021-07-01 14:47     ` [PATCH v3 11/34] fsmonitor-fs-listen-win32: stub in backend for Windows Jeff Hostetler via GitGitGadget
                       ` (24 subsequent siblings)
  34 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Create an IPC client to send query and flush commands to the daemon.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 Makefile                         |   1 +
 t/helper/test-fsmonitor-client.c | 121 +++++++++++++++++++++++++++++++
 t/helper/test-tool.c             |   1 +
 t/helper/test-tool.h             |   1 +
 4 files changed, 124 insertions(+)
 create mode 100644 t/helper/test-fsmonitor-client.c

diff --git a/Makefile b/Makefile
index 8fe1e42a435..c45caacf2c3 100644
--- a/Makefile
+++ b/Makefile
@@ -709,6 +709,7 @@ TEST_BUILTINS_OBJS += test-dump-split-index.o
 TEST_BUILTINS_OBJS += test-dump-untracked-cache.o
 TEST_BUILTINS_OBJS += test-example-decorate.o
 TEST_BUILTINS_OBJS += test-fast-rebase.o
+TEST_BUILTINS_OBJS += test-fsmonitor-client.o
 TEST_BUILTINS_OBJS += test-genrandom.o
 TEST_BUILTINS_OBJS += test-genzeros.o
 TEST_BUILTINS_OBJS += test-hash-speed.o
diff --git a/t/helper/test-fsmonitor-client.c b/t/helper/test-fsmonitor-client.c
new file mode 100644
index 00000000000..f7a5b3a32fa
--- /dev/null
+++ b/t/helper/test-fsmonitor-client.c
@@ -0,0 +1,121 @@
+/*
+ * test-fsmonitor-client.c: client code to send commands/requests to
+ * a `git fsmonitor--daemon` daemon.
+ */
+
+#include "test-tool.h"
+#include "cache.h"
+#include "parse-options.h"
+#include "fsmonitor-ipc.h"
+
+#ifndef HAVE_FSMONITOR_DAEMON_BACKEND
+int cmd__fsmonitor_client(int argc, const char **argv)
+{
+	die("fsmonitor--daemon not available on this platform");
+}
+#else
+
+/*
+ * Read the `.git/index` to get the last token written to the
+ * FSMonitor Index Extension.
+ */
+static const char *get_token_from_index(void)
+{
+	struct index_state *istate = the_repository->index;
+
+	if (do_read_index(istate, the_repository->index_file, 0) < 0)
+		die("unable to read index file");
+	if (!istate->fsmonitor_last_update)
+		die("index file does not have fsmonitor extension");
+
+	return istate->fsmonitor_last_update;
+}
+
+/*
+ * Send an IPC query to a `git-fsmonitor--daemon` daemon and
+ * ask for the changes since the given token or from the last
+ * token in the index extension.
+ *
+ * This will implicitly start a daemon process if necessary.  The
+ * daemon process will persist after we exit.
+ */
+static int do_send_query(const char *token)
+{
+	struct strbuf answer = STRBUF_INIT;
+	int ret;
+
+	if (!token || !*token)
+		token = get_token_from_index();
+
+	ret = fsmonitor_ipc__send_query(token, &answer);
+	if (ret < 0)
+		die(_("could not query fsmonitor--daemon"));
+
+	write_in_full(1, answer.buf, answer.len);
+	strbuf_release(&answer);
+
+	return 0;
+}
+
+/*
+ * Send a "flush" command to the `git-fsmonitor--daemon` (if running)
+ * and tell it to flush its cache.
+ *
+ * This feature is primarily used by the test suite to simulate a loss of
+ * sync with the filesystem where we miss kernel events.
+ */
+static int do_send_flush(void)
+{
+	struct strbuf answer = STRBUF_INIT;
+	int ret;
+
+	ret = fsmonitor_ipc__send_command("flush", &answer);
+	if (ret)
+		return ret;
+
+	write_in_full(1, answer.buf, answer.len);
+	strbuf_release(&answer);
+
+	return 0;
+}
+
+int cmd__fsmonitor_client(int argc, const char **argv)
+{
+	const char *subcmd;
+	const char *token = NULL;
+
+	const char * const fsmonitor_client_usage[] = {
+		N_("test-helper fsmonitor-client query [<token>]"),
+		N_("test-helper fsmonitor-client flush"),
+		NULL,
+	};
+
+	struct option options[] = {
+		OPT_STRING(0, "token", &token, N_("token"),
+			   N_("command token to send to the server")),
+		OPT_END()
+	};
+
+	if (argc < 2)
+		usage_with_options(fsmonitor_client_usage, options);
+
+	if (argc == 2 && !strcmp(argv[1], "-h"))
+		usage_with_options(fsmonitor_client_usage, options);
+
+	subcmd = argv[1];
+	argv--;
+	argc++;
+
+	argc = parse_options(argc, argv, NULL, options, fsmonitor_client_usage, 0);
+
+	setup_git_directory();
+
+	if (!strcmp(subcmd, "query"))
+		return !!do_send_query(token);
+
+	if (!strcmp(subcmd, "flush"))
+		return !!do_send_flush();
+
+	die("Unhandled subcommand: '%s'", subcmd);
+}
+#endif
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index c5bd0c6d4c7..af879e4a5d7 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -31,6 +31,7 @@ static struct test_cmd cmds[] = {
 	{ "dump-untracked-cache", cmd__dump_untracked_cache },
 	{ "example-decorate", cmd__example_decorate },
 	{ "fast-rebase", cmd__fast_rebase },
+	{ "fsmonitor-client", cmd__fsmonitor_client },
 	{ "genrandom", cmd__genrandom },
 	{ "genzeros", cmd__genzeros },
 	{ "hashmap", cmd__hashmap },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index e8069a3b222..6c5134b46d9 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -21,6 +21,7 @@ int cmd__dump_split_index(int argc, const char **argv);
 int cmd__dump_untracked_cache(int argc, const char **argv);
 int cmd__example_decorate(int argc, const char **argv);
 int cmd__fast_rebase(int argc, const char **argv);
+int cmd__fsmonitor_client(int argc, const char **argv);
 int cmd__genrandom(int argc, const char **argv);
 int cmd__genzeros(int argc, const char **argv);
 int cmd__hashmap(int argc, const char **argv);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 11/34] fsmonitor-fs-listen-win32: stub in backend for Windows
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (9 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 10/34] t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 22:45       ` Ævar Arnfjörð Bjarmason
  2021-07-01 14:47     ` [PATCH v3 12/34] fsmonitor-fs-listen-macos: stub in backend for MacOS Jeff Hostetler via GitGitGadget
                       ` (23 subsequent siblings)
  34 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Stub in empty backend for fsmonitor--daemon on Windows.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 Makefile                                     | 13 ++++++
 compat/fsmonitor/fsmonitor-fs-listen-win32.c | 21 +++++++++
 compat/fsmonitor/fsmonitor-fs-listen.h       | 49 ++++++++++++++++++++
 config.mak.uname                             |  2 +
 contrib/buildsystems/CMakeLists.txt          |  5 ++
 5 files changed, 90 insertions(+)
 create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-win32.c
 create mode 100644 compat/fsmonitor/fsmonitor-fs-listen.h

diff --git a/Makefile b/Makefile
index c45caacf2c3..a2a6e1f20f6 100644
--- a/Makefile
+++ b/Makefile
@@ -467,6 +467,11 @@ all::
 # directory, and the JSON compilation database 'compile_commands.json' will be
 # created at the root of the repository.
 #
+# If your platform supports a built-in fsmonitor backend, set
+# FSMONITOR_DAEMON_BACKEND to the "<name>" of the corresponding
+# `compat/fsmonitor/fsmonitor-fs-listen-<name>.c` that implements the
+# `fsmonitor_fs_listen__*()` routines.
+#
 # Define DEVELOPER to enable more compiler warnings. Compiler version
 # and family are auto detected, but could be overridden by defining
 # COMPILER_FEATURES (see config.mak.dev). You can still set
@@ -1929,6 +1934,11 @@ ifdef NEED_ACCESS_ROOT_HANDLER
 	COMPAT_OBJS += compat/access.o
 endif
 
+ifdef FSMONITOR_DAEMON_BACKEND
+	COMPAT_CFLAGS += -DHAVE_FSMONITOR_DAEMON_BACKEND
+	COMPAT_OBJS += compat/fsmonitor/fsmonitor-fs-listen-$(FSMONITOR_DAEMON_BACKEND).o
+endif
+
 ifeq ($(TCLTK_PATH),)
 NO_TCLTK = NoThanks
 endif
@@ -2793,6 +2803,9 @@ GIT-BUILD-OPTIONS: FORCE
 	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
 	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
 	@echo X=\'$(X)\' >>$@+
+ifdef FSMONITOR_DAEMON_BACKEND
+	@echo FSMONITOR_DAEMON_BACKEND=\''$(subst ','\'',$(subst ','\'',$(FSMONITOR_DAEMON_BACKEND)))'\' >>$@+
+endif
 ifdef TEST_OUTPUT_DIRECTORY
 	@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@+
 endif
diff --git a/compat/fsmonitor/fsmonitor-fs-listen-win32.c b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
new file mode 100644
index 00000000000..880446b49e3
--- /dev/null
+++ b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
@@ -0,0 +1,21 @@
+#include "cache.h"
+#include "config.h"
+#include "fsmonitor.h"
+#include "fsmonitor-fs-listen.h"
+
+void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state)
+{
+}
+
+void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
+{
+}
+
+int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
+{
+	return -1;
+}
+
+void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
+{
+}
diff --git a/compat/fsmonitor/fsmonitor-fs-listen.h b/compat/fsmonitor/fsmonitor-fs-listen.h
new file mode 100644
index 00000000000..c7b5776b3b6
--- /dev/null
+++ b/compat/fsmonitor/fsmonitor-fs-listen.h
@@ -0,0 +1,49 @@
+#ifndef FSMONITOR_FS_LISTEN_H
+#define FSMONITOR_FS_LISTEN_H
+
+/* This needs to be implemented by each backend */
+
+#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+
+struct fsmonitor_daemon_state;
+
+/*
+ * Initialize platform-specific data for the fsmonitor listener thread.
+ * This will be called from the main thread PRIOR to staring the
+ * fsmonitor_fs_listener thread.
+ *
+ * Returns 0 if successful.
+ * Returns -1 otherwise.
+ */
+int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state);
+
+/*
+ * Cleanup platform-specific data for the fsmonitor listener thread.
+ * This will be called from the main thread AFTER joining the listener.
+ */
+void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state);
+
+/*
+ * The main body of the platform-specific event loop to watch for
+ * filesystem events.  This will run in the fsmonitor_fs_listen thread.
+ *
+ * It should call `ipc_server_stop_async()` if the listener thread
+ * prematurely terminates (because of a filesystem error or if it
+ * detects that the .git directory has been deleted).  (It should NOT
+ * do so if the listener thread receives a normal shutdown signal from
+ * the IPC layer.)
+ *
+ * It should set `state->error_code` to -1 if the daemon should exit
+ * with an error.
+ */
+void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state);
+
+/*
+ * Gently request that the fsmonitor listener thread shutdown.
+ * It does not wait for it to stop.  The caller should do a JOIN
+ * to wait for it.
+ */
+void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state);
+
+#endif /* HAVE_FSMONITOR_DAEMON_BACKEND */
+#endif /* FSMONITOR_FS_LISTEN_H */
diff --git a/config.mak.uname b/config.mak.uname
index cb443b4e023..fcd88b60b14 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -420,6 +420,7 @@ ifeq ($(uname_S),Windows)
 	# so we don't need this:
 	#
 	#   SNPRINTF_RETURNS_BOGUS = YesPlease
+	FSMONITOR_DAEMON_BACKEND = win32
 	NO_SVN_TESTS = YesPlease
 	RUNTIME_PREFIX = YesPlease
 	HAVE_WPGMPTR = YesWeDo
@@ -598,6 +599,7 @@ ifneq (,$(findstring MINGW,$(uname_S)))
 	NO_STRTOUMAX = YesPlease
 	NO_MKDTEMP = YesPlease
 	NO_SVN_TESTS = YesPlease
+	FSMONITOR_DAEMON_BACKEND = win32
 	RUNTIME_PREFIX = YesPlease
 	HAVE_WPGMPTR = YesWeDo
 	NO_ST_BLOCKS_IN_STRUCT_STAT = YesPlease
diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index a87841340e6..1ab94eb284f 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -263,6 +263,11 @@ else()
 	endif()
 endif()
 
+if(CMAKE_SYSTEM_NAME STREQUAL "Windows")
+	add_compile_definitions(HAVE_FSMONITOR_DAEMON_BACKEND)
+	list(APPEND compat_SOURCES compat/fsmonitor/fsmonitor-fs-listen-win32.c)
+endif()
+
 set(EXE_EXTENSION ${CMAKE_EXECUTABLE_SUFFIX})
 
 #header checks
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 12/34] fsmonitor-fs-listen-macos: stub in backend for MacOS
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (10 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 11/34] fsmonitor-fs-listen-win32: stub in backend for Windows Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 22:49       ` Ævar Arnfjörð Bjarmason
  2021-07-01 14:47     ` [PATCH v3 13/34] fsmonitor--daemon: implement 'run' command Jeff Hostetler via GitGitGadget
                       ` (22 subsequent siblings)
  34 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Stub in empty implementation of fsmonitor--daemon
backend for MacOS.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/fsmonitor/fsmonitor-fs-listen-macos.c | 20 ++++++++++++++++++++
 config.mak.uname                             |  2 ++
 contrib/buildsystems/CMakeLists.txt          |  3 +++
 3 files changed, 25 insertions(+)
 create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-macos.c

diff --git a/compat/fsmonitor/fsmonitor-fs-listen-macos.c b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
new file mode 100644
index 00000000000..b91058d1c4f
--- /dev/null
+++ b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
@@ -0,0 +1,20 @@
+#include "cache.h"
+#include "fsmonitor.h"
+#include "fsmonitor-fs-listen.h"
+
+int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
+{
+	return -1;
+}
+
+void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
+{
+}
+
+void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state)
+{
+}
+
+void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
+{
+}
diff --git a/config.mak.uname b/config.mak.uname
index fcd88b60b14..394355463e1 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -147,6 +147,8 @@ ifeq ($(uname_S),Darwin)
 			MSGFMT = /usr/local/opt/gettext/bin/msgfmt
 		endif
 	endif
+	FSMONITOR_DAEMON_BACKEND = macos
+	BASIC_LDFLAGS += -framework CoreServices
 endif
 ifeq ($(uname_S),SunOS)
 	NEEDS_SOCKET = YesPlease
diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index 1ab94eb284f..aa80671045a 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -266,6 +266,9 @@ endif()
 if(CMAKE_SYSTEM_NAME STREQUAL "Windows")
 	add_compile_definitions(HAVE_FSMONITOR_DAEMON_BACKEND)
 	list(APPEND compat_SOURCES compat/fsmonitor/fsmonitor-fs-listen-win32.c)
+elseif(CMAKE_SYSTEM_NAME STREQUAL "Darwin")
+	add_compile_definitions(HAVE_FSMONITOR_DAEMON_BACKEND)
+	list(APPEND compat_SOURCES compat/fsmonitor/fsmonitor-fs-listen-macos.c)
 endif()
 
 set(EXE_EXTENSION ${CMAKE_EXECUTABLE_SUFFIX})
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 13/34] fsmonitor--daemon: implement 'run' command
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (11 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 12/34] fsmonitor-fs-listen-macos: stub in backend for MacOS Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 14/34] fsmonitor--daemon: implement 'start' command Jeff Hostetler via GitGitGadget
                       ` (21 subsequent siblings)
  34 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Implement `run` command to try to begin listening for file system events.

This version defines the thread structure with a single fsmonitor_fs_listen
thread to watch for file system events and a simple IPC thread pool to
watch for connection from Git clients over a well-known named pipe or
Unix domain socket.

This commit does not actually do anything yet because the platform
backends are still just stubs.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 210 +++++++++++++++++++++++++++++++++++-
 fsmonitor--daemon.h         |  34 ++++++
 2 files changed, 243 insertions(+), 1 deletion(-)
 create mode 100644 fsmonitor--daemon.h

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 62efd5ea787..a265c962ccc 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -3,16 +3,39 @@
 #include "parse-options.h"
 #include "fsmonitor.h"
 #include "fsmonitor-ipc.h"
+#include "compat/fsmonitor/fsmonitor-fs-listen.h"
+#include "fsmonitor--daemon.h"
 #include "simple-ipc.h"
 #include "khash.h"
 
 static const char * const builtin_fsmonitor__daemon_usage[] = {
+	N_("git fsmonitor--daemon run [<options>]"),
 	N_("git fsmonitor--daemon stop"),
 	N_("git fsmonitor--daemon status"),
 	NULL
 };
 
 #ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+/*
+ * Global state loaded from config.
+ */
+#define FSMONITOR__IPC_THREADS "fsmonitor.ipcthreads"
+static int fsmonitor__ipc_threads = 8;
+
+static int fsmonitor_config(const char *var, const char *value, void *cb)
+{
+	if (!strcmp(var, FSMONITOR__IPC_THREADS)) {
+		int i = git_config_int(var, value);
+		if (i < 1)
+			return error(_("value of '%s' out of range: %d"),
+				     FSMONITOR__IPC_THREADS, i);
+		fsmonitor__ipc_threads = i;
+		return 0;
+	}
+
+	return git_default_config(var, value, cb);
+}
+
 /*
  * Acting as a CLIENT.
  *
@@ -57,11 +80,190 @@ static int do_as_client__status(void)
 	}
 }
 
+static ipc_server_application_cb handle_client;
+
+static int handle_client(void *data,
+			 const char *command, size_t command_len,
+			 ipc_server_reply_cb *reply,
+			 struct ipc_server_reply_data *reply_data)
+{
+	/* struct fsmonitor_daemon_state *state = data; */
+	int result;
+
+	/*
+	 * The Simple IPC API now supports {char*, len} arguments, but
+	 * FSMonitor always uses proper null-terminated strings, so
+	 * we can ignore the command_len argument.  (Trust, but verify.)
+	 */
+	if (command_len != strlen(command))
+		BUG("FSMonitor assumes text messages");
+
+	trace2_region_enter("fsmonitor", "handle_client", the_repository);
+	trace2_data_string("fsmonitor", the_repository, "request", command);
+
+	result = 0; /* TODO Do something here. */
+
+	trace2_region_leave("fsmonitor", "handle_client", the_repository);
+
+	return result;
+}
+
+static void *fsmonitor_fs_listen__thread_proc(void *_state)
+{
+	struct fsmonitor_daemon_state *state = _state;
+
+	trace2_thread_start("fsm-listen");
+
+	trace_printf_key(&trace_fsmonitor, "Watching: worktree '%s'",
+			 state->path_worktree_watch.buf);
+	if (state->nr_paths_watching > 1)
+		trace_printf_key(&trace_fsmonitor, "Watching: gitdir '%s'",
+				 state->path_gitdir_watch.buf);
+
+	fsmonitor_fs_listen__loop(state);
+
+	trace2_thread_exit();
+	return NULL;
+}
+
+static int fsmonitor_run_daemon_1(struct fsmonitor_daemon_state *state)
+{
+	struct ipc_server_opts ipc_opts = {
+		.nr_threads = fsmonitor__ipc_threads,
+
+		/*
+		 * We know that there are no other active threads yet,
+		 * so we can let the IPC layer temporarily chdir() if
+		 * it needs to when creating the server side of the
+		 * Unix domain socket.
+		 */
+		.uds_disallow_chdir = 0
+	};
+
+	/*
+	 * Start the IPC thread pool before the we've started the file
+	 * system event listener thread so that we have the IPC handle
+	 * before we need it.
+	 */
+	if (ipc_server_run_async(&state->ipc_server_data,
+				 fsmonitor_ipc__get_path(), &ipc_opts,
+				 handle_client, state))
+		return error(_("could not start IPC thread pool"));
+
+	/*
+	 * Start the fsmonitor listener thread to collect filesystem
+	 * events.
+	 */
+	if (pthread_create(&state->listener_thread, NULL,
+			   fsmonitor_fs_listen__thread_proc, state) < 0) {
+		ipc_server_stop_async(state->ipc_server_data);
+		ipc_server_await(state->ipc_server_data);
+
+		return error(_("could not start fsmonitor listener thread"));
+	}
+
+	/*
+	 * The daemon is now fully functional in background threads.
+	 * Wait for the IPC thread pool to shutdown (whether by client
+	 * request or from filesystem activity).
+	 */
+	ipc_server_await(state->ipc_server_data);
+
+	/*
+	 * The fsmonitor listener thread may have received a shutdown
+	 * event from the IPC thread pool, but it doesn't hurt to tell
+	 * it again.  And wait for it to shutdown.
+	 */
+	fsmonitor_fs_listen__stop_async(state);
+	pthread_join(state->listener_thread, NULL);
+
+	return state->error_code;
+}
+
+static int fsmonitor_run_daemon(void)
+{
+	struct fsmonitor_daemon_state state;
+	int err;
+
+	memset(&state, 0, sizeof(state));
+
+	pthread_mutex_init(&state.main_lock, NULL);
+	state.error_code = 0;
+	state.current_token_data = NULL;
+
+	/* Prepare to (recursively) watch the <worktree-root> directory. */
+	strbuf_init(&state.path_worktree_watch, 0);
+	strbuf_addstr(&state.path_worktree_watch, absolute_path(get_git_work_tree()));
+	state.nr_paths_watching = 1;
+
+	/*
+	 * We create and delete cookie files somewhere inside the .git
+	 * directory to help us keep sync with the file system.  If
+	 * ".git" is not a directory, then <gitdir> is not inside the
+	 * cone of <worktree-root>, so set up a second watch to watch
+	 * the <gitdir> so that we get events for the cookie files.
+	 */
+	strbuf_init(&state.path_gitdir_watch, 0);
+	strbuf_addbuf(&state.path_gitdir_watch, &state.path_worktree_watch);
+	strbuf_addstr(&state.path_gitdir_watch, "/.git");
+	if (!is_directory(state.path_gitdir_watch.buf)) {
+		strbuf_reset(&state.path_gitdir_watch);
+		strbuf_addstr(&state.path_gitdir_watch, absolute_path(get_git_dir()));
+		state.nr_paths_watching = 2;
+	}
+
+	/*
+	 * Confirm that we can create platform-specific resources for the
+	 * filesystem listener before we bother starting all the threads.
+	 */
+	if (fsmonitor_fs_listen__ctor(&state)) {
+		err = error(_("could not initialize listener thread"));
+		goto done;
+	}
+
+	err = fsmonitor_run_daemon_1(&state);
+
+done:
+	pthread_mutex_destroy(&state.main_lock);
+	fsmonitor_fs_listen__dtor(&state);
+
+	ipc_server_free(state.ipc_server_data);
+
+	strbuf_release(&state.path_worktree_watch);
+	strbuf_release(&state.path_gitdir_watch);
+
+	return err;
+}
+
+static int try_to_run_foreground_daemon(void)
+{
+	/*
+	 * Technically, we don't need to probe for an existing daemon
+	 * process, since we could just call `fsmonitor_run_daemon()`
+	 * and let it fail if the pipe/socket is busy.
+	 *
+	 * However, this method gives us a nicer error message for a
+	 * common error case.
+	 */
+	if (fsmonitor_ipc__get_state() == IPC_STATE__LISTENING)
+		die("fsmonitor--daemon is already running '%s'",
+		    the_repository->worktree);
+
+	printf(_("running fsmonitor-daemon in '%s'\n"),
+	       the_repository->worktree);
+	fflush(stdout);
+
+	return !!fsmonitor_run_daemon();
+}
+
 int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 {
 	const char *subcmd;
 
 	struct option options[] = {
+		OPT_INTEGER(0, "ipc-threads",
+			    &fsmonitor__ipc_threads,
+			    N_("use <n> ipc worker threads")),
 		OPT_END()
 	};
 
@@ -71,7 +273,7 @@ int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 	if (argc == 2 && !strcmp(argv[1], "-h"))
 		usage_with_options(builtin_fsmonitor__daemon_usage, options);
 
-	git_config(git_default_config, NULL);
+	git_config(fsmonitor_config, NULL);
 
 	subcmd = argv[1];
 	argv--;
@@ -79,6 +281,12 @@ int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 
 	argc = parse_options(argc, argv, prefix, options,
 			     builtin_fsmonitor__daemon_usage, 0);
+	if (fsmonitor__ipc_threads < 1)
+		die(_("invalid 'ipc-threads' value (%d)"),
+		    fsmonitor__ipc_threads);
+
+	if (!strcmp(subcmd, "run"))
+		return !!try_to_run_foreground_daemon();
 
 	if (!strcmp(subcmd, "stop"))
 		return !!do_as_client__send_stop();
diff --git a/fsmonitor--daemon.h b/fsmonitor--daemon.h
new file mode 100644
index 00000000000..3009c1a83de
--- /dev/null
+++ b/fsmonitor--daemon.h
@@ -0,0 +1,34 @@
+#ifndef FSMONITOR_DAEMON_H
+#define FSMONITOR_DAEMON_H
+
+#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
+
+#include "cache.h"
+#include "dir.h"
+#include "run-command.h"
+#include "simple-ipc.h"
+#include "thread-utils.h"
+
+struct fsmonitor_batch;
+struct fsmonitor_token_data;
+
+struct fsmonitor_daemon_backend_data; /* opaque platform-specific data */
+
+struct fsmonitor_daemon_state {
+	pthread_t listener_thread;
+	pthread_mutex_t main_lock;
+
+	struct strbuf path_worktree_watch;
+	struct strbuf path_gitdir_watch;
+	int nr_paths_watching;
+
+	struct fsmonitor_token_data *current_token_data;
+
+	int error_code;
+	struct fsmonitor_daemon_backend_data *backend_data;
+
+	struct ipc_server_data *ipc_server_data;
+};
+
+#endif /* HAVE_FSMONITOR_DAEMON_BACKEND */
+#endif /* FSMONITOR_DAEMON_H */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 14/34] fsmonitor--daemon: implement 'start' command
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (12 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 13/34] fsmonitor--daemon: implement 'run' command Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 22:18       ` Ævar Arnfjörð Bjarmason
  2021-07-01 14:47     ` [PATCH v3 15/34] fsmonitor: do not try to operate on bare repos Jeff Hostetler via GitGitGadget
                       ` (20 subsequent siblings)
  34 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Implement 'git fsmonitor--daemon start' command.  This command
tries to start a daemon in the background.  It creates a background
process to run the daemon.

The updated daemon does not actually do anything yet because the
platform backends are still just stubs.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 208 ++++++++++++++++++++++++++++++++++++
 1 file changed, 208 insertions(+)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index a265c962ccc..7fcf960652f 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -9,6 +9,7 @@
 #include "khash.h"
 
 static const char * const builtin_fsmonitor__daemon_usage[] = {
+	N_("git fsmonitor--daemon start [<options>]"),
 	N_("git fsmonitor--daemon run [<options>]"),
 	N_("git fsmonitor--daemon stop"),
 	N_("git fsmonitor--daemon status"),
@@ -22,6 +23,9 @@ static const char * const builtin_fsmonitor__daemon_usage[] = {
 #define FSMONITOR__IPC_THREADS "fsmonitor.ipcthreads"
 static int fsmonitor__ipc_threads = 8;
 
+#define FSMONITOR__START_TIMEOUT "fsmonitor.starttimeout"
+static int fsmonitor__start_timeout_sec = 60;
+
 static int fsmonitor_config(const char *var, const char *value, void *cb)
 {
 	if (!strcmp(var, FSMONITOR__IPC_THREADS)) {
@@ -33,6 +37,15 @@ static int fsmonitor_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (!strcmp(var, FSMONITOR__START_TIMEOUT)) {
+		int i = git_config_int(var, value);
+		if (i < 0)
+			return error(_("value of '%s' out of range: %d"),
+				     FSMONITOR__START_TIMEOUT, i);
+		fsmonitor__start_timeout_sec = i;
+		return 0;
+	}
+
 	return git_default_config(var, value, cb);
 }
 
@@ -256,6 +269,194 @@ static int try_to_run_foreground_daemon(void)
 	return !!fsmonitor_run_daemon();
 }
 
+#ifdef GIT_WINDOWS_NATIVE
+/*
+ * Create a background process to run the daemon.  It should be completely
+ * disassociated from the terminal.
+ *
+ * Conceptually like `daemonize()` but different because Windows does not
+ * have `fork(2)`.  Spawn a normal Windows child process but without the
+ * limitations of `start_command()` and `finish_command()`.
+ *
+ * The child process execs the "git fsmonitor--daemon run" command.
+ *
+ * The current process returns so that the caller can wait for the child
+ * to startup before exiting.
+ */
+static int spawn_background_fsmonitor_daemon(pid_t *pid)
+{
+	char git_exe[MAX_PATH];
+	struct strvec args = STRVEC_INIT;
+	int in, out;
+
+	GetModuleFileNameA(NULL, git_exe, MAX_PATH);
+
+	in = open("/dev/null", O_RDONLY);
+	out = open("/dev/null", O_WRONLY);
+
+	strvec_push(&args, git_exe);
+	strvec_push(&args, "fsmonitor--daemon");
+	strvec_push(&args, "run");
+	strvec_pushf(&args, "--ipc-threads=%d", fsmonitor__ipc_threads);
+
+	*pid = mingw_spawnvpe(args.v[0], args.v, NULL, NULL, in, out, out);
+	close(in);
+	close(out);
+
+	strvec_clear(&args);
+
+	if (*pid < 0)
+		return error(_("could not spawn fsmonitor--daemon in the background"));
+
+	return 0;
+}
+#else
+/*
+ * Create a background process to run the daemon.  It should be completely
+ * disassociated from the terminal.
+ *
+ * This is adapted from `daemonize()`.  Use `fork()` to directly
+ * create and run the daemon in the child process.
+ *
+ * The fork-child can just call the run code; it does not need to exec
+ * it.
+ *
+ * The fork-parent returns the child PID so that we can wait for the
+ * child to startup before exiting.
+ */
+static int spawn_background_fsmonitor_daemon(pid_t *pid)
+{
+	*pid = fork();
+
+	switch (*pid) {
+	case 0:
+		if (setsid() == -1)
+			error_errno(_("setsid failed"));
+		close(0);
+		close(1);
+		close(2);
+		sanitize_stdfds();
+
+		return !!fsmonitor_run_daemon();
+
+	case -1:
+		return error_errno(_("could not spawn fsmonitor--daemon in the background"));
+
+	default:
+		return 0;
+	}
+}
+#endif
+
+/*
+ * This is adapted from `wait_or_whine()`.  Watch the child process and
+ * let it get started and begin listening for requests on the socket
+ * before reporting our success.
+ */
+static int wait_for_background_startup(pid_t pid_child)
+{
+	int status;
+	pid_t pid_seen;
+	enum ipc_active_state s;
+	time_t time_limit, now;
+
+	time(&time_limit);
+	time_limit += fsmonitor__start_timeout_sec;
+
+	for (;;) {
+		pid_seen = waitpid(pid_child, &status, WNOHANG);
+
+		if (pid_seen == -1)
+			return error_errno(_("waitpid failed"));
+		else if (pid_seen == 0) {
+			/*
+			 * The child is still running (this should be
+			 * the normal case).  Try to connect to it on
+			 * the socket and see if it is ready for
+			 * business.
+			 *
+			 * If there is another daemon already running,
+			 * our child will fail to start (possibly
+			 * after a timeout on the lock), but we don't
+			 * care (who responds) if the socket is live.
+			 */
+			s = fsmonitor_ipc__get_state();
+			if (s == IPC_STATE__LISTENING)
+				return 0;
+
+			time(&now);
+			if (now > time_limit)
+				return error(_("fsmonitor--daemon not online yet"));
+		} else if (pid_seen == pid_child) {
+			/*
+			 * The new child daemon process shutdown while
+			 * it was starting up, so it is not listening
+			 * on the socket.
+			 *
+			 * Try to ping the socket in the odd chance
+			 * that another daemon started (or was already
+			 * running) while our child was starting.
+			 *
+			 * Again, we don't care who services the socket.
+			 */
+			s = fsmonitor_ipc__get_state();
+			if (s == IPC_STATE__LISTENING)
+				return 0;
+
+			/*
+			 * We don't care about the WEXITSTATUS() nor
+			 * any of the WIF*(status) values because
+			 * `cmd_fsmonitor__daemon()` does the `!!result`
+			 * trick on all function return values.
+			 *
+			 * So it is sufficient to just report the
+			 * early shutdown as an error.
+			 */
+			return error(_("fsmonitor--daemon failed to start"));
+		} else
+			return error(_("waitpid is confused"));
+	}
+}
+
+static int try_to_start_background_daemon(void)
+{
+	pid_t pid_child;
+	int ret;
+
+	/*
+	 * Before we try to create a background daemon process, see
+	 * if a daemon process is already listening.  This makes it
+	 * easier for us to report an already-listening error to the
+	 * console, since our spawn/daemon can only report the success
+	 * of creating the background process (and not whether it
+	 * immediately exited).
+	 */
+	if (fsmonitor_ipc__get_state() == IPC_STATE__LISTENING)
+		die("fsmonitor--daemon is already running '%s'",
+		    the_repository->worktree);
+
+	printf(_("starting fsmonitor-daemon in '%s'\n"),
+	       the_repository->worktree);
+	fflush(stdout);
+
+	/*
+	 * Run the actual daemon in a background process.
+	 */
+	ret = spawn_background_fsmonitor_daemon(&pid_child);
+	if (pid_child <= 0)
+		return ret;
+
+	/*
+	 * Wait (with timeout) for the background child process get
+	 * started and begin listening on the socket/pipe.  This makes
+	 * the "start" command more synchronous and more reliable in
+	 * tests.
+	 */
+	ret = wait_for_background_startup(pid_child);
+
+	return ret;
+}
+
 int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 {
 	const char *subcmd;
@@ -264,6 +465,10 @@ int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 		OPT_INTEGER(0, "ipc-threads",
 			    &fsmonitor__ipc_threads,
 			    N_("use <n> ipc worker threads")),
+		OPT_INTEGER(0, "start-timeout",
+			    &fsmonitor__start_timeout_sec,
+			    N_("Max seconds to wait for background daemon startup")),
+
 		OPT_END()
 	};
 
@@ -285,6 +490,9 @@ int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 		die(_("invalid 'ipc-threads' value (%d)"),
 		    fsmonitor__ipc_threads);
 
+	if (!strcmp(subcmd, "start"))
+		return !!try_to_start_background_daemon();
+
 	if (!strcmp(subcmd, "run"))
 		return !!try_to_run_foreground_daemon();
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 15/34] fsmonitor: do not try to operate on bare repos
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (13 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 14/34] fsmonitor--daemon: implement 'start' command Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 22:53       ` Ævar Arnfjörð Bjarmason
  2021-07-01 14:47     ` [PATCH v3 16/34] fsmonitor--daemon: add pathname classification Jeff Hostetler via GitGitGadget
                       ` (19 subsequent siblings)
  34 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Bare repos do not have a working directory, so there is no
directory for the daemon to register a watch upon.  And therefore
there are no files within the directory for it to actually watch.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c |  8 ++++++++
 t/t7519-status-fsmonitor.sh | 16 ++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 7fcf960652f..d6161ad95a5 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -490,6 +490,14 @@ int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
 		die(_("invalid 'ipc-threads' value (%d)"),
 		    fsmonitor__ipc_threads);
 
+	prepare_repo_settings(the_repository);
+	if (!the_repository->worktree)
+		return error(_("fsmonitor-daemon does not support bare repos '%s'"),
+			     xgetcwd());
+	if (the_repository->settings.fsmonitor_mode == FSMONITOR_MODE_INCOMPATIBLE)
+		return error(_("fsmonitor-daemon is incompatible with this repo '%s'"),
+			     the_repository->worktree);
+
 	if (!strcmp(subcmd, "start"))
 		return !!try_to_start_background_daemon();
 
diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
index 02919c68ddd..ed20a4f7fb9 100755
--- a/t/t7519-status-fsmonitor.sh
+++ b/t/t7519-status-fsmonitor.sh
@@ -394,6 +394,22 @@ test_expect_success 'incompatible bare repo' '
 	test_cmp expect actual
 '
 
+test_expect_success FSMONITOR_DAEMON 'try running fsmonitor-daemon in bare repo' '
+	test_when_finished "rm -rf ./bare-clone" &&
+	git clone --bare . ./bare-clone &&
+	test_must_fail git -C ./bare-clone fsmonitor--daemon run 2>actual &&
+	grep "fsmonitor-daemon does not support bare repos" actual
+'
+
+test_expect_success FSMONITOR_DAEMON 'try running fsmonitor-daemon in virtual repo' '
+	test_when_finished "rm -rf ./fake-virtual-clone" &&
+	git clone . ./fake-virtual-clone &&
+	test_must_fail git -C ./fake-virtual-clone \
+			   -c core.virtualfilesystem=true \
+			   fsmonitor--daemon run 2>actual &&
+	grep "fsmonitor-daemon is incompatible with this repo" actual
+'
+
 test_expect_success 'incompatible core.virtualfilesystem' '
 	test_when_finished "rm -rf ./fake-gvfs-clone" &&
 	git clone . ./fake-gvfs-clone &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 16/34] fsmonitor--daemon: add pathname classification
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (14 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 15/34] fsmonitor: do not try to operate on bare repos Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 17/34] fsmonitor--daemon: define token-ids Jeff Hostetler via GitGitGadget
                       ` (18 subsequent siblings)
  34 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon to classify relative and absolute
pathnames and decide how they should be handled.  This will
be used by the platform-specific backend to respond to each
filesystem event.

When we register for filesystem notifications on a directory,
we get events for everything (recursively) in the directory.
We want to report to clients changes to tracked and untracked
paths within the working directory.  We do not want to report
changes within the .git directory, for example.

This classification will be used in a later commit by the
different backends to classify paths as events are received.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 81 +++++++++++++++++++++++++++++++++++++
 fsmonitor--daemon.h         | 61 ++++++++++++++++++++++++++++
 2 files changed, 142 insertions(+)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index d6161ad95a5..e942f7c5840 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -121,6 +121,87 @@ static int handle_client(void *data,
 	return result;
 }
 
+#define FSMONITOR_COOKIE_PREFIX ".fsmonitor-daemon-"
+
+enum fsmonitor_path_type fsmonitor_classify_path_workdir_relative(
+	const char *rel)
+{
+	if (fspathncmp(rel, ".git", 4))
+		return IS_WORKDIR_PATH;
+	rel += 4;
+
+	if (!*rel)
+		return IS_DOT_GIT;
+	if (*rel != '/')
+		return IS_WORKDIR_PATH; /* e.g. .gitignore */
+	rel++;
+
+	if (!fspathncmp(rel, FSMONITOR_COOKIE_PREFIX,
+			strlen(FSMONITOR_COOKIE_PREFIX)))
+		return IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX;
+
+	return IS_INSIDE_DOT_GIT;
+}
+
+enum fsmonitor_path_type fsmonitor_classify_path_gitdir_relative(
+	const char *rel)
+{
+	if (!fspathncmp(rel, FSMONITOR_COOKIE_PREFIX,
+			strlen(FSMONITOR_COOKIE_PREFIX)))
+		return IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX;
+
+	return IS_INSIDE_GITDIR;
+}
+
+static enum fsmonitor_path_type try_classify_workdir_abs_path(
+	struct fsmonitor_daemon_state *state,
+	const char *path)
+{
+	const char *rel;
+
+	if (fspathncmp(path, state->path_worktree_watch.buf,
+		       state->path_worktree_watch.len))
+		return IS_OUTSIDE_CONE;
+
+	rel = path + state->path_worktree_watch.len;
+
+	if (!*rel)
+		return IS_WORKDIR_PATH; /* it is the root dir exactly */
+	if (*rel != '/')
+		return IS_OUTSIDE_CONE;
+	rel++;
+
+	return fsmonitor_classify_path_workdir_relative(rel);
+}
+
+enum fsmonitor_path_type fsmonitor_classify_path_absolute(
+	struct fsmonitor_daemon_state *state,
+	const char *path)
+{
+	const char *rel;
+	enum fsmonitor_path_type t;
+
+	t = try_classify_workdir_abs_path(state, path);
+	if (state->nr_paths_watching == 1)
+		return t;
+	if (t != IS_OUTSIDE_CONE)
+		return t;
+
+	if (fspathncmp(path, state->path_gitdir_watch.buf,
+		       state->path_gitdir_watch.len))
+		return IS_OUTSIDE_CONE;
+
+	rel = path + state->path_gitdir_watch.len;
+
+	if (!*rel)
+		return IS_GITDIR; /* it is the <gitdir> exactly */
+	if (*rel != '/')
+		return IS_OUTSIDE_CONE;
+	rel++;
+
+	return fsmonitor_classify_path_gitdir_relative(rel);
+}
+
 static void *fsmonitor_fs_listen__thread_proc(void *_state)
 {
 	struct fsmonitor_daemon_state *state = _state;
diff --git a/fsmonitor--daemon.h b/fsmonitor--daemon.h
index 3009c1a83de..7bbb3a27a1c 100644
--- a/fsmonitor--daemon.h
+++ b/fsmonitor--daemon.h
@@ -30,5 +30,66 @@ struct fsmonitor_daemon_state {
 	struct ipc_server_data *ipc_server_data;
 };
 
+/*
+ * Pathname classifications.
+ *
+ * The daemon classifies the pathnames that it receives from file
+ * system notification events into the following categories and uses
+ * that to decide whether clients are told about them.  (And to watch
+ * for file system synchronization events.)
+ *
+ * The client should only care about paths within the working
+ * directory proper (inside the working directory and not ".git" nor
+ * inside of ".git/").  That is, the client has read the index and is
+ * asking for a list of any paths in the working directory that have
+ * been modified since the last token.  The client does not care about
+ * file system changes within the .git directory (such as new loose
+ * objects or packfiles).  So the client will only receive paths that
+ * are classified as IS_WORKDIR_PATH.
+ *
+ * The daemon uses the IS_DOT_GIT and IS_GITDIR internally to mean the
+ * exact ".git" directory or GITDIR.  If the daemon receives a delete
+ * event for either of these directories, it will automatically
+ * shutdown, for example.
+ *
+ * Note that the daemon DOES NOT explicitly watch nor special case the
+ * ".git/index" file.  The daemon does not read the index and does not
+ * have any internal index-relative state.  The daemon only collects
+ * the set of modified paths within the working directory.
+ */
+enum fsmonitor_path_type {
+	IS_WORKDIR_PATH = 0,
+
+	IS_DOT_GIT,
+	IS_INSIDE_DOT_GIT,
+	IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX,
+
+	IS_GITDIR,
+	IS_INSIDE_GITDIR,
+	IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX,
+
+	IS_OUTSIDE_CONE,
+};
+
+/*
+ * Classify a pathname relative to the root of the working directory.
+ */
+enum fsmonitor_path_type fsmonitor_classify_path_workdir_relative(
+	const char *relative_path);
+
+/*
+ * Classify a pathname relative to a <gitdir> that is external to the
+ * worktree directory.
+ */
+enum fsmonitor_path_type fsmonitor_classify_path_gitdir_relative(
+	const char *relative_path);
+
+/*
+ * Classify an absolute pathname received from a filesystem event.
+ */
+enum fsmonitor_path_type fsmonitor_classify_path_absolute(
+	struct fsmonitor_daemon_state *state,
+	const char *path);
+
 #endif /* HAVE_FSMONITOR_DAEMON_BACKEND */
 #endif /* FSMONITOR_DAEMON_H */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 17/34] fsmonitor--daemon: define token-ids
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (15 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 16/34] fsmonitor--daemon: add pathname classification Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 22:58       ` Ævar Arnfjörð Bjarmason
  2021-07-01 14:47     ` [PATCH v3 18/34] fsmonitor--daemon: create token-based changed path cache Jeff Hostetler via GitGitGadget
                       ` (17 subsequent siblings)
  34 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon to create token-ids and define the
overall token naming scheme.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 116 +++++++++++++++++++++++++++++++++++-
 1 file changed, 115 insertions(+), 1 deletion(-)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index e942f7c5840..e991925fafc 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -93,6 +93,120 @@ static int do_as_client__status(void)
 	}
 }
 
+/*
+ * Requests to and from a FSMonitor Protocol V2 provider use an opaque
+ * "token" as a virtual timestamp.  Clients can request a summary of all
+ * created/deleted/modified files relative to a token.  In the response,
+ * clients receive a new token for the next (relative) request.
+ *
+ *
+ * Token Format
+ * ============
+ *
+ * The contents of the token are private and provider-specific.
+ *
+ * For the built-in fsmonitor--daemon, we define a token as follows:
+ *
+ *     "builtin" ":" <token_id> ":" <sequence_nr>
+ *
+ * The "builtin" prefix is used as a namespace to avoid conflicts
+ * with other providers (such as Watchman).
+ *
+ * The <token_id> is an arbitrary OPAQUE string, such as a GUID,
+ * UUID, or {timestamp,pid}.  It is used to group all filesystem
+ * events that happened while the daemon was monitoring (and in-sync
+ * with the filesystem).
+ *
+ *     Unlike FSMonitor Protocol V1, it is not defined as a timestamp
+ *     and does not define less-than/greater-than relationships.
+ *     (There are too many race conditions to rely on file system
+ *     event timestamps.)
+ *
+ * The <sequence_nr> is a simple integer incremented whenever the
+ * daemon needs to make its state public.  For example, if 1000 file
+ * system events come in, but no clients have requested the data,
+ * the daemon can continue to accumulate file changes in the same
+ * bin and does not need to advance the sequence number.  However,
+ * as soon as a client does arrive, the daemon needs to start a new
+ * bin and increment the sequence number.
+ *
+ *     The sequence number serves as the boundary between 2 sets
+ *     of bins -- the older ones that the client has already seen
+ *     and the newer ones that it hasn't.
+ *
+ * When a new <token_id> is created, the <sequence_nr> is reset to
+ * zero.
+ *
+ *
+ * About Token Ids
+ * ===============
+ *
+ * A new token_id is created:
+ *
+ * [1] each time the daemon is started.
+ *
+ * [2] any time that the daemon must re-sync with the filesystem
+ *     (such as when the kernel drops or we miss events on a very
+ *     active volume).
+ *
+ * [3] in response to a client "flush" command (for dropped event
+ *     testing).
+ *
+ * When a new token_id is created, the daemon is free to discard all
+ * cached filesystem events associated with any previous token_ids.
+ * Events associated with a non-current token_id will never be sent
+ * to a client.  A token_id change implicitly means that the daemon
+ * has gap in its event history.
+ *
+ * Therefore, clients that present a token with a stale (non-current)
+ * token_id will always be given a trivial response.
+ */
+struct fsmonitor_token_data {
+	struct strbuf token_id;
+	struct fsmonitor_batch *batch_head;
+	struct fsmonitor_batch *batch_tail;
+	uint64_t client_ref_count;
+};
+
+static struct fsmonitor_token_data *fsmonitor_new_token_data(void)
+{
+	static int test_env_value = -1;
+	static uint64_t flush_count = 0;
+	struct fsmonitor_token_data *token;
+
+	CALLOC_ARRAY(token, 1);
+
+	strbuf_init(&token->token_id, 0);
+	token->batch_head = NULL;
+	token->batch_tail = NULL;
+	token->client_ref_count = 0;
+
+	if (test_env_value < 0)
+		test_env_value = git_env_bool("GIT_TEST_FSMONITOR_TOKEN", 0);
+
+	if (!test_env_value) {
+		struct timeval tv;
+		struct tm tm;
+		time_t secs;
+
+		gettimeofday(&tv, NULL);
+		secs = tv.tv_sec;
+		gmtime_r(&secs, &tm);
+
+		strbuf_addf(&token->token_id,
+			    "%"PRIu64".%d.%4d%02d%02dT%02d%02d%02d.%06ldZ",
+			    flush_count++,
+			    getpid(),
+			    tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
+			    tm.tm_hour, tm.tm_min, tm.tm_sec,
+			    (long)tv.tv_usec);
+	} else {
+		strbuf_addf(&token->token_id, "test_%08x", test_env_value++);
+	}
+
+	return token;
+}
+
 static ipc_server_application_cb handle_client;
 
 static int handle_client(void *data,
@@ -283,7 +397,7 @@ static int fsmonitor_run_daemon(void)
 
 	pthread_mutex_init(&state.main_lock, NULL);
 	state.error_code = 0;
-	state.current_token_data = NULL;
+	state.current_token_data = fsmonitor_new_token_data();
 
 	/* Prepare to (recursively) watch the <worktree-root> directory. */
 	strbuf_init(&state.path_worktree_watch, 0);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 18/34] fsmonitor--daemon: create token-based changed path cache
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (16 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 17/34] fsmonitor--daemon: define token-ids Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 19/34] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows Jeff Hostetler via GitGitGadget
                       ` (16 subsequent siblings)
  34 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon to build a list of changed paths and associate
them with a token-id.  This will be used by the platform-specific
backends to accumulate changed paths in response to filesystem events.

The platform-specific file system listener thread receives file system
events containing one or more changed pathnames (with whatever bucketing
or grouping that is convenient for the file system).  These paths are
accumulated (without locking) by the file system layer into a `fsmonitor_batch`.

When the file system layer has drained the kernel event queue, it will
"publish" them to our token queue and make them visible to concurrent
client worker threads.  The token layer is free to combine and/or de-dup
paths within these batches for efficient presentation to clients.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 234 +++++++++++++++++++++++++++++++++++-
 fsmonitor--daemon.h         |  40 ++++++
 2 files changed, 272 insertions(+), 2 deletions(-)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index e991925fafc..ea3a52d34e3 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -168,17 +168,27 @@ struct fsmonitor_token_data {
 	uint64_t client_ref_count;
 };
 
+struct fsmonitor_batch {
+	struct fsmonitor_batch *next;
+	uint64_t batch_seq_nr;
+	const char **interned_paths;
+	size_t nr, alloc;
+	time_t pinned_time;
+};
+
 static struct fsmonitor_token_data *fsmonitor_new_token_data(void)
 {
 	static int test_env_value = -1;
 	static uint64_t flush_count = 0;
 	struct fsmonitor_token_data *token;
+	struct fsmonitor_batch *batch;
 
 	CALLOC_ARRAY(token, 1);
+	batch = fsmonitor_batch__new();
 
 	strbuf_init(&token->token_id, 0);
-	token->batch_head = NULL;
-	token->batch_tail = NULL;
+	token->batch_head = batch;
+	token->batch_tail = batch;
 	token->client_ref_count = 0;
 
 	if (test_env_value < 0)
@@ -204,9 +214,147 @@ static struct fsmonitor_token_data *fsmonitor_new_token_data(void)
 		strbuf_addf(&token->token_id, "test_%08x", test_env_value++);
 	}
 
+	/*
+	 * We created a new <token_id> and are starting a new series
+	 * of tokens with a zero <seq_nr>.
+	 *
+	 * Since clients cannot guess our new (non test) <token_id>
+	 * they will always receive a trivial response (because of the
+	 * mismatch on the <token_id>).  The trivial response will
+	 * tell them our new <token_id> so that subsequent requests
+	 * will be relative to our new series.  (And when sending that
+	 * response, we pin the current head of the batch list.)
+	 *
+	 * Even if the client correctly guesses the <token_id>, their
+	 * request of "builtin:<token_id>:0" asks for all changes MORE
+	 * RECENT than batch/bin 0.
+	 *
+	 * This implies that it is a waste to accumulate paths in the
+	 * initial batch/bin (because they will never be transmitted).
+	 *
+	 * So the daemon could be running for days and watching the
+	 * file system, but doesn't need to actually accumulate any
+	 * paths UNTIL we need to set a reference point for a later
+	 * relative request.
+	 *
+	 * However, it is very useful for testing to always have a
+	 * reference point set.  Pin batch 0 to force early file system
+	 * events to accumulate.
+	 */
+	if (test_env_value)
+		batch->pinned_time = time(NULL);
+
 	return token;
 }
 
+struct fsmonitor_batch *fsmonitor_batch__new(void)
+{
+	struct fsmonitor_batch *batch;
+
+	CALLOC_ARRAY(batch, 1);
+
+	return batch;
+}
+
+struct fsmonitor_batch *fsmonitor_batch__pop(struct fsmonitor_batch *batch)
+{
+	struct fsmonitor_batch *next;
+
+	if (!batch)
+		return NULL;
+
+	next = batch->next;
+
+	/*
+	 * The actual strings within the array are interned, so we don't
+	 * own them.
+	 */
+	free(batch->interned_paths);
+
+	return next;
+}
+
+void fsmonitor_batch__add_path(struct fsmonitor_batch *batch,
+			       const char *path)
+{
+	const char *interned_path = strintern(path);
+
+	trace_printf_key(&trace_fsmonitor, "event: %s", interned_path);
+
+	ALLOC_GROW(batch->interned_paths, batch->nr + 1, batch->alloc);
+	batch->interned_paths[batch->nr++] = interned_path;
+}
+
+static void fsmonitor_batch__combine(struct fsmonitor_batch *batch_dest,
+				     const struct fsmonitor_batch *batch_src)
+{
+	size_t k;
+
+	ALLOC_GROW(batch_dest->interned_paths,
+		   batch_dest->nr + batch_src->nr + 1,
+		   batch_dest->alloc);
+
+	for (k = 0; k < batch_src->nr; k++)
+		batch_dest->interned_paths[batch_dest->nr++] =
+			batch_src->interned_paths[k];
+}
+
+static void fsmonitor_free_token_data(struct fsmonitor_token_data *token)
+{
+	struct fsmonitor_batch *p;
+
+	if (!token)
+		return;
+
+	assert(token->client_ref_count == 0);
+
+	strbuf_release(&token->token_id);
+
+	for (p = token->batch_head; p; p = fsmonitor_batch__pop(p))
+		;
+
+	free(token);
+}
+
+/*
+ * Flush all of our cached data about the filesystem.  Call this if we
+ * lose sync with the filesystem and miss some notification events.
+ *
+ * [1] If we are missing events, then we no longer have a complete
+ *     history of the directory (relative to our current start token).
+ *     We should create a new token and start fresh (as if we just
+ *     booted up).
+ *
+ * If there are no concurrent threads readering the current token data
+ * series, we can free it now.  Otherwise, let the last reader free
+ * it.
+ *
+ * Either way, the old token data series is no longer associated with
+ * our state data.
+ */
+static void with_lock__do_force_resync(struct fsmonitor_daemon_state *state)
+{
+	/* assert current thread holding state->main_lock */
+
+	struct fsmonitor_token_data *free_me = NULL;
+	struct fsmonitor_token_data *new_one = NULL;
+
+	new_one = fsmonitor_new_token_data();
+
+	if (state->current_token_data->client_ref_count == 0)
+		free_me = state->current_token_data;
+	state->current_token_data = new_one;
+
+	fsmonitor_free_token_data(free_me);
+}
+
+void fsmonitor_force_resync(struct fsmonitor_daemon_state *state)
+{
+	pthread_mutex_lock(&state->main_lock);
+	with_lock__do_force_resync(state);
+	pthread_mutex_unlock(&state->main_lock);
+}
+
 static ipc_server_application_cb handle_client;
 
 static int handle_client(void *data,
@@ -316,6 +464,81 @@ enum fsmonitor_path_type fsmonitor_classify_path_absolute(
 	return fsmonitor_classify_path_gitdir_relative(rel);
 }
 
+/*
+ * We try to combine small batches at the front of the batch-list to avoid
+ * having a long list.  This hopefully makes it a little easier when we want
+ * to truncate and maintain the list.  However, we don't want the paths array
+ * to just keep growing and growing with realloc, so we insert an arbitrary
+ * limit.
+ */
+#define MY_COMBINE_LIMIT (1024)
+
+void fsmonitor_publish(struct fsmonitor_daemon_state *state,
+		       struct fsmonitor_batch *batch,
+		       const struct string_list *cookie_names)
+{
+	if (!batch && !cookie_names->nr)
+		return;
+
+	pthread_mutex_lock(&state->main_lock);
+
+	if (batch) {
+		struct fsmonitor_batch *head;
+
+		head = state->current_token_data->batch_head;
+		if (!head) {
+			BUG("token does not have batch");
+		} else if (head->pinned_time) {
+			/*
+			 * We cannot alter the current batch list
+			 * because:
+			 *
+			 * [a] it is being transmitted to at least one
+			 * client and the handle_client() thread has a
+			 * ref-count, but not a lock on the batch list
+			 * starting with this item.
+			 *
+			 * [b] it has been transmitted in the past to
+			 * at least one client such that future
+			 * requests are relative to this head batch.
+			 *
+			 * So, we can only prepend a new batch onto
+			 * the front of the list.
+			 */
+			batch->batch_seq_nr = head->batch_seq_nr + 1;
+			batch->next = head;
+			state->current_token_data->batch_head = batch;
+		} else if (!head->batch_seq_nr) {
+			/*
+			 * Batch 0 is unpinned.  See the note in
+			 * `fsmonitor_new_token_data()` about why we
+			 * don't need to accumulate these paths.
+			 */
+			fsmonitor_batch__pop(batch);
+		} else if (head->nr + batch->nr > MY_COMBINE_LIMIT) {
+			/*
+			 * The head batch in the list has never been
+			 * transmitted to a client, but folding the
+			 * contents of the new batch onto it would
+			 * exceed our arbitrary limit, so just prepend
+			 * the new batch onto the list.
+			 */
+			batch->batch_seq_nr = head->batch_seq_nr + 1;
+			batch->next = head;
+			state->current_token_data->batch_head = batch;
+		} else {
+			/*
+			 * We are free to append the paths in the given
+			 * batch onto the end of the current head batch.
+			 */
+			fsmonitor_batch__combine(head, batch);
+			fsmonitor_batch__pop(batch);
+		}
+	}
+
+	pthread_mutex_unlock(&state->main_lock);
+}
+
 static void *fsmonitor_fs_listen__thread_proc(void *_state)
 {
 	struct fsmonitor_daemon_state *state = _state;
@@ -330,6 +553,13 @@ static void *fsmonitor_fs_listen__thread_proc(void *_state)
 
 	fsmonitor_fs_listen__loop(state);
 
+	pthread_mutex_lock(&state->main_lock);
+	if (state->current_token_data &&
+	    state->current_token_data->client_ref_count == 0)
+		fsmonitor_free_token_data(state->current_token_data);
+	state->current_token_data = NULL;
+	pthread_mutex_unlock(&state->main_lock);
+
 	trace2_thread_exit();
 	return NULL;
 }
diff --git a/fsmonitor--daemon.h b/fsmonitor--daemon.h
index 7bbb3a27a1c..89a9ef20b24 100644
--- a/fsmonitor--daemon.h
+++ b/fsmonitor--daemon.h
@@ -12,6 +12,27 @@
 struct fsmonitor_batch;
 struct fsmonitor_token_data;
 
+/*
+ * Create a new batch of path(s).  The returned batch is considered
+ * private and not linked into the fsmonitor daemon state.  The caller
+ * should fill this batch with one or more paths and then publish it.
+ */
+struct fsmonitor_batch *fsmonitor_batch__new(void);
+
+/*
+ * Free this batch and return the value of the batch->next field.
+ */
+struct fsmonitor_batch *fsmonitor_batch__pop(struct fsmonitor_batch *batch);
+
+/*
+ * Add this path to this batch of modified files.
+ *
+ * The batch should be private and NOT (yet) linked into the fsmonitor
+ * daemon state and therefore not yet visible to worker threads and so
+ * no locking is required.
+ */
+void fsmonitor_batch__add_path(struct fsmonitor_batch *batch, const char *path);
+
 struct fsmonitor_daemon_backend_data; /* opaque platform-specific data */
 
 struct fsmonitor_daemon_state {
@@ -91,5 +112,24 @@ enum fsmonitor_path_type fsmonitor_classify_path_absolute(
 	struct fsmonitor_daemon_state *state,
 	const char *path);
 
+/*
+ * Prepend the this batch of path(s) onto the list of batches associated
+ * with the current token.  This makes the batch visible to worker threads.
+ *
+ * The caller no longer owns the batch and must not free it.
+ *
+ * Wake up the client threads waiting on these cookies.
+ */
+void fsmonitor_publish(struct fsmonitor_daemon_state *state,
+		       struct fsmonitor_batch *batch,
+		       const struct string_list *cookie_names);
+
+/*
+ * If the platform-specific layer loses sync with the filesystem,
+ * it should call this to invalidate cached data and abort waiting
+ * threads.
+ */
+void fsmonitor_force_resync(struct fsmonitor_daemon_state *state);
+
 #endif /* HAVE_FSMONITOR_DAEMON_BACKEND */
 #endif /* FSMONITOR_DAEMON_H */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 19/34] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (17 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 18/34] fsmonitor--daemon: create token-based changed path cache Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 23:02       ` Ævar Arnfjörð Bjarmason
  2021-07-06 19:09       ` Johannes Schindelin
  2021-07-01 14:47     ` [PATCH v3 20/34] fsmonitor-fs-listen-macos: add macos header files for FSEvent Jeff Hostetler via GitGitGadget
                       ` (15 subsequent siblings)
  34 siblings, 2 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach the win32 backend to register a watch on the working tree
root directory (recursively).  Also watch the <gitdir> if it is
not inside the working tree.  And to collect path change notifications
into batches and publish.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/fsmonitor/fsmonitor-fs-listen-win32.c | 530 +++++++++++++++++++
 1 file changed, 530 insertions(+)

diff --git a/compat/fsmonitor/fsmonitor-fs-listen-win32.c b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
index 880446b49e3..d707d47a0d7 100644
--- a/compat/fsmonitor/fsmonitor-fs-listen-win32.c
+++ b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
@@ -2,20 +2,550 @@
 #include "config.h"
 #include "fsmonitor.h"
 #include "fsmonitor-fs-listen.h"
+#include "fsmonitor--daemon.h"
+
+/*
+ * The documentation of ReadDirectoryChangesW() states that the maximum
+ * buffer size is 64K when the monitored directory is remote.
+ *
+ * Larger buffers may be used when the monitored directory is local and
+ * will help us receive events faster from the kernel and avoid dropped
+ * events.
+ *
+ * So we try to use a very large buffer and silently fallback to 64K if
+ * we get an error.
+ */
+#define MAX_RDCW_BUF_FALLBACK (65536)
+#define MAX_RDCW_BUF          (65536 * 8)
+
+struct one_watch
+{
+	char buffer[MAX_RDCW_BUF];
+	DWORD buf_len;
+	DWORD count;
+
+	struct strbuf path;
+	HANDLE hDir;
+	HANDLE hEvent;
+	OVERLAPPED overlapped;
+
+	/*
+	 * Is there an active ReadDirectoryChangesW() call pending.  If so, we
+	 * need to later call GetOverlappedResult() and possibly CancelIoEx().
+	 */
+	BOOL is_active;
+};
+
+struct fsmonitor_daemon_backend_data
+{
+	struct one_watch *watch_worktree;
+	struct one_watch *watch_gitdir;
+
+	HANDLE hEventShutdown;
+
+	HANDLE hListener[3]; /* we don't own these handles */
+#define LISTENER_SHUTDOWN 0
+#define LISTENER_HAVE_DATA_WORKTREE 1
+#define LISTENER_HAVE_DATA_GITDIR 2
+	int nr_listener_handles;
+};
+
+/*
+ * Convert the WCHAR path from the notification into UTF8 and
+ * then normalize it.
+ */
+static int normalize_path_in_utf8(FILE_NOTIFY_INFORMATION *info,
+				  struct strbuf *normalized_path)
+{
+	int reserve;
+	int len = 0;
+
+	strbuf_reset(normalized_path);
+	if (!info->FileNameLength)
+		goto normalize;
+
+	/*
+	 * Pre-reserve enough space in the UTF8 buffer for
+	 * each Unicode WCHAR character to be mapped into a
+	 * sequence of 2 UTF8 characters.  That should let us
+	 * avoid ERROR_INSUFFICIENT_BUFFER 99.9+% of the time.
+	 */
+	reserve = info->FileNameLength + 1;
+	strbuf_grow(normalized_path, reserve);
+
+	for (;;) {
+		len = WideCharToMultiByte(CP_UTF8, 0, info->FileName,
+					  info->FileNameLength / sizeof(WCHAR),
+					  normalized_path->buf,
+					  strbuf_avail(normalized_path) - 1,
+					  NULL, NULL);
+		if (len > 0)
+			goto normalize;
+		if (GetLastError() != ERROR_INSUFFICIENT_BUFFER) {
+			error("[GLE %ld] could not convert path to UTF-8: '%.*ls'",
+			      GetLastError(),
+			      (int)(info->FileNameLength / sizeof(WCHAR)),
+			      info->FileName);
+			return -1;
+		}
+
+		strbuf_grow(normalized_path,
+			    strbuf_avail(normalized_path) + reserve);
+	}
+
+normalize:
+	strbuf_setlen(normalized_path, len);
+	return strbuf_normalize_path(normalized_path);
+}
 
 void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state)
 {
+	SetEvent(state->backend_data->hListener[LISTENER_SHUTDOWN]);
+}
+
+static struct one_watch *create_watch(struct fsmonitor_daemon_state *state,
+				      const char *path)
+{
+	struct one_watch *watch = NULL;
+	DWORD desired_access = FILE_LIST_DIRECTORY;
+	DWORD share_mode =
+		FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE;
+	HANDLE hDir;
+
+	hDir = CreateFileA(path,
+			   desired_access, share_mode, NULL, OPEN_EXISTING,
+			   FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OVERLAPPED,
+			   NULL);
+	if (hDir == INVALID_HANDLE_VALUE) {
+		error(_("[GLE %ld] could not watch '%s'"),
+		      GetLastError(), path);
+		return NULL;
+	}
+
+	CALLOC_ARRAY(watch, 1);
+
+	watch->buf_len = sizeof(watch->buffer); /* assume full MAX_RDCW_BUF */
+
+	strbuf_init(&watch->path, 0);
+	strbuf_addstr(&watch->path, path);
+
+	watch->hDir = hDir;
+	watch->hEvent = CreateEvent(NULL, TRUE, FALSE, NULL);
+
+	return watch;
+}
+
+static void destroy_watch(struct one_watch *watch)
+{
+	if (!watch)
+		return;
+
+	strbuf_release(&watch->path);
+	if (watch->hDir != INVALID_HANDLE_VALUE)
+		CloseHandle(watch->hDir);
+	if (watch->hEvent != INVALID_HANDLE_VALUE)
+		CloseHandle(watch->hEvent);
+
+	free(watch);
+}
+
+static int start_rdcw_watch(struct fsmonitor_daemon_backend_data *data,
+			    struct one_watch *watch)
+{
+	DWORD dwNotifyFilter =
+		FILE_NOTIFY_CHANGE_FILE_NAME |
+		FILE_NOTIFY_CHANGE_DIR_NAME |
+		FILE_NOTIFY_CHANGE_ATTRIBUTES |
+		FILE_NOTIFY_CHANGE_SIZE |
+		FILE_NOTIFY_CHANGE_LAST_WRITE |
+		FILE_NOTIFY_CHANGE_CREATION;
+
+	ResetEvent(watch->hEvent);
+
+	memset(&watch->overlapped, 0, sizeof(watch->overlapped));
+	watch->overlapped.hEvent = watch->hEvent;
+
+start_watch:
+	/*
+	 * Queue an async call using Overlapped IO.  This returns immediately.
+	 * Our event handle will be signalled when the real result is available.
+	 *
+	 * The return value here just means that we successfully queued it.
+	 * We won't know if the Read...() actually produces data until later.
+	 */
+	watch->is_active = ReadDirectoryChangesW(
+		watch->hDir, watch->buffer, watch->buf_len, TRUE,
+		dwNotifyFilter, &watch->count, &watch->overlapped, NULL);
+
+	/*
+	 * The kernel throws an invalid parameter error when our buffer
+	 * is too big and we are pointed at a remote directory (and possibly
+	 * for other reasons).  Quietly set it down and try again.
+	 *
+	 * See note about MAX_RDCW_BUF at the top.
+	 */
+	if (!watch->is_active &&
+	    GetLastError() == ERROR_INVALID_PARAMETER &&
+	    watch->buf_len > MAX_RDCW_BUF_FALLBACK) {
+		watch->buf_len = MAX_RDCW_BUF_FALLBACK;
+		goto start_watch;
+	}
+
+	if (watch->is_active)
+		return 0;
+
+	error("ReadDirectoryChangedW failed on '%s' [GLE %ld]",
+	      watch->path.buf, GetLastError());
+	return -1;
+}
+
+static int recv_rdcw_watch(struct one_watch *watch)
+{
+	watch->is_active = FALSE;
+
+	/*
+	 * The overlapped result is ready.  If the Read...() was successful
+	 * we finally receive the actual result into our buffer.
+	 */
+	if (GetOverlappedResult(watch->hDir, &watch->overlapped, &watch->count,
+				TRUE))
+		return 0;
+
+	/*
+	 * NEEDSWORK: If an external <gitdir> is deleted, the above
+	 * returns an error.  I'm not sure that there's anything that
+	 * we can do here other than failing -- the <worktree>/.git
+	 * link file would be broken anyway.  We might try to check
+	 * for that and return a better error message, but I'm not
+	 * sure it is worth it.
+	 */
+
+	error("GetOverlappedResult failed on '%s' [GLE %ld]",
+	      watch->path.buf, GetLastError());
+	return -1;
+}
+
+static void cancel_rdcw_watch(struct one_watch *watch)
+{
+	DWORD count;
+
+	if (!watch || !watch->is_active)
+		return;
+
+	/*
+	 * The calls to ReadDirectoryChangesW() and GetOverlappedResult()
+	 * form a "pair" (my term) where we queue an IO and promise to
+	 * hang around and wait for the kernel to give us the result.
+	 *
+	 * If for some reason after we queue the IO, we have to quit
+	 * or otherwise not stick around for the second half, we must
+	 * tell the kernel to abort the IO.  This prevents the kernel
+	 * from writing to our buffer and/or signalling our event
+	 * after we free them.
+	 *
+	 * (Ask me how much fun it was to track that one down).
+	 */
+	CancelIoEx(watch->hDir, &watch->overlapped);
+	GetOverlappedResult(watch->hDir, &watch->overlapped, &count, TRUE);
+	watch->is_active = FALSE;
+}
+
+/*
+ * Process filesystem events that happen anywhere (recursively) under the
+ * <worktree> root directory.  For a normal working directory, this includes
+ * both version controlled files and the contents of the .git/ directory.
+ *
+ * If <worktree>/.git is a file, then we only see events for the file
+ * itself.
+ */
+static int process_worktree_events(struct fsmonitor_daemon_state *state)
+{
+	struct fsmonitor_daemon_backend_data *data = state->backend_data;
+	struct one_watch *watch = data->watch_worktree;
+	struct strbuf path = STRBUF_INIT;
+	struct string_list cookie_list = STRING_LIST_INIT_DUP;
+	struct fsmonitor_batch *batch = NULL;
+	const char *p = watch->buffer;
+
+	/*
+	 * If the kernel gets more events than will fit in the kernel
+	 * buffer associated with our RDCW handle, it drops them and
+	 * returns a count of zero.
+	 *
+	 * Yes, the call returns WITHOUT error and with length zero.
+	 *
+	 * (The "overflow" case is not ambiguous with the "no data" case
+	 * because we did an INFINITE wait.)
+	 *
+	 * This means we have a gap in coverage.  Tell the daemon layer
+	 * to resync.
+	 */
+	if (!watch->count) {
+		trace2_data_string("fsmonitor", NULL, "fsm-listen/kernel",
+				   "overflow");
+		fsmonitor_force_resync(state);
+		return LISTENER_HAVE_DATA_WORKTREE;
+	}
+
+	/*
+	 * On Windows, `info` contains an "array" of paths that are
+	 * relative to the root of whichever directory handle received
+	 * the event.
+	 */
+	for (;;) {
+		FILE_NOTIFY_INFORMATION *info = (void *)p;
+		const char *slash;
+		enum fsmonitor_path_type t;
+
+		strbuf_reset(&path);
+		if (normalize_path_in_utf8(info, &path) == -1)
+			goto skip_this_path;
+
+		t = fsmonitor_classify_path_workdir_relative(path.buf);
+
+		switch (t) {
+		case IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX:
+			/* special case cookie files within .git */
+
+			/* Use just the filename of the cookie file. */
+			slash = find_last_dir_sep(path.buf);
+			string_list_append(&cookie_list,
+					   slash ? slash + 1 : path.buf);
+			break;
+
+		case IS_INSIDE_DOT_GIT:
+			/* ignore everything inside of "<worktree>/.git/" */
+			break;
+
+		case IS_DOT_GIT:
+			/* "<worktree>/.git" was deleted (or renamed away) */
+			if ((info->Action == FILE_ACTION_REMOVED) ||
+			    (info->Action == FILE_ACTION_RENAMED_OLD_NAME)) {
+				trace2_data_string("fsmonitor", NULL,
+						   "fsm-listen/dotgit",
+						   "removed");
+				goto force_shutdown;
+			}
+			break;
+
+		case IS_WORKDIR_PATH:
+			/* queue normal pathname */
+			if (!batch)
+				batch = fsmonitor_batch__new();
+			fsmonitor_batch__add_path(batch, path.buf);
+			break;
+
+		case IS_GITDIR:
+		case IS_INSIDE_GITDIR:
+		case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX:
+		default:
+			BUG("unexpected path classification '%d' for '%s'",
+			    t, path.buf);
+		}
+
+skip_this_path:
+		if (!info->NextEntryOffset)
+			break;
+		p += info->NextEntryOffset;
+	}
+
+	fsmonitor_publish(state, batch, &cookie_list);
+	batch = NULL;
+	string_list_clear(&cookie_list, 0);
+	strbuf_release(&path);
+	return LISTENER_HAVE_DATA_WORKTREE;
+
+force_shutdown:
+	fsmonitor_batch__pop(batch);
+	string_list_clear(&cookie_list, 0);
+	strbuf_release(&path);
+	return LISTENER_SHUTDOWN;
+}
+
+/*
+ * Process filesystem events that happened anywhere (recursively) under the
+ * external <gitdir> (such as non-primary worktrees or submodules).
+ * We only care about cookie files that our client threads created here.
+ *
+ * Note that we DO NOT get filesystem events on the external <gitdir>
+ * itself (it is not inside something that we are watching).  In particular,
+ * we do not get an event if the external <gitdir> is deleted.
+ */
+static int process_gitdir_events(struct fsmonitor_daemon_state *state)
+{
+	struct fsmonitor_daemon_backend_data *data = state->backend_data;
+	struct one_watch *watch = data->watch_gitdir;
+	struct strbuf path = STRBUF_INIT;
+	struct string_list cookie_list = STRING_LIST_INIT_DUP;
+	const char *p = watch->buffer;
+
+	if (!watch->count) {
+		trace2_data_string("fsmonitor", NULL, "fsm-listen/kernel",
+				   "overflow");
+		fsmonitor_force_resync(state);
+		return LISTENER_HAVE_DATA_GITDIR;
+	}
+
+	for (;;) {
+		FILE_NOTIFY_INFORMATION *info = (void *)p;
+		const char *slash;
+		enum fsmonitor_path_type t;
+
+		strbuf_reset(&path);
+		if (normalize_path_in_utf8(info, &path) == -1)
+			goto skip_this_path;
+
+		t = fsmonitor_classify_path_gitdir_relative(path.buf);
+
+		switch (t) {
+		case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX:
+			/* special case cookie files within gitdir */
+
+			/* Use just the filename of the cookie file. */
+			slash = find_last_dir_sep(path.buf);
+			string_list_append(&cookie_list,
+					   slash ? slash + 1 : path.buf);
+			break;
+
+		case IS_INSIDE_GITDIR:
+			goto skip_this_path;
+
+		default:
+			BUG("unexpected path classification '%d' for '%s'",
+			    t, path.buf);
+		}
+
+skip_this_path:
+		if (!info->NextEntryOffset)
+			break;
+		p += info->NextEntryOffset;
+	}
+
+	fsmonitor_publish(state, NULL, &cookie_list);
+	string_list_clear(&cookie_list, 0);
+	strbuf_release(&path);
+	return LISTENER_HAVE_DATA_GITDIR;
 }
 
 void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
 {
+	struct fsmonitor_daemon_backend_data *data = state->backend_data;
+	DWORD dwWait;
+
+	state->error_code = 0;
+
+	if (start_rdcw_watch(data, data->watch_worktree) == -1)
+		goto force_error_stop;
+
+	if (data->watch_gitdir &&
+	    start_rdcw_watch(data, data->watch_gitdir) == -1)
+		goto force_error_stop;
+
+	for (;;) {
+		dwWait = WaitForMultipleObjects(data->nr_listener_handles,
+						data->hListener,
+						FALSE, INFINITE);
+
+		if (dwWait == WAIT_OBJECT_0 + LISTENER_HAVE_DATA_WORKTREE) {
+			if (recv_rdcw_watch(data->watch_worktree) == -1)
+				goto force_error_stop;
+			if (process_worktree_events(state) == LISTENER_SHUTDOWN)
+				goto force_shutdown;
+			if (start_rdcw_watch(data, data->watch_worktree) == -1)
+				goto force_error_stop;
+			continue;
+		}
+
+		if (dwWait == WAIT_OBJECT_0 + LISTENER_HAVE_DATA_GITDIR) {
+			if (recv_rdcw_watch(data->watch_gitdir) == -1)
+				goto force_error_stop;
+			if (process_gitdir_events(state) == LISTENER_SHUTDOWN)
+				goto force_shutdown;
+			if (start_rdcw_watch(data, data->watch_gitdir) == -1)
+				goto force_error_stop;
+			continue;
+		}
+
+		if (dwWait == WAIT_OBJECT_0 + LISTENER_SHUTDOWN)
+			goto clean_shutdown;
+
+		error(_("could not read directory changes [GLE %ld]"),
+		      GetLastError());
+		goto force_error_stop;
+	}
+
+force_error_stop:
+	state->error_code = -1;
+
+force_shutdown:
+	/*
+	 * Tell the IPC thead pool to stop (which completes the await
+	 * in the main thread (which will also signal this thread (if
+	 * we are still alive))).
+	 */
+	ipc_server_stop_async(state->ipc_server_data);
+
+clean_shutdown:
+	cancel_rdcw_watch(data->watch_worktree);
+	cancel_rdcw_watch(data->watch_gitdir);
 }
 
 int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
 {
+	struct fsmonitor_daemon_backend_data *data;
+
+	CALLOC_ARRAY(data, 1);
+
+	data->hEventShutdown = CreateEvent(NULL, TRUE, FALSE, NULL);
+
+	data->watch_worktree = create_watch(state,
+					    state->path_worktree_watch.buf);
+	if (!data->watch_worktree)
+		goto failed;
+
+	if (state->nr_paths_watching > 1) {
+		data->watch_gitdir = create_watch(state,
+						  state->path_gitdir_watch.buf);
+		if (!data->watch_gitdir)
+			goto failed;
+	}
+
+	data->hListener[LISTENER_SHUTDOWN] = data->hEventShutdown;
+	data->nr_listener_handles++;
+
+	data->hListener[LISTENER_HAVE_DATA_WORKTREE] =
+		data->watch_worktree->hEvent;
+	data->nr_listener_handles++;
+
+	if (data->watch_gitdir) {
+		data->hListener[LISTENER_HAVE_DATA_GITDIR] =
+			data->watch_gitdir->hEvent;
+		data->nr_listener_handles++;
+	}
+
+	state->backend_data = data;
+	return 0;
+
+failed:
+	CloseHandle(data->hEventShutdown);
+	destroy_watch(data->watch_worktree);
+	destroy_watch(data->watch_gitdir);
+
 	return -1;
 }
 
 void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
 {
+	struct fsmonitor_daemon_backend_data *data;
+
+	if (!state || !state->backend_data)
+		return;
+
+	data = state->backend_data;
+
+	CloseHandle(data->hEventShutdown);
+	destroy_watch(data->watch_worktree);
+	destroy_watch(data->watch_gitdir);
+
+	FREE_AND_NULL(state->backend_data);
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 20/34] fsmonitor-fs-listen-macos: add macos header files for FSEvent
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (18 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 19/34] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 21/34] fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS Jeff Hostetler via GitGitGadget
                       ` (14 subsequent siblings)
  34 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Include MacOS system declarations to allow us to use FSEvent and
CoreFoundation APIs.  We need GCC and clang versions because of
compiler and header file conflicts.

While it is quite possible to #include Apple's CoreServices.h when
compiling C source code with clang, trying to build it with GCC
currently fails with this error:

In file included
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Security.framework/Headers/AuthSession.h:32,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Security.framework/Headers/Security.h:42,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/OSServices.framework/Headers/CSIdentity.h:43,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/OSServices.framework/Headers/OSServices.h:29,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/LaunchServices.framework/Headers/IconsCore.h:23,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/LaunchServices.framework/Headers/LaunchServices.h:23,
   from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Headers/CoreServices.h:45,
     /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Security.framework/Headers/Authorization.h:193:7: error: variably modified 'bytes' at file scope
       193 | char bytes[kAuthorizationExternalFormLength];
           |      ^~~~~

The underlying reason is that GCC (rightfully) objects that an `enum`
value such as `kAuthorizationExternalFormLength` is not a constant
(because it is not, the preprocessor has no knowledge of it, only the
actual C compiler does) and can therefore not be used to define the size
of a C array.

This is a known problem and tracked in GCC's bug tracker:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93082

In the meantime, let's not block things and go the slightly ugly route
of declaring/defining the FSEvents constants, data structures and
functions that we need, so that we can avoid above-mentioned issue.

Let's do this _only_ for GCC, though, so that the CI/PR builds (which
build both with clang and with GCC) can guarantee that we _are_ using
the correct data types.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/fsmonitor/fsmonitor-fs-listen-macos.c | 96 ++++++++++++++++++++
 1 file changed, 96 insertions(+)

diff --git a/compat/fsmonitor/fsmonitor-fs-listen-macos.c b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
index b91058d1c4f..bec5130d9e1 100644
--- a/compat/fsmonitor/fsmonitor-fs-listen-macos.c
+++ b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
@@ -1,3 +1,99 @@
+#if defined(__GNUC__)
+/*
+ * It is possible to #include CoreFoundation/CoreFoundation.h when compiling
+ * with clang, but not with GCC as of time of writing.
+ *
+ * See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93082 for details.
+ */
+typedef unsigned int FSEventStreamCreateFlags;
+#define kFSEventStreamEventFlagNone               0x00000000
+#define kFSEventStreamEventFlagMustScanSubDirs    0x00000001
+#define kFSEventStreamEventFlagUserDropped        0x00000002
+#define kFSEventStreamEventFlagKernelDropped      0x00000004
+#define kFSEventStreamEventFlagEventIdsWrapped    0x00000008
+#define kFSEventStreamEventFlagHistoryDone        0x00000010
+#define kFSEventStreamEventFlagRootChanged        0x00000020
+#define kFSEventStreamEventFlagMount              0x00000040
+#define kFSEventStreamEventFlagUnmount            0x00000080
+#define kFSEventStreamEventFlagItemCreated        0x00000100
+#define kFSEventStreamEventFlagItemRemoved        0x00000200
+#define kFSEventStreamEventFlagItemInodeMetaMod   0x00000400
+#define kFSEventStreamEventFlagItemRenamed        0x00000800
+#define kFSEventStreamEventFlagItemModified       0x00001000
+#define kFSEventStreamEventFlagItemFinderInfoMod  0x00002000
+#define kFSEventStreamEventFlagItemChangeOwner    0x00004000
+#define kFSEventStreamEventFlagItemXattrMod       0x00008000
+#define kFSEventStreamEventFlagItemIsFile         0x00010000
+#define kFSEventStreamEventFlagItemIsDir          0x00020000
+#define kFSEventStreamEventFlagItemIsSymlink      0x00040000
+#define kFSEventStreamEventFlagOwnEvent           0x00080000
+#define kFSEventStreamEventFlagItemIsHardlink     0x00100000
+#define kFSEventStreamEventFlagItemIsLastHardlink 0x00200000
+#define kFSEventStreamEventFlagItemCloned         0x00400000
+
+typedef struct __FSEventStream *FSEventStreamRef;
+typedef const FSEventStreamRef ConstFSEventStreamRef;
+
+typedef unsigned int CFStringEncoding;
+#define kCFStringEncodingUTF8 0x08000100
+
+typedef const struct __CFString *CFStringRef;
+typedef const struct __CFArray *CFArrayRef;
+typedef const struct __CFRunLoop *CFRunLoopRef;
+
+struct FSEventStreamContext {
+    long long version;
+    void *cb_data, *retain, *release, *copy_description;
+};
+
+typedef struct FSEventStreamContext FSEventStreamContext;
+typedef unsigned int FSEventStreamEventFlags;
+#define kFSEventStreamCreateFlagNoDefer 0x02
+#define kFSEventStreamCreateFlagWatchRoot 0x04
+#define kFSEventStreamCreateFlagFileEvents 0x10
+
+typedef unsigned long long FSEventStreamEventId;
+#define kFSEventStreamEventIdSinceNow 0xFFFFFFFFFFFFFFFFULL
+
+typedef void (*FSEventStreamCallback)(ConstFSEventStreamRef streamRef,
+				      void *context,
+				      __SIZE_TYPE__ num_of_events,
+				      void *event_paths,
+				      const FSEventStreamEventFlags event_flags[],
+				      const FSEventStreamEventId event_ids[]);
+typedef double CFTimeInterval;
+FSEventStreamRef FSEventStreamCreate(void *allocator,
+				     FSEventStreamCallback callback,
+				     FSEventStreamContext *context,
+				     CFArrayRef paths_to_watch,
+				     FSEventStreamEventId since_when,
+				     CFTimeInterval latency,
+				     FSEventStreamCreateFlags flags);
+CFStringRef CFStringCreateWithCString(void *allocator, const char *string,
+				      CFStringEncoding encoding);
+CFArrayRef CFArrayCreate(void *allocator, const void **items, long long count,
+			 void *callbacks);
+void CFRunLoopRun(void);
+void CFRunLoopStop(CFRunLoopRef run_loop);
+CFRunLoopRef CFRunLoopGetCurrent(void);
+extern CFStringRef kCFRunLoopDefaultMode;
+void FSEventStreamScheduleWithRunLoop(FSEventStreamRef stream,
+				      CFRunLoopRef run_loop,
+				      CFStringRef run_loop_mode);
+unsigned char FSEventStreamStart(FSEventStreamRef stream);
+void FSEventStreamStop(FSEventStreamRef stream);
+void FSEventStreamInvalidate(FSEventStreamRef stream);
+void FSEventStreamRelease(FSEventStreamRef stream);
+#else
+/*
+ * Let Apple's headers declare `isalnum()` first, before
+ * Git's headers override it via a constant
+ */
+#include <string.h>
+#include <CoreFoundation/CoreFoundation.h>
+#include <CoreServices/CoreServices.h>
+#endif
+
 #include "cache.h"
 #include "fsmonitor.h"
 #include "fsmonitor-fs-listen.h"
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 21/34] fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (19 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 20/34] fsmonitor-fs-listen-macos: add macos header files for FSEvent Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 22/34] fsmonitor--daemon: implement handle_client callback Jeff Hostetler via GitGitGadget
                       ` (13 subsequent siblings)
  34 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Implement file system event listener on MacOS using FSEvent,
CoreFoundation, and CoreServices.

Co-authored-by: Kevin Willford <Kevin.Willford@microsoft.com>
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/fsmonitor/fsmonitor-fs-listen-macos.c | 381 +++++++++++++++++++
 1 file changed, 381 insertions(+)

diff --git a/compat/fsmonitor/fsmonitor-fs-listen-macos.c b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
index bec5130d9e1..02f89de216e 100644
--- a/compat/fsmonitor/fsmonitor-fs-listen-macos.c
+++ b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
@@ -97,20 +97,401 @@ void FSEventStreamRelease(FSEventStreamRef stream);
 #include "cache.h"
 #include "fsmonitor.h"
 #include "fsmonitor-fs-listen.h"
+#include "fsmonitor--daemon.h"
+
+struct fsmonitor_daemon_backend_data
+{
+	CFStringRef cfsr_worktree_path;
+	CFStringRef cfsr_gitdir_path;
+
+	CFArrayRef cfar_paths_to_watch;
+	int nr_paths_watching;
+
+	FSEventStreamRef stream;
+
+	CFRunLoopRef rl;
+
+	enum shutdown_style {
+		SHUTDOWN_EVENT = 0,
+		FORCE_SHUTDOWN,
+		FORCE_ERROR_STOP,
+	} shutdown_style;
+
+	unsigned int stream_scheduled:1;
+	unsigned int stream_started:1;
+};
+
+static void log_flags_set(const char *path, const FSEventStreamEventFlags flag)
+{
+	struct strbuf msg = STRBUF_INIT;
+
+	if (flag & kFSEventStreamEventFlagMustScanSubDirs)
+		strbuf_addstr(&msg, "MustScanSubDirs|");
+	if (flag & kFSEventStreamEventFlagUserDropped)
+		strbuf_addstr(&msg, "UserDropped|");
+	if (flag & kFSEventStreamEventFlagKernelDropped)
+		strbuf_addstr(&msg, "KernelDropped|");
+	if (flag & kFSEventStreamEventFlagEventIdsWrapped)
+		strbuf_addstr(&msg, "EventIdsWrapped|");
+	if (flag & kFSEventStreamEventFlagHistoryDone)
+		strbuf_addstr(&msg, "HistoryDone|");
+	if (flag & kFSEventStreamEventFlagRootChanged)
+		strbuf_addstr(&msg, "RootChanged|");
+	if (flag & kFSEventStreamEventFlagMount)
+		strbuf_addstr(&msg, "Mount|");
+	if (flag & kFSEventStreamEventFlagUnmount)
+		strbuf_addstr(&msg, "Unmount|");
+	if (flag & kFSEventStreamEventFlagItemChangeOwner)
+		strbuf_addstr(&msg, "ItemChangeOwner|");
+	if (flag & kFSEventStreamEventFlagItemCreated)
+		strbuf_addstr(&msg, "ItemCreated|");
+	if (flag & kFSEventStreamEventFlagItemFinderInfoMod)
+		strbuf_addstr(&msg, "ItemFinderInfoMod|");
+	if (flag & kFSEventStreamEventFlagItemInodeMetaMod)
+		strbuf_addstr(&msg, "ItemInodeMetaMod|");
+	if (flag & kFSEventStreamEventFlagItemIsDir)
+		strbuf_addstr(&msg, "ItemIsDir|");
+	if (flag & kFSEventStreamEventFlagItemIsFile)
+		strbuf_addstr(&msg, "ItemIsFile|");
+	if (flag & kFSEventStreamEventFlagItemIsHardlink)
+		strbuf_addstr(&msg, "ItemIsHardlink|");
+	if (flag & kFSEventStreamEventFlagItemIsLastHardlink)
+		strbuf_addstr(&msg, "ItemIsLastHardlink|");
+	if (flag & kFSEventStreamEventFlagItemIsSymlink)
+		strbuf_addstr(&msg, "ItemIsSymlink|");
+	if (flag & kFSEventStreamEventFlagItemModified)
+		strbuf_addstr(&msg, "ItemModified|");
+	if (flag & kFSEventStreamEventFlagItemRemoved)
+		strbuf_addstr(&msg, "ItemRemoved|");
+	if (flag & kFSEventStreamEventFlagItemRenamed)
+		strbuf_addstr(&msg, "ItemRenamed|");
+	if (flag & kFSEventStreamEventFlagItemXattrMod)
+		strbuf_addstr(&msg, "ItemXattrMod|");
+	if (flag & kFSEventStreamEventFlagOwnEvent)
+		strbuf_addstr(&msg, "OwnEvent|");
+	if (flag & kFSEventStreamEventFlagItemCloned)
+		strbuf_addstr(&msg, "ItemCloned|");
+
+	trace_printf_key(&trace_fsmonitor, "fsevent: '%s', flags=%u %s",
+			 path, flag, msg.buf);
+
+	strbuf_release(&msg);
+}
+
+static int ef_is_root_delete(const FSEventStreamEventFlags ef)
+{
+	return (ef & kFSEventStreamEventFlagItemIsDir &&
+		ef & kFSEventStreamEventFlagItemRemoved);
+}
+
+static int ef_is_root_renamed(const FSEventStreamEventFlags ef)
+{
+	return (ef & kFSEventStreamEventFlagItemIsDir &&
+		ef & kFSEventStreamEventFlagItemRenamed);
+}
+
+static int ef_is_dropped(const FSEventStreamEventFlags ef)
+{
+	return (ef & kFSEventStreamEventFlagKernelDropped ||
+		ef & kFSEventStreamEventFlagUserDropped);
+}
+
+static void fsevent_callback(ConstFSEventStreamRef streamRef,
+			     void *ctx,
+			     size_t num_of_events,
+			     void *event_paths,
+			     const FSEventStreamEventFlags event_flags[],
+			     const FSEventStreamEventId event_ids[])
+{
+	struct fsmonitor_daemon_state *state = ctx;
+	struct fsmonitor_daemon_backend_data *data = state->backend_data;
+	char **paths = (char **)event_paths;
+	struct fsmonitor_batch *batch = NULL;
+	struct string_list cookie_list = STRING_LIST_INIT_DUP;
+	const char *path_k;
+	const char *slash;
+	int k;
+	struct strbuf tmp = STRBUF_INIT;
+
+	/*
+	 * Build a list of all filesystem changes into a private/local
+	 * list and without holding any locks.
+	 */
+	for (k = 0; k < num_of_events; k++) {
+		/*
+		 * On Mac, we receive an array of absolute paths.
+		 */
+		path_k = paths[k];
+
+		/*
+		 * If you want to debug FSEvents, log them to GIT_TRACE_FSMONITOR.
+		 * Please don't log them to Trace2.
+		 *
+		 * trace_printf_key(&trace_fsmonitor, "Path: '%s'", path_k);
+		 */
+
+		/*
+		 * If event[k] is marked as dropped, we assume that we have
+		 * lost sync with the filesystem and should flush our cached
+		 * data.  We need to:
+		 *
+		 * [1] Abort/wake any client threads waiting for a cookie and
+		 *     flush the cached state data (the current token), and
+		 *     create a new token.
+		 *
+		 * [2] Discard the batch that we were locally building (since
+		 *     they are conceptually relative to the just flushed
+		 *     token).
+		 */
+		if (ef_is_dropped(event_flags[k])) {
+			/*
+			 * see also kFSEventStreamEventFlagMustScanSubDirs
+			 */
+			trace_printf_key(&trace_fsmonitor, "event: dropped");
+
+			fsmonitor_force_resync(state);
+			fsmonitor_batch__pop(batch);
+			string_list_clear(&cookie_list, 0);
+
+			/*
+			 * We assume that any events that we received
+			 * in this callback after this dropped event
+			 * may still be valid, so we continue rather
+			 * than break.  (And just in case there is a
+			 * delete of ".git" hiding in there.)
+			 */
+			continue;
+		}
+
+		switch (fsmonitor_classify_path_absolute(state, path_k)) {
+
+		case IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX:
+		case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX:
+			/* special case cookie files within .git or gitdir */
+
+			/* Use just the filename of the cookie file. */
+			slash = find_last_dir_sep(path_k);
+			string_list_append(&cookie_list,
+					   slash ? slash + 1 : path_k);
+			break;
+
+		case IS_INSIDE_DOT_GIT:
+		case IS_INSIDE_GITDIR:
+			/* ignore all other paths inside of .git or gitdir */
+			break;
+
+		case IS_DOT_GIT:
+		case IS_GITDIR:
+			/*
+			 * If .git directory is deleted or renamed away,
+			 * we have to quit.
+			 */
+			if (ef_is_root_delete(event_flags[k])) {
+				trace_printf_key(&trace_fsmonitor,
+						 "event: gitdir removed");
+				goto force_shutdown;
+			}
+			if (ef_is_root_renamed(event_flags[k])) {
+				trace_printf_key(&trace_fsmonitor,
+						 "event: gitdir renamed");
+				goto force_shutdown;
+			}
+			break;
+
+		case IS_WORKDIR_PATH:
+			/* try to queue normal pathnames */
+
+			if (trace_pass_fl(&trace_fsmonitor))
+				log_flags_set(path_k, event_flags[k]);
+
+			/*
+			 * Because of the implicit "binning" (the
+			 * kernel calls us at a given frequency) and
+			 * de-duping (the kernel is free to combine
+			 * multiple events for a given pathname), an
+			 * individual fsevent could be marked as both
+			 * a file and directory.  Add it to the queue
+			 * with both spellings so that the client will
+			 * know how much to invalidate/refresh.
+			 */
+
+			if (event_flags[k] & kFSEventStreamEventFlagItemIsFile) {
+				const char *rel = path_k +
+					state->path_worktree_watch.len + 1;
+
+				if (!batch)
+					batch = fsmonitor_batch__new();
+				fsmonitor_batch__add_path(batch, rel);
+			}
+
+			if (event_flags[k] & kFSEventStreamEventFlagItemIsDir) {
+				const char *rel = path_k +
+					state->path_worktree_watch.len + 1;
+
+				strbuf_reset(&tmp);
+				strbuf_addstr(&tmp, rel);
+				strbuf_addch(&tmp, '/');
+
+				if (!batch)
+					batch = fsmonitor_batch__new();
+				fsmonitor_batch__add_path(batch, tmp.buf);
+			}
+
+			break;
+
+		case IS_OUTSIDE_CONE:
+		default:
+			trace_printf_key(&trace_fsmonitor,
+					 "ignoring '%s'", path_k);
+			break;
+		}
+	}
+
+	fsmonitor_publish(state, batch, &cookie_list);
+	string_list_clear(&cookie_list, 0);
+	strbuf_release(&tmp);
+	return;
+
+force_shutdown:
+	fsmonitor_batch__pop(batch);
+	string_list_clear(&cookie_list, 0);
+
+	data->shutdown_style = FORCE_SHUTDOWN;
+	CFRunLoopStop(data->rl);
+	strbuf_release(&tmp);
+	return;
+}
+
+/*
+ * NEEDSWORK: Investigate the proper value for the `latency` argument
+ * in the call to `FSEventStreamCreate()`.  I'm not sure that this
+ * needs to be a config setting or just something that we tune after
+ * some testing.
+ *
+ * With a latency of 0.1, I was seeing lots of dropped events during
+ * the "touch 100000" files test within t/perf/p7519, but with a
+ * latency of 0.001 I did not see any dropped events.  So the
+ * "correct" value may be somewhere in between.
+ *
+ * https://developer.apple.com/documentation/coreservices/1443980-fseventstreamcreate
+ */
 
 int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
 {
+	FSEventStreamCreateFlags flags = kFSEventStreamCreateFlagNoDefer |
+		kFSEventStreamCreateFlagWatchRoot |
+		kFSEventStreamCreateFlagFileEvents;
+	FSEventStreamContext ctx = {
+		0,
+		state,
+		NULL,
+		NULL,
+		NULL
+	};
+	struct fsmonitor_daemon_backend_data *data;
+	const void *dir_array[2];
+
+	CALLOC_ARRAY(data, 1);
+	state->backend_data = data;
+
+	data->cfsr_worktree_path = CFStringCreateWithCString(
+		NULL, state->path_worktree_watch.buf, kCFStringEncodingUTF8);
+	dir_array[data->nr_paths_watching++] = data->cfsr_worktree_path;
+
+	if (state->nr_paths_watching > 1) {
+		data->cfsr_gitdir_path = CFStringCreateWithCString(
+			NULL, state->path_gitdir_watch.buf,
+			kCFStringEncodingUTF8);
+		dir_array[data->nr_paths_watching++] = data->cfsr_gitdir_path;
+	}
+
+	data->cfar_paths_to_watch = CFArrayCreate(NULL, dir_array,
+						  data->nr_paths_watching,
+						  NULL);
+	data->stream = FSEventStreamCreate(NULL, fsevent_callback, &ctx,
+					   data->cfar_paths_to_watch,
+					   kFSEventStreamEventIdSinceNow,
+					   0.001, flags);
+	if (data->stream == NULL)
+		goto failed;
+
+	/*
+	 * `data->rl` needs to be set inside the listener thread.
+	 */
+
+	return 0;
+
+failed:
+	error("Unable to create FSEventStream.");
+
+	FREE_AND_NULL(state->backend_data);
 	return -1;
 }
 
 void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
 {
+	struct fsmonitor_daemon_backend_data *data;
+
+	if (!state || !state->backend_data)
+		return;
+
+	data = state->backend_data;
+
+	if (data->stream) {
+		if (data->stream_started)
+			FSEventStreamStop(data->stream);
+		if (data->stream_scheduled)
+			FSEventStreamInvalidate(data->stream);
+		FSEventStreamRelease(data->stream);
+	}
+
+	FREE_AND_NULL(state->backend_data);
 }
 
 void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state)
 {
+	struct fsmonitor_daemon_backend_data *data;
+
+	data = state->backend_data;
+	data->shutdown_style = SHUTDOWN_EVENT;
+
+	CFRunLoopStop(data->rl);
 }
 
 void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
 {
+	struct fsmonitor_daemon_backend_data *data;
+
+	data = state->backend_data;
+
+	data->rl = CFRunLoopGetCurrent();
+
+	FSEventStreamScheduleWithRunLoop(data->stream, data->rl, kCFRunLoopDefaultMode);
+	data->stream_scheduled = 1;
+
+	if (!FSEventStreamStart(data->stream)) {
+		error("Failed to start the FSEventStream");
+		goto force_error_stop_without_loop;
+	}
+	data->stream_started = 1;
+
+	CFRunLoopRun();
+
+	switch (data->shutdown_style) {
+	case FORCE_ERROR_STOP:
+		state->error_code = -1;
+		/* fall thru */
+	case FORCE_SHUTDOWN:
+		ipc_server_stop_async(state->ipc_server_data);
+		/* fall thru */
+	case SHUTDOWN_EVENT:
+	default:
+		break;
+	}
+	return;
+
+force_error_stop_without_loop:
+	state->error_code = -1;
+	ipc_server_stop_async(state->ipc_server_data);
+	return;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 22/34] fsmonitor--daemon: implement handle_client callback
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (20 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 21/34] fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 23/34] t/helper/test-touch: add helper to touch a series of files Jeff Hostetler via GitGitGadget
                       ` (12 subsequent siblings)
  34 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon to respond to IPC requests from client
Git processes and respond with a list of modified pathnames
relative to the provided token.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 312 +++++++++++++++++++++++++++++++++++-
 1 file changed, 310 insertions(+), 2 deletions(-)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index ea3a52d34e3..7a7fef681fe 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -7,6 +7,7 @@
 #include "fsmonitor--daemon.h"
 #include "simple-ipc.h"
 #include "khash.h"
+#include "pkt-line.h"
 
 static const char * const builtin_fsmonitor__daemon_usage[] = {
 	N_("git fsmonitor--daemon start [<options>]"),
@@ -355,6 +356,311 @@ void fsmonitor_force_resync(struct fsmonitor_daemon_state *state)
 	pthread_mutex_unlock(&state->main_lock);
 }
 
+/*
+ * Format an opaque token string to send to the client.
+ */
+static void with_lock__format_response_token(
+	struct strbuf *response_token,
+	const struct strbuf *response_token_id,
+	const struct fsmonitor_batch *batch)
+{
+	/* assert current thread holding state->main_lock */
+
+	strbuf_reset(response_token);
+	strbuf_addf(response_token, "builtin:%s:%"PRIu64,
+		    response_token_id->buf, batch->batch_seq_nr);
+}
+
+/*
+ * Parse an opaque token from the client.
+ * Returns -1 on error.
+ */
+static int fsmonitor_parse_client_token(const char *buf_token,
+					struct strbuf *requested_token_id,
+					uint64_t *seq_nr)
+{
+	const char *p;
+	char *p_end;
+
+	strbuf_reset(requested_token_id);
+	*seq_nr = 0;
+
+	if (!skip_prefix(buf_token, "builtin:", &p))
+		return -1;
+
+	while (*p && *p != ':')
+		strbuf_addch(requested_token_id, *p++);
+	if (!*p++)
+		return -1;
+
+	*seq_nr = (uint64_t)strtoumax(p, &p_end, 10);
+	if (*p_end)
+		return -1;
+
+	return 0;
+}
+
+KHASH_INIT(str, const char *, int, 0, kh_str_hash_func, kh_str_hash_equal);
+
+static int do_handle_client(struct fsmonitor_daemon_state *state,
+			    const char *command,
+			    ipc_server_reply_cb *reply,
+			    struct ipc_server_reply_data *reply_data)
+{
+	struct fsmonitor_token_data *token_data = NULL;
+	struct strbuf response_token = STRBUF_INIT;
+	struct strbuf requested_token_id = STRBUF_INIT;
+	struct strbuf payload = STRBUF_INIT;
+	uint64_t requested_oldest_seq_nr = 0;
+	uint64_t total_response_len = 0;
+	const char *p;
+	const struct fsmonitor_batch *batch_head;
+	const struct fsmonitor_batch *batch;
+	intmax_t count = 0, duplicates = 0;
+	kh_str_t *shown;
+	int hash_ret;
+	int do_trivial = 0;
+	int do_flush = 0;
+
+	/*
+	 * We expect `command` to be of the form:
+	 *
+	 * <command> := quit NUL
+	 *            | flush NUL
+	 *            | <V1-time-since-epoch-ns> NUL
+	 *            | <V2-opaque-fsmonitor-token> NUL
+	 */
+
+	if (!strcmp(command, "quit")) {
+		/*
+		 * A client has requested over the socket/pipe that the
+		 * daemon shutdown.
+		 *
+		 * Tell the IPC thread pool to shutdown (which completes
+		 * the await in the main thread (which can stop the
+		 * fsmonitor listener thread)).
+		 *
+		 * There is no reply to the client.
+		 */
+		return SIMPLE_IPC_QUIT;
+
+	} else if (!strcmp(command, "flush")) {
+		/*
+		 * Flush all of our cached data and generate a new token
+		 * just like if we lost sync with the filesystem.
+		 *
+		 * Then send a trivial response using the new token.
+		 */
+		do_flush = 1;
+		do_trivial = 1;
+
+	} else if (!skip_prefix(command, "builtin:", &p)) {
+		/* assume V1 timestamp or garbage */
+
+		char *p_end;
+
+		strtoumax(command, &p_end, 10);
+		trace_printf_key(&trace_fsmonitor,
+				 ((*p_end) ?
+				  "fsmonitor: invalid command line '%s'" :
+				  "fsmonitor: unsupported V1 protocol '%s'"),
+				 command);
+		do_trivial = 1;
+
+	} else {
+		/* We have "builtin:*" */
+		if (fsmonitor_parse_client_token(command, &requested_token_id,
+						 &requested_oldest_seq_nr)) {
+			trace_printf_key(&trace_fsmonitor,
+					 "fsmonitor: invalid V2 protocol token '%s'",
+					 command);
+			do_trivial = 1;
+
+		} else {
+			/*
+			 * We have a V2 valid token:
+			 *     "builtin:<token_id>:<seq_nr>"
+			 */
+		}
+	}
+
+	pthread_mutex_lock(&state->main_lock);
+
+	if (!state->current_token_data)
+		BUG("fsmonitor state does not have a current token");
+
+	if (do_flush)
+		with_lock__do_force_resync(state);
+
+	/*
+	 * We mark the current head of the batch list as "pinned" so
+	 * that the listener thread will treat this item as read-only
+	 * (and prevent any more paths from being added to it) from
+	 * now on.
+	 */
+	token_data = state->current_token_data;
+	batch_head = token_data->batch_head;
+	((struct fsmonitor_batch *)batch_head)->pinned_time = time(NULL);
+
+	/*
+	 * FSMonitor Protocol V2 requires that we send a response header
+	 * with a "new current token" and then all of the paths that changed
+	 * since the "requested token".  We send the seq_nr of the just-pinned
+	 * head batch so that future requests from a client will be relative
+	 * to it.
+	 */
+	with_lock__format_response_token(&response_token,
+					 &token_data->token_id, batch_head);
+
+	reply(reply_data, response_token.buf, response_token.len + 1);
+	total_response_len += response_token.len + 1;
+
+	trace2_data_string("fsmonitor", the_repository, "response/token",
+			   response_token.buf);
+	trace_printf_key(&trace_fsmonitor, "response token: %s",
+			 response_token.buf);
+
+	if (!do_trivial) {
+		if (strcmp(requested_token_id.buf, token_data->token_id.buf)) {
+			/*
+			 * The client last spoke to a different daemon
+			 * instance -OR- the daemon had to resync with
+			 * the filesystem (and lost events), so reject.
+			 */
+			trace2_data_string("fsmonitor", the_repository,
+					   "response/token", "different");
+			do_trivial = 1;
+
+		} else if (requested_oldest_seq_nr <
+			   token_data->batch_tail->batch_seq_nr) {
+			/*
+			 * The client wants older events than we have for
+			 * this token_id.  This means that the end of our
+			 * batch list was truncated and we cannot give the
+			 * client a complete snapshot relative to their
+			 * request.
+			 */
+			trace_printf_key(&trace_fsmonitor,
+					 "client requested truncated data");
+			do_trivial = 1;
+		}
+	}
+
+	if (do_trivial) {
+		pthread_mutex_unlock(&state->main_lock);
+
+		reply(reply_data, "/", 2);
+
+		trace2_data_intmax("fsmonitor", the_repository,
+				   "response/trivial", 1);
+
+		strbuf_release(&response_token);
+		strbuf_release(&requested_token_id);
+		return 0;
+	}
+
+	/*
+	 * We're going to hold onto a pointer to the current
+	 * token-data while we walk the list of batches of files.
+	 * During this time, we will NOT be under the lock.
+	 * So we ref-count it.
+	 *
+	 * This allows the listener thread to continue prepending
+	 * new batches of items to the token-data (which we'll ignore).
+	 *
+	 * AND it allows the listener thread to do a token-reset
+	 * (and install a new `current_token_data`).
+	 */
+	token_data->client_ref_count++;
+
+	pthread_mutex_unlock(&state->main_lock);
+
+	/*
+	 * The client request is relative to the token that they sent,
+	 * so walk the batch list backwards from the current head back
+	 * to the batch (sequence number) they named.
+	 *
+	 * We use khash to de-dup the list of pathnames.
+	 *
+	 * NEEDSWORK: each batch contains a list of interned strings,
+	 * so we only need to do pointer comparisons here to build the
+	 * hash table.  Currently, we're still comparing the string
+	 * values.
+	 */
+	shown = kh_init_str();
+	for (batch = batch_head;
+	     batch && batch->batch_seq_nr > requested_oldest_seq_nr;
+	     batch = batch->next) {
+		size_t k;
+
+		for (k = 0; k < batch->nr; k++) {
+			const char *s = batch->interned_paths[k];
+			size_t s_len;
+
+			if (kh_get_str(shown, s) != kh_end(shown))
+				duplicates++;
+			else {
+				kh_put_str(shown, s, &hash_ret);
+
+				trace_printf_key(&trace_fsmonitor,
+						 "send[%"PRIuMAX"]: %s",
+						 count, s);
+
+				/* Each path gets written with a trailing NUL */
+				s_len = strlen(s) + 1;
+
+				if (payload.len + s_len >=
+				    LARGE_PACKET_DATA_MAX) {
+					reply(reply_data, payload.buf,
+					      payload.len);
+					total_response_len += payload.len;
+					strbuf_reset(&payload);
+				}
+
+				strbuf_add(&payload, s, s_len);
+				count++;
+			}
+		}
+	}
+
+	if (payload.len) {
+		reply(reply_data, payload.buf, payload.len);
+		total_response_len += payload.len;
+	}
+
+	kh_release_str(shown);
+
+	pthread_mutex_lock(&state->main_lock);
+
+	if (token_data->client_ref_count > 0)
+		token_data->client_ref_count--;
+
+	if (token_data->client_ref_count == 0) {
+		if (token_data != state->current_token_data) {
+			/*
+			 * The listener thread did a token-reset while we were
+			 * walking the batch list.  Therefore, this token is
+			 * stale and can be discarded completely.  If we are
+			 * the last reader thread using this token, we own
+			 * that work.
+			 */
+			fsmonitor_free_token_data(token_data);
+		}
+	}
+
+	pthread_mutex_unlock(&state->main_lock);
+
+	trace2_data_intmax("fsmonitor", the_repository, "response/length", total_response_len);
+	trace2_data_intmax("fsmonitor", the_repository, "response/count/files", count);
+	trace2_data_intmax("fsmonitor", the_repository, "response/count/duplicates", duplicates);
+
+	strbuf_release(&response_token);
+	strbuf_release(&requested_token_id);
+	strbuf_release(&payload);
+
+	return 0;
+}
+
 static ipc_server_application_cb handle_client;
 
 static int handle_client(void *data,
@@ -362,7 +668,7 @@ static int handle_client(void *data,
 			 ipc_server_reply_cb *reply,
 			 struct ipc_server_reply_data *reply_data)
 {
-	/* struct fsmonitor_daemon_state *state = data; */
+	struct fsmonitor_daemon_state *state = data;
 	int result;
 
 	/*
@@ -373,10 +679,12 @@ static int handle_client(void *data,
 	if (command_len != strlen(command))
 		BUG("FSMonitor assumes text messages");
 
+	trace_printf_key(&trace_fsmonitor, "requested token: %s", command);
+
 	trace2_region_enter("fsmonitor", "handle_client", the_repository);
 	trace2_data_string("fsmonitor", the_repository, "request", command);
 
-	result = 0; /* TODO Do something here. */
+	result = do_handle_client(state, command, reply, reply_data);
 
 	trace2_region_leave("fsmonitor", "handle_client", the_repository);
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 23/34] t/helper/test-touch: add helper to touch a series of files
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (21 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 22/34] fsmonitor--daemon: implement handle_client callback Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 20:00       ` Junio C Hamano
  2021-07-01 14:47     ` [PATCH v3 24/34] t/perf/p7519: speed up test using "test-tool touch" Jeff Hostetler via GitGitGadget
                       ` (11 subsequent siblings)
  34 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Create `test-tool touch` that can update a series of files
using either a pattern given on the command line or a list
of files read from stdin.

This will be used in a later commit to speed up p7519
which needs to generate/update many thousands of files.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 Makefile              |   1 +
 t/helper/test-tool.c  |   1 +
 t/helper/test-tool.h  |   1 +
 t/helper/test-touch.c | 126 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 129 insertions(+)
 create mode 100644 t/helper/test-touch.c

diff --git a/Makefile b/Makefile
index a2a6e1f20f6..c07cfb75532 100644
--- a/Makefile
+++ b/Makefile
@@ -757,6 +757,7 @@ TEST_BUILTINS_OBJS += test-string-list.o
 TEST_BUILTINS_OBJS += test-submodule-config.o
 TEST_BUILTINS_OBJS += test-submodule-nested-repo-config.o
 TEST_BUILTINS_OBJS += test-subprocess.o
+TEST_BUILTINS_OBJS += test-touch.o
 TEST_BUILTINS_OBJS += test-trace2.o
 TEST_BUILTINS_OBJS += test-urlmatch-normalization.o
 TEST_BUILTINS_OBJS += test-userdiff.o
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index af879e4a5d7..1ad8d5fbd82 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -73,6 +73,7 @@ static struct test_cmd cmds[] = {
 	{ "submodule-config", cmd__submodule_config },
 	{ "submodule-nested-repo-config", cmd__submodule_nested_repo_config },
 	{ "subprocess", cmd__subprocess },
+	{ "touch", cmd__touch },
 	{ "trace2", cmd__trace2 },
 	{ "userdiff", cmd__userdiff },
 	{ "urlmatch-normalization", cmd__urlmatch_normalization },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 6c5134b46d9..58fde0a62e5 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -63,6 +63,7 @@ int cmd__string_list(int argc, const char **argv);
 int cmd__submodule_config(int argc, const char **argv);
 int cmd__submodule_nested_repo_config(int argc, const char **argv);
 int cmd__subprocess(int argc, const char **argv);
+int cmd__touch(int argc, const char **argv);
 int cmd__trace2(int argc, const char **argv);
 int cmd__userdiff(int argc, const char **argv);
 int cmd__urlmatch_normalization(int argc, const char **argv);
diff --git a/t/helper/test-touch.c b/t/helper/test-touch.c
new file mode 100644
index 00000000000..e9b3b754f1f
--- /dev/null
+++ b/t/helper/test-touch.c
@@ -0,0 +1,126 @@
+/*
+ * test-touch.c: variation on /usr/bin/touch to speed up tests
+ * with a large number of files (primarily on Windows where child
+ * process are very, very expensive).
+ */
+
+#include "test-tool.h"
+#include "cache.h"
+#include "parse-options.h"
+
+char *seq_pattern;
+int seq_start = 1;
+int seq_count = 1;
+
+static int do_touch_one(const char *path)
+{
+	int fd;
+
+	if (!utime(path, NULL))
+		return 0;
+
+	if (errno != ENOENT) {
+		warning_errno("could not touch '%s'", path);
+		return 0;
+	}
+
+	fd = open(path, O_RDWR | O_CREAT, 0644);
+	if (fd == -1) {
+		warning_errno("could not create '%s'", path);
+		return 0;
+	}
+	close(fd);
+
+	return 0;
+}
+
+/*
+ * Touch a series of files.  We assume that any required subdirs
+ * already exist.  This function allows us to replace the following
+ * test script fragment:
+ *
+ *    for i in $(test_seq 1 10000); do touch 10000_files/$i; done &&
+ *
+ * with a single process:
+ *
+ *    test-tool touch sequence --pattern="10000_files/%d" --start=1 --count=10000
+ *
+ * which is much faster on Windows.
+ */
+static int do_sequence(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+	int k;
+
+	for (k = seq_start; k < seq_start + seq_count; k++) {
+		strbuf_reset(&buf);
+		strbuf_addf(&buf, seq_pattern, k);
+
+		if (do_touch_one(buf.buf))
+			return 1;
+	}
+
+	return 0;
+}
+
+/*
+ * Read a list of pathnames from stdin and touch them.  We assume that
+ * any required subdirs already exist.
+ */
+static int do_stdin(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+
+	while (strbuf_getline(&buf, stdin) != EOF && buf.len)
+		if (do_touch_one(buf.buf))
+			return 1;
+
+	return 0;
+}
+
+int cmd__touch(int argc, const char **argv)
+{
+	const char *touch_usage[] = {
+		N_("test-tool touch sequence <pattern> <start> <count>"),
+		N_("test-tool touch stdin"),
+		NULL,
+	};
+
+	struct option touch_options[] = {
+		OPT_GROUP(N_("sequence")),
+		OPT_STRING(0, "pattern", &seq_pattern, N_("format"),
+			   N_("sequence pathname pattern")),
+		OPT_INTEGER(0, "start", &seq_start,
+			    N_("sequence starting value")),
+		OPT_INTEGER(0, "count", &seq_count,
+			    N_("sequence count")),
+		OPT_END()
+	};
+
+	const char *subcmd;
+
+	if (argc < 2)
+		usage_with_options(touch_usage, touch_options);
+	if (argc == 2 && !strcmp(argv[1], "-h"))
+		usage_with_options(touch_usage, touch_options);
+
+	subcmd = argv[1];
+	argv--;
+	argc++;
+
+	argc = parse_options(argc, argv, NULL, touch_options, touch_usage, 0);
+
+	if (!strcmp(subcmd, "sequence")) {
+		if (!seq_pattern || !strstr(seq_pattern, "%d"))
+			die("invalid sequence pattern");
+		if (seq_count < 1)
+			die("invalid sequence count: %d", seq_count);
+		return !!do_sequence();
+	}
+
+	if (!strcmp(subcmd, "stdin")) {
+		return !!do_stdin();
+	}
+
+	die("Unhandled subcommand: '%s'", subcmd);
+}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 24/34] t/perf/p7519: speed up test using "test-tool touch"
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (22 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 23/34] t/helper/test-touch: add helper to touch a series of files Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 23:09       ` Ævar Arnfjörð Bjarmason
  2021-07-13 18:04       ` Jeff Hostetler
  2021-07-01 14:47     ` [PATCH v3 25/34] t/perf: avoid copying builtin fsmonitor files into test repo Jeff Hostetler via GitGitGadget
                       ` (10 subsequent siblings)
  34 siblings, 2 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Change p7519 to use a single "test-tool touch" command to update
the mtime on a series of (thousands) files instead of invoking
thousands of commands to update a single file.

This is primarily for Windows where process creation is so
very slow and reduces the test run time by minutes.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 t/perf/p7519-fsmonitor.sh | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
index 5eb5044a103..f74e6014a0a 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -119,10 +119,11 @@ test_expect_success "one time repo setup" '
 	fi &&
 
 	mkdir 1_file 10_files 100_files 1000_files 10000_files &&
-	for i in $(test_seq 1 10); do touch 10_files/$i; done &&
-	for i in $(test_seq 1 100); do touch 100_files/$i; done &&
-	for i in $(test_seq 1 1000); do touch 1000_files/$i; done &&
-	for i in $(test_seq 1 10000); do touch 10000_files/$i; done &&
+	test-tool touch sequence --pattern="10_files/%d" --start=1 --count=10 &&
+	test-tool touch sequence --pattern="100_files/%d" --start=1 --count=100 &&
+	test-tool touch sequence --pattern="1000_files/%d" --start=1 --count=1000 &&
+	test-tool touch sequence --pattern="10000_files/%d" --start=1 --count=10000 &&
+
 	git add 1_file 10_files 100_files 1000_files 10000_files &&
 	git commit -qm "Add files" &&
 
@@ -200,15 +201,12 @@ test_fsmonitor_suite() {
 	# Update the mtimes on upto 100k files to make status think
 	# that they are dirty.  For simplicity, omit any files with
 	# LFs (i.e. anything that ls-files thinks it needs to dquote).
-	# Then fully backslash-quote the paths to capture any
-	# whitespace so that they pass thru xargs properly.
 	#
 	test_perf_w_drop_caches "status (dirty) ($DESC)" '
 		git ls-files | \
 			head -100000 | \
 			grep -v \" | \
-			sed '\''s/\(.\)/\\\1/g'\'' | \
-			xargs test-tool chmtime -300 &&
+			test-tool touch stdin &&
 		git status
 	'
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 25/34] t/perf: avoid copying builtin fsmonitor files into test repo
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (23 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 24/34] t/perf/p7519: speed up test using "test-tool touch" Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 23:11       ` Ævar Arnfjörð Bjarmason
  2021-07-01 14:47     ` [PATCH v3 26/34] t/perf/p7519: add fsmonitor--daemon test cases Jeff Hostetler via GitGitGadget
                       ` (9 subsequent siblings)
  34 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Do not try to copy a fsmonitor--daemon socket from the current
development directory into the test trash directory.

When we run the perf suite without an explicit source repo set,
we copy of the current $GIT_DIR into the test trash directory.
Unix domain sockets cannot be copied in that manner, so the test
setup fails.

Additionally, omit any other fsmonitor--daemon temp files inside
the $GIT_DIR directory.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 t/perf/perf-lib.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
index 601d9f67ddb..3b97e3fc0f2 100644
--- a/t/perf/perf-lib.sh
+++ b/t/perf/perf-lib.sh
@@ -74,7 +74,7 @@ test_perf_copy_repo_contents () {
 	for stuff in "$1"/*
 	do
 		case "$stuff" in
-		*/objects|*/hooks|*/config|*/commondir|*/gitdir|*/worktrees)
+		*/objects|*/hooks|*/config|*/commondir|*/gitdir|*/worktrees|*/fsmonitor--daemon*)
 			;;
 		*)
 			cp -R "$stuff" "$repo/.git/" || exit 1
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 26/34] t/perf/p7519: add fsmonitor--daemon test cases
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (24 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 25/34] t/perf: avoid copying builtin fsmonitor files into test repo Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 27/34] t7527: create test for fsmonitor--daemon Jeff Hostetler via GitGitGadget
                       ` (8 subsequent siblings)
  34 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Repeat all of the fsmonitor perf tests using `git fsmonitor--daemon` and
the "Simple IPC" interface.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 t/perf/p7519-fsmonitor.sh | 37 ++++++++++++++++++++++++++++++++++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
index f74e6014a0a..3a3fc5748ae 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -24,7 +24,8 @@ test_description="Test core.fsmonitor"
 # GIT_PERF_7519_SPLIT_INDEX: used to configure core.splitIndex
 # GIT_PERF_7519_FSMONITOR: used to configure core.fsMonitor. May be an
 #   absolute path to an integration. May be a space delimited list of
-#   absolute paths to integrations.
+#   absolute paths to integrations.  (This hook or list of hooks does not
+#   include the built-in fsmonitor--daemon.)
 #
 # The big win for using fsmonitor is the elimination of the need to scan the
 # working directory looking for changed and untracked files. If the file
@@ -136,10 +137,16 @@ test_expect_success "one time repo setup" '
 
 setup_for_fsmonitor() {
 	# set INTEGRATION_SCRIPT depending on the environment
-	if test -n "$INTEGRATION_PATH"
+	if test -n "$USE_FSMONITOR_DAEMON"
 	then
+		git config core.useBuiltinFSMonitor true &&
+		INTEGRATION_SCRIPT=false
+	elif test -n "$INTEGRATION_PATH"
+	then
+		git config core.useBuiltinFSMonitor false &&
 		INTEGRATION_SCRIPT="$INTEGRATION_PATH"
 	else
+		git config core.useBuiltinFSMonitor false &&
 		#
 		# Choose integration script based on existence of Watchman.
 		# Fall back to an empty integration script.
@@ -175,7 +182,10 @@ test_perf_w_drop_caches () {
 }
 
 test_fsmonitor_suite() {
-	if test -n "$INTEGRATION_SCRIPT"; then
+	if test -n "$USE_FSMONITOR_DAEMON"
+	then
+		DESC="builtin fsmonitor--daemon"
+	elif test -n "$INTEGRATION_SCRIPT"; then
 		DESC="fsmonitor=$(basename $INTEGRATION_SCRIPT)"
 	else
 		DESC="fsmonitor=disabled"
@@ -283,4 +293,25 @@ test_expect_success "setup without fsmonitor" '
 test_fsmonitor_suite
 trace_stop
 
+#
+# Run a full set of perf tests using the built-in fsmonitor--daemon.
+# It does not use the Hook API, so it has a different setup.
+# Explicitly start the daemon here and before we start client commands
+# so that we can later add custom tracing.
+#
+if test_have_prereq FSMONITOR_DAEMON
+then
+	USE_FSMONITOR_DAEMON=t
+
+	trace_start fsmonitor--daemon--server
+	git fsmonitor--daemon start
+
+	trace_start fsmonitor--daemon--client
+	test_expect_success "setup for fsmonitor--daemon" 'setup_for_fsmonitor'
+	test_fsmonitor_suite
+
+	git fsmonitor--daemon stop
+	trace_stop
+fi
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 27/34] t7527: create test for fsmonitor--daemon
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (25 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 26/34] t/perf/p7519: add fsmonitor--daemon test cases Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 23:15       ` Ævar Arnfjörð Bjarmason
  2021-07-01 14:47     ` [PATCH v3 28/34] fsmonitor--daemon: periodically truncate list of modified files Jeff Hostetler via GitGitGadget
                       ` (7 subsequent siblings)
  34 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 t/t7527-builtin-fsmonitor.sh | 497 +++++++++++++++++++++++++++++++++++
 1 file changed, 497 insertions(+)
 create mode 100755 t/t7527-builtin-fsmonitor.sh

diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
new file mode 100755
index 00000000000..58b56dc9940
--- /dev/null
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -0,0 +1,497 @@
+#!/bin/sh
+
+test_description='built-in file system watcher'
+
+. ./test-lib.sh
+
+if ! test_have_prereq FSMONITOR_DAEMON
+then
+	skip_all="fsmonitor--daemon is not supported on this platform"
+	test_done
+fi
+
+stop_daemon_delete_repo () {
+	r=$1
+	git -C $r fsmonitor--daemon stop >/dev/null 2>/dev/null
+	rm -rf $1
+	return 0
+}
+
+start_daemon () {
+	case "$#" in
+		1) r="-C $1";;
+		*) r="";
+	esac
+
+	git $r fsmonitor--daemon start || return $?
+	git $r fsmonitor--daemon status || return $?
+
+	return 0
+}
+
+test_expect_success 'explicit daemon start and stop' '
+	test_when_finished "stop_daemon_delete_repo test_explicit" &&
+
+	git init test_explicit &&
+	start_daemon test_explicit &&
+
+	git -C test_explicit fsmonitor--daemon stop &&
+	test_must_fail git -C test_explicit fsmonitor--daemon status
+'
+
+test_expect_success 'implicit daemon start' '
+	test_when_finished "stop_daemon_delete_repo test_implicit" &&
+
+	git init test_implicit &&
+	test_must_fail git -C test_implicit fsmonitor--daemon status &&
+
+	# query will implicitly start the daemon.
+	#
+	# for test-script simplicity, we send a V1 timestamp rather than
+	# a V2 token.  either way, the daemon response to any query contains
+	# a new V2 token.  (the daemon may complain that we sent a V1 request,
+	# but this test case is only concerned with whether the daemon was
+	# implicitly started.)
+
+	GIT_TRACE2_EVENT="$(pwd)/.git/trace" \
+		test-tool -C test_implicit fsmonitor-client query --token 0 >actual &&
+	nul_to_q <actual >actual.filtered &&
+	grep "builtin:" actual.filtered &&
+
+	# confirm that a daemon was started in the background.
+	#
+	# since the mechanism for starting the background daemon is platform
+	# dependent, just confirm that the foreground command received a
+	# response from the daemon.
+
+	grep :\"query/response-length\" .git/trace &&
+
+	git -C test_implicit fsmonitor--daemon status &&
+	git -C test_implicit fsmonitor--daemon stop &&
+	test_must_fail git -C test_implicit fsmonitor--daemon status
+'
+
+test_expect_success 'implicit daemon stop (delete .git)' '
+	test_when_finished "stop_daemon_delete_repo test_implicit_1" &&
+
+	git init test_implicit_1 &&
+
+	start_daemon test_implicit_1 &&
+
+	# deleting the .git directory will implicitly stop the daemon.
+	rm -rf test_implicit_1/.git &&
+
+	# [1] Create an empty .git directory so that the following Git
+	#     command will stay relative to the `-C` directory.
+	#
+	#     Without this, the Git command will override the requested
+	#     -C argument and crawl out to the containing Git source tree.
+	#     This would make the test result dependent upon whether we
+	#     were using fsmonitor on our development worktree.
+	#
+	sleep 1 &&
+	mkdir test_implicit_1/.git &&
+
+	test_must_fail git -C test_implicit_1 fsmonitor--daemon status
+'
+
+test_expect_success 'implicit daemon stop (rename .git)' '
+	test_when_finished "stop_daemon_delete_repo test_implicit_2" &&
+
+	git init test_implicit_2 &&
+
+	start_daemon test_implicit_2 &&
+
+	# renaming the .git directory will implicitly stop the daemon.
+	mv test_implicit_2/.git test_implicit_2/.xxx &&
+
+	# See [1] above.
+	#
+	sleep 1 &&
+	mkdir test_implicit_2/.git &&
+
+	test_must_fail git -C test_implicit_2 fsmonitor--daemon status
+'
+
+test_expect_success 'cannot start multiple daemons' '
+	test_when_finished "stop_daemon_delete_repo test_multiple" &&
+
+	git init test_multiple &&
+
+	start_daemon test_multiple &&
+
+	test_must_fail git -C test_multiple fsmonitor--daemon start 2>actual &&
+	grep "fsmonitor--daemon is already running" actual &&
+
+	git -C test_multiple fsmonitor--daemon stop &&
+	test_must_fail git -C test_multiple fsmonitor--daemon status
+'
+
+# These tests use the main repo in the trash directory
+
+test_expect_success 'setup' '
+	>tracked &&
+	>modified &&
+	>delete &&
+	>rename &&
+	mkdir dir1 &&
+	>dir1/tracked &&
+	>dir1/modified &&
+	>dir1/delete &&
+	>dir1/rename &&
+	mkdir dir2 &&
+	>dir2/tracked &&
+	>dir2/modified &&
+	>dir2/delete &&
+	>dir2/rename &&
+	mkdir dirtorename &&
+	>dirtorename/a &&
+	>dirtorename/b &&
+
+	cat >.gitignore <<-\EOF &&
+	.gitignore
+	expect*
+	actual*
+	EOF
+
+	git -c core.useBuiltinFSMonitor= add . &&
+	test_tick &&
+	git -c core.useBuiltinFSMonitor= commit -m initial &&
+
+	git config core.useBuiltinFSMonitor true
+'
+
+# The test already explicitly stopped (or tried to stop) the daemon.
+# This is here in case something else fails first.
+#
+redundant_stop_daemon () {
+	git fsmonitor--daemon stop
+	return 0
+}
+
+test_expect_success 'update-index implicitly starts daemon' '
+	test_when_finished redundant_stop_daemon &&
+
+	test_must_fail git fsmonitor--daemon status &&
+
+	GIT_TRACE2_EVENT="$(pwd)/.git/trace_implicit_1" \
+		git update-index --fsmonitor &&
+
+	git fsmonitor--daemon status &&
+	test_might_fail git fsmonitor--daemon stop &&
+
+	grep \"event\":\"start\".*\"fsmonitor--daemon\" .git/trace_implicit_1
+'
+
+test_expect_success 'status implicitly starts daemon' '
+	test_when_finished redundant_stop_daemon &&
+
+	test_must_fail git fsmonitor--daemon status &&
+
+	GIT_TRACE2_EVENT="$(pwd)/.git/trace_implicit_2" \
+		git status >actual &&
+
+	git fsmonitor--daemon status &&
+	test_might_fail git fsmonitor--daemon stop &&
+
+	grep \"event\":\"start\".*\"fsmonitor--daemon\" .git/trace_implicit_2
+'
+
+edit_files() {
+	echo 1 >modified
+	echo 2 >dir1/modified
+	echo 3 >dir2/modified
+	>dir1/untracked
+}
+
+delete_files() {
+	rm -f delete
+	rm -f dir1/delete
+	rm -f dir2/delete
+}
+
+create_files() {
+	echo 1 >new
+	echo 2 >dir1/new
+	echo 3 >dir2/new
+}
+
+rename_files() {
+	mv rename renamed
+	mv dir1/rename dir1/renamed
+	mv dir2/rename dir2/renamed
+}
+
+file_to_directory() {
+	rm -f delete
+	mkdir delete
+	echo 1 >delete/new
+}
+
+directory_to_file() {
+	rm -rf dir1
+	echo 1 >dir1
+}
+
+verify_status() {
+	git status >actual &&
+	GIT_INDEX_FILE=.git/fresh-index git read-tree master &&
+	GIT_INDEX_FILE=.git/fresh-index git -c core.useBuiltinFSMonitor= status >expect &&
+	test_cmp expect actual &&
+	echo HELLO AFTER &&
+	cat .git/trace &&
+	echo HELLO AFTER
+}
+
+# The next few test cases confirm that our fsmonitor daemon sees each type
+# of OS filesystem notification that we care about.  At this layer we just
+# ensure we are getting the OS notifications and do not try to confirm what
+# is reported by `git status`.
+#
+# We run a simple query after modifying the filesystem just to introduce
+# a bit of a delay so that the trace logging from the daemon has time to
+# get flushed to disk.
+#
+# We `reset` and `clean` at the bottom of each test (and before stopping the
+# daemon) because these commands might implicitly restart the daemon.
+
+clean_up_repo_and_stop_daemon () {
+	git reset --hard HEAD
+	git clean -fd
+	git fsmonitor--daemon stop
+	rm -f .git/trace
+}
+
+test_expect_success 'edit some files' '
+	test_when_finished clean_up_repo_and_stop_daemon &&
+
+	(
+		GIT_TRACE_FSMONITOR="$(pwd)/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	edit_files &&
+
+	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
+
+	grep "^event: dir1/modified$"  .git/trace &&
+	grep "^event: dir2/modified$"  .git/trace &&
+	grep "^event: modified$"       .git/trace &&
+	grep "^event: dir1/untracked$" .git/trace
+'
+
+test_expect_success 'create some files' '
+	test_when_finished clean_up_repo_and_stop_daemon &&
+
+	(
+		GIT_TRACE_FSMONITOR="$(pwd)/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	create_files &&
+
+	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
+
+	grep "^event: dir1/new$" .git/trace &&
+	grep "^event: dir2/new$" .git/trace &&
+	grep "^event: new$"      .git/trace
+'
+
+test_expect_success 'delete some files' '
+	test_when_finished clean_up_repo_and_stop_daemon &&
+
+	(
+		GIT_TRACE_FSMONITOR="$(pwd)/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	delete_files &&
+
+	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
+
+	grep "^event: dir1/delete$" .git/trace &&
+	grep "^event: dir2/delete$" .git/trace &&
+	grep "^event: delete$"      .git/trace
+'
+
+test_expect_success 'rename some files' '
+	test_when_finished clean_up_repo_and_stop_daemon &&
+
+	(
+		GIT_TRACE_FSMONITOR="$(pwd)/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	rename_files &&
+
+	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
+
+	grep "^event: dir1/rename$"  .git/trace &&
+	grep "^event: dir2/rename$"  .git/trace &&
+	grep "^event: rename$"       .git/trace &&
+	grep "^event: dir1/renamed$" .git/trace &&
+	grep "^event: dir2/renamed$" .git/trace &&
+	grep "^event: renamed$"      .git/trace
+'
+
+test_expect_success 'rename directory' '
+	test_when_finished clean_up_repo_and_stop_daemon &&
+
+	(
+		GIT_TRACE_FSMONITOR="$(pwd)/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	mv dirtorename dirrenamed &&
+
+	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
+
+	grep "^event: dirtorename/*$" .git/trace &&
+	grep "^event: dirrenamed/*$"  .git/trace
+'
+
+test_expect_success 'file changes to directory' '
+	test_when_finished clean_up_repo_and_stop_daemon &&
+
+	(
+		GIT_TRACE_FSMONITOR="$(pwd)/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	file_to_directory &&
+
+	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
+
+	grep "^event: delete$"     .git/trace &&
+	grep "^event: delete/new$" .git/trace
+'
+
+test_expect_success 'directory changes to a file' '
+	test_when_finished clean_up_repo_and_stop_daemon &&
+
+	(
+		GIT_TRACE_FSMONITOR="$(pwd)/.git/trace" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon
+	) &&
+
+	directory_to_file &&
+
+	test-tool fsmonitor-client query --token 0 >/dev/null 2>&1 &&
+
+	grep "^event: dir1$" .git/trace
+'
+
+# The next few test cases exercise the token-resync code.  When filesystem
+# drops events (because of filesystem velocity or because the daemon isn't
+# polling fast enough), we need to discard the cached data (relative to the
+# current token) and start collecting events under a new token.
+#
+# the 'test-tool fsmonitor-client flush' command can be used to send a
+# "flush" message to a running daemon and ask it to do a flush/resync.
+
+test_expect_success 'flush cached data' '
+	test_when_finished "stop_daemon_delete_repo test_flush" &&
+
+	git init test_flush &&
+
+	(
+		GIT_TEST_FSMONITOR_TOKEN=true &&
+		export GIT_TEST_FSMONITOR_TOKEN &&
+
+		GIT_TRACE_FSMONITOR="$(pwd)/.git/trace_daemon" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon test_flush
+	) &&
+
+	# The daemon should have an initial token with no events in _0 and
+	# then a few (probably platform-specific number of) events in _1.
+	# These should both have the same <token_id>.
+
+	test-tool -C test_flush fsmonitor-client query --token "builtin:test_00000001:0" >actual_0 &&
+	nul_to_q <actual_0 >actual_q0 &&
+
+	touch test_flush/file_1 &&
+	touch test_flush/file_2 &&
+
+	test-tool -C test_flush fsmonitor-client query --token "builtin:test_00000001:0" >actual_1 &&
+	nul_to_q <actual_1 >actual_q1 &&
+
+	grep "file_1" actual_q1 &&
+
+	# Force a flush.  This will change the <token_id>, reset the <seq_nr>, and
+	# flush the file data.  Then create some events and ensure that the file
+	# again appears in the cache.  It should have the new <token_id>.
+
+	test-tool -C test_flush fsmonitor-client flush >flush_0 &&
+	nul_to_q <flush_0 >flush_q0 &&
+	grep "^builtin:test_00000002:0Q/Q$" flush_q0 &&
+
+	test-tool -C test_flush fsmonitor-client query --token "builtin:test_00000002:0" >actual_2 &&
+	nul_to_q <actual_2 >actual_q2 &&
+
+	grep "^builtin:test_00000002:0Q$" actual_q2 &&
+
+	touch test_flush/file_3 &&
+
+	test-tool -C test_flush fsmonitor-client query --token "builtin:test_00000002:0" >actual_3 &&
+	nul_to_q <actual_3 >actual_q3 &&
+
+	grep "file_3" actual_q3
+'
+
+# The next few test cases create repos where the .git directory is NOT
+# inside the one of the working directory.  That is, where .git is a file
+# that points to a directory elsewhere.  This happens for submodules and
+# non-primary worktrees.
+
+test_expect_success 'setup worktree base' '
+	git init wt-base &&
+	echo 1 >wt-base/file1 &&
+	git -C wt-base add file1 &&
+	git -C wt-base commit -m "c1"
+'
+
+test_expect_success 'worktree with .git file' '
+	git -C wt-base worktree add ../wt-secondary &&
+
+	(
+		GIT_TRACE2_PERF="$(pwd)/trace2_wt_secondary" &&
+		export GIT_TRACE2_PERF &&
+
+		GIT_TRACE_FSMONITOR="$(pwd)/trace_wt_secondary" &&
+		export GIT_TRACE_FSMONITOR &&
+
+		start_daemon wt-secondary
+	) &&
+
+	git -C wt-secondary fsmonitor--daemon stop &&
+	test_must_fail git -C wt-secondary fsmonitor--daemon status
+'
+
+# NEEDSWORK: Repeat one of the "edit" tests on wt-secondary and
+# confirm that we get the same events and behavior -- that is, that
+# fsmonitor--daemon correctly watches BOTH the working directory and
+# the external GITDIR directory and behaves the same as when ".git"
+# is a directory inside the working directory.
+
+test_expect_success 'cleanup worktrees' '
+	stop_daemon_delete_repo wt-secondary &&
+	stop_daemon_delete_repo wt-base
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 28/34] fsmonitor--daemon: periodically truncate list of modified files
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (26 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 27/34] t7527: create test for fsmonitor--daemon Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 29/34] fsmonitor--daemon: use a cookie file to sync with file system Jeff Hostetler via GitGitGadget
                       ` (6 subsequent siblings)
  34 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon to periodically truncate the list of
modified files to save some memory.

Clients will ask for the set of changes relative to a token that they
found in the FSMN index extension in the index.  (This token is like a
point in time, but different).  Clients will then update the index to
contain the response token (so that subsequent commands will be
relative to this new token).

Therefore, the daemon can gradually truncate the in-memory list of
changed paths as they become obsolete (older than the previous token).
Since we may have multiple clients making concurrent requests with a
skew of tokens and clients may be racing to the talk to the daemon,
we lazily truncate the list.

We introduce a 5 minute delay and truncate batches 5 minutes after
they are considered obsolete.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 80 +++++++++++++++++++++++++++++++++++++
 1 file changed, 80 insertions(+)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 7a7fef681fe..8249420ba18 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -300,6 +300,77 @@ static void fsmonitor_batch__combine(struct fsmonitor_batch *batch_dest,
 			batch_src->interned_paths[k];
 }
 
+/*
+ * To keep the batch list from growing unbounded in response to filesystem
+ * activity, we try to truncate old batches from the end of the list as
+ * they become irrelevant.
+ *
+ * We assume that the .git/index will be updated with the most recent token
+ * any time the index is updated.  And future commands will only ask for
+ * recent changes *since* that new token.  So as tokens advance into the
+ * future, older batch items will never be requested/needed.  So we can
+ * truncate them without loss of functionality.
+ *
+ * However, multiple commands may be talking to the daemon concurrently
+ * or perform a slow command, so a little "token skew" is possible.
+ * Therefore, we want this to be a little bit lazy and have a generous
+ * delay.
+ *
+ * The current reader thread walked backwards in time from `token->batch_head`
+ * back to `batch_marker` somewhere in the middle of the batch list.
+ *
+ * Let's walk backwards in time from that marker an arbitrary delay
+ * and truncate the list there.  Note that these timestamps are completely
+ * artificial (based on when we pinned the batch item) and not on any
+ * filesystem activity.
+ */
+#define MY_TIME_DELAY_SECONDS (5 * 60) /* seconds */
+
+static void with_lock__truncate_old_batches(
+	struct fsmonitor_daemon_state *state,
+	const struct fsmonitor_batch *batch_marker)
+{
+	/* assert current thread holding state->main_lock */
+
+	const struct fsmonitor_batch *batch;
+	struct fsmonitor_batch *rest;
+	struct fsmonitor_batch *p;
+
+	if (!batch_marker)
+		return;
+
+	trace_printf_key(&trace_fsmonitor, "Truncate: mark (%"PRIu64",%"PRIu64")",
+			 batch_marker->batch_seq_nr,
+			 (uint64_t)batch_marker->pinned_time);
+
+	for (batch = batch_marker; batch; batch = batch->next) {
+		time_t t;
+
+		if (!batch->pinned_time) /* an overflow batch */
+			continue;
+
+		t = batch->pinned_time + MY_TIME_DELAY_SECONDS;
+		if (t > batch_marker->pinned_time) /* too close to marker */
+			continue;
+
+		goto truncate_past_here;
+	}
+
+	return;
+
+truncate_past_here:
+	state->current_token_data->batch_tail = (struct fsmonitor_batch *)batch;
+
+	rest = ((struct fsmonitor_batch *)batch)->next;
+	((struct fsmonitor_batch *)batch)->next = NULL;
+
+	for (p = rest; p; p = fsmonitor_batch__pop(p)) {
+		trace_printf_key(&trace_fsmonitor,
+				 "Truncate: kill (%"PRIu64",%"PRIu64")",
+				 p->batch_seq_nr, (uint64_t)p->pinned_time);
+	}
+}
+
 static void fsmonitor_free_token_data(struct fsmonitor_token_data *token)
 {
 	struct fsmonitor_batch *p;
@@ -645,6 +716,15 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
 			 * that work.
 			 */
 			fsmonitor_free_token_data(token_data);
+		} else if (batch) {
+			/*
+			 * This batch is the first item in the list
+			 * that is older than the requested sequence
+			 * number and might be considered to be
+			 * obsolete.  See if we can truncate the list
+			 * and save some memory.
+			 */
+			with_lock__truncate_old_batches(state, batch);
 		}
 	}
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 29/34] fsmonitor--daemon: use a cookie file to sync with file system
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (27 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 28/34] fsmonitor--daemon: periodically truncate list of modified files Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 23:17       ` Ævar Arnfjörð Bjarmason
  2021-07-01 14:47     ` [PATCH v3 30/34] fsmonitor: enhance existing comments Jeff Hostetler via GitGitGadget
                       ` (5 subsequent siblings)
  34 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach fsmonitor--daemon client threads to create a cookie file
inside the .git directory and then wait until FS events for the
cookie are observed by the FS listener thread.

This helps address the racy nature of file system events by
blocking the client response until the kernel has drained any
event backlog.

This is especially important on MacOS where kernel events are
only issued with a limited frequency.  See the `latency` argument
of `FSeventStreamCreate()`.  The kernel only signals every `latency`
seconds, but does not guarantee that the kernel queue is completely
drained, so we may have to wait more than one interval.  If we
increase the frequency, the system is more likely to drop events.
We avoid these issues by having each client thread create a unique
cookie file and then wait until it is seen in the event stream.

Co-authored-by: Kevin Willford <Kevin.Willford@microsoft.com>
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 228 +++++++++++++++++++++++++++++++++++-
 fsmonitor--daemon.h         |   5 +
 2 files changed, 232 insertions(+), 1 deletion(-)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 8249420ba18..25f18f2726b 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -94,6 +94,149 @@ static int do_as_client__status(void)
 	}
 }
 
+enum fsmonitor_cookie_item_result {
+	FCIR_ERROR = -1, /* could not create cookie file ? */
+	FCIR_INIT = 0,
+	FCIR_SEEN,
+	FCIR_ABORT,
+};
+
+struct fsmonitor_cookie_item {
+	struct hashmap_entry entry;
+	const char *name;
+	enum fsmonitor_cookie_item_result result;
+};
+
+static int cookies_cmp(const void *data, const struct hashmap_entry *he1,
+		     const struct hashmap_entry *he2, const void *keydata)
+{
+	const struct fsmonitor_cookie_item *a =
+		container_of(he1, const struct fsmonitor_cookie_item, entry);
+	const struct fsmonitor_cookie_item *b =
+		container_of(he2, const struct fsmonitor_cookie_item, entry);
+
+	return strcmp(a->name, keydata ? keydata : b->name);
+}
+
+static enum fsmonitor_cookie_item_result with_lock__wait_for_cookie(
+	struct fsmonitor_daemon_state *state)
+{
+	/* assert current thread holding state->main_lock */
+
+	int fd;
+	struct fsmonitor_cookie_item *cookie;
+	struct strbuf cookie_pathname = STRBUF_INIT;
+	struct strbuf cookie_filename = STRBUF_INIT;
+	enum fsmonitor_cookie_item_result result;
+	int my_cookie_seq;
+
+	CALLOC_ARRAY(cookie, 1);
+
+	my_cookie_seq = state->cookie_seq++;
+
+	strbuf_addf(&cookie_filename, "%i-%i", getpid(), my_cookie_seq);
+
+	strbuf_addbuf(&cookie_pathname, &state->path_cookie_prefix);
+	strbuf_addbuf(&cookie_pathname, &cookie_filename);
+
+	cookie->name = strbuf_detach(&cookie_filename, NULL);
+	cookie->result = FCIR_INIT;
+	hashmap_entry_init(&cookie->entry, strhash(cookie->name));
+
+	hashmap_add(&state->cookies, &cookie->entry);
+
+	trace_printf_key(&trace_fsmonitor, "cookie-wait: '%s' '%s'",
+			 cookie->name, cookie_pathname.buf);
+
+	/*
+	 * Create the cookie file on disk and then wait for a notification
+	 * that the listener thread has seen it.
+	 */
+	fd = open(cookie_pathname.buf, O_WRONLY | O_CREAT | O_EXCL, 0600);
+	if (fd >= 0) {
+		close(fd);
+		unlink(cookie_pathname.buf);
+
+		/*
+		 * NEEDSWORK: This is an infinite wait (well, unless another
+		 * thread sends us an abort).  I'd like to change this to
+		 * use `pthread_cond_timedwait()` and return an error/timeout
+		 * and let the caller do the trivial response thing.
+		 */
+		while (cookie->result == FCIR_INIT)
+			pthread_cond_wait(&state->cookies_cond,
+					  &state->main_lock);
+	} else {
+		error_errno(_("could not create fsmonitor cookie '%s'"),
+			    cookie->name);
+
+		cookie->result = FCIR_ERROR;
+	}
+
+	hashmap_remove(&state->cookies, &cookie->entry, NULL);
+
+	result = cookie->result;
+
+	free((char*)cookie->name);
+	free(cookie);
+	strbuf_release(&cookie_pathname);
+
+	return result;
+}
+
+/*
+ * Mark these cookies as _SEEN and wake up the corresponding client threads.
+ */
+static void with_lock__mark_cookies_seen(struct fsmonitor_daemon_state *state,
+					 const struct string_list *cookie_names)
+{
+	/* assert current thread holding state->main_lock */
+
+	int k;
+	int nr_seen = 0;
+
+	for (k = 0; k < cookie_names->nr; k++) {
+		struct fsmonitor_cookie_item key;
+		struct fsmonitor_cookie_item *cookie;
+
+		key.name = cookie_names->items[k].string;
+		hashmap_entry_init(&key.entry, strhash(key.name));
+
+		cookie = hashmap_get_entry(&state->cookies, &key, entry, NULL);
+		if (cookie) {
+			trace_printf_key(&trace_fsmonitor, "cookie-seen: '%s'",
+					 cookie->name);
+			cookie->result = FCIR_SEEN;
+			nr_seen++;
+		}
+	}
+
+	if (nr_seen)
+		pthread_cond_broadcast(&state->cookies_cond);
+}
+
+/*
+ * Set _ABORT on all pending cookies and wake up all client threads.
+ */
+static void with_lock__abort_all_cookies(struct fsmonitor_daemon_state *state)
+{
+	/* assert current thread holding state->main_lock */
+
+	struct hashmap_iter iter;
+	struct fsmonitor_cookie_item *cookie;
+	int nr_aborted = 0;
+
+	hashmap_for_each_entry(&state->cookies, &iter, cookie, entry) {
+		trace_printf_key(&trace_fsmonitor, "cookie-abort: '%s'",
+				 cookie->name);
+		cookie->result = FCIR_ABORT;
+		nr_aborted++;
+	}
+
+	if (nr_aborted)
+		pthread_cond_broadcast(&state->cookies_cond);
+}
+
 /*
  * Requests to and from a FSMonitor Protocol V2 provider use an opaque
  * "token" as a virtual timestamp.  Clients can request a summary of all
@@ -397,6 +540,9 @@ static void fsmonitor_free_token_data(struct fsmonitor_token_data *token)
  *     We should create a new token and start fresh (as if we just
  *     booted up).
  *
+ * [2] Some of those lost events may have been for cookie files.  We
+ *     should assume the worst and abort them rather letting them starve.
+ *
  * If there are no concurrent threads readering the current token data
  * series, we can free it now.  Otherwise, let the last reader free
  * it.
@@ -418,6 +564,8 @@ static void with_lock__do_force_resync(struct fsmonitor_daemon_state *state)
 	state->current_token_data = new_one;
 
 	fsmonitor_free_token_data(free_me);
+
+	with_lock__abort_all_cookies(state);
 }
 
 void fsmonitor_force_resync(struct fsmonitor_daemon_state *state)
@@ -492,6 +640,8 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
 	int hash_ret;
 	int do_trivial = 0;
 	int do_flush = 0;
+	int do_cookie = 0;
+	enum fsmonitor_cookie_item_result cookie_result;
 
 	/*
 	 * We expect `command` to be of the form:
@@ -552,6 +702,7 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
 			 * We have a V2 valid token:
 			 *     "builtin:<token_id>:<seq_nr>"
 			 */
+			do_cookie = 1;
 		}
 	}
 
@@ -560,6 +711,30 @@ static int do_handle_client(struct fsmonitor_daemon_state *state,
 	if (!state->current_token_data)
 		BUG("fsmonitor state does not have a current token");
 
+	/*
+	 * Write a cookie file inside the directory being watched in
+	 * an effort to flush out existing filesystem events that we
+	 * actually care about.  Suspend this client thread until we
+	 * see the filesystem events for this cookie file.
+	 *
+	 * Creating the cookie lets us guarantee that our FS listener
+	 * thread has drained the kernel queue and we are caught up
+	 * with the kernel.
+	 *
+	 * If we cannot create the cookie (or otherwise guarantee that
+	 * we are caught up), we send a trivial response.  We have to
+	 * assume that there might be some very, very recent activity
+	 * on the FS still in flight.
+	 */
+	if (do_cookie) {
+		cookie_result = with_lock__wait_for_cookie(state);
+		if (cookie_result != FCIR_SEEN) {
+			error(_("fsmonitor: cookie_result '%d' != SEEN"),
+			      cookie_result);
+			do_trivial = 1;
+		}
+	}
+
 	if (do_flush)
 		with_lock__do_force_resync(state);
 
@@ -771,7 +946,9 @@ static int handle_client(void *data,
 	return result;
 }
 
-#define FSMONITOR_COOKIE_PREFIX ".fsmonitor-daemon-"
+#define FSMONITOR_DIR           "fsmonitor--daemon"
+#define FSMONITOR_COOKIE_DIR    "cookies"
+#define FSMONITOR_COOKIE_PREFIX (FSMONITOR_DIR "/" FSMONITOR_COOKIE_DIR "/")
 
 enum fsmonitor_path_type fsmonitor_classify_path_workdir_relative(
 	const char *rel)
@@ -924,6 +1101,9 @@ void fsmonitor_publish(struct fsmonitor_daemon_state *state,
 		}
 	}
 
+	if (cookie_names->nr)
+		with_lock__mark_cookies_seen(state, cookie_names);
+
 	pthread_mutex_unlock(&state->main_lock);
 }
 
@@ -1013,7 +1193,9 @@ static int fsmonitor_run_daemon(void)
 
 	memset(&state, 0, sizeof(state));
 
+	hashmap_init(&state.cookies, cookies_cmp, NULL, 0);
 	pthread_mutex_init(&state.main_lock, NULL);
+	pthread_cond_init(&state.cookies_cond, NULL);
 	state.error_code = 0;
 	state.current_token_data = fsmonitor_new_token_data();
 
@@ -1038,6 +1220,44 @@ static int fsmonitor_run_daemon(void)
 		state.nr_paths_watching = 2;
 	}
 
+	/*
+	 * We will write filesystem syncing cookie files into
+	 * <gitdir>/<fsmonitor-dir>/<cookie-dir>/<pid>-<seq>.
+	 *
+	 * The extra layers of subdirectories here keep us from
+	 * changing the mtime on ".git/" or ".git/foo/" when we create
+	 * or delete cookie files.
+	 *
+	 * There have been problems with some IDEs that do a
+	 * non-recursive watch of the ".git/" directory and run a
+	 * series of commands any time something happens.
+	 *
+	 * For example, if we place our cookie files directly in
+	 * ".git/" or ".git/foo/" then a `git status` (or similar
+	 * command) from the IDE will cause a cookie file to be
+	 * created in one of those dirs.  This causes the mtime of
+	 * those dirs to change.  This triggers the IDE's watch
+	 * notification.  This triggers the IDE to run those commands
+	 * again.  And the process repeats and the machine never goes
+	 * idle.
+	 *
+	 * Adding the extra layers of subdirectories prevents the
+	 * mtime of ".git/" and ".git/foo" from changing when a
+	 * cookie file is created.
+	 */
+	strbuf_init(&state.path_cookie_prefix, 0);
+	strbuf_addbuf(&state.path_cookie_prefix, &state.path_gitdir_watch);
+
+	strbuf_addch(&state.path_cookie_prefix, '/');
+	strbuf_addstr(&state.path_cookie_prefix, FSMONITOR_DIR);
+	mkdir(state.path_cookie_prefix.buf, 0777);
+
+	strbuf_addch(&state.path_cookie_prefix, '/');
+	strbuf_addstr(&state.path_cookie_prefix, FSMONITOR_COOKIE_DIR);
+	mkdir(state.path_cookie_prefix.buf, 0777);
+
+	strbuf_addch(&state.path_cookie_prefix, '/');
+
 	/*
 	 * Confirm that we can create platform-specific resources for the
 	 * filesystem listener before we bother starting all the threads.
@@ -1050,6 +1270,7 @@ static int fsmonitor_run_daemon(void)
 	err = fsmonitor_run_daemon_1(&state);
 
 done:
+	pthread_cond_destroy(&state.cookies_cond);
 	pthread_mutex_destroy(&state.main_lock);
 	fsmonitor_fs_listen__dtor(&state);
 
@@ -1057,6 +1278,11 @@ done:
 
 	strbuf_release(&state.path_worktree_watch);
 	strbuf_release(&state.path_gitdir_watch);
+	strbuf_release(&state.path_cookie_prefix);
+
+	/*
+	 * NEEDSWORK: Consider "rm -rf <gitdir>/<fsmonitor-dir>"
+	 */
 
 	return err;
 }
diff --git a/fsmonitor--daemon.h b/fsmonitor--daemon.h
index 89a9ef20b24..e9fc099bae9 100644
--- a/fsmonitor--daemon.h
+++ b/fsmonitor--daemon.h
@@ -45,6 +45,11 @@ struct fsmonitor_daemon_state {
 
 	struct fsmonitor_token_data *current_token_data;
 
+	struct strbuf path_cookie_prefix;
+	pthread_cond_t cookies_cond;
+	int cookie_seq;
+	struct hashmap cookies;
+
 	int error_code;
 	struct fsmonitor_daemon_backend_data *backend_data;
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 30/34] fsmonitor: enhance existing comments
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (28 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 29/34] fsmonitor--daemon: use a cookie file to sync with file system Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 31/34] fsmonitor: force update index after large responses Jeff Hostetler via GitGitGadget
                       ` (4 subsequent siblings)
  34 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 fsmonitor.c | 37 ++++++++++++++++++++++++++++++-------
 1 file changed, 30 insertions(+), 7 deletions(-)

diff --git a/fsmonitor.c b/fsmonitor.c
index 3719ddfeec9..f53791c8674 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -360,9 +360,25 @@ void refresh_fsmonitor(struct index_state *istate)
 	}
 
 apply_results:
-	/* a fsmonitor process can return '/' to indicate all entries are invalid */
+	/*
+	 * The response from FSMonitor (excluding the header token) is
+	 * either:
+	 *
+	 * [a] a (possibly empty) list of NUL delimited relative
+	 *     pathnames of changed paths.  This list can contain
+	 *     files and directories.  Directories have a trailing
+	 *     slash.
+	 *
+	 * [b] a single '/' to indicate the provider had no
+	 *     information and that we should consider everything
+	 *     invalid.  We call this a trivial response.
+	 */
 	if (query_success && query_result.buf[bol] != '/') {
-		/* Mark all entries returned by the monitor as dirty */
+		/*
+		 * Mark all pathnames returned by the monitor as dirty.
+		 *
+		 * This updates both the cache-entries and the untracked-cache.
+		 */
 		buf = query_result.buf;
 		for (i = bol; i < query_result.len; i++) {
 			if (buf[i] != '\0')
@@ -377,11 +393,15 @@ apply_results:
 		if (istate->untracked)
 			istate->untracked->use_fsmonitor = 1;
 	} else {
-
-		/* We only want to run the post index changed hook if we've actually changed entries, so keep track
-		 * if we actually changed entries or not */
+		/*
+		 * We received a trivial response, so invalidate everything.
+		 *
+		 * We only want to run the post index changed hook if
+		 * we've actually changed entries, so keep track if we
+		 * actually changed entries or not.
+		 */
 		int is_cache_changed = 0;
-		/* Mark all entries invalid */
+
 		for (i = 0; i < istate->cache_nr; i++) {
 			if (istate->cache[i]->ce_flags & CE_FSMONITOR_VALID) {
 				is_cache_changed = 1;
@@ -389,7 +409,10 @@ apply_results:
 			}
 		}
 
-		/* If we're going to check every file, ensure we save the results */
+		/*
+		 * If we're going to check every file, ensure we save
+		 * the results.
+		 */
 		if (is_cache_changed)
 			istate->cache_changed |= FSMONITOR_CHANGED;
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 31/34] fsmonitor: force update index after large responses
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (29 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 30/34] fsmonitor: enhance existing comments Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 32/34] t7527: test status with untracked-cache and fsmonitor--daemon Jeff Hostetler via GitGitGadget
                       ` (3 subsequent siblings)
  34 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Set the `FSMONITOR_CHANGED` bit on `istate->cache_changed` when
FSMonitor returns a very large repsonse to ensure that the index is
written to disk.

Normally, when the FSMonitor response includes a tracked file, the
index is always updated.  Similarly, the index might be updated when
the response alters the untracked-cache (when enabled).  However, in
cases where neither of those cause the index to be considered changed,
the FSMonitor response is wasted.  Subsequent Git commands will make
requests with the same token and receive the same response.

If that response is very large, performance may suffer.  It would be
more efficient to force update the index now (and the token in the
index extension) in order to reduce the size of the response received
by future commands.

This was observed on Windows after a large checkout.  On Windows, the
kernel emits events for the files that are changed as they are
changed.  However, it might delay events for the containing
directories until the system is more idle (or someone scans the
directory (so it seems)).  The first status following a checkout would
get the list of files.  The subsequent status commands would get the
list of directories as the events trickled out.  But they would never
catch up because the token was not advanced because the index wasn't
updated.

This list of directories caused `wt_status_collect_untracked()` to
unnecessarily spend time actually scanning them during each command.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 fsmonitor.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/fsmonitor.c b/fsmonitor.c
index f53791c8674..eee653f9337 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -236,6 +236,45 @@ static void fsmonitor_refresh_callback(struct index_state *istate, char *name)
 	untracked_cache_invalidate_path(istate, name, 0);
 }
 
+/*
+ * The number of pathnames that we need to receive from FSMonitor
+ * before we force the index to be updated.
+ *
+ * Note that any pathname within the set of received paths MAY cause
+ * cache-entry or istate flag bits to be updated and thus cause the
+ * index to be updated on disk.
+ *
+ * However, the response may contain many paths (such as ignored
+ * paths) that will not update any flag bits.  And thus not force the
+ * index to be updated.  (This is fine and normal.)  It also means
+ * that the token will not be updated in the FSMonitor index
+ * extension.  So the next Git command will find the same token in the
+ * index, make the same token-relative request, and receive the same
+ * response (plus any newly changed paths).  If this response is large
+ * (and continues to grow), performance could be impacted.
+ *
+ * For example, if the user runs a build and it writes 100K object
+ * files but doesn't modify any source files, the index would not need
+ * to be updated.  The FSMonitor response (after the build and
+ * relative to a pre-build token) might be 5MB.  Each subsequent Git
+ * command will receive that same 100K/5MB response until something
+ * causes the index to be updated.  And `refresh_fsmonitor()` will
+ * have to iterate over those 100K paths each time.
+ *
+ * Performance could be improved if we optionally force update the
+ * index after a very large response and get an updated token into
+ * the FSMonitor index extension.  This should allow subsequent
+ * commands to get smaller and more current responses.
+ *
+ * The value chosen here does not need to be precise.  The index
+ * will be updated automatically the first time the user touches
+ * a tracked file and causes a command like `git status` to
+ * update an mtime to be updated and/or set a flag bit.
+ *
+ * NEEDSWORK: Does this need to be a config value?
+ */
+static int fsmonitor_force_update_threshold = 100;
+
 void refresh_fsmonitor(struct index_state *istate)
 {
 	struct strbuf query_result = STRBUF_INIT;
@@ -379,19 +418,28 @@ apply_results:
 		 *
 		 * This updates both the cache-entries and the untracked-cache.
 		 */
+		int count = 0;
+
 		buf = query_result.buf;
 		for (i = bol; i < query_result.len; i++) {
 			if (buf[i] != '\0')
 				continue;
 			fsmonitor_refresh_callback(istate, buf + bol);
 			bol = i + 1;
+			count++;
 		}
-		if (bol < query_result.len)
+		if (bol < query_result.len) {
 			fsmonitor_refresh_callback(istate, buf + bol);
+			count++;
+		}
 
 		/* Now mark the untracked cache for fsmonitor usage */
 		if (istate->untracked)
 			istate->untracked->use_fsmonitor = 1;
+
+		if (count > fsmonitor_force_update_threshold)
+			istate->cache_changed |= FSMONITOR_CHANGED;
+
 	} else {
 		/*
 		 * We received a trivial response, so invalidate everything.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 32/34] t7527: test status with untracked-cache and fsmonitor--daemon
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (30 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 31/34] fsmonitor: force update index after large responses Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 33/34] fsmonitor: handle shortname for .git Jeff Hostetler via GitGitGadget
                       ` (2 subsequent siblings)
  34 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Create 2x2 test matrix with the untracked-cache and fsmonitor--daemon
features and a series of edits and verify that status output is
identical.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 t/t7527-builtin-fsmonitor.sh | 87 ++++++++++++++++++++++++++++++++++++
 1 file changed, 87 insertions(+)

diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index 58b56dc9940..d1832702397 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -152,6 +152,8 @@ test_expect_success 'setup' '
 	.gitignore
 	expect*
 	actual*
+	flush*
+	trace*
 	EOF
 
 	git -c core.useBuiltinFSMonitor= add . &&
@@ -494,4 +496,89 @@ test_expect_success 'cleanup worktrees' '
 	stop_daemon_delete_repo wt-base
 '
 
+# The next few tests perform arbitrary/contrived file operations and
+# confirm that status is correct.  That is, that the data (or lack of
+# data) from fsmonitor doesn't cause incorrect results.  And doesn't
+# cause incorrect results when the untracked-cache is enabled.
+
+test_lazy_prereq UNTRACKED_CACHE '
+	{ git update-index --test-untracked-cache; ret=$?; } &&
+	test $ret -ne 1
+'
+
+test_expect_success 'Matrix: setup for untracked-cache,fsmonitor matrix' '
+	test_might_fail git config --unset core.useBuiltinFSMonitor &&
+	git update-index --no-fsmonitor &&
+	test_might_fail git fsmonitor--daemon stop
+'
+
+matrix_clean_up_repo () {
+	git reset --hard HEAD
+	git clean -fd
+}
+
+matrix_try () {
+	uc=$1
+	fsm=$2
+	fn=$3
+
+	test_expect_success "Matrix[uc:$uc][fsm:$fsm] $fn" '
+		matrix_clean_up_repo &&
+		$fn &&
+		if test $uc = false -a $fsm = false
+		then
+			git status --porcelain=v1 >.git/expect.$fn
+		else
+			git status --porcelain=v1 >.git/actual.$fn
+			test_cmp .git/expect.$fn .git/actual.$fn
+		fi
+	'
+
+	return $?
+}
+
+uc_values="false"
+test_have_prereq UNTRACKED_CACHE && uc_values="false true"
+for uc_val in $uc_values
+do
+	if test $uc_val = false
+	then
+		test_expect_success "Matrix[uc:$uc_val] disable untracked cache" '
+			git config core.untrackedcache false &&
+			git update-index --no-untracked-cache
+		'
+	else
+		test_expect_success "Matrix[uc:$uc_val] enable untracked cache" '
+			git config core.untrackedcache true &&
+			git update-index --untracked-cache
+		'
+	fi
+
+	fsm_values="false true"
+	for fsm_val in $fsm_values
+	do
+		if test $fsm_val = false
+		then
+			test_expect_success "Matrix[uc:$uc_val][fsm:$fsm_val] disable fsmonitor" '
+				test_might_fail git config --unset core.useBuiltinFSMonitor &&
+				git update-index --no-fsmonitor &&
+				test_might_fail git fsmonitor--daemon stop 2>/dev/null
+			'
+		else
+			test_expect_success "Matrix[uc:$uc_val][fsm:$fsm_val] enable fsmonitor" '
+				git config core.useBuiltinFSMonitor true &&
+				git fsmonitor--daemon start &&
+				git update-index --fsmonitor
+			'
+		fi
+
+		matrix_try $uc_val $fsm_val edit_files
+		matrix_try $uc_val $fsm_val delete_files
+		matrix_try $uc_val $fsm_val create_files
+		matrix_try $uc_val $fsm_val rename_files
+		matrix_try $uc_val $fsm_val file_to_directory
+		matrix_try $uc_val $fsm_val directory_to_file
+	done
+done
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 33/34] fsmonitor: handle shortname for .git
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (31 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 32/34] t7527: test status with untracked-cache and fsmonitor--daemon Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 14:47     ` [PATCH v3 34/34] t7527: test FS event reporing on MacOS WRT case and Unicode Jeff Hostetler via GitGitGadget
  2021-07-01 17:40     ` [PATCH v3 00/34] Builtin FSMonitor Feature Ævar Arnfjörð Bjarmason
  34 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

On Windows, teach FSMonitor to recognize the shortname of ".git"
as an alias for ".git".

Sometimes we receive FS events using the shortname, such as when
a CMD shell runs "RENAME GIT~1 FOO" or "RMDIR GIT~1".  The FS
notification arrives using whatever combination of long and
shortnames used by the other process.  (Shortnames do seem to
be case normalized, however.)

NEEDSWORK: This only addresses the case of removing or renaming
the ".git" directory using the shortname alias, so that the daemon
properly shuts down.  I'm leaving it a task for later to handle
the general case of shortnames and report them to the fsmonitor
client process.  This would include tracked and untracked paths
that just happen to have a shortname alias.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 compat/fsmonitor/fsmonitor-fs-listen-win32.c | 192 +++++++++++++++----
 t/t7527-builtin-fsmonitor.sh                 |  65 +++++++
 2 files changed, 217 insertions(+), 40 deletions(-)

diff --git a/compat/fsmonitor/fsmonitor-fs-listen-win32.c b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
index d707d47a0d7..f2ea5940790 100644
--- a/compat/fsmonitor/fsmonitor-fs-listen-win32.c
+++ b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
@@ -48,6 +48,8 @@ struct fsmonitor_daemon_backend_data
 #define LISTENER_HAVE_DATA_WORKTREE 1
 #define LISTENER_HAVE_DATA_GITDIR 2
 	int nr_listener_handles;
+
+	struct strbuf dot_git_shortname;
 };
 
 /*
@@ -250,6 +252,62 @@ static void cancel_rdcw_watch(struct one_watch *watch)
 	watch->is_active = FALSE;
 }
 
+/*
+ * Process a single relative pathname event.
+ * Return 1 if we should shutdown.
+ */
+static int process_1_worktree_event(
+	FILE_NOTIFY_INFORMATION *info,
+	struct string_list *cookie_list,
+	struct fsmonitor_batch **batch,
+	const struct strbuf *path,
+	enum fsmonitor_path_type t)
+{
+	const char *slash;
+
+	switch (t) {
+	case IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX:
+		/* special case cookie files within .git */
+
+		/* Use just the filename of the cookie file. */
+		slash = find_last_dir_sep(path->buf);
+		string_list_append(cookie_list,
+				   slash ? slash + 1 : path->buf);
+		break;
+
+	case IS_INSIDE_DOT_GIT:
+		/* ignore everything inside of "<worktree>/.git/" */
+		break;
+
+	case IS_DOT_GIT:
+		/* "<worktree>/.git" was deleted (or renamed away) */
+		if ((info->Action == FILE_ACTION_REMOVED) ||
+		    (info->Action == FILE_ACTION_RENAMED_OLD_NAME)) {
+			trace2_data_string("fsmonitor", NULL,
+					   "fsm-listen/dotgit",
+					   "removed");
+			return 1;
+		}
+		break;
+
+	case IS_WORKDIR_PATH:
+		/* queue normal pathname */
+		if (!*batch)
+			*batch = fsmonitor_batch__new();
+		fsmonitor_batch__add_path(*batch, path->buf);
+		break;
+
+	case IS_GITDIR:
+	case IS_INSIDE_GITDIR:
+	case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX:
+	default:
+		BUG("unexpected path classification '%d' for '%s'",
+		    t, path->buf);
+	}
+
+	return 0;
+}
+
 /*
  * Process filesystem events that happen anywhere (recursively) under the
  * <worktree> root directory.  For a normal working directory, this includes
@@ -294,7 +352,6 @@ static int process_worktree_events(struct fsmonitor_daemon_state *state)
 	 */
 	for (;;) {
 		FILE_NOTIFY_INFORMATION *info = (void *)p;
-		const char *slash;
 		enum fsmonitor_path_type t;
 
 		strbuf_reset(&path);
@@ -303,45 +360,45 @@ static int process_worktree_events(struct fsmonitor_daemon_state *state)
 
 		t = fsmonitor_classify_path_workdir_relative(path.buf);
 
-		switch (t) {
-		case IS_INSIDE_DOT_GIT_WITH_COOKIE_PREFIX:
-			/* special case cookie files within .git */
-
-			/* Use just the filename of the cookie file. */
-			slash = find_last_dir_sep(path.buf);
-			string_list_append(&cookie_list,
-					   slash ? slash + 1 : path.buf);
-			break;
-
-		case IS_INSIDE_DOT_GIT:
-			/* ignore everything inside of "<worktree>/.git/" */
-			break;
-
-		case IS_DOT_GIT:
-			/* "<worktree>/.git" was deleted (or renamed away) */
-			if ((info->Action == FILE_ACTION_REMOVED) ||
-			    (info->Action == FILE_ACTION_RENAMED_OLD_NAME)) {
-				trace2_data_string("fsmonitor", NULL,
-						   "fsm-listen/dotgit",
-						   "removed");
-				goto force_shutdown;
-			}
-			break;
-
-		case IS_WORKDIR_PATH:
-			/* queue normal pathname */
-			if (!batch)
-				batch = fsmonitor_batch__new();
-			fsmonitor_batch__add_path(batch, path.buf);
-			break;
-
-		case IS_GITDIR:
-		case IS_INSIDE_GITDIR:
-		case IS_INSIDE_GITDIR_WITH_COOKIE_PREFIX:
-		default:
-			BUG("unexpected path classification '%d' for '%s'",
-			    t, path.buf);
-		}
+		if (process_1_worktree_event(info, &cookie_list, &batch,
+					     &path, t))
+			goto force_shutdown;
+
+		/*
+		 * NEEDSWORK: If `path` contains a shortname (that is,
+		 * if any component within it is a shortname), we
+		 * should expand it to a longname (See
+		 * `GetLongPathNameW()`) and re-normalize, classify,
+		 * and process it because our client is probably
+		 * expecting "normal" paths.
+		 *
+		 * HOWEVER, if our process has called `chdir()` to get
+		 * us out of the root of the worktree (so that the
+		 * root directory is not busy), then we have to be
+		 * careful to convert the paths in the INFO array
+		 * (which are relative to the directory of the RDCW
+		 * watch and not the CWD) into absolute paths before
+		 * calling GetLongPathNameW() and then convert the
+		 * computed value back to a RDCW-relative pathname
+		 * (which is what we and the client expect).
+		 *
+		 * FOR NOW, just handle case (1) exactly so that we
+		 * shutdown properly when ".git" is deleted via the
+		 * shortname alias.
+		 *
+		 * We might see case (2) events for cookie files, but
+		 * we can ignore them.
+		 *
+		 * FOR LATER, handle case (3) where the worktree
+		 * events contain shortnames.  We should convert
+		 * them to longnames to avoid confusing the client.
+		 */
+		if (data->dot_git_shortname.len &&
+		    !strcmp(path.buf, data->dot_git_shortname.buf) &&
+		    process_1_worktree_event(info, &cookie_list, &batch,
+					     &data->dot_git_shortname,
+					     IS_DOT_GIT))
+			goto force_shutdown;
 
 skip_this_path:
 		if (!info->NextEntryOffset)
@@ -415,6 +472,14 @@ static int process_gitdir_events(struct fsmonitor_daemon_state *state)
 			    t, path.buf);
 		}
 
+		/*
+		 * WRT shortnames, this external gitdir will not see
+		 * case (1) nor case (3) events.
+		 *
+		 * We might see case (2) events for cookie files, but
+		 * we can ignore them.
+		 */
+
 skip_this_path:
 		if (!info->NextEntryOffset)
 			break;
@@ -493,6 +558,7 @@ clean_shutdown:
 int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
 {
 	struct fsmonitor_daemon_backend_data *data;
+	char shortname[16]; /* a padded 8.3 buffer */
 
 	CALLOC_ARRAY(data, 1);
 
@@ -523,6 +589,52 @@ int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
 		data->nr_listener_handles++;
 	}
 
+	/*
+	 * NEEDSWORK: Properly handle 8.3 shortnames.  RDCW events can
+	 * contain a shortname (if another application uses a
+	 * shortname in a system call).  We care about aliasing and
+	 * the use of shortnames for:
+	 *
+	 * (1) ".git",
+	 *     -- if an external process deletes ".git" using "GIT~1",
+	 *        we need to catch that and shutdown.
+	 *
+	 * (2) our cookie files,
+	 *     -- if an external process deletes one of our cookie
+	 *        files using a shortname, we will get a shortname
+	 *        event for it.  However, we should have already
+	 *        gotten a longname event for it when we created the
+	 *        cookie, so we can safely discard the shortname
+	 *        events for cookie files.
+	 *
+	 * (3) the spelling of modified files that we report to clients.
+	 *     -- we need to report the longname to the client because
+	 *        that is what they are expecting.  Presumably, the
+	 *        client is going to lookup the paths that we report
+	 *        in their index and untracked-cache, so we should
+	 *        normalize the data for them.  (Technically, they
+	 *        could adapt, so we could relax this maybe.)
+	 *
+	 * FOR NOW, while our CWD is at the root of the worktree we
+	 * can easily get the spelling of the shortname of ".git" (if
+	 * the volume has shortnames enabled).  For most worktrees
+	 * this value will be "GIT~1", but we don't want to assume
+	 * that.
+	 *
+	 * Capture this so that we can handle (1).
+	 *
+	 * We leave (3) for a future effort.
+	 */
+	strbuf_init(&data->dot_git_shortname, 0);
+	GetShortPathNameA(".git", shortname, sizeof(shortname));
+	if (!strcmp(".git", shortname))
+		trace_printf_key(&trace_fsmonitor, "No shortname for '.git'");
+	else {
+		trace_printf_key(&trace_fsmonitor,
+				 "Shortname of '.git' is '%s'", shortname);
+		strbuf_addstr(&data->dot_git_shortname, shortname);
+	}
+
 	state->backend_data = data;
 	return 0;
 
diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index d1832702397..b166b4a0a31 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -113,6 +113,71 @@ test_expect_success 'implicit daemon stop (rename .git)' '
 	test_must_fail git -C test_implicit_2 fsmonitor--daemon status
 '
 
+# File systems on Windows may or may not have shortnames.
+# This is a volume-specific setting on modern systems.
+# "C:/" drives are required to have them enabled.  Other
+# hard drives default to disabled.
+#
+# This is a crude test to see if shortnames are enabled
+# on the volume containing the test directory.  It is
+# crude, but it does not require elevation like `fsutil`.
+#
+test_lazy_prereq SHORTNAMES '
+	mkdir .foo &&
+	test -d "FOO~1"
+'
+
+# Here we assume that the shortname of ".git" is "GIT~1".
+test_expect_success MINGW,SHORTNAMES 'implicit daemon stop (rename GIT~1)' '
+	test_when_finished "stop_daemon_delete_repo test_implicit_1s" &&
+
+	git init test_implicit_1s &&
+
+	start_daemon test_implicit_1s &&
+
+	# renaming the .git directory will implicitly stop the daemon.
+	# this moves {.git, GIT~1} to {.gitxyz, GITXYZ~1}.
+	# the rename-from FS Event will contain the shortname.
+	#
+	mv test_implicit_1s/GIT~1 test_implicit_1s/.gitxyz &&
+
+	sleep 1 &&
+	# put it back so that our status will not crawl out to our
+	# parent directory.
+	# this moves {.gitxyz, GITXYZ~1} to {.git, GIT~1}.
+	mv test_implicit_1s/.gitxyz test_implicit_1s/.git &&
+
+	test_must_fail git -C test_implicit_1s fsmonitor--daemon status
+'
+
+# Here we first create a file with LONGNAME of "GIT~1" before
+# we create the repo.  This will cause the shortname of ".git"
+# to be "GIT~2".
+test_expect_success MINGW,SHORTNAMES 'implicit daemon stop (rename GIT~2)' '
+	test_when_finished "stop_daemon_delete_repo test_implicit_1s2" &&
+
+	mkdir test_implicit_1s2 &&
+	echo HELLO >test_implicit_1s2/GIT~1 &&
+	git init test_implicit_1s2 &&
+
+	[ -f test_implicit_1s2/GIT~1 ] &&
+	[ -d test_implicit_1s2/GIT~2 ] &&
+
+	start_daemon test_implicit_1s2 &&
+
+	# renaming the .git directory will implicitly stop the daemon.
+	# the rename-from FS Event will contain the shortname.
+	#
+	mv test_implicit_1s2/GIT~2 test_implicit_1s2/.gitxyz &&
+
+	sleep 1 &&
+	# put it back so that our status will not crawl out to our
+	# parent directory.
+	mv test_implicit_1s2/.gitxyz test_implicit_1s2/.git &&
+
+	test_must_fail git -C test_implicit_1s2 fsmonitor--daemon status
+'
+
 test_expect_success 'cannot start multiple daemons' '
 	test_when_finished "stop_daemon_delete_repo test_multiple" &&
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* [PATCH v3 34/34] t7527: test FS event reporing on MacOS WRT case and Unicode
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (32 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 33/34] fsmonitor: handle shortname for .git Jeff Hostetler via GitGitGadget
@ 2021-07-01 14:47     ` Jeff Hostetler via GitGitGadget
  2021-07-01 23:39       ` Ævar Arnfjörð Bjarmason
  2021-07-01 17:40     ` [PATCH v3 00/34] Builtin FSMonitor Feature Ævar Arnfjörð Bjarmason
  34 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler via GitGitGadget @ 2021-07-01 14:47 UTC (permalink / raw)
  To: git
  Cc: Johannes Schindelin, Jeff Hostetler, Derrick Stolee,
	Jeff Hostetler, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Confirm that MacOS FS events are reported with a normalized spelling.

APFS (and/or HFS+) is case-insensitive.  This means that case-independent
lookups ( [ -d .git ] and [ -d .GIT ] ) should both succeed.  But that
doesn't tell us how FS events are reported if we try "rm -rf .git" versus
"rm -rf .GIT".  Are the events reported using the on-disk spelling of the
pathname or in the spelling used by the command.

NEEDSWORK: I was only able to test case.  It would be nice to add tests
that use different Unicode spellings/normalizations and understand the
differences between APFS and HFS+ in this area.  We should confirm that
the spelling of the workdir paths that the daemon sends to clients are
always properly normalized.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 t/t7527-builtin-fsmonitor.sh | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index b166b4a0a31..d2ff1bf6c49 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -178,6 +178,36 @@ test_expect_success MINGW,SHORTNAMES 'implicit daemon stop (rename GIT~2)' '
 	test_must_fail git -C test_implicit_1s2 fsmonitor--daemon status
 '
 
+# Confirm that MacOS hides all of the Unicode normalization and/or
+# case folding from the FS events.  That is, are the pathnames in the
+# FS events reported using the spelling on the disk or in the spelling
+# used by the other process.
+#
+# Note that we assume that the filesystem is set to case insensitive.
+#
+# NEEDSWORK: APFS handles Unicode and Unicode normalization
+# differently than HFS+.  I only have an APFS partition, so
+# more testing here would be helpful.
+#
+
+# Rename .git using alternate spelling and confirm that the daemon
+# sees the event using the correct spelling and shutdown.
+test_expect_success UTF8_NFD_TO_NFC 'MacOS event spelling (rename .GIT)' '
+	test_when_finished "stop_daemon_delete_repo test_apfs" &&
+
+	git init test_apfs &&
+	start_daemon test_apfs &&
+
+	[ -d test_apfs/.git ] &&
+	[ -d test_apfs/.GIT ] &&
+
+	mv test_apfs/.GIT test_apfs/.FOO &&
+	sleep 1 &&
+	mv test_apfs/.FOO test_apfs/.git &&
+
+	test_must_fail git -C test_apfs fsmonitor--daemon status
+'
+
 test_expect_success 'cannot start multiple daemons' '
 	test_when_finished "stop_daemon_delete_repo test_multiple" &&
 
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 06/34] fsmonitor: config settings are repository-specific
  2021-07-01 14:47     ` [PATCH v3 06/34] fsmonitor: config settings are repository-specific Jeff Hostetler via GitGitGadget
@ 2021-07-01 16:46       ` Ævar Arnfjörð Bjarmason
  2021-07-19 20:36         ` Jeff Hostetler
  0 siblings, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-01 16:46 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler


On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:

In a reference to a discussion[1] about an earlier version of this patch
you said:

    I'm going to ignore all of the thread responses to this patch
    dealing with how we acquire config settings and macros and etc.
    Those issues are completely independent of FSMonitor (which is
    already way too big).

Since then the changes to repo-settings.c have become a lot larger, so
let's take a look...

1. https://lore.kernel.org/git/87mttkyrqq.fsf@evledraar.gmail.com/
2. https://lore.kernel.org/git/4552971c-0a23-c19a-6a23-cb5737e43b2a@jeffhostetler.com/


> diff --git a/repo-settings.c b/repo-settings.c
> index 0cfe8b787db..faf197ff60a 100644
> --- a/repo-settings.c
> +++ b/repo-settings.c
> @@ -5,10 +5,42 @@
>  
>  #define UPDATE_DEFAULT_BOOL(s,v) do { if (s == -1) { s = v; } } while(0)
>  
> +/*
> + * Return 1 if the repo/workdir is incompatible with FSMonitor.
> + */
> +static int is_repo_incompatible_with_fsmonitor(struct repository *r)
> +{
> +	const char *const_strval;
> +
> +	/*
> +	 * Bare repositories don't have a working directory and
> +	 * therefore, nothing to watch.
> +	 */
> +	if (!r->worktree)
> +		return 1;

Looking ahead in this series you end up using
FSMONITOR_MODE_INCOMPATIBLE in two places in the codebase. In
builtin/update-index.c to throw a "repository is incompatible with
fsmonitor" error.

Can't that case just be replaced with setup_work_tree()? Other sub-modes
of update-index already die implicitly on that, e.g.:

	$ git update-index test
	fatal: this operation must be run in a work tree

The other case is:
	
	+       prepare_repo_settings(the_repository);
	+       if (!the_repository->worktree)
	+               return error(_("fsmonitor-daemon does not support bare repos '%s'"),
	+                            xgetcwd());
	+       if (the_repository->settings.fsmonitor_mode == FSMONITOR_MODE_INCOMPATIBLE)
	+               return error(_("fsmonitor-daemon is incompatible with this repo '%s'"),
	+                            the_repository->worktree);

I.e. we just checked the_repository->worktree, but it's not that, but....

> +
> +	/*
> +	 * GVFS (aka VFS for Git) is incompatible with FSMonitor.
> +	 *
> +	 * Granted, core Git does not know anything about GVFS and
> +	 * we shouldn't make assumptions about a downstream feature,
> +	 * but users can install both versions.  And this can lead
> +	 * to incorrect results from core Git commands.  So, without
> +	 * bringing in any of the GVFS code, do a simple config test
> +	 * for a published config setting.  (We do not look at the
> +	 * various *_TEST_* environment variables.)
> +	 */
> +	if (!repo_config_get_value(r, "core.virtualfilesystem", &const_strval))
> +		return 1;

I'm skeptical of us hardcoding a third-party software config
variable. Can't GitVFS handle this somehow on its end?

But just in terms of implementation it seems the end result of that is
to emit a very confusing error to the user. Sinc we already checked for
bare repos we run into this and instead of sayingwhen we should really
say "hey, maybe disable your core.virtualFileSystem setting", we say
"your repo is incompatible".

> +
> +	return 0;
> +}
> +
>  void prepare_repo_settings(struct repository *r)
>  {
>  	int value;
>  	char *strval;
> +	const char *const_strval;

Can be declared in the "else" below.

>  
>  	if (r->settings.initialized)
>  		return;
> @@ -26,6 +58,22 @@ void prepare_repo_settings(struct repository *r)
>  	UPDATE_DEFAULT_BOOL(r->settings.commit_graph_read_changed_paths, 1);
>  	UPDATE_DEFAULT_BOOL(r->settings.gc_write_commit_graph, 1);
>  
> +	r->settings.fsmonitor_hook_path = NULL;
> +	r->settings.fsmonitor_mode = FSMONITOR_MODE_DISABLED;

With the memset earlier (b.t.w. I've got a patch to fix all this bizarre
behavior in repo-settings.c, but have been waiting on this series we
implicitly set it to FSMONITOR_MODE_UNSET (-1) with the memset, but then
never use that ever.

Your code in update-index.c then for a check against
"FSMONITOR_MODE_DISABLED" says "core.useBuiltinFSMonitor is unset;".

> +	if (is_repo_incompatible_with_fsmonitor(r))
> +		r->settings.fsmonitor_mode = FSMONITOR_MODE_INCOMPATIBLE;

Style: should have {} braces on all arms.

> +	else if (!repo_config_get_bool(r, "core.usebuiltinfsmonitor", &value)
> +		   && value)
> +		r->settings.fsmonitor_mode = FSMONITOR_MODE_IPC;

Here you're conflating false with whether the variable is set at all. I
guess that works out here since if it's false we want to fall through
to...

> +	else {

...ignoring it and looing at core.fsmonitor instead.

> +		if (repo_config_get_pathname(r, "core.fsmonitor", &const_strval))
> +			const_strval = getenv("GIT_TEST_FSMONITOR");

If it's not set we pay attention to GIT_TEST_FSMONITOR, so the behavior
from the old git_config_get_fsmonitor(). So even if the env variable is
set we want to take the config variable over it, correct?

> +		if (const_strval && *const_strval) {
> +			r->settings.fsmonitor_hook_path = strdup(const_strval);

We had a strbuf_detach()'d string in the case of
repo_config_get_pathname(), but here we strdup() it again in case we
were in the getenv() codepath. This code probably leaks memory now
anyway, but perhaps it's better to split up the two so we make it easier
to deal with who owns/frees what in the future.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 00/34] Builtin FSMonitor Feature
  2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
                       ` (33 preceding siblings ...)
  2021-07-01 14:47     ` [PATCH v3 34/34] t7527: test FS event reporing on MacOS WRT case and Unicode Jeff Hostetler via GitGitGadget
@ 2021-07-01 17:40     ` Ævar Arnfjörð Bjarmason
  2021-07-01 18:29       ` Jeff Hostetler
  34 siblings, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-01 17:40 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler


On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:

> Here is V3 of my patch series to add a builtin FSMonitor daemon to Git. I
> rebased this series onto v2.32.0.
>
> V3 addresses most of the previous review comments and things we've learned
> from our experimental testing of V2. (A version of V2 was shipped as an
> experimental feature in the v2.32.0-based releases of Git for Windows and
> VFS for Git.)
>
> There are still a few items that I need to address, but that list is getting
> very short.

...
>   fsmonitor-fs-listen-win32: stub in backend for Windows
>   fsmonitor-fs-listen-macos: stub in backend for MacOS

I left some light comments on the repo-settings.c part of this to follow
up from a previous round.

Any other testing of it is stalled by there being no linux backend for
it as part of this series. I see from spelunking repos that Johannes had
a WIP compat/fsmonitor/linux.c which looks like it could/should mostly
work, but the API names all changed since then, and after a short try I
gave up on trying to rebase it.

I'd really prefer for git not to have features that place free platforms
at a disadvantage against proprietary platforms if it can be avoided,
and in this case the lack of a Linux backend also means much less
widespread testing of the feature among the development community / CI.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 00/34] Builtin FSMonitor Feature
  2021-07-01 17:40     ` [PATCH v3 00/34] Builtin FSMonitor Feature Ævar Arnfjörð Bjarmason
@ 2021-07-01 18:29       ` Jeff Hostetler
  2021-07-01 21:26         ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-01 18:29 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Derrick Stolee, Jeff Hostetler



On 7/1/21 1:40 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
> 
>> Here is V3 of my patch series to add a builtin FSMonitor daemon to Git. I
>> rebased this series onto v2.32.0.
>>
>> V3 addresses most of the previous review comments and things we've learned
>> from our experimental testing of V2. (A version of V2 was shipped as an
>> experimental feature in the v2.32.0-based releases of Git for Windows and
>> VFS for Git.)
>>
>> There are still a few items that I need to address, but that list is getting
>> very short.
> 
> ...
>>    fsmonitor-fs-listen-win32: stub in backend for Windows
>>    fsmonitor-fs-listen-macos: stub in backend for MacOS
> 
> I left some light comments on the repo-settings.c part of this to follow
> up from a previous round.

Thanks.

> 
> Any other testing of it is stalled by there being no linux backend for
> it as part of this series. I see from spelunking repos that Johannes had
> a WIP compat/fsmonitor/linux.c which looks like it could/should mostly
> work, but the API names all changed since then, and after a short try I
> gave up on trying to rebase it.

The early Linux version was dropped because inotify does not give
recursive coverage -- only the requested directory.  Using inotify
requires adding a watch to each subdirectory (recursively) in the
worktree.  There's a system limit on the maximum number of watched
directories (defaults to 8K IIRC) and that limit is system-wide.

Since the whole point was to support large very large repos, using
inotify was a non-starter, so I removed the Linux version from our
patch series.  For example, the first repo I tried it on (outside
of the test suite) had 25K subdirectories.

I'm told there is a new fanotify API in recent Linux kernels that
is a better fit for what we need, but we haven't had time to investigate
it yet.

> 
> I'd really prefer for git not to have features that place free platforms
> at a disadvantage against proprietary platforms if it can be avoided,
> and in this case the lack of a Linux backend also means much less
> widespread testing of the feature among the development community / CI.
> 

This feature is always going to have platform-specific components,
so the lack of one platform or another should not stop us from
discussing it for the platforms that can be supported.

And given the size and complexity of the platform-specific code,
we should not assume that "just test it on Linux" is sufficient.
Yes, there are some common/shared routines/data structures in the
daemon, but hard/tricky parts are in the platform layer.

Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 23/34] t/helper/test-touch: add helper to touch a series of files
  2021-07-01 14:47     ` [PATCH v3 23/34] t/helper/test-touch: add helper to touch a series of files Jeff Hostetler via GitGitGadget
@ 2021-07-01 20:00       ` Junio C Hamano
  2021-07-13 16:45         ` Jeff Hostetler
  0 siblings, 1 reply; 237+ messages in thread
From: Junio C Hamano @ 2021-07-01 20:00 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler

"Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/t/helper/test-touch.c b/t/helper/test-touch.c
> new file mode 100644
> index 00000000000..e9b3b754f1f
> --- /dev/null
> +++ b/t/helper/test-touch.c
> @@ -0,0 +1,126 @@
> +/*
> + * test-touch.c: variation on /usr/bin/touch to speed up tests
> + * with a large number of files (primarily on Windows where child
> + * process are very, very expensive).
> + */
> +
> +#include "test-tool.h"
> +#include "cache.h"
> +#include "parse-options.h"
> +
> +char *seq_pattern;
> +int seq_start = 1;
> +int seq_count = 1;

With this in, "make sparse" dies like this:

    SP t/helper/test-touch.c
t/helper/test-touch.c:11:6: error: symbol 'seq_pattern' was not declared. Should it be static?
t/helper/test-touch.c:12:5: error: symbol 'seq_start' was not declared. Should it be static?
t/helper/test-touch.c:13:5: error: symbol 'seq_count' was not declared. Should it be static?


^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 00/34] Builtin FSMonitor Feature
  2021-07-01 18:29       ` Jeff Hostetler
@ 2021-07-01 21:26         ` Ævar Arnfjörð Bjarmason
  2021-07-02 19:06           ` Jeff Hostetler
  2021-07-05 21:35           ` Johannes Schindelin
  0 siblings, 2 replies; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-01 21:26 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Jeff Hostetler via GitGitGadget, git, Johannes Schindelin,
	Derrick Stolee, Jeff Hostetler, David Turner


On Thu, Jul 01 2021, Jeff Hostetler wrote:

> On 7/1/21 1:40 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>> 
>>> Here is V3 of my patch series to add a builtin FSMonitor daemon to Git. I
>>> rebased this series onto v2.32.0.
>>>
>>> V3 addresses most of the previous review comments and things we've learned
>>> from our experimental testing of V2. (A version of V2 was shipped as an
>>> experimental feature in the v2.32.0-based releases of Git for Windows and
>>> VFS for Git.)
>>>
>>> There are still a few items that I need to address, but that list is getting
>>> very short.
>> ...
>>>    fsmonitor-fs-listen-win32: stub in backend for Windows
>>>    fsmonitor-fs-listen-macos: stub in backend for MacOS
>> I left some light comments on the repo-settings.c part of this to
>> follow
>> up from a previous round.
>
> Thanks.
>
>> Any other testing of it is stalled by there being no linux backend
>> for
>> it as part of this series. I see from spelunking repos that Johannes had
>> a WIP compat/fsmonitor/linux.c which looks like it could/should mostly
>> work, but the API names all changed since then, and after a short try I
>> gave up on trying to rebase it.
>
> The early Linux version was dropped because inotify does not give
> recursive coverage -- only the requested directory.  Using inotify
> requires adding a watch to each subdirectory (recursively) in the
> worktree.  There's a system limit on the maximum number of watched
> directories (defaults to 8K IIRC) and that limit is system-wide.
>
> Since the whole point was to support large very large repos, using
> inotify was a non-starter, so I removed the Linux version from our
> patch series.  For example, the first repo I tried it on (outside
> of the test suite) had 25K subdirectories.
>
> I'm told there is a new fanotify API in recent Linux kernels that
> is a better fit for what we need, but we haven't had time to investigate
> it yet.

That default limit is a bit annoying, but I don't see how it's a blocker
in any way.

You simply adjust the limit. E.g. I deployed and tested the hook version
of inotify (using watchman) in a sizable development environment, and
written my own code using the API. This was all before fanotify(7)
existed. IIRC I set most of the limits to 2^24 or 2^20. I've used it
with really large Git repos, including with David Turner's
2015-04-03-1M-git for testing (`git ls-files | wc -l` on that is around
a quarter-million).

If you have a humongous repository and don't have root on your own
machine you're out of luck. But I think most people who'd use this are
either using their own laptop, or e.g. in a corporate setting where
administrators would tweak the sysctl limits given the performance
advantages (as I did).

Once you adjust the limits Linux deals with large trees just fine, it
just has overly conservative limits for most things in sysctl. The API
is a bit annoying, your event loop needs to run around and add watches.

AFAICT you need Linux 5.1 for fanotify(7) to be useful, e.g. Debian
stable, RHEL etc. aren't using anything that new. So having an inotify
backend as well as possibly a fanotify one would be very useful.

And linux.git's docs suggest that the default limit was bumped from 8192
to 1M in v5.10, a really recent kernel, so if you've got that you've
also got fanotify.

In any case, even if Linux's inotify limit was something hardcoded and
impossible to change you could still use such an API to test & debug the
basics of this feature on that platform.

>> I'd really prefer for git not to have features that place free
>> platforms
>> at a disadvantage against proprietary platforms if it can be avoided,
>> and in this case the lack of a Linux backend also means much less
>> widespread testing of the feature among the development community / CI.
>> 
>
> This feature is always going to have platform-specific components,
> so the lack of one platform or another should not stop us from
> discussing it for the platforms that can be supported.

(I think per the above that's s/can be/are/)

> And given the size and complexity of the platform-specific code,
> we should not assume that "just test it on Linux" is sufficient.
> Yes, there are some common/shared routines/data structures in the
> daemon, but hard/tricky parts are in the platform layer.

I think we're talking past each other a bit here. I'm not saying that
you can get full or meaningful testing for it on Windows if you test it
on Linux, or the other way around.

Of course there's platform-specific stuff, although there's also a lot
of non-platform-specific stuff, so even a very basic implementation of
inotify would make reviwing this easier / give access to more reviewers.

I'm saying that I prefer that Git as a free software project not be in
the situation of saying the best use-case for a given size/shape of repo
is to use Git in combination with proprietary operating systems over
freely licensed ones.

IOW what the FSF has a policy for GNU projects. Now, I think the FSF
probably goes too far in that (famously removing rather obscure font
rendering features from Emacs on OSX), but "manage lots of data" is a
core feature of git.


^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 14/34] fsmonitor--daemon: implement 'start' command
  2021-07-01 14:47     ` [PATCH v3 14/34] fsmonitor--daemon: implement 'start' command Jeff Hostetler via GitGitGadget
@ 2021-07-01 22:18       ` Ævar Arnfjörð Bjarmason
  2021-07-05 21:52         ` Johannes Schindelin
  2021-07-13 14:39         ` Jeff Hostetler
  0 siblings, 2 replies; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-01 22:18 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler


On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:

> +#ifdef GIT_WINDOWS_NATIVE
> +/*
> + * Create a background process to run the daemon.  It should be completely
> + * disassociated from the terminal.
> + *
> + * Conceptually like `daemonize()` but different because Windows does not
> + * have `fork(2)`.  Spawn a normal Windows child process but without the
> + * limitations of `start_command()` and `finish_command()`.
> + *
> + * The child process execs the "git fsmonitor--daemon run" command.
> + *
> + * The current process returns so that the caller can wait for the child
> + * to startup before exiting.
> + */
> +static int spawn_background_fsmonitor_daemon(pid_t *pid)
> +{
> +	char git_exe[MAX_PATH];
> +	struct strvec args = STRVEC_INIT;
> +	int in, out;
> +
> +	GetModuleFileNameA(NULL, git_exe, MAX_PATH);
> +
> +	in = open("/dev/null", O_RDONLY);
> +	out = open("/dev/null", O_WRONLY);
> +
> +	strvec_push(&args, git_exe);
> +	strvec_push(&args, "fsmonitor--daemon");
> +	strvec_push(&args, "run");
> +	strvec_pushf(&args, "--ipc-threads=%d", fsmonitor__ipc_threads);
> +
> +	*pid = mingw_spawnvpe(args.v[0], args.v, NULL, NULL, in, out, out);
> +	close(in);
> +	close(out);
> +
> +	strvec_clear(&args);
> +
> +	if (*pid < 0)
> +		return error(_("could not spawn fsmonitor--daemon in the background"));
> +
> +	return 0;
> +}
> +#else
> +/*
> + * Create a background process to run the daemon.  It should be completely
> + * disassociated from the terminal.
> + *
> + * This is adapted from `daemonize()`.  Use `fork()` to directly
> + * create and run the daemon in the child process.
> + *
> + * The fork-child can just call the run code; it does not need to exec
> + * it.
> + *
> + * The fork-parent returns the child PID so that we can wait for the
> + * child to startup before exiting.
> + */
> +static int spawn_background_fsmonitor_daemon(pid_t *pid)
> +{
> +	*pid = fork();
> +
> +	switch (*pid) {
> +	case 0:
> +		if (setsid() == -1)
> +			error_errno(_("setsid failed"));
> +		close(0);
> +		close(1);
> +		close(2);
> +		sanitize_stdfds();
> +
> +		return !!fsmonitor_run_daemon();
> +
> +	case -1:
> +		return error_errno(_("could not spawn fsmonitor--daemon in the background"));
> +
> +	default:
> +		return 0;
> +	}
> +}
> +#endif

The spawn_background_fsmonitor_daemon() function here is almost the same
as daemonize(). I wonder if this & the Windows-specific one you have
here can't be refactored into an API from what's now in setup.c.

Then we could make builtin/gc.c and daemon.c use that, so Windows could
have background GC, and we'd have a more battle-tested central codepath
for this tricky bit.

It seems to me like the only limitations on it are to have this return
slightly more general things (e.g. not set its own errors, return
structured data), and maybe some callback for what to do in the
child/parent.

> +/*
> + * This is adapted from `wait_or_whine()`.  Watch the child process and
> + * let it get started and begin listening for requests on the socket
> + * before reporting our success.
> + */
> +static int wait_for_background_startup(pid_t pid_child)
> +{
> +	int status;
> +	pid_t pid_seen;
> +	enum ipc_active_state s;
> +	time_t time_limit, now;
> +
> +	time(&time_limit);
> +	time_limit += fsmonitor__start_timeout_sec;
> +
> +	for (;;) {
> +		pid_seen = waitpid(pid_child, &status, WNOHANG);
> +
> +		if (pid_seen == -1)
> +			return error_errno(_("waitpid failed"));
> +		else if (pid_seen == 0) {
> +			/*
> +			 * The child is still running (this should be
> +			 * the normal case).  Try to connect to it on
> +			 * the socket and see if it is ready for
> +			 * business.
> +			 *
> +			 * If there is another daemon already running,
> +			 * our child will fail to start (possibly
> +			 * after a timeout on the lock), but we don't
> +			 * care (who responds) if the socket is live.
> +			 */
> +			s = fsmonitor_ipc__get_state();
> +			if (s == IPC_STATE__LISTENING)
> +				return 0;
> +
> +			time(&now);
> +			if (now > time_limit)
> +				return error(_("fsmonitor--daemon not online yet"));
> +		} else if (pid_seen == pid_child) {
> +			/*
> +			 * The new child daemon process shutdown while
> +			 * it was starting up, so it is not listening
> +			 * on the socket.
> +			 *
> +			 * Try to ping the socket in the odd chance
> +			 * that another daemon started (or was already
> +			 * running) while our child was starting.
> +			 *
> +			 * Again, we don't care who services the socket.
> +			 */
> +			s = fsmonitor_ipc__get_state();
> +			if (s == IPC_STATE__LISTENING)
> +				return 0;
> +
> +			/*
> +			 * We don't care about the WEXITSTATUS() nor
> +			 * any of the WIF*(status) values because
> +			 * `cmd_fsmonitor__daemon()` does the `!!result`
> +			 * trick on all function return values.
> +			 *
> +			 * So it is sufficient to just report the
> +			 * early shutdown as an error.
> +			 */
> +			return error(_("fsmonitor--daemon failed to start"));
> +		} else
> +			return error(_("waitpid is confused"));
> +	}
> +}

Ditto this. could we extend the wait_or_whine() function (or some
extended version thereof) to do what you need with callbacks?

It seems the main difference is just being able to pass down a flag for
waitpid(), and the loop needing to check EINTR or not depending on
whether WNOHANG is passed.

For e.g. the "We don't care about the WEXITSTATUS()" you'd get that
behavior with an adjusted wait_or_whine(). Wouldn't it be better to
report what exit status it exits with e.g. if the top-level process is
signalled? We do so in trace2 for other things we spawn...

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 02/34] fsmonitor--daemon: man page
  2021-07-01 14:47     ` [PATCH v3 02/34] fsmonitor--daemon: man page Jeff Hostetler via GitGitGadget
@ 2021-07-01 22:29       ` Ævar Arnfjörð Bjarmason
  2021-07-05 22:00         ` Johannes Schindelin
  2021-07-12 19:23         ` Jeff Hostetler
  0 siblings, 2 replies; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-01 22:29 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler


On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Create a manual page describing the `git fsmonitor--daemon` feature.
>
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  Documentation/git-fsmonitor--daemon.txt | 75 +++++++++++++++++++++++++
>  1 file changed, 75 insertions(+)
>  create mode 100644 Documentation/git-fsmonitor--daemon.txt
>
> diff --git a/Documentation/git-fsmonitor--daemon.txt b/Documentation/git-fsmonitor--daemon.txt
> new file mode 100644
> index 00000000000..154e7684daa
> --- /dev/null
> +++ b/Documentation/git-fsmonitor--daemon.txt
> @@ -0,0 +1,75 @@
> +git-fsmonitor--daemon(1)
> +========================
> +
> +NAME
> +----
> +git-fsmonitor--daemon - A Built-in File System Monitor
> +
> +SYNOPSIS
> +--------
> +[verse]
> +'git fsmonitor--daemon' start
> +'git fsmonitor--daemon' run
> +'git fsmonitor--daemon' stop
> +'git fsmonitor--daemon' status
> +
> +DESCRIPTION
> +-----------
> +
> +A daemon to watch the working directory for file and directory
> +changes using platform-specific file system notification facilities.
> +
> +This daemon communicates directly with commands like `git status`
> +using the link:technical/api-simple-ipc.html[simple IPC] interface
> +instead of the slower linkgit:githooks[5] interface.
> +
> +This daemon is built into Git so that no third-party tools are
> +required.
> +
> +OPTIONS
> +-------
> +
> +start::
> +	Starts a daemon in the background.
> +
> +run::
> +	Runs a daemon in the foreground.
> +
> +stop::
> +	Stops the daemon running in the current working
> +	directory, if present.
> +
> +status::
> +	Exits with zero status if a daemon is watching the
> +	current working directory.
> +
> +REMARKS
> +-------
> +
> +This daemon is a long running process used to watch a single working
> +directory and maintain a list of the recently changed files and
> +directories.  Performance of commands such as `git status` can be
> +increased if they just ask for a summary of changes to the working
> +directory and can avoid scanning the disk.
> +
> +When `core.useBuiltinFSMonitor` is set to `true` (see
> +linkgit:git-config[1]) commands, such as `git status`, will ask the
> +daemon for changes and automatically start it (if necessary).
> +
> +For more information see the "File System Monitor" section in
> +linkgit:git-update-index[1].
> +
> +CAVEATS
> +-------
> +
> +The fsmonitor daemon does not currently know about submodules and does
> +not know to filter out file system events that happen within a
> +submodule.  If fsmonitor daemon is watching a super repo and a file is
> +modified within the working directory of a submodule, it will report
> +the change (as happening against the super repo).  However, the client
> +will properly ignore these extra events, so performance may be affected
> +but it will not cause an incorrect result.
> +
> +GIT
> +---
> +Part of the linkgit:git[1] suite

Later in the series we incrementally add features to the daemon, so this
is describing a state that doesn't exist yet at this point.

I think it would be better to start with a stup here and add
documentation as we add features, e.g. the patch tha adds "start" should
add that to the synopsis + options etc.

See the outstanding ab/config-based-hooks-base for a small example of
that.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 03/34] fsmonitor--daemon: update fsmonitor documentation
  2021-07-01 14:47     ` [PATCH v3 03/34] fsmonitor--daemon: update fsmonitor documentation Jeff Hostetler via GitGitGadget
@ 2021-07-01 22:31       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-01 22:31 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler


On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Update references to `core.fsmonitor` and `core.fsmonitorHookVersion` and
> pointers to `Watchman` to mention the new built-in `fsmonitor--daemon`.
>
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  Documentation/config/core.txt      | 56 ++++++++++++++++++++++--------
>  Documentation/git-update-index.txt | 27 +++++++-------
>  Documentation/githooks.txt         |  3 +-
>  3 files changed, 59 insertions(+), 27 deletions(-)
>
> diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
> index c04f62a54a1..4f6e519bc02 100644
> --- a/Documentation/config/core.txt
> +++ b/Documentation/config/core.txt
> @@ -62,22 +62,50 @@ core.protectNTFS::
>  	Defaults to `true` on Windows, and `false` elsewhere.
>  
>  core.fsmonitor::
> -	If set, the value of this variable is used as a command which
> -	will identify all files that may have changed since the
> -	requested date/time. This information is used to speed up git by
> -	avoiding unnecessary processing of files that have not changed.
> -	See the "fsmonitor-watchman" section of linkgit:githooks[5].
> +	If set, this variable contains the pathname of the "fsmonitor"
> +	hook command.
> ++
> +This hook command is used to identify all files that may have changed
> +since the requested date/time. This information is used to speed up
> +git by avoiding unnecessary scanning of files that have not changed.
> ++
> +See the "fsmonitor-watchman" section of linkgit:githooks[5].
> ++
> +Note: The value of this config setting is ignored if the
> +built-in file system monitor is enabled (see `core.useBuiltinFSMonitor`).
>  
>  core.fsmonitorHookVersion::
> -	Sets the version of hook that is to be used when calling fsmonitor.
> -	There are currently versions 1 and 2. When this is not set,
> -	version 2 will be tried first and if it fails then version 1
> -	will be tried. Version 1 uses a timestamp as input to determine
> -	which files have changes since that time but some monitors
> -	like watchman have race conditions when used with a timestamp.
> -	Version 2 uses an opaque string so that the monitor can return
> -	something that can be used to determine what files have changed
> -	without race conditions.
> +	Sets the protocol version to be used when invoking the
> +	"fsmonitor" hook.
> ++
> +There are currently versions 1 and 2. When this is not set,
> +version 2 will be tried first and if it fails then version 1
> +will be tried. Version 1 uses a timestamp as input to determine
> +which files have changes since that time but some monitors
> +like Watchman have race conditions when used with a timestamp.
> +Version 2 uses an opaque string so that the monitor can return
> +something that can be used to determine what files have changed
> +without race conditions.
> ++
> +Note: The value of this config setting is ignored if the
> +built-in file system monitor is enabled (see `core.useBuiltinFSMonitor`).
> +
> +core.useBuiltinFSMonitor::
> +	If set to true, enable the built-in file system monitor
> +	daemon for this working directory (linkgit:git-fsmonitor--daemon[1]).
> ++
> +Like hook-based file system monitors, the built-in file system monitor
> +can speed up Git commands that need to refresh the Git index
> +(e.g. `git status`) in a working directory with many files.  The
> +built-in monitor eliminates the need to install and maintain an
> +external third-party tool.
> ++
> +The built-in file system monitor is currently available only on a
> +limited set of supported platforms.  Currently, this includes Windows
> +and MacOS.
> ++
> +Note: if this config setting is set to `true`, the values of
> +`core.fsmonitor` and `core.fsmonitorHookVersion` are ignored.
>  
>  core.trustctime::
>  	If false, the ctime differences between the index and the
> diff --git a/Documentation/git-update-index.txt b/Documentation/git-update-index.txt
> index 2853f168d97..c7c31b3fcf9 100644
> --- a/Documentation/git-update-index.txt
> +++ b/Documentation/git-update-index.txt
> @@ -498,7 +498,9 @@ FILE SYSTEM MONITOR
>  This feature is intended to speed up git operations for repos that have
>  large working directories.
>  
> -It enables git to work together with a file system monitor (see the
> +It enables git to work together with a file system monitor (see
> +linkgit:git-fsmonitor--daemon[1]
> +and the
>  "fsmonitor-watchman" section of linkgit:githooks[5]) that can
>  inform it as to what files have been modified. This enables git to avoid
>  having to lstat() every file to find modified files.
> @@ -508,17 +510,18 @@ performance by avoiding the cost of scanning the entire working directory
>  looking for new files.
>  
>  If you want to enable (or disable) this feature, it is easier to use
> -the `core.fsmonitor` configuration variable (see
> -linkgit:git-config[1]) than using the `--fsmonitor` option to
> -`git update-index` in each repository, especially if you want to do so
> -across all repositories you use, because you can set the configuration
> -variable in your `$HOME/.gitconfig` just once and have it affect all
> -repositories you touch.
> -
> -When the `core.fsmonitor` configuration variable is changed, the
> -file system monitor is added to or removed from the index the next time
> -a command reads the index. When `--[no-]fsmonitor` are used, the file
> -system monitor is immediately added to or removed from the index.
> +the `core.fsmonitor` or `core.useBuiltinFSMonitor` configuration
> +variable (see linkgit:git-config[1]) than using the `--fsmonitor`
> +option to `git update-index` in each repository, especially if you
> +want to do so across all repositories you use, because you can set the
> +configuration variable in your `$HOME/.gitconfig` just once and have
> +it affect all repositories you touch.
> +
> +When the `core.fsmonitor` or `core.useBuiltinFSMonitor` configuration
> +variable is changed, the file system monitor is added to or removed
> +from the index the next time a command reads the index. When
> +`--[no-]fsmonitor` are used, the file system monitor is immediately
> +added to or removed from the index.
>  
>  CONFIGURATION
>  -------------
> diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
> index b51959ff941..b7d5e926f7b 100644
> --- a/Documentation/githooks.txt
> +++ b/Documentation/githooks.txt
> @@ -593,7 +593,8 @@ fsmonitor-watchman
>  
>  This hook is invoked when the configuration option `core.fsmonitor` is
>  set to `.git/hooks/fsmonitor-watchman` or `.git/hooks/fsmonitor-watchmanv2`
> -depending on the version of the hook to use.
> +depending on the version of the hook to use, unless overridden via
> +`core.useBuiltinFSMonitor` (see linkgit:git-config[1]).
>  
>  Version 1 takes two arguments, a version (1) and the time in elapsed
>  nanoseconds since midnight, January 1, 1970.

Ditto the comment on 02/34, mostly docs for things that don't exist
until in later patches.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 08/34] fsmonitor--daemon: add a built-in fsmonitor daemon
  2021-07-01 14:47     ` [PATCH v3 08/34] fsmonitor--daemon: add a built-in fsmonitor daemon Jeff Hostetler via GitGitGadget
@ 2021-07-01 22:36       ` Ævar Arnfjörð Bjarmason
  2021-07-19 20:56         ` Jeff Hostetler
  0 siblings, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-01 22:36 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler


On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:

A general comment on this series (including previous patches). We've
usually tried to bend over backwards in git's codebase not to have big
ifdef blocks, so we compile most code the same everywhere. We waste a
bit of object code, but that's fine.

See 9c897c5c2ad (pack-objects: remove #ifdef NO_PTHREADS, 2018-11-03)
for a good exmaple of bad code being turned to good.

E.g. in this case:

> +#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
> +
> +int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
> +{
> +	const char *subcmd;
> +
> +	struct option options[] = {
> +		OPT_END()
> +	};
> +
> +	if (argc < 2)
> +		usage_with_options(builtin_fsmonitor__daemon_usage, options);
> +
> +	if (argc == 2 && !strcmp(argv[1], "-h"))
> +		usage_with_options(builtin_fsmonitor__daemon_usage, options);
> +
> +	git_config(git_default_config, NULL);
> +
> +	subcmd = argv[1];
> +	argv--;
> +	argc++;
> +
> +	argc = parse_options(argc, argv, prefix, options,
> +			     builtin_fsmonitor__daemon_usage, 0);
> +
> +	die(_("Unhandled subcommand '%s'"), subcmd);
> +}
> +
> +#else
> +int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
> +{
> +	struct option options[] = {
> +		OPT_END()
> +	};
> +
> +	if (argc == 2 && !strcmp(argv[1], "-h"))
> +		usage_with_options(builtin_fsmonitor__daemon_usage, options);
> +
> +	die(_("fsmonitor--daemon not supported on this platform"));
> +}
> +#endif

This whole thing could really just be a
-DHAVE_FSMONITOR_DAEMON_BACKEND=1 or -DHAVE_FSMONITOR_DAEMON_BACKEND=0
somewhere (depending), and then somewhere in the middle of the first
function:

	if (!HAVE_FSMONITOR_DAEMON_BACKEND)
	    	die(_("fsmonitor--daemon not supported on this platform"));

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 10/34] t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon
  2021-07-01 14:47     ` [PATCH v3 10/34] t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon Jeff Hostetler via GitGitGadget
@ 2021-07-01 22:41       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-01 22:41 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler


On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Create an IPC client to send query and flush commands to the daemon.
>
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  Makefile                         |   1 +
>  t/helper/test-fsmonitor-client.c | 121 +++++++++++++++++++++++++++++++
>  t/helper/test-tool.c             |   1 +
>  t/helper/test-tool.h             |   1 +
>  4 files changed, 124 insertions(+)
>  create mode 100644 t/helper/test-fsmonitor-client.c
>
> diff --git a/Makefile b/Makefile
> index 8fe1e42a435..c45caacf2c3 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -709,6 +709,7 @@ TEST_BUILTINS_OBJS += test-dump-split-index.o
>  TEST_BUILTINS_OBJS += test-dump-untracked-cache.o
>  TEST_BUILTINS_OBJS += test-example-decorate.o
>  TEST_BUILTINS_OBJS += test-fast-rebase.o
> +TEST_BUILTINS_OBJS += test-fsmonitor-client.o
>  TEST_BUILTINS_OBJS += test-genrandom.o
>  TEST_BUILTINS_OBJS += test-genzeros.o
>  TEST_BUILTINS_OBJS += test-hash-speed.o
> diff --git a/t/helper/test-fsmonitor-client.c b/t/helper/test-fsmonitor-client.c
> new file mode 100644
> index 00000000000..f7a5b3a32fa
> --- /dev/null
> +++ b/t/helper/test-fsmonitor-client.c
> @@ -0,0 +1,121 @@
> +/*
> + * test-fsmonitor-client.c: client code to send commands/requests to
> + * a `git fsmonitor--daemon` daemon.
> + */
> +
> +#include "test-tool.h"
> +#include "cache.h"
> +#include "parse-options.h"
> +#include "fsmonitor-ipc.h"
> +
> +#ifndef HAVE_FSMONITOR_DAEMON_BACKEND
> +int cmd__fsmonitor_client(int argc, const char **argv)
> +{
> +	die("fsmonitor--daemon not available on this platform");
> +}
> +#else

Re my earlier comments on excessive ifdefs: In this case don't we just
want to not compile test-fsmonitor-client at all unless
HAVE_FSMONITOR_DAEMON_BACKEND is true.

You'll get the same error as though you ran "helper/test-tool
does-not-exist", but the tests check for the prerequisite earlier
anyway, so why get this far on an unsupported platform for a pure test
helper?

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 11/34] fsmonitor-fs-listen-win32: stub in backend for Windows
  2021-07-01 14:47     ` [PATCH v3 11/34] fsmonitor-fs-listen-win32: stub in backend for Windows Jeff Hostetler via GitGitGadget
@ 2021-07-01 22:45       ` Ævar Arnfjörð Bjarmason
  2021-07-16 15:47         ` Johannes Schindelin
  2021-07-19 16:54         ` Jeff Hostetler
  0 siblings, 2 replies; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-01 22:45 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler


On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Stub in empty backend for fsmonitor--daemon on Windows.
>
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  Makefile                                     | 13 ++++++
>  compat/fsmonitor/fsmonitor-fs-listen-win32.c | 21 +++++++++
>  compat/fsmonitor/fsmonitor-fs-listen.h       | 49 ++++++++++++++++++++
>  config.mak.uname                             |  2 +
>  contrib/buildsystems/CMakeLists.txt          |  5 ++
>  5 files changed, 90 insertions(+)
>  create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-win32.c
>  create mode 100644 compat/fsmonitor/fsmonitor-fs-listen.h
>
> diff --git a/Makefile b/Makefile
> index c45caacf2c3..a2a6e1f20f6 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -467,6 +467,11 @@ all::
>  # directory, and the JSON compilation database 'compile_commands.json' will be
>  # created at the root of the repository.
>  #
> +# If your platform supports a built-in fsmonitor backend, set
> +# FSMONITOR_DAEMON_BACKEND to the "<name>" of the corresponding
> +# `compat/fsmonitor/fsmonitor-fs-listen-<name>.c` that implements the
> +# `fsmonitor_fs_listen__*()` routines.
> +#
>  # Define DEVELOPER to enable more compiler warnings. Compiler version
>  # and family are auto detected, but could be overridden by defining
>  # COMPILER_FEATURES (see config.mak.dev). You can still set
> @@ -1929,6 +1934,11 @@ ifdef NEED_ACCESS_ROOT_HANDLER
>  	COMPAT_OBJS += compat/access.o
>  endif
>  
> +ifdef FSMONITOR_DAEMON_BACKEND
> +	COMPAT_CFLAGS += -DHAVE_FSMONITOR_DAEMON_BACKEND
> +	COMPAT_OBJS += compat/fsmonitor/fsmonitor-fs-listen-$(FSMONITOR_DAEMON_BACKEND).o
> +endif
> +
>  ifeq ($(TCLTK_PATH),)
>  NO_TCLTK = NoThanks
>  endif
> @@ -2793,6 +2803,9 @@ GIT-BUILD-OPTIONS: FORCE
>  	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
>  	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
>  	@echo X=\'$(X)\' >>$@+
> +ifdef FSMONITOR_DAEMON_BACKEND
> +	@echo FSMONITOR_DAEMON_BACKEND=\''$(subst ','\'',$(subst ','\'',$(FSMONITOR_DAEMON_BACKEND)))'\' >>$@+
> +endif

Why put this in an ifdef?

In 342e9ef2d9e (Introduce a performance testing framework, 2012-02-17)
we started doing that for some perf/test options (which b.t.w., I don't
really see the reason for, maybe it's some subtlety in how test-lib.sh
picks those up).

But for all the other compile-time stuff we don't ifdef it, we just
define it, and then you get an empty value or not.

This would AFAICT be the first build-time-for-the-C-program option we
ifdef for writing a line to GIT-BUILD-OPTIONS.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 12/34] fsmonitor-fs-listen-macos: stub in backend for MacOS
  2021-07-01 14:47     ` [PATCH v3 12/34] fsmonitor-fs-listen-macos: stub in backend for MacOS Jeff Hostetler via GitGitGadget
@ 2021-07-01 22:49       ` Ævar Arnfjörð Bjarmason
  2021-07-16 15:51         ` Johannes Schindelin
  0 siblings, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-01 22:49 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler


On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Stub in empty implementation of fsmonitor--daemon
> backend for MacOS.
>
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  compat/fsmonitor/fsmonitor-fs-listen-macos.c | 20 ++++++++++++++++++++
>  config.mak.uname                             |  2 ++
>  contrib/buildsystems/CMakeLists.txt          |  3 +++
>  3 files changed, 25 insertions(+)
>  create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-macos.c
>
> diff --git a/compat/fsmonitor/fsmonitor-fs-listen-macos.c b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
> new file mode 100644
> index 00000000000..b91058d1c4f
> --- /dev/null
> +++ b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
> @@ -0,0 +1,20 @@
> +#include "cache.h"
> +#include "fsmonitor.h"
> +#include "fsmonitor-fs-listen.h"
> +
> +int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
> +{
> +	return -1;
> +}
> +
> +void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
> +{
> +}
> +
> +void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state)
> +{
> +}
> +
> +void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
> +{
> +}
> diff --git a/config.mak.uname b/config.mak.uname
> index fcd88b60b14..394355463e1 100644
> --- a/config.mak.uname
> +++ b/config.mak.uname
> @@ -147,6 +147,8 @@ ifeq ($(uname_S),Darwin)
>  			MSGFMT = /usr/local/opt/gettext/bin/msgfmt
>  		endif
>  	endif
> +	FSMONITOR_DAEMON_BACKEND = macos

A rather trivial point, but can't we pick one of "macos" or "darwin"
(I'd think going with the existing uname is better) and name the file
after the uname (or lower-case thereof)?

Makes these make rules more consistent too, we could just set this to
"YesPlease" here, and then lower case the uname for the file
compilation/include.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 15/34] fsmonitor: do not try to operate on bare repos
  2021-07-01 14:47     ` [PATCH v3 15/34] fsmonitor: do not try to operate on bare repos Jeff Hostetler via GitGitGadget
@ 2021-07-01 22:53       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-01 22:53 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler


On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Bare repos do not have a working directory, so there is no
> directory for the daemon to register a watch upon.  And therefore
> there are no files within the directory for it to actually watch.
>
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  builtin/fsmonitor--daemon.c |  8 ++++++++
>  t/t7519-status-fsmonitor.sh | 16 ++++++++++++++++
>  2 files changed, 24 insertions(+)
>
> diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
> index 7fcf960652f..d6161ad95a5 100644
> --- a/builtin/fsmonitor--daemon.c
> +++ b/builtin/fsmonitor--daemon.c
> @@ -490,6 +490,14 @@ int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
>  		die(_("invalid 'ipc-threads' value (%d)"),
>  		    fsmonitor__ipc_threads);
>  
> +	prepare_repo_settings(the_repository);
> +	if (!the_repository->worktree)
> +		return error(_("fsmonitor-daemon does not support bare repos '%s'"),
> +			     xgetcwd());
> +	if (the_repository->settings.fsmonitor_mode == FSMONITOR_MODE_INCOMPATIBLE)
> +		return error(_("fsmonitor-daemon is incompatible with this repo '%s'"),
> +			     the_repository->worktree);

I commented on another patch that that second condition seems like it's
only hit under core.virtualfilesystem=true....


> +test_expect_success FSMONITOR_DAEMON 'try running fsmonitor-daemon in bare repo' '
> +	test_when_finished "rm -rf ./bare-clone" &&
> +	git clone --bare . ./bare-clone &&
> +	test_must_fail git -C ./bare-clone fsmonitor--daemon run 2>actual &&
> +	grep "fsmonitor-daemon does not support bare repos" actual
> +'

Isn't just:

    git init --bare bare.git
    test_must_fail git -C bare.git [...]

Enough, or does the repository need content to get to that error.

> +test_expect_success FSMONITOR_DAEMON 'try running fsmonitor-daemon in virtual repo' '
> +	test_when_finished "rm -rf ./fake-virtual-clone" &&
> +	git clone . ./fake-virtual-clone &&
> +	test_must_fail git -C ./fake-virtual-clone \
> +			   -c core.virtualfilesystem=true \
> +			   fsmonitor--daemon run 2>actual &&
> +	grep "fsmonitor-daemon is incompatible with this repo" actual
> +'
> +

Ditto. 

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 17/34] fsmonitor--daemon: define token-ids
  2021-07-01 14:47     ` [PATCH v3 17/34] fsmonitor--daemon: define token-ids Jeff Hostetler via GitGitGadget
@ 2021-07-01 22:58       ` Ævar Arnfjörð Bjarmason
  2021-07-13 15:15         ` Jeff Hostetler
  0 siblings, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-01 22:58 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler


On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:

> +	if (!test_env_value) {
> +		struct timeval tv;
> +		struct tm tm;
> +		time_t secs;
> +
> +		gettimeofday(&tv, NULL);
> +		secs = tv.tv_sec;
> +		gmtime_r(&secs, &tm);
> +
> +		strbuf_addf(&token->token_id,
> +			    "%"PRIu64".%d.%4d%02d%02dT%02d%02d%02d.%06ldZ",
> +			    flush_count++,
> +			    getpid(),
> +			    tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
> +			    tm.tm_hour, tm.tm_min, tm.tm_sec,
> +			    (long)tv.tv_usec);

Just bikeshedding, but can we have tokens that mostly sort numeric-wise
by time order? So time at the start, not the flush_count/getpid.

Maybe I'm missing something, but couldn't we just re-use the trace2 SID
+ a more trivial trailer? It would have the nice property that you could
find the trace2 SID whenever you looked at such a token (could
e.g. split them by "/" too), and add the tv_usec, flush_count+whatever
else is needed to make it unique after the "/", no?

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 19/34] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  2021-07-01 14:47     ` [PATCH v3 19/34] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows Jeff Hostetler via GitGitGadget
@ 2021-07-01 23:02       ` Ævar Arnfjörð Bjarmason
  2021-07-13 15:46         ` Jeff Hostetler
  2021-07-06 19:09       ` Johannes Schindelin
  1 sibling, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-01 23:02 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler


On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Teach the win32 backend to register a watch on the working tree
> root directory (recursively).  Also watch the <gitdir> if it is
> not inside the working tree.  And to collect path change notifications
> into batches and publish.
>
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  compat/fsmonitor/fsmonitor-fs-listen-win32.c | 530 +++++++++++++++++++

<bikeshed mode> Spying on the early history of this (looking for the
Linux backend) I saw that at some point we had just
compat/fsmonitor/linux.c, and presumably some of
compat/fsmonitor/{windows,win32,macos,darwin}.c.

At some point those filenames became much much longer.

I've noticed you tend to prefer really long file and function names,
e.g. your borrowed daemonize() became
spawn_background_fsmonitor_daemon(), I think aiming for shorter
filenames & function names helps, e.g. these long names widen diffstats,
and many people who hack on the code stick religiously to 80 character
width terminals.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 24/34] t/perf/p7519: speed up test using "test-tool touch"
  2021-07-01 14:47     ` [PATCH v3 24/34] t/perf/p7519: speed up test using "test-tool touch" Jeff Hostetler via GitGitGadget
@ 2021-07-01 23:09       ` Ævar Arnfjörð Bjarmason
  2021-07-13 17:06         ` Jeff Hostetler
  2021-07-13 18:04       ` Jeff Hostetler
  1 sibling, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-01 23:09 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler


On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Change p7519 to use a single "test-tool touch" command to update
> the mtime on a series of (thousands) files instead of invoking
> thousands of commands to update a single file.
>
> This is primarily for Windows where process creation is so
> very slow and reduces the test run time by minutes.
>
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  t/perf/p7519-fsmonitor.sh | 14 ++++++--------
>  1 file changed, 6 insertions(+), 8 deletions(-)
>
> diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
> index 5eb5044a103..f74e6014a0a 100755
> --- a/t/perf/p7519-fsmonitor.sh
> +++ b/t/perf/p7519-fsmonitor.sh
> @@ -119,10 +119,11 @@ test_expect_success "one time repo setup" '
>  	fi &&
>  
>  	mkdir 1_file 10_files 100_files 1000_files 10000_files &&
> -	for i in $(test_seq 1 10); do touch 10_files/$i; done &&
> -	for i in $(test_seq 1 100); do touch 100_files/$i; done &&
> -	for i in $(test_seq 1 1000); do touch 1000_files/$i; done &&
> -	for i in $(test_seq 1 10000); do touch 10000_files/$i; done &&
> +	test-tool touch sequence --pattern="10_files/%d" --start=1 --count=10 &&
> +	test-tool touch sequence --pattern="100_files/%d" --start=1 --count=100 &&
> +	test-tool touch sequence --pattern="1000_files/%d" --start=1 --count=1000 &&
> +	test-tool touch sequence --pattern="10000_files/%d" --start=1 --count=10000 &&
> +
>  	git add 1_file 10_files 100_files 1000_files 10000_files &&
>  	git commit -qm "Add files" &&
>  
> @@ -200,15 +201,12 @@ test_fsmonitor_suite() {
>  	# Update the mtimes on upto 100k files to make status think
>  	# that they are dirty.  For simplicity, omit any files with
>  	# LFs (i.e. anything that ls-files thinks it needs to dquote).
> -	# Then fully backslash-quote the paths to capture any
> -	# whitespace so that they pass thru xargs properly.
>  	#
>  	test_perf_w_drop_caches "status (dirty) ($DESC)" '
>  		git ls-files | \
>  			head -100000 | \
>  			grep -v \" | \
> -			sed '\''s/\(.\)/\\\1/g'\'' | \
> -			xargs test-tool chmtime -300 &&
> +			test-tool touch stdin &&
>  		git status
>  	'

Did you try to replace this with some variant of:

    test_seq 1 10000 | xargs touch

Which (depending on your xargs version) would invoke "touch" commands
with however many argv items it thinks you can handle.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 25/34] t/perf: avoid copying builtin fsmonitor files into test repo
  2021-07-01 14:47     ` [PATCH v3 25/34] t/perf: avoid copying builtin fsmonitor files into test repo Jeff Hostetler via GitGitGadget
@ 2021-07-01 23:11       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-01 23:11 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler


On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Do not try to copy a fsmonitor--daemon socket from the current
> development directory into the test trash directory.

Okey, the */fsmonitor--daemon* rule covers that...

> When we run the perf suite without an explicit source repo set,
> we copy of the current $GIT_DIR into the test trash directory.
> Unix domain sockets cannot be copied in that manner, so the test
> setup fails.
>
> Additionally, omit any other fsmonitor--daemon temp files inside
> the $GIT_DIR directory.

So is the "any other" also matched by that rule? Not knowing the files
part of this is just phrasing, would be less confusing (if that's true,
and you didn't just forget to add a match for them) as:

    The */fsmonitor--daemon* glob will also match temporary files the
    daemon creates, but that's OK. We'd like to ignore these too.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 27/34] t7527: create test for fsmonitor--daemon
  2021-07-01 14:47     ` [PATCH v3 27/34] t7527: create test for fsmonitor--daemon Jeff Hostetler via GitGitGadget
@ 2021-07-01 23:15       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-01 23:15 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler


On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:

> +	grep \"event\":\"start\".*\"fsmonitor--daemon\" .git/trace_implicit_1
> +'
> +
> +test_expect_success 'status implicitly starts daemon' '
> +	test_when_finished redundant_stop_daemon &&
> +
> +	test_must_fail git fsmonitor--daemon status &&
> +
> +	GIT_TRACE2_EVENT="$(pwd)/.git/trace_implicit_2" \
> +		git status >actual &&
> +
> +	git fsmonitor--daemon status &&
> +	test_might_fail git fsmonitor--daemon stop &&
> +
> +	grep \"event\":\"start\".*\"fsmonitor--daemon\" .git/trace_implicit_2
> +'

Seems like this and test_region could eventually be factored into some
common function that would be more flexible about grabbing trace2 data.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 29/34] fsmonitor--daemon: use a cookie file to sync with file system
  2021-07-01 14:47     ` [PATCH v3 29/34] fsmonitor--daemon: use a cookie file to sync with file system Jeff Hostetler via GitGitGadget
@ 2021-07-01 23:17       ` Ævar Arnfjörð Bjarmason
  2021-07-21 14:40         ` Jeff Hostetler
  0 siblings, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-01 23:17 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler


On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Teach fsmonitor--daemon client threads to create a cookie file
> inside the .git directory and then wait until FS events for the
> cookie are observed by the FS listener thread.
>
> This helps address the racy nature of file system events by
> blocking the client response until the kernel has drained any
> event backlog.
>
> This is especially important on MacOS where kernel events are
> only issued with a limited frequency.  See the `latency` argument
> of `FSeventStreamCreate()`.  The kernel only signals every `latency`
> seconds, but does not guarantee that the kernel queue is completely
> drained, so we may have to wait more than one interval.  If we
> increase the frequency, the system is more likely to drop events.
> We avoid these issues by having each client thread create a unique
> cookie file and then wait until it is seen in the event stream.

Is this a guaranteed property of any API fsmonitor might need to work
with (linux, darwin, Windows) that if I perform a bunch of FS operations
on my working tree, that if I finish up by touching this cookie file
that that'll happen last?

I'd think that wouldn't be the case, i.e. on POSIX filesystems unless
you run around fsyncing both files and directories you're not guaranteed
that they're on disk, and even then the kernel might decide to sync your
cookie earlier, won't it?

E.g. on Linux you can even have cross-FS watches, and mix & match
different FS types. I'd expect to get events in whatever
implementation-defined order the VFS layer + FS decided to sync them to
disk in & get to firing off an event for me.

Or do these APIs all guarantee that a linear view of the world is
presented to the API consumer?


> Co-authored-by: Kevin Willford <Kevin.Willford@microsoft.com>
> Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  builtin/fsmonitor--daemon.c | 228 +++++++++++++++++++++++++++++++++++-
>  fsmonitor--daemon.h         |   5 +
>  2 files changed, 232 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
> index 8249420ba18..25f18f2726b 100644
> --- a/builtin/fsmonitor--daemon.c
> +++ b/builtin/fsmonitor--daemon.c
> @@ -94,6 +94,149 @@ static int do_as_client__status(void)
>  	}
>  }
>  
> +enum fsmonitor_cookie_item_result {
> +	FCIR_ERROR = -1, /* could not create cookie file ? */
> +	FCIR_INIT = 0,
> +	FCIR_SEEN,
> +	FCIR_ABORT,
> +};
> +
> +struct fsmonitor_cookie_item {
> +	struct hashmap_entry entry;
> +	const char *name;
> +	enum fsmonitor_cookie_item_result result;
> +};
> +
> +static int cookies_cmp(const void *data, const struct hashmap_entry *he1,
> +		     const struct hashmap_entry *he2, const void *keydata)
> +{
> +	const struct fsmonitor_cookie_item *a =
> +		container_of(he1, const struct fsmonitor_cookie_item, entry);
> +	const struct fsmonitor_cookie_item *b =
> +		container_of(he2, const struct fsmonitor_cookie_item, entry);

Re earlier comments about verbose names, these are just enums in a
builtin/*.c file, so a name like "cookie_item" is fine, and then the
whole thing might even fit on one line.. :)

> [...]
> +	/*
> +	 * We will write filesystem syncing cookie files into
> +	 * <gitdir>/<fsmonitor-dir>/<cookie-dir>/<pid>-<seq>.
> +	 *
> +	 * The extra layers of subdirectories here keep us from
> +	 * changing the mtime on ".git/" or ".git/foo/" when we create
> +	 * or delete cookie files.
> +	 *
> +	 * There have been problems with some IDEs that do a
> +	 * non-recursive watch of the ".git/" directory and run a
> +	 * series of commands any time something happens.
> +	 *
> +	 * For example, if we place our cookie files directly in
> +	 * ".git/" or ".git/foo/" then a `git status` (or similar
> +	 * command) from the IDE will cause a cookie file to be
> +	 * created in one of those dirs.  This causes the mtime of
> +	 * those dirs to change.  This triggers the IDE's watch
> +	 * notification.  This triggers the IDE to run those commands
> +	 * again.  And the process repeats and the machine never goes
> +	 * idle.
> +	 *
> +	 * Adding the extra layers of subdirectories prevents the
> +	 * mtime of ".git/" and ".git/foo" from changing when a
> +	 * cookie file is created.
> +	 */
> +	strbuf_init(&state.path_cookie_prefix, 0);
> +	strbuf_addbuf(&state.path_cookie_prefix, &state.path_gitdir_watch);
> +
> +	strbuf_addch(&state.path_cookie_prefix, '/');
> +	strbuf_addstr(&state.path_cookie_prefix, FSMONITOR_DIR);
> +	mkdir(state.path_cookie_prefix.buf, 0777);
> +
> +	strbuf_addch(&state.path_cookie_prefix, '/');
> +	strbuf_addstr(&state.path_cookie_prefix, FSMONITOR_COOKIE_DIR);
> +	mkdir(state.path_cookie_prefix.buf, 0777);
> +
> +	strbuf_addch(&state.path_cookie_prefix, '/');

So, on some stupid IDEs (would be nice to have specifics in the commit
message, which ones/versions?) this avoids causing infinite activity,
but on slightly more industrious stupid IDEs that would do their own
recursive watch we'll have the same problem?

Perhaps we should just consider creating it at the top-level and those
IDEs will eventually sort out their bugs, sooner than later if this
feature ships...

> +	strbuf_release(&state.path_cookie_prefix);
> +
> +	/*
> +	 * NEEDSWORK: Consider "rm -rf <gitdir>/<fsmonitor-dir>"
> +	 */

That would also make this trivial, presumably it's a "needswork" since
you have this recursive structure, but if it's at the top-level we
already did the unlink() above, so NEEDSWORK solved then?

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 34/34] t7527: test FS event reporing on MacOS WRT case and Unicode
  2021-07-01 14:47     ` [PATCH v3 34/34] t7527: test FS event reporing on MacOS WRT case and Unicode Jeff Hostetler via GitGitGadget
@ 2021-07-01 23:39       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-01 23:39 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Jeff Hostetler, Derrick Stolee, Jeff Hostetler


On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:


> +	[ -d test_apfs/.git ] &&
> +	[ -d test_apfs/.GIT ] &&

Better as "test_path_is_dir".

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 00/34] Builtin FSMonitor Feature
  2021-07-01 21:26         ` Ævar Arnfjörð Bjarmason
@ 2021-07-02 19:06           ` Jeff Hostetler
  2021-07-05 22:52             ` Ævar Arnfjörð Bjarmason
  2021-07-05 21:35           ` Johannes Schindelin
  1 sibling, 1 reply; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-02 19:06 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jeff Hostetler via GitGitGadget, git, Johannes Schindelin,
	Derrick Stolee, Jeff Hostetler, David Turner



On 7/1/21 5:26 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Jul 01 2021, Jeff Hostetler wrote:
> 
>> On 7/1/21 1:40 PM, Ævar Arnfjörð Bjarmason wrote:
>>> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>>>
>>>> Here is V3 of my patch series to add a builtin FSMonitor daemon to Git. I
>>>> rebased this series onto v2.32.0.
>>>>
>>>> V3 addresses most of the previous review comments and things we've learned
>>>> from our experimental testing of V2. (A version of V2 was shipped as an
>>>> experimental feature in the v2.32.0-based releases of Git for Windows and
>>>> VFS for Git.)
>>>>
>>>> There are still a few items that I need to address, but that list is getting
>>>> very short.
>>> ...
>>>>     fsmonitor-fs-listen-win32: stub in backend for Windows
>>>>     fsmonitor-fs-listen-macos: stub in backend for MacOS
>>> I left some light comments on the repo-settings.c part of this to
>>> follow
>>> up from a previous round.
>>
>> Thanks.
>>
>>> Any other testing of it is stalled by there being no linux backend
>>> for
>>> it as part of this series. I see from spelunking repos that Johannes had
>>> a WIP compat/fsmonitor/linux.c which looks like it could/should mostly
>>> work, but the API names all changed since then, and after a short try I
>>> gave up on trying to rebase it.
>>
>> The early Linux version was dropped because inotify does not give
>> recursive coverage -- only the requested directory.  Using inotify
>> requires adding a watch to each subdirectory (recursively) in the
>> worktree.  There's a system limit on the maximum number of watched
>> directories (defaults to 8K IIRC) and that limit is system-wide.
>>
>> Since the whole point was to support large very large repos, using
>> inotify was a non-starter, so I removed the Linux version from our
>> patch series.  For example, the first repo I tried it on (outside
>> of the test suite) had 25K subdirectories.
>>
>> I'm told there is a new fanotify API in recent Linux kernels that
>> is a better fit for what we need, but we haven't had time to investigate
>> it yet.
> 
> That default limit is a bit annoying, but I don't see how it's a blocker
> in any way.
> 
> You simply adjust the limit. E.g. I deployed and tested the hook version
> of inotify (using watchman) in a sizable development environment, and
> written my own code using the API. This was all before fanotify(7)
> existed. IIRC I set most of the limits to 2^24 or 2^20. I've used it
> with really large Git repos, including with David Turner's
> 2015-04-03-1M-git for testing (`git ls-files | wc -l` on that is around
> a quarter-million).
> 
> If you have a humongous repository and don't have root on your own
> machine you're out of luck. But I think most people who'd use this are
> either using their own laptop, or e.g. in a corporate setting where
> administrators would tweak the sysctl limits given the performance
> advantages (as I did).
> 
> Once you adjust the limits Linux deals with large trees just fine, it
> just has overly conservative limits for most things in sysctl. The API
> is a bit annoying, your event loop needs to run around and add watches.
> 
> AFAICT you need Linux 5.1 for fanotify(7) to be useful, e.g. Debian
> stable, RHEL etc. aren't using anything that new. So having an inotify
> backend as well as possibly a fanotify one would be very useful.
> 
> And linux.git's docs suggest that the default limit was bumped from 8192
> to 1M in v5.10, a really recent kernel, so if you've got that you've
> also got fanotify.
> 
> In any case, even if Linux's inotify limit was something hardcoded and
> impossible to change you could still use such an API to test & debug the
> basics of this feature on that platform.

Good points.  If the inotify limits can be increased into the millions
then we can revisit supporting it.  I do worry about possible race
conditions as we have to scan the worktree and add/delete a watch for
each directory, but we don't need to worry about that right now.

I do want to have Linux support eventually, but I was saving it for
a later effort (and/or looking for volunteers).  My IPC-based builtin
daemon complements the existing hook-based fsmonitor support that Ben
Peart and Kevin Willford added a few years ago.  That model (and PERL
hook script and Watchman integration) work fine for Linux, so the
advantages of a builtin monitor aren't as compelling.

For example, on Linux, hook process creation is fast, PERL is fast,
and it is easy to just apt-get a tool like Watchman.  But on
Windows, process creation is slow, NTFS is slow, PERL is available
as part of the Git-for-Windows distribution, and installing third-party
tools like Watchman onto a collection of enterprise users' machines is
a chore, so it made sense of us to pick platforms where the existing
hook model has issues and add other platforms later.

Besides, this patch series is already at 34 commits.  And some of
them are quite large.  Adding another platform would just make it
even larger.

Right now I'm more interested in the larger question of whether
we WANT to have a builtin fsmonitor and do we like the overall
design of what I have here?  Picking up a new platform, whether
it is Linux, or AIX, or Solaris, or Nonstop, or whatever, should
nicely fit into one platform-specific file in compat/fsmonitor
and not take that long.

> 
>>> I'd really prefer for git not to have features that place free
>>> platforms
>>> at a disadvantage against proprietary platforms if it can be avoided,
>>> and in this case the lack of a Linux backend also means much less
>>> widespread testing of the feature among the development community / CI.
>>>
>>
>> This feature is always going to have platform-specific components,
>> so the lack of one platform or another should not stop us from
>> discussing it for the platforms that can be supported.
> 
> (I think per the above that's s/can be/are/)
> 
>> And given the size and complexity of the platform-specific code,
>> we should not assume that "just test it on Linux" is sufficient.
>> Yes, there are some common/shared routines/data structures in the
>> daemon, but hard/tricky parts are in the platform layer.
> 
> I think we're talking past each other a bit here. I'm not saying that
> you can get full or meaningful testing for it on Windows if you test it
> on Linux, or the other way around.
> 
> Of course there's platform-specific stuff, although there's also a lot
> of non-platform-specific stuff, so even a very basic implementation of
> inotify would make reviwing this easier / give access to more reviewers.
> 
> I'm saying that I prefer that Git as a free software project not be in
> the situation of saying the best use-case for a given size/shape of repo
> is to use Git in combination with proprietary operating systems over
> freely licensed ones.

I wouldn't worry about that.  Even without Watchman integration,
Linux runs things so much faster than anything else it's not funny.

If anything, we need things like fsmonitor on Windows to help keep
up with Linux.

> 
> IOW what the FSF has a policy for GNU projects. Now, I think the FSF
> probably goes too far in that (famously removing rather obscure font
> rendering features from Emacs on OSX), but "manage lots of data" is a
> core feature of git.
> 

Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 00/34] Builtin FSMonitor Feature
  2021-07-01 21:26         ` Ævar Arnfjörð Bjarmason
  2021-07-02 19:06           ` Jeff Hostetler
@ 2021-07-05 21:35           ` Johannes Schindelin
  2021-07-05 22:02             ` Ævar Arnfjörð Bjarmason
  2021-07-07  1:53             ` Felipe Contreras
  1 sibling, 2 replies; 237+ messages in thread
From: Johannes Schindelin @ 2021-07-05 21:35 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jeff Hostetler, Jeff Hostetler via GitGitGadget, git,
	Derrick Stolee, Jeff Hostetler, David Turner

[-- Attachment #1: Type: text/plain, Size: 4484 bytes --]

Hi Ævar,

On Thu, 1 Jul 2021, Ævar Arnfjörð Bjarmason wrote:

> On Thu, Jul 01 2021, Jeff Hostetler wrote:
>
> > On 7/1/21 1:40 PM, Ævar Arnfjörð Bjarmason wrote:
> >
> >> Any other testing of it is stalled by there being no linux backend
> >> for it as part of this series. I see from spelunking repos that
> >> Johannes had a WIP compat/fsmonitor/linux.c which looks like it
> >> could/should mostly work, but the API names all changed since then,
> >> and after a short try I gave up on trying to rebase it.

I am a bit surprised that you gave up so easily, it cannot be that hard if
you use `git rebase -i` smartly.

But I think there is an even bigger obstacle lurking than just the
challenge of rebasing that experimental backend.

> > The early Linux version was dropped because inotify does not give
> > recursive coverage -- only the requested directory.  Using inotify
> > requires adding a watch to each subdirectory (recursively) in the
> > worktree.  There's a system limit on the maximum number of watched
> > directories (defaults to 8K IIRC) and that limit is system-wide.
> >
> > Since the whole point was to support large very large repos, using
> > inotify was a non-starter, so I removed the Linux version from our
> > patch series.  For example, the first repo I tried it on (outside of
> > the test suite) had 25K subdirectories.
> >
> > I'm told there is a new fanotify API in recent Linux kernels that is a
> > better fit for what we need, but we haven't had time to investigate it
> > yet.
>
> That default limit is a bit annoying, but I don't see how it's a blocker
> in any way.

Let me help you to see it.

So let's assume that you start FSMonitor-enabled Git, with the default
values. What is going to happen if you have any decently-sized worktree?
You run out of handles. What then? Throw your hands in the air? Stop
working? Report incorrect results?

Those are real design challenges, and together with the race problems Jeff
mentioned, they pose a much bigger obstacle than the rebasing you
mentioned above.

> You simply adjust the limit. E.g. I deployed and tested the hook version
> of inotify (using watchman) in a sizable development environment, and
> written my own code using the API. This was all before fanotify(7)
> existed. IIRC I set most of the limits to 2^24 or 2^20. I've used it
> with really large Git repos, including with David Turner's
> 2015-04-03-1M-git for testing (`git ls-files | wc -l` on that is around
> a quarter-million).
>
> If you have a humongous repository and don't have root on your own
> machine you're out of luck. But I think most people who'd use this are
> either using their own laptop, or e.g. in a corporate setting where
> administrators would tweak the sysctl limits given the performance
> advantages (as I did).

This conjecture that most people who'd use this are using their own laptop
or have a corporate setting where administrators would tweak the sysctl
limits according to engineers' wishes strikes me as totally made up from
thin air, nothing else.

In other words, I find it an incredibly unconvincing argument.

I prefer not to address the rest of your mail, as I found it not only a
lengthy tangent (basically trying to talk Jeff into adding Linux support
in what could have been a much shorter mail), but actively distracting
from the already long patch series Jeff presented. It is so long, in fact,
that we had to put in an exemption in GitGitGadget because it is already
longer than a usually-unreasonable 30 patches. Also, at this point,
insisting on Linux support (in so many words) is unhelpful.

Let me summarize why I think this is unhelpful: In Git, it is our
tradition to develop incrementally, for better or worse. Jeff's effort
brought us to a point where we already have Windows and macOS support,
i.e. support for the most prevalent development platforms (see e.g.
https://insights.stackoverflow.com/survey/2020#technology-developers-primary-operating-systems).
We already established multiple obstacles for Linux support, therefore
demanding Linux support to be included Right Now would increase the patch
series even further, making it even less reviewable, being even less
incremental, hold up the current known-to-work-well state, force Jeff to
work on something he probably cannot work on right now, and therefore
delaying the entire effort even further.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 14/34] fsmonitor--daemon: implement 'start' command
  2021-07-01 22:18       ` Ævar Arnfjörð Bjarmason
@ 2021-07-05 21:52         ` Johannes Schindelin
  2021-07-13 14:39         ` Jeff Hostetler
  1 sibling, 0 replies; 237+ messages in thread
From: Johannes Schindelin @ 2021-07-05 21:52 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jeff Hostetler via GitGitGadget, git, Jeff Hostetler,
	Derrick Stolee, Jeff Hostetler

[-- Attachment #1: Type: text/plain, Size: 8169 bytes --]

Hi Ævar,

On Fri, 2 Jul 2021, Ævar Arnfjörð Bjarmason wrote:

> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>
> > +#ifdef GIT_WINDOWS_NATIVE
> > +/*
> > + * Create a background process to run the daemon.  It should be completely
> > + * disassociated from the terminal.
> > + *
> > + * Conceptually like `daemonize()` but different because Windows does not
> > + * have `fork(2)`.  Spawn a normal Windows child process but without the
> > + * limitations of `start_command()` and `finish_command()`.
> > + *
> > + * The child process execs the "git fsmonitor--daemon run" command.
> > + *
> > + * The current process returns so that the caller can wait for the child
> > + * to startup before exiting.
> > + */
> > +static int spawn_background_fsmonitor_daemon(pid_t *pid)
> > +{
> > +	char git_exe[MAX_PATH];
> > +	struct strvec args = STRVEC_INIT;
> > +	int in, out;
> > +
> > +	GetModuleFileNameA(NULL, git_exe, MAX_PATH);
> > +
> > +	in = open("/dev/null", O_RDONLY);
> > +	out = open("/dev/null", O_WRONLY);
> > +
> > +	strvec_push(&args, git_exe);
> > +	strvec_push(&args, "fsmonitor--daemon");
> > +	strvec_push(&args, "run");
> > +	strvec_pushf(&args, "--ipc-threads=%d", fsmonitor__ipc_threads);
> > +
> > +	*pid = mingw_spawnvpe(args.v[0], args.v, NULL, NULL, in, out, out);
> > +	close(in);
> > +	close(out);
> > +
> > +	strvec_clear(&args);
> > +
> > +	if (*pid < 0)
> > +		return error(_("could not spawn fsmonitor--daemon in the background"));
> > +
> > +	return 0;
> > +}
> > +#else
> > +/*
> > + * Create a background process to run the daemon.  It should be completely
> > + * disassociated from the terminal.
> > + *
> > + * This is adapted from `daemonize()`.  Use `fork()` to directly
> > + * create and run the daemon in the child process.
> > + *
> > + * The fork-child can just call the run code; it does not need to exec
> > + * it.
> > + *
> > + * The fork-parent returns the child PID so that we can wait for the
> > + * child to startup before exiting.
> > + */
> > +static int spawn_background_fsmonitor_daemon(pid_t *pid)
> > +{
> > +	*pid = fork();
> > +
> > +	switch (*pid) {
> > +	case 0:
> > +		if (setsid() == -1)
> > +			error_errno(_("setsid failed"));
> > +		close(0);
> > +		close(1);
> > +		close(2);
> > +		sanitize_stdfds();
> > +
> > +		return !!fsmonitor_run_daemon();
> > +
> > +	case -1:
> > +		return error_errno(_("could not spawn fsmonitor--daemon in the background"));
> > +
> > +	default:
> > +		return 0;
> > +	}
> > +}
> > +#endif
>
> The spawn_background_fsmonitor_daemon() function here is almost the same
> as daemonize().

Yes, the code comment above that function even says that it was adapted
from `daemonize()`.

And above that, of course, is a _completely different_ implementation that
works on Windows (you will notice that this is in stark contrast of
Windows, where the `daemonize()` function will simply fail with `ENOSYS`).

> I wonder if this & the Windows-specific one you have here can't be
> refactored into an API from what's now in setup.c.

Given that there is no `fork()` on Windows (which has been the subject of
many a message to this mailing list), I think the answer to this question
of yours is a resounding "No".

> Then we could make builtin/gc.c and daemon.c use that, so Windows could
> have background GC, and we'd have a more battle-tested central codepath
> for this tricky bit.

Please. Not _more_ sidetracks.

The issue of getting `git gc --auto` to daemonize on Windows is a rather
complicated one. I won't bore this list with the details, but link to
https://github.com/git-for-windows/git/issues/2221#issuecomment-542589590
(a ~950 word analysis of the problem).

> It seems to me like the only limitations on it are to have this return
> slightly more general things (e.g. not set its own errors, return
> structured data), and maybe some callback for what to do in the
> child/parent.

And one version doesn't `die()`. Nor does it call `exit(0)` in the parent
process. But it calls `fsmonitor_listen()` in the child process. And if
you wanted to refactor `daemonize()` to do all that, it would have to be
renamed (because it does no longer _necessarily_ daemonize), and it would
have to have a `gentle` flag, and it would somehow have to indicate in its
return value whether `0` means that the parent process returned
successfully or the client process. And soon we'll end up with a function
that is both longer and more unreadable than what we have right now.

>
> > +/*
> > + * This is adapted from `wait_or_whine()`.  Watch the child process and
> > + * let it get started and begin listening for requests on the socket
> > + * before reporting our success.
> > + */
> > +static int wait_for_background_startup(pid_t pid_child)
> > +{
> > +	int status;
> > +	pid_t pid_seen;
> > +	enum ipc_active_state s;
> > +	time_t time_limit, now;
> > +
> > +	time(&time_limit);
> > +	time_limit += fsmonitor__start_timeout_sec;
> > +
> > +	for (;;) {
> > +		pid_seen = waitpid(pid_child, &status, WNOHANG);
> > +
> > +		if (pid_seen == -1)
> > +			return error_errno(_("waitpid failed"));
> > +		else if (pid_seen == 0) {
> > +			/*
> > +			 * The child is still running (this should be
> > +			 * the normal case).  Try to connect to it on
> > +			 * the socket and see if it is ready for
> > +			 * business.
> > +			 *
> > +			 * If there is another daemon already running,
> > +			 * our child will fail to start (possibly
> > +			 * after a timeout on the lock), but we don't
> > +			 * care (who responds) if the socket is live.
> > +			 */
> > +			s = fsmonitor_ipc__get_state();
> > +			if (s == IPC_STATE__LISTENING)
> > +				return 0;
> > +
> > +			time(&now);
> > +			if (now > time_limit)
> > +				return error(_("fsmonitor--daemon not online yet"));
> > +		} else if (pid_seen == pid_child) {
> > +			/*
> > +			 * The new child daemon process shutdown while
> > +			 * it was starting up, so it is not listening
> > +			 * on the socket.
> > +			 *
> > +			 * Try to ping the socket in the odd chance
> > +			 * that another daemon started (or was already
> > +			 * running) while our child was starting.
> > +			 *
> > +			 * Again, we don't care who services the socket.
> > +			 */
> > +			s = fsmonitor_ipc__get_state();
> > +			if (s == IPC_STATE__LISTENING)
> > +				return 0;
> > +
> > +			/*
> > +			 * We don't care about the WEXITSTATUS() nor
> > +			 * any of the WIF*(status) values because
> > +			 * `cmd_fsmonitor__daemon()` does the `!!result`
> > +			 * trick on all function return values.
> > +			 *
> > +			 * So it is sufficient to just report the
> > +			 * early shutdown as an error.
> > +			 */
> > +			return error(_("fsmonitor--daemon failed to start"));
> > +		} else
> > +			return error(_("waitpid is confused"));
> > +	}
> > +}
>
> Ditto this. could we extend the wait_or_whine() function (or some
> extended version thereof) to do what you need with callbacks?
>
> It seems the main difference is just being able to pass down a flag for
> waitpid(), and the loop needing to check EINTR or not depending on
> whether WNOHANG is passed.

Given that over half of `wait_or_whine()` is concerned with signals, which
the `wait_for_background_startup()` function is not at all concerned with,
I see another main difference.

> For e.g. the "We don't care about the WEXITSTATUS()" you'd get that
> behavior with an adjusted wait_or_whine(). Wouldn't it be better to
> report what exit status it exits with e.g. if the top-level process is
> signalled? We do so in trace2 for other things we spawn...

The `wait_or_whine()` function also adjusts `atexit()` behavior, which we
would not want here.

Therefore, just like the suggestion about `daemonize()` above, it appears
to me as if the suggested refactoring would make the code dramatically
more complex and less readable.

In other words, it would be a refactoring for refactoring's sake.
Definitely not something I would suggest.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 02/34] fsmonitor--daemon: man page
  2021-07-01 22:29       ` Ævar Arnfjörð Bjarmason
@ 2021-07-05 22:00         ` Johannes Schindelin
  2021-07-12 19:23         ` Jeff Hostetler
  1 sibling, 0 replies; 237+ messages in thread
From: Johannes Schindelin @ 2021-07-05 22:00 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jeff Hostetler via GitGitGadget, git, Jeff Hostetler,
	Derrick Stolee, Jeff Hostetler

[-- Attachment #1: Type: text/plain, Size: 4046 bytes --]

Hi Ævar,

On Fri, 2 Jul 2021, Ævar Arnfjörð Bjarmason wrote:

> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>
> > From: Jeff Hostetler <jeffhost@microsoft.com>
> >
> > Create a manual page describing the `git fsmonitor--daemon` feature.
> >
> > Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> > ---
> >  Documentation/git-fsmonitor--daemon.txt | 75 +++++++++++++++++++++++++
> >  1 file changed, 75 insertions(+)
> >  create mode 100644 Documentation/git-fsmonitor--daemon.txt
> >
> > diff --git a/Documentation/git-fsmonitor--daemon.txt b/Documentation/git-fsmonitor--daemon.txt
> > new file mode 100644
> > index 00000000000..154e7684daa
> > --- /dev/null
> > +++ b/Documentation/git-fsmonitor--daemon.txt
> > @@ -0,0 +1,75 @@
> > +git-fsmonitor--daemon(1)
> > +========================
> > +
> > +NAME
> > +----
> > +git-fsmonitor--daemon - A Built-in File System Monitor
> > +
> > +SYNOPSIS
> > +--------
> > +[verse]
> > +'git fsmonitor--daemon' start
> > +'git fsmonitor--daemon' run
> > +'git fsmonitor--daemon' stop
> > +'git fsmonitor--daemon' status
> > +
> > +DESCRIPTION
> > +-----------
> > +
> > +A daemon to watch the working directory for file and directory
> > +changes using platform-specific file system notification facilities.
> > +
> > +This daemon communicates directly with commands like `git status`
> > +using the link:technical/api-simple-ipc.html[simple IPC] interface
> > +instead of the slower linkgit:githooks[5] interface.
> > +
> > +This daemon is built into Git so that no third-party tools are
> > +required.
> > +
> > +OPTIONS
> > +-------
> > +
> > +start::
> > +	Starts a daemon in the background.
> > +
> > +run::
> > +	Runs a daemon in the foreground.
> > +
> > +stop::
> > +	Stops the daemon running in the current working
> > +	directory, if present.
> > +
> > +status::
> > +	Exits with zero status if a daemon is watching the
> > +	current working directory.
> > +
> > +REMARKS
> > +-------
> > +
> > +This daemon is a long running process used to watch a single working
> > +directory and maintain a list of the recently changed files and
> > +directories.  Performance of commands such as `git status` can be
> > +increased if they just ask for a summary of changes to the working
> > +directory and can avoid scanning the disk.
> > +
> > +When `core.useBuiltinFSMonitor` is set to `true` (see
> > +linkgit:git-config[1]) commands, such as `git status`, will ask the
> > +daemon for changes and automatically start it (if necessary).
> > +
> > +For more information see the "File System Monitor" section in
> > +linkgit:git-update-index[1].
> > +
> > +CAVEATS
> > +-------
> > +
> > +The fsmonitor daemon does not currently know about submodules and does
> > +not know to filter out file system events that happen within a
> > +submodule.  If fsmonitor daemon is watching a super repo and a file is
> > +modified within the working directory of a submodule, it will report
> > +the change (as happening against the super repo).  However, the client
> > +will properly ignore these extra events, so performance may be affected
> > +but it will not cause an incorrect result.
> > +
> > +GIT
> > +---
> > +Part of the linkgit:git[1] suite
>
> Later in the series we incrementally add features to the daemon, so this
> is describing a state that doesn't exist yet at this point.
>
> I think it would be better to start with a stup here and add
> documentation as we add features, e.g. the patch tha adds "start" should
> add that to the synopsis + options etc.

Incidentally, I had structured the patch series that way until Jeff took
over from me last year. And it was definitely not more reviewable because
it was not clear from the get-go what I intended this command to do.

Therefore, I object to your suggestion.

Ciao,
Johannes

P.S.: I will not address your other reviews in this patch series, mostly
because I am technically off from work this week.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 00/34] Builtin FSMonitor Feature
  2021-07-05 21:35           ` Johannes Schindelin
@ 2021-07-05 22:02             ` Ævar Arnfjörð Bjarmason
  2021-07-06 13:12               ` Johannes Schindelin
  2021-07-07  1:53             ` Felipe Contreras
  1 sibling, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-05 22:02 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Jeff Hostetler, Jeff Hostetler via GitGitGadget, git,
	Derrick Stolee, Jeff Hostetler, David Turner


On Mon, Jul 05 2021, Johannes Schindelin wrote:

> On Thu, 1 Jul 2021, Ævar Arnfjörð Bjarmason wrote:
>
>> On Thu, Jul 01 2021, Jeff Hostetler wrote:
>>
>> > On 7/1/21 1:40 PM, Ævar Arnfjörð Bjarmason wrote:
> [...]
>> > The early Linux version was dropped because inotify does not give
>> > recursive coverage -- only the requested directory.  Using inotify
>> > requires adding a watch to each subdirectory (recursively) in the
>> > worktree.  There's a system limit on the maximum number of watched
>> > directories (defaults to 8K IIRC) and that limit is system-wide.
>> >
>> > Since the whole point was to support large very large repos, using
>> > inotify was a non-starter, so I removed the Linux version from our
>> > patch series.  For example, the first repo I tried it on (outside of
>> > the test suite) had 25K subdirectories.
>> >
>> > I'm told there is a new fanotify API in recent Linux kernels that is a
>> > better fit for what we need, but we haven't had time to investigate it
>> > yet.
>>
>> That default limit is a bit annoying, but I don't see how it's a blocker
>> in any way.
>
> Let me help you to see it.
>
> So let's assume that you start FSMonitor-enabled Git, with the default
> values. What is going to happen if you have any decently-sized worktree?
> You run out of handles. What then? Throw your hands in the air? Stop
> working? Report incorrect results?
>
> Those are real design challenges, and together with the race problems Jeff
> mentioned, they pose a much bigger obstacle than the rebasing you
> mentioned above.

You report an error and tell the user to raise the limit, and cover this
in your install docs. It's what watchman does:

https://github.com/facebook/watchman/blob/master/watchman/error_category.cpp#L28-L45
https://facebook.github.io/watchman/docs/install.html#system-specific-preparation

>> You simply adjust the limit. E.g. I deployed and tested the hook version
>> of inotify (using watchman) in a sizable development environment, and
>> written my own code using the API. This was all before fanotify(7)
>> existed. IIRC I set most of the limits to 2^24 or 2^20. I've used it
>> with really large Git repos, including with David Turner's
>> 2015-04-03-1M-git for testing (`git ls-files | wc -l` on that is around
>> a quarter-million).
>>
>> If you have a humongous repository and don't have root on your own
>> machine you're out of luck. But I think most people who'd use this are
>> either using their own laptop, or e.g. in a corporate setting where
>> administrators would tweak the sysctl limits given the performance
>> advantages (as I did).
>
> This conjecture that most people who'd use this are using their own laptop
> or have a corporate setting where administrators would tweak the sysctl
> limits according to engineers' wishes strikes me as totally made up from
> thin air, nothing else.
>
> In other words, I find it an incredibly unconvincing argument.

It's from a sample size of one experience of deploying this in a BigCorp
setting.

But sure, perhaps things are done differently where you work/have
worked. My experience is that even if you're dealing with some BOFHs and
e.g. are using shared racked development servers it's generally not an
insurmountable task to get some useful software installed, or some
system configuration tweaked.

In this case we're talking about ~40MB of kernel memory for 1 million
dirs IIRC, that coupled with the target audience that benefits most from
this probably being deployments that are *painfully* aware of their "git
status" slowness...

> I prefer not to address the rest of your mail, as I found it not only a
> lengthy tangent (basically trying to talk Jeff into adding Linux support
> in what could have been a much shorter mail), but actively distracting
> from the already long patch series Jeff presented. It is so long, in fact,
> that we had to put in an exemption in GitGitGadget because it is already
> longer than a usually-unreasonable 30 patches. Also, at this point,
> insisting on Linux support (in so many words) is unhelpful.

This part of the tread started because Jeff H. claimed upthread:

    [...]inotify was a non-starter, so I removed the Linux version from
    our patch series

But after I noted that it works just fine, you just need to change some
sysctl limits.

It seems at this point we're debating whether some installations of
Linux have BOFH-y enough administrators that they won't tweak sysctl
limits for you. OK, but given that I've run this thing in a production
setting it's clearly not a "non-starter". I think it could be useful for
a lot of users.

I'll reply with more (and hopefully helpful) specifics to Jeff's mail.

> Let me summarize why I think this is unhelpful: In Git, it is our
> tradition to develop incrementally, for better or worse. Jeff's effort
> brought us to a point where we already have Windows and macOS support,
> i.e. support for the most prevalent development platforms (see e.g.
> https://insights.stackoverflow.com/survey/2020#technology-developers-primary-operating-systems).
> We already established multiple obstacles for Linux support, therefore
> demanding Linux support to be included Right Now would increase the patch
> series even further, making it even less reviewable, being even less
> incremental, hold up the current known-to-work-well state, force Jeff to
> work on something he probably cannot work on right now, and therefore
> delaying the entire effort even further.

I think we just disagree. I wouldn't call my opinion "unhelpful" any
more than yours.

I don't think Git's ever had anything like a major feature (built in,
config settings, etc. etc.) contributed by a propriterary OS vendor that
works on that vendor's OS, as well as another vendor's propriterary OS,
but not comparable free systems

Is that less incremental and perhaps less practical? Sure. It's not an
entirely practical viewpoint. I work on free software partly for
idealistic reasons. I'd prefer if the project I'm working on doesn't
give users a carrot to pick proprietary systems over free ones.

But ultimately it's not my call, but Junio's.




^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 00/34] Builtin FSMonitor Feature
  2021-07-02 19:06           ` Jeff Hostetler
@ 2021-07-05 22:52             ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-05 22:52 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Jeff Hostetler via GitGitGadget, git, Johannes Schindelin,
	Derrick Stolee, Jeff Hostetler, David Turner


On Fri, Jul 02 2021, Jeff Hostetler wrote:

> On 7/1/21 5:26 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Thu, Jul 01 2021, Jeff Hostetler wrote:
>> 
>>> On 7/1/21 1:40 PM, Ævar Arnfjörð Bjarmason wrote:
>>>> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>>>>
>>>>> Here is V3 of my patch series to add a builtin FSMonitor daemon to Git. I
>>>>> rebased this series onto v2.32.0.
>>>>>
>>>>> V3 addresses most of the previous review comments and things we've learned
>>>>> from our experimental testing of V2. (A version of V2 was shipped as an
>>>>> experimental feature in the v2.32.0-based releases of Git for Windows and
>>>>> VFS for Git.)
>>>>>
>>>>> There are still a few items that I need to address, but that list is getting
>>>>> very short.
>>>> ...
>>>>>     fsmonitor-fs-listen-win32: stub in backend for Windows
>>>>>     fsmonitor-fs-listen-macos: stub in backend for MacOS
>>>> I left some light comments on the repo-settings.c part of this to
>>>> follow
>>>> up from a previous round.
>>>
>>> Thanks.
>>>
>>>> Any other testing of it is stalled by there being no linux backend
>>>> for
>>>> it as part of this series. I see from spelunking repos that Johannes had
>>>> a WIP compat/fsmonitor/linux.c which looks like it could/should mostly
>>>> work, but the API names all changed since then, and after a short try I
>>>> gave up on trying to rebase it.
>>>
>>> The early Linux version was dropped because inotify does not give
>>> recursive coverage -- only the requested directory.  Using inotify
>>> requires adding a watch to each subdirectory (recursively) in the
>>> worktree.  There's a system limit on the maximum number of watched
>>> directories (defaults to 8K IIRC) and that limit is system-wide.
>>>
>>> Since the whole point was to support large very large repos, using
>>> inotify was a non-starter, so I removed the Linux version from our
>>> patch series.  For example, the first repo I tried it on (outside
>>> of the test suite) had 25K subdirectories.
>>>
>>> I'm told there is a new fanotify API in recent Linux kernels that
>>> is a better fit for what we need, but we haven't had time to investigate
>>> it yet.
>> That default limit is a bit annoying, but I don't see how it's a
>> blocker
>> in any way.
>> You simply adjust the limit. E.g. I deployed and tested the hook
>> version
>> of inotify (using watchman) in a sizable development environment, and
>> written my own code using the API. This was all before fanotify(7)
>> existed. IIRC I set most of the limits to 2^24 or 2^20. I've used it
>> with really large Git repos, including with David Turner's
>> 2015-04-03-1M-git for testing (`git ls-files | wc -l` on that is around
>> a quarter-million).
>> If you have a humongous repository and don't have root on your own
>> machine you're out of luck. But I think most people who'd use this are
>> either using their own laptop, or e.g. in a corporate setting where
>> administrators would tweak the sysctl limits given the performance
>> advantages (as I did).
>> Once you adjust the limits Linux deals with large trees just fine,
>> it
>> just has overly conservative limits for most things in sysctl. The API
>> is a bit annoying, your event loop needs to run around and add watches.
>> AFAICT you need Linux 5.1 for fanotify(7) to be useful, e.g. Debian
>> stable, RHEL etc. aren't using anything that new. So having an inotify
>> backend as well as possibly a fanotify one would be very useful.
>> And linux.git's docs suggest that the default limit was bumped from
>> 8192
>> to 1M in v5.10, a really recent kernel, so if you've got that you've
>> also got fanotify.
>> In any case, even if Linux's inotify limit was something hardcoded
>> and
>> impossible to change you could still use such an API to test & debug the
>> basics of this feature on that platform.
>
> Good points.  If the inotify limits can be increased into the millions
> then we can revisit supporting it.

I also replied in the side-thread on this topic:
https://lore.kernel.org/git/874kd874qv.fsf@evledraar.gmail.com/

> [...] I do worry about possible race
> conditions as we have to scan the worktree and add/delete a watch for
> each directory, but we don't need to worry about that right now.

In the watchman code there's a scheduleRecrawl() in such cases, but it's
not just something that happens on Linux.

On Windows it's if ReadDirectoryChangesW() returns
ERROR_NOTIFY_ENUM_DIR. This builtin version doesn't check that at
all. Here's someone who seems to work at MSFT talking about it:

    https://devblogs.microsoft.com/oldnewthing/20110812-00/?p=9913

So isn't this an issue on Windows as well?

> I do want to have Linux support eventually, but I was saving it for
> a later effort (and/or looking for volunteers).

OK, upthread you said it was a "non-starter" because of the adjustable
limits we've been discussing...

> My IPC-based builtin daemon complements the existing hook-based
> fsmonitor support that Ben Peart and Kevin Willford added a few years
> ago.  That model (and PERL hook script and Watchman integration) work
> fine for Linux, so the advantages of a builtin monitor aren't as
> compelling.
>
> For example, on Linux, hook process creation is fast, PERL is fast,[...]

Did you or Ben ever try to reproduce what I noted at:

    https://lore.kernel.org/git/CACBZZX5e58bWuf3NdDYTxu2KyZj29hHONzN=rp-7vXd8nURyWQ@mail.gmail.com/

And more recently with reference to that at:

    https://lore.kernel.org/git/87h7lgfchm.fsf@evledraar.gmail.com/

I.e. when I tested it back then the issue wasn't Perl's performance, it
was something between watchman and the user, happening in
git. I.e. watchman would return in the single-digit milliseconds, git
would take hundreds or thousands of ms.

Maybe I missed some intervening analysis, but this didn't have to do
with process overhead back then.

Also, watchman has a JSON network interface. So for a "native" solution
I also wonder if you or Ben tried a solution that involved just talking
to that over the local network. The hook/perl script was just a trivial
aid to make doing that easier. Wouldn't that be just as fast (or near
enough) on Windows?

Ben mentioned in the linked thread that there was some talks between
Microsoft (or well, the "Git team") and the watchman people. So again, I
missed some in-between discussions. I'm just wondering about the design
choices...

> and it is easy to just apt-get a tool like Watchman.  But on
> Windows, process creation is slow, NTFS is slow, PERL is available
> as part of the Git-for-Windows distribution, and installing third-party
> tools like Watchman onto a collection of enterprise users' machines is
> a chore, so it made sense of us to pick platforms where the existing
> hook model has issues and add other platforms later.

If we're trying to do an end-run around package systems being a pain
would including the relevant part of watchman in contrib + talking to it
over its network interface be a replacement for some/most/all of this
series?

> Besides, this patch series is already at 34 commits.  And some of
> them are quite large.  Adding another platform would just make it
> even larger.
>
> Right now I'm more interested in the larger question of whether
> we WANT to have a builtin fsmonitor and do we like the overall
> design of what I have here?  Picking up a new platform, whether
> it is Linux, or AIX, or Solaris, or Nonstop, or whatever, should
> nicely fit into one platform-specific file in compat/fsmonitor
> and not take that long.

Sure, FWIW I'm not opposed to this being a built-in per-se, or us
re-implementing parts of watchman here, and not being able to test this
series meaningfully a lot of these questions are probably easy to answer
on a supported OS...

>>>> I'd really prefer for git not to have features that place free
>>>> platforms
>>>> at a disadvantage against proprietary platforms if it can be avoided,
>>>> and in this case the lack of a Linux backend also means much less
>>>> widespread testing of the feature among the development community / CI.
>>>>
>>>
>>> This feature is always going to have platform-specific components,
>>> so the lack of one platform or another should not stop us from
>>> discussing it for the platforms that can be supported.
>> (I think per the above that's s/can be/are/)
>> 
>>> And given the size and complexity of the platform-specific code,
>>> we should not assume that "just test it on Linux" is sufficient.
>>> Yes, there are some common/shared routines/data structures in the
>>> daemon, but hard/tricky parts are in the platform layer.
>> I think we're talking past each other a bit here. I'm not saying
>> that
>> you can get full or meaningful testing for it on Windows if you test it
>> on Linux, or the other way around.
>> Of course there's platform-specific stuff, although there's also a
>> lot
>> of non-platform-specific stuff, so even a very basic implementation of
>> inotify would make reviwing this easier / give access to more reviewers.
>> I'm saying that I prefer that Git as a free software project not be
>> in
>> the situation of saying the best use-case for a given size/shape of repo
>> is to use Git in combination with proprietary operating systems over
>> freely licensed ones.
>
> I wouldn't worry about that.  Even without Watchman integration,
> Linux runs things so much faster than anything else it's not funny.
>
> If anything, we need things like fsmonitor on Windows to help keep
> up with Linux.

Even on Linux if you do a "git status" on a humongous repository it can
take 1-2 seconds (see old but still valid linked numbers above), whereas
if you've been logging fs events and ask a daemon "what changed since
time xyz" it can take <5ms.

But that's just been on repositories I've tested. I'd assumed that this
part of GVFS wasn't something that could have been fairly easily
replaced by having the relevant developers run a Linux distro in vmware
or whatever, adn that's been consistent with my own testing on
still-big-but-smaller repos than that.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 00/34] Builtin FSMonitor Feature
  2021-07-05 22:02             ` Ævar Arnfjörð Bjarmason
@ 2021-07-06 13:12               ` Johannes Schindelin
  2021-07-07  2:14                 ` Felipe Contreras
  0 siblings, 1 reply; 237+ messages in thread
From: Johannes Schindelin @ 2021-07-06 13:12 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jeff Hostetler, Jeff Hostetler via GitGitGadget, git,
	Derrick Stolee, Jeff Hostetler, David Turner

[-- Attachment #1: Type: text/plain, Size: 620 bytes --]

Hi Ævar,

On Tue, 6 Jul 2021, Ævar Arnfjörð Bjarmason wrote:

> I think we just disagree. I wouldn't call my opinion "unhelpful" any
> more than yours.

You still misunderstand. This is not about any "opinion" of yours, it is
about your delay tactics to make it deliberately difficult to finish this
patch series, by raising the bar beyond what is reasonable for a single
patch series.

And you keep doing it. I would appreciate if you just stopped with all
those tangents and long and many replies that do not seem designed to help
the patch series stabilize, but do the opposite.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 19/34] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  2021-07-01 14:47     ` [PATCH v3 19/34] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows Jeff Hostetler via GitGitGadget
  2021-07-01 23:02       ` Ævar Arnfjörð Bjarmason
@ 2021-07-06 19:09       ` Johannes Schindelin
  2021-07-13 15:18         ` Jeff Hostetler
  1 sibling, 1 reply; 237+ messages in thread
From: Johannes Schindelin @ 2021-07-06 19:09 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget
  Cc: git, Jeff Hostetler, Derrick Stolee, Jeff Hostetler, Jeff Hostetler

Hi Jeff,


On Thu, 1 Jul 2021, Jeff Hostetler via GitGitGadget wrote:

Jeff Hostetler <jeffhost@microsoft.com>

the win32 backend to register a watch on the working tree
irectory (recursively).  Also watch the <gitdir> if it is
side the working tree.  And to collect path change notifications
atches and publish.

-off-by: Jeff Hostetler <jeffhost@microsoft.com>

t/fsmonitor/fsmonitor-fs-listen-win32.c | 530 +++++++++++++++++++
e changed, 530 insertions(+)

> diff --git a/compat/fsmonitor/fsmonitor-fs-listen-win32.c b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
> index 880446b49e3..d707d47a0d7 100644
> --- a/compat/fsmonitor/fsmonitor-fs-listen-win32.c
> +++ b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
> @@ -2,20 +2,550 @@
> + [...]
> +
> +static struct one_watch *create_watch(struct fsmonitor_daemon_state *state,
> +				      const char *path)
> +{
> +	struct one_watch *watch = NULL;
> +	DWORD desired_access = FILE_LIST_DIRECTORY;
> +	DWORD share_mode =
> +		FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE;
> +	HANDLE hDir;
> +
> +	hDir = CreateFileA(path,
> +			   desired_access, share_mode, NULL, OPEN_EXISTING,
> +			   FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OVERLAPPED,
> +			   NULL);

The `*A()` family of Win32 API functions disagree with Git in one very
interesting aspect: Git always assumes UTF-8, while e.g. `CreateFileA()`
will use the current Win32 locale to internally transform to wide
characters and then call `CreateFileW()`.

This poses no problem when your locale is US American and your paths
contain no non-ASCII characters.

In the Git for Windows bug tracker, it was reported that it _does_ cause
problems when venturing outside such a cozy scenario (for full details,
see https://github.com/git-for-windows/git/issues/3262)

I need this (and merged it before starting the process to release Git for
Windows v2.32.0(2)) to fix that (could I ask you to integrate this in case
a re-roll will become necessary?):

-- snipsnap --
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 5 Jul 2021 13:51:05 +0200
Subject: [PATCH] fixup! fsmonitor-fs-listen-win32: implement FSMonitor backend
 on Windows

Let's keep avoiding the `*A()` family of Win32 API functions because
they are susceptible to incoherent encoding problems. In Git for
Windows, we always assume paths to be UTF-8 encoded. Let's use the
dedicated helper to convert such a path to the wide character version,
and then use the `*W()` function instead.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 compat/fsmonitor/fsmonitor-fs-listen-win32.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/compat/fsmonitor/fsmonitor-fs-listen-win32.c b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
index ba087b292df..3b42ab311d9 100644
--- a/compat/fsmonitor/fsmonitor-fs-listen-win32.c
+++ b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
@@ -111,8 +111,14 @@ static struct one_watch *create_watch(struct fsmonitor_daemon_state *state,
 	DWORD share_mode =
 		FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE;
 	HANDLE hDir;
+	wchar_t wpath[MAX_PATH];

-	hDir = CreateFileA(path,
+	if (xutftowcs_path(wpath, path) < 0) {
+		error(_("could not convert to wide characters: '%s'"), path);
+		return NULL;
+	}
+
+	hDir = CreateFileW(wpath,
 			   desired_access, share_mode, NULL, OPEN_EXISTING,
 			   FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OVERLAPPED,
 			   NULL);
--
2.32.0.windows.1.15.gf1590a75e2d


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 00/34] Builtin FSMonitor Feature
  2021-07-05 21:35           ` Johannes Schindelin
  2021-07-05 22:02             ` Ævar Arnfjörð Bjarmason
@ 2021-07-07  1:53             ` Felipe Contreras
  1 sibling, 0 replies; 237+ messages in thread
From: Felipe Contreras @ 2021-07-07  1:53 UTC (permalink / raw)
  To: Johannes Schindelin, Ævar Arnfjörð Bjarmason
  Cc: Jeff Hostetler, Jeff Hostetler via GitGitGadget, git,
	Derrick Stolee, Jeff Hostetler, David Turner

Johannes Schindelin wrote:

> In Git, it is our tradition to develop incrementally, for better or
> worse.

We develop incrementally, but the first version of a big feature has to
have as many eyes on it as possible in order to track potential issues
in the future.

It is very common to look back at an API and say "oh, if only wad
thought of that back then". Therefore we should do our best to think
about it now.

We want to avoid future aw-shucks.

> Jeff's effort
> brought us to a point where we already have Windows and macOS support,
> i.e. support for the most prevalent development platforms (see e.g.
> https://insights.stackoverflow.com/survey/2020#technology-developers-primary-operating-systems).
> We already established multiple obstacles for Linux support, therefore
> demanding Linux support to be included Right Now would increase the patch
> series even further, making it even less reviewable, being even less
> incremental, hold up the current known-to-work-well state, force Jeff to
> work on something he probably cannot work on right now, and therefore
> delaying the entire effort even further.

This is a red herring.

You don't need to send the Linux support as part of the patch series,
you can simply provide a branch for the people that want to give it a
try.

Even if it's not ready for wide use, even if it needs a specific version
of Linux, even if it's a proof of concept, you can provide it.

The real reason it's not provided is laziness (this is not an attack,
laziness is good trait in a programmer, although not always).

Cheers.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 00/34] Builtin FSMonitor Feature
  2021-07-06 13:12               ` Johannes Schindelin
@ 2021-07-07  2:14                 ` Felipe Contreras
  0 siblings, 0 replies; 237+ messages in thread
From: Felipe Contreras @ 2021-07-07  2:14 UTC (permalink / raw)
  To: Johannes Schindelin, Ævar Arnfjörð Bjarmason
  Cc: Jeff Hostetler, Jeff Hostetler via GitGitGadget, git,
	Derrick Stolee, Jeff Hostetler, David Turner

Johannes Schindelin wrote:
> On Tue, 6 Jul 2021, Ævar Arnfjörð Bjarmason wrote:
> 
> > I think we just disagree. I wouldn't call my opinion "unhelpful" any
> > more than yours.
> 
> You still misunderstand. This is not about any "opinion" of yours, it is
> about your delay tactics to make it deliberately difficult to finish this
> patch series,

The whole job of a reviewer is to *delay* the inclusion of the patch
series being reviewed until it has passed his/her personal standards.

Ævar is just doing his job (metaphorically).

> by raising the bar beyond what is reasonable for a single patch
> series.

It is not up to you to determine what is reasonable for a patch series,
and given that you have a vested interest you are also not an objective
observer.

I have hundreds of patches pending review, and I would love for anyone
to try find issues with them, even if I ultimately disagreed with their
assessment.

The Git project doesn't owe your patches any preferential treatment.
The review process will take as long as the review process takes. You
can't put deadlines on open source projects.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 02/34] fsmonitor--daemon: man page
  2021-07-01 22:29       ` Ævar Arnfjörð Bjarmason
  2021-07-05 22:00         ` Johannes Schindelin
@ 2021-07-12 19:23         ` Jeff Hostetler
  2021-07-13 17:46           ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-12 19:23 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Derrick Stolee, Jeff Hostetler



On 7/1/21 6:29 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
> 
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> Create a manual page describing the `git fsmonitor--daemon` feature.
>>
>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>> ---
>>   Documentation/git-fsmonitor--daemon.txt | 75 +++++++++++++++++++++++++
>>   1 file changed, 75 insertions(+)
>>   create mode 100644 Documentation/git-fsmonitor--daemon.txt
>>
>> diff --git a/Documentation/git-fsmonitor--daemon.txt b/Documentation/git-fsmonitor--daemon.txt
>> new file mode 100644
>> index 00000000000..154e7684daa
>> --- /dev/null
>> +++ b/Documentation/git-fsmonitor--daemon.txt
>> @@ -0,0 +1,75 @@
>> +git-fsmonitor--daemon(1)
>> +========================
>> +
>> +NAME
>> +----
>> +git-fsmonitor--daemon - A Built-in File System Monitor
>> +
>> +SYNOPSIS
>> +--------
>> +[verse]
>> +'git fsmonitor--daemon' start
>> +'git fsmonitor--daemon' run
>> +'git fsmonitor--daemon' stop
>> +'git fsmonitor--daemon' status
>> +
>> +DESCRIPTION
>> +-----------
>> +
>> +A daemon to watch the working directory for file and directory
>> +changes using platform-specific file system notification facilities.
>> +
>> +This daemon communicates directly with commands like `git status`
>> +using the link:technical/api-simple-ipc.html[simple IPC] interface
>> +instead of the slower linkgit:githooks[5] interface.
>> +
>> +This daemon is built into Git so that no third-party tools are
>> +required.
>> +
>> +OPTIONS
>> +-------
>> +
>> +start::
>> +	Starts a daemon in the background.
>> +
>> +run::
>> +	Runs a daemon in the foreground.
>> +
>> +stop::
>> +	Stops the daemon running in the current working
>> +	directory, if present.
>> +
>> +status::
>> +	Exits with zero status if a daemon is watching the
>> +	current working directory.
>> +
>> +REMARKS
>> +-------
>> +
>> +This daemon is a long running process used to watch a single working
>> +directory and maintain a list of the recently changed files and
>> +directories.  Performance of commands such as `git status` can be
>> +increased if they just ask for a summary of changes to the working
>> +directory and can avoid scanning the disk.
>> +
>> +When `core.useBuiltinFSMonitor` is set to `true` (see
>> +linkgit:git-config[1]) commands, such as `git status`, will ask the
>> +daemon for changes and automatically start it (if necessary).
>> +
>> +For more information see the "File System Monitor" section in
>> +linkgit:git-update-index[1].
>> +
>> +CAVEATS
>> +-------
>> +
>> +The fsmonitor daemon does not currently know about submodules and does
>> +not know to filter out file system events that happen within a
>> +submodule.  If fsmonitor daemon is watching a super repo and a file is
>> +modified within the working directory of a submodule, it will report
>> +the change (as happening against the super repo).  However, the client
>> +will properly ignore these extra events, so performance may be affected
>> +but it will not cause an incorrect result.
>> +
>> +GIT
>> +---
>> +Part of the linkgit:git[1] suite
> 
> Later in the series we incrementally add features to the daemon, so this
> is describing a state that doesn't exist yet at this point.
> 
> I think it would be better to start with a stup here and add
> documentation as we add features, e.g. the patch tha adds "start" should
> add that to the synopsis + options etc.
> 
> See the outstanding ab/config-based-hooks-base for a small example of
> that.
> 

I like to lead the series with the documentation that summarizes the
purpose of the entire feature or patch series.  This gives the reviewer
the context for the complete series that follows.  In the past, we've
had discussions on the list about how hard it is to review a series when
the foo.c comes (alphabetically) before foo.h in the patch and all
the documentation is attached to the prototypes in the .h file so the
reviewer needs to bounce around in the patch or series to read the
intent and then go back to the beginning to read the code.  In that
spirit, I think that having the complete man page come first provides
necessary context and is helpful.

The argument that the man-page should grow as the feature grows
presumes that there is a meaningful cut-point mid-series where you
would adopt the first portion and delay the second to a later release
or something.  That division would not be useful/usable.

And it just clutters up later commits in the series with man-page
deltas.

So I'd like to keep it as it unless there are further objections.

Jeff


^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 14/34] fsmonitor--daemon: implement 'start' command
  2021-07-01 22:18       ` Ævar Arnfjörð Bjarmason
  2021-07-05 21:52         ` Johannes Schindelin
@ 2021-07-13 14:39         ` Jeff Hostetler
  2021-07-13 17:54           ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-13 14:39 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Derrick Stolee, Jeff Hostetler


My response here is in addition to Dscho's remarks on this topic.
He makes excellent points that I'll just #include here.  I do want
to add my own $0.02 here.

On 7/1/21 6:18 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
> 
>> +#ifdef GIT_WINDOWS_NATIVE
>> +/*
>> + * Create a background process to run the daemon.  It should be completely
>> + * disassociated from the terminal.
>> + *
>> + * Conceptually like `daemonize()` but different because Windows does not
>> + * have `fork(2)`.  Spawn a normal Windows child process but without the
>> + * limitations of `start_command()` and `finish_command()`.
>> + *
>> + * The child process execs the "git fsmonitor--daemon run" command.
>> + *
>> + * The current process returns so that the caller can wait for the child
>> + * to startup before exiting.
>> + */
>> +static int spawn_background_fsmonitor_daemon(pid_t *pid)
>> +{
>> +	char git_exe[MAX_PATH];
>> +	struct strvec args = STRVEC_INIT;
>> +	int in, out;
>> +
>> +	GetModuleFileNameA(NULL, git_exe, MAX_PATH);
>> +
>> +	in = open("/dev/null", O_RDONLY);
>> +	out = open("/dev/null", O_WRONLY);
>> +
>> +	strvec_push(&args, git_exe);
>> +	strvec_push(&args, "fsmonitor--daemon");
>> +	strvec_push(&args, "run");
>> +	strvec_pushf(&args, "--ipc-threads=%d", fsmonitor__ipc_threads);
>> +
>> +	*pid = mingw_spawnvpe(args.v[0], args.v, NULL, NULL, in, out, out);
>> +	close(in);
>> +	close(out);
>> +
>> +	strvec_clear(&args);
>> +
>> +	if (*pid < 0)
>> +		return error(_("could not spawn fsmonitor--daemon in the background"));
>> +
>> +	return 0;
>> +}
>> +#else
>> +/*
>> + * Create a background process to run the daemon.  It should be completely
>> + * disassociated from the terminal.
>> + *
>> + * This is adapted from `daemonize()`.  Use `fork()` to directly
>> + * create and run the daemon in the child process.
>> + *
>> + * The fork-child can just call the run code; it does not need to exec
>> + * it.
>> + *
>> + * The fork-parent returns the child PID so that we can wait for the
>> + * child to startup before exiting.
>> + */
>> +static int spawn_background_fsmonitor_daemon(pid_t *pid)
>> +{
>> +	*pid = fork();
>> +
>> +	switch (*pid) {
>> +	case 0:
>> +		if (setsid() == -1)
>> +			error_errno(_("setsid failed"));
>> +		close(0);
>> +		close(1);
>> +		close(2);
>> +		sanitize_stdfds();
>> +
>> +		return !!fsmonitor_run_daemon();
>> +
>> +	case -1:
>> +		return error_errno(_("could not spawn fsmonitor--daemon in the background"));
>> +
>> +	default:
>> +		return 0;
>> +	}
>> +}
>> +#endif
> 
> The spawn_background_fsmonitor_daemon() function here is almost the same
> as daemonize(). I wonder if this & the Windows-specific one you have
> here can't be refactored into an API from what's now in setup.c.
> 
> Then we could make builtin/gc.c and daemon.c use that, so Windows could
> have background GC, and we'd have a more battle-tested central codepath
> for this tricky bit.
> 

I'd rather not refactor all of this and add unnecessary generality
and complexity just to save duplicating some of the code in daemonize().

And I'd rather not destabilize existing commands like gc and daemon
by changing the daemonize() layer on them.  If those commands need help,
let's have a separate conversation _later_ about what help they need
and if it makes sense to combine them.


> It seems to me like the only limitations on it are to have this return
> slightly more general things (e.g. not set its own errors, return
> structured data), and maybe some callback for what to do in the
> child/parent.

There are several issues here when trying to start a background process
and we're already on the edge of the behavioral differences between
Windows and Unix -- let's not make things more confusing with multiple
callbacks, returning structures, custom errors, and etc.

Also, since Windows doesn't do fork(), we don't have child/parent
branches in the call, so this whole "just pretend it's all Unix"
model doesn't work.

Even if we did pretend I'd still need ifdef'd callback routines to
either call `fsmonitor_run_daemon()` or build a command line (or have
blocks of functions that "just happen to never be called on one
platform or the other").


What I have here is an API that the primary (read: parent) calls
and gets back a 0 or -1 (with error message).  And that's it.
The primary can then wait for the child (whether from fork or
CreateProcess) to become responsive or fail to start.  And then
the primary can exit (with or without error).

So I think we're good.  Yes, there is an ifdef here, but I think
it is worth it.


> 
>> +/*
>> + * This is adapted from `wait_or_whine()`.  Watch the child process and
>> + * let it get started and begin listening for requests on the socket
>> + * before reporting our success.
>> + */
>> +static int wait_for_background_startup(pid_t pid_child)
>> +{
>> +	int status;
>> +	pid_t pid_seen;
>> +	enum ipc_active_state s;
>> +	time_t time_limit, now;
>> +
>> +	time(&time_limit);
>> +	time_limit += fsmonitor__start_timeout_sec;
>> +
>> +	for (;;) {
>> +		pid_seen = waitpid(pid_child, &status, WNOHANG);
>> +
>> +		if (pid_seen == -1)
>> +			return error_errno(_("waitpid failed"));
>> +		else if (pid_seen == 0) {
>> +			/*
>> +			 * The child is still running (this should be
>> +			 * the normal case).  Try to connect to it on
>> +			 * the socket and see if it is ready for
>> +			 * business.
>> +			 *
>> +			 * If there is another daemon already running,
>> +			 * our child will fail to start (possibly
>> +			 * after a timeout on the lock), but we don't
>> +			 * care (who responds) if the socket is live.
>> +			 */
>> +			s = fsmonitor_ipc__get_state();
>> +			if (s == IPC_STATE__LISTENING)
>> +				return 0;
>> +
>> +			time(&now);
>> +			if (now > time_limit)
>> +				return error(_("fsmonitor--daemon not online yet"));
>> +		} else if (pid_seen == pid_child) {
>> +			/*
>> +			 * The new child daemon process shutdown while
>> +			 * it was starting up, so it is not listening
>> +			 * on the socket.
>> +			 *
>> +			 * Try to ping the socket in the odd chance
>> +			 * that another daemon started (or was already
>> +			 * running) while our child was starting.
>> +			 *
>> +			 * Again, we don't care who services the socket.
>> +			 */
>> +			s = fsmonitor_ipc__get_state();
>> +			if (s == IPC_STATE__LISTENING)
>> +				return 0;
>> +
>> +			/*
>> +			 * We don't care about the WEXITSTATUS() nor
>> +			 * any of the WIF*(status) values because
>> +			 * `cmd_fsmonitor__daemon()` does the `!!result`
>> +			 * trick on all function return values.
>> +			 *
>> +			 * So it is sufficient to just report the
>> +			 * early shutdown as an error.
>> +			 */
>> +			return error(_("fsmonitor--daemon failed to start"));
>> +		} else
>> +			return error(_("waitpid is confused"));
>> +	}
>> +}
> 
> Ditto this. could we extend the wait_or_whine() function (or some
> extended version thereof) to do what you need with callbacks?
> 
> It seems the main difference is just being able to pass down a flag for
> waitpid(), and the loop needing to check EINTR or not depending on
> whether WNOHANG is passed.
> 
> For e.g. the "We don't care about the WEXITSTATUS()" you'd get that
> behavior with an adjusted wait_or_whine(). Wouldn't it be better to
> report what exit status it exits with e.g. if the top-level process is
> signalled? We do so in trace2 for other things we spawn...
> 

Again, I don't want to mix my usage here with the existing code
and destabilize all existing callers.  Here we are spinning to give
the child a chance to *start* and confirm that it is in a listening
state and ready for connections.  We do not wait for the child to
exit (unless it dies quickly without becoming ready).

We want to end our wait as soon as we confirm that the child is
ready and return.  All I really need from the system is `waitpid()`.

Also, since we started the child in my `spawn_background...()`, it
is not in the `children_to_clean` list, so there is no need to mess
with that.

So I'd like to leave this as is.

Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 17/34] fsmonitor--daemon: define token-ids
  2021-07-01 22:58       ` Ævar Arnfjörð Bjarmason
@ 2021-07-13 15:15         ` Jeff Hostetler
  2021-07-13 18:11           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-13 15:15 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Derrick Stolee, Jeff Hostetler



On 7/1/21 6:58 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
> 
>> +	if (!test_env_value) {
>> +		struct timeval tv;
>> +		struct tm tm;
>> +		time_t secs;
>> +
>> +		gettimeofday(&tv, NULL);
>> +		secs = tv.tv_sec;
>> +		gmtime_r(&secs, &tm);
>> +
>> +		strbuf_addf(&token->token_id,
>> +			    "%"PRIu64".%d.%4d%02d%02dT%02d%02d%02d.%06ldZ",
>> +			    flush_count++,
>> +			    getpid(),
>> +			    tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
>> +			    tm.tm_hour, tm.tm_min, tm.tm_sec,
>> +			    (long)tv.tv_usec);
> 
> Just bikeshedding, but can we have tokens that mostly sort numeric-wise
> by time order? So time at the start, not the flush_count/getpid.

As I described in a rather large comment in the code, tokens are opaque
strings -- without a less-than / greater-than relationship -- just a
random string that the daemon can use (along with a sequence number) to
ensure that a later request is well-defined.

Here I'm using a counter, pid, and date-stamp.  I'd prefer using a GUID
or UUID just to drive that home, but I didn't want to add a new .lib or
.a to the build if not necessary.

Perhaps I should compute this portion as hex(hash(time())) to remove the
temptation to look inside my opaque token ??

> 
> Maybe I'm missing something, but couldn't we just re-use the trace2 SID
> + a more trivial trailer? It would have the nice property that you could
> find the trace2 SID whenever you looked at such a token (could
> e.g. split them by "/" too), and add the tv_usec, flush_count+whatever
> else is needed to make it unique after the "/", no?
> 

I would rather keep Trace2 out of this.  The SID is another opaque
string and I don't want to reach inside it.

Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 19/34] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  2021-07-06 19:09       ` Johannes Schindelin
@ 2021-07-13 15:18         ` Jeff Hostetler
  0 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-13 15:18 UTC (permalink / raw)
  To: Johannes Schindelin, Jeff Hostetler via GitGitGadget
  Cc: git, Derrick Stolee, Jeff Hostetler



On 7/6/21 3:09 PM, Johannes Schindelin wrote:
> Hi Jeff,
> 
> 
> On Thu, 1 Jul 2021, Jeff Hostetler via GitGitGadget wrote:
> 
> Jeff Hostetler <jeffhost@microsoft.com>
> 
> the win32 backend to register a watch on the working tree
> irectory (recursively).  Also watch the <gitdir> if it is
> side the working tree.  And to collect path change notifications
> atches and publish.
> 
> -off-by: Jeff Hostetler <jeffhost@microsoft.com>
> 
> t/fsmonitor/fsmonitor-fs-listen-win32.c | 530 +++++++++++++++++++
> e changed, 530 insertions(+)
> 
>> diff --git a/compat/fsmonitor/fsmonitor-fs-listen-win32.c b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
>> index 880446b49e3..d707d47a0d7 100644
>> --- a/compat/fsmonitor/fsmonitor-fs-listen-win32.c
>> +++ b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
>> @@ -2,20 +2,550 @@
>> + [...]
>> +
>> +static struct one_watch *create_watch(struct fsmonitor_daemon_state *state,
>> +				      const char *path)
>> +{
>> +	struct one_watch *watch = NULL;
>> +	DWORD desired_access = FILE_LIST_DIRECTORY;
>> +	DWORD share_mode =
>> +		FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE;
>> +	HANDLE hDir;
>> +
>> +	hDir = CreateFileA(path,
>> +			   desired_access, share_mode, NULL, OPEN_EXISTING,
>> +			   FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OVERLAPPED,
>> +			   NULL);
> 
> The `*A()` family of Win32 API functions disagree with Git in one very
> interesting aspect: Git always assumes UTF-8, while e.g. `CreateFileA()`
> will use the current Win32 locale to internally transform to wide
> characters and then call `CreateFileW()`.
> 
> This poses no problem when your locale is US American and your paths
> contain no non-ASCII characters.
> 
> In the Git for Windows bug tracker, it was reported that it _does_ cause
> problems when venturing outside such a cozy scenario (for full details,
> see https://github.com/git-for-windows/git/issues/3262)
> 
> I need this (and merged it before starting the process to release Git for
> Windows v2.32.0(2)) to fix that (could I ask you to integrate this in case
> a re-roll will become necessary?):
> 
> -- snipsnap --
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> Date: Mon, 5 Jul 2021 13:51:05 +0200
> Subject: [PATCH] fixup! fsmonitor-fs-listen-win32: implement FSMonitor backend
>   on Windows
> 
> Let's keep avoiding the `*A()` family of Win32 API functions because
> they are susceptible to incoherent encoding problems. In Git for
> Windows, we always assume paths to be UTF-8 encoded. Let's use the
> dedicated helper to convert such a path to the wide character version,
> and then use the `*W()` function instead.
> 
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>   compat/fsmonitor/fsmonitor-fs-listen-win32.c | 8 +++++++-
>   1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/compat/fsmonitor/fsmonitor-fs-listen-win32.c b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
> index ba087b292df..3b42ab311d9 100644
> --- a/compat/fsmonitor/fsmonitor-fs-listen-win32.c
> +++ b/compat/fsmonitor/fsmonitor-fs-listen-win32.c
> @@ -111,8 +111,14 @@ static struct one_watch *create_watch(struct fsmonitor_daemon_state *state,
>   	DWORD share_mode =
>   		FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE;
>   	HANDLE hDir;
> +	wchar_t wpath[MAX_PATH];
> 
> -	hDir = CreateFileA(path,
> +	if (xutftowcs_path(wpath, path) < 0) {
> +		error(_("could not convert to wide characters: '%s'"), path);
> +		return NULL;
> +	}
> +
> +	hDir = CreateFileW(wpath,
>   			   desired_access, share_mode, NULL, OPEN_EXISTING,
>   			   FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OVERLAPPED,
>   			   NULL);
> --
> 2.32.0.windows.1.15.gf1590a75e2d
> 

Thanks for the heads up.  I'll pull this into my next release.
Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 19/34] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  2021-07-01 23:02       ` Ævar Arnfjörð Bjarmason
@ 2021-07-13 15:46         ` Jeff Hostetler
  2021-07-13 18:15           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-13 15:46 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Derrick Stolee, Jeff Hostetler



On 7/1/21 7:02 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
> 
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> Teach the win32 backend to register a watch on the working tree
>> root directory (recursively).  Also watch the <gitdir> if it is
>> not inside the working tree.  And to collect path change notifications
>> into batches and publish.
>>
>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>> ---
>>   compat/fsmonitor/fsmonitor-fs-listen-win32.c | 530 +++++++++++++++++++
> 
> <bikeshed mode> Spying on the early history of this (looking for the
> Linux backend) I saw that at some point we had just
> compat/fsmonitor/linux.c, and presumably some of
> compat/fsmonitor/{windows,win32,macos,darwin}.c.
> 
> At some point those filenames became much much longer.
> 

Once upon a time having "foo/bar/win32.c" and "abc/def/win32.c"
would cause confusion in the debugger (I've long since forgotten
which).  Breaking at win32.c:30 was no longer unique.

Also, if the Makefile sends all .o's to the root directory or a
unified OBJS directory rather than to the subdir containing the .c,
then we have another issue during linking...

So, having been burned too many times, I prefer to make source
filenames unique when possible.


> I've noticed you tend to prefer really long file and function names,
> e.g. your borrowed daemonize() became
> spawn_background_fsmonitor_daemon(), I think aiming for shorter
> filenames & function names helps, e.g. these long names widen diffstats,
> and many people who hack on the code stick religiously to 80 character
> width terminals.
> 

I prefer self-documenting code.

Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 23/34] t/helper/test-touch: add helper to touch a series of files
  2021-07-01 20:00       ` Junio C Hamano
@ 2021-07-13 16:45         ` Jeff Hostetler
  0 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-13 16:45 UTC (permalink / raw)
  To: Junio C Hamano, Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Derrick Stolee, Jeff Hostetler



On 7/1/21 4:00 PM, Junio C Hamano wrote:
> "Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> diff --git a/t/helper/test-touch.c b/t/helper/test-touch.c
>> new file mode 100644
>> index 00000000000..e9b3b754f1f
>> --- /dev/null
>> +++ b/t/helper/test-touch.c
>> @@ -0,0 +1,126 @@
>> +/*
>> + * test-touch.c: variation on /usr/bin/touch to speed up tests
>> + * with a large number of files (primarily on Windows where child
>> + * process are very, very expensive).
>> + */
>> +
>> +#include "test-tool.h"
>> +#include "cache.h"
>> +#include "parse-options.h"
>> +
>> +char *seq_pattern;
>> +int seq_start = 1;
>> +int seq_count = 1;
> 
> With this in, "make sparse" dies like this:
> 
>      SP t/helper/test-touch.c
> t/helper/test-touch.c:11:6: error: symbol 'seq_pattern' was not declared. Should it be static?
> t/helper/test-touch.c:12:5: error: symbol 'seq_start' was not declared. Should it be static?
> t/helper/test-touch.c:13:5: error: symbol 'seq_count' was not declared. Should it be static?
> 

I'll fix.  Thanks!
Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 24/34] t/perf/p7519: speed up test using "test-tool touch"
  2021-07-01 23:09       ` Ævar Arnfjörð Bjarmason
@ 2021-07-13 17:06         ` Jeff Hostetler
  2021-07-13 17:36           ` Elijah Newren
  2021-07-13 18:18           ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-13 17:06 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Derrick Stolee, Jeff Hostetler



On 7/1/21 7:09 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
> 
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> Change p7519 to use a single "test-tool touch" command to update
>> the mtime on a series of (thousands) files instead of invoking
>> thousands of commands to update a single file.
>>
>> This is primarily for Windows where process creation is so
>> very slow and reduces the test run time by minutes.
>>
>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>> ---
>>   t/perf/p7519-fsmonitor.sh | 14 ++++++--------
>>   1 file changed, 6 insertions(+), 8 deletions(-)
>>
>> diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
>> index 5eb5044a103..f74e6014a0a 100755
>> --- a/t/perf/p7519-fsmonitor.sh
>> +++ b/t/perf/p7519-fsmonitor.sh
>> @@ -119,10 +119,11 @@ test_expect_success "one time repo setup" '
>>   	fi &&
>>   
>>   	mkdir 1_file 10_files 100_files 1000_files 10000_files &&
>> -	for i in $(test_seq 1 10); do touch 10_files/$i; done &&
>> -	for i in $(test_seq 1 100); do touch 100_files/$i; done &&
>> -	for i in $(test_seq 1 1000); do touch 1000_files/$i; done &&
>> -	for i in $(test_seq 1 10000); do touch 10000_files/$i; done &&
>> +	test-tool touch sequence --pattern="10_files/%d" --start=1 --count=10 &&
>> +	test-tool touch sequence --pattern="100_files/%d" --start=1 --count=100 &&
>> +	test-tool touch sequence --pattern="1000_files/%d" --start=1 --count=1000 &&
>> +	test-tool touch sequence --pattern="10000_files/%d" --start=1 --count=10000 &&
>> +
>>   	git add 1_file 10_files 100_files 1000_files 10000_files &&
>>   	git commit -qm "Add files" &&
>>   
>> @@ -200,15 +201,12 @@ test_fsmonitor_suite() {
>>   	# Update the mtimes on upto 100k files to make status think
>>   	# that they are dirty.  For simplicity, omit any files with
>>   	# LFs (i.e. anything that ls-files thinks it needs to dquote).
>> -	# Then fully backslash-quote the paths to capture any
>> -	# whitespace so that they pass thru xargs properly.
>>   	#
>>   	test_perf_w_drop_caches "status (dirty) ($DESC)" '
>>   		git ls-files | \
>>   			head -100000 | \
>>   			grep -v \" | \
>> -			sed '\''s/\(.\)/\\\1/g'\'' | \
>> -			xargs test-tool chmtime -300 &&
>> +			test-tool touch stdin &&
>>   		git status
>>   	'
> 
> Did you try to replace this with some variant of:
> 
>      test_seq 1 10000 | xargs touch
> 
> Which (depending on your xargs version) would invoke "touch" commands
> with however many argv items it thinks you can handle.
> 

a quick test on my Windows machine shows that

	test_seq 1 10000 | xargs touch

takes 3.1 seconds.

just a simple

	test_seq 1 10000 >/dev/null

take 0.2 seconds.

using my test-tool helper cuts that time in half.

Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 24/34] t/perf/p7519: speed up test using "test-tool touch"
  2021-07-13 17:06         ` Jeff Hostetler
@ 2021-07-13 17:36           ` Elijah Newren
  2021-07-13 17:47             ` Junio C Hamano
  2021-07-13 17:58             ` Jeff Hostetler
  2021-07-13 18:18           ` Ævar Arnfjörð Bjarmason
  1 sibling, 2 replies; 237+ messages in thread
From: Elijah Newren @ 2021-07-13 17:36 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Ævar Arnfjörð Bjarmason,
	Jeff Hostetler via GitGitGadget, Git Mailing List,
	Johannes Schindelin, Derrick Stolee, Jeff Hostetler

On Tue, Jul 13, 2021 at 10:10 AM Jeff Hostetler <git@jeffhostetler.com> wrote:
>
> On 7/1/21 7:09 PM, Ævar Arnfjörð Bjarmason wrote:
> >
> > Did you try to replace this with some variant of:
> >
> >      test_seq 1 10000 | xargs touch
> >
> > Which (depending on your xargs version) would invoke "touch" commands
> > with however many argv items it thinks you can handle.
>
> a quick test on my Windows machine shows that
>
>         test_seq 1 10000 | xargs touch
>
> takes 3.1 seconds.
>
> just a simple
>
>         test_seq 1 10000 >/dev/null
>
> take 0.2 seconds.
>
> using my test-tool helper cuts that time in half.

Yeah, test_seq is pretty bad; it's just a loop in shell.  Is there a
'seq' on windows, and does using it instead of test_seq make things
faster with Ævar's suggested command?

I'd really like to modify test_seq to use seq when it's available and
fall back to the looping-in-shell when we need to for various
platforms.

Maybe it'd even make sense to write a 'test-tool seq' and make
test_seq use that just so we can rip out that super lame shell
looping.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 02/34] fsmonitor--daemon: man page
  2021-07-12 19:23         ` Jeff Hostetler
@ 2021-07-13 17:46           ` Ævar Arnfjörð Bjarmason
  2021-07-16 15:45             ` Johannes Schindelin
  0 siblings, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-13 17:46 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Jeff Hostetler via GitGitGadget, git, Johannes Schindelin,
	Derrick Stolee, Jeff Hostetler


On Mon, Jul 12 2021, Jeff Hostetler wrote:

> On 7/1/21 6:29 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>> 
>>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>>
>>> Create a manual page describing the `git fsmonitor--daemon` feature.
>>>
>>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>>> ---
>>>   Documentation/git-fsmonitor--daemon.txt | 75 +++++++++++++++++++++++++
>>>   1 file changed, 75 insertions(+)
>>>   create mode 100644 Documentation/git-fsmonitor--daemon.txt
>>>
>>> diff --git a/Documentation/git-fsmonitor--daemon.txt b/Documentation/git-fsmonitor--daemon.txt
>>> new file mode 100644
>>> index 00000000000..154e7684daa
>>> --- /dev/null
>>> +++ b/Documentation/git-fsmonitor--daemon.txt
>>> @@ -0,0 +1,75 @@
>>> +git-fsmonitor--daemon(1)
>>> +========================
>>> +
>>> +NAME
>>> +----
>>> +git-fsmonitor--daemon - A Built-in File System Monitor
>>> +
>>> +SYNOPSIS
>>> +--------
>>> +[verse]
>>> +'git fsmonitor--daemon' start
>>> +'git fsmonitor--daemon' run
>>> +'git fsmonitor--daemon' stop
>>> +'git fsmonitor--daemon' status
>>> +
>>> +DESCRIPTION
>>> +-----------
>>> +
>>> +A daemon to watch the working directory for file and directory
>>> +changes using platform-specific file system notification facilities.
>>> +
>>> +This daemon communicates directly with commands like `git status`
>>> +using the link:technical/api-simple-ipc.html[simple IPC] interface
>>> +instead of the slower linkgit:githooks[5] interface.
>>> +
>>> +This daemon is built into Git so that no third-party tools are
>>> +required.
>>> +
>>> +OPTIONS
>>> +-------
>>> +
>>> +start::
>>> +	Starts a daemon in the background.
>>> +
>>> +run::
>>> +	Runs a daemon in the foreground.
>>> +
>>> +stop::
>>> +	Stops the daemon running in the current working
>>> +	directory, if present.
>>> +
>>> +status::
>>> +	Exits with zero status if a daemon is watching the
>>> +	current working directory.
>>> +
>>> +REMARKS
>>> +-------
>>> +
>>> +This daemon is a long running process used to watch a single working
>>> +directory and maintain a list of the recently changed files and
>>> +directories.  Performance of commands such as `git status` can be
>>> +increased if they just ask for a summary of changes to the working
>>> +directory and can avoid scanning the disk.
>>> +
>>> +When `core.useBuiltinFSMonitor` is set to `true` (see
>>> +linkgit:git-config[1]) commands, such as `git status`, will ask the
>>> +daemon for changes and automatically start it (if necessary).
>>> +
>>> +For more information see the "File System Monitor" section in
>>> +linkgit:git-update-index[1].
>>> +
>>> +CAVEATS
>>> +-------
>>> +
>>> +The fsmonitor daemon does not currently know about submodules and does
>>> +not know to filter out file system events that happen within a
>>> +submodule.  If fsmonitor daemon is watching a super repo and a file is
>>> +modified within the working directory of a submodule, it will report
>>> +the change (as happening against the super repo).  However, the client
>>> +will properly ignore these extra events, so performance may be affected
>>> +but it will not cause an incorrect result.
>>> +
>>> +GIT
>>> +---
>>> +Part of the linkgit:git[1] suite
>> Later in the series we incrementally add features to the daemon, so
>> this
>> is describing a state that doesn't exist yet at this point.
>> I think it would be better to start with a stup here and add
>> documentation as we add features, e.g. the patch tha adds "start" should
>> add that to the synopsis + options etc.
>> See the outstanding ab/config-based-hooks-base for a small example
>> of
>> that.
>> 
>
> I like to lead the series with the documentation that summarizes the
> purpose of the entire feature or patch series.  This gives the reviewer
> the context for the complete series that follows.  In the past, we've
> had discussions on the list about how hard it is to review a series when
> the foo.c comes (alphabetically) before foo.h in the patch and all
> the documentation is attached to the prototypes in the .h file so the
> reviewer needs to bounce around in the patch or series to read the
> intent and then go back to the beginning to read the code.  In that
> spirit, I think that having the complete man page come first provides
> necessary context and is helpful.
>
> The argument that the man-page should grow as the feature grows
> presumes that there is a meaningful cut-point mid-series where you
> would adopt the first portion and delay the second to a later release
> or something.  That division would not be useful/usable.

Isn't there such a meaningful cut-off point?

In 08/34[1] you add a skeleton of the daemon, so then the NAME/SYNOPSIS
(empty)/DESCRIPTION/REMARKS/GIT sections could be added, in 09/34[3] you
add stop/status commands, so then the SYNOPSIS/OPTIONS could be
updated/created with those, same for the addition of run/start in
13/34[5] and 14/35[6] respectively.

> And it just clutters up later commits in the series with man-page
> deltas.
>
> So I'd like to keep it as it unless there are further objections.

I'll leave it to you to do with the feedback as you choose,

I suggested this because I think it's much easier to read patches that
are larger because they incrementally update docs or tests along with
code, than smaller ones that are e.g. "add docs" followed by
incrementally modifying the code.

That's because you can consider those atomically. If earlier doc changes
refer to later code changes you're left jumping back & forth and
wondering if the code you're reading that doesn't match the docs yet is
a bug, or if it's solved in some later change you're now needing to
mentally keep track of.

Which is not some theoretical concern b.t.w., but exactly what I found
myself doing when reading this series, hence the suggestion.

1. https://lore.kernel.org/git/f88db92d4259d1c29827e97e957daf6eda39c551.1625150864.git.gitgitgadget@gmail.com/
2. https://lore.kernel.org/git/877di9d5uz.fsf@evledraar.gmail.com/
3. https://lore.kernel.org/git/02e21384ef0ca4909e0bda2c78fa63c06be22a50.1625150864.git.gitgitgadget@gmail.com/
4. https://lore.kernel.org/git/5d6646df93a17659af66f136295444d1bd834090.1625150864.git.gitgitgadget@gmail.com/
5. https://lore.kernel.org/git/5d6646df93a17659af66f136295444d1bd834090.1625150864.git.gitgitgadget@gmail.com/
6. https://lore.kernel.org/git/9fe902aad87f1192705fb69ea212a2d066d0286d.1625150864.git.gitgitgadget@gmail.com/

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 24/34] t/perf/p7519: speed up test using "test-tool touch"
  2021-07-13 17:36           ` Elijah Newren
@ 2021-07-13 17:47             ` Junio C Hamano
  2021-07-13 17:50               ` Elijah Newren
  2021-07-13 17:58             ` Jeff Hostetler
  1 sibling, 1 reply; 237+ messages in thread
From: Junio C Hamano @ 2021-07-13 17:47 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Jeff Hostetler, Ævar Arnfjörð Bjarmason,
	Jeff Hostetler via GitGitGadget, Git Mailing List,
	Johannes Schindelin, Derrick Stolee, Jeff Hostetler

Elijah Newren <newren@gmail.com> writes:

> On Tue, Jul 13, 2021 at 10:10 AM Jeff Hostetler <git@jeffhostetler.com> wrote:
>>
>> a quick test on my Windows machine shows that
>>
>>         test_seq 1 10000 | xargs touch
>>
>> takes 3.1 seconds.
>>
>> just a simple
>>
>>         test_seq 1 10000 >/dev/null
>>
>> take 0.2 seconds.
>>
>> using my test-tool helper cuts that time in half.
>
> Yeah, test_seq is pretty bad; it's just a loop in shell.  Is there a
> 'seq' on windows, and does using it instead of test_seq make things
> faster with Ævar's suggested command?

Unless I am misreading Jeff's message, I do not think that makes
sense.  Counting to 10000 in shell loop is trivial (0.2 seconds),
but letting touch invoked 10000 times to create (or smudge mtime of,
but I suspect that is not what is going on here) 10000 files takes
3.1 seconds, and of course a native binary that creates 10000 files
with a single invocation would be faster.

> I'd really like to modify test_seq to use seq when it's available and
> fall back to the looping-in-shell when we need to for various
> platforms.

So, if I am reading Jeff correctly, that optimizes something that is
not a bottleneck.

> Maybe it'd even make sense to write a 'test-tool seq' and make
> test_seq use that just so we can rip out that super lame shell
> looping.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 24/34] t/perf/p7519: speed up test using "test-tool touch"
  2021-07-13 17:47             ` Junio C Hamano
@ 2021-07-13 17:50               ` Elijah Newren
  0 siblings, 0 replies; 237+ messages in thread
From: Elijah Newren @ 2021-07-13 17:50 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff Hostetler, Ævar Arnfjörð Bjarmason,
	Jeff Hostetler via GitGitGadget, Git Mailing List,
	Johannes Schindelin, Derrick Stolee, Jeff Hostetler

On Tue, Jul 13, 2021 at 10:47 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > On Tue, Jul 13, 2021 at 10:10 AM Jeff Hostetler <git@jeffhostetler.com> wrote:
> >>
> >> a quick test on my Windows machine shows that
> >>
> >>         test_seq 1 10000 | xargs touch
> >>
> >> takes 3.1 seconds.
> >>
> >> just a simple
> >>
> >>         test_seq 1 10000 >/dev/null
> >>
> >> take 0.2 seconds.
> >>
> >> using my test-tool helper cuts that time in half.
> >
> > Yeah, test_seq is pretty bad; it's just a loop in shell.  Is there a
> > 'seq' on windows, and does using it instead of test_seq make things
> > faster with Ævar's suggested command?
>
> Unless I am misreading Jeff's message, I do not think that makes
> sense.  Counting to 10000 in shell loop is trivial (0.2 seconds),
> but letting touch invoked 10000 times to create (or smudge mtime of,
> but I suspect that is not what is going on here) 10000 files takes
> 3.1 seconds, and of course a native binary that creates 10000 files
> with a single invocation would be faster.
>
> > I'd really like to modify test_seq to use seq when it's available and
> > fall back to the looping-in-shell when we need to for various
> > platforms.
>
> So, if I am reading Jeff correctly, that optimizes something that is
> not a bottleneck.

Oh, indeed.  Sorry, I misread.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 14/34] fsmonitor--daemon: implement 'start' command
  2021-07-13 14:39         ` Jeff Hostetler
@ 2021-07-13 17:54           ` Ævar Arnfjörð Bjarmason
  2021-07-13 18:44             ` Jeff Hostetler
  0 siblings, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-13 17:54 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Jeff Hostetler via GitGitGadget, git, Johannes Schindelin,
	Derrick Stolee, Jeff Hostetler


On Tue, Jul 13 2021, Jeff Hostetler wrote:

> My response here is in addition to Dscho's remarks on this topic.
> He makes excellent points that I'll just #include here.  I do want
> to add my own $0.02 here.
>
> On 7/1/21 6:18 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>> 
>>> +#ifdef GIT_WINDOWS_NATIVE
>>> +/*
>>> + * Create a background process to run the daemon.  It should be completely
>>> + * disassociated from the terminal.
>>> + *
>>> + * Conceptually like `daemonize()` but different because Windows does not
>>> + * have `fork(2)`.  Spawn a normal Windows child process but without the
>>> + * limitations of `start_command()` and `finish_command()`.
>>> + *
>>> + * The child process execs the "git fsmonitor--daemon run" command.
>>> + *
>>> + * The current process returns so that the caller can wait for the child
>>> + * to startup before exiting.
>>> + */
>>> +static int spawn_background_fsmonitor_daemon(pid_t *pid)
>>> +{
>>> +	char git_exe[MAX_PATH];
>>> +	struct strvec args = STRVEC_INIT;
>>> +	int in, out;
>>> +
>>> +	GetModuleFileNameA(NULL, git_exe, MAX_PATH);
>>> +
>>> +	in = open("/dev/null", O_RDONLY);
>>> +	out = open("/dev/null", O_WRONLY);
>>> +
>>> +	strvec_push(&args, git_exe);
>>> +	strvec_push(&args, "fsmonitor--daemon");
>>> +	strvec_push(&args, "run");
>>> +	strvec_pushf(&args, "--ipc-threads=%d", fsmonitor__ipc_threads);
>>> +
>>> +	*pid = mingw_spawnvpe(args.v[0], args.v, NULL, NULL, in, out, out);
>>> +	close(in);
>>> +	close(out);
>>> +
>>> +	strvec_clear(&args);
>>> +
>>> +	if (*pid < 0)
>>> +		return error(_("could not spawn fsmonitor--daemon in the background"));
>>> +
>>> +	return 0;
>>> +}
>>> +#else
>>> +/*
>>> + * Create a background process to run the daemon.  It should be completely
>>> + * disassociated from the terminal.
>>> + *
>>> + * This is adapted from `daemonize()`.  Use `fork()` to directly
>>> + * create and run the daemon in the child process.
>>> + *
>>> + * The fork-child can just call the run code; it does not need to exec
>>> + * it.
>>> + *
>>> + * The fork-parent returns the child PID so that we can wait for the
>>> + * child to startup before exiting.
>>> + */
>>> +static int spawn_background_fsmonitor_daemon(pid_t *pid)
>>> +{
>>> +	*pid = fork();
>>> +
>>> +	switch (*pid) {
>>> +	case 0:
>>> +		if (setsid() == -1)
>>> +			error_errno(_("setsid failed"));
>>> +		close(0);
>>> +		close(1);
>>> +		close(2);
>>> +		sanitize_stdfds();
>>> +
>>> +		return !!fsmonitor_run_daemon();
>>> +
>>> +	case -1:
>>> +		return error_errno(_("could not spawn fsmonitor--daemon in the background"));
>>> +
>>> +	default:
>>> +		return 0;
>>> +	}
>>> +}
>>> +#endif
>> The spawn_background_fsmonitor_daemon() function here is almost the
>> same
>> as daemonize(). I wonder if this & the Windows-specific one you have
>> here can't be refactored into an API from what's now in setup.c.
>> Then we could make builtin/gc.c and daemon.c use that, so Windows
>> could
>> have background GC, and we'd have a more battle-tested central codepath
>> for this tricky bit.
>> 
>
> I'd rather not refactor all of this and add unnecessary generality
> and complexity just to save duplicating some of the code in daemonize().
>
> And I'd rather not destabilize existing commands like gc and daemon
> by changing the daemonize() layer on them.  If those commands need help,
> let's have a separate conversation _later_ about what help they need
> and if it makes sense to combine them.

Johannes suggested in
https://lore.kernel.org/git/nycvar.QRO.7.76.6.2107052336480.8230@tvgsbejvaqbjf.bet/
that (if I understand that correctly, and I just skimmed the linked isse
some days ago), that even if such a refactoring was done these two
functions are solving subtly different problems, or something. I.e. we
couldn't use it for daemonize().

Which I'd say is interesting for the code comments/commit message at
least, i.e. how they're solving subtly different problems (not being
able to run this & not being able to test on Windows I haven't poked at
it myself).

>> It seems to me like the only limitations on it are to have this return
>> slightly more general things (e.g. not set its own errors, return
>> structured data), and maybe some callback for what to do in the
>> child/parent.
>
> There are several issues here when trying to start a background process
> and we're already on the edge of the behavioral differences between
> Windows and Unix -- let's not make things more confusing with multiple
> callbacks, returning structures, custom errors, and etc.
>
> Also, since Windows doesn't do fork(), we don't have child/parent
> branches in the call, so this whole "just pretend it's all Unix"
> model doesn't work.

Fair enough, And I think replied-to above.

> Even if we did pretend I'd still need ifdef'd callback routines to
> either call `fsmonitor_run_daemon()` or build a command line (or have
> blocks of functions that "just happen to never be called on one
> platform or the other").
>
>
> What I have here is an API that the primary (read: parent) calls
> and gets back a 0 or -1 (with error message).  And that's it.
> The primary can then wait for the child (whether from fork or
> CreateProcess) to become responsive or fail to start.  And then
> the primary can exit (with or without error).
>
> So I think we're good.  Yes, there is an ifdef here, but I think
> it is worth it.

FWIW what I was noting here & elsewhere is that yes, you need to ifdef
some of it, but the code you're proposing to add here is using a
different pattern than the one generally preferred in this codebase.

I.e. check out how we do it for threading, we intentionally compile the
"if (thread) {code}" clauses on platforms we know don't have threading,
ditto the code around PCRE in grep.c. 

Similarly, here in e.g. spawn_background_fsmonitor_daemon just the
GetModuleFileNameA() and mingw_spawnvpe() are Windows-specifics (and
could be calls to some helper that *is* ifdef'd).

In this case it's not a big deal, but as a general pattern it helps to
e.g. avoid subtle syntax errors in nested ifdefs and the like, and
generally encourages keeping the ifdef'd code as small as possible.

>>> +/*
>>> + * This is adapted from `wait_or_whine()`.  Watch the child process and
>>> + * let it get started and begin listening for requests on the socket
>>> + * before reporting our success.
>>> + */
>>> +static int wait_for_background_startup(pid_t pid_child)
>>> +{
>>> +	int status;
>>> +	pid_t pid_seen;
>>> +	enum ipc_active_state s;
>>> +	time_t time_limit, now;
>>> +
>>> +	time(&time_limit);
>>> +	time_limit += fsmonitor__start_timeout_sec;
>>> +
>>> +	for (;;) {
>>> +		pid_seen = waitpid(pid_child, &status, WNOHANG);
>>> +
>>> +		if (pid_seen == -1)
>>> +			return error_errno(_("waitpid failed"));
>>> +		else if (pid_seen == 0) {
>>> +			/*
>>> +			 * The child is still running (this should be
>>> +			 * the normal case).  Try to connect to it on
>>> +			 * the socket and see if it is ready for
>>> +			 * business.
>>> +			 *
>>> +			 * If there is another daemon already running,
>>> +			 * our child will fail to start (possibly
>>> +			 * after a timeout on the lock), but we don't
>>> +			 * care (who responds) if the socket is live.
>>> +			 */
>>> +			s = fsmonitor_ipc__get_state();
>>> +			if (s == IPC_STATE__LISTENING)
>>> +				return 0;
>>> +
>>> +			time(&now);
>>> +			if (now > time_limit)
>>> +				return error(_("fsmonitor--daemon not online yet"));
>>> +		} else if (pid_seen == pid_child) {
>>> +			/*
>>> +			 * The new child daemon process shutdown while
>>> +			 * it was starting up, so it is not listening
>>> +			 * on the socket.
>>> +			 *
>>> +			 * Try to ping the socket in the odd chance
>>> +			 * that another daemon started (or was already
>>> +			 * running) while our child was starting.
>>> +			 *
>>> +			 * Again, we don't care who services the socket.
>>> +			 */
>>> +			s = fsmonitor_ipc__get_state();
>>> +			if (s == IPC_STATE__LISTENING)
>>> +				return 0;
>>> +
>>> +			/*
>>> +			 * We don't care about the WEXITSTATUS() nor
>>> +			 * any of the WIF*(status) values because
>>> +			 * `cmd_fsmonitor__daemon()` does the `!!result`
>>> +			 * trick on all function return values.
>>> +			 *
>>> +			 * So it is sufficient to just report the
>>> +			 * early shutdown as an error.
>>> +			 */
>>> +			return error(_("fsmonitor--daemon failed to start"));
>>> +		} else
>>> +			return error(_("waitpid is confused"));
>>> +	}
>>> +}
>> Ditto this. could we extend the wait_or_whine() function (or some
>> extended version thereof) to do what you need with callbacks?
>> It seems the main difference is just being able to pass down a flag
>> for
>> waitpid(), and the loop needing to check EINTR or not depending on
>> whether WNOHANG is passed.
>> For e.g. the "We don't care about the WEXITSTATUS()" you'd get that
>> behavior with an adjusted wait_or_whine(). Wouldn't it be better to
>> report what exit status it exits with e.g. if the top-level process is
>> signalled? We do so in trace2 for other things we spawn...
>> 
>
> Again, I don't want to mix my usage here with the existing code
> and destabilize all existing callers.  Here we are spinning to give
> the child a chance to *start* and confirm that it is in a listening
> state and ready for connections.  We do not wait for the child to
> exit (unless it dies quickly without becoming ready).
>
> We want to end our wait as soon as we confirm that the child is
> ready and return.  All I really need from the system is `waitpid()`.

Will this code behave correctly if the daemon we start is signalled per
the WIFSIGNALED() cases the code this is derived handles, but this does
not?

But sure, I just meant to point out that the flip side to "destabilize
all existing callers" is reviewing new code that may be subtly buggy,
and those subtle bugs (if any) would be smoked out if we were forced to
extend run-command.c, i.e. to use whatever feature(s) this needs for all
existing callers.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 24/34] t/perf/p7519: speed up test using "test-tool touch"
  2021-07-13 17:36           ` Elijah Newren
  2021-07-13 17:47             ` Junio C Hamano
@ 2021-07-13 17:58             ` Jeff Hostetler
  2021-07-13 18:07               ` Junio C Hamano
  1 sibling, 1 reply; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-13 17:58 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Ævar Arnfjörð Bjarmason,
	Jeff Hostetler via GitGitGadget, Git Mailing List,
	Johannes Schindelin, Derrick Stolee, Jeff Hostetler



On 7/13/21 1:36 PM, Elijah Newren wrote:
> On Tue, Jul 13, 2021 at 10:10 AM Jeff Hostetler <git@jeffhostetler.com> wrote:
>>
>> On 7/1/21 7:09 PM, Ævar Arnfjörð Bjarmason wrote:
>>>
>>> Did you try to replace this with some variant of:
>>>
>>>       test_seq 1 10000 | xargs touch
>>>
>>> Which (depending on your xargs version) would invoke "touch" commands
>>> with however many argv items it thinks you can handle.
>>
>> a quick test on my Windows machine shows that
>>
>>          test_seq 1 10000 | xargs touch
>>
>> takes 3.1 seconds.
>>
>> just a simple
>>
>>          test_seq 1 10000 >/dev/null
>>
>> take 0.2 seconds.
>>
>> using my test-tool helper cuts that time in half.
> 
> Yeah, test_seq is pretty bad; it's just a loop in shell.  Is there a
> 'seq' on windows, and does using it instead of test_seq make things
> faster with Ævar's suggested command?
> 

The Git for Windows SDK bash environment does have a /usr/bin/seq
which appears to be from GNU coreutils 8.32.  (This is different
from the version that I have on my Mac (which doesn't have a version
number).)

Using it:

	seq 1 10000 >/dev/null

takes 0.04 seconds instead of 0.2.

However, it doesn't help the touch.

	seq 1 10000 | xargs touch

still takes ~3.1 seconds.

FWIW, the xargs is clustering the 10,000 files into ~4 command lines,
so there is a little bit of Windows process overhead, but not that
much.

	seq 1 10000 | xargs wc -l | grep total

> I'd really like to modify test_seq to use seq when it's available and
> fall back to the looping-in-shell when we need to for various
> platforms.
> 
> Maybe it'd even make sense to write a 'test-tool seq' and make
> test_seq use that just so we can rip out that super lame shell
> looping.
> 

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 24/34] t/perf/p7519: speed up test using "test-tool touch"
  2021-07-01 14:47     ` [PATCH v3 24/34] t/perf/p7519: speed up test using "test-tool touch" Jeff Hostetler via GitGitGadget
  2021-07-01 23:09       ` Ævar Arnfjörð Bjarmason
@ 2021-07-13 18:04       ` Jeff Hostetler
  1 sibling, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-13 18:04 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget, git
  Cc: Johannes Schindelin, Derrick Stolee, Jeff Hostetler



On 7/1/21 10:47 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Change p7519 to use a single "test-tool touch" command to update
> the mtime on a series of (thousands) files instead of invoking
> thousands of commands to update a single file.
> 
> This is primarily for Windows where process creation is so
> very slow and reduces the test run time by minutes.
> 
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>   t/perf/p7519-fsmonitor.sh | 14 ++++++--------
>   1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
> index 5eb5044a103..f74e6014a0a 100755
> --- a/t/perf/p7519-fsmonitor.sh
> +++ b/t/perf/p7519-fsmonitor.sh
> @@ -119,10 +119,11 @@ test_expect_success "one time repo setup" '
>   	fi &&
>   
>   	mkdir 1_file 10_files 100_files 1000_files 10000_files &&
> -	for i in $(test_seq 1 10); do touch 10_files/$i; done &&
> -	for i in $(test_seq 1 100); do touch 100_files/$i; done &&
> -	for i in $(test_seq 1 1000); do touch 1000_files/$i; done &&
> -	for i in $(test_seq 1 10000); do touch 10000_files/$i; done &&
> +	test-tool touch sequence --pattern="10_files/%d" --start=1 --count=10 &&
> +	test-tool touch sequence --pattern="100_files/%d" --start=1 --count=100 &&
> +	test-tool touch sequence --pattern="1000_files/%d" --start=1 --count=1000 &&
> +	test-tool touch sequence --pattern="10000_files/%d" --start=1 --count=10000 &&

The big win in taking *minutes* off of the run time of this
test was getting rid of the `for` loops and one `touch` invocation
per file.

So whether we keep my `test-tool touch` command or switch to
`test_seq` or `seq` is open for debate.  Mine seems quicker, but
it is more or less round off error in the larger picture considering
what we started with.

I'll play with this a bit.
Jeff


^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 24/34] t/perf/p7519: speed up test using "test-tool touch"
  2021-07-13 17:58             ` Jeff Hostetler
@ 2021-07-13 18:07               ` Junio C Hamano
  2021-07-13 18:19                 ` Jeff Hostetler
  0 siblings, 1 reply; 237+ messages in thread
From: Junio C Hamano @ 2021-07-13 18:07 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Elijah Newren, Ævar Arnfjörð Bjarmason,
	Jeff Hostetler via GitGitGadget, Git Mailing List,
	Johannes Schindelin, Derrick Stolee, Jeff Hostetler

Jeff Hostetler <git@jeffhostetler.com> writes:

> FWIW, the xargs is clustering the 10,000 files into ~4 command lines,
> so there is a little bit of Windows process overhead, but not that
> much.
>
> 	seq 1 10000 | xargs wc -l | grep total
>
>> I'd really like to modify test_seq to use seq when it's available and
>> fall back to the looping-in-shell when we need to for various
>> platforms.
>> Maybe it'd even make sense to write a 'test-tool seq' and make
>> test_seq use that just so we can rip out that super lame shell
>> looping.
>> 

So what lame in this picture is not shell, or process overhead, but
I/O performance.

I've seen some noises about Windows file creation performance raised
as an issue when doing initial checkout followed by "git clone", and
an idea floated to create a bunch of open file handles for writing
in threads when checkout (really the caller that repeatedly calls
entry.c:write_entry() by iterating the in-core index) starts, and
write out the contents in parallel, as a workaround.  When I heard
it, I somehow thought it was meant as a not-so-funny joke, but from
the sounds of it, the I/O performance may be so horrible to require
such a hack to be usable there.  Sigh...


^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 17/34] fsmonitor--daemon: define token-ids
  2021-07-13 15:15         ` Jeff Hostetler
@ 2021-07-13 18:11           ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-13 18:11 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Jeff Hostetler via GitGitGadget, git, Johannes Schindelin,
	Derrick Stolee, Jeff Hostetler


On Tue, Jul 13 2021, Jeff Hostetler wrote:

> On 7/1/21 6:58 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>> 
>>> +	if (!test_env_value) {
>>> +		struct timeval tv;
>>> +		struct tm tm;
>>> +		time_t secs;
>>> +
>>> +		gettimeofday(&tv, NULL);
>>> +		secs = tv.tv_sec;
>>> +		gmtime_r(&secs, &tm);
>>> +
>>> +		strbuf_addf(&token->token_id,
>>> +			    "%"PRIu64".%d.%4d%02d%02dT%02d%02d%02d.%06ldZ",
>>> +			    flush_count++,
>>> +			    getpid(),
>>> +			    tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
>>> +			    tm.tm_hour, tm.tm_min, tm.tm_sec,
>>> +			    (long)tv.tv_usec);
>> Just bikeshedding, but can we have tokens that mostly sort
>> numeric-wise
>> by time order? So time at the start, not the flush_count/getpid.
>
> As I described in a rather large comment in the code, tokens are opaque
> strings -- without a less-than / greater-than relationship -- just a
> random string that the daemon can use (along with a sequence number) to
> ensure that a later request is well-defined.
>
> Here I'm using a counter, pid, and date-stamp.  I'd prefer using a GUID
> or UUID just to drive that home, but I didn't want to add a new .lib or
> .a to the build if not necessary.
>
> Perhaps I should compute this portion as hex(hash(time())) to remove the
> temptation to look inside my opaque token ??

Why does it matter if someone looks inside your opaque token if the code
is treating it as opaque by just doing a strcmp(old,new) on it?

I just suggested this as a debugging aid, i.e. when you the human (as
opposed to the program) are looking at this behavior it's handy to look
at the token and see that your cookies don't match, and that they look
to be N seconds apart.

And furthermore, if git crashes or whatever you can now easily look up
what process crashed if you've got the leftover cookie, if you've also
got trace2 logs.

>> Maybe I'm missing something, but couldn't we just re-use the trace2
>> SID
>> + a more trivial trailer? It would have the nice property that you could
>> find the trace2 SID whenever you looked at such a token (could
>> e.g. split them by "/" too), and add the tv_usec, flush_count+whatever
>> else is needed to make it unique after the "/", no?
>> 
>
> I would rather keep Trace2 out of this.  The SID is another opaque
> string and I don't want to reach inside it.

For the purposes of the git.git codebase it's fine to reach inside of
it, especially for a "I'd like a near-enough-UUID, and I know the trace2
SID already does that per-program", so you just need e.g. a sequence
counter within the program to ensure global uniqueness with other git
processes for such a cookie.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 19/34] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  2021-07-13 15:46         ` Jeff Hostetler
@ 2021-07-13 18:15           ` Ævar Arnfjörð Bjarmason
  2021-07-16 15:55             ` Johannes Schindelin
  0 siblings, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-13 18:15 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Jeff Hostetler via GitGitGadget, git, Johannes Schindelin,
	Derrick Stolee, Jeff Hostetler


On Tue, Jul 13 2021, Jeff Hostetler wrote:

> On 7/1/21 7:02 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>> 
>>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>>
>>> Teach the win32 backend to register a watch on the working tree
>>> root directory (recursively).  Also watch the <gitdir> if it is
>>> not inside the working tree.  And to collect path change notifications
>>> into batches and publish.
>>>
>>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>>> ---
>>>   compat/fsmonitor/fsmonitor-fs-listen-win32.c | 530 +++++++++++++++++++
>> <bikeshed mode> Spying on the early history of this (looking for the
>> Linux backend) I saw that at some point we had just
>> compat/fsmonitor/linux.c, and presumably some of
>> compat/fsmonitor/{windows,win32,macos,darwin}.c.
>> At some point those filenames became much much longer.
>> 
>
> Once upon a time having "foo/bar/win32.c" and "abc/def/win32.c"
> would cause confusion in the debugger (I've long since forgotten
> which).  Breaking at win32.c:30 was no longer unique.
>
> Also, if the Makefile sends all .o's to the root directory or a
> unified OBJS directory rather than to the subdir containing the .c,
> then we have another issue during linking...
>
> So, having been burned too many times, I prefer to make source
> filenames unique when possible.

A much shorter name like compat/fsmonitor/fsmon-win32.c would achieve
that goal.

>> I've noticed you tend to prefer really long file and function names,
>> e.g. your borrowed daemonize() became
>> spawn_background_fsmonitor_daemon(), I think aiming for shorter
>> filenames & function names helps, e.g. these long names widen diffstats,
>> and many people who hack on the code stick religiously to 80 character
>> width terminals.
>> 
>
> I prefer self-documenting code.

Sure, I'm not saying daemonize() is an ideal name, just suggesting that
you can both get uniqueness & self-documentation and not need to split
to multiple lines in some common cases to stay within the "We try to
keep to at most 80 characters per line" in CodingGuidelines in this
series.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 24/34] t/perf/p7519: speed up test using "test-tool touch"
  2021-07-13 17:06         ` Jeff Hostetler
  2021-07-13 17:36           ` Elijah Newren
@ 2021-07-13 18:18           ` Ævar Arnfjörð Bjarmason
  2021-07-13 19:05             ` Jeff Hostetler
  1 sibling, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-13 18:18 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Jeff Hostetler via GitGitGadget, git, Johannes Schindelin,
	Derrick Stolee, Jeff Hostetler


On Tue, Jul 13 2021, Jeff Hostetler wrote:

> On 7/1/21 7:09 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>> 
>>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>>
>>> Change p7519 to use a single "test-tool touch" command to update
>>> the mtime on a series of (thousands) files instead of invoking
>>> thousands of commands to update a single file.
>>>
>>> This is primarily for Windows where process creation is so
>>> very slow and reduces the test run time by minutes.
>>>
>>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>>> ---
>>>   t/perf/p7519-fsmonitor.sh | 14 ++++++--------
>>>   1 file changed, 6 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
>>> index 5eb5044a103..f74e6014a0a 100755
>>> --- a/t/perf/p7519-fsmonitor.sh
>>> +++ b/t/perf/p7519-fsmonitor.sh
>>> @@ -119,10 +119,11 @@ test_expect_success "one time repo setup" '
>>>   	fi &&
>>>     	mkdir 1_file 10_files 100_files 1000_files 10000_files &&
>>> -	for i in $(test_seq 1 10); do touch 10_files/$i; done &&
>>> -	for i in $(test_seq 1 100); do touch 100_files/$i; done &&
>>> -	for i in $(test_seq 1 1000); do touch 1000_files/$i; done &&
>>> -	for i in $(test_seq 1 10000); do touch 10000_files/$i; done &&
>>> +	test-tool touch sequence --pattern="10_files/%d" --start=1 --count=10 &&
>>> +	test-tool touch sequence --pattern="100_files/%d" --start=1 --count=100 &&
>>> +	test-tool touch sequence --pattern="1000_files/%d" --start=1 --count=1000 &&
>>> +	test-tool touch sequence --pattern="10000_files/%d" --start=1 --count=10000 &&
>>> +
>>>   	git add 1_file 10_files 100_files 1000_files 10000_files &&
>>>   	git commit -qm "Add files" &&
>>>   @@ -200,15 +201,12 @@ test_fsmonitor_suite() {
>>>   	# Update the mtimes on upto 100k files to make status think
>>>   	# that they are dirty.  For simplicity, omit any files with
>>>   	# LFs (i.e. anything that ls-files thinks it needs to dquote).
>>> -	# Then fully backslash-quote the paths to capture any
>>> -	# whitespace so that they pass thru xargs properly.
>>>   	#
>>>   	test_perf_w_drop_caches "status (dirty) ($DESC)" '
>>>   		git ls-files | \
>>>   			head -100000 | \
>>>   			grep -v \" | \
>>> -			sed '\''s/\(.\)/\\\1/g'\'' | \
>>> -			xargs test-tool chmtime -300 &&
>>> +			test-tool touch stdin &&
>>>   		git status
>>>   	'
>> Did you try to replace this with some variant of:
>>      test_seq 1 10000 | xargs touch
>> Which (depending on your xargs version) would invoke "touch"
>> commands
>> with however many argv items it thinks you can handle.
>> 
>
> a quick test on my Windows machine shows that
>
> 	test_seq 1 10000 | xargs touch
>
> takes 3.1 seconds.
>
> just a simple
>
> 	test_seq 1 10000 >/dev/null
>
> take 0.2 seconds.
>
> using my test-tool helper cuts that time in half.

There's what Elijah mentioned about test_seq, so maybe it's just that.

But what I was suggesting was using the xargs mode where it does N
arguments at a time.

Does this work for you, and does it cause xargs to invoke "touch" with
the relevant N number of arguments, and does it help with the
performance?

    test_seq 1 10000 | xargs touch
    test_seq 1 10000 | xargs -n 10 touch
    test_seq 1 10000 | xargs -n 100 touch
    test_seq 1 10000 | xargs -n 1000 touch

etc.

Also I didn't notice this before, but the -300 part of "chmtime -300"
was redundant before then? I.e. you're implicitly changing it to "=+0"
instead with your "touch" helper, are you not?

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 24/34] t/perf/p7519: speed up test using "test-tool touch"
  2021-07-13 18:07               ` Junio C Hamano
@ 2021-07-13 18:19                 ` Jeff Hostetler
  0 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-13 18:19 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren, Ævar Arnfjörð Bjarmason,
	Jeff Hostetler via GitGitGadget, Git Mailing List,
	Johannes Schindelin, Derrick Stolee, Jeff Hostetler



On 7/13/21 2:07 PM, Junio C Hamano wrote:
> Jeff Hostetler <git@jeffhostetler.com> writes:
> 
>> FWIW, the xargs is clustering the 10,000 files into ~4 command lines,
>> so there is a little bit of Windows process overhead, but not that
>> much.
>>
>> 	seq 1 10000 | xargs wc -l | grep total
>>
>>> I'd really like to modify test_seq to use seq when it's available and
>>> fall back to the looping-in-shell when we need to for various
>>> platforms.
>>> Maybe it'd even make sense to write a 'test-tool seq' and make
>>> test_seq use that just so we can rip out that super lame shell
>>> looping.
>>>
> 
> So what lame in this picture is not shell, or process overhead, but
> I/O performance.
> 
> I've seen some noises about Windows file creation performance raised
> as an issue when doing initial checkout followed by "git clone", and
> an idea floated to create a bunch of open file handles for writing
> in threads when checkout (really the caller that repeatedly calls
> entry.c:write_entry() by iterating the in-core index) starts, and
> write out the contents in parallel, as a workaround.  When I heard
> it, I somehow thought it was meant as a not-so-funny joke, but from
> the sounds of it, the I/O performance may be so horrible to require
> such a hack to be usable there.  Sigh...
> 

Yes, there are some things here (that I believe to be I/O related)
on Windows that I want to look at when I wrap up FSMonitor.  And
yes, some of them sound pretty stupid.

Jeff


^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 14/34] fsmonitor--daemon: implement 'start' command
  2021-07-13 17:54           ` Ævar Arnfjörð Bjarmason
@ 2021-07-13 18:44             ` Jeff Hostetler
  2021-07-20 19:38               ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-13 18:44 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jeff Hostetler via GitGitGadget, git, Johannes Schindelin,
	Derrick Stolee, Jeff Hostetler



On 7/13/21 1:54 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Tue, Jul 13 2021, Jeff Hostetler wrote:
> 
>> My response here is in addition to Dscho's remarks on this topic.
>> He makes excellent points that I'll just #include here.  I do want
>> to add my own $0.02 here.
>>
>> On 7/1/21 6:18 PM, Ævar Arnfjörð Bjarmason wrote:
>>> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>>>

>>>> +/*
>>>> + * This is adapted from `wait_or_whine()`.  Watch the child process and
>>>> + * let it get started and begin listening for requests on the socket
>>>> + * before reporting our success.
>>>> + */
>>>> +static int wait_for_background_startup(pid_t pid_child)
>>>> +{
>>>> +	int status;
>>>> +	pid_t pid_seen;
>>>> +	enum ipc_active_state s;
>>>> +	time_t time_limit, now;
>>>> +
>>>> +	time(&time_limit);
>>>> +	time_limit += fsmonitor__start_timeout_sec;
>>>> +
>>>> +	for (;;) {
>>>> +		pid_seen = waitpid(pid_child, &status, WNOHANG);
>>>> +
>>>> +		if (pid_seen == -1)
>>>> +			return error_errno(_("waitpid failed"));
>>>> +		else if (pid_seen == 0) {
>>>> +			/*
>>>> +			 * The child is still running (this should be
>>>> +			 * the normal case).  Try to connect to it on
>>>> +			 * the socket and see if it is ready for
>>>> +			 * business.
>>>> +			 *
>>>> +			 * If there is another daemon already running,
>>>> +			 * our child will fail to start (possibly
>>>> +			 * after a timeout on the lock), but we don't
>>>> +			 * care (who responds) if the socket is live.
>>>> +			 */
>>>> +			s = fsmonitor_ipc__get_state();
>>>> +			if (s == IPC_STATE__LISTENING)
>>>> +				return 0;
>>>> +
>>>> +			time(&now);
>>>> +			if (now > time_limit)
>>>> +				return error(_("fsmonitor--daemon not online yet"));
>>>> +		} else if (pid_seen == pid_child) {
>>>> +			/*
>>>> +			 * The new child daemon process shutdown while
>>>> +			 * it was starting up, so it is not listening
>>>> +			 * on the socket.
>>>> +			 *
>>>> +			 * Try to ping the socket in the odd chance
>>>> +			 * that another daemon started (or was already
>>>> +			 * running) while our child was starting.
>>>> +			 *
>>>> +			 * Again, we don't care who services the socket.
>>>> +			 */
>>>> +			s = fsmonitor_ipc__get_state();
>>>> +			if (s == IPC_STATE__LISTENING)
>>>> +				return 0;
>>>> +
>>>> +			/*
>>>> +			 * We don't care about the WEXITSTATUS() nor
>>>> +			 * any of the WIF*(status) values because
>>>> +			 * `cmd_fsmonitor__daemon()` does the `!!result`
>>>> +			 * trick on all function return values.
>>>> +			 *
>>>> +			 * So it is sufficient to just report the
>>>> +			 * early shutdown as an error.
>>>> +			 */
>>>> +			return error(_("fsmonitor--daemon failed to start"));
>>>> +		} else
>>>> +			return error(_("waitpid is confused"));
>>>> +	}
>>>> +}
>>> Ditto this. could we extend the wait_or_whine() function (or some
>>> extended version thereof) to do what you need with callbacks?
>>> It seems the main difference is just being able to pass down a flag
>>> for
>>> waitpid(), and the loop needing to check EINTR or not depending on
>>> whether WNOHANG is passed.
>>> For e.g. the "We don't care about the WEXITSTATUS()" you'd get that
>>> behavior with an adjusted wait_or_whine(). Wouldn't it be better to
>>> report what exit status it exits with e.g. if the top-level process is
>>> signalled? We do so in trace2 for other things we spawn...
>>>
>>
>> Again, I don't want to mix my usage here with the existing code
>> and destabilize all existing callers.  Here we are spinning to give
>> the child a chance to *start* and confirm that it is in a listening
>> state and ready for connections.  We do not wait for the child to
>> exit (unless it dies quickly without becoming ready).
>>
>> We want to end our wait as soon as we confirm that the child is
>> ready and return.  All I really need from the system is `waitpid()`.
> 
> Will this code behave correctly if the daemon we start is signalled per
> the WIFSIGNALED() cases the code this is derived handles, but this does
> not?

We're only waiting until the child gets started and is able to receive
requests -- what happens to it after we have confirmed that it is ready
is not our concern (after all, the parent is about to exit anyway and
the child is going to continue on).

If waitpid() gives us a WIFSIGNALED (or any other WIF*() state) before
we have spoken to it, we will return a "failed to start".

But again, that signal would have to arrive immediately after we spawned
it and *before* we could talk to it.  If the child is signaled after we
confirmed it was ready, we don't care because the parent process will be
gone.

(If the child is signaled or is killed (or crashes or whatever), the
next Git command (like "status") that tries to talk to it will re-start
it implicitly -- the `git fsmonitor--daemon start` command here is an
explicit start.)


> 
> But sure, I just meant to point out that the flip side to "destabilize
> all existing callers" is reviewing new code that may be subtly buggy,
> and those subtle bugs (if any) would be smoked out if we were forced to
> extend run-command.c, i.e. to use whatever feature(s) this needs for all
> existing callers.
> 

That would/could have a massive footprint.  And I've already established
that my usage here is sufficiently different from existing uses that the
result would be a mess. IMHO.

Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 24/34] t/perf/p7519: speed up test using "test-tool touch"
  2021-07-13 18:18           ` Ævar Arnfjörð Bjarmason
@ 2021-07-13 19:05             ` Jeff Hostetler
  2021-07-20 19:18               ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-13 19:05 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jeff Hostetler via GitGitGadget, git, Johannes Schindelin,
	Derrick Stolee, Jeff Hostetler



On 7/13/21 2:18 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Tue, Jul 13 2021, Jeff Hostetler wrote:
> 
>> On 7/1/21 7:09 PM, Ævar Arnfjörð Bjarmason wrote:
>>> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>>>
>>>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>>>
>>>> Change p7519 to use a single "test-tool touch" command to update
>>>> the mtime on a series of (thousands) files instead of invoking
>>>> thousands of commands to update a single file.
>>>>
>>>> This is primarily for Windows where process creation is so
>>>> very slow and reduces the test run time by minutes.
>>>>
>>>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>>>> ---
>>>>    t/perf/p7519-fsmonitor.sh | 14 ++++++--------
>>>>    1 file changed, 6 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
>>>> index 5eb5044a103..f74e6014a0a 100755
>>>> --- a/t/perf/p7519-fsmonitor.sh
>>>> +++ b/t/perf/p7519-fsmonitor.sh
>>>> @@ -119,10 +119,11 @@ test_expect_success "one time repo setup" '
>>>>    	fi &&
>>>>      	mkdir 1_file 10_files 100_files 1000_files 10000_files &&
>>>> -	for i in $(test_seq 1 10); do touch 10_files/$i; done &&
>>>> -	for i in $(test_seq 1 100); do touch 100_files/$i; done &&
>>>> -	for i in $(test_seq 1 1000); do touch 1000_files/$i; done &&
>>>> -	for i in $(test_seq 1 10000); do touch 10000_files/$i; done &&
>>>> +	test-tool touch sequence --pattern="10_files/%d" --start=1 --count=10 &&
>>>> +	test-tool touch sequence --pattern="100_files/%d" --start=1 --count=100 &&
>>>> +	test-tool touch sequence --pattern="1000_files/%d" --start=1 --count=1000 &&
>>>> +	test-tool touch sequence --pattern="10000_files/%d" --start=1 --count=10000 &&
>>>> +
>>>>    	git add 1_file 10_files 100_files 1000_files 10000_files &&
>>>>    	git commit -qm "Add files" &&
>>>>    @@ -200,15 +201,12 @@ test_fsmonitor_suite() {
>>>>    	# Update the mtimes on upto 100k files to make status think
>>>>    	# that they are dirty.  For simplicity, omit any files with
>>>>    	# LFs (i.e. anything that ls-files thinks it needs to dquote).
>>>> -	# Then fully backslash-quote the paths to capture any
>>>> -	# whitespace so that they pass thru xargs properly.
>>>>    	#
>>>>    	test_perf_w_drop_caches "status (dirty) ($DESC)" '
>>>>    		git ls-files | \
>>>>    			head -100000 | \
>>>>    			grep -v \" | \
>>>> -			sed '\''s/\(.\)/\\\1/g'\'' | \
>>>> -			xargs test-tool chmtime -300 &&
>>>> +			test-tool touch stdin &&
>>>>    		git status
>>>>    	'
>>> Did you try to replace this with some variant of:
>>>       test_seq 1 10000 | xargs touch
>>> Which (depending on your xargs version) would invoke "touch"
>>> commands
>>> with however many argv items it thinks you can handle.
>>>
>>
>> a quick test on my Windows machine shows that
>>
>> 	test_seq 1 10000 | xargs touch
>>
>> takes 3.1 seconds.
>>
>> just a simple
>>
>> 	test_seq 1 10000 >/dev/null
>>
>> take 0.2 seconds.
>>
>> using my test-tool helper cuts that time in half.
> 
> There's what Elijah mentioned about test_seq, so maybe it's just that.
> 
> But what I was suggesting was using the xargs mode where it does N
> arguments at a time.
> 
> Does this work for you, and does it cause xargs to invoke "touch" with
> the relevant N number of arguments, and does it help with the
> performance?
> 
>      test_seq 1 10000 | xargs touch
>      test_seq 1 10000 | xargs -n 10 touch
>      test_seq 1 10000 | xargs -n 100 touch
>      test_seq 1 10000 | xargs -n 1000 touch

The GFW SDK version of xargs does have `-n N` and it does work as
advertised.  And it does slow down things considerably.  Letting it
do ~2500 per command in 4 commands took the 3.1 seconds listed above.

Add a -n 100 to it takes 5.7 seconds, so process creation overhead
is a factor here.


> 
> etc.
> 
> Also I didn't notice this before, but the -300 part of "chmtime -300"
> was redundant before then? I.e. you're implicitly changing it to "=+0"
> instead with your "touch" helper, are you not?
> 

Right. I'm changing it to the current time.

Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 02/34] fsmonitor--daemon: man page
  2021-07-13 17:46           ` Ævar Arnfjörð Bjarmason
@ 2021-07-16 15:45             ` Johannes Schindelin
  2021-07-16 17:04               ` Felipe Contreras
  0 siblings, 1 reply; 237+ messages in thread
From: Johannes Schindelin @ 2021-07-16 15:45 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jeff Hostetler, Jeff Hostetler via GitGitGadget, git,
	Derrick Stolee, Jeff Hostetler

[-- Attachment #1: Type: text/plain, Size: 2570 bytes --]

Hi Ævar,

On Tue, 13 Jul 2021, Ævar Arnfjörð Bjarmason wrote:

> [snip Ævar's suggestion to populate the manual page incrementally,
> interspersed with the commits that finalize implementing the respective
> functionality]
>
> I suggested this because I think it's much easier to read patches that
> are larger because they incrementally update docs or tests along with
> code, than smaller ones that are e.g. "add docs" followed by
> incrementally modifying the code.

My experience is the exact opposite of yours: shorter patches are easier
to read.

> That's because you can consider those atomically.

No, in a patch series you cannot consider any patch completely atomically.
Just like you don't consider any paragraph in any well-written book out of
context.

> If earlier doc changes refer to later code changes you're left jumping
> back & forth and wondering if the code you're reading that doesn't match
> the docs yet is a bug, or if it's solved in some later change you're now
> needing to mentally keep track of.

You only keep jumping back and forth when reviewing _patches_. We try to
do code review on this mailing list, which means that you have the code
locally and review it in the correct context. When you do that, a design
document is quite helpful. And the proposed manual page serves as such a
design document.

> Which is not some theoretical concern b.t.w., but exactly what I found
> myself doing when reading this series, hence the suggestion.

Part of the problem here seems to be that this patch series saw many
reviewer suggestions that forced it to increase in length. That was
probably not very helpful, after all.

The first iteration had 23 patches. Reviews forced it to grow to 28
patches in the second iteration. And that was still not enough, therefore
the third iteration consisted of 34 patches. And if you had had your way,
requiring Jeff to include a Linux backend in the same patch series, it
would have to increase in size again, and not just by a little.

This all sounds like we're truly falling into the trap of ignoring the
rule that the perfect is the enemy of the good.

I really would like to come back to a focused review that truly improves
the patches at hand, and avoids conflating the review of the actual patch
series with matters of personal taste (which is a recipe for
disagreement). After all, we are interested in getting this feature out to
users who need to work with very large worktrees, right? At least that's
my goal here.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 11/34] fsmonitor-fs-listen-win32: stub in backend for Windows
  2021-07-01 22:45       ` Ævar Arnfjörð Bjarmason
@ 2021-07-16 15:47         ` Johannes Schindelin
  2021-07-16 16:55           ` Ævar Arnfjörð Bjarmason
  2021-07-16 16:59           ` Felipe Contreras
  2021-07-19 16:54         ` Jeff Hostetler
  1 sibling, 2 replies; 237+ messages in thread
From: Johannes Schindelin @ 2021-07-16 15:47 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jeff Hostetler via GitGitGadget, git, Jeff Hostetler,
	Derrick Stolee, Jeff Hostetler

[-- Attachment #1: Type: text/plain, Size: 2517 bytes --]

Hi Ævar,

On Fri, 2 Jul 2021, Ævar Arnfjörð Bjarmason wrote:

>
> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>
> > From: Jeff Hostetler <jeffhost@microsoft.com>
> >
> > Stub in empty backend for fsmonitor--daemon on Windows.
> >
> > Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> > ---
> >  Makefile                                     | 13 ++++++
> >  compat/fsmonitor/fsmonitor-fs-listen-win32.c | 21 +++++++++
> >  compat/fsmonitor/fsmonitor-fs-listen.h       | 49 ++++++++++++++++++++
> >  config.mak.uname                             |  2 +
> >  contrib/buildsystems/CMakeLists.txt          |  5 ++
> >  5 files changed, 90 insertions(+)
> >  create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-win32.c
> >  create mode 100644 compat/fsmonitor/fsmonitor-fs-listen.h
> >
> > diff --git a/Makefile b/Makefile
> > index c45caacf2c3..a2a6e1f20f6 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -467,6 +467,11 @@ all::
> >  # directory, and the JSON compilation database 'compile_commands.json' will be
> >  # created at the root of the repository.
> >  #
> > +# If your platform supports a built-in fsmonitor backend, set
> > +# FSMONITOR_DAEMON_BACKEND to the "<name>" of the corresponding
> > +# `compat/fsmonitor/fsmonitor-fs-listen-<name>.c` that implements the
> > +# `fsmonitor_fs_listen__*()` routines.
> > +#
> >  # Define DEVELOPER to enable more compiler warnings. Compiler version
> >  # and family are auto detected, but could be overridden by defining
> >  # COMPILER_FEATURES (see config.mak.dev). You can still set
> > @@ -1929,6 +1934,11 @@ ifdef NEED_ACCESS_ROOT_HANDLER
> >  	COMPAT_OBJS += compat/access.o
> >  endif
> >
> > +ifdef FSMONITOR_DAEMON_BACKEND
> > +	COMPAT_CFLAGS += -DHAVE_FSMONITOR_DAEMON_BACKEND
> > +	COMPAT_OBJS += compat/fsmonitor/fsmonitor-fs-listen-$(FSMONITOR_DAEMON_BACKEND).o
> > +endif
> > +
> >  ifeq ($(TCLTK_PATH),)
> >  NO_TCLTK = NoThanks
> >  endif
> > @@ -2793,6 +2803,9 @@ GIT-BUILD-OPTIONS: FORCE
> >  	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
> >  	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
> >  	@echo X=\'$(X)\' >>$@+
> > +ifdef FSMONITOR_DAEMON_BACKEND
> > +	@echo FSMONITOR_DAEMON_BACKEND=\''$(subst ','\'',$(subst ','\'',$(FSMONITOR_DAEMON_BACKEND)))'\' >>$@+
> > +endif
>
> Why put this in an ifdef?

Why not? What benefit does this question bring to improving this patch
series?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 12/34] fsmonitor-fs-listen-macos: stub in backend for MacOS
  2021-07-01 22:49       ` Ævar Arnfjörð Bjarmason
@ 2021-07-16 15:51         ` Johannes Schindelin
  2021-07-16 16:52           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 237+ messages in thread
From: Johannes Schindelin @ 2021-07-16 15:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jeff Hostetler via GitGitGadget, git, Jeff Hostetler,
	Derrick Stolee, Jeff Hostetler

[-- Attachment #1: Type: text/plain, Size: 2376 bytes --]

Hi Ævar,

On Fri, 2 Jul 2021, Ævar Arnfjörð Bjarmason wrote:

> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>
> > From: Jeff Hostetler <jeffhost@microsoft.com>
> >
> > Stub in empty implementation of fsmonitor--daemon
> > backend for MacOS.
> >
> > Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> > ---
> >  compat/fsmonitor/fsmonitor-fs-listen-macos.c | 20 ++++++++++++++++++++
> >  config.mak.uname                             |  2 ++
> >  contrib/buildsystems/CMakeLists.txt          |  3 +++
> >  3 files changed, 25 insertions(+)
> >  create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-macos.c
> >
> > diff --git a/compat/fsmonitor/fsmonitor-fs-listen-macos.c b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
> > new file mode 100644
> > index 00000000000..b91058d1c4f
> > --- /dev/null
> > +++ b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
> > @@ -0,0 +1,20 @@
> > +#include "cache.h"
> > +#include "fsmonitor.h"
> > +#include "fsmonitor-fs-listen.h"
> > +
> > +int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
> > +{
> > +	return -1;
> > +}
> > +
> > +void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
> > +{
> > +}
> > +
> > +void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state)
> > +{
> > +}
> > +
> > +void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
> > +{
> > +}
> > diff --git a/config.mak.uname b/config.mak.uname
> > index fcd88b60b14..394355463e1 100644
> > --- a/config.mak.uname
> > +++ b/config.mak.uname
> > @@ -147,6 +147,8 @@ ifeq ($(uname_S),Darwin)
> >  			MSGFMT = /usr/local/opt/gettext/bin/msgfmt
> >  		endif
> >  	endif
> > +	FSMONITOR_DAEMON_BACKEND = macos
>
> A rather trivial point, but can't we pick one of "macos" or "darwin"
> (I'd think going with the existing uname is better) and name the file
> after the uname (or lower-case thereof)?
>
> Makes these make rules more consistent too, we could just set this to
> "YesPlease" here, and then lower case the uname for the file
> compilation/include.

So you suggest that we name the new stuff after an `uname` that reflects a
name that is no longer relevant? I haven't seen a real Darwin system in
quite a long time, have you?

I don't find such a suggestion constructive, I have to admit.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 19/34] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  2021-07-13 18:15           ` Ævar Arnfjörð Bjarmason
@ 2021-07-16 15:55             ` Johannes Schindelin
  2021-07-16 16:27               ` Ævar Arnfjörð Bjarmason
  2021-07-16 16:55               ` Felipe Contreras
  0 siblings, 2 replies; 237+ messages in thread
From: Johannes Schindelin @ 2021-07-16 15:55 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jeff Hostetler, Jeff Hostetler via GitGitGadget, git,
	Derrick Stolee, Jeff Hostetler

[-- Attachment #1: Type: text/plain, Size: 2671 bytes --]

Hi Ævar,

On Tue, 13 Jul 2021, Ævar Arnfjörð Bjarmason wrote:

>
> On Tue, Jul 13 2021, Jeff Hostetler wrote:
>
> > On 7/1/21 7:02 PM, Ævar Arnfjörð Bjarmason wrote:
> >> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
> >>
> >>> From: Jeff Hostetler <jeffhost@microsoft.com>
> >>>
> >>> Teach the win32 backend to register a watch on the working tree
> >>> root directory (recursively).  Also watch the <gitdir> if it is
> >>> not inside the working tree.  And to collect path change notifications
> >>> into batches and publish.
> >>>
> >>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> >>> ---
> >>>   compat/fsmonitor/fsmonitor-fs-listen-win32.c | 530 +++++++++++++++++++
> >> <bikeshed mode> Spying on the early history of this (looking for the
> >> Linux backend) I saw that at some point we had just
> >> compat/fsmonitor/linux.c, and presumably some of
> >> compat/fsmonitor/{windows,win32,macos,darwin}.c.
> >> At some point those filenames became much much longer.
> >>
> >
> > Once upon a time having "foo/bar/win32.c" and "abc/def/win32.c"
> > would cause confusion in the debugger (I've long since forgotten
> > which).  Breaking at win32.c:30 was no longer unique.
> >
> > Also, if the Makefile sends all .o's to the root directory or a
> > unified OBJS directory rather than to the subdir containing the .c,
> > then we have another issue during linking...
> >
> > So, having been burned too many times, I prefer to make source
> > filenames unique when possible.
>
> A much shorter name like compat/fsmonitor/fsmon-win32.c would achieve
> that goal.
>
> >> I've noticed you tend to prefer really long file and function names,
> >> e.g. your borrowed daemonize() became
> >> spawn_background_fsmonitor_daemon(), I think aiming for shorter
> >> filenames & function names helps, e.g. these long names widen diffstats,
> >> and many people who hack on the code stick religiously to 80 character
> >> width terminals.
> >>
> >
> > I prefer self-documenting code.
>
> Sure, I'm not saying daemonize() is an ideal name, just suggesting that
> you can both get uniqueness & self-documentation and not need to split
> to multiple lines in some common cases to stay within the "We try to
> keep to at most 80 characters per line" in CodingGuidelines in this
> series.

While you are entitled to have your taste, I have to point out that Jeff
is just as entitled to their taste, and I don't think that you can claim
that yours is better.

So I wonder what the intended outcome of this review is? To make the patch
better? Or to pit taste against taste?

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 19/34] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  2021-07-16 15:55             ` Johannes Schindelin
@ 2021-07-16 16:27               ` Ævar Arnfjörð Bjarmason
  2021-07-17 12:45                 ` Eric Wong
  2021-07-16 16:55               ` Felipe Contreras
  1 sibling, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-16 16:27 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Jeff Hostetler, Jeff Hostetler via GitGitGadget, git,
	Derrick Stolee, Jeff Hostetler, Eric Wong, SZEDER Gábor


On Fri, Jul 16 2021, Johannes Schindelin wrote:

> Hi Ævar,
>
> On Tue, 13 Jul 2021, Ævar Arnfjörð Bjarmason wrote:
>
>>
>> On Tue, Jul 13 2021, Jeff Hostetler wrote:
>>
>> > On 7/1/21 7:02 PM, Ævar Arnfjörð Bjarmason wrote:
>> >> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>> >>
>> >>> From: Jeff Hostetler <jeffhost@microsoft.com>
>> >>>
>> >>> Teach the win32 backend to register a watch on the working tree
>> >>> root directory (recursively).  Also watch the <gitdir> if it is
>> >>> not inside the working tree.  And to collect path change notifications
>> >>> into batches and publish.
>> >>>
>> >>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>> >>> ---
>> >>>   compat/fsmonitor/fsmonitor-fs-listen-win32.c | 530 +++++++++++++++++++
>> >> <bikeshed mode> Spying on the early history of this (looking for the
>> >> Linux backend) I saw that at some point we had just
>> >> compat/fsmonitor/linux.c, and presumably some of
>> >> compat/fsmonitor/{windows,win32,macos,darwin}.c.
>> >> At some point those filenames became much much longer.
>> >>
>> >
>> > Once upon a time having "foo/bar/win32.c" and "abc/def/win32.c"
>> > would cause confusion in the debugger (I've long since forgotten
>> > which).  Breaking at win32.c:30 was no longer unique.
>> >
>> > Also, if the Makefile sends all .o's to the root directory or a
>> > unified OBJS directory rather than to the subdir containing the .c,
>> > then we have another issue during linking...
>> >
>> > So, having been burned too many times, I prefer to make source
>> > filenames unique when possible.
>>
>> A much shorter name like compat/fsmonitor/fsmon-win32.c would achieve
>> that goal.
>>
>> >> I've noticed you tend to prefer really long file and function names,
>> >> e.g. your borrowed daemonize() became
>> >> spawn_background_fsmonitor_daemon(), I think aiming for shorter
>> >> filenames & function names helps, e.g. these long names widen diffstats,
>> >> and many people who hack on the code stick religiously to 80 character
>> >> width terminals.
>> >>
>> >
>> > I prefer self-documenting code.
>>
>> Sure, I'm not saying daemonize() is an ideal name, just suggesting that
>> you can both get uniqueness & self-documentation and not need to split
>> to multiple lines in some common cases to stay within the "We try to
>> keep to at most 80 characters per line" in CodingGuidelines in this
>> series.
>
> While you are entitled to have your taste, I have to point out that Jeff
> is just as entitled to their taste, and I don't think that you can claim
> that yours is better.
>
> So I wonder what the intended outcome of this review is? To make the patch
> better? Or to pit taste against taste?

Neither, to address a misunderstanding.

Sure, if a reviewer points out "maybe change X to Y" and the reply is "I
like X better than Y", fair enough.

My reading of Jeff H.'s upthread was that he'd misunderstood my
suggesting of that Y for a Z.

I.e. that shortening a name like fsmonitor-fs-listen-win32.c (X)
necessarily had to mean that we'd have a win32.c (Z), negatively
impacting some debugging workflows, as opposed to just a
shorter-but-unique name like fsmon-win32.c (Y).

Ditto for daemonize() (X/Z) and spawn_background_fsmonitor_daemon() (X).

I'm certain that with this reply we're thoroughly into the "respectfully
disagree" territory as opposed to having a misunderstanding.

I also take and agree your implied point that there's no point in having
a yes/no/yes/no/yes argument on-list, and I did not mean to engage in
such a thing, only to clear up the misunderstanding, if any.

I'll only say that I don't think that something like long variable/file
etc. names is *just* a matter of taste, seeing as we have a fairly
strict "keep to at most 80 characters per line" as the 2nd item in the C
coding style (after "use tabs, not spaces").

That matter of taste for one developer objectively makes it harder to
stay within the bounds of the coding style for furute maintenance.

We do have active contributors that I understand actually use terminals
of that size to work on this project (CC'd, but maybe I misrecall that
for one/both). I'm not one of those people, but I do find that
maintaining code with needlessly long identifiers in this codebase is
painful.

E.g. in a patch I just submitted I've been working on similarly long
identifiers in the refs code[1], and with say a long variable/type name
and using a long-named function you get to the point of needing to place
each individual argument of the function on its own line, or near enough
to that.

1. https://lore.kernel.org/git/patch-7.7-cb32b5c0526-20210716T142032Z-avarab@gmail.com/


^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 12/34] fsmonitor-fs-listen-macos: stub in backend for MacOS
  2021-07-16 15:51         ` Johannes Schindelin
@ 2021-07-16 16:52           ` Ævar Arnfjörð Bjarmason
  2021-07-26 21:40             ` Johannes Schindelin
  0 siblings, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-16 16:52 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Jeff Hostetler via GitGitGadget, git, Jeff Hostetler,
	Derrick Stolee, Jeff Hostetler


On Fri, Jul 16 2021, Johannes Schindelin wrote:

> Hi Ævar,
>
> On Fri, 2 Jul 2021, Ævar Arnfjörð Bjarmason wrote:
>
>> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>>
>> > From: Jeff Hostetler <jeffhost@microsoft.com>
>> >
>> > Stub in empty implementation of fsmonitor--daemon
>> > backend for MacOS.
>> >
>> > Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>> > ---
>> >  compat/fsmonitor/fsmonitor-fs-listen-macos.c | 20 ++++++++++++++++++++
>> >  config.mak.uname                             |  2 ++
>> >  contrib/buildsystems/CMakeLists.txt          |  3 +++
>> >  3 files changed, 25 insertions(+)
>> >  create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-macos.c
>> >
>> > diff --git a/compat/fsmonitor/fsmonitor-fs-listen-macos.c b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
>> > new file mode 100644
>> > index 00000000000..b91058d1c4f
>> > --- /dev/null
>> > +++ b/compat/fsmonitor/fsmonitor-fs-listen-macos.c
>> > @@ -0,0 +1,20 @@
>> > +#include "cache.h"
>> > +#include "fsmonitor.h"
>> > +#include "fsmonitor-fs-listen.h"
>> > +
>> > +int fsmonitor_fs_listen__ctor(struct fsmonitor_daemon_state *state)
>> > +{
>> > +	return -1;
>> > +}
>> > +
>> > +void fsmonitor_fs_listen__dtor(struct fsmonitor_daemon_state *state)
>> > +{
>> > +}
>> > +
>> > +void fsmonitor_fs_listen__stop_async(struct fsmonitor_daemon_state *state)
>> > +{
>> > +}
>> > +
>> > +void fsmonitor_fs_listen__loop(struct fsmonitor_daemon_state *state)
>> > +{
>> > +}
>> > diff --git a/config.mak.uname b/config.mak.uname
>> > index fcd88b60b14..394355463e1 100644
>> > --- a/config.mak.uname
>> > +++ b/config.mak.uname
>> > @@ -147,6 +147,8 @@ ifeq ($(uname_S),Darwin)
>> >  			MSGFMT = /usr/local/opt/gettext/bin/msgfmt
>> >  		endif
>> >  	endif
>> > +	FSMONITOR_DAEMON_BACKEND = macos
>>
>> A rather trivial point, but can't we pick one of "macos" or "darwin"
>> (I'd think going with the existing uname is better) and name the file
>> after the uname (or lower-case thereof)?
>>
>> Makes these make rules more consistent too, we could just set this to
>> "YesPlease" here, and then lower case the uname for the file
>> compilation/include.
>
> So you suggest that we name the new stuff after an `uname` that reflects a
> name that is no longer relevant? I haven't seen a real Darwin system in
> quite a long time, have you?

It's not current? On an Mac Mini M1 which got released this year:

    % uname -s
    Darwin

We then have the same in config.mak.uname, it seemed the most obvious
and consistent to carry that through to file inclusion.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 11/34] fsmonitor-fs-listen-win32: stub in backend for Windows
  2021-07-16 15:47         ` Johannes Schindelin
@ 2021-07-16 16:55           ` Ævar Arnfjörð Bjarmason
  2021-07-17  5:13             ` Junio C Hamano
  2021-07-16 16:59           ` Felipe Contreras
  1 sibling, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-16 16:55 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Jeff Hostetler via GitGitGadget, git, Jeff Hostetler,
	Derrick Stolee, Jeff Hostetler


On Fri, Jul 16 2021, Johannes Schindelin wrote:

> Hi Ævar,
>
> On Fri, 2 Jul 2021, Ævar Arnfjörð Bjarmason wrote:
>
>>
>> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>>
>> > From: Jeff Hostetler <jeffhost@microsoft.com>
>> >
>> > Stub in empty backend for fsmonitor--daemon on Windows.
>> >
>> > Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>> > ---
>> >  Makefile                                     | 13 ++++++
>> >  compat/fsmonitor/fsmonitor-fs-listen-win32.c | 21 +++++++++
>> >  compat/fsmonitor/fsmonitor-fs-listen.h       | 49 ++++++++++++++++++++
>> >  config.mak.uname                             |  2 +
>> >  contrib/buildsystems/CMakeLists.txt          |  5 ++
>> >  5 files changed, 90 insertions(+)
>> >  create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-win32.c
>> >  create mode 100644 compat/fsmonitor/fsmonitor-fs-listen.h
>> >
>> > diff --git a/Makefile b/Makefile
>> > index c45caacf2c3..a2a6e1f20f6 100644
>> > --- a/Makefile
>> > +++ b/Makefile
>> > @@ -467,6 +467,11 @@ all::
>> >  # directory, and the JSON compilation database 'compile_commands.json' will be
>> >  # created at the root of the repository.
>> >  #
>> > +# If your platform supports a built-in fsmonitor backend, set
>> > +# FSMONITOR_DAEMON_BACKEND to the "<name>" of the corresponding
>> > +# `compat/fsmonitor/fsmonitor-fs-listen-<name>.c` that implements the
>> > +# `fsmonitor_fs_listen__*()` routines.
>> > +#
>> >  # Define DEVELOPER to enable more compiler warnings. Compiler version
>> >  # and family are auto detected, but could be overridden by defining
>> >  # COMPILER_FEATURES (see config.mak.dev). You can still set
>> > @@ -1929,6 +1934,11 @@ ifdef NEED_ACCESS_ROOT_HANDLER
>> >  	COMPAT_OBJS += compat/access.o
>> >  endif
>> >
>> > +ifdef FSMONITOR_DAEMON_BACKEND
>> > +	COMPAT_CFLAGS += -DHAVE_FSMONITOR_DAEMON_BACKEND
>> > +	COMPAT_OBJS += compat/fsmonitor/fsmonitor-fs-listen-$(FSMONITOR_DAEMON_BACKEND).o
>> > +endif
>> > +
>> >  ifeq ($(TCLTK_PATH),)
>> >  NO_TCLTK = NoThanks
>> >  endif
>> > @@ -2793,6 +2803,9 @@ GIT-BUILD-OPTIONS: FORCE
>> >  	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
>> >  	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
>> >  	@echo X=\'$(X)\' >>$@+
>> > +ifdef FSMONITOR_DAEMON_BACKEND
>> > +	@echo FSMONITOR_DAEMON_BACKEND=\''$(subst ','\'',$(subst ','\'',$(FSMONITOR_DAEMON_BACKEND)))'\' >>$@+
>> > +endif
>>
>> Why put this in an ifdef?
>
> Why not? What benefit does this question bring to improving this patch
> series?

I think that when adding code to the Makefile it makes sense to follow
the prevailing pattern, unless there's a good reason to do otherwise,
e.g. on my build:
	
	$ grep "''" GIT-BUILD-OPTIONS 
	NO_CURL=''
	NO_EXPAT=''
	NO_PERL=''
	NO_PTHREADS=''
	NO_PYTHON=''
	NO_UNIX_SOCKETS=''
	X=''

Why does the FSMONITOR_DAEMON_BACKEND option require a nonexistent line
as opposed to an empty one?

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 19/34] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  2021-07-16 15:55             ` Johannes Schindelin
  2021-07-16 16:27               ` Ævar Arnfjörð Bjarmason
@ 2021-07-16 16:55               ` Felipe Contreras
  1 sibling, 0 replies; 237+ messages in thread
From: Felipe Contreras @ 2021-07-16 16:55 UTC (permalink / raw)
  To: Johannes Schindelin, Ævar Arnfjörð Bjarmason
  Cc: Jeff Hostetler, Jeff Hostetler via GitGitGadget, git,
	Derrick Stolee, Jeff Hostetler

Johannes Schindelin wrote:
> On Tue, 13 Jul 2021, Ævar Arnfjörð Bjarmason wrote:
> > On Tue, Jul 13 2021, Jeff Hostetler wrote:

> > > I prefer self-documenting code.
> >
> > Sure, I'm not saying daemonize() is an ideal name, just suggesting that
> > you can both get uniqueness & self-documentation and not need to split
> > to multiple lines in some common cases to stay within the "We try to
> > keep to at most 80 characters per line" in CodingGuidelines in this
> > series.
> 
> While you are entitled to have your taste, I have to point out that Jeff
> is just as entitled to their taste, and I don't think that you can claim
> that yours is better.
> 
> So I wonder what the intended outcome of this review is? To make the patch
> better? Or to pit taste against taste?

Unless you read minds you can't possibly know what the taste of other
people will be.

So you put forward what *you* think is better, and then find out if
others agree with your taste or not. If it turns out you are the only
one that thinks it's better, so be it.

For what it's worth, I agree with Ævar that daemonize() is better and I
find your statement "I prefer self-documenting code" a) not an argument,
b) not a valid argument if we fill in the dots, and c) passively
aggressive.

Each one of us can only do one thing: express our opinion. What else can
we do?

Reviewers should not be chastised for expressing their opinion.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 11/34] fsmonitor-fs-listen-win32: stub in backend for Windows
  2021-07-16 15:47         ` Johannes Schindelin
  2021-07-16 16:55           ` Ævar Arnfjörð Bjarmason
@ 2021-07-16 16:59           ` Felipe Contreras
  1 sibling, 0 replies; 237+ messages in thread
From: Felipe Contreras @ 2021-07-16 16:59 UTC (permalink / raw)
  To: Johannes Schindelin, Ævar Arnfjörð Bjarmason
  Cc: Jeff Hostetler via GitGitGadget, git, Jeff Hostetler,
	Derrick Stolee, Jeff Hostetler

Johannes Schindelin wrote:
> On Fri, 2 Jul 2021, Ævar Arnfjörð Bjarmason wrote:
> > On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:

> > > @@ -2793,6 +2803,9 @@ GIT-BUILD-OPTIONS: FORCE
> > >  	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
> > >  	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
> > >  	@echo X=\'$(X)\' >>$@+
> > > +ifdef FSMONITOR_DAEMON_BACKEND
> > > +	@echo FSMONITOR_DAEMON_BACKEND=\''$(subst ','\'',$(subst ','\'',$(FSMONITOR_DAEMON_BACKEND)))'\' >>$@+
> > > +endif
> >
> > Why put this in an ifdef?
> 
> Why not? What benefit does this question bring to improving this patch
> series?

This is a common debate tactic known as "shifting the burden of proof".

Ævar does not need to prove that your patch is undesirable, *you* have
to prove that it is desirable.

You have the burden of proof, so you should answer the question.

https://www.logicallyfallacious.com/logicalfallacies/Shifting-of-the-Burden-of-Proof

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 02/34] fsmonitor--daemon: man page
  2021-07-16 15:45             ` Johannes Schindelin
@ 2021-07-16 17:04               ` Felipe Contreras
  0 siblings, 0 replies; 237+ messages in thread
From: Felipe Contreras @ 2021-07-16 17:04 UTC (permalink / raw)
  To: Johannes Schindelin, Ævar Arnfjörð Bjarmason
  Cc: Jeff Hostetler, Jeff Hostetler via GitGitGadget, git,
	Derrick Stolee, Jeff Hostetler

Johannes Schindelin wrote:
> On Tue, 13 Jul 2021, Ævar Arnfjörð Bjarmason wrote:
> 
> > [snip Ævar's suggestion to populate the manual page incrementally,
> > interspersed with the commits that finalize implementing the respective
> > functionality]
> >
> > I suggested this because I think it's much easier to read patches that
> > are larger because they incrementally update docs or tests along with
> > code, than smaller ones that are e.g. "add docs" followed by
> > incrementally modifying the code.
> 
> My experience is the exact opposite of yours: shorter patches are easier
> to read.

It depends.

> > That's because you can consider those atomically.
> 
> No, in a patch series you cannot consider any patch completely atomically.
> Just like you don't consider any paragraph in any well-written book out of
> context.

But you do not put every sentence in a paragraph.

Sometimes a paragraph can contain a single sentence, or a single word
even. But other times to properly read what is being tried to say you
need a pretty big paragraph.

> This all sounds like we're truly falling into the trap of ignoring the
> rule that the perfect is the enemy of the good.

Have you established that this is good enough?

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 11/34] fsmonitor-fs-listen-win32: stub in backend for Windows
  2021-07-16 16:55           ` Ævar Arnfjörð Bjarmason
@ 2021-07-17  5:13             ` Junio C Hamano
  2021-07-17  5:21               ` Junio C Hamano
  2021-07-17 21:43               ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 237+ messages in thread
From: Junio C Hamano @ 2021-07-17  5:13 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, Jeff Hostetler via GitGitGadget, git,
	Jeff Hostetler, Derrick Stolee, Jeff Hostetler

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>>> > +ifdef FSMONITOR_DAEMON_BACKEND
>>> > +	COMPAT_CFLAGS += -DHAVE_FSMONITOR_DAEMON_BACKEND
>>> > +	COMPAT_OBJS += compat/fsmonitor/fsmonitor-fs-listen-$(FSMONITOR_DAEMON_BACKEND).o
>>> > +endif
>>> > +
>>> >  ifeq ($(TCLTK_PATH),)
>>> >  NO_TCLTK = NoThanks
>>> >  endif
>>> ...
>>>
>>> Why put this in an ifdef?
>>
>> Why not? What benefit does this question bring to improving this patch
>> series?
>
> I think that when adding code to the Makefile it makes sense to follow
> the prevailing pattern, unless there's a good reason to do otherwise,
> e.g. on my build:
> 	
> 	$ grep "''" GIT-BUILD-OPTIONS 
> 	NO_CURL=''
> 	NO_EXPAT=''
> 	NO_PERL=''
> 	NO_PTHREADS=''
> 	NO_PYTHON=''
> 	NO_UNIX_SOCKETS=''
> 	X=''
>
> Why does the FSMONITOR_DAEMON_BACKEND option require a nonexistent line
> as opposed to an empty one?

I do not quite get the question.

#!/bin/sh
cat >make.file <<\EOF
all::
ifeq ($(FSMONITOR_DAEMON_BACKEND),)
	echo it is empty
endif
ifdef FSMONITOR_DAEMON_BACKEND
	echo it is undefined
endif
EOF

echo "unset???"
make -f make.file

echo "set to empty???"
make -f make.file FSMONITOR_DAEMON_BACKEND=

These two make invocations will give us the same result, showing
that "is it set to empty" and "is it unset" are the same.


^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 11/34] fsmonitor-fs-listen-win32: stub in backend for Windows
  2021-07-17  5:13             ` Junio C Hamano
@ 2021-07-17  5:21               ` Junio C Hamano
  2021-07-17 21:43               ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 237+ messages in thread
From: Junio C Hamano @ 2021-07-17  5:21 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, Jeff Hostetler via GitGitGadget, git,
	Jeff Hostetler, Derrick Stolee, Jeff Hostetler

Junio C Hamano <gitster@pobox.com> writes:

> #!/bin/sh
> cat >make.file <<\EOF
> all::
> ifeq ($(FSMONITOR_DAEMON_BACKEND),)
> 	echo it is empty
> endif
> ifdef FSMONITOR_DAEMON_BACKEND

An obvious typo.  This must be "ifndef", of course.

> 	echo it is undefined
> endif
> EOF
>
> echo "unset???"
> make -f make.file
>
> echo "set to empty???"
> make -f make.file FSMONITOR_DAEMON_BACKEND=
>
> These two make invocations will give us the same result, showing
> that "is it set to empty" and "is it unset" are the same.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 19/34] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  2021-07-16 16:27               ` Ævar Arnfjörð Bjarmason
@ 2021-07-17 12:45                 ` Eric Wong
  2021-07-19 22:35                   ` Jeff Hostetler
  0 siblings, 1 reply; 237+ messages in thread
From: Eric Wong @ 2021-07-17 12:45 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, Jeff Hostetler,
	Jeff Hostetler via GitGitGadget, git, Derrick Stolee,
	Jeff Hostetler, SZEDER Gábor

Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> On Fri, Jul 16 2021, Johannes Schindelin wrote:
> > Hi Ævar,
> >
> > On Tue, 13 Jul 2021, Ævar Arnfjörð Bjarmason wrote:
> >
> >>
> >> On Tue, Jul 13 2021, Jeff Hostetler wrote:
> >>
> >> > On 7/1/21 7:02 PM, Ævar Arnfjörð Bjarmason wrote:
> >> >> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
> >> >>
> >> >>> From: Jeff Hostetler <jeffhost@microsoft.com>
> >> >>>
> >> >>> Teach the win32 backend to register a watch on the working tree
> >> >>> root directory (recursively).  Also watch the <gitdir> if it is
> >> >>> not inside the working tree.  And to collect path change notifications
> >> >>> into batches and publish.
> >> >>>
> >> >>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> >> >>> ---
> >> >>>   compat/fsmonitor/fsmonitor-fs-listen-win32.c | 530 +++++++++++++++++++
> >> >> <bikeshed mode> Spying on the early history of this (looking for the
> >> >> Linux backend) I saw that at some point we had just
> >> >> compat/fsmonitor/linux.c, and presumably some of
> >> >> compat/fsmonitor/{windows,win32,macos,darwin}.c.
> >> >> At some point those filenames became much much longer.
> >> >>
> >> >
> >> > Once upon a time having "foo/bar/win32.c" and "abc/def/win32.c"
> >> > would cause confusion in the debugger (I've long since forgotten
> >> > which).  Breaking at win32.c:30 was no longer unique.
> >> >
> >> > Also, if the Makefile sends all .o's to the root directory or a
> >> > unified OBJS directory rather than to the subdir containing the .c,
> >> > then we have another issue during linking...
> >> >
> >> > So, having been burned too many times, I prefer to make source
> >> > filenames unique when possible.
> >>
> >> A much shorter name like compat/fsmonitor/fsmon-win32.c would achieve
> >> that goal.
> >>
> >> >> I've noticed you tend to prefer really long file and function names,
> >> >> e.g. your borrowed daemonize() became
> >> >> spawn_background_fsmonitor_daemon(), I think aiming for shorter
> >> >> filenames & function names helps, e.g. these long names widen diffstats,
> >> >> and many people who hack on the code stick religiously to 80 character
> >> >> width terminals.

At least "daemon"/"daemonize" already implies "background"; so
even if we have the extra function, "spawn_fsmon_daemon()" would
be enough info.

> >> >>
> >> >
> >> > I prefer self-documenting code.
> >>
> >> Sure, I'm not saying daemonize() is an ideal name, just suggesting that
> >> you can both get uniqueness & self-documentation and not need to split
> >> to multiple lines in some common cases to stay within the "We try to
> >> keep to at most 80 characters per line" in CodingGuidelines in this
> >> series.
> >
> > While you are entitled to have your taste, I have to point out that Jeff
> > is just as entitled to their taste, and I don't think that you can claim
> > that yours is better.
> >
> > So I wonder what the intended outcome of this review is? To make the patch
> > better? Or to pit taste against taste?
> 
> Neither, to address a misunderstanding.
> 
> Sure, if a reviewer points out "maybe change X to Y" and the reply is "I
> like X better than Y", fair enough.
> 
> My reading of Jeff H.'s upthread was that he'd misunderstood my
> suggesting of that Y for a Z.
> 
> I.e. that shortening a name like fsmonitor-fs-listen-win32.c (X)
> necessarily had to mean that we'd have a win32.c (Z), negatively
> impacting some debugging workflows, as opposed to just a
> shorter-but-unique name like fsmon-win32.c (Y).

Short-as-possible-while-being-meaningful is a pretty important
usability thing git.  There's a good reason git supports OID
prefix abbreviations, after all.

Not my area of expertise, but AFAIK git's rename detection is
affected by basename; and I've encountered debugger confusion
with non-unique basenames while debugging other codebases.

My brain works like a naive "strcmp"/"memcmp": long common
prefixes slows down my ability to differentiate filenames.

Having lots of common terms/prefixes on the screen works like
camoflage to me and slows down my ability to process things.
I suppose my eyes and cognitive abilities are below average;
and even worse due to the pandemic numbing my brain.

> Ditto for daemonize() (X/Z) and spawn_background_fsmonitor_daemon() (X).

(what I said above)

> I'm certain that with this reply we're thoroughly into the "respectfully
> disagree" territory as opposed to having a misunderstanding.
> 
> I also take and agree your implied point that there's no point in having
> a yes/no/yes/no/yes argument on-list, and I did not mean to engage in
> such a thing, only to clear up the misunderstanding, if any.
> 
> I'll only say that I don't think that something like long variable/file
> etc. names is *just* a matter of taste, seeing as we have a fairly
> strict "keep to at most 80 characters per line" as the 2nd item in the C
> coding style (after "use tabs, not spaces").
> 
> That matter of taste for one developer objectively makes it harder to
> stay within the bounds of the coding style for furute maintenance.
> 
> We do have active contributors that I understand actually use terminals
> of that size to work on this project (CC'd, but maybe I misrecall that
> for one/both). I'm not one of those people, but I do find that
> maintaining code with needlessly long identifiers in this codebase is
> painful.

Thanks for Cc-ing me.

Yes, I'm one of those developers.  Accessibility matters to me:
my eyesight certainly isn't getting better with age (nor do I
expect anyone elses').  I need giant fonts to reduce eye and
neck strain.

Fwiw, newspaper publishers figured out line width
decades/centuries ago and wrap lines despite having large sheets
to work on.


I mostly work over mosh or ssh to reduce noise and heat locally.
There's no bandwidth for VNC or similar, and graphical stuff
tends to be unstable UI-wise anyways so I stick to the terminal.

Taste does have much to do with it: I favor stable, reliable
tools (e.g. POSIX, Perl5, git) that works well on both old and
new hardware.  I avoid mainstream "desktop" software since they
tend to have unstable UIs which break users' workflows while
requiring more powerful HW.

Complex graphics drivers tend to get unreliable, too, especially
when one is stuck with old HW that gets limited support from
vendors.  It's also difficult to fix complex drivers as a
hobbyist given the one-off HW/vendor-specific knowledge
required.

So we shouldn't expect a developer with old HW can have more
than a standard text terminal.  This is an accessibility problem
for developers lacking in finances.

This is also a problem for developers wishing to backdoors+bugs
found in modern systems (IntelME, AMD-PSP, endless stream of CPU
bugs).


Back to health-related accessibility; I've also had joint
problems for many years so shorter identifiers helps reduce
typing I need to do.  I mostly had that under control
pre-pandemic, but it's been a huge struggle to find adequate
replacements for activities I used to rely on to manage the
pain.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 11/34] fsmonitor-fs-listen-win32: stub in backend for Windows
  2021-07-17  5:13             ` Junio C Hamano
  2021-07-17  5:21               ` Junio C Hamano
@ 2021-07-17 21:43               ` Ævar Arnfjörð Bjarmason
  2021-07-19 19:58                 ` Junio C Hamano
  1 sibling, 1 reply; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-17 21:43 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin, Jeff Hostetler via GitGitGadget, git,
	Jeff Hostetler, Derrick Stolee, Jeff Hostetler


On Fri, Jul 16 2021, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>>>> > +ifdef FSMONITOR_DAEMON_BACKEND
>>>> > +	COMPAT_CFLAGS += -DHAVE_FSMONITOR_DAEMON_BACKEND
>>>> > +	COMPAT_OBJS += compat/fsmonitor/fsmonitor-fs-listen-$(FSMONITOR_DAEMON_BACKEND).o
>>>> > +endif
>>>> > +
>>>> >  ifeq ($(TCLTK_PATH),)
>>>> >  NO_TCLTK = NoThanks
>>>> >  endif
>>>> ...
>>>>
>>>> Why put this in an ifdef?
>>>
>>> Why not? What benefit does this question bring to improving this patch
>>> series?
>>
>> I think that when adding code to the Makefile it makes sense to follow
>> the prevailing pattern, unless there's a good reason to do otherwise,
>> e.g. on my build:
>> 	
>> 	$ grep "''" GIT-BUILD-OPTIONS 
>> 	NO_CURL=''
>> 	NO_EXPAT=''
>> 	NO_PERL=''
>> 	NO_PTHREADS=''
>> 	NO_PYTHON=''
>> 	NO_UNIX_SOCKETS=''
>> 	X=''
>>
>> Why does the FSMONITOR_DAEMON_BACKEND option require a nonexistent line
>> as opposed to an empty one?
>
> I do not quite get the question.
>
> #!/bin/sh
> cat >make.file <<\EOF
> all::
> ifeq ($(FSMONITOR_DAEMON_BACKEND),)
> 	echo it is empty
> endif
> ifdef FSMONITOR_DAEMON_BACKEND
> 	echo it is undefined
> endif
> EOF
>
> echo "unset???"
> make -f make.file
>
> echo "set to empty???"
> make -f make.file FSMONITOR_DAEMON_BACKEND=
>
> These two make invocations will give us the same result, showing
> that "is it set to empty" and "is it unset" are the same.

Indeed, which is why I'm pointing out that wrapping it in an ifdef is
pointless, which is why we don't do it for the other ones.

We do have a bunch of ifdef'd things there for perf etc., I'm not sure
if it matters or not for those.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 11/34] fsmonitor-fs-listen-win32: stub in backend for Windows
  2021-07-01 22:45       ` Ævar Arnfjörð Bjarmason
  2021-07-16 15:47         ` Johannes Schindelin
@ 2021-07-19 16:54         ` Jeff Hostetler
  2021-07-20 20:32           ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-19 16:54 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Derrick Stolee, Jeff Hostetler



On 7/1/21 6:45 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
> 
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> Stub in empty backend for fsmonitor--daemon on Windows.
>>
>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>> ---
>>   Makefile                                     | 13 ++++++
>>   compat/fsmonitor/fsmonitor-fs-listen-win32.c | 21 +++++++++
>>   compat/fsmonitor/fsmonitor-fs-listen.h       | 49 ++++++++++++++++++++
>>   config.mak.uname                             |  2 +
>>   contrib/buildsystems/CMakeLists.txt          |  5 ++
>>   5 files changed, 90 insertions(+)
>>   create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-win32.c
>>   create mode 100644 compat/fsmonitor/fsmonitor-fs-listen.h
>>
>> diff --git a/Makefile b/Makefile
>> index c45caacf2c3..a2a6e1f20f6 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -467,6 +467,11 @@ all::
>>   # directory, and the JSON compilation database 'compile_commands.json' will be
>>   # created at the root of the repository.
>>   #
>> +# If your platform supports a built-in fsmonitor backend, set
>> +# FSMONITOR_DAEMON_BACKEND to the "<name>" of the corresponding
>> +# `compat/fsmonitor/fsmonitor-fs-listen-<name>.c` that implements the
>> +# `fsmonitor_fs_listen__*()` routines.
>> +#
>>   # Define DEVELOPER to enable more compiler warnings. Compiler version
>>   # and family are auto detected, but could be overridden by defining
>>   # COMPILER_FEATURES (see config.mak.dev). You can still set
>> @@ -1929,6 +1934,11 @@ ifdef NEED_ACCESS_ROOT_HANDLER
>>   	COMPAT_OBJS += compat/access.o
>>   endif
>>   
>> +ifdef FSMONITOR_DAEMON_BACKEND
>> +	COMPAT_CFLAGS += -DHAVE_FSMONITOR_DAEMON_BACKEND
>> +	COMPAT_OBJS += compat/fsmonitor/fsmonitor-fs-listen-$(FSMONITOR_DAEMON_BACKEND).o
>> +endif
>> +
>>   ifeq ($(TCLTK_PATH),)
>>   NO_TCLTK = NoThanks
>>   endif
>> @@ -2793,6 +2803,9 @@ GIT-BUILD-OPTIONS: FORCE
>>   	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
>>   	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
>>   	@echo X=\'$(X)\' >>$@+
>> +ifdef FSMONITOR_DAEMON_BACKEND
>> +	@echo FSMONITOR_DAEMON_BACKEND=\''$(subst ','\'',$(subst ','\'',$(FSMONITOR_DAEMON_BACKEND)))'\' >>$@+
>> +endif
> 
> Why put this in an ifdef?
> 
> In 342e9ef2d9e (Introduce a performance testing framework, 2012-02-17)
> we started doing that for some perf/test options (which b.t.w., I don't
> really see the reason for, maybe it's some subtlety in how test-lib.sh
> picks those up).
> 
> But for all the other compile-time stuff we don't ifdef it, we just
> define it, and then you get an empty value or not.
> 
> This would AFAICT be the first build-time-for-the-C-program option we
> ifdef for writing a line to GIT-BUILD-OPTIONS.
> 

(I'm going to respond here on the original question rather than on any
of the follow up responses in an attempt at diffusing things a bit.)

I added the ifdef because I thought it to be the *most conservative*
thing that I could do.  The output of the generated file on unsupported
platforms should be *identical* to what it was before my changes.  I
only alter the contents of the generated file on supported platforms.

Later, when the generated file is consumed, we don't need to worry about
the effect (if any) on incremental compiles -- we will know that it
won't be set -- just like it was not set in the original compile.

That change appears right before a 12 other ifdef'd symbols also being
written to that generated file.  Most are test and perf, but some are
not.  But my point is that the pattern is present already.

The original question also references a 9.5 year old commit which
uses the same pattern as I've used here.  It also muddies the water
on why it was/wasn't needed back then.  And hints at possible
side-effects in some of our test scripts.  So it is clear that the
confusion/disagreements that we are having with the current patch
and whether or not to ifdef are not new.


So, is there value in being explicit and having the ifdef ??


There are well defined Make rules (and Junio gave us a very elegant
little script to demonstrate that), but the subtleties are there.
Especially with our use generated files like `GIT-BUILD-OPTIONS`.
We have a mailing list full of experts and yet this question received
a lot more discussion than I thought possible or necessary, but it
took a test script to demonstrate that the results are the same and it
doesn't matter.  Perhaps the clarity is worth it for the price of a
simple ifdef.


So, how much time have we (collectively) wasted discussing this
subtlety ??


To summarize, I added the ifdef to make it explicitly clear that
I'm not altering behavior on unsupported platforms.  I can remove it
from V4 if desired or I can keep it.  (We all now know that it doesn't
functionally matter -- it does however, provide clarity.)


Sorry if this sounded like a rant,
Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 11/34] fsmonitor-fs-listen-win32: stub in backend for Windows
  2021-07-17 21:43               ` Ævar Arnfjörð Bjarmason
@ 2021-07-19 19:58                 ` Junio C Hamano
  0 siblings, 0 replies; 237+ messages in thread
From: Junio C Hamano @ 2021-07-19 19:58 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, Jeff Hostetler via GitGitGadget, git,
	Jeff Hostetler, Derrick Stolee, Jeff Hostetler

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>>>>> Why put this in an ifdef?
>>>> ...
>>> Why does the FSMONITOR_DAEMON_BACKEND option require a nonexistent line
>>> as opposed to an empty one?
>>
>> I do not quite get the question.
>>
>> #!/bin/sh
>> cat >make.file <<\EOF
>> all::
>> ifeq ($(FSMONITOR_DAEMON_BACKEND),)
>> 	echo it is empty
>> endif
>> ifndef FSMONITOR_DAEMON_BACKEND
>> 	echo it is undefined
>> endif
>> EOF
>>
>> echo "unset???"
>> make -f make.file
>>
>> echo "set to empty???"
>> make -f make.file FSMONITOR_DAEMON_BACKEND=
>>
>> These two make invocations will give us the same result, showing
>> that "is it set to empty" and "is it unset" are the same.
>
> Indeed, which is why I'm pointing out that wrapping it in an ifdef is
> pointless, which is why we don't do it for the other ones.
>
> We do have a bunch of ifdef'd things there for perf etc., I'm not sure
> if it matters or not for those.

Sorry, but I still do not get the question.  There are bunch of
ifndef in Makefile in addition to ifeq/ifneq and your question

    FSMONITOR_DAEMON_BACKEND option require a nonexistent line as
    opposed to an empty one?

is asking "why is it X" when X is not quite true.  I presume that
your "wrapping it in an ifdef" refers to a construct like this:

>>> > +ifdef FSMONITOR_DAEMON_BACKEND
>>> > +	COMPAT_CFLAGS += -DHAVE_FSMONITOR_DAEMON_BACKEND
>>> > +	COMPAT_OBJS += compat/fsmonitor/fsmonitor-fs-listen-$(FSMONITOR_DAEMON_BACKEND).o
>>> > +endif

but is your suggestion that it should be written like this instead?

>>> > +ifneq ($(FSMONITOR_DAEMON_BACKEND),)
>>> > +	COMPAT_CFLAGS += -DHAVE_FSMONITOR_DAEMON_BACKEND
>>> > +	COMPAT_OBJS += compat/fsmonitor/fsmonitor-fs-listen-$(FSMONITOR_DAEMON_BACKEND).o
>>> > +endif

I do not think the latter is any easier to follow (and we have many
ifdef and ifndef in our Makefile already).  Perhaps I will see what
you mean when I see your "better alternative", but so far, I am not
successfully guessing what it is.


^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 06/34] fsmonitor: config settings are repository-specific
  2021-07-01 16:46       ` Ævar Arnfjörð Bjarmason
@ 2021-07-19 20:36         ` Jeff Hostetler
  0 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-19 20:36 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Derrick Stolee, Jeff Hostetler



On 7/1/21 12:46 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
> 
> In a reference to a discussion[1] about an earlier version of this patch
> you said:
> 
>      I'm going to ignore all of the thread responses to this patch
>      dealing with how we acquire config settings and macros and etc.
>      Those issues are completely independent of FSMonitor (which is
>      already way too big).
> 
> Since then the changes to repo-settings.c have become a lot larger, so
> let's take a look...
> 
> 1. https://lore.kernel.org/git/87mttkyrqq.fsf@evledraar.gmail.com/
> 2. https://lore.kernel.org/git/4552971c-0a23-c19a-6a23-cb5737e43b2a@jeffhostetler.com/

Yes, there was a large conversation about re-doing how config
values are acquired or whatever and it was nested inside of
the context of a completely unrelated topic.  It still is.

I would like to focus on FSMonitor using the existing config API.
I'm only looking up ~2 config values.  And that code is fairly
minor considering everything else in this patch series.

Yes, there's a bizarre initialization with memset(-1), but it
can wait.

Later (in a clean context) we can address and focus on the config
API and/or structure initialization that you're talking about here.

FWIW, in V4 I'll refactor the block of code that I added into the
body of prepare_repo_settings() to address your later concerns.
This version will hopefully be more readable.

> 
> 
>> diff --git a/repo-settings.c b/repo-settings.c
>> index 0cfe8b787db..faf197ff60a 100644
>> --- a/repo-settings.c
>> +++ b/repo-settings.c
>> @@ -5,10 +5,42 @@
>>   
>>   #define UPDATE_DEFAULT_BOOL(s,v) do { if (s == -1) { s = v; } } while(0)
>>   
>> +/*
>> + * Return 1 if the repo/workdir is incompatible with FSMonitor.
>> + */
>> +static int is_repo_incompatible_with_fsmonitor(struct repository *r)
>> +{
>> +	const char *const_strval;
>> +
>> +	/*
>> +	 * Bare repositories don't have a working directory and
>> +	 * therefore, nothing to watch.
>> +	 */
>> +	if (!r->worktree)
>> +		return 1;
> 
> Looking ahead in this series you end up using
> FSMONITOR_MODE_INCOMPATIBLE in two places in the codebase. In
> builtin/update-index.c to throw a "repository is incompatible with
> fsmonitor" error.
> 
> Can't that case just be replaced with setup_work_tree()? Other sub-modes
> of update-index already die implicitly on that, e.g.:
> 
> 	$ git update-index test
> 	fatal: this operation must be run in a work tree

I will refactor that static function in V4, but I want it to return
an indication of whether the repo is compatible only and let the
command print the error/die as is appropriate for the daemon and/or
update-index.  The daemon should not start on an incompatible repo.
Likewise, update-index should not enable the extension in the index.

We share some of that code with the client side code, like status,
which wants to talk to the hook or daemon if supported/present/allowed.
If the repo is incompatible, then status should just behave in the
classic manner.  So I don't want the detection code to print those
error messages or die.

> 
> The other case is:
> 	
> 	+       prepare_repo_settings(the_repository);
> 	+       if (!the_repository->worktree)
> 	+               return error(_("fsmonitor-daemon does not support bare repos '%s'"),
> 	+                            xgetcwd());
> 	+       if (the_repository->settings.fsmonitor_mode == FSMONITOR_MODE_INCOMPATIBLE)
> 	+               return error(_("fsmonitor-daemon is incompatible with this repo '%s'"),
> 	+                            the_repository->worktree);
> 
> I.e. we just checked the_repository->worktree, but it's not that, but....

yes, I currently have 2 types of repos that I want to disable
both the hook and daemon version of FSMonitor.  I'll update the
error messages to specify the reason why we are incompatible.

> 
>> +
>> +	/*
>> +	 * GVFS (aka VFS for Git) is incompatible with FSMonitor.
>> +	 *
>> +	 * Granted, core Git does not know anything about GVFS and
>> +	 * we shouldn't make assumptions about a downstream feature,
>> +	 * but users can install both versions.  And this can lead
>> +	 * to incorrect results from core Git commands.  So, without
>> +	 * bringing in any of the GVFS code, do a simple config test
>> +	 * for a published config setting.  (We do not look at the
>> +	 * various *_TEST_* environment variables.)
>> +	 */
>> +	if (!repo_config_get_value(r, "core.virtualfilesystem", &const_strval))
>> +		return 1;
> 
> I'm skeptical of us hardcoding a third-party software config
> variable. Can't GitVFS handle this somehow on its end?

Adding the test for a GVFS-specific config setting is questionable.
And perhaps we should move it to our downstream fork.

The value in putting it here for now (at least) is that it makes
clear the structure for supporting other types of incompatible
repos.

For example, perhaps we want to disallow repos that are on remote
file systems (since we might not be able to get FS events).  It
would be nice to be able to prevent the daemon from starting and/or
from status from trying to connect to a daemon that will be a
disappointment.  The code I have here serves as a model for adding
such additional restrictions.  And keeps us from trying to prematurely
collapse the code into a simple expression.

> 
> But just in terms of implementation it seems the end result of that is
> to emit a very confusing error to the user. Sinc we already checked for
> bare repos we run into this and instead of sayingwhen we should really
> say "hey, maybe disable your core.virtualFileSystem setting", we say
> "your repo is incompatible".
> 

I'll update the error messages to make that clear.

>> +
>> +	return 0;
>> +}
>> +
>>   void prepare_repo_settings(struct repository *r)
>>   {
>>   	int value;
>>   	char *strval;
>> +	const char *const_strval;
> 
> Can be declared in the "else" below.
> 
>>   
>>   	if (r->settings.initialized)
>>   		return;
>> @@ -26,6 +58,22 @@ void prepare_repo_settings(struct repository *r)
>>   	UPDATE_DEFAULT_BOOL(r->settings.commit_graph_read_changed_paths, 1);
>>   	UPDATE_DEFAULT_BOOL(r->settings.gc_write_commit_graph, 1);
>>   
>> +	r->settings.fsmonitor_hook_path = NULL;
>> +	r->settings.fsmonitor_mode = FSMONITOR_MODE_DISABLED;
> 
> With the memset earlier (b.t.w. I've got a patch to fix all this bizarre
> behavior in repo-settings.c, but have been waiting on this series we
> implicitly set it to FSMONITOR_MODE_UNSET (-1) with the memset, but then
> never use that ever.

I'm working around the bogus -1 value that the structure has
been initialize with and that I'm inheriting.  I'll completely
set my `fsmonitor_mode` variable so that I don't care what it
is initialized to.

I created the _UNSET value as a reminder that there is this memset(-1)
there and that must be attended to.

In my V4 I'll add comments to that effect.

> 
> Your code in update-index.c then for a check against
> "FSMONITOR_MODE_DISABLED" says "core.useBuiltinFSMonitor is unset;".
> 
>> +	if (is_repo_incompatible_with_fsmonitor(r))
>> +		r->settings.fsmonitor_mode = FSMONITOR_MODE_INCOMPATIBLE;
> 
> Style: should have {} braces on all arms.
> 
>> +	else if (!repo_config_get_bool(r, "core.usebuiltinfsmonitor", &value)
>> +		   && value)
>> +		r->settings.fsmonitor_mode = FSMONITOR_MODE_IPC;
> 
> Here you're conflating false with whether the variable is set at all. I
> guess that works out here since if it's false we want to fall through
> to...
> 
>> +	else {
> 
> ...ignoring it and looing at core.fsmonitor instead.

yes, if core.useBuiltinFSMonitor is true, we do not need to look
at the hook pathname.

> 
>> +		if (repo_config_get_pathname(r, "core.fsmonitor", &const_strval))
>> +			const_strval = getenv("GIT_TEST_FSMONITOR");
> 
> If it's not set we pay attention to GIT_TEST_FSMONITOR, so the behavior
> from the old git_config_get_fsmonitor(). So even if the env variable is
> set we want to take the config variable over it, correct?

GIT_TEST_FSMONITOR sets the hook path for testing and we're not using
the hook API at all.  So it is kind of ill-defined what that test env
var should do if you have IPC turned on, so I'm ignoring it.

> 
>> +		if (const_strval && *const_strval) {
>> +			r->settings.fsmonitor_hook_path = strdup(const_strval);
> 
> We had a strbuf_detach()'d string in the case of
> repo_config_get_pathname(), but here we strdup() it again in case we
> were in the getenv() codepath. This code probably leaks memory now
> anyway, but perhaps it's better to split up the two so we make it easier
> to deal with who owns/frees what in the future.
> 

repo_config_get_pathname() returns a const char** which implies that
it is not a detached buffer.  it is a deep trek thru the config code,
but it eventually gets to git_configset_get_*() which is documented as
returning a pointer into a configset cache and that the caller should
not free it.  so dup'ing it is appropriate.

Similarly, getenv() is also returning a pointer to a buffer that the
caller does not own.  so dup'ing it here is also appropriate.

Thanks,
Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 08/34] fsmonitor--daemon: add a built-in fsmonitor daemon
  2021-07-01 22:36       ` Ævar Arnfjörð Bjarmason
@ 2021-07-19 20:56         ` Jeff Hostetler
  0 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-19 20:56 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Derrick Stolee, Jeff Hostetler



On 7/1/21 6:36 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
> 
> A general comment on this series (including previous patches). We've
> usually tried to bend over backwards in git's codebase not to have big
> ifdef blocks, so we compile most code the same everywhere. We waste a
> bit of object code, but that's fine.
> 
> See 9c897c5c2ad (pack-objects: remove #ifdef NO_PTHREADS, 2018-11-03)
> for a good exmaple of bad code being turned to good.
> 
> E.g. in this case:
> 
>> +#ifdef HAVE_FSMONITOR_DAEMON_BACKEND
>> +
>> +int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
>> +{
>> +	const char *subcmd;
>> +
>> +	struct option options[] = {
>> +		OPT_END()
>> +	};
>> +
>> +	if (argc < 2)
>> +		usage_with_options(builtin_fsmonitor__daemon_usage, options);
>> +
>> +	if (argc == 2 && !strcmp(argv[1], "-h"))
>> +		usage_with_options(builtin_fsmonitor__daemon_usage, options);
>> +
>> +	git_config(git_default_config, NULL);
>> +
>> +	subcmd = argv[1];
>> +	argv--;
>> +	argc++;
>> +
>> +	argc = parse_options(argc, argv, prefix, options,
>> +			     builtin_fsmonitor__daemon_usage, 0);
>> +
>> +	die(_("Unhandled subcommand '%s'"), subcmd);
>> +}
>> +
>> +#else
>> +int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix)
>> +{
>> +	struct option options[] = {
>> +		OPT_END()
>> +	};
>> +
>> +	if (argc == 2 && !strcmp(argv[1], "-h"))
>> +		usage_with_options(builtin_fsmonitor__daemon_usage, options);
>> +
>> +	die(_("fsmonitor--daemon not supported on this platform"));
>> +}
>> +#endif
> 
> This whole thing could really just be a
> -DHAVE_FSMONITOR_DAEMON_BACKEND=1 or -DHAVE_FSMONITOR_DAEMON_BACKEND=0
> somewhere (depending), and then somewhere in the middle of the first
> function:
> 
> 	if (!HAVE_FSMONITOR_DAEMON_BACKEND)
> 	    	die(_("fsmonitor--daemon not supported on this platform"));
> 

This whole file will be filled up with ~1500 lines of static functions
that only make sense when the daemon is supported and that make calls
to platform-specific backends.

I suppose we could stub in an empty backend (something like that in
11/34 and 12/34) and hack in all stuff in the makefile to link to it
in the unsupported case, but that seems like a lot of effort just to
avoid an ifdef here.

I mean, the intent of the #else block is quite clear and we're not
fooling the reader with a large source file of code that will never
be used on their platform.

We could consider splitting this source file into a supported and
unsupported version and have the makefile select the right .c file.
We'd have to move the usage and stuff to a shared header and etc.
That would eliminate the ifdef, but it would break the convention of
the source filename matching the command name.

I'm not sure it's worth the bother TBH.

Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 19/34] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
  2021-07-17 12:45                 ` Eric Wong
@ 2021-07-19 22:35                   ` Jeff Hostetler
  0 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-19 22:35 UTC (permalink / raw)
  To: Eric Wong, Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, Jeff Hostetler via GitGitGadget, git,
	Derrick Stolee, Jeff Hostetler, SZEDER Gábor



On 7/17/21 8:45 AM, Eric Wong wrote:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>> On Fri, Jul 16 2021, Johannes Schindelin wrote:
>>> Hi Ævar,
>>>
>>> On Tue, 13 Jul 2021, Ævar Arnfjörð Bjarmason wrote:
>>>
>>>>
>>>> On Tue, Jul 13 2021, Jeff Hostetler wrote:
>>>>
>>>>> On 7/1/21 7:02 PM, Ævar Arnfjörð Bjarmason wrote:
>>>>>> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>>>>>>
>>>>>>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>>>>>>
...

Eric, welcome to the conversation and thanks for sharing your concerns.


For my upcoming V4 I've shortened the filenames of the various backends,
renamed the -macos one to -darwin, and shortened the names of the
fsm-listener API, and the names of those static functions associated
with starting the daemon in the background.

I think this covers all of the issues raised across several
patches in the series.

Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 24/34] t/perf/p7519: speed up test using "test-tool touch"
  2021-07-13 19:05             ` Jeff Hostetler
@ 2021-07-20 19:18               ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-20 19:18 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Jeff Hostetler via GitGitGadget, git, Johannes Schindelin,
	Derrick Stolee, Jeff Hostetler


On Tue, Jul 13 2021, Jeff Hostetler wrote:

> On 7/13/21 2:18 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Tue, Jul 13 2021, Jeff Hostetler wrote:
>> 
>>> On 7/1/21 7:09 PM, Ævar Arnfjörð Bjarmason wrote:
>>>> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>>>>
>>>>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>>>>
>>>>> Change p7519 to use a single "test-tool touch" command to update
>>>>> the mtime on a series of (thousands) files instead of invoking
>>>>> thousands of commands to update a single file.
>>>>>
>>>>> This is primarily for Windows where process creation is so
>>>>> very slow and reduces the test run time by minutes.
>>>>>
>>>>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>>>>> ---
>>>>>    t/perf/p7519-fsmonitor.sh | 14 ++++++--------
>>>>>    1 file changed, 6 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
>>>>> index 5eb5044a103..f74e6014a0a 100755
>>>>> --- a/t/perf/p7519-fsmonitor.sh
>>>>> +++ b/t/perf/p7519-fsmonitor.sh
>>>>> @@ -119,10 +119,11 @@ test_expect_success "one time repo setup" '
>>>>>    	fi &&
>>>>>      	mkdir 1_file 10_files 100_files 1000_files 10000_files &&
>>>>> -	for i in $(test_seq 1 10); do touch 10_files/$i; done &&
>>>>> -	for i in $(test_seq 1 100); do touch 100_files/$i; done &&
>>>>> -	for i in $(test_seq 1 1000); do touch 1000_files/$i; done &&
>>>>> -	for i in $(test_seq 1 10000); do touch 10000_files/$i; done &&
>>>>> +	test-tool touch sequence --pattern="10_files/%d" --start=1 --count=10 &&
>>>>> +	test-tool touch sequence --pattern="100_files/%d" --start=1 --count=100 &&
>>>>> +	test-tool touch sequence --pattern="1000_files/%d" --start=1 --count=1000 &&
>>>>> +	test-tool touch sequence --pattern="10000_files/%d" --start=1 --count=10000 &&
>>>>> +
>>>>>    	git add 1_file 10_files 100_files 1000_files 10000_files &&
>>>>>    	git commit -qm "Add files" &&
>>>>>    @@ -200,15 +201,12 @@ test_fsmonitor_suite() {
>>>>>    	# Update the mtimes on upto 100k files to make status think
>>>>>    	# that they are dirty.  For simplicity, omit any files with
>>>>>    	# LFs (i.e. anything that ls-files thinks it needs to dquote).
>>>>> -	# Then fully backslash-quote the paths to capture any
>>>>> -	# whitespace so that they pass thru xargs properly.
>>>>>    	#
>>>>>    	test_perf_w_drop_caches "status (dirty) ($DESC)" '
>>>>>    		git ls-files | \
>>>>>    			head -100000 | \
>>>>>    			grep -v \" | \
>>>>> -			sed '\''s/\(.\)/\\\1/g'\'' | \
>>>>> -			xargs test-tool chmtime -300 &&
>>>>> +			test-tool touch stdin &&
>>>>>    		git status
>>>>>    	'
>>>> Did you try to replace this with some variant of:
>>>>       test_seq 1 10000 | xargs touch
>>>> Which (depending on your xargs version) would invoke "touch"
>>>> commands
>>>> with however many argv items it thinks you can handle.
>>>>
>>>
>>> a quick test on my Windows machine shows that
>>>
>>> 	test_seq 1 10000 | xargs touch
>>>
>>> takes 3.1 seconds.
>>>
>>> just a simple
>>>
>>> 	test_seq 1 10000 >/dev/null
>>>
>>> take 0.2 seconds.
>>>
>>> using my test-tool helper cuts that time in half.
>> There's what Elijah mentioned about test_seq, so maybe it's just
>> that.
>> But what I was suggesting was using the xargs mode where it does N
>> arguments at a time.
>> Does this work for you, and does it cause xargs to invoke "touch"
>> with
>> the relevant N number of arguments, and does it help with the
>> performance?
>>      test_seq 1 10000 | xargs touch
>>      test_seq 1 10000 | xargs -n 10 touch
>>      test_seq 1 10000 | xargs -n 100 touch
>>      test_seq 1 10000 | xargs -n 1000 touch
>
> The GFW SDK version of xargs does have `-n N` and it does work as
> advertised.  And it does slow down things considerably.  Letting it
> do ~2500 per command in 4 commands took the 3.1 seconds listed above.
>
> Add a -n 100 to it takes 5.7 seconds, so process creation overhead
> is a factor here.

Doesn't -n 2500 being faster than -n 100 suggest the opposite of process
overhead being the deciding factor? With -n 2500 you'll invoke 4 touch
processes, so one takes 2500/3.1 =~ 0.8s to run, whereas with -n 100 you
invoke 100 of them, so if the overall time is then 5.7 seconds that's
5.7/100 =~ ~0.06s.

Or am I misunderstanding you, or does some implicit parallelism kick in
with that version of xargs depending on -n?

>> etc.
>> Also I didn't notice this before, but the -300 part of "chmtime
>> -300"
>> was redundant before then? I.e. you're implicitly changing it to "=+0"
>> instead with your "touch" helper, are you not?
>> 
>
> Right. I'm changing it to the current time.

If that "while we're at it change the behavior of the test" is wanted I
think it should be called out in the commit message. Right now it looks
like it might be an unintentional regression in the test.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 14/34] fsmonitor--daemon: implement 'start' command
  2021-07-13 18:44             ` Jeff Hostetler
@ 2021-07-20 19:38               ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-20 19:38 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Jeff Hostetler via GitGitGadget, git, Johannes Schindelin,
	Derrick Stolee, Jeff Hostetler


On Tue, Jul 13 2021, Jeff Hostetler wrote:

> On 7/13/21 1:54 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Tue, Jul 13 2021, Jeff Hostetler wrote:
>> 
>>> My response here is in addition to Dscho's remarks on this topic.
>>> He makes excellent points that I'll just #include here.  I do want
>>> to add my own $0.02 here.
>>>
>>> On 7/1/21 6:18 PM, Ævar Arnfjörð Bjarmason wrote:
>>>> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>>>>
>
>>>>> +/*
>>>>> + * This is adapted from `wait_or_whine()`.  Watch the child process and
>>>>> + * let it get started and begin listening for requests on the socket
>>>>> + * before reporting our success.
>>>>> + */
>>>>> +static int wait_for_background_startup(pid_t pid_child)
>>>>> +{
>>>>> +	int status;
>>>>> +	pid_t pid_seen;
>>>>> +	enum ipc_active_state s;
>>>>> +	time_t time_limit, now;
>>>>> +
>>>>> +	time(&time_limit);
>>>>> +	time_limit += fsmonitor__start_timeout_sec;
>>>>> +
>>>>> +	for (;;) {
>>>>> +		pid_seen = waitpid(pid_child, &status, WNOHANG);
>>>>> +
>>>>> +		if (pid_seen == -1)
>>>>> +			return error_errno(_("waitpid failed"));
>>>>> +		else if (pid_seen == 0) {
>>>>> +			/*
>>>>> +			 * The child is still running (this should be
>>>>> +			 * the normal case).  Try to connect to it on
>>>>> +			 * the socket and see if it is ready for
>>>>> +			 * business.
>>>>> +			 *
>>>>> +			 * If there is another daemon already running,
>>>>> +			 * our child will fail to start (possibly
>>>>> +			 * after a timeout on the lock), but we don't
>>>>> +			 * care (who responds) if the socket is live.
>>>>> +			 */
>>>>> +			s = fsmonitor_ipc__get_state();
>>>>> +			if (s == IPC_STATE__LISTENING)
>>>>> +				return 0;
>>>>> +
>>>>> +			time(&now);
>>>>> +			if (now > time_limit)
>>>>> +				return error(_("fsmonitor--daemon not online yet"));
>>>>> +		} else if (pid_seen == pid_child) {
>>>>> +			/*
>>>>> +			 * The new child daemon process shutdown while
>>>>> +			 * it was starting up, so it is not listening
>>>>> +			 * on the socket.
>>>>> +			 *
>>>>> +			 * Try to ping the socket in the odd chance
>>>>> +			 * that another daemon started (or was already
>>>>> +			 * running) while our child was starting.
>>>>> +			 *
>>>>> +			 * Again, we don't care who services the socket.
>>>>> +			 */
>>>>> +			s = fsmonitor_ipc__get_state();
>>>>> +			if (s == IPC_STATE__LISTENING)
>>>>> +				return 0;
>>>>> +
>>>>> +			/*
>>>>> +			 * We don't care about the WEXITSTATUS() nor
>>>>> +			 * any of the WIF*(status) values because
>>>>> +			 * `cmd_fsmonitor__daemon()` does the `!!result`
>>>>> +			 * trick on all function return values.
>>>>> +			 *
>>>>> +			 * So it is sufficient to just report the
>>>>> +			 * early shutdown as an error.
>>>>> +			 */
>>>>> +			return error(_("fsmonitor--daemon failed to start"));
>>>>> +		} else
>>>>> +			return error(_("waitpid is confused"));
>>>>> +	}
>>>>> +}
>>>> Ditto this. could we extend the wait_or_whine() function (or some
>>>> extended version thereof) to do what you need with callbacks?
>>>> It seems the main difference is just being able to pass down a flag
>>>> for
>>>> waitpid(), and the loop needing to check EINTR or not depending on
>>>> whether WNOHANG is passed.
>>>> For e.g. the "We don't care about the WEXITSTATUS()" you'd get that
>>>> behavior with an adjusted wait_or_whine(). Wouldn't it be better to
>>>> report what exit status it exits with e.g. if the top-level process is
>>>> signalled? We do so in trace2 for other things we spawn...
>>>>
>>>
>>> Again, I don't want to mix my usage here with the existing code
>>> and destabilize all existing callers.  Here we are spinning to give
>>> the child a chance to *start* and confirm that it is in a listening
>>> state and ready for connections.  We do not wait for the child to
>>> exit (unless it dies quickly without becoming ready).
>>>
>>> We want to end our wait as soon as we confirm that the child is
>>> ready and return.  All I really need from the system is `waitpid()`.
>> Will this code behave correctly if the daemon we start is signalled
>> per
>> the WIFSIGNALED() cases the code this is derived handles, but this does
>> not?
>
> We're only waiting until the child gets started and is able to receive
> requests -- what happens to it after we have confirmed that it is ready
> is not our concern (after all, the parent is about to exit anyway and
> the child is going to continue on).
>
> If waitpid() gives us a WIFSIGNALED (or any other WIF*() state) before
> we have spoken to it, we will return a "failed to start".

So in wait_or_whine() and finish_command() we capture all of that in
trace2 logs, we explicitly don't want that in this case? We do concern
ourselves with the exact exit status/signal status etc. of children we
start in most other scenarios for trace2 logging purposes.

> But again, that signal would have to arrive immediately after we spawned
> it and *before* we could talk to it.  If the child is signaled after we
> confirmed it was ready, we don't care because the parent process will be
> gone.
>
> (If the child is signaled or is killed (or crashes or whatever), the
> next Git command (like "status") that tries to talk to it will re-start
> it implicitly -- the `git fsmonitor--daemon start` command here is an
> explicit start.)
>
>
>> But sure, I just meant to point out that the flip side to
>> "destabilize
>> all existing callers" is reviewing new code that may be subtly buggy,
>> and those subtle bugs (if any) would be smoked out if we were forced to
>> extend run-command.c, i.e. to use whatever feature(s) this needs for all
>> existing callers.
>> 
>
> That would/could have a massive footprint.  And I've already established
> that my usage here is sufficiently different from existing uses that the
> result would be a mess. IMHO.

I hadn't see this before but it seems pretty much exactly the same code
was already added (by you) in 36a7eb68760 (t0052: add simple-ipc tests
and t/helper/test-simple-ipc tool, 2021-03-22), perhaps splitting it
into a utility function for the two to use with a callback mechanism
would reduce the footprint?

What I was suggesting was some continuation of the below.

(I stopped once I noticed the changes I was making to
builtin/fsmonitor--daemon.c didn't even compile (almost the entire file
is hidden behind a macro), but I've commented on that aspect
elsewhere. I.e. it's nice to be able to do tree-wide refactoring without
tripping over code hidden by ifdefs)

It passes all current tests for whatever that's worth, obviously not a
pretty API, and I'm not claiming it's correct.

But I think it's clear how the trace2/error handling part of it could be
further extracted into some utility, so this would just be a mode of
run-command.

Not saying you need to do it, but the comments about us explicitly not
caring at all about the exit state make me wonder if there's some reason
for why someone else would be tripping over some landmine if they did
that refactoring.

Anyway, just a thought. I see from other feedback that you seem to be
getting pretty exasperated with me.

I'm just trying to help this along, usually being able to piggy-back on
existing in-tree code and proving the correctness by passing all in-tree
tests with that piggy-backing helps more than hinders.

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 25f18f2726b..7365fff95f4 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -8,6 +8,7 @@
 #include "simple-ipc.h"
 #include "khash.h"
 #include "pkt-line.h"
+#include "run-command.h"
 
 static const char * const builtin_fsmonitor__daemon_usage[] = {
 	N_("git fsmonitor--daemon start [<options>]"),
@@ -1403,11 +1404,12 @@ static int wait_for_background_startup(pid_t pid_child)
 	time_limit += fsmonitor__start_timeout_sec;
 
 	for (;;) {
+		int saved_errno = 0;
+		code = wait_or_whine_extended(pid_child, &pid_seen, "TODO",
+					      0, WNOHANG, &saved_errno);
 		pid_seen = waitpid(pid_child, &status, WNOHANG);
 
-		if (pid_seen == -1)
-			return error_errno(_("waitpid failed"));
-		else if (pid_seen == 0) {
+		if (pid_seen == 0) {
 			/*
 			 * The child is still running (this should be
 			 * the normal case).  Try to connect to it on
@@ -1452,8 +1454,7 @@ static int wait_for_background_startup(pid_t pid_child)
 			 * early shutdown as an error.
 			 */
 			return error(_("fsmonitor--daemon failed to start"));
-		} else
-			return error(_("waitpid is confused"));
+		}
 	}
 }
 
diff --git a/run-command.c b/run-command.c
index aacc336f951..856e7d87c40 100644
--- a/run-command.c
+++ b/run-command.c
@@ -543,24 +543,28 @@ static inline void set_cloexec(int fd)
 		fcntl(fd, F_SETFD, flags | FD_CLOEXEC);
 }
 
-static int wait_or_whine(pid_t pid, const char *argv0, int in_signal)
+int wait_or_whine_extended(pid_t pid, pid_t *waiting, const char *argv0,
+			   int in_signal, int waitpid_options, int *failed_errno)
 {
 	int status, code = -1;
-	pid_t waiting;
-	int failed_errno = 0;
 
-	while ((waiting = waitpid(pid, &status, 0)) < 0 && errno == EINTR)
+	if (waitpid_options & WNOHANG)
+		*waiting = waitpid(pid, &status, waitpid_options);
+	else
+		while ((*waiting = waitpid(pid, &status, waitpid_options)) < 0 &&
+		       errno == EINTR)
 		;	/* nothing */
+
 	if (in_signal) {
 		if (WIFEXITED(status))
 			code = WEXITSTATUS(status);
 		return code;
 	}
 
-	if (waiting < 0) {
-		failed_errno = errno;
+	if (*waiting < 0) {
+		*failed_errno = errno;
 		error_errno("waitpid for %s failed", argv0);
-	} else if (waiting != pid) {
+	} else if (*waiting != pid) {
 		error("waitpid is confused (%s)", argv0);
 	} else if (WIFSIGNALED(status)) {
 		code = WTERMSIG(status);
@@ -574,14 +578,23 @@ static int wait_or_whine(pid_t pid, const char *argv0, int in_signal)
 		code += 128;
 	} else if (WIFEXITED(status)) {
 		code = WEXITSTATUS(status);
+	} else if (waitpid_options & WNOHANG && *waiting == 0) {
+		code = 0;
 	} else {
 		error("waitpid is confused (%s)", argv0);
 	}
 
-	clear_child_for_cleanup(pid);
+	return code;
+}
 
+static int wait_or_whine(pid_t pid, const char *argv0, int in_signal)
+{
+	pid_t ignore;
+	int failed_errno = 0;
+	int ret = wait_or_whine_extended(pid, &ignore, argv0, in_signal, 0, &failed_errno);
+	clear_child_for_cleanup(pid);
 	errno = failed_errno;
-	return code;
+	return ret;
 }
 
 static void trace_add_env(struct strbuf *dst, const char *const *deltaenv)
diff --git a/run-command.h b/run-command.h
index e321d23bbd2..fe39730f87a 100644
--- a/run-command.h
+++ b/run-command.h
@@ -182,6 +182,9 @@ void child_process_clear(struct child_process *);
 
 int is_executable(const char *name);
 
+int wait_or_whine_extended(pid_t pid, pid_t *waiting, const char *argv0,
+			   int in_signal, int waitpid_options, int *failed_errno);
+
 /**
  * Start a sub-process. Takes a pointer to a `struct child_process`
  * that specifies the details and returns pipe FDs (if requested).
diff --git a/t/helper/test-simple-ipc.c b/t/helper/test-simple-ipc.c
index 91345180750..44658a46713 100644
--- a/t/helper/test-simple-ipc.c
+++ b/t/helper/test-simple-ipc.c
@@ -9,6 +9,7 @@
 #include "parse-options.h"
 #include "thread-utils.h"
 #include "strvec.h"
+#include "run-command.h"
 
 #ifndef SUPPORTS_SIMPLE_IPC
 int cmd__simple_ipc(int argc, const char **argv)
@@ -349,7 +350,7 @@ static int spawn_server(pid_t *pid)
  */
 static int wait_for_server_startup(pid_t pid_child)
 {
-	int status;
+	int code;
 	pid_t pid_seen;
 	enum ipc_active_state s;
 	time_t time_limit, now;
@@ -358,12 +359,10 @@ static int wait_for_server_startup(pid_t pid_child)
 	time_limit += cl_args.max_wait_sec;
 
 	for (;;) {
-		pid_seen = waitpid(pid_child, &status, WNOHANG);
-
-		if (pid_seen == -1)
-			return error_errno(_("waitpid failed"));
-
-		else if (pid_seen == 0) {
+		int saved_errno = 0;
+		code = wait_or_whine_extended(pid_child, &pid_seen, "TODO",
+					      0, WNOHANG, &saved_errno);
+		if (pid_seen == 0) {
 			/*
 			 * The child is still running (this should be
 			 * the normal case).  Try to connect to it on
@@ -384,9 +383,7 @@ static int wait_for_server_startup(pid_t pid_child)
 				return error(_("daemon not online yet"));
 
 			continue;
-		}
-
-		else if (pid_seen == pid_child) {
+		} else if (pid_seen == pid_child) {
 			/*
 			 * The new child daemon process shutdown while
 			 * it was starting up, so it is not listening
@@ -412,10 +409,9 @@ static int wait_for_server_startup(pid_t pid_child)
 			 * early shutdown as an error.
 			 */
 			return error(_("daemon failed to start"));
+		} else if (code) {
+			BUG("??");
 		}
-
-		else
-			return error(_("waitpid is confused"));
 	}
 }
 


^ permalink raw reply related	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 11/34] fsmonitor-fs-listen-win32: stub in backend for Windows
  2021-07-19 16:54         ` Jeff Hostetler
@ 2021-07-20 20:32           ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-20 20:32 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Jeff Hostetler via GitGitGadget, git, Johannes Schindelin,
	Derrick Stolee, Jeff Hostetler


On Mon, Jul 19 2021, Jeff Hostetler wrote:

> On 7/1/21 6:45 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>> 
>>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>>
>>> Stub in empty backend for fsmonitor--daemon on Windows.
>>>
>>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>>> ---
>>>   Makefile                                     | 13 ++++++
>>>   compat/fsmonitor/fsmonitor-fs-listen-win32.c | 21 +++++++++
>>>   compat/fsmonitor/fsmonitor-fs-listen.h       | 49 ++++++++++++++++++++
>>>   config.mak.uname                             |  2 +
>>>   contrib/buildsystems/CMakeLists.txt          |  5 ++
>>>   5 files changed, 90 insertions(+)
>>>   create mode 100644 compat/fsmonitor/fsmonitor-fs-listen-win32.c
>>>   create mode 100644 compat/fsmonitor/fsmonitor-fs-listen.h
>>>
>>> diff --git a/Makefile b/Makefile
>>> index c45caacf2c3..a2a6e1f20f6 100644
>>> --- a/Makefile
>>> +++ b/Makefile
>>> @@ -467,6 +467,11 @@ all::
>>>   # directory, and the JSON compilation database 'compile_commands.json' will be
>>>   # created at the root of the repository.
>>>   #
>>> +# If your platform supports a built-in fsmonitor backend, set
>>> +# FSMONITOR_DAEMON_BACKEND to the "<name>" of the corresponding
>>> +# `compat/fsmonitor/fsmonitor-fs-listen-<name>.c` that implements the
>>> +# `fsmonitor_fs_listen__*()` routines.
>>> +#
>>>   # Define DEVELOPER to enable more compiler warnings. Compiler version
>>>   # and family are auto detected, but could be overridden by defining
>>>   # COMPILER_FEATURES (see config.mak.dev). You can still set
>>> @@ -1929,6 +1934,11 @@ ifdef NEED_ACCESS_ROOT_HANDLER
>>>   	COMPAT_OBJS += compat/access.o
>>>   endif
>>>   +ifdef FSMONITOR_DAEMON_BACKEND
>>> +	COMPAT_CFLAGS += -DHAVE_FSMONITOR_DAEMON_BACKEND
>>> +	COMPAT_OBJS += compat/fsmonitor/fsmonitor-fs-listen-$(FSMONITOR_DAEMON_BACKEND).o
>>> +endif
>>> +
>>>   ifeq ($(TCLTK_PATH),)
>>>   NO_TCLTK = NoThanks
>>>   endif
>>> @@ -2793,6 +2803,9 @@ GIT-BUILD-OPTIONS: FORCE
>>>   	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
>>>   	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
>>>   	@echo X=\'$(X)\' >>$@+
>>> +ifdef FSMONITOR_DAEMON_BACKEND
>>> +	@echo FSMONITOR_DAEMON_BACKEND=\''$(subst ','\'',$(subst ','\'',$(FSMONITOR_DAEMON_BACKEND)))'\' >>$@+
>>> +endif
>> Why put this in an ifdef?
>> In 342e9ef2d9e (Introduce a performance testing framework,
>> 2012-02-17)
>> we started doing that for some perf/test options (which b.t.w., I don't
>> really see the reason for, maybe it's some subtlety in how test-lib.sh
>> picks those up).
>> But for all the other compile-time stuff we don't ifdef it, we just
>> define it, and then you get an empty value or not.
>> This would AFAICT be the first build-time-for-the-C-program option
>> we
>> ifdef for writing a line to GIT-BUILD-OPTIONS.
>> 
>
> (I'm going to respond here on the original question rather than on any
> of the follow up responses in an attempt at diffusing things a bit.)
>
> I added the ifdef because I thought it to be the *most conservative*
> thing that I could do.  The output of the generated file on unsupported
> platforms should be *identical* to what it was before my changes.  I
> only alter the contents of the generated file on supported platforms.
>
> Later, when the generated file is consumed, we don't need to worry about
> the effect (if any) on incremental compiles -- we will know that it
> won't be set -- just like it was not set in the original compile.

Okey, so e.g. when we added e.g. USE_LIBPCRE2 we added it TO
GIT-BUILD-OPTIONS unconditionally, so if you pulled that commit you'd
trigger a rebuild on anything that cares about GIT-BUILD-OPTIONS (which
is almost everything).

But you'd like to have the line not added to avoid that one-off
recompile....

> That change appears right before a 12 other ifdef'd symbols also being
> written to that generated file.  Most are test and perf, but some are
> not.  But my point is that the pattern is present already.
>
> The original question also references a 9.5 year old commit which
> uses the same pattern as I've used here.  It also muddies the water
> on why it was/wasn't needed back then.  And hints at possible
> side-effects in some of our test scripts.  So it is clear that the
> confusion/disagreements that we are having with the current patch
> and whether or not to ifdef are not new.
>
>
> So, is there value in being explicit and having the ifdef ??
>
>
> There are well defined Make rules (and Junio gave us a very elegant
> little script to demonstrate that), but the subtleties are there.
> Especially with our use generated files like `GIT-BUILD-OPTIONS`.
> We have a mailing list full of experts and yet this question received
> a lot more discussion than I thought possible or necessary, but it
> took a test script to demonstrate that the results are the same and it
> doesn't matter.  Perhaps the clarity is worth it for the price of a
> simple ifdef.
>
>
> So, how much time have we (collectively) wasted discussing this
> subtlety ??
>
>
> To summarize, I added the ifdef to make it explicitly clear that
> I'm not altering behavior on unsupported platforms.  I can remove it
> from V4 if desired or I can keep it.  (We all now know that it doesn't
> functionally matter -- it does however, provide clarity.)
>
>
> Sorry if this sounded like a rant,

...I asked because I've looked at that ifdef soup around
GIT-BUILD-OPTIONS and wondered if I could make it go away, and before a
patch lands is a good time to ask "what's this pattern for?", as opposed
to inferring this after the fact.

For me it was just a minor curiosity, I didn't expect to start this big
discussion about it. I expected just a "oh, I just copy/pasted that from
the lines at the end" or something, which would be fair enough.

I really don't care which one we go for here. If you want to change it
fine, if not that's fine too.

I have noticed a pattern where you seem to really carefully consider why
you'd like X over Y. I.e. it wasn't just copy/pasting in this case if I
understand you correctly, but a carefully thought out decision to not do
it like the other C-level-GIT-BUILD-OPTIONS.

Okey, fair enough, but that decision then doesn't go into the commit
message, and then when I innocently ask about it...

..I guess I'll stop before this starts resembling a rant on my part :)

Anyway, I have also had really non-trivial comments on this fsmonitor
series, not just a few bikeshed comments. I.e. the un-addressed question
about the wildly different performance numbers we seem to have seen in
our respective testing:
https://lore.kernel.org/git/871r8c73ej.fsf@evledraar.gmail.com

I think that's much more interesting than this relatively light-reading
bikeshedding I had while giving this a read-through.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 29/34] fsmonitor--daemon: use a cookie file to sync with file system
  2021-07-01 23:17       ` Ævar Arnfjörð Bjarmason
@ 2021-07-21 14:40         ` Jeff Hostetler
  0 siblings, 0 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-21 14:40 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Jeff Hostetler via GitGitGadget
  Cc: git, Johannes Schindelin, Derrick Stolee, Jeff Hostetler



On 7/1/21 7:17 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
> 
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> Teach fsmonitor--daemon client threads to create a cookie file
>> inside the .git directory and then wait until FS events for the
>> cookie are observed by the FS listener thread.
>>
>> This helps address the racy nature of file system events by
>> blocking the client response until the kernel has drained any
>> event backlog.
>>
>> This is especially important on MacOS where kernel events are
>> only issued with a limited frequency.  See the `latency` argument
>> of `FSeventStreamCreate()`.  The kernel only signals every `latency`
>> seconds, but does not guarantee that the kernel queue is completely
>> drained, so we may have to wait more than one interval.  If we
>> increase the frequency, the system is more likely to drop events.
>> We avoid these issues by having each client thread create a unique
>> cookie file and then wait until it is seen in the event stream.
> 
> Is this a guaranteed property of any API fsmonitor might need to work
> with (linux, darwin, Windows) that if I perform a bunch of FS operations
> on my working tree, that if I finish up by touching this cookie file
> that that'll happen last?
> 
> I'd think that wouldn't be the case, i.e. on POSIX filesystems unless
> you run around fsyncing both files and directories you're not guaranteed
> that they're on disk, and even then the kernel might decide to sync your
> cookie earlier, won't it?
> 
> E.g. on Linux you can even have cross-FS watches, and mix & match
> different FS types. I'd expect to get events in whatever
> implementation-defined order the VFS layer + FS decided to sync them to
> disk in & get to firing off an event for me.
> 
> Or do these APIs all guarantee that a linear view of the world is
> presented to the API consumer?
> 


Theoretically, none of these APIs guarantee a complete linear ordering.
We receive events from the FS in the order that the FS decides to
perform the actual IO.  And the inner workings of the FS is private.
Even if we directly read the journal rather than listening for
notifications, we probably still don't know whether the FS reordered
the queue of things heading to disk.

However in practice, the events for the cookie files do tend to arrive
in order.  And the net effect is that the worker thread in the daemon
is sync'd up with IO activity that was initiated before the request.


BTW Watchman also uses cookie files for this same reason.


It should also be noted that some operations are just racy.  If you're
doing a bunch of IO in one window and a 'git status' in another window,
your result will be racy -- status (without FSM) makes 2 passes on the
disk: the first to verify mtimes on items in the index and the second
to look for untracked files.  the status result may be "blurry" (for
lack of a better word).  So the same questions
     "does the FS reorder my IO?",
     "did status see the fully sync'd FS?",
and etc can also be asked in the normal (non FSM) case, right?

So it may be the case that having an fsmonitor (mine, Watchman, etc)
and the untracked-cache, we'll have less skew in status results
because the status process shouldn't have to do any scanning.
But I'm not sure I want to make that assertion yet.

Thanks,
Jeff


^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 12/34] fsmonitor-fs-listen-macos: stub in backend for MacOS
  2021-07-16 16:52           ` Ævar Arnfjörð Bjarmason
@ 2021-07-26 21:40             ` Johannes Schindelin
  2021-07-26 23:26               ` Junio C Hamano
  0 siblings, 1 reply; 237+ messages in thread
From: Johannes Schindelin @ 2021-07-26 21:40 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jeff Hostetler via GitGitGadget, git, Jeff Hostetler,
	Derrick Stolee, Jeff Hostetler

[-- Attachment #1: Type: text/plain, Size: 758 bytes --]

Hi Ævar,

On Fri, 16 Jul 2021, Ævar Arnfjörð Bjarmason wrote:

> On Fri, Jul 16 2021, Johannes Schindelin wrote:
>
> > So you suggest that we name the new stuff after an `uname` that
> > reflects a name that is no longer relevant? I haven't seen a real
> > Darwin system in quite a long time, have you?
>
> It's not current? On an Mac Mini M1 which got released this year:
>
>     % uname -s
>     Darwin
>
> We then have the same in config.mak.uname, it seemed the most obvious
> and consistent to carry that through to file inclusion.

Sorry. I assumed that you knew that Darwin was the name for an open source
Operating System. See
https://en.wikipedia.org/wiki/Darwin_%28operating_system%29 for more
details.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 12/34] fsmonitor-fs-listen-macos: stub in backend for MacOS
  2021-07-26 21:40             ` Johannes Schindelin
@ 2021-07-26 23:26               ` Junio C Hamano
  2021-07-27 12:46                 ` Jeff Hostetler
  0 siblings, 1 reply; 237+ messages in thread
From: Junio C Hamano @ 2021-07-26 23:26 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Ævar Arnfjörð Bjarmason,
	Jeff Hostetler via GitGitGadget, git, Jeff Hostetler,
	Derrick Stolee, Jeff Hostetler

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi Ævar,
>
> On Fri, 16 Jul 2021, Ævar Arnfjörð Bjarmason wrote:
>
>> On Fri, Jul 16 2021, Johannes Schindelin wrote:
>>
>> > So you suggest that we name the new stuff after an `uname` that
>> > reflects a name that is no longer relevant? I haven't seen a real
>> > Darwin system in quite a long time, have you?
>>
>> It's not current? On an Mac Mini M1 which got released this year:
>>
>>     % uname -s
>>     Darwin
>>
>> We then have the same in config.mak.uname, it seemed the most obvious
>> and consistent to carry that through to file inclusion.
>
> Sorry. I assumed that you knew that Darwin was the name for an open source
> Operating System. See
> https://en.wikipedia.org/wiki/Darwin_%28operating_system%29 for more
> details.
>
> Ciao,
> Johannes

Sorry, but I do not see that you are being more constructive than
the other party, whom you blame to be not constructive, in this
exchange.

The part of the file that the patch applies to uses $(uname_S) to
implement platform specific special cases, and we are looking at

	ifeq ($(uname_S),Darwin)
		...
		FSMONITOR_DAEMON_BACKEND = macos
		...
	endif

I find it a fair question why the name used there has to be
different from the one we can automatically and mechanically
get out of "uname -s".

Then you respond that uname output is no longer relevant because
Darwin is a name that is no longer relevant?  And when asked why the
name is no longer relevant, you make a sniding comment implying that
the other party does not know the name is an operating system?

What is going on here?

It does not really matter how "Darwin" is described in an
encyclopedia in the context of this discussion.  What matters is
that it is what the system's "uname -s" currently uses to identify
itself, and what we guard the section of makefile snippet with,
isn't it?

ci/lib.sh seems to have an attempt to unify/translate among these
names, and

 * on azure-pipelines, it wants to translate darwin to osx
 * on github-actions, it wants to translate macos to osx

Presumably that is because these two systems call the platform with
these two different names, and you want to pick a middle ground that
nobody uses to be neutral, or something?

Also, in contrib/vscode/init.sh, I see Darwin obtained from "uname -s"
gets translated to "macOS".

In any case, if your argument was "we picked macos because we use
the same token elsewhere, while trying to translate away from Darwin
as much as possible for such and such reasons", I would have found
it a productive exchange, but unfortunately that is not what I am
seeing here.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 12/34] fsmonitor-fs-listen-macos: stub in backend for MacOS
  2021-07-26 23:26               ` Junio C Hamano
@ 2021-07-27 12:46                 ` Jeff Hostetler
  2021-07-27 15:56                   ` Ævar Arnfjörð Bjarmason
  2021-07-27 17:25                   ` Junio C Hamano
  0 siblings, 2 replies; 237+ messages in thread
From: Jeff Hostetler @ 2021-07-27 12:46 UTC (permalink / raw)
  To: Junio C Hamano, Johannes Schindelin
  Cc: Ævar Arnfjörð Bjarmason,
	Jeff Hostetler via GitGitGadget, git, Derrick Stolee,
	Jeff Hostetler



On 7/26/21 7:26 PM, Junio C Hamano wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
>> Hi Ævar,
>>
>> On Fri, 16 Jul 2021, Ævar Arnfjörð Bjarmason wrote:
>>
>>> On Fri, Jul 16 2021, Johannes Schindelin wrote:
>>>
...


I'm not sure that there is a "correct" answer here, but for the sake
of harmony, in V4 I'll set this to "darwin" and update the name of
the backend driver source file to match.  So that we are consistently
using 1 term throughout "Makefile" and "config.mak.uname".

	ifeq ($(uname_S),Darwin)
	...
		FSMONITOR_DAEMON_BACKEND = darwin
	endif



FWIW, I suspect that it is not worth the effort to directly set the
backend name from $(uname_S).  For example, on Windows we currently have
two different uname values depending on which compiler is being used.

	ifeq ($(uname_S),Windows)
	...
		FSMONITOR_DAEMON_BACKEND = win32
	endif

	ifneq (,$(findstring MINGW,$(uname_S)))
	...
		FSMONITOR_DAEMON_BACKEND = win32
	endif


Also, since the backend layer is highly platform-specific, it may be
a while (if ever) before we have universal coverage for all platforms.
Until then, we can simply set $FSMONITOR_DAEMON_BACKEND to a literal
value on a platform-by-platform basis as support is added.


Thanks,
Jeff

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 12/34] fsmonitor-fs-listen-macos: stub in backend for MacOS
  2021-07-27 12:46                 ` Jeff Hostetler
@ 2021-07-27 15:56                   ` Ævar Arnfjörð Bjarmason
  2021-07-27 17:25                   ` Junio C Hamano
  1 sibling, 0 replies; 237+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-27 15:56 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Junio C Hamano, Johannes Schindelin,
	Jeff Hostetler via GitGitGadget, git, Derrick Stolee,
	Jeff Hostetler


On Tue, Jul 27 2021, Jeff Hostetler wrote:

> On 7/26/21 7:26 PM, Junio C Hamano wrote:
>> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>> 
>>> Hi Ævar,
>>>
>>> On Fri, 16 Jul 2021, Ævar Arnfjörð Bjarmason wrote:
>>>
>>>> On Fri, Jul 16 2021, Johannes Schindelin wrote:
>>>>
> ...
>
>
> I'm not sure that there is a "correct" answer here, but for the sake
> of harmony, in V4 I'll set this to "darwin" and update the name of
> the backend driver source file to match.  So that we are consistently
> using 1 term throughout "Makefile" and "config.mak.uname".
>
> 	ifeq ($(uname_S),Darwin)
> 	...
> 		FSMONITOR_DAEMON_BACKEND = darwin
> 	endif
>
>
>
> FWIW, I suspect that it is not worth the effort to directly set the
> backend name from $(uname_S).  For example, on Windows we currently have
> two different uname values depending on which compiler is being used.
>
> 	ifeq ($(uname_S),Windows)
> 	...
> 		FSMONITOR_DAEMON_BACKEND = win32
> 	endif
>
> 	ifneq (,$(findstring MINGW,$(uname_S)))
> 	...
> 		FSMONITOR_DAEMON_BACKEND = win32
> 	endif
>
>
> Also, since the backend layer is highly platform-specific, it may be
> a while (if ever) before we have universal coverage for all platforms.
> Until then, we can simply set $FSMONITOR_DAEMON_BACKEND to a literal
> value on a platform-by-platform basis as support is added.

Re "harmony": For what it's worth I don't think you should change it on
my accord.

I should probably have more explicitly said (but I've also been trying
to check the general verbosity of my E-Mails), that when I read a series
like this and have some general trivial comments like this, I mean them
as something like:

    Just a thought while reading this through, i.e. a person familiar
    with the general codebase but not necessarily your specific
    are. Maybe this suggestion makes things easier/simpler, but if you
    think not and decide not to take the suggestion that's fine too.

I.e. that along with the general implicit suggestion that I'd say
applies in general on list that if someone is perplexed by a patch by
default that's a comment on the commit message.

That person (i.e. me in this case) could also just be hopelessly
confused & nothing needs to change. When I get comments like that I
sometimes change things, sometimes not. You should do the same.

As noted in another reply on this general thread & what's cooking I seem
to have poked a bit of a hornet's nest here that I wasn't expecting to
poke. I'd not been following earlier rounds of this topic, and didn't
know that it had (seemingly) reached some phase of critical updates only
in the minds of its authors.


^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 12/34] fsmonitor-fs-listen-macos: stub in backend for MacOS
  2021-07-27 12:46                 ` Jeff Hostetler
  2021-07-27 15:56                   ` Ævar Arnfjörð Bjarmason
@ 2021-07-27 17:25                   ` Junio C Hamano
  2021-07-27 17:45                     ` Felipe Contreras
  1 sibling, 1 reply; 237+ messages in thread
From: Junio C Hamano @ 2021-07-27 17:25 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Johannes Schindelin, Ævar Arnfjörð Bjarmason,
	Jeff Hostetler via GitGitGadget, git, Derrick Stolee,
	Jeff Hostetler

Jeff Hostetler <git@jeffhostetler.com> writes:

> I'm not sure that there is a "correct" answer here, but for the sake
> of harmony, in V4 I'll set this to "darwin" and update the name of
> the backend driver source file to match.  So that we are consistently
> using 1 term throughout "Makefile" and "config.mak.uname".
>
> 	ifeq ($(uname_S),Darwin)
> 	...
> 		FSMONITOR_DAEMON_BACKEND = darwin
> 	endif
> ...

I do not think it would help "harmony" to change the name, but any
one of 'darwin', 'macos' and 'osx' would be fine.

It was irritating to see a simple "why is this particular name
chosen?" answered with such a hostility, when even a "we have no
deep reasoning behind the choice of the name. It is only seen by
developers in names of the source files, and it does not make much
difference" would have been sufficient.  I somehow view it more
problematic.

I guess the blame goes both ways, though.  We all have worked with
each other long enough to know which of your recipients are prone to
go overly defensive when asked questions, and we should know that it
would help to prepend "I am just being curious but..." to your
questions whose answers do not make a huge difference at the end
either way (or not asking such questions at all).

Thanks.

^ permalink raw reply	[flat|nested] 237+ messages in thread

* Re: [PATCH v3 12/34] fsmonitor-fs-listen-macos: stub in backend for MacOS
  2021-07-27 17:25                   ` Junio C Hamano
@ 2021-07-27 17:45                     ` Felipe Contreras
  0 siblings, 0 replies; 237+ messages in thread
From: Felipe Contreras @ 2021-07-27 17:45 UTC (permalink / raw)
  To: Junio C Hamano, Jeff Hostetler
  Cc: Johannes Schindelin, Ævar Arnfjörð Bjarmason,
	Jeff Hostetler via GitGitGadget, git, Derrick Stolee,
	Jeff Hostetler

Junio C Hamano wrote:
> I guess the blame goes both ways, though.  We all have worked with
> each other long enough to know which of your recipients are prone to
> go overly defensive when asked questions, and we should know that it
> would help to prepend "I am just being curious but..." to your
> questions whose answers do not make a huge difference at the end
> either way (or not asking such questions at all).

I disagree. It's not OK for contributors to react defensively when asked
questions, and in particular I don't think it's OK that some
contributors are punished for merely disagreeing, while others are given
a pass for snarling at feedback. This is double standards.

The Git project should not play favorites, and all contributors should
be asked to assume good faith.

https://en.wikipedia.org/wiki/Wikipedia:Assume_good_faith

It should be implied that the feedback given is to try to improve the
project, it should not need to be stated.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 237+ messages in thread

end of thread, other threads:[~2021-07-27 17:45 UTC | newest]

Thread overview: 237+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-01 15:40 [PATCH 00/23] [RFC] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
2021-04-01 15:40 ` [PATCH 01/23] fsmonitor--daemon: man page and documentation Jeff Hostetler via GitGitGadget
2021-04-26 14:13   ` Derrick Stolee
2021-04-28 13:54     ` Jeff Hostetler
2021-04-01 15:40 ` [PATCH 02/23] fsmonitor-ipc: create client routines for git-fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-04-26 14:31   ` Derrick Stolee
2021-04-26 20:20     ` Eric Sunshine
2021-04-26 21:02       ` Derrick Stolee
2021-04-28 19:26     ` Jeff Hostetler
2021-04-01 15:40 ` [PATCH 03/23] config: FSMonitor is repository-specific Johannes Schindelin via GitGitGadget
2021-04-01 15:40 ` [PATCH 04/23] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC Johannes Schindelin via GitGitGadget
2021-04-26 14:56   ` Derrick Stolee
2021-04-27  9:20     ` Ævar Arnfjörð Bjarmason
2021-04-27 12:42       ` Derrick Stolee
2021-04-28  7:59         ` Ævar Arnfjörð Bjarmason
2021-04-28 16:26           ` [PATCH] repo-settings.c: simplify the setup Ævar Arnfjörð Bjarmason
2021-04-28 19:09             ` Nesting topics within other threads (was: [PATCH] repo-settings.c: simplify the setup) Derrick Stolee
2021-04-28 23:01               ` Ævar Arnfjörð Bjarmason
2021-05-05 16:12                 ` Johannes Schindelin
2021-04-29  5:12               ` Nesting topics within other threads Junio C Hamano
2021-04-29 12:14                 ` Ævar Arnfjörð Bjarmason
2021-04-29 20:14                   ` Jeff King
2021-04-30  0:07                   ` Junio C Hamano
2021-04-30 14:23     ` [PATCH 04/23] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC Jeff Hostetler
2021-04-01 15:40 ` [PATCH 05/23] fsmonitor--daemon: add a built-in fsmonitor daemon Jeff Hostetler via GitGitGadget
2021-04-26 15:08   ` Derrick Stolee
2021-04-26 15:45     ` Derrick Stolee
2021-04-30 14:31       ` Jeff Hostetler
2021-04-01 15:40 ` [PATCH 06/23] fsmonitor--daemon: implement client command options Jeff Hostetler via GitGitGadget
2021-04-26 15:12   ` Derrick Stolee
2021-04-30 14:33     ` Jeff Hostetler
2021-04-01 15:40 ` [PATCH 07/23] fsmonitor-fs-listen-win32: stub in backend for Windows Jeff Hostetler via GitGitGadget
2021-04-26 15:23   ` Derrick Stolee
2021-04-01 15:40 ` [PATCH 08/23] fsmonitor-fs-listen-macos: stub in backend for MacOS Jeff Hostetler via GitGitGadget
2021-04-01 15:40 ` [PATCH 09/23] fsmonitor--daemon: implement daemon command options Jeff Hostetler via GitGitGadget
2021-04-26 15:47   ` Derrick Stolee
2021-04-26 16:12     ` Derrick Stolee
2021-04-30 15:18       ` Jeff Hostetler
2021-04-30 15:59     ` Jeff Hostetler
2021-04-01 15:40 ` [PATCH 10/23] fsmonitor--daemon: add pathname classification Jeff Hostetler via GitGitGadget
2021-04-26 19:17   ` Derrick Stolee
2021-04-26 20:11     ` Eric Sunshine
2021-04-26 20:24       ` Derrick Stolee
2021-04-01 15:40 ` [PATCH 11/23] fsmonitor--daemon: define token-ids Jeff Hostetler via GitGitGadget
2021-04-26 19:49   ` Derrick Stolee
2021-04-26 20:01     ` Eric Sunshine
2021-04-26 20:03       ` Derrick Stolee
2021-04-30 16:17     ` Jeff Hostetler
2021-04-01 15:40 ` [PATCH 12/23] fsmonitor--daemon: create token-based changed path cache Jeff Hostetler via GitGitGadget
2021-04-26 20:22   ` Derrick Stolee
2021-04-30 17:36     ` Jeff Hostetler
2021-04-01 15:40 ` [PATCH 13/23] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows Jeff Hostetler via GitGitGadget
2021-04-27 17:22   ` Derrick Stolee
2021-04-27 17:41     ` Eric Sunshine
2021-04-30 19:32     ` Jeff Hostetler
2021-04-01 15:40 ` [PATCH 14/23] fsmonitor-fs-listen-macos: add macos header files for FSEvent Jeff Hostetler via GitGitGadget
2021-04-27 18:13   ` Derrick Stolee
2021-04-01 15:40 ` [PATCH 15/23] fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS Jeff Hostetler via GitGitGadget
2021-04-27 18:35   ` Derrick Stolee
2021-04-30 20:05     ` Jeff Hostetler
2021-04-01 15:40 ` [PATCH 16/23] fsmonitor--daemon: implement handle_client callback Jeff Hostetler via GitGitGadget
2021-04-26 21:01   ` Derrick Stolee
2021-05-03 15:04     ` Jeff Hostetler
2021-05-13 18:52   ` Derrick Stolee
2021-04-01 15:40 ` [PATCH 17/23] fsmonitor--daemon: periodically truncate list of modified files Jeff Hostetler via GitGitGadget
2021-04-27 13:24   ` Derrick Stolee
2021-04-01 15:41 ` [PATCH 18/23] fsmonitor--daemon:: introduce client delay for testing Jeff Hostetler via GitGitGadget
2021-04-27 13:36   ` Derrick Stolee
2021-04-01 15:41 ` [PATCH 19/23] fsmonitor--daemon: use a cookie file to sync with file system Jeff Hostetler via GitGitGadget
2021-04-27 14:23   ` Derrick Stolee
2021-05-03 21:59     ` Jeff Hostetler
2021-04-01 15:41 ` [PATCH 20/23] fsmonitor: force update index when fsmonitor token advances Jeff Hostetler via GitGitGadget
2021-04-27 14:52   ` Derrick Stolee
2021-04-01 15:41 ` [PATCH 21/23] t7527: create test for fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-04-27 15:41   ` Derrick Stolee
2021-04-01 15:41 ` [PATCH 22/23] p7519: add fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-04-27 15:45   ` Derrick Stolee
2021-04-01 15:41 ` [PATCH 23/23] t7527: test status with untracked-cache and fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-04-27 15:51   ` Derrick Stolee
2021-04-16 22:44 ` [PATCH 00/23] [RFC] Builtin FSMonitor Feature Junio C Hamano
2021-04-20 15:27   ` Johannes Schindelin
2021-04-20 19:13     ` Jeff Hostetler
2021-04-21 13:17     ` Derrick Stolee
2021-04-27 18:49 ` FS Monitor Windows Performance (was [PATCH 00/23] [RFC] Builtin FSMonitor Feature) Derrick Stolee
2021-04-27 19:31 ` FS Monitor macOS " Derrick Stolee
2021-05-22 13:56 ` [PATCH v2 00/28] Builtin FSMonitor Feature Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 01/28] simple-ipc: preparations for supporting binary messages Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 02/28] fsmonitor--daemon: man page Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 03/28] fsmonitor--daemon: update fsmonitor documentation Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 04/28] fsmonitor-ipc: create client routines for git-fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-06-02 11:24     ` Johannes Schindelin
2021-06-14 21:23       ` Johannes Schindelin
2021-05-22 13:56   ` [PATCH v2 05/28] help: include fsmonitor--daemon feature flag in version info Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 06/28] config: FSMonitor is repository-specific Johannes Schindelin via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 07/28] fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon via IPC Johannes Schindelin via GitGitGadget
2021-06-14 21:28     ` Johannes Schindelin
2021-05-22 13:56   ` [PATCH v2 08/28] fsmonitor--daemon: add a built-in fsmonitor daemon Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 09/28] fsmonitor--daemon: implement client command options Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 10/28] t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon Jeff Hostetler via GitGitGadget
2021-06-11  6:32     ` Junio C Hamano
2021-05-22 13:56   ` [PATCH v2 11/28] fsmonitor-fs-listen-win32: stub in backend for Windows Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 12/28] fsmonitor-fs-listen-macos: stub in backend for MacOS Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 13/28] fsmonitor--daemon: implement daemon command options Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 14/28] fsmonitor--daemon: add pathname classification Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 15/28] fsmonitor--daemon: define token-ids Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 16/28] fsmonitor--daemon: create token-based changed path cache Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 17/28] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 18/28] fsmonitor-fs-listen-macos: add macos header files for FSEvent Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 19/28] fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS Jeff Hostetler via GitGitGadget
2021-05-22 13:56   ` [PATCH v2 20/28] fsmonitor--daemon: implement handle_client callback Jeff Hostetler via GitGitGadget
2021-05-22 13:57   ` [PATCH v2 21/28] fsmonitor--daemon: periodically truncate list of modified files Jeff Hostetler via GitGitGadget
2021-05-22 13:57   ` [PATCH v2 22/28] fsmonitor--daemon: use a cookie file to sync with file system Jeff Hostetler via GitGitGadget
2021-06-14 21:42     ` Johannes Schindelin
2021-05-22 13:57   ` [PATCH v2 23/28] fsmonitor: enhance existing comments Jeff Hostetler via GitGitGadget
2021-05-22 13:57   ` [PATCH v2 24/28] fsmonitor: force update index after large responses Jeff Hostetler via GitGitGadget
2021-05-22 13:57   ` [PATCH v2 25/28] t7527: create test for fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-05-22 13:57   ` [PATCH v2 26/28] p7519: add fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-05-22 13:57   ` [PATCH v2 27/28] t7527: test status with untracked-cache and fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-05-22 13:57   ` [PATCH v2 28/28] t/perf: avoid copying builtin fsmonitor files into test repo Jeff Hostetler via GitGitGadget
2021-05-27  2:06   ` [PATCH v2 00/28] Builtin FSMonitor Feature Junio C Hamano
2021-06-02 11:28     ` Johannes Schindelin
2021-06-22 15:45     ` Jeff Hostetler
2021-07-01 14:47   ` [PATCH v3 00/34] " Jeff Hostetler via GitGitGadget
2021-07-01 14:47     ` [PATCH v3 01/34] simple-ipc: preparations for supporting binary messages Jeff Hostetler via GitGitGadget
2021-07-01 14:47     ` [PATCH v3 02/34] fsmonitor--daemon: man page Jeff Hostetler via GitGitGadget
2021-07-01 22:29       ` Ævar Arnfjörð Bjarmason
2021-07-05 22:00         ` Johannes Schindelin
2021-07-12 19:23         ` Jeff Hostetler
2021-07-13 17:46           ` Ævar Arnfjörð Bjarmason
2021-07-16 15:45             ` Johannes Schindelin
2021-07-16 17:04               ` Felipe Contreras
2021-07-01 14:47     ` [PATCH v3 03/34] fsmonitor--daemon: update fsmonitor documentation Jeff Hostetler via GitGitGadget
2021-07-01 22:31       ` Ævar Arnfjörð Bjarmason
2021-07-01 14:47     ` [PATCH v3 04/34] fsmonitor-ipc: create client routines for git-fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-07-01 14:47     ` [PATCH v3 05/34] help: include fsmonitor--daemon feature flag in version info Jeff Hostetler via GitGitGadget
2021-07-01 14:47     ` [PATCH v3 06/34] fsmonitor: config settings are repository-specific Jeff Hostetler via GitGitGadget
2021-07-01 16:46       ` Ævar Arnfjörð Bjarmason
2021-07-19 20:36         ` Jeff Hostetler
2021-07-01 14:47     ` [PATCH v3 07/34] fsmonitor: use IPC to query the builtin FSMonitor daemon Jeff Hostetler via GitGitGadget
2021-07-01 14:47     ` [PATCH v3 08/34] fsmonitor--daemon: add a built-in fsmonitor daemon Jeff Hostetler via GitGitGadget
2021-07-01 22:36       ` Ævar Arnfjörð Bjarmason
2021-07-19 20:56         ` Jeff Hostetler
2021-07-01 14:47     ` [PATCH v3 09/34] fsmonitor--daemon: implement 'stop' and 'status' commands Jeff Hostetler via GitGitGadget
2021-07-01 14:47     ` [PATCH v3 10/34] t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon Jeff Hostetler via GitGitGadget
2021-07-01 22:41       ` Ævar Arnfjörð Bjarmason
2021-07-01 14:47     ` [PATCH v3 11/34] fsmonitor-fs-listen-win32: stub in backend for Windows Jeff Hostetler via GitGitGadget
2021-07-01 22:45       ` Ævar Arnfjörð Bjarmason
2021-07-16 15:47         ` Johannes Schindelin
2021-07-16 16:55           ` Ævar Arnfjörð Bjarmason
2021-07-17  5:13             ` Junio C Hamano
2021-07-17  5:21               ` Junio C Hamano
2021-07-17 21:43               ` Ævar Arnfjörð Bjarmason
2021-07-19 19:58                 ` Junio C Hamano
2021-07-16 16:59           ` Felipe Contreras
2021-07-19 16:54         ` Jeff Hostetler
2021-07-20 20:32           ` Ævar Arnfjörð Bjarmason
2021-07-01 14:47     ` [PATCH v3 12/34] fsmonitor-fs-listen-macos: stub in backend for MacOS Jeff Hostetler via GitGitGadget
2021-07-01 22:49       ` Ævar Arnfjörð Bjarmason
2021-07-16 15:51         ` Johannes Schindelin
2021-07-16 16:52           ` Ævar Arnfjörð Bjarmason
2021-07-26 21:40             ` Johannes Schindelin
2021-07-26 23:26               ` Junio C Hamano
2021-07-27 12:46                 ` Jeff Hostetler
2021-07-27 15:56                   ` Ævar Arnfjörð Bjarmason
2021-07-27 17:25                   ` Junio C Hamano
2021-07-27 17:45                     ` Felipe Contreras
2021-07-01 14:47     ` [PATCH v3 13/34] fsmonitor--daemon: implement 'run' command Jeff Hostetler via GitGitGadget
2021-07-01 14:47     ` [PATCH v3 14/34] fsmonitor--daemon: implement 'start' command Jeff Hostetler via GitGitGadget
2021-07-01 22:18       ` Ævar Arnfjörð Bjarmason
2021-07-05 21:52         ` Johannes Schindelin
2021-07-13 14:39         ` Jeff Hostetler
2021-07-13 17:54           ` Ævar Arnfjörð Bjarmason
2021-07-13 18:44             ` Jeff Hostetler
2021-07-20 19:38               ` Ævar Arnfjörð Bjarmason
2021-07-01 14:47     ` [PATCH v3 15/34] fsmonitor: do not try to operate on bare repos Jeff Hostetler via GitGitGadget
2021-07-01 22:53       ` Ævar Arnfjörð Bjarmason
2021-07-01 14:47     ` [PATCH v3 16/34] fsmonitor--daemon: add pathname classification Jeff Hostetler via GitGitGadget
2021-07-01 14:47     ` [PATCH v3 17/34] fsmonitor--daemon: define token-ids Jeff Hostetler via GitGitGadget
2021-07-01 22:58       ` Ævar Arnfjörð Bjarmason
2021-07-13 15:15         ` Jeff Hostetler
2021-07-13 18:11           ` Ævar Arnfjörð Bjarmason
2021-07-01 14:47     ` [PATCH v3 18/34] fsmonitor--daemon: create token-based changed path cache Jeff Hostetler via GitGitGadget
2021-07-01 14:47     ` [PATCH v3 19/34] fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows Jeff Hostetler via GitGitGadget
2021-07-01 23:02       ` Ævar Arnfjörð Bjarmason
2021-07-13 15:46         ` Jeff Hostetler
2021-07-13 18:15           ` Ævar Arnfjörð Bjarmason
2021-07-16 15:55             ` Johannes Schindelin
2021-07-16 16:27               ` Ævar Arnfjörð Bjarmason
2021-07-17 12:45                 ` Eric Wong
2021-07-19 22:35                   ` Jeff Hostetler
2021-07-16 16:55               ` Felipe Contreras
2021-07-06 19:09       ` Johannes Schindelin
2021-07-13 15:18         ` Jeff Hostetler
2021-07-01 14:47     ` [PATCH v3 20/34] fsmonitor-fs-listen-macos: add macos header files for FSEvent Jeff Hostetler via GitGitGadget
2021-07-01 14:47     ` [PATCH v3 21/34] fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS Jeff Hostetler via GitGitGadget
2021-07-01 14:47     ` [PATCH v3 22/34] fsmonitor--daemon: implement handle_client callback Jeff Hostetler via GitGitGadget
2021-07-01 14:47     ` [PATCH v3 23/34] t/helper/test-touch: add helper to touch a series of files Jeff Hostetler via GitGitGadget
2021-07-01 20:00       ` Junio C Hamano
2021-07-13 16:45         ` Jeff Hostetler
2021-07-01 14:47     ` [PATCH v3 24/34] t/perf/p7519: speed up test using "test-tool touch" Jeff Hostetler via GitGitGadget
2021-07-01 23:09       ` Ævar Arnfjörð Bjarmason
2021-07-13 17:06         ` Jeff Hostetler
2021-07-13 17:36           ` Elijah Newren
2021-07-13 17:47             ` Junio C Hamano
2021-07-13 17:50               ` Elijah Newren
2021-07-13 17:58             ` Jeff Hostetler
2021-07-13 18:07               ` Junio C Hamano
2021-07-13 18:19                 ` Jeff Hostetler
2021-07-13 18:18           ` Ævar Arnfjörð Bjarmason
2021-07-13 19:05             ` Jeff Hostetler
2021-07-20 19:18               ` Ævar Arnfjörð Bjarmason
2021-07-13 18:04       ` Jeff Hostetler
2021-07-01 14:47     ` [PATCH v3 25/34] t/perf: avoid copying builtin fsmonitor files into test repo Jeff Hostetler via GitGitGadget
2021-07-01 23:11       ` Ævar Arnfjörð Bjarmason
2021-07-01 14:47     ` [PATCH v3 26/34] t/perf/p7519: add fsmonitor--daemon test cases Jeff Hostetler via GitGitGadget
2021-07-01 14:47     ` [PATCH v3 27/34] t7527: create test for fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-07-01 23:15       ` Ævar Arnfjörð Bjarmason
2021-07-01 14:47     ` [PATCH v3 28/34] fsmonitor--daemon: periodically truncate list of modified files Jeff Hostetler via GitGitGadget
2021-07-01 14:47     ` [PATCH v3 29/34] fsmonitor--daemon: use a cookie file to sync with file system Jeff Hostetler via GitGitGadget
2021-07-01 23:17       ` Ævar Arnfjörð Bjarmason
2021-07-21 14:40         ` Jeff Hostetler
2021-07-01 14:47     ` [PATCH v3 30/34] fsmonitor: enhance existing comments Jeff Hostetler via GitGitGadget
2021-07-01 14:47     ` [PATCH v3 31/34] fsmonitor: force update index after large responses Jeff Hostetler via GitGitGadget
2021-07-01 14:47     ` [PATCH v3 32/34] t7527: test status with untracked-cache and fsmonitor--daemon Jeff Hostetler via GitGitGadget
2021-07-01 14:47     ` [PATCH v3 33/34] fsmonitor: handle shortname for .git Jeff Hostetler via GitGitGadget
2021-07-01 14:47     ` [PATCH v3 34/34] t7527: test FS event reporing on MacOS WRT case and Unicode Jeff Hostetler via GitGitGadget
2021-07-01 23:39       ` Ævar Arnfjörð Bjarmason
2021-07-01 17:40     ` [PATCH v3 00/34] Builtin FSMonitor Feature Ævar Arnfjörð Bjarmason
2021-07-01 18:29       ` Jeff Hostetler
2021-07-01 21:26         ` Ævar Arnfjörð Bjarmason
2021-07-02 19:06           ` Jeff Hostetler
2021-07-05 22:52             ` Ævar Arnfjörð Bjarmason
2021-07-05 21:35           ` Johannes Schindelin
2021-07-05 22:02             ` Ævar Arnfjörð Bjarmason
2021-07-06 13:12               ` Johannes Schindelin
2021-07-07  2:14                 ` Felipe Contreras
2021-07-07  1:53             ` Felipe Contreras

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.