Hi Ævar,

On Fri, 2 Jul 2021, Ævar Arnfjörð Bjarmason wrote:

> On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote:
>
> > +#ifdef GIT_WINDOWS_NATIVE
> > +/*
> > + * Create a background process to run the daemon.  It should be completely
> > + * disassociated from the terminal.
> > + *
> > + * Conceptually like `daemonize()` but different because Windows does not
> > + * have `fork(2)`.  Spawn a normal Windows child process but without the
> > + * limitations of `start_command()` and `finish_command()`.
> > + *
> > + * The child process execs the "git fsmonitor--daemon run" command.
> > + *
> > + * The current process returns so that the caller can wait for the child
> > + * to startup before exiting.
> > + */
> > +static int spawn_background_fsmonitor_daemon(pid_t *pid)
> > +{
> > +	char git_exe[MAX_PATH];
> > +	struct strvec args = STRVEC_INIT;
> > +	int in, out;
> > +
> > +	GetModuleFileNameA(NULL, git_exe, MAX_PATH);
> > +
> > +	in = open("/dev/null", O_RDONLY);
> > +	out = open("/dev/null", O_WRONLY);
> > +
> > +	strvec_push(&args, git_exe);
> > +	strvec_push(&args, "fsmonitor--daemon");
> > +	strvec_push(&args, "run");
> > +	strvec_pushf(&args, "--ipc-threads=%d", fsmonitor__ipc_threads);
> > +
> > +	*pid = mingw_spawnvpe(args.v[0], args.v, NULL, NULL, in, out, out);
> > +	close(in);
> > +	close(out);
> > +
> > +	strvec_clear(&args);
> > +
> > +	if (*pid < 0)
> > +		return error(_("could not spawn fsmonitor--daemon in the background"));
> > +
> > +	return 0;
> > +}
> > +#else
> > +/*
> > + * Create a background process to run the daemon.  It should be completely
> > + * disassociated from the terminal.
> > + *
> > + * This is adapted from `daemonize()`.  Use `fork()` to directly
> > + * create and run the daemon in the child process.
> > + *
> > + * The fork-child can just call the run code; it does not need to exec
> > + * it.
> > + *
> > + * The fork-parent returns the child PID so that we can wait for the
> > + * child to startup before exiting.
> > + */
> > +static int spawn_background_fsmonitor_daemon(pid_t *pid)
> > +{
> > +	*pid = fork();
> > +
> > +	switch (*pid) {
> > +	case 0:
> > +		if (setsid() == -1)
> > +			error_errno(_("setsid failed"));
> > +		close(0);
> > +		close(1);
> > +		close(2);
> > +		sanitize_stdfds();
> > +
> > +		return !!fsmonitor_run_daemon();
> > +
> > +	case -1:
> > +		return error_errno(_("could not spawn fsmonitor--daemon in the background"));
> > +
> > +	default:
> > +		return 0;
> > +	}
> > +}
> > +#endif
>
> The spawn_background_fsmonitor_daemon() function here is almost the same
> as daemonize().

Yes, the code comment above that function even says that it was adapted
from `daemonize()`.

And above that, of course, is a _completely different_ implementation that
works on Windows (you will notice that this is in stark contrast of
Windows, where the `daemonize()` function will simply fail with `ENOSYS`).

> I wonder if this & the Windows-specific one you have here can't be
> refactored into an API from what's now in setup.c.

Given that there is no `fork()` on Windows (which has been the subject of
many a message to this mailing list), I think the answer to this question
of yours is a resounding "No".

> Then we could make builtin/gc.c and daemon.c use that, so Windows could
> have background GC, and we'd have a more battle-tested central codepath
> for this tricky bit.

Please. Not _more_ sidetracks.

The issue of getting `git gc --auto` to daemonize on Windows is a rather
complicated one. I won't bore this list with the details, but link to
https://github.com/git-for-windows/git/issues/2221#issuecomment-542589590
(a ~950 word analysis of the problem).

> It seems to me like the only limitations on it are to have this return
> slightly more general things (e.g. not set its own errors, return
> structured data), and maybe some callback for what to do in the
> child/parent.

And one version doesn't `die()`. Nor does it call `exit(0)` in the parent
process. But it calls `fsmonitor_listen()` in the child process. And if
you wanted to refactor `daemonize()` to do all that, it would have to be
renamed (because it does no longer _necessarily_ daemonize), and it would
have to have a `gentle` flag, and it would somehow have to indicate in its
return value whether `0` means that the parent process returned
successfully or the client process. And soon we'll end up with a function
that is both longer and more unreadable than what we have right now.

>
> > +/*
> > + * This is adapted from `wait_or_whine()`.  Watch the child process and
> > + * let it get started and begin listening for requests on the socket
> > + * before reporting our success.
> > + */
> > +static int wait_for_background_startup(pid_t pid_child)
> > +{
> > +	int status;
> > +	pid_t pid_seen;
> > +	enum ipc_active_state s;
> > +	time_t time_limit, now;
> > +
> > +	time(&time_limit);
> > +	time_limit += fsmonitor__start_timeout_sec;
> > +
> > +	for (;;) {
> > +		pid_seen = waitpid(pid_child, &status, WNOHANG);
> > +
> > +		if (pid_seen == -1)
> > +			return error_errno(_("waitpid failed"));
> > +		else if (pid_seen == 0) {
> > +			/*
> > +			 * The child is still running (this should be
> > +			 * the normal case).  Try to connect to it on
> > +			 * the socket and see if it is ready for
> > +			 * business.
> > +			 *
> > +			 * If there is another daemon already running,
> > +			 * our child will fail to start (possibly
> > +			 * after a timeout on the lock), but we don't
> > +			 * care (who responds) if the socket is live.
> > +			 */
> > +			s = fsmonitor_ipc__get_state();
> > +			if (s == IPC_STATE__LISTENING)
> > +				return 0;
> > +
> > +			time(&now);
> > +			if (now > time_limit)
> > +				return error(_("fsmonitor--daemon not online yet"));
> > +		} else if (pid_seen == pid_child) {
> > +			/*
> > +			 * The new child daemon process shutdown while
> > +			 * it was starting up, so it is not listening
> > +			 * on the socket.
> > +			 *
> > +			 * Try to ping the socket in the odd chance
> > +			 * that another daemon started (or was already
> > +			 * running) while our child was starting.
> > +			 *
> > +			 * Again, we don't care who services the socket.
> > +			 */
> > +			s = fsmonitor_ipc__get_state();
> > +			if (s == IPC_STATE__LISTENING)
> > +				return 0;
> > +
> > +			/*
> > +			 * We don't care about the WEXITSTATUS() nor
> > +			 * any of the WIF*(status) values because
> > +			 * `cmd_fsmonitor__daemon()` does the `!!result`
> > +			 * trick on all function return values.
> > +			 *
> > +			 * So it is sufficient to just report the
> > +			 * early shutdown as an error.
> > +			 */
> > +			return error(_("fsmonitor--daemon failed to start"));
> > +		} else
> > +			return error(_("waitpid is confused"));
> > +	}
> > +}
>
> Ditto this. could we extend the wait_or_whine() function (or some
> extended version thereof) to do what you need with callbacks?
>
> It seems the main difference is just being able to pass down a flag for
> waitpid(), and the loop needing to check EINTR or not depending on
> whether WNOHANG is passed.

Given that over half of `wait_or_whine()` is concerned with signals, which
the `wait_for_background_startup()` function is not at all concerned with,
I see another main difference.

> For e.g. the "We don't care about the WEXITSTATUS()" you'd get that
> behavior with an adjusted wait_or_whine(). Wouldn't it be better to
> report what exit status it exits with e.g. if the top-level process is
> signalled? We do so in trace2 for other things we spawn...

The `wait_or_whine()` function also adjusts `atexit()` behavior, which we
would not want here.

Therefore, just like the suggestion about `daemonize()` above, it appears
to me as if the suggested refactoring would make the code dramatically
more complex and less readable.

In other words, it would be a refactoring for refactoring's sake.
Definitely not something I would suggest.

Ciao,
Johannes