All of lore.kernel.org
 help / color / mirror / Atom feed
From: Olaf Hering <olaf@aepfle.de>
To: xen-devel@lists.xenproject.org
Cc: Olaf Hering <olaf@aepfle.de>, Ian Jackson <iwj@xenproject.org>,
	Wei Liu <wl@xen.org>
Subject: [PATCH v20210601 38/38] hotplug/Linux: fix starting of xenstored with restarting systemd
Date: Tue,  1 Jun 2021 18:11:18 +0200	[thread overview]
Message-ID: <20210601161118.18986-39-olaf@aepfle.de> (raw)
In-Reply-To: <20210601161118.18986-1-olaf@aepfle.de>

A hard to trigger race with another unrelated systemd service and
xenstored.service unveiled a bug in the way how xenstored is launched
with systemd.

launch-xenstore may start either a daemon or a domain. In case a domain
is used, systemd-notify was called. If another service triggered a
restart of systemd while xenstored.service was executed, systemd may
temporary lose track of services with Type=notify. As a result,
xenstored.service would be marked as failed and units that depend on it
will not be started. This breaks the enire Xen toolstack.

The chain of events is basically: xenstored.service sends the
notification to systemd, this is a one-way event. Then systemd may be
restarted by the other unit. During this time, xenstored.service is done
and exits. Once systemd is done with its restart, it collects the pending
notifications and childs. If it does not find the unit which sent the
notification it will declare it as failed.

A workaround for this scenario is to leave the child processes running
for a short time after sending the "READY=1" notification. If systemd
happens to restart it will still find the unit it launched.

Adjust the callers of launch-xenstore to specifiy the init system:
Do not fork xenstored with systemd, preserve pid. This wil also avoid
the need for a sleep because the process which sent the "READY=1" (the
previously forked child) is still alive.

Remove the --pid-file in the systemd case because the pid of the child
is known, and the file had probably little effect anyway due to lack of
PidFile= and Type=forking in the unit file.

Be verbose about xenstored startup only with sysv to avoid interleaved
output in systemd journal. Do the same also for domain case, even if is
not strictly needed because init-xenstore-domain has no output.

The fix for upstream systemd which is supposed to fix it:
575b300b795b6 ("pid1: rework how we dispatch SIGCHLD and other signals")

Signed-off-by: Olaf Hering <olaf@aepfle.de>

--
v04:
- do mkdir unconditionally because init-xenstore-domain writes the domid to
  xenstored.pid
v03:
- remove run_xenstored function, follow style of shell built-in test function
v02:
- preserve Type=notify
---
 tools/hotplug/Linux/init.d/xencommons.in      |  2 +-
 tools/hotplug/Linux/launch-xenstore.in        | 40 ++++++++++++++-----
 .../Linux/systemd/xenstored.service.in        |  2 +-
 3 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/tools/hotplug/Linux/init.d/xencommons.in b/tools/hotplug/Linux/init.d/xencommons.in
index 7fd6903b98..dcb0ce4b73 100644
--- a/tools/hotplug/Linux/init.d/xencommons.in
+++ b/tools/hotplug/Linux/init.d/xencommons.in
@@ -60,7 +60,7 @@ do_start () {
 	mkdir -m700 -p ${XEN_LOCK_DIR}
 	mkdir -p ${XEN_LOG_DIR}
 
-	@XEN_SCRIPT_DIR@/launch-xenstore || exit 1
+	@XEN_SCRIPT_DIR@/launch-xenstore 'sysv' || exit 1
 
 	echo Setting domain 0 name, domid and JSON config...
 	${LIBEXEC_BIN}/xen-init-dom0 ${XEN_DOM0_UUID}
diff --git a/tools/hotplug/Linux/launch-xenstore.in b/tools/hotplug/Linux/launch-xenstore.in
index 019f9d6f4d..d40c66482a 100644
--- a/tools/hotplug/Linux/launch-xenstore.in
+++ b/tools/hotplug/Linux/launch-xenstore.in
@@ -15,6 +15,17 @@
 # License along with this library; If not, see <http://www.gnu.org/licenses/>.
 #
 
+initd=$1
+
+case "$initd" in
+	sysv) nonl='-n' ;;
+	systemd) nonl= ;;
+	*)
+	echo "first argument must be 'sysv' or 'systemd'"
+	exit 1
+	;;
+esac
+
 XENSTORED=@XENSTORED@
 
 . @XEN_SCRIPT_DIR@/hotplugpath.sh
@@ -44,14 +55,16 @@ timeout_xenstore () {
 	return 0
 }
 
-test_xenstore && exit 0
+mkdir -p @XEN_RUN_DIR@
+
+if test "$initd" = 'sysv' ; then
+	test_xenstore && exit 0
+fi
 
 test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons
 
 [ "$XENSTORETYPE" = "" ] && XENSTORETYPE=daemon
 
-/bin/mkdir -p @XEN_RUN_DIR@
-
 [ "$XENSTORETYPE" = "daemon" ] && {
 	[ -z "$XENSTORED_TRACE" ] || XENSTORED_ARGS="$XENSTORED_ARGS -T @XEN_LOG_DIR@/xenstored-trace.log"
 	[ -z "$XENSTORED" ] && XENSTORED=@XENSTORED@
@@ -59,13 +72,15 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . @CONFIG_DIR@/@CONFIG_LEAF
 		echo "No xenstored found"
 		exit 1
 	}
+	[ "$initd" = 'sysv' ] && {
+		echo $nonl Starting $XENSTORED...
+		$XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS
+		timeout_xenstore $XENSTORED || exit 1
+		exit 0
+	}
 
-	echo -n Starting $XENSTORED...
-	$XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS
-
-	systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || exit 1
-
-	exit 0
+	exec $XENSTORED -N $XENSTORED_ARGS
+	exit 1
 }
 
 [ "$XENSTORETYPE" = "domain" ] && {
@@ -75,9 +90,12 @@ test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . @CONFIG_DIR@/@CONFIG_LEAF
 	XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --memory $XENSTORE_DOMAIN_SIZE"
 	[ -z "$XENSTORE_MAX_DOMAIN_SIZE" ] || XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --maxmem $XENSTORE_MAX_DOMAIN_SIZE"
 
-	echo -n Starting $XENSTORE_DOMAIN_KERNEL...
+	echo $nonl Starting $XENSTORE_DOMAIN_KERNEL...
 	${LIBEXEC_BIN}/init-xenstore-domain $XENSTORE_DOMAIN_ARGS || exit 1
-	systemd-notify --ready 2>/dev/null
+	[ "$initd" = 'systemd' ] && {
+		systemd-notify --ready
+		sleep 9
+	}
 
 	exit 0
 }
diff --git a/tools/hotplug/Linux/systemd/xenstored.service.in b/tools/hotplug/Linux/systemd/xenstored.service.in
index 80c1d408a5..c226eb3635 100644
--- a/tools/hotplug/Linux/systemd/xenstored.service.in
+++ b/tools/hotplug/Linux/systemd/xenstored.service.in
@@ -11,7 +11,7 @@ Type=notify
 NotifyAccess=all
 RemainAfterExit=true
 ExecStartPre=/bin/grep -q control_d /proc/xen/capabilities
-ExecStart=@XEN_SCRIPT_DIR@/launch-xenstore
+ExecStart=@XEN_SCRIPT_DIR@/launch-xenstore 'systemd'
 
 [Install]
 WantedBy=multi-user.target


  parent reply	other threads:[~2021-06-01 16:13 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-01 16:10 [PATCH v20210601 00/38] leftover from 2020 Olaf Hering
2021-06-01 16:10 ` [PATCH v20210601 01/38] tools: add API to work with sevaral bits at once Olaf Hering
2021-06-02  6:19   ` Juergen Gross
2021-06-01 16:10 ` [PATCH v20210601 02/38] xl: fix description of migrate --debug Olaf Hering
2021-06-02  6:09   ` Juergen Gross
2021-06-02 10:43     ` Olaf Hering
2021-06-02 11:43       ` Juergen Gross
2021-06-02 12:32   ` [PATCH v20210602 " Olaf Hering
2021-06-02 13:48     ` Juergen Gross
2021-06-01 16:10 ` [PATCH v20210601 03/38] tools: create libxensaverestore Olaf Hering
2021-06-01 16:10 ` [PATCH v20210601 04/38] tools: add readv_exact to libxenctrl Olaf Hering
2021-06-02  6:30   ` Juergen Gross
2021-06-02 10:57     ` Olaf Hering
2021-06-02 11:05       ` Olaf Hering
2021-06-02 11:41       ` Juergen Gross
2021-06-07  9:46         ` Olaf Hering
2021-06-07 11:31         ` Olaf Hering
2021-06-01 16:10 ` [PATCH v20210601 05/38] tools: add xc_is_known_page_type " Olaf Hering
2021-06-02  6:51   ` Juergen Gross
2021-06-02 11:10     ` Olaf Hering
2021-06-02 11:48       ` Juergen Gross
2021-06-01 16:10 ` [PATCH v20210601 06/38] tools: use xc_is_known_page_type Olaf Hering
2021-06-02  6:53   ` Juergen Gross
2021-06-01 16:10 ` [PATCH v20210601 07/38] tools: unify type checking for data pfns in migration stream Olaf Hering
2021-06-02  6:59   ` Juergen Gross
2021-06-02 11:21     ` Olaf Hering
2021-06-02 12:03       ` Juergen Gross
2021-06-07 10:12         ` Olaf Hering
2021-06-07 10:22           ` Juergen Gross
2021-06-18 12:25     ` Olaf Hering
2021-06-01 16:10 ` [PATCH v20210601 08/38] tools: show migration transfer rate in send_dirty_pages Olaf Hering
2021-06-02  7:10   ` Juergen Gross
2021-06-08  8:58     ` Olaf Hering
2021-06-08 10:07       ` Juergen Gross
2021-06-01 16:10 ` [PATCH v20210601 09/38] tools/guest: prepare to allocate arrays once Olaf Hering
2021-06-02  7:29   ` Juergen Gross
2021-06-02 12:03     ` Olaf Hering
2021-06-02 12:09       ` Juergen Gross
2021-06-01 16:10 ` [PATCH v20210601 10/38] tools/guest: save: move batch_pfns Olaf Hering
2021-06-02  7:31   ` Juergen Gross
2021-06-01 16:10 ` [PATCH v20210601 11/38] tools/guest: save: move mfns array Olaf Hering
2021-06-02  7:32   ` Juergen Gross
2021-06-01 16:10 ` [PATCH v20210601 12/38] tools/guest: save: move types array Olaf Hering
2021-06-02  7:32   ` Juergen Gross
2021-06-01 16:10 ` [PATCH v20210601 13/38] tools/guest: save: move errors array Olaf Hering
2021-06-02  7:33   ` Juergen Gross
2021-06-01 16:10 ` [PATCH v20210601 14/38] tools/guest: save: move iov array Olaf Hering
2021-06-02  7:34   ` Juergen Gross
2021-06-01 16:10 ` [PATCH v20210601 15/38] tools/guest: save: move rec_pfns array Olaf Hering
2021-06-02  7:35   ` Juergen Gross
2021-06-01 16:10 ` [PATCH v20210601 16/38] tools/guest: save: move guest_data array Olaf Hering
2021-06-02  7:39   ` Juergen Gross
2021-06-01 16:10 ` [PATCH v20210601 17/38] tools/guest: save: move local_pages array Olaf Hering
2021-06-02  7:47   ` Juergen Gross
2021-06-01 16:10 ` [PATCH v20210601 18/38] tools/guest: restore: move pfns array Olaf Hering
2021-06-02  7:55   ` Juergen Gross
2021-06-01 16:10 ` [PATCH v20210601 19/38] tools/guest: restore: move types array Olaf Hering
2021-06-02  7:56   ` Juergen Gross
2021-06-01 16:11 ` [PATCH v20210601 20/38] tools/guest: restore: move mfns array Olaf Hering
2021-06-02  7:57   ` Juergen Gross
2021-06-01 16:11 ` [PATCH v20210601 21/38] tools/guest: restore: move map_errs array Olaf Hering
2021-06-02  7:58   ` Juergen Gross
2021-06-01 16:11 ` [PATCH v20210601 22/38] tools/guest: restore: move mfns array in populate_pfns Olaf Hering
2021-06-02  7:59   ` Juergen Gross
2021-06-01 16:11 ` [PATCH v20210601 23/38] tools/guest: restore: move pfns " Olaf Hering
2021-06-02  7:59   ` Juergen Gross
2021-06-01 16:11 ` [PATCH v20210601 24/38] tools/guest: restore: split record processing Olaf Hering
2021-06-02  9:57   ` Juergen Gross
2021-06-01 16:11 ` [PATCH v20210601 25/38] tools/guest: restore: split handle_page_data Olaf Hering
2021-06-01 16:11 ` [PATCH v20210601 26/38] tools/guest: restore: write data directly into guest Olaf Hering
2021-06-01 16:11 ` [PATCH v20210601 27/38] tools: recognize LIBXL_API_VERSION for 4.16 Olaf Hering
2021-06-01 16:11 ` [PATCH v20210601 28/38] tools: adjust libxl_domain_suspend to receive a struct props Olaf Hering
2021-06-01 16:11 ` [PATCH v20210601 29/38] tools: change struct precopy_stats to precopy_stats_t Olaf Hering
2021-06-01 16:11 ` [PATCH v20210601 30/38] tools: add callback to libxl for precopy_policy and precopy_stats_t Olaf Hering
2021-06-01 16:11 ` [PATCH v20210601 31/38] tools: add --max_iters to libxl_domain_suspend Olaf Hering
2021-06-01 16:11 ` [PATCH v20210601 32/38] tools: add --min_remaining " Olaf Hering
2021-06-01 16:11 ` [PATCH v20210601 33/38] tools: add --abort_if_busy " Olaf Hering
2021-06-01 16:11 ` [PATCH v20210601 34/38] tools: add API for expandable bitmaps Olaf Hering
2021-06-01 16:11 ` [PATCH v20210601 35/38] tools: use xg_sr_bitmap for populated_pfns Olaf Hering
2021-06-01 16:11 ` [PATCH v20210601 36/38] tools: use superpages during restore of HVM guest Olaf Hering
2021-06-01 16:11 ` [PATCH v20210601 37/38] tools: remove migration stream verify code Olaf Hering
2021-06-01 16:11 ` Olaf Hering [this message]
2021-06-02  6:10 ` [PATCH v20210601 00/38] leftover from 2020 Juergen Gross
2021-06-02  6:54   ` Olaf Hering
2021-06-02  7:00     ` Juergen Gross
2021-06-02 12:07       ` Olaf Hering
2021-06-02 12:15         ` Juergen Gross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210601161118.18986-39-olaf@aepfle.de \
    --to=olaf@aepfle.de \
    --cc=iwj@xenproject.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.