All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v10 00/25] tools/xenstore: support live update for xenstored
@ 2020-12-15 16:35 Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 01/25] tools/xenstore: switch barf[_perror]() to use syslog() Juergen Gross
                   ` (25 more replies)
  0 siblings, 26 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Ian Jackson, Wei Liu, Andrew Cooper,
	George Dunlap, Jan Beulich, Julien Grall, Stefano Stabellini

Today Xenstore is not restartable. This means a Xen server needing an
update of xenstored has to be rebooted in order to let this update
become effective.

This patch series is changing that: The internal state of xenstored
(the contents of Xenstore, all connections to various clients like
programs or other domains, and watches) is saved in a defined format
and a new binary is being activated consuming the old state. All
connections are being restored and the new Xenstore binary will
continue where the old one stopped.

This patch series has been under (secret) development for quite some
time. It hasn't been posted to xen-devel until now due to the various
Xenstore related security issues which have become public only today.

There will be a similar series for oxenstored posted.

Xenstore-stubdom is not yet supported, but I'm planning to start
working on that soon.

Changes in V10 (for the members of the security team):
- dropped patch 6 as requested by Andrew

Juergen Gross (24):
  tools/xenstore: switch barf[_perror]() to use syslog()
  tools/xenstore: make set_tdb_key() non-static
  tools/xenstore: remove unused cruft from xenstored_domain.c
  tools/libxenevtchn: add possibility to not close file descriptor on
    exec
  tools/xenstore: refactor XS_CONTROL handling
  tools/xenstore: add live update command to xenstore-control
  tools/xenstore: add basic live-update command parsing
  tools/xenstore: introduce live update status block
  tools/xenstore: save new binary for live update
  tools/xenstore: add command line handling for live update
  tools/xenstore: add the basic framework for doing the live update
  tools/xenstore: allow live update only with no transaction active
  docs: update the xenstore migration stream documentation
  tools/xenstore: add include file for state structure definitions
  tools/xenstore: dump the xenstore state for live update
  tools/xenstore: handle CLOEXEC flag for local files and pipes
  tools/xenstore: split off domain introduction from do_introduce()
  tools/xenstore: evaluate the live update flag when starting
  tools/xenstore: read internal state when doing live upgrade
  tools/xenstore: add reading global state for live update
  tools/xenstore: add read connection state for live update
  tools/xenstore: add read node state for live update
  tools/xenstore: add read watch state for live update
  tools/xenstore: activate new binary for live update

Julien Grall (1):
  tools/xenstore: handle dying domains in live update

 docs/designs/xenstore-migration.md      |  19 +-
 docs/misc/xenstore.txt                  |  21 +
 tools/include/xenevtchn.h               |  16 +-
 tools/libs/evtchn/Makefile              |   2 +-
 tools/libs/evtchn/core.c                |  45 +-
 tools/libs/evtchn/freebsd.c             |   9 +-
 tools/libs/evtchn/libxenevtchn.map      |   4 +
 tools/libs/evtchn/linux.c               |   9 +-
 tools/libs/evtchn/minios.c              |   6 +-
 tools/libs/evtchn/netbsd.c              |   2 +-
 tools/libs/evtchn/private.h             |   2 +-
 tools/libs/evtchn/solaris.c             |   2 +-
 tools/xenstore/Makefile                 |   3 +-
 tools/xenstore/include/xenstore_state.h | 131 +++++
 tools/xenstore/utils.c                  |  20 +
 tools/xenstore/utils.h                  |   6 +
 tools/xenstore/xenstore_control.c       | 332 ++++++++++++-
 tools/xenstore/xenstored_control.c      | 612 +++++++++++++++++++++++-
 tools/xenstore/xenstored_control.h      |   1 +
 tools/xenstore/xenstored_core.c         | 510 ++++++++++++++++++--
 tools/xenstore/xenstored_core.h         |  40 ++
 tools/xenstore/xenstored_domain.c       | 312 +++++++++---
 tools/xenstore/xenstored_domain.h       |  14 +-
 tools/xenstore/xenstored_posix.c        |  13 +-
 tools/xenstore/xenstored_transaction.c  |  11 +-
 tools/xenstore/xenstored_watch.c        | 171 +++++--
 tools/xenstore/xenstored_watch.h        |   5 +
 27 files changed, 2103 insertions(+), 215 deletions(-)
 create mode 100644 tools/xenstore/include/xenstore_state.h

-- 
2.26.2



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v10 01/25] tools/xenstore: switch barf[_perror]() to use syslog()
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 02/25] tools/xenstore: make set_tdb_key() non-static Juergen Gross
                   ` (24 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Paul Durrant, Julien Grall

When xenstored crashes due to an unrecoverable condition it is calling
either barf() or barf_perror() to issue a message and then exit().

Make sure the message is visible somewhere by using syslog()
additionally to xprintf(), as the latter will be visible only with
tracing active.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Julien Grall <jgrall@amazon.com>
---
V4:
- new patch
---
 tools/xenstore/utils.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/xenstore/utils.c b/tools/xenstore/utils.c
index a1ac12584a..633ce3b4fc 100644
--- a/tools/xenstore/utils.c
+++ b/tools/xenstore/utils.c
@@ -3,6 +3,7 @@
 #include <stdarg.h>
 #include <stdlib.h>
 #include <string.h>
+#include <syslog.h>
 #include <errno.h>
 #include <unistd.h>
 #include <fcntl.h>
@@ -35,6 +36,7 @@ void barf(const char *fmt, ...)
 	va_end(arglist);
 
  	if (bytes >= 0) {
+		syslog(LOG_CRIT, "%s\n", str);
 		xprintf("%s\n", str);
 		free(str);
 	}
@@ -54,6 +56,7 @@ void barf_perror(const char *fmt, ...)
 	va_end(arglist);
 
  	if (bytes >= 0) {
+		syslog(LOG_CRIT, "%s: %s\n", str, strerror(err));
 		xprintf("%s: %s\n", str, strerror(err));
 		free(str);
 	}
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 02/25] tools/xenstore: make set_tdb_key() non-static
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 01/25] tools/xenstore: switch barf[_perror]() to use syslog() Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 03/25] tools/xenstore: remove unused cruft from xenstored_domain.c Juergen Gross
                   ` (23 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Paul Durrant

set_tdb_key() can be used by destroy_node(), too. So remove the static
attribute and move it to xenstored_core.c.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
---
V5:
- new patch

V6:
- add comment (Julien Grall)

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/xenstored_core.c        | 14 +++++++++++---
 tools/xenstore/xenstored_core.h        |  2 ++
 tools/xenstore/xenstored_transaction.c |  6 ------
 3 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index 3082a36d3a..ab1c7835b8 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -352,6 +352,16 @@ static void initialize_fds(int *p_sock_pollfd_idx, int *ptimeout)
 	}
 }
 
+void set_tdb_key(const char *name, TDB_DATA *key)
+{
+	/*
+	 * Dropping const is fine here, as the key will never be modified
+	 * by TDB.
+	 */
+	key->dptr = (char *)name;
+	key->dsize = strlen(name);
+}
+
 /*
  * If it fails, returns NULL and sets errno.
  * Temporary memory allocations will be done with ctx.
@@ -985,9 +995,7 @@ static int destroy_node(void *_node)
 	if (streq(node->name, "/"))
 		corrupt(NULL, "Destroying root node!");
 
-	key.dptr = (void *)node->name;
-	key.dsize = strlen(node->name);
-
+	set_tdb_key(node->name, &key);
 	tdb_delete(tdb_ctx, key);
 
 	domain_entry_dec(talloc_parent(node), node);
diff --git a/tools/xenstore/xenstored_core.h b/tools/xenstore/xenstored_core.h
index 4c6c3d6f20..fb59d862a2 100644
--- a/tools/xenstore/xenstored_core.h
+++ b/tools/xenstore/xenstored_core.h
@@ -220,6 +220,8 @@ extern xengnttab_handle **xgt_handle;
 
 int remember_string(struct hashtable *hash, const char *str);
 
+void set_tdb_key(const char *name, TDB_DATA *key);
+
 #endif /* _XENSTORED_CORE_H */
 
 /*
diff --git a/tools/xenstore/xenstored_transaction.c b/tools/xenstore/xenstored_transaction.c
index 2881f3b2e4..52355f4ed8 100644
--- a/tools/xenstore/xenstored_transaction.c
+++ b/tools/xenstore/xenstored_transaction.c
@@ -168,12 +168,6 @@ struct transaction
 extern int quota_max_transaction;
 uint64_t generation;
 
-static void set_tdb_key(const char *name, TDB_DATA *key)
-{
-	key->dptr = (char *)name;
-	key->dsize = strlen(name);
-}
-
 static struct accessed_node *find_accessed_node(struct transaction *trans,
 						const char *name)
 {
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 03/25] tools/xenstore: remove unused cruft from xenstored_domain.c
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 01/25] tools/xenstore: switch barf[_perror]() to use syslog() Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 02/25] tools/xenstore: make set_tdb_key() non-static Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 04/25] tools/libxenevtchn: add possibility to not close file descriptor on exec Juergen Gross
                   ` (22 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Julien Grall

domain->remote_port and restore_existing_connections() are useless and
can be removed.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
---
V7:
- new patch
---
 tools/xenstore/xenstored_core.c   |  3 ---
 tools/xenstore/xenstored_domain.c | 11 -----------
 tools/xenstore/xenstored_domain.h |  3 ---
 3 files changed, 17 deletions(-)

diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index ab1c7835b8..50986f8b29 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -2087,9 +2087,6 @@ int main(int argc, char *argv[])
 	if (!no_domain_init)
 		domain_init();
 
-	/* Restore existing connections. */
-	restore_existing_connections();
-
 	if (outputpid) {
 		printf("%ld\n", (long)getpid());
 		fflush(stdout);
diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
index 7d348d57f3..ed8e83b06b 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -54,10 +54,6 @@ struct domain
 	/* Event channel port */
 	evtchn_port_t port;
 
-	/* The remote end of the event channel, used only to validate
-	   repeated domain introductions. */
-	evtchn_port_t remote_port;
-
 	/* Domain path in store. */
 	char *path;
 
@@ -382,7 +378,6 @@ static int new_domain(struct domain *domain, int port)
 	domain->conn->domain = domain;
 	domain->conn->id = domain->domid;
 
-	domain->remote_port = port;
 	domain->nbentry = 0;
 	domain->nbwatch = 0;
 
@@ -470,7 +465,6 @@ int do_introduce(struct connection *conn, struct buffered_data *in)
 			xenevtchn_unbind(xce_handle, domain->port);
 		rc = xenevtchn_bind_interdomain(xce_handle, domid, port);
 		domain->port = (rc == -1) ? 0 : rc;
-		domain->remote_port = port;
 	}
 
 	domain_conn_reset(domain);
@@ -636,11 +630,6 @@ const char *get_implicit_path(const struct connection *conn)
 	return conn->domain->path;
 }
 
-/* Restore existing connections. */
-void restore_existing_connections(void)
-{
-}
-
 static int set_dom_perms_default(struct node_perms *perms)
 {
 	perms->num = 1;
diff --git a/tools/xenstore/xenstored_domain.h b/tools/xenstore/xenstored_domain.h
index 5e00087206..66e0a12654 100644
--- a/tools/xenstore/xenstored_domain.h
+++ b/tools/xenstore/xenstored_domain.h
@@ -47,9 +47,6 @@ void domain_init(void);
 /* Returns the implicit path of a connection (only domains have this) */
 const char *get_implicit_path(const struct connection *conn);
 
-/* Read existing connection information from store. */
-void restore_existing_connections(void);
-
 /* Can connection attached to domain read/write. */
 bool domain_can_read(struct connection *conn);
 bool domain_can_write(struct connection *conn);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 04/25] tools/libxenevtchn: add possibility to not close file descriptor on exec
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (2 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 03/25] tools/xenstore: remove unused cruft from xenstored_domain.c Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 18:09   ` Andrew Cooper
  2020-12-15 16:35 ` [PATCH v10 05/25] tools/xenstore: refactor XS_CONTROL handling Juergen Gross
                   ` (21 subsequent siblings)
  25 siblings, 1 reply; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Julien Grall

Today the file descriptor for the access of the event channel driver
is being closed in case of exec(2). For the support of live update of
a daemon using libxenevtchn this can be problematic, so add a way to
keep that file descriptor open.

Add support of a flag XENEVTCHN_NO_CLOEXEC for xenevtchn_open() which
will result in _not_ setting O_CLOEXEC when opening the event channel
driver node.

The caller can then obtain the file descriptor via xenevtchn_fd().

Add an alternative open function xenevtchn_open_fd() which takes that
file descriptor as an additional parameter. This allows to allocate a
xenevtchn_handle and to associate it with that file descriptor.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wl@xen.org>
Reviewed-by: Julien Grall <jgrall@amazon.com>
---
V7:
- new patch

V8:
- some minor comments by Julien Grall addressed

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/include/xenevtchn.h          | 16 ++++++++++-
 tools/libs/evtchn/Makefile         |  2 +-
 tools/libs/evtchn/core.c           | 45 ++++++++++++++++++++++++------
 tools/libs/evtchn/freebsd.c        |  9 ++++--
 tools/libs/evtchn/libxenevtchn.map |  4 +++
 tools/libs/evtchn/linux.c          |  9 ++++--
 tools/libs/evtchn/minios.c         |  6 +++-
 tools/libs/evtchn/netbsd.c         |  2 +-
 tools/libs/evtchn/private.h        |  2 +-
 tools/libs/evtchn/solaris.c        |  2 +-
 10 files changed, 79 insertions(+), 18 deletions(-)

diff --git a/tools/include/xenevtchn.h b/tools/include/xenevtchn.h
index 91821ee56d..dadc46ea36 100644
--- a/tools/include/xenevtchn.h
+++ b/tools/include/xenevtchn.h
@@ -64,11 +64,25 @@ struct xentoollog_logger;
  *
  * Calling xenevtchn_close() is the only safe operation on a
  * xenevtchn_handle which has been inherited.
+ *
+ * Setting XENEVTCHN_NO_CLOEXEC allows to keep the file descriptor used
+ * for the event channel driver open across exec(2). In order to be able
+ * to use that file descriptor the new binary activated via exec(2) has
+ * to call xenevtchn_open_fd() with that file descriptor as parameter in
+ * order to associate it with a new handle. The file descriptor can be
+ * obtained via xenevtchn_fd() before calling exec(2).
  */
-/* Currently no flags are defined */
+
+/* Don't set O_CLOEXEC when opening event channel driver node. */
+#define XENEVTCHN_NO_CLOEXEC 0x01
+
 xenevtchn_handle *xenevtchn_open(struct xentoollog_logger *logger,
                                  unsigned open_flags);
 
+/* Flag XENEVTCHN_NO_CLOEXEC is ignored by xenevtchn_open_fd(). */
+xenevtchn_handle *xenevtchn_open_fd(struct xentoollog_logger *logger,
+                                    int fd, unsigned open_flags);
+
 /*
  * Close a handle previously allocated with xenevtchn_open().
  */
diff --git a/tools/libs/evtchn/Makefile b/tools/libs/evtchn/Makefile
index ad01a17b3d..b8c37b5b97 100644
--- a/tools/libs/evtchn/Makefile
+++ b/tools/libs/evtchn/Makefile
@@ -2,7 +2,7 @@ XEN_ROOT = $(CURDIR)/../../..
 include $(XEN_ROOT)/tools/Rules.mk
 
 MAJOR    = 1
-MINOR    = 1
+MINOR    = 2
 
 SRCS-y                 += core.c
 SRCS-$(CONFIG_Linux)   += linux.c
diff --git a/tools/libs/evtchn/core.c b/tools/libs/evtchn/core.c
index aff6ecfaa0..fa1e44b6ea 100644
--- a/tools/libs/evtchn/core.c
+++ b/tools/libs/evtchn/core.c
@@ -13,6 +13,7 @@
  * License along with this library; If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <errno.h>
 #include <unistd.h>
 #include <stdlib.h>
 
@@ -28,10 +29,9 @@ static int all_restrict_cb(Xentoolcore__Active_Handle *ah, domid_t domid) {
     return xenevtchn_restrict(xce, domid);
 }
 
-xenevtchn_handle *xenevtchn_open(xentoollog_logger *logger, unsigned open_flags)
+static xenevtchn_handle *xenevtchn_alloc_handle(xentoollog_logger *logger)
 {
     xenevtchn_handle *xce = malloc(sizeof(*xce));
-    int rc;
 
     if (!xce) return NULL;
 
@@ -49,19 +49,48 @@ xenevtchn_handle *xenevtchn_open(xentoollog_logger *logger, unsigned open_flags)
         if (!xce->logger) goto err;
     }
 
-    rc = osdep_evtchn_open(xce);
-    if ( rc  < 0 ) goto err;
+    return xce;
+
+err:
+    xenevtchn_close(xce);
+    return NULL;
+}
+
+xenevtchn_handle *xenevtchn_open(xentoollog_logger *logger, unsigned open_flags)
+{
+    xenevtchn_handle *xce = xenevtchn_alloc_handle(logger);
+    int rc;
+
+    if (!xce) return NULL;
+
+    rc = osdep_evtchn_open(xce, open_flags);
+    if ( rc < 0 ) goto err;
 
     return xce;
 
 err:
-    xentoolcore__deregister_active_handle(&xce->tc_ah);
-    osdep_evtchn_close(xce);
-    xtl_logger_destroy(xce->logger_tofree);
-    free(xce);
+    xenevtchn_close(xce);
     return NULL;
 }
 
+xenevtchn_handle *xenevtchn_open_fd(struct xentoollog_logger *logger,
+                                    int fd, unsigned open_flags)
+{
+    xenevtchn_handle *xce;
+
+    if (open_flags & ~XENEVTCHN_NO_CLOEXEC) {
+        errno = EINVAL;
+        return NULL;
+    }
+
+    xce = xenevtchn_alloc_handle(logger);
+    if (!xce) return NULL;
+
+    xce->fd = fd;
+
+    return xce;
+}
+
 int xenevtchn_close(xenevtchn_handle *xce)
 {
     int rc;
diff --git a/tools/libs/evtchn/freebsd.c b/tools/libs/evtchn/freebsd.c
index 6564ed4c44..635c10f09f 100644
--- a/tools/libs/evtchn/freebsd.c
+++ b/tools/libs/evtchn/freebsd.c
@@ -31,9 +31,14 @@
 
 #define EVTCHN_DEV      "/dev/xen/evtchn"
 
-int osdep_evtchn_open(xenevtchn_handle *xce)
+int osdep_evtchn_open(xenevtchn_handle *xce, unsigned int open_flags)
 {
-    int fd = open(EVTCHN_DEV, O_RDWR|O_CLOEXEC);
+    int flags = O_RDWR;
+    int fd;
+
+    if ( !(open_flags & XENEVTCHN_NO_CLOEXEC) )
+        flags |= O_CLOEXEC;
+    fd = open(EVTCHN_DEV, flags);
     if ( fd == -1 )
         return -1;
     xce->fd = fd;
diff --git a/tools/libs/evtchn/libxenevtchn.map b/tools/libs/evtchn/libxenevtchn.map
index 33a38f953a..722fa026f9 100644
--- a/tools/libs/evtchn/libxenevtchn.map
+++ b/tools/libs/evtchn/libxenevtchn.map
@@ -21,3 +21,7 @@ VERS_1.1 {
 	global:
 		xenevtchn_restrict;
 } VERS_1.0;
+VERS_1.2 {
+	global:
+		xenevtchn_open_fd;
+} VERS_1.1;
diff --git a/tools/libs/evtchn/linux.c b/tools/libs/evtchn/linux.c
index 17e64aea32..2297488f88 100644
--- a/tools/libs/evtchn/linux.c
+++ b/tools/libs/evtchn/linux.c
@@ -34,9 +34,14 @@
 #define O_CLOEXEC 0
 #endif
 
-int osdep_evtchn_open(xenevtchn_handle *xce)
+int osdep_evtchn_open(xenevtchn_handle *xce, unsigned int open_flags)
 {
-    int fd = open("/dev/xen/evtchn", O_RDWR|O_CLOEXEC);
+    int flags = O_RDWR;
+    int fd;
+
+    if ( !(open_flags & XENEVTCHN_NO_CLOEXEC) )
+        flags |= O_CLOEXEC;
+    fd = open("/dev/xen/evtchn", flags);
     if ( fd == -1 )
         return -1;
     xce->fd = fd;
diff --git a/tools/libs/evtchn/minios.c b/tools/libs/evtchn/minios.c
index 9cd7636fc5..f5db021747 100644
--- a/tools/libs/evtchn/minios.c
+++ b/tools/libs/evtchn/minios.c
@@ -63,7 +63,11 @@ static void port_dealloc(struct evtchn_port_info *port_info) {
     free(port_info);
 }
 
-int osdep_evtchn_open(xenevtchn_handle *xce)
+/*
+ * XENEVTCHN_NO_CLOEXEC is being ignored, as there is no exec() call supported
+ * in Mini-OS.
+ */
+int osdep_evtchn_open(xenevtchn_handle *xce, unsigned int open_flags)
 {
     int fd = alloc_fd(FTYPE_EVTCHN);
     if ( fd == -1 )
diff --git a/tools/libs/evtchn/netbsd.c b/tools/libs/evtchn/netbsd.c
index 8b8545d2f9..7c73d1c599 100644
--- a/tools/libs/evtchn/netbsd.c
+++ b/tools/libs/evtchn/netbsd.c
@@ -31,7 +31,7 @@
 
 #define EVTCHN_DEV_NAME  "/dev/xenevt"
 
-int osdep_evtchn_open(xenevtchn_handle *xce)
+int osdep_evtchn_open(xenevtchn_handle *xce, unsigned int open_flags)
 {
     int fd = open(EVTCHN_DEV_NAME, O_NONBLOCK|O_RDWR);
     if ( fd == -1 )
diff --git a/tools/libs/evtchn/private.h b/tools/libs/evtchn/private.h
index 31e595bea2..bcac2a191d 100644
--- a/tools/libs/evtchn/private.h
+++ b/tools/libs/evtchn/private.h
@@ -14,7 +14,7 @@ struct xenevtchn_handle {
     Xentoolcore__Active_Handle tc_ah;
 };
 
-int osdep_evtchn_open(xenevtchn_handle *xce);
+int osdep_evtchn_open(xenevtchn_handle *xce, unsigned int open_flags);
 int osdep_evtchn_close(xenevtchn_handle *xce);
 int osdep_evtchn_restrict(xenevtchn_handle *xce, domid_t domid);
 
diff --git a/tools/libs/evtchn/solaris.c b/tools/libs/evtchn/solaris.c
index dd41f62a24..7e22f7906a 100644
--- a/tools/libs/evtchn/solaris.c
+++ b/tools/libs/evtchn/solaris.c
@@ -29,7 +29,7 @@
 
 #include "private.h"
 
-int osdep_evtchn_open(xenevtchn_handle *xce)
+int osdep_evtchn_open(xenevtchn_handle *xce, unsigned int open_flags)
 {
     int fd;
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 05/25] tools/xenstore: refactor XS_CONTROL handling
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (3 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 04/25] tools/libxenevtchn: add possibility to not close file descriptor on exec Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 06/25] tools/xenstore: add live update command to xenstore-control Juergen Gross
                   ` (20 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Julien Grall, Paul Durrant

In order to allow control commands with binary data refactor handling
of XS_CONTROL:

- get primary command first
- add maximum number of additional parameters to pass to command
  handler

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Paul Durrant <paul@xen.org>
---
V2:
- add comment regarding max_pars (Pawel Wieczorkiewicz)

V3:
- addressed Paul's comments

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/xenstored_control.c | 34 ++++++++++++++++++++----------
 tools/xenstore/xenstored_core.c    |  3 +--
 tools/xenstore/xenstored_core.h    |  1 +
 3 files changed, 25 insertions(+), 13 deletions(-)

diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index 8d48ab4820..8d29db8270 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -30,6 +30,14 @@ struct cmd_s {
 	char *cmd;
 	int (*func)(void *, struct connection *, char **, int);
 	char *pars;
+	/*
+	 * max_pars can be used to limit the size of the parameter vector,
+	 * e.g. in case of large binary parts in the parameters.
+	 * The command is included in the count, so 1 means just the command
+	 * without any parameter.
+	 * 0 == no limit (the default)
+	 */
+	unsigned int max_pars;
 };
 
 static int do_control_check(void *ctx, struct connection *conn,
@@ -194,25 +202,29 @@ static int do_control_help(void *ctx, struct connection *conn,
 
 int do_control(struct connection *conn, struct buffered_data *in)
 {
-	int num;
-	int cmd;
-	char **vec;
+	unsigned int cmd, num, off;
+	char **vec = NULL;
 
 	if (conn->id != 0)
 		return EACCES;
 
-	num = xs_count_strings(in->buffer, in->used);
-	if (num < 1)
+	off = get_string(in, 0);
+	if (!off)
+		return EINVAL;
+	for (cmd = 0; cmd < ARRAY_SIZE(cmds); cmd++)
+		if (streq(in->buffer, cmds[cmd].cmd))
+			break;
+	if (cmd == ARRAY_SIZE(cmds))
 		return EINVAL;
+
+	num = xs_count_strings(in->buffer, in->used);
+	if (cmds[cmd].max_pars)
+		num = min(num, cmds[cmd].max_pars);
 	vec = talloc_array(in, char *, num);
 	if (!vec)
 		return ENOMEM;
-	if (get_strings(in, vec, num) != num)
+	if (get_strings(in, vec, num) < num)
 		return EIO;
 
-	for (cmd = 0; cmd < ARRAY_SIZE(cmds); cmd++)
-		if (streq(vec[0], cmds[cmd].cmd))
-			return cmds[cmd].func(in, conn, vec + 1, num - 1);
-
-	return EINVAL;
+	return cmds[cmd].func(in, conn, vec + 1, num - 1);
 }
diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index 50986f8b29..e1b92c3dc8 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -620,8 +620,7 @@ static struct buffered_data *new_buffer(void *ctx)
 /* Return length of string (including nul) at this offset.
  * If there is no nul, returns 0 for failure.
  */
-static unsigned int get_string(const struct buffered_data *data,
-			       unsigned int offset)
+unsigned int get_string(const struct buffered_data *data, unsigned int offset)
 {
 	const char *nul;
 
diff --git a/tools/xenstore/xenstored_core.h b/tools/xenstore/xenstored_core.h
index fb59d862a2..27826c125c 100644
--- a/tools/xenstore/xenstored_core.h
+++ b/tools/xenstore/xenstored_core.h
@@ -142,6 +142,7 @@ const char *onearg(struct buffered_data *in);
 /* Break input into vectors, return the number, fill in up to num of them. */
 unsigned int get_strings(struct buffered_data *data,
 			 char *vec[], unsigned int num);
+unsigned int get_string(const struct buffered_data *data, unsigned int offset);
 
 void send_reply(struct connection *conn, enum xsd_sockmsg_type type,
 		const void *data, unsigned int len);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 06/25] tools/xenstore: add live update command to xenstore-control
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (4 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 05/25] tools/xenstore: refactor XS_CONTROL handling Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2021-01-06 14:42   ` Edwin Torok
  2020-12-15 16:35 ` [PATCH v10 07/25] tools/xenstore: add basic live-update command parsing Juergen Gross
                   ` (19 subsequent siblings)
  25 siblings, 1 reply; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Paul Durrant, Julien Grall

Add the "live-update" command to xenstore-control enabling updating
xenstored to a new version in a running Xen system.

With -c <arg> it is possible to pass a different command line to the
new instance of xenstored. This will replace the command line used
for the invocation of the just running xenstored instance.

The running xenstored (or xenstore-stubdom) needs to support live
updating, of course.

For now just add a small dummy handler to C xenstore denying any
live update action.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Julien Grall <jgrall@amazon.com>
---
V2:
- add 0 byte after kernel chunk
- add comment regrading add_to_buf() semantics (Pawel Wieczorkiewicz)
- use %u for unsigned in format (Pawel Wieczorkiewicz)
- explain buffer size better (Pawel Wieczorkiewicz)
- add loop around "-s" option for client side retry in case of timeout

V3:
- add live-update command to docs/misc/xenstore.txt (Paul Durrant)
- fix indent (Paul Durrant)

V4:
- made several parameters const (Julien Grall)
- added more details to xenstore.txt (Julien Grall)

V5:
- set old_binary to NULL initially (Paul Durrant)

V6:
- use strerror(errno) in error message (Julien Grall)

V10:
- make binary specification mandatory (Andrew Cooper)

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 docs/misc/xenstore.txt             |  21 ++
 tools/xenstore/Makefile            |   3 +-
 tools/xenstore/xenstore_control.c  | 332 +++++++++++++++++++++++++++--
 tools/xenstore/xenstored_control.c |  30 +++
 4 files changed, 369 insertions(+), 17 deletions(-)

diff --git a/docs/misc/xenstore.txt b/docs/misc/xenstore.txt
index 2081f20f55..1480742330 100644
--- a/docs/misc/xenstore.txt
+++ b/docs/misc/xenstore.txt
@@ -317,6 +317,27 @@ CONTROL			<command>|[<parameters>|]
 	Current commands are:
 	check
 		checks xenstored innards
+	live-update|<params>|+
+		perform a live-update of the Xenstore daemon, only to
+		be used via xenstore-control command.
+		<params> are implementation specific and are used for
+		different steps of the live-update processing. Currently
+		supported <params> are:
+		-f <file>  specify new daemon binary
+		-b <size>  specify size of new stubdom binary
+		-d <chunk-size> <binary-chunk>  transfer chunk of new
+			stubdom binary
+		-c <pars>  specify new command line to use
+		-s [-t <sec>] [-F]  start live update process (-t specifies
+			timeout in seconds to wait for active transactions
+			to finish, default is 60 seconds; -F will force
+			live update to happen even with running transactions
+			after timeout elapsed)
+		-a  abort live update handling
+		All sub-options will return "OK" in case of success or an
+		error string in case of failure. -s can return "BUSY" in case
+		of an active transaction, a retry of -s can be done in that
+		case.
 	log|on
 		turn xenstore logging on
 	log|off
diff --git a/tools/xenstore/Makefile b/tools/xenstore/Makefile
index 9a0f0d012d..ab89e22d3a 100644
--- a/tools/xenstore/Makefile
+++ b/tools/xenstore/Makefile
@@ -11,6 +11,7 @@ CFLAGS += -include $(XEN_ROOT)/tools/config.h
 CFLAGS += -I./include
 CFLAGS += $(CFLAGS_libxenevtchn)
 CFLAGS += $(CFLAGS_libxenctrl)
+CFLAGS += $(CFLAGS_libxenguest)
 CFLAGS += $(CFLAGS_libxentoolcore)
 CFLAGS += -DXEN_LIB_STORED="\"$(XEN_LIB_STORED)\""
 CFLAGS += -DXEN_RUN_STORED="\"$(XEN_RUN_STORED)\""
@@ -81,7 +82,7 @@ xenstore: xenstore_client.o
 	$(CC) $< $(LDFLAGS) $(LDLIBS_libxenstore) $(LDLIBS_libxentoolcore) $(SOCKET_LIBS) -o $@ $(APPEND_LDFLAGS)
 
 xenstore-control: xenstore_control.o
-	$(CC) $< $(LDFLAGS) $(LDLIBS_libxenstore) $(LDLIBS_libxentoolcore) $(SOCKET_LIBS) -o $@ $(APPEND_LDFLAGS)
+	$(CC) $< $(LDFLAGS) $(LDLIBS_libxenstore) $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) $(LDLIBS_libxentoolcore) $(SOCKET_LIBS) -o $@ $(APPEND_LDFLAGS)
 
 xs_tdb_dump: xs_tdb_dump.o utils.o tdb.o talloc.o
 	$(CC) $^ $(LDFLAGS) -o $@ $(APPEND_LDFLAGS)
diff --git a/tools/xenstore/xenstore_control.c b/tools/xenstore/xenstore_control.c
index afa04495a7..5ca015a07d 100644
--- a/tools/xenstore/xenstore_control.c
+++ b/tools/xenstore/xenstore_control.c
@@ -1,9 +1,311 @@
+#define _GNU_SOURCE
+#include <stdbool.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+#include <time.h>
+#include <xenctrl.h>
+#include <xenguest.h>
 
 #include "xenstore.h"
 
+/* Add a string plus terminating 0 byte to buf, returning new len. */
+static int add_to_buf(char **buf, const char *val, int len)
+{
+    int vallen = strlen(val) + 1;
+
+    if (len < 0)
+        return -1;
+
+    *buf = realloc(*buf, len + vallen);
+    if (!*buf)
+        return -1;
+
+    strcpy(*buf + len, val);
+
+    return len + vallen;
+}
+
+static int live_update_start(struct xs_handle *xsh, bool force, unsigned int to)
+{
+    int len = 0;
+    char *buf = NULL, *ret;
+    time_t time_start;
+
+    if (asprintf(&ret, "%u", to) < 0)
+        return 1;
+    len = add_to_buf(&buf, "-s", len);
+    len = add_to_buf(&buf, "-t", len);
+    len = add_to_buf(&buf, ret, len);
+    free(ret);
+    if (force)
+        len = add_to_buf(&buf, "-F", len);
+    if (len < 0)
+        return 1;
+
+    for (time_start = time(NULL); time(NULL) - time_start < to;) {
+        ret = xs_control_command(xsh, "live-update", buf, len);
+        if (!ret)
+            goto err;
+        if (strcmp(ret, "BUSY"))
+            break;
+    }
+
+    if (strcmp(ret, "OK"))
+        goto err;
+
+    free(buf);
+    free(ret);
+
+    return 0;
+
+ err:
+    fprintf(stderr, "Starting live update failed:\n%s\n",
+            ret ? : strerror(errno));
+    free(buf);
+    free(ret);
+
+    return 3;
+}
+
+static int live_update_cmdline(struct xs_handle *xsh, const char *cmdline)
+{
+    int len = 0, rc = 0;
+    char *buf = NULL, *ret;
+
+    len = add_to_buf(&buf, "-c", len);
+    len = add_to_buf(&buf, cmdline, len);
+    if (len < 0)
+        return 1;
+
+    ret = xs_control_command(xsh, "live-update", buf, len);
+    free(buf);
+    if (!ret || strcmp(ret, "OK")) {
+        fprintf(stderr, "Setting update binary failed:\n%s\n",
+                ret ? : strerror(errno));
+        rc = 3;
+    }
+    free(ret);
+
+    return rc;
+}
+
+static int send_kernel_blob(struct xs_handle *xsh, const char *binary)
+{
+    int rc = 0, len = 0;
+    xc_interface *xch;
+    struct xc_dom_image *dom;
+    char *ret, *buf = NULL;
+    size_t off, sz;
+#define BLOB_CHUNK_SZ 2048
+
+    xch = xc_interface_open(NULL, NULL, 0);
+    if (!xch) {
+        fprintf(stderr, "xc_interface_open() failed\n");
+        return 1;
+    }
+
+    dom = xc_dom_allocate(xch, NULL, NULL);
+    if (!dom) {
+        rc = 1;
+        goto out_close;
+    }
+
+    rc = xc_dom_kernel_file(dom, binary);
+    if (rc) {
+        rc = 1;
+        goto out_rel;
+    }
+
+    if (asprintf(&ret, "%zu", dom->kernel_size) < 0) {
+        rc = 1;
+        goto out_rel;
+    }
+    len = add_to_buf(&buf, "-b", len);
+    len = add_to_buf(&buf, ret, len);
+    free(ret);
+    if (len < 0) {
+        rc = 1;
+        goto out_rel;
+    }
+    ret = xs_control_command(xsh, "live-update", buf, len);
+    free(buf);
+    if (!ret || strcmp(ret, "OK")) {
+        fprintf(stderr, "Starting live update failed:\n%s\n",
+                ret ? : strerror(errno));
+        rc = 3;
+    }
+    free(ret);
+    if (rc)
+        goto out_rel;
+
+    /* buf capable to hold "-d" <1..2048> BLOB_CHUNK_SZ and a terminating 0. */
+    buf = malloc(3 + 5 + BLOB_CHUNK_SZ + 1);
+    if (!buf) {
+        rc = 1;
+        goto out_rel;
+    }
+
+    strcpy(buf, "-d");
+    sz = BLOB_CHUNK_SZ;
+    for (off = 0; off < dom->kernel_size; off += BLOB_CHUNK_SZ) {
+        if (dom->kernel_size - off < BLOB_CHUNK_SZ)
+            sz = dom->kernel_size - off;
+        sprintf(buf + 3, "%zu", sz);
+        len = 3 + strlen(buf + 3) + 1;
+        memcpy(buf + len, dom->kernel_blob + off, sz);
+        buf[len + sz] = 0;
+        len += sz + 1;
+        ret = xs_control_command(xsh, "live-update", buf, len);
+        if (!ret || strcmp(ret, "OK")) {
+            fprintf(stderr, "Transfer of new binary failed:\n%s\n",
+                    ret ? : strerror(errno));
+            rc = 3;
+            free(ret);
+            break;
+        }
+        free(ret);
+    }
+
+    free(buf);
+
+ out_rel:
+    xc_dom_release(dom);
+
+ out_close:
+    xc_interface_close(xch);
+
+    return rc;
+}
+
+/*
+ * Live update of Xenstore stubdom
+ *
+ * Sequence of actions:
+ * 1. transfer new stubdom binary
+ *    a) specify size
+ *    b) transfer unpacked binary in chunks
+ * 2. transfer new cmdline (optional)
+ * 3. start update (includes flags)
+ */
+static int live_update_stubdom(struct xs_handle *xsh, const char *binary,
+                               const char *cmdline, bool force, unsigned int to)
+{
+    int rc;
+
+    rc = send_kernel_blob(xsh, binary);
+    if (rc)
+        goto abort;
+
+    if (cmdline) {
+        rc = live_update_cmdline(xsh, cmdline);
+        if (rc)
+            goto abort;
+    }
+
+    rc = live_update_start(xsh, force, to);
+    if (rc)
+        goto abort;
+
+    return 0;
+
+ abort:
+    xs_control_command(xsh, "live-update", "-a", 3);
+    return rc;
+}
+
+/*
+ * Live update of Xenstore daemon
+ *
+ * Sequence of actions:
+ * 1. transfer new binary filename
+ * 2. transfer new cmdline (optional)
+ * 3. start update (includes flags)
+ */
+static int live_update_daemon(struct xs_handle *xsh, const char *binary,
+                              const char *cmdline, bool force, unsigned int to)
+{
+    int len = 0, rc;
+    char *buf = NULL, *ret;
+
+    len = add_to_buf(&buf, "-f", len);
+    len = add_to_buf(&buf, binary, len);
+    if (len < 0)
+        return 1;
+    ret = xs_control_command(xsh, "live-update", buf, len);
+    free(buf);
+    if (!ret || strcmp(ret, "OK")) {
+        fprintf(stderr, "Setting update binary failed:\n%s\n",
+                ret ? : strerror(errno));
+        free(ret);
+        return 3;
+    }
+    free(ret);
+
+    if (cmdline) {
+        rc = live_update_cmdline(xsh, cmdline);
+        if (rc)
+            goto abort;
+    }
+
+    rc = live_update_start(xsh, force, to);
+    if (rc)
+        goto abort;
+
+    return 0;
+
+ abort:
+    xs_control_command(xsh, "live-update", "-a", 3);
+    return rc;
+}
+
+static int live_update(struct xs_handle *xsh, int argc, char **argv)
+{
+    int rc = 0;
+    unsigned int i, to = 60;
+    char *binary = NULL, *cmdline = NULL, *val;
+    bool force = false;
+
+    for (i = 0; i < argc; i++) {
+        if (!strcmp(argv[i], "-c")) {
+            i++;
+            if (i == argc) {
+                fprintf(stderr, "Missing command line value\n");
+                rc = 2;
+                goto out;
+            }
+            cmdline = argv[i];
+        } else if (!strcmp(argv[i], "-t")) {
+            i++;
+            if (i == argc) {
+                fprintf(stderr, "Missing timeout value\n");
+                rc = 2;
+                goto out;
+            }
+            to = atoi(argv[i]);
+        } else if (!strcmp(argv[i], "-F"))
+            force = true;
+        else
+            binary = argv[i];
+    }
+
+    if (!binary) {
+        fprintf(stderr, "Missing binary specification\n");
+        rc = 2;
+        goto out;
+    }
+
+    val = xs_read(xsh, XBT_NULL, "/tool/xenstored/domid", &i);
+    if (val)
+        rc = live_update_stubdom(xsh, binary, cmdline, force, to);
+    else
+        rc = live_update_daemon(xsh, binary, cmdline, force, to);
+
+    free(val);
+
+ out:
+    return rc;
+}
 
 int main(int argc, char **argv)
 {
@@ -20,22 +322,6 @@ int main(int argc, char **argv)
         goto out;
     }
 
-    for (p = 2; p < argc; p++)
-        len += strlen(argv[p]) + 1;
-    if (len) {
-        par = malloc(len);
-        if (!par) {
-            fprintf(stderr, "Allocation error.\n");
-            rc = 1;
-            goto out;
-        }
-        len = 0;
-        for (p = 2; p < argc; p++) {
-            memcpy(par + len, argv[p], strlen(argv[p]) + 1);
-            len += strlen(argv[p]) + 1;
-        }
-    }
-
     xsh = xs_open(0);
     if (xsh == NULL) {
         fprintf(stderr, "Failed to contact Xenstored.\n");
@@ -43,6 +329,19 @@ int main(int argc, char **argv)
         goto out;
     }
 
+    if (!strcmp(argv[1], "live-update")) {
+        rc = live_update(xsh, argc - 2, argv + 2);
+        goto out_close;
+    }
+
+    for (p = 2; p < argc; p++)
+        len = add_to_buf(&par, argv[p], len);
+    if (len < 0) {
+        fprintf(stderr, "Allocation error.\n");
+        rc = 1;
+        goto out_close;
+    }
+
     ret = xs_control_command(xsh, argv[1], par, len);
     if (!ret) {
         rc = 3;
@@ -59,6 +358,7 @@ int main(int argc, char **argv)
     } else if (strlen(ret) > 0)
         printf("%s\n", ret);
 
+ out_close:
     xs_close(xsh);
 
  out:
diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index 8d29db8270..00fda5acdb 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -149,11 +149,41 @@ static int do_control_print(void *ctx, struct connection *conn,
 	return 0;
 }
 
+static int do_control_lu(void *ctx, struct connection *conn,
+			 char **vec, int num)
+{
+	const char *resp;
+
+	resp = talloc_strdup(ctx, "NYI");
+	send_reply(conn, XS_CONTROL, resp, strlen(resp) + 1);
+	return 0;
+}
+
 static int do_control_help(void *, struct connection *, char **, int);
 
 static struct cmd_s cmds[] = {
 	{ "check", do_control_check, "" },
 	{ "log", do_control_log, "on|off" },
+
+	/*
+	 * The parameters are those of the xenstore-control utility!
+	 * Depending on environment (Mini-OS or daemon) the live-update
+	 * sequence is split into several sub-operations:
+	 * 1. Specification of new binary
+	 *    daemon:  -f <filename>
+	 *    Mini-OS: -b <binary-size>
+	 *             -d <size> <data-bytes> (multiple of those)
+	 * 2. New command-line (optional): -c <cmdline>
+	 * 3. Start of update: -s [-F] [-t <timeout>]
+	 * Any sub-operation needs to respond with the string "OK" in case
+	 * of success, any other response indicates failure.
+	 * A started live-update sequence can be aborted via "-a" (not
+	 * needed in case of failure for the first or last live-update
+	 * sub-operation).
+	 */
+	{ "live-update", do_control_lu,
+		"[-c <cmdline>] [-F] [-t <timeout>] <file>\n"
+		"    Default timeout is 60 seconds.", 4 },
 #ifdef __MINIOS__
 	{ "memreport", do_control_memreport, "" },
 #else
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 07/25] tools/xenstore: add basic live-update command parsing
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (5 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 06/25] tools/xenstore: add live update command to xenstore-control Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 08/25] tools/xenstore: introduce live update status block Juergen Gross
                   ` (18 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Paul Durrant, Julien Grall

Add the basic parts for parsing the live-update control command.

For now only add the parameter evaluation and calling appropriate
functions. Those function only print a message for now and return
success.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Julien Grall <jgrall@amazon.com>
---
V2:
- keep consistent style in lu_arch() (Pawel Wieczorkiewicz)
- fix handling of force flag (Pawel Wieczorkiewicz)
- use xprintf() instead of trace() for better stubdom diag
- add conn parameter to subfunctions

V4:
- make several parameters/variables const (Julien Grall)
- don't reject an option specified multiple times (Julien Grall)
- use syslog() for messages

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/xenstored_control.c | 105 ++++++++++++++++++++++++++++-
 1 file changed, 104 insertions(+), 1 deletion(-)

diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index 00fda5acdb..e3f0d34528 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -19,7 +19,9 @@
 #include <errno.h>
 #include <stdarg.h>
 #include <stdio.h>
+#include <stdlib.h>
 #include <string.h>
+#include <syslog.h>
 
 #include "utils.h"
 #include "talloc.h"
@@ -149,12 +151,113 @@ static int do_control_print(void *ctx, struct connection *conn,
 	return 0;
 }
 
+static const char *lu_abort(const void *ctx, struct connection *conn)
+{
+	syslog(LOG_INFO, "live-update: abort\n");
+	return NULL;
+}
+
+static const char *lu_cmdline(const void *ctx, struct connection *conn,
+			      const char *cmdline)
+{
+	syslog(LOG_INFO, "live-update: cmdline %s\n", cmdline);
+	return NULL;
+}
+
+#ifdef __MINIOS__
+static const char *lu_binary_alloc(const void *ctx, struct connection *conn,
+				   unsigned long size)
+{
+	syslog(LOG_INFO, "live-update: binary size %lu\n", size);
+	return NULL;
+}
+
+static const char *lu_binary_save(const void *ctx, struct connection *conn,
+				  unsigned int size, const char *data)
+{
+	return NULL;
+}
+
+static const char *lu_arch(const void *ctx, struct connection *conn,
+			   char **vec, int num)
+{
+	if (num == 2 && !strcmp(vec[0], "-b"))
+		return lu_binary_alloc(ctx, conn, atol(vec[1]));
+	if (num > 2 && !strcmp(vec[0], "-d"))
+		return lu_binary_save(ctx, conn, atoi(vec[1]), vec[2]);
+
+	errno = EINVAL;
+	return NULL;
+}
+#else
+static const char *lu_binary(const void *ctx, struct connection *conn,
+			     const char *filename)
+{
+	syslog(LOG_INFO, "live-update: binary %s\n", filename);
+	return NULL;
+}
+
+static const char *lu_arch(const void *ctx, struct connection *conn,
+			   char **vec, int num)
+{
+	if (num == 2 && !strcmp(vec[0], "-f"))
+		return lu_binary(ctx, conn, vec[1]);
+
+	errno = EINVAL;
+	return NULL;
+}
+#endif
+
+static const char *lu_start(const void *ctx, struct connection *conn,
+			    bool force, unsigned int to)
+{
+	syslog(LOG_INFO, "live-update: start, force=%d, to=%u\n", force, to);
+	return NULL;
+}
+
 static int do_control_lu(void *ctx, struct connection *conn,
 			 char **vec, int num)
 {
 	const char *resp;
+	const char *ret = NULL;
+	unsigned int i;
+	bool force = false;
+	unsigned int to = 0;
+
+	if (num < 1)
+		return EINVAL;
+
+	if (!strcmp(vec[0], "-a")) {
+		if (num == 1)
+			ret = lu_abort(ctx, conn);
+		else
+			return EINVAL;
+	} else if (!strcmp(vec[0], "-c")) {
+		if (num == 2)
+			ret = lu_cmdline(ctx, conn, vec[1]);
+		else
+			return EINVAL;
+	} else if (!strcmp(vec[0], "-s")) {
+		for (i = 1; i < num; i++) {
+			if (!strcmp(vec[i], "-F"))
+				force = true;
+			else if (!strcmp(vec[i], "-t") && i < num - 1) {
+				i++;
+				to = atoi(vec[i]);
+			} else
+				return EINVAL;
+		}
+		ret = lu_start(ctx, conn, force, to);
+	} else {
+		errno = 0;
+		ret = lu_arch(ctx, conn, vec, num);
+		if (errno)
+			return errno;
+	}
 
-	resp = talloc_strdup(ctx, "NYI");
+	if (!ret)
+		ret = "OK";
+	resp = talloc_strdup(ctx, ret);
 	send_reply(conn, XS_CONTROL, resp, strlen(resp) + 1);
 	return 0;
 }
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 08/25] tools/xenstore: introduce live update status block
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (6 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 07/25] tools/xenstore: add basic live-update command parsing Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 09/25] tools/xenstore: save new binary for live update Juergen Gross
                   ` (17 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Paul Durrant, Julien Grall

Live update of Xenstore is done in multiple steps. It needs a status
block holding the current state of live update and related data. It
is allocated as child of the connection live update was started over
in order to abort live update in case the connection is closed.

Allocation of the block is done in lu_binary[_alloc](), freeing in
lu_abort() (and for now in lu_start() as long as no real live-update
is happening).

Add tests in all live-update command handlers other than lu_abort()
and lu_binary[_alloc]() for being started via the same connection
as the begin of live-update.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Julien Grall <jgrall@amazon.com>
---
V2:
- use talloc_zero() for allocating the status area (Julien Grall)

V4:
- const (Julien Grall)

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/xenstored_control.c | 63 ++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index e3f0d34528..7854b7f46f 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -28,6 +28,34 @@
 #include "xenstored_core.h"
 #include "xenstored_control.h"
 
+struct live_update {
+	/* For verification the correct connection is acting. */
+	struct connection *conn;
+};
+
+static struct live_update *lu_status;
+
+static int lu_destroy(void *data)
+{
+	lu_status = NULL;
+
+	return 0;
+}
+
+static const char *lu_begin(struct connection *conn)
+{
+	if (lu_status)
+		return "live-update session already active.";
+
+	lu_status = talloc_zero(conn, struct live_update);
+	if (!lu_status)
+		return "Allocation failure.";
+	lu_status->conn = conn;
+	talloc_set_destructor(lu_status, lu_destroy);
+
+	return NULL;
+}
+
 struct cmd_s {
 	char *cmd;
 	int (*func)(void *, struct connection *, char **, int);
@@ -154,6 +182,13 @@ static int do_control_print(void *ctx, struct connection *conn,
 static const char *lu_abort(const void *ctx, struct connection *conn)
 {
 	syslog(LOG_INFO, "live-update: abort\n");
+
+	if (!lu_status)
+		return "No live-update session active.";
+
+	/* Destructor will do the real abort handling. */
+	talloc_free(lu_status);
+
 	return NULL;
 }
 
@@ -161,6 +196,10 @@ static const char *lu_cmdline(const void *ctx, struct connection *conn,
 			      const char *cmdline)
 {
 	syslog(LOG_INFO, "live-update: cmdline %s\n", cmdline);
+
+	if (!lu_status || lu_status->conn != conn)
+		return "Not in live-update session.";
+
 	return NULL;
 }
 
@@ -168,13 +207,23 @@ static const char *lu_cmdline(const void *ctx, struct connection *conn,
 static const char *lu_binary_alloc(const void *ctx, struct connection *conn,
 				   unsigned long size)
 {
+	const char *ret;
+
 	syslog(LOG_INFO, "live-update: binary size %lu\n", size);
+
+	ret = lu_begin(conn);
+	if (ret)
+		return ret;
+
 	return NULL;
 }
 
 static const char *lu_binary_save(const void *ctx, struct connection *conn,
 				  unsigned int size, const char *data)
 {
+	if (!lu_status || lu_status->conn != conn)
+		return "Not in live-update session.";
+
 	return NULL;
 }
 
@@ -193,7 +242,14 @@ static const char *lu_arch(const void *ctx, struct connection *conn,
 static const char *lu_binary(const void *ctx, struct connection *conn,
 			     const char *filename)
 {
+	const char *ret;
+
 	syslog(LOG_INFO, "live-update: binary %s\n", filename);
+
+	ret = lu_begin(conn);
+	if (ret)
+		return ret;
+
 	return NULL;
 }
 
@@ -212,6 +268,13 @@ static const char *lu_start(const void *ctx, struct connection *conn,
 			    bool force, unsigned int to)
 {
 	syslog(LOG_INFO, "live-update: start, force=%d, to=%u\n", force, to);
+
+	if (!lu_status || lu_status->conn != conn)
+		return "Not in live-update session.";
+
+	/* Will be replaced by real live-update later. */
+	talloc_free(lu_status);
+
 	return NULL;
 }
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 09/25] tools/xenstore: save new binary for live update
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (7 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 08/25] tools/xenstore: introduce live update status block Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 10/25] tools/xenstore: add command line handling " Juergen Gross
                   ` (16 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Paul Durrant, Julien Grall

Save the new binary name for the daemon case and the new kernel for
stubdom in order to support live update of Xenstore..

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Julien Grall <jgrall@amazon.com>
---
 tools/xenstore/xenstored_control.c | 41 +++++++++++++++++++++++++++++-
 1 file changed, 40 insertions(+), 1 deletion(-)

diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index 7854b7f46f..95ac1a1648 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -1,5 +1,5 @@
 /*
-    Interactive commands for Xen Store Daemon.
+Interactive commands for Xen Store Daemon.
     Copyright (C) 2017 Juergen Gross, SUSE Linux GmbH
 
     This program is free software; you can redistribute it and/or modify
@@ -22,6 +22,9 @@
 #include <stdlib.h>
 #include <string.h>
 #include <syslog.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <unistd.h>
 
 #include "utils.h"
 #include "talloc.h"
@@ -31,6 +34,14 @@
 struct live_update {
 	/* For verification the correct connection is acting. */
 	struct connection *conn;
+
+#ifdef __MINIOS__
+	void *kernel;
+	unsigned int kernel_size;
+	unsigned int kernel_off;
+#else
+	char *filename;
+#endif
 };
 
 static struct live_update *lu_status;
@@ -215,6 +226,13 @@ static const char *lu_binary_alloc(const void *ctx, struct connection *conn,
 	if (ret)
 		return ret;
 
+	lu_status->kernel = talloc_size(lu_status, size);
+	if (!lu_status->kernel)
+		return "Allocation failure.";
+
+	lu_status->kernel_size = size;
+	lu_status->kernel_off = 0;
+
 	return NULL;
 }
 
@@ -224,6 +242,12 @@ static const char *lu_binary_save(const void *ctx, struct connection *conn,
 	if (!lu_status || lu_status->conn != conn)
 		return "Not in live-update session.";
 
+	if (lu_status->kernel_off + size > lu_status->kernel_size)
+		return "Too much kernel data.";
+
+	memcpy(lu_status->kernel + lu_status->kernel_off, data, size);
+	lu_status->kernel_off += size;
+
 	return NULL;
 }
 
@@ -243,13 +267,23 @@ static const char *lu_binary(const void *ctx, struct connection *conn,
 			     const char *filename)
 {
 	const char *ret;
+	struct stat statbuf;
 
 	syslog(LOG_INFO, "live-update: binary %s\n", filename);
 
+	if (stat(filename, &statbuf))
+		return "File not accessible.";
+	if (!(statbuf.st_mode & (S_IXOTH | S_IXGRP | S_IXUSR)))
+		return "File not executable.";
+
 	ret = lu_begin(conn);
 	if (ret)
 		return ret;
 
+	lu_status->filename = talloc_strdup(lu_status, filename);
+	if (!lu_status->filename)
+		return "Allocation failure.";
+
 	return NULL;
 }
 
@@ -272,6 +306,11 @@ static const char *lu_start(const void *ctx, struct connection *conn,
 	if (!lu_status || lu_status->conn != conn)
 		return "Not in live-update session.";
 
+#ifdef __MINIOS__
+	if (lu_status->kernel_size != lu_status->kernel_off)
+		return "Kernel not complete.";
+#endif
+
 	/* Will be replaced by real live-update later. */
 	talloc_free(lu_status);
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 10/25] tools/xenstore: add command line handling for live update
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (8 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 09/25] tools/xenstore: save new binary for live update Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 11/25] tools/xenstore: add the basic framework for doing the " Juergen Gross
                   ` (15 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Julien Grall, Paul Durrant

Updating an instance of Xenstore via live update needs to hand over
the command line parameters to the updated instance. Those can be
either the parameters used by the updated instance or new ones when
supplied when starting the live update.

So when supplied store the new command line parameters in lu_status.

As it is related add a new option -U (or --live-update") to the command
line of xenstored which will be added when starting the new instance.
This enables to perform slightly different initializations when
started as a result of live update.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Paul Durrant <paul@xen.org>
---
 tools/xenstore/xenstored_control.c | 6 ++++++
 tools/xenstore/xenstored_core.c    | 7 ++++++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index 95ac1a1648..2e0827b9ef 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -42,6 +42,8 @@ struct live_update {
 #else
 	char *filename;
 #endif
+
+	char *cmdline;
 };
 
 static struct live_update *lu_status;
@@ -211,6 +213,10 @@ static const char *lu_cmdline(const void *ctx, struct connection *conn,
 	if (!lu_status || lu_status->conn != conn)
 		return "Not in live-update session.";
 
+	lu_status->cmdline = talloc_strdup(lu_status, cmdline);
+	if (!lu_status->cmdline)
+		return "Allocation failure.";
+
 	return NULL;
 }
 
diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index e1b92c3dc8..0dddf24327 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -1969,6 +1969,7 @@ static struct option options[] = {
 	{ "internal-db", 0, NULL, 'I' },
 	{ "verbose", 0, NULL, 'V' },
 	{ "watch-nb", 1, NULL, 'W' },
+	{ "live-update", 0, NULL, 'U' },
 	{ NULL, 0, NULL, 0 } };
 
 extern void dump_conn(struct connection *conn); 
@@ -1983,11 +1984,12 @@ int main(int argc, char *argv[])
 	bool dofork = true;
 	bool outputpid = false;
 	bool no_domain_init = false;
+	bool live_update = false;
 	const char *pidfile = NULL;
 	int timeout;
 
 
-	while ((opt = getopt_long(argc, argv, "DE:F:HNPS:t:A:M:T:RVW:", options,
+	while ((opt = getopt_long(argc, argv, "DE:F:HNPS:t:A:M:T:RVW:U", options,
 				  NULL)) != -1) {
 		switch (opt) {
 		case 'D':
@@ -2045,6 +2047,9 @@ int main(int argc, char *argv[])
 		case 'p':
 			priv_domid = strtol(optarg, NULL, 10);
 			break;
+		case 'U':
+			live_update = true;
+			break;
 		}
 	}
 	if (optind != argc)
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 11/25] tools/xenstore: add the basic framework for doing the live update
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (9 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 10/25] tools/xenstore: add command line handling " Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 12/25] tools/xenstore: allow live update only with no transaction active Juergen Gross
                   ` (14 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Julien Grall, Paul Durrant

Add the main framework for executing the live update. This for now
only defines the basic execution steps with empty dummy functions.
This final step returning means failure, as in case of success the
new executable will have taken over.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Paul Durrant <paul@xen.org>
---
V4:
- const (Julien Grall)
---
 tools/xenstore/xenstored_control.c | 39 ++++++++++++++++++++++++++++--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index 2e0827b9ef..940b717741 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -304,9 +304,27 @@ static const char *lu_arch(const void *ctx, struct connection *conn,
 }
 #endif
 
+static const char *lu_check_lu_allowed(const void *ctx, bool force,
+				       unsigned int to)
+{
+	return NULL;
+}
+
+static const char *lu_dump_state(const void *ctx, struct connection *conn)
+{
+	return NULL;
+}
+
+static const char *lu_activate_binary(const void *ctx)
+{
+	return "Not yet implemented.";
+}
+
 static const char *lu_start(const void *ctx, struct connection *conn,
 			    bool force, unsigned int to)
 {
+	const char *ret;
+
 	syslog(LOG_INFO, "live-update: start, force=%d, to=%u\n", force, to);
 
 	if (!lu_status || lu_status->conn != conn)
@@ -317,10 +335,27 @@ static const char *lu_start(const void *ctx, struct connection *conn,
 		return "Kernel not complete.";
 #endif
 
-	/* Will be replaced by real live-update later. */
+	/* Check for state to allow live update. */
+	ret = lu_check_lu_allowed(ctx, force, to);
+	if (ret) {
+		if (!strcmp(ret, "BUSY"))
+			return ret;
+		goto out;
+	}
+
+	/* Dump out internal state, including "OK" for live update. */
+	ret = lu_dump_state(ctx, conn);
+	if (ret)
+		goto out;
+
+	/* Perform the activation of new binary. */
+	ret = lu_activate_binary(ctx);
+	/* We will reach this point only in case of failure. */
+
+ out:
 	talloc_free(lu_status);
 
-	return NULL;
+	return ret;
 }
 
 static int do_control_lu(void *ctx, struct connection *conn,
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 12/25] tools/xenstore: allow live update only with no transaction active
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (10 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 11/25] tools/xenstore: add the basic framework for doing the " Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 13/25] docs: update the xenstore migration stream documentation Juergen Gross
                   ` (13 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Paul Durrant, Julien Grall

In order to simplify live update state dumping only allow live update
to happen when no transaction is active.

A timeout is used to detect guests which have a transaction active for
longer periods of time. In case such a guest is detected when trying
to do a live update it will be reported and the update will fail.

The admin can then either use a longer timeout, or use the force flag
to just ignore the transactions of such a guest, or kill the guest
before retrying.

Transactions that have been active for a shorter time than the timeout
will end in the live update starting to respond "BUSY" without
aborting the complete live update process. The xenstore-control
program will then just repeat the live update starting until a
different result is returned.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Julien Grall <jgrall@amazon.com>
---
 tools/xenstore/xenstored_control.c     | 20 +++++++++++++++++++-
 tools/xenstore/xenstored_core.h        |  1 +
 tools/xenstore/xenstored_transaction.c |  5 +++++
 3 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index 940b717741..af64a9a2d4 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -22,6 +22,7 @@ Interactive commands for Xen Store Daemon.
 #include <stdlib.h>
 #include <string.h>
 #include <syslog.h>
+#include <time.h>
 #include <sys/types.h>
 #include <sys/stat.h>
 #include <unistd.h>
@@ -307,7 +308,24 @@ static const char *lu_arch(const void *ctx, struct connection *conn,
 static const char *lu_check_lu_allowed(const void *ctx, bool force,
 				       unsigned int to)
 {
-	return NULL;
+	char *ret = NULL;
+	struct connection *conn;
+	time_t now = time(NULL);
+	bool busy = false;
+
+	list_for_each_entry(conn, &connections, list) {
+		if (conn->ta_start_time - now >= to && !force) {
+			ret = talloc_asprintf(ctx, "%s\nDomain %u: %ld s",
+					      ret ? : "Domains with long running transactions:",
+					      conn->id,
+					      conn->ta_start_time - now);
+			if (!ret)
+				busy = true;
+		} else if (conn->ta_start_time)
+			busy = true;
+	}
+
+	return ret ? (const char *)ret : (busy ? "BUSY" : NULL);
 }
 
 static const char *lu_dump_state(const void *ctx, struct connection *conn)
diff --git a/tools/xenstore/xenstored_core.h b/tools/xenstore/xenstored_core.h
index 27826c125c..a009b182fd 100644
--- a/tools/xenstore/xenstored_core.h
+++ b/tools/xenstore/xenstored_core.h
@@ -93,6 +93,7 @@ struct connection
 	struct list_head transaction_list;
 	uint32_t next_transaction_id;
 	unsigned int transaction_started;
+	time_t ta_start_time;
 
 	/* The domain I'm associated with, if any. */
 	struct domain *domain;
diff --git a/tools/xenstore/xenstored_transaction.c b/tools/xenstore/xenstored_transaction.c
index 52355f4ed8..cd07fb0f21 100644
--- a/tools/xenstore/xenstored_transaction.c
+++ b/tools/xenstore/xenstored_transaction.c
@@ -473,6 +473,8 @@ int do_transaction_start(struct connection *conn, struct buffered_data *in)
 	list_add_tail(&trans->list, &conn->transaction_list);
 	talloc_steal(conn, trans);
 	talloc_set_destructor(trans, destroy_transaction);
+	if (!conn->transaction_started)
+		conn->ta_start_time = time(NULL);
 	conn->transaction_started++;
 	wrl_ntransactions++;
 
@@ -511,6 +513,8 @@ int do_transaction_end(struct connection *conn, struct buffered_data *in)
 	conn->transaction = NULL;
 	list_del(&trans->list);
 	conn->transaction_started--;
+	if (!conn->transaction_started)
+		conn->ta_start_time = 0;
 
 	/* Attach transaction to in for auto-cleanup */
 	talloc_steal(in, trans);
@@ -589,6 +593,7 @@ void conn_delete_all_transactions(struct connection *conn)
 	assert(conn->transaction == NULL);
 
 	conn->transaction_started = 0;
+	conn->ta_start_time = 0;
 }
 
 int check_transactions(struct hashtable *hash)
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 13/25] docs: update the xenstore migration stream documentation
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (11 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 12/25] tools/xenstore: allow live update only with no transaction active Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 14/25] tools/xenstore: add include file for state structure definitions Juergen Gross
                   ` (12 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Julien Grall

For live update of Xenstore some records defined in the migration
stream document need to be changed:

- Support of the read-only socket has been dropped from all Xenstore
  implementations, so ro-socket-fd in the global record can be removed.

- Some guests require the event channel to Xenstore to remain the same
  on Xenstore side, so Xenstore has to keep the event channel interface
  open across a live update. For this purpose an evtchn-fd needs to be
  added to the global record.

- With no read-only support the flags field in the connection record
  can be dropped.

- The evtchn field in the connection record needs to be switched to
  hold the port of the Xenstore side of the event channel.

- A flags field needs to be added to permission specifiers in order to
  be able to mark a permission as stale (XSA-322).

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
---
V8:
- split off from following patch (Julien Grall)
---
 docs/designs/xenstore-migration.md | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/docs/designs/xenstore-migration.md b/docs/designs/xenstore-migration.md
index 2ce2c836f5..1a5b94b31d 100644
--- a/docs/designs/xenstore-migration.md
+++ b/docs/designs/xenstore-migration.md
@@ -116,7 +116,7 @@ xenstored state that needs to be restored.
 +-------+-------+-------+-------+
 | rw-socket-fd                  |
 +-------------------------------+
-| ro-socket-fd                  |
+| evtchn-fd                     |
 +-------------------------------+
 ```
 
@@ -126,8 +126,8 @@ xenstored state that needs to be restored.
 | `rw-socket-fd` | The file descriptor of the socket accepting  |
 |                | read-write connections                       |
 |                |                                              |
-| `ro-socket-fd` | The file descriptor of the socket accepting  |
-|                | read-only connections                        |
+| `evtchn-fd`    | The file descriptor used to communicate with |
+|                | the event channel driver                     |
 
 xenstored will resume in the original process context. Hence `rw-socket-fd` and
 `ro-socket-fd` simply specify the file descriptors of the sockets. Sockets
@@ -147,7 +147,7 @@ the domain being migrated.
 ```
     0       1       2       3       4       5       6       7    octet
 +-------+-------+-------+-------+-------+-------+-------+-------+
-| conn-id                       | conn-type     | flags         |
+| conn-id                       | conn-type     |               |
 +-------------------------------+---------------+---------------+
 | conn-spec
 ...
@@ -169,9 +169,6 @@ the domain being migrated.
 |                | 0x0001: socket                               |
 |                | 0x0002 - 0xFFFF: reserved for future use     |
 |                |                                              |
-| `flags`        | A bit-wise OR of:                            |
-|                | 0001: read-only                              |
-|                |                                              |
 | `conn-spec`    | See below                                    |
 |                |                                              |
 | `in-data-len`  | The length (in octets) of any data read      |
@@ -216,7 +213,7 @@ For `shared ring` connections it is as follows:
 |           | operation [2] or DOMID_INVALID [3] otherwise      |
 |           |                                                   |
 | `evtchn`  | The port number of the interdomain channel used   |
-|           | by `domid` to communicate with xenstored          |
+|           | by xenstored to communicate with `domid`          |
 |           |                                                   |
 
 Since the ABI guarantees that entry 1 in `domid`'s grant table will always
@@ -386,7 +383,7 @@ A node permission specifier has the following format:
 ```
     0       1       2       3    octet
 +-------+-------+-------+-------+
-| perm  | pad   | domid         |
+| perm  | flags | domid         |
 +-------+-------+---------------+
 ```
 
@@ -395,6 +392,10 @@ A node permission specifier has the following format:
 | `perm`  | One of the ASCII values `w`, `r`, `b` or `n` as     |
 |         | specified for the `SET_PERMS` operation [2]         |
 |         |                                                     |
+| `flags` | A bit-wise OR of:                                   |
+|         | 0x01: stale permission, ignore when checking        |
+|         |       permissions                                   |
+|         |                                                     |
 | `domid` | The domain-id to which the permission relates       |
 
 Note that perm1 defines the domain owning the code. See [4] for more
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 14/25] tools/xenstore: add include file for state structure definitions
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (12 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 13/25] docs: update the xenstore migration stream documentation Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 15/25] tools/xenstore: dump the xenstore state for live update Juergen Gross
                   ` (11 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Julien Grall

Add an include file containing all structures and defines needed for
dumping and restoring the internal Xenstore state.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
---
V4:
- drop mfn from connection record (rebase to V5 of state doc patch)
- add #ifdef __MINIOS__ (Julien Grall)
- correct comments (Julien Grall)
- add data_in_len

V5:
- add data_resp_len

V6:
- add flag byte to node permissions (Julien Grall)
- update migration document

V7:
- add evtchn_fd

V8:
- remove ro-socket and read-only connection flag
- split off documentation part

V9:
- add htobe32() macro if needed (Wei Liu)

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/include/xenstore_state.h | 131 ++++++++++++++++++++++++
 1 file changed, 131 insertions(+)
 create mode 100644 tools/xenstore/include/xenstore_state.h

diff --git a/tools/xenstore/include/xenstore_state.h b/tools/xenstore/include/xenstore_state.h
new file mode 100644
index 0000000000..d2a9307400
--- /dev/null
+++ b/tools/xenstore/include/xenstore_state.h
@@ -0,0 +1,131 @@
+/*
+ * Xenstore internal state dump definitions.
+ * Copyright (C) Juergen Gross, SUSE Software Solutions Germany GmbH
+ *
+ * Used for live-update and migration, possibly across Xenstore implementations.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef XENSTORE_STATE_H
+#define XENSTORE_STATE_H
+
+#include <endian.h>
+#include <sys/types.h>
+
+#ifndef htobe32
+#if __BYTE_ORDER == __LITTLE_ENDIAN
+#define htobe32(x) __builtin_bswap32(x)
+#else
+#define htobe32(x) (x)
+#endif
+#endif
+
+struct xs_state_preamble {
+    char ident[8];
+#define XS_STATE_IDENT    "xenstore"  /* To be used without the NUL byte. */
+    uint32_t version;                 /* Version in big endian format. */
+#define XS_STATE_VERSION  0x00000001
+    uint32_t flags;                   /* Endianess. */
+#if __BYTE_ORDER == __LITTLE_ENDIAN
+#define XS_STATE_FLAGS    0x00000000  /* Little endian. */
+#else
+#define XS_STATE_FLAGS    0x00000001  /* Big endian. */
+#endif
+};
+
+/*
+ * Each record is starting with xs_state_record_header.
+ * All records have a length of a multiple of 8 bytes.
+ */
+
+/* Common record layout: */
+struct xs_state_record_header {
+    uint32_t type;
+#define XS_STATE_TYPE_END        0x00000000
+#define XS_STATE_TYPE_GLOBAL     0x00000001
+#define XS_STATE_TYPE_CONN       0x00000002
+#define XS_STATE_TYPE_WATCH      0x00000003
+#define XS_STATE_TYPE_TA         0x00000004
+#define XS_STATE_TYPE_NODE       0x00000005
+    uint32_t length;         /* Length of record in bytes. */
+};
+
+/* Global state of Xenstore: */
+struct xs_state_global {
+    int32_t socket_fd;      /* File descriptor for socket connections or -1. */
+    int32_t evtchn_fd;      /* File descriptor for event channel operations. */
+};
+
+/* Connection to Xenstore: */
+struct xs_state_connection {
+    uint32_t conn_id;       /* Used as reference in watch and TA records. */
+    uint16_t conn_type;
+#define XS_STATE_CONN_TYPE_RING   0
+#define XS_STATE_CONN_TYPE_SOCKET 1
+    uint16_t pad;
+    union {
+        struct {
+            uint16_t domid;  /* Domain-Id. */
+            uint16_t tdomid; /* Id of target domain or DOMID_INVALID. */
+            uint32_t evtchn; /* Event channel port. */
+        } ring;
+        int32_t socket_fd;   /* File descriptor for socket connections. */
+    } spec;
+    uint16_t data_in_len;    /* Number of unprocessed bytes read from conn. */
+    uint16_t data_resp_len;  /* Size of partial response pending for conn. */
+    uint32_t data_out_len;   /* Number of bytes not yet written to conn. */
+    uint8_t  data[];         /* Pending data (read, written) + 0-7 pad bytes. */
+};
+
+/* Watch: */
+struct xs_state_watch {
+    uint32_t conn_id;       /* Connection this watch is associated with. */
+    uint16_t path_length;   /* Number of bytes of path watched (incl. 0). */
+    uint16_t token_length;  /* Number of bytes of watch token (incl. 0). */
+    uint8_t data[];         /* Path bytes, token bytes, 0-7 pad bytes. */
+};
+
+/* Transaction: */
+struct xs_state_transaction {
+    uint32_t conn_id;       /* Connection this TA is associated with. */
+    uint32_t ta_id;         /* Transaction Id. */
+};
+
+/* Node (either XS_STATE_TYPE_NODE or XS_STATE_TYPE_TANODE[_MOD]): */
+struct xs_state_node_perm {
+    uint8_t access;         /* Access rights. */
+#define XS_STATE_NODE_PERM_NONE   'n'
+#define XS_STATE_NODE_PERM_READ   'r'
+#define XS_STATE_NODE_PERM_WRITE  'w'
+#define XS_STATE_NODE_PERM_BOTH   'b'
+    uint8_t flags;
+#define XS_STATE_NODE_PERM_IGNORE 0x01 /* Stale permission, ignore for check. */
+    uint16_t domid;         /* Domain-Id. */
+};
+struct xs_state_node {
+    uint32_t conn_id;       /* Connection in case of transaction or 0. */
+    uint32_t ta_id;         /* Transaction Id or 0. */
+    uint16_t path_len;      /* Length of path string including NUL byte. */
+    uint16_t data_len;      /* Length of node data. */
+    uint16_t ta_access;
+#define XS_STATE_NODE_TA_DELETED  0x0000
+#define XS_STATE_NODE_TA_READ     0x0001
+#define XS_STATE_NODE_TA_WRITTEN  0x0002
+    uint16_t perm_n;        /* Number of permissions (0 in TA: node deleted). */
+    /* Permissions (first is owner, has full access). */
+    struct xs_state_node_perm perms[];
+    /* Path and data follows, plus 0-7 pad bytes. */
+};
+#endif /* XENSTORE_STATE_H */
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 15/25] tools/xenstore: dump the xenstore state for live update
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (13 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 14/25] tools/xenstore: add include file for state structure definitions Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 16/25] tools/xenstore: handle CLOEXEC flag for local files and pipes Juergen Gross
                   ` (10 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Paul Durrant, Julien Grall

Dump the complete Xenstore status to a file (daemon case) or memory
(stubdom case).

As we don't know the exact length of the needed area in advance we are
using an anonymous rather large mapping in stubdom case, which will
use only virtual address space until accessed. And as we are writing
the area in a sequential manner this is fine. As the initial size we
are choosing the double size of the memory allocated via talloc(),
which should be more than enough.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Julien Grall <jgrall@amazon.com>
---
V4:
- directly write to state file in daemon case
- add pending read data (Julien Grall)

V5:
- fix 0 length buffered data body (Julien Grall)
- simplify dump_state_buffered_data() (Paul Durrant)
- add data_resp_len field handling
- add comments (Paul Durrant)
- move dump_state_align() call out of dump_state_buffered_data()
  (Paul Durrant)
- move constant assignments out of loops (Paul Durrant)
- use set_tdb_key() (Paul Durrant)

V6:
- rename "first" to "Partial" (Paul Durrant)
- make sure data_resp_len is written (Paul Durrant)
- don't leak node memory in dump_state_nodes()
- add permission flag byte handling (Julien Grall)
- drop global state buffer (Julien Grall)
- add and correct comments (Julien Grall)
- add const (Julien Grall)
- add path buffer overrun check (Julien Grall)
- move get_watch_path() from later patch, correct dump_state_watches()
  to use it (Julien Grall)

V7:
- add glb.evtchn_fd, switch evtchn to local port
- remove definition of ROUNDUP() from utils.h due to rebase
- create state file with 0600 permissions

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/utils.c             |  17 +++
 tools/xenstore/utils.h             |   6 +
 tools/xenstore/xenstored_control.c | 102 +++++++++++++-
 tools/xenstore/xenstored_core.c    | 213 +++++++++++++++++++++++++++++
 tools/xenstore/xenstored_core.h    |  12 ++
 tools/xenstore/xenstored_domain.c  | 105 ++++++++++++++
 tools/xenstore/xenstored_domain.h  |   3 +
 tools/xenstore/xenstored_watch.c   |  57 +++++++-
 tools/xenstore/xenstored_watch.h   |   3 +
 9 files changed, 512 insertions(+), 6 deletions(-)

diff --git a/tools/xenstore/utils.c b/tools/xenstore/utils.c
index 633ce3b4fc..0d80cb6de8 100644
--- a/tools/xenstore/utils.c
+++ b/tools/xenstore/utils.c
@@ -62,3 +62,20 @@ void barf_perror(const char *fmt, ...)
 	}
 	exit(1);
 }
+
+const char *dump_state_align(FILE *fp)
+{
+	long len;
+	static char nul[8] = {};
+
+	len = ftell(fp);
+	if (len < 0)
+		return "Dump state align error";
+	len &= 7;
+	if (!len)
+		return NULL;
+
+	if (fwrite(nul, 8 - len, 1, fp) != 1)
+		return "Dump state align error";
+	return NULL;
+}
diff --git a/tools/xenstore/utils.h b/tools/xenstore/utils.h
index 6a1b5de9bd..df1cb9a3ba 100644
--- a/tools/xenstore/utils.h
+++ b/tools/xenstore/utils.h
@@ -3,6 +3,7 @@
 #include <stdbool.h>
 #include <string.h>
 #include <stdint.h>
+#include <stdio.h>
 
 #include <xen-tools/libs.h>
 
@@ -21,6 +22,11 @@ static inline bool strends(const char *a, const char *b)
 	return streq(a + strlen(a) - strlen(b), b);
 }
 
+/*
+ * Write NUL bytes for aligning state data to 8 bytes.
+ */
+const char *dump_state_align(FILE *fp);
+
 void barf(const char *fmt, ...) __attribute__((noreturn));
 void barf_perror(const char *fmt, ...) __attribute__((noreturn));
 
diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index af64a9a2d4..38550a559e 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -25,12 +25,21 @@ Interactive commands for Xen Store Daemon.
 #include <time.h>
 #include <sys/types.h>
 #include <sys/stat.h>
+#include <sys/mman.h>
+#include <fcntl.h>
 #include <unistd.h>
+#include <xenctrl.h>
 
 #include "utils.h"
 #include "talloc.h"
 #include "xenstored_core.h"
 #include "xenstored_control.h"
+#include "xenstored_domain.h"
+
+/* Mini-OS only knows about MAP_ANON. */
+#ifndef MAP_ANONYMOUS
+#define MAP_ANONYMOUS MAP_ANON
+#endif
 
 struct live_update {
 	/* For verification the correct connection is acting. */
@@ -40,6 +49,9 @@ struct live_update {
 	void *kernel;
 	unsigned int kernel_size;
 	unsigned int kernel_off;
+
+	void *dump_state;
+	unsigned long dump_size;
 #else
 	char *filename;
 #endif
@@ -51,6 +63,10 @@ static struct live_update *lu_status;
 
 static int lu_destroy(void *data)
 {
+#ifdef __MINIOS__
+	if (lu_status->dump_state)
+		munmap(lu_status->dump_state, lu_status->dump_size);
+#endif
 	lu_status = NULL;
 
 	return 0;
@@ -269,6 +285,31 @@ static const char *lu_arch(const void *ctx, struct connection *conn,
 	errno = EINVAL;
 	return NULL;
 }
+
+static FILE *lu_dump_open(const void *ctx)
+{
+	lu_status->dump_size = ROUNDUP(talloc_total_size(NULL) * 2,
+				       XC_PAGE_SHIFT);
+	lu_status->dump_state = mmap(NULL, lu_status->dump_size,
+				     PROT_READ | PROT_WRITE,
+				     MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+	if (lu_status->dump_state == MAP_FAILED)
+		return NULL;
+
+	return fmemopen(lu_status->dump_state, lu_status->dump_size, "w");
+}
+
+static void lu_dump_close(FILE *fp)
+{
+	size_t size;
+
+	size = ftell(fp);
+	size = ROUNDUP(size, XC_PAGE_SHIFT);
+	munmap(lu_status->dump_state + size, lu_status->dump_size - size);
+	lu_status->dump_size = size;
+
+	fclose(fp);
+}
 #else
 static const char *lu_binary(const void *ctx, struct connection *conn,
 			     const char *filename)
@@ -303,6 +344,27 @@ static const char *lu_arch(const void *ctx, struct connection *conn,
 	errno = EINVAL;
 	return NULL;
 }
+
+static FILE *lu_dump_open(const void *ctx)
+{
+	char *filename;
+	int fd;
+
+	filename = talloc_asprintf(ctx, "%s/state_dump", xs_daemon_rootdir());
+	if (!filename)
+		return NULL;
+
+	fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC, S_IRUSR | S_IWUSR);
+	if (fd < 0)
+		return NULL;
+
+	return fdopen(fd, "w");
+}
+
+static void lu_dump_close(FILE *fp)
+{
+	fclose(fp);
+}
 #endif
 
 static const char *lu_check_lu_allowed(const void *ctx, bool force,
@@ -330,7 +392,45 @@ static const char *lu_check_lu_allowed(const void *ctx, bool force,
 
 static const char *lu_dump_state(const void *ctx, struct connection *conn)
 {
-	return NULL;
+	FILE *fp;
+	const char *ret;
+	struct xs_state_record_header end;
+	struct xs_state_preamble pre;
+
+	fp = lu_dump_open(ctx);
+	if (!fp)
+		return "Dump state open error";
+
+	memcpy(pre.ident, XS_STATE_IDENT, sizeof(pre.ident));
+	pre.version = htobe32(XS_STATE_VERSION);
+	pre.flags = XS_STATE_FLAGS;
+	if (fwrite(&pre, sizeof(pre), 1, fp) != 1) {
+		ret = "Dump write error";
+		goto out;
+	}
+
+	ret = dump_state_global(fp);
+	if (ret)
+		goto out;
+	ret = dump_state_connections(fp, conn);
+	if (ret)
+		goto out;
+	ret = dump_state_special_nodes(fp);
+	if (ret)
+		goto out;
+	ret = dump_state_nodes(fp, ctx);
+	if (ret)
+		goto out;
+
+	end.type = XS_STATE_TYPE_END;
+	end.length = 0;
+	if (fwrite(&end, sizeof(end), 1, fp) != 1)
+		ret = "Dump write error";
+
+ out:
+	lu_dump_close(fp);
+
+	return ret;
 }
 
 static const char *lu_activate_binary(const void *ctx)
diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index 0dddf24327..064109c393 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -2219,6 +2219,219 @@ int main(int argc, char *argv[])
 	}
 }
 
+const char *dump_state_global(FILE *fp)
+{
+	struct xs_state_record_header head;
+	struct xs_state_global glb;
+
+	head.type = XS_STATE_TYPE_GLOBAL;
+	head.length = sizeof(glb);
+	if (fwrite(&head, sizeof(head), 1, fp) != 1)
+		return "Dump global state error";
+	glb.socket_fd = sock;
+	glb.evtchn_fd = xenevtchn_fd(xce_handle);
+	if (fwrite(&glb, sizeof(glb), 1, fp) != 1)
+		return "Dump global state error";
+
+	return NULL;
+}
+
+/* Called twice: first with fp == NULL to get length, then for writing data. */
+const char *dump_state_buffered_data(FILE *fp, const struct connection *c,
+				     const struct connection *conn,
+				     struct xs_state_connection *sc)
+{
+	unsigned int len = 0, used;
+	struct buffered_data *out, *in = c->in;
+	bool partial = true;
+
+	if (in && c != conn) {
+		len = in->inhdr ? in->used : sizeof(in->hdr);
+		if (fp && fwrite(&in->hdr, len, 1, fp) != 1)
+			return "Dump read data error";
+		if (!in->inhdr && in->used) {
+			len += in->used;
+			if (fp && fwrite(in->buffer, in->used, 1, fp) != 1)
+				return "Dump read data error";
+		}
+	}
+
+	if (sc) {
+		sc->data_in_len = len;
+		sc->data_resp_len = 0;
+	}
+
+	len = 0;
+
+	list_for_each_entry(out, &c->out_list, list) {
+		used = out->used;
+		if (out->inhdr) {
+			if (!used)
+				partial = false;
+			if (fp && fwrite(out->hdr.raw + out->used,
+				  sizeof(out->hdr) - out->used, 1, fp) != 1)
+				return "Dump buffered data error";
+			len += sizeof(out->hdr) - out->used;
+			used = 0;
+		}
+		if (fp && out->hdr.msg.len &&
+		    fwrite(out->buffer + used, out->hdr.msg.len - used,
+			   1, fp) != 1)
+			return "Dump buffered data error";
+		len += out->hdr.msg.len - used;
+		if (partial && sc)
+			sc->data_resp_len = len;
+		partial = false;
+	}
+
+	/* Add "OK" for live-update command. */
+	if (c == conn) {
+		struct xsd_sockmsg msg = conn->in->hdr.msg;
+
+		msg.len = sizeof("OK");
+		if (fp && fwrite(&msg, sizeof(msg), 1, fp) != 1)
+			return "Dump buffered data error";
+		len += sizeof(msg);
+		if (fp && fwrite("OK", msg.len, 1, fp) != 1)
+
+			return "Dump buffered data error";
+		len += msg.len;
+	}
+
+	if (sc)
+		sc->data_out_len = len;
+
+	return NULL;
+}
+
+const char *dump_state_node_perms(FILE *fp, struct xs_state_node *sn,
+				  const struct xs_permissions *perms,
+				  unsigned int n_perms)
+{
+	unsigned int p;
+
+	for (p = 0; p < n_perms; p++) {
+		switch ((int)perms[p].perms & ~XS_PERM_IGNORE) {
+		case XS_PERM_READ:
+			sn->perms[p].access = XS_STATE_NODE_PERM_READ;
+			break;
+		case XS_PERM_WRITE:
+			sn->perms[p].access = XS_STATE_NODE_PERM_WRITE;
+			break;
+		case XS_PERM_READ | XS_PERM_WRITE:
+			sn->perms[p].access = XS_STATE_NODE_PERM_BOTH;
+			break;
+		default:
+			sn->perms[p].access = XS_STATE_NODE_PERM_NONE;
+			break;
+		}
+		sn->perms[p].flags = (perms[p].perms & XS_PERM_IGNORE)
+				     ? XS_STATE_NODE_PERM_IGNORE : 0;
+		sn->perms[p].domid = perms[p].id;
+	}
+
+	if (fwrite(sn->perms, sizeof(*sn->perms), n_perms, fp) != n_perms)
+		return "Dump node permissions error";
+
+	return NULL;
+}
+
+static const char *dump_state_node_tree(FILE *fp, char *path)
+{
+	unsigned int pathlen, childlen, p = 0;
+	struct xs_state_record_header head;
+	struct xs_state_node sn;
+	TDB_DATA key, data;
+	const struct xs_tdb_record_hdr *hdr;
+	const char *child;
+	const char *ret;
+
+	pathlen = strlen(path) + 1;
+
+	set_tdb_key(path, &key);
+	data = tdb_fetch(tdb_ctx, key);
+	if (data.dptr == NULL)
+		return "Error reading node";
+
+	/* Clean up in case of failure. */
+	talloc_steal(path, data.dptr);
+
+	hdr = (void *)data.dptr;
+
+	head.type = XS_STATE_TYPE_NODE;
+	head.length = sizeof(sn);
+	sn.conn_id = 0;
+	sn.ta_id = 0;
+	sn.ta_access = 0;
+	sn.perm_n = hdr->num_perms;
+	sn.path_len = pathlen;
+	sn.data_len = hdr->datalen;
+	head.length += hdr->num_perms * sizeof(*sn.perms);
+	head.length += pathlen;
+	head.length += hdr->datalen;
+	head.length = ROUNDUP(head.length, 3);
+
+	if (fwrite(&head, sizeof(head), 1, fp) != 1)
+		return "Dump node state error";
+	if (fwrite(&sn, sizeof(sn), 1, fp) != 1)
+		return "Dump node state error";
+
+	ret = dump_state_node_perms(fp, &sn, hdr->perms, hdr->num_perms);
+	if (ret)
+		return ret;
+
+	if (fwrite(path, pathlen, 1, fp) != 1)
+		return "Dump node path error";
+	if (hdr->datalen &&
+	    fwrite(hdr->perms + hdr->num_perms, hdr->datalen, 1, fp) != 1)
+		return "Dump node data error";
+
+	ret = dump_state_align(fp);
+	if (ret)
+		return ret;
+
+	child = (char *)(hdr->perms + hdr->num_perms) + hdr->datalen;
+
+	/*
+	 * Use path for constructing children paths.
+	 * As we don't write out nodes without having written their parent
+	 * already we will never clobber a part of the path we'll need later.
+	 */
+	pathlen--;
+	if (path[pathlen - 1] != '/') {
+		path[pathlen] = '/';
+		pathlen++;
+	}
+	while (p < hdr->childlen) {
+		childlen = strlen(child) + 1;
+		if (pathlen + childlen > XENSTORE_ABS_PATH_MAX)
+			return "Dump node path length error";
+		strcpy(path + pathlen, child);
+		ret = dump_state_node_tree(fp, path);
+		if (ret)
+			return ret;
+		p += childlen;
+		child += childlen;
+	}
+
+	talloc_free(data.dptr);
+
+	return NULL;
+}
+
+const char *dump_state_nodes(FILE *fp, const void *ctx)
+{
+	char *path;
+
+	path = talloc_size(ctx, XENSTORE_ABS_PATH_MAX);
+	if (!path)
+		return "Path buffer allocation error";
+
+	strcpy(path, "/");
+
+	return dump_state_node_tree(fp, path);
+}
+
 /*
  * Local variables:
  *  mode: C
diff --git a/tools/xenstore/xenstored_core.h b/tools/xenstore/xenstored_core.h
index a009b182fd..32b306e161 100644
--- a/tools/xenstore/xenstored_core.h
+++ b/tools/xenstore/xenstored_core.h
@@ -30,6 +30,7 @@
 #include <errno.h>
 
 #include "xenstore_lib.h"
+#include "xenstore_state.h"
 #include "list.h"
 #include "tdb.h"
 #include "hashtable.h"
@@ -41,6 +42,8 @@ typedef int32_t wrl_creditt;
 #define WRL_CREDIT_MAX (1000*1000*1000)
 /* ^ satisfies non-overflow condition for wrl_xfer_credit */
 
+struct xs_state_connection;
+
 struct buffered_data
 {
 	struct list_head list;
@@ -224,6 +227,15 @@ int remember_string(struct hashtable *hash, const char *str);
 
 void set_tdb_key(const char *name, TDB_DATA *key);
 
+const char *dump_state_global(FILE *fp);
+const char *dump_state_buffered_data(FILE *fp, const struct connection *c,
+				     const struct connection *conn,
+				     struct xs_state_connection *sc);
+const char *dump_state_nodes(FILE *fp, const void *ctx);
+const char *dump_state_node_perms(FILE *fp, struct xs_state_node *sn,
+				  const struct xs_permissions *perms,
+				  unsigned int n_perms);
+
 #endif /* _XENSTORED_CORE_H */
 
 /*
diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
index ed8e83b06b..919a4d98cf 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -1143,6 +1143,111 @@ void wrl_apply_debit_trans_commit(struct connection *conn)
 	wrl_apply_debit_actual(conn->domain);
 }
 
+const char *dump_state_connections(FILE *fp, struct connection *conn)
+{
+	const char *ret = NULL;
+	unsigned int conn_id = 1;
+	struct xs_state_connection sc;
+	struct xs_state_record_header head;
+	struct connection *c;
+
+	list_for_each_entry(c, &connections, list) {
+		head.type = XS_STATE_TYPE_CONN;
+		head.length = sizeof(sc);
+
+		sc.conn_id = conn_id++;
+		sc.pad = 0;
+		memset(&sc.spec, 0, sizeof(sc.spec));
+		if (c->domain) {
+			sc.conn_type = XS_STATE_CONN_TYPE_RING;
+			sc.spec.ring.domid = c->id;
+			sc.spec.ring.tdomid = c->target ? c->target->id
+						: DOMID_INVALID;
+			sc.spec.ring.evtchn = c->domain->port;
+		} else {
+			sc.conn_type = XS_STATE_CONN_TYPE_SOCKET;
+			sc.spec.socket_fd = c->fd;
+		}
+
+		ret = dump_state_buffered_data(NULL, c, conn, &sc);
+		if (ret)
+			return ret;
+		head.length += sc.data_in_len + sc.data_out_len;
+		head.length = ROUNDUP(head.length, 3);
+		if (fwrite(&head, sizeof(head), 1, fp) != 1)
+			return "Dump connection state error";
+		if (fwrite(&sc, offsetof(struct xs_state_connection, data),
+			   1, fp) != 1)
+			return "Dump connection state error";
+		ret = dump_state_buffered_data(fp, c, conn, NULL);
+		if (ret)
+			return ret;
+		ret = dump_state_align(fp);
+		if (ret)
+			return ret;
+
+		ret = dump_state_watches(fp, c, sc.conn_id);
+		if (ret)
+			return ret;
+	}
+
+	return ret;
+}
+
+static const char *dump_state_special_node(FILE *fp, const char *name,
+					   const struct node_perms *perms)
+{
+	struct xs_state_record_header head;
+	struct xs_state_node sn;
+	unsigned int pathlen;
+	const char *ret;
+
+	pathlen = strlen(name) + 1;
+
+	head.type = XS_STATE_TYPE_NODE;
+	head.length = sizeof(sn);
+
+	sn.conn_id = 0;
+	sn.ta_id = 0;
+	sn.ta_access = 0;
+	sn.perm_n = perms->num;
+	sn.path_len = pathlen;
+	sn.data_len = 0;
+	head.length += perms->num * sizeof(*sn.perms);
+	head.length += pathlen;
+	head.length = ROUNDUP(head.length, 3);
+	if (fwrite(&head, sizeof(head), 1, fp) != 1)
+		return "Dump special node error";
+	if (fwrite(&sn, sizeof(sn), 1, fp) != 1)
+		return "Dump special node error";
+
+	ret = dump_state_node_perms(fp, &sn, perms->p, perms->num);
+	if (ret)
+		return ret;
+
+	if (fwrite(name, pathlen, 1, fp) != 1)
+		return "Dump special node path error";
+
+	ret = dump_state_align(fp);
+
+	return ret;
+}
+
+const char *dump_state_special_nodes(FILE *fp)
+{
+	const char *ret;
+
+	ret = dump_state_special_node(fp, "@releaseDomain",
+				      &dom_release_perms);
+	if (ret)
+		return ret;
+
+	ret = dump_state_special_node(fp, "@introduceDomain",
+				      &dom_introduce_perms);
+
+	return ret;
+}
+
 /*
  * Local variables:
  *  mode: C
diff --git a/tools/xenstore/xenstored_domain.h b/tools/xenstore/xenstored_domain.h
index 66e0a12654..413b974375 100644
--- a/tools/xenstore/xenstored_domain.h
+++ b/tools/xenstore/xenstored_domain.h
@@ -97,4 +97,7 @@ void wrl_log_periodic(struct wrl_timestampt now);
 void wrl_apply_debit_direct(struct connection *conn);
 void wrl_apply_debit_trans_commit(struct connection *conn);
 
+const char *dump_state_connections(FILE *fp, struct connection *conn);
+const char *dump_state_special_nodes(FILE *fp);
+
 #endif /* _XENSTORED_DOMAIN_H */
diff --git a/tools/xenstore/xenstored_watch.c b/tools/xenstore/xenstored_watch.c
index 9ff20690c0..9248f08bd9 100644
--- a/tools/xenstore/xenstored_watch.c
+++ b/tools/xenstore/xenstored_watch.c
@@ -72,6 +72,19 @@ static bool is_child(const char *child, const char *parent)
 	return child[len] == '/' || child[len] == '\0';
 }
 
+static const char *get_watch_path(const struct watch *watch, const char *name)
+{
+	const char *path = name;
+
+	if (watch->relative_path) {
+		path += strlen(watch->relative_path);
+		if (*path == '/') /* Could be "" */
+			path++;
+	}
+
+	return path;
+}
+
 /*
  * Send a watch event.
  * Temporary memory allocations are done with ctx.
@@ -85,11 +98,7 @@ static void add_event(struct connection *conn,
 	unsigned int len;
 	char *data;
 
-	if (watch->relative_path) {
-		name += strlen(watch->relative_path);
-		if (*name == '/') /* Could be "" */
-			name++;
-	}
+	name = get_watch_path(watch, name);
 
 	len = strlen(name) + 1 + strlen(watch->token) + 1;
 	/* Don't try to send over-long events. */
@@ -291,6 +300,44 @@ void conn_delete_all_watches(struct connection *conn)
 	}
 }
 
+const char *dump_state_watches(FILE *fp, struct connection *conn,
+			       unsigned int conn_id)
+{
+	const char *ret = NULL;
+	struct watch *watch;
+	struct xs_state_watch sw;
+	struct xs_state_record_header head;
+	const char *path;
+
+	head.type = XS_STATE_TYPE_WATCH;
+
+	list_for_each_entry(watch, &conn->watches, list) {
+		head.length = sizeof(sw);
+
+		sw.conn_id = conn_id;
+		path = get_watch_path(watch, watch->node);
+		sw.path_length = strlen(path) + 1;
+		sw.token_length = strlen(watch->token) + 1;
+		head.length += sw.path_length + sw.token_length;
+		head.length = ROUNDUP(head.length, 3);
+		if (fwrite(&head, sizeof(head), 1, fp) != 1)
+			return "Dump watch state error";
+		if (fwrite(&sw, sizeof(sw), 1, fp) != 1)
+			return "Dump watch state error";
+
+		if (fwrite(path, sw.path_length, 1, fp) != 1)
+			return "Dump watch path error";
+		if (fwrite(watch->token, sw.token_length, 1, fp) != 1)
+			return "Dump watch token error";
+
+		ret = dump_state_align(fp);
+		if (ret)
+			return ret;
+	}
+
+	return ret;
+}
+
 /*
  * Local variables:
  *  mode: C
diff --git a/tools/xenstore/xenstored_watch.h b/tools/xenstore/xenstored_watch.h
index 03094374f3..3d81645f45 100644
--- a/tools/xenstore/xenstored_watch.h
+++ b/tools/xenstore/xenstored_watch.h
@@ -30,4 +30,7 @@ void fire_watches(struct connection *conn, const void *tmp, const char *name,
 
 void conn_delete_all_watches(struct connection *conn);
 
+const char *dump_state_watches(FILE *fp, struct connection *conn,
+			       unsigned int conn_id);
+
 #endif /* _XENSTORED_WATCH_H */
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 16/25] tools/xenstore: handle CLOEXEC flag for local files and pipes
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (14 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 15/25] tools/xenstore: dump the xenstore state for live update Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 17/25] tools/xenstore: split off domain introduction from do_introduce() Juergen Gross
                   ` (9 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Julien Grall

For support of live update the locally used files need to have the
"close on exec" flag set. Fortunately the used Xen libraries are
already doing this, so only the logging and tdb related files and
pipes are affected. openlog() has the close on exec attribute, too.

In order to be able to keep the event channels open specify the
XENEVTCHN_NO_CLOEXEC flag when calling xenevtchn_open().

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Julien Grall <julien.grall@amazon.com>
---
V4:
- disable LU in case of O_CLOEXEC not supported (Julien Grall)

V5:
- add comment (Paul Durrant)

V7:
- set XENEVTCHN_NO_CLOEXEC

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/xenstored_control.c |  6 ++++++
 tools/xenstore/xenstored_core.c    |  6 ++++--
 tools/xenstore/xenstored_core.h    |  8 ++++++++
 tools/xenstore/xenstored_domain.c  |  2 +-
 tools/xenstore/xenstored_posix.c   | 12 ++++++++++++
 5 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index 38550a559e..437276de8d 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -41,6 +41,7 @@ Interactive commands for Xen Store Daemon.
 #define MAP_ANONYMOUS MAP_ANON
 #endif
 
+#ifndef NO_LIVE_UPDATE
 struct live_update {
 	/* For verification the correct connection is acting. */
 	struct connection *conn;
@@ -85,6 +86,7 @@ static const char *lu_begin(struct connection *conn)
 
 	return NULL;
 }
+#endif
 
 struct cmd_s {
 	char *cmd;
@@ -209,6 +211,7 @@ static int do_control_print(void *ctx, struct connection *conn,
 	return 0;
 }
 
+#ifndef NO_LIVE_UPDATE
 static const char *lu_abort(const void *ctx, struct connection *conn)
 {
 	syslog(LOG_INFO, "live-update: abort\n");
@@ -522,6 +525,7 @@ static int do_control_lu(void *ctx, struct connection *conn,
 	send_reply(conn, XS_CONTROL, resp, strlen(resp) + 1);
 	return 0;
 }
+#endif
 
 static int do_control_help(void *, struct connection *, char **, int);
 
@@ -529,6 +533,7 @@ static struct cmd_s cmds[] = {
 	{ "check", do_control_check, "" },
 	{ "log", do_control_log, "on|off" },
 
+#ifndef NO_LIVE_UPDATE
 	/*
 	 * The parameters are those of the xenstore-control utility!
 	 * Depending on environment (Mini-OS or daemon) the live-update
@@ -548,6 +553,7 @@ static struct cmd_s cmds[] = {
 	{ "live-update", do_control_lu,
 		"[-c <cmdline>] [-F] [-t <timeout>] <file>\n"
 		"    Default timeout is 60 seconds.", 4 },
+#endif
 #ifdef __MINIOS__
 	{ "memreport", do_control_memreport, "" },
 #else
diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index 064109c393..0f4e10815a 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -197,7 +197,8 @@ void reopen_log(void)
 	if (tracefile) {
 		close_log();
 
-		tracefd = open(tracefile, O_WRONLY|O_CREAT|O_APPEND, 0600);
+		tracefd = open(tracefile,
+			       O_WRONLY | O_CREAT | O_APPEND | O_CLOEXEC, 0600);
 
 		if (tracefd < 0)
 			perror("Could not open tracefile");
@@ -1646,7 +1647,8 @@ static void setup_structure(void)
 	if (!(tdb_flags & TDB_INTERNAL))
 		unlink(tdbname);
 
-	tdb_ctx = tdb_open_ex(tdbname, 7919, tdb_flags, O_RDWR|O_CREAT|O_EXCL,
+	tdb_ctx = tdb_open_ex(tdbname, 7919, tdb_flags,
+			      O_RDWR | O_CREAT | O_EXCL | O_CLOEXEC,
 			      0640, &tdb_logger, NULL);
 	if (!tdb_ctx)
 		barf_perror("Could not create tdb file %s", tdbname);
diff --git a/tools/xenstore/xenstored_core.h b/tools/xenstore/xenstored_core.h
index 32b306e161..e40e0e6806 100644
--- a/tools/xenstore/xenstored_core.h
+++ b/tools/xenstore/xenstored_core.h
@@ -35,6 +35,14 @@
 #include "tdb.h"
 #include "hashtable.h"
 
+#ifndef O_CLOEXEC
+#define O_CLOEXEC 0
+/* O_CLOEXEC support is needed for Live Update in the daemon case. */
+#ifndef __MINIOS__
+#define NO_LIVE_UPDATE
+#endif
+#endif
+
 /* DEFAULT_BUFFER_SIZE should be large enough for each errno string. */
 #define DEFAULT_BUFFER_SIZE 16
 
diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
index 919a4d98cf..38d250fbed 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -743,7 +743,7 @@ void domain_init(void)
 
 	talloc_set_destructor(xgt_handle, close_xgt_handle);
 
-	xce_handle = xenevtchn_open(NULL, 0);
+	xce_handle = xenevtchn_open(NULL, XENEVTCHN_NO_CLOEXEC);
 
 	if (xce_handle == NULL)
 		barf_perror("Failed to open evtchn device");
diff --git a/tools/xenstore/xenstored_posix.c b/tools/xenstore/xenstored_posix.c
index 1f9603fea2..ae3e63e07f 100644
--- a/tools/xenstore/xenstored_posix.c
+++ b/tools/xenstore/xenstored_posix.c
@@ -90,9 +90,21 @@ void finish_daemonize(void)
 
 void init_pipe(int reopen_log_pipe[2])
 {
+	int flags;
+	unsigned int i;
+
 	if (pipe(reopen_log_pipe)) {
 		barf_perror("pipe");
 	}
+
+	for (i = 0; i < 2; i++) {
+		flags = fcntl(reopen_log_pipe[i], F_GETFD);
+		if (flags < 0)
+			barf_perror("pipe get flags");
+		flags |= FD_CLOEXEC;
+		if (fcntl(reopen_log_pipe[i],  F_SETFD, flags) < 0)
+			barf_perror("pipe set flags");
+	}
 }
 
 void unmap_xenbus(void *interface)
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 17/25] tools/xenstore: split off domain introduction from do_introduce()
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (15 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 16/25] tools/xenstore: handle CLOEXEC flag for local files and pipes Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 18/25] tools/xenstore: evaluate the live update flag when starting Juergen Gross
                   ` (8 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Julien Grall

For live update the functionality to introduce a new domain similar to
the XS_INTRODUCE command is needed, so split that functionality off
into a dedicated function introduce_domain().

Switch initial dom0 initialization to use this function, too.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
---
V8:
- new patch
---
 tools/xenstore/xenstored_domain.c | 95 ++++++++++++++++++-------------
 1 file changed, 55 insertions(+), 40 deletions(-)

diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
index 38d250fbed..71b078caf3 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -326,7 +326,7 @@ static struct domain *find_domain_struct(unsigned int domid)
 	return NULL;
 }
 
-static struct domain *alloc_domain(void *context, unsigned int domid)
+static struct domain *alloc_domain(const void *context, unsigned int domid)
 {
 	struct domain *domain;
 
@@ -347,6 +347,14 @@ static struct domain *alloc_domain(void *context, unsigned int domid)
 	return domain;
 }
 
+static struct domain *find_or_alloc_domain(const void *ctx, unsigned int domid)
+{
+	struct domain *domain;
+
+	domain = find_domain_struct(domid);
+	return domain ? : alloc_domain(ctx, domid);
+}
+
 static int new_domain(struct domain *domain, int port)
 {
 	int rc;
@@ -413,52 +421,41 @@ static void domain_conn_reset(struct domain *domain)
 	domain->interface->rsp_cons = domain->interface->rsp_prod = 0;
 }
 
-/* domid, gfn, evtchn, path */
-int do_introduce(struct connection *conn, struct buffered_data *in)
+static struct domain *introduce_domain(const void *ctx,
+				       unsigned int domid,
+				       evtchn_port_t port)
 {
 	struct domain *domain;
-	char *vec[3];
-	unsigned int domid;
-	evtchn_port_t port;
 	int rc;
 	struct xenstore_domain_interface *interface;
+	bool is_master_domain = (domid == xenbus_master_domid());
 
-	if (get_strings(in, vec, ARRAY_SIZE(vec)) < ARRAY_SIZE(vec))
-		return EINVAL;
-
-	domid = atoi(vec[0]);
-	/* Ignore the gfn, we don't need it. */
-	port = atoi(vec[2]);
-
-	/* Sanity check args. */
-	if (port <= 0)
-		return EINVAL;
-
-	domain = find_domain_struct(domid);
-
-	if (domain == NULL) {
-		/* Hang domain off "in" until we're finished. */
-		domain = alloc_domain(in, domid);
-		if (domain == NULL)
-			return ENOMEM;
-	}
+	domain = find_or_alloc_domain(ctx, domid);
+	if (!domain)
+		return NULL;
 
 	if (!domain->introduced) {
-		interface = map_interface(domid);
+		interface = is_master_domain ? xenbus_map()
+					     : map_interface(domid);
 		if (!interface)
-			return errno;
-		/* Hang domain off "in" until we're finished. */
+			return NULL;
 		if (new_domain(domain, port)) {
 			rc = errno;
-			unmap_interface(interface);
-			return rc;
+			if (is_master_domain)
+				unmap_xenbus(interface);
+			else
+				unmap_interface(interface);
+			errno = rc;
+			return NULL;
 		}
 		domain->interface = interface;
 
 		/* Now domain belongs to its connection. */
 		talloc_steal(domain->conn, domain);
 
-		fire_watches(NULL, in, "@introduceDomain", NULL, false, NULL);
+		if (!is_master_domain)
+			fire_watches(NULL, ctx, "@introduceDomain", NULL,
+				     false, NULL);
 	} else {
 		/* Use XS_INTRODUCE for recreating the xenbus event-channel. */
 		if (domain->port)
@@ -467,6 +464,32 @@ int do_introduce(struct connection *conn, struct buffered_data *in)
 		domain->port = (rc == -1) ? 0 : rc;
 	}
 
+	return domain;
+}
+
+/* domid, gfn, evtchn, path */
+int do_introduce(struct connection *conn, struct buffered_data *in)
+{
+	struct domain *domain;
+	char *vec[3];
+	unsigned int domid;
+	evtchn_port_t port;
+
+	if (get_strings(in, vec, ARRAY_SIZE(vec)) < ARRAY_SIZE(vec))
+		return EINVAL;
+
+	domid = atoi(vec[0]);
+	/* Ignore the gfn, we don't need it. */
+	port = atoi(vec[2]);
+
+	/* Sanity check args. */
+	if (port <= 0)
+		return EINVAL;
+
+	domain = introduce_domain(in, domid, port);
+	if (!domain)
+		return errno;
+
 	domain_conn_reset(domain);
 
 	send_ack(conn, XS_INTRODUCE);
@@ -692,17 +715,9 @@ static int dom0_init(void)
 	if (port == -1)
 		return -1;
 
-	dom0 = alloc_domain(NULL, xenbus_master_domid());
+	dom0 = introduce_domain(NULL, xenbus_master_domid(), port);
 	if (!dom0)
 		return -1;
-	if (new_domain(dom0, port))
-		return -1;
-
-	dom0->interface = xenbus_map();
-	if (dom0->interface == NULL)
-		return -1;
-
-	talloc_steal(dom0->conn, dom0); 
 
 	xenevtchn_notify(xce_handle, dom0->port);
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 18/25] tools/xenstore: evaluate the live update flag when starting
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (16 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 17/25] tools/xenstore: split off domain introduction from do_introduce() Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 19/25] tools/xenstore: read internal state when doing live upgrade Juergen Gross
                   ` (7 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Julien Grall

In the live update case several initialization steps of xenstore must
be omitted or modified. Add the proper handling for that.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
---
V4:
- set xprintf = trace in daemon case (Julien Grall)
- only update /tool/xenstored node contents

V7:
- some restructuring to enable keeping event channel fd

V8:
- pass evtfd to domain_init() as parameter (Julien Grall)
- call dom0_init() from main()

V10:
- remove support for remembering binary name (Andrew Cooper)

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/xenstored_control.c |  5 ++++
 tools/xenstore/xenstored_control.h |  1 +
 tools/xenstore/xenstored_core.c    | 43 +++++++++++++++++++++---------
 tools/xenstore/xenstored_domain.c  | 26 +++++++++---------
 tools/xenstore/xenstored_domain.h  |  3 ++-
 tools/xenstore/xenstored_posix.c   |  1 -
 6 files changed, 50 insertions(+), 29 deletions(-)

diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index 437276de8d..a539666410 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -436,6 +436,11 @@ static const char *lu_dump_state(const void *ctx, struct connection *conn)
 	return ret;
 }
 
+void lu_read_state(void)
+{
+	xprintf("live-update: read state\n");
+}
+
 static const char *lu_activate_binary(const void *ctx)
 {
 	return "Not yet implemented.";
diff --git a/tools/xenstore/xenstored_control.h b/tools/xenstore/xenstored_control.h
index 207e0a6fa3..aac61f0590 100644
--- a/tools/xenstore/xenstored_control.h
+++ b/tools/xenstore/xenstored_control.h
@@ -17,3 +17,4 @@
 */
 
 int do_control(struct connection *conn, struct buffered_data *in);
+void lu_read_state(void);
diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index 0f4e10815a..d6f8373ee0 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -1637,9 +1637,10 @@ static void tdb_logger(TDB_CONTEXT *tdb, int level, const char * fmt, ...)
 	}
 }
 
-static void setup_structure(void)
+static void setup_structure(bool live_update)
 {
 	char *tdbname;
+
 	tdbname = talloc_strdup(talloc_autofree_context(), xs_daemon_tdb());
 	if (!tdbname)
 		barf_perror("Could not create tdbname");
@@ -1653,14 +1654,16 @@ static void setup_structure(void)
 	if (!tdb_ctx)
 		barf_perror("Could not create tdb file %s", tdbname);
 
-	manual_node("/", "tool");
-	manual_node("/tool", "xenstored");
-	manual_node("/tool/xenstored", NULL);
+	if (live_update)
+		manual_node("/", NULL);
+	else {
+		manual_node("/", "tool");
+		manual_node("/tool", "xenstored");
+	}
 
 	check_store();
 }
 
-
 static unsigned int hash_from_key_fn(void *k)
 {
 	char *str = k;
@@ -2066,7 +2069,8 @@ int main(int argc, char *argv[])
 
 	if (dofork) {
 		openlog("xenstored", 0, LOG_DAEMON);
-		daemonize();
+		if (!live_update)
+			daemonize();
 	}
 	if (pidfile)
 		write_pidfile(pidfile);
@@ -2081,17 +2085,20 @@ int main(int argc, char *argv[])
 	talloc_enable_null_tracking();
 
 #ifndef NO_SOCKETS
-	init_sockets();
+	if (!live_update)
+		init_sockets();
 #endif
 
 	init_pipe(reopen_log_pipe);
 
 	/* Setup the database */
-	setup_structure();
+	setup_structure(live_update);
 
 	/* Listen to hypervisor. */
-	if (!no_domain_init)
-		domain_init();
+	if (!no_domain_init && !live_update) {
+		domain_init(-1);
+		dom0_init();
+	}
 
 	if (outputpid) {
 		printf("%ld\n", (long)getpid());
@@ -2099,13 +2106,21 @@ int main(int argc, char *argv[])
 	}
 
 	/* redirect to /dev/null now we're ready to accept connections */
-	if (dofork)
+	if (dofork && !live_update)
 		finish_daemonize();
+#ifndef __MINIOS__
+	if (dofork)
+		xprintf = trace;
+#endif
 
 	signal(SIGHUP, trigger_reopen_log);
 	if (tracefile)
 		tracefile = talloc_strdup(NULL, tracefile);
 
+	/* Read state in case of live update. */
+	if (live_update)
+		lu_read_state();
+
 	/* Get ready to listen to the tools. */
 	initialize_fds(&sock_pollfd_idx, &timeout);
 
@@ -2113,8 +2128,10 @@ int main(int argc, char *argv[])
 	xenbus_notify_running();
 
 #if defined(XEN_SYSTEMD_ENABLED)
-	sd_notify(1, "READY=1");
-	fprintf(stderr, SD_NOTICE "xenstored is ready\n");
+	if (!live_update) {
+		sd_notify(1, "READY=1");
+		fprintf(stderr, SD_NOTICE "xenstored is ready\n");
+	}
 #endif
 
 	/* Main loop. */
diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
index 71b078caf3..94dd501a3b 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -706,29 +706,23 @@ bool check_perms_special(const char *name, struct connection *conn)
 	return perm_for_conn(conn, p) & XS_PERM_READ;
 }
 
-static int dom0_init(void) 
-{ 
+void dom0_init(void)
+{
 	evtchn_port_t port;
 	struct domain *dom0;
 
 	port = xenbus_evtchn();
 	if (port == -1)
-		return -1;
+		barf_perror("Failed to initialize dom0 port");
 
 	dom0 = introduce_domain(NULL, xenbus_master_domid(), port);
 	if (!dom0)
-		return -1;
+		barf_perror("Failed to initialize dom0");
 
 	xenevtchn_notify(xce_handle, dom0->port);
-
-	if (set_dom_perms_default(&dom_release_perms) ||
-	    set_dom_perms_default(&dom_introduce_perms))
-		return -1;
-
-	return 0; 
 }
 
-void domain_init(void)
+void domain_init(int evtfd)
 {
 	int rc;
 
@@ -758,13 +752,17 @@ void domain_init(void)
 
 	talloc_set_destructor(xgt_handle, close_xgt_handle);
 
-	xce_handle = xenevtchn_open(NULL, XENEVTCHN_NO_CLOEXEC);
+	if (evtfd < 0)
+		xce_handle = xenevtchn_open(NULL, XENEVTCHN_NO_CLOEXEC);
+	else
+		xce_handle = xenevtchn_open_fd(NULL, evtfd, 0);
 
 	if (xce_handle == NULL)
 		barf_perror("Failed to open evtchn device");
 
-	if (dom0_init() != 0) 
-		barf_perror("Failed to initialize dom0 state"); 
+	if (set_dom_perms_default(&dom_release_perms) ||
+	    set_dom_perms_default(&dom_introduce_perms))
+		barf_perror("Failed to set special permissions");
 
 	if ((rc = xenevtchn_bind_virq(xce_handle, VIRQ_DOM_EXC)) == -1)
 		barf_perror("Failed to bind to domain exception virq port");
diff --git a/tools/xenstore/xenstored_domain.h b/tools/xenstore/xenstored_domain.h
index 413b974375..b20269b038 100644
--- a/tools/xenstore/xenstored_domain.h
+++ b/tools/xenstore/xenstored_domain.h
@@ -42,7 +42,8 @@ int do_get_domain_path(struct connection *conn, struct buffered_data *in);
 /* Allow guest to reset all watches */
 int do_reset_watches(struct connection *conn, struct buffered_data *in);
 
-void domain_init(void);
+void domain_init(int evtfd);
+void dom0_init(void);
 
 /* Returns the implicit path of a connection (only domains have this) */
 const char *get_implicit_path(const struct connection *conn);
diff --git a/tools/xenstore/xenstored_posix.c b/tools/xenstore/xenstored_posix.c
index ae3e63e07f..48c37ffe3e 100644
--- a/tools/xenstore/xenstored_posix.c
+++ b/tools/xenstore/xenstored_posix.c
@@ -85,7 +85,6 @@ void finish_daemonize(void)
 	dup2(devnull, STDOUT_FILENO);
 	dup2(devnull, STDERR_FILENO);
 	close(devnull);
-	xprintf = trace;
 }
 
 void init_pipe(int reopen_log_pipe[2])
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 19/25] tools/xenstore: read internal state when doing live upgrade
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (17 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 18/25] tools/xenstore: evaluate the live update flag when starting Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 20/25] tools/xenstore: add reading global state for live update Juergen Gross
                   ` (6 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Paul Durrant, Julien Grall

When started due to a live upgrade read the internal state and apply
it to the data base and internal structures.

Add the main control functions for that.

For now only handle the daemon case.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Julien Grall <jgrall@amazon.com>
---
V4:
- directly mmap dump state file (Julien Grall)
- use syslog() instead of xprintf() (Julien Grall)

V8:
- remove state file after reading it (Julien Grall)

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/xenstored_control.c | 102 ++++++++++++++++++++++++++++-
 1 file changed, 101 insertions(+), 1 deletion(-)

diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index a539666410..129d2b44bb 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -62,6 +62,14 @@ struct live_update {
 
 static struct live_update *lu_status;
 
+struct lu_dump_state {
+	void *buf;
+	unsigned int size;
+#ifndef __MINIOS__
+	int fd;
+#endif
+};
+
 static int lu_destroy(void *data)
 {
 #ifdef __MINIOS__
@@ -313,6 +321,14 @@ static void lu_dump_close(FILE *fp)
 
 	fclose(fp);
 }
+
+static void lu_get_dump_state(struct lu_dump_state *state)
+{
+}
+
+static void lu_close_dump_state(struct lu_dump_state *state)
+{
+}
 #else
 static const char *lu_binary(const void *ctx, struct connection *conn,
 			     const char *filename)
@@ -368,6 +384,50 @@ static void lu_dump_close(FILE *fp)
 {
 	fclose(fp);
 }
+
+static void lu_get_dump_state(struct lu_dump_state *state)
+{
+	char *filename;
+	struct stat statbuf;
+
+	state->size = 0;
+
+	filename = talloc_asprintf(NULL, "%s/state_dump", xs_daemon_rootdir());
+	if (!filename)
+		barf("Allocation failure");
+
+	state->fd = open(filename, O_RDONLY);
+	talloc_free(filename);
+	if (state->fd < 0)
+		return;
+	if (fstat(state->fd, &statbuf) != 0)
+		goto out_close;
+	state->size = statbuf.st_size;
+
+	state->buf = mmap(NULL, state->size, PROT_READ, MAP_PRIVATE,
+			  state->fd, 0);
+	if (state->buf == MAP_FAILED) {
+		state->size = 0;
+		goto out_close;
+	}
+
+	return;
+
+ out_close:
+	close(state->fd);
+}
+
+static void lu_close_dump_state(struct lu_dump_state *state)
+{
+	char *filename;
+
+	munmap(state->buf, state->size);
+	close(state->fd);
+
+	filename = talloc_asprintf(NULL, "%s/state_dump", xs_daemon_rootdir());
+	unlink(filename);
+	talloc_free(filename);
+}
 #endif
 
 static const char *lu_check_lu_allowed(const void *ctx, bool force,
@@ -438,7 +498,47 @@ static const char *lu_dump_state(const void *ctx, struct connection *conn)
 
 void lu_read_state(void)
 {
-	xprintf("live-update: read state\n");
+	struct lu_dump_state state;
+	struct xs_state_record_header *head;
+	void *ctx = talloc_new(NULL); /* Work context for subfunctions. */
+	struct xs_state_preamble *pre;
+
+	syslog(LOG_INFO, "live-update: read state\n");
+	lu_get_dump_state(&state);
+	if (state.size == 0)
+		barf_perror("No state found after live-update");
+
+	pre = state.buf;
+	if (memcmp(pre->ident, XS_STATE_IDENT, sizeof(pre->ident)) ||
+	    pre->version != htobe32(XS_STATE_VERSION) ||
+	    pre->flags != XS_STATE_FLAGS)
+		barf("Unknown record identifier");
+	for (head = state.buf + sizeof(*pre);
+	     head->type != XS_STATE_TYPE_END &&
+		(void *)head - state.buf < state.size;
+	     head = (void *)head + sizeof(*head) + head->length) {
+		switch (head->type) {
+		case XS_STATE_TYPE_GLOBAL:
+			break;
+		case XS_STATE_TYPE_CONN:
+			break;
+		case XS_STATE_TYPE_WATCH:
+			break;
+		case XS_STATE_TYPE_TA:
+			xprintf("live-update: ignore transaction record\n");
+			break;
+		case XS_STATE_TYPE_NODE:
+			break;
+		default:
+			xprintf("live-update: unknown state record %08x\n",
+				head->type);
+			break;
+		}
+	}
+
+	lu_close_dump_state(&state);
+
+	talloc_free(ctx);
 }
 
 static const char *lu_activate_binary(const void *ctx)
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 20/25] tools/xenstore: add reading global state for live update
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (18 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 19/25] tools/xenstore: read internal state when doing live upgrade Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:35 ` [PATCH v10 21/25] tools/xenstore: add read connection " Juergen Gross
                   ` (5 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Julien Grall, Paul Durrant

Add reading the global state for live update.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Paul Durrant <paul@xen.org>
---
 tools/xenstore/xenstored_control.c | 1 +
 tools/xenstore/xenstored_core.c    | 9 +++++++++
 tools/xenstore/xenstored_core.h    | 2 ++
 3 files changed, 12 insertions(+)

diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index 129d2b44bb..f6c4ab3d8a 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -519,6 +519,7 @@ void lu_read_state(void)
 	     head = (void *)head + sizeof(*head) + head->length) {
 		switch (head->type) {
 		case XS_STATE_TYPE_GLOBAL:
+			read_state_global(ctx, head + 1);
 			break;
 		case XS_STATE_TYPE_CONN:
 			break;
diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index d6f8373ee0..5922a03a98 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -2451,6 +2451,15 @@ const char *dump_state_nodes(FILE *fp, const void *ctx)
 	return dump_state_node_tree(fp, path);
 }
 
+void read_state_global(const void *ctx, const void *state)
+{
+	const struct xs_state_global *glb = state;
+
+	sock = glb->socket_fd;
+
+	domain_init(glb->evtchn_fd);
+}
+
 /*
  * Local variables:
  *  mode: C
diff --git a/tools/xenstore/xenstored_core.h b/tools/xenstore/xenstored_core.h
index e40e0e6806..6c9d838f11 100644
--- a/tools/xenstore/xenstored_core.h
+++ b/tools/xenstore/xenstored_core.h
@@ -244,6 +244,8 @@ const char *dump_state_node_perms(FILE *fp, struct xs_state_node *sn,
 				  const struct xs_permissions *perms,
 				  unsigned int n_perms);
 
+void read_state_global(const void *ctx, const void *state);
+
 #endif /* _XENSTORED_CORE_H */
 
 /*
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 21/25] tools/xenstore: add read connection state for live update
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (19 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 20/25] tools/xenstore: add reading global state for live update Juergen Gross
@ 2020-12-15 16:35 ` Juergen Gross
  2020-12-15 16:36 ` [PATCH v10 22/25] tools/xenstore: add read node " Juergen Gross
                   ` (4 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Julien Grall

Add the needed functions for reading connection state for live update.

As the connection is identified by a unique connection id in the state
records we need to add this to struct connection. Add a new function
to return the connection based on a connection id.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
---
V2:
- fixed condition in introduce_domain() (Julien Grall)

V4:
- set pending data msg type to XS_INVALID (Julien Grall)
- add buffered read data (Julien Grall)

V5:
- really read buffered read data (Julien Grall)
- drop conn parameter from introduce_domain() (Paul Durrant)
- split pending write data into individual buffers

V6:
- rename "first" to "partial" (Paul Durrant)

V7:
- use local port from connection data

V8:
- remove dom0 special handling

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/xenstored_control.c |   1 +
 tools/xenstore/xenstored_core.c    | 102 ++++++++++++++++++++++++++++-
 tools/xenstore/xenstored_core.h    |  10 +++
 tools/xenstore/xenstored_domain.c  |  60 +++++++++++++----
 tools/xenstore/xenstored_domain.h  |   2 +
 5 files changed, 162 insertions(+), 13 deletions(-)

diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index f6c4ab3d8a..4bf075ad79 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -522,6 +522,7 @@ void lu_read_state(void)
 			read_state_global(ctx, head + 1);
 			break;
 		case XS_STATE_TYPE_CONN:
+			read_state_connection(ctx, head + 1);
 			break;
 		case XS_STATE_TYPE_WATCH:
 			break;
diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index 5922a03a98..2ad1cc8d44 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -1532,12 +1532,35 @@ struct connection *new_connection(connwritefn_t *write, connreadfn_t *read)
 	return new;
 }
 
+struct connection *get_connection_by_id(unsigned int conn_id)
+{
+	struct connection *conn;
+
+	list_for_each_entry(conn, &connections, list)
+		if (conn->conn_id == conn_id)
+			return conn;
+
+	return NULL;
+}
+
 #ifdef NO_SOCKETS
 static void accept_connection(int sock)
 {
 }
+
+int writefd(struct connection *conn, const void *data, unsigned int len)
+{
+	errno = EBADF;
+	return -1;
+}
+
+int readfd(struct connection *conn, void *data, unsigned int len)
+{
+	errno = EBADF;
+	return -1;
+}
 #else
-static int writefd(struct connection *conn, const void *data, unsigned int len)
+int writefd(struct connection *conn, const void *data, unsigned int len)
 {
 	int rc;
 
@@ -1553,7 +1576,7 @@ static int writefd(struct connection *conn, const void *data, unsigned int len)
 	return rc;
 }
 
-static int readfd(struct connection *conn, void *data, unsigned int len)
+int readfd(struct connection *conn, void *data, unsigned int len)
 {
 	int rc;
 
@@ -2460,6 +2483,81 @@ void read_state_global(const void *ctx, const void *state)
 	domain_init(glb->evtchn_fd);
 }
 
+static void add_buffered_data(struct buffered_data *bdata,
+			      struct connection *conn, const uint8_t *data,
+			      unsigned int len)
+{
+	bdata->hdr.msg.len = len;
+	if (len <= DEFAULT_BUFFER_SIZE)
+		bdata->buffer = bdata->default_buffer;
+	else
+		bdata->buffer = talloc_array(bdata, char, len);
+	if (!bdata->buffer)
+		barf("error restoring buffered data");
+
+	memcpy(bdata->buffer, data, len);
+
+	/* Queue for later transmission. */
+	list_add_tail(&bdata->list, &conn->out_list);
+}
+
+void read_state_buffered_data(const void *ctx, struct connection *conn,
+			      const struct xs_state_connection *sc)
+{
+	struct buffered_data *bdata;
+	const uint8_t *data;
+	unsigned int len;
+	bool partial = sc->data_resp_len;
+
+	if (sc->data_in_len) {
+		bdata = new_buffer(conn);
+		if (!bdata)
+			barf("error restoring read data");
+		if (sc->data_in_len < sizeof(bdata->hdr)) {
+			bdata->inhdr = true;
+			memcpy(&bdata->hdr, sc->data, sc->data_in_len);
+			bdata->used = sc->data_in_len;
+		} else {
+			bdata->inhdr = false;
+			memcpy(&bdata->hdr, sc->data, sizeof(bdata->hdr));
+			if (bdata->hdr.msg.len <= DEFAULT_BUFFER_SIZE)
+				bdata->buffer = bdata->default_buffer;
+			else
+				bdata->buffer = talloc_array(bdata, char,
+							bdata->hdr.msg.len);
+			if (!bdata->buffer)
+				barf("Error allocating in buffer");
+			bdata->used = sc->data_in_len - sizeof(bdata->hdr);
+			memcpy(bdata->buffer, sc->data + sizeof(bdata->hdr),
+			       bdata->used);
+		}
+
+		conn->in = bdata;
+	}
+
+	for (data = sc->data + sc->data_in_len;
+	     data < sc->data + sc->data_in_len + sc->data_out_len;
+	     data += len) {
+		bdata = new_buffer(conn);
+		if (!bdata)
+			barf("error restoring buffered data");
+		if (partial) {
+			bdata->inhdr = false;
+			/* Make trace look nice. */
+			bdata->hdr.msg.type = XS_INVALID;
+			len = sc->data_resp_len;
+			add_buffered_data(bdata, conn, data, len);
+			partial = false;
+			continue;
+		}
+
+		memcpy(&bdata->hdr, data, sizeof(bdata->hdr));
+		data += sizeof(bdata->hdr);
+		len = bdata->hdr.msg.len;
+		add_buffered_data(bdata, conn, data, len);
+	}
+}
+
 /*
  * Local variables:
  *  mode: C
diff --git a/tools/xenstore/xenstored_core.h b/tools/xenstore/xenstored_core.h
index 6c9d838f11..cb256626fe 100644
--- a/tools/xenstore/xenstored_core.h
+++ b/tools/xenstore/xenstored_core.h
@@ -118,6 +118,9 @@ struct connection
 	/* Methods for communicating over this connection: write can be NULL */
 	connwritefn_t *write;
 	connreadfn_t *read;
+
+	/* Support for live update: connection id. */
+	unsigned int conn_id;
 };
 extern struct list_head connections;
 
@@ -178,6 +181,7 @@ struct node *read_node(struct connection *conn, const void *ctx,
 		       const char *name);
 
 struct connection *new_connection(connwritefn_t *write, connreadfn_t *read);
+struct connection *get_connection_by_id(unsigned int conn_id);
 void check_store(void);
 void corrupt(struct connection *conn, const char *fmt, ...);
 enum xs_perm_type perm_for_conn(struct connection *conn,
@@ -229,6 +233,10 @@ void finish_daemonize(void);
 /* Open a pipe for signal handling */
 void init_pipe(int reopen_log_pipe[2]);
 
+int writefd(struct connection *conn, const void *data, unsigned int len);
+int readfd(struct connection *conn, void *data, unsigned int len);
+
+extern struct interface_funcs socket_funcs;
 extern xengnttab_handle **xgt_handle;
 
 int remember_string(struct hashtable *hash, const char *str);
@@ -245,6 +253,8 @@ const char *dump_state_node_perms(FILE *fp, struct xs_state_node *sn,
 				  unsigned int n_perms);
 
 void read_state_global(const void *ctx, const void *state);
+void read_state_buffered_data(const void *ctx, struct connection *conn,
+			      const struct xs_state_connection *sc);
 
 #endif /* _XENSTORED_CORE_H */
 
diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
index 94dd501a3b..0cd8234bd1 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -355,7 +355,7 @@ static struct domain *find_or_alloc_domain(const void *ctx, unsigned int domid)
 	return domain ? : alloc_domain(ctx, domid);
 }
 
-static int new_domain(struct domain *domain, int port)
+static int new_domain(struct domain *domain, int port, bool restore)
 {
 	int rc;
 
@@ -369,11 +369,16 @@ static int new_domain(struct domain *domain, int port)
 
 	wrl_domain_new(domain);
 
-	/* Tell kernel we're interested in this event. */
-	rc = xenevtchn_bind_interdomain(xce_handle, domain->domid, port);
-	if (rc == -1)
-		return errno;
-	domain->port = rc;
+	if (restore)
+		domain->port = port;
+	else {
+		/* Tell kernel we're interested in this event. */
+		rc = xenevtchn_bind_interdomain(xce_handle, domain->domid,
+						port);
+		if (rc == -1)
+			return errno;
+		domain->port = rc;
+	}
 
 	domain->introduced = true;
 
@@ -423,7 +428,7 @@ static void domain_conn_reset(struct domain *domain)
 
 static struct domain *introduce_domain(const void *ctx,
 				       unsigned int domid,
-				       evtchn_port_t port)
+				       evtchn_port_t port, bool restore)
 {
 	struct domain *domain;
 	int rc;
@@ -439,7 +444,7 @@ static struct domain *introduce_domain(const void *ctx,
 					     : map_interface(domid);
 		if (!interface)
 			return NULL;
-		if (new_domain(domain, port)) {
+		if (new_domain(domain, port, restore)) {
 			rc = errno;
 			if (is_master_domain)
 				unmap_xenbus(interface);
@@ -453,7 +458,7 @@ static struct domain *introduce_domain(const void *ctx,
 		/* Now domain belongs to its connection. */
 		talloc_steal(domain->conn, domain);
 
-		if (!is_master_domain)
+		if (!is_master_domain && !restore)
 			fire_watches(NULL, ctx, "@introduceDomain", NULL,
 				     false, NULL);
 	} else {
@@ -486,7 +491,7 @@ int do_introduce(struct connection *conn, struct buffered_data *in)
 	if (port <= 0)
 		return EINVAL;
 
-	domain = introduce_domain(in, domid, port);
+	domain = introduce_domain(in, domid, port, false);
 	if (!domain)
 		return errno;
 
@@ -715,7 +720,7 @@ void dom0_init(void)
 	if (port == -1)
 		barf_perror("Failed to initialize dom0 port");
 
-	dom0 = introduce_domain(NULL, xenbus_master_domid(), port);
+	dom0 = introduce_domain(NULL, xenbus_master_domid(), port, false);
 	if (!dom0)
 		barf_perror("Failed to initialize dom0");
 
@@ -1261,6 +1266,39 @@ const char *dump_state_special_nodes(FILE *fp)
 	return ret;
 }
 
+void read_state_connection(const void *ctx, const void *state)
+{
+	const struct xs_state_connection *sc = state;
+	struct connection *conn;
+	struct domain *domain, *tdomain;
+
+	if (sc->conn_type == XS_STATE_CONN_TYPE_SOCKET) {
+		conn = new_connection(writefd, readfd);
+		if (!conn)
+			barf("error restoring connection");
+		conn->fd = sc->spec.socket_fd;
+	} else {
+		domain = introduce_domain(ctx, sc->spec.ring.domid,
+					  sc->spec.ring.evtchn, true);
+		if (!domain)
+			barf("domain allocation error");
+
+		if (sc->spec.ring.tdomid != DOMID_INVALID) {
+			tdomain = find_or_alloc_domain(ctx,
+						       sc->spec.ring.tdomid);
+			if (!tdomain)
+				barf("target domain allocation error");
+			talloc_reference(domain->conn, tdomain->conn);
+			domain->conn->target = tdomain->conn;
+		}
+		conn = domain->conn;
+	}
+
+	conn->conn_id = sc->conn_id;
+
+	read_state_buffered_data(ctx, conn, sc);
+}
+
 /*
  * Local variables:
  *  mode: C
diff --git a/tools/xenstore/xenstored_domain.h b/tools/xenstore/xenstored_domain.h
index b20269b038..8f3b4e0f8b 100644
--- a/tools/xenstore/xenstored_domain.h
+++ b/tools/xenstore/xenstored_domain.h
@@ -101,4 +101,6 @@ void wrl_apply_debit_trans_commit(struct connection *conn);
 const char *dump_state_connections(FILE *fp, struct connection *conn);
 const char *dump_state_special_nodes(FILE *fp);
 
+void read_state_connection(const void *ctx, const void *state);
+
 #endif /* _XENSTORED_DOMAIN_H */
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 22/25] tools/xenstore: add read node state for live update
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (20 preceding siblings ...)
  2020-12-15 16:35 ` [PATCH v10 21/25] tools/xenstore: add read connection " Juergen Gross
@ 2020-12-15 16:36 ` Juergen Gross
  2020-12-15 16:36 ` [PATCH v10 23/25] tools/xenstore: add read watch " Juergen Gross
                   ` (3 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:36 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Paul Durrant, Julien Grall

Add the needed functions for reading node state for live update.

This requires some refactoring of current node handling in Xenstore in
order to avoid repeating the same code patterns multiple times.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Julien Grall <jgrall@amazon.com>
---
V4:
- drop local node handling (Julien Grall)

V5:
- use set_tdb_key (Paul Durrant)

V6:
- add permission flag handling (Julien Grall)

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/xenstored_control.c |   1 +
 tools/xenstore/xenstored_core.c    | 105 ++++++++++++++++++++++++++---
 tools/xenstore/xenstored_core.h    |   1 +
 3 files changed, 96 insertions(+), 11 deletions(-)

diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index 4bf075ad79..a978ccf17e 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -530,6 +530,7 @@ void lu_read_state(void)
 			xprintf("live-update: ignore transaction record\n");
 			break;
 		case XS_STATE_TYPE_NODE:
+			read_state_node(ctx, head + 1);
 			break;
 		default:
 			xprintf("live-update: unknown state record %08x\n",
diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index 2ad1cc8d44..649dfb534a 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -930,13 +930,30 @@ static char *basename(const char *name)
 	return strrchr(name, '/') + 1;
 }
 
-static struct node *construct_node(struct connection *conn, const void *ctx,
-				   const char *name)
+static int add_child(const void *ctx, struct node *parent, const char *name)
 {
 	const char *base;
 	unsigned int baselen;
+	char *children;
+
+	base = basename(name);
+	baselen = strlen(base) + 1;
+	children = talloc_array(ctx, char, parent->childlen + baselen);
+	if (!children)
+		return ENOMEM;
+	memcpy(children, parent->children, parent->childlen);
+	memcpy(children + parent->childlen, base, baselen);
+	parent->children = children;
+	parent->childlen += baselen;
+
+	return 0;
+}
+
+static struct node *construct_node(struct connection *conn, const void *ctx,
+				   const char *name)
+{
 	struct node *parent, *node;
-	char *children, *parentname = get_parent(ctx, name);
+	char *parentname = get_parent(ctx, name);
 
 	if (!parentname)
 		return NULL;
@@ -949,15 +966,8 @@ static struct node *construct_node(struct connection *conn, const void *ctx,
 		return NULL;
 
 	/* Add child to parent. */
-	base = basename(name);
-	baselen = strlen(base) + 1;
-	children = talloc_array(ctx, char, parent->childlen + baselen);
-	if (!children)
+	if (add_child(ctx, parent, name))
 		goto nomem;
-	memcpy(children, parent->children, parent->childlen);
-	memcpy(children + parent->childlen, base, baselen);
-	parent->children = children;
-	parent->childlen += baselen;
 
 	/* Allocate node */
 	node = talloc(ctx, struct node);
@@ -2558,6 +2568,79 @@ void read_state_buffered_data(const void *ctx, struct connection *conn,
 	}
 }
 
+void read_state_node(const void *ctx, const void *state)
+{
+	const struct xs_state_node *sn = state;
+	struct node *node, *parent;
+	TDB_DATA key;
+	char *name, *parentname;
+	unsigned int i;
+	struct connection conn = { .id = priv_domid };
+
+	name = (char *)(sn->perms + sn->perm_n);
+	node = talloc(ctx, struct node);
+	if (!node)
+		barf("allocation error restoring node");
+
+	node->name = name;
+	node->generation = ++generation;
+	node->datalen = sn->data_len;
+	node->data = name + sn->path_len;
+	node->childlen = 0;
+	node->children = NULL;
+	node->perms.num = sn->perm_n;
+	node->perms.p = talloc_array(node, struct xs_permissions,
+				     node->perms.num);
+	if (!node->perms.p)
+		barf("allocation error restoring node");
+	for (i = 0; i < node->perms.num; i++) {
+		switch (sn->perms[i].access) {
+		case 'r':
+			node->perms.p[i].perms = XS_PERM_READ;
+			break;
+		case 'w':
+			node->perms.p[i].perms = XS_PERM_WRITE;
+			break;
+		case 'b':
+			node->perms.p[i].perms = XS_PERM_READ | XS_PERM_WRITE;
+			break;
+		default:
+			node->perms.p[i].perms = XS_PERM_NONE;
+			break;
+		}
+		if (sn->perms[i].flags & XS_STATE_NODE_PERM_IGNORE)
+			node->perms.p[i].perms |= XS_PERM_IGNORE;
+		node->perms.p[i].id = sn->perms[i].domid;
+	}
+
+	if (strstarts(name, "@")) {
+		set_perms_special(&conn, name, &node->perms);
+		talloc_free(node);
+		return;
+	}
+
+	parentname = get_parent(node, name);
+	if (!parentname)
+		barf("allocation error restoring node");
+	parent = read_node(NULL, node, parentname);
+	if (!parent)
+		barf("read parent error restoring node");
+
+	if (add_child(node, parent, name))
+		barf("allocation error restoring node");
+
+	set_tdb_key(parentname, &key);
+	if (write_node_raw(NULL, &key, parent, true))
+		barf("write parent error restoring node");
+
+	set_tdb_key(name, &key);
+	if (write_node_raw(NULL, &key, node, true))
+		barf("write node error restoring node");
+	domain_entry_inc(&conn, node);
+
+	talloc_free(node);
+}
+
 /*
  * Local variables:
  *  mode: C
diff --git a/tools/xenstore/xenstored_core.h b/tools/xenstore/xenstored_core.h
index cb256626fe..0af2f364bf 100644
--- a/tools/xenstore/xenstored_core.h
+++ b/tools/xenstore/xenstored_core.h
@@ -255,6 +255,7 @@ const char *dump_state_node_perms(FILE *fp, struct xs_state_node *sn,
 void read_state_global(const void *ctx, const void *state);
 void read_state_buffered_data(const void *ctx, struct connection *conn,
 			      const struct xs_state_connection *sc);
+void read_state_node(const void *ctx, const void *state);
 
 #endif /* _XENSTORED_CORE_H */
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 23/25] tools/xenstore: add read watch state for live update
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (21 preceding siblings ...)
  2020-12-15 16:36 ` [PATCH v10 22/25] tools/xenstore: add read node " Juergen Gross
@ 2020-12-15 16:36 ` Juergen Gross
  2020-12-15 16:36 ` [PATCH v10 24/25] tools/xenstore: handle dying domains in " Juergen Gross
                   ` (2 subsequent siblings)
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:36 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Paul Durrant

Add reading the watch state records for live update.

This requires factoring out some of the add watch functionality into a
dedicated function.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
---
V4:
- add comment (Julien Grall)

V6:
- correct check_watch_path() (setting errno)

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/xenstored_control.c |   2 +
 tools/xenstore/xenstored_watch.c   | 114 +++++++++++++++++++++--------
 tools/xenstore/xenstored_watch.h   |   2 +
 3 files changed, 88 insertions(+), 30 deletions(-)

diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index a978ccf17e..8a1e3b35fe 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -35,6 +35,7 @@ Interactive commands for Xen Store Daemon.
 #include "xenstored_core.h"
 #include "xenstored_control.h"
 #include "xenstored_domain.h"
+#include "xenstored_watch.h"
 
 /* Mini-OS only knows about MAP_ANON. */
 #ifndef MAP_ANONYMOUS
@@ -525,6 +526,7 @@ void lu_read_state(void)
 			read_state_connection(ctx, head + 1);
 			break;
 		case XS_STATE_TYPE_WATCH:
+			read_state_watch(ctx, head + 1);
 			break;
 		case XS_STATE_TYPE_TA:
 			xprintf("live-update: ignore transaction record\n");
diff --git a/tools/xenstore/xenstored_watch.c b/tools/xenstore/xenstored_watch.c
index 9248f08bd9..db89e0141f 100644
--- a/tools/xenstore/xenstored_watch.c
+++ b/tools/xenstore/xenstored_watch.c
@@ -205,6 +205,62 @@ static int destroy_watch(void *_watch)
 	return 0;
 }
 
+static int check_watch_path(struct connection *conn, const void *ctx,
+			    char **path, bool *relative)
+{
+	/* Check if valid event. */
+	if (strstarts(*path, "@")) {
+		*relative = false;
+		if (strlen(*path) > XENSTORE_REL_PATH_MAX)
+			goto inval;
+	} else {
+		*relative = !strstarts(*path, "/");
+		*path = canonicalize(conn, ctx, *path);
+		if (!*path)
+			return errno;
+		if (!is_valid_nodename(*path))
+			goto inval;
+	}
+
+	return 0;
+
+ inval:
+	errno = EINVAL;
+	return errno;
+}
+
+static struct watch *add_watch(struct connection *conn, char *path, char *token,
+			       bool relative)
+{
+	struct watch *watch;
+
+	watch = talloc(conn, struct watch);
+	if (!watch)
+		goto nomem;
+	watch->node = talloc_strdup(watch, path);
+	watch->token = talloc_strdup(watch, token);
+	if (!watch->node || !watch->token)
+		goto nomem;
+
+	if (relative)
+		watch->relative_path = get_implicit_path(conn);
+	else
+		watch->relative_path = NULL;
+
+	INIT_LIST_HEAD(&watch->events);
+
+	domain_watch_inc(conn);
+	list_add_tail(&watch->list, &conn->watches);
+	talloc_set_destructor(watch, destroy_watch);
+
+	return watch;
+
+ nomem:
+	talloc_free(watch);
+	errno = ENOMEM;
+	return NULL;
+}
+
 int do_watch(struct connection *conn, struct buffered_data *in)
 {
 	struct watch *watch;
@@ -214,19 +270,9 @@ int do_watch(struct connection *conn, struct buffered_data *in)
 	if (get_strings(in, vec, ARRAY_SIZE(vec)) != ARRAY_SIZE(vec))
 		return EINVAL;
 
-	if (strstarts(vec[0], "@")) {
-		relative = false;
-		if (strlen(vec[0]) > XENSTORE_REL_PATH_MAX)
-			return EINVAL;
-		/* check if valid event */
-	} else {
-		relative = !strstarts(vec[0], "/");
-		vec[0] = canonicalize(conn, in, vec[0]);
-		if (!vec[0])
-			return ENOMEM;
-		if (!is_valid_nodename(vec[0]))
-			return EINVAL;
-	}
+	errno = check_watch_path(conn, in, &(vec[0]), &relative);
+	if (errno)
+		return errno;
 
 	/* Check for duplicates. */
 	list_for_each_entry(watch, &conn->watches, list) {
@@ -238,26 +284,11 @@ int do_watch(struct connection *conn, struct buffered_data *in)
 	if (domain_watch(conn) > quota_nb_watch_per_domain)
 		return E2BIG;
 
-	watch = talloc(conn, struct watch);
+	watch = add_watch(conn, vec[0], vec[1], relative);
 	if (!watch)
-		return ENOMEM;
-	watch->node = talloc_strdup(watch, vec[0]);
-	watch->token = talloc_strdup(watch, vec[1]);
-	if (!watch->node || !watch->token) {
-		talloc_free(watch);
-		return ENOMEM;
-	}
-	if (relative)
-		watch->relative_path = get_implicit_path(conn);
-	else
-		watch->relative_path = NULL;
+		return errno;
 
-	INIT_LIST_HEAD(&watch->events);
-
-	domain_watch_inc(conn);
-	list_add_tail(&watch->list, &conn->watches);
 	trace_create(watch, "watch");
-	talloc_set_destructor(watch, destroy_watch);
 	send_ack(conn, XS_WATCH);
 
 	/* We fire once up front: simplifies clients and restart. */
@@ -338,6 +369,29 @@ const char *dump_state_watches(FILE *fp, struct connection *conn,
 	return ret;
 }
 
+void read_state_watch(const void *ctx, const void *state)
+{
+	const struct xs_state_watch *sw = state;
+	struct connection *conn;
+	char *path, *token;
+	bool relative;
+
+	conn = get_connection_by_id(sw->conn_id);
+	if (!conn)
+		barf("connection not found for read watch");
+
+	path = (char *)sw->data;
+	token = path + sw->path_length;
+
+	/* Don't check success, we want the relative information only. */
+	check_watch_path(conn, ctx, &path, &relative);
+	if (!path)
+		barf("allocation error for read watch");
+
+	if (!add_watch(conn, path, token, relative))
+		barf("error adding watch");
+}
+
 /*
  * Local variables:
  *  mode: C
diff --git a/tools/xenstore/xenstored_watch.h b/tools/xenstore/xenstored_watch.h
index 3d81645f45..0e693f0839 100644
--- a/tools/xenstore/xenstored_watch.h
+++ b/tools/xenstore/xenstored_watch.h
@@ -33,4 +33,6 @@ void conn_delete_all_watches(struct connection *conn);
 const char *dump_state_watches(FILE *fp, struct connection *conn,
 			       unsigned int conn_id);
 
+void read_state_watch(const void *ctx, const void *state);
+
 #endif /* _XENSTORED_WATCH_H */
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 24/25] tools/xenstore: handle dying domains in live update
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (22 preceding siblings ...)
  2020-12-15 16:36 ` [PATCH v10 23/25] tools/xenstore: add read watch " Juergen Gross
@ 2020-12-15 16:36 ` Juergen Gross
  2020-12-15 16:36 ` [PATCH v10 25/25] tools/xenstore: activate new binary for " Juergen Gross
  2021-01-05 12:26 ` [PATCH v10 00/25] tools/xenstore: support live update for xenstored Wei Liu
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:36 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, Ian Jackson, Wei Liu, Juergen Gross, Paul Durrant

From: Julien Grall <jgrall@amazon.com>

A domain could just be dying when live updating Xenstore, so the case
of not being able to map the ring page or to connect to the event
channel must be handled gracefully.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Paul Durrant <paul@xen.org>
---
V4:
- new patch (Julien, I hope adding the Sob: is okay?)

V10:
- removed "XXX..." comment (Julien Grall)

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/xenstored_control.c |  7 +++++++
 tools/xenstore/xenstored_domain.c  | 25 +++++++++++++++++--------
 tools/xenstore/xenstored_domain.h  |  2 ++
 3 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index 8a1e3b35fe..dee55de264 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -544,6 +544,13 @@ void lu_read_state(void)
 	lu_close_dump_state(&state);
 
 	talloc_free(ctx);
+
+	/*
+	 * We may have missed the VIRQ_DOM_EXC notification and a domain may
+	 * have died while we were live-updating. So check all the domains are
+	 * still alive.
+	 */
+	check_domains(true);
 }
 
 static const char *lu_activate_binary(const void *ctx)
diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
index 0cd8234bd1..dfda90c791 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -220,7 +220,7 @@ static bool get_domain_info(unsigned int domid, xc_dominfo_t *dominfo)
 	       dominfo->domid == domid;
 }
 
-static void domain_cleanup(void)
+void check_domains(bool restore)
 {
 	xc_dominfo_t dominfo;
 	struct domain *domain;
@@ -244,7 +244,14 @@ static void domain_cleanup(void)
 				domain->shutdown = true;
 				notify = 1;
 			}
-			if (!dominfo.dying)
+			/*
+			 * On Restore, we may have been unable to remap the
+			 * interface and the port. As we don't know whether
+			 * this was because of a dying domain, we need to
+			 * check if the interface and port are still valid.
+			 */
+			if (!dominfo.dying && domain->port &&
+			    domain->interface)
 				continue;
 		}
 		if (domain->conn) {
@@ -270,7 +277,7 @@ void handle_event(void)
 		barf_perror("Failed to read from event fd");
 
 	if (port == virq_port)
-		domain_cleanup();
+		check_domains(false);
 
 	if (xenevtchn_unmask(xce_handle, port) == -1)
 		barf_perror("Failed to write to event fd");
@@ -442,14 +449,16 @@ static struct domain *introduce_domain(const void *ctx,
 	if (!domain->introduced) {
 		interface = is_master_domain ? xenbus_map()
 					     : map_interface(domid);
-		if (!interface)
+		if (!interface && !restore)
 			return NULL;
 		if (new_domain(domain, port, restore)) {
 			rc = errno;
-			if (is_master_domain)
-				unmap_xenbus(interface);
-			else
-				unmap_interface(interface);
+			if (interface) {
+				if (is_master_domain)
+					unmap_xenbus(interface);
+				else
+					unmap_interface(interface);
+			}
 			errno = rc;
 			return NULL;
 		}
diff --git a/tools/xenstore/xenstored_domain.h b/tools/xenstore/xenstored_domain.h
index 8f3b4e0f8b..1cc1c03ed8 100644
--- a/tools/xenstore/xenstored_domain.h
+++ b/tools/xenstore/xenstored_domain.h
@@ -21,6 +21,8 @@
 
 void handle_event(void);
 
+void check_domains(bool restore);
+
 /* domid, mfn, eventchn, path */
 int do_introduce(struct connection *conn, struct buffered_data *in);
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v10 25/25] tools/xenstore: activate new binary for live update
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (23 preceding siblings ...)
  2020-12-15 16:36 ` [PATCH v10 24/25] tools/xenstore: handle dying domains in " Juergen Gross
@ 2020-12-15 16:36 ` Juergen Gross
  2021-01-05 12:26 ` [PATCH v10 00/25] tools/xenstore: support live update for xenstored Wei Liu
  25 siblings, 0 replies; 34+ messages in thread
From: Juergen Gross @ 2020-12-15 16:36 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu

Add activation of the new binary for live update. The daemon case is
handled completely, while for stubdom we only add stubs.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
V7:
- added unbinding dom0 and virq event channels

V8:
- no longer close dom0 evtchn (Julien Grall)

V10:
- remember original argc and argv (taken from deleted patch)

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/xenstored_control.c | 61 +++++++++++++++++++++++++++++-
 tools/xenstore/xenstored_core.c    |  5 +++
 tools/xenstore/xenstored_core.h    |  3 ++
 tools/xenstore/xenstored_domain.c  |  6 +++
 tools/xenstore/xenstored_domain.h  |  1 +
 5 files changed, 75 insertions(+), 1 deletion(-)

diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index dee55de264..a5d8185c41 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -16,6 +16,7 @@ Interactive commands for Xen Store Daemon.
     along with this program; If not, see <http://www.gnu.org/licenses/>.
 */
 
+#include <ctype.h>
 #include <errno.h>
 #include <stdarg.h>
 #include <stdio.h>
@@ -330,6 +331,11 @@ static void lu_get_dump_state(struct lu_dump_state *state)
 static void lu_close_dump_state(struct lu_dump_state *state)
 {
 }
+
+static char *lu_exec(const void *ctx, int argc, char **argv)
+{
+	return "NYI";
+}
 #else
 static const char *lu_binary(const void *ctx, struct connection *conn,
 			     const char *filename)
@@ -429,6 +435,14 @@ static void lu_close_dump_state(struct lu_dump_state *state)
 	unlink(filename);
 	talloc_free(filename);
 }
+
+static char *lu_exec(const void *ctx, int argc, char **argv)
+{
+	argv[0] = lu_status->filename;
+	execvp(argv[0], argv);
+
+	return "Error activating new binary.";
+}
 #endif
 
 static const char *lu_check_lu_allowed(const void *ctx, bool force,
@@ -555,7 +569,52 @@ void lu_read_state(void)
 
 static const char *lu_activate_binary(const void *ctx)
 {
-	return "Not yet implemented.";
+	int argc;
+	char **argv;
+	unsigned int i;
+
+	if (lu_status->cmdline) {
+		argc = 4;   /* At least one arg + progname + "-U" + NULL. */
+		for (i = 0; lu_status->cmdline[i]; i++)
+			if (isspace(lu_status->cmdline[i]))
+				argc++;
+		argv = talloc_array(ctx, char *, argc);
+		if (!argv)
+			return "Allocation failure.";
+
+		i = 0;
+		argc = 1;
+		argv[1] = strtok(lu_status->cmdline, " \t");
+		while (argv[argc]) {
+			if (!strcmp(argv[argc], "-U"))
+				i = 1;
+			argc++;
+			argv[argc] = strtok(NULL, " \t");
+		}
+
+		if (!i) {
+			argv[argc++] = "-U";
+			argv[argc] = NULL;
+		}
+	} else {
+		for (i = 0; i < orig_argc; i++)
+			if (!strcmp(orig_argv[i], "-U"))
+				break;
+
+		argc = orig_argc;
+		argv = talloc_array(ctx, char *, orig_argc + 2);
+		if (!argv)
+			return "Allocation failure.";
+
+		memcpy(argv, orig_argv, orig_argc * sizeof(*argv));
+		if (i == orig_argc)
+			argv[argc++] = "-U";
+		argv[argc] = NULL;
+	}
+
+	domain_deinit();
+
+	return lu_exec(ctx, argc, argv);
 }
 
 static const char *lu_start(const void *ctx, struct connection *conn,
diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index 649dfb534a..7174a9288a 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -72,6 +72,9 @@ static unsigned int nr_fds;
 
 static int sock = -1;
 
+int orig_argc;
+char **orig_argv;
+
 static bool verbose = false;
 LIST_HEAD(connections);
 int tracefd = -1;
@@ -2026,6 +2029,8 @@ int main(int argc, char *argv[])
 	const char *pidfile = NULL;
 	int timeout;
 
+	orig_argc = argc;
+	orig_argv = argv;
 
 	while ((opt = getopt_long(argc, argv, "DE:F:HNPS:t:A:M:T:RVW:U", options,
 				  NULL)) != -1) {
diff --git a/tools/xenstore/xenstored_core.h b/tools/xenstore/xenstored_core.h
index 0af2f364bf..91e036f49f 100644
--- a/tools/xenstore/xenstored_core.h
+++ b/tools/xenstore/xenstored_core.h
@@ -201,6 +201,9 @@ void dtrace_io(const struct connection *conn, const struct buffered_data *data,
 void reopen_log(void);
 void close_log(void);
 
+extern int orig_argc;
+extern char **orig_argv;
+
 extern char *tracefile;
 extern int tracefd;
 
diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
index dfda90c791..317427b7cb 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -783,6 +783,12 @@ void domain_init(int evtfd)
 	virq_port = rc;
 }
 
+void domain_deinit(void)
+{
+	if (virq_port)
+		xenevtchn_unbind(xce_handle, virq_port);
+}
+
 void domain_entry_inc(struct connection *conn, struct node *node)
 {
 	struct domain *d;
diff --git a/tools/xenstore/xenstored_domain.h b/tools/xenstore/xenstored_domain.h
index 1cc1c03ed8..dc97591713 100644
--- a/tools/xenstore/xenstored_domain.h
+++ b/tools/xenstore/xenstored_domain.h
@@ -46,6 +46,7 @@ int do_reset_watches(struct connection *conn, struct buffered_data *in);
 
 void domain_init(int evtfd);
 void dom0_init(void);
+void domain_deinit(void);
 
 /* Returns the implicit path of a connection (only domains have this) */
 const char *get_implicit_path(const struct connection *conn);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v10 04/25] tools/libxenevtchn: add possibility to not close file descriptor on exec
  2020-12-15 16:35 ` [PATCH v10 04/25] tools/libxenevtchn: add possibility to not close file descriptor on exec Juergen Gross
@ 2020-12-15 18:09   ` Andrew Cooper
  2020-12-16  6:06     ` Jürgen Groß
  0 siblings, 1 reply; 34+ messages in thread
From: Andrew Cooper @ 2020-12-15 18:09 UTC (permalink / raw)
  To: Juergen Gross, xen-devel; +Cc: Ian Jackson, Wei Liu, Julien Grall

On 15/12/2020 16:35, Juergen Gross wrote:
> Signed-off-by: Juergen Gross <jgross@suse.com>
> Reviewed-by: Wei Liu <wl@xen.org>
> Reviewed-by: Julien Grall <jgrall@amazon.com>
> ---
> V7:
> - new patch
>
> V8:
> - some minor comments by Julien Grall addressed
>
> Signed-off-by: Juergen Gross <jgross@suse.com>

Various of your patches still have double SoB.  (Just as a note to be
careful to anyone committing...)

> diff --git a/tools/include/xenevtchn.h b/tools/include/xenevtchn.h
> index 91821ee56d..dadc46ea36 100644
> --- a/tools/include/xenevtchn.h
> +++ b/tools/include/xenevtchn.h
> @@ -64,11 +64,25 @@ struct xentoollog_logger;
>   *
>   * Calling xenevtchn_close() is the only safe operation on a
>   * xenevtchn_handle which has been inherited.
> + *
> + * Setting XENEVTCHN_NO_CLOEXEC allows to keep the file descriptor used
> + * for the event channel driver open across exec(2). In order to be able
> + * to use that file descriptor the new binary activated via exec(2) has
> + * to call xenevtchn_open_fd() with that file descriptor as parameter in
> + * order to associate it with a new handle. The file descriptor can be
> + * obtained via xenevtchn_fd() before calling exec(2).
>   */

More of the comment block than this needs adjusting in light of the
exec() changes.

> -/* Currently no flags are defined */
> +
> +/* Don't set O_CLOEXEC when opening event channel driver node. */
> +#define XENEVTCHN_NO_CLOEXEC 0x01
> +
>  xenevtchn_handle *xenevtchn_open(struct xentoollog_logger *logger,
>                                   unsigned open_flags);
>  
> +/* Flag XENEVTCHN_NO_CLOEXEC is ignored by xenevtchn_open_fd(). */
> +xenevtchn_handle *xenevtchn_open_fd(struct xentoollog_logger *logger,
> +                                    int fd, unsigned open_flags);
> +

I spotted this before, but didn't have time to reply.

This isn't "open fd".  It is "construct a xenevtchn_handle object around
an already-open fd".  As such, open_flags appears bogus because at no
point are we making an open() call.  (I'd argue that it irrespective of
other things, it wants naming xenevtchn_fdopen() for API familiarity.)

However, the root of the problem is actually the ambiguity in the name. 
These are not flags to the open() system call, but general flags for
xenevtchn.

Therefore, I recommend a prep patch which renames open_flags to just
flags, and while at it, changes to unsigned int rather than a naked
"unsigned" type.  There are no API/ABI implications for this, but it
will help massively with code clarity.

I'd also possibly go as far as to say that plumbing 'flags' down into
osdep ought to split out into a separate patch.  There is also a wild
mix of coding styles even within the hunks here.

Additionally, something in core.c should check for unknown flags and
reject them them with EINVAL.  It was buggy that this wasn't done
before, and really needs to be implemented before we start having cases
where people might plausibly pass something other than 0.

~Andrew

P.S. if you don't fancy doing all of this, my brain could do with a
break from the complicated work, and I can see about organising this
cleanup.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v10 04/25] tools/libxenevtchn: add possibility to not close file descriptor on exec
  2020-12-15 18:09   ` Andrew Cooper
@ 2020-12-16  6:06     ` Jürgen Groß
  2020-12-16 11:22       ` Andrew Cooper
  0 siblings, 1 reply; 34+ messages in thread
From: Jürgen Groß @ 2020-12-16  6:06 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Ian Jackson, Wei Liu, Julien Grall


[-- Attachment #1.1.1: Type: text/plain, Size: 3926 bytes --]

On 15.12.20 19:09, Andrew Cooper wrote:
> On 15/12/2020 16:35, Juergen Gross wrote:
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> Reviewed-by: Wei Liu <wl@xen.org>
>> Reviewed-by: Julien Grall <jgrall@amazon.com>
>> ---
>> V7:
>> - new patch
>>
>> V8:
>> - some minor comments by Julien Grall addressed
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
> 
> Various of your patches still have double SoB.  (Just as a note to be
> careful to anyone committing...)

Yeah, this is annoying.

They are all after the first "---" mark, so they shouldn't end up in
git AFAICS.

Why git is adding those duplicates I don't know. I had this problem
before, but some git update made it disappear. Now it is back, but
not for all patches. :-(

> 
>> diff --git a/tools/include/xenevtchn.h b/tools/include/xenevtchn.h
>> index 91821ee56d..dadc46ea36 100644
>> --- a/tools/include/xenevtchn.h
>> +++ b/tools/include/xenevtchn.h
>> @@ -64,11 +64,25 @@ struct xentoollog_logger;
>>    *
>>    * Calling xenevtchn_close() is the only safe operation on a
>>    * xenevtchn_handle which has been inherited.
>> + *
>> + * Setting XENEVTCHN_NO_CLOEXEC allows to keep the file descriptor used
>> + * for the event channel driver open across exec(2). In order to be able
>> + * to use that file descriptor the new binary activated via exec(2) has
>> + * to call xenevtchn_open_fd() with that file descriptor as parameter in
>> + * order to associate it with a new handle. The file descriptor can be
>> + * obtained via xenevtchn_fd() before calling exec(2).
>>    */
> 
> More of the comment block than this needs adjusting in light of the
> exec() changes.
> 
>> -/* Currently no flags are defined */
>> +
>> +/* Don't set O_CLOEXEC when opening event channel driver node. */
>> +#define XENEVTCHN_NO_CLOEXEC 0x01
>> +
>>   xenevtchn_handle *xenevtchn_open(struct xentoollog_logger *logger,
>>                                    unsigned open_flags);
>>   
>> +/* Flag XENEVTCHN_NO_CLOEXEC is ignored by xenevtchn_open_fd(). */
>> +xenevtchn_handle *xenevtchn_open_fd(struct xentoollog_logger *logger,
>> +                                    int fd, unsigned open_flags);
>> +
> 
> I spotted this before, but didn't have time to reply.
> 
> This isn't "open fd".  It is "construct a xenevtchn_handle object around
> an already-open fd".  As such, open_flags appears bogus because at no
> point are we making an open() call.  (I'd argue that it irrespective of
> other things, it wants naming xenevtchn_fdopen() for API familiarity.)

Okay.

> 
> However, the root of the problem is actually the ambiguity in the name.
> These are not flags to the open() system call, but general flags for
> xenevtchn.
> 
> Therefore, I recommend a prep patch which renames open_flags to just
> flags, and while at it, changes to unsigned int rather than a naked
> "unsigned" type.  There are no API/ABI implications for this, but it
> will help massively with code clarity.

Okay.

> 
> I'd also possibly go as far as to say that plumbing 'flags' down into
> osdep ought to split out into a separate patch.  There is also a wild
> mix of coding styles even within the hunks here.

Fine with me.

> 
> Additionally, something in core.c should check for unknown flags and
> reject them them with EINVAL.  It was buggy that this wasn't done
> before, and really needs to be implemented before we start having cases
> where people might plausibly pass something other than 0.

Are you sure this is safe? I'm not arguing against it, but we considered
to do that and didn't dare to.

> 
> ~Andrew
> 
> P.S. if you don't fancy doing all of this, my brain could do with a
> break from the complicated work, and I can see about organising this
> cleanup.
> 

I'm fine doing it. I'm sure you'll find some other no-brainer to work
on :-)


Juergen

[-- Attachment #1.1.2: OpenPGP_0xB0DE9DD628BF132F.asc --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v10 04/25] tools/libxenevtchn: add possibility to not close file descriptor on exec
  2020-12-16  6:06     ` Jürgen Groß
@ 2020-12-16 11:22       ` Andrew Cooper
  2020-12-16 11:33         ` Jürgen Groß
  0 siblings, 1 reply; 34+ messages in thread
From: Andrew Cooper @ 2020-12-16 11:22 UTC (permalink / raw)
  To: Jürgen Groß, xen-devel; +Cc: Ian Jackson, Wei Liu, Julien Grall

On 16/12/2020 06:06, Jürgen Groß wrote:
> On 15.12.20 19:09, Andrew Cooper wrote: 
>>
>> Additionally, something in core.c should check for unknown flags and
>> reject them them with EINVAL.  It was buggy that this wasn't done
>> before, and really needs to be implemented before we start having cases
>> where people might plausibly pass something other than 0.
>
> Are you sure this is safe? I'm not arguing against it, but we considered
> to do that and didn't dare to.

Well - you're already breaking things by adding meaning to bit 0 where
it was previously ignored.

But fundamentally - any caller passing non-zero to begin with is buggy,
and it will be less bad to fix up our input validation and given them a
clean EINVAL now.

The alternative is no error and some weird side effect when we implement
whichever bit they were settings.

~Andrew



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v10 04/25] tools/libxenevtchn: add possibility to not close file descriptor on exec
  2020-12-16 11:22       ` Andrew Cooper
@ 2020-12-16 11:33         ` Jürgen Groß
  0 siblings, 0 replies; 34+ messages in thread
From: Jürgen Groß @ 2020-12-16 11:33 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Ian Jackson, Wei Liu, Julien Grall


[-- Attachment #1.1.1: Type: text/plain, Size: 1012 bytes --]

On 16.12.20 12:22, Andrew Cooper wrote:
> On 16/12/2020 06:06, Jürgen Groß wrote:
>> On 15.12.20 19:09, Andrew Cooper wrote:
>>>
>>> Additionally, something in core.c should check for unknown flags and
>>> reject them them with EINVAL.  It was buggy that this wasn't done
>>> before, and really needs to be implemented before we start having cases
>>> where people might plausibly pass something other than 0.
>>
>> Are you sure this is safe? I'm not arguing against it, but we considered
>> to do that and didn't dare to.
> 
> Well - you're already breaking things by adding meaning to bit 0 where
> it was previously ignored.

Fair enough.

> But fundamentally - any caller passing non-zero to begin with is buggy,
> and it will be less bad to fix up our input validation and given them a
> clean EINVAL now.

Fine with me.

> The alternative is no error and some weird side effect when we implement
> whichever bit they were settings.

Hmm, yes. I'll add the check.


Juergen

[-- Attachment #1.1.2: OpenPGP_0xB0DE9DD628BF132F.asc --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v10 00/25] tools/xenstore: support live update for xenstored
  2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
                   ` (24 preceding siblings ...)
  2020-12-15 16:36 ` [PATCH v10 25/25] tools/xenstore: activate new binary for " Juergen Gross
@ 2021-01-05 12:26 ` Wei Liu
  25 siblings, 0 replies; 34+ messages in thread
From: Wei Liu @ 2021-01-05 12:26 UTC (permalink / raw)
  To: Juergen Gross
  Cc: xen-devel, Ian Jackson, Wei Liu, Andrew Cooper, George Dunlap,
	Jan Beulich, Julien Grall, Stefano Stabellini

On Tue, Dec 15, 2020 at 05:35:38PM +0100, Juergen Gross wrote:
> Today Xenstore is not restartable. This means a Xen server needing an
> update of xenstored has to be rebooted in order to let this update
> become effective.
> 
> This patch series is changing that: The internal state of xenstored
> (the contents of Xenstore, all connections to various clients like
> programs or other domains, and watches) is saved in a defined format
> and a new binary is being activated consuming the old state. All
> connections are being restored and the new Xenstore binary will
> continue where the old one stopped.
> 
> This patch series has been under (secret) development for quite some
> time. It hasn't been posted to xen-devel until now due to the various
> Xenstore related security issues which have become public only today.
> 
> There will be a similar series for oxenstored posted.
> 
> Xenstore-stubdom is not yet supported, but I'm planning to start
> working on that soon.
> 
> Changes in V10 (for the members of the security team):
> - dropped patch 6 as requested by Andrew

I went through this series when it was posted to security@ and didn't
find that many issues. I guess I will wait for Andrew's comment to be
addressed and have a look again.

Wei.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v10 06/25] tools/xenstore: add live update command to xenstore-control
  2020-12-15 16:35 ` [PATCH v10 06/25] tools/xenstore: add live update command to xenstore-control Juergen Gross
@ 2021-01-06 14:42   ` Edwin Torok
  2021-01-13 16:34     ` Jürgen Groß
  0 siblings, 1 reply; 34+ messages in thread
From: Edwin Torok @ 2021-01-06 14:42 UTC (permalink / raw)
  To: jgross, xen-devel
  Cc: jbeulich, julien, wl, iwj, sstabellini, jgrall, George Dunlap,
	paul, Andrew Cooper

On Tue, 2020-12-15 at 17:35 +0100, Juergen Gross wrote:

> [...] 
> +static int live_update_start(struct xs_handle *xsh, bool force,
> unsigned int to)
> +{
> +    int len = 0;
> +    char *buf = NULL, *ret;
> +    time_t time_start;
> +
> +    if (asprintf(&ret, "%u", to) < 0)
> +        return 1;
> +    len = add_to_buf(&buf, "-s", len);
> +    len = add_to_buf(&buf, "-t", len);
> +    len = add_to_buf(&buf, ret, len);
> +    free(ret);
> +    if (force)
> +        len = add_to_buf(&buf, "-F", len);
> +    if (len < 0)
> +        return 1;
> +
> +    for (time_start = time(NULL); time(NULL) - time_start < to;) {
> +        ret = xs_control_command(xsh, "live-update", buf, len);
> +        if (!ret)
> +            goto err;
> +        if (strcmp(ret, "BUSY"))
> +            break;
> +    }

We've discovered issues with this during testing: when a live-update is
not possible (e.g. guest with active transaction held open on purpose)
xenstored gets flooded with live-update requests until the timeout is
reached.

This is problematic for multiple reasons:
* flooding xenstored reduces its throughput, and makes it use 100% CPU.
Depending on the implementation and configuration it may also flood the
logs (which isn't fatal, the system still stays alive if you ENOSPC on
/var/log, but it makes it difficult to debug issues if a live update
gets you to ENOSPC as you may not see actual failures from after the
live update).
* flooding xenstored causes the evtchn to overflow and AFAICT neither
xenstored or oxenstored is capable of coping with that (oxenstored
infinite loops when that happens). IIUC this needs to be fixed in the
kernel, so it doesn't return EFBIG in the first place. As a workaround
I added a sleep(1) in this loop
* the timeout is hit on both client and server sides, but depending on
rounding almost always happens on the client side first which prevents
us from displaying a more informative error message from the server. As
a workaround I increased the client side timeout by 2s to cope with
rounding and give the server a chance to reply. Oxenstored can reply
with a list of domains preventing the live-update for example.

My workarounds looked like this:
@@ -42,6 +43,9 @@ static int live_update_start(struct xs_handle *xsh,
bool force, unsigned int to)
         len = add_to_buf(&buf, "-F", len);
     if (len < 0)
         return 1;
+    /* +1 for rounding issues
+     * +1 to give oxenstored a chance to timeout and report back first
*/
+    to += 2;
 
     for (time_start = time(NULL); time(NULL) - time_start < to;) {
         ret = xs_control_command(xsh, "live-update", buf, len);
@@ -49,6 +53,12 @@ static int live_update_start(struct xs_handle *xsh,
bool force, unsigned int to)
             goto err;
         if (strcmp(ret, "BUSY"))
             break;
+        /* TODO: use task ID for commands, avoid busy loop polling
here
+         * oxenstored checks BUSY condition internally on every main
loop iteration anyway.
+         * Avoid flooding xenstored with live-update requests.
+         * The flooding can also cause the evtchn to overflow in
xenstored which makes
+         * xenstored enter an infinite loop */
+        sleep(1);
     }

This also required various heuristics in oxenstored to differentiate
between a new live-update command and querying the status of an already
completed live-update command, otherwise I almost always ended up doing
2 live-updates in a row (first one queued up, returned BUSY, completed
after a while, and then another one from all the repeated live-update
requests).

I'd prefer it if there was a more asynchronous protocol available for
live-update:
* the live-update on its own queues up the live update request and
returns a generation ID. xenstored retries on its own during each of
its main loop iterations whether the conditions for live-update are now
met
* when a generation ID is specified with the live-update command it
acts as a query to return the status. A query for generation ID <
current ID return success, and for generation ID = current ID it
returns either BUSY, or a specific error if known (e.g. timeout reached
and list of domains preventing live update)
* the generation ID gets incremented every time a live update succeeds
 
This would eliminiate the need for a client-side timeout, since each of
these commands would be non-blocking.
It'd also avoid the busy-poll flood.

Obviously any change here has to be backwards compatible with the
already deployed xenstored and oxenstored which doesn't know about
generation IDs, but you can tell them apart based on the reply: if you
get back OK/BUSY/some error it is the old version, if you get back a
generation ID it is the new version.

I ran out of time to implement this before the embargo was up, should I
go ahead with prototyping a patch for this now, or do you see an
alternative way to make the live-update command more robust?


Best regards,
--Edwin

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v10 06/25] tools/xenstore: add live update command to xenstore-control
  2021-01-06 14:42   ` Edwin Torok
@ 2021-01-13 16:34     ` Jürgen Groß
  2021-01-13 17:42       ` Edwin Torok
  0 siblings, 1 reply; 34+ messages in thread
From: Jürgen Groß @ 2021-01-13 16:34 UTC (permalink / raw)
  To: Edwin Torok, xen-devel
  Cc: jbeulich, julien, wl, iwj, sstabellini, jgrall, George Dunlap,
	paul, Andrew Cooper


[-- Attachment #1.1.1: Type: text/plain, Size: 5728 bytes --]

On 06.01.21 15:42, Edwin Torok wrote:
> On Tue, 2020-12-15 at 17:35 +0100, Juergen Gross wrote:
> 
>> [...]
>> +static int live_update_start(struct xs_handle *xsh, bool force,
>> unsigned int to)
>> +{
>> +    int len = 0;
>> +    char *buf = NULL, *ret;
>> +    time_t time_start;
>> +
>> +    if (asprintf(&ret, "%u", to) < 0)
>> +        return 1;
>> +    len = add_to_buf(&buf, "-s", len);
>> +    len = add_to_buf(&buf, "-t", len);
>> +    len = add_to_buf(&buf, ret, len);
>> +    free(ret);
>> +    if (force)
>> +        len = add_to_buf(&buf, "-F", len);
>> +    if (len < 0)
>> +        return 1;
>> +
>> +    for (time_start = time(NULL); time(NULL) - time_start < to;) {
>> +        ret = xs_control_command(xsh, "live-update", buf, len);
>> +        if (!ret)
>> +            goto err;
>> +        if (strcmp(ret, "BUSY"))
>> +            break;
>> +    }
> 
> We've discovered issues with this during testing: when a live-update is
> not possible (e.g. guest with active transaction held open on purpose)
> xenstored gets flooded with live-update requests until the timeout is
> reached.
> 
> This is problematic for multiple reasons:
> * flooding xenstored reduces its throughput, and makes it use 100% CPU.
> Depending on the implementation and configuration it may also flood the
> logs (which isn't fatal, the system still stays alive if you ENOSPC on
> /var/log, but it makes it difficult to debug issues if a live update
> gets you to ENOSPC as you may not see actual failures from after the
> live update).
> * flooding xenstored causes the evtchn to overflow and AFAICT neither
> xenstored or oxenstored is capable of coping with that (oxenstored
> infinite loops when that happens). IIUC this needs to be fixed in the
> kernel, so it doesn't return EFBIG in the first place. As a workaround
> I added a sleep(1) in this loop
> * the timeout is hit on both client and server sides, but depending on
> rounding almost always happens on the client side first which prevents
> us from displaying a more informative error message from the server. As
> a workaround I increased the client side timeout by 2s to cope with
> rounding and give the server a chance to reply. Oxenstored can reply
> with a list of domains preventing the live-update for example.
> 
> My workarounds looked like this:
> @@ -42,6 +43,9 @@ static int live_update_start(struct xs_handle *xsh,
> bool force, unsigned int to)
>           len = add_to_buf(&buf, "-F", len);
>       if (len < 0)
>           return 1;
> +    /* +1 for rounding issues
> +     * +1 to give oxenstored a chance to timeout and report back first
> */
> +    to += 2;
>   
>       for (time_start = time(NULL); time(NULL) - time_start < to;) {
>           ret = xs_control_command(xsh, "live-update", buf, len);
> @@ -49,6 +53,12 @@ static int live_update_start(struct xs_handle *xsh,
> bool force, unsigned int to)
>               goto err;
>           if (strcmp(ret, "BUSY"))
>               break;
> +        /* TODO: use task ID for commands, avoid busy loop polling
> here
> +         * oxenstored checks BUSY condition internally on every main
> loop iteration anyway.
> +         * Avoid flooding xenstored with live-update requests.
> +         * The flooding can also cause the evtchn to overflow in
> xenstored which makes
> +         * xenstored enter an infinite loop */
> +        sleep(1);
>       }
> 
> This also required various heuristics in oxenstored to differentiate
> between a new live-update command and querying the status of an already
> completed live-update command, otherwise I almost always ended up doing
> 2 live-updates in a row (first one queued up, returned BUSY, completed
> after a while, and then another one from all the repeated live-update
> requests).
> 
> I'd prefer it if there was a more asynchronous protocol available for
> live-update:
> * the live-update on its own queues up the live update request and
> returns a generation ID. xenstored retries on its own during each of
> its main loop iterations whether the conditions for live-update are now
> met
> * when a generation ID is specified with the live-update command it
> acts as a query to return the status. A query for generation ID <
> current ID return success, and for generation ID = current ID it
> returns either BUSY, or a specific error if known (e.g. timeout reached
> and list of domains preventing live update)
> * the generation ID gets incremented every time a live update succeeds
>   
> This would eliminiate the need for a client-side timeout, since each of
> these commands would be non-blocking.
> It'd also avoid the busy-poll flood.
> 
> Obviously any change here has to be backwards compatible with the
> already deployed xenstored and oxenstored which doesn't know about
> generation IDs, but you can tell them apart based on the reply: if you
> get back OK/BUSY/some error it is the old version, if you get back a
> generation ID it is the new version.
> 
> I ran out of time to implement this before the embargo was up, should I
> go ahead with prototyping a patch for this now, or do you see an
> alternative way to make the live-update command more robust?

I think this can be made much easier:

The live-update should be handled completely in the daemon, so
returning only with success or failure. Returning BUSY wouldn't
occur this way, but either "OK" (after the successful LU) or a
failure reason (e.g. list of domains blocking) would be
returned.

I'll have a try with this approach.


Juergen

[-- Attachment #1.1.2: OpenPGP_0xB0DE9DD628BF132F.asc --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v10 06/25] tools/xenstore: add live update command to xenstore-control
  2021-01-13 16:34     ` Jürgen Groß
@ 2021-01-13 17:42       ` Edwin Torok
  0 siblings, 0 replies; 34+ messages in thread
From: Edwin Torok @ 2021-01-13 17:42 UTC (permalink / raw)
  To: jgross, xen-devel
  Cc: jbeulich, julien, wl, iwj, sstabellini, jgrall, paul,
	George Dunlap, Andrew Cooper

On Wed, 2021-01-13 at 17:34 +0100, Jürgen Groß wrote:
> On 06.01.21 15:42, Edwin Torok wrote:
> > [...]
> > 
> > I'd prefer it if there was a more asynchronous protocol available
> > for
> > live-update:
> > * the live-update on its own queues up the live update request and
> > returns a generation ID. xenstored retries on its own during each
> > of
> > its main loop iterations whether the conditions for live-update are
> > now
> > met
> > * when a generation ID is specified with the live-update command it
> > acts as a query to return the status. A query for generation ID <
> > current ID return success, and for generation ID = current ID it
> > returns either BUSY, or a specific error if known (e.g. timeout
> > reached
> > and list of domains preventing live update)
> > * the generation ID gets incremented every time a live update
> > succeeds
> >   
> > This would eliminiate the need for a client-side timeout, since
> > each of
> > these commands would be non-blocking.
> > It'd also avoid the busy-poll flood.
> > 
> > Obviously any change here has to be backwards compatible with the
> > already deployed xenstored and oxenstored which doesn't know about
> > generation IDs, but you can tell them apart based on the reply: if
> > you
> > get back OK/BUSY/some error it is the old version, if you get back
> > a
> > generation ID it is the new version.
> > 
> > I ran out of time to implement this before the embargo was up,
> > should I
> > go ahead with prototyping a patch for this now, or do you see an
> > alternative way to make the live-update command more robust?
> 
> I think this can be made much easier:
> 
> The live-update should be handled completely in the daemon, so
> returning only with success or failure. Returning BUSY wouldn't
> occur this way, but either "OK" (after the successful LU) or a
> failure reason (e.g. list of domains blocking) would be
> returned.
> 
> I'll have a try with this approach.
> 
> 

In oxenstored each request wants an immediate reply, so delaying the
reply to after the live-update is not trivial.
Should be possible to do though (e.g. mark the socket such that no
further xenstore protocol is processed on it, and "put back" the live
update request, to be replied by the new xenstored after the live
update completes, which would clear the 'do not process this socket'
flag), I'll be curious to see what your approach will look like.

Best regards,
--Edwin

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2021-01-13 17:43 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-15 16:35 [PATCH v10 00/25] tools/xenstore: support live update for xenstored Juergen Gross
2020-12-15 16:35 ` [PATCH v10 01/25] tools/xenstore: switch barf[_perror]() to use syslog() Juergen Gross
2020-12-15 16:35 ` [PATCH v10 02/25] tools/xenstore: make set_tdb_key() non-static Juergen Gross
2020-12-15 16:35 ` [PATCH v10 03/25] tools/xenstore: remove unused cruft from xenstored_domain.c Juergen Gross
2020-12-15 16:35 ` [PATCH v10 04/25] tools/libxenevtchn: add possibility to not close file descriptor on exec Juergen Gross
2020-12-15 18:09   ` Andrew Cooper
2020-12-16  6:06     ` Jürgen Groß
2020-12-16 11:22       ` Andrew Cooper
2020-12-16 11:33         ` Jürgen Groß
2020-12-15 16:35 ` [PATCH v10 05/25] tools/xenstore: refactor XS_CONTROL handling Juergen Gross
2020-12-15 16:35 ` [PATCH v10 06/25] tools/xenstore: add live update command to xenstore-control Juergen Gross
2021-01-06 14:42   ` Edwin Torok
2021-01-13 16:34     ` Jürgen Groß
2021-01-13 17:42       ` Edwin Torok
2020-12-15 16:35 ` [PATCH v10 07/25] tools/xenstore: add basic live-update command parsing Juergen Gross
2020-12-15 16:35 ` [PATCH v10 08/25] tools/xenstore: introduce live update status block Juergen Gross
2020-12-15 16:35 ` [PATCH v10 09/25] tools/xenstore: save new binary for live update Juergen Gross
2020-12-15 16:35 ` [PATCH v10 10/25] tools/xenstore: add command line handling " Juergen Gross
2020-12-15 16:35 ` [PATCH v10 11/25] tools/xenstore: add the basic framework for doing the " Juergen Gross
2020-12-15 16:35 ` [PATCH v10 12/25] tools/xenstore: allow live update only with no transaction active Juergen Gross
2020-12-15 16:35 ` [PATCH v10 13/25] docs: update the xenstore migration stream documentation Juergen Gross
2020-12-15 16:35 ` [PATCH v10 14/25] tools/xenstore: add include file for state structure definitions Juergen Gross
2020-12-15 16:35 ` [PATCH v10 15/25] tools/xenstore: dump the xenstore state for live update Juergen Gross
2020-12-15 16:35 ` [PATCH v10 16/25] tools/xenstore: handle CLOEXEC flag for local files and pipes Juergen Gross
2020-12-15 16:35 ` [PATCH v10 17/25] tools/xenstore: split off domain introduction from do_introduce() Juergen Gross
2020-12-15 16:35 ` [PATCH v10 18/25] tools/xenstore: evaluate the live update flag when starting Juergen Gross
2020-12-15 16:35 ` [PATCH v10 19/25] tools/xenstore: read internal state when doing live upgrade Juergen Gross
2020-12-15 16:35 ` [PATCH v10 20/25] tools/xenstore: add reading global state for live update Juergen Gross
2020-12-15 16:35 ` [PATCH v10 21/25] tools/xenstore: add read connection " Juergen Gross
2020-12-15 16:36 ` [PATCH v10 22/25] tools/xenstore: add read node " Juergen Gross
2020-12-15 16:36 ` [PATCH v10 23/25] tools/xenstore: add read watch " Juergen Gross
2020-12-15 16:36 ` [PATCH v10 24/25] tools/xenstore: handle dying domains in " Juergen Gross
2020-12-15 16:36 ` [PATCH v10 25/25] tools/xenstore: activate new binary for " Juergen Gross
2021-01-05 12:26 ` [PATCH v10 00/25] tools/xenstore: support live update for xenstored Wei Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.