All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] oxenstored build enhancements
@ 2021-01-15 22:28 ` Edwin Török
  0 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Doug Goldstein, Andrew Cooper,
	George Dunlap, Ian Jackson, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu, Christian Lindig, David Scott

The patches were posted previously, this is a repost after the XSA series.

For convenience here is a tree with all patch series applied:
https://github.com/edwintorok/xen/pull/1

Edwin Török (2):
  automation/: add Ubuntu:focal container
  Makefile: add build-tools-oxenstored

 Makefile                                 |  6 +++
 automation/build/ubuntu/focal.dockerfile | 48 ++++++++++++++++++++++++
 automation/scripts/containerize          |  1 +
 tools/ocaml/Makefile                     |  8 ++++
 4 files changed, 63 insertions(+)
 create mode 100644 automation/build/ubuntu/focal.dockerfile

-- 
2.29.2



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 1/2] automation/: add Ubuntu:focal container
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (4 preceding siblings ...)
  (?)
@ 2021-01-15 22:28 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel; +Cc: Edwin Török, Doug Goldstein

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Doug Goldstein <cardoe@cardoe.com>

---
Changed since v1:
* dropped python-dev and markdown
---
 automation/build/ubuntu/focal.dockerfile | 48 ++++++++++++++++++++++++
 automation/scripts/containerize          |  1 +
 2 files changed, 49 insertions(+)
 create mode 100644 automation/build/ubuntu/focal.dockerfile

diff --git a/automation/build/ubuntu/focal.dockerfile b/automation/build/ubuntu/focal.dockerfile
new file mode 100644
index 0000000000..c1c1f8d58f
--- /dev/null
+++ b/automation/build/ubuntu/focal.dockerfile
@@ -0,0 +1,48 @@
+FROM ubuntu:20.04
+LABEL maintainer.name="The Xen Project " \
+      maintainer.email="xen-devel@lists.xenproject.org"
+
+ENV DEBIAN_FRONTEND=noninteractive
+ENV USER root
+
+RUN mkdir /build
+WORKDIR /build
+
+# build depends
+RUN apt-get update && \
+    apt-get --quiet --yes install \
+        build-essential \
+        zlib1g-dev \
+        libncurses5-dev \
+        libssl-dev \
+        python3-dev \
+        xorg-dev \
+        uuid-dev \
+        libyajl-dev \
+        libaio-dev \
+        libglib2.0-dev \
+        clang \
+        libpixman-1-dev \
+        pkg-config \
+        flex \
+        bison \
+        gettext \
+        acpica-tools \
+        bin86 \
+        bcc \
+        liblzma-dev \
+        libc6-dev-i386 \
+        libnl-3-dev \
+        ocaml-nox \
+        libfindlib-ocaml-dev \
+        libsystemd-dev \
+        transfig \
+        pandoc \
+        checkpolicy \
+        wget \
+        git \
+        nasm \
+        && \
+        apt-get autoremove -y && \
+        apt-get clean && \
+        rm -rf /var/lib/apt/lists* /tmp/* /var/tmp/*
diff --git a/automation/scripts/containerize b/automation/scripts/containerize
index c8c3c20fa2..da45baed4e 100755
--- a/automation/scripts/containerize
+++ b/automation/scripts/containerize
@@ -28,6 +28,7 @@ case "_${CONTAINER}" in
     _centos7) CONTAINER="${BASE}/centos:7" ;;
     _centos72) CONTAINER="${BASE}/centos:7.2" ;;
     _fedora) CONTAINER="${BASE}/fedora:29";;
+    _focal) CONTAINER="${BASE}/ubuntu:focal" ;;
     _jessie) CONTAINER="${BASE}/debian:jessie" ;;
     _stretch|_) CONTAINER="${BASE}/debian:stretch" ;;
     _unstable|_) CONTAINER="${BASE}/debian:unstable" ;;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 2/2] Makefile: add build-tools-oxenstored
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (5 preceding siblings ...)
  (?)
@ 2021-01-15 22:28 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Christian Lindig, David Scott

As a convenience so that oxenstored patches can be compile-tested
using upstream's build-system before submitting upstream.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
---
Changed since V1:
* repost after XSA to avoid conflicts
---
 Makefile             | 6 ++++++
 tools/ocaml/Makefile | 8 ++++++++
 2 files changed, 14 insertions(+)

diff --git a/Makefile b/Makefile
index 9ad2602f63..96d32cfd50 100644
--- a/Makefile
+++ b/Makefile
@@ -62,6 +62,12 @@ build-xen:
 build-tools: build-tools-public-headers
 	$(MAKE) -C tools build
 
+.PHONY: build-tools-oxenstored
+build-tools-oxenstored: build-tools-public-headers
+	$(MAKE) -s -C tools/ocaml clean
+	$(MAKE) -s -C tools/libs
+	$(MAKE) -C tools/ocaml build-tools-oxenstored
+
 .PHONY: build-stubdom
 build-stubdom: mini-os-dir build-tools-public-headers
 	$(MAKE) -C stubdom build
diff --git a/tools/ocaml/Makefile b/tools/ocaml/Makefile
index 66f2d6b131..a7c04b6546 100644
--- a/tools/ocaml/Makefile
+++ b/tools/ocaml/Makefile
@@ -26,3 +26,11 @@ clean: subdirs-clean
 
 .PHONY: distclean
 distclean: subdirs-distclean
+
+.PHONY: build-tools-oxenstored
+build-tools-oxenstored:
+	$(MAKE) -s -C libs/eventchn
+	$(MAKE) -s -C libs/mmap
+	$(MAKE) -s -C libs/xb
+	$(MAKE) -s -C libs/xc
+	$(MAKE) -C xenstored
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 0/8] tools/ocaml/xenstored: implement live update
@ 2021-01-15 22:28 ` Edwin Török
  0 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Juergen Gross, Christian Lindig, David Scott

This is the oxenstored live update code that was shipped with the XSAs.
It reuses the existing text-based dump format to ease backporting to
old versions of Xen.
There is a separate series that implements the C xenstored migration binary
format on top of this.

Feedback welcome either on this PR or on the mailing list:
https://github.com/edwintorok/xen/pull/1

Edvin Torok (1):
  tools/ocaml/xenstored: Implement live update for socket connections

Edwin Török (6):
  Add workaround for xenstore-control flood issues
  docs/designs/xenstore-migration.md: clarify that deletes are recursive
  tools/ocaml/xenstored: only quit on SIGTERM when a reload is possible
  tools/ocaml/xenstored: Automatically resume when possible
  tools/ocaml/xenstored: add cooperative live-update command
  tools/ocaml/xenstored: start live update process

Juergen Gross (1):
  tools/xenstore: add live update command to xenstore-control

 docs/designs/xenstore-migration.md   |   3 +-
 docs/misc/xenstore.txt               |  21 ++
 tools/ocaml/xenstored/connection.ml  |  56 ++++-
 tools/ocaml/xenstored/connections.ml |   8 +
 tools/ocaml/xenstored/logging.ml     |   3 +
 tools/ocaml/xenstored/parse_arg.ml   |   4 +
 tools/ocaml/xenstored/process.ml     | 139 +++++++++++
 tools/ocaml/xenstored/stdext.ml      |   6 +
 tools/ocaml/xenstored/store.ml       |   2 +-
 tools/ocaml/xenstored/utils.ml       |  12 +
 tools/ocaml/xenstored/xenstored.ml   | 126 +++++++---
 tools/xenstore/Makefile              |   3 +-
 tools/xenstore/xenstore_control.c    | 345 +++++++++++++++++++++++++--
 tools/xenstore/xenstored_core.c      |   7 +-
 14 files changed, 681 insertions(+), 54 deletions(-)

-- 
2.29.2



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 1/8] tools/xenstore: add live update command to xenstore-control
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (6 preceding siblings ...)
  (?)
@ 2021-01-15 22:28 ` Edwin Török
  2021-01-18  7:50   ` Jürgen Groß
  -1 siblings, 1 reply; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Paul Durrant, Julien Grall

From: Juergen Gross <jgross@suse.com>

Add the "live-update" command to xenstore-control enabling updating
xenstored to a new version in a running Xen system.

With -c <arg> it is possible to pass a different command line to the
new instance of xenstored. This will replace the command line used
for the invocation of the just running xenstored instance.

The running xenstored (or xenstore-stubdom) needs to support live
updating, of course.

For now just add a small dummy handler to C xenstore denying any
live update action.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Julien Grall <jgrall@amazon.com>
---
 docs/misc/xenstore.txt            |  21 ++
 tools/xenstore/Makefile           |   3 +-
 tools/xenstore/xenstore_control.c | 332 ++++++++++++++++++++++++++++--
 3 files changed, 339 insertions(+), 17 deletions(-)

diff --git a/docs/misc/xenstore.txt b/docs/misc/xenstore.txt
index 2081f20f55..1480742330 100644
--- a/docs/misc/xenstore.txt
+++ b/docs/misc/xenstore.txt
@@ -317,6 +317,27 @@ CONTROL			<command>|[<parameters>|]
 	Current commands are:
 	check
 		checks xenstored innards
+	live-update|<params>|+
+		perform a live-update of the Xenstore daemon, only to
+		be used via xenstore-control command.
+		<params> are implementation specific and are used for
+		different steps of the live-update processing. Currently
+		supported <params> are:
+		-f <file>  specify new daemon binary
+		-b <size>  specify size of new stubdom binary
+		-d <chunk-size> <binary-chunk>  transfer chunk of new
+			stubdom binary
+		-c <pars>  specify new command line to use
+		-s [-t <sec>] [-F]  start live update process (-t specifies
+			timeout in seconds to wait for active transactions
+			to finish, default is 60 seconds; -F will force
+			live update to happen even with running transactions
+			after timeout elapsed)
+		-a  abort live update handling
+		All sub-options will return "OK" in case of success or an
+		error string in case of failure. -s can return "BUSY" in case
+		of an active transaction, a retry of -s can be done in that
+		case.
 	log|on
 		turn xenstore logging on
 	log|off
diff --git a/tools/xenstore/Makefile b/tools/xenstore/Makefile
index 9a0f0d012d..ab89e22d3a 100644
--- a/tools/xenstore/Makefile
+++ b/tools/xenstore/Makefile
@@ -11,6 +11,7 @@ CFLAGS += -include $(XEN_ROOT)/tools/config.h
 CFLAGS += -I./include
 CFLAGS += $(CFLAGS_libxenevtchn)
 CFLAGS += $(CFLAGS_libxenctrl)
+CFLAGS += $(CFLAGS_libxenguest)
 CFLAGS += $(CFLAGS_libxentoolcore)
 CFLAGS += -DXEN_LIB_STORED="\"$(XEN_LIB_STORED)\""
 CFLAGS += -DXEN_RUN_STORED="\"$(XEN_RUN_STORED)\""
@@ -81,7 +82,7 @@ xenstore: xenstore_client.o
 	$(CC) $< $(LDFLAGS) $(LDLIBS_libxenstore) $(LDLIBS_libxentoolcore) $(SOCKET_LIBS) -o $@ $(APPEND_LDFLAGS)
 
 xenstore-control: xenstore_control.o
-	$(CC) $< $(LDFLAGS) $(LDLIBS_libxenstore) $(LDLIBS_libxentoolcore) $(SOCKET_LIBS) -o $@ $(APPEND_LDFLAGS)
+	$(CC) $< $(LDFLAGS) $(LDLIBS_libxenstore) $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) $(LDLIBS_libxentoolcore) $(SOCKET_LIBS) -o $@ $(APPEND_LDFLAGS)
 
 xs_tdb_dump: xs_tdb_dump.o utils.o tdb.o talloc.o
 	$(CC) $^ $(LDFLAGS) -o $@ $(APPEND_LDFLAGS)
diff --git a/tools/xenstore/xenstore_control.c b/tools/xenstore/xenstore_control.c
index afa04495a7..5ca015a07d 100644
--- a/tools/xenstore/xenstore_control.c
+++ b/tools/xenstore/xenstore_control.c
@@ -1,9 +1,311 @@
+#define _GNU_SOURCE
+#include <stdbool.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+#include <time.h>
+#include <xenctrl.h>
+#include <xenguest.h>
 
 #include "xenstore.h"
 
+/* Add a string plus terminating 0 byte to buf, returning new len. */
+static int add_to_buf(char **buf, const char *val, int len)
+{
+    int vallen = strlen(val) + 1;
+
+    if (len < 0)
+        return -1;
+
+    *buf = realloc(*buf, len + vallen);
+    if (!*buf)
+        return -1;
+
+    strcpy(*buf + len, val);
+
+    return len + vallen;
+}
+
+static int live_update_start(struct xs_handle *xsh, bool force, unsigned int to)
+{
+    int len = 0;
+    char *buf = NULL, *ret;
+    time_t time_start;
+
+    if (asprintf(&ret, "%u", to) < 0)
+        return 1;
+    len = add_to_buf(&buf, "-s", len);
+    len = add_to_buf(&buf, "-t", len);
+    len = add_to_buf(&buf, ret, len);
+    free(ret);
+    if (force)
+        len = add_to_buf(&buf, "-F", len);
+    if (len < 0)
+        return 1;
+
+    for (time_start = time(NULL); time(NULL) - time_start < to;) {
+        ret = xs_control_command(xsh, "live-update", buf, len);
+        if (!ret)
+            goto err;
+        if (strcmp(ret, "BUSY"))
+            break;
+    }
+
+    if (strcmp(ret, "OK"))
+        goto err;
+
+    free(buf);
+    free(ret);
+
+    return 0;
+
+ err:
+    fprintf(stderr, "Starting live update failed:\n%s\n",
+            ret ? : strerror(errno));
+    free(buf);
+    free(ret);
+
+    return 3;
+}
+
+static int live_update_cmdline(struct xs_handle *xsh, const char *cmdline)
+{
+    int len = 0, rc = 0;
+    char *buf = NULL, *ret;
+
+    len = add_to_buf(&buf, "-c", len);
+    len = add_to_buf(&buf, cmdline, len);
+    if (len < 0)
+        return 1;
+
+    ret = xs_control_command(xsh, "live-update", buf, len);
+    free(buf);
+    if (!ret || strcmp(ret, "OK")) {
+        fprintf(stderr, "Setting update binary failed:\n%s\n",
+                ret ? : strerror(errno));
+        rc = 3;
+    }
+    free(ret);
+
+    return rc;
+}
+
+static int send_kernel_blob(struct xs_handle *xsh, const char *binary)
+{
+    int rc = 0, len = 0;
+    xc_interface *xch;
+    struct xc_dom_image *dom;
+    char *ret, *buf = NULL;
+    size_t off, sz;
+#define BLOB_CHUNK_SZ 2048
+
+    xch = xc_interface_open(NULL, NULL, 0);
+    if (!xch) {
+        fprintf(stderr, "xc_interface_open() failed\n");
+        return 1;
+    }
+
+    dom = xc_dom_allocate(xch, NULL, NULL);
+    if (!dom) {
+        rc = 1;
+        goto out_close;
+    }
+
+    rc = xc_dom_kernel_file(dom, binary);
+    if (rc) {
+        rc = 1;
+        goto out_rel;
+    }
+
+    if (asprintf(&ret, "%zu", dom->kernel_size) < 0) {
+        rc = 1;
+        goto out_rel;
+    }
+    len = add_to_buf(&buf, "-b", len);
+    len = add_to_buf(&buf, ret, len);
+    free(ret);
+    if (len < 0) {
+        rc = 1;
+        goto out_rel;
+    }
+    ret = xs_control_command(xsh, "live-update", buf, len);
+    free(buf);
+    if (!ret || strcmp(ret, "OK")) {
+        fprintf(stderr, "Starting live update failed:\n%s\n",
+                ret ? : strerror(errno));
+        rc = 3;
+    }
+    free(ret);
+    if (rc)
+        goto out_rel;
+
+    /* buf capable to hold "-d" <1..2048> BLOB_CHUNK_SZ and a terminating 0. */
+    buf = malloc(3 + 5 + BLOB_CHUNK_SZ + 1);
+    if (!buf) {
+        rc = 1;
+        goto out_rel;
+    }
+
+    strcpy(buf, "-d");
+    sz = BLOB_CHUNK_SZ;
+    for (off = 0; off < dom->kernel_size; off += BLOB_CHUNK_SZ) {
+        if (dom->kernel_size - off < BLOB_CHUNK_SZ)
+            sz = dom->kernel_size - off;
+        sprintf(buf + 3, "%zu", sz);
+        len = 3 + strlen(buf + 3) + 1;
+        memcpy(buf + len, dom->kernel_blob + off, sz);
+        buf[len + sz] = 0;
+        len += sz + 1;
+        ret = xs_control_command(xsh, "live-update", buf, len);
+        if (!ret || strcmp(ret, "OK")) {
+            fprintf(stderr, "Transfer of new binary failed:\n%s\n",
+                    ret ? : strerror(errno));
+            rc = 3;
+            free(ret);
+            break;
+        }
+        free(ret);
+    }
+
+    free(buf);
+
+ out_rel:
+    xc_dom_release(dom);
+
+ out_close:
+    xc_interface_close(xch);
+
+    return rc;
+}
+
+/*
+ * Live update of Xenstore stubdom
+ *
+ * Sequence of actions:
+ * 1. transfer new stubdom binary
+ *    a) specify size
+ *    b) transfer unpacked binary in chunks
+ * 2. transfer new cmdline (optional)
+ * 3. start update (includes flags)
+ */
+static int live_update_stubdom(struct xs_handle *xsh, const char *binary,
+                               const char *cmdline, bool force, unsigned int to)
+{
+    int rc;
+
+    rc = send_kernel_blob(xsh, binary);
+    if (rc)
+        goto abort;
+
+    if (cmdline) {
+        rc = live_update_cmdline(xsh, cmdline);
+        if (rc)
+            goto abort;
+    }
+
+    rc = live_update_start(xsh, force, to);
+    if (rc)
+        goto abort;
+
+    return 0;
+
+ abort:
+    xs_control_command(xsh, "live-update", "-a", 3);
+    return rc;
+}
+
+/*
+ * Live update of Xenstore daemon
+ *
+ * Sequence of actions:
+ * 1. transfer new binary filename
+ * 2. transfer new cmdline (optional)
+ * 3. start update (includes flags)
+ */
+static int live_update_daemon(struct xs_handle *xsh, const char *binary,
+                              const char *cmdline, bool force, unsigned int to)
+{
+    int len = 0, rc;
+    char *buf = NULL, *ret;
+
+    len = add_to_buf(&buf, "-f", len);
+    len = add_to_buf(&buf, binary, len);
+    if (len < 0)
+        return 1;
+    ret = xs_control_command(xsh, "live-update", buf, len);
+    free(buf);
+    if (!ret || strcmp(ret, "OK")) {
+        fprintf(stderr, "Setting update binary failed:\n%s\n",
+                ret ? : strerror(errno));
+        free(ret);
+        return 3;
+    }
+    free(ret);
+
+    if (cmdline) {
+        rc = live_update_cmdline(xsh, cmdline);
+        if (rc)
+            goto abort;
+    }
+
+    rc = live_update_start(xsh, force, to);
+    if (rc)
+        goto abort;
+
+    return 0;
+
+ abort:
+    xs_control_command(xsh, "live-update", "-a", 3);
+    return rc;
+}
+
+static int live_update(struct xs_handle *xsh, int argc, char **argv)
+{
+    int rc = 0;
+    unsigned int i, to = 60;
+    char *binary = NULL, *cmdline = NULL, *val;
+    bool force = false;
+
+    for (i = 0; i < argc; i++) {
+        if (!strcmp(argv[i], "-c")) {
+            i++;
+            if (i == argc) {
+                fprintf(stderr, "Missing command line value\n");
+                rc = 2;
+                goto out;
+            }
+            cmdline = argv[i];
+        } else if (!strcmp(argv[i], "-t")) {
+            i++;
+            if (i == argc) {
+                fprintf(stderr, "Missing timeout value\n");
+                rc = 2;
+                goto out;
+            }
+            to = atoi(argv[i]);
+        } else if (!strcmp(argv[i], "-F"))
+            force = true;
+        else
+            binary = argv[i];
+    }
+
+    if (!binary) {
+        fprintf(stderr, "Missing binary specification\n");
+        rc = 2;
+        goto out;
+    }
+
+    val = xs_read(xsh, XBT_NULL, "/tool/xenstored/domid", &i);
+    if (val)
+        rc = live_update_stubdom(xsh, binary, cmdline, force, to);
+    else
+        rc = live_update_daemon(xsh, binary, cmdline, force, to);
+
+    free(val);
+
+ out:
+    return rc;
+}
 
 int main(int argc, char **argv)
 {
@@ -20,22 +322,6 @@ int main(int argc, char **argv)
         goto out;
     }
 
-    for (p = 2; p < argc; p++)
-        len += strlen(argv[p]) + 1;
-    if (len) {
-        par = malloc(len);
-        if (!par) {
-            fprintf(stderr, "Allocation error.\n");
-            rc = 1;
-            goto out;
-        }
-        len = 0;
-        for (p = 2; p < argc; p++) {
-            memcpy(par + len, argv[p], strlen(argv[p]) + 1);
-            len += strlen(argv[p]) + 1;
-        }
-    }
-
     xsh = xs_open(0);
     if (xsh == NULL) {
         fprintf(stderr, "Failed to contact Xenstored.\n");
@@ -43,6 +329,19 @@ int main(int argc, char **argv)
         goto out;
     }
 
+    if (!strcmp(argv[1], "live-update")) {
+        rc = live_update(xsh, argc - 2, argv + 2);
+        goto out_close;
+    }
+
+    for (p = 2; p < argc; p++)
+        len = add_to_buf(&par, argv[p], len);
+    if (len < 0) {
+        fprintf(stderr, "Allocation error.\n");
+        rc = 1;
+        goto out_close;
+    }
+
     ret = xs_control_command(xsh, argv[1], par, len);
     if (!ret) {
         rc = 3;
@@ -59,6 +358,7 @@ int main(int argc, char **argv)
     } else if (strlen(ret) > 0)
         printf("%s\n", ret);
 
+ out_close:
     xs_close(xsh);
 
  out:
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 2/8] Add workaround for xenstore-control flood issues
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (7 preceding siblings ...)
  (?)
@ 2021-01-15 22:28 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Ian Jackson, Wei Liu, Juergen Gross,
	Pau Ruiz Safont, Christian Lindig

There are alternative fixes for this, e.g. do the entire live update
inside oxenstored and reply OK from the next oxenstored or an error.
This requires some asynchronous handling there.

Once that code is available we can revert this one.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Reviewed-by: Pau Ruiz Safont <pau.safont@citrix.com>
Reviewed-by: Christian Lindig <christian.lindig@citrix.com>
---
Changed since V1:
* post publicly now that the XSA is out
---
 tools/xenstore/xenstore_control.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/tools/xenstore/xenstore_control.c b/tools/xenstore/xenstore_control.c
index 5ca015a07d..611e8b4fdd 100644
--- a/tools/xenstore/xenstore_control.c
+++ b/tools/xenstore/xenstore_control.c
@@ -42,6 +42,10 @@ static int live_update_start(struct xs_handle *xsh, bool force, unsigned int to)
         len = add_to_buf(&buf, "-F", len);
     if (len < 0)
         return 1;
+    /* +1 for rounding issues
+     * +1 to give oxenstored a chance to timeout and report back first
+     */
+    to += 2;
 
     for (time_start = time(NULL); time(NULL) - time_start < to;) {
         ret = xs_control_command(xsh, "live-update", buf, len);
@@ -49,6 +53,15 @@ static int live_update_start(struct xs_handle *xsh, bool force, unsigned int to)
             goto err;
         if (strcmp(ret, "BUSY"))
             break;
+        /* TODO: use task ID for commands, avoid busy loop polling
+here
+         * oxenstored checks BUSY condition internally on every main
+loop iteration anyway.
+         * Avoid flooding xenstored with live-update requests.
+         * The flooding can also cause the evtchn to overflow in
+xenstored which makes
+         * xenstored enter an infinite loop */
+        sleep(1);
     }
 
     if (strcmp(ret, "OK"))
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 3/8] docs/designs/xenstore-migration.md: clarify that deletes are recursive
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (8 preceding siblings ...)
  (?)
@ 2021-01-15 22:28 ` Edwin Török
  2021-01-22 13:04   ` Jürgen Groß
  -1 siblings, 1 reply; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
---
Changed since V1:
* post publicly now that the XSA is out
---
 docs/designs/xenstore-migration.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/docs/designs/xenstore-migration.md b/docs/designs/xenstore-migration.md
index 2ce2c836f5..f44bc0c61d 100644
--- a/docs/designs/xenstore-migration.md
+++ b/docs/designs/xenstore-migration.md
@@ -365,7 +365,8 @@ record previously present).
 |              | 0x0001: read                                   |
 |              | 0x0002: written                                |
 |              |                                                |
-|              | The value will be zero for a deleted node      |
+|              | The value will be zero for a recursively       |
+|              | deleted node                                   |
 |              |                                                |
 | `perm-count` | The number (N) of node permission specifiers   |
 |              | (which will be 0 for a node deleted in a       |
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 4/8] tools/ocaml/xenstored: only quit on SIGTERM when a reload is possible
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (9 preceding siblings ...)
  (?)
@ 2021-01-15 22:28 ` Edwin Török
  2021-01-18  7:51   ` Jürgen Groß
  -1 siblings, 1 reply; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu, Juergen Gross, Pau Ruiz Safont

Currently when oxenstored receives SIGTERM it dumps its state and quits.
It is possible to then restart it if --restart is given, however that is
not always safe:

* domains could have active transactions, and after a restart they would
either reuse transaction IDs of already open transactions, or get an
error back that the transaction doesn't exist

* there could be pending data to send to a VM still in oxenstored's
  queue which would be lost

* there could be pending input to be processed from a VM in oxenstored's
  queue which would be lost

Prevent shutting down oxenstored via SIGTERM in the above situations.
Also ignore domains marked as bad because oxenstored would never talk
to them again.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Reviewed-by: Pau Ruiz Safont <pau.safont@citrix.com>
Reviewed-by: Christian Lindig <christian.lindig@citrix.com>

---
Changed since V1:
* post publicly now that the XSA is out
---
 tools/ocaml/xenstored/connection.ml  | 35 ++++++++++++++++++++++++++++
 tools/ocaml/xenstored/connections.ml |  8 +++++++
 tools/ocaml/xenstored/xenstored.ml   | 13 +++++++++--
 tools/xenstore/xenstored_core.c      |  7 +++++-
 4 files changed, 60 insertions(+), 3 deletions(-)

diff --git a/tools/ocaml/xenstored/connection.ml b/tools/ocaml/xenstored/connection.ml
index fa0d3c4d92..bd02060cd0 100644
--- a/tools/ocaml/xenstored/connection.ml
+++ b/tools/ocaml/xenstored/connection.ml
@@ -290,6 +290,41 @@ let has_new_output con = Xenbus.Xb.has_new_output con.xb
 let peek_output con = Xenbus.Xb.peek_output con.xb
 let do_output con = Xenbus.Xb.output con.xb
 
+let is_bad con = match con.dom with None -> false | Some dom -> Domain.is_bad_domain dom
+
+(* oxenstored currently only dumps limited information about its state.
+   A live update is only possible if any of the state that is not dumped would be empty.
+   Compared to https://xenbits.xen.org/docs/unstable/designs/xenstore-migration.html:
+     * GLOBAL_DATA: not strictly needed, systemd is giving the socket FDs to us
+     * CONNECTION_DATA: PARTIAL
+       * for domains: PARTIAL, see Connection.dump -> Domain.dump, only if data and tdomid is empty
+       * for sockets (Dom0 toolstack): NO
+     * WATCH_DATA: OK, see Connection.dump
+     * TRANSACTION_DATA: NO
+     * NODE_DATA: OK (except for transactions), see Store.dump_fct and DB.to_channel
+
+   Also xenstored will never talk to a Domain once it is marked as bad,
+   so treat it as idle for live-update.
+
+   Restrictions below can be relaxed once xenstored learns to dump more
+   of its live state in a safe way *)
+let has_extra_connection_data con =
+	let has_in = has_input con in
+	let has_out = has_output con in
+	let has_socket = con.dom = None in
+	let has_nondefault_perms = make_perm con.dom <> con.perm in
+	has_in || has_out
+	|| has_socket (* dom0 sockets not dumped yet *)
+	|| has_nondefault_perms (* set_target not dumped yet *)
+
+let has_transaction_data con =
+	let n = number_of_transactions con in
+	dbg "%s: number of transactions = %d" (get_domstr con) n;
+	n > 0
+
+let prevents_live_update con = not (is_bad con)
+	&& (has_extra_connection_data con || has_transaction_data con)
+
 let has_more_work con =
 	has_more_input con || not (has_old_output con) && has_new_output con
 
diff --git a/tools/ocaml/xenstored/connections.ml b/tools/ocaml/xenstored/connections.ml
index 6ee3552ec2..82988f7e8d 100644
--- a/tools/ocaml/xenstored/connections.ml
+++ b/tools/ocaml/xenstored/connections.ml
@@ -194,3 +194,11 @@ let debug cons =
 	let anonymous = Hashtbl.fold (fun _ con accu -> Connection.debug con :: accu) cons.anonymous [] in
 	let domains = Hashtbl.fold (fun _ con accu -> Connection.debug con :: accu) cons.domains [] in
 	String.concat "" (domains @ anonymous)
+
+let filter ~f cons =
+	let fold _ v acc = if f v then v :: acc else acc in
+	[]
+	|> Hashtbl.fold fold cons.anonymous
+	|> Hashtbl.fold fold cons.domains
+
+let prevents_quit cons = filter ~f:Connection.prevents_live_update cons
diff --git a/tools/ocaml/xenstored/xenstored.ml b/tools/ocaml/xenstored/xenstored.ml
index 39d6d767e4..6b5381962b 100644
--- a/tools/ocaml/xenstored/xenstored.ml
+++ b/tools/ocaml/xenstored/xenstored.ml
@@ -20,6 +20,7 @@ open Parse_arg
 open Stdext
 
 let error fmt = Logging.error "xenstored" fmt
+let warn fmt = Logging.warn "xenstored" fmt
 let debug fmt = Logging.debug "xenstored" fmt
 let info fmt = Logging.info "xenstored" fmt
 
@@ -312,7 +313,9 @@ let _ =
 	);
 
 	Sys.set_signal Sys.sighup (Sys.Signal_handle sighup_handler);
-	Sys.set_signal Sys.sigterm (Sys.Signal_handle (fun _ -> quit := true));
+	Sys.set_signal Sys.sigterm (Sys.Signal_handle (fun _ ->
+		 info "Received SIGTERM";
+		 quit := true));
 	Sys.set_signal Sys.sigusr1 (Sys.Signal_handle (fun _ -> sigusr1_handler store));
 	Sys.set_signal Sys.sigpipe Sys.Signal_ignore;
 
@@ -424,6 +427,12 @@ let _ =
 		);
 		let elapsed = Unix.gettimeofday () -. now in
 		debug "periodic_ops took %F seconds." elapsed;
+		if !quit then
+		(match Connections.prevents_quit cons with
+		| [] -> ()
+		| domains ->
+		    List.iter (fun con -> warn "%s prevents live update" (Connection.get_domstr con)) domains
+		);
 		delay_next_frequent_ops_by elapsed
 	in
 
@@ -475,7 +484,7 @@ let _ =
 		in
 
 	Systemd.sd_notify_ready ();
-	while not !quit
+	while not (!quit && Connections.prevents_quit cons = [])
 	do
 		try
 			main_loop ()
diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index 50986f8b29..b9495365c4 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -1970,6 +1970,7 @@ static struct option options[] = {
 	{ "internal-db", 0, NULL, 'I' },
 	{ "verbose", 0, NULL, 'V' },
 	{ "watch-nb", 1, NULL, 'W' },
+	{ "live-update", 0, NULL, 'U' },
 	{ NULL, 0, NULL, 0 } };
 
 extern void dump_conn(struct connection *conn); 
@@ -1984,11 +1985,12 @@ int main(int argc, char *argv[])
 	bool dofork = true;
 	bool outputpid = false;
 	bool no_domain_init = false;
+	bool live_update = false;
 	const char *pidfile = NULL;
 	int timeout;
 
 
-	while ((opt = getopt_long(argc, argv, "DE:F:HNPS:t:A:M:T:RVW:", options,
+	while ((opt = getopt_long(argc, argv, "DE:F:HNPS:t:A:M:T:RVW:U", options,
 				  NULL)) != -1) {
 		switch (opt) {
 		case 'D':
@@ -2046,6 +2048,9 @@ int main(int argc, char *argv[])
 		case 'p':
 			priv_domid = strtol(optarg, NULL, 10);
 			break;
+		case 'U':
+			live_update = true;
+			break;
 		}
 	}
 	if (optind != argc)
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 5/8] tools/ocaml/xenstored: Automatically resume when possible
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (10 preceding siblings ...)
  (?)
@ 2021-01-15 22:28 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu, Pau Ruiz Safont

When a `db` file exists use it to resume oxenstored.
It will contains a xenstore tree, domain reconnection info, and watches.

It is currently missing data about all active socket connections,
so a toolstack should ideally be stopped and restarted too.

Tell systemd about oxenstored's PID and allow it to restart on success.

This should make updating oxenstored as easy as:
`systemctl stop -s SIGTERM xenstored` on a suitable xenstored version.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Reviewed-by: Pau Ruiz Safont <pau.safont@citrix.com>
Reviewed-by: Christian Lindig <christian.lindig@citrix.com>

---
Changed since V1:
* post publicly now that the XSA is out
---
 tools/ocaml/xenstored/xenstored.ml | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/tools/ocaml/xenstored/xenstored.ml b/tools/ocaml/xenstored/xenstored.ml
index 6b5381962b..500d96753b 100644
--- a/tools/ocaml/xenstored/xenstored.ml
+++ b/tools/ocaml/xenstored/xenstored.ml
@@ -292,9 +292,8 @@ let _ =
 	List.iter (fun path ->
 		Store.write store Perms.Connection.full_rights path "") Store.Path.specials;
 
-	let filename = Paths.xen_run_stored ^ "/db" in
-	if cf.restart && Sys.file_exists filename then (
-		DB.from_file store domains cons filename;
+	if cf.restart && Sys.file_exists Disk.xs_daemon_database then (
+		DB.from_file store domains cons Disk.xs_daemon_database;
 		Event.bind_dom_exc_virq eventchn
 	) else (
 		if !Disk.enable then (
@@ -320,7 +319,7 @@ let _ =
 	Sys.set_signal Sys.sigpipe Sys.Signal_ignore;
 
 	if cf.activate_access_log then begin
-		let post_rotate () = DB.to_file store cons (Paths.xen_run_stored ^ "/db") in
+		let post_rotate () = DB.to_file store cons Disk.xs_daemon_database in
 		Logging.init_access_log post_rotate
 	end;
 
@@ -494,5 +493,8 @@ let _ =
 				raise exc
 	done;
 	info "stopping xenstored";
-	DB.to_file store cons (Paths.xen_run_stored ^ "/db");
+		DB.to_file store cons Disk.xs_daemon_database;
+		(* unlink pidfile so that launch-xenstore works again *)
+		Unixext.unlink_safe pidfile;
+		(match cf.pidfile with Some pidfile -> Unixext.unlink_safe pidfile | None -> ());
 	()
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 6/8] tools/ocaml/xenstored: add cooperative live-update command
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (11 preceding siblings ...)
  (?)
@ 2021-01-15 22:28 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu, Pau Ruiz Safont

See docs/misc/xenstore.txt for documentation on live-update command.
Validate that the binary exists and that the cmdline is valid,
to prevent typos from taking down xenstore
(if live-update fails there is no way back due to the use of execve).

Live update only proceeds if there are no active transactions,
and no unprocess input or unflushed output.
It is not yet possible to force the live-update.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Reviewed-by: Pau Ruiz Safont <pau.safont@citrix.com>
Reviewed-by: Christian Lindig <christian.lindig@citrix.com>

---
Changed since V1:
* post publicly now that the XSA is out
---
 tools/ocaml/xenstored/process.ml | 112 +++++++++++++++++++++++++++++++
 tools/ocaml/xenstored/stdext.ml  |   6 ++
 2 files changed, 118 insertions(+)

diff --git a/tools/ocaml/xenstored/process.ml b/tools/ocaml/xenstored/process.ml
index 437d2dcf9e..c3c5dc58c0 100644
--- a/tools/ocaml/xenstored/process.ml
+++ b/tools/ocaml/xenstored/process.ml
@@ -15,6 +15,7 @@
  *)
 
 let error fmt = Logging.error "process" fmt
+let warn fmt = Logging.warn "process" fmt
 let info fmt = Logging.info "process" fmt
 let debug fmt = Logging.debug "process" fmt
 
@@ -84,11 +85,122 @@ let create_implicit_path t perm path =
 		List.iter (fun s -> Transaction.mkdir ~with_watch:false t perm s) ret
 	)
 
+module LiveUpdate = struct
+type t =
+	{ binary: string
+		; cmdline: string list
+		; deadline: float
+	; force: bool
+	; pending: bool }
+
+let state =
+	ref
+		{ binary= Sys.executable_name
+		; cmdline= []
+		; deadline= 0.
+		; force= false
+		; pending= false }
+
+let debug = Printf.eprintf
+
+let args_of_t t = (t.binary, "--restart" :: t.cmdline)
+
+let string_of_t t =
+	let executable, rest = args_of_t t in
+	Filename.quote_command executable rest
+
+let launch_exn t =
+	let executable, rest = args_of_t t in
+	let args = Array.of_list (executable :: rest) in
+	Unix.execv args.(0) args
+
+let validate_exn t =
+	(* --help must be last to check validity of earlier arguments *)
+	let t = {t with cmdline= t.cmdline @ ["--help"]} in
+	let cmd = string_of_t t in
+	debug "Executing %s" cmd ;
+	match Unix.fork () with
+	| 0 ->
+		 ( try launch_exn t with _ -> exit 2 )
+	| pid -> (
+	match Unix.waitpid [] pid with
+		| _, Unix.WEXITED 0 ->
+				debug "Live update validated cmdline %s" cmd;
+		t
+	| _, Unix.WEXITED n ->
+		invalid_arg (Printf.sprintf "Command %s exited with code %d" cmd n)
+	| _, Unix.WSIGNALED n ->
+		invalid_arg
+		  (Printf.sprintf "Command %s killed by ocaml signal number %d" cmd n)
+	| _, Unix.WSTOPPED n ->
+		invalid_arg
+		  (Printf.sprintf "Command %s stopped by ocaml signal number %d" cmd n)
+	)
+
+let parse_live_update args =
+	try
+	(state :=
+		match args with
+		| ["-f"; file] ->
+			validate_exn {!state with binary= file}
+		| ["-a"] ->
+			debug "Live update aborted" ;
+			{!state with pending= false}
+		| "-c" :: cmdline ->
+			validate_exn {!state with cmdline}
+		| "-s" :: _ ->
+			let timeout = ref 60 in
+			let force = ref false in
+			Arg.parse_argv ~current:(ref 1) (Array.of_list args)
+				[ ( "-t"
+				, Arg.Set_int timeout
+				, "timeout in seconds to wait for active transactions to finish"
+				)
+			(*; ( "-F"
+				, Arg.Set force
+				, "force live update to happen even with running transactions \
+				   after timeout elapsed" )*) ]
+			(fun x -> raise (Arg.Bad x))
+			"live-update -s" ;
+			debug "Live update process queued" ;
+				{!state with deadline = Unix.gettimeofday () +. float !timeout
+				; force= !force; pending= true}
+		| _ ->
+			invalid_arg ("Unknown arguments: " ^ String.concat " " args)) ;
+	None
+	with
+	| Arg.Bad s | Arg.Help s | Invalid_argument s ->
+		Some s
+	| Unix.Unix_error (e, fn, args) ->
+		Some (Printf.sprintf "%s(%s): %s" fn args (Unix.error_message e))
+
+	let should_run cons =
+		let t = !state in
+		if t.pending then begin
+			match Connections.prevents_quit cons with
+			| [] -> true
+			| _ when Unix.gettimeofday () < t.deadline -> false
+			| l ->
+				 info "Live update timeout reached: %d active connections" (List.length l);
+				 List.iter (fun con -> warn "%s prevents live update" (Connection.get_domstr con)) l;
+				 if t.force then begin
+					 warn "Live update forced, some domain connections may break!";
+					 true
+				 end else begin
+					 warn "Live update aborted, try migrating or shutting down the domains/toolstack";
+					 state := { t with pending = false };
+					 false
+				end
+		end else false
+end
+
 (* packets *)
 let do_debug con t _domains cons data =
 	if not (Connection.is_dom0 con) && not !allow_debug
 	then None
 	else try match split None '\000' data with
+	| "live-update" :: params ->
+		LiveUpdate.parse_live_update params
 	| "print" :: msg :: _ ->
 		Logging.xb_op ~tid:0 ~ty:Xenbus.Xb.Op.Debug ~con:"=======>" msg;
 		None
diff --git a/tools/ocaml/xenstored/stdext.ml b/tools/ocaml/xenstored/stdext.ml
index 4f2f3a2c8c..e1567c4dfa 100644
--- a/tools/ocaml/xenstored/stdext.ml
+++ b/tools/ocaml/xenstored/stdext.ml
@@ -44,6 +44,12 @@ let default d v =
 let maybe f v =
 	match v with None -> () | Some x -> f x
 
+module Filename = struct
+	include Filename
+	let quote_command cmd args =
+		cmd :: args |> List.map quote |> String.concat " "
+end
+
 module String = struct include String
 
 let of_char c = String.make 1 c
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 7/8] tools/ocaml/xenstored: start live update process
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (12 preceding siblings ...)
  (?)
@ 2021-01-15 22:28 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu, Pau Ruiz Safont

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Reviewed-by: Pau Ruiz Safont <pau.safont@citrix.com>
Reviewed-by: Christian Lindig <christian.lindig@citrix.com>

---
Changed since V1:
* post publicly now that the XSA is out
---
 tools/ocaml/xenstored/logging.ml   |  3 +++
 tools/ocaml/xenstored/process.ml   |  8 +++++---
 tools/ocaml/xenstored/xenstored.ml | 29 ++++++++++++++++++++++-------
 3 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/tools/ocaml/xenstored/logging.ml b/tools/ocaml/xenstored/logging.ml
index 1ede131329..39c3036155 100644
--- a/tools/ocaml/xenstored/logging.ml
+++ b/tools/ocaml/xenstored/logging.ml
@@ -327,6 +327,9 @@ let end_transaction ~tid ~con =
 	if !access_log_transaction_ops && tid <> 0
 	then access_logging ~tid ~con (XbOp Xenbus.Xb.Op.Transaction_end) ~level:Debug
 
+let live_update () =
+	xb_op ~tid:0 ~con:"" ~ty:Xenbus.Xb.Op.Debug "Live update begin"
+
 let xb_answer ~tid ~con ~ty data =
 	let print, level = match ty with
 		| Xenbus.Xb.Op.Error when String.startswith "ENOENT" data -> !access_log_read_ops , Warn
diff --git a/tools/ocaml/xenstored/process.ml b/tools/ocaml/xenstored/process.ml
index c3c5dc58c0..3174d8ede5 100644
--- a/tools/ocaml/xenstored/process.ml
+++ b/tools/ocaml/xenstored/process.ml
@@ -112,6 +112,7 @@ let string_of_t t =
 let launch_exn t =
 	let executable, rest = args_of_t t in
 	let args = Array.of_list (executable :: rest) in
+	info "Launching %s, args: %s" executable (String.concat " " rest);
 	Unix.execv args.(0) args
 
 let validate_exn t =
@@ -151,7 +152,7 @@ let parse_live_update args =
 		| "-s" :: _ ->
 			let timeout = ref 60 in
 			let force = ref false in
-			Arg.parse_argv ~current:(ref 1) (Array.of_list args)
+			Arg.parse_argv ~current:(ref 0) (Array.of_list args)
 				[ ( "-t"
 				, Arg.Set_int timeout
 				, "timeout in seconds to wait for active transactions to finish"
@@ -166,7 +167,7 @@ let parse_live_update args =
 				{!state with deadline = Unix.gettimeofday () +. float !timeout
 				; force= !force; pending= true}
 		| _ ->
-			invalid_arg ("Unknown arguments: " ^ String.concat " " args)) ;
+			invalid_arg ("Unknown arguments: " ^ String.concat "," args)) ;
 	None
 	with
 	| Arg.Bad s | Arg.Help s | Invalid_argument s ->
@@ -200,7 +201,8 @@ let do_debug con t _domains cons data =
 	then None
 	else try match split None '\000' data with
 	| "live-update" :: params ->
-		LiveUpdate.parse_live_update params
+		let dropped_trailing_nul = params |> List.rev |> List.tl |> List.rev in
+		LiveUpdate.parse_live_update dropped_trailing_nul
 	| "print" :: msg :: _ ->
 		Logging.xb_op ~tid:0 ~ty:Xenbus.Xb.Op.Debug ~con:"=======>" msg;
 		None
diff --git a/tools/ocaml/xenstored/xenstored.ml b/tools/ocaml/xenstored/xenstored.ml
index 500d96753b..22413271fb 100644
--- a/tools/ocaml/xenstored/xenstored.ml
+++ b/tools/ocaml/xenstored/xenstored.ml
@@ -311,6 +311,11 @@ let _ =
 		);
 	);
 
+	(* required for xenstore-control to detect availability of live-update *)
+	Store.mkdir store Perms.Connection.full_rights (Store.Path.of_string "/tool");
+	Store.write store Perms.Connection.full_rights
+		(Store.Path.of_string "/tool/xenstored") Sys.executable_name;
+
 	Sys.set_signal Sys.sighup (Sys.Signal_handle sighup_handler);
 	Sys.set_signal Sys.sigterm (Sys.Signal_handle (fun _ ->
 		 info "Received SIGTERM";
@@ -483,18 +488,28 @@ let _ =
 		in
 
 	Systemd.sd_notify_ready ();
+	let live_update = ref false in
 	while not (!quit && Connections.prevents_quit cons = [])
 	do
 		try
-			main_loop ()
+			main_loop ();
+			live_update := Process.LiveUpdate.should_run cons;
+			if !live_update || !quit then begin
+				(* don't initiate live update if saving state fails *)
+				DB.to_file store cons Disk.xs_daemon_database;
+				quit := true;
+			end
 		with exc ->
-			error "caught exception %s" (Printexc.to_string exc);
+			let bt = Printexc.get_backtrace () in
+			error "caught exception %s: %s" (Printexc.to_string exc) bt;
 			if cf.reraise_top_level then
 				raise exc
 	done;
 	info "stopping xenstored";
-		DB.to_file store cons Disk.xs_daemon_database;
-		(* unlink pidfile so that launch-xenstore works again *)
-		Unixext.unlink_safe pidfile;
-		(match cf.pidfile with Some pidfile -> Unixext.unlink_safe pidfile | None -> ());
-	()
+	(* unlink pidfile so that launch-xenstore works again *)
+	Unixext.unlink_safe pidfile;
+	(match cf.pidfile with Some pidfile -> Unixext.unlink_safe pidfile | None -> ());
+	if !live_update then begin
+		 Logging.live_update ();
+		 Process.LiveUpdate.launch_exn !Process.LiveUpdate.state
+	end
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 8/8] tools/ocaml/xenstored: Implement live update for socket connections
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (13 preceding siblings ...)
  (?)
@ 2021-01-15 22:28 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edvin Torok, Christian Lindig, David Scott, Ian Jackson, Wei Liu,
	Edwin Török, Pau Ruiz Safont

From: Edvin Torok <edvint@eddie2.eng.citrite.net>

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Reviewed-by: Pau Ruiz Safont <pau.safont@citrix.com>
Reviewed-by: Christian Lindig <christian.lindig@citrix.com>

---
Changed since V1
* post publicly now that the XSA is out
---
 tools/ocaml/xenstored/connection.ml | 25 +++++---
 tools/ocaml/xenstored/parse_arg.ml  |  4 ++
 tools/ocaml/xenstored/process.ml    | 51 ++++++++++++-----
 tools/ocaml/xenstored/store.ml      |  2 +-
 tools/ocaml/xenstored/utils.ml      | 12 ++++
 tools/ocaml/xenstored/xenstored.ml  | 88 +++++++++++++++++++++--------
 6 files changed, 138 insertions(+), 44 deletions(-)

diff --git a/tools/ocaml/xenstored/connection.ml b/tools/ocaml/xenstored/connection.ml
index bd02060cd0..eb23c3af7a 100644
--- a/tools/ocaml/xenstored/connection.ml
+++ b/tools/ocaml/xenstored/connection.ml
@@ -281,6 +281,9 @@ let get_transaction con tid =
 
 let do_input con = Xenbus.Xb.input con.xb
 let has_input con = Xenbus.Xb.has_in_packet con.xb
+let has_partial_input con = match con.xb.Xenbus.Xb.partial_in with
+	| HaveHdr _ -> true
+	| NoHdr (n, _) -> n < Xenbus.Partial.header_size ()
 let pop_in con = Xenbus.Xb.get_in_packet con.xb
 let has_more_input con = Xenbus.Xb.has_more_input con.xb
 
@@ -309,12 +312,13 @@ let is_bad con = match con.dom with None -> false | Some dom -> Domain.is_bad_do
    Restrictions below can be relaxed once xenstored learns to dump more
    of its live state in a safe way *)
 let has_extra_connection_data con =
-	let has_in = has_input con in
+	let has_in = has_input con || has_partial_input con in
 	let has_out = has_output con in
 	let has_socket = con.dom = None in
 	let has_nondefault_perms = make_perm con.dom <> con.perm in
 	has_in || has_out
-	|| has_socket (* dom0 sockets not dumped yet *)
+	(* TODO: what about SIGTERM, should use systemd to store FDS
+	 || has_socket (* dom0 sockets not * dumped yet *) *)
 	|| has_nondefault_perms (* set_target not dumped yet *)
 
 let has_transaction_data con =
@@ -337,16 +341,21 @@ let stats con =
 	Hashtbl.length con.watches, con.stat_nb_ops
 
 let dump con chan =
-	match con.dom with
+	let id = match con.dom with
 	| Some dom ->
 		let domid = Domain.get_id dom in
 		(* dump domain *)
 		Domain.dump dom chan;
-		(* dump watches *)
-		List.iter (fun (path, token) ->
-			Printf.fprintf chan "watch,%d,%s,%s\n" domid (Utils.hexify path) (Utils.hexify token)
-			) (list_watches con);
-	| None -> ()
+		domid
+	| None ->
+		let fd = con |> get_fd |> Utils.FD.to_int in
+		Printf.fprintf chan "socket,%d\n" fd;
+		-fd
+	in
+	(* dump watches *)
+	List.iter (fun (path, token) ->
+		Printf.fprintf chan "watch,%d,%s,%s\n" id (Utils.hexify path) (Utils.hexify token)
+		) (list_watches con)
 
 let debug con =
 	let domid = get_domstr con in
diff --git a/tools/ocaml/xenstored/parse_arg.ml b/tools/ocaml/xenstored/parse_arg.ml
index 2c4b5a8528..7c0478e76a 100644
--- a/tools/ocaml/xenstored/parse_arg.ml
+++ b/tools/ocaml/xenstored/parse_arg.ml
@@ -24,6 +24,7 @@ type config =
 	pidfile: string option; (* old xenstored compatibility *)
 	tracefile: string option; (* old xenstored compatibility *)
 	restart: bool;
+	live_reload: bool;
 	disable_socket: bool;
 }
 
@@ -35,6 +36,7 @@ let do_argv =
 	and reraise_top_level = ref false
 	and config_file = ref ""
 	and restart = ref false
+	and live_reload = ref false
 	and disable_socket = ref false
 	in
 
@@ -52,6 +54,7 @@ let do_argv =
 		  ("--pid-file", Arg.Set_string pidfile, ""); (* for compatibility *)
 		  ("-T", Arg.Set_string tracefile, ""); (* for compatibility *)
 		  ("--restart", Arg.Set restart, "Read database on starting");
+		  ("--live", Arg.Set live_reload, "Read live dump on startup");
 		  ("--disable-socket", Arg.Unit (fun () -> disable_socket := true), "Disable socket");
 		] in
 	let usage_msg = "usage : xenstored [--config-file <filename>] [--no-domain-init] [--help] [--no-fork] [--reraise-top-level] [--restart] [--disable-socket]" in
@@ -65,5 +68,6 @@ let do_argv =
 		pidfile = if !pidfile <> "" then Some !pidfile else None;
 		tracefile = if !tracefile <> "" then Some !tracefile else None;
 		restart = !restart;
+		live_reload = !live_reload;
 		disable_socket = !disable_socket;
 	}
diff --git a/tools/ocaml/xenstored/process.ml b/tools/ocaml/xenstored/process.ml
index 3174d8ede5..dd50456ad5 100644
--- a/tools/ocaml/xenstored/process.ml
+++ b/tools/ocaml/xenstored/process.ml
@@ -91,19 +91,24 @@ type t =
 		; cmdline: string list
 		; deadline: float
 	; force: bool
+	; result: string list
 	; pending: bool }
 
 let state =
 	ref
 		{ binary= Sys.executable_name
-		; cmdline= []
+		; cmdline= (Sys.argv |> Array.to_list |> List.tl)
 		; deadline= 0.
 		; force= false
+		; result = []
 		; pending= false }
 
 let debug = Printf.eprintf
 
-let args_of_t t = (t.binary, "--restart" :: t.cmdline)
+let forced_args = ["--live"; "--restart"]
+let args_of_t t =
+	let filtered = List.filter (fun x -> not @@ List.mem x forced_args) t.cmdline in
+	(t.binary, forced_args @ filtered)
 
 let string_of_t t =
 	let executable, rest = args_of_t t in
@@ -117,12 +122,12 @@ let launch_exn t =
 
 let validate_exn t =
 	(* --help must be last to check validity of earlier arguments *)
-	let t = {t with cmdline= t.cmdline @ ["--help"]} in
-	let cmd = string_of_t t in
+	let t' = {t with cmdline= t.cmdline @ ["--help"]} in
+	let cmd = string_of_t t' in
 	debug "Executing %s" cmd ;
 	match Unix.fork () with
 	| 0 ->
-		 ( try launch_exn t with _ -> exit 2 )
+		 ( try launch_exn t' with _ -> exit 2 )
 	| pid -> (
 	match Unix.waitpid [] pid with
 		| _, Unix.WEXITED 0 ->
@@ -146,10 +151,14 @@ let parse_live_update args =
 			validate_exn {!state with binary= file}
 		| ["-a"] ->
 			debug "Live update aborted" ;
-			{!state with pending= false}
+			{!state with pending= false; result = []}
 		| "-c" :: cmdline ->
-			validate_exn {!state with cmdline}
+			validate_exn {!state with cmdline = !state.cmdline @ cmdline}
 		| "-s" :: _ ->
+			(match !state.pending, !state.result with
+			| true, _ -> !state (* no change to state, avoid resetting timeout *)
+			| false, _ :: _ -> !state (* we got a pending result to deliver *)
+			| false, [] ->
 			let timeout = ref 60 in
 			let force = ref false in
 			Arg.parse_argv ~current:(ref 0) (Array.of_list args)
@@ -165,10 +174,16 @@ let parse_live_update args =
 			"live-update -s" ;
 			debug "Live update process queued" ;
 				{!state with deadline = Unix.gettimeofday () +. float !timeout
-				; force= !force; pending= true}
+				; force= !force; pending= true})
 		| _ ->
 			invalid_arg ("Unknown arguments: " ^ String.concat "," args)) ;
-	None
+		match !state.pending, !state.result with
+		| true, _ -> Some "BUSY"
+		| false, (_ :: _ as result) ->
+				(* xenstore-control has read the result, clear it *)
+				state := { !state with result = [] };
+				Some (String.concat "\n" result)
+		| false, [] -> None
 	with
 	| Arg.Bad s | Arg.Help s | Invalid_argument s ->
 		Some s
@@ -182,17 +197,27 @@ let parse_live_update args =
 			| [] -> true
 			| _ when Unix.gettimeofday () < t.deadline -> false
 			| l ->
-				 info "Live update timeout reached: %d active connections" (List.length l);
-				 List.iter (fun con -> warn "%s prevents live update" (Connection.get_domstr con)) l;
+				 warn "timeout reached: have to wait, migrate or shutdown %d domains:" (List.length l);
+				 let msgs = List.rev_map (fun con -> Printf.sprintf "%s: %d tx, in: %b, out: %b, perm: %s"
+					 (Connection.get_domstr con)
+					 (Connection.number_of_transactions con)
+					 (Connection.has_input con)
+					 (Connection.has_output con)
+					 (Connection.get_perm con |> Perms.Connection.to_string)
+					) l in
+				 List.iter (warn "Live-update: %s") msgs;
 				 if t.force then begin
 					 warn "Live update forced, some domain connections may break!";
 					 true
 				 end else begin
-					 warn "Live update aborted, try migrating or shutting down the domains/toolstack";
-					 state := { t with pending = false };
+					 warn "Live update aborted (see above for domains preventing it)";
+					 state := { t with pending = false; result = msgs};
 					 false
 				end
 		end else false
+
+	let completed () =
+		state := { !state with result = ["OK"] }
 end
 
 (* packets *)
diff --git a/tools/ocaml/xenstored/store.ml b/tools/ocaml/xenstored/store.ml
index e20767372f..a3be2e6bbe 100644
--- a/tools/ocaml/xenstored/store.ml
+++ b/tools/ocaml/xenstored/store.ml
@@ -366,7 +366,7 @@ let traversal root_node f =
 	let rec _traversal path node =
 		f path node;
 		let node_path = Path.of_path_and_name path (Symbol.to_string node.Node.name) in
-		List.iter (_traversal node_path) node.Node.children
+		List.iter (_traversal node_path) (List.rev node.Node.children)
 		in
 	_traversal [] root_node
 
diff --git a/tools/ocaml/xenstored/utils.ml b/tools/ocaml/xenstored/utils.ml
index eb79bf0146..6c1603c276 100644
--- a/tools/ocaml/xenstored/utils.ml
+++ b/tools/ocaml/xenstored/utils.ml
@@ -115,3 +115,15 @@ let path_validate path connection_path =
 	if len > !Define.path_max then raise Define.Invalid_path;
 
 	abs_path
+
+module FD : sig
+     type t = Unix.file_descr
+     val of_int: int -> t
+     val to_int : t -> int
+end = struct
+    type t = Unix.file_descr
+    (* This is like Obj.magic but just for these types,
+       and relies on Unix.file_descr = int *)
+    external to_int : t -> int = "%identity"
+    external of_int : int -> t = "%identity"
+end
diff --git a/tools/ocaml/xenstored/xenstored.ml b/tools/ocaml/xenstored/xenstored.ml
index 22413271fb..5893af2caa 100644
--- a/tools/ocaml/xenstored/xenstored.ml
+++ b/tools/ocaml/xenstored/xenstored.ml
@@ -141,9 +141,12 @@ exception Bad_format of string
 
 let dump_format_header = "$xenstored-dump-format"
 
-let from_channel_f chan domain_f watch_f store_f =
+let from_channel_f chan global_f socket_f domain_f watch_f store_f =
 	let unhexify s = Utils.unhexify s in
-	let getpath s = Store.Path.of_string (Utils.unhexify s) in
+	let getpath s =
+		let u = Utils.unhexify s in
+		debug "Path: %s" u;
+		Store.Path.of_string u in
 	let header = input_line chan in
 	if header <> dump_format_header then
 		raise (Bad_format "header");
@@ -155,6 +158,12 @@ let from_channel_f chan domain_f watch_f store_f =
 			let l = String.split ',' line in
 			try
 				match l with
+				| "global" :: rw :: _ ->
+					(* there might be more parameters here,
+						e.g. a RO socket from a previous version: ignore it *)
+					global_f ~rw
+				| "socket" :: fd :: [] ->
+					socket_f ~fd:(int_of_string fd)
 				| "dom" :: domid :: mfn :: port :: []->
 					domain_f (int_of_string domid)
 					         (Nativeint.of_string mfn)
@@ -175,12 +184,28 @@ let from_channel_f chan domain_f watch_f store_f =
 		with End_of_file ->
 			quit := true
 	done;
-	()
+	info "Completed loading xenstore dump"
 
 let from_channel store cons doms chan =
 	(* don't let the permission get on our way, full perm ! *)
 	let op = Store.get_ops store Perms.Connection.full_rights in
-
+	let rwro = ref (None) in
+	let global_f ~rw =
+		let get_listen_sock sockfd =
+			let fd = sockfd |> int_of_string |> Utils.FD.of_int in
+			Unix.listen fd 1;
+			Some fd
+		in
+		rwro := get_listen_sock rw
+	in
+	let socket_f ~fd =
+		let ufd = Utils.FD.of_int fd in
+		let is_valid = try (Unix.fstat ufd).Unix.st_kind = Unix.S_SOCK with _ -> false in
+		if is_valid then
+			Connections.add_anonymous cons ufd
+		else
+			warn "Ignoring invalid socket FD %d" fd
+	in
 	let domain_f domid mfn port =
 		let ndom =
 			if domid > 0 then
@@ -190,28 +215,38 @@ let from_channel store cons doms chan =
 			in
 		Connections.add_domain cons ndom;
 		in
-	let watch_f domid path token =
-		let con = Connections.find_domain cons domid in
-		ignore (Connections.add_watch cons con path token)
+	let get_con id =
+		if id < 0 then Connections.find cons (Utils.FD.of_int (-id))
+		else Connections.find_domain cons id
+	in
+	let watch_f id path token =
+		ignore (Connections.add_watch cons (get_con id) path token)
 		in
 	let store_f path perms value =
 		op.Store.write path value;
 		op.Store.setperms path perms
 		in
-	from_channel_f chan domain_f watch_f store_f
+	from_channel_f chan global_f socket_f domain_f watch_f store_f;
+	!rwro
 
 let from_file store cons doms file =
+	info "Loading xenstore dump from %s" file;
 	let channel = open_in file in
 	finally (fun () -> from_channel store doms cons channel)
 	        (fun () -> close_in channel)
 
-let to_channel store cons chan =
+let to_channel store cons rw chan =
 	let hexify s = Utils.hexify s in
 
 	fprintf chan "%s\n" dump_format_header;
+	let fdopt = function None -> -1 | Some fd ->
+		(* systemd and utils.ml sets it close on exec *)
+		Unix.clear_close_on_exec fd;
+		Utils.FD.to_int fd in
+	fprintf chan "global,%d\n" (fdopt rw);
 
-	(* dump connections related to domains; domid, mfn, eventchn port, watches *)
-	Connections.iter_domains cons (fun con -> Connection.dump con chan);
+	(* dump connections related to domains: domid, mfn, eventchn port/ sockets, and watches *)
+	Connections.iter cons (fun con -> Connection.dump con chan);
 
 	(* dump the store *)
 	Store.dump_fct store (fun path node ->
@@ -224,9 +259,9 @@ let to_channel store cons chan =
 	()
 
 
-let to_file store cons file =
+let to_file store cons fds file =
 	let channel = open_out_gen [ Open_wronly; Open_creat; Open_trunc; ] 0o600 file in
-	finally (fun () -> to_channel store cons channel)
+	finally (fun () -> to_channel store cons fds channel)
 	        (fun () -> close_out channel)
 end
 
@@ -246,13 +281,13 @@ let _ =
 	);
 
 	let rw_sock =
-		if cf.disable_socket then
+		if cf.disable_socket || cf.live_reload then
 			None
 		else
 			Some (Unix.handle_unix_error Utils.create_unix_socket Define.xs_daemon_socket)
 		in
 
-	if cf.daemonize then
+	if cf.daemonize && not cf.live_reload then
 		Unixext.daemonize ()
 	else
 		printf "Xen Storage Daemon, version %d.%d\n%!"
@@ -292,10 +327,15 @@ let _ =
 	List.iter (fun path ->
 		Store.write store Perms.Connection.full_rights path "") Store.Path.specials;
 
+	let rw_sock =
 	if cf.restart && Sys.file_exists Disk.xs_daemon_database then (
-		DB.from_file store domains cons Disk.xs_daemon_database;
-		Event.bind_dom_exc_virq eventchn
-	) else (
+		let rwro = DB.from_file store domains cons Disk.xs_daemon_database in
+		info "Live reload: database loaded";
+		Event.bind_dom_exc_virq eventchn;
+		Process.LiveUpdate.completed ();
+		rwro
+ 	) else (
+		info "No live reload: regular startup";
 		if !Disk.enable then (
 			info "reading store from disk";
 			Disk.read store
@@ -309,10 +349,13 @@ let _ =
 			Connections.add_domain cons (Domains.create0 domains);
 			Event.bind_dom_exc_virq eventchn
 		);
-	);
+		rw_sock
+	) in
 
 	(* required for xenstore-control to detect availability of live-update *)
-	Store.mkdir store Perms.Connection.full_rights (Store.Path.of_string "/tool");
+	let toolpath = Store.Path.of_string "/tool" in
+	if not (Store.path_exists store toolpath) then
+		Store.mkdir store Perms.Connection.full_rights toolpath;
 	Store.write store Perms.Connection.full_rights
 		(Store.Path.of_string "/tool/xenstored") Sys.executable_name;
 
@@ -324,7 +367,7 @@ let _ =
 	Sys.set_signal Sys.sigpipe Sys.Signal_ignore;
 
 	if cf.activate_access_log then begin
-		let post_rotate () = DB.to_file store cons Disk.xs_daemon_database in
+		let post_rotate () = DB.to_file store cons (None) Disk.xs_daemon_database in
 		Logging.init_access_log post_rotate
 	end;
 
@@ -367,6 +410,7 @@ let _ =
 	let ring_scan_checker dom =
 		(* no need to scan domains already marked as for processing *)
 		if not (Domain.get_io_credit dom > 0) then
+			debug "Looking up domid %d" (Domain.get_id dom);
 			let con = Connections.find_domain cons (Domain.get_id dom) in
 			if not (Connection.has_more_work con) then (
 				Process.do_output store cons domains con;
@@ -496,7 +540,7 @@ let _ =
 			live_update := Process.LiveUpdate.should_run cons;
 			if !live_update || !quit then begin
 				(* don't initiate live update if saving state fails *)
-				DB.to_file store cons Disk.xs_daemon_database;
+				DB.to_file store cons (rw_sock) Disk.xs_daemon_database;
 				quit := true;
 			end
 		with exc ->
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 0/2] tools/ocaml/libs/xc: domid control
@ 2021-01-15 22:28 ` Edwin Török
  0 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

For debugging/testing purposes we want to be able to control the domid
from the XAPI toolstack too. Xen supports this since a long time.

For convenience here is a tree with all patch series applied:
https://github.com/edwintorok/xen/pull/1


Edwin Török (2):
  tools/ocaml/xenstored: trim txhistory on xenbus reconnect
  tools/ocaml/libs/xc: backward compatible domid control at domain
    creation time

 tools/ocaml/libs/xc/xenctrl.ml      | 5 ++++-
 tools/ocaml/libs/xc/xenctrl.mli     | 4 ++--
 tools/ocaml/libs/xc/xenctrl_stubs.c | 6 +++---
 tools/ocaml/xenstored/connection.ml | 2 +-
 tools/ocaml/xenstored/history.ml    | 4 ++++
 tools/ocaml/xenstored/process.ml    | 4 ++--
 6 files changed, 16 insertions(+), 9 deletions(-)

-- 
2.29.2



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 1/2] tools/ocaml/xenstored: trim txhistory on xenbus reconnect
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (14 preceding siblings ...)
  (?)
@ 2021-01-15 22:28 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

There is a global history, containing transactions from the past 0.05s, which get trimmed whenever any transaction commits or aborts.
Destroying a domain will cause xenopsd to perform some transactions deleting the tree, so that is fine.
But I think that a domain can abuse the xenbus reconnect facility to cause a large history to be
recorded - provided that noone does any transactions on the system inbetween, which may be difficult
to achieve given squeezed's constant pinging.

The theoretical situation is like this:
- a domain starts a transaction, creates as large a tree as it can, commits it. Then repeatedly:
    - start a transaction, do nothing with it, start a transaction, delete part of the large tree, write some new unique data there, don't commit
    - cause a xenbus reconnect (I think this can be done by writing something to the ring). This causes all transactions/watches for the connection to be cleared, but NOT the history, there were no commits, so nobody trimmed the history, i.e. it the history can contain transactions from more than just 0.05s
    - loop back and start more transactions, you can keep this up indefinitely without hitting quotas

Now there is a periodic History.trim running every 0.05s, so I don't think you can do much damage
with it.
But lets be safe an trim the transaction history anyway on reconnect.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
---
Changed since V1:
* post publicly now that the XSA is out (not a security issue)
---
 tools/ocaml/xenstored/connection.ml | 2 +-
 tools/ocaml/xenstored/history.ml    | 4 ++++
 tools/ocaml/xenstored/process.ml    | 4 ++--
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/ocaml/xenstored/connection.ml b/tools/ocaml/xenstored/connection.ml
index eb23c3af7a..1cf24beafd 100644
--- a/tools/ocaml/xenstored/connection.ml
+++ b/tools/ocaml/xenstored/connection.ml
@@ -47,7 +47,7 @@ let mark_as_bad con =
 
 let initial_next_tid = 1
 
-let reconnect con =
+let do_reconnect con =
 	Xenbus.Xb.reconnect con.xb;
 	(* dom is the same *)
 	Hashtbl.clear con.transactions;
diff --git a/tools/ocaml/xenstored/history.ml b/tools/ocaml/xenstored/history.ml
index f39565bff5..3899353da8 100644
--- a/tools/ocaml/xenstored/history.ml
+++ b/tools/ocaml/xenstored/history.ml
@@ -53,6 +53,10 @@ let end_transaction txn con tid commit =
 	trim ~txn ();
 	success
 
+let reconnect con =
+	trim ();
+	Connection.do_reconnect con
+
 let push (x: history_record) =
 	let dom = x.con.Connection.dom in
 	match dom with
diff --git a/tools/ocaml/xenstored/process.ml b/tools/ocaml/xenstored/process.ml
index dd50456ad5..da8e9cdb26 100644
--- a/tools/ocaml/xenstored/process.ml
+++ b/tools/ocaml/xenstored/process.ml
@@ -705,7 +705,7 @@ let do_input store cons doms con =
 			Connection.do_input con
 		with Xenbus.Xb.Reconnect ->
 			info "%s requests a reconnect" (Connection.get_domstr con);
-			Connection.reconnect con;
+			History.reconnect con;
 			info "%s reconnection complete" (Connection.get_domstr con);
 			false
 		| Failure exp ->
@@ -744,7 +744,7 @@ let do_output _store _cons _doms con =
 			ignore (Connection.do_output con)
 		with Xenbus.Xb.Reconnect ->
 			info "%s requests a reconnect" (Connection.get_domstr con);
-			Connection.reconnect con;
+			History.reconnect con;
 			info "%s reconnection complete" (Connection.get_domstr con)
 	)
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 2/2] tools/ocaml/libs/xc: backward compatible domid control at domain creation time
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (15 preceding siblings ...)
  (?)
@ 2021-01-15 22:28 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

One can specify the domid to use when creating the domain, but this was hardcoded to 0.

Keep the existing `domain_create` function, and make domid an optional argument.
When not specified default to 0.

A new version of xenopsd can choose to start using this, while old versions of xenopsd will keep
building and using the old API.
(The ABI will change, but that changes every time a function is introduced/removed or modified)

Controlling the domid can be useful during testing or migration.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
---
Changed since V1:
* introduced an optional ?domid for better backwards compatibility
* use CAMLparam3 because we have an additional parameter
---
 tools/ocaml/libs/xc/xenctrl.ml      | 5 ++++-
 tools/ocaml/libs/xc/xenctrl.mli     | 4 ++--
 tools/ocaml/libs/xc/xenctrl_stubs.c | 6 +++---
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
index e878699b0a..e0a47c4769 100644
--- a/tools/ocaml/libs/xc/xenctrl.ml
+++ b/tools/ocaml/libs/xc/xenctrl.ml
@@ -179,9 +179,12 @@ let with_intf f =
 		handle := Some h;
 		f h
 
-external domain_create: handle -> domctl_create_config -> domid
+external domain_create_stub: handle -> domid -> domctl_create_config -> domid
        = "stub_xc_domain_create"
 
+let domain_create handle ?(domid=0) config =
+	domain_create_stub handle domid config
+
 external domain_sethandle: handle -> domid -> string -> unit
        = "stub_xc_domain_sethandle"
 
diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index e64907df8e..84311fa33d 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -143,8 +143,8 @@ val get_handle: unit -> handle option
  * would invalidate the handle that with_intf passes to its argument. *)
 val close_handle: unit -> unit
 
-external domain_create : handle -> domctl_create_config -> domid
-  = "stub_xc_domain_create"
+val domain_create: handle -> ?domid:int -> domctl_create_config -> domid
+
 external domain_sethandle : handle -> domid -> string -> unit = "stub_xc_domain_sethandle"
 external domain_max_vcpus : handle -> domid -> int -> unit
   = "stub_xc_domain_max_vcpus"
diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
index 94aba38a42..9a8dbe5579 100644
--- a/tools/ocaml/libs/xc/xenctrl_stubs.c
+++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
@@ -175,9 +175,9 @@ static unsigned int ocaml_list_to_c_bitmap(value l)
 	return val;
 }
 
-CAMLprim value stub_xc_domain_create(value xch, value config)
+CAMLprim value stub_xc_domain_create(value xch, value wanted_domid, value config)
 {
-	CAMLparam2(xch, config);
+	CAMLparam3(xch, wanted_domid, config);
 	CAMLlocal2(l, arch_domconfig);
 
 	/* Mnemonics for the named fields inside domctl_create_config */
@@ -191,7 +191,7 @@ CAMLprim value stub_xc_domain_create(value xch, value config)
 #define VAL_MAX_MAPTRACK_FRAMES Field(config, 7)
 #define VAL_ARCH                Field(config, 8)
 
-	uint32_t domid = 0;
+	uint32_t domid = Int_val(wanted_domid);
 	int result;
 	struct xen_domctl_createdomain cfg = {
 		.ssidref = Int32_val(VAL_SSIDREF),
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 0/4] tools/ocaml/xenstored: optimizations
@ 2021-01-15 22:28 ` Edwin Török
  0 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

Various speed optimizations that have already been posted,
but committing them was delayed to avoid conflicts with XSAs.
The XSAs are out, so these are ready to go now.

The switch to Maps may expose bugs in certain xenstored clients,
which previously relied on iteration order of the DIRECTORY response.

In our testing we found one such case, which turned out to be a bug
in a testsuite (it always dropped the 1st xenstore key).

For convenience here is a tree with all patch series applied:
https://github.com/edwintorok/xen/pull/1

Edwin Török (4):
  tools/ocaml/xenstored: replace hand rolled GC with weak GC references
  tools/ocaml/xenstored: backport find_opt/update from 4.06
  tools/ocaml/xenstored: use more efficient node trees
  tools/ocaml/xenstored: use more efficient tries

 tools/ocaml/xenstored/connection.ml  |  3 --
 tools/ocaml/xenstored/connections.ml |  2 +-
 tools/ocaml/xenstored/history.ml     | 14 ------
 tools/ocaml/xenstored/stdext.ml      | 19 ++++++++
 tools/ocaml/xenstored/store.ml       | 51 +++++++++-----------
 tools/ocaml/xenstored/symbol.ml      | 70 +++++++---------------------
 tools/ocaml/xenstored/symbol.mli     | 22 +++------
 tools/ocaml/xenstored/trie.ml        | 61 +++++++++++-------------
 tools/ocaml/xenstored/trie.mli       | 26 +++++------
 tools/ocaml/xenstored/xenstored.ml   | 16 +------
 10 files changed, 109 insertions(+), 175 deletions(-)

-- 
2.29.2



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v4 1/4] tools/ocaml/xenstored: replace hand rolled GC with weak GC references
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (16 preceding siblings ...)
  (?)
@ 2021-01-15 22:28 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

The code here is attempting to reduce memory usage by sharing common
substrings in the tree: it replaces strings with ints, and keeps a
string->int map that gets manually garbage collected using a hand-rolled
mark and sweep algorithm.

This is unnecessary: OCaml already has a mark-and-sweep Garbage
Collector runtime, and sharing of common strings in tree nodes
can be achieved through Weak references: if the string hasn't been seen
yet it gets added to the Weak reference table, and if it has we use the
entry from the table instead, thus storing a string only once.
When the string is no longer referenced OCaml's GC will drop it from the
weak table: there is no need to manually do a mark-and-sweep, or to tell
OCaml when to drop it.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>

---
Changed since V3:
* repost after XSA to avoid conflicts
---
 tools/ocaml/xenstored/connection.ml |  3 --
 tools/ocaml/xenstored/history.ml    | 14 ------
 tools/ocaml/xenstored/store.ml      | 11 ++---
 tools/ocaml/xenstored/symbol.ml     | 68 ++++++-----------------------
 tools/ocaml/xenstored/symbol.mli    | 21 ++-------
 tools/ocaml/xenstored/xenstored.ml  | 16 +------
 6 files changed, 24 insertions(+), 109 deletions(-)

diff --git a/tools/ocaml/xenstored/connection.ml b/tools/ocaml/xenstored/connection.ml
index 1cf24beafd..51041dde8e 100644
--- a/tools/ocaml/xenstored/connection.ml
+++ b/tools/ocaml/xenstored/connection.ml
@@ -334,9 +334,6 @@ let has_more_work con =
 
 let incr_ops con = con.stat_nb_ops <- con.stat_nb_ops + 1
 
-let mark_symbols con =
-	Hashtbl.iter (fun _ t -> Store.mark_symbols (Transaction.get_store t)) con.transactions
-
 let stats con =
 	Hashtbl.length con.watches, con.stat_nb_ops
 
diff --git a/tools/ocaml/xenstored/history.ml b/tools/ocaml/xenstored/history.ml
index 3899353da8..ba5c9cb571 100644
--- a/tools/ocaml/xenstored/history.ml
+++ b/tools/ocaml/xenstored/history.ml
@@ -22,20 +22,6 @@ type history_record = {
 
 let history : history_record list ref = ref []
 
-(* Called from periodic_ops to ensure we don't discard symbols that are still needed. *)
-(* There is scope for optimisation here, since in consecutive commits one commit's `after`
- * is the same thing as the next commit's `before`, but not all commits in history are
- * consecutive. *)
-let mark_symbols () =
-	(* There are gaps where dom0's commits are missing. Otherwise we could assume that
-	 * each element's `before` is the same thing as the next element's `after`
-	 * since the next element is the previous commit *)
-	List.iter (fun hist_rec ->
-			Store.mark_symbols hist_rec.before;
-			Store.mark_symbols hist_rec.after;
-		)
-		!history
-
 (* Keep only enough commit-history to protect the running transactions that we are still tracking *)
 (* There is scope for optimisation here, replacing List.filter with something more efficient,
  * probably on a different list-like structure. *)
diff --git a/tools/ocaml/xenstored/store.ml b/tools/ocaml/xenstored/store.ml
index a3be2e6bbe..9c226e4ef7 100644
--- a/tools/ocaml/xenstored/store.ml
+++ b/tools/ocaml/xenstored/store.ml
@@ -46,18 +46,18 @@ let add_child node child =
 
 let exists node childname =
 	let childname = Symbol.of_string childname in
-	List.exists (fun n -> n.name = childname) node.children
+	List.exists (fun n -> Symbol.equal n.name childname) node.children
 
 let find node childname =
 	let childname = Symbol.of_string childname in
-	List.find (fun n -> n.name = childname) node.children
+	List.find (fun n -> Symbol.equal n.name childname) node.children
 
 let replace_child node child nchild =
 	(* this is the on-steroid version of the filter one-replace one *)
 	let rec replace_one_in_list l =
 		match l with
 		| []                               -> []
-		| h :: tl when h.name = child.name -> nchild :: tl
+		| h :: tl when Symbol.equal h.name child.name -> nchild :: tl
 		| h :: tl                          -> h :: replace_one_in_list tl
 		in
 	{ node with children = (replace_one_in_list node.children) }
@@ -67,7 +67,7 @@ let del_childname node childname =
 	let rec delete_one_in_list l =
 		match l with
 		| []                        -> raise Not_found
-		| h :: tl when h.name = sym -> tl
+		| h :: tl when Symbol.equal h.name sym -> tl
 		| h :: tl                   -> h :: delete_one_in_list tl
 		in
 	{ node with children = (delete_one_in_list node.children) }
@@ -489,9 +489,6 @@ let copy store = {
 	quota = Quota.copy store.quota;
 }
 
-let mark_symbols store =
-	Node.recurse (fun node -> Symbol.mark_as_used node.Node.name) store.root
-
 let incr_transaction_coalesce store =
 	store.stat_transaction_coalesce <- store.stat_transaction_coalesce + 1
 let incr_transaction_abort store =
diff --git a/tools/ocaml/xenstored/symbol.ml b/tools/ocaml/xenstored/symbol.ml
index 4420c6a4d7..2b41d120f6 100644
--- a/tools/ocaml/xenstored/symbol.ml
+++ b/tools/ocaml/xenstored/symbol.ml
@@ -14,63 +14,23 @@
  * GNU Lesser General Public License for more details.
  *)
 
-type t = int
+module WeakTable = Weak.Make(struct
+    type t = string
+    let equal (x:string) (y:string) = (x = y)
+    let hash = Hashtbl.hash
+end)
 
-type 'a record = { data: 'a; mutable garbage: bool }
-let int_string_tbl : (int,string record) Hashtbl.t = Hashtbl.create 1024
-let string_int_tbl : (string,int) Hashtbl.t = Hashtbl.create 1024
+type t = string
 
-let created_counter = ref 0
-let used_counter = ref 0
+let tbl = WeakTable.create 1024
 
-let count = ref 0
-let rec fresh () =
-	if Hashtbl.mem int_string_tbl !count
-	then begin
-		incr count;
-		fresh ()
-	end else
-		!count
+let of_string s = WeakTable.merge tbl s
+let to_string s = s
 
-let new_record v = { data=v; garbage=false }
-
-let of_string name =
-	if Hashtbl.mem string_int_tbl name
-	then begin
-		incr used_counter;
-		Hashtbl.find string_int_tbl name
-	end else begin
-		let i = fresh () in
-		incr created_counter;
-		Hashtbl.add string_int_tbl name i;
-		Hashtbl.add int_string_tbl i (new_record name);
-		i
-	end
-
-let to_string i =
-	(Hashtbl.find int_string_tbl i).data
-
-let mark_all_as_unused () =
-	Hashtbl.iter (fun _ v -> v.garbage <- true) int_string_tbl
-
-let mark_as_used symb =
-	let record1 = Hashtbl.find int_string_tbl symb in
-		record1.garbage <- false
-
-let garbage () =
-	let records = Hashtbl.fold (fun symb record accu ->
-		if record.garbage then (symb, record.data) :: accu else accu
-	) int_string_tbl [] in
-	let remove (int,string) =
-		Hashtbl.remove int_string_tbl int;
-		Hashtbl.remove string_int_tbl string
-	in
-	created_counter := 0;
-	used_counter := 0;
-	List.iter remove records
+let equal a b =
+  (* compare using physical equality, both members have to be part of the above weak table *)
+  a == b
 
 let stats () =
-	Hashtbl.length string_int_tbl
-
-let created () = !created_counter
-let used () = !used_counter
+  let len, entries, _, _, _, _ = WeakTable.stats tbl in
+  len, entries
diff --git a/tools/ocaml/xenstored/symbol.mli b/tools/ocaml/xenstored/symbol.mli
index c3c9f6e2f8..586ab57507 100644
--- a/tools/ocaml/xenstored/symbol.mli
+++ b/tools/ocaml/xenstored/symbol.mli
@@ -29,24 +29,11 @@ val of_string : string -> t
 val to_string : t -> string
 (** Convert a symbol into a string. *)
 
-(** {6 Garbage Collection} *)
-
-(** Symbols need to be regulary garbage collected. The following steps should be followed:
--     mark all the knowns symbols as unused (with [mark_all_as_unused]);
--     mark all the symbols really usefull as used (with [mark_as_used]); and
--     finally, call [garbage] *)
-
-val mark_all_as_unused : unit -> unit
-val mark_as_used : t -> unit
-val garbage : unit -> unit
+val equal: t -> t -> bool
+(** Compare two symbols for equality *)
 
 (** {6 Statistics } *)
 
-val stats : unit -> int
-(** Get the number of used symbols. *)
+val stats : unit -> int * int
+(** Get the table size and number of entries. *)
 
-val created : unit -> int
-(** Returns the number of symbols created since the last GC. *)
-
-val used : unit -> int
-(** Returns the number of existing symbols used since the last GC *)
diff --git a/tools/ocaml/xenstored/xenstored.ml b/tools/ocaml/xenstored/xenstored.ml
index 5893af2caa..885b397d71 100644
--- a/tools/ocaml/xenstored/xenstored.ml
+++ b/tools/ocaml/xenstored/xenstored.ml
@@ -431,18 +431,6 @@ let _ =
 
 	let periodic_ops now =
 		debug "periodic_ops starting";
-		(* we garbage collect the string->int dictionary after a sizeable amount of operations,
-		 * there's no need to be really fast even if we got loose
-		 * objects since names are often reuse.
-		 *)
-		if Symbol.created () > 1000 || Symbol.used () > 20000
-		then begin
-			Symbol.mark_all_as_unused ();
-			Store.mark_symbols store;
-			Connections.iter cons Connection.mark_symbols;
-			History.mark_symbols ();
-			Symbol.garbage ()
-		end;
 
 		(* scan all the xs rings as a safenet for ill-behaved clients *)
 		if !ring_scan_interval >= 0 && now > (!last_scan_time +. float !ring_scan_interval) then
@@ -460,11 +448,11 @@ let _ =
 			let (lanon, lanon_ops, lanon_watchs,
 			     ldom, ldom_ops, ldom_watchs) = Connections.stats cons in
 			let store_nodes, store_abort, store_coalesce = Store.stats store in
-			let symtbl_len = Symbol.stats () in
+			let symtbl_len, symtbl_entries = Symbol.stats () in
 
 			info "store stat: nodes(%d) t-abort(%d) t-coalesce(%d)"
 			     store_nodes store_abort store_coalesce;
-			info "sytbl stat: %d" symtbl_len;
+			info "sytbl stat: length(%d) entries(%d)" symtbl_len symtbl_entries;
 			info "  con stat: anonymous(%d, %d o, %d w) domains(%d, %d o, %d w)"
 			     lanon lanon_ops lanon_watchs ldom ldom_ops ldom_watchs;
 			info "  mem stat: minor(%.0f) promoted(%.0f) major(%.0f) heap(%d w, %d c) live(%d w, %d b) free(%d w, %d b)"
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 2/4] tools/ocaml/xenstored: backport find_opt/update from 4.06
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (17 preceding siblings ...)
  (?)
@ 2021-01-15 22:28 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

We are currently on OCaml 4.02 as minimum version.
To make the followup optimizations compile backport these functions from
OCaml 4.06.

This implementation is less efficient than the one in the 4.06 standard
library which has access to the internals of the Map.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
---
Changed since V3:
* repost after XSA to avoid conflicts
---
 tools/ocaml/xenstored/stdext.ml | 19 +++++++++++++++++++
 tools/ocaml/xenstored/trie.ml   |  2 ++
 2 files changed, 21 insertions(+)

diff --git a/tools/ocaml/xenstored/stdext.ml b/tools/ocaml/xenstored/stdext.ml
index e1567c4dfa..0640602449 100644
--- a/tools/ocaml/xenstored/stdext.ml
+++ b/tools/ocaml/xenstored/stdext.ml
@@ -50,6 +50,25 @@ module Filename = struct
 		cmd :: args |> List.map quote |> String.concat " "
 end
 
+module Map = struct
+module Make(Ord: Map.OrderedType) = struct
+
+include Map.Make(Ord)
+
+let find_opt k t = try Some (find k t) with Not_found -> None
+
+let update k f t =
+  let r = find_opt k t in
+  let r' = f r in
+  match r, r' with
+  | None, None -> t
+  | Some _, None -> remove k t
+  | Some r, Some r' when r == r' -> t
+  | _, Some r' -> add k r' t
+
+end
+end
+
 module String = struct include String
 
 let of_char c = String.make 1 c
diff --git a/tools/ocaml/xenstored/trie.ml b/tools/ocaml/xenstored/trie.ml
index dc42535092..f513f4e608 100644
--- a/tools/ocaml/xenstored/trie.ml
+++ b/tools/ocaml/xenstored/trie.ml
@@ -13,6 +13,8 @@
  * GNU Lesser General Public License for more details.
  *)
 
+open Stdext
+
 module Node =
 struct
 	type ('a,'b) t =  {
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 3/4] tools/ocaml/xenstored: use more efficient node trees
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (18 preceding siblings ...)
  (?)
@ 2021-01-15 22:28 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

This changes the output of xenstore-ls to be sorted.
Previously the keys were listed in the order in which they were inserted
in.
docs/misc/xenstore.txt doesn't specify in what order keys are listed.

Map.update is used to retain semantics with replace_child:
only an existing child is replaced, if it wasn't part of the original
map we don't add it.
Similarly exception behaviour is retained for del_childname and related
functions.

Entries are stored in reverse sort order, so that upon Map.fold the
constructed list is sorted in ascending order and there is no need for a
List.rev.

This changes the semantics and is not suitable as is for a backport.
It reveals bugs in buggy clients that depend on xenstore entry order,
however those clients should be fixed.
(We found one such bug in our internal testsuite where the first
 xenstore entry from a subtree was always dropped, and changing the
 listing order changed what key got dropped making the test fail)

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
---
Changed since V3:
* repost after XSA to avoid conflicts
---
 tools/ocaml/xenstored/store.ml   | 48 +++++++++++++++-----------------
 tools/ocaml/xenstored/symbol.ml  |  4 +++
 tools/ocaml/xenstored/symbol.mli |  3 ++
 3 files changed, 30 insertions(+), 25 deletions(-)

diff --git a/tools/ocaml/xenstored/store.ml b/tools/ocaml/xenstored/store.ml
index 9c226e4ef7..5f155f45eb 100644
--- a/tools/ocaml/xenstored/store.ml
+++ b/tools/ocaml/xenstored/store.ml
@@ -16,17 +16,19 @@
  *)
 open Stdext
 
+module SymbolMap = Map.Make(Symbol)
+
 module Node = struct
 
 type t = {
 	name: Symbol.t;
 	perms: Perms.Node.t;
 	value: string;
-	children: t list;
+	children: t SymbolMap.t;
 }
 
 let create _name _perms _value =
-	{ name = Symbol.of_string _name; perms = _perms; value = _value; children = []; }
+	{ name = Symbol.of_string _name; perms = _perms; value = _value; children = SymbolMap.empty; }
 
 let get_owner node = Perms.Node.get_owner node.perms
 let get_children node = node.children
@@ -42,38 +44,34 @@ let set_value node nvalue =
 let set_perms node nperms = { node with perms = nperms }
 
 let add_child node child =
-	{ node with children = child :: node.children }
+	let children = SymbolMap.add child.name child node.children in
+	{ node with children }
 
 let exists node childname =
 	let childname = Symbol.of_string childname in
-	List.exists (fun n -> Symbol.equal n.name childname) node.children
+	SymbolMap.mem childname node.children
 
 let find node childname =
 	let childname = Symbol.of_string childname in
-	List.find (fun n -> Symbol.equal n.name childname) node.children
+	SymbolMap.find childname node.children
 
 let replace_child node child nchild =
-	(* this is the on-steroid version of the filter one-replace one *)
-	let rec replace_one_in_list l =
-		match l with
-		| []                               -> []
-		| h :: tl when Symbol.equal h.name child.name -> nchild :: tl
-		| h :: tl                          -> h :: replace_one_in_list tl
-		in
-	{ node with children = (replace_one_in_list node.children) }
+	{ node with
+	  children = SymbolMap.update child.name
+		     (function None -> None | Some _ -> Some nchild)
+		     node.children
+	}
 
 let del_childname node childname =
 	let sym = Symbol.of_string childname in
-	let rec delete_one_in_list l =
-		match l with
-		| []                        -> raise Not_found
-		| h :: tl when Symbol.equal h.name sym -> tl
-		| h :: tl                   -> h :: delete_one_in_list tl
-		in
-	{ node with children = (delete_one_in_list node.children) }
+	{ node with children =
+		SymbolMap.update sym
+		  (function None -> raise Not_found | Some _ -> None)
+		  node.children
+	}
 
 let del_all_children node =
-	{ node with children = [] }
+	{ node with children = SymbolMap.empty }
 
 (* check if the current node can be accessed by the current connection with rperm permissions *)
 let check_perm node connection request =
@@ -87,12 +85,12 @@ let check_owner node connection =
 		raise Define.Permission_denied;
 	end
 
-let rec recurse fct node = fct node; List.iter (recurse fct) node.children
+let rec recurse fct node = fct node; SymbolMap.iter (fun _ -> recurse fct) node.children
 
 (** [recurse_map f tree] applies [f] on each node in the tree recursively *)
 let recurse_map f =
 	let rec walk node =
-		f { node with children = List.rev_map walk node.children |> List.rev }
+		  f { node with children = SymbolMap.map walk node.children }
 	in
 	walk
 
@@ -336,7 +334,7 @@ let ls store perm path =
 				Node.check_perm cnode perm Perms.READ;
 				cnode.Node.children in
 			Path.apply store.root path do_ls in
-	List.rev (List.map (fun n -> Symbol.to_string n.Node.name) children)
+	SymbolMap.fold (fun k _ accu -> Symbol.to_string k :: accu) children []
 
 let getperms store perm path =
 	if path = [] then (
@@ -366,7 +364,7 @@ let traversal root_node f =
 	let rec _traversal path node =
 		f path node;
 		let node_path = Path.of_path_and_name path (Symbol.to_string node.Node.name) in
-		List.iter (_traversal node_path) (List.rev node.Node.children)
+		SymbolMap.iter (fun _ -> _traversal node_path) node.Node.children
 		in
 	_traversal [] root_node
 
diff --git a/tools/ocaml/xenstored/symbol.ml b/tools/ocaml/xenstored/symbol.ml
index 2b41d120f6..301639f16f 100644
--- a/tools/ocaml/xenstored/symbol.ml
+++ b/tools/ocaml/xenstored/symbol.ml
@@ -31,6 +31,10 @@ let equal a b =
   (* compare using physical equality, both members have to be part of the above weak table *)
   a == b
 
+let compare a b =
+  if equal a b then 0
+  else -(String.compare a b)
+
 let stats () =
   let len, entries, _, _, _, _ = WeakTable.stats tbl in
   len, entries
diff --git a/tools/ocaml/xenstored/symbol.mli b/tools/ocaml/xenstored/symbol.mli
index 586ab57507..dd0f014796 100644
--- a/tools/ocaml/xenstored/symbol.mli
+++ b/tools/ocaml/xenstored/symbol.mli
@@ -32,6 +32,9 @@ val to_string : t -> string
 val equal: t -> t -> bool
 (** Compare two symbols for equality *)
 
+val compare: t -> t -> int
+(** Compare two symbols *)
+
 (** {6 Statistics } *)
 
 val stats : unit -> int * int
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 4/4] tools/ocaml/xenstored: use more efficient tries
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (19 preceding siblings ...)
  (?)
@ 2021-01-15 22:28 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

No functional change, just an optimization.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
---
Changed since V3:
* repost after XSA to avoid conflicts
---
 tools/ocaml/xenstored/connections.ml |  2 +-
 tools/ocaml/xenstored/symbol.ml      |  6 +--
 tools/ocaml/xenstored/trie.ml        | 59 ++++++++++++----------------
 tools/ocaml/xenstored/trie.mli       | 26 ++++++------
 4 files changed, 43 insertions(+), 50 deletions(-)

diff --git a/tools/ocaml/xenstored/connections.ml b/tools/ocaml/xenstored/connections.ml
index 82988f7e8d..8a66eeec3a 100644
--- a/tools/ocaml/xenstored/connections.ml
+++ b/tools/ocaml/xenstored/connections.ml
@@ -21,7 +21,7 @@ type t = {
 	anonymous: (Unix.file_descr, Connection.t) Hashtbl.t;
 	domains: (int, Connection.t) Hashtbl.t;
 	ports: (Xeneventchn.t, Connection.t) Hashtbl.t;
-	mutable watches: (string, Connection.watch list) Trie.t;
+	mutable watches: Connection.watch list Trie.t;
 }
 
 let create () = {
diff --git a/tools/ocaml/xenstored/symbol.ml b/tools/ocaml/xenstored/symbol.ml
index 301639f16f..72a84ebf80 100644
--- a/tools/ocaml/xenstored/symbol.ml
+++ b/tools/ocaml/xenstored/symbol.ml
@@ -31,9 +31,9 @@ let equal a b =
   (* compare using physical equality, both members have to be part of the above weak table *)
   a == b
 
-let compare a b =
-  if equal a b then 0
-  else -(String.compare a b)
+(* the sort order is reversed here, so that Map.fold constructs a list
+   in ascending order *)
+let compare a b = String.compare b a
 
 let stats () =
   let len, entries, _, _, _, _ = WeakTable.stats tbl in
diff --git a/tools/ocaml/xenstored/trie.ml b/tools/ocaml/xenstored/trie.ml
index f513f4e608..ad2aed5123 100644
--- a/tools/ocaml/xenstored/trie.ml
+++ b/tools/ocaml/xenstored/trie.ml
@@ -15,24 +15,26 @@
 
 open Stdext
 
+module StringMap = Map.Make(String)
+
 module Node =
 struct
-	type ('a,'b) t =  {
-		key: 'a;
-		value: 'b option;
-		children: ('a,'b) t list;
+	type 'a t =  {
+		key: string;
+		value: 'a option;
+		children: 'a t StringMap.t;
 	}
 
 	let _create key value = {
 		key = key;
 		value = Some value;
-		children = [];
+		children = StringMap.empty;
 	}
 
 	let empty key = {
 		key = key;
 		value = None;
-		children = []
+		children = StringMap.empty;
 	}
 
 	let _get_key node = node.key
@@ -49,41 +51,31 @@ struct
 		{ node with children = children }
 
 	let _add_child node child =
-		{ node with children = child :: node.children }
+		{ node with children = StringMap.add child.key child node.children }
 end
 
-type ('a,'b) t = ('a,'b) Node.t list
+type 'a t = 'a Node.t StringMap.t
 
 let mem_node nodes key =
-	List.exists (fun n -> n.Node.key = key) nodes
+	StringMap.mem key nodes
 
 let find_node nodes key =
-	List.find (fun n -> n.Node.key = key) nodes
+	StringMap.find key nodes
 
 let replace_node nodes key node =
-	let rec aux = function
-		| []                            -> []
-		| h :: tl when h.Node.key = key -> node :: tl
-		| h :: tl                       -> h :: aux tl
-	in
-	aux nodes
+	StringMap.update key (function None -> None | Some _ -> Some node) nodes
 
 let remove_node nodes key =
-	let rec aux = function
-		| []                            -> raise Not_found
-		| h :: tl when h.Node.key = key -> tl
-		| h :: tl                       -> h :: aux tl
-	in
-	aux nodes
+	StringMap.update key (function None -> raise Not_found | Some _ -> None) nodes
 
-let create () = []
+let create () = StringMap.empty
 
 let rec iter f tree =
-	let aux node =
-		f node.Node.key node.Node.value;
+	let aux key node =
+		f key node.Node.value;
 		iter f node.Node.children
 	in
-	List.iter aux tree
+	StringMap.iter aux tree
 
 let rec map f tree =
 	let aux node =
@@ -94,13 +86,14 @@ let rec map f tree =
 		in
 		{ node with Node.value = value; Node.children = map f node.Node.children }
 	in
-	List.filter (fun n -> n.Node.value <> None || n.Node.children <> []) (List.map aux tree)
+	tree |> StringMap.map aux
+	|> StringMap.filter (fun _ n -> n.Node.value <> None || not (StringMap.is_empty n.Node.children) )
 
 let rec fold f tree acc =
-	let aux accu node =
-		fold f node.Node.children (f node.Node.key node.Node.value accu)
+	let aux key node accu =
+		fold f node.Node.children (f key node.Node.value accu)
 	in
-	List.fold_left aux acc tree
+	StringMap.fold aux tree acc
 
 (* return a sub-trie *)
 let rec sub_node tree = function
@@ -117,7 +110,7 @@ let rec sub_node tree = function
 
 let sub tree path =
 	try (sub_node tree path).Node.children
-	with Not_found -> []
+	with Not_found -> StringMap.empty
 
 let find tree path =
 	Node.get_value (sub_node tree path)
@@ -161,7 +154,7 @@ and set tree path value =
 				  replace_node tree h (set_node node t value)
 			  end else begin
 				  let node = Node.empty h in
-				  set_node node t value :: tree
+				  StringMap.add node.Node.key (set_node node t value) tree
 			  end
 
 let rec unset tree = function
@@ -176,7 +169,7 @@ let rec unset tree = function
 				  then Node.set_children (Node.empty h) children
 				  else Node.set_children node children
 			  in
-			  if children = [] && new_node.Node.value = None
+			  if StringMap.is_empty children && new_node.Node.value = None
 			  then remove_node tree h
 			  else replace_node tree h new_node
 		  end else
diff --git a/tools/ocaml/xenstored/trie.mli b/tools/ocaml/xenstored/trie.mli
index 5dc53c1cb1..27785154f5 100644
--- a/tools/ocaml/xenstored/trie.mli
+++ b/tools/ocaml/xenstored/trie.mli
@@ -15,46 +15,46 @@
 
 (** Basic Implementation of polymorphic tries (ie. prefix trees) *)
 
-type ('a, 'b) t
-(** The type of tries. ['a list] is the type of keys, ['b] the type of values.
+type 'a t
+(** The type of tries. ['a] the type of values.
 	Internally, a trie is represented as a labeled tree, where node contains values
-	of type ['a * 'b option]. *)
+	of type [string * 'a option]. *)
 
-val create : unit -> ('a,'b) t
+val create : unit -> 'a t
 (** Creates an empty trie. *)
 
-val mem : ('a,'b) t -> 'a list -> bool
+val mem : 'a t -> string list -> bool
 (** [mem t k] returns true if a value is associated with the key [k] in the trie [t].
 	Otherwise, it returns false. *)
 
-val find : ('a, 'b) t -> 'a list -> 'b
+val find : 'a t -> string list -> 'a
 (** [find t k] returns the value associated with the key [k] in the trie [t].
 	Returns [Not_found] if no values are associated with [k] in [t]. *)
 
-val set : ('a, 'b) t -> 'a list -> 'b -> ('a, 'b) t
+val set : 'a t -> string list -> 'a -> 'a t
 (** [set t k v] associates the value [v] with the key [k] in the trie [t]. *)
 
-val unset : ('a, 'b) t -> 'a list -> ('a, 'b) t
+val unset : 'a t -> string list -> 'a t
 (** [unset k v] removes the association of value [v] with the key [k] in the trie [t].
 	Moreover, it automatically clean the trie, ie. it removes recursively
 	every nodes of [t] containing no values and having no chil. *)
 
-val iter : ('a -> 'b option -> unit) -> ('a, 'b) t -> unit
+val iter : (string -> 'a option -> unit) -> 'a t -> unit
 (** [iter f t] applies the function [f] to every node of the trie [t].
 	As nodes of the trie [t] do not necessary contains a value, the second argument of
 	[f] is an option type. *)
 
-val iter_path : ('a -> 'b option -> unit) -> ('a, 'b) t -> 'a list -> unit
+val iter_path : (string -> 'a option -> unit) -> 'a t -> string list -> unit
 (** [iter_path f t p] iterates [f] over nodes associated with the path [p] in the trie [t].
 	If [p] is not a valid path of [t], it iterates on the longest valid prefix of [p]. *)
 
-val fold : ('a -> 'b option -> 'c -> 'c) -> ('a, 'b) t -> 'c -> 'c
+val fold : (string -> 'a option -> 'c -> 'c) -> 'a t -> 'c -> 'c
 (** [fold f t x] fold [f] over every nodes of [t], with [x] as initial value. *)
 
-val map : ('b -> 'c option) -> ('a,'b) t -> ('a,'c) t
+val map : ('a -> 'b option) -> 'a t -> 'b t
 (** [map f t] maps [f] over every values stored in [t]. The return value of [f] is of type 'c option
 	as one may wants to remove value associated to a key. This function is not tail-recursive. *)
 
-val sub : ('a, 'b) t -> 'a list -> ('a,'b) t
+val sub : 'a t -> string list -> 'a t
 (** [sub t p] returns the sub-trie associated with the path [p] in the trie [t].
 	If [p] is not a valid path of [t], it returns an empty trie. *)
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v1 0/5] tools/ocaml/xenstored: structured fuzz testing
@ 2021-01-15 22:28 ` Edwin Török
  0 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

Test that live update functionality works correctly by writing a
Quickcheck-style property test that asserts that oxenstored state
is the same whether the live-update command was run or not.

Crowbar is used for supplying random numbers to quickcheck,
which optionally works with AFL as a driver, but can also be used
without external dependencies.

This is also called structured fuzz testing, because we don't fuzz
the xenstore commands or state directly, but generate some valid looking
trees and commands based on fuzzing the parameters of these commands.

Inspired by the 'qcstm' package and presentation at ICFP 2020,
but using 'crowbar' instead, due to its integration with AFL.

This is a work in progress, transaction live updates do not work yet,
and should be split out from this series.

For convenience here is a series with all patches applied,
review welcome either there or on the mailing list:
https://github.com/edwintorok/xen/pull/1

This series hasn't been reviewed yet, but has been in development during
the XSA series.
It (and myself) kept finding new XSAs and bugs to fix, which left the test itself in
an unfinished state, it is possible to improve it to find more bugs.

Edwin Török (5):
  tools/ocaml: add unit test skeleton with Dune build system
  tools/ocaml/xenstored: implement the live migration binary format
  tools/ocaml/xenstored: add binary dump format support
  tools/ocaml/xenstored: add support for binary format
  Add structured fuzzing unit test

 tools/ocaml/.gitignore                        |   2 +
 tools/ocaml/Makefile                          |  52 ++
 tools/ocaml/dune-project                      |   5 +
 tools/ocaml/libs/eventchn/dune                |   8 +
 tools/ocaml/libs/mmap/dune                    |   8 +
 tools/ocaml/libs/xb/dune                      |   7 +
 tools/ocaml/libs/xc/dune                      |   9 +
 tools/ocaml/libs/xs/dune                      |   4 +
 tools/ocaml/xen.opam                          |   0
 tools/ocaml/xenstore.opam                     |   0
 tools/ocaml/xenstored.opam                    |  18 +
 tools/ocaml/xenstored/Makefile                |   3 +-
 tools/ocaml/xenstored/connection.ml           |  63 +-
 tools/ocaml/xenstored/disk.ml                 | 318 ++++++++
 tools/ocaml/xenstored/dune                    |  19 +
 tools/ocaml/xenstored/parse_arg.ml            |   2 +-
 tools/ocaml/xenstored/perms.ml                |   2 +
 tools/ocaml/xenstored/poll.ml                 |   3 +-
 tools/ocaml/xenstored/process.ml              |  12 +-
 tools/ocaml/xenstored/store.ml                |   1 +
 tools/ocaml/xenstored/test/dune               |  11 +
 tools/ocaml/xenstored/test/generator.ml       | 189 +++++
 tools/ocaml/xenstored/test/model.ml           | 253 ++++++
 tools/ocaml/xenstored/test/old/arbitrary.ml   | 261 +++++++
 tools/ocaml/xenstored/test/old/gen_paths.ml   |  66 ++
 .../xenstored/test/old/xenstored_test.ml      | 527 +++++++++++++
 tools/ocaml/xenstored/test/pathtree.ml        |  40 +
 tools/ocaml/xenstored/test/testable.ml        | 364 +++++++++
 tools/ocaml/xenstored/test/types.ml           | 427 ++++++++++
 tools/ocaml/xenstored/test/xenctrl.ml         |  48 ++
 tools/ocaml/xenstored/test/xeneventchn.ml     |  50 ++
 tools/ocaml/xenstored/test/xenstored_test.ml  | 147 ++++
 tools/ocaml/xenstored/test/xs_protocol.ml     | 733 ++++++++++++++++++
 tools/ocaml/xenstored/transaction.ml          | 119 ++-
 tools/ocaml/xenstored/xenstored.ml            | 205 ++++-
 35 files changed, 3918 insertions(+), 58 deletions(-)
 create mode 100644 tools/ocaml/.gitignore
 create mode 100644 tools/ocaml/dune-project
 create mode 100644 tools/ocaml/libs/eventchn/dune
 create mode 100644 tools/ocaml/libs/mmap/dune
 create mode 100644 tools/ocaml/libs/xb/dune
 create mode 100644 tools/ocaml/libs/xc/dune
 create mode 100644 tools/ocaml/libs/xs/dune
 create mode 100644 tools/ocaml/xen.opam
 create mode 100644 tools/ocaml/xenstore.opam
 create mode 100644 tools/ocaml/xenstored.opam
 create mode 100644 tools/ocaml/xenstored/dune
 create mode 100644 tools/ocaml/xenstored/test/dune
 create mode 100644 tools/ocaml/xenstored/test/generator.ml
 create mode 100644 tools/ocaml/xenstored/test/model.ml
 create mode 100644 tools/ocaml/xenstored/test/old/arbitrary.ml
 create mode 100644 tools/ocaml/xenstored/test/old/gen_paths.ml
 create mode 100644 tools/ocaml/xenstored/test/old/xenstored_test.ml
 create mode 100644 tools/ocaml/xenstored/test/pathtree.ml
 create mode 100644 tools/ocaml/xenstored/test/testable.ml
 create mode 100644 tools/ocaml/xenstored/test/types.ml
 create mode 100644 tools/ocaml/xenstored/test/xenctrl.ml
 create mode 100644 tools/ocaml/xenstored/test/xeneventchn.ml
 create mode 100644 tools/ocaml/xenstored/test/xenstored_test.ml
 create mode 100644 tools/ocaml/xenstored/test/xs_protocol.ml

-- 
2.29.2



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v1 1/5] tools/ocaml: add unit test skeleton with Dune build system
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (20 preceding siblings ...)
  (?)
@ 2021-01-15 22:29 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

Based on initial work by Christian Lindig

Doing oxenstored development, especially fuzzing/unit tests requires
an incremental and fast build system.

Dune is the preferred upstream build system for OCaml, and has been in
use by the XAPI project for years.
Is is incremental and also generates editor integration files (.merlin).

Usage:
./xs-reconfigure.sh
cd tools/ocaml
make clean
make check

There are some other convenience targets as well:
make dune-clean
make dune-syntax-check
make dune-build-oxenstored

There are some files that are generated by Make, these are created
by a 'dune-pre' target, they are too closely tied to make and
cannot yet be generated by Dune itself.

The various Makefile targets are used as entrypoints into Dune
that set the needed env vars (for C include files and libraries)
and ensure that the generated files are available.

The unit tests do not require Xen to be available, so add mock
eventchn and xenctrl libraries for the unit test to use,
and copy the non-system specific modules from xenstored/ to
xenstored/test/.

Xenstored had to be split into Xenstored and Xenstored_main,
so that we can use the functions defined in Xenstored without
actually starting up the daemon in a unit test.
Similarly argument parsing had to be delayed until after daemon startup.

Also had to disable setrlimit when running as non-root in poll.ml.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
---
 tools/ocaml/.gitignore                       |  2 +
 tools/ocaml/Makefile                         | 33 +++++++++++++
 tools/ocaml/dune-project                     |  5 ++
 tools/ocaml/libs/eventchn/dune               |  8 ++++
 tools/ocaml/libs/mmap/dune                   |  8 ++++
 tools/ocaml/libs/xb/dune                     |  7 +++
 tools/ocaml/libs/xc/dune                     |  9 ++++
 tools/ocaml/libs/xs/dune                     |  4 ++
 tools/ocaml/xen.opam                         |  0
 tools/ocaml/xenstore.opam                    |  0
 tools/ocaml/xenstored.opam                   | 18 +++++++
 tools/ocaml/xenstored/Makefile               |  3 +-
 tools/ocaml/xenstored/dune                   | 19 ++++++++
 tools/ocaml/xenstored/parse_arg.ml           |  2 +-
 tools/ocaml/xenstored/poll.ml                |  3 +-
 tools/ocaml/xenstored/test/dune              | 11 +++++
 tools/ocaml/xenstored/test/xenctrl.ml        | 48 +++++++++++++++++++
 tools/ocaml/xenstored/test/xeneventchn.ml    | 50 ++++++++++++++++++++
 tools/ocaml/xenstored/test/xenstored_test.ml |  2 +
 tools/ocaml/xenstored/xenstored.ml           |  4 +-
 20 files changed, 231 insertions(+), 5 deletions(-)
 create mode 100644 tools/ocaml/.gitignore
 create mode 100644 tools/ocaml/dune-project
 create mode 100644 tools/ocaml/libs/eventchn/dune
 create mode 100644 tools/ocaml/libs/mmap/dune
 create mode 100644 tools/ocaml/libs/xb/dune
 create mode 100644 tools/ocaml/libs/xc/dune
 create mode 100644 tools/ocaml/libs/xs/dune
 create mode 100644 tools/ocaml/xen.opam
 create mode 100644 tools/ocaml/xenstore.opam
 create mode 100644 tools/ocaml/xenstored.opam
 create mode 100644 tools/ocaml/xenstored/dune
 create mode 100644 tools/ocaml/xenstored/test/dune
 create mode 100644 tools/ocaml/xenstored/test/xenctrl.ml
 create mode 100644 tools/ocaml/xenstored/test/xeneventchn.ml
 create mode 100644 tools/ocaml/xenstored/test/xenstored_test.ml

diff --git a/tools/ocaml/.gitignore b/tools/ocaml/.gitignore
new file mode 100644
index 0000000000..655e32b07c
--- /dev/null
+++ b/tools/ocaml/.gitignore
@@ -0,0 +1,2 @@
+_build
+.merlin
diff --git a/tools/ocaml/Makefile b/tools/ocaml/Makefile
index a7c04b6546..53dd0a0f0d 100644
--- a/tools/ocaml/Makefile
+++ b/tools/ocaml/Makefile
@@ -34,3 +34,36 @@ build-tools-oxenstored:
 	$(MAKE) -s -C libs/xb
 	$(MAKE) -s -C libs/xc
 	$(MAKE) -C xenstored
+
+LIBRARY_PATH=$(XEN_libxenctrl):$(XEN_libxenguest):$(XEN_libxentoollog):$(XEN_libxencall):$(XEN_libxenevtchn):$(XEN_libxenforeignmemory):$(XEN_libxengnttab):$(XEN_libxendevicemodel):$(XEN_libxentoolcore)
+C_INCLUDE_PATH=$(XEN_libxenctrl)/include:$(XEN_libxengnttab)/include:$(XEN_libxenevtchn)/include:$(XEN_libxentoollog)/include:$(XEN_INCLUDE)
+
+# Files generated by the Makefile
+# These cannot be generated from dune, because dune cannot refer to files
+# in the parent directory (so it couldn't copy/use Config.mk)
+.PHONY: dune-pre
+dune-pre:
+	$(MAKE) -s -C ../../ build-tools-public-headers
+	$(MAKE) -s -C libs/xs paths.ml
+	$(MAKE) -s -C libs/xc xenctrl_abi_check.h
+	$(MAKE) -s -C xenstored paths.ml _paths.h
+
+.PHONY: check
+check: dune-pre
+	# --force isn't necessary here if the test is deterministic
+	OCAMLRUNPARAM=b C_INCLUDE_PATH=$(C_INCLUDE_PATH) dune runtest --profile=release --no-buffer --force
+
+# Convenience targets for development
+
+.PHONY: dune-clean
+dune-clean:
+	$(MAKE) clean
+	dune clean
+
+.PHONY: dune-syntax-check
+dune-syntax-check: dune-pre
+	LIBRARY_PATH=$(LIBRARY_PATH) C_INCLUDE_PATH=$(C_INCLUDE_PATH) dune build --profile=release @check
+
+.PHONY: build-oxenstored-dune
+dune-build-oxenstored: dune-pre
+	LD_LIBRARY_PATH=$(LIBRARY_PATH) LIBRARY_PATH=$(LIBRARY_PATH) C_INCLUDE_PATH=$(C_INCLUDE_PATH) dune build --profile=release @all
diff --git a/tools/ocaml/dune-project b/tools/ocaml/dune-project
new file mode 100644
index 0000000000..b41cfae68b
--- /dev/null
+++ b/tools/ocaml/dune-project
@@ -0,0 +1,5 @@
+(lang dune 2.0)
+
+(name xen)
+
+(formatting disabled)
diff --git a/tools/ocaml/libs/eventchn/dune b/tools/ocaml/libs/eventchn/dune
new file mode 100644
index 0000000000..e08bc76fdf
--- /dev/null
+++ b/tools/ocaml/libs/eventchn/dune
@@ -0,0 +1,8 @@
+(library
+ (foreign_stubs
+  (language c)
+  (names xeneventchn_stubs))
+ (name xeneventchn)
+ (public_name xen.eventchn)
+ (libraries unix)
+ (c_library_flags -lxenevtchn))
diff --git a/tools/ocaml/libs/mmap/dune b/tools/ocaml/libs/mmap/dune
new file mode 100644
index 0000000000..a47de44e47
--- /dev/null
+++ b/tools/ocaml/libs/mmap/dune
@@ -0,0 +1,8 @@
+(library
+ (foreign_stubs
+  (language c)
+  (names xenmmap_stubs))
+ (name xenmmap)
+ (public_name xen.mmap)
+ (libraries unix)
+ (install_c_headers mmap_stubs))
diff --git a/tools/ocaml/libs/xb/dune b/tools/ocaml/libs/xb/dune
new file mode 100644
index 0000000000..feb30adc01
--- /dev/null
+++ b/tools/ocaml/libs/xb/dune
@@ -0,0 +1,7 @@
+(library
+ (foreign_stubs
+  (language c)
+  (names xenbus_stubs xs_ring_stubs))
+ (name xenbus)
+ (public_name xen.bus)
+ (libraries unix xenmmap))
diff --git a/tools/ocaml/libs/xc/dune b/tools/ocaml/libs/xc/dune
new file mode 100644
index 0000000000..fb75ee8ff7
--- /dev/null
+++ b/tools/ocaml/libs/xc/dune
@@ -0,0 +1,9 @@
+(library
+ (foreign_stubs
+  (language c)
+  (names xenctrl_stubs))
+ (name xenctrl)
+ (public_name xen.ctrl)
+ (libraries unix xenmmap)
+ (c_library_flags -lxenctrl -lxenguest -lxencall -lxenforeignmemory
+   -lxengnttab))
diff --git a/tools/ocaml/libs/xs/dune b/tools/ocaml/libs/xs/dune
new file mode 100644
index 0000000000..c79ea75775
--- /dev/null
+++ b/tools/ocaml/libs/xs/dune
@@ -0,0 +1,4 @@
+(library
+ (name xenstore)
+ (public_name xen.store)
+ (libraries unix xenbus))
diff --git a/tools/ocaml/xen.opam b/tools/ocaml/xen.opam
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/tools/ocaml/xenstore.opam b/tools/ocaml/xenstore.opam
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/tools/ocaml/xenstored.opam b/tools/ocaml/xenstored.opam
new file mode 100644
index 0000000000..31775a3033
--- /dev/null
+++ b/tools/ocaml/xenstored.opam
@@ -0,0 +1,18 @@
+opam-version: "2.0"
+synopsis: "In-memory key-value store for the Xen hypervisor"
+maintainer: "lindig@gmail.com"
+authors: "lindig@gmail.com"
+license: "LGPL"
+homepage: "https://github.com/lindig/xen-ocaml-tools"
+bug-reports: "https://github.com/lindig/xen-ocaml-tools/issues"
+depends: [
+  "ocaml"
+  "dune" {build}
+  "base-unix"
+  "crowbar" {with-test}
+  "qcheck-core" {with-test}
+  "qcstm" {with-test}
+]
+build: ["dune" "build" "-p" name "-j" jobs]
+depexts: ["m4" "libxen-dev" "libsystemd-dev"] {os-distribution = "debian"}
+dev-repo: "git+https://github.com/lindig/xen-ocaml-tools.git"
diff --git a/tools/ocaml/xenstored/Makefile b/tools/ocaml/xenstored/Makefile
index 89ec3ec76a..9d2da206d8 100644
--- a/tools/ocaml/xenstored/Makefile
+++ b/tools/ocaml/xenstored/Makefile
@@ -56,7 +56,8 @@ OBJS = paths \
 	history \
 	parse_arg \
 	process \
-	xenstored
+	xenstored \
+	xenstored_main
 
 INTF = symbol.cmi trie.cmi syslog.cmi systemd.cmi poll.cmi
 
diff --git a/tools/ocaml/xenstored/dune b/tools/ocaml/xenstored/dune
new file mode 100644
index 0000000000..e59eb22638
--- /dev/null
+++ b/tools/ocaml/xenstored/dune
@@ -0,0 +1,19 @@
+(executable
+ (modes byte exe)
+ (name xenstored)
+ (modules (:standard \ syslog systemd))
+ (public_name xenstored)
+ (package xenstored)
+ (flags (:standard -w -52))
+ (libraries unix xen.bus xen.mmap xen.ctrl xen.eventchn xenstubs))
+
+(library
+ (foreign_stubs
+  (language c)
+  (names syslog_stubs systemd_stubs select_stubs)
+  (flags (-DHAVE_SYSTEMD)))
+ (modules syslog systemd)
+ (name xenstubs)
+ (wrapped false)
+ (libraries unix)
+ (c_library_flags -lsystemd))
diff --git a/tools/ocaml/xenstored/parse_arg.ml b/tools/ocaml/xenstored/parse_arg.ml
index 7c0478e76a..965cb9ebeb 100644
--- a/tools/ocaml/xenstored/parse_arg.ml
+++ b/tools/ocaml/xenstored/parse_arg.ml
@@ -28,7 +28,7 @@ type config =
 	disable_socket: bool;
 }
 
-let do_argv =
+let do_argv () =
 	let pidfile = ref "" and tracefile = ref "" (* old xenstored compatibility *)
 	and domain_init = ref true
 	and activate_access_log = ref true
diff --git a/tools/ocaml/xenstored/poll.ml b/tools/ocaml/xenstored/poll.ml
index 26f8620dfc..92e0717ed2 100644
--- a/tools/ocaml/xenstored/poll.ml
+++ b/tools/ocaml/xenstored/poll.ml
@@ -64,4 +64,5 @@ let poll_select in_fds out_fds exc_fds timeout =
 			a r
 
 let () =
-        set_fd_limit (get_sys_fs_nr_open ())
+        if Unix.geteuid () = 0 then
+          set_fd_limit (get_sys_fs_nr_open ())
diff --git a/tools/ocaml/xenstored/test/dune b/tools/ocaml/xenstored/test/dune
new file mode 100644
index 0000000000..2a3eb2b7df
--- /dev/null
+++ b/tools/ocaml/xenstored/test/dune
@@ -0,0 +1,11 @@
+(copy_files# ../*.ml{,i})
+
+(test
+ (modes native)
+ (ocamlopt_flags -afl-instrument)
+ (name xenstored_test)
+ (modules (:standard \ syslog systemd))
+ (package xenstored)
+ (flags (:standard -w -52))
+ ;;(action (run %{test} -v --seed 364172147))
+ (libraries unix xen.bus xen.mmap xenstubs crowbar xen.store fmt fmt.tty))
diff --git a/tools/ocaml/xenstored/test/xenctrl.ml b/tools/ocaml/xenstored/test/xenctrl.ml
new file mode 100644
index 0000000000..37d6da0a47
--- /dev/null
+++ b/tools/ocaml/xenstored/test/xenctrl.ml
@@ -0,0 +1,48 @@
+(*
+ * Copyright (C) 2006-2007 XenSource Ltd.
+ * Copyright (C) 2008      Citrix Ltd.
+ * Author Vincent Hanquez <vincent.hanquez@eu.citrix.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ *)
+
+(** *)
+type domid = int
+
+(* ** xenctrl.h ** *)
+
+
+type domaininfo =
+{
+	domid             : domid;
+	dying             : bool;
+	shutdown          : bool;
+	shutdown_code     : int;
+}
+
+exception Error of string
+
+type handle = unit
+
+let interface_open () = ()
+let interface_close () = ()
+
+let domain_getinfo () domid = {
+  domid = domid;
+  dying = false;
+  shutdown = false;
+  shutdown_code = 0;
+}
+
+let devzero = Unix.openfile "/dev/zero" [] 0
+let  nullmap () = Xenmmap.mmap devzero Xenmmap.RDWR Xenmmap.PRIVATE 4096 0
+
+let map_foreign_range _ _ _ _ = nullmap ()
diff --git a/tools/ocaml/xenstored/test/xeneventchn.ml b/tools/ocaml/xenstored/test/xeneventchn.ml
new file mode 100644
index 0000000000..6612722dc2
--- /dev/null
+++ b/tools/ocaml/xenstored/test/xeneventchn.ml
@@ -0,0 +1,50 @@
+(*
+ * Copyright (C) 2006-2007 XenSource Ltd.
+ * Copyright (C) 2008      Citrix Ltd.
+ * Author Vincent Hanquez <vincent.hanquez@eu.citrix.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ *)
+
+type handle = Unix.file_descr * int ref
+
+let devnull = Unix.openfile "/dev/null" [] 0
+let init () = devnull, ref 0
+let fd (h, _) = h
+
+type t = int
+
+type virq_t =
+  | Timer        (* #define VIRQ_TIMER      0 *)
+  | Debug        (* #define VIRQ_DEBUG      1 *)
+  | Console      (* #define VIRQ_CONSOLE    2 *)
+  | Dom_exc      (* #define VIRQ_DOM_EXC    3 *)
+  | Tbuf         (* #define VIRQ_TBUF       4 *)
+  | Reserved_5   (* Do not use this value as it's not defined *)
+  | Debugger     (* #define VIRQ_DEBUGGER   6 *)
+  | Xenoprof     (* #define VIRQ_XENOPROF   7 *)
+  | Con_ring     (* #define VIRQ_CON_RING   8 *)
+  | Pcpu_state   (* #define VIRQ_PCPU_STATE 9 *)
+  | Mem_event    (* #define VIRQ_MEM_EVENT  10 *)
+  | Xc_reserved  (* #define VIRQ_XC_RESERVED 11 *)
+  | Enomem       (* #define VIRQ_ENOMEM     12 *)
+  | Xenpmu       (* #define VIRQ_XENPMU     13 *)
+
+let notify _h _ = ()
+let bind_interdomain (_h, port) domid remote_port = incr port; !port
+let bind_virq (_h, port) _ = incr port; !port
+let bind_dom_exc_virq handle = bind_virq handle Dom_exc
+let unbind _ _ = ()
+let pending (_h, port) = !port
+let unmask _ _ = ()
+
+let to_int x = x
+let of_int x = x
diff --git a/tools/ocaml/xenstored/test/xenstored_test.ml b/tools/ocaml/xenstored/test/xenstored_test.ml
new file mode 100644
index 0000000000..e86b68e867
--- /dev/null
+++ b/tools/ocaml/xenstored/test/xenstored_test.ml
@@ -0,0 +1,2 @@
+open Xenstored
+let () = ()
diff --git a/tools/ocaml/xenstored/xenstored.ml b/tools/ocaml/xenstored/xenstored.ml
index 885b397d71..e25b407303 100644
--- a/tools/ocaml/xenstored/xenstored.ml
+++ b/tools/ocaml/xenstored/xenstored.ml
@@ -265,8 +265,8 @@ let to_file store cons fds file =
 	        (fun () -> close_out channel)
 end
 
-let _ =
-	let cf = do_argv in
+let main () =
+	let cf = do_argv () in
 	let pidfile =
 		if Sys.file_exists (config_filename cf) then
 			parse_config (config_filename cf)
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v1 2/5] tools/ocaml/xenstored: implement the live migration binary format
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (21 preceding siblings ...)
  (?)
@ 2021-01-15 22:29 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

This is implemented by C xenstored as live update dump format.
oxenstored already has its own (text-based) dump format, but for
compatibility implement one compatible with C xenstored.

This will also be useful in the future for non-cooperative guest live migration.

docs/designs/xenstore-migration.md documents the format

For now this always dumps integers in big endian order, because even old
versions of OCaml have support for that.
The binary format supports both little and big endian orders, so this
should be compatible.

To dump in little endian or native endian order we would
require OCaml 4.08+.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
---
 tools/ocaml/xenstored/disk.ml | 318 ++++++++++++++++++++++++++++++++++
 1 file changed, 318 insertions(+)

diff --git a/tools/ocaml/xenstored/disk.ml b/tools/ocaml/xenstored/disk.ml
index 4739967b61..595fdab54a 100644
--- a/tools/ocaml/xenstored/disk.ml
+++ b/tools/ocaml/xenstored/disk.ml
@@ -155,3 +155,321 @@ let write store =
 		Unix.rename tfile xs_daemon_database
 	with exc ->
 		error "caught exn %s" (Printexc.to_string exc)
+
+	module BinaryOut = struct
+		let version = 0x1
+		let endian = 1
+		let padding = String.make 7 '\x00'
+
+		let write_header ch =
+			(* for testing endian order *)
+			output_binary_int ch 0x78656e73;
+			output_binary_int ch 0x746f7265;
+			output_binary_int ch version;
+			output_binary_int ch endian;
+			ch
+
+		let w8 = output_char
+		let w16 ch i =
+			assert (i >= 0 && i lsr 16 = 0);
+			output_byte ch (i lsr 8);
+			output_byte ch i
+
+		let w32 ch v =
+			assert (v >= 0 && v <= 0xFFFF_FFFF);
+			output_binary_int ch v
+
+		let pos = pos_out
+		let wpad ch =
+			let padto = 8 in
+			let padby = (padto - pos ch mod padto) mod padto in
+			if padby > 0 then
+				output_substring ch padding 0 padby
+
+		let wstring = output_string
+	end
+
+	module BinaryIn = struct
+		type t = in_channel
+
+		let read_header t =
+			let h = Bytes.make 8 '\x00' in
+			really_input t h 0 (Bytes.length h);
+			let ver = input_binary_int t in
+			let endian = input_binary_int t in
+			if Bytes.to_string h <> "xenstore" then
+				failwith "Header doesn't begin with 'xenstore'";
+			if ver <> BinaryOut.version then
+				failwith "Incompatible version";
+			if endian <> BinaryOut.endian then
+				failwith "Incompatible endianness"
+
+		let r8 = input_char
+
+		let r16 t = 
+			let r0 = input_byte t in
+			let r1 = input_byte t  in
+			(r0 lsl 8) lor r1
+
+		let r32 t =
+			(* read unsigned 32-bit int *)
+			let r = input_binary_int t land 0xFFFF_FFFF in
+			assert (r >= 0);
+			r
+
+		let rstring = really_input_string
+
+		let rpad t =
+			let padto = 8 in
+			let padby = (padto - pos_in t mod padto) mod padto in
+			if padby > 0 then
+				ignore (really_input_string t padby)
+	end
+
+module FD : sig
+     type t = Unix.file_descr
+     val of_int: int -> t
+     val to_int : t -> int
+end = struct
+    type t = Unix.file_descr
+    (* This is like Obj.magic but just for these types,
+       and relies on Unix.file_descr = int *)
+    external to_int : t -> int = "%identity"
+    external of_int : int -> t = "%identity"
+end
+
+module LiveRecord = struct
+	(* See docs/designs/xenstore-migration.md for binary format *)
+	module Type : sig
+		type t = private int
+		val end_ : t
+		val global_data : t
+		val connection_data : t
+		val watch_data : t
+		val transaction_data : t
+		val node_data: t
+	end = struct
+		type t = int
+		let end_ = 0x0
+		let global_data = 0x01
+		let connection_data = 0x02
+		let watch_data = 0x03
+		let transaction_data = 0x04
+		let node_data = 0x05
+	end
+
+	module I = BinaryIn
+	module O = BinaryOut
+
+	let write_expect msg expected actual =
+		if expected <> actual then
+			let m = Printf.sprintf "expected %d <> %d: %s" expected actual msg in
+			invalid_arg m
+
+	let write_record t (typ: Type.t) len f =
+		assert (O.pos t mod 8 = 0);
+		O.w32 t (typ :> int);
+		O.w32 t len;
+		let p0 = O.pos t in
+		f t;
+		let p1 = O.pos t in
+		write_expect "position and length" len (p1-p0);
+		O.wpad t
+
+	let write_end t =
+		write_record t Type.end_ 0 ignore
+
+	let read_expect t msg expected actual =
+		if expected <> actual then
+			let pos = pos_in t in
+			let m = Printf.sprintf "expected %d <> %d at ~%d: %s" expected actual pos msg in
+			invalid_arg m
+
+	let read_end t ~len f =
+		read_expect t "end" 0 len;
+		f ()
+
+	let write_global_data t ~rw_sock =
+		write_record t Type.global_data 8 @@ fun b ->
+		O.w32 b (FD.to_int rw_sock);
+		O.w32 b (-1)
+
+	let read_global_data t ~len f =
+		read_expect t "global_data" 8 len;
+		let rw_sock = FD.of_int (I.r32 t) in
+		let _ = FD.of_int (I.r32 t) in
+		f ~rw_sock
+
+	let conn_shared_ring = 0x0
+	let conn_socket = 0x1
+	let domid_invalid = 0x7FF4
+
+	(* oxenstored doesn't support readonly sockets yet *)
+	let flags_connection_readonly = 0x1l
+
+	type dom = { id: int; target: int; remote_port: int }
+	type conn = Socket of Unix.file_descr | Domain of dom
+
+	let write_connection_data t ~conid ~conn xb_pktin xb_partialout xb_pktout =
+		let in_data_len = Buffer.length xb_pktin in
+		let out_resp_len = String.length xb_partialout in
+		let out_data_len = Buffer.length xb_pktout in
+		let data_len = in_data_len + out_data_len in
+
+		write_record t Type.connection_data (32 + data_len) @@ fun b ->
+		assert (conid > 0);
+		O.w32 b conid;
+		O.w32 b (match conn with
+		| Socket _ -> conn_socket
+		| Domain _ -> conn_shared_ring
+		);
+		let flags = 0x0 in
+		O.w32 b flags;
+
+		(match conn with
+		| Socket fd ->
+			O.w32 b (FD.to_int fd);
+			O.w32 b 0 (* pad *)
+		| Domain dom ->
+			O.w16 b dom.id;
+			O.w16 b dom.target;
+			O.w32 b dom.remote_port
+			);
+
+		O.w32 b in_data_len;
+		O.w32 b out_resp_len;
+		O.w32 b out_data_len;
+		Buffer.output_buffer b xb_pktin;
+		O.wstring b xb_partialout;
+		Buffer.output_buffer b xb_pktout
+
+	let read_connection_data t ~len f =
+		let conid = I.r32 t in
+		assert (conid > 0);
+		let kind = I.r32 t in
+		let flags = I.r32 t in
+		read_expect t "flags" 0 flags;
+		let conn = (match kind with
+		| x when x = conn_socket ->
+			let fd = FD.of_int (I.r32 t) in
+			I.r32 t |> ignore;
+			Socket fd
+		| x when x = conn_shared_ring ->
+			let id = I.r16 t in
+			let target = I.r16 t in
+			let remote_port = I.r32 t in
+			Domain {id; target; remote_port }
+		| x ->
+			invalid_arg (Printf.sprintf "Unknown connection kind %x" x)
+		) in
+		let in_data_len = I.r32 t in
+		let out_resp_len = I.r32 t in
+		let out_data_len = I.r32 t in
+		let in_data = really_input_string t in_data_len in
+		let out_data = really_input_string t out_data_len in
+		f ~conid ~conn ~in_data ~out_data ~out_resp_len
+
+
+	let write_watch_data t ~conid ~wpath ~token =
+		let wpath_len = String.length wpath in
+		let token_len = String.length token in
+
+		write_record t Type.watch_data (12+wpath_len+token_len) @@ fun b ->
+		O.w32 b conid;
+		O.w32 b (String.length wpath);
+		O.w32 b (String.length token);
+		O.wstring b wpath;
+		O.wstring b token
+
+	let read_watch_data t ~len f =
+		let conid = I.r32 t in
+		let wpathlen = I.r32 t in
+		let tokenlen = I.r32 t in
+		let wpath = I.rstring t wpathlen in
+		let token = I.rstring t tokenlen in
+		f ~conid ~wpath ~token
+
+	let write_transaction_data t ~conid ~txid =
+		write_record t Type.transaction_data 8 @@ fun b ->
+		O.w32 b conid;
+		O.w32 b txid
+
+	let read_transaction_data t ~len f =
+		read_expect t "transaction" 8 len;
+		let conid = I.r32 t in
+		let txid = I.r32 t in
+		f ~conid ~txid
+
+	type access = R | W | RW | Del
+
+	let write_node_data t ~txidaccess ~path ~value ~perms =
+		let path_len = String.length path in
+		let value_len = String.length value in
+		let perms = Perms.Node.acls perms in
+		let len = 24 + (List.length perms)*4 + path_len + value_len in
+
+		write_record t Type.node_data len @@ fun b ->
+		O.w32 b (match txidaccess with None -> 0 | Some (conid, _, _) -> conid);
+		O.w32 b (match txidaccess with None -> 0 | Some (_, txid, _) -> txid);
+		O.w32 b path_len;
+		O.w32 b value_len;
+		O.w32 b (match txidaccess with
+		| None -> 0x0
+		| Some (_, _, Del) -> 0x0
+		| Some (_, _, R) -> 0x1
+		| Some (_, _, W) -> 0x2
+		| Some (_, _, RW) -> 0x3
+		);
+		O.w32 b (List.length perms);
+		List.iter (fun (domid, permty) ->
+			O.w8 b (Perms.char_of_permty permty);
+			O.w8 b '\x00';
+			O.w16 b domid;
+		) perms;
+		O.wstring b path;
+		O.wstring b value
+
+	let read_node_data t ~len f =
+		let conid = I.r32 t in
+		let txid = I.r32 t in
+		let path_len = I.r32 t in
+		let value_len = I.r32 t in
+		let txaccess = match conid, I.r32 t with
+		| 0, _ -> None
+		| _, 0 -> Some (conid, txid, Del)
+		| _, 1 -> Some (conid, txid, R)
+		| _, 2 -> Some (conid, txid, W)
+		| _, 3 -> Some (conid, txid, RW)
+		| _ -> invalid_arg "invalid access flag"
+		in
+		let a = Array.init (I.r32 t) (fun _ ->
+					let perm = Perms.permty_of_char (I.r8 t) in
+					I.r8 t |> ignore;
+					let domid = I.r16 t in
+					domid, perm
+		) in
+		let perms = match Array.to_list a with
+		| [] -> invalid_arg "Permission list cannot be empty";
+		| (owner, other) :: acls ->
+			Perms.Node.create owner other acls
+		in
+		let path = I.rstring t path_len in
+		let value = I.rstring t value_len in
+		f ~txaccess ~perms ~path ~value
+
+	let read_record t ~on_end ~on_global_data ~on_connection_data ~on_watch_data ~on_transaction_data ~on_node_data =
+		I.rpad t; (* if we fail to process a record (e.g. callback raises, ensure we resume at right place *)
+		let typ = I.r32 t in
+		let len = I.r32 t in
+		let p0 = pos_in t in
+		(match typ with
+		| x when x = (Type.end_ :> int) -> read_end t ~len on_end
+		| x when x = (Type.global_data :> int) -> read_global_data t ~len on_global_data
+		| x when x = (Type.connection_data :> int) -> read_connection_data t ~len on_connection_data
+		| x when x = (Type.watch_data :> int) -> read_watch_data t ~len on_watch_data
+		| x when x = (Type.transaction_data :> int) -> read_transaction_data t ~len on_transaction_data
+		| x when x = (Type.node_data :> int) -> read_node_data t ~len on_node_data
+		| x -> failwith (Printf.sprintf "Unknown record type: %x" x));
+		let p1 = pos_in t in
+		read_expect t "record length" len (p1-p0)
+end
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v1 3/5] tools/ocaml/xenstored: add binary dump format support
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (22 preceding siblings ...)
  (?)
@ 2021-01-15 22:29 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
---
 tools/ocaml/xenstored/connection.ml | 63 +++++++++++++++++++++--------
 1 file changed, 47 insertions(+), 16 deletions(-)

diff --git a/tools/ocaml/xenstored/connection.ml b/tools/ocaml/xenstored/connection.ml
index 51041dde8e..1f9fe9e3b2 100644
--- a/tools/ocaml/xenstored/connection.ml
+++ b/tools/ocaml/xenstored/connection.ml
@@ -17,6 +17,7 @@
 exception End_of_file
 
 open Stdext
+module LR = Disk.LiveRecord
 
 let xenstore_payload_max = 4096 (* xen/include/public/io/xs_wire.h *)
 
@@ -77,6 +78,10 @@ let number_of_transactions con =
 
 let get_domain con = con.dom
 
+let get_id con = match con.dom with
+| None -> 2*LR.domid_invalid + con.anonid
+| Some dom -> 1 + Domain.get_id dom
+
 let anon_id_next = ref 1
 
 let get_domstr con =
@@ -279,6 +284,9 @@ let end_transaction con tid commit =
 let get_transaction con tid =
 	Hashtbl.find con.transactions tid
 
+let iter_transactions con f =
+	Hashtbl.iter f con.transactions
+
 let do_input con = Xenbus.Xb.input con.xb
 let has_input con = Xenbus.Xb.has_in_packet con.xb
 let has_partial_input con = match con.xb.Xenbus.Xb.partial_in with
@@ -337,22 +345,45 @@ let incr_ops con = con.stat_nb_ops <- con.stat_nb_ops + 1
 let stats con =
 	Hashtbl.length con.watches, con.stat_nb_ops
 
-let dump con chan =
-	let id = match con.dom with
-	| Some dom ->
-		let domid = Domain.get_id dom in
-		(* dump domain *)
-		Domain.dump dom chan;
-		domid
-	| None ->
-		let fd = con |> get_fd |> Utils.FD.to_int in
-		Printf.fprintf chan "socket,%d\n" fd;
-		-fd
-	in
-	(* dump watches *)
-	List.iter (fun (path, token) ->
-		Printf.fprintf chan "watch,%d,%s,%s\n" id (Utils.hexify path) (Utils.hexify token)
-		) (list_watches con)
+let serialize_pkt_in buf xb =
+	let open Xenbus.Xb in
+	Queue.iter (fun p -> Buffer.add_string buf (Packet.to_string p)) xb.pkt_in;
+	match xb.partial_in with
+	| NoHdr (to_read, hdrb) ->
+		(* see Xb.input *)
+		let used = Xenbus.Partial.header_size () - to_read in
+		Buffer.add_subbytes buf hdrb 0 used
+	| HaveHdr p ->
+		p |> Packet.of_partialpkt |> Packet.to_string |> Buffer.add_string buf
+
+let serialize_pkt_out buf xb =
+	let open Xenbus.Xb in
+	Buffer.add_string buf xb.partial_out;
+	Queue.iter (fun p -> Buffer.add_string buf (Packet.to_string p)) xb.pkt_out
+
+let dump con store chan =
+	let conid = get_id con in
+	let conn = match con.dom with
+	| None -> LR.Socket (get_fd con)
+	| Some dom -> LR.Domain {
+		id = Domain.get_id dom;
+		target = LR.domid_invalid;  (* TODO: we do not store this info *)
+		remote_port = Domain.get_remote_port dom
+	} in
+	let pkt_in = Buffer.create 4096 in
+	let pkt_out = Buffer.create 4096 in
+	serialize_pkt_in pkt_in con.xb;
+	serialize_pkt_out pkt_out con.xb;
+	LR.write_connection_data chan ~conid ~conn  pkt_in con.xb.partial_out pkt_out;
+
+	con |> list_watches
+	|> List.rev (* preserve order in dump/reload *)
+	|> List.iter (fun (wpath, token) ->
+		LR.write_watch_data chan ~conid ~wpath ~token
+	);
+	let conpath = get_path con in
+	iter_transactions con (fun _ txn ->
+		 Transaction.dump store conpath ~conid txn chan)
 
 let debug con =
 	let domid = get_domstr con in
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v1 4/5] tools/ocaml/xenstored: add support for binary format
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (23 preceding siblings ...)
  (?)
@ 2021-01-15 22:29 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
---
 tools/ocaml/xenstored/perms.ml     |   2 +
 tools/ocaml/xenstored/xenstored.ml | 201 ++++++++++++++++++++++++-----
 2 files changed, 173 insertions(+), 30 deletions(-)

diff --git a/tools/ocaml/xenstored/perms.ml b/tools/ocaml/xenstored/perms.ml
index e8a16221f8..61c1c60083 100644
--- a/tools/ocaml/xenstored/perms.ml
+++ b/tools/ocaml/xenstored/perms.ml
@@ -69,6 +69,8 @@ let remove_domid ~domid perm =
 
 let default0 = create 0 NONE []
 
+let acls t = (t.owner, t.other) :: t.acl
+
 let perm_of_string s =
 	let ty = permty_of_char s.[0]
 	and id = int_of_string (String.sub s 1 (String.length s - 1)) in
diff --git a/tools/ocaml/xenstored/xenstored.ml b/tools/ocaml/xenstored/xenstored.ml
index e25b407303..9338190804 100644
--- a/tools/ocaml/xenstored/xenstored.ml
+++ b/tools/ocaml/xenstored/xenstored.ml
@@ -141,7 +141,8 @@ exception Bad_format of string
 
 let dump_format_header = "$xenstored-dump-format"
 
-let from_channel_f chan global_f socket_f domain_f watch_f store_f =
+(* for backwards compatibility with already released live-update *)
+let from_channel_f_compat chan global_f socket_f domain_f watch_f store_f =
 	let unhexify s = Utils.unhexify s in
 	let getpath s =
 		let u = Utils.unhexify s in
@@ -186,7 +187,7 @@ let from_channel_f chan global_f socket_f domain_f watch_f store_f =
 	done;
 	info "Completed loading xenstore dump"
 
-let from_channel store cons doms chan =
+let from_channel_compat ~live store cons doms chan =
 	(* don't let the permission get on our way, full perm ! *)
 	let op = Store.get_ops store Perms.Connection.full_rights in
 	let rwro = ref (None) in
@@ -226,43 +227,183 @@ let from_channel store cons doms chan =
 		op.Store.write path value;
 		op.Store.setperms path perms
 		in
-	from_channel_f chan global_f socket_f domain_f watch_f store_f;
+	from_channel_f_compat chan global_f socket_f domain_f watch_f store_f;
 	!rwro
 
-let from_file store cons doms file =
-	info "Loading xenstore dump from %s" file;
-	let channel = open_in file in
-	finally (fun () -> from_channel store doms cons channel)
+module LR = Disk.LiveRecord
+
+let from_channel_f_bin chan on_global_data on_connection_data on_watch_data on_transaction_data on_node_data =
+	Disk.BinaryIn.read_header chan;
+	let quit = ref false in
+	let on_end () = quit := true in
+	let errors = ref 0 in
+	while not !quit
+	do
+		try
+			LR.read_record chan ~on_end ~on_global_data ~on_connection_data ~on_watch_data ~on_transaction_data ~on_node_data
+		with exn ->
+			let bt = Printexc.get_backtrace () in
+			incr errors;
+			Logging.warn "xenstored" "restoring: ignoring faulty record (exception: %s): %s" (Printexc.to_string exn) bt
+	done;
+        info "Completed loading xenstore dump";
+	!errors
+
+
+let from_channel_bin ~live store cons doms chan =
+	(* don't let the permission get on our way, full perm ! *)
+	let maintx = Transaction.make ~internal:true Transaction.none store in
+	let fullperm = Perms.Connection.full_rights in
+	let fds = ref None in
+	let allcons = Hashtbl.create 1021 in
+	let contxid_to_op = Hashtbl.create 1021 in
+	let global_f ~rw_sock =
+		(* file descriptors are only valid on a live-reload, a cold restart won't have them *)
+		if live then
+			fds := Some rw_sock
+	in
+	let domain_f ~conid ~conn ~in_data ~out_data ~out_resp_len =
+		let con = match conn with
+		| LR.Domain { LR.id = 0; _ } ->
+			(* Dom0 is precreated *)
+			Connections.find_domain cons 0
+		| LR.Domain d ->
+			debug "Recreating domain %d, port %d" d.id d.remote_port; 
+			(* FIXME: gnttab *)
+			Domains.create doms d.id 0n d.remote_port
+			|> Connections.add_domain cons;
+			Connections.find_domain cons d.id
+		| LR.Socket fd ->
+			debug "Recreating open socket";
+			(* TODO: rw/ro flag *)
+			Connections.add_anonymous cons fd;
+			Connections.find cons fd
+		in
+		Hashtbl.add allcons conid con
+	in
+	let watch_f ~conid ~wpath ~token =
+		let con = Hashtbl.find allcons conid in
+		ignore (Connections.add_watch cons con wpath token);
+		()
+		in
+	let transaction_f ~conid ~txid =
+		let con = Hashtbl.find allcons conid in
+		con.Connection.next_tid <- txid;
+		let id = Connection.start_transaction con store in
+		assert (id = txid);
+		let txn = Connection.get_transaction con txid in
+		Hashtbl.add contxid_to_op (conid, txid) txn
+	in
+	let store_f ~txaccess ~perms ~path ~value =
+		let txn, op = match txaccess with
+		| None -> maintx, LR.W
+		| Some (conid, txid, op) ->
+			 let (txn, _) as r = Hashtbl.find contxid_to_op (conid, txid), op in
+     	 (* make sure this doesn't commit, even as RO *)
+			 Transaction.mark_failed txn;
+			 r
+        in
+	let get_con id =
+		if id < 0 then Connections.find cons (Utils.FD.of_int (-id))
+		else Connections.find_domain cons id
+	in
+	let watch_f id path token =
+		ignore (Connections.add_watch cons (get_con id) path token)
+		in
+		let path = Store.Path.of_string path in
+		try match op with
+		| LR.R ->
+			 Logging.debug "xenstored" "TR %s %S" (Store.Path.to_string path) value;
+			(* these are values read by the tx, potentially
+				 no write access here. Make the tree match. *)
+			Transaction.write txn fullperm path value; 
+			Transaction.setperms txn fullperm path perms;
+		| LR.W | LR.RW ->
+			 Logging.debug "xenstored" "TW %d %s %S" (Transaction.get_id txn) (Store.Path.to_string path) value;
+			 (* We started with empty tree, create parents.
+			    All the implicit mkdirs from the original tx should be explicit already for quota purposes.
+			 *)
+			 Process.create_implicit_path txn fullperm path;
+ 			 Transaction.write txn fullperm path value; 
+			 Transaction.setperms txn fullperm path perms;
+			 Logging.debug "xenstored" "TWdone %s %S" (Store.Path.to_string path) value;
+		| LR.Del ->
+			 Logging.debug "xenstored" "TDel %s " (Store.Path.to_string path);
+			Transaction.rm txn fullperm path
+		with Not_found|Define.Doesnt_exist|Define.Lookup_Doesnt_exist _ -> ()
+		in
+	(* make sure we got a quota entry for Dom0, so that setperms on / doesn't cause quota to be off-by-one *)
+	Transaction.mkdir maintx fullperm (Store.Path.of_string "/local");
+	let errors = from_channel_f_bin chan global_f domain_f watch_f transaction_f store_f in
+	(* do not fire any watches, but this makes a tx RO *)
+(*	Transaction.clear_wops maintx; *)
+	let errors = if not @@ Transaction.commit ~con:"live-update" maintx then begin
+		Logging.warn "xenstored" "live-update: failed to commit main transaction";
+		errors + 1
+	end else errors
+	in
+	!fds, errors
+
+let from_channel = from_channel_bin (* TODO: detect and accept text format *)
+
+let from_file ~live store cons doms file =
+	let channel = open_in_bin file in
+	finally (fun () -> from_channel_bin ~live store doms cons channel)
 	        (fun () -> close_in channel)
 
-let to_channel store cons rw chan =
-	let hexify s = Utils.hexify s in
+let to_channel rw_sock store cons chan =
+	let t = Disk.BinaryOut.write_header chan in
 
-	fprintf chan "%s\n" dump_format_header;
-	let fdopt = function None -> -1 | Some fd ->
-		(* systemd and utils.ml sets it close on exec *)
-		Unix.clear_close_on_exec fd;
-		Utils.FD.to_int fd in
-	fprintf chan "global,%d\n" (fdopt rw);
-
-	(* dump connections related to domains: domid, mfn, eventchn port/ sockets, and watches *)
-	Connections.iter cons (fun con -> Connection.dump con chan);
+	(match rw_sock with
+	| Some rw_sock ->
+		LR.write_global_data t ~rw_sock
+	| _ -> ());
 
 	(* dump the store *)
 	Store.dump_fct store (fun path node ->
-		let name, perms, value = Store.Node.unpack node in
-		let fullpath = Store.Path.to_string (Store.Path.of_path_and_name path name) in
-		let permstr = Perms.Node.to_string perms in
-		fprintf chan "store,%s,%s,%s\n" (hexify fullpath) (hexify permstr) (hexify value)
+		Transaction.write_node t None path node
 	);
+
+	(* dump connections related to domains and sockets; domid, mfn, eventchn port, watches *)
+	Connections.iter cons (fun con -> Connection.dump con store t);
+
+	LR.write_end t;
 	flush chan;
 	()
 
+let validate_f ch =
+	let conids = Hashtbl.create 1021 in
+	let txids = Hashtbl.create 1021 in
+	let global_f ~rw_sock = () in
+	let domain_f ~conid ~conn ~in_data ~out_data ~out_resp_len =
+		Hashtbl.add conids conid ()
+	in
+	let watch_f ~conid ~wpath ~token =
+		Hashtbl.find conids conid
+	in
+	let transaction_f ~conid ~txid =
+		Hashtbl.find conids conid;
+		Hashtbl.add txids (conid, txid) ()
+	in 
+	let store_f ~txaccess ~perms ~path ~value =
+		match txaccess with
+		| None -> ()
+		| Some (conid, txid, _) ->
+			Hashtbl.find conids conid;
+			Hashtbl.find txids (conid, txid)
+	in
+	let errors = from_channel_f_bin ch global_f domain_f watch_f transaction_f store_f in
+	if errors > 0 then
+		failwith (Printf.sprintf "Failed to re-read dump: %d errors" errors)
 
-let to_file store cons fds file =
-	let channel = open_out_gen [ Open_wronly; Open_creat; Open_trunc; ] 0o600 file in
-	finally (fun () -> to_channel store cons fds channel)
-	        (fun () -> close_out channel)
+let to_file fds store cons file =
+	let channel = open_out_gen [ Open_wronly; Open_creat; Open_trunc; Open_binary ] 0o600 file in
+	finally (fun () -> to_channel fds store cons channel)
+					(fun () -> close_out channel);
+	let channel = open_in_bin file in
+	finally (fun () -> validate_f channel)
+	        (fun () -> close_in channel)
+	
 end
 
 let main () =
@@ -329,8 +470,8 @@ let main () =
 
 	let rw_sock =
 	if cf.restart && Sys.file_exists Disk.xs_daemon_database then (
-		let rwro = DB.from_file store domains cons Disk.xs_daemon_database in
-		info "Live reload: database loaded";
+		let rwro, errors = DB.from_file ~live:cf.live_reload store domains cons Disk.xs_daemon_database in
+		info "Live reload: database loaded (%d errors)" errors;
 		Event.bind_dom_exc_virq eventchn;
 		Process.LiveUpdate.completed ();
 		rwro
@@ -367,7 +508,7 @@ let main () =
 	Sys.set_signal Sys.sigpipe Sys.Signal_ignore;
 
 	if cf.activate_access_log then begin
-		let post_rotate () = DB.to_file store cons (None) Disk.xs_daemon_database in
+		let post_rotate () = DB.to_file None store cons Disk.xs_daemon_database in
 		Logging.init_access_log post_rotate
 	end;
 
@@ -528,7 +669,7 @@ let main () =
 			live_update := Process.LiveUpdate.should_run cons;
 			if !live_update || !quit then begin
 				(* don't initiate live update if saving state fails *)
-				DB.to_file store cons (rw_sock) Disk.xs_daemon_database;
+				DB.to_file rw_sock store cons Disk.xs_daemon_database;
 				quit := true;
 			end
 		with exc ->
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v1 5/5] Add structured fuzzing unit test
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (24 preceding siblings ...)
  (?)
@ 2021-01-15 22:29 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

Based on ideas from qcstm, implemented using Crowbar.

Quickcheck-style property tests that uses AFL for quickly
exploring various values that trigger bugs in the code.

This is structured/guided fuzzing: we read an arbitrary random number,
and use it to generate some valid looking xenstore trees and commands.

There are 2 instances of xenstored: one that runs the live update
command, and one that ignores it.
Live-update should be a no-op wrt to xenstored state: this is our
quicheck property.

When any mismatch is identified it prints the input
(tree+xenstore commands), and a diff of the output:
the internal xenstore tree state + quotas.

afl-cmin can be used to further minimize the testcase.
Crowbar (AFL persistent mode Quickcheck integration) is used due to
speed: this very easily gets us a multi-core parallelizable test.

Currently the Transaction tests fail, which is why live updates with
active transactions are rejected.

TODO: split out the non-working transaction code, and drop some obsolete
code.

There is also some incomplete code here that attempts to find functional
bugs in xenstored by interpeting xenstore commands in a simpler way and
comparing states.

This will build the fuzzer and run it single core for sanity test:
make container-fuzz-sanity-test

This will run it multicore (requires all dependencies installed on the host,
including ocaml-bun, the multi-core AFL runner):
make dune-oxenstored-fuzz

'make check' will also run the fuzzer but with input supplied by OCaml's
random number generator, and for a very small number of iterations
(few thousand). This doesn't require any external tools (no AFL, bun).

On failure it prints a base64 encoding of the fuzzer state that can be
used to reproduce the failure instantly, which is very useful for
debugging: one can iterate on the failed fuzzer state until it is fixed,
and then run the fuzzer again to find next failure.

The unit tests here require OCaml 4.06, but the rest of the codebase
doesn't (yet).

See https://lore.kernel.org/xen-devel/cbb2742191e9c1303fdfd95feef4d829ecf33a0d.camel@citrix.com/
for previous discussion of OCaml version.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
---
 tools/ocaml/Makefile                          |  19 +
 tools/ocaml/xenstored/process.ml              |  12 +-
 tools/ocaml/xenstored/store.ml                |   1 +
 tools/ocaml/xenstored/test/generator.ml       | 189 +++++
 tools/ocaml/xenstored/test/model.ml           | 253 ++++++
 tools/ocaml/xenstored/test/old/arbitrary.ml   | 261 +++++++
 tools/ocaml/xenstored/test/old/gen_paths.ml   |  66 ++
 .../xenstored/test/old/xenstored_test.ml      | 527 +++++++++++++
 tools/ocaml/xenstored/test/pathtree.ml        |  40 +
 tools/ocaml/xenstored/test/testable.ml        | 364 +++++++++
 tools/ocaml/xenstored/test/types.ml           | 427 ++++++++++
 tools/ocaml/xenstored/test/xenstored_test.ml  | 149 +++-
 tools/ocaml/xenstored/test/xs_protocol.ml     | 733 ++++++++++++++++++
 tools/ocaml/xenstored/transaction.ml          | 119 ++-
 14 files changed, 3151 insertions(+), 9 deletions(-)
 create mode 100644 tools/ocaml/xenstored/test/generator.ml
 create mode 100644 tools/ocaml/xenstored/test/model.ml
 create mode 100644 tools/ocaml/xenstored/test/old/arbitrary.ml
 create mode 100644 tools/ocaml/xenstored/test/old/gen_paths.ml
 create mode 100644 tools/ocaml/xenstored/test/old/xenstored_test.ml
 create mode 100644 tools/ocaml/xenstored/test/pathtree.ml
 create mode 100644 tools/ocaml/xenstored/test/testable.ml
 create mode 100644 tools/ocaml/xenstored/test/types.ml
 create mode 100644 tools/ocaml/xenstored/test/xs_protocol.ml

diff --git a/tools/ocaml/Makefile b/tools/ocaml/Makefile
index 53dd0a0f0d..de375820a3 100644
--- a/tools/ocaml/Makefile
+++ b/tools/ocaml/Makefile
@@ -67,3 +67,22 @@ dune-syntax-check: dune-pre
 .PHONY: build-oxenstored-dune
 dune-build-oxenstored: dune-pre
 	LD_LIBRARY_PATH=$(LIBRARY_PATH) LIBRARY_PATH=$(LIBRARY_PATH) C_INCLUDE_PATH=$(C_INCLUDE_PATH) dune build --profile=release @all
+
+.PHONY: oxenstored-fuzz1 oxenstored-fuzz
+dune-oxenstored-fuzz: dune-pre
+	# --force is needed, otherwise it would cache a successful run
+	sh -c '. /etc/profile && C_INCLUDE_PATH=$(C_INCLUDE_PATH) dune build --profile=release --no-buffer --force @fuzz'
+
+dune-oxenstored-fuzz1: dune-pre
+	sh -c '. /etc/profile && C_INCLUDE_PATH=$(C_INCLUDE_PATH) dune build --profile=release --no-buffer --force @fuzz1'
+
+.PHONY: container-fuzz
+container-fuzz-sanity-test:
+	dune clean
+	podman build -t oxenstored-fuzz .
+	# if UID is 0 then we get EPERM on setrlimit from inside the container
+	# use containerize script which ensures that uid is not 0
+	# (podman/docker run would get us a uid of 0)
+	# Only do a sanity test with 1 core, actually doing fuzzing inside a container is a bad idea
+	# due to FUSE overlayfs overhead
+	CONTAINER=oxenstored-fuzz CONTAINER_NO_PULL=1 DOCKER_CMD=podman ../../automation/scripts/containerize make -C tools/ocaml dune-oxenstored-fuzz1
diff --git a/tools/ocaml/xenstored/process.ml b/tools/ocaml/xenstored/process.ml
index da8e9cdb26..ac3afb495f 100644
--- a/tools/ocaml/xenstored/process.ml
+++ b/tools/ocaml/xenstored/process.ml
@@ -172,7 +172,7 @@ let parse_live_update args =
 				   after timeout elapsed" )*) ]
 			(fun x -> raise (Arg.Bad x))
 			"live-update -s" ;
-			debug "Live update process queued" ;
+			(* debug "Live update process queued" ; *)
 				{!state with deadline = Unix.gettimeofday () +. float !timeout
 				; force= !force; pending= true})
 		| _ ->
@@ -452,6 +452,8 @@ let transaction_replay c t doms cons =
 		(fun () ->
 			try
 				Logging.start_transaction ~con ~tid;
+				if t.must_fail then
+					raise Transaction_again;
 				List.iter (perform_exn ~wlog:true replay_t) (Transaction.get_operations t); (* May throw EAGAIN *)
 
 				Logging.end_transaction ~con ~tid;
@@ -553,7 +555,7 @@ let do_introduce con t domains cons data =
 		| _                         -> raise Invalid_Cmd_Args;
 		in
 	let dom =
-		if Domains.exist domains domid then
+		if Domains.exist domains domid then begin
 			let edom = Domains.find domains domid in
 			if (Domain.get_mfn edom) = mfn && (Connections.find_domain cons domid) != con then begin
 				(* Use XS_INTRODUCE for recreating the xenbus event-channel. *)
@@ -561,12 +563,16 @@ let do_introduce con t domains cons data =
 				Domain.bind_interdomain edom;
 			end;
 			edom
+		end
 		else try
 			let ndom = Domains.create domains domid mfn port in
 			Connections.add_domain cons ndom;
 			Connections.fire_spec_watches (Transaction.get_root t) cons Store.Path.introduce_domain;
 			ndom
-		with _ -> raise Invalid_Cmd_Args
+		with e ->
+			let bt = Printexc.get_backtrace () in
+			 Logging.debug "process" "do_introduce: %s (%s)" (Printexc.to_string e) bt;
+			 raise Invalid_Cmd_Args
 	in
 	if (Domain.get_remote_port dom) <> port || (Domain.get_mfn dom) <> mfn then
 		raise Domain_not_match
diff --git a/tools/ocaml/xenstored/store.ml b/tools/ocaml/xenstored/store.ml
index 5f155f45eb..a9c079a417 100644
--- a/tools/ocaml/xenstored/store.ml
+++ b/tools/ocaml/xenstored/store.ml
@@ -133,6 +133,7 @@ let of_path_and_name path name =
 	| _ -> path @ [name]
 
 let create path connection_path =
+	Logging.debug "store" "Path.create %S %S" path connection_path;
 	of_string (Utils.path_validate path connection_path)
 
 let to_string t =
diff --git a/tools/ocaml/xenstored/test/generator.ml b/tools/ocaml/xenstored/test/generator.ml
new file mode 100644
index 0000000000..6f7dc374f8
--- /dev/null
+++ b/tools/ocaml/xenstored/test/generator.ml
@@ -0,0 +1,189 @@
+module type S = sig
+  type cmd
+
+  type state
+
+  val init_state : state
+
+  val next_state : cmd -> state -> state
+
+  val precond : cmd -> state -> bool
+end
+
+module IntSet = Set.Make (Int)
+module IntMap = Map.Make (Int)
+
+module Pickable (K : sig
+  include Map.OrderedType
+
+  include Hashtbl.HashedType with type t := t
+end) =
+struct
+  (* allow picking a random value from a changing map keys.
+     Store a random value (hash of key) as first element of key,
+     then use find_first to pick an item related to the random element if any.
+     This should be more efficient than converting to a list and using List.nth to pick
+  *)
+  module Key = struct
+    type t = int * K.t
+
+    let of_key k = (K.hash k, k)
+
+    let compare (h, k) (h', k') =
+      match Int.compare h h' with 0 -> K.compare k k' | r -> r
+  end
+
+  module M = Map.Make (Key)
+
+  type 'a t = 'a M.t
+
+  let empty = M.empty
+
+  let singleton k v = M.singleton (Key.of_key k) v
+
+  let add k v m = M.add (Key.of_key k) v m
+
+  let find_opt k m = M.find_opt (Key.of_key k) m
+
+  let mem k m = M.mem (Key.of_key k) m
+
+  let remove k m = M.remove (Key.of_key k) m
+
+  let merge f m m' = M.merge f m m'
+
+  let is_empty = M.is_empty
+
+  let update k f m = M.update (Key.of_key k) f m
+
+  let choose rnd m =
+    (* function needs to be monotonic, so the hash has to be part of the key *)
+    let gte (keyhash, _) = Int.compare keyhash rnd >= 0 in
+    match M.find_first_opt gte m with
+    | Some ((_, k), _) ->
+        k
+    | None ->
+        snd @@ fst @@ M.min_binding m
+end
+
+module PickablePath = Pickable (struct
+  type t = string
+
+  let hash = Hashtbl.hash
+
+  let compare = String.compare
+
+  let equal = String.equal
+end)
+
+module PickableInt = Pickable (struct
+  include Int
+
+  let hash = Hashtbl.hash
+end)
+
+module PathObserver = struct
+  type state =
+    { seen: unit PickablePath.t
+    ; dom_txs: unit PickableInt.t PickableInt.t
+    ; next_tid: int }
+
+  let choose_path t rnd = PickablePath.choose rnd t.seen
+
+  let choose_domid t rnd = PickableInt.choose rnd t.dom_txs
+
+  let choose_txid_opt t domid rnd =
+    match PickableInt.find_opt domid t.dom_txs with
+    | None ->
+        0
+    | Some txs ->
+        if PickableInt.is_empty txs then 0 else PickableInt.choose rnd txs
+
+  let new_domid domid = PickableInt.singleton domid PickableInt.empty
+
+  let both _ _ _ = Some ()
+
+  let merge_txs _ s s' =
+    let s = Option.value ~default:PickableInt.empty s in
+    let s' = Option.value ~default:PickableInt.empty s' in
+    Some (PickableInt.merge both s s')
+
+  let init_state =
+    {seen= PickablePath.singleton "/" (); dom_txs= new_domid 0; next_tid= 1}
+
+  let with_path path t = {t with seen= PickablePath.add path () t.seen}
+
+  let split0 str =
+    match Process.split (Some 2) '\000' str with
+    | [x; y] ->
+        (x, y)
+    | _ ->
+        invalid_arg str
+
+  let next_state (domid, cmd) t =
+    let open Xenbus.Xb in
+    match cmd with
+    | {Xenbus.Packet.ty= Transaction_start; _} ->
+        let update = function
+          | None ->
+              None
+          | Some txs ->
+              Some (PickableInt.add t.next_tid () txs)
+        in
+        { t with
+          dom_txs= PickableInt.update domid update t.dom_txs
+        ; next_tid= t.next_tid + 1 }
+    | { Xenbus.Packet.ty=
+          Op.(
+            ( Rm
+            | Read
+            | Directory
+            | Getperms
+            | Setperms
+            | Unwatch
+            | Reset_watches
+            | Getdomainpath
+            | Isintroduced
+            | Set_target
+            | Debug ))
+      ; _ } ->
+        t
+    | {Xenbus.Packet.ty= Op.(Watchevent | Error | Resume | Invalid); _} ->
+        assert false
+    | {Xenbus.Packet.ty= Op.Transaction_end; tid; _} ->
+        let update = function
+          | None ->
+              None
+          | Some txs ->
+              Some (PickableInt.remove tid txs)
+        in
+        {t with dom_txs= PickableInt.update domid update t.dom_txs}
+    | {Xenbus.Packet.ty= Op.(Write | Mkdir | Watch); data} ->
+        let path, _ = split0 data in
+        with_path path t
+    | {Xenbus.Packet.ty= Introduce; data} ->
+        let domidstr, _ = split0 data in
+        let domid' = int_of_string domidstr in
+        if domid = 0 then
+          { t with
+            dom_txs= PickableInt.merge merge_txs t.dom_txs (new_domid domid') }
+        else t
+    | {Xenbus.Packet.ty= Release; data} ->
+        let domidstr, _ = split0 data in
+        let domid = int_of_string domidstr in
+        {t with dom_txs= PickableInt.remove domid t.dom_txs}
+
+  let precond (domid, cmd) t =
+    ( match PickableInt.find_opt domid t.dom_txs with
+    | None ->
+        false
+    | Some txs ->
+        let tid = cmd.Xenbus.Packet.tid in
+        tid = 0 || PickableInt.mem tid txs )
+    && Testable.Command.precond cmd t
+
+  let pp =
+    let open Fmt in
+    Dump.record
+      [ Dump.field "domid" fst Fmt.int
+      ; Dump.field "cmd" snd Testable.Command.pp_dump ]
+end
diff --git a/tools/ocaml/xenstored/test/model.ml b/tools/ocaml/xenstored/test/model.ml
new file mode 100644
index 0000000000..4b5ae462fb
--- /dev/null
+++ b/tools/ocaml/xenstored/test/model.ml
@@ -0,0 +1,253 @@
+open Xs_protocol
+
+(* Conventions:
+Aim for correctness, use simplest data structure that unambigously represents state.
+
+E.g.:
+* a list when duplicates are allowed, order matters and the empty list is a valid value
+* a set when elements appearing multiple time have the same semantic meaning as them appearing once,
+and the order is unspecified or sorted
+* a map when a single key is mapped to a single value, and order is unspecified or sorted
+
+When we must retain the original order for queries, but semantically it doesn't matter
+then store both a canonical representation and the original order.
+
+*)
+
+let rec string_for_all_from s f pos =
+  pos = String.length s || (f s.[pos] && (string_for_all_from s f @@ (pos + 1)))
+
+type error = [`Msg of string]
+
+module Path : sig
+  (** a valid xenstore path *)
+  type t
+
+  val root : t
+
+  val of_string : string -> t option
+  (** [of_string path] parses [path].
+      @return [None] if the path is syntactically not valid *)
+
+  val to_string : t -> string
+  (** [to_string path] converts path to string. *)
+
+  (** [is_child parent child] returns true if [child] is a child of [parent].
+      A path is considered to be a child of itself *)
+  val is_child : t -> t -> bool
+end = struct
+  type t = string list
+
+  let is_valid_char = function
+    | '0' .. '9' | 'a' .. 'z' | 'A' .. 'Z' | '-' | '/' | '_' | '@' ->
+        true
+    | _ ->
+        false
+
+  let root = [""]
+
+  let nonempty s = String.length s > 0
+
+  let of_string s =
+    let n = String.length s in
+    if
+      n > 0 (* empty path is not permitted *)
+      && n < 1024
+      (* paths cannot exceed 1024 chars, FIXME: relative vs absolute *)
+      && string_for_all_from s is_valid_char 0
+    then
+      match String.split_on_char '/' s with
+      | [] ->
+          assert false
+      | [""; ""] ->
+          Some root
+      | _ :: tl as path ->
+          if List.for_all nonempty tl then Some path else None
+    else None
+
+  let to_string = String.concat "/"
+
+  let rec is_child p c =
+    match (p, c) with
+    | [], [] ->
+        true (* a path is a child of itself *)
+    | [], _ ->
+        true
+    | phd :: ptl, chd :: ctl when phd = chd ->
+        is_child ptl ctl
+    | _ ->
+        false
+end
+
+module PathMap = Map.Make (String)
+module DomidSet = Set.Make (Int)
+module DomidMap = Map.Make (Int)
+
+let preserve_order = ref true
+
+module CanonicalACL = struct
+  module RW = struct
+    type t = {read: bool; write: bool}
+
+    let of_perm = function
+      | ACL.NONE ->
+          {read= false; write= false}
+      | ACL.WRITE ->
+          {read= false; write= true}
+      | ACL.READ ->
+          {read= true; write= false}
+      | ACL.RDWR ->
+          {read= true; write= true}
+
+    let to_perm = function
+      | {read= false; write= false} ->
+          ACL.NONE
+      | {read= false; write= true} ->
+          ACL.WRITE
+      | {read= true; write= false} ->
+          ACL.READ
+      | {read= true; write= true} ->
+          ACL.RDWR
+
+    let full = {read= true; write= true}
+  end
+
+  module RWMap = struct
+    type t = {fallback: RW.t; map: RW.t DomidMap.t}
+
+    let lookup t domid =
+      (* other=RDWR can be overriden by explicitly revoking
+         permissions for a domain, so a read=false,write=false
+         in the DomidMap is not necessarily redundant
+      *)
+      DomidMap.find_opt domid t.map |> Option.value ~default:t.fallback
+
+    let create fallback owner =
+      (* owner always has full permissions, and cannot be overriden *)
+      {fallback; map= DomidMap.singleton owner RW.full}
+
+    let override t (domid, perm) =
+      let rw = RW.of_perm perm in
+      (* first entry wins, see perms.ml, also entries that are same as the fallback are
+         not necessarily redundant: (b1,b2,r2) means that domid 2 has rdwr,
+         but if we remove the seemingly redundant `b2` entry then the override would make it
+         just read which would be wrong. *)
+      if DomidMap.mem domid t.map then t
+      else {t with map= DomidMap.add domid rw t.map}
+  end
+
+  type t = {original: ACL.t; owner: ACL.domid; acl: RWMap.t}
+
+  let can_read t domid = (RWMap.lookup t.acl domid).read
+
+  let can_write t domid = (RWMap.lookup t.acl domid).write
+
+  let owner t = t.owner
+
+  let of_acl original =
+    let fallback = RW.of_perm original.ACL.other in
+    let owner = original.ACL.owner in
+    let acl =
+      let init = RWMap.create fallback owner in
+      List.fold_left RWMap.override init original.ACL.acl
+    in
+    {original; owner; acl}
+
+  let to_acl t =
+    if !preserve_order then t.original
+    else
+      ACL.
+        { owner= t.owner
+        ; other= RW.to_perm t.acl.fallback
+        ; acl= t.acl.map |> DomidMap.map RW.to_perm |> DomidMap.bindings }
+end
+
+module Store = struct
+  type node = {value: string; children: string list; acl: CanonicalACL.t}
+
+  type t = {paths: node PathMap.t}
+
+  let create () = {paths= PathMap.empty}
+
+  let parent x = failwith "TODO"
+
+  let add t key value =
+    let paths = PathMap.add key value t.paths in
+    {paths}
+
+  let remove t key =
+    let paths = PathMap.remove key t.paths in
+    {paths}
+end
+
+type t = Store.t
+
+let create () = Store.create ()
+
+let reply_enoent = Response.Error "ENOENT"
+
+let reply_eexist = Response.Error "EEXIST"
+
+let with_node_read t path f =
+  ( t
+  , match PathMap.find_opt path t.paths with
+    | None ->
+        reply_enoent
+    | Some n ->
+        f n )
+
+(* TODO: perm check *)
+let perform_path t domid path = function
+  | Request.Read ->
+      with_node_read t path @@ fun n -> Response.Read n.value
+  | Request.Directory ->
+      with_node_read t path @@ fun n -> Response.Directory n.children
+  | Request.Directory_part _ ->
+      (t, Response.Error "ENOTSUP")
+  | Request.Getperms ->
+      with_node_read t path @@ fun n -> Response.Getperms n.acl
+  | Request.Write value -> (
+    (* TODO: implicit mkdir *)
+    match PathMap.find_opt path t.paths with
+    | Some _ ->
+        (t, reply_eexist)
+    | None ->
+        let acl = ACL.{owner= domid; other= NONE; acl= []} in
+        let n = {value; children= []; acl} in
+        ({t with paths= PathMap.add path n t.paths}, Response.Write) )
+  | Request.Setperms acl -> (
+    match PathMap.find_opt path t.paths with
+    | Some _ ->
+        (t, reply_enoent)
+    | None ->
+        let update_node = function
+          | None ->
+              None
+          | Some n ->
+              Some {n with acl}
+        in
+        ( {t with paths= PathMap.update path update_node t.paths}
+        , Response.Setperms ) )
+  | Request.Mkdir -> (
+    (* TODO: implicit mkdir *)
+    match PathMap.find_opt path t.paths with
+    | Some _ ->
+        (t, reply_eexist)
+    | None ->
+        let acl = ACL.{owner= domid; other= NONE; acl= []} in
+        let n = {value= ""; children= []; acl} in
+        ({t with paths= PathMap.add path n t.paths}, Response.Mkdir) )
+  | Request.Rm -> (
+    match PathMap.find_opt path t.paths with
+    | None ->
+        (t, reply_enoent)
+    | Some _ ->
+        ({t with paths= PathMap.remove path t.paths}, Response.Rm) )
+
+let perform t domid = function
+  | Request.PathOp (path, op) ->
+      perform_path t domid path op
+  | Request.Getdomainpath domid ->
+      (t, Response.Getdomainpath (Printf.sprintf "/local/domain/%d" domid))
+  | _ ->
+      failwith "TODO"
diff --git a/tools/ocaml/xenstored/test/old/arbitrary.ml b/tools/ocaml/xenstored/test/old/arbitrary.ml
new file mode 100644
index 0000000000..6b0bf9864a
--- /dev/null
+++ b/tools/ocaml/xenstored/test/old/arbitrary.ml
@@ -0,0 +1,261 @@
+open QCheck
+
+(* See https://github.com/gasche/random-generator/blob/51351c16b587a1c4216d158e847dcfa6db15f009/random_generator.mli#L275-L325
+    for background on fueled generators for recursive data structures.
+   The difference here is that we build an N-ary tree, not a binary tree as in the example.
+   So we need to spread the fuel among elements of a list of random size.
+*)
+
+(** [spread fuel] creates an array of a random size, and spreads fuel among array elements.
+  Each array slot uses up at least 1 fuel itself.
+  For example the full list of possible arrays with [4] fuel is:
+  {[ [[|3|]; [|0; 2|]; [|1; 1|]; [|2; 0|]; [|0; 0; 0; 0|]] ]}
+*)
+let spread = function
+  | 0 ->
+      Gen.return [||]
+  | n when n < 0 ->
+      invalid_arg "negative fuel"
+  | n ->
+      Gen.(
+        1 -- n
+        >>= fun per_element ->
+        (* We got n fuel to divide up, such that most elements have [per_element] fuel.
+           Round up the number of elements *)
+        let m = (n + per_element - 1) / per_element in
+        (* each element uses up at least one fuel, this has to be subtracted before propagation *)
+        let a = Array.make m (per_element - 1) in
+        (* handle remainder *)
+        a.(0) <- n - (per_element * (m - 1)) - 1 ;
+        assert (Array.fold_left ( + ) m a = n) ;
+        (* ensure that remainder is in a random position *)
+        Gen.shuffle_a a >|= fun () -> a)
+
+(** [spread_l fuel sized_element] spreads [fuel] among list elements,
+    where each list element is created using [sized_element].
+    [sized_element] needs to create an element of exactly the requested size
+     (which may be a recursive element, that calls [spread_l] in turn).
+    Each list element consumes 1 fuel implicitly and sized_element is called with decreased fuel.
+ *)
+let spread_l fuel (sized_elem : 'a Gen.sized) =
+  Gen.(
+    spread fuel
+    >>= fun a ->
+    a |> Array.map sized_elem |> Gen.flatten_a |> Gen.map Array.to_list)
+
+module Tree = struct
+  (* For better shrinking put the (recursive) list first *)
+  type 'a t = Nodes of ('a t * 'a) list
+
+  (** [empty] the empty tree (of size 1) *)
+  let empty = Nodes []
+
+  (** [nodes subtree] tree constructor *)
+  let nodes children = Nodes children
+
+  (** [tree elem_gen] generates a random tree, with elements generated by [elem_gen] *)
+  let tree elem =
+    Gen.sized @@ Gen.fix
+    @@ fun self fuel ->
+    (* self is the generator for a subtree *)
+    let node fuel = Gen.(pair (self fuel) elem) in
+    (* using spread_l ensures that fuel decreases by at least 1, thus ensuring termination *)
+    Gen.map nodes @@ spread_l fuel node
+
+  (** [zero _] is a default implementation for [small] *)
+  let zero _ = 0
+
+  (** [small elem_size tree] returns the count of nodes in the tree and the sum of element sizes
+      as determined by [elem_size] *)
+  let rec small ?(elem_size = zero) (Nodes tree) =
+    List.fold_left
+      (fun acc (subtree, elem) ->
+        acc + elem_size elem + small ~elem_size subtree)
+      1 tree
+
+  (** [shrink ?elem tree] returns a list of potentially smaller trees based on [tree].
+   *)
+  let shrink ?(elem = Shrink.nil) =
+    (* Shrinking needs to generate smaller trees (as determined by [small]),
+       QCheck will keep iterating until it finds a smaller tree that still reproduces the bug.
+       It will then invoke the shrinker again on the smaller tree to attempt to shrink it further.
+       Once the tree shape cannot be shrunk further individual node elements will be shrunk.
+    *)
+    let rec tree (Nodes t) =
+      (* first try to shrink the subtree to a leaf,
+         and if that doesn't work then recursively shrink the subtree
+      *)
+      Iter.append (Iter.return empty)
+      @@ Iter.map nodes
+      @@ Shrink.list ~shrink:(Shrink.pair tree elem) t
+    in
+    tree
+
+  (** [make arb] creates a tree generator with elements generated by [arb].
+      The tree has a shrinker and size defined.
+   *)
+  let make arb =
+    let gen = tree @@ gen arb in
+    QCheck.make
+      ~small:(small ?elem_size:arb.small)
+      ~shrink:(shrink ?elem:arb.shrink) gen
+
+  (** [paths_of_tree ~join tree] return all paths through the tree,
+      with path elements joined using [join] *)
+  let paths_of_tree ~join t =
+    let rec paths_of_subtree (paths, path) (Nodes nodes) =
+      ListLabels.fold_left nodes ~init:paths ~f:(fun paths (tree, elem) ->
+          let path = elem :: path in
+          paths_of_subtree (join (List.rev path) :: paths, path) tree)
+    in
+    paths_of_subtree ([], []) t
+
+  let paths join arb =
+    make arb
+    (* we need to retain the tree, so that the shrinking is done on the tree,
+       and not on the paths *)
+    |> map_keep_input (paths_of_tree ~join)
+end
+
+module Case = struct
+  type ('a, 'b) t =
+    { case_tag: string
+    ; orig: 'a QCheck.arbitrary
+    ; map: 'a -> 'b
+    ; shrink: 'a -> 'b Iter.t
+    ; print: 'a Print.t
+    ; small: 'a -> int }
+
+  (** [make arb f] defines a new variant case with constructor arguments
+      generated by [arb] and constructor [f]. *)
+  let make case_tag orig map =
+    let shrink a =
+      match orig.QCheck.shrink with
+      | None ->
+          Iter.empty
+      | Some s ->
+          Iter.map map @@ s a
+    in
+    let small a = match orig.QCheck.small with None -> 0 | Some s -> s a in
+    let print a = match orig.QCheck.print with None -> "_" | Some p -> p a in
+    {case_tag; orig; map; shrink; small; print}
+
+  type 'a call =
+    { tag: string
+    ; shrink_lazy: 'a Iter.t Lazy.t
+    ; small_lazy: int Lazy.t
+    ; print: string Lazy.t }
+
+  (** [call tag case args] used by the implementation of [rev] to build a shrinker/small of appropriate type *)
+  let call t a =
+    { tag= t.case_tag
+    ; shrink_lazy= lazy (t.shrink a)
+    ; small_lazy= lazy (t.small a)
+    ; print= lazy (t.print a) }
+
+  (** [to_sum case] converts all variant cases to the same type so they can be put into a list *)
+  let to_sum t = Gen.map t.map @@ QCheck.gen t.orig
+end
+
+(** [sum ~print ~rev cases] defines an arbitrary for a sum type consisting of [cases]
+  variant case generators. [print] converts the sum type to a string.
+  [rev] matches on the sum type and should invoke [Case.call <variant-tag> <variant-case-def> <args>].
+
+  E.g.
+  {|
+  type t = A of int | B of float
+
+  let case_a = Case.make "A" int (fun i -> A i)
+
+  let case_b = Case.make "B" float (fun f -> B f)
+
+  let rev t =
+    match t with A i -> Case.call case_a i | B g -> Case.call case_b g
+
+  let x =
+    sum
+      ~print:(fun _ -> failwith "TODO")
+      [Case.to_sum case_a; Case.to_sum case_b]
+  |}
+ *)
+let sum ~rev lst =
+  let shrink b = Lazy.force (rev b).Case.shrink_lazy in
+  let small b = Lazy.force (rev b).Case.small_lazy in
+  let collect b = (rev b).Case.tag in
+  let print b = let r = rev b in r.Case.tag ^ " " ^ Lazy.force r.print in
+  QCheck.make ~shrink ~small ~collect ~print (Gen.oneof lst)
+
+(*
+let mk_packet op to_string arb =
+  Case.make arb (fun x -> Xenbus.Packet.create 0 0 op (to_string x))
+
+let read_packet =
+  mk_packet Xenbus.Xb.Op.Read Store.Path.to_string (list path_element)
+
+let write_packet =
+  mk_packet Xenbus.Xb.Op.Write
+    (fun (x, y) -> Store.Path.to_string x ^ "\x00" ^ y)
+    (pair (list path_element) binary)
+
+let packet =
+  sum ~print:Xenbus.Packet.to_string
+    [Case.to_sum read_packet; Case.to_sum write_packet]
+*)
+
+(** [binary] is a generator of strings containing \x00 characters. *)
+let binary =
+  (* increase frequency of '\x00' to 10%, otherwise it'd be ~1/256 *)
+  string_gen (Gen.frequency [(10, Gen.return '\x00'); (90, Gen.char)])
+  |> set_print String.escaped
+
+(** [path_chars] valid path characters according to Xenstore protocol. *)
+let path_chars =
+  List.init 256 Char.chr
+  |> List.filter Store.Path.char_is_valid
+  |> Array.of_list |> Gen.oneofa
+
+(** [path_element] a valid path element *)
+let path_element =
+   string_gen_of_size Gen.small_int path_chars
+
+type tree = string Tree.t
+
+let paths = Tree.paths Store.Path.to_string path_element
+
+let with_validate p =
+  map_same_type
+  @@ fun v ->
+  (* reject it in a way known to QCheck: precondition failed,
+     instead of testcase failed *)
+  assume @@ p v ;
+  v
+
+(** [non_nul string_arb] rejects strings generated by [string_arb] that contain '\x00'. *)
+let non_nul = with_validate @@ fun s -> not (String.contains s '\x00')
+
+(** [plus arb] generates a list of 1 or more elements generated by [arb] *)
+let plus arb = list_of_size Gen.(map succ small_int) arb
+
+(** [star arb] generates a list of 0 or more elements generated by [arb] *)
+let star arb = list_of_size Gen.small_int arb
+
+let reserved =
+  string_of_size Gen.(frequency [(90, Gen.return 0); (10, Gen.small_int)])
+
+(** According to xenstore protocol this could go up to 65535, but an actual domid
+    shouldn't go above this value *)
+let domid_first_reserved = 0x7FF0
+
+(** [new_domid] generates DomU domids *)
+let new_domid = 1 -- domid_first_reserved
+
+let permty =
+  let open Perms in
+  oneofl [READ; WRITE; RDWR; NONE]
+
+let perms domid =
+  map
+    (fun (domid, other, acls) -> Perms.Node.create domid other acls)
+    ~rev:(fun n ->
+      (Perms.Node.get_owner n, Perms.Node.get_other n, Perms.Node.get_acl n))
+  @@ triple domid permty (small_list (pair domid permty))
diff --git a/tools/ocaml/xenstored/test/old/gen_paths.ml b/tools/ocaml/xenstored/test/old/gen_paths.ml
new file mode 100644
index 0000000000..b50c5b7cad
--- /dev/null
+++ b/tools/ocaml/xenstored/test/old/gen_paths.ml
@@ -0,0 +1,66 @@
+open QCheck
+open Store
+
+type tree = Leaf | Nodes of (string * tree) list
+
+let nodes children = Nodes children
+let gen_tree = QCheck.Gen.(sized @@ fix
+  (fun self n ->
+    let children = frequency [1, pure 0; 2, int_bound n] >>= fun m ->
+    match m with
+    | 0 -> pure []
+    | _ -> list_repeat m (pair string (self (n/m)))
+    in
+    frequency
+     [ 1, pure Leaf
+     ; 2, map nodes children
+    ]
+    ))
+
+let rec paths_of_tree (acc, path) = function
+| Leaf -> acc
+| Nodes l ->
+  List.fold_left (fun acc (k, children) ->
+    let path = k :: path in
+    paths_of_tree (Store.Path.to_string (List.rev path) :: acc, path) children
+  ) acc l
+
+let gen_paths_choices =
+  Gen.map (fun tree ->
+  tree |> paths_of_tree ([], []) |> Array.of_list
+  ) gen_tree
+
+(*let arb_name = Gen.small_string
+
+let arb_permty = let open Perms in oneofl [ READ; WRITE; RDWR; NONE ]
+
+let arb_domid = oneofl [ 0; 1; 0x7FEF]
+
+let arb_perms =
+   map (fun (domid, other, acls) -> Perms.Node.create domid other acls)
+   ~rev:(fun n -> Perms.Node.get_owner n, Perms.Node.get_other n, Perms.Node.get_acl n)
+   @@ triple arb_domid arb_permty (small_list (pair arb_domid arb_permty))*)
+
+let arb_name = Gen.small_string
+let arb_value = Gen.small_string
+
+let node_of name value children =
+  List.fold_left (fun c acc -> Node.add_child acc c)
+  (Node.create name Perms.Node.default0 value ) children
+
+let g = QCheck.Gen.(sized @@ fix
+  (fun self n ->
+      frequency [1, pure 0; 2, int_bound n] >>= fun m ->
+      let children = match m with
+      | 0 -> pure []
+      | _ -> list_repeat m (self (n/m))
+      in
+      map3 node_of arb_name arb_value children
+    ))
+
+let paths_of_tree t =
+  let paths = ref [] in
+  Store.traversal t (fun path node ->
+    paths := (Store.Path.of_path_and_name path (Node.get_name node) |> Store.Path.to_string) :: !paths
+  );
+  !paths
diff --git a/tools/ocaml/xenstored/test/old/xenstored_test.ml b/tools/ocaml/xenstored/test/old/xenstored_test.ml
new file mode 100644
index 0000000000..84cfc45d4f
--- /dev/null
+++ b/tools/ocaml/xenstored/test/old/xenstored_test.ml
@@ -0,0 +1,527 @@
+open Stdext
+open QCheck
+open Arbitrary
+
+let () =
+  (* Logging.access_log_nb_files := 1 ;
+     Logging.access_log_transaction_ops := true ;
+     Logging.access_log_special_ops := true ;
+     Logging.access_log_destination := File "/tmp/log" ;
+     Logging.init_access_log ignore ;
+     Logging.set_xenstored_log_destination "/dev/stderr";
+     Logging.init_xenstored_log (); *)
+  Domains.xenstored_port := "xenstored-port" ;
+  let f = open_out !Domains.xenstored_port in
+  Printf.fprintf f "%d" 1 ;
+  close_out f ;
+  Domains.xenstored_kva := "/dev/zero"
+
+module Command = struct
+  type value = string
+
+  let value = binary
+
+  type token = string
+
+  type txid = int
+
+  type domid = Xenctrl.domid
+
+  type t =
+    | Read of Store.Path.t
+    | Write of Store.Path.t * value
+    | Mkdir of Store.Path.t
+    | Rm of Store.Path.t
+    | Directory of Store.Path.t
+    (* | Directory_part not implemented *)
+    | Get_perms of Store.Path.t
+    | Set_perms of Store.Path.t * Perms.Node.t
+    | Watch of Store.Path.t * token
+    | Unwatch of Store.Path.t * token
+    | Reset_watches
+    | Transaction_start
+    | Transaction_end of bool
+    | Introduce of domid * nativeint * int
+    | Release of int
+    | Get_domain_path of domid
+    | Is_domain_introduced of domid
+    | Set_target of domid * domid
+    | LiveUpdate
+
+  type state =
+    { store: Store.t
+    ; doms: Domains.domains
+    ; cons: Connections.t
+    ; domids: int array }
+
+  let path = list path_element
+
+  let token = printable_string
+
+  let domid state = oneofa ~print:Print.int state.domids
+
+  let cmd state =
+    let domid = domid state in
+    let cmd_read = Case.make "READ" path (fun path -> Read path) in
+    let cmd_write =
+      Case.make "WRITE" (pair path value) (fun (path, value) ->
+          Write (path, value))
+    in
+    let cmd_mkdir = Case.make "MKDIR" path (fun path -> Mkdir path) in
+    let cmd_rm = Case.make "RM" path (fun path -> Rm path) in
+    let cmd_directory =
+      Case.make "DIRECTORY" path (fun path -> Directory path)
+    in
+    let cmd_get_perms =
+      Case.make "GET_PERMS" path (fun path -> Get_perms path)
+    in
+    let cmd_set_perms =
+      Case.make "SET_PERMS"
+        (pair path (perms domid))
+        (fun (path, perms) -> Set_perms (path, perms))
+    in
+    let cmd_watch =
+      Case.make "WATCH" (pair path token) (fun (path, token) ->
+          Watch (path, token))
+    in
+    let cmd_unwatch =
+      Case.make "UNWATCH" (pair path token) (fun (path, token) ->
+          Unwatch (path, token))
+    in
+    let cmd_reset_watches =
+      Case.make "RESET_WATCHES" unit (fun () -> Reset_watches)
+    in
+    let cmd_tx_start =
+      Case.make "TRANSACTION_START" unit (fun () -> Transaction_start)
+    in
+    let cmd_tx_end =
+      Case.make "TRANSACTION_END" bool (fun commit -> Transaction_end commit)
+    in
+    let cmd_introduce =
+      Case.make "INTRODUCE" (triple domid int int) (fun (domid, gfn, port) ->
+          Introduce (domid, Nativeint.of_int gfn, port))
+    in
+    let cmd_release = Case.make "RELEASE" domid (fun domid -> Release domid) in
+    let cmd_get_domain_path =
+      Case.make "GET_DOMAIN_PATH" domid (fun domid -> Get_domain_path domid)
+    in
+    let cmd_is_domain_introduced =
+      Case.make "IS_DOMAIN_INTRODUCED" domid (fun domid ->
+          Is_domain_introduced domid)
+    in
+    let cmd_set_target =
+      Case.make "SET_TARGET" (pair domid domid) (fun (domid, tdomid) ->
+          Set_target (domid, tdomid))
+    in
+    let cmd_live_update =
+      Case.make "CONTROL live-update" unit (fun () -> LiveUpdate)
+    in
+    let rev = function
+      | Read a ->
+          Case.call cmd_read a
+      | Write (p, v) ->
+          Case.call cmd_write (p, v)
+      | Mkdir a ->
+          Case.call cmd_mkdir a
+      | Rm a ->
+          Case.call cmd_rm a
+      | Directory a ->
+          Case.call cmd_directory a
+      | Get_perms a ->
+          Case.call cmd_get_perms a
+      | Set_perms (p, v) ->
+          Case.call cmd_set_perms (p, v)
+      | Watch (p, t) ->
+          Case.call cmd_watch (p, t)
+      | Unwatch (p, t) ->
+          Case.call cmd_unwatch (p, t)
+      | Reset_watches ->
+          Case.call cmd_reset_watches ()
+      | Transaction_start ->
+          Case.call cmd_tx_start ()
+      | Transaction_end a ->
+          Case.call cmd_tx_end a
+      | Introduce (d, g, p) ->
+          Case.call cmd_introduce (d, Nativeint.to_int g, p)
+      | Release a ->
+          Case.call cmd_release a
+      | Get_domain_path a ->
+          Case.call cmd_get_domain_path a
+      | Is_domain_introduced a ->
+          Case.call cmd_is_domain_introduced a
+      | Set_target (d, t) ->
+          Case.call cmd_set_target (d, t)
+      | LiveUpdate ->
+          Case.call cmd_live_update ()
+    in
+    let open Case in
+    sum ~rev
+      [ to_sum cmd_read
+      ; to_sum cmd_write
+      ; to_sum cmd_mkdir
+      ; to_sum cmd_rm
+      ; to_sum cmd_directory
+      ; to_sum cmd_get_perms
+      ; to_sum cmd_set_perms
+      ; to_sum cmd_watch
+      ; to_sum cmd_unwatch
+      ; to_sum cmd_reset_watches
+      ; to_sum cmd_tx_start
+      ; to_sum cmd_tx_end
+      ; to_sum cmd_introduce
+      ; to_sum cmd_release
+      ; to_sum cmd_get_domain_path
+      ; to_sum cmd_is_domain_introduced
+      ; to_sum cmd_set_target
+      ; to_sum cmd_live_update ]
+
+  let run tid =
+    let open Xenstore.Queueop in
+    function
+    | Read p ->
+        read tid Store.Path.(to_string p)
+    | Write (p, v) ->
+        write tid Store.Path.(to_string p) v
+    | Mkdir p ->
+        mkdir tid Store.Path.(to_string p)
+    | Rm p ->
+        rm tid Store.Path.(to_string p)
+    | Directory p ->
+        directory tid Store.Path.(to_string p)
+    | Get_perms p ->
+        getperms tid Store.Path.(to_string p)
+    | Set_perms (p, v) ->
+        setperms tid Store.Path.(to_string p) Perms.Node.(to_string v)
+    | Watch (p, t) ->
+        watch Store.Path.(to_string p) t
+    | Unwatch (p, t) ->
+        unwatch Store.Path.(to_string p) t
+    | Reset_watches ->
+        let open Xenbus in
+        fun con -> Xb.queue con (Xb.Packet.create 0 0 Xb.Op.Reset_watches "")
+    | Transaction_start ->
+        transaction_start
+    | Transaction_end c ->
+        transaction_end tid c
+    | Release d ->
+        release d
+    | Get_domain_path d ->
+        getdomainpath d
+    | Is_domain_introduced d ->
+        let open Xenbus in
+        fun con ->
+          Xb.queue con
+            (Xb.Packet.create 0 0 Xb.Op.Isintroduced (string_of_int d))
+    | Set_target (d, t) ->
+        let open Xenbus in
+        fun con ->
+          Xb.queue con
+            (Xb.Packet.create 0 0 Xb.Op.Isintroduced
+               (String.concat "\x00" [string_of_int d; string_of_int t]))
+    | LiveUpdate ->
+        debug ["live-update"; "-s"]
+    | Introduce (d, g, p) ->
+        introduce d g p
+end
+
+module Spec = struct
+  type cmd = New | Cmd of Command.domid * int option * Command.t
+
+  type state =
+    { xb: Xenbus.Xb.t
+    ; cnt: int
+    ; cmdstate: Command.state ref option
+    ; failure: (exn * string) option }
+
+  type sut = state ref
+
+  let doms = Domains.init (Event.init ()) ignore
+
+  let dom0 = Domains.create0 doms
+
+  let new_state () =
+    let cons = Connections.create () in
+    Connections.add_domain cons dom0 ;
+    let store = Store.create () in
+    let con = Perms.Connection.create 0 in
+    Store.mkdir store con ["tool"] ;
+    {Command.store; doms; cons; domids= [|0|]}
+
+  let print = function
+    | New ->
+        "NEW"
+    | Cmd (d, t, c) ->
+        let s = new_state () in
+        let cmd = Command.cmd s in
+        (Option.get (triple (Command.domid s) (option int) cmd).print) (d, t, c)
+
+  let shrink = function
+    | New ->
+        Iter.empty
+    | Cmd (d, t, c) ->
+        let s = new_state () in
+        let cmd = Command.cmd s in
+        Iter.map (fun (d, t, c) -> Cmd (d, t, c))
+        @@ (Option.get (triple (Command.domid s) (option int) cmd).shrink)
+             (d, t, c)
+
+  let arb_cmd state =
+    ( match state.cmdstate with
+    | None ->
+        always New
+    | Some s ->
+        let cmd = Command.cmd !s in
+        QCheck.map
+          (fun (d, t, c) -> Cmd (d, t, c))
+          ~rev:(fun (Cmd (d, t, c)) -> (d, t, c))
+        @@ triple (Command.domid !s) (option int) cmd )
+    |> set_print print |> set_shrink shrink
+
+  (*    |> set_collect (fun (_, _, c) -> (Option.get cmd.QCheck.collect) c)*)
+
+  let init_state =
+    {cnt= 0; xb= Xenbus.Xb.open_fd Unix.stdout; cmdstate= None; failure= None}
+
+  let precond cmd s =
+    match (cmd, s.cmdstate) with
+    | New, None ->
+        true
+    | New, _ ->
+        false
+    | Cmd _, None ->
+        false
+    | Cmd (_, _, Command.Release 0), _ ->
+        false
+    | _ ->
+        true
+
+  let next_state cmd state =
+    { ( try
+          assume (precond cmd state) ;
+          match cmd with
+          | New ->
+              {state with cmdstate= Some (ref @@ new_state ())}
+          | Cmd (domid, tid, cmd) ->
+              let tid = match tid with None -> 0 | Some id -> 1 + id in
+              Command.run tid cmd state.xb ;
+              let s = !(Option.get state.cmdstate) in
+              let con = Connections.find_domain s.Command.cons domid in
+              Queue.clear con.xb.pkt_out ;
+              let run_packet packet =
+                let tid, rid, ty, data = Xenbus.Xb.Packet.unpack packet in
+                let req = {Packet.tid; Packet.rid; Packet.ty; Packet.data} in
+                Process.process_packet ~store:s.Command.store
+                  ~cons:s.Command.cons ~doms:s.Command.doms ~con ~req ;
+                Process.write_access_log ~ty ~tid
+                  ~con:(Connection.get_domstr con)
+                  ~data ;
+                let packet = Connection.peek_output con in
+                let tid, _rid, ty, data = Xenbus.Xb.Packet.unpack packet in
+                Process.write_answer_log ~ty ~tid
+                  ~con:(Connection.get_domstr con)
+                  ~data
+              in
+              Queue.iter run_packet state.xb.pkt_out ;
+              Queue.clear state.xb.pkt_out ;
+              state
+        with e ->
+          let bt = Printexc.get_backtrace () in
+          {state with failure= Some (e, bt)} )
+      with
+      cnt= state.cnt + 1 }
+
+  let init_sut () = ref init_state
+
+  let cleanup _ = ()
+
+  module P = struct
+    type t = string list
+
+    let compare = compare
+  end
+
+  module PathMap = Map.Make (P)
+
+  module DomidMap = Map.Make (struct
+    type t = Xenctrl.domid
+
+    let compare = compare
+  end)
+
+  module IntMap = Map.Make (struct
+    type t = int
+
+    let compare = compare
+  end)
+
+  module FDMap = Map.Make (struct
+    type t = Unix.file_descr
+
+    let compare = compare
+  end)
+
+  let map_of_store s =
+    let m = ref PathMap.empty in
+    Store.dump_fct s (fun path node -> m := PathMap.add path node !m) ;
+    !m
+
+  let node_equiv n n' =
+    Perms.equiv (Store.Node.get_perms n) (Store.Node.get_perms n')
+    && Store.Node.get_name n = Store.Node.get_name n'
+    && Store.Node.get_value n = Store.Node.get_value n'
+
+  let store_root_equiv s s' =
+    if not (PathMap.equal node_equiv (map_of_store s) (map_of_store s')) then
+      let b = Store.dump_store_buf s.root in
+      let b' = Store.dump_store_buf s'.root in
+      Test.fail_reportf "Store trees are not equivalent:\n %s\n <>\n %s"
+        (Buffer.contents b) (Buffer.contents b')
+    else true
+
+  let map_of_domid_table tbl = Hashtbl.fold DomidMap.add tbl DomidMap.empty
+
+  let map_of_quota q = map_of_domid_table q.Quota.cur
+
+  let store_quota_equiv root root' q q' =
+    let _ =
+      DomidMap.merge
+        (fun domid q q' ->
+          let q = Option.value ~default:(-1) q in
+          let q' = Option.value ~default:(-1) q' in
+          if q <> q' then
+            let b = Store.dump_store_buf root in
+            let b' = Store.dump_store_buf root' in
+            Test.fail_reportf "quota mismatch on %d: %d <> %d\n%s\n%s\n" domid q
+              q' (Buffer.contents b) (Buffer.contents b')
+          else Some q)
+        (map_of_quota q) (map_of_quota q')
+    in
+    true
+
+  let store_equiv s s' =
+    store_root_equiv s s'
+    && store_quota_equiv s.root s'.root (Store.get_quota s) (Store.get_quota s')
+
+  let map_of_domains d = map_of_domid_table d.Domains.table
+
+  let domain_equiv d d' =
+    Domain.get_id d = Domain.get_id d'
+    && Domain.get_remote_port d = Domain.get_remote_port d'
+
+  let domains_equiv d d' =
+    DomidMap.equal domain_equiv (map_of_domains d) (map_of_domains d')
+
+  let map_of_fd_table tbl = Hashtbl.fold FDMap.add tbl FDMap.empty
+
+  let map_of_int_table tbl = Hashtbl.fold IntMap.add tbl IntMap.empty
+
+  let list_of_queue q = Queue.fold (fun acc e -> e :: acc) [] q
+
+  let connection_equiv c c' =
+    let l = list_of_queue c.Connection.xb.pkt_out in
+    let l' = list_of_queue c'.Connection.xb.pkt_out in
+    if List.length l <> List.length l' || List.exists2 ( <> ) l l' then (
+      let print_packets l =
+        l
+        |> List.rev_map (fun p ->
+               let tid, rid, ty, data = Xenbus.Packet.unpack p in
+               let tystr = Xenbus.Xb.Op.to_string ty in
+               Printf.sprintf "tid=%d, rid=%d, ty=%s, data=%s" tid rid tystr
+                 (String.escaped data))
+        |> String.concat "\n"
+      in
+      let r = print_packets l in
+      let r' = print_packets l' in
+      Test.fail_reportf "Replies not equal:\n%s\n <>\n %s" r r' )
+    else
+      let n = Connection.number_of_transactions c in
+      let n' = Connection.number_of_transactions c' in
+      if n <> n' then Test.fail_reportf "TX count mismatch: %d <> %d" n n'
+      else true
+
+  let connections_equiv c c' =
+    FDMap.equal connection_equiv
+      (map_of_fd_table c.Connections.anonymous)
+      (map_of_fd_table c'.Connections.anonymous)
+    && IntMap.equal connection_equiv
+         (map_of_int_table c.Connections.domains)
+         (map_of_int_table c'.Connections.domains)
+
+  let dump_load s =
+    let tmp = Filename.temp_file "xenstored" "qcheck.dump" in
+    finally
+      (fun () ->
+        let fds = {Xenstored.DB.rw_sock= None; ro_sock= None} in
+        Xenstored.DB.to_file fds !s.Command.store !s.Command.cons tmp ;
+        s := new_state () ;
+        let _fds', errors =
+          Xenstored.DB.from_file ~live:true !s.Command.store !s.Command.doms
+            !s.Command.cons tmp
+        in
+        if errors > 0 then
+          Test.fail_reportf "Errors during live update: %d" errors)
+      (fun () -> Sys.remove tmp)
+
+  let run_cmd cmd state sut =
+    ( match state.failure with
+    | None ->
+        true
+    | Some (e, bt) ->
+        Test.fail_reportf "Exception %s, backtrace: %s" (Printexc.to_string e)
+          bt )
+    &&
+    match cmd with
+    | New ->
+        sut := next_state cmd !sut ;
+        true
+    | Cmd (0, _, Command.LiveUpdate) ->
+        let s = !sut.cmdstate in
+        let store1 = Store.copy !(Option.get s).store in
+        let doms1 = !(Option.get s).doms in
+        dump_load (Option.get s) ;
+        (* reply is expected not to be equivalent, since after live update we got an empty reply queue,
+           so don't compare connections
+        *)
+        store_equiv store1 !(Option.get s).store
+        && domains_equiv doms1 !(Option.get s).doms
+    | Cmd(_, _, cmd') -> (
+        (* TODO: also got same reply, and check for equivalence on the actual Live Update *)
+        sut := next_state cmd !sut ;
+        let ids = Hashtbl.create 47 in
+        Connections.iter !(Option.get state.cmdstate).cons (fun con ->
+            Hashtbl.add ids (Connection.get_id con) con.next_tid) ;
+        let state = next_state cmd state in
+        match (!sut.failure, state.cmdstate, !sut.cmdstate) with
+        | None, Some s, Some s' ->
+            let r = cmd' = Command.Transaction_start (* txid can change *) || 
+               connections_equiv !s.cons !s'.cons in
+            Connections.iter !(Option.get state.cmdstate).cons (fun con ->
+                let tid = Hashtbl.find ids (Connection.get_id con) in
+                if con.next_tid <> tid then (
+                  let (_ : bool) = Connection.end_transaction con tid None in
+                  () ;
+                  con.next_tid <- tid )) ;
+            r
+        | None, None, None ->
+            true
+        | None, None, _ ->
+            Test.fail_report "state uninit"
+        | None, _, None ->
+            Test.fail_report "sut uninit"
+        | Some (e, bt), _, _ ->
+            Test.fail_reportf "Exception %s, backtrace: %s"
+              (Printexc.to_string e) bt )
+end
+
+module States = QCSTM.Make (Spec)
+
+(* && watches_equiv c c' *)
+
+let test = States.agree_test ~count:100 ~name:"live-update"
+
+let test =
+  Test.make ~name:"live-update" ~count:100
+    (States.arb_cmds Spec.init_state)
+    States.agree_prop
+
+let () = QCheck_base_runner.run_tests_main [test]
diff --git a/tools/ocaml/xenstored/test/pathtree.ml b/tools/ocaml/xenstored/test/pathtree.ml
new file mode 100644
index 0000000000..50cbb0302d
--- /dev/null
+++ b/tools/ocaml/xenstored/test/pathtree.ml
@@ -0,0 +1,40 @@
+module M = Map.Make(String)
+type 'a t = { data: 'a; children: 'a t M.t }
+
+type 'a tree = 'a t
+let of_data data = { data; children = M.empty }
+
+let update key f t = { t with children = M.update key f t.children }
+let set t data = { t with data }
+
+module Cursor = struct
+  type 'a t = { tree: 'a tree; up: ('a t * M.key) option }
+
+  let of_tree tree = { tree; up = None }
+
+  let create parent key tree = { tree; up = Some (parent, key) }
+
+  let down cur k =
+    M.find_opt k cur.tree.children |> Option.map @@ create cur k
+
+  let down_implicit_create ~implicit cur k =
+    match down cur k with
+    | Some r -> r
+    | None -> cur.tree.data |> implicit |> of_data |> create cur k
+
+  let rec to_tree t = match t.up with
+    | None -> t.tree
+    | Some (parent, key) ->
+        to_tree { parent with tree = update key (fun _ -> Some t.tree) parent.tree }
+
+  let set cur data = { cur with tree = set cur.tree data }
+  let get cur = cur.tree.data
+
+  let rm_child cur key = { cur with tree = update key (fun _ -> None) cur.tree}
+
+  (* TODO: down with implicit create *)
+end
+
+
+
+let rec map f t = { data = f t.data; children = M.map (map f) t.children }
diff --git a/tools/ocaml/xenstored/test/testable.ml b/tools/ocaml/xenstored/test/testable.ml
new file mode 100644
index 0000000000..ec50b10391
--- /dev/null
+++ b/tools/ocaml/xenstored/test/testable.ml
@@ -0,0 +1,364 @@
+let is_output_devnull = Unix.stat "/dev/null" = Unix.fstat Unix.stdout
+
+let () =
+  if not is_output_devnull then (
+    Printexc.record_backtrace true ;
+    Fmt_tty.setup_std_outputs () ;
+    try
+      let cols =
+        let ch = Unix.open_process_in "tput cols" in
+        Stdext.finally
+          (fun () -> input_line ch |> int_of_string)
+          (fun () -> Unix.close_process_in ch)
+      in
+      Format.set_margin cols
+    with _ -> () )
+
+let devnull () = Unix.openfile "/dev/null" [] 0
+
+let xb = Xenbus.Xb.open_fd (devnull ())
+
+module Command = struct
+  type path = Store.Path.t
+
+  type value = string
+
+  type token = string
+
+  type domid = int
+
+  type t = Xenbus.Packet.t
+
+  open Xenstore.Queueop
+
+  let cmd f =
+    Queue.clear xb.pkt_out ;
+    let () = f xb in
+    let p = Xenbus.Xb.peek_output xb in
+    Queue.clear xb.pkt_out ; p
+
+  let pathcmd f pathgen tid state = cmd @@ f tid @@ pathgen state
+
+  let cmd_read gen tid state = pathcmd read gen tid state
+
+  let cmd_write pathgen v tid state = cmd @@ write tid (pathgen state) v
+
+  let cmd_mkdir g t s = pathcmd mkdir g t s
+
+  let cmd_rm g t s = pathcmd rm g t s
+
+  let cmd_directory g t s = pathcmd directory g t s
+
+  let cmd_getperms g t s = pathcmd getperms g t s
+
+  let cmd_setperms pathgen vgen tid state =
+    cmd @@ setperms tid (pathgen state) (Perms.Node.to_string @@ vgen state)
+
+  let cmd_watch pathgen token _ state = cmd @@ watch (pathgen state) token
+
+  let cmd_unwatch pathgen token _ state = cmd @@ unwatch (pathgen state) token
+
+  let cmd_reset_watches tid _state =
+    let open Xenbus in
+    cmd
+    @@ fun con ->
+    Xenbus.Xb.queue con
+      (Xenbus.Xb.Packet.create 0 0 Xenbus.Xb.Op.Reset_watches "")
+
+  let cmd_transaction_start _ _ = cmd @@ transaction_start
+
+  let cmd_transaction_end commit tid _ = cmd @@ transaction_end tid commit
+
+  let domcmd f idgen _ state = cmd @@ f @@ idgen state
+
+  let cmd_release idgen state = domcmd release idgen state
+
+  let cmd_getdomainpath i s = domcmd getdomainpath i s
+
+  let cmd_isintroduced i t s =
+    domcmd
+      (fun d con ->
+        let open Xenbus in
+        Xenbus.Xb.queue con
+          (Xenbus.Xb.Packet.create 0 0 Xenbus.Xb.Op.Isintroduced
+             (string_of_int d)))
+      i t s
+
+  let cmd_set_target idgen1 idgen2 _ state =
+    let d = idgen1 state in
+    let t = idgen2 state in
+    cmd
+    @@ fun con ->
+    Xenbus.Xb.queue con
+      (Xenbus.Xb.Packet.create 0 0 Xenbus.Xb.Op.Isintroduced
+         (String.concat "\x00" [string_of_int d; string_of_int t]))
+
+  let cmd_liveupdate _ _ = cmd @@ debug ["live-update"; "-s"]
+
+  let cmd_introduce id port _ state = cmd @@ introduce id 0n port
+
+  let pp_dump = Types.pp_dump_packet
+
+  let precond cmd _state =
+    match cmd with
+    | {Xenbus.Packet.ty= Xenbus.Xb.Op.Release; data= "0\000"} ->
+        false
+        (* can't release Dom0 in the tests, or we crash due to shared dom0 backend *)
+    | {ty= Xenbus.Xb.Op.Rm; data= ""} ->
+        (* this is expected to cause inconsistencies on pre-created paths like /local *)
+        false
+    | _ ->
+        true
+end
+
+let with_logger ~on_exn f =
+  if is_output_devnull then f ()
+  else
+    let old = (!Logging.xenstored_logger, !Logging.access_logger) in
+    let logs = ref [] in
+    let write ?(level = Logging.Debug) s =
+      let msg = Printf.sprintf "%s %s" (Logging.string_of_level level) s in
+      logs := msg :: !logs
+    in
+    let logger =
+      Some {Logging.stop= ignore; restart= ignore; rotate= ignore; write}
+    in
+    Logging.xenstored_logger := logger ;
+    Logging.access_logger := logger ;
+    Stdext.finally
+      (fun () ->
+        try f ()
+        with e ->
+          let bt = Printexc.get_raw_backtrace () in
+          on_exn e bt (List.rev !logs))
+      (fun () ->
+        Logging.xenstored_logger := fst old ;
+        Logging.access_logger := snd old)
+
+type t =
+  { store: Store.t
+  ; cons: Connections.t
+  ; doms: Domains.domains
+  ; mutable anon: Unix.file_descr option
+  ; live_update: bool
+  ; txidtbl: (int, int) Hashtbl.t }
+
+let () =
+  Logging.xenstored_log_level := Logging.Debug ;
+  Logging.access_log_special_ops := true ;
+  Logging.access_log_transaction_ops := true ;
+  let name, f = Filename.open_temp_file "xenstored" "port" in
+  Domains.xenstored_port := name ;
+  Stdext.finally (fun () -> Printf.fprintf f "%d" 1) (fun () -> close_out f) ;
+  Domains.xenstored_kva := "/dev/zero" ;
+  (* entries from a typical oxenstored.conf *)
+  Transaction.do_coalesce := true ;
+  Perms.activate := true ;
+  Quota.activate := true ;
+  Quota.maxent := 8192 ;
+  Quota.maxsize := 2048 ;
+  Define.maxwatch := 512 ;
+  Define.maxtransaction := 10 ;
+  Define.maxrequests := 1024
+
+(* we MUST NOT release dom0, or we crash,
+   this is shared between multiple tests, because
+   it keeps an FD open, and we want to avoid EMFILE
+*)
+
+let create ?(live_update = false) () =
+  let store = Store.create () in
+  let cons = Connections.create () in
+  let doms = Domains.init (Event.init ()) ignore in
+  let dom0 = Domains.create0 doms in
+  let txidtbl = Hashtbl.create 47 in
+  Connections.add_domain cons dom0 ;
+  {store; cons; doms; anon= None; live_update; txidtbl}
+
+let cleanup t = Connections.iter t.cons Connection.close
+
+let init t =
+  let local = Store.Path.of_string "/local" in
+  let con = Perms.Connection.create 0 in
+  Store.mkdir t.store con local ;
+  Store.mkdir t.store con (Store.Path.of_string "/tool") ;
+  let fd = devnull () in
+  t.anon <- Some fd ;
+  Connections.add_anonymous t.cons fd
+
+let dump_load s =
+  let tmp = Filename.temp_file "xenstored" "qcheck.dump" in
+  Stdext.finally
+    (fun () ->
+      Xenstored.DB.to_file None s.store s.cons tmp ;
+      let s' = create () in
+      (* preserve FD *)
+      s'.anon <- s.anon ;
+      s.anon <- None ;
+      let _fds', errors =
+        Xenstored.DB.from_file ~live:true s'.store s'.doms s'.cons tmp
+      in
+      if errors > 0 then
+        failwith (Printf.sprintf "Errors during live update: %d" errors) ;
+      s')
+    (fun () -> Sys.remove tmp)
+
+let is_live_update = function
+  | {Xenbus.Packet.ty= Xenbus.Xb.Op.Debug; data= "live-update\000-s\000"} ->
+      true
+  | _ ->
+      false
+
+let is_tx_start p = p.Xenbus.Packet.ty = Xenbus.Xb.Op.Transaction_start
+
+let with_tmpfile prefix write f =
+  let name, ch = Filename.open_temp_file prefix ".txt" in
+  Stdext.finally
+    (fun () ->
+      Stdext.finally (fun () -> write ch) (fun () -> close_out ch) ;
+      f name)
+    (fun () -> Sys.remove name)
+
+let with_pp_to_file prefix pp x f =
+  let write ch =
+    let ppf = Format.formatter_of_out_channel ch in
+    Format.pp_set_margin ppf @@ Format.get_margin () ;
+    pp ppf x ;
+    Fmt.flush ppf ()
+  in
+  with_tmpfile prefix write f
+
+let run_cmd_get_output ?(ok_codes = [0]) cmd =
+  let cmd = Array.of_list cmd in
+  let ch = Unix.open_process_args_in cmd.(0) cmd in
+  Stdext.finally
+    (fun () ->
+      let lines = ref [] in
+      try
+        while true do
+          lines := input_line ch :: !lines
+        done ;
+        assert false
+      with End_of_file -> List.rev !lines |> String.concat "\n")
+    (fun () ->
+      match Unix.close_process_in ch with
+      | Unix.WEXITED code when List.mem code ok_codes ->
+          ()
+      | status ->
+          Crowbar.failf "%a %a" (Fmt.array Fmt.string) cmd
+            Types.pp_process_status status)
+
+let call_diff x y =
+  let ok_codes = [0; 1] in
+  run_cmd_get_output ~ok_codes
+    [ "/usr/bin/git"
+    ; "diff"
+    ; "-U10000" (* we want to see the entire state, where possible *)
+    ; "--no-index"
+    ; ( "--word-diff="
+      ^ if Fmt.style_renderer Fmt.stdout = `Ansi_tty then "color" else "plain"
+      )
+    ; "--color-moved=dimmed-zebra"
+    ; x
+    ; y ]
+
+let check_eq_exn prefix ~pp ~eq x y =
+  if not @@ eq x y then
+    if is_output_devnull then failwith "different"
+    else
+      with_pp_to_file "expected" pp x
+      @@ fun xfile ->
+      with_pp_to_file "actual" pp y
+      @@ fun yfile ->
+      failwith
+      @@ Printf.sprintf "%s agrement: %s" prefix (call_diff xfile yfile)
+
+let run next_tid t (domid, cmd) =
+  let con =
+    match domid with
+    | 0 ->
+        Connections.find !t.cons (Option.get !t.anon)
+    | id ->
+        Connections.find_domain !t.cons domid
+  in
+  (* clear out any watch events, TODO: don't  *)
+  Connections.iter !t.cons (fun con -> Queue.clear con.xb.pkt_out) ;
+  (* TODO: use the global live update state that processing the command sets, but remember to reset it *)
+  if is_live_update cmd then
+    if !t.live_update then (
+      let t0 = !t in
+      let t' = dump_load t0 in
+      Connections.iter t0.cons (fun con ->
+          Connection.iter_transactions con
+          @@ fun _ tx ->
+             if tx.Transaction.operations <> [] then
+             Transaction.mark_failed tx) ;
+      check_eq_exn "store" ~pp:Types.pp_dump_store ~eq:Types.equal_store
+        t0.store t'.store ;
+      check_eq_exn "connections" ~pp:Types.pp_dump_connections
+        ~eq:Types.equal_connections t0.cons t'.cons ;
+      check_eq_exn "domains" ~pp:Types.pp_dump_domains ~eq:Types.equal_domains
+        t0.doms t'.doms ;
+      (* avoid double close on anonymous conn *)
+      Connections.iter_domains t0.cons Connection.close ;
+      t := {t' with txidtbl= !t.txidtbl} )
+    else begin
+      Logging.debug "testable" "BEFORE TXMARK";
+      Connections.iter !t.cons (fun con ->
+          Connection.iter_transactions con
+          @@ fun txid tx ->
+             Logging.debug "testable" "marking to fail %d" txid; 
+             if tx.Transaction.operations <> [] then
+             Transaction.mark_failed tx) 
+    end;
+  let run_packet packet =
+    let tid, rid, ty, data = Xenbus.Xb.Packet.unpack packet in
+    Logging.debug "testable" "tid: %d" tid ;
+    let tid = if tid <> 0 then Hashtbl.find !t.txidtbl tid else tid in
+    let req : Packet.request =
+      {Packet.tid; Packet.rid; Packet.ty; Packet.data}
+    in
+    Process.process_packet ~store:!t.store ~cons:!t.cons ~doms:!t.doms ~con ~req ;
+    Process.write_access_log ~ty ~tid ~con:(Connection.get_domstr con) ~data ;
+    let packet = Connection.peek_output con in
+    if ty = Xenbus.Xb.Op.Transaction_start then (
+      Logging.debug "testable" "Adding mapping for tid %d" next_tid ;
+      Hashtbl.add !t.txidtbl next_tid (con.Connection.next_tid - 1) ) ;
+    let tid, _rid, ty, data = Xenbus.Xb.Packet.unpack packet in
+    Process.write_answer_log ~ty ~tid ~con:(Connection.get_domstr con) ~data
+  in
+  (* TODO: also a Nodes command with multiple packets *)
+  run_packet cmd ; (* TODO: act on and clear watches? *)
+                   con
+
+let run2 next_tid t t' (domid, cmd) =
+  let con = run next_tid t (domid, cmd) in
+  let con' = run next_tid t' (domid, cmd) in
+  (* TODO: ignore txid mismatches on transactions *)
+  if not @@ is_tx_start cmd then
+    check_eq_exn "reply packets" ~pp:Types.pp_dump_xb ~eq:Types.equal_xb_pkts
+      con.xb con'.xb ;
+  Queue.clear con'.xb.pkt_out ;
+  Queue.clear con.xb.pkt_out
+
+module type S = sig
+  type cmd
+
+  type state
+
+  type sut
+
+  val init_state : state
+
+  val next_state : cmd -> state -> state
+
+  val init_sut : unit -> sut
+
+  val cleanup : sut -> unit
+
+  val run_cmd : cmd -> state -> sut -> bool
+
+  val precond : cmd -> state -> bool
+
+  val pp : cmd Fmt.t
+end
diff --git a/tools/ocaml/xenstored/test/types.ml b/tools/ocaml/xenstored/test/types.ml
new file mode 100644
index 0000000000..a85168cbcf
--- /dev/null
+++ b/tools/ocaml/xenstored/test/types.ml
@@ -0,0 +1,427 @@
+(*
+ * Copyright (C) Citrix Systems Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ *)
+
+let domid_first_reserved = 0x7FF0
+
+type 'a eq = 'a -> 'a -> bool
+
+let hashtable_equal (eq : 'a eq) h h' =
+  Hashtbl.length h = Hashtbl.length h'
+  && Hashtbl.fold
+       (fun k v acc ->
+         acc
+         && match Hashtbl.find_opt h' k with Some x -> eq v x | None -> false)
+       h true
+
+let list_equal (eq : 'a eq) l l' =
+  try List.for_all2 eq l l' with Invalid_argument _ -> false
+
+let queue_equal eq q q' =
+  Queue.length q = Queue.length q'
+  &&
+  let list_of_queue q = Queue.fold (fun acc e -> e :: acc) [] q in
+  list_equal eq (list_of_queue q) (list_of_queue q')
+
+let pp_process_status ppf = function
+  | Unix.WEXITED code ->
+      Fmt.pf ppf "exited with code %d" code
+  | Unix.WSIGNALED osig ->
+      Fmt.pf ppf "killed by signal %a" Fmt.Dump.signal osig
+  | Unix.WSTOPPED osig ->
+      Fmt.pf ppf "stopped by signal %a" Fmt.Dump.signal osig
+
+let pp_dump_ref dump =
+  Fmt.using ( ! ) Fmt.(dump |> Fmt.braces |> prefix (const string "ref"))
+
+let pp_file_descr = Fmt.using Disk.FD.to_int Fmt.int
+
+module Quota = struct
+  open Quota
+
+  let pp_dump =
+    let open Fmt in
+    Dump.record
+      [ Dump.field "maxent" (fun q -> q.maxent) int
+      ; Dump.field "maxsize" (fun q -> q.maxsize) int
+      ; Dump.field "cur" (fun q -> q.cur) @@ Dump.hashtbl int int ]
+
+  let equal q q' =
+    q.maxent = q'.maxent && q.maxsize = q'.maxsize
+    && hashtable_equal Int.equal q.cur q'.cur
+end
+let pp_dump_quota = Quota.pp_dump
+let equal_quota = Quota.equal
+
+module Store = struct
+  open Store
+
+  module Node = struct
+    open Node
+
+    let pp_dump ppf t =
+      let buf = dump_store_buf t in
+      Fmt.lines ppf (Buffer.contents buf)
+
+    let rec equal n n' =
+      Symbol.equal n.name n'.name
+      && Perms.equiv n.perms n'.perms
+      && String.equal n.value n'.value
+      && SymbolMap.equal equal n.children n'.children
+  end
+
+  module Path = struct
+    open Path
+
+    let pp_dump = Fmt.using to_string Fmt.string
+
+    let equal p p' = list_equal String.equal p p'
+
+    let hash (p : t) = Hashtbl.hash p
+
+    let compare (p : t) (p' : t) = compare p p'
+  end
+
+  let pp_dump =
+    let open Fmt in
+    (* only print relevant fields, expected to stay same during live-update. *)
+    Dump.record
+      [ Dump.field "stat_transaction_coalesce"
+          (fun t -> t.stat_transaction_coalesce)
+          int
+      ; Dump.field "stat_transaction_abort"
+          (fun t -> t.stat_transaction_coalesce)
+          int
+      ; Dump.field "store" (fun t -> t.root) Node.pp_dump
+      ; Dump.field "quota" (fun t -> t.quota) Quota.pp_dump ]
+
+  let equal s s' =
+    (* ignore stats *)
+    Node.equal s.root s'.root && Quota.equal s.quota s'.quota
+end
+
+let pp_dump_store = Store.pp_dump
+let equal_store = Store.equal
+
+module Xb = struct
+  open Xenbus.Xb
+
+  module Op = struct
+    open Xenbus.Op
+
+    let pp_dump = Fmt.of_to_string to_string
+
+    let equal (op : t) (op' : t) = op = op'
+  end
+
+  module Packet = struct
+    open Xenbus.Packet
+
+    let pp_dump =
+      let open Fmt in
+      Dump.record
+        [ Dump.field "tid" get_tid int
+        ; Dump.field "rid" get_rid int
+        ; Dump.field "ty" get_ty Op.pp_dump
+        ; Dump.field "data" get_data Dump.string ]
+
+    let equal (p : t) (p' : t) =
+      (* ignore TXID, it can be different after a live-update *)
+      p.rid = p'.rid && p.ty = p'.ty && p.data = p'.data
+  end
+
+  module Partial = struct
+    open Xenbus.Partial
+
+    let pp_dump =
+      let open Fmt in
+      Dump.record
+        [ Dump.field "tid" (fun p -> p.tid) int
+        ; Dump.field "rid" (fun p -> p.rid) int
+        ; Dump.field "ty" (fun p -> p.ty) Op.pp_dump
+        ; Dump.field "len" (fun p -> p.len) int
+        ; Dump.field "buf" (fun p -> p.buf) Fmt.buffer ]
+
+    let equal p p' =
+      p.tid = p'.tid && p.rid = p'.rid && p.ty = p'.ty
+      && Buffer.contents p.buf = Buffer.contents p'.buf
+  end
+
+  let pp_dump_partial_buf ppf = function
+    | HaveHdr pkt ->
+        Fmt.pf ppf "HaveHdr %a" Partial.pp_dump pkt
+    | NoHdr (i, b) ->
+        Fmt.pf ppf "NoHdr(%d, %S)" i (Bytes.to_string b)
+
+  let equal_partial_buf buf buf' =
+    match (buf, buf') with
+    | HaveHdr pkt, HaveHdr pkt' ->
+        Partial.equal pkt pkt'
+    | NoHdr (i, b), NoHdr (i', b') ->
+        i = i' && b = b'
+    | HaveHdr _, NoHdr _ | NoHdr _, HaveHdr _ ->
+        false
+
+  let pp_backend ppf = function
+    | Fd {fd} ->
+        Fmt.prefix (Fmt.const Fmt.string "Fd ") pp_file_descr ppf fd
+    | Xenmmap _ ->
+        Fmt.const Fmt.string "Xenmmap _" ppf ()
+
+  let equal_backend b b' =
+    match (b, b') with
+    | Fd fd, Fd fd' ->
+        fd = fd'
+    | Xenmmap _, Xenmmap _ ->
+        true (* can't extract the FD to compare *)
+    | Fd _, Xenmmap _ | Xenmmap _, Fd _ ->
+        false
+
+  let pp_dump =
+    let open Fmt in
+    Dump.record
+      [ Dump.field "backend" (fun x -> x.backend) pp_backend
+      ; Dump.field "pkt_in" (fun x -> x.pkt_in) @@ Dump.queue Packet.pp_dump
+      ; Dump.field "pkt_out" (fun x -> x.pkt_out) @@ Dump.queue Packet.pp_dump
+      ; Dump.field "partial_in" (fun x -> x.partial_in) pp_dump_partial_buf
+      ; Dump.field "partial_out" (fun x -> x.partial_out) Dump.string ]
+
+  let equal_pkts xb xb' =
+    let queue_eq = queue_equal Packet.equal in
+    queue_eq xb.pkt_in xb'.pkt_in
+    && queue_eq xb.pkt_out xb'.pkt_out
+    && xb.partial_in = xb'.partial_in
+    && xb.partial_out = xb'.partial_out
+
+  let equal xb xb' = equal_backend xb.backend xb'.backend && equal_pkts xb xb'
+end
+
+let pp_dump_packet = Xb.Packet.pp_dump
+let pp_dump_xb = Xb.pp_dump
+let equal_xb = Xb.equal
+let equal_xb_pkts = Xb.equal_pkts
+
+module Packet = struct
+  open Packet
+
+  let pp_dump_request =
+    let open Fmt in
+    Dump.record
+      [ Dump.field "tid" (fun t -> t.tid) int
+      ; Dump.field "rid" (fun t -> t.rid) int
+      ; Dump.field "ty" (fun t -> t.ty) Xb.Op.pp_dump
+      ; Dump.field "data" (fun t -> t.data) Dump.string ]
+
+  let equal_req r r' =
+    r.tid = r'.tid && r.rid = r'.rid && r.ty = r'.ty && r.data = r'.data
+
+  let pp_dump_response ppf = function
+    | Reply str ->
+        Fmt.pf ppf "Reply %S" str
+    | Error str ->
+        Fmt.pf ppf "Error %S" str
+    | Ack _ ->
+        Fmt.string ppf "Ack"
+
+  let equal_response = response_equal
+end
+
+module Transaction = struct
+  open Transaction
+
+  let pp_dump_ty ppf = function
+    | Transaction.No ->
+        Fmt.string ppf "No"
+    | Full (id, orig, canonical) ->
+        Fmt.pf ppf "Full @[(%d, %a, %a)@]" id Store.pp_dump orig Store.pp_dump
+          canonical
+
+  let equal_ty t t' =
+    match (t, t') with
+    | Transaction.No, Transaction.No ->
+        true
+    | Transaction.Full _, Transaction.Full _ ->
+        (* We expect the trees not to be identical, so we ignore any differences here.
+           The reply comparison tests will find any mismatches in observable transaction state
+        *)
+        true
+    | Transaction.No, Transaction.Full _ | Transaction.Full _, Transaction.No ->
+        false
+
+  let equal_pathop (op, path) (op', path') =
+    op = op' && Store.Path.equal path path'
+
+  let pp_dump_op = Fmt.pair Packet.pp_dump_request Packet.pp_dump_response
+
+  let equal_op (req, reply) (req', reply') =
+    Packet.equal_req req req' && Packet.equal_response reply reply'
+
+  let pp_dump =
+    let open Fmt in
+    let open Transaction in
+    Dump.record
+      [ Dump.field "ty" (fun t -> t.ty) pp_dump_ty
+      ; Dump.field "start_count" (fun t -> t.start_count) int64
+      ; Dump.field "store" (fun t -> t.store) Store.pp_dump
+      ; Dump.field "quota" (fun t -> t.quota) Quota.pp_dump
+      ; Dump.field "must_fail" (fun t -> t.must_fail) Fmt.bool
+      ; Dump.field "paths" (fun t -> t.paths)
+        @@ Dump.list (pair Xb.Op.pp_dump Store.Path.pp_dump)
+      ; Dump.field "operations" (fun t -> t.operations)
+        @@ list (pair Packet.pp_dump_request Packet.pp_dump_response)
+      ; Dump.field "read_lowpath" (fun t -> t.read_lowpath)
+        @@ option Store.Path.pp_dump
+      ; Dump.field "write_lowpath" (fun t -> t.write_lowpath)
+        @@ option Store.Path.pp_dump ]
+
+  let equal t t' =
+    equal_ty t.ty t'.ty
+    (* ignored: quota at start of transaction, not relevant
+       && Quota.equal t.quota t'.quota *)
+    (*&& list_equal equal_pathop t.paths t'.paths *)
+    (*&& list_equal equal_op t.operations t'.operations*)
+    && t.must_fail = t'.must_fail
+    (* ignore lowpath, impossible to recreate from limited migration info *)
+    (*&& Option.equal Store.Path.equal t.read_lowpath t'.read_lowpath
+    && Option.equal Store.Path.equal t.write_lowpath t'.write_lowpath *)
+end
+
+module Connection = struct
+  open Connection
+
+  let pp_dump_watch =
+    let open Fmt in
+    Dump.record
+      [ Dump.field "token" (fun w -> w.token) Dump.string
+      ; Dump.field "path" (fun w -> w.path) Dump.string
+      ; Dump.field "base" (fun w -> w.base) Dump.string
+      ; Dump.field "is_relative" (fun w -> w.is_relative) Fmt.bool ]
+
+  let pp_dump =
+    let open Fmt in
+    Dump.record
+      [ Dump.field "xb" (fun c -> c.xb) Xb.pp_dump
+      ; Dump.field "transactions" (fun c -> c.transactions)
+        @@ Dump.hashtbl int Transaction.pp_dump
+      ; Dump.field "next_tid" (fun t -> t.next_tid) int
+      ; Dump.field "nb_watches" (fun c -> c.nb_watches) int
+      ; Dump.field "anonid" (fun c -> c.anonid) int
+      ; Dump.field "watches" (fun c -> c.watches)
+        @@ Dump.hashtbl Dump.string (Dump.list pp_dump_watch)
+      ; Dump.field "perm" (fun c -> c.perm)
+        @@ Fmt.using Perms.Connection.to_string Fmt.string ]
+
+  let equal c c' =
+    let watch_equal w w' =
+      (* avoid recursion, these must be physically equal *)
+      w.con == c && w'.con == c' && w.token = w'.token && w.path = w'.path
+      && w.base = w'.base
+      && w.is_relative = w'.is_relative
+    in
+    Xb.equal c.xb c'.xb
+    && hashtable_equal Transaction.equal c.transactions c'.transactions
+    (* next_tid ignored, not preserved *)
+    && hashtable_equal (list_equal watch_equal) c.watches c'.watches
+    && c.nb_watches = c'.nb_watches
+    (* anonid ignored, not preserved *)
+    (* && c.anonid = c'.anonid *) && c.perm = c'.perm
+
+  let equal_watch w w' =
+    equal w.con w'.con && w.token = w'.token && w.path = w'.path
+    && w.base = w'.base
+    && w.is_relative = w'.is_relative
+end
+
+module Trie = struct
+  open Trie
+
+  let pp_dump dump_elt =
+    Fmt.Dump.iter_bindings Trie.iter (Fmt.any "trie") Fmt.string
+      Fmt.(option dump_elt)
+
+  let plus1 _ _ acc = acc + 1
+
+  let length t = fold plus1 t 0
+
+  (* Trie.iter doesn't give full path so we can't compare the paths/values exactly.
+     They will be compared as part of the individual connections
+  *)
+  let equal _eq t t' = length t = length t'
+end
+
+module Connections = struct
+  open Connections
+
+  let pp_dump =
+    let open Fmt in
+    Dump.record
+      [ Dump.field "anonymous" (fun t -> t.anonymous)
+        @@ Dump.hashtbl Fmt.(any "") Connection.pp_dump
+      ; Dump.field "domains" (fun t -> t.domains)
+        @@ Dump.hashtbl Fmt.int Connection.pp_dump
+      ; Dump.field "ports" (fun t -> t.ports)
+        @@ Dump.hashtbl
+             (Fmt.using Xeneventchn.to_int Fmt.int)
+             Connection.pp_dump
+      ; Dump.field "watches" (fun t -> t.watches)
+        @@ Trie.pp_dump (Dump.list Connection.pp_dump_watch) ]
+
+  let equal c c' =
+    hashtable_equal Connection.equal c.anonymous c'.anonymous
+    && hashtable_equal Connection.equal c.domains c'.domains
+    (* TODO: local port changes for now *)
+    (*&& hashtable_equal Connection.equal c.ports c'.ports *)
+    && Trie.equal (list_equal Connection.equal_watch) c.watches c'.watches
+end
+
+let pp_dump_connections = Connections.pp_dump
+let equal_connections = Connections.equal
+
+module Domain = struct
+  open Domain
+
+  let pp_dump =
+    let open Fmt in
+    Dump.record
+      [ Dump.field "id" Domain.get_id int
+      ; Dump.field "remote_port" Domain.get_remote_port int
+      ; Dump.field "bad_client" Domain.is_bad_domain bool
+      ; Dump.field "io_credit" Domain.get_io_credit int
+      ; Dump.field "conflict_credit" (fun t -> t.conflict_credit) float
+      ; Dump.field "caused_conflicts" (fun t -> t.caused_conflicts) int64 ]
+
+  (* ignore stats fields *)
+  let equal t t' = t.id = t'.id && t.remote_port = t'.remote_port
+end
+
+module Domains = struct
+  open Domains
+
+  let pp_dump =
+    let open Fmt in
+    Dump.record
+      [ Dump.field "table" (fun t -> t.table)
+        @@ Dump.hashtbl Fmt.int Domain.pp_dump
+      ; Dump.field "doms_conflict_paused" (fun t -> t.doms_conflict_paused)
+        @@ Dump.queue (pp_dump_ref @@ Dump.option Domain.pp_dump)
+      ; Dump.field "doms_with_conflict_penalty" (fun t ->
+            t.doms_with_conflict_penalty)
+        @@ Dump.queue (pp_dump_ref @@ Dump.option Domain.pp_dump)
+      ; Dump.field "n_paused" (fun t -> t.n_paused) int
+      ; Dump.field "n_penalised" (fun t -> t.n_penalised) int ]
+
+  (* ignore statistic fields *)
+  let equal t t' = hashtable_equal Domain.equal t.table t'.table
+end
+let pp_dump_domains = Domains.pp_dump
+let equal_domains = Domains.equal
\ No newline at end of file
diff --git a/tools/ocaml/xenstored/test/xenstored_test.ml b/tools/ocaml/xenstored/test/xenstored_test.ml
index e86b68e867..acf3209087 100644
--- a/tools/ocaml/xenstored/test/xenstored_test.ml
+++ b/tools/ocaml/xenstored/test/xenstored_test.ml
@@ -1,2 +1,147 @@
-open Xenstored
-let () = ()
+open Testable
+open Generator
+module Cb = Crowbar
+
+let random_path = Cb.list Cb.bytes
+
+let value = Cb.bytes
+
+let token = Cb.bytes
+
+let permty =
+  [Perms.READ; Perms.WRITE; Perms.RDWR; Perms.NONE]
+  |> List.map Cb.const |> Cb.choose
+
+let new_domid = Cb.range ~min:1 Types.domid_first_reserved
+
+let port = Cb.range 0xFFFF_FFFF (*uint32_t*)
+
+let arb_cmd =
+  let open Command in
+  let path =
+    Cb.choose
+      [ Cb.map [Cb.int] (fun rnd state -> PathObserver.choose_path state rnd)
+      ; Cb.map [random_path] (fun x _ -> Store.Path.to_string x) ]
+  in
+  let domid =
+    Cb.map [Cb.int] (fun rnd state -> PathObserver.choose_domid state rnd)
+  in
+  let perms =
+    Cb.map [domid; permty; Cb.pair domid permty |> Cb.list]
+    @@ fun idgen owner other state ->
+    let other = List.map (fun (idgen, ty) -> (idgen state, ty)) other in
+    Perms.Node.create (idgen state) owner other
+  in
+  let guard' ~f gen state =
+    let v = gen state in
+    Cb.guard (f v) ;
+    v
+  in
+  let cmd =
+    let open Testable.Command in
+    Cb.choose
+      [ Cb.map [path] cmd_read
+      ; Cb.map [path; value] cmd_write
+      ; Cb.map [path] cmd_mkdir
+      ; Cb.map [path] (fun p -> cmd_rm @@ guard' ~f:(fun p -> p <> "/") p)
+      ; Cb.map [path] cmd_directory
+      ; Cb.map [path] cmd_getperms
+      ; Cb.map [path; perms] cmd_setperms
+      ; Cb.map [path; token] cmd_watch
+      ; Cb.map [path; token] cmd_unwatch
+      ; Cb.const cmd_reset_watches
+      ; Cb.const cmd_transaction_start
+      ; Cb.map [Cb.bool] cmd_transaction_end
+      ; Cb.map [new_domid; port] cmd_introduce
+      ; Cb.map [domid] (fun idgen ->
+            cmd_release @@ guard' ~f:(fun id -> id <> 0) idgen)
+      ; Cb.map [domid] cmd_getdomainpath
+      ; Cb.map [domid] cmd_isintroduced
+      ; Cb.map [domid; domid] cmd_set_target
+      ; Cb.const cmd_liveupdate ]
+  in
+  Cb.map [domid; Cb.int; cmd] (fun this rnd cmd state ->
+      let this = this state in
+      let txid = PathObserver.choose_txid_opt state this rnd in
+      let cmd = cmd txid state in
+      (this, cmd))
+
+(* based on QCSTM *)
+module Make (Spec : sig
+  include Testable.S
+
+  val arb_cmd : (state -> cmd) Crowbar.gen
+end) =
+struct
+  let arb_cmds =
+    Crowbar.with_printer (Fmt.Dump.list Spec.pp)
+    @@ Crowbar.map [Crowbar.list1 Spec.arb_cmd] (fun cmdgens ->
+           let cmds, _ =
+             List.fold_left
+               (fun (cmds, s) f ->
+                 let cmd = f s in
+                 Crowbar.check (Spec.precond cmd s) ;
+                 (cmd :: cmds, Spec.next_state cmd s))
+               ([], Spec.init_state) cmdgens
+           in
+           List.rev cmds)
+
+  let interp_agree sut cs =
+    List.fold_left
+      (fun s cmd ->
+        Crowbar.check
+          ( try Spec.run_cmd cmd s sut
+            with Failure msg -> Crowbar.failf "%a" Fmt.lines msg ) ;
+        Spec.next_state cmd s)
+      Spec.init_state cs
+
+  let agree_prop cs =
+    let on_exn e bt logs =
+      List.iter prerr_endline logs ;
+      Printexc.raise_with_backtrace e bt
+    in
+    Testable.with_logger ~on_exn (fun () ->
+        let sut = Spec.init_sut () in
+        Stdext.finally (fun () -> 
+        let (_ : Spec.state) = interp_agree sut cs in ())
+        (fun () -> 
+        Spec.cleanup sut))
+
+  let agree_test ~name = Crowbar.add_test ~name [arb_cmds] agree_prop
+end
+
+module LU = Make (struct
+  include PathObserver
+
+  type cmd = int * Testable.Command.t
+
+  type sut = Testable.t ref * Testable.t ref
+
+  let arb_cmd = arb_cmd
+
+  let init_sut () =
+    let sut1 = Testable.create () in
+    Testable.init sut1 ;
+    let sut2 = Testable.create ~live_update:true () in
+    Testable.init sut2 ;
+    let sut1 = ref sut1 in
+    let sut2 = ref sut2 in
+    (sut1, sut2)
+
+  let cleanup (sut1, sut2) =
+    Testable.cleanup !sut1 ; Testable.cleanup !sut2
+
+  let run_cmd cmd state (sut1, sut2) =
+    Testable.run2 state.next_tid sut1 sut2 cmd ;
+    true
+end)
+
+let () =
+  (* Crowbar runs at_exit, and after bisect's coverage dumper,
+     registering an at_exit here would run *before* Crowbar starts,
+     hence the nested at_exit which puts the bisect dumper in the proper place
+     to dump coverage *after* crowbar is finished.
+   *)
+  (* at_exit (fun () -> at_exit Bisect.Runtime.write_coverage_data);*)
+  print_endline "";
+  LU.agree_test ~name:"live-update-agree";
diff --git a/tools/ocaml/xenstored/test/xs_protocol.ml b/tools/ocaml/xenstored/test/xs_protocol.ml
new file mode 100644
index 0000000000..b5da2aff34
--- /dev/null
+++ b/tools/ocaml/xenstored/test/xs_protocol.ml
@@ -0,0 +1,733 @@
+(*
+ * Copyright (C) Citrix Systems Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ *)
+
+let ( |> ) f g = g f
+let ( ++ ) f g x = f (g x)
+
+module Op = struct
+  type t =
+    | Debug | Directory | Read | Getperms
+    | Watch | Unwatch | Transaction_start
+    | Transaction_end | Introduce | Release
+    | Getdomainpath | Write | Mkdir | Rm
+    | Setperms | Watchevent | Error | Isintroduced
+    | Resume | Set_target
+    | Reset_watches | Directory_part
+
+  let to_int32 = function
+    | Debug -> 0l
+    | Directory -> 1l
+    | Read -> 2l
+    | Getperms -> 3l
+    | Watch -> 4l
+    | Unwatch -> 5l
+    | Transaction_start -> 6l
+    | Transaction_end -> 7l
+    | Introduce -> 8l
+    | Release -> 9l
+    | Getdomainpath -> 10l
+    | Write -> 11l
+    | Mkdir -> 12l
+    | Rm -> 13l
+    | Setperms -> 14l
+    | Watchevent -> 15l
+    | Error -> 16l
+    | Isintroduced -> 17l
+    | Resume -> 18l
+    | Set_target -> 19l
+    | Reset_watches -> 21l (* 20 is reserved *)
+    | Directory_part -> 22l
+
+  (* The index of the value in the array is the integer representation used
+     by the wire protocol. Every element of t exists exactly once in the array. *)
+  let on_the_wire =
+    let a = Array.make 23 None in
+    ListLabels.iter
+      ~f:(fun v -> a.(v |> to_int32 |> Int32.to_int) <- Some v)
+      [ Debug; Directory; Read; Getperms; Watch; Unwatch; Transaction_start
+      ; Transaction_end; Introduce; Release; Getdomainpath; Write; Mkdir; Rm
+      ; Setperms; Watchevent; Error; Isintroduced; Resume; Set_target
+      ; Reset_watches; Directory_part ] ;
+    a
+
+  let of_int32 i =
+    let i = Int32.to_int i in
+    if i >= 0 && i < Array.length on_the_wire then on_the_wire.(i) else None
+
+  let to_string = function
+    | Debug             -> "debug"
+    | Directory         -> "directory"
+    | Read              -> "read"
+    | Getperms          -> "getperms"
+    | Watch             -> "watch"
+    | Unwatch           -> "unwatch"
+    | Transaction_start -> "transaction_start"
+    | Transaction_end   -> "transaction_end"
+    | Introduce         -> "introduce"
+    | Release           -> "release"
+    | Getdomainpath     -> "getdomainpath"
+    | Write             -> "write"
+    | Mkdir             -> "mkdir"
+    | Rm                -> "rm"
+    | Setperms          -> "setperms"
+    | Watchevent        -> "watchevent"
+    | Error             -> "error"
+    | Isintroduced      -> "isintroduced"
+    | Resume            -> "resume"
+    | Set_target        -> "set_target"
+    | Reset_watches     -> "reset_watches"
+    | Directory_part    -> "directory_part"
+end
+
+let split_string ?limit:(limit=max_int) c s =
+  let len = String.length s in
+  let next_c from =
+    try
+      Some (String.index_from s from c)
+    with
+    | Not_found -> None
+  in
+  let decr n = max 0 (n-1) in
+  let rec loop n from acc =
+    match decr n, next_c from with
+    | 0, _
+    | _, None ->
+      (* No further instances of c, or we've reached limit *)
+      String.sub s from (len - from) :: acc
+    | n', Some idx ->
+      let a = String.sub s from (idx - from) in
+      (loop[@tailcall]) n' (idx + 1) (a :: acc)
+  in loop limit 0 [] |> List.rev
+
+
+module ACL = struct
+  type perm =
+    | NONE
+    | READ
+    | WRITE
+    | RDWR
+
+  let char_of_perm = function
+    | READ -> 'r'
+    | WRITE -> 'w'
+    | RDWR -> 'b'
+    | NONE -> 'n'
+
+  let perm_of_char = function
+    | 'r' -> Some READ
+    | 'w' -> Some WRITE
+    | 'b' -> Some RDWR
+    | 'n' -> Some NONE
+    | _ -> None
+
+  type domid = int
+
+  type t = {
+    owner: domid;             (** domain which "owns", has full access *)
+    other: perm;              (** default permissions for all others... *)
+    acl: (domid * perm) list; (** ... unless overridden in the ACL *)
+  }
+
+  let to_string perms =
+    let string_of_perm (id, perm) = Printf.sprintf "%c%u" (char_of_perm perm) id in
+    String.concat "\000" (List.map string_of_perm ((perms.owner,perms.other) :: perms.acl))
+
+  let of_string s =
+    (* A perm is stored as '<c>domid' *)
+    let perm_of_char_exn x = match (perm_of_char x) with Some y -> y | None -> raise Not_found in
+    try
+      let perm_of_string s =
+        if String.length s < 2
+        then invalid_arg (Printf.sprintf "Permission string too short: '%s'" s);
+        int_of_string (String.sub s 1 (String.length s - 1)), perm_of_char_exn s.[0] in
+      let l = List.map perm_of_string (split_string '\000' s) in
+      match l with
+      | (owner, other) :: l -> Some { owner = owner; other = other; acl = l }
+      | [] -> Some { owner = 0; other = NONE; acl = [] }
+    with _ ->
+      None
+end
+
+type t = {
+  tid: int32;
+  rid: int32;
+  ty: Op.t;
+  len: int;
+  data: Buffer.t;
+}
+
+let sizeof_header = 16
+let get_header_ty v = Cstruct.LE.get_uint32 v 0
+let set_header_ty v x = Cstruct.LE.set_uint32 v 0 x
+let get_header_rid v = Cstruct.LE.get_uint32 v 4
+let set_header_rid v x = Cstruct.LE.set_uint32 v 4 x
+let get_header_tid v = Cstruct.LE.get_uint32 v 8
+let set_header_tid v x = Cstruct.LE.set_uint32 v 8 x
+let get_header_len v = Cstruct.LE.get_uint32 v 12
+let set_header_len v x = Cstruct.LE.set_uint32 v 12 x
+
+let to_bytes pkt =
+  let header = Cstruct.create sizeof_header in
+  let len = Int32.of_int (Buffer.length pkt.data) in
+  let ty = Op.to_int32 pkt.ty in
+  set_header_ty header ty;
+  set_header_rid header pkt.rid;
+  set_header_tid header pkt.tid;
+  set_header_len header len;
+  let result = Buffer.create 64 in
+  Buffer.add_bytes result (Cstruct.to_bytes header);
+  Buffer.add_buffer result pkt.data;
+  Buffer.to_bytes result
+
+let get_tid pkt = pkt.tid
+let get_ty pkt = pkt.ty
+let get_data pkt =
+  if pkt.len > 0 && Buffer.nth pkt.data (pkt.len - 1) = '\000' then
+    Buffer.sub pkt.data 0 (pkt.len - 1)
+  else
+    Buffer.contents pkt.data
+let get_rid pkt = pkt.rid
+
+module Parser = struct
+  (** Incrementally parse packets *)
+
+  let header_size = 16
+
+  let xenstore_payload_max = 4096 (* xen/include/public/io/xs_wire.h *)
+
+  let allow_oversize_packets = ref true
+
+  type state =
+    | Unknown_operation of int32
+    | Parser_failed of string
+    | Need_more_data of int
+    | Packet of t
+
+  type parse =
+    | ReadingHeader of int * bytes
+    | ReadingBody of t
+    | Finished of state
+
+  let start () = ReadingHeader (0, Bytes.make header_size '\000')
+
+  let state = function
+    | ReadingHeader(got_already, _) -> Need_more_data (header_size - got_already)
+    | ReadingBody pkt -> Need_more_data (pkt.len - (Buffer.length pkt.data))
+    | Finished r -> r
+
+  let parse_header str =
+    let header = Cstruct.create sizeof_header in
+    Cstruct.blit_from_string str 0 header 0 sizeof_header;
+    let ty = get_header_ty header in
+    let rid = get_header_rid header in
+    let tid = get_header_tid header in
+    let len = get_header_len header in
+
+    let len = Int32.to_int len in
+    (* A packet which is bigger than xenstore_payload_max is illegal.
+       This will leave the guest connection is a bad state and will
+       be hard to recover from without restarting the connection
+       (ie rebooting the guest) *)
+    let len = if !allow_oversize_packets then len else max 0 (min xenstore_payload_max len) in
+
+    begin match Op.of_int32 ty with
+      | Some ty ->
+        let t = {
+          tid = tid;
+          rid = rid;
+          ty = ty;
+          len = len;
+          data = Buffer.create len;
+        } in
+        if len = 0
+        then Finished (Packet t)
+        else ReadingBody t
+      | None -> Finished (Unknown_operation ty)
+    end
+
+  let input state (bytes : string) =
+    match state with
+    | ReadingHeader(got_already, (str : bytes)) ->
+      Bytes.blit_string bytes 0 str got_already (String.length bytes);
+      let got_already = got_already + (String.length bytes) in
+      if got_already < header_size
+      then ReadingHeader(got_already, str)
+      else parse_header (Bytes.to_string str)
+    | ReadingBody x ->
+      Buffer.add_string x.data bytes;
+      let needed = x.len - (Buffer.length x.data) in
+      if needed > 0
+      then ReadingBody x
+      else Finished (Packet x)
+    | Finished f -> Finished f
+end
+
+(* Should we switch to an explicit stream abstraction here? *)
+module type IO = sig
+  type 'a t
+  val return: 'a -> 'a t
+  val ( >>= ): 'a t -> ('a -> 'b t) -> 'b t
+
+  type channel
+  val read: channel -> bytes -> int -> int -> int t
+  val write: channel -> bytes -> int -> int -> unit t
+end
+
+exception Unknown_xenstore_operation of int32
+exception Response_parser_failed of string
+exception EOF
+
+type ('a, 'b) result =
+  | Ok of 'a
+  | Exception of 'b
+
+module PacketStream = functor(IO: IO) -> struct
+  let ( >>= ) = IO.( >>= )
+  let return = IO.return
+
+  type stream = {
+    channel: IO.channel;
+    mutable incoming_pkt: Parser.parse; (* incrementally parses the next packet *)
+  }
+
+  let make t = {
+    channel = t;
+    incoming_pkt = Parser.start ();
+  }
+
+  (* [recv client] returns a single Packet, or fails *)
+  let rec recv t =
+    let open Parser in match Parser.state t.incoming_pkt with
+    | Packet pkt ->
+      t.incoming_pkt <- start ();
+      return (Ok pkt)
+    | Need_more_data x ->
+      let buf = Bytes.make x '\000' in
+      IO.read t.channel buf 0 x
+      >>= (function
+          | 0 -> return (Exception EOF)
+          | n ->
+            let fragment = Bytes.sub_string buf 0 n in
+            t.incoming_pkt <- input t.incoming_pkt fragment;
+            recv t)
+    | Unknown_operation x -> return (Exception (Unknown_xenstore_operation x))
+    | Parser_failed x -> return (Exception (Response_parser_failed x))
+
+  (* [send client pkt] sends [pkt] and returns (), or fails *)
+  let send t request =
+    let req = to_bytes request in
+    IO.write t.channel req 0 (Bytes.length req)
+end
+
+module Token = struct
+  type t = string
+
+  (** [to_user_string x] returns the user-supplied part of the watch token *)
+  let to_user_string x = Scanf.sscanf x "%d:%s" (fun _ x -> x)
+
+  let to_debug_string x = x
+
+  let of_string x = x
+  let to_string x = x
+end
+
+let data_concat ls = (String.concat "\000" ls) ^ "\000"
+
+let create tid rid ty data =
+  let len = String.length data in
+  let b = Buffer.create len in
+  Buffer.add_string b data;
+  {
+    tid = tid;
+    rid = rid;
+    ty = ty;
+    len = len;
+    data = b;
+  }
+
+module Response = struct
+
+  type payload =
+    | Read of string
+    | Directory of string list
+    | Getperms of ACL.t
+    | Getdomainpath of string
+    | Transaction_start of int32
+    | Write
+    | Mkdir
+    | Rm
+    | Setperms
+    | Watch
+    | Unwatch
+    | Transaction_end
+    | Debug of string list
+    | Introduce
+    | Resume
+    | Release
+    | Set_target
+    | Reset_watches
+    | Directory_part of int * string list
+    | Isintroduced of bool
+    | Error of string
+    | Watchevent of string * string
+
+  let prettyprint_payload =
+    let open Printf in function
+      | Read x -> sprintf "Read %s" x
+      | Directory xs -> sprintf "Directory [ %s ]" (String.concat "; " xs)
+      | Getperms acl -> sprintf "Getperms %s" (ACL.to_string acl)
+      | Getdomainpath p -> sprintf "Getdomainpath %s" p
+      | Transaction_start x -> sprintf "Transaction_start %ld" x
+      | Write -> "Write"
+      | Mkdir -> "Mkdir"
+      | Rm -> "Rm"
+      | Setperms -> "Setperms"
+      | Watch -> "Watch"
+      | Unwatch -> "Unwatch"
+      | Transaction_end -> "Transaction_end"
+      | Debug xs -> sprintf "Debug [ %s ]" (String.concat "; " xs)
+      | Introduce -> "Introduce"
+      | Resume -> "Resume"
+      | Release -> "Release"
+      | Set_target -> "Set_target"
+      | Reset_watches -> "Reset_watches"
+      | Directory_part (gencnt, xs) ->
+          sprintf "Directory_part #%d [ %s ]" gencnt (String.concat "; " xs)
+      | Isintroduced x -> sprintf "Isintroduced %b" x
+      | Error x -> sprintf "Error %s" x
+      | Watchevent (x, y) -> sprintf "Watchevent %s %s" x y
+
+  let ty_of_payload = function
+    | Read _ -> Op.Read
+    | Directory _ -> Op.Directory
+    | Getperms _ -> Op.Getperms
+    | Getdomainpath _ -> Op.Getdomainpath
+    | Transaction_start _ -> Op.Transaction_start
+    | Debug _ -> Op.Debug
+    | Isintroduced _ -> Op.Isintroduced
+    | Watchevent (_, _) -> Op.Watchevent
+    | Error _ -> Op.Error
+    | Write -> Op.Write
+    | Mkdir -> Op.Mkdir
+    | Rm -> Op.Rm
+    | Setperms -> Op.Setperms
+    | Watch -> Op.Watch
+    | Unwatch -> Op.Unwatch
+    | Transaction_end -> Op.Transaction_end
+    | Introduce -> Op.Introduce
+    | Resume -> Op.Resume
+    | Release -> Op.Release
+    | Set_target -> Op.Set_target
+    | Reset_watches -> Op.Reset_watches
+    | Directory_part _ -> Op.Directory_part
+
+  let ok = "OK\000"
+
+  let data_of_payload = function
+    | Read x                   -> x
+    | Directory ls             -> if ls = [] then "" else data_concat ls
+    | Getperms perms           -> data_concat [ ACL.to_string perms ]
+    | Getdomainpath x          -> data_concat [ x ]
+    | Transaction_start tid    -> data_concat [ Int32.to_string tid ]
+    | Debug items              -> data_concat items
+    | Isintroduced b           -> data_concat [ if b then "T" else "F" ]
+    | Watchevent (path, token) -> data_concat [ path; token ]
+    | Error x                  -> data_concat [ x ]
+    | _                        -> ok
+
+  let print x tid rid =
+    create tid rid (ty_of_payload x) (data_of_payload x)
+end
+
+module Request = struct
+
+  type path_op =
+    | Read
+    | Directory
+    | Directory_part of int
+    | Getperms
+    | Write of string
+    | Mkdir
+    | Rm
+    | Setperms of ACL.t
+
+  type payload =
+    | PathOp of string * path_op
+    | Getdomainpath of int
+    | Transaction_start
+    | Watch of string * string
+    | Unwatch of string * string
+    | Transaction_end of bool
+    | Debug of string list
+    | Introduce of int * Nativeint.t * int
+    | Resume of int
+    | Release of int
+    | Set_target of int * int
+    | Reset_watches
+    | Isintroduced of int
+    | Error of string
+    | Watchevent of string
+
+  open Printf
+
+  let prettyprint_pathop x = function
+    | Read -> sprintf "Read %s" x
+    | Directory -> sprintf "Directory %s" x
+    | Directory_part off -> sprintf "Directory %s @%d" x off
+    | Getperms -> sprintf "Getperms %s" x
+    | Write v -> sprintf "Write %s %s" x v
+    | Mkdir -> sprintf "Mkdir %s" x
+    | Rm -> sprintf "Rm %s" x
+    | Setperms acl -> sprintf "Setperms %s %s" x (ACL.to_string acl)
+
+  let prettyprint_payload = function
+    | PathOp (path, op) -> prettyprint_pathop path op
+    | Getdomainpath x -> sprintf "Getdomainpath %d" x
+    | Transaction_start -> "Transaction_start"
+    | Watch (x, y) -> sprintf "Watch %s %s" x y
+    | Unwatch (x, y) -> sprintf "Unwatch %s %s" x y
+    | Transaction_end x -> sprintf "Transaction_end %b" x
+    | Debug xs -> sprintf "Debug [ %s ]" (String.concat "; " xs)
+    | Introduce (x, n, y) -> sprintf "Introduce %d %nu %d" x n y
+    | Resume x -> sprintf "Resume %d" x
+    | Release x -> sprintf "Release %d" x
+    | Set_target (x, y) -> sprintf "Set_target %d %d" x y
+    | Reset_watches -> "Reset_watches"
+    | Isintroduced x -> sprintf "Isintroduced %d" x
+    | Error x -> sprintf "Error %s" x
+    | Watchevent x -> sprintf "Watchevent %s" x
+
+  exception Parse_failure
+
+  let strings data = split_string '\000' data
+
+  let one_string data =
+    let args = split_string ~limit:2 '\000' data in
+    match args with
+    | x :: [] -> x
+    | _       ->
+      raise Parse_failure
+
+  let two_strings data =
+    let args = split_string ~limit:2 '\000' data in
+    match args with
+    | a :: b :: [] -> a, b
+    | a :: [] -> a, "" (* terminating NULL removed by get_data *)
+    | _            ->
+      raise Parse_failure
+
+  let acl x = match ACL.of_string x with
+    | Some x -> x
+    | None ->
+      raise Parse_failure
+
+  let domid s =
+    let v = ref 0 in
+    let is_digit c = c >= '0' && c <= '9' in
+    let len = String.length s in
+    let i = ref 0 in
+    while !i < len && not (is_digit s.[!i]) do incr i done;
+    while !i < len && is_digit s.[!i]
+    do
+      let x = (Char.code s.[!i]) - (Char.code '0') in
+      v := !v * 10 + x;
+      incr i
+    done;
+    !v
+
+  let bool = function
+    | "F" -> false
+    | "T" -> true
+    | _ ->
+      raise Parse_failure
+
+  let parse_exn request =
+    let data = get_data request in
+    match get_ty request with
+    | Op.Read -> PathOp (data |> one_string, Read)
+    | Op.Directory -> PathOp (data |> one_string, Directory)
+    | Op.Getperms -> PathOp (data |> one_string, Getperms)
+    | Op.Getdomainpath -> Getdomainpath (data |> one_string |> domid)
+    | Op.Transaction_start -> Transaction_start
+    | Op.Write ->
+      let path, value = two_strings data in
+      PathOp (path, Write value)
+    | Op.Mkdir -> PathOp (data |> one_string, Mkdir)
+    | Op.Rm -> PathOp (data |> one_string, Rm)
+    | Op.Setperms ->
+      let path, perms = two_strings data in
+      let perms = acl perms in
+      PathOp(path, Setperms perms)
+    | Op.Watch ->
+      let path, token = two_strings data in
+      Watch(path, token)
+    | Op.Unwatch ->
+      let path, token = two_strings data in
+      Unwatch(path, token)
+    | Op.Transaction_end -> Transaction_end(data |> one_string |> bool)
+    | Op.Debug -> Debug (strings data)
+    | Op.Introduce ->
+      begin match strings data with
+        | d :: mfn :: port :: _ ->
+          let d = domid d in
+          let mfn = Nativeint.of_string mfn in
+          let port = int_of_string port in
+          Introduce (d, mfn, port)
+        | _ ->
+          raise Parse_failure
+      end
+    | Op.Resume -> Resume (data |> one_string |> domid)
+    | Op.Release -> Release (data |> one_string |> domid)
+    | Op.Set_target ->
+      let mine, yours = two_strings data in
+      let mine = domid mine and yours = domid yours in
+      Set_target(mine, yours)
+    | Op.Reset_watches -> Reset_watches
+    | Op.Directory_part ->
+        let path, offstr = two_strings data in
+        PathOp (path, Directory_part (int_of_string offstr))
+    | Op.Isintroduced -> Isintroduced (data |> one_string |> domid)
+    | Op.Error -> Error(data |> one_string)
+    | Op.Watchevent -> Watchevent(data |> one_string)
+
+  let parse request =
+    try
+      Some (parse_exn request)
+    with _ -> None
+
+  let prettyprint request =
+    Printf.sprintf "tid = %ld; rid = %ld; payload = %s"
+      (get_tid request) (get_rid request)
+      (match parse request with
+       | None -> "None"
+       | Some x -> "Some " ^ (prettyprint_payload x))
+
+  let ty_of_payload = function
+    | PathOp(_, Directory) -> Op.Directory
+    | PathOp(_, Read) -> Op.Read
+    | PathOp(_, Getperms) -> Op.Getperms
+    | Debug _ -> Op.Debug
+    | Watch (_, _) -> Op.Watch
+    | Unwatch (_, _) -> Op.Unwatch
+    | Transaction_start -> Op.Transaction_start
+    | Transaction_end _ -> Op.Transaction_end
+    | Introduce(_, _, _) -> Op.Introduce
+    | Release _ -> Op.Release
+    | Resume _ -> Op.Resume
+    | Getdomainpath _ -> Op.Getdomainpath
+    | PathOp(_, Write _) -> Op.Write
+    | PathOp(_, Mkdir) -> Op.Mkdir
+    | PathOp(_, Rm) -> Op.Rm
+    | PathOp(_, Setperms _) -> Op.Setperms
+    | Set_target (_, _) -> Op.Set_target
+    | Reset_watches -> Op.Reset_watches
+    | PathOp(_, Directory_part _) -> Op.Directory_part
+    | Isintroduced _ -> Op.Isintroduced
+    | Error _ -> Op.Error
+    | Watchevent _ -> Op.Watchevent
+
+  let transactional_of_payload = function
+    | PathOp(_, _)
+    | Transaction_end _ -> true
+    | _ -> false
+
+  let data_of_payload = function
+    | PathOp(path, Write value) ->
+      path ^ "\000" ^ value (* no NULL at the end *)
+    | PathOp(path, Setperms perms) ->
+      data_concat [ path; ACL.to_string perms ]
+    | PathOp(path, _) -> data_concat [ path ]
+    | Debug commands -> data_concat commands
+    | Watch (path, token)
+    | Unwatch (path, token) -> data_concat [ path; token ]
+    | Transaction_start -> data_concat []
+    | Transaction_end commit -> data_concat [ if commit then "T" else "F" ]
+    | Introduce(domid, mfn, port) ->
+      data_concat [
+        Printf.sprintf "%u" domid;
+        Printf.sprintf "%nu" mfn;
+        string_of_int port;
+      ]
+    | Release domid
+    | Resume domid
+    | Getdomainpath domid
+    | Isintroduced domid ->
+      data_concat [ Printf.sprintf "%u" domid; ]
+    | Reset_watches -> data_concat []
+    | Set_target (mine, yours) ->
+      data_concat [ Printf.sprintf "%u" mine; Printf.sprintf "%u" yours; ]
+    | Error _ ->
+      failwith "Unimplemented: data_of_payload (Error)"
+    | Watchevent _ ->
+      failwith "Unimplemented: data_of_payload (Watchevent)"
+
+  let print x tid rid =
+    create
+      (if transactional_of_payload x then tid else 0l)
+      rid
+      (ty_of_payload x)
+      (data_of_payload x)
+end
+
+module Unmarshal = struct
+  let some x = Some x
+  let int_of_string_opt x = try Some(int_of_string x) with _ -> None
+  let int32_of_string_opt x = try Some(Int32.of_string x) with _ -> None
+  let unit_of_string_opt x = if x = "" then Some () else None
+  let ok x = if x = "OK" then Some () else None
+
+  let string = some ++ get_data
+  let list = some ++ split_string '\000' ++ get_data
+  let acl = ACL.of_string ++ get_data
+  let int = int_of_string_opt ++ get_data
+  let int32 = int32_of_string_opt ++ get_data
+  let unit = unit_of_string_opt ++ get_data
+  let ok = ok ++ get_data
+end
+
+exception Enoent of string
+exception Eagain
+exception Eexist
+exception Invalid
+exception Error of string
+
+let response hint sent received f = match get_ty sent, get_ty received with
+  | _, Op.Error ->
+    begin match get_data received with
+      | "ENOENT" -> raise (Enoent hint)
+      | "EAGAIN" -> raise Eagain
+      | "EINVAL" -> raise Invalid
+      | "EEXIST" -> raise Eexist
+      | s -> raise (Error s)
+    end
+  | x, y when x = y ->
+    begin match f received with
+      | None -> raise (Error (Printf.sprintf "failed to parse response (hint:%s) (payload:%s)" hint (get_data received)))
+      | Some z -> z
+    end
+  | x, y ->
+    raise (Error (Printf.sprintf "unexpected packet: expected %s; got %s" (Op.to_string x) (Op.to_string y)))
+
+type address =
+  | Unix of string
+  | Domain of int
+
+let string_of_address = function
+  | Unix x -> x
+  | Domain x -> string_of_int x
+
+let domain_of_address = function
+  | Unix _ -> 0
+  | Domain x -> x
+
diff --git a/tools/ocaml/xenstored/transaction.ml b/tools/ocaml/xenstored/transaction.ml
index 25bc8c3b4a..4ee77b6e14 100644
--- a/tools/ocaml/xenstored/transaction.ml
+++ b/tools/ocaml/xenstored/transaction.ml
@@ -82,6 +82,7 @@ type t = {
 	start_count: int64;
 	store: Store.t; (* This is the store that we change in write operations. *)
 	quota: Quota.t;
+	mutable must_fail: bool;
 	oldroot: Store.Node.t;
 	mutable paths: (Xenbus.Xb.Op.operation * Store.Path.t) list;
 	mutable operations: (Packet.request * Packet.response) list;
@@ -89,7 +90,7 @@ type t = {
 	mutable write_lowpath: Store.Path.t option;
 }
 let get_id t = match t.ty with No -> none | Full (id, _, _) -> id
-
+let mark_failed t = t.must_fail <- true
 let counter = ref 0L
 let failed_commits = ref 0L
 let failed_commits_no_culprit = ref 0L
@@ -117,6 +118,8 @@ let trim_short_running_transactions txn =
 		keep
 		!short_running_txns
 
+let invalid_op = Xenbus.Xb.Op.Invalid, []
+
 let make ?(internal=false) id store =
 	let ty = if id = none then No else Full(id, Store.copy store, store) in
 	let txn = {
@@ -129,6 +132,7 @@ let make ?(internal=false) id store =
 		operations = [];
 		read_lowpath = None;
 		write_lowpath = None;
+		must_fail = false;
 	} in
 	if id <> none && not internal then (
 		let now = Unix.gettimeofday () in
@@ -139,10 +143,11 @@ let make ?(internal=false) id store =
 let get_store t = t.store
 let get_paths t = t.paths
 
+let is_read_only t = t.paths = [] && not t.must_fail
 let get_root t = Store.get_root t.store
 
-let is_read_only t = t.paths = []
 let add_wop t ty path = t.paths <- (ty, path) :: t.paths
+let clear_wops t = t.paths <- []
 let add_operation ~perm t request response =
 	if !Define.maxrequests >= 0
 		&& not (Perms.Connection.is_dom0 perm)
@@ -151,7 +156,9 @@ let add_operation ~perm t request response =
 	t.operations <- (request, response) :: t.operations
 let get_operations t = List.rev t.operations
 let set_read_lowpath t path = t.read_lowpath <- get_lowest path t.read_lowpath
-let set_write_lowpath t path = t.write_lowpath <- get_lowest path t.write_lowpath
+let set_write_lowpath t path =
+	 Logging.debug "transaction" "set_writelowpath (%d) %s" (get_id t)  (Store.Path.to_string path);
+	 t.write_lowpath <- get_lowest path t.write_lowpath
 
 let path_exists t path = Store.path_exists t.store path
 
@@ -200,7 +207,7 @@ let commit ~con t =
 	let has_commited =
 	match t.ty with
 	| No                         -> true
-	| Full (_id, oldstore, cstore) ->       (* "cstore" meaning current canonical store *)
+	| Full (id, oldstore, cstore) ->       (* "cstore" meaning current canonical store *)
 		let commit_partial oldroot cstore store =
 			(* get the lowest path of the query and verify that it hasn't
 			   been modified by others transactions. *)
@@ -240,11 +247,16 @@ let commit ~con t =
 				(* we try a partial commit if possible *)
 				commit_partial oldroot cstore store
 			in
+		if t.must_fail then begin
+			Logging.info "transaction" "Transaction %d was marked to fail (by live-update)" id;
+			false
+		end else
 		if !test_eagain && Random.int 3 = 0 then
 			false
 		else
 			try_commit (Store.get_root oldstore) cstore t.store
 		in
+	Logging.info "transaction" "has_commited: %b" has_commited;
 	if has_commited && has_write_ops then
 		Disk.write t.store;
 	if not has_commited
@@ -252,3 +264,102 @@ let commit ~con t =
 	else if not !has_coalesced
 	then Logging.commit ~tid:(get_id t) ~con;
 	has_commited
+
+module LR = Disk.LiveRecord
+
+(* here instead of Store.ml to avoid dependency cycle *)
+let write_node ch txidaccess path node =
+	let value = Store.Node.get_value node in
+	let perms = Store.Node.get_perms node in
+	let path = Store.Path.of_path_and_name path (Symbol.to_string node.Store.Node.name) |> Store.Path.to_string in
+	LR.write_node_data ch ~txidaccess ~path ~value ~perms
+
+let split limit c s =
+	let limit = match limit with None -> 8 | Some x -> x in
+	String.split ~limit c s
+	
+exception Invalid_Cmd_Args
+let split_one_path data conpath =
+	let args = split (Some 2) '\000' data in
+	match args with
+	| path :: "" :: [] -> Store.Path.create path conpath
+	| _                -> raise Invalid_Cmd_Args
+	
+let dump base conpath ~conid txn ch =
+	(* TODO: implicit paths need to be converted to explicit *)
+	let txid = get_id txn in
+	LR.write_transaction_data ch ~conid ~txid;
+	let store = get_store txn in
+	let write_node_mkdir path =
+   let perms, value =	match Store.get_node store path with
+  | None -> Perms.Node.default0, "" (* need to dump mkdir anyway even if later deleted due to implicit path creation *)
+  | Some node -> Store.Node.get_perms node, Store.Node.get_value node (* not always "", e.g. on EEXIST *) in
+  LR.write_node_data ch ~txidaccess:(Some (conid, txid, LR.W)) ~path:(Store.Path.to_string path) ~value ~perms
+in
+	maybe (fun path ->
+		(* if there were any reads make sure the tree matches, remove all contents and write out subtree *)
+		match Store.get_node store path with
+		| None -> (* we've only read nodes that we ended up deleting, nothing to do *) ()
+		| Some node ->
+			write_node ch (Some (conid, txid, LR.Del)) (Store.Path.get_parent path) node;
+			let path = Store.Path.get_parent path in
+			Store.traversal node @@ fun path' node ->
+			write_node ch (Some (conid,txid, LR.R)) (List.append path path') node
+	) txn.read_lowpath;
+	(* we could do something similar for write_lowpath, but that would become 
+	 	 complicated to handle correctly wrt to permissions and quotas if there are nodes
+		 owned by other domains in the subtree.
+	*)
+	let ops = get_operations txn in
+	if ops <> [] then
+		(* mark that we had some operation, these could be failures, etc.
+			 we want to fail the transaction after a live-update,
+			 unless it is completely a no-op
+		 *)
+		let perms = Store.getperms store Perms.Connection.full_rights [] in
+		let value = Store.get_root store |> Store.Node.get_value in
+  	LR.write_node_data ch ~txidaccess:(Some (conid, txid, LR.R)) ~path:"/" ~value ~perms;
+	ListLabels.iter (fun (req, reply) ->
+		Logging.debug "transaction" "dumpop %s" (Xenbus.Xb.Op.to_string req.Packet.ty); 
+		let data = req.Packet.data in
+		let open Xenbus.Xb.Op in
+		match reply with
+		| Packet.Error _ -> ()
+		| _ ->
+		try match req.Packet.ty with
+| Debug
+| Watch
+| Unwatch
+| Transaction_start
+| Transaction_end
+| Introduce
+| Release
+| Watchevent
+| Getdomainpath
+| Error
+| Isintroduced
+| Resume
+| Set_target
+| Reset_watches
+| Invalid
+| Directory
+| (Read|Getperms) -> ()
+| (Write|Setperms) ->
+		(match (split (Some 2) '\000' data) with
+		| path :: _ :: _ ->
+	let path = Store.Path.create path conpath in
+	if req.Packet.ty = Write then
+  write_node_mkdir (Store.Path.get_parent path);(* implicit mkdir *)
+	(match Store.get_node store path with
+	| None -> ()
+	| Some node ->
+	write_node ch (Some (conid, txid, LR.W)) (Store.Path.get_parent path) node)
+	| _ -> raise Invalid_Cmd_Args)
+| Mkdir ->
+	let path = split_one_path data conpath in
+  write_node_mkdir  path;
+| Rm ->
+	let path = split_one_path data conpath |> Store.Path.to_string in
+	LR.write_node_data ch ~txidaccess:(Some (conid, txid, LR.Del)) ~path ~value:"" ~perms:Perms.Node.default0
+	with Invalid_Cmd_Args|Define.Invalid_path|Not_found-> ()
+	 ) ops
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v1 0/4] tools/ocaml/xenstored: bugfixes
@ 2021-01-15 22:28 ` Edwin Török
  0 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini

Fix bugs discovered by structured fuzz testing.
See 'tools/ocaml/xenstored: structured fuzz testing' series for the tests.

These have been discovered while testing the XSA fixes, but they are not security related.

For convenience here is a tree with all patch series applied:
https://github.com/edwintorok/xen/pull/1

Edwin Török (4):
  tools/ocaml/libs/xb: do not crash after xenbus is unmapped
  tools/ocaml/xenstored: fix quota calculation for mkdir EEXIST
  tools/ocaml/xenstored: reject invalid watch paths early
  tools/ocaml/xenstored: mkdir conflicts were sometimes missed

 tools/ocaml/libs/xb/xs_ring_stubs.c  | 3 +++
 tools/ocaml/xenstored/connection.ml  | 5 ++---
 tools/ocaml/xenstored/connections.ml | 4 +++-
 tools/ocaml/xenstored/store.ml       | 1 +
 tools/ocaml/xenstored/transaction.ml | 2 +-
 5 files changed, 10 insertions(+), 5 deletions(-)

-- 
2.29.2



^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v1 1/4] tools/ocaml/libs/xb: do not crash after xenbus is unmapped
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (25 preceding siblings ...)
  (?)
@ 2021-01-15 22:29 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

Xenmmap.unmap sets the address to MAP_FAILED in xenmmap_stubs.c.
If due to a bug there were still references to the Xenbus and we attempt
to use it then we crash.
Raise an exception instead of crashing.

(My initial version of fuzz testing had such a bug)

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
---
 tools/ocaml/libs/xb/xs_ring_stubs.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/ocaml/libs/xb/xs_ring_stubs.c b/tools/ocaml/libs/xb/xs_ring_stubs.c
index 7537a23949..7a91fdee75 100644
--- a/tools/ocaml/libs/xb/xs_ring_stubs.c
+++ b/tools/ocaml/libs/xb/xs_ring_stubs.c
@@ -32,6 +32,7 @@
 #include <caml/fail.h>
 #include <caml/callback.h>
 
+#include <sys/mman.h>
 #include "mmap_stubs.h"
 
 #define GET_C_STRUCT(a) ((struct mmap_interface *) a)
@@ -166,6 +167,8 @@ CAMLprim value ml_interface_set_server_features(value interface, value v)
 {
 	CAMLparam2(interface, v);
 	struct xenstore_domain_interface *intf = GET_C_STRUCT(interface)->addr;
+	if (intf == (void*)MAP_FAILED)
+		caml_failwith("Interface closed");
 
 	intf->server_features = Int_val(v);
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v1 2/4] tools/ocaml/xenstored: fix quota calculation for mkdir EEXIST
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (26 preceding siblings ...)
  (?)
@ 2021-01-15 22:29 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Christian Lindig, David Scott

We increment the domain's quota on mkdir even when the node already
exists.
This results in a quota inconsistency after live update, where
reconstructing the tree from scratch results in a different quota.

Not a security issue because the domain uses up quota faster,
so it will only get a Quota error sooner than it should.

Discovered by the structured fuzzing test:
```
live-update-agree: FAIL

When given the input:

  [{ "domid" = 0;
    "cmd" = { "tid" = 0;
              "rid" = 0;
              "op" = MKDIR;
              "data" = "/" } }; { "domid" = 0;
                                  "cmd" = { "tid" = 0;
                                            "rid" = 0;
                                            "op" = DEBUG;
                                            "data" = "live-update\000-s" } }]

the test failed:

    store agrement: diff --git 1/tmp/expected5b4372.txt 2/tmp/actual1c18b5.txt
index ac39964836..af318026ec 100644
--- 1/tmp/expected5b4372.txt
+++ 2/tmp/actual1c18b5.txt
@@ -1,9 +1,9 @@
{ "stat_transaction_coalesce" = 0;
  "stat_transaction_abort" = 0;
  "store" = /{n0}
  /tool{n0}
  /local{n0}
  ;
  "quota" = { "maxent" = 8192;
  "maxsize" = 2048;
  "cur" = (hashtbl (0, +3+))-2-)) } }

Fatal error: exception Crowbar.TestFailure
```

This shows that the quota was 2 instead of 3 after a live update.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
---
 tools/ocaml/xenstored/store.ml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/ocaml/xenstored/store.ml b/tools/ocaml/xenstored/store.ml
index a9c079a417..1a9f71fa62 100644
--- a/tools/ocaml/xenstored/store.ml
+++ b/tools/ocaml/xenstored/store.ml
@@ -420,6 +420,7 @@ let mkdir store perm path =
 	(* It's upt to the mkdir logic to decide what to do with existing path *)
 	if not (existing || (Perms.Connection.is_dom0 perm)) then Quota.check store.quota owner 0;
 	store.root <- path_mkdir store perm path;
+	if not existing then
 	Quota.add_entry store.quota owner
 
 let rm store perm path =
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v1 3/4] tools/ocaml/xenstored: reject invalid watch paths early
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (27 preceding siblings ...)
  (?)
@ 2021-01-15 22:29 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

Watches on invalid paths were accepted, but they would never trigger.
The client also got no notification that its watch is bad and would
never trigger.

Found again by the structured fuzzer, due to an error on live update
reload: the invalid watch paths would get rejected during live update
and the list of watches would be different pre/post live update.

This was found by an older version of the fuzzer:
```
Test live-update failed (507 shrink steps):
[NEW; (0, None, WATCH ([""; ""], "")); (0, None, CONTROL live-update ())]
```

The testcase is watch on `//`, which is an invalid path.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
---
 tools/ocaml/xenstored/connection.ml  | 5 ++---
 tools/ocaml/xenstored/connections.ml | 4 +++-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/ocaml/xenstored/connection.ml b/tools/ocaml/xenstored/connection.ml
index 1f9fe9e3b2..c7f22e5ee9 100644
--- a/tools/ocaml/xenstored/connection.ml
+++ b/tools/ocaml/xenstored/connection.ml
@@ -163,18 +163,17 @@ let get_children_watches con path =
 let is_dom0 con =
 	Perms.Connection.is_dom0 (get_perm con)
 
-let add_watch con path token =
+let add_watch con (path, apath) token =
 	if !Quota.activate && !Define.maxwatch > 0 &&
 	   not (is_dom0 con) && con.nb_watches > !Define.maxwatch then
 		raise Quota.Limit_reached;
-	let apath = get_watch_path con path in
 	let l = get_watches con apath in
 	if List.exists (fun w -> w.token = token) l then
 		raise Define.Already_exist;
 	let watch = watch_create ~con ~token ~path in
 	Hashtbl.replace con.watches apath (watch :: l);
 	con.nb_watches <- con.nb_watches + 1;
-	apath, watch
+	watch
 
 let del_watch con path token =
 	let apath = get_watch_path con path in
diff --git a/tools/ocaml/xenstored/connections.ml b/tools/ocaml/xenstored/connections.ml
index 8a66eeec3a..3c7429fe7f 100644
--- a/tools/ocaml/xenstored/connections.ml
+++ b/tools/ocaml/xenstored/connections.ml
@@ -114,8 +114,10 @@ let key_of_path path =
 	"" :: Store.Path.to_string_list path
 
 let add_watch cons con path token =
-	let apath, watch = Connection.add_watch con path token in
+	let apath = Connection.get_watch_path con path in
+	(* fail on invalid paths early by calling key_of_str before adding watch *)
 	let key = key_of_str apath in
+	let watch = Connection.add_watch con (path, apath) token in
 	let watches =
  		if Trie.mem cons.watches key
  		then Trie.find cons.watches key
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v1 4/4] tools/ocaml/xenstored: mkdir conflicts were sometimes missed
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (28 preceding siblings ...)
  (?)
@ 2021-01-15 22:29 ` Edwin Török
  -1 siblings, 0 replies; 39+ messages in thread
From: Edwin Török @ 2021-01-15 22:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Edwin Török, Christian Lindig, David Scott,
	Ian Jackson, Wei Liu

Due to how set_write_lowpath was used here it didn't detect
create/delete conflicts.
When we create an entry we must mark our parent as modified
(this is what creating a new node via write does).

Otherwise we can have 2 transactions one creating, and another deleting
a node both succeeding depending on timing.
Or one transaction reading an entry, concluding it doesn't exist,
do some other work based on that information and successfully commit
even if another transaction creates the node via mkdir meanwhile.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
---
 tools/ocaml/xenstored/transaction.ml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/ocaml/xenstored/transaction.ml b/tools/ocaml/xenstored/transaction.ml
index 4ee77b6e14..0466b04ae3 100644
--- a/tools/ocaml/xenstored/transaction.ml
+++ b/tools/ocaml/xenstored/transaction.ml
@@ -172,7 +172,7 @@ let write t perm path value =
 
 let mkdir ?(with_watch=true) t perm path =
 	Store.mkdir t.store perm path;
-	set_write_lowpath t path;
+	set_write_lowpath t (Store.Path.get_parent path);
 	if with_watch then
 		add_wop t Xenbus.Xb.Op.Mkdir path
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/8] tools/xenstore: add live update command to xenstore-control
  2021-01-15 22:28 ` [PATCH v2 1/8] tools/xenstore: add live update command to xenstore-control Edwin Török
@ 2021-01-18  7:50   ` Jürgen Groß
  2021-01-18  9:40     ` Edwin Torok
  0 siblings, 1 reply; 39+ messages in thread
From: Jürgen Groß @ 2021-01-18  7:50 UTC (permalink / raw)
  To: Edwin Török, xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Paul Durrant,
	Julien Grall


[-- Attachment #1.1.1: Type: text/plain, Size: 1084 bytes --]

On 15.01.21 23:28, Edwin Török wrote:
> From: Juergen Gross <jgross@suse.com>
> 
> Add the "live-update" command to xenstore-control enabling updating
> xenstored to a new version in a running Xen system.
> 
> With -c <arg> it is possible to pass a different command line to the
> new instance of xenstored. This will replace the command line used
> for the invocation of the just running xenstored instance.
> 
> The running xenstored (or xenstore-stubdom) needs to support live
> updating, of course.
> 
> For now just add a small dummy handler to C xenstore denying any
> live update action.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> Reviewed-by: Paul Durrant <paul@xen.org>
> Reviewed-by: Julien Grall <jgrall@amazon.com>

Instead of merging multiple patches of mine into a single one and
sending it here with my S-o-b I'd prefer a simple caveat for your
series to depend on my C-xenstore series.

This additionally reduces the risk to miss any modifications in
my series done in a later iteration than the one you have taken.


Juergen

[-- Attachment #1.1.2: OpenPGP_0xB0DE9DD628BF132F.asc --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 4/8] tools/ocaml/xenstored: only quit on SIGTERM when a reload is possible
  2021-01-15 22:28 ` [PATCH v2 4/8] tools/ocaml/xenstored: only quit on SIGTERM when a reload is possible Edwin Török
@ 2021-01-18  7:51   ` Jürgen Groß
  2021-01-18  9:28     ` Edwin Torok
  0 siblings, 1 reply; 39+ messages in thread
From: Jürgen Groß @ 2021-01-18  7:51 UTC (permalink / raw)
  To: Edwin Török, xen-devel
  Cc: Christian Lindig, David Scott, Ian Jackson, Wei Liu, Pau Ruiz Safont


[-- Attachment #1.1.1: Type: text/plain, Size: 1404 bytes --]

On 15.01.21 23:28, Edwin Török wrote:
> Currently when oxenstored receives SIGTERM it dumps its state and quits.
> It is possible to then restart it if --restart is given, however that is
> not always safe:
> 
> * domains could have active transactions, and after a restart they would
> either reuse transaction IDs of already open transactions, or get an
> error back that the transaction doesn't exist
> 
> * there could be pending data to send to a VM still in oxenstored's
>    queue which would be lost
> 
> * there could be pending input to be processed from a VM in oxenstored's
>    queue which would be lost
> 
> Prevent shutting down oxenstored via SIGTERM in the above situations.
> Also ignore domains marked as bad because oxenstored would never talk
> to them again.
> 
> Signed-off-by: Edwin Török <edvin.torok@citrix.com>
> Reviewed-by: Pau Ruiz Safont <pau.safont@citrix.com>
> Reviewed-by: Christian Lindig <christian.lindig@citrix.com>
> 
> ---
> Changed since V1:
> * post publicly now that the XSA is out
> ---
>   tools/ocaml/xenstored/connection.ml  | 35 ++++++++++++++++++++++++++++
>   tools/ocaml/xenstored/connections.ml |  8 +++++++
>   tools/ocaml/xenstored/xenstored.ml   | 13 +++++++++--
>   tools/xenstore/xenstored_core.c      |  7 +++++-

I don't think you should modify tools/xenstore/xenstored_core.c in your
series.


Juergen

[-- Attachment #1.1.2: OpenPGP_0xB0DE9DD628BF132F.asc --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 4/8] tools/ocaml/xenstored: only quit on SIGTERM when a reload is possible
  2021-01-18  7:51   ` Jürgen Groß
@ 2021-01-18  9:28     ` Edwin Torok
  0 siblings, 0 replies; 39+ messages in thread
From: Edwin Torok @ 2021-01-18  9:28 UTC (permalink / raw)
  To: jgross, xen-devel; +Cc: wl, Pau Ruiz Safont, dave, Christian Lindig, iwj

On Mon, 2021-01-18 at 08:51 +0100, Jürgen Groß wrote:
> On 15.01.21 23:28, Edwin Török wrote:
> > Currently when oxenstored receives SIGTERM it dumps its state and
> > quits.
> > It is possible to then restart it if --restart is given, however
> > that is
> > not always safe:
> > 
> > * domains could have active transactions, and after a restart they
> > would
> > either reuse transaction IDs of already open transactions, or get
> > an
> > error back that the transaction doesn't exist
> > 
> > * there could be pending data to send to a VM still in oxenstored's
> >    queue which would be lost
> > 
> > * there could be pending input to be processed from a VM in
> > oxenstored's
> >    queue which would be lost
> > 
> > Prevent shutting down oxenstored via SIGTERM in the above
> > situations.
> > Also ignore domains marked as bad because oxenstored would never
> > talk
> > to them again.
> > 
> > Signed-off-by: Edwin Török <edvin.torok@citrix.com>
> > Reviewed-by: Pau Ruiz Safont <pau.safont@citrix.com>
> > Reviewed-by: Christian Lindig <christian.lindig@citrix.com>
> > 
> > ---
> > Changed since V1:
> > * post publicly now that the XSA is out
> > ---
> >   tools/ocaml/xenstored/connection.ml  | 35
> > ++++++++++++++++++++++++++++
> >   tools/ocaml/xenstored/connections.ml |  8 +++++++
> >   tools/ocaml/xenstored/xenstored.ml   | 13 +++++++++--
> >   tools/xenstore/xenstored_core.c      |  7 +++++-
> 
> I don't think you should modify tools/xenstore/xenstored_core.c in
> your
> series.
> 

Thanks for spotting, I think that hunk ended up in the wrong place
during a patchqueue rebase when solving conflicts, it should be gone
when I post a V3:
https://github.com/edwintorok/xen/pull/1/commits/4ec9ffcee83a9668431b220bef4abdcd9ac51175

Best regards,
--Edwin

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/8] tools/xenstore: add live update command to xenstore-control
  2021-01-18  7:50   ` Jürgen Groß
@ 2021-01-18  9:40     ` Edwin Torok
  0 siblings, 0 replies; 39+ messages in thread
From: Edwin Torok @ 2021-01-18  9:40 UTC (permalink / raw)
  To: jgross, xen-devel
  Cc: jbeulich, julien, wl, iwj, sstabellini, jgrall, George Dunlap,
	paul, Andrew Cooper

On Mon, 2021-01-18 at 08:50 +0100, Jürgen Groß wrote:
> On 15.01.21 23:28, Edwin Török wrote:
> > From: Juergen Gross <jgross@suse.com>
> > 
> > Add the "live-update" command to xenstore-control enabling updating
> > xenstored to a new version in a running Xen system.
> > 
> > With -c <arg> it is possible to pass a different command line to
> > the
> > new instance of xenstored. This will replace the command line used
> > for the invocation of the just running xenstored instance.
> > 
> > The running xenstored (or xenstore-stubdom) needs to support live
> > updating, of course.
> > 
> > For now just add a small dummy handler to C xenstore denying any
> > live update action.
> > 
> > Signed-off-by: Juergen Gross <jgross@suse.com>
> > Reviewed-by: Paul Durrant <paul@xen.org>
> > Reviewed-by: Julien Grall <jgrall@amazon.com>
> 
> Instead of merging multiple patches of mine into a single one and
> sending it here with my S-o-b I'd prefer a simple caveat for your
> series to depend on my C-xenstore series.

Yes, I should've added a link to that in my cover letter,
in the next revision I'll point to:
https://lore.kernel.org/xen-devel/20210115083000.14186-1-jgross@suse.com/

> 
> This additionally reduces the risk to miss any modifications in
> my series done in a later iteration than the one you have taken.

I've reordered the patches and changed the base, it should be gone the
next I do git format-patch:
https://github.com/edwintorok/xen/pull/1/commits

Will also remove
https://github.com/edwintorok/xen/pull/1/commits/47dd1d9b99b9210e94fbd4af28afee8088d5267
once I implemented the change in oxenstored to not return BUSY that we
discussed.

> 
> 
> Juergen


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 0/4] tools/ocaml/xenstored: optimizations
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (29 preceding siblings ...)
  (?)
@ 2021-01-21 11:15 ` Christian Lindig
  -1 siblings, 0 replies; 39+ messages in thread
From: Christian Lindig @ 2021-01-21 11:15 UTC (permalink / raw)
  To: Edwin Torok, xen-devel; +Cc: David Scott, Ian Jackson, Wei Liu

Acked-by: Christian Lindig <christian.lindig@citrix.com>

I am providing feedback on https://github.com/edwintorok/xen/pull/1.

In general: this is a large patch and therefore difficult to review for correctness. However:

* It comes with a lot of testing and was fuzz-tested
* It improves building OCaml xenstore
* It improves functional and performance problems

I welcome this work and think it ends a period of stagnation of this code base. This applies not just to this patch but the other patches sent out by Edwin in the context of this.


________________________________________
From: Edwin Török <edvin.torok@citrix.com>
Sent: 15 January 2021 22:28
To: xen-devel@lists.xenproject.org
Cc: Edwin Torok; Christian Lindig; David Scott; Ian Jackson; Wei Liu
Subject: [PATCH v4 0/4] tools/ocaml/xenstored: optimizations

Various speed optimizations that have already been posted,
but committing them was delayed to avoid conflicts with XSAs.
The XSAs are out, so these are ready to go now.

The switch to Maps may expose bugs in certain xenstored clients,
which previously relied on iteration order of the DIRECTORY response.

In our testing we found one such case, which turned out to be a bug
in a testsuite (it always dropped the 1st xenstore key).

For convenience here is a tree with all patch series applied:
https://github.com/edwintorok/xen/pull/1

Edwin Török (4):
  tools/ocaml/xenstored: replace hand rolled GC with weak GC references
  tools/ocaml/xenstored: backport find_opt/update from 4.06
  tools/ocaml/xenstored: use more efficient node trees
  tools/ocaml/xenstored: use more efficient tries

 tools/ocaml/xenstored/connection.ml  |  3 --
 tools/ocaml/xenstored/connections.ml |  2 +-
 tools/ocaml/xenstored/history.ml     | 14 ------
 tools/ocaml/xenstored/stdext.ml      | 19 ++++++++
 tools/ocaml/xenstored/store.ml       | 51 +++++++++-----------
 tools/ocaml/xenstored/symbol.ml      | 70 +++++++---------------------
 tools/ocaml/xenstored/symbol.mli     | 22 +++------
 tools/ocaml/xenstored/trie.ml        | 61 +++++++++++-------------
 tools/ocaml/xenstored/trie.mli       | 26 +++++------
 tools/ocaml/xenstored/xenstored.ml   | 16 +------
 10 files changed, 109 insertions(+), 175 deletions(-)

--
2.29.2



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 0/2] tools/ocaml/libs/xc: domid control
  2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
                   ` (30 preceding siblings ...)
  (?)
@ 2021-01-21 11:16 ` Christian Lindig
  -1 siblings, 0 replies; 39+ messages in thread
From: Christian Lindig @ 2021-01-21 11:16 UTC (permalink / raw)
  To: Edwin Torok, xen-devel; +Cc: David Scott, Ian Jackson, Wei Liu

Acked-by: Christian Lindig <christian.lindig@citrix.com>

I'm providing some feedback on https://github.com/edwintorok/xen/pull/1

________________________________________
From: Edwin Török <edvin.torok@citrix.com>
Sent: 15 January 2021 22:28
To: xen-devel@lists.xenproject.org
Cc: Edwin Torok; Christian Lindig; David Scott; Ian Jackson; Wei Liu
Subject: [PATCH v2 0/2] tools/ocaml/libs/xc: domid control

For debugging/testing purposes we want to be able to control the domid
from the XAPI toolstack too. Xen supports this since a long time.

For convenience here is a tree with all patch series applied:
https://github.com/edwintorok/xen/pull/1


Edwin Török (2):
  tools/ocaml/xenstored: trim txhistory on xenbus reconnect
  tools/ocaml/libs/xc: backward compatible domid control at domain
    creation time

 tools/ocaml/libs/xc/xenctrl.ml      | 5 ++++-
 tools/ocaml/libs/xc/xenctrl.mli     | 4 ++--
 tools/ocaml/libs/xc/xenctrl_stubs.c | 6 +++---
 tools/ocaml/xenstored/connection.ml | 2 +-
 tools/ocaml/xenstored/history.ml    | 4 ++++
 tools/ocaml/xenstored/process.ml    | 4 ++--
 6 files changed, 16 insertions(+), 9 deletions(-)

--
2.29.2



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 3/8] docs/designs/xenstore-migration.md: clarify that deletes are recursive
  2021-01-15 22:28 ` [PATCH v2 3/8] docs/designs/xenstore-migration.md: clarify that deletes are recursive Edwin Török
@ 2021-01-22 13:04   ` Jürgen Groß
  2021-01-22 14:44     ` Edwin Torok
  0 siblings, 1 reply; 39+ messages in thread
From: Jürgen Groß @ 2021-01-22 13:04 UTC (permalink / raw)
  To: Edwin Török, xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu


[-- Attachment #1.1.1: Type: text/plain, Size: 1358 bytes --]

On 15.01.21 23:28, Edwin Török wrote:
> Signed-off-by: Edwin Török <edvin.torok@citrix.com>
> ---
> Changed since V1:
> * post publicly now that the XSA is out
> ---
>   docs/designs/xenstore-migration.md | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/designs/xenstore-migration.md b/docs/designs/xenstore-migration.md
> index 2ce2c836f5..f44bc0c61d 100644
> --- a/docs/designs/xenstore-migration.md
> +++ b/docs/designs/xenstore-migration.md
> @@ -365,7 +365,8 @@ record previously present).
>   |              | 0x0001: read                                   |
>   |              | 0x0002: written                                |
>   |              |                                                |
> -|              | The value will be zero for a deleted node      |
> +|              | The value will be zero for a recursively       |
> +|              | deleted node                                   |

I don't see the value in this modification.

The wording is ambiguous: is the value zero only for nodes which were
deleted due to recursion, or do you mean deletes are recursive?

Per docs/misc/xenstore.txt all deletes are recursive, so there is no
need to repeat that here. And a zero value only for the recursions makes
no sense at all.

So I'd nack this patch.


Juergen

[-- Attachment #1.1.2: OpenPGP_0xB0DE9DD628BF132F.asc --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 3/8] docs/designs/xenstore-migration.md: clarify that deletes are recursive
  2021-01-22 13:04   ` Jürgen Groß
@ 2021-01-22 14:44     ` Edwin Torok
  0 siblings, 0 replies; 39+ messages in thread
From: Edwin Torok @ 2021-01-22 14:44 UTC (permalink / raw)
  To: jgross, xen-devel
  Cc: iwj, sstabellini, julien, jbeulich, George Dunlap, Andrew Cooper, wl

On Fri, 2021-01-22 at 14:04 +0100, Jürgen Groß wrote:
> On 15.01.21 23:28, Edwin Török wrote:
> > Signed-off-by: Edwin Török <edvin.torok@citrix.com>
> > ---
> > Changed since V1:
> > * post publicly now that the XSA is out
> > ---
> >   docs/designs/xenstore-migration.md | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/docs/designs/xenstore-migration.md
> > b/docs/designs/xenstore-migration.md
> > index 2ce2c836f5..f44bc0c61d 100644
> > --- a/docs/designs/xenstore-migration.md
> > +++ b/docs/designs/xenstore-migration.md
> > @@ -365,7 +365,8 @@ record previously present).
> >   |              | 0x0001: read                                   |
> >   |              | 0x0002: written                                |
> >   |              |                                                |
> > -|              | The value will be zero for a deleted node      |
> > +|              | The value will be zero for a recursively       |
> > +|              | deleted node                                   |
> 
> I don't see the value in this modification.
> 
> The wording is ambiguous: is the value zero only for nodes which were
> deleted due to recursion, or do you mean deletes are recursive?

I was looking at this from the point of view of generating the diff,
where you can optimize and reduce the size of the diff if you notice
that it is sufficient to add a record only for the topmost entry when
the entire subtree is deleted.

You are right that looking at it from the point of view of applying the
transaction record you would reuse the existing delete implementation
which is already recursive.

> 
> Per docs/misc/xenstore.txt all deletes are recursive, so there is no
> need to repeat that here. And a zero value only for the recursions
> makes
> no sense at all.
> 
> So I'd nack this patch.

We can drop it.
--Edwin
> 
> Juergen


^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2021-01-22 14:44 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-15 22:28 [PATCH v2 0/2] oxenstored build enhancements Edwin Török
2021-01-15 22:29 ` [PATCH v1 0/4] tools/ocaml/xenstored: bugfixes Edwin Török
2021-01-15 22:28 ` [PATCH v1 0/5] tools/ocaml/xenstored: structured fuzz testing Edwin Török
2021-01-15 22:28 ` [PATCH v4 0/4] tools/ocaml/xenstored: optimizations Edwin Török
2021-01-15 22:28 ` [PATCH v2 0/2] tools/ocaml/libs/xc: domid control Edwin Török
2021-01-15 22:28 ` [PATCH v2 0/8] tools/ocaml/xenstored: implement live update Edwin Török
2021-01-15 22:28 ` [PATCH v2 1/2] automation/: add Ubuntu:focal container Edwin Török
2021-01-15 22:28 ` [PATCH v2 2/2] Makefile: add build-tools-oxenstored Edwin Török
2021-01-15 22:28 ` [PATCH v2 1/8] tools/xenstore: add live update command to xenstore-control Edwin Török
2021-01-18  7:50   ` Jürgen Groß
2021-01-18  9:40     ` Edwin Torok
2021-01-15 22:28 ` [PATCH v2 2/8] Add workaround for xenstore-control flood issues Edwin Török
2021-01-15 22:28 ` [PATCH v2 3/8] docs/designs/xenstore-migration.md: clarify that deletes are recursive Edwin Török
2021-01-22 13:04   ` Jürgen Groß
2021-01-22 14:44     ` Edwin Torok
2021-01-15 22:28 ` [PATCH v2 4/8] tools/ocaml/xenstored: only quit on SIGTERM when a reload is possible Edwin Török
2021-01-18  7:51   ` Jürgen Groß
2021-01-18  9:28     ` Edwin Torok
2021-01-15 22:28 ` [PATCH v2 5/8] tools/ocaml/xenstored: Automatically resume when possible Edwin Török
2021-01-15 22:28 ` [PATCH v2 6/8] tools/ocaml/xenstored: add cooperative live-update command Edwin Török
2021-01-15 22:28 ` [PATCH v2 7/8] tools/ocaml/xenstored: start live update process Edwin Török
2021-01-15 22:28 ` [PATCH v2 8/8] tools/ocaml/xenstored: Implement live update for socket connections Edwin Török
2021-01-15 22:28 ` [PATCH v2 1/2] tools/ocaml/xenstored: trim txhistory on xenbus reconnect Edwin Török
2021-01-15 22:28 ` [PATCH v2 2/2] tools/ocaml/libs/xc: backward compatible domid control at domain creation time Edwin Török
2021-01-15 22:28 ` [PATCH v4 1/4] tools/ocaml/xenstored: replace hand rolled GC with weak GC references Edwin Török
2021-01-15 22:28 ` [PATCH v4 2/4] tools/ocaml/xenstored: backport find_opt/update from 4.06 Edwin Török
2021-01-15 22:28 ` [PATCH v4 3/4] tools/ocaml/xenstored: use more efficient node trees Edwin Török
2021-01-15 22:28 ` [PATCH v4 4/4] tools/ocaml/xenstored: use more efficient tries Edwin Török
2021-01-15 22:29 ` [PATCH v1 1/5] tools/ocaml: add unit test skeleton with Dune build system Edwin Török
2021-01-15 22:29 ` [PATCH v1 2/5] tools/ocaml/xenstored: implement the live migration binary format Edwin Török
2021-01-15 22:29 ` [PATCH v1 3/5] tools/ocaml/xenstored: add binary dump format support Edwin Török
2021-01-15 22:29 ` [PATCH v1 4/5] tools/ocaml/xenstored: add support for binary format Edwin Török
2021-01-15 22:29 ` [PATCH v1 5/5] Add structured fuzzing unit test Edwin Török
2021-01-15 22:29 ` [PATCH v1 1/4] tools/ocaml/libs/xb: do not crash after xenbus is unmapped Edwin Török
2021-01-15 22:29 ` [PATCH v1 2/4] tools/ocaml/xenstored: fix quota calculation for mkdir EEXIST Edwin Török
2021-01-15 22:29 ` [PATCH v1 3/4] tools/ocaml/xenstored: reject invalid watch paths early Edwin Török
2021-01-15 22:29 ` [PATCH v1 4/4] tools/ocaml/xenstored: mkdir conflicts were sometimes missed Edwin Török
2021-01-21 11:15 ` [PATCH v4 0/4] tools/ocaml/xenstored: optimizations Christian Lindig
2021-01-21 11:16 ` [PATCH v2 0/2] tools/ocaml/libs/xc: domid control Christian Lindig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.