All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xen-devel] [PATCH 0/5] domain context infrastructure
@ 2020-03-27 18:50 Paul Durrant
  2020-03-27 18:50 ` [Xen-devel] [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context Paul Durrant
                   ` (4 more replies)
  0 siblings, 5 replies; 32+ messages in thread
From: Paul Durrant @ 2020-03-27 18:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Paul Durrant,
	Andrew Cooper, Ian Jackson, George Dunlap, Jan Beulich,
	Daniel De Graaf

Paul Durrant (5):
  xen/common: introduce a new framework for save/restore of 'domain'
    context
  xen/common/domctl: introduce XEN_DOMCTL_get/setdomaincontext
  tools/misc: add xen-ctx to present domain context
  common/domain: add a domain context record for shared_info...
  tools/libxc: make use of domain context SHARED_INFO record...

 .gitignore                          |   1 +
 tools/flask/policy/modules/xen.if   |   4 +-
 tools/libxc/include/xenctrl.h       |  11 ++
 tools/libxc/xc_domain.c             |  52 ++++++
 tools/libxc/xc_sr_common.h          |   7 +-
 tools/libxc/xc_sr_common_x86.c      |  58 ++++++
 tools/libxc/xc_sr_common_x86.h      |   4 +
 tools/libxc/xc_sr_common_x86_pv.c   |  52 ++++++
 tools/libxc/xc_sr_common_x86_pv.h   |   3 +
 tools/libxc/xc_sr_restore_x86_pv.c  |  40 ++---
 tools/libxc/xc_sr_save_x86_pv.c     |  26 +--
 tools/libxc/xg_save_restore.h       |   1 +
 tools/misc/Makefile                 |   4 +
 tools/misc/xen-ctx.c                | 152 ++++++++++++++++
 xen/common/Makefile                 |   1 +
 xen/common/domain.c                 |  55 ++++++
 xen/common/domctl.c                 | 115 ++++++++++++
 xen/common/save.c                   | 262 ++++++++++++++++++++++++++++
 xen/include/public/domctl.h         |  41 ++++-
 xen/include/public/save.h           |  82 +++++++++
 xen/include/public/xen.h            |   3 +
 xen/include/xen/save.h              | 115 ++++++++++++
 xen/xsm/flask/hooks.c               |   6 +
 xen/xsm/flask/policy/access_vectors |   4 +
 24 files changed, 1047 insertions(+), 52 deletions(-)
 create mode 100644 tools/misc/xen-ctx.c
 create mode 100644 xen/common/save.c
 create mode 100644 xen/include/public/save.h
 create mode 100644 xen/include/xen/save.h
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Wei Liu <wl@xen.org>
-- 
2.20.1



^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Xen-devel] [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
  2020-03-27 18:50 [Xen-devel] [PATCH 0/5] domain context infrastructure Paul Durrant
@ 2020-03-27 18:50 ` Paul Durrant
  2020-04-01 12:00   ` Julien Grall
  2020-04-01 14:50   ` Jan Beulich
  2020-03-27 18:50 ` [Xen-devel] [PATCH 2/5] xen/common/domctl: introduce XEN_DOMCTL_get/setdomaincontext Paul Durrant
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 32+ messages in thread
From: Paul Durrant @ 2020-03-27 18:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Paul Durrant,
	Andrew Cooper, Ian Jackson, George Dunlap, Jan Beulich

Domain context is state held in the hypervisor that does not come under
the category of 'HVM state' but is instead 'PV state' that is common
between PV guests and enlightened HVM guests (i.e. those that have PV
drivers) such as event channel state, grant entry state, etc.

To allow enlightened HVM guests to be migrated without their co-operation
it will be necessary to transfer such state along with the domain's
memory image, architectural state, etc. This framework is introduced for
that purpose.

This patch adds the new public header and the low level implementation,
entered via the domain_save() or domain_load() functions. Subsequent
patches will introduce other parts of the framwork, and code that will
make use of it within the current version of the libxc migration stream.

Signed-off-by: Paul Durrant <paul@xen.org>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Wei Liu <wl@xen.org>
---
 xen/common/Makefile       |   1 +
 xen/common/save.c         | 262 ++++++++++++++++++++++++++++++++++++++
 xen/include/public/save.h |  74 +++++++++++
 xen/include/xen/save.h    | 115 +++++++++++++++++
 4 files changed, 452 insertions(+)
 create mode 100644 xen/common/save.c
 create mode 100644 xen/include/public/save.h
 create mode 100644 xen/include/xen/save.h

diff --git a/xen/common/Makefile b/xen/common/Makefile
index e8cde65370..90553ba5d7 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -37,6 +37,7 @@ obj-y += radix-tree.o
 obj-y += rbtree.o
 obj-y += rcupdate.o
 obj-y += rwlock.o
+obj-y += save.o
 obj-y += shutdown.o
 obj-y += softirq.o
 obj-y += sort.o
diff --git a/xen/common/save.c b/xen/common/save.c
new file mode 100644
index 0000000000..bef99452d8
--- /dev/null
+++ b/xen/common/save.c
@@ -0,0 +1,262 @@
+/*
+ * save.c: Save and restore PV guest state common to all domain types.
+ *
+ * Copyright Amazon.com Inc. or its affiliates.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/save.h>
+
+struct domain_context {
+    bool log;
+    struct domain_save_descriptor desc;
+    domain_copy_entry copy;
+    void *priv;
+};
+
+static struct {
+    const char *name;
+    bool per_vcpu;
+    domain_save_handler save;
+    domain_load_handler load;
+} handlers[DOMAIN_SAVE_CODE_MAX + 1];
+
+void __init domain_register_save_type(unsigned int tc, const char *name,
+                                      bool per_vcpu,
+                                      domain_save_handler save,
+                                      domain_load_handler load)
+{
+    BUG_ON(tc > ARRAY_SIZE(handlers));
+
+    ASSERT(!handlers[tc].save);
+    ASSERT(!handlers[tc].load);
+
+    handlers[tc].name = name;
+    handlers[tc].per_vcpu = per_vcpu;
+    handlers[tc].save = save;
+    handlers[tc].load = load;
+}
+
+int domain_save_entry(struct domain_context *c, unsigned int tc,
+                      const char *name, const struct vcpu *v, void *src,
+                      size_t src_len)
+{
+    int rc;
+
+    if ( c->log && tc != DOMAIN_SAVE_CODE(HEADER) &&
+         tc != DOMAIN_SAVE_CODE(END) )
+        gdprintk(XENLOG_INFO, "%pv save: %s (%lu)\n", v, name, src_len);
+
+    if ( !IS_ALIGNED(src_len, 8) )
+        return -EINVAL;
+
+    BUG_ON(tc != c->desc.typecode);
+    BUG_ON(v->vcpu_id != c->desc.instance);
+    c->desc.length = src_len;
+
+    rc = c->copy(c->priv, &c->desc, sizeof(c->desc));
+    if ( rc )
+        return rc;
+
+    return c->copy(c->priv, src, src_len);
+}
+
+int domain_save(struct domain *d, domain_copy_entry copy, void *priv,
+                unsigned long mask, bool dry_run)
+{
+    struct domain_context c = {
+        .copy = copy,
+        .priv = priv,
+        .log = !dry_run,
+    };
+    struct domain_save_header h = {
+        .magic = DOMAIN_SAVE_MAGIC,
+        .version = DOMAIN_SAVE_VERSION,
+    };
+    struct domain_save_header e;
+    unsigned int i;
+    int rc;
+
+    ASSERT(d != current->domain);
+
+    if ( d->is_dying )
+        return -EINVAL;
+
+    domain_pause(d);
+
+    c.desc.typecode = DOMAIN_SAVE_CODE(HEADER);
+
+    rc = DOMAIN_SAVE_ENTRY(HEADER, &c, d->vcpu[0], &h, sizeof(h));
+    if ( rc )
+        goto out;
+
+    for ( i = 0; i < ARRAY_SIZE(handlers); i++ )
+    {
+        domain_save_handler save = handlers[i].save;
+
+        if ( (mask && !test_bit(i, &mask)) || !save )
+            continue;
+
+        memset(&c.desc, 0, sizeof(c.desc));
+        c.desc.typecode = i;
+
+        if ( handlers[i].per_vcpu )
+        {
+            struct vcpu *v;
+
+            for_each_vcpu ( d, v )
+            {
+                c.desc.instance = v->vcpu_id;
+
+                rc = save(v, &c, dry_run);
+                if ( rc )
+                    goto out;
+            }
+        }
+        else
+        {
+            rc = save(d->vcpu[0], &c, dry_run);
+            if ( rc )
+                goto out;
+        }
+    }
+
+    memset(&c.desc, 0, sizeof(c.desc));
+    c.desc.typecode = DOMAIN_SAVE_CODE(END);
+
+    rc = DOMAIN_SAVE_ENTRY(END, &c, d->vcpu[0], &e, 0);
+
+ out:
+    domain_unpause(d);
+
+    return rc;
+}
+
+int domain_load_entry(struct domain_context *c, unsigned int tc,
+                      const char *name, const struct vcpu *v, void *dst,
+                      size_t dst_len, bool exact)
+{
+    int rc;
+
+    if ( c->log && tc != DOMAIN_SAVE_CODE(HEADER) &&
+         tc != DOMAIN_SAVE_CODE(END) )
+        gdprintk(XENLOG_INFO, "%pv load: %s (%lu)\n", v, name, dst_len);
+
+    BUG_ON(tc != c->desc.typecode);
+    BUG_ON(v->vcpu_id != c->desc.instance);
+
+    if ( (exact ?
+          (dst_len != c->desc.length) : (dst_len < c->desc.length)) ||
+         !IS_ALIGNED(c->desc.length, 8) )
+        return -EINVAL;
+
+    rc = c->copy(c->priv, dst, c->desc.length);
+    if ( rc )
+        return rc;
+
+    if ( !exact )
+    {
+        dst += c->desc.length;
+        memset(dst, 0, dst_len - c->desc.length);
+    }
+
+    return 0;
+}
+
+int domain_load(struct domain *d,  domain_copy_entry copy, void *priv,
+                unsigned long mask)
+{
+    struct domain_context c = {
+        .copy = copy,
+        .priv = priv,
+        .log = true,
+    };
+    struct domain_save_header h, e;
+    int rc;
+
+    ASSERT(d != current->domain);
+
+    if ( d->is_dying )
+        return -EINVAL;
+
+    rc = c.copy(c.priv, &c.desc, sizeof(c.desc));
+    if ( rc )
+        return rc;
+
+    if ( c.desc.typecode != DOMAIN_SAVE_CODE(HEADER) ||
+         c.desc.instance != 0 )
+        return -EINVAL;
+
+    rc = DOMAIN_LOAD_ENTRY(HEADER, &c, d->vcpu[0], &h, sizeof(h), true);
+    if ( rc )
+        return rc;
+
+    if ( h.magic != DOMAIN_SAVE_MAGIC || h.version != DOMAIN_SAVE_VERSION )
+        return -EINVAL;
+
+    domain_pause(d);
+
+    for (;;)
+    {
+        unsigned int i;
+        domain_load_handler load;
+        struct vcpu *v;
+
+        rc = c.copy(c.priv, &c.desc, sizeof(c.desc));
+        if ( rc )
+            break;
+
+        if ( c.desc.typecode == DOMAIN_SAVE_CODE(END) ) {
+            rc = DOMAIN_LOAD_ENTRY(END, &c, d->vcpu[0], &e, 0, true);
+            break;
+        }
+
+        rc = -EINVAL;
+        if ( c.desc.typecode >= ARRAY_SIZE(handlers) ||
+             c.desc.instance >= d->max_vcpus )
+            break;
+
+        i = c.desc.typecode;
+        load = handlers[i].load;
+        v = d->vcpu[c.desc.instance];
+
+        if ( mask && !test_bit(i, &mask) )
+        {
+            /* Sink the data */
+            rc = c.copy(c.priv, NULL, c.desc.length);
+            if ( rc )
+                break;
+
+            continue;
+        }
+
+        rc = load ? load(v, &c) : -EOPNOTSUPP;
+        if ( rc )
+            break;
+    }
+
+    domain_unpause(d);
+
+    return rc;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/public/save.h b/xen/include/public/save.h
new file mode 100644
index 0000000000..84981cd0f6
--- /dev/null
+++ b/xen/include/public/save.h
@@ -0,0 +1,74 @@
+/*
+ * save.h
+ *
+ * Structure definitions for common PV/HVM domain state that is held by
+ * Xen and must be saved along with the domain's memory.
+ *
+ * Copyright Amazon.com Inc. or its affiliates.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef __XEN_PUBLIC_SAVE_H__
+#define __XEN_PUBLIC_SAVE_H__
+
+#include "xen.h"
+
+/* Each entry is preceded by a descriptor */
+struct domain_save_descriptor {
+    uint16_t typecode;
+    uint16_t instance;
+    /*
+     * Entry length not including this descriptor. Entries must be padded
+     * to a multiple of 8 bytes to make sure descriptors remain correctly
+     * aligned.
+     */
+    uint32_t length;
+};
+
+/*
+ * Each entry has a type associated with it. DECLARE_DOMAIN_SAVE_TYPE
+ * binds these things together.
+ */
+#define DECLARE_DOMAIN_SAVE_TYPE(_x, _code, _type) \
+    struct __DOMAIN_SAVE_TYPE_##_x { _type t; char c[_code]; };
+
+#define DOMAIN_SAVE_TYPE(_x) \
+    typeof (((struct __DOMAIN_SAVE_TYPE_##_x *)(0))->t)
+#define DOMAIN_SAVE_CODE(_x) \
+    (sizeof (((struct __DOMAIN_SAVE_TYPE_##_x *)(0))->c))
+#define DOMAIN_SAVE_MASK(_x) (1u << DOMAIN_SAVE_CODE(_x))
+
+/* Terminating entry */
+struct domain_save_end {};
+DECLARE_DOMAIN_SAVE_TYPE(END, 0, struct domain_save_end);
+
+#define DOMAIN_SAVE_MAGIC   0x53415645
+#define DOMAIN_SAVE_VERSION 0x00000001
+
+/* Initial entry */
+struct domain_save_header {
+    uint32_t magic;             /* Must be DOMAIN_SAVE_MAGIC */
+    uint32_t version;           /* Save format version */
+};
+DECLARE_DOMAIN_SAVE_TYPE(HEADER, 1, struct domain_save_header);
+
+#define DOMAIN_SAVE_CODE_MAX 1
+
+#endif /* __XEN_PUBLIC_SAVE_H__ */
diff --git a/xen/include/xen/save.h b/xen/include/xen/save.h
new file mode 100644
index 0000000000..d5846f9e68
--- /dev/null
+++ b/xen/include/xen/save.h
@@ -0,0 +1,115 @@
+/*
+ * save.h: support routines for save/restore
+ *
+ * Copyright Amazon.com Inc. or its affiliates.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __XEN_SAVE_H__
+#define __XEN_SAVE_H__
+
+#include <xen/sched.h>
+#include <xen/types.h>
+#include <xen/init.h>
+
+#include <public/xen.h>
+#include <public/save.h>
+
+struct domain_context;
+
+int domain_save_entry(struct domain_context *c, unsigned int tc,
+                      const char *name, const struct vcpu *v, void *src,
+                      size_t src_len);
+
+#define DOMAIN_SAVE_ENTRY(_x, _c, _v, _src, _len)                        \
+        domain_save_entry((_c), DOMAIN_SAVE_CODE(_x), #_x, (_v), (_src), \
+                          (_len));
+
+int domain_load_entry(struct domain_context *c, unsigned int tc,
+                      const char *name, const struct vcpu *v, void *dest,
+                      size_t dest_len, bool exact);
+
+#define DOMAIN_LOAD_ENTRY(_x, _c, _v, _src, _len, _exact)                \
+        domain_load_entry((_c), DOMAIN_SAVE_CODE(_x), #_x, (_v), (_src), \
+                          (_len), (_exact));
+
+/*
+ * The 'dry_run' flag indicates that the caller of domain_save() (see
+ * below) is not trying to actually acquire the data, only the size
+ * of the data. The save handler can therefore limit work to only that
+ * which is necessary to call DOMAIN_SAVE_ENTRY() with an accurate value
+ * for '_len'.
+ */
+typedef int (*domain_save_handler)(const struct vcpu *v,
+                                   struct domain_context *h,
+                                   bool dry_run);
+typedef int (*domain_load_handler)(struct vcpu *v,
+                                   struct domain_context *h);
+
+void domain_register_save_type(unsigned int tc, const char *name,
+                               bool per_vcpu,
+                               domain_save_handler save,
+                               domain_load_handler load);
+
+#define DOMAIN_REGISTER_SAVE_RESTORE(_x, _per_vcpu, _save, _load) \
+static int __init __domain_register_##_x##_save_restore(void)     \
+{                                                                 \
+    domain_register_save_type(                                    \
+        DOMAIN_SAVE_CODE(_x),                                     \
+        #_x,                                                      \
+        (_per_vcpu),                                              \
+        &(_save),                                                 \
+        &(_load));                                                \
+                                                                  \
+    return 0;                                                     \
+}                                                                 \
+__initcall(__domain_register_##_x##_save_restore);
+
+/* Copy callback function */
+typedef int (*domain_copy_entry)(void *priv, void *data, size_t len);
+
+/*
+ * Entry points:
+ *
+ * int domain_save(struct domain *d, domain_copy_entry copy, void *priv,
+ *                 unsigned long mask, bool dry_run);
+ * int domain_load(struct domain *d, domain_copy_entry copy, void *priv,
+ *                 unsigned long mask);
+ *
+ * copy:    This is a callback function provided by the caller that will be
+ *          used to write to (in the save case) or read from (in the load
+ *          case) the context buffer.
+ * priv:    This is a pointer that will be passed to the copy function to
+ *          allow it to identify the context buffer and the current state
+ *          of the save or load operation.
+ * mask:    This is a mask to determine which save record types should be
+ *          copied to or from the buffer.
+ *          If it is zero then all save record types will be copied.
+ *          If it is non-zero then only record types with codes
+ *          corresponding to set bits will be copied. I.e. to copy save
+ *          record 'type', set the bit in position DOMAIN_SAVE_CODE(type).
+ *          DOMAIN_SAVE_CODE(HEADER) and DOMAIN_SAVE_CODE(END) records must
+ *          always be present and thus will be copied regardless of whether
+ *          the bits in those positions are set or not.
+ * dry_run: See the comment concerning (*domain_save) above.
+ *
+ * NOTE: A convenience macro, DOMAIN_SAVE_MASK(type), is defined to support
+ *       setting bits in the mask field.
+ */
+int domain_save(struct domain *d, domain_copy_entry copy, void *priv,
+                unsigned long mask, bool dry_run);
+int domain_load(struct domain *d, domain_copy_entry copy, void *priv,
+                unsigned long mask);
+
+#endif /* __XEN_SAVE_H__ */
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Xen-devel] [PATCH 2/5] xen/common/domctl: introduce XEN_DOMCTL_get/setdomaincontext
  2020-03-27 18:50 [Xen-devel] [PATCH 0/5] domain context infrastructure Paul Durrant
  2020-03-27 18:50 ` [Xen-devel] [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context Paul Durrant
@ 2020-03-27 18:50 ` Paul Durrant
  2020-04-01 13:42   ` Julien Grall
  2020-04-01 13:46   ` Julien Grall
  2020-03-27 18:50 ` [Xen-devel] [PATCH 3/5] tools/misc: add xen-ctx to present domain context Paul Durrant
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 32+ messages in thread
From: Paul Durrant @ 2020-03-27 18:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Paul Durrant,
	Andrew Cooper, Ian Jackson, George Dunlap, Jan Beulich,
	Daniel De Graaf

These domctls provide a mechanism to get and set domain context from
the toolstack.

Signed-off-by: Paul Durrant <paul@xen.org>
---
Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wl@xen.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
---
 tools/flask/policy/modules/xen.if   |   4 +-
 tools/libxc/include/xenctrl.h       |  11 +++
 tools/libxc/xc_domain.c             |  52 +++++++++++++
 xen/common/domctl.c                 | 115 ++++++++++++++++++++++++++++
 xen/include/public/domctl.h         |  41 +++++++++-
 xen/xsm/flask/hooks.c               |   6 ++
 xen/xsm/flask/policy/access_vectors |   4 +
 7 files changed, 230 insertions(+), 3 deletions(-)

diff --git a/tools/flask/policy/modules/xen.if b/tools/flask/policy/modules/xen.if
index 8eb2293a52..2bc9db4f64 100644
--- a/tools/flask/policy/modules/xen.if
+++ b/tools/flask/policy/modules/xen.if
@@ -53,7 +53,7 @@ define(`create_domain_common', `
 	allow $1 $2:domain2 { set_cpu_policy settsc setscheduler setclaim
 			set_vnumainfo get_vnumainfo cacheflush
 			psr_cmt_op psr_alloc soft_reset
-			resource_map get_cpu_policy };
+			resource_map get_cpu_policy setcontext };
 	allow $1 $2:security check_context;
 	allow $1 $2:shadow enable;
 	allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage mmuext_op updatemp };
@@ -97,7 +97,7 @@ define(`migrate_domain_out', `
 	allow $1 $2:hvm { gethvmc getparam };
 	allow $1 $2:mmu { stat pageinfo map_read };
 	allow $1 $2:domain { getaddrsize getvcpucontext pause destroy };
-	allow $1 $2:domain2 gettsc;
+	allow $1 $2:domain2 { gettsc getcontext };
 	allow $1 $2:shadow { enable disable logdirty };
 ')
 
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index fc6e57a1a0..5c0d0d27a4 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -867,6 +867,17 @@ int xc_domain_hvm_setcontext(xc_interface *xch,
                              uint8_t *hvm_ctxt,
                              uint32_t size);
 
+int xc_domain_getcontext(xc_interface *xch,
+                         uint32_t domid,
+                         uint32_t mask,
+                         uint8_t *ctxt_buf,
+                         uint32_t size);
+int xc_domain_setcontext(xc_interface *xch,
+                         uint32_t domid,
+                         uint32_t mask,
+                         uint8_t *ctxt_buf,
+                         uint32_t size);
+
 /**
  * This function will return guest IO ABI protocol
  *
diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index 71829c2bce..15bcf671fc 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -537,6 +537,58 @@ int xc_domain_hvm_setcontext(xc_interface *xch,
     return ret;
 }
 
+int xc_domain_getcontext(xc_interface *xch,
+                         uint32_t domid,
+                         uint32_t mask,
+                         uint8_t *ctxt_buf,
+                         uint32_t size)
+{
+    int ret;
+    DECLARE_DOMCTL;
+    DECLARE_HYPERCALL_BOUNCE(ctxt_buf, size, XC_HYPERCALL_BUFFER_BOUNCE_OUT);
+
+    if ( xc_hypercall_bounce_pre(xch, ctxt_buf) )
+        return -1;
+
+    domctl.cmd = XEN_DOMCTL_getdomaincontext;
+    domctl.domain = domid;
+    domctl.u.domaincontext.mask = mask;
+    domctl.u.domaincontext.size = size;
+    set_xen_guest_handle(domctl.u.domaincontext.buffer, ctxt_buf);
+
+    ret = do_domctl(xch, &domctl);
+
+    xc_hypercall_bounce_post(xch, ctxt_buf);
+
+    return !ret ? domctl.u.domaincontext.size : -1;
+}
+
+int xc_domain_setcontext(xc_interface *xch,
+                         uint32_t domid,
+                         uint32_t mask,
+                         uint8_t *ctxt_buf,
+                         uint32_t size)
+{
+    int ret;
+    DECLARE_DOMCTL;
+    DECLARE_HYPERCALL_BOUNCE(ctxt_buf, size, XC_HYPERCALL_BUFFER_BOUNCE_IN);
+
+    if ( xc_hypercall_bounce_pre(xch, ctxt_buf) )
+        return -1;
+
+    domctl.cmd = XEN_DOMCTL_setdomaincontext;
+    domctl.domain = domid;
+    domctl.u.domaincontext.mask = mask;
+    domctl.u.domaincontext.size = size;
+    set_xen_guest_handle(domctl.u.domaincontext.buffer, ctxt_buf);
+
+    ret = do_domctl(xch, &domctl);
+
+    xc_hypercall_bounce_post(xch, ctxt_buf);
+
+    return ret;
+}
+
 int xc_vcpu_getcontext(xc_interface *xch,
                        uint32_t domid,
                        uint32_t vcpu,
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index a69b3b59a8..9c39519b04 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -25,6 +25,7 @@
 #include <xen/hypercall.h>
 #include <xen/vm_event.h>
 #include <xen/monitor.h>
+#include <xen/save.h>
 #include <asm/current.h>
 #include <asm/irq.h>
 #include <asm/page.h>
@@ -358,6 +359,111 @@ static struct vnuma_info *vnuma_init(const struct xen_domctl_vnuma *uinfo,
     return ERR_PTR(ret);
 }
 
+struct domctl_context
+{
+    void *buffer;
+    size_t len;
+    size_t cur;
+};
+
+static int accumulate_size(void *priv, void *data, size_t len)
+{
+    struct domctl_context *c = priv;
+
+    c->len += len;
+
+    return 0;
+}
+
+static int save_data(void *priv, void *data, size_t len)
+{
+    struct domctl_context *c = priv;
+
+    if ( c->len - c->cur < len )
+        return -ENOSPC;
+
+    memcpy(c->buffer + c->cur, data, len);
+    c->cur += len;
+
+    return 0;
+}
+
+static int getdomaincontext(struct domain *d,
+                            struct xen_domctl_domaincontext *dc)
+{
+    struct domctl_context c = { };
+    int rc;
+
+    if ( d == current->domain )
+        return -EPERM;
+
+    if ( guest_handle_is_null(dc->buffer) ) /* query for buffer size */
+
+    {
+        if ( dc->size )
+            return -EINVAL;
+
+        /* dry run to acquire buffer size */
+        rc = domain_save(d, accumulate_size, &c, dc->mask, true);
+        if ( rc )
+            return rc;
+
+        dc->size = c.len;
+        return 0;
+    }
+
+    c.len = dc->size;
+    c.buffer = xmalloc_bytes(c.len);
+    if ( !c.buffer )
+        return -ENOMEM;
+
+    rc = domain_save(d, save_data, &c, dc->mask, false);
+
+    dc->size = c.cur;
+    if ( !rc && copy_to_guest(dc->buffer, c.buffer, dc->size) )
+        rc = -EFAULT;
+
+    xfree(c.buffer);
+
+    return rc;
+}
+
+static int load_data(void *priv, void *data, size_t len)
+{
+    struct domctl_context *c = priv;
+
+    if ( c->len - c->cur < len )
+        return -ENODATA;
+
+    if ( data )
+        memcpy(data, c->buffer + c->cur, len);
+
+    c->cur += len;
+
+    return 0;
+}
+
+static int setdomaincontext(struct domain *d,
+                            const struct xen_domctl_domaincontext *dc)
+{
+    struct domctl_context c = { .len = dc->size };
+    int rc;
+
+    if ( d == current->domain )
+        return -EPERM;
+
+    c.buffer = xmalloc_bytes(c.len);
+    if ( !c.buffer )
+        return -ENOMEM;
+
+    rc = !copy_from_guest(c.buffer, dc->buffer, c.len) ?
+        domain_load(d, load_data, &c, dc->mask) : -EFAULT;
+
+    xfree(c.buffer);
+
+    return rc;
+}
+
 long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
 {
     long ret = 0;
@@ -942,6 +1048,15 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
             copyback = 1;
         break;
 
+    case XEN_DOMCTL_getdomaincontext:
+        ret = getdomaincontext(d, &op->u.domaincontext);
+        copyback = !ret;
+        break;
+
+    case XEN_DOMCTL_setdomaincontext:
+        ret = setdomaincontext(d, &op->u.domaincontext);
+        break;
+
     default:
         ret = arch_do_domctl(op, d, u_domctl);
         break;
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 1ad34c35eb..24ed6852cf 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -38,7 +38,7 @@
 #include "hvm/save.h"
 #include "memory.h"
 
-#define XEN_DOMCTL_INTERFACE_VERSION 0x00000012
+#define XEN_DOMCTL_INTERFACE_VERSION 0x00000013
 
 /*
  * NB. xen_domctl.domain is an IN/OUT parameter for this operation.
@@ -1129,6 +1129,42 @@ struct xen_domctl_vuart_op {
                                  */
 };
 
+/*
+ * Get/Set domain PV context. The same struct xen_domctl_domaincontext
+ * is used for both commands but with slightly different field semantics
+ * as follows:
+ *
+ * XEN_DOMCTL_getdomaincontext
+ * ---------------------------
+ *
+ * buffer (IN):   The buffer into which the context data should be
+ *                copied, or NULL to query the buffer size that should
+ *                be allocated.
+ * size (IN/OUT): If 'buffer' is NULL then the value passed in must be
+ *                zero, and the value passed out will be the size of the
+ *                buffer to allocate.
+ *                If 'buffer' is non-NULL then the value passed in must
+ *                be the size of the buffer into which data may be copied.
+ * mask (IN):     See comment on domain_save/load() in xen/save.h.
+ *
+ * XEN_DOMCTL_setdomaincontext
+ * ---------------------------
+ *
+ * buffer (IN):   The buffer from which the context data should be
+ *                copied.
+ * size (IN):     The size of the buffer from which data may be copied.
+ *                This data must include DOMAIN_SAVE_CODE_HEADER at the
+ *                start and terminate with a DOMAIN_SAVE_CODE_END record.
+ *                Any data beyond the DOMAIN_SAVE_CODE_END record will be
+ *                ignored.
+ * mask (IN):     See comment on domain_save/load() in xen/save.h.
+ */
+struct xen_domctl_domaincontext {
+    uint32_t size;
+    uint32_t mask;
+    XEN_GUEST_HANDLE_64(uint8) buffer;
+};
+
 struct xen_domctl {
     uint32_t cmd;
 #define XEN_DOMCTL_createdomain                   1
@@ -1210,6 +1246,8 @@ struct xen_domctl {
 #define XEN_DOMCTL_vuart_op                      81
 #define XEN_DOMCTL_get_cpu_policy                82
 #define XEN_DOMCTL_set_cpu_policy                83
+#define XEN_DOMCTL_getdomaincontext              84
+#define XEN_DOMCTL_setdomaincontext              85
 #define XEN_DOMCTL_gdbsx_guestmemio            1000
 #define XEN_DOMCTL_gdbsx_pausevcpu             1001
 #define XEN_DOMCTL_gdbsx_unpausevcpu           1002
@@ -1270,6 +1308,7 @@ struct xen_domctl {
         struct xen_domctl_monitor_op        monitor_op;
         struct xen_domctl_psr_alloc         psr_alloc;
         struct xen_domctl_vuart_op          vuart_op;
+        struct xen_domctl_domaincontext     domaincontext;
         uint8_t                             pad[128];
     } u;
 };
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 8af8602b46..d94d0fc125 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -744,6 +744,12 @@ static int flask_domctl(struct domain *d, int cmd)
     case XEN_DOMCTL_get_cpu_policy:
         return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__GET_CPU_POLICY);
 
+    case XEN_DOMCTL_setdomaincontext:
+        return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__SETCONTEXT);
+
+    case XEN_DOMCTL_getdomaincontext:
+        return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__GETCONTEXT);
+
     default:
         return avc_unknown_permission("domctl", cmd);
     }
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index c055c14c26..fccfb9de82 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -245,6 +245,10 @@ class domain2
     resource_map
 # XEN_DOMCTL_get_cpu_policy
     get_cpu_policy
+# XEN_DOMCTL_setdomaincontext
+    setcontext
+# XEN_DOMCTL_getdomaincontext
+    getcontext
 }
 
 # Similar to class domain, but primarily contains domctls related to HVM domains
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Xen-devel] [PATCH 3/5] tools/misc: add xen-ctx to present domain context
  2020-03-27 18:50 [Xen-devel] [PATCH 0/5] domain context infrastructure Paul Durrant
  2020-03-27 18:50 ` [Xen-devel] [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context Paul Durrant
  2020-03-27 18:50 ` [Xen-devel] [PATCH 2/5] xen/common/domctl: introduce XEN_DOMCTL_get/setdomaincontext Paul Durrant
@ 2020-03-27 18:50 ` Paul Durrant
  2020-03-30 10:54   ` Jan Beulich
  2020-03-27 18:50 ` [Xen-devel] [PATCH 4/5] common/domain: add a domain context record for shared_info Paul Durrant
  2020-03-27 18:50 ` [Xen-devel] [PATCH 5/5] tools/libxc: make use of domain context SHARED_INFO record Paul Durrant
  4 siblings, 1 reply; 32+ messages in thread
From: Paul Durrant @ 2020-03-27 18:50 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Paul Durrant

This tools is analogous to 'xen-hvmctx' which presents HVM context.
Subsequent patches will add 'dump' functions when new records are
introduced.

Signed-off-by: Paul Durrant <paul@xen.org>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wl@xen.org>
---
 .gitignore           |   1 +
 tools/misc/Makefile  |   4 ++
 tools/misc/xen-ctx.c | 144 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 149 insertions(+)
 create mode 100644 tools/misc/xen-ctx.c

diff --git a/.gitignore b/.gitignore
index 4ca679ddbc..72b807141f 100644
--- a/.gitignore
+++ b/.gitignore
@@ -206,6 +206,7 @@ tools/misc/cpuperf/cpuperf-xen
 tools/misc/xc_shadow
 tools/misc/xen_cpuperf
 tools/misc/xen-cpuid
+tools/misc/xen-ctx
 tools/misc/xen-detect
 tools/misc/xen-diag
 tools/misc/xen-tmem-list-parse
diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index 63947bfadc..6347bb24e9 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -30,6 +30,7 @@ INSTALL_SBIN                   += xenpm
 INSTALL_SBIN                   += xenwatchdogd
 INSTALL_SBIN                   += xen-livepatch
 INSTALL_SBIN                   += xen-diag
+INSTALL_SBIN                   += xen-ctx
 INSTALL_SBIN += $(INSTALL_SBIN-y)
 
 # Everything to be installed in a private bin/
@@ -108,6 +109,9 @@ xen-livepatch: xen-livepatch.o
 xen-diag: xen-diag.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
+xen-ctx: xen-ctx.o
+	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
+
 xen-lowmemd: xen-lowmemd.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenevtchn) $(LDLIBS_libxenctrl) $(LDLIBS_libxenstore) $(APPEND_LDFLAGS)
 
diff --git a/tools/misc/xen-ctx.c b/tools/misc/xen-ctx.c
new file mode 100644
index 0000000000..c31dd5d8e9
--- /dev/null
+++ b/tools/misc/xen-ctx.c
@@ -0,0 +1,144 @@
+/*
+ * xen-ctx.c
+ *
+ * Print out domain save records in a human-readable way.
+ *
+ * Copyright Amazon.com Inc. or its affiliates.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include <inttypes.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <errno.h>
+
+#include <xenctrl.h>
+#include <xen/xen.h>
+#include <xen/domctl.h>
+#include <xen/save.h>
+
+static void *buf = NULL;
+static size_t len, off;
+
+#define READ(_x) do {                                                       \
+    if ( len - off < sizeof (_x) )                                          \
+    {                                                                       \
+        fprintf(stderr, "Error: need another %lu bytes, only %lu available",\
+                sizeof(_x), len - off);                                     \
+        exit(1);                                                            \
+    }                                                                       \
+    memcpy(&(_x), buf + off, sizeof (_x));                                  \
+    off += sizeof (_x);                                                     \
+} while (0)
+
+static void dump_header(void)
+{
+    DOMAIN_SAVE_TYPE(HEADER) h;
+    READ(h);
+    printf("    HEADER: magic %#x, version %u\n",
+           h.magic, h.version);
+}
+
+static void dump_end(void)
+{
+    DOMAIN_SAVE_TYPE(END) e;
+    READ(e);
+    printf("    END\n");
+}
+
+int main(int argc, char **argv)
+{
+    uint32_t domid;
+    unsigned int entry;
+    xc_interface *xch;
+    int rc;
+
+    if ( argc != 2 || !argv[1] || (rc = atoi(argv[1])) < 0 )
+    {
+        fprintf(stderr, "usage: %s <domid>\n", argv[0]);
+        exit(1);
+    }
+    domid = rc;
+
+    xch = xc_interface_open(0,0,0);
+    if ( !xch )
+    {
+        fprintf(stderr, "Error: can't open libxc handle\n");
+        exit(1);
+    }
+
+    rc = xc_domain_getcontext(xch, domid, 0, 0, 0);
+    if ( rc < 0 )
+    {
+        fprintf(stderr, "Error: can't get record length for dom %u: %s\n",
+                domid, strerror(errno));
+        exit(1);
+    }
+    len = rc;
+
+    buf = malloc(len);
+    if ( !buf )
+    {
+        fprintf(stderr, "Error: can't allocate %lu bytes\n", len);
+        exit(1);
+    }
+
+    rc = xc_domain_getcontext(xch, domid, 0, buf, len);
+    if ( rc < 0 )
+    {
+        fprintf(stderr, "Error: can't get domain record for dom %u: %s\n",
+                domid, strerror(errno));
+        exit(1);
+    }
+    len = rc;
+    off = 0;
+
+    printf("Domain save records for d%u\n", domid);
+
+    entry = 0;
+    for (;;) {
+        struct domain_save_descriptor desc;
+
+        READ(desc);
+        printf("[%u] type %u instance %u, length %u\n", entry++,
+               desc.typecode, desc.instance, desc.length);
+
+        switch (desc.typecode)
+        {
+        case DOMAIN_SAVE_CODE(HEADER): dump_header(); break;
+        case DOMAIN_SAVE_CODE(END): dump_end(); return 0;
+        default:
+            printf("Unknown type %u: skipping\n", desc.typecode);
+            off += desc.length;
+            break;
+        }
+    }
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Xen-devel] [PATCH 4/5] common/domain: add a domain context record for shared_info...
  2020-03-27 18:50 [Xen-devel] [PATCH 0/5] domain context infrastructure Paul Durrant
                   ` (2 preceding siblings ...)
  2020-03-27 18:50 ` [Xen-devel] [PATCH 3/5] tools/misc: add xen-ctx to present domain context Paul Durrant
@ 2020-03-27 18:50 ` Paul Durrant
  2020-04-01 14:27   ` Julien Grall
  2020-03-27 18:50 ` [Xen-devel] [PATCH 5/5] tools/libxc: make use of domain context SHARED_INFO record Paul Durrant
  4 siblings, 1 reply; 32+ messages in thread
From: Paul Durrant @ 2020-03-27 18:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Paul Durrant,
	Andrew Cooper, Ian Jackson, George Dunlap, Jan Beulich

... and update xen-ctx to dump some information describing the record.

NOTE: To allow a sensible definition of the record in public/save.h
      this patch also adds a definition of the Xen ABI's de-facto page
      size into public/xen.h.

Signed-off-by: Paul Durrant <paul@xen.org>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wl@xen.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
---
 tools/misc/xen-ctx.c      |  8 ++++++
 xen/common/domain.c       | 55 +++++++++++++++++++++++++++++++++++++++
 xen/include/public/save.h | 10 ++++++-
 xen/include/public/xen.h  |  3 +++
 4 files changed, 75 insertions(+), 1 deletion(-)

diff --git a/tools/misc/xen-ctx.c b/tools/misc/xen-ctx.c
index c31dd5d8e9..8f9692843b 100644
--- a/tools/misc/xen-ctx.c
+++ b/tools/misc/xen-ctx.c
@@ -57,6 +57,13 @@ static void dump_header(void)
            h.magic, h.version);
 }
 
+static void dump_shared_info(void)
+{
+    DOMAIN_SAVE_TYPE(SHARED_INFO) s;
+    READ(s);
+    printf("    SHARED_INFO: field_width %u\n", s.field_width);
+}
+
 static void dump_end(void)
 {
     DOMAIN_SAVE_TYPE(END) e;
@@ -124,6 +131,7 @@ int main(int argc, char **argv)
         switch (desc.typecode)
         {
         case DOMAIN_SAVE_CODE(HEADER): dump_header(); break;
+        case DOMAIN_SAVE_CODE(SHARED_INFO): dump_shared_info(); break;
         case DOMAIN_SAVE_CODE(END): dump_end(); return 0;
         default:
             printf("Unknown type %u: skipping\n", desc.typecode);
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 3dcd73f67c..484f6bde13 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -33,6 +33,7 @@
 #include <xen/xenoprof.h>
 #include <xen/irq.h>
 #include <xen/argo.h>
+#include <xen/save.h>
 #include <asm/debugger.h>
 #include <asm/p2m.h>
 #include <asm/processor.h>
@@ -1646,6 +1647,60 @@ int continue_hypercall_on_cpu(
     return 0;
 }
 
+static int save_shared_info(const struct vcpu *v, struct domain_context *c,
+                            bool dry_run)
+{
+    struct domain *d = v->domain;
+    struct domain_shared_info_context ctxt = {};
+
+    if ( !dry_run )
+    {
+        memcpy(ctxt.buffer, d->shared_info, PAGE_SIZE);
+        ctxt.field_width = has_32bit_shinfo(d) ? 4 : 8;
+    }
+
+    return DOMAIN_SAVE_ENTRY(SHARED_INFO, c, v, &ctxt, sizeof(ctxt));
+}
+
+static int load_shared_info(struct vcpu *v, struct domain_context *c)
+{
+    struct domain *d = v->domain;
+    struct domain_shared_info_context ctxt;
+    unsigned int i;
+    int rc;
+
+    rc = DOMAIN_LOAD_ENTRY(SHARED_INFO, c, v, &ctxt, sizeof(ctxt), true);
+    if ( rc )
+        return rc;
+
+    for ( i = 0; i < ARRAY_SIZE(ctxt.pad); i++ )
+        if ( ctxt.pad[i] )
+            return -EINVAL;
+
+    switch ( ctxt.field_width )
+    {
+    case 4:
+        d->arch.has_32bit_shinfo = 1;
+        break;
+
+    case 8:
+        d->arch.has_32bit_shinfo = 0;
+        break;
+
+    default:
+        rc = -EINVAL;
+        break;
+    }
+
+    if ( !rc )
+        memcpy(d->shared_info, ctxt.buffer, PAGE_SIZE);
+
+    return rc;
+}
+
+DOMAIN_REGISTER_SAVE_RESTORE(SHARED_INFO, false, save_shared_info,
+                             load_shared_info);
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/public/save.h b/xen/include/public/save.h
index 84981cd0f6..ff804a7dbf 100644
--- a/xen/include/public/save.h
+++ b/xen/include/public/save.h
@@ -69,6 +69,14 @@ struct domain_save_header {
 };
 DECLARE_DOMAIN_SAVE_TYPE(HEADER, 1, struct domain_save_header);
 
-#define DOMAIN_SAVE_CODE_MAX 1
+struct domain_shared_info_context {
+    uint8_t buffer[XEN_PAGE_SIZE];
+    uint8_t field_width;
+    uint8_t pad[7];
+};
+
+DECLARE_DOMAIN_SAVE_TYPE(SHARED_INFO, 2, struct domain_shared_info_context);
+
+#define DOMAIN_SAVE_CODE_MAX 2
 
 #endif /* __XEN_PUBLIC_SAVE_H__ */
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index 75b1619d0d..cbf603f289 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -37,6 +37,9 @@
 #error "Unsupported architecture"
 #endif
 
+/* The Xen ABI assumes a page size of 4k. */
+#define XEN_PAGE_SIZE 4096
+
 #ifndef __ASSEMBLY__
 /* Guest handles for primitive C types. */
 DEFINE_XEN_GUEST_HANDLE(char);
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Xen-devel] [PATCH 5/5] tools/libxc: make use of domain context SHARED_INFO record...
  2020-03-27 18:50 [Xen-devel] [PATCH 0/5] domain context infrastructure Paul Durrant
                   ` (3 preceding siblings ...)
  2020-03-27 18:50 ` [Xen-devel] [PATCH 4/5] common/domain: add a domain context record for shared_info Paul Durrant
@ 2020-03-27 18:50 ` Paul Durrant
  4 siblings, 0 replies; 32+ messages in thread
From: Paul Durrant @ 2020-03-27 18:50 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Wei Liu, Paul Durrant

... in the save/restore code.

This patch replaces direct mapping of the shared_info_frame (retrieved
using XEN_DOMCTL_getdomaininfo) with save/load of the domain context
SHARED_INFO record.

No modifications are made to the definition of the migration stream at
this point. Subsequent patches will define a record in the libxc domain
image format for passing domain context and convert the save/restore code
to use that.

Signed-off-by: Paul Durrant <paul@xen.org>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wl@xen.org>
---
 tools/libxc/xc_sr_common.h         |  7 +++-
 tools/libxc/xc_sr_common_x86.c     | 58 ++++++++++++++++++++++++++++++
 tools/libxc/xc_sr_common_x86.h     |  4 +++
 tools/libxc/xc_sr_common_x86_pv.c  | 52 +++++++++++++++++++++++++++
 tools/libxc/xc_sr_common_x86_pv.h  |  3 ++
 tools/libxc/xc_sr_restore_x86_pv.c | 40 ++++++++-------------
 tools/libxc/xc_sr_save_x86_pv.c    | 26 ++------------
 tools/libxc/xg_save_restore.h      |  1 +
 8 files changed, 142 insertions(+), 49 deletions(-)

diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 5dd51ccb15..df21b46dc7 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -287,6 +287,11 @@ struct xc_sr_context
     {
         struct /* x86 */
         {
+            struct {
+                void *buffer;
+                unsigned int len;
+            } domain_context;
+
             struct /* x86 PV guest. */
             {
                 /* 4 or 8; 32 or 64 bit domain */
@@ -314,7 +319,7 @@ struct xc_sr_context
                 /* The guest pfns containing the p2m leaves */
                 xen_pfn_t *p2m_pfns;
 
-                /* Read-only mapping of guests shared info page */
+                /* Pointer to shared_info (located in context buffer)  */
                 shared_info_any_t *shinfo;
 
                 /* p2m generation count for verifying validity of local p2m. */
diff --git a/tools/libxc/xc_sr_common_x86.c b/tools/libxc/xc_sr_common_x86.c
index 011684df97..7d297b75b5 100644
--- a/tools/libxc/xc_sr_common_x86.c
+++ b/tools/libxc/xc_sr_common_x86.c
@@ -42,6 +42,64 @@ int handle_x86_tsc_info(struct xc_sr_context *ctx, struct xc_sr_record *rec)
     return 0;
 }
 
+int x86_get_context(struct xc_sr_context *ctx, uint32_t mask)
+{
+    xc_interface *xch = ctx->xch;
+    int rc;
+
+    if ( ctx->x86.domain_context.buffer )
+    {
+        ERROR("Domain context already present");
+        return -1;
+    }
+
+    rc = xc_domain_getcontext(xch, ctx->domid, mask, NULL, 0);
+    if ( rc < 0 )
+    {
+        PERROR("Unable to get size of domain context");
+        return -1;
+    }
+
+    ctx->x86.domain_context.buffer = malloc(rc);
+    if ( ctx->x86.domain_context.buffer == NULL )
+    {
+        PERROR("Unable to allocate memory for domain context");
+        return -1;
+    }
+
+    rc = xc_domain_getcontext(xch, ctx->domid, mask,
+                              ctx->x86.domain_context.buffer, rc);
+    if ( rc < 0 )
+    {
+        PERROR("Unable to get domain context");
+        return -1;
+    }
+
+    ctx->x86.domain_context.len = rc;
+
+    return 0;
+}
+
+int x86_set_context(struct xc_sr_context *ctx, uint32_t mask)
+{
+    xc_interface *xch = ctx->xch;
+
+    if ( !ctx->x86.domain_context.buffer )
+    {
+        ERROR("Domain context not present");
+        return -1;
+    }
+
+    return xc_domain_setcontext(xch, ctx->domid, mask,
+                                ctx->x86.domain_context.buffer,
+                                ctx->x86.domain_context.len);
+}
+
+void x86_cleanup(struct xc_sr_context *ctx)
+{
+    free(ctx->x86.domain_context.buffer);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxc/xc_sr_common_x86.h b/tools/libxc/xc_sr_common_x86.h
index ebc4355bd1..1746094081 100644
--- a/tools/libxc/xc_sr_common_x86.h
+++ b/tools/libxc/xc_sr_common_x86.h
@@ -14,6 +14,10 @@ int write_x86_tsc_info(struct xc_sr_context *ctx);
  */
 int handle_x86_tsc_info(struct xc_sr_context *ctx, struct xc_sr_record *rec);
 
+int x86_get_context(struct xc_sr_context *ctx, uint32_t mask);
+int x86_set_context(struct xc_sr_context *ctx, uint32_t mask);
+void x86_cleanup(struct xc_sr_context *ctx);
+
 #endif
 /*
  * Local variables:
diff --git a/tools/libxc/xc_sr_common_x86_pv.c b/tools/libxc/xc_sr_common_x86_pv.c
index d3d425cb82..3e6f130e56 100644
--- a/tools/libxc/xc_sr_common_x86_pv.c
+++ b/tools/libxc/xc_sr_common_x86_pv.c
@@ -182,6 +182,58 @@ int x86_pv_map_m2p(struct xc_sr_context *ctx)
     return rc;
 }
 
+int x86_pv_get_shinfo(struct xc_sr_context *ctx)
+{
+    unsigned int off = 0;
+    struct domain_save_descriptor *desc;
+    int rc;
+
+    rc = x86_get_context(ctx, DOMAIN_SAVE_MASK(SHARED_INFO));
+    if ( rc )
+        return rc;
+
+    do {
+        if ( ctx->x86.domain_context.len - off < sizeof(*desc) )
+        {
+            return -1;
+        }
+        desc = ctx->x86.domain_context.buffer + off;
+        off += sizeof(*desc);
+
+        switch (desc->typecode)
+        {
+        case DOMAIN_SAVE_CODE(SHARED_INFO):
+        {
+            DOMAIN_SAVE_TYPE(SHARED_INFO) *s;
+
+            if ( ctx->x86.domain_context.len - off < sizeof(*s) )
+                return -1;
+
+            s = ctx->x86.domain_context.buffer + off;
+            ctx->x86.pv.shinfo = (shared_info_any_t *)s->buffer;
+            /* fall through */
+        }
+        default:
+            off += (desc->length);
+            break;
+        }
+    } while ( desc->typecode != DOMAIN_SAVE_CODE(END) );
+
+    if ( !ctx->x86.pv.shinfo )
+        return -1;
+
+    return 0;
+}
+
+int x86_pv_set_shinfo(struct xc_sr_context *ctx)
+{
+    if ( !ctx->x86.pv.shinfo )
+        return -1;
+
+    return ctx->x86.pv.shinfo ?
+        x86_set_context(ctx, DOMAIN_SAVE_MASK(SHARED_INFO)) : -1;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxc/xc_sr_common_x86_pv.h b/tools/libxc/xc_sr_common_x86_pv.h
index 2ed03309af..01442f48fb 100644
--- a/tools/libxc/xc_sr_common_x86_pv.h
+++ b/tools/libxc/xc_sr_common_x86_pv.h
@@ -97,6 +97,9 @@ int x86_pv_domain_info(struct xc_sr_context *ctx);
  */
 int x86_pv_map_m2p(struct xc_sr_context *ctx);
 
+int x86_pv_get_shinfo(struct xc_sr_context *ctx);
+int x86_pv_set_shinfo(struct xc_sr_context *ctx);
+
 #endif
 /*
  * Local variables:
diff --git a/tools/libxc/xc_sr_restore_x86_pv.c b/tools/libxc/xc_sr_restore_x86_pv.c
index 904ccc462a..4dbc7f0da5 100644
--- a/tools/libxc/xc_sr_restore_x86_pv.c
+++ b/tools/libxc/xc_sr_restore_x86_pv.c
@@ -864,8 +864,7 @@ static int handle_shared_info(struct xc_sr_context *ctx,
 {
     xc_interface *xch = ctx->xch;
     unsigned int i;
-    int rc = -1;
-    shared_info_any_t *guest_shinfo = NULL;
+    int rc;
     const shared_info_any_t *old_shinfo = rec->data;
 
     if ( !ctx->x86.pv.restore.seen_pv_info )
@@ -878,39 +877,30 @@ static int handle_shared_info(struct xc_sr_context *ctx,
     {
         ERROR("X86_PV_SHARED_INFO record wrong size: length %u"
               ", expected 4096", rec->length);
-        goto err;
+        return -1;
     }
 
-    guest_shinfo = xc_map_foreign_range(
-        xch, ctx->domid, PAGE_SIZE, PROT_READ | PROT_WRITE,
-        ctx->dominfo.shared_info_frame);
-    if ( !guest_shinfo )
-    {
-        PERROR("Failed to map Shared Info at mfn %#lx",
-               ctx->dominfo.shared_info_frame);
-        goto err;
-    }
+    rc = x86_pv_get_shinfo(ctx);
+    if ( rc )
+        return rc;
 
-    MEMCPY_FIELD(guest_shinfo, old_shinfo, vcpu_info, ctx->x86.pv.width);
-    MEMCPY_FIELD(guest_shinfo, old_shinfo, arch, ctx->x86.pv.width);
+    MEMCPY_FIELD(ctx->x86.pv.shinfo, old_shinfo, vcpu_info,
+                 ctx->x86.pv.width);
+    MEMCPY_FIELD(ctx->x86.pv.shinfo, old_shinfo, arch, ctx->x86.pv.width);
 
-    SET_FIELD(guest_shinfo, arch.pfn_to_mfn_frame_list_list,
+    SET_FIELD(ctx->x86.pv.shinfo, arch.pfn_to_mfn_frame_list_list,
               0, ctx->x86.pv.width);
 
-    MEMSET_ARRAY_FIELD(guest_shinfo, evtchn_pending, 0, ctx->x86.pv.width);
+    MEMSET_ARRAY_FIELD(ctx->x86.pv.shinfo, evtchn_pending, 0,
+                       ctx->x86.pv.width);
     for ( i = 0; i < XEN_LEGACY_MAX_VCPUS; i++ )
-        SET_FIELD(guest_shinfo, vcpu_info[i].evtchn_pending_sel,
+        SET_FIELD(ctx->x86.pv.shinfo, vcpu_info[i].evtchn_pending_sel,
                   0, ctx->x86.pv.width);
 
-    MEMSET_ARRAY_FIELD(guest_shinfo, evtchn_mask, 0xff, ctx->x86.pv.width);
-
-    rc = 0;
+    MEMSET_ARRAY_FIELD(ctx->x86.pv.shinfo, evtchn_mask, 0xff,
+                       ctx->x86.pv.width);
 
- err:
-    if ( guest_shinfo )
-        munmap(guest_shinfo, PAGE_SIZE);
-
-    return rc;
+    return x86_pv_set_shinfo(ctx);
 }
 
 /* restore_ops function. */
diff --git a/tools/libxc/xc_sr_save_x86_pv.c b/tools/libxc/xc_sr_save_x86_pv.c
index f3ccf5bb4b..7c4fcffa92 100644
--- a/tools/libxc/xc_sr_save_x86_pv.c
+++ b/tools/libxc/xc_sr_save_x86_pv.c
@@ -9,25 +9,6 @@ static inline bool is_canonical_address(xen_vaddr_t vaddr)
     return ((int64_t)vaddr >> 47) == ((int64_t)vaddr >> 63);
 }
 
-/*
- * Maps the guests shared info page.
- */
-static int map_shinfo(struct xc_sr_context *ctx)
-{
-    xc_interface *xch = ctx->xch;
-
-    ctx->x86.pv.shinfo = xc_map_foreign_range(
-        xch, ctx->domid, PAGE_SIZE, PROT_READ, ctx->dominfo.shared_info_frame);
-    if ( !ctx->x86.pv.shinfo )
-    {
-        PERROR("Failed to map shared info frame at mfn %#lx",
-               ctx->dominfo.shared_info_frame);
-        return -1;
-    }
-
-    return 0;
-}
-
 /*
  * Copy a list of mfns from a guest, accounting for differences between guest
  * and toolstack width.  Can fail if truncation would occur.
@@ -1041,7 +1022,7 @@ static int x86_pv_setup(struct xc_sr_context *ctx)
     if ( rc )
         return rc;
 
-    rc = map_shinfo(ctx);
+    rc = x86_pv_get_shinfo(ctx);
     if ( rc )
         return rc;
 
@@ -1112,12 +1093,11 @@ static int x86_pv_cleanup(struct xc_sr_context *ctx)
     if ( ctx->x86.pv.p2m )
         munmap(ctx->x86.pv.p2m, ctx->x86.pv.p2m_frames * PAGE_SIZE);
 
-    if ( ctx->x86.pv.shinfo )
-        munmap(ctx->x86.pv.shinfo, PAGE_SIZE);
-
     if ( ctx->x86.pv.m2p )
         munmap(ctx->x86.pv.m2p, ctx->x86.pv.nr_m2p_frames * PAGE_SIZE);
 
+    x86_cleanup(ctx);
+
     return 0;
 }
 
diff --git a/tools/libxc/xg_save_restore.h b/tools/libxc/xg_save_restore.h
index 303081df0d..296b523963 100644
--- a/tools/libxc/xg_save_restore.h
+++ b/tools/libxc/xg_save_restore.h
@@ -19,6 +19,7 @@
 
 #include <xen/foreign/x86_32.h>
 #include <xen/foreign/x86_64.h>
+#include <xen/save.h>
 
 /*
 ** We process save/restore/migrate in batches of pages; the below
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [Xen-devel] [PATCH 3/5] tools/misc: add xen-ctx to present domain context
  2020-03-27 18:50 ` [Xen-devel] [PATCH 3/5] tools/misc: add xen-ctx to present domain context Paul Durrant
@ 2020-03-30 10:54   ` Jan Beulich
  2020-04-03 15:20     ` Paul Durrant
  0 siblings, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2020-03-30 10:54 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Ian Jackson, Wei Liu

On 27.03.2020 19:50, Paul Durrant wrote:
> This tools is analogous to 'xen-hvmctx' which presents HVM context.
> Subsequent patches will add 'dump' functions when new records are
> introduced.
> 
> Signed-off-by: Paul Durrant <paul@xen.org>
> ---
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Wei Liu <wl@xen.org>
> ---
>  .gitignore           |   1 +
>  tools/misc/Makefile  |   4 ++
>  tools/misc/xen-ctx.c | 144 +++++++++++++++++++++++++++++++++++++++++++

Is xen-ctx a good choice of a name, considering we already have not
only xen-hvmctx, but also xenctx? If the new functionality isn't a
good fit for either, perhaps its name would better reflect its
connection to save/restore records? xen-sr-dump looks pretty clumsy
to me, but still seems better than a name easily mixed up with
others.

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
  2020-03-27 18:50 ` [Xen-devel] [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context Paul Durrant
@ 2020-04-01 12:00   ` Julien Grall
  2020-04-01 12:07     ` Jan Beulich
  2020-04-03 15:55     ` Paul Durrant
  2020-04-01 14:50   ` Jan Beulich
  1 sibling, 2 replies; 32+ messages in thread
From: Julien Grall @ 2020-04-01 12:00 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich

Hi Paul,

On 27/03/2020 18:50, Paul Durrant wrote:
> Domain context is state held in the hypervisor that does not come under
> the category of 'HVM state' but is instead 'PV state' that is common
> between PV guests and enlightened HVM guests (i.e. those that have PV
> drivers) such as event channel state, grant entry state, etc.
> 
> To allow enlightened HVM guests to be migrated without their co-operation
> it will be necessary to transfer such state along with the domain's
> memory image, architectural state, etc. This framework is introduced for
> that purpose.
> 
> This patch adds the new public header and the low level implementation,
> entered via the domain_save() or domain_load() functions. Subsequent
> patches will introduce other parts of the framwork, and code that will

Typo: s/framwork/framework/

> make use of it within the current version of the libxc migration stream.
> 
> Signed-off-by: Paul Durrant <paul@xen.org>
> ---
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Julien Grall <julien@xen.org>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Wei Liu <wl@xen.org>
> ---
>   xen/common/Makefile       |   1 +
>   xen/common/save.c         | 262 ++++++++++++++++++++++++++++++++++++++
>   xen/include/public/save.h |  74 +++++++++++
>   xen/include/xen/save.h    | 115 +++++++++++++++++
>   4 files changed, 452 insertions(+)
>   create mode 100644 xen/common/save.c
>   create mode 100644 xen/include/public/save.h
>   create mode 100644 xen/include/xen/save.h
> 
> diff --git a/xen/common/Makefile b/xen/common/Makefile
> index e8cde65370..90553ba5d7 100644
> --- a/xen/common/Makefile
> +++ b/xen/common/Makefile
> @@ -37,6 +37,7 @@ obj-y += radix-tree.o
>   obj-y += rbtree.o
>   obj-y += rcupdate.o
>   obj-y += rwlock.o
> +obj-y += save.o
>   obj-y += shutdown.o
>   obj-y += softirq.o
>   obj-y += sort.o
> diff --git a/xen/common/save.c b/xen/common/save.c
> new file mode 100644
> index 0000000000..bef99452d8
> --- /dev/null
> +++ b/xen/common/save.c
> @@ -0,0 +1,262 @@
> +/*
> + * save.c: Save and restore PV guest state common to all domain types.
> + *
> + * Copyright Amazon.com Inc. or its affiliates.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/save.h>
> +
> +struct domain_context {
> +    bool log;
> +    struct domain_save_descriptor desc;
> +    domain_copy_entry copy;

As your new framework is technically an extension of existing one, it 
would be good to explain why we diverge in the definitions.

> +    void *priv;
> +};
> +
> +static struct {
> +    const char *name;
> +    bool per_vcpu;
> +    domain_save_handler save;
> +    domain_load_handler load;
> +} handlers[DOMAIN_SAVE_CODE_MAX + 1];
> +
> +void __init domain_register_save_type(unsigned int tc, const char *name,
> +                                      bool per_vcpu,
> +                                      domain_save_handler save,
> +                                      domain_load_handler load)
> +{
> +    BUG_ON(tc > ARRAY_SIZE(handlers));
> +
> +    ASSERT(!handlers[tc].save);
> +    ASSERT(!handlers[tc].load);
> +
> +    handlers[tc].name = name;
> +    handlers[tc].per_vcpu = per_vcpu;
> +    handlers[tc].save = save;
> +    handlers[tc].load = load;
> +}
> +
> +int domain_save_entry(struct domain_context *c, unsigned int tc,
> +                      const char *name, const struct vcpu *v, void *src,

I realize that 'src' is not const because how you define copy, however I 
would rather prefer if we rework the interface to avoid to keep the 
const in the definition. This may mean to have to define two callback 
rather than one.

> +                      size_t src_len)

On 64-bit architecture, size_t would be 64-bit. But the record is only 
storing 32-bit. Would it make sense to add some sanity check in the code?

> +{
> +    int rc;
> +
> +    if ( c->log && tc != DOMAIN_SAVE_CODE(HEADER) &&
> +         tc != DOMAIN_SAVE_CODE(END) )
> +        gdprintk(XENLOG_INFO, "%pv save: %s (%lu)\n", v, name, src_len);
> +
> +    if ( !IS_ALIGNED(src_len, 8) )

Why not adding an implicit padding? This would avoid to chase error 
during saving because the len wasn't a multiple of 8.

> +        return -EINVAL;
> +
> +    BUG_ON(tc != c->desc.typecode);
> +    BUG_ON(v->vcpu_id != c->desc.instance);

Does the descriptor really need to be filled by domain_save()? Can't it 
be done here, so we can avoid the two BUG_ON()s?

> +    c->desc.length = src_len;
> +
> +    rc = c->copy(c->priv, &c->desc, sizeof(c->desc));
> +    if ( rc )
> +        return rc;
> +
> +    return c->copy(c->priv, src, src_len);
> +}
> +
> +int domain_save(struct domain *d, domain_copy_entry copy, void *priv,
> +                unsigned long mask, bool dry_run)
> +{
> +    struct domain_context c = {
> +        .copy = copy,
> +        .priv = priv,
> +        .log = !dry_run,
> +    };
> +    struct domain_save_header h = {
> +        .magic = DOMAIN_SAVE_MAGIC,
> +        .version = DOMAIN_SAVE_VERSION,
> +    };
> +    struct domain_save_header e;
> +    unsigned int i;
> +    int rc;
> +
> +    ASSERT(d != current->domain);
> +
> +    if ( d->is_dying )
> +        return -EINVAL;
> +
> +    domain_pause(d);
> +
> +    c.desc.typecode = DOMAIN_SAVE_CODE(HEADER);
> +
> +    rc = DOMAIN_SAVE_ENTRY(HEADER, &c, d->vcpu[0], &h, sizeof(h));
> +    if ( rc )
> +        goto out;
> +
> +    for ( i = 0; i < ARRAY_SIZE(handlers); i++ )

AFAIU, with this solution, if there are dependency between records, the 
dependencies will have to a lower "index". I think we may have some 
dependency with guest transparent migration such as we need to restore 
the event ABI before restoring the event channels.

Should we rely on the index for the dependency?

In any case, I think we want to document it.

> +    {
> +        domain_save_handler save = handlers[i].save;
> +
> +        if ( (mask && !test_bit(i, &mask)) || !save )

The type of mask suggests it is not possible to have more than 32 
different types of record if we wanted to be arch agnostic. Even if we 
focus on 64-bit arch, this is only 64 records.

This is not very future proof, but might be ok if this is not exposed 
outside of the hypervisor (I haven't looked at the rest of the series 
yet). However, we at least need a BUILD_BUG_ON() in place. So please 
make sure  DOMAIN_SAVE_CODE_MAX is always less than 64.

Also what if there is a bit set in the mask and the record is not 
existing? Shouldn't we return an error?

> +            continue;
> +
> +        memset(&c.desc, 0, sizeof(c.desc));
> +        c.desc.typecode = i;
> +
> +        if ( handlers[i].per_vcpu )
> +        {
> +            struct vcpu *v;
> +
> +            for_each_vcpu ( d, v )
> +            {
> +                c.desc.instance = v->vcpu_id;
> +
> +                rc = save(v, &c, dry_run);
> +                if ( rc )
> +                    goto out;
> +            }
> +        }
> +        else
> +        {
> +            rc = save(d->vcpu[0], &c, dry_run);
> +            if ( rc )
> +                goto out;
> +        }
> +    }
> +
> +    memset(&c.desc, 0, sizeof(c.desc));
> +    c.desc.typecode = DOMAIN_SAVE_CODE(END);
> +
> +    rc = DOMAIN_SAVE_ENTRY(END, &c, d->vcpu[0], &e, 0);
> +
> + out:
> +    domain_unpause(d);
> +
> +    return rc;
> +}
> +
> +int domain_load_entry(struct domain_context *c, unsigned int tc,
> +                      const char *name, const struct vcpu *v, void *dst,
> +                      size_t dst_len, bool exact)
> +{
> +    int rc;
> +
> +    if ( c->log && tc != DOMAIN_SAVE_CODE(HEADER) &&
> +         tc != DOMAIN_SAVE_CODE(END) )
> +        gdprintk(XENLOG_INFO, "%pv load: %s (%lu)\n", v, name, dst_len);
> +
> +    BUG_ON(tc != c->desc.typecode);
> +    BUG_ON(v->vcpu_id != c->desc.instance);

Is it really warrant to crash the host? What would happen if the values 
were wrong?

> +
> +    if ( (exact ?
> +          (dst_len != c->desc.length) : (dst_len < c->desc.length)) ||

Using ternary in if is really confusing. How about:

dst_len < c->desc.length || (exact && dst_len != c->desc.length) ||

I understand that there would be two check for the exact case but I 
think it is better than a ternary.

However what is the purpose of the param 'exact'? If you set it to false 
how do you know the size you read?

> +         !IS_ALIGNED(c->desc.length, 8) )
> +        return -EINVAL;
> +
> +    rc = c->copy(c->priv, dst, c->desc.length);
> +    if ( rc )
> +        return rc;
> +
> +    if ( !exact )
> +    {
> +        dst += c->desc.length;
> +        memset(dst, 0, dst_len - c->desc.length);
> +    }
> +
> +    return 0;
> +}
> +
> +int domain_load(struct domain *d,  domain_copy_entry copy, void *priv,
> +                unsigned long mask)
> +{
> +    struct domain_context c = {
> +        .copy = copy,
> +        .priv = priv,
> +        .log = true,
> +    };
> +    struct domain_save_header h, e;
> +    int rc;
> +
> +    ASSERT(d != current->domain);
> +
> +    if ( d->is_dying )
> +        return -EINVAL;

What does protect you against the doing dying right after this check?

> +
> +    rc = c.copy(c.priv, &c.desc, sizeof(c.desc));
> +    if ( rc )
> +        return rc;
> +
> +    if ( c.desc.typecode != DOMAIN_SAVE_CODE(HEADER) ||
> +         c.desc.instance != 0 )
> +        return -EINVAL;
> +
> +    rc = DOMAIN_LOAD_ENTRY(HEADER, &c, d->vcpu[0], &h, sizeof(h), true);
> +    if ( rc )
> +        return rc;
> +
> +    if ( h.magic != DOMAIN_SAVE_MAGIC || h.version != DOMAIN_SAVE_VERSION )
> +        return -EINVAL;
> +
> +    domain_pause(d);
> +
> +    for (;;)
> +    {
> +        unsigned int i;
> +        domain_load_handler load;
> +        struct vcpu *v;
> +
> +        rc = c.copy(c.priv, &c.desc, sizeof(c.desc));
> +        if ( rc )
> +            break;
> +
> +        if ( c.desc.typecode == DOMAIN_SAVE_CODE(END) ) {
> +            rc = DOMAIN_LOAD_ENTRY(END, &c, d->vcpu[0], &e, 0, true);
> +            break;
> +        }
> +
> +        rc = -EINVAL;
> +        if ( c.desc.typecode >= ARRAY_SIZE(handlers) ||
> +             c.desc.instance >= d->max_vcpus )

Nothing in the documention suggests that c.desc.instance should be 0 
when the record is for the domain.

> +            break;
> +
> +        i = c.desc.typecode;
> +        load = handlers[i].load;
> +        v = d->vcpu[c.desc.instance];
> +
> +        if ( mask && !test_bit(i, &mask) )
> +        {
> +            /* Sink the data */
> +            rc = c.copy(c.priv, NULL, c.desc.length);
> +            if ( rc )
> +                break;
> +
> +            continue;
> +        }
> +
> +        rc = load ? load(v, &c) : -EOPNOTSUPP;
> +        if ( rc )
> +            break;
> +    }
> +
> +    domain_unpause(d);
> +
> +    return rc;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/include/public/save.h b/xen/include/public/save.h
> new file mode 100644
> index 0000000000..84981cd0f6
> --- /dev/null
> +++ b/xen/include/public/save.h
> @@ -0,0 +1,74 @@
> +/*
> + * save.h
> + *
> + * Structure definitions for common PV/HVM domain state that is held by
> + * Xen and must be saved along with the domain's memory.
> + *
> + * Copyright Amazon.com Inc. or its affiliates.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to
> + * deal in the Software without restriction, including without limitation the
> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> + * sell copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#ifndef __XEN_PUBLIC_SAVE_H__
> +#define __XEN_PUBLIC_SAVE_H__
> +
> +#include "xen.h"
> +
> +/* Each entry is preceded by a descriptor */
> +struct domain_save_descriptor {
> +    uint16_t typecode;
> +    uint16_t instance;
> +    /*
> +     * Entry length not including this descriptor. Entries must be padded
> +     * to a multiple of 8 bytes to make sure descriptors remain correctly
> +     * aligned.
> +     */
> +    uint32_t length;
> +};
> +
> +/*
> + * Each entry has a type associated with it. DECLARE_DOMAIN_SAVE_TYPE
> + * binds these things together.
> + */
> +#define DECLARE_DOMAIN_SAVE_TYPE(_x, _code, _type) \
> +    struct __DOMAIN_SAVE_TYPE_##_x { _type t; char c[_code]; };
> +
> +#define DOMAIN_SAVE_TYPE(_x) \
> +    typeof (((struct __DOMAIN_SAVE_TYPE_##_x *)(0))->t)
> +#define DOMAIN_SAVE_CODE(_x) \
> +    (sizeof (((struct __DOMAIN_SAVE_TYPE_##_x *)(0))->c))
> +#define DOMAIN_SAVE_MASK(_x) (1u << DOMAIN_SAVE_CODE(_x))
> +
> +/* Terminating entry */
> +struct domain_save_end {};
> +DECLARE_DOMAIN_SAVE_TYPE(END, 0, struct domain_save_end);
> +
> +#define DOMAIN_SAVE_MAGIC   0x53415645
> +#define DOMAIN_SAVE_VERSION 0x00000001
> +
> +/* Initial entry */
> +struct domain_save_header {
> +    uint32_t magic;             /* Must be DOMAIN_SAVE_MAGIC */
> +    uint32_t version;           /* Save format version */

Would it make sense to have the version of Xen in the stream?

> +};
> +DECLARE_DOMAIN_SAVE_TYPE(HEADER, 1, struct domain_save_header);
> +
> +#define DOMAIN_SAVE_CODE_MAX 1
> +
> +#endif /* __XEN_PUBLIC_SAVE_H__ */
> diff --git a/xen/include/xen/save.h b/xen/include/xen/save.h
> new file mode 100644
> index 0000000000..d5846f9e68
> --- /dev/null
> +++ b/xen/include/xen/save.h
> @@ -0,0 +1,115 @@
> +/*
> + * save.h: support routines for save/restore
> + *
> + * Copyright Amazon.com Inc. or its affiliates.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef __XEN_SAVE_H__
> +#define __XEN_SAVE_H__
> +
> +#include <xen/sched.h>
> +#include <xen/types.h>
> +#include <xen/init.h>
> +
> +#include <public/xen.h>
> +#include <public/save.h>
> +
> +struct domain_context;
> +
> +int domain_save_entry(struct domain_context *c, unsigned int tc,
> +                      const char *name, const struct vcpu *v, void *src,
> +                      size_t src_len);
> +
> +#define DOMAIN_SAVE_ENTRY(_x, _c, _v, _src, _len)                        \
> +        domain_save_entry((_c), DOMAIN_SAVE_CODE(_x), #_x, (_v), (_src), \
> +                          (_len));
> +
> +int domain_load_entry(struct domain_context *c, unsigned int tc,
> +                      const char *name, const struct vcpu *v, void *dest,
> +                      size_t dest_len, bool exact);
> +
> +#define DOMAIN_LOAD_ENTRY(_x, _c, _v, _src, _len, _exact)                \
> +        domain_load_entry((_c), DOMAIN_SAVE_CODE(_x), #_x, (_v), (_src), \
> +                          (_len), (_exact));
> +
> +/*
> + * The 'dry_run' flag indicates that the caller of domain_save() (see
> + * below) is not trying to actually acquire the data, only the size
> + * of the data. The save handler can therefore limit work to only that
> + * which is necessary to call DOMAIN_SAVE_ENTRY() with an accurate value
> + * for '_len'.
> + */
> +typedef int (*domain_save_handler)(const struct vcpu *v,
> +                                   struct domain_context *h,
> +                                   bool dry_run);
> +typedef int (*domain_load_handler)(struct vcpu *v,
> +                                   struct domain_context *h);
> +
> +void domain_register_save_type(unsigned int tc, const char *name,
> +                               bool per_vcpu,
> +                               domain_save_handler save,
> +                               domain_load_handler load);
> +
> +#define DOMAIN_REGISTER_SAVE_RESTORE(_x, _per_vcpu, _save, _load) \
> +static int __init __domain_register_##_x##_save_restore(void)     \
> +{                                                                 \
> +    domain_register_save_type(                                    \
> +        DOMAIN_SAVE_CODE(_x),                                     \
> +        #_x,                                                      \
> +        (_per_vcpu),                                              \
> +        &(_save),                                                 \
> +        &(_load));                                                \
> +                                                                  \
> +    return 0;                                                     \
> +}                                                                 \
> +__initcall(__domain_register_##_x##_save_restore);
> +
> +/* Copy callback function */
> +typedef int (*domain_copy_entry)(void *priv, void *data, size_t len);
> +
> +/*
> + * Entry points:
> + *
> + * int domain_save(struct domain *d, domain_copy_entry copy, void *priv,
> + *                 unsigned long mask, bool dry_run);
> + * int domain_load(struct domain *d, domain_copy_entry copy, void *priv,
> + *                 unsigned long mask);
> + *
> + * copy:    This is a callback function provided by the caller that will be
> + *          used to write to (in the save case) or read from (in the load
> + *          case) the context buffer.
> + * priv:    This is a pointer that will be passed to the copy function to
> + *          allow it to identify the context buffer and the current state
> + *          of the save or load operation.
> + * mask:    This is a mask to determine which save record types should be
> + *          copied to or from the buffer.
> + *          If it is zero then all save record types will be copied.
> + *          If it is non-zero then only record types with codes
> + *          corresponding to set bits will be copied. I.e. to copy save
> + *          record 'type', set the bit in position DOMAIN_SAVE_CODE(type).
> + *          DOMAIN_SAVE_CODE(HEADER) and DOMAIN_SAVE_CODE(END) records must
> + *          always be present and thus will be copied regardless of whether
> + *          the bits in those positions are set or not.
> + * dry_run: See the comment concerning (*domain_save) above.
> + *
> + * NOTE: A convenience macro, DOMAIN_SAVE_MASK(type), is defined to support
> + *       setting bits in the mask field.
> + */
> +int domain_save(struct domain *d, domain_copy_entry copy, void *priv,
> +                unsigned long mask, bool dry_run);
> +int domain_load(struct domain *d, domain_copy_entry copy, void *priv,
> +                unsigned long mask);
> +
> +#endif /* __XEN_SAVE_H__ */
> 

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
  2020-04-01 12:00   ` Julien Grall
@ 2020-04-01 12:07     ` Jan Beulich
  2020-04-01 12:16       ` Julien Grall
  2020-04-03 15:55     ` Paul Durrant
  1 sibling, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2020-04-01 12:07 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Wei Liu, Paul Durrant, Andrew Cooper,
	Ian Jackson, George Dunlap, xen-devel

On 01.04.2020 14:00, Julien Grall wrote:
> On 27/03/2020 18:50, Paul Durrant wrote:
>> +    if ( (exact ?
>> +          (dst_len != c->desc.length) : (dst_len < c->desc.length)) ||
> 
> Using ternary in if is really confusing. How about:
> 
> dst_len < c->desc.length || (exact && dst_len != c->desc.length) ||
> 
> I understand that there would be two check for the exact case but I think it is better than a ternary.

I'm of the opposite opinion, and hence with Paul. While the alternative
you suggest is still reasonable because of the special case here, I
find it confusing / more difficult to read / follow

    if ( (a && b) || (!a && c) )

(and I've seen quite a few instances of such over time) instead of

    if ( a ? b : c )

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
  2020-04-01 12:07     ` Jan Beulich
@ 2020-04-01 12:16       ` Julien Grall
  2020-04-01 12:23         ` Jan Beulich
  0 siblings, 1 reply; 32+ messages in thread
From: Julien Grall @ 2020-04-01 12:16 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Paul Durrant, Andrew Cooper,
	Ian Jackson, George Dunlap, xen-devel

Hi Jan,

On 01/04/2020 13:07, Jan Beulich wrote:
> On 01.04.2020 14:00, Julien Grall wrote:
>> On 27/03/2020 18:50, Paul Durrant wrote:
>>> +    if ( (exact ?
>>> +          (dst_len != c->desc.length) : (dst_len < c->desc.length)) ||
>>
>> Using ternary in if is really confusing. How about:
>>
>> dst_len < c->desc.length || (exact && dst_len != c->desc.length) ||
>>
>> I understand that there would be two check for the exact case but I think it is better than a ternary.
> 
> I'm of the opposite opinion, and hence with Paul. While the alternative
> you suggest is still reasonable because of the special case here, I
> find it confusing / more difficult to read / follow
> 
>      if ( (a && b) || (!a && c) )
> 
> (and I've seen quite a few instances of such over time) instead of
> 
>      if ( a ? b : c )

If the ternary was the only condition and in a single line then it would 
be okay. However, the if is split over 3 lines...

The more stuff you put in an if, then more chance you are going to 
misread/make a mistake (you likely know what I am referring about here ;)).

So if you prefer the ternary, then we should at least write 2 ifs.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
  2020-04-01 12:16       ` Julien Grall
@ 2020-04-01 12:23         ` Jan Beulich
  0 siblings, 0 replies; 32+ messages in thread
From: Jan Beulich @ 2020-04-01 12:23 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Wei Liu, Paul Durrant, Andrew Cooper,
	Ian Jackson, George Dunlap, xen-devel

On 01.04.2020 14:16, Julien Grall wrote:
> Hi Jan,
> 
> On 01/04/2020 13:07, Jan Beulich wrote:
>> On 01.04.2020 14:00, Julien Grall wrote:
>>> On 27/03/2020 18:50, Paul Durrant wrote:
>>>> +    if ( (exact ?
>>>> +          (dst_len != c->desc.length) : (dst_len < c->desc.length)) ||
>>>
>>> Using ternary in if is really confusing. How about:
>>>
>>> dst_len < c->desc.length || (exact && dst_len != c->desc.length) ||
>>>
>>> I understand that there would be two check for the exact case but I think it is better than a ternary.
>>
>> I'm of the opposite opinion, and hence with Paul. While the alternative
>> you suggest is still reasonable because of the special case here, I
>> find it confusing / more difficult to read / follow
>>
>>      if ( (a && b) || (!a && c) )
>>
>> (and I've seen quite a few instances of such over time) instead of
>>
>>      if ( a ? b : c )
> 
> If the ternary was the only condition and in a single line then it would be okay. However, the if is split over 3 lines...
> 
> The more stuff you put in an if, then more chance you are going to misread/make a mistake (you likely know what I am referring about here ;)).
> 
> So if you prefer the ternary, then we should at least write 2 ifs.

Since it's || that would be fine (albeit not preferred) by me. If
it was &&, would be be suggesting two nested if()-s (which
generally in reviews we ask to be avoided)? I see nothing wrong
with e.g.

      if ( (a ? b : c) &&
           (x ? y : z) )

Nevertheless I agree that very large conditionals often are more
difficult to read.

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/5] xen/common/domctl: introduce XEN_DOMCTL_get/setdomaincontext
  2020-03-27 18:50 ` [Xen-devel] [PATCH 2/5] xen/common/domctl: introduce XEN_DOMCTL_get/setdomaincontext Paul Durrant
@ 2020-04-01 13:42   ` Julien Grall
  2020-04-06  9:07     ` Paul Durrant
  2020-04-01 13:46   ` Julien Grall
  1 sibling, 1 reply; 32+ messages in thread
From: Julien Grall @ 2020-04-01 13:42 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, Daniel De Graaf

Hi Paul,

On 27/03/2020 18:50, Paul Durrant wrote:
> These domctls provide a mechanism to get and set domain context from
> the toolstack.
> 
> Signed-off-by: Paul Durrant <paul@xen.org>
> ---
> Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Wei Liu <wl@xen.org>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Julien Grall <julien@xen.org>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> ---
>   tools/flask/policy/modules/xen.if   |   4 +-
>   tools/libxc/include/xenctrl.h       |  11 +++
>   tools/libxc/xc_domain.c             |  52 +++++++++++++
>   xen/common/domctl.c                 | 115 ++++++++++++++++++++++++++++
>   xen/include/public/domctl.h         |  41 +++++++++-
>   xen/xsm/flask/hooks.c               |   6 ++
>   xen/xsm/flask/policy/access_vectors |   4 +
>   7 files changed, 230 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/flask/policy/modules/xen.if b/tools/flask/policy/modules/xen.if
> index 8eb2293a52..2bc9db4f64 100644
> --- a/tools/flask/policy/modules/xen.if
> +++ b/tools/flask/policy/modules/xen.if
> @@ -53,7 +53,7 @@ define(`create_domain_common', `
>   	allow $1 $2:domain2 { set_cpu_policy settsc setscheduler setclaim
>   			set_vnumainfo get_vnumainfo cacheflush
>   			psr_cmt_op psr_alloc soft_reset
> -			resource_map get_cpu_policy };
> +			resource_map get_cpu_policy setcontext };
>   	allow $1 $2:security check_context;
>   	allow $1 $2:shadow enable;
>   	allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage mmuext_op updatemp };
> @@ -97,7 +97,7 @@ define(`migrate_domain_out', `
>   	allow $1 $2:hvm { gethvmc getparam };
>   	allow $1 $2:mmu { stat pageinfo map_read };
>   	allow $1 $2:domain { getaddrsize getvcpucontext pause destroy };
> -	allow $1 $2:domain2 gettsc;
> +	allow $1 $2:domain2 { gettsc getcontext };
>   	allow $1 $2:shadow { enable disable logdirty };
>   ')
>   
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index fc6e57a1a0..5c0d0d27a4 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -867,6 +867,17 @@ int xc_domain_hvm_setcontext(xc_interface *xch,
>                                uint8_t *hvm_ctxt,
>                                uint32_t size);
>   
> +int xc_domain_getcontext(xc_interface *xch,
> +                         uint32_t domid,
> +                         uint32_t mask,
> +                         uint8_t *ctxt_buf,

Why do you use uint8_t rather than void here?

> +                         uint32_t size);
> +int xc_domain_setcontext(xc_interface *xch,
> +                         uint32_t domid,
> +                         uint32_t mask,
> +                         uint8_t *ctxt_buf,
> +                         uint32_t size);
> +
>   /**
>    * This function will return guest IO ABI protocol
>    *
> diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
> index 71829c2bce..15bcf671fc 100644
> --- a/tools/libxc/xc_domain.c
> +++ b/tools/libxc/xc_domain.c
> @@ -537,6 +537,58 @@ int xc_domain_hvm_setcontext(xc_interface *xch,
>       return ret;
>   }
>   
> +int xc_domain_getcontext(xc_interface *xch,
> +                         uint32_t domid,
> +                         uint32_t mask,
> +                         uint8_t *ctxt_buf,
> +                         uint32_t size)
> +{
> +    int ret;
> +    DECLARE_DOMCTL;
> +    DECLARE_HYPERCALL_BOUNCE(ctxt_buf, size, XC_HYPERCALL_BUFFER_BOUNCE_OUT);
> +
> +    if ( xc_hypercall_bounce_pre(xch, ctxt_buf) )
> +        return -1;
> +
> +    domctl.cmd = XEN_DOMCTL_getdomaincontext;
> +    domctl.domain = domid;
> +    domctl.u.domaincontext.mask = mask;
> +    domctl.u.domaincontext.size = size;
> +    set_xen_guest_handle(domctl.u.domaincontext.buffer, ctxt_buf);
> +
> +    ret = do_domctl(xch, &domctl);
> +
> +    xc_hypercall_bounce_post(xch, ctxt_buf);
> +
> +    return !ret ? domctl.u.domaincontext.size : -1;

Looking at the domctl interface, size would be a 32-bit unsigned value. 
However the return is a 32-bit signed value. This is a call for trouble 
if the size is too big.

> +}
> +
> +int xc_domain_setcontext(xc_interface *xch,
> +                         uint32_t domid,
> +                         uint32_t mask,
> +                         uint8_t *ctxt_buf,

The context is not meant to be modified. So this should really be const.

> +                         uint32_t size)
> +{
> +    int ret;
> +    DECLARE_DOMCTL;
> +    DECLARE_HYPERCALL_BOUNCE(ctxt_buf, size, XC_HYPERCALL_BUFFER_BOUNCE_IN);
> +
> +    if ( xc_hypercall_bounce_pre(xch, ctxt_buf) )
> +        return -1;
> +
> +    domctl.cmd = XEN_DOMCTL_setdomaincontext;
> +    domctl.domain = domid;
> +    domctl.u.domaincontext.mask = mask;
> +    domctl.u.domaincontext.size = size;
> +    set_xen_guest_handle(domctl.u.domaincontext.buffer, ctxt_buf);
> +
> +    ret = do_domctl(xch, &domctl);
> +
> +    xc_hypercall_bounce_post(xch, ctxt_buf);
> +
> +    return ret;
> +}
> +
>   int xc_vcpu_getcontext(xc_interface *xch,
>                          uint32_t domid,
>                          uint32_t vcpu,
> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> index a69b3b59a8..9c39519b04 100644
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -25,6 +25,7 @@
>   #include <xen/hypercall.h>
>   #include <xen/vm_event.h>
>   #include <xen/monitor.h>
> +#include <xen/save.h>
>   #include <asm/current.h>
>   #include <asm/irq.h>
>   #include <asm/page.h>
> @@ -358,6 +359,111 @@ static struct vnuma_info *vnuma_init(const struct xen_domctl_vnuma *uinfo,
>       return ERR_PTR(ret);
>   }
>   
> +struct domctl_context
> +{
> +    void *buffer;
> +    size_t len;
> +    size_t cur;
> +};
> +
> +static int accumulate_size(void *priv, void *data, size_t len)
> +{
> +    struct domctl_context *c = priv;
> +
> +    c->len += len;

What does prevent overflow?

> +
> +    return 0;
> +}
> +
> +static int save_data(void *priv, void *data, size_t len)
> +{
> +    struct domctl_context *c = priv;
> +
> +    if ( c->len - c->cur < len )
> +        return -ENOSPC;
> +
> +    memcpy(c->buffer + c->cur, data, len);
> +    c->cur += len;
> +
> +    return 0;
> +}
> +
> +static int getdomaincontext(struct domain *d,
> +                            struct xen_domctl_domaincontext *dc)
> +{
> +    struct domctl_context c = { };
> +    int rc;
> +
> +    if ( d == current->domain )
> +        return -EPERM;
> +
> +    if ( guest_handle_is_null(dc->buffer) ) /* query for buffer size */
> +
> +    {
> +        if ( dc->size )
> +            return -EINVAL;
> +
> +        /* dry run to acquire buffer size */
> +        rc = domain_save(d, accumulate_size, &c, dc->mask, true);
> +        if ( rc )
> +            return rc;
> +
> +        dc->size = c.len;

len is a size_t (so 64-bit on 64-bit arch) value, but size is 32-bit. So 
we return an error if the stream is going to be bigger than 4G?

> +        return 0;
> +    }
> +
> +    c.len = dc->size;
> +    c.buffer = xmalloc_bytes(c.len);
> +    if ( !c.buffer )
> +        return -ENOMEM;
> +
> +    rc = domain_save(d, save_data, &c, dc->mask, false);
> +
> +    dc->size = c.cur;
> +    if ( !rc && copy_to_guest(dc->buffer, c.buffer, dc->size) )
> +        rc = -EFAULT;
> +
> +    xfree(c.buffer);
> +
> +    return rc;
> +}
> +
> +static int load_data(void *priv, void *data, size_t len)
> +{
> +    struct domctl_context *c = priv;
> +
> +    if ( c->len - c->cur < len )
> +        return -ENODATA;
> +
> +    if ( data )
> +        memcpy(data, c->buffer + c->cur, len);
> +
> +    c->cur += len;
> +
> +    return 0;
> +}
> +
> +static int setdomaincontext(struct domain *d,
> +                            const struct xen_domctl_domaincontext *dc)
> +{
> +    struct domctl_context c = { .len = dc->size };
> +    int rc;
> +
> +    if ( d == current->domain )
> +        return -EPERM;
> +
> +    c.buffer = xmalloc_bytes(c.len);
> +    if ( !c.buffer )
> +        return -ENOMEM;
> +
> +    rc = !copy_from_guest(c.buffer, dc->buffer, c.len) ?
> +        domain_load(d, load_data, &c, dc->mask) : -EFAULT;
> +
> +    xfree(c.buffer);
> +
> +    return rc;
> +}
> +
>   long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>   {
>       long ret = 0;
> @@ -942,6 +1048,15 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>               copyback = 1;
>           break;
>   
> +    case XEN_DOMCTL_getdomaincontext:
> +        ret = getdomaincontext(d, &op->u.domaincontext);
> +        copyback = !ret;
> +        break;
> +
> +    case XEN_DOMCTL_setdomaincontext:
> +        ret = setdomaincontext(d, &op->u.domaincontext);
> +        break;
> +
>       default:
>           ret = arch_do_domctl(op, d, u_domctl);
>           break;
> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> index 1ad34c35eb..24ed6852cf 100644
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -38,7 +38,7 @@
>   #include "hvm/save.h"
>   #include "memory.h"
>   
> -#define XEN_DOMCTL_INTERFACE_VERSION 0x00000012
> +#define XEN_DOMCTL_INTERFACE_VERSION 0x00000013
>   
>   /*
>    * NB. xen_domctl.domain is an IN/OUT parameter for this operation.
> @@ -1129,6 +1129,42 @@ struct xen_domctl_vuart_op {
>                                    */
>   };
>   
> +/*
> + * Get/Set domain PV context. The same struct xen_domctl_domaincontext
> + * is used for both commands but with slightly different field semantics
> + * as follows:
> + *
> + * XEN_DOMCTL_getdomaincontext
> + * ---------------------------
> + *
> + * buffer (IN):   The buffer into which the context data should be
> + *                copied, or NULL to query the buffer size that should
> + *                be allocated.
> + * size (IN/OUT): If 'buffer' is NULL then the value passed in must be
> + *                zero, and the value passed out will be the size of the
> + *                buffer to allocate.
> + *                If 'buffer' is non-NULL then the value passed in must
> + *                be the size of the buffer into which data may be copied.
> + * mask (IN):     See comment on domain_save/load() in xen/save.h.
> + *
> + * XEN_DOMCTL_setdomaincontext
> + * ---------------------------
> + *
> + * buffer (IN):   The buffer from which the context data should be
> + *                copied.
> + * size (IN):     The size of the buffer from which data may be copied.
> + *                This data must include DOMAIN_SAVE_CODE_HEADER at the
> + *                start and terminate with a DOMAIN_SAVE_CODE_END record.
> + *                Any data beyond the DOMAIN_SAVE_CODE_END record will be
> + *                ignored.
> + * mask (IN):     See comment on domain_save/load() in xen/save.h.
> + */
> +struct xen_domctl_domaincontext {
> +    uint32_t size;
> +    uint32_t mask;
> +    XEN_GUEST_HANDLE_64(uint8) buffer;

Should not it be a void handle?

Also, in the case of XEN_DOMCTL_setdomaincontext, the buffer is not 
meant to be modified by the hypervisor. So I would rather introduce two 
separate structure so we can enforce the const.

> +};
> +
>   struct xen_domctl {
>       uint32_t cmd;
>   #define XEN_DOMCTL_createdomain                   1
> @@ -1210,6 +1246,8 @@ struct xen_domctl {
>   #define XEN_DOMCTL_vuart_op                      81
>   #define XEN_DOMCTL_get_cpu_policy                82
>   #define XEN_DOMCTL_set_cpu_policy                83
> +#define XEN_DOMCTL_getdomaincontext              84
> +#define XEN_DOMCTL_setdomaincontext              85
>   #define XEN_DOMCTL_gdbsx_guestmemio            1000
>   #define XEN_DOMCTL_gdbsx_pausevcpu             1001
>   #define XEN_DOMCTL_gdbsx_unpausevcpu           1002
> @@ -1270,6 +1308,7 @@ struct xen_domctl {
>           struct xen_domctl_monitor_op        monitor_op;
>           struct xen_domctl_psr_alloc         psr_alloc;
>           struct xen_domctl_vuart_op          vuart_op;
> +        struct xen_domctl_domaincontext     domaincontext;
>           uint8_t                             pad[128];
>       } u;
>   };
> diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
> index 8af8602b46..d94d0fc125 100644
> --- a/xen/xsm/flask/hooks.c
> +++ b/xen/xsm/flask/hooks.c
> @@ -744,6 +744,12 @@ static int flask_domctl(struct domain *d, int cmd)
>       case XEN_DOMCTL_get_cpu_policy:
>           return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__GET_CPU_POLICY);
>   
> +    case XEN_DOMCTL_setdomaincontext:
> +        return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__SETCONTEXT);
> +
> +    case XEN_DOMCTL_getdomaincontext:
> +        return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__GETCONTEXT);
> +
>       default:
>           return avc_unknown_permission("domctl", cmd);
>       }
> diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
> index c055c14c26..fccfb9de82 100644
> --- a/xen/xsm/flask/policy/access_vectors
> +++ b/xen/xsm/flask/policy/access_vectors
> @@ -245,6 +245,10 @@ class domain2
>       resource_map
>   # XEN_DOMCTL_get_cpu_policy
>       get_cpu_policy
> +# XEN_DOMCTL_setdomaincontext
> +    setcontext
> +# XEN_DOMCTL_getdomaincontext
> +    getcontext
>   }
>   
>   # Similar to class domain, but primarily contains domctls related to HVM domains
> 

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/5] xen/common/domctl: introduce XEN_DOMCTL_get/setdomaincontext
  2020-03-27 18:50 ` [Xen-devel] [PATCH 2/5] xen/common/domctl: introduce XEN_DOMCTL_get/setdomaincontext Paul Durrant
  2020-04-01 13:42   ` Julien Grall
@ 2020-04-01 13:46   ` Julien Grall
  1 sibling, 0 replies; 32+ messages in thread
From: Julien Grall @ 2020-04-01 13:46 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich, Daniel De Graaf



On 27/03/2020 18:50, Paul Durrant wrote:
> + * XEN_DOMCTL_getdomaincontext
> + * ---------------------------
> + *
> + * buffer (IN):   The buffer into which the context data should be
> + *                copied, or NULL to query the buffer size that should
> + *                be allocated.
> + * size (IN/OUT): If 'buffer' is NULL then the value passed in must be
> + *                zero, and the value passed out will be the size of the
> + *                buffer to allocate.
> + *                If 'buffer' is non-NULL then the value passed in must
> + *                be the size of the buffer into which data may be copied.
> + * mask (IN):     See comment on domain_save/load() in xen/save.h.
> + *
> + * XEN_DOMCTL_setdomaincontext
> + * ---------------------------
> + *
> + * buffer (IN):   The buffer from which the context data should be
> + *                copied.
> + * size (IN):     The size of the buffer from which data may be copied.
> + *                This data must include DOMAIN_SAVE_CODE_HEADER at the
> + *                start and terminate with a DOMAIN_SAVE_CODE_END record.
> + *                Any data beyond the DOMAIN_SAVE_CODE_END record will be
> + *                ignored.
> + * mask (IN):     See comment on domain_save/load() in xen/save.h.
> + */
> +struct xen_domctl_domaincontext {
> +    uint32_t size;
> +    uint32_t mask;

I actually forgot to comment on the mask. We are restricting ourself to 
32 different type of records. I don't think this is going to fly very far.

But what is the expected use of 'mask'. Are we expecting the toolstack 
to say which records should be saved/restored?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] common/domain: add a domain context record for shared_info...
  2020-03-27 18:50 ` [Xen-devel] [PATCH 4/5] common/domain: add a domain context record for shared_info Paul Durrant
@ 2020-04-01 14:27   ` Julien Grall
  0 siblings, 0 replies; 32+ messages in thread
From: Julien Grall @ 2020-04-01 14:27 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Ian Jackson,
	George Dunlap, Jan Beulich

HI Paul,

On 27/03/2020 18:50, Paul Durrant wrote:
> ... and update xen-ctx to dump some information describing the record.

The commit message should contain a little bit more context why we are 
saving/restore the shared_info.

> 
> NOTE: To allow a sensible definition of the record in public/save.h
>        this patch also adds a definition of the Xen ABI's de-facto page
>        size into public/xen.h.
> 
> Signed-off-by: Paul Durrant <paul@xen.org>
> ---
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Wei Liu <wl@xen.org>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Julien Grall <julien@xen.org>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> ---
>   tools/misc/xen-ctx.c      |  8 ++++++
>   xen/common/domain.c       | 55 +++++++++++++++++++++++++++++++++++++++
>   xen/include/public/save.h | 10 ++++++-
>   xen/include/public/xen.h  |  3 +++
>   4 files changed, 75 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/misc/xen-ctx.c b/tools/misc/xen-ctx.c
> index c31dd5d8e9..8f9692843b 100644
> --- a/tools/misc/xen-ctx.c
> +++ b/tools/misc/xen-ctx.c
> @@ -57,6 +57,13 @@ static void dump_header(void)
>              h.magic, h.version);
>   }
>   
> +static void dump_shared_info(void)
> +{
> +    DOMAIN_SAVE_TYPE(SHARED_INFO) s;
> +    READ(s);
> +    printf("    SHARED_INFO: field_width %u\n", s.field_width);
> +}
> +
>   static void dump_end(void)
>   {
>       DOMAIN_SAVE_TYPE(END) e;
> @@ -124,6 +131,7 @@ int main(int argc, char **argv)
>           switch (desc.typecode)
>           {
>           case DOMAIN_SAVE_CODE(HEADER): dump_header(); break;
> +        case DOMAIN_SAVE_CODE(SHARED_INFO): dump_shared_info(); break;
>           case DOMAIN_SAVE_CODE(END): dump_end(); return 0;
>           default:
>               printf("Unknown type %u: skipping\n", desc.typecode);
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 3dcd73f67c..484f6bde13 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -33,6 +33,7 @@
>   #include <xen/xenoprof.h>
>   #include <xen/irq.h>
>   #include <xen/argo.h>
> +#include <xen/save.h>
>   #include <asm/debugger.h>
>   #include <asm/p2m.h>
>   #include <asm/processor.h>
> @@ -1646,6 +1647,60 @@ int continue_hypercall_on_cpu(
>       return 0;
>   }
>   
> +static int save_shared_info(const struct vcpu *v, struct domain_context *c,
> +                            bool dry_run)
> +{
> +    struct domain *d = v->domain;
> +    struct domain_shared_info_context ctxt = {};

IHMO, storing 4K worth of data on the stack is a pretty bad idea. 
Looking at the code...

> +
> +    if ( !dry_run )
> +    {
> +        memcpy(ctxt.buffer, d->shared_info, PAGE_SIZE);

... it feels to me the content of the shared_info should be directly 
written in the stream.

> +        ctxt.field_width = has_32bit_shinfo(d) ? 4 : 8;
> +    }
> +
> +    return DOMAIN_SAVE_ENTRY(SHARED_INFO, c, v, &ctxt, sizeof(ctxt));
> +}
> +
> +static int load_shared_info(struct vcpu *v, struct domain_context *c)
> +{
> +    struct domain *d = v->domain;
> +    struct domain_shared_info_context ctxt;
> +    unsigned int i;
> +    int rc;
> +
> +    rc = DOMAIN_LOAD_ENTRY(SHARED_INFO, c, v, &ctxt, sizeof(ctxt), true);
> +    if ( rc )
> +        return rc;
> +
> +    for ( i = 0; i < ARRAY_SIZE(ctxt.pad); i++ )
> +        if ( ctxt.pad[i] )
> +            return -EINVAL;
> +
> +    switch ( ctxt.field_width )
> +    {
> +    case 4:
> +        d->arch.has_32bit_shinfo = 1;

I don't think this will compile on Arm.

> +        break;
> +
> +    case 8:
> +        d->arch.has_32bit_shinfo = 0;
> +        break;
> +
> +    default:
> +        rc = -EINVAL;
> +        break;
> +    }
> +
> +    if ( !rc )
> +        memcpy(d->shared_info, ctxt.buffer, PAGE_SIZE);
> +
> +    return rc;
> +}
> +
> +DOMAIN_REGISTER_SAVE_RESTORE(SHARED_INFO, false, save_shared_info,
> +                             load_shared_info);
> +
>   /*
>    * Local variables:
>    * mode: C
> diff --git a/xen/include/public/save.h b/xen/include/public/save.h
> index 84981cd0f6..ff804a7dbf 100644
> --- a/xen/include/public/save.h
> +++ b/xen/include/public/save.h
> @@ -69,6 +69,14 @@ struct domain_save_header {
>   };
>   DECLARE_DOMAIN_SAVE_TYPE(HEADER, 1, struct domain_save_header);
>   
> -#define DOMAIN_SAVE_CODE_MAX 1
> +struct domain_shared_info_context {
> +    uint8_t buffer[XEN_PAGE_SIZE];

In the load/save code, you are using PAGE_SIZE but here you are using 
XEN_PAGE_SIZE. I don't think there are any promise they will be the same 
value. In particular...

> +    uint8_t field_width;
> +    uint8_t pad[7];
> +};
> +
> +DECLARE_DOMAIN_SAVE_TYPE(SHARED_INFO, 2, struct domain_shared_info_context);
> +
> +#define DOMAIN_SAVE_CODE_MAX 2
>   
>   #endif /* __XEN_PUBLIC_SAVE_H__ */
> diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
> index 75b1619d0d..cbf603f289 100644
> --- a/xen/include/public/xen.h
> +++ b/xen/include/public/xen.h
> @@ -37,6 +37,9 @@
>   #error "Unsupported architecture"
>   #endif
>   
> +/* The Xen ABI assumes a page size of 4k. */
> +#define XEN_PAGE_SIZE 4096

... this is not correct. Xen ABI is based on the page granularity used 
by Xen. It happens to be 4KB today but that's because no-one build yet a 
64KB/16KB hypervisor.

If someone tomorrow decides to add support for 64KB, then using 4KB in 
the ABI by default is not going to fly.

Imagine the guest only give access to 4KB region, how do you handle it 
if your minimum granularity in the hypervisor is 64KB?

In theory, it will not require much effort to get Xen on Arm built with 
64KB/16KB support as long as guest is using a page granularity higher 
than what Xen is using.

I would much prefer if we encode the page size in the stream. We can 
then use to retrieve data based on the page size.

Regarding the shared_info itself. Why do you need to know the size in 
tools? Would not it be enough to save the real size in 
domain_shared_info_context?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
  2020-03-27 18:50 ` [Xen-devel] [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context Paul Durrant
  2020-04-01 12:00   ` Julien Grall
@ 2020-04-01 14:50   ` Jan Beulich
  2020-04-02  9:58     ` Paul Durrant
  1 sibling, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2020-04-01 14:50 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Julien Grall, Wei Liu, Andrew Cooper,
	Ian Jackson, George Dunlap, xen-devel

On 27.03.2020 19:50, Paul Durrant wrote:
> Domain context is state held in the hypervisor that does not come under
> the category of 'HVM state' but is instead 'PV state' that is common
> between PV guests and enlightened HVM guests (i.e. those that have PV
> drivers) such as event channel state, grant entry state, etc.

Without looking at the patch details yet, I'm having some difficulty
understanding how this is going to work in a safe/correct manner. I
suppose for LU the system is in a frozen enough state that
snapshotting and copying state like this is okay, but ...

> To allow enlightened HVM guests to be migrated without their co-operation
> it will be necessary to transfer such state along with the domain's
> memory image, architectural state, etc. This framework is introduced for
> that purpose.
> 
> This patch adds the new public header and the low level implementation,
> entered via the domain_save() or domain_load() functions. Subsequent
> patches will introduce other parts of the framwork, and code that will
> make use of it within the current version of the libxc migration stream.

... here you suggest (and patch 5 appears to match this) that this
is going to be used even in "normal" migration streams. All of the
items named are communication vehicles, and hence there are always
two sides that can influence the state. For event channels, the
other side (which isn't paused) or the hardware (for passed through
devices) might signal them, or it (just the former obviously) could
close their end, resulting in a state change also for the domain
being migrated. If this happens after the snapshot was taken, the
state change is lost.

Otoh I'm sure the case was considered, so perhaps I'm simply missing
some crucial aspect (which then could do with spelling out in the
description of the cover letter).

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
  2020-04-01 14:50   ` Jan Beulich
@ 2020-04-02  9:58     ` Paul Durrant
  2020-04-02 11:08       ` Jan Beulich
  0 siblings, 1 reply; 32+ messages in thread
From: Paul Durrant @ 2020-04-02  9:58 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: 'Stefano Stabellini', 'Julien Grall',
	'Wei Liu', 'Andrew Cooper', 'Ian Jackson',
	'George Dunlap',
	xen-devel

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 01 April 2020 15:51
> To: Paul Durrant <paul@xen.org>
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper <andrew.cooper3@citrix.com>; George Dunlap
> <george.dunlap@citrix.com>; Ian Jackson <ian.jackson@eu.citrix.com>; Julien Grall <julien@xen.org>;
> Stefano Stabellini <sstabellini@kernel.org>; Wei Liu <wl@xen.org>
> Subject: Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
> 
> On 27.03.2020 19:50, Paul Durrant wrote:
> > Domain context is state held in the hypervisor that does not come under
> > the category of 'HVM state' but is instead 'PV state' that is common
> > between PV guests and enlightened HVM guests (i.e. those that have PV
> > drivers) such as event channel state, grant entry state, etc.
> 
> Without looking at the patch details yet, I'm having some difficulty
> understanding how this is going to work in a safe/correct manner. I
> suppose for LU the system is in a frozen enough state that
> snapshotting and copying state like this is okay, but ...
> 
> > To allow enlightened HVM guests to be migrated without their co-operation
> > it will be necessary to transfer such state along with the domain's
> > memory image, architectural state, etc. This framework is introduced for
> > that purpose.
> >
> > This patch adds the new public header and the low level implementation,
> > entered via the domain_save() or domain_load() functions. Subsequent
> > patches will introduce other parts of the framwork, and code that will
> > make use of it within the current version of the libxc migration stream.
> 
> ... here you suggest (and patch 5 appears to match this) that this
> is going to be used even in "normal" migration streams.

Well, 'transparent' (or non-cooperative) migration will only work in some cases but it definitely does work.

> All of the
> items named are communication vehicles, and hence there are always
> two sides that can influence the state. For event channels, the
> other side (which isn't paused) or the hardware (for passed through
> devices) might signal them, or it (just the former obviously) could
> close their end, resulting in a state change also for the domain
> being migrated. If this happens after the snapshot was taken, the
> state change is lost.

Indeed, which is why we *do* rely on co-operation from the other end of the event channels in the migration case. In the initial case it is likely we'll veto transparent migration unless all event channels are connected to either dom0 or Xen.

> 
> Otoh I'm sure the case was considered, so perhaps I'm simply missing
> some crucial aspect (which then could do with spelling out in the
> description of the cover letter).
> 

Does that need to be explained for a series that is just infrastructure? The overall design doc is now committed in the repo (although may need some expansion in future) so I could point at that.
I don't think I'm giving anything away when I say that EC2's downstream code simply (ab)uses HVM save records for transferring the extra state; all I'm trying to do here is create something cleaner onto which I can re-base and upstream the EC2 code.

  Paul




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
  2020-04-02  9:58     ` Paul Durrant
@ 2020-04-02 11:08       ` Jan Beulich
  2020-04-02 14:00         ` Paul Durrant
  0 siblings, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2020-04-02 11:08 UTC (permalink / raw)
  To: paul
  Cc: 'Stefano Stabellini', 'Julien Grall',
	'Wei Liu', 'Andrew Cooper', 'Ian Jackson',
	'George Dunlap',
	xen-devel

On 02.04.2020 11:58, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 01 April 2020 15:51
>> To: Paul Durrant <paul@xen.org>
>> Cc: xen-devel@lists.xenproject.org; Andrew Cooper <andrew.cooper3@citrix.com>; George Dunlap
>> <george.dunlap@citrix.com>; Ian Jackson <ian.jackson@eu.citrix.com>; Julien Grall <julien@xen.org>;
>> Stefano Stabellini <sstabellini@kernel.org>; Wei Liu <wl@xen.org>
>> Subject: Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
>>
>> On 27.03.2020 19:50, Paul Durrant wrote:
>>> Domain context is state held in the hypervisor that does not come under
>>> the category of 'HVM state' but is instead 'PV state' that is common
>>> between PV guests and enlightened HVM guests (i.e. those that have PV
>>> drivers) such as event channel state, grant entry state, etc.
>>
>> Without looking at the patch details yet, I'm having some difficulty
>> understanding how this is going to work in a safe/correct manner. I
>> suppose for LU the system is in a frozen enough state that
>> snapshotting and copying state like this is okay, but ...
>>
>>> To allow enlightened HVM guests to be migrated without their co-operation
>>> it will be necessary to transfer such state along with the domain's
>>> memory image, architectural state, etc. This framework is introduced for
>>> that purpose.
>>>
>>> This patch adds the new public header and the low level implementation,
>>> entered via the domain_save() or domain_load() functions. Subsequent
>>> patches will introduce other parts of the framwork, and code that will
>>> make use of it within the current version of the libxc migration stream.
>>
>> ... here you suggest (and patch 5 appears to match this) that this
>> is going to be used even in "normal" migration streams.
> 
> Well, 'transparent' (or non-cooperative) migration will only work in some cases but it definitely does work.
> 
>> All of the
>> items named are communication vehicles, and hence there are always
>> two sides that can influence the state. For event channels, the
>> other side (which isn't paused) or the hardware (for passed through
>> devices) might signal them, or it (just the former obviously) could
>> close their end, resulting in a state change also for the domain
>> being migrated. If this happens after the snapshot was taken, the
>> state change is lost.
> 
> Indeed, which is why we *do* rely on co-operation from the other end
> of the event channels in the migration case. In the initial case it
> is likely we'll veto transparent migration unless all event channels
> are connected to either dom0 or Xen.

Co-operation for "normal" migration, iirc, consists of tearing down
and re-establishing everything. There's simply no risk of losing e.g.
events this way.

>> Otoh I'm sure the case was considered, so perhaps I'm simply missing
>> some crucial aspect (which then could do with spelling out in the
>> description of the cover letter).
>>
> 
> Does that need to be explained for a series that is just
> infrastructure?

I think so, yes - this infrastructure is pointless to introduce if
it doesn't allow fulfilling all requirements. Pointing at the design
doc (in the cover letter) may be enough if all aspects are covered
by what's there. I wouldn't have assumed using this infrastructure
also for co-operative migration to also be mentioned there.

Considering the situation with event channels (all closed), doing
what you do for the shared info page is probably going to be fine;
large parts of it are in a known state (or need re-filling on the
destination) anyway. What other plans do you have for non-LU
migration wrt this new infrastructure? If the shared info page is
all that's going to get migrated with its help, I'd wonder whether
the infrastructure wasn't better conditional upon a LU config
option, and the shared info migration was left as it is now.

> The overall design doc is now committed in the repo (although may
> need some expansion in future) so I could point at that.
> I don't think I'm giving anything away when I say that EC2's
> downstream code simply (ab)uses HVM save records for transferring
> the extra state; all I'm trying to do here is create something
> cleaner onto which I can re-base and upstream the EC2 code.

That was my guess, indeed.

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
  2020-04-02 11:08       ` Jan Beulich
@ 2020-04-02 14:00         ` Paul Durrant
  2020-04-03  8:38           ` Jan Beulich
  0 siblings, 1 reply; 32+ messages in thread
From: Paul Durrant @ 2020-04-02 14:00 UTC (permalink / raw)
  To: 'Jan Beulich', 'Julien Grall'
  Cc: 'Stefano Stabellini', 'Julien Grall',
	'Wei Liu', 'Andrew Cooper', 'Ian Jackson',
	'George Dunlap',
	xen-devel

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 02 April 2020 12:08
> To: paul@xen.org
> Cc: xen-devel@lists.xenproject.org; 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap'
> <george.dunlap@citrix.com>; 'Ian Jackson' <ian.jackson@eu.citrix.com>; 'Julien Grall'
> <julien@xen.org>; 'Stefano Stabellini' <sstabellini@kernel.org>; 'Wei Liu' <wl@xen.org>
> Subject: Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
> 
> On 02.04.2020 11:58, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: 01 April 2020 15:51
> >> To: Paul Durrant <paul@xen.org>
> >> Cc: xen-devel@lists.xenproject.org; Andrew Cooper <andrew.cooper3@citrix.com>; George Dunlap
> >> <george.dunlap@citrix.com>; Ian Jackson <ian.jackson@eu.citrix.com>; Julien Grall <julien@xen.org>;
> >> Stefano Stabellini <sstabellini@kernel.org>; Wei Liu <wl@xen.org>
> >> Subject: Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
> >>
> >> On 27.03.2020 19:50, Paul Durrant wrote:
> >>> Domain context is state held in the hypervisor that does not come under
> >>> the category of 'HVM state' but is instead 'PV state' that is common
> >>> between PV guests and enlightened HVM guests (i.e. those that have PV
> >>> drivers) such as event channel state, grant entry state, etc.
> >>
> >> Without looking at the patch details yet, I'm having some difficulty
> >> understanding how this is going to work in a safe/correct manner. I
> >> suppose for LU the system is in a frozen enough state that
> >> snapshotting and copying state like this is okay, but ...
> >>
> >>> To allow enlightened HVM guests to be migrated without their co-operation
> >>> it will be necessary to transfer such state along with the domain's
> >>> memory image, architectural state, etc. This framework is introduced for
> >>> that purpose.
> >>>
> >>> This patch adds the new public header and the low level implementation,
> >>> entered via the domain_save() or domain_load() functions. Subsequent
> >>> patches will introduce other parts of the framwork, and code that will
> >>> make use of it within the current version of the libxc migration stream.
> >>
> >> ... here you suggest (and patch 5 appears to match this) that this
> >> is going to be used even in "normal" migration streams.
> >
> > Well, 'transparent' (or non-cooperative) migration will only work in some cases but it definitely
> does work.
> >
> >> All of the
> >> items named are communication vehicles, and hence there are always
> >> two sides that can influence the state. For event channels, the
> >> other side (which isn't paused) or the hardware (for passed through
> >> devices) might signal them, or it (just the former obviously) could
> >> close their end, resulting in a state change also for the domain
> >> being migrated. If this happens after the snapshot was taken, the
> >> state change is lost.
> >
> > Indeed, which is why we *do* rely on co-operation from the other end
> > of the event channels in the migration case. In the initial case it
> > is likely we'll veto transparent migration unless all event channels
> > are connected to either dom0 or Xen.
> 
> Co-operation for "normal" migration, iirc, consists of tearing down
> and re-establishing everything. There's simply no risk of losing e.g.
> events this way.

No, indeed, everything basically has to be re-established from scratch for normal migration. Transparent migration, as it is implemented at the moment, does rely on injecting potentially spurious events on resume. I think the alternative would be to have the PV backends send an event when they re-connect to a shared ring rather than having Xen do it.

> 
> >> Otoh I'm sure the case was considered, so perhaps I'm simply missing
> >> some crucial aspect (which then could do with spelling out in the
> >> description of the cover letter).
> >>
> >
> > Does that need to be explained for a series that is just
> > infrastructure?
> 
> I think so, yes - this infrastructure is pointless to introduce if
> it doesn't allow fulfilling all requirements. Pointing at the design
> doc (in the cover letter) may be enough if all aspects are covered
> by what's there. I wouldn't have assumed using this infrastructure
> also for co-operative migration to also be mentioned there.

No, I can mention the plan to use it to replace existing libxc migration records in the cover letter even though it is stated in the comment for patch #5.

> 
> Considering the situation with event channels (all closed), doing
> what you do for the shared info page is probably going to be fine;
> large parts of it are in a known state (or need re-filling on the
> destination) anyway. What other plans do you have for non-LU
> migration wrt this new infrastructure?

Well, as discussed above, event channels are one, then there's the grant table. The downstream code as a record for the wallclock but I think the RTC covers that now that added the code to preserve the offset. We also have records for vcpu timers and runstate, but I'm not convinced we actually need those.

> If the shared info page is
> all that's going to get migrated with its help, I'd wonder whether
> the infrastructure wasn't better conditional upon a LU config
> option, and the shared info migration was left as it is now.
> 

The shared info is also migrated for HVM guests so it would have meant moving the mapping code around anyway, thus it seems sensible to use the new domain context for that for PV guests in their normal migration stream anyway.

  Paul



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
  2020-04-02 14:00         ` Paul Durrant
@ 2020-04-03  8:38           ` Jan Beulich
  0 siblings, 0 replies; 32+ messages in thread
From: Jan Beulich @ 2020-04-03  8:38 UTC (permalink / raw)
  To: paul
  Cc: 'Stefano Stabellini', 'Julien Grall',
	'Wei Liu', 'Andrew Cooper', 'Ian Jackson',
	'George Dunlap',
	xen-devel

On 02.04.2020 16:00, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 02 April 2020 12:08
>>
>> Considering the situation with event channels (all closed), doing
>> what you do for the shared info page is probably going to be fine;
>> large parts of it are in a known state (or need re-filling on the
>> destination) anyway. What other plans do you have for non-LU
>> migration wrt this new infrastructure?
> 
> Well, as discussed above, event channels are one, then there's the grant table. The downstream code as a record for the wallclock but I think the RTC covers that now that added the code to preserve the offset. We also have records for vcpu timers and runstate, but I'm not convinced we actually need those.

Timers may need preserving, but runstate may be avoidable. Depends
on whether the accumulation in time[4] is fine to start from zero
again after (transparent) migration.

>> If the shared info page is
>> all that's going to get migrated with its help, I'd wonder whether
>> the infrastructure wasn't better conditional upon a LU config
>> option, and the shared info migration was left as it is now.
> 
> The shared info is also migrated for HVM guests so it would have meant moving the mapping code around anyway, thus it seems sensible to use the new domain context for that for PV guests in their normal migration stream anyway.

Hmm, okay - I'll see to get to look at the actual code then.

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 3/5] tools/misc: add xen-ctx to present domain context
  2020-03-30 10:54   ` Jan Beulich
@ 2020-04-03 15:20     ` Paul Durrant
  2020-04-03 15:29       ` Jan Beulich
  0 siblings, 1 reply; 32+ messages in thread
From: Paul Durrant @ 2020-04-03 15:20 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel, 'Ian Jackson', 'Wei Liu'

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 30 March 2020 11:54
> To: Paul Durrant <paul@xen.org>
> Cc: xen-devel@lists.xenproject.org; Ian Jackson <ian.jackson@eu.citrix.com>; Wei Liu <wl@xen.org>
> Subject: Re: [PATCH 3/5] tools/misc: add xen-ctx to present domain context
> 
> On 27.03.2020 19:50, Paul Durrant wrote:
> > This tools is analogous to 'xen-hvmctx' which presents HVM context.
> > Subsequent patches will add 'dump' functions when new records are
> > introduced.
> >
> > Signed-off-by: Paul Durrant <paul@xen.org>
> > ---
> > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > Cc: Wei Liu <wl@xen.org>
> > ---
> >  .gitignore           |   1 +
> >  tools/misc/Makefile  |   4 ++
> >  tools/misc/xen-ctx.c | 144 +++++++++++++++++++++++++++++++++++++++++++
> 
> Is xen-ctx a good choice of a name, considering we already have not
> only xen-hvmctx, but also xenctx? If the new functionality isn't a
> good fit for either, perhaps its name would better reflect its
> connection to save/restore records? xen-sr-dump looks pretty clumsy
> to me, but still seems better than a name easily mixed up with
> others.

How about xen-domctx?

  Paul

> 
> Jan



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/5] tools/misc: add xen-ctx to present domain context
  2020-04-03 15:20     ` Paul Durrant
@ 2020-04-03 15:29       ` Jan Beulich
  2020-04-03 16:08         ` Paul Durrant
  0 siblings, 1 reply; 32+ messages in thread
From: Jan Beulich @ 2020-04-03 15:29 UTC (permalink / raw)
  To: paul; +Cc: xen-devel, 'Ian Jackson', 'Wei Liu'

On 03.04.2020 17:20, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 30 March 2020 11:54
>> To: Paul Durrant <paul@xen.org>
>> Cc: xen-devel@lists.xenproject.org; Ian Jackson <ian.jackson@eu.citrix.com>; Wei Liu <wl@xen.org>
>> Subject: Re: [PATCH 3/5] tools/misc: add xen-ctx to present domain context
>>
>> On 27.03.2020 19:50, Paul Durrant wrote:
>>> This tools is analogous to 'xen-hvmctx' which presents HVM context.
>>> Subsequent patches will add 'dump' functions when new records are
>>> introduced.
>>>
>>> Signed-off-by: Paul Durrant <paul@xen.org>
>>> ---
>>> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>>> Cc: Wei Liu <wl@xen.org>
>>> ---
>>>  .gitignore           |   1 +
>>>  tools/misc/Makefile  |   4 ++
>>>  tools/misc/xen-ctx.c | 144 +++++++++++++++++++++++++++++++++++++++++++
>>
>> Is xen-ctx a good choice of a name, considering we already have not
>> only xen-hvmctx, but also xenctx? If the new functionality isn't a
>> good fit for either, perhaps its name would better reflect its
>> connection to save/restore records? xen-sr-dump looks pretty clumsy
>> to me, but still seems better than a name easily mixed up with
>> others.
> 
> How about xen-domctx?

Hmm, maybe. Seeing this is about PV pieces, xen-pvctx might also be
an option.

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
  2020-04-01 12:00   ` Julien Grall
  2020-04-01 12:07     ` Jan Beulich
@ 2020-04-03 15:55     ` Paul Durrant
  2020-04-03 17:24       ` Julien Grall
  1 sibling, 1 reply; 32+ messages in thread
From: Paul Durrant @ 2020-04-03 15:55 UTC (permalink / raw)
  To: 'Julien Grall', xen-devel
  Cc: 'Stefano Stabellini', 'Wei Liu',
	'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', 'Jan Beulich'

> -----Original Message-----
[snip]
> > +
> > +#include <xen/save.h>
> > +
> > +struct domain_context {
> > +    bool log;
> > +    struct domain_save_descriptor desc;
> > +    domain_copy_entry copy;
> 
> As your new framework is technically an extension of existing one, it
> would be good to explain why we diverge in the definitions.
> 

I don't follow. What is diverging? I explain in the commit comment that this is a parallel framework. Do I need to justify why it is not a carbon copy of the HVM one?

> > +    void *priv;
> > +};
> > +
> > +static struct {
> > +    const char *name;
> > +    bool per_vcpu;
> > +    domain_save_handler save;
> > +    domain_load_handler load;
> > +} handlers[DOMAIN_SAVE_CODE_MAX + 1];
> > +
> > +void __init domain_register_save_type(unsigned int tc, const char *name,
> > +                                      bool per_vcpu,
> > +                                      domain_save_handler save,
> > +                                      domain_load_handler load)
> > +{
> > +    BUG_ON(tc > ARRAY_SIZE(handlers));
> > +
> > +    ASSERT(!handlers[tc].save);
> > +    ASSERT(!handlers[tc].load);
> > +
> > +    handlers[tc].name = name;
> > +    handlers[tc].per_vcpu = per_vcpu;
> > +    handlers[tc].save = save;
> > +    handlers[tc].load = load;
> > +}
> > +
> > +int domain_save_entry(struct domain_context *c, unsigned int tc,
> > +                      const char *name, const struct vcpu *v, void *src,
> 
> I realize that 'src' is not const because how you define copy, however I
> would rather prefer if we rework the interface to avoid to keep the
> const in the definition. This may mean to have to define two callback
> rather than one.

That seems a bit ugly for the sake of a const but I guess I could create a union with a copy_in and copy_out. I'll see how that looks.

> 
> > +                      size_t src_len)
> 
> On 64-bit architecture, size_t would be 64-bit. But the record is only
> storing 32-bit. Would it make sense to add some sanity check in the code?
> 

True. Given this is new I think I'll just bump the length to 64-bits.

> > +{
> > +    int rc;
> > +
> > +    if ( c->log && tc != DOMAIN_SAVE_CODE(HEADER) &&
> > +         tc != DOMAIN_SAVE_CODE(END) )
> > +        gdprintk(XENLOG_INFO, "%pv save: %s (%lu)\n", v, name, src_len);
> > +
> > +    if ( !IS_ALIGNED(src_len, 8) )
> 
> Why not adding an implicit padding? This would avoid to chase error
> during saving because the len wasn't a multiple of 8.
> 

I don't think implicit padding is worth it. This error should only happen if a badly defined save record type is added to the code so perhaps I ought to add an ASSERT here as well as deal with the error.

> > +        return -EINVAL;
> > +
> > +    BUG_ON(tc != c->desc.typecode);
> > +    BUG_ON(v->vcpu_id != c->desc.instance);
> 
> Does the descriptor really need to be filled by domain_save()? Can't it
> be done here, so we can avoid the two BUG_ON()s?
> 

Yes it can but this serves as a sanity check that the save code is actually doing what it should. Hence why these are BUG_ON()s and not error exits.

> > +    c->desc.length = src_len;
> > +
> > +    rc = c->copy(c->priv, &c->desc, sizeof(c->desc));
> > +    if ( rc )
> > +        return rc;
> > +
> > +    return c->copy(c->priv, src, src_len);
> > +}
> > +
> > +int domain_save(struct domain *d, domain_copy_entry copy, void *priv,
> > +                unsigned long mask, bool dry_run)
> > +{
> > +    struct domain_context c = {
> > +        .copy = copy,
> > +        .priv = priv,
> > +        .log = !dry_run,
> > +    };
> > +    struct domain_save_header h = {
> > +        .magic = DOMAIN_SAVE_MAGIC,
> > +        .version = DOMAIN_SAVE_VERSION,
> > +    };
> > +    struct domain_save_header e;
> > +    unsigned int i;
> > +    int rc;
> > +
> > +    ASSERT(d != current->domain);
> > +
> > +    if ( d->is_dying )
> > +        return -EINVAL;
> > +
> > +    domain_pause(d);
> > +
> > +    c.desc.typecode = DOMAIN_SAVE_CODE(HEADER);
> > +
> > +    rc = DOMAIN_SAVE_ENTRY(HEADER, &c, d->vcpu[0], &h, sizeof(h));
> > +    if ( rc )
> > +        goto out;
> > +
> > +    for ( i = 0; i < ARRAY_SIZE(handlers); i++ )
> 
> AFAIU, with this solution, if there are dependency between records, the
> dependencies will have to a lower "index". I think we may have some
> dependency with guest transparent migration such as we need to restore
> the event ABI before restoring the event channels.
> 

Is that just a downstream implementation detail though? I would hope that there are no ordering dependencies between records.

> Should we rely on the index for the dependency?
>

If we do need ordering dependencies then I guess it would need to be explicit.
 
> In any case, I think we want to document it.
>

I can certainly document that save handlers are invoked in code order.

> > +    {
> > +        domain_save_handler save = handlers[i].save;
> > +
> > +        if ( (mask && !test_bit(i, &mask)) || !save )
> 
> The type of mask suggests it is not possible to have more than 32
> different types of record if we wanted to be arch agnostic. Even if we
> focus on 64-bit arch, this is only 64 records.
> 
> This is not very future proof, but might be ok if this is not exposed
> outside of the hypervisor (I haven't looked at the rest of the series
> yet). However, we at least need a BUILD_BUG_ON() in place. So please
> make sure  DOMAIN_SAVE_CODE_MAX is always less than 64.
> 
> Also what if there is a bit set in the mask and the record is not
> existing? Shouldn't we return an error?
> 

TBH I think 32 will be enough... I would not expect this context space to grow very much, if at all, once transparent migration is working. I think I'll just drop the mask for now; it's not actually clear to me we'll make use of it... just seemed like a nice-to-have.

> > +            continue;
> > +
> > +        memset(&c.desc, 0, sizeof(c.desc));
> > +        c.desc.typecode = i;
> > +
> > +        if ( handlers[i].per_vcpu )
> > +        {
> > +            struct vcpu *v;
> > +
> > +            for_each_vcpu ( d, v )
> > +            {
> > +                c.desc.instance = v->vcpu_id;
> > +
> > +                rc = save(v, &c, dry_run);
> > +                if ( rc )
> > +                    goto out;
> > +            }
> > +        }
> > +        else
> > +        {
> > +            rc = save(d->vcpu[0], &c, dry_run);
> > +            if ( rc )
> > +                goto out;
> > +        }
> > +    }
> > +
> > +    memset(&c.desc, 0, sizeof(c.desc));
> > +    c.desc.typecode = DOMAIN_SAVE_CODE(END);
> > +
> > +    rc = DOMAIN_SAVE_ENTRY(END, &c, d->vcpu[0], &e, 0);
> > +
> > + out:
> > +    domain_unpause(d);
> > +
> > +    return rc;
> > +}
> > +
> > +int domain_load_entry(struct domain_context *c, unsigned int tc,
> > +                      const char *name, const struct vcpu *v, void *dst,
> > +                      size_t dst_len, bool exact)
> > +{
> > +    int rc;
> > +
> > +    if ( c->log && tc != DOMAIN_SAVE_CODE(HEADER) &&
> > +         tc != DOMAIN_SAVE_CODE(END) )
> > +        gdprintk(XENLOG_INFO, "%pv load: %s (%lu)\n", v, name, dst_len);
> > +
> > +    BUG_ON(tc != c->desc.typecode);
> > +    BUG_ON(v->vcpu_id != c->desc.instance);
> 
> Is it really warrant to crash the host? What would happen if the values
> were wrong?
> 

It would mean the code is fairly seriously buggy in that the load handler is trying to load a record other than the type it registered for, or for a vcpu other than the one it was passed.

> > +
> > +    if ( (exact ?
> > +          (dst_len != c->desc.length) : (dst_len < c->desc.length)) ||
> 
> Using ternary in if is really confusing. How about:
> 
> dst_len < c->desc.length || (exact && dst_len != c->desc.length) ||
> 
> I understand that there would be two check for the exact case but I
> think it is better than a ternary.

I'm going to re-work this I think.

> 
> However what is the purpose of the param 'exact'? If you set it to false
> how do you know the size you read?

The point of the exact parameter is that whether the caller can only consume a record of the correct length, or whether it can handle an undersize one which gets zero-padded. (It's the same as the zeroextend option in HVM records).

> 
> > +         !IS_ALIGNED(c->desc.length, 8) )
> > +        return -EINVAL;
> > +
> > +    rc = c->copy(c->priv, dst, c->desc.length);
> > +    if ( rc )
> > +        return rc;
> > +
> > +    if ( !exact )
> > +    {
> > +        dst += c->desc.length;
> > +        memset(dst, 0, dst_len - c->desc.length);
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> > +int domain_load(struct domain *d,  domain_copy_entry copy, void *priv,
> > +                unsigned long mask)
> > +{
> > +    struct domain_context c = {
> > +        .copy = copy,
> > +        .priv = priv,
> > +        .log = true,
> > +    };
> > +    struct domain_save_header h, e;
> > +    int rc;
> > +
> > +    ASSERT(d != current->domain);
> > +
> > +    if ( d->is_dying )
> > +        return -EINVAL;
> 
> What does protect you against the doing dying right after this check?
> 

Nothing. It's just to avoid doing pointless work if we can.

> > +
> > +    rc = c.copy(c.priv, &c.desc, sizeof(c.desc));
> > +    if ( rc )
> > +        return rc;
> > +
> > +    if ( c.desc.typecode != DOMAIN_SAVE_CODE(HEADER) ||
> > +         c.desc.instance != 0 )
> > +        return -EINVAL;
> > +
> > +    rc = DOMAIN_LOAD_ENTRY(HEADER, &c, d->vcpu[0], &h, sizeof(h), true);
> > +    if ( rc )
> > +        return rc;
> > +
> > +    if ( h.magic != DOMAIN_SAVE_MAGIC || h.version != DOMAIN_SAVE_VERSION )
> > +        return -EINVAL;
> > +
> > +    domain_pause(d);
> > +
> > +    for (;;)
> > +    {
> > +        unsigned int i;
> > +        domain_load_handler load;
> > +        struct vcpu *v;
> > +
> > +        rc = c.copy(c.priv, &c.desc, sizeof(c.desc));
> > +        if ( rc )
> > +            break;
> > +
> > +        if ( c.desc.typecode == DOMAIN_SAVE_CODE(END) ) {
> > +            rc = DOMAIN_LOAD_ENTRY(END, &c, d->vcpu[0], &e, 0, true);
> > +            break;
> > +        }
> > +
> > +        rc = -EINVAL;
> > +        if ( c.desc.typecode >= ARRAY_SIZE(handlers) ||
> > +             c.desc.instance >= d->max_vcpus )
> 
> Nothing in the documention suggests that c.desc.instance should be 0
> when the record is for the domain.
> 

Ok. I'll put a comment somewhere.

> > +            break;
> > +
> > +        i = c.desc.typecode;
> > +        load = handlers[i].load;
> > +        v = d->vcpu[c.desc.instance];
> > +
> > +        if ( mask && !test_bit(i, &mask) )
> > +        {
> > +            /* Sink the data */
> > +            rc = c.copy(c.priv, NULL, c.desc.length);
> > +            if ( rc )
> > +                break;
> > +
> > +            continue;
> > +        }
> > +
> > +        rc = load ? load(v, &c) : -EOPNOTSUPP;
> > +        if ( rc )
> > +            break;
> > +    }
> > +
> > +    domain_unpause(d);
> > +
> > +    return rc;
> > +}
> > +
> > +/*
> > + * Local variables:
> > + * mode: C
> > + * c-file-style: "BSD"
> > + * c-basic-offset: 4
> > + * tab-width: 4
> > + * indent-tabs-mode: nil
> > + * End:
> > + */
> > diff --git a/xen/include/public/save.h b/xen/include/public/save.h
> > new file mode 100644
> > index 0000000000..84981cd0f6
> > --- /dev/null
> > +++ b/xen/include/public/save.h
> > @@ -0,0 +1,74 @@
> > +/*
> > + * save.h
> > + *
> > + * Structure definitions for common PV/HVM domain state that is held by
> > + * Xen and must be saved along with the domain's memory.
> > + *
> > + * Copyright Amazon.com Inc. or its affiliates.
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a copy
> > + * of this software and associated documentation files (the "Software"), to
> > + * deal in the Software without restriction, including without limitation the
> > + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> > + * sell copies of the Software, and to permit persons to whom the Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
> > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + */
> > +
> > +#ifndef __XEN_PUBLIC_SAVE_H__
> > +#define __XEN_PUBLIC_SAVE_H__
> > +
> > +#include "xen.h"
> > +
> > +/* Each entry is preceded by a descriptor */
> > +struct domain_save_descriptor {
> > +    uint16_t typecode;
> > +    uint16_t instance;
> > +    /*
> > +     * Entry length not including this descriptor. Entries must be padded
> > +     * to a multiple of 8 bytes to make sure descriptors remain correctly
> > +     * aligned.
> > +     */
> > +    uint32_t length;
> > +};
> > +
> > +/*
> > + * Each entry has a type associated with it. DECLARE_DOMAIN_SAVE_TYPE
> > + * binds these things together.
> > + */
> > +#define DECLARE_DOMAIN_SAVE_TYPE(_x, _code, _type) \
> > +    struct __DOMAIN_SAVE_TYPE_##_x { _type t; char c[_code]; };
> > +
> > +#define DOMAIN_SAVE_TYPE(_x) \
> > +    typeof (((struct __DOMAIN_SAVE_TYPE_##_x *)(0))->t)
> > +#define DOMAIN_SAVE_CODE(_x) \
> > +    (sizeof (((struct __DOMAIN_SAVE_TYPE_##_x *)(0))->c))
> > +#define DOMAIN_SAVE_MASK(_x) (1u << DOMAIN_SAVE_CODE(_x))
> > +
> > +/* Terminating entry */
> > +struct domain_save_end {};
> > +DECLARE_DOMAIN_SAVE_TYPE(END, 0, struct domain_save_end);
> > +
> > +#define DOMAIN_SAVE_MAGIC   0x53415645
> > +#define DOMAIN_SAVE_VERSION 0x00000001
> > +
> > +/* Initial entry */
> > +struct domain_save_header {
> > +    uint32_t magic;             /* Must be DOMAIN_SAVE_MAGIC */
> > +    uint32_t version;           /* Save format version */
> 
> Would it make sense to have the version of Xen in the stream?
> 

Why? How would it help?

  Paul



^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 3/5] tools/misc: add xen-ctx to present domain context
  2020-04-03 15:29       ` Jan Beulich
@ 2020-04-03 16:08         ` Paul Durrant
  0 siblings, 0 replies; 32+ messages in thread
From: Paul Durrant @ 2020-04-03 16:08 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel, 'Ian Jackson', 'Wei Liu'

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 03 April 2020 16:30
> To: paul@xen.org
> Cc: xen-devel@lists.xenproject.org; 'Ian Jackson' <ian.jackson@eu.citrix.com>; 'Wei Liu' <wl@xen.org>
> Subject: Re: [PATCH 3/5] tools/misc: add xen-ctx to present domain context
> 
> On 03.04.2020 17:20, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: 30 March 2020 11:54
> >> To: Paul Durrant <paul@xen.org>
> >> Cc: xen-devel@lists.xenproject.org; Ian Jackson <ian.jackson@eu.citrix.com>; Wei Liu <wl@xen.org>
> >> Subject: Re: [PATCH 3/5] tools/misc: add xen-ctx to present domain context
> >>
> >> On 27.03.2020 19:50, Paul Durrant wrote:
> >>> This tools is analogous to 'xen-hvmctx' which presents HVM context.
> >>> Subsequent patches will add 'dump' functions when new records are
> >>> introduced.
> >>>
> >>> Signed-off-by: Paul Durrant <paul@xen.org>
> >>> ---
> >>> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> >>> Cc: Wei Liu <wl@xen.org>
> >>> ---
> >>>  .gitignore           |   1 +
> >>>  tools/misc/Makefile  |   4 ++
> >>>  tools/misc/xen-ctx.c | 144 +++++++++++++++++++++++++++++++++++++++++++
> >>
> >> Is xen-ctx a good choice of a name, considering we already have not
> >> only xen-hvmctx, but also xenctx? If the new functionality isn't a
> >> good fit for either, perhaps its name would better reflect its
> >> connection to save/restore records? xen-sr-dump looks pretty clumsy
> >> to me, but still seems better than a name easily mixed up with
> >> others.
> >
> > How about xen-domctx?
> 
> Hmm, maybe. Seeing this is about PV pieces, xen-pvctx might also be
> an option.

Yes, that would work but it also implies it might only be valid for PV domains. I also prefer 'domctx' since it better matches the name I chose for the framework.

  Paul



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
  2020-04-03 15:55     ` Paul Durrant
@ 2020-04-03 17:24       ` Julien Grall
  2020-04-06  8:27         ` Paul Durrant
  0 siblings, 1 reply; 32+ messages in thread
From: Julien Grall @ 2020-04-03 17:24 UTC (permalink / raw)
  To: paul, xen-devel
  Cc: 'Stefano Stabellini', 'Wei Liu',
	'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', 'Jan Beulich'

Hi Paul,

On 03/04/2020 16:55, Paul Durrant wrote:
>> -----Original Message-----
> [snip]
>>> +
>>> +#include <xen/save.h>
>>> +
>>> +struct domain_context {
>>> +    bool log;
>>> +    struct domain_save_descriptor desc;
>>> +    domain_copy_entry copy;
>>
>> As your new framework is technically an extension of existing one, it
>> would be good to explain why we diverge in the definitions.
>>
> 
> I don't follow. What is diverging? I explain in the commit comment that this is a parallel framework. Do I need to justify why it is not a carbon copy of the HVM one?

Well, they are both restoring/saving guest state. The only difference is 
the existing one is focusing on HVM state.

So it would make sense long term to have only one hypercall and tell 
what you want to save. In fact, some of the improvement here would 
definitely make the HVM one nicer to use (at least in the context of LU).

 From the commit message, it is not clear to me why a new framework and 
why the infrastructure is at the same time different but not.

> 
>>> +    void *priv;
>>> +};
>>> +
>>> +static struct {
>>> +    const char *name;
>>> +    bool per_vcpu;
>>> +    domain_save_handler save;
>>> +    domain_load_handler load;
>>> +} handlers[DOMAIN_SAVE_CODE_MAX + 1];
>>> +
>>> +void __init domain_register_save_type(unsigned int tc, const char *name,
>>> +                                      bool per_vcpu,
>>> +                                      domain_save_handler save,
>>> +                                      domain_load_handler load)
>>> +{
>>> +    BUG_ON(tc > ARRAY_SIZE(handlers));
>>> +
>>> +    ASSERT(!handlers[tc].save);
>>> +    ASSERT(!handlers[tc].load);
>>> +
>>> +    handlers[tc].name = name;
>>> +    handlers[tc].per_vcpu = per_vcpu;
>>> +    handlers[tc].save = save;
>>> +    handlers[tc].load = load;
>>> +}
>>> +
>>> +int domain_save_entry(struct domain_context *c, unsigned int tc,
>>> +                      const char *name, const struct vcpu *v, void *src,
>>
>> I realize that 'src' is not const because how you define copy, however I
>> would rather prefer if we rework the interface to avoid to keep the
>> const in the definition. This may mean to have to define two callback
>> rather than one.
> 
> That seems a bit ugly for the sake of a const but I guess I could create a union with a copy_in and copy_out. I'll see how that looks.

I would really like to map the Live-Update as read-only (it is not meant 
to be modified by the new Xen) and therefore adding const here would 
enable us to catch most of the mistake at build rather than during runtime.
>> Why not adding an implicit padding? This would avoid to chase error
>> during saving because the len wasn't a multiple of 8.
>>
> 
> I don't think implicit padding is worth it. This error should only happen if a badly defined save record type is added to the code so perhaps I ought to add an ASSERT here as well as deal with the error.

I wish I would be able to say every error can be caught during review. 
Unfortunately this is not true.

If you define the size dynamically, then a misalignment may not be 
noticed until you go to prod. The implicit padding would at least allow 
you to be more resilient.

It may not be that bad as you would not be able to save the guest. 
Anyway, this would be nice a feature to have but not a must.

> 
>>> +        return -EINVAL;
>>> +
>>> +    BUG_ON(tc != c->desc.typecode);
>>> +    BUG_ON(v->vcpu_id != c->desc.instance);
>>
>> Does the descriptor really need to be filled by domain_save()? Can't it
>> be done here, so we can avoid the two BUG_ON()s?
>>
> 
> Yes it can but this serves as a sanity check that the save code is actually doing what it should. Hence why these are BUG_ON()s and not error exits.

This does not really answer to the question why the c->desc.instance and 
c->desc.typecode are not set here. What is the advantage to do it 
earlier on and still passing the information in parameters? Why not just 
setting them here?

> 
>>> +    c->desc.length = src_len;
>>> +
>>> +    rc = c->copy(c->priv, &c->desc, sizeof(c->desc));
>>> +    if ( rc )
>>> +        return rc;
>>> +
>>> +    return c->copy(c->priv, src, src_len);
>>> +}
>>> +
>>> +int domain_save(struct domain *d, domain_copy_entry copy, void *priv,
>>> +                unsigned long mask, bool dry_run)
>>> +{
>>> +    struct domain_context c = {
>>> +        .copy = copy,
>>> +        .priv = priv,
>>> +        .log = !dry_run,
>>> +    };
>>> +    struct domain_save_header h = {
>>> +        .magic = DOMAIN_SAVE_MAGIC,
>>> +        .version = DOMAIN_SAVE_VERSION,
>>> +    };
>>> +    struct domain_save_header e;
>>> +    unsigned int i;
>>> +    int rc;
>>> +
>>> +    ASSERT(d != current->domain);
>>> +
>>> +    if ( d->is_dying )
>>> +        return -EINVAL;
>>> +
>>> +    domain_pause(d);
>>> +
>>> +    c.desc.typecode = DOMAIN_SAVE_CODE(HEADER);
>>> +
>>> +    rc = DOMAIN_SAVE_ENTRY(HEADER, &c, d->vcpu[0], &h, sizeof(h));
>>> +    if ( rc )
>>> +        goto out;
>>> +
>>> +    for ( i = 0; i < ARRAY_SIZE(handlers); i++ )
>>
>> AFAIU, with this solution, if there are dependency between records, the
>> dependencies will have to a lower "index". I think we may have some
>> dependency with guest transparent migration such as we need to restore
>> the event ABI before restoring the event channels.
>>
> 
> Is that just a downstream implementation detail though? I would hope that there are no ordering dependencies between records.

Until now, I never managed to avoid ordering between records on the Live 
Update side.

In my previous e-mail, I suggested there are dependency between 
restoring the event ABI (such as control page...) and restoring the 
event channels. How do you suggest to solve it without imposing an order 
between records?

>>> +    {
>>> +        domain_save_handler save = handlers[i].save;
>>> +
>>> +        if ( (mask && !test_bit(i, &mask)) || !save )
>>
>> The type of mask suggests it is not possible to have more than 32
>> different types of record if we wanted to be arch agnostic. Even if we
>> focus on 64-bit arch, this is only 64 records.
>>
>> This is not very future proof, but might be ok if this is not exposed
>> outside of the hypervisor (I haven't looked at the rest of the series
>> yet). However, we at least need a BUILD_BUG_ON() in place. So please
>> make sure  DOMAIN_SAVE_CODE_MAX is always less than 64.
>>
>> Also what if there is a bit set in the mask and the record is not
>> existing? Shouldn't we return an error?
>>
> 
> TBH I think 32 will be enough... I would not expect this context space to grow very much, if at all, once transparent migration is working. I think I'll just drop the mask for now; it's not actually clear to me we'll make use of it... just seemed like a nice-to-have.

So far for Live-Update we have 20 records (not counting the Guest 
Tranparent one). We might be able to do without some, but we also didn't 
restore all the states yet.

> 
>>> +            continue;
>>> +
>>> +        memset(&c.desc, 0, sizeof(c.desc));
>>> +        c.desc.typecode = i;
>>> +
>>> +        if ( handlers[i].per_vcpu )
>>> +        {
>>> +            struct vcpu *v;
>>> +
>>> +            for_each_vcpu ( d, v )
>>> +            {
>>> +                c.desc.instance = v->vcpu_id;
>>> +
>>> +                rc = save(v, &c, dry_run);
>>> +                if ( rc )
>>> +                    goto out;
>>> +            }
>>> +        }
>>> +        else
>>> +        {
>>> +            rc = save(d->vcpu[0], &c, dry_run);
>>> +            if ( rc )
>>> +                goto out;
>>> +        }
>>> +    }
>>> +
>>> +    memset(&c.desc, 0, sizeof(c.desc));
>>> +    c.desc.typecode = DOMAIN_SAVE_CODE(END);
>>> +
>>> +    rc = DOMAIN_SAVE_ENTRY(END, &c, d->vcpu[0], &e, 0);
>>> +
>>> + out:
>>> +    domain_unpause(d);
>>> +
>>> +    return rc;
>>> +}
>>> +
>>> +int domain_load_entry(struct domain_context *c, unsigned int tc,
>>> +                      const char *name, const struct vcpu *v, void *dst,
>>> +                      size_t dst_len, bool exact)
>>> +{
>>> +    int rc;
>>> +
>>> +    if ( c->log && tc != DOMAIN_SAVE_CODE(HEADER) &&
>>> +         tc != DOMAIN_SAVE_CODE(END) )
>>> +        gdprintk(XENLOG_INFO, "%pv load: %s (%lu)\n", v, name, dst_len);
>>> +
>>> +    BUG_ON(tc != c->desc.typecode);
>>> +    BUG_ON(v->vcpu_id != c->desc.instance);
>>
>> Is it really warrant to crash the host? What would happen if the values
>> were wrong?
>>
> 
> It would mean the code is fairly seriously buggy in that the load handler is trying to load a record other than the type it registered for, or for a vcpu other than the one it was passed.

I understand that, but is warrant to crash the host? Couldn't you just 
return an error and continue to run?

> 
>>> +
>>> +    if ( (exact ?
>>> +          (dst_len != c->desc.length) : (dst_len < c->desc.length)) ||
>>
>> Using ternary in if is really confusing. How about:
>>
>> dst_len < c->desc.length || (exact && dst_len != c->desc.length) ||
>>
>> I understand that there would be two check for the exact case but I
>> think it is better than a ternary.
> 
> I'm going to re-work this I think.
> 
>>
>> However what is the purpose of the param 'exact'? If you set it to false
>> how do you know the size you read?
> 
> The point of the exact parameter is that whether the caller can only consume a record of the correct length, or whether it can handle an undersize one which gets zero-padded. (It's the same as the zeroextend option in HVM records).
> 
>>
>>> +         !IS_ALIGNED(c->desc.length, 8) )
>>> +        return -EINVAL;
>>> +
>>> +    rc = c->copy(c->priv, dst, c->desc.length);
>>> +    if ( rc )
>>> +        return rc;
>>> +
>>> +    if ( !exact )
>>> +    {
>>> +        dst += c->desc.length;
>>> +        memset(dst, 0, dst_len - c->desc.length);
>>> +    }
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +int domain_load(struct domain *d,  domain_copy_entry copy, void *priv,
>>> +                unsigned long mask)
>>> +{
>>> +    struct domain_context c = {
>>> +        .copy = copy,
>>> +        .priv = priv,
>>> +        .log = true,
>>> +    };
>>> +    struct domain_save_header h, e;
>>> +    int rc;
>>> +
>>> +    ASSERT(d != current->domain);
>>> +
>>> +    if ( d->is_dying )
>>> +        return -EINVAL;
>>
>> What does protect you against the doing dying right after this check?
>>
> 
> Nothing. It's just to avoid doing pointless work if we can.

I wasn't thinking about pointless work but whether the rest of the code 
is relying on a sound domain. Is it going to be fine?

[...]

>>> +/* Each entry is preceded by a descriptor */
>>> +struct domain_save_descriptor {
>>> +    uint16_t typecode;
>>> +    uint16_t instance;
>>> +    /*
>>> +     * Entry length not including this descriptor. Entries must be padded
>>> +     * to a multiple of 8 bytes to make sure descriptors remain correctly
>>> +     * aligned.
>>> +     */
>>> +    uint32_t length;
>>> +};
>>> +
>>> +/*
>>> + * Each entry has a type associated with it. DECLARE_DOMAIN_SAVE_TYPE
>>> + * binds these things together.
>>> + */
>>> +#define DECLARE_DOMAIN_SAVE_TYPE(_x, _code, _type) \
>>> +    struct __DOMAIN_SAVE_TYPE_##_x { _type t; char c[_code]; };
>>> +
>>> +#define DOMAIN_SAVE_TYPE(_x) \
>>> +    typeof (((struct __DOMAIN_SAVE_TYPE_##_x *)(0))->t)
>>> +#define DOMAIN_SAVE_CODE(_x) \
>>> +    (sizeof (((struct __DOMAIN_SAVE_TYPE_##_x *)(0))->c))
>>> +#define DOMAIN_SAVE_MASK(_x) (1u << DOMAIN_SAVE_CODE(_x))
>>> +
>>> +/* Terminating entry */
>>> +struct domain_save_end {};
>>> +DECLARE_DOMAIN_SAVE_TYPE(END, 0, struct domain_save_end);
>>> +
>>> +#define DOMAIN_SAVE_MAGIC   0x53415645
>>> +#define DOMAIN_SAVE_VERSION 0x00000001
>>> +
>>> +/* Initial entry */
>>> +struct domain_save_header {
>>> +    uint32_t magic;             /* Must be DOMAIN_SAVE_MAGIC */
>>> +    uint32_t version;           /* Save format version */
>>
>> Would it make sense to have the version of Xen in the stream?
>>
> 
> Why? How would it help?

Let's imagine in 4.14 we introduced a bug in the saving part. This is 
solved in 4.15 but somehow the version is not bumped. How would you 
differentiate the streams saved by Xen 4.14 so you can still migrate?

If you record the version of Xen in the record automatically, then you 
at least have a way to differentiate the two versions.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
  2020-04-03 17:24       ` Julien Grall
@ 2020-04-06  8:27         ` Paul Durrant
  2020-04-06  9:08           ` Julien Grall
  0 siblings, 1 reply; 32+ messages in thread
From: Paul Durrant @ 2020-04-06  8:27 UTC (permalink / raw)
  To: 'Julien Grall', xen-devel
  Cc: 'Stefano Stabellini', 'Wei Liu',
	'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', 'Jan Beulich'

> -----Original Message-----
> From: Julien Grall <julien@xen.org>
> Sent: 03 April 2020 18:24
> To: paul@xen.org; xen-devel@lists.xenproject.org
> Cc: 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap' <george.dunlap@citrix.com>; 'Ian
> Jackson' <ian.jackson@eu.citrix.com>; 'Jan Beulich' <jbeulich@suse.com>; 'Stefano Stabellini'
> <sstabellini@kernel.org>; 'Wei Liu' <wl@xen.org>
> Subject: Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
> 
> Hi Paul,
> 
> On 03/04/2020 16:55, Paul Durrant wrote:
> >> -----Original Message-----
> > [snip]
> >>> +
> >>> +#include <xen/save.h>
> >>> +
> >>> +struct domain_context {
> >>> +    bool log;
> >>> +    struct domain_save_descriptor desc;
> >>> +    domain_copy_entry copy;
> >>
> >> As your new framework is technically an extension of existing one, it
> >> would be good to explain why we diverge in the definitions.
> >>
> >
> > I don't follow. What is diverging? I explain in the commit comment that this is a parallel
> framework. Do I need to justify why it is not a carbon copy of the HVM one?
> 
> Well, they are both restoring/saving guest state. The only difference is
> the existing one is focusing on HVM state.
> 
> So it would make sense long term to have only one hypercall and tell
> what you want to save. In fact, some of the improvement here would
> definitely make the HVM one nicer to use (at least in the context of LU).
> 

I guess we could move the HVM save records over to the new framework, but it works for the moment so I don't want to bring it into scope now.

>  From the commit message, it is not clear to me why a new framework and
> why the infrastructure is at the same time different but not.
> 

An alternative would be to move the HVM save code into common code and then try to adapt it. I think that would result in more code churn and ultimately be harder to review. The extra infrastructure introduced here is fairly minimal and, for the moment, only targeting PV state. As I said above there's nothing stopping the HVM records being ported over later once any initial issues have been shaken out.

  Paul





^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 2/5] xen/common/domctl: introduce XEN_DOMCTL_get/setdomaincontext
  2020-04-01 13:42   ` Julien Grall
@ 2020-04-06  9:07     ` Paul Durrant
  2020-04-06 12:46       ` Jan Beulich
  0 siblings, 1 reply; 32+ messages in thread
From: Paul Durrant @ 2020-04-06  9:07 UTC (permalink / raw)
  To: 'Julien Grall', xen-devel
  Cc: 'Stefano Stabellini', 'Wei Liu',
	'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', 'Jan Beulich',
	'Daniel De Graaf'

> -----Original Message-----
> From: Julien Grall <julien@xen.org>
> Sent: 01 April 2020 14:42
> To: Paul Durrant <paul@xen.org>; xen-devel@lists.xenproject.org
> Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>; Ian Jackson <ian.jackson@eu.citrix.com>; Wei Liu
> <wl@xen.org>; Andrew Cooper <andrew.cooper3@citrix.com>; George Dunlap <george.dunlap@citrix.com>; Jan
> Beulich <jbeulich@suse.com>; Stefano Stabellini <sstabellini@kernel.org>
> Subject: Re: [PATCH 2/5] xen/common/domctl: introduce XEN_DOMCTL_get/setdomaincontext
> 
> Hi Paul,
> 
> On 27/03/2020 18:50, Paul Durrant wrote:
> > These domctls provide a mechanism to get and set domain context from
> > the toolstack.
> >
> > Signed-off-by: Paul Durrant <paul@xen.org>
> > ---
> > Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>
> > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > Cc: Wei Liu <wl@xen.org>
> > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > Cc: George Dunlap <george.dunlap@citrix.com>
> > Cc: Jan Beulich <jbeulich@suse.com>
> > Cc: Julien Grall <julien@xen.org>
> > Cc: Stefano Stabellini <sstabellini@kernel.org>
> > ---
> >   tools/flask/policy/modules/xen.if   |   4 +-
> >   tools/libxc/include/xenctrl.h       |  11 +++
> >   tools/libxc/xc_domain.c             |  52 +++++++++++++
> >   xen/common/domctl.c                 | 115 ++++++++++++++++++++++++++++
> >   xen/include/public/domctl.h         |  41 +++++++++-
> >   xen/xsm/flask/hooks.c               |   6 ++
> >   xen/xsm/flask/policy/access_vectors |   4 +
> >   7 files changed, 230 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/flask/policy/modules/xen.if b/tools/flask/policy/modules/xen.if
> > index 8eb2293a52..2bc9db4f64 100644
> > --- a/tools/flask/policy/modules/xen.if
> > +++ b/tools/flask/policy/modules/xen.if
> > @@ -53,7 +53,7 @@ define(`create_domain_common', `
> >   	allow $1 $2:domain2 { set_cpu_policy settsc setscheduler setclaim
> >   			set_vnumainfo get_vnumainfo cacheflush
> >   			psr_cmt_op psr_alloc soft_reset
> > -			resource_map get_cpu_policy };
> > +			resource_map get_cpu_policy setcontext };
> >   	allow $1 $2:security check_context;
> >   	allow $1 $2:shadow enable;
> >   	allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage mmuext_op updatemp };
> > @@ -97,7 +97,7 @@ define(`migrate_domain_out', `
> >   	allow $1 $2:hvm { gethvmc getparam };
> >   	allow $1 $2:mmu { stat pageinfo map_read };
> >   	allow $1 $2:domain { getaddrsize getvcpucontext pause destroy };
> > -	allow $1 $2:domain2 gettsc;
> > +	allow $1 $2:domain2 { gettsc getcontext };
> >   	allow $1 $2:shadow { enable disable logdirty };
> >   ')
> >
> > diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> > index fc6e57a1a0..5c0d0d27a4 100644
> > --- a/tools/libxc/include/xenctrl.h
> > +++ b/tools/libxc/include/xenctrl.h
> > @@ -867,6 +867,17 @@ int xc_domain_hvm_setcontext(xc_interface *xch,
> >                                uint8_t *hvm_ctxt,
> >                                uint32_t size);
> >
> > +int xc_domain_getcontext(xc_interface *xch,
> > +                         uint32_t domid,
> > +                         uint32_t mask,
> > +                         uint8_t *ctxt_buf,
> 
> Why do you use uint8_t rather than void here?
> 
> > +                         uint32_t size);
> > +int xc_domain_setcontext(xc_interface *xch,
> > +                         uint32_t domid,
> > +                         uint32_t mask,
> > +                         uint8_t *ctxt_buf,
> > +                         uint32_t size);
> > +
> >   /**
> >    * This function will return guest IO ABI protocol
> >    *
> > diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
> > index 71829c2bce..15bcf671fc 100644
> > --- a/tools/libxc/xc_domain.c
> > +++ b/tools/libxc/xc_domain.c
> > @@ -537,6 +537,58 @@ int xc_domain_hvm_setcontext(xc_interface *xch,
> >       return ret;
> >   }
> >
> > +int xc_domain_getcontext(xc_interface *xch,
> > +                         uint32_t domid,
> > +                         uint32_t mask,
> > +                         uint8_t *ctxt_buf,
> > +                         uint32_t size)
> > +{
> > +    int ret;
> > +    DECLARE_DOMCTL;
> > +    DECLARE_HYPERCALL_BOUNCE(ctxt_buf, size, XC_HYPERCALL_BUFFER_BOUNCE_OUT);
> > +
> > +    if ( xc_hypercall_bounce_pre(xch, ctxt_buf) )
> > +        return -1;
> > +
> > +    domctl.cmd = XEN_DOMCTL_getdomaincontext;
> > +    domctl.domain = domid;
> > +    domctl.u.domaincontext.mask = mask;
> > +    domctl.u.domaincontext.size = size;
> > +    set_xen_guest_handle(domctl.u.domaincontext.buffer, ctxt_buf);
> > +
> > +    ret = do_domctl(xch, &domctl);
> > +
> > +    xc_hypercall_bounce_post(xch, ctxt_buf);
> > +
> > +    return !ret ? domctl.u.domaincontext.size : -1;
> 
> Looking at the domctl interface, size would be a 32-bit unsigned value.
> However the return is a 32-bit signed value. This is a call for trouble
> if the size is too big.
> 
> > +}
> > +
> > +int xc_domain_setcontext(xc_interface *xch,
> > +                         uint32_t domid,
> > +                         uint32_t mask,
> > +                         uint8_t *ctxt_buf,
> 
> The context is not meant to be modified. So this should really be const.
> 

All of the above were basically inherited via cut'n'paste from the HVM save code. I'll re-work it.

> > +                         uint32_t size)
> > +{
> > +    int ret;
> > +    DECLARE_DOMCTL;
> > +    DECLARE_HYPERCALL_BOUNCE(ctxt_buf, size, XC_HYPERCALL_BUFFER_BOUNCE_IN);
> > +
> > +    if ( xc_hypercall_bounce_pre(xch, ctxt_buf) )
> > +        return -1;
> > +
> > +    domctl.cmd = XEN_DOMCTL_setdomaincontext;
> > +    domctl.domain = domid;
> > +    domctl.u.domaincontext.mask = mask;
> > +    domctl.u.domaincontext.size = size;
> > +    set_xen_guest_handle(domctl.u.domaincontext.buffer, ctxt_buf);
> > +
> > +    ret = do_domctl(xch, &domctl);
> > +
> > +    xc_hypercall_bounce_post(xch, ctxt_buf);
> > +
> > +    return ret;
> > +}
> > +
> >   int xc_vcpu_getcontext(xc_interface *xch,
> >                          uint32_t domid,
> >                          uint32_t vcpu,
> > diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> > index a69b3b59a8..9c39519b04 100644
> > --- a/xen/common/domctl.c
> > +++ b/xen/common/domctl.c
> > @@ -25,6 +25,7 @@
> >   #include <xen/hypercall.h>
> >   #include <xen/vm_event.h>
> >   #include <xen/monitor.h>
> > +#include <xen/save.h>
> >   #include <asm/current.h>
> >   #include <asm/irq.h>
> >   #include <asm/page.h>
> > @@ -358,6 +359,111 @@ static struct vnuma_info *vnuma_init(const struct xen_domctl_vnuma *uinfo,
> >       return ERR_PTR(ret);
> >   }
> >
> > +struct domctl_context
> > +{
> > +    void *buffer;
> > +    size_t len;
> > +    size_t cur;
> > +};
> > +
> > +static int accumulate_size(void *priv, void *data, size_t len)
> > +{
> > +    struct domctl_context *c = priv;
> > +
> > +    c->len += len;
> 
> What does prevent overflow?
> 

Good point. That needs to be checked.

> > +
> > +    return 0;
> > +}
> > +
> > +static int save_data(void *priv, void *data, size_t len)
> > +{
> > +    struct domctl_context *c = priv;
> > +
> > +    if ( c->len - c->cur < len )
> > +        return -ENOSPC;
> > +
> > +    memcpy(c->buffer + c->cur, data, len);
> > +    c->cur += len;
> > +
> > +    return 0;
> > +}
> > +
> > +static int getdomaincontext(struct domain *d,
> > +                            struct xen_domctl_domaincontext *dc)
> > +{
> > +    struct domctl_context c = { };
> > +    int rc;
> > +
> > +    if ( d == current->domain )
> > +        return -EPERM;
> > +
> > +    if ( guest_handle_is_null(dc->buffer) ) /* query for buffer size */
> > +
> > +    {
> > +        if ( dc->size )
> > +            return -EINVAL;
> > +
> > +        /* dry run to acquire buffer size */
> > +        rc = domain_save(d, accumulate_size, &c, dc->mask, true);
> > +        if ( rc )
> > +            return rc;
> > +
> > +        dc->size = c.len;
> 
> len is a size_t (so 64-bit on 64-bit arch) value, but size is 32-bit. So
> we return an error if the stream is going to be bigger than 4G?
> 

I'd hope a 4G stream is unlikely but I'll expand size to 64 bits.

> > +        return 0;
> > +    }
> > +
> > +    c.len = dc->size;
> > +    c.buffer = xmalloc_bytes(c.len);
> > +    if ( !c.buffer )
> > +        return -ENOMEM;
> > +
> > +    rc = domain_save(d, save_data, &c, dc->mask, false);
> > +
> > +    dc->size = c.cur;
> > +    if ( !rc && copy_to_guest(dc->buffer, c.buffer, dc->size) )
> > +        rc = -EFAULT;
> > +
> > +    xfree(c.buffer);
> > +
> > +    return rc;
> > +}
> > +
> > +static int load_data(void *priv, void *data, size_t len)
> > +{
> > +    struct domctl_context *c = priv;
> > +
> > +    if ( c->len - c->cur < len )
> > +        return -ENODATA;
> > +
> > +    if ( data )
> > +        memcpy(data, c->buffer + c->cur, len);
> > +
> > +    c->cur += len;
> > +
> > +    return 0;
> > +}
> > +
> > +static int setdomaincontext(struct domain *d,
> > +                            const struct xen_domctl_domaincontext *dc)
> > +{
> > +    struct domctl_context c = { .len = dc->size };
> > +    int rc;
> > +
> > +    if ( d == current->domain )
> > +        return -EPERM;
> > +
> > +    c.buffer = xmalloc_bytes(c.len);
> > +    if ( !c.buffer )
> > +        return -ENOMEM;
> > +
> > +    rc = !copy_from_guest(c.buffer, dc->buffer, c.len) ?
> > +        domain_load(d, load_data, &c, dc->mask) : -EFAULT;
> > +
> > +    xfree(c.buffer);
> > +
> > +    return rc;
> > +}
> > +
> >   long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
> >   {
> >       long ret = 0;
> > @@ -942,6 +1048,15 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
> >               copyback = 1;
> >           break;
> >
> > +    case XEN_DOMCTL_getdomaincontext:
> > +        ret = getdomaincontext(d, &op->u.domaincontext);
> > +        copyback = !ret;
> > +        break;
> > +
> > +    case XEN_DOMCTL_setdomaincontext:
> > +        ret = setdomaincontext(d, &op->u.domaincontext);
> > +        break;
> > +
> >       default:
> >           ret = arch_do_domctl(op, d, u_domctl);
> >           break;
> > diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> > index 1ad34c35eb..24ed6852cf 100644
> > --- a/xen/include/public/domctl.h
> > +++ b/xen/include/public/domctl.h
> > @@ -38,7 +38,7 @@
> >   #include "hvm/save.h"
> >   #include "memory.h"
> >
> > -#define XEN_DOMCTL_INTERFACE_VERSION 0x00000012
> > +#define XEN_DOMCTL_INTERFACE_VERSION 0x00000013
> >
> >   /*
> >    * NB. xen_domctl.domain is an IN/OUT parameter for this operation.
> > @@ -1129,6 +1129,42 @@ struct xen_domctl_vuart_op {
> >                                    */
> >   };
> >
> > +/*
> > + * Get/Set domain PV context. The same struct xen_domctl_domaincontext
> > + * is used for both commands but with slightly different field semantics
> > + * as follows:
> > + *
> > + * XEN_DOMCTL_getdomaincontext
> > + * ---------------------------
> > + *
> > + * buffer (IN):   The buffer into which the context data should be
> > + *                copied, or NULL to query the buffer size that should
> > + *                be allocated.
> > + * size (IN/OUT): If 'buffer' is NULL then the value passed in must be
> > + *                zero, and the value passed out will be the size of the
> > + *                buffer to allocate.
> > + *                If 'buffer' is non-NULL then the value passed in must
> > + *                be the size of the buffer into which data may be copied.
> > + * mask (IN):     See comment on domain_save/load() in xen/save.h.
> > + *
> > + * XEN_DOMCTL_setdomaincontext
> > + * ---------------------------
> > + *
> > + * buffer (IN):   The buffer from which the context data should be
> > + *                copied.
> > + * size (IN):     The size of the buffer from which data may be copied.
> > + *                This data must include DOMAIN_SAVE_CODE_HEADER at the
> > + *                start and terminate with a DOMAIN_SAVE_CODE_END record.
> > + *                Any data beyond the DOMAIN_SAVE_CODE_END record will be
> > + *                ignored.
> > + * mask (IN):     See comment on domain_save/load() in xen/save.h.
> > + */
> > +struct xen_domctl_domaincontext {
> > +    uint32_t size;
> > +    uint32_t mask;
> > +    XEN_GUEST_HANDLE_64(uint8) buffer;
> 
> Should not it be a void handle?
>

Perhaps.
 
> Also, in the case of XEN_DOMCTL_setdomaincontext, the buffer is not
> meant to be modified by the hypervisor. So I would rather introduce two
> separate structure so we can enforce the const.
> 

Can handles be meaningfully const?

  Paul



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
  2020-04-06  8:27         ` Paul Durrant
@ 2020-04-06  9:08           ` Julien Grall
  2020-04-06  9:18             ` Paul Durrant
  0 siblings, 1 reply; 32+ messages in thread
From: Julien Grall @ 2020-04-06  9:08 UTC (permalink / raw)
  To: paul, xen-devel
  Cc: 'Stefano Stabellini', 'Wei Liu',
	'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', 'Jan Beulich'

Hi Paul,

On 06/04/2020 09:27, Paul Durrant wrote:
>> -----Original Message-----
>> From: Julien Grall <julien@xen.org>
>> Sent: 03 April 2020 18:24
>> To: paul@xen.org; xen-devel@lists.xenproject.org
>> Cc: 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap' <george.dunlap@citrix.com>; 'Ian
>> Jackson' <ian.jackson@eu.citrix.com>; 'Jan Beulich' <jbeulich@suse.com>; 'Stefano Stabellini'
>> <sstabellini@kernel.org>; 'Wei Liu' <wl@xen.org>
>> Subject: Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
>>
>> Hi Paul,
>>
>> On 03/04/2020 16:55, Paul Durrant wrote:
>>>> -----Original Message-----
>>> [snip]
>>>>> +
>>>>> +#include <xen/save.h>
>>>>> +
>>>>> +struct domain_context {
>>>>> +    bool log;
>>>>> +    struct domain_save_descriptor desc;
>>>>> +    domain_copy_entry copy;
>>>>
>>>> As your new framework is technically an extension of existing one, it
>>>> would be good to explain why we diverge in the definitions.
>>>>
>>>
>>> I don't follow. What is diverging? I explain in the commit comment that this is a parallel
>> framework. Do I need to justify why it is not a carbon copy of the HVM one?
>>
>> Well, they are both restoring/saving guest state. The only difference is
>> the existing one is focusing on HVM state.
>>
>> So it would make sense long term to have only one hypercall and tell
>> what you want to save. In fact, some of the improvement here would
>> definitely make the HVM one nicer to use (at least in the context of LU).
>>
> 
> I guess we could move the HVM save records over to the new framework, but it works for the moment so I don't want to bring it into scope now.

And I agree, hence why I say "long term" :).

> 
>>   From the commit message, it is not clear to me why a new framework and
>> why the infrastructure is at the same time different but not.
>>
> 
> An alternative would be to move the HVM save code into common code and then try to adapt it. I think that would result in more code churn and ultimately be harder to review. The extra infrastructure introduced here is fairly minimal and, for the moment, only targeting PV state. As I said above there's nothing stopping the HVM records being ported over later once any initial issues have been shaken out.

Code churn is always going to necessary one day or another if we want to 
consolidate the two.

Having two frameworks is not without risks. There are a few unknown to 
be answered:
   * Is there any dependency between the two? If yes, what is the ordering?
   * The name of the hypercall does not say anything about "PV". So a 
contributor could think it would be fine to save/restore new HVM state 
in the new generic hypercall. Is it going to be an issue? If so, how do 
we prevent it?
   * Are we going to deprecate the existing framework?

I am not suggesting we should not go with two frameworks, but the 
reasons and implications are not clear to me. Hence, why I think the 
commit message should be expanded with some rationale.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
  2020-04-06  9:08           ` Julien Grall
@ 2020-04-06  9:18             ` Paul Durrant
  2020-04-06  9:50               ` Julien Grall
  0 siblings, 1 reply; 32+ messages in thread
From: Paul Durrant @ 2020-04-06  9:18 UTC (permalink / raw)
  To: 'Julien Grall', xen-devel
  Cc: 'Stefano Stabellini', 'Wei Liu',
	'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', 'Jan Beulich'

> -----Original Message-----
> From: Julien Grall <julien@xen.org>
> Sent: 06 April 2020 10:08
> To: paul@xen.org; xen-devel@lists.xenproject.org
> Cc: 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap' <george.dunlap@citrix.com>; 'Ian
> Jackson' <ian.jackson@eu.citrix.com>; 'Jan Beulich' <jbeulich@suse.com>; 'Stefano Stabellini'
> <sstabellini@kernel.org>; 'Wei Liu' <wl@xen.org>
> Subject: Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
> 
> Hi Paul,
> 
> On 06/04/2020 09:27, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Julien Grall <julien@xen.org>
> >> Sent: 03 April 2020 18:24
> >> To: paul@xen.org; xen-devel@lists.xenproject.org
> >> Cc: 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap' <george.dunlap@citrix.com>; 'Ian
> >> Jackson' <ian.jackson@eu.citrix.com>; 'Jan Beulich' <jbeulich@suse.com>; 'Stefano Stabellini'
> >> <sstabellini@kernel.org>; 'Wei Liu' <wl@xen.org>
> >> Subject: Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
> >>
> >> Hi Paul,
> >>
> >> On 03/04/2020 16:55, Paul Durrant wrote:
> >>>> -----Original Message-----
> >>> [snip]
> >>>>> +
> >>>>> +#include <xen/save.h>
> >>>>> +
> >>>>> +struct domain_context {
> >>>>> +    bool log;
> >>>>> +    struct domain_save_descriptor desc;
> >>>>> +    domain_copy_entry copy;
> >>>>
> >>>> As your new framework is technically an extension of existing one, it
> >>>> would be good to explain why we diverge in the definitions.
> >>>>
> >>>
> >>> I don't follow. What is diverging? I explain in the commit comment that this is a parallel
> >> framework. Do I need to justify why it is not a carbon copy of the HVM one?
> >>
> >> Well, they are both restoring/saving guest state. The only difference is
> >> the existing one is focusing on HVM state.
> >>
> >> So it would make sense long term to have only one hypercall and tell
> >> what you want to save. In fact, some of the improvement here would
> >> definitely make the HVM one nicer to use (at least in the context of LU).
> >>
> >
> > I guess we could move the HVM save records over to the new framework, but it works for the moment so
> I don't want to bring it into scope now.
> 
> And I agree, hence why I say "long term" :).
> 
> >
> >>   From the commit message, it is not clear to me why a new framework and
> >> why the infrastructure is at the same time different but not.
> >>
> >
> > An alternative would be to move the HVM save code into common code and then try to adapt it. I think
> that would result in more code churn and ultimately be harder to review. The extra infrastructure
> introduced here is fairly minimal and, for the moment, only targeting PV state. As I said above
> there's nothing stopping the HVM records being ported over later once any initial issues have been
> shaken out.
> 
> Code churn is always going to necessary one day or another if we want to
> consolidate the two.
> 
> Having two frameworks is not without risks. There are a few unknown to
> be answered:
>    * Is there any dependency between the two? If yes, what is the ordering?

There isn't any dependency at the moment so need I say anything? If an ordering dependency is introduced by a future patch then surely that would be time to call it out.

>    * The name of the hypercall does not say anything about "PV". So a
> contributor could think it would be fine to save/restore new HVM state
> in the new generic hypercall. Is it going to be an issue? If so, how do
> we prevent it?

The commit message says:

"Domain context is state held in the hypervisor that does not come under
the category of 'HVM state' but is instead 'PV state' that is common
between PV guests and enlightened HVM guests (i.e. those that have PV
drivers) such as event channel state, grant entry state, etc."

Do you think this should also appear in a comment? Do we really care? Nothing fundamentally prevents the mechanism being used for HVM state, but that may introduce an ordering dependency.

>    * Are we going to deprecate the existing framework?
> 

There is no intention as yet.

> I am not suggesting we should not go with two frameworks, but the
> reasons and implications are not clear to me. Hence, why I think the
> commit message should be expanded with some rationale.
> 

Ok, I can add a paragraph to try to explain a little more.

  Paul



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
  2020-04-06  9:18             ` Paul Durrant
@ 2020-04-06  9:50               ` Julien Grall
  2020-04-06 10:34                 ` Paul Durrant
  0 siblings, 1 reply; 32+ messages in thread
From: Julien Grall @ 2020-04-06  9:50 UTC (permalink / raw)
  To: paul, xen-devel
  Cc: 'Stefano Stabellini', 'Wei Liu',
	'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', 'Jan Beulich'

Hi Paul,

On 06/04/2020 10:18, Paul Durrant wrote:
>> -----Original Message-----
>> From: Julien Grall <julien@xen.org>
>> Sent: 06 April 2020 10:08
>> To: paul@xen.org; xen-devel@lists.xenproject.org
>> Cc: 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap' <george.dunlap@citrix.com>; 'Ian
>> Jackson' <ian.jackson@eu.citrix.com>; 'Jan Beulich' <jbeulich@suse.com>; 'Stefano Stabellini'
>> <sstabellini@kernel.org>; 'Wei Liu' <wl@xen.org>
>> Subject: Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
>>
>> Hi Paul,
>>
>> On 06/04/2020 09:27, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Julien Grall <julien@xen.org>
>>>> Sent: 03 April 2020 18:24
>>>> To: paul@xen.org; xen-devel@lists.xenproject.org
>>>> Cc: 'Andrew Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap' <george.dunlap@citrix.com>; 'Ian
>>>> Jackson' <ian.jackson@eu.citrix.com>; 'Jan Beulich' <jbeulich@suse.com>; 'Stefano Stabellini'
>>>> <sstabellini@kernel.org>; 'Wei Liu' <wl@xen.org>
>>>> Subject: Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
>>>>
>>>> Hi Paul,
>>>>
>>>> On 03/04/2020 16:55, Paul Durrant wrote:
>>>>>> -----Original Message-----
>>>>> [snip]
>>>>>>> +
>>>>>>> +#include <xen/save.h>
>>>>>>> +
>>>>>>> +struct domain_context {
>>>>>>> +    bool log;
>>>>>>> +    struct domain_save_descriptor desc;
>>>>>>> +    domain_copy_entry copy;
>>>>>>
>>>>>> As your new framework is technically an extension of existing one, it
>>>>>> would be good to explain why we diverge in the definitions.
>>>>>>
>>>>>
>>>>> I don't follow. What is diverging? I explain in the commit comment that this is a parallel
>>>> framework. Do I need to justify why it is not a carbon copy of the HVM one?
>>>>
>>>> Well, they are both restoring/saving guest state. The only difference is
>>>> the existing one is focusing on HVM state.
>>>>
>>>> So it would make sense long term to have only one hypercall and tell
>>>> what you want to save. In fact, some of the improvement here would
>>>> definitely make the HVM one nicer to use (at least in the context of LU).
>>>>
>>>
>>> I guess we could move the HVM save records over to the new framework, but it works for the moment so
>> I don't want to bring it into scope now.
>>
>> And I agree, hence why I say "long term" :).
>>
>>>
>>>>    From the commit message, it is not clear to me why a new framework and
>>>> why the infrastructure is at the same time different but not.
>>>>
>>>
>>> An alternative would be to move the HVM save code into common code and then try to adapt it. I think
>> that would result in more code churn and ultimately be harder to review. The extra infrastructure
>> introduced here is fairly minimal and, for the moment, only targeting PV state. As I said above
>> there's nothing stopping the HVM records being ported over later once any initial issues have been
>> shaken out.
>>
>> Code churn is always going to necessary one day or another if we want to
>> consolidate the two.
>>
>> Having two frameworks is not without risks. There are a few unknown to
>> be answered:
>>     * Is there any dependency between the two? If yes, what is the ordering?
> 
> There isn't any dependency at the moment so need I say anything? If an ordering dependency is introduced by a future patch then surely that would be time to call it out.

If we don't spell out a dependency now, then the risk is we have to 
modify the toolstack at the same time as spelling out the dependency.

I am not sure whether this matters thought. This would only affect the 
save part as the restore part should be just reading records one by one.

> 
>>     * The name of the hypercall does not say anything about "PV". So a
>> contributor could think it would be fine to save/restore new HVM state
>> in the new generic hypercall. Is it going to be an issue? If so, how do
>> we prevent it?
> 
> The commit message says:
> 
> "Domain context is state held in the hypervisor that does not come under
> the category of 'HVM state' but is instead 'PV state' that is common
> between PV guests and enlightened HVM guests (i.e. those that have PV
> drivers) such as event channel state, grant entry state, etc."

This does not seem to cover all the cases. For instance, where would you 
save IOREQ servers information?

> 
> Do you think this should also appear in a comment? Do we really care? Nothing fundamentally prevents the mechanism being used for HVM state, but that may introduce an ordering dependency.

I don't particularly like the idea to let the contributor decide where 
"HVM context" or as part of the "Domain context".

This is  going to result to unwanted dependency and possibly 
bikeshedding on where they should be saved.

My preference would be to mark the existing framework as deprecated and 
force all the new states to be saved as part of the new "Domain Context".

> 
>>     * Are we going to deprecate the existing framework?
>>
> 
> There is no intention as yet.
> 
>> I am not suggesting we should not go with two frameworks, but the
>> reasons and implications are not clear to me. Hence, why I think the
>> commit message should be expanded with some rationale.
>>
> 
> Ok, I can add a paragraph to try to explain a little more.

That would be appreciated. Thank you!

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
  2020-04-06  9:50               ` Julien Grall
@ 2020-04-06 10:34                 ` Paul Durrant
  2020-04-06 12:43                   ` Jan Beulich
  0 siblings, 1 reply; 32+ messages in thread
From: Paul Durrant @ 2020-04-06 10:34 UTC (permalink / raw)
  To: 'Julien Grall', xen-devel
  Cc: 'Stefano Stabellini', 'Wei Liu',
	'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', 'Jan Beulich'

> -----Original Message-----
[snip]
> >>     * The name of the hypercall does not say anything about "PV". So a
> >> contributor could think it would be fine to save/restore new HVM state
> >> in the new generic hypercall. Is it going to be an issue? If so, how do
> >> we prevent it?
> >
> > The commit message says:
> >
> > "Domain context is state held in the hypervisor that does not come under
> > the category of 'HVM state' but is instead 'PV state' that is common
> > between PV guests and enlightened HVM guests (i.e. those that have PV
> > drivers) such as event channel state, grant entry state, etc."
> 
> This does not seem to cover all the cases. For instance, where would you
> save IOREQ servers information?
> 

Ok, I agree that is ambiguous. I'd still call it PV state but of course it does only relate to HVM guests.

> >
> > Do you think this should also appear in a comment? Do we really care? Nothing fundamentally prevents
> the mechanism being used for HVM state, but that may introduce an ordering dependency.
> 
> I don't particularly like the idea to let the contributor decide where
> "HVM context" or as part of the "Domain context".
> 
> This is  going to result to unwanted dependency and possibly
> bikeshedding on where they should be saved.
> 
> My preference would be to mark the existing framework as deprecated and
> force all the new states to be saved as part of the new "Domain Context".
> 

I'm ok with that. Any others have any opinion to the contrary?

> >
> >>     * Are we going to deprecate the existing framework?
> >>
> >
> > There is no intention as yet.
> >
> >> I am not suggesting we should not go with two frameworks, but the
> >> reasons and implications are not clear to me. Hence, why I think the
> >> commit message should be expanded with some rationale.
> >>
> >
> > Ok, I can add a paragraph to try to explain a little more.
> 
> That would be appreciated. Thank you!
> 

I'll mention the deprecation of the HVM context interface there too.

  Paul



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
  2020-04-06 10:34                 ` Paul Durrant
@ 2020-04-06 12:43                   ` Jan Beulich
  0 siblings, 0 replies; 32+ messages in thread
From: Jan Beulich @ 2020-04-06 12:43 UTC (permalink / raw)
  To: paul
  Cc: 'Stefano Stabellini', 'Julien Grall',
	'Wei Liu', 'Andrew Cooper', 'Ian Jackson',
	'George Dunlap',
	xen-devel

On 06.04.2020 12:34, Paul Durrant wrote:
>>> Do you think this should also appear in a comment? Do we really care? Nothing fundamentally prevents
>> the mechanism being used for HVM state, but that may introduce an ordering dependency.
>>
>> I don't particularly like the idea to let the contributor decide where
>> "HVM context" or as part of the "Domain context".
>>
>> This is  going to result to unwanted dependency and possibly
>> bikeshedding on where they should be saved.
>>
>> My preference would be to mark the existing framework as deprecated and
>> force all the new states to be saved as part of the new "Domain Context".
> 
> I'm ok with that. Any others have any opinion to the contrary?

Well, not in principle, but I won't have a firm opinion until I've
actually looked at (relevant parts of) the patches.

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/5] xen/common/domctl: introduce XEN_DOMCTL_get/setdomaincontext
  2020-04-06  9:07     ` Paul Durrant
@ 2020-04-06 12:46       ` Jan Beulich
  0 siblings, 0 replies; 32+ messages in thread
From: Jan Beulich @ 2020-04-06 12:46 UTC (permalink / raw)
  To: paul
  Cc: 'Stefano Stabellini', 'Julien Grall',
	'Wei Liu', 'Andrew Cooper', 'Ian Jackson',
	'George Dunlap', xen-devel, 'Daniel De Graaf'

On 06.04.2020 11:07, Paul Durrant wrote:
>> -----Original Message-----
>> From: Julien Grall <julien@xen.org>
>> Sent: 01 April 2020 14:42
>>
>> On 27/03/2020 18:50, Paul Durrant wrote:
>> Also, in the case of XEN_DOMCTL_setdomaincontext, the buffer is not
>> meant to be modified by the hypervisor. So I would rather introduce two
>> separate structure so we can enforce the const.
> 
> Can handles be meaningfully const?

Yes, see e.g. Julien's recent patch to force honoring const-ness
in some cases where it wasn't enforced so far. Luckily there
look to not have crept in any violators.

Jan


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2020-04-06 18:04 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-27 18:50 [Xen-devel] [PATCH 0/5] domain context infrastructure Paul Durrant
2020-03-27 18:50 ` [Xen-devel] [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context Paul Durrant
2020-04-01 12:00   ` Julien Grall
2020-04-01 12:07     ` Jan Beulich
2020-04-01 12:16       ` Julien Grall
2020-04-01 12:23         ` Jan Beulich
2020-04-03 15:55     ` Paul Durrant
2020-04-03 17:24       ` Julien Grall
2020-04-06  8:27         ` Paul Durrant
2020-04-06  9:08           ` Julien Grall
2020-04-06  9:18             ` Paul Durrant
2020-04-06  9:50               ` Julien Grall
2020-04-06 10:34                 ` Paul Durrant
2020-04-06 12:43                   ` Jan Beulich
2020-04-01 14:50   ` Jan Beulich
2020-04-02  9:58     ` Paul Durrant
2020-04-02 11:08       ` Jan Beulich
2020-04-02 14:00         ` Paul Durrant
2020-04-03  8:38           ` Jan Beulich
2020-03-27 18:50 ` [Xen-devel] [PATCH 2/5] xen/common/domctl: introduce XEN_DOMCTL_get/setdomaincontext Paul Durrant
2020-04-01 13:42   ` Julien Grall
2020-04-06  9:07     ` Paul Durrant
2020-04-06 12:46       ` Jan Beulich
2020-04-01 13:46   ` Julien Grall
2020-03-27 18:50 ` [Xen-devel] [PATCH 3/5] tools/misc: add xen-ctx to present domain context Paul Durrant
2020-03-30 10:54   ` Jan Beulich
2020-04-03 15:20     ` Paul Durrant
2020-04-03 15:29       ` Jan Beulich
2020-04-03 16:08         ` Paul Durrant
2020-03-27 18:50 ` [Xen-devel] [PATCH 4/5] common/domain: add a domain context record for shared_info Paul Durrant
2020-04-01 14:27   ` Julien Grall
2020-03-27 18:50 ` [Xen-devel] [PATCH 5/5] tools/libxc: make use of domain context SHARED_INFO record Paul Durrant

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.