All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/4] remove libxenctrl usage from xenstored
@ 2021-09-14 12:35 Juergen Gross
  2021-09-14 12:35 ` [PATCH RFC 1/4] xen: add a domain unique id to each domain Juergen Gross
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Juergen Gross @ 2021-09-14 12:35 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu,
	Daniel De Graaf, Daniel P. Smith, Volodymyr Babchuk,
	Roger Pau Monné

Xenstored is using libxenctrl for only one purpose: to get information
about state of domains.

This RFC patch series is removing that dependency by introducing a new
stable interface which can be used by xenstored instead.

I have chosen to add a new hypercall with sub-commands. This can be
changed, of course, but I thought this would be a good step towards
stable control interfaces for the Xen tools.

I have done only very little testing (basically normal domain creation
and deletion). In case my approach is being accepted I'll do a more
thorough test.

What is missing right now is some cleanup (e.g. the need to allow
the dominfo operation of the domctl hypercall for a stubdom based
Xenstore could be removed). For Mini-OS some more work is needed to
remove the dependency to libxenctrl for xenstore-stubdom, as Mini-OS
can't be configured to use libxenevent and libxengnttab, but not
libxenctrl today. Again, I'll do that cleanup in case the series is
generally accepted.

Juergen Gross (4):
  xen: add a domain unique id to each domain
  xen: add bitmap to indicate per-domain state changes
  xen: add new stable control hypercall
  tools/xenstore: use new stable interface instead of libxenctrl

 tools/flask/policy/modules/dom0.te  |   2 +-
 tools/xenstore/Makefile             |   3 +-
 tools/xenstore/xenstored_control.c  |  14 +-
 tools/xenstore/xenstored_domain.c   | 219 +++++++++++++++-------------
 xen/arch/arm/traps.c                |   1 +
 xen/arch/x86/hvm/hypercall.c        |   1 +
 xen/arch/x86/hypercall.c            |   1 +
 xen/arch/x86/pv/hypercall.c         |   1 +
 xen/common/Makefile                 |   1 +
 xen/common/control.c                |  52 +++++++
 xen/common/domain.c                 |  76 ++++++++++
 xen/common/event_channel.c          |  11 +-
 xen/include/Makefile                |   1 +
 xen/include/public/control.h        |  80 ++++++++++
 xen/include/public/xen.h            |   1 +
 xen/include/xen/event.h             |   6 +
 xen/include/xen/hypercall.h         |   5 +
 xen/include/xen/sched.h             |   7 +
 xen/include/xsm/dummy.h             |  14 ++
 xen/include/xsm/xsm.h               |   6 +
 xen/xsm/dummy.c                     |   1 +
 xen/xsm/flask/hooks.c               |   6 +
 xen/xsm/flask/policy/access_vectors |   2 +
 23 files changed, 402 insertions(+), 109 deletions(-)
 create mode 100644 xen/common/control.c
 create mode 100644 xen/include/public/control.h

-- 
2.26.2



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH RFC 1/4] xen: add a domain unique id to each domain
  2021-09-14 12:35 [PATCH RFC 0/4] remove libxenctrl usage from xenstored Juergen Gross
@ 2021-09-14 12:35 ` Juergen Gross
  2021-11-22 10:01   ` Jan Beulich
  2021-11-22 11:42   ` Julien Grall
  2021-09-14 12:35 ` [PATCH RFC 2/4] xen: add bitmap to indicate per-domain state changes Juergen Gross
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 25+ messages in thread
From: Juergen Gross @ 2021-09-14 12:35 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu

Xenstore is referencing domains by their domid, but reuse of a domid
can lead to the situation that Xenstore can't tell whether a domain
with that domid has been deleted and created again without Xenstore
noticing the domain is a new one now.

Add a global domain creation unique id which is updated when creating
a new domain, and store that value in struct domain of the new domain.
The global unique id is initialized with the system time and updates
are done via the xorshift algorithm which is used for pseudo random
number generation, too (see https://en.wikipedia.org/wiki/Xorshift).

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/common/domain.c     | 16 ++++++++++++++++
 xen/include/xen/sched.h |  3 +++
 2 files changed, 19 insertions(+)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 6ee5d033b0..755349b93f 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -84,6 +84,9 @@ vcpu_info_t dummy_vcpu_info;
 
 bool __read_mostly vmtrace_available;
 
+/* Unique domain identifier, protected by domctl lock. */
+static uint64_t unique_id;
+
 static void __domain_finalise_shutdown(struct domain *d)
 {
     struct vcpu *v;
@@ -473,6 +476,18 @@ static void _domain_destroy(struct domain *d)
     free_domain_struct(d);
 }
 
+static uint64_t get_unique_id(void)
+{
+    uint64_t x = unique_id ? : NOW();
+
+    x ^= x << 13;
+    x ^= x >> 7;
+    x ^= x << 17;
+    unique_id = x;
+
+    return x;
+}
+
 static int sanitise_domain_config(struct xen_domctl_createdomain *config)
 {
     bool hvm = config->flags & XEN_DOMCTL_CDF_hvm;
@@ -552,6 +567,7 @@ struct domain *domain_create(domid_t domid,
 
     /* Sort out our idea of is_system_domain(). */
     d->domain_id = domid;
+    d->unique_id = get_unique_id();
 
     /* Debug sanity. */
     ASSERT(is_system_domain(d) ? config == NULL : config != NULL);
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 28146ee404..b827c5af8d 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -362,6 +362,9 @@ struct domain
     domid_t          domain_id;
 
     unsigned int     max_vcpus;
+
+    uint64_t         unique_id;       /* Unique domain identifier */
+
     struct vcpu    **vcpu;
 
     shared_info_t   *shared_info;     /* shared data area */
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH RFC 2/4] xen: add bitmap to indicate per-domain state changes
  2021-09-14 12:35 [PATCH RFC 0/4] remove libxenctrl usage from xenstored Juergen Gross
  2021-09-14 12:35 ` [PATCH RFC 1/4] xen: add a domain unique id to each domain Juergen Gross
@ 2021-09-14 12:35 ` Juergen Gross
  2021-09-23  9:16   ` Julien Grall
  2021-11-22 10:41   ` Jan Beulich
  2021-09-14 12:35 ` [PATCH RFC 3/4] xen: add new stable control hypercall Juergen Gross
  2021-09-14 12:36 ` [PATCH RFC 4/4] tools/xenstore: use new stable interface instead of libxenctrl Juergen Gross
  3 siblings, 2 replies; 25+ messages in thread
From: Juergen Gross @ 2021-09-14 12:35 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Andrew Cooper, George Dunlap, Ian Jackson,
	Jan Beulich, Julien Grall, Stefano Stabellini, Wei Liu

Add a bitmap with one bit per possible domid indicating the respective
domain has changed its state (created, deleted, dying, crashed,
shutdown).

Registering the VIRQ_DOM_EXC event will result in setting the bits for
all existing domains and resetting all other bits.

Resetting a bit will be done in a future patch.

This information is needed for Xenstore to keep track of all domains.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/common/domain.c        | 22 ++++++++++++++++++++++
 xen/common/event_channel.c |  2 ++
 xen/include/xen/sched.h    |  2 ++
 3 files changed, 26 insertions(+)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 755349b93f..14a427e2ef 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -87,6 +87,22 @@ bool __read_mostly vmtrace_available;
 /* Unique domain identifier, protected by domctl lock. */
 static uint64_t unique_id;
 
+static DECLARE_BITMAP(dom_state_changed, DOMID_MASK + 1);
+
+void domain_reset_states(void)
+{
+    struct domain *d;
+
+    bitmap_zero(dom_state_changed, DOMID_MASK + 1);
+
+    rcu_read_lock(&domlist_read_lock);
+
+    for_each_domain ( d )
+        set_bit(d->domain_id, dom_state_changed);
+
+    rcu_read_unlock(&domlist_read_lock);
+}
+
 static void __domain_finalise_shutdown(struct domain *d)
 {
     struct vcpu *v;
@@ -101,6 +117,7 @@ static void __domain_finalise_shutdown(struct domain *d)
             return;
 
     d->is_shut_down = 1;
+    set_bit(d->domain_id, dom_state_changed);
     if ( (d->shutdown_code == SHUTDOWN_suspend) && d->suspend_evtchn )
         evtchn_send(d, d->suspend_evtchn);
     else
@@ -720,6 +737,8 @@ struct domain *domain_create(domid_t domid,
         rcu_assign_pointer(domain_hash[DOMAIN_HASH(domid)], d);
         spin_unlock(&domlist_update_lock);
 
+        set_bit(d->domain_id, dom_state_changed);
+
         memcpy(d->handle, config->handle, sizeof(d->handle));
     }
 
@@ -954,6 +973,7 @@ int domain_kill(struct domain *d)
         /* Mem event cleanup has to go here because the rings 
          * have to be put before we call put_domain. */
         vm_event_cleanup(d);
+        set_bit(d->domain_id, dom_state_changed);
         put_domain(d);
         send_global_virq(VIRQ_DOM_EXC);
         /* fallthrough */
@@ -1141,6 +1161,8 @@ static void complete_domain_destroy(struct rcu_head *head)
 
     xfree(d->vcpu);
 
+    set_bit(d->domain_id, dom_state_changed);
+
     _domain_destroy(d);
 
     send_global_virq(VIRQ_DOM_EXC);
diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index da88ad141a..e2a416052b 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -1218,6 +1218,8 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         rc = evtchn_bind_virq(&bind_virq, 0);
         if ( !rc && __copy_to_guest(arg, &bind_virq, 1) )
             rc = -EFAULT; /* Cleaning up here would be a mess! */
+        if ( !rc && bind_virq.virq == VIRQ_DOM_EXC )
+            domain_reset_states();
         break;
     }
 
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index b827c5af8d..2ae26bc50e 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -730,6 +730,8 @@ void domain_resume(struct domain *d);
 
 int domain_soft_reset(struct domain *d, bool resuming);
 
+void domain_reset_states(void);
+
 int vcpu_start_shutdown_deferral(struct vcpu *v);
 void vcpu_end_shutdown_deferral(struct vcpu *v);
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH RFC 3/4] xen: add new stable control hypercall
  2021-09-14 12:35 [PATCH RFC 0/4] remove libxenctrl usage from xenstored Juergen Gross
  2021-09-14 12:35 ` [PATCH RFC 1/4] xen: add a domain unique id to each domain Juergen Gross
  2021-09-14 12:35 ` [PATCH RFC 2/4] xen: add bitmap to indicate per-domain state changes Juergen Gross
@ 2021-09-14 12:35 ` Juergen Gross
  2021-11-22 15:39   ` Jan Beulich
  2021-09-14 12:36 ` [PATCH RFC 4/4] tools/xenstore: use new stable interface instead of libxenctrl Juergen Gross
  3 siblings, 1 reply; 25+ messages in thread
From: Juergen Gross @ 2021-09-14 12:35 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Daniel De Graaf, Daniel P. Smith, Ian Jackson,
	Wei Liu, Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Volodymyr Babchuk, Roger Pau Monné

The sysctl and domctl hypercalls are not stable, so tools using those
need to be in sync with the hypervisor.

In order to decouple (some) tools from the hypervisor add a new stable
hypercall __HYPERVISOR_control_op with (for now) two sub-options:

- XEN_CONTROL_OP_get_version for retrieving the max version of the new
  hypercall supported by the hypervisor
- XEN_CONTROL_OP_get_state_changed_domain for retrieving some state
  data of a domain which changed state (this is needed by Xenstore).
  The returned state just contains the domid, the domain unique id,
  and some flags (existing, shutdown, dying).

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/flask/policy/modules/dom0.te  |  2 +-
 xen/arch/arm/traps.c                |  1 +
 xen/arch/x86/hvm/hypercall.c        |  1 +
 xen/arch/x86/hypercall.c            |  1 +
 xen/arch/x86/pv/hypercall.c         |  1 +
 xen/common/Makefile                 |  1 +
 xen/common/control.c                | 52 +++++++++++++++++++
 xen/common/domain.c                 | 38 ++++++++++++++
 xen/common/event_channel.c          |  9 +++-
 xen/include/Makefile                |  1 +
 xen/include/public/control.h        | 80 +++++++++++++++++++++++++++++
 xen/include/public/xen.h            |  1 +
 xen/include/xen/event.h             |  6 +++
 xen/include/xen/hypercall.h         |  5 ++
 xen/include/xen/sched.h             |  2 +
 xen/include/xsm/dummy.h             | 14 +++++
 xen/include/xsm/xsm.h               |  6 +++
 xen/xsm/dummy.c                     |  1 +
 xen/xsm/flask/hooks.c               |  6 +++
 xen/xsm/flask/policy/access_vectors |  2 +
 20 files changed, 227 insertions(+), 3 deletions(-)
 create mode 100644 xen/common/control.c
 create mode 100644 xen/include/public/control.h

diff --git a/tools/flask/policy/modules/dom0.te b/tools/flask/policy/modules/dom0.te
index 0a63ce15b6..5c5e4af56c 100644
--- a/tools/flask/policy/modules/dom0.te
+++ b/tools/flask/policy/modules/dom0.te
@@ -11,7 +11,7 @@ allow dom0_t xen_t:xen {
 	mtrr_del mtrr_read microcode physinfo quirk writeconsole readapic
 	writeapic privprofile nonprivprofile kexec firmware sleep frequency
 	getidle debug getcpuinfo heap pm_op mca_op lockprof cpupool_op
-	getscheduler setscheduler hypfs_op
+	getscheduler setscheduler hypfs_op control_op
 };
 allow dom0_t xen_t:xen2 {
 	resource_op psr_cmt_op psr_alloc pmu_ctrl get_symbol
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 219ab3c3fb..78802a5660 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1397,6 +1397,7 @@ static arm_hypercall_t arm_hypercall_table[] = {
 #ifdef CONFIG_IOREQ_SERVER
     HYPERCALL(dm_op, 3),
 #endif
+    HYPERCALL(control_op, 2),
 };
 
 #ifndef NDEBUG
diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index 261d8ee8a4..007fcf5cb0 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -154,6 +154,7 @@ static const struct {
 #ifdef CONFIG_HYPFS
     HYPERCALL(hypfs_op),
 #endif
+    HYPERCALL(control_op),
     HYPERCALL(arch_1)
 };
 
diff --git a/xen/arch/x86/hypercall.c b/xen/arch/x86/hypercall.c
index 2370d31d3f..53523bac43 100644
--- a/xen/arch/x86/hypercall.c
+++ b/xen/arch/x86/hypercall.c
@@ -74,6 +74,7 @@ const hypercall_args_t hypercall_args_table[NR_hypercalls] =
     ARGS(hvm_op, 2),
     ARGS(dm_op, 3),
     ARGS(hypfs_op, 5),
+    ARGS(control_op, 2),
     ARGS(mca, 1),
     ARGS(arch_1, 1),
 };
diff --git a/xen/arch/x86/pv/hypercall.c b/xen/arch/x86/pv/hypercall.c
index 9765e674cf..8f81c1e645 100644
--- a/xen/arch/x86/pv/hypercall.c
+++ b/xen/arch/x86/pv/hypercall.c
@@ -99,6 +99,7 @@ const pv_hypercall_table_t pv_hypercall_table[] = {
 #ifdef CONFIG_HYPFS
     HYPERCALL(hypfs_op),
 #endif
+    HYPERCALL(control_op),
     HYPERCALL(mca),
 #ifndef CONFIG_PV_SHIM_EXCLUSIVE
     HYPERCALL(arch_1),
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 54de70d422..3c44def563 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -1,6 +1,7 @@
 obj-$(CONFIG_ARGO) += argo.o
 obj-y += bitmap.o
 obj-$(CONFIG_HYPFS_CONFIG) += config_data.o
+obj-y += control.o
 obj-$(CONFIG_CORE_PARKING) += core_parking.o
 obj-y += cpu.o
 obj-$(CONFIG_DEBUG_TRACE) += debugtrace.o
diff --git a/xen/common/control.c b/xen/common/control.c
new file mode 100644
index 0000000000..f92178d461
--- /dev/null
+++ b/xen/common/control.c
@@ -0,0 +1,52 @@
+/******************************************************************************
+ *
+ * control.c
+ *
+ * Entry point of the stable __HYPERVISOR_control_op hypercall.
+ */
+#include <xen/err.h>
+#include <xen/event.h>
+#include <xen/guest_access.h>
+#include <xen/hypercall.h>
+#include <public/control.h>
+
+long do_control_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
+{
+    int ret = 0;
+
+    if ( xsm_control_op(XSM_OTHER, cmd) )
+        return -EPERM;
+
+    switch ( cmd )
+    {
+    case XEN_CONTROL_OP_get_version:
+        if ( !guest_handle_is_null(arg) )
+            return -EINVAL;
+
+        ret = XEN_CONTROL_VERSION;
+        break;
+
+    case XEN_CONTROL_OP_get_state_changed_domain:
+    {
+        struct xen_control_changed_domain info = { };
+
+        if ( get_global_virq_handler(VIRQ_DOM_EXC) != current->domain )
+            return -EPERM;
+
+        ret = domain_get_dom_state_changed(&info);
+        if ( ret )
+            break;
+
+        if ( copy_to_guest(arg, &info, 1) )
+            ret = -EFAULT;
+
+        break;
+    }
+
+    default:
+        ret = -EOPNOTSUPP;
+        break;
+    }
+
+    return ret;
+}
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 14a427e2ef..d6d729c485 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -36,6 +36,7 @@
 #include <asm/debugger.h>
 #include <asm/p2m.h>
 #include <asm/processor.h>
+#include <public/control.h>
 #include <public/sched.h>
 #include <public/sysctl.h>
 #include <public/vcpu.h>
@@ -103,6 +104,43 @@ void domain_reset_states(void)
     rcu_read_unlock(&domlist_read_lock);
 }
 
+int domain_get_dom_state_changed(struct xen_control_changed_domain *info)
+{
+    unsigned int dom;
+    struct domain *d;
+
+    while ( (dom = find_first_bit(dom_state_changed, DOMID_MASK + 1)) <
+            DOMID_FIRST_RESERVED )
+    {
+        d = rcu_lock_domain_by_id(dom);
+
+        if ( test_and_clear_bit(dom, dom_state_changed) )
+        {
+            info->domid = dom;
+            if ( d )
+            {
+                info->state = XEN_CONTROL_CHANGEDDOM_STATE_EXIST;
+                if ( d->is_shut_down )
+                    info->state |= XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN;
+                if ( d->is_dying == DOMDYING_dead )
+                    info->state |= XEN_CONTROL_CHANGEDDOM_STATE_DYING;
+                info->unique_id = d->unique_id;
+
+                rcu_unlock_domain(d);
+            }
+
+            return 0;
+        }
+
+        if ( d )
+        {
+            rcu_unlock_domain(d);
+        }
+    }
+
+    return -ENOENT;
+}
+
 static void __domain_finalise_shutdown(struct domain *d)
 {
     struct vcpu *v;
diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index e2a416052b..b5f377c76f 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -894,11 +894,16 @@ static struct domain *global_virq_handlers[NR_VIRQS] __read_mostly;
 
 static DEFINE_SPINLOCK(global_virq_handlers_lock);
 
-void send_global_virq(uint32_t virq)
+struct domain *get_global_virq_handler(uint32_t virq)
 {
     ASSERT(virq_is_global(virq));
 
-    send_guest_global_virq(global_virq_handlers[virq] ?: hardware_domain, virq);
+    return global_virq_handlers[virq] ?: hardware_domain;
+}
+
+void send_global_virq(uint32_t virq)
+{
+    send_guest_global_virq(get_global_virq_handler(virq), virq);
 }
 
 int set_global_virq_handler(struct domain *d, uint32_t virq)
diff --git a/xen/include/Makefile b/xen/include/Makefile
index 95daa8a289..adf61f40c3 100644
--- a/xen/include/Makefile
+++ b/xen/include/Makefile
@@ -4,6 +4,7 @@ compat-arch-$(CONFIG_X86) := x86_32
 
 headers-y := \
     compat/arch-$(compat-arch-y).h \
+    compat/control.h \
     compat/elfnote.h \
     compat/event_channel.h \
     compat/features.h \
diff --git a/xen/include/public/control.h b/xen/include/public/control.h
new file mode 100644
index 0000000000..0b2a032e96
--- /dev/null
+++ b/xen/include/public/control.h
@@ -0,0 +1,80 @@
+/******************************************************************************
+ * Xen Control Hypercall
+ *
+ * Copyright (c) 2021, SUSE Software Solutions Germany GmbH
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __XEN_PUBLIC_CONTROL_H__
+#define __XEN_PUBLIC_CONTROL_H__
+
+#include "xen.h"
+
+/*
+ * Definitions for the __HYPERVISOR_control_op hypercall.
+ */
+
+/* Highest version number of the control interface currently defined. */
+#define XEN_CONTROL_VERSION      1
+
+/*
+ * Hypercall operations.
+ */
+
+/*
+ * XEN_CONTROL_OP_get_version
+ *
+ * Read highest interface version supported by the hypervisor.
+ *
+ * arg: NULL
+ *
+ * Possible return values:
+ * >0: highest supported interface version
+ * <0: negative Xen errno value
+ */
+#define XEN_CONTROL_OP_get_version                  0
+
+/*
+ * XEN_CONTROL_OP_get_state_changed_domain
+ *
+ * Get information about a domain having changed state and reset the state
+ * change indicator for that domain. This function is usable only by a domain
+ * having registered the VIRQ_DOM_EXC event (normally Xenstore).
+ *
+ * arg: XEN_GUEST_HANDLE(struct xen_control_changed_domain)
+ *
+ * Possible return values:
+ * 0: success
+ * <0 : negative Xen errno value
+ */
+#define XEN_CONTROL_OP_get_state_changed_domain     1
+struct xen_control_changed_domain {
+    domid_t domid;
+    uint16_t state;
+#define XEN_CONTROL_CHANGEDDOM_STATE_EXIST     0x0001  /* Domain is existing. */
+#define XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN  0x0002  /* Shutdown finished. */
+#define XEN_CONTROL_CHANGEDDOM_STATE_DYING     0x0004  /* Domain dying. */
+    uint32_t pad1;           /* Returned as 0. */
+    uint64_t unique_id;      /* Unique domain identifier. */
+    uint64_t pad2[6];        /* Returned as 0. */
+};
+
+#endif /* __XEN_PUBLIC_CONTROL_H__ */
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index e373592c33..1344fbcc36 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -131,6 +131,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
 #define __HYPERVISOR_xenpmu_op            40
 #define __HYPERVISOR_dm_op                41
 #define __HYPERVISOR_hypfs_op             42
+#define __HYPERVISOR_control_op           43
 
 /* Architecture-specific hypercall definitions. */
 #define __HYPERVISOR_arch_0               48
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index 21c95e14fd..c770f29e8e 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -43,6 +43,12 @@ void send_guest_global_virq(struct domain *d, uint32_t virq);
  */
 int set_global_virq_handler(struct domain *d, uint32_t virq);
 
+/*
+ * get_global_virq_handler: Get domain handling a global virq.
+ *  @virq:     Virtual IRQ number (VIRQ_*), must be global
+ */
+struct domain *get_global_virq_handler(uint32_t virq);
+
 /*
  * send_guest_pirq:
  *  @d:        Domain to which physical IRQ should be sent
diff --git a/xen/include/xen/hypercall.h b/xen/include/xen/hypercall.h
index 3771487a30..1876e0b26f 100644
--- a/xen/include/xen/hypercall.h
+++ b/xen/include/xen/hypercall.h
@@ -160,6 +160,11 @@ do_hypfs_op(
     unsigned long arg4);
 #endif
 
+extern long
+do_control_op(
+    unsigned int cmd,
+    XEN_GUEST_HANDLE_PARAM(void) arg);
+
 #ifdef CONFIG_COMPAT
 
 extern int
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 2ae26bc50e..2795906a8f 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -24,6 +24,7 @@
 #include <xen/vpci.h>
 #include <xen/wait.h>
 #include <public/xen.h>
+#include <public/control.h>
 #include <public/domctl.h>
 #include <public/sysctl.h>
 #include <public/vcpu.h>
@@ -731,6 +732,7 @@ void domain_resume(struct domain *d);
 int domain_soft_reset(struct domain *d, bool resuming);
 
 void domain_reset_states(void);
+int domain_get_dom_state_changed(struct xen_control_changed_domain *info);
 
 int vcpu_start_shutdown_deferral(struct vcpu *v);
 void vcpu_end_shutdown_deferral(struct vcpu *v);
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index 214b5408b1..e3c9b27acc 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -17,6 +17,7 @@
 
 #include <xen/sched.h>
 #include <xsm/xsm.h>
+#include <public/control.h>
 #include <public/hvm/params.h>
 
 /* Cannot use BUILD_BUG_ON here because the expressions we check are not
@@ -157,6 +158,19 @@ static XSM_INLINE int xsm_sysctl(XSM_DEFAULT_ARG int cmd)
     return xsm_default_action(action, current->domain, NULL);
 }
 
+static XSM_INLINE int xsm_control_op(XSM_DEFAULT_ARG uint32_t cmd)
+{
+    XSM_ASSERT_ACTION(XSM_OTHER);
+    switch ( cmd )
+    {
+    case XEN_CONTROL_OP_get_version:
+    case XEN_CONTROL_OP_get_state_changed_domain:
+        return xsm_default_action(XSM_XS_PRIV, current->domain, NULL);
+    default:
+        return xsm_default_action(XSM_PRIV, current->domain, NULL);
+    }
+}
+
 static XSM_INLINE int xsm_readconsole(XSM_DEFAULT_ARG uint32_t clear)
 {
     XSM_ASSERT_ACTION(XSM_HOOK);
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 16b90be2c5..21965d3df9 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -125,6 +125,7 @@ struct xsm_operations {
 
     int (*page_offline)(uint32_t cmd);
     int (*hypfs_op)(void);
+    int (*control_op)(uint32_t cmd);
 
     long (*do_xsm_op) (XEN_GUEST_HANDLE_PARAM(void) op);
 #ifdef CONFIG_COMPAT
@@ -539,6 +540,11 @@ static inline int xsm_hypfs_op(xsm_default_t def)
     return xsm_ops->hypfs_op();
 }
 
+static inline int xsm_control_op(xsm_default_t def, uint32_t cmd)
+{
+    return xsm_ops->control_op(cmd);
+}
+
 static inline long xsm_do_xsm_op (XEN_GUEST_HANDLE_PARAM(void) op)
 {
     return xsm_ops->do_xsm_op(op);
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index de44b10130..0190463523 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -104,6 +104,7 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
 
     set_to_dummy_if_null(ops, page_offline);
     set_to_dummy_if_null(ops, hypfs_op);
+    set_to_dummy_if_null(ops, control_op);
     set_to_dummy_if_null(ops, hvm_param);
     set_to_dummy_if_null(ops, hvm_control);
     set_to_dummy_if_null(ops, hvm_param_altp2mhvm);
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 1465db125a..5966e35c5c 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1181,6 +1181,11 @@ static inline int flask_hypfs_op(void)
     return domain_has_xen(current->domain, XEN__HYPFS_OP);
 }
 
+static inline int flask_control_op(void)
+{
+    return domain_has_xen(current->domain, XEN__CONTROL_OP);
+}
+
 static int flask_add_to_physmap(struct domain *d1, struct domain *d2)
 {
     return domain_has_perm(d1, d2, SECCLASS_MMU, MMU__PHYSMAP);
@@ -1815,6 +1820,7 @@ static struct xsm_operations flask_ops = {
 
     .page_offline = flask_page_offline,
     .hypfs_op = flask_hypfs_op,
+    .control_op = flask_control_op,
     .hvm_param = flask_hvm_param,
     .hvm_control = flask_hvm_param,
     .hvm_param_altp2mhvm = flask_hvm_param_altp2mhvm,
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index 6359c7fc87..3ca91ac23c 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -69,6 +69,8 @@ class xen
     cpupool_op
 # hypfs hypercall
     hypfs_op
+# control hypercall
+    control_op
 # XEN_SYSCTL_scheduler_op with XEN_DOMCTL_SCHEDOP_getinfo, XEN_SYSCTL_sched_id, XEN_DOMCTL_SCHEDOP_getvcpuinfo
     getscheduler
 # XEN_SYSCTL_scheduler_op with XEN_DOMCTL_SCHEDOP_putinfo, XEN_DOMCTL_SCHEDOP_putvcpuinfo
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH RFC 4/4] tools/xenstore: use new stable interface instead of libxenctrl
  2021-09-14 12:35 [PATCH RFC 0/4] remove libxenctrl usage from xenstored Juergen Gross
                   ` (2 preceding siblings ...)
  2021-09-14 12:35 ` [PATCH RFC 3/4] xen: add new stable control hypercall Juergen Gross
@ 2021-09-14 12:36 ` Juergen Gross
  3 siblings, 0 replies; 25+ messages in thread
From: Juergen Gross @ 2021-09-14 12:36 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross, Ian Jackson, Wei Liu, Julien Grall

Xenstore is using libxenctrl only for obtaining state information
about domains.

Use the new stable interface XEN_CONTROL_OP_get_state_changed_domain
instead.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/Makefile            |   3 +-
 tools/xenstore/xenstored_control.c |  14 +-
 tools/xenstore/xenstored_domain.c  | 219 ++++++++++++++++-------------
 3 files changed, 130 insertions(+), 106 deletions(-)

diff --git a/tools/xenstore/Makefile b/tools/xenstore/Makefile
index 292b478fa1..b78e29470b 100644
--- a/tools/xenstore/Makefile
+++ b/tools/xenstore/Makefile
@@ -11,6 +11,7 @@ CFLAGS += -include $(XEN_ROOT)/tools/config.h
 CFLAGS += -I./include
 CFLAGS += $(CFLAGS_libxenevtchn)
 CFLAGS += $(CFLAGS_libxenctrl)
+CFLAGS += $(CFLAGS_libxencall)
 CFLAGS += $(CFLAGS_libxenguest)
 CFLAGS += $(CFLAGS_libxentoolcore)
 CFLAGS += -DXEN_LIB_STORED="\"$(XEN_LIB_STORED)\""
@@ -70,7 +71,7 @@ endif
 $(XENSTORED_OBJS): CFLAGS += $(CFLAGS_libxengnttab)
 
 xenstored: $(XENSTORED_OBJS)
-	$(CC) $^ $(LDFLAGS) $(LDLIBS_libxenevtchn) $(LDLIBS_libxengnttab) $(LDLIBS_libxenctrl) $(LDLIBS_xenstored) $(SOCKET_LIBS) -o $@ $(APPEND_LDFLAGS)
+	$(CC) $^ $(LDFLAGS) $(LDLIBS_libxenevtchn) $(LDLIBS_libxengnttab) $(LDLIBS_libxencall) $(LDLIBS_xenstored) $(SOCKET_LIBS) -o $@ $(APPEND_LDFLAGS)
 
 xenstored.a: $(XENSTORED_OBJS)
 	$(AR) cr $@ $^
diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index 7b4300ef77..d767aea0f9 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -594,6 +594,13 @@ void lu_read_state(void)
 	void *ctx = talloc_new(NULL); /* Work context for subfunctions. */
 	struct xs_state_preamble *pre;
 
+	/*
+	 * We may have missed the VIRQ_DOM_EXC notification and a domain may
+	 * have died while we were live-updating. So check all the domains are
+	 * still alive. This will pre-initialize all domain structures.
+	 */
+	check_domains();
+
 	syslog(LOG_INFO, "live-update: read state\n");
 	lu_get_dump_state(&state);
 	if (state.size == 0)
@@ -634,13 +641,6 @@ void lu_read_state(void)
 	lu_close_dump_state(&state);
 
 	talloc_free(ctx);
-
-	/*
-	 * We may have missed the VIRQ_DOM_EXC notification and a domain may
-	 * have died while we were live-updating. So check all the domains are
-	 * still alive.
-	 */
-	check_domains();
 }
 
 static const char *lu_activate_binary(const void *ctx)
diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
index 47e9107c14..2801e4b24f 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -31,11 +31,12 @@
 #include "xenstored_transaction.h"
 #include "xenstored_watch.h"
 
+#include <xencall.h>
 #include <xenevtchn.h>
-#include <xenctrl.h>
+#include <xen/control.h>
 #include <xen/grant_table.h>
 
-static xc_interface **xc_handle;
+static xencall_handle **xc_handle;
 xengnttab_handle **xgt_handle;
 static evtchn_port_t virq_port;
 
@@ -50,6 +51,7 @@ struct domain
 
 	/* The id of this domain */
 	unsigned int domid;
+	uint64_t unique_id;
 
 	/* Event channel port */
 	evtchn_port_t port;
@@ -238,54 +240,114 @@ static int destroy_domain(void *_domain)
 	return 0;
 }
 
-static bool get_domain_info(unsigned int domid, xc_dominfo_t *dominfo)
+static bool get_domain_info(struct xen_control_changed_domain *dominfo)
 {
-	return xc_domain_getinfo(*xc_handle, domid, 1, dominfo) == 1 &&
-	       dominfo->domid == domid;
+	struct xen_control_changed_domain *buf;
+	int ret;
+
+	buf = xencall_alloc_buffer(*xc_handle, sizeof(*buf));
+	if (!buf)
+		return false;
+	ret = xencall2(*xc_handle, __HYPERVISOR_control_op,
+		       XEN_CONTROL_OP_get_state_changed_domain,
+		       (unsigned long)buf);
+	*dominfo = *buf;
+	xencall_free_buffer(*xc_handle, buf);
+
+	return ret == 0;
 }
 
-void check_domains(void)
+static struct domain *find_domain_struct(unsigned int domid)
+{
+	struct domain *i;
+
+	list_for_each_entry(i, &domains, list) {
+		if (i->domid == domid)
+			return i;
+	}
+	return NULL;
+}
+
+static struct domain *alloc_domain(unsigned int domid)
 {
-	xc_dominfo_t dominfo;
 	struct domain *domain;
+
+	domain = talloc_zero(talloc_autofree_context(), struct domain);
+	if (!domain) {
+		errno = ENOMEM;
+		return NULL;
+	}
+
+	domain->domid = domid;
+	domain->generation = generation;
+	domain->introduced = false;
+
+	talloc_set_destructor(domain, destroy_domain);
+
+	list_add(&domain->list, &domains);
+
+	return domain;
+}
+
+static void domain_drop(struct domain *domain)
+{
 	struct connection *conn;
+
+	/* domain is a talloc child of domain->conn. */
+	conn = domain->conn;
+	domain->conn = NULL;
+	talloc_unlink(talloc_autofree_context(), conn);
+}
+
+void check_domains(void)
+{
+	struct xen_control_changed_domain dominfo;
+	struct domain *domain;
 	int notify = 0;
-	bool dom_valid;
 
- again:
-	list_for_each_entry(domain, &domains, list) {
-		dom_valid = get_domain_info(domain->domid, &dominfo);
-		if (!domain->introduced) {
-			if (!dom_valid) {
+	while (get_domain_info(&dominfo)) {
+		domain = find_domain_struct(dominfo.domid);
+
+		if (!dominfo.state) {
+			if (domain && !domain->introduced)
 				talloc_free(domain);
-				goto again;
-			}
 			continue;
 		}
-		if (dom_valid) {
-			if ((dominfo.crashed || dominfo.shutdown)
-			    && !domain->shutdown) {
-				domain->shutdown = true;
-				notify = 1;
-			}
-			/*
-			 * On Restore, we may have been unable to remap the
-			 * interface and the port. As we don't know whether
-			 * this was because of a dying domain, we need to
-			 * check if the interface and port are still valid.
-			 */
-			if (!dominfo.dying && domain->port &&
-			    domain->interface)
+
+		if (domain && domain->unique_id &&
+		    domain->unique_id != dominfo.unique_id) {
+			if (domain->conn)
+				domain_drop(domain);
+			else
+				talloc_free(domain);
+			domain = NULL;
+		}
+
+		if (!domain) {
+			domain = alloc_domain(dominfo.domid);
+			if (!domain)
 				continue;
+			domain->unique_id = dominfo.unique_id;
 		}
-		if (domain->conn) {
-			/* domain is a talloc child of domain->conn. */
-			conn = domain->conn;
-			domain->conn = NULL;
-			talloc_unlink(talloc_autofree_context(), conn);
-			notify = 0; /* destroy_domain() fires the watch */
-			goto again;
+
+		if ((dominfo.state & XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN) &&
+		    !domain->shutdown) {
+			domain->shutdown = true;
+			notify = 1;
 		}
+
+		/*
+		 * On Restore, we may have been unable to remap the
+		 * interface and the port. As we don't know whether
+		 * this was because of a dying domain, we need to
+		 * check if the interface and port are still valid.
+		 */
+		if (!(dominfo.state & XEN_CONTROL_CHANGEDDOM_STATE_DYING) &&
+		    domain->port && domain->interface)
+			continue;
+
+		if (domain->conn)
+			domain_drop(domain);
 	}
 
 	if (notify)
@@ -323,46 +385,6 @@ static char *talloc_domain_path(void *context, unsigned int domid)
 	return talloc_asprintf(context, "/local/domain/%u", domid);
 }
 
-static struct domain *find_domain_struct(unsigned int domid)
-{
-	struct domain *i;
-
-	list_for_each_entry(i, &domains, list) {
-		if (i->domid == domid)
-			return i;
-	}
-	return NULL;
-}
-
-static struct domain *alloc_domain(const void *context, unsigned int domid)
-{
-	struct domain *domain;
-
-	domain = talloc(context, struct domain);
-	if (!domain) {
-		errno = ENOMEM;
-		return NULL;
-	}
-
-	domain->domid = domid;
-	domain->generation = generation;
-	domain->introduced = false;
-
-	talloc_set_destructor(domain, destroy_domain);
-
-	list_add(&domain->list, &domains);
-
-	return domain;
-}
-
-static struct domain *find_or_alloc_domain(const void *ctx, unsigned int domid)
-{
-	struct domain *domain;
-
-	domain = find_domain_struct(domid);
-	return domain ? : alloc_domain(ctx, domid);
-}
-
 static int new_domain(struct domain *domain, int port, bool restore)
 {
 	int rc;
@@ -443,9 +465,13 @@ static struct domain *introduce_domain(const void *ctx,
 	struct xenstore_domain_interface *interface;
 	bool is_master_domain = (domid == xenbus_master_domid());
 
-	domain = find_or_alloc_domain(ctx, domid);
-	if (!domain)
+	check_domains();
+
+	domain = find_domain_struct(domid);
+	if (!domain) {
+		errno = ENOENT;
 		return NULL;
+	}
 
 	if (!domain->introduced) {
 		interface = is_master_domain ? xenbus_map()
@@ -650,7 +676,7 @@ int do_reset_watches(struct connection *conn, struct buffered_data *in)
 
 static int close_xc_handle(void *_handle)
 {
-	xc_interface_close(*(xc_interface**)_handle);
+	xencall_close(*(xencall_handle **)_handle);
 	return 0;
 }
 
@@ -741,11 +767,11 @@ void domain_init(int evtfd)
 {
 	int rc;
 
-	xc_handle = talloc(talloc_autofree_context(), xc_interface*);
+	xc_handle = talloc(talloc_autofree_context(), xencall_handle *);
 	if (!xc_handle)
 		barf_perror("Failed to allocate domain handle");
 
-	*xc_handle = xc_interface_open(0,0,0);
+	*xc_handle = xencall_open(NULL, 0);
 	if (!*xc_handle)
 		barf_perror("Failed to open connection to hypervisor");
 
@@ -821,7 +847,6 @@ void domain_entry_inc(struct connection *conn, struct node *node)
  * count (used for testing whether a node permission is older than a domain).
  *
  * Return values:
- * -1: error
  *  0: domain has higher generation count (it is younger than a node with the
  *     given count), or domain isn't existing any longer
  *  1: domain is older than the node
@@ -829,7 +854,6 @@ void domain_entry_inc(struct connection *conn, struct node *node)
 static int chk_domain_generation(unsigned int domid, uint64_t gen)
 {
 	struct domain *d;
-	xc_dominfo_t dominfo;
 
 	if (!xc_handle && domid == 0)
 		return 1;
@@ -838,11 +862,9 @@ static int chk_domain_generation(unsigned int domid, uint64_t gen)
 	if (d)
 		return (d->generation <= gen) ? 1 : 0;
 
-	if (!get_domain_info(domid, &dominfo))
-		return 0;
-
-	d = alloc_domain(NULL, domid);
-	return d ? 1 : -1;
+	check_domains();
+	d = find_domain_struct(domid);
+	return (d && d->generation <= gen) ? 1 : 0;
 }
 
 /*
@@ -855,8 +877,6 @@ int domain_adjust_node_perms(struct node *node)
 	int ret;
 
 	ret = chk_domain_generation(node->perms.p[0].id, node->generation);
-	if (ret < 0)
-		return errno;
 
 	/* If the owner doesn't exist any longer give it to priv domain. */
 	if (!ret)
@@ -867,8 +887,6 @@ int domain_adjust_node_perms(struct node *node)
 			continue;
 		ret = chk_domain_generation(node->perms.p[i].id,
 					    node->generation);
-		if (ret < 0)
-			return errno;
 		if (!ret)
 			node->perms.p[i].perms |= XS_PERM_IGNORE;
 	}
@@ -1300,16 +1318,21 @@ void read_state_connection(const void *ctx, const void *state)
 	} else {
 		domain = introduce_domain(ctx, sc->spec.ring.domid,
 					  sc->spec.ring.evtchn, true);
-		if (!domain)
+		if (!domain) {
+			/* Domain vanished during LM? */
+			if (errno = ENOENT)
+				return;
 			barf("domain allocation error");
+		}
 
 		if (sc->spec.ring.tdomid != DOMID_INVALID) {
-			tdomain = find_or_alloc_domain(ctx,
-						       sc->spec.ring.tdomid);
-			if (!tdomain)
+			tdomain = find_domain_struct(sc->spec.ring.tdomid);
+			if (!tdomain && errno != ENOENT)
 				barf("target domain allocation error");
-			talloc_reference(domain->conn, tdomain->conn);
-			domain->conn->target = tdomain->conn;
+			if (tdomain) {
+				talloc_reference(domain->conn, tdomain->conn);
+				domain->conn->target = tdomain->conn;
+			}
 		}
 		conn = domain->conn;
 	}
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 2/4] xen: add bitmap to indicate per-domain state changes
  2021-09-14 12:35 ` [PATCH RFC 2/4] xen: add bitmap to indicate per-domain state changes Juergen Gross
@ 2021-09-23  9:16   ` Julien Grall
  2021-11-22 10:41   ` Jan Beulich
  1 sibling, 0 replies; 25+ messages in thread
From: Julien Grall @ 2021-09-23  9:16 UTC (permalink / raw)
  To: Juergen Gross, xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Stefano Stabellini, Wei Liu

Hi Juergen,

On 14/09/2021 17:35, Juergen Gross wrote:
> Add a bitmap with one bit per possible domid indicating the respective
> domain has changed its state (created, deleted, dying, crashed,
> shutdown).
> 
> Registering the VIRQ_DOM_EXC event will result in setting the bits for
> all existing domains and resetting all other bits.
> 
> Resetting a bit will be done in a future patch.
> 
> This information is needed for Xenstore to keep track of all domains.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
>   xen/common/domain.c        | 22 ++++++++++++++++++++++
>   xen/common/event_channel.c |  2 ++
>   xen/include/xen/sched.h    |  2 ++
>   3 files changed, 26 insertions(+)
> 
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 755349b93f..14a427e2ef 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -87,6 +87,22 @@ bool __read_mostly vmtrace_available;
>   /* Unique domain identifier, protected by domctl lock. */
>   static uint64_t unique_id;
>   
> +static DECLARE_BITMAP(dom_state_changed, DOMID_MASK + 1);
> +
> +void domain_reset_states(void)
> +{
> +    struct domain *d;
> +
> +    bitmap_zero(dom_state_changed, DOMID_MASK + 1);
> +
> +    rcu_read_lock(&domlist_read_lock);
> +
> +    for_each_domain ( d )
> +        set_bit(d->domain_id, dom_state_changed);
> +
> +    rcu_read_unlock(&domlist_read_lock);
> +}
> +
>   static void __domain_finalise_shutdown(struct domain *d)
>   {
>       struct vcpu *v;
> @@ -101,6 +117,7 @@ static void __domain_finalise_shutdown(struct domain *d)
>               return;
>   
>       d->is_shut_down = 1;
> +    set_bit(d->domain_id, dom_state_changed);

Looking at the follow-up patch, I think you want a smp_wmb() before 
set_bit() otherwise this could be re-ordered on Arm (set_bit() doesn't 
guarantee any ordering). This is to avoid any issue if the new hypercall 
run concurrently.

>       if ( (d->shutdown_code == SHUTDOWN_suspend) && d->suspend_evtchn )
>           evtchn_send(d, d->suspend_evtchn);
>       else
> @@ -720,6 +737,8 @@ struct domain *domain_create(domid_t domid,
>           rcu_assign_pointer(domain_hash[DOMAIN_HASH(domid)], d);
>           spin_unlock(&domlist_update_lock);
>   
> +        set_bit(d->domain_id, dom_state_changed);

Here you should not need one because spin_unlock() contains one.

> +
>           memcpy(d->handle, config->handle, sizeof(d->handle));
>       }
>   
> @@ -954,6 +973,7 @@ int domain_kill(struct domain *d)
>           /* Mem event cleanup has to go here because the rings
>            * have to be put before we call put_domain. */
>           vm_event_cleanup(d);
> +        set_bit(d->domain_id, dom_state_changed);

You might want an smp_wmb() here as well.

>           put_domain(d);
>           send_global_virq(VIRQ_DOM_EXC);
>           /* fallthrough */
> @@ -1141,6 +1161,8 @@ static void complete_domain_destroy(struct rcu_head *head)
>   
>       xfree(d->vcpu);
>   
> +    set_bit(d->domain_id, dom_state_changed);
> +

I think this wants to be moved *after* _domain_destroy() with an 
smp_wmb() between the two. So you when the bit is set you know the 
domain is dead.

>       _domain_destroy(d);
>   
>       send_global_virq(VIRQ_DOM_EXC);
> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
> index da88ad141a..e2a416052b 100644
> --- a/xen/common/event_channel.c
> +++ b/xen/common/event_channel.c
> @@ -1218,6 +1218,8 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>           rc = evtchn_bind_virq(&bind_virq, 0);
>           if ( !rc && __copy_to_guest(arg, &bind_virq, 1) )
>               rc = -EFAULT; /* Cleaning up here would be a mess! */
> +        if ( !rc && bind_virq.virq == VIRQ_DOM_EXC )
> +            domain_reset_states();
>           break;
>       }
>   
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index b827c5af8d..2ae26bc50e 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -730,6 +730,8 @@ void domain_resume(struct domain *d);
>   
>   int domain_soft_reset(struct domain *d, bool resuming);
>   
> +void domain_reset_states(void);
> +
>   int vcpu_start_shutdown_deferral(struct vcpu *v);
>   void vcpu_end_shutdown_deferral(struct vcpu *v);
>   
> 

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 1/4] xen: add a domain unique id to each domain
  2021-09-14 12:35 ` [PATCH RFC 1/4] xen: add a domain unique id to each domain Juergen Gross
@ 2021-11-22 10:01   ` Jan Beulich
  2021-11-22 11:42   ` Julien Grall
  1 sibling, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2021-11-22 10:01 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu, xen-devel

On 14.09.2021 14:35, Juergen Gross wrote:
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -84,6 +84,9 @@ vcpu_info_t dummy_vcpu_info;
>  
>  bool __read_mostly vmtrace_available;
>  
> +/* Unique domain identifier, protected by domctl lock. */
> +static uint64_t unique_id;

Unless a reason was given for this to live here, I think it wants to move ...

> @@ -473,6 +476,18 @@ static void _domain_destroy(struct domain *d)
>      free_domain_struct(d);
>  }
>  
> +static uint64_t get_unique_id(void)
> +{

... here. Then
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 2/4] xen: add bitmap to indicate per-domain state changes
  2021-09-14 12:35 ` [PATCH RFC 2/4] xen: add bitmap to indicate per-domain state changes Juergen Gross
  2021-09-23  9:16   ` Julien Grall
@ 2021-11-22 10:41   ` Jan Beulich
  2021-11-22 12:46     ` Juergen Gross
  1 sibling, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2021-11-22 10:41 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu, xen-devel

On 14.09.2021 14:35, Juergen Gross wrote:
> Add a bitmap with one bit per possible domid indicating the respective
> domain has changed its state (created, deleted, dying, crashed,
> shutdown).
> 
> Registering the VIRQ_DOM_EXC event will result in setting the bits for
> all existing domains and resetting all other bits.

Generally I view VIRQ_DOM_EXC as overly restrictive already - what if both
a xenstore domain and a control domain want respective notification? Hence
similarly I'm not convinced a single, global instance will do here. Which
in turn raises the question whether the approach chosen may not take us
far enough, and we wouldn't instead want a more precise notification model
(i.e. not just "something has changed").

> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -87,6 +87,22 @@ bool __read_mostly vmtrace_available;
>  /* Unique domain identifier, protected by domctl lock. */
>  static uint64_t unique_id;
>  
> +static DECLARE_BITMAP(dom_state_changed, DOMID_MASK + 1);

While not really making a difference to the size of the bitmap, afaict up
to DOMID_FIRST_RESERVED would do. Neither system domains nor idle ones
will reach, in particular, the set_bit() in domain_create(). And none of
them can be subject to destruction.

Also I think this could do with a brief comment as to what causes bits
to be set. This would avoid readers having to go hunt down all the
set_bit() (or the commit introducing the bitmap).

> +void domain_reset_states(void)
> +{
> +    struct domain *d;
> +
> +    bitmap_zero(dom_state_changed, DOMID_MASK + 1);

While this looks to be fine with the present updates of the bitmap, I
still wonder about the non-atomicity here vs the atomic updates
everywhere else. It feels like there's some locking needed to be future
proof. Since you come here from VIRQ_DOM_EXC binding, it could be that
domain's per-domain lock.

> @@ -1141,6 +1161,8 @@ static void complete_domain_destroy(struct rcu_head *head)
>  
>      xfree(d->vcpu);
>  
> +    set_bit(d->domain_id, dom_state_changed);
> +
>      _domain_destroy(d);
>  
>      send_global_virq(VIRQ_DOM_EXC);

Wouldn't this better be in domain_destroy() immediately after the domain
has been taken off the list, and hence is no longer "discoverable"?

Jan



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 1/4] xen: add a domain unique id to each domain
  2021-09-14 12:35 ` [PATCH RFC 1/4] xen: add a domain unique id to each domain Juergen Gross
  2021-11-22 10:01   ` Jan Beulich
@ 2021-11-22 11:42   ` Julien Grall
  2021-11-22 12:48     ` Juergen Gross
  1 sibling, 1 reply; 25+ messages in thread
From: Julien Grall @ 2021-11-22 11:42 UTC (permalink / raw)
  To: Juergen Gross, xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Stefano Stabellini, Wei Liu

Hi Juergen,

On 14/09/2021 13:35, Juergen Gross wrote:
> Xenstore is referencing domains by their domid, but reuse of a domid
> can lead to the situation that Xenstore can't tell whether a domain
> with that domid has been deleted and created again without Xenstore
> noticing the domain is a new one now.
> 
> Add a global domain creation unique id which is updated when creating
> a new domain, and store that value in struct domain of the new domain.
> The global unique id is initialized with the system time and updates
> are done via the xorshift algorithm which is used for pseudo random
> number generation, too (see https://en.wikipedia.org/wiki/Xorshift).
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
>   xen/common/domain.c     | 16 ++++++++++++++++
>   xen/include/xen/sched.h |  3 +++
>   2 files changed, 19 insertions(+)
> 
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 6ee5d033b0..755349b93f 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -84,6 +84,9 @@ vcpu_info_t dummy_vcpu_info;
>   
>   bool __read_mostly vmtrace_available;
>   
> +/* Unique domain identifier, protected by domctl lock. */
> +static uint64_t unique_id;
> +
>   static void __domain_finalise_shutdown(struct domain *d)
>   {
>       struct vcpu *v;
> @@ -473,6 +476,18 @@ static void _domain_destroy(struct domain *d)
>       free_domain_struct(d);
>   }
>   
> +static uint64_t get_unique_id(void)

The implementation is assuming that domain cannot be created 
concurrently. The rest of domain_create() seems to be able to cope with 
concurrent call (even if domctl prevents this situation today).

So I think we would want to make this call safe as well. One possibility 
would be to (ab)use the domlist_update_lock (I think the uniq ID is only 
necessary for real domains).

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 2/4] xen: add bitmap to indicate per-domain state changes
  2021-11-22 10:41   ` Jan Beulich
@ 2021-11-22 12:46     ` Juergen Gross
  2021-11-22 13:39       ` Jan Beulich
  0 siblings, 1 reply; 25+ messages in thread
From: Juergen Gross @ 2021-11-22 12:46 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu, xen-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 2812 bytes --]

On 22.11.21 11:41, Jan Beulich wrote:
> On 14.09.2021 14:35, Juergen Gross wrote:
>> Add a bitmap with one bit per possible domid indicating the respective
>> domain has changed its state (created, deleted, dying, crashed,
>> shutdown).
>>
>> Registering the VIRQ_DOM_EXC event will result in setting the bits for
>> all existing domains and resetting all other bits.
> 
> Generally I view VIRQ_DOM_EXC as overly restrictive already - what if both
> a xenstore domain and a control domain want respective notification? Hence

The general idea was that in this case the control domain should
register a related Xenstore watch.

> similarly I'm not convinced a single, global instance will do here. Which
> in turn raises the question whether the approach chosen may not take us
> far enough, and we wouldn't instead want a more precise notification model
> (i.e. not just "something has changed").

Yes, that would be the job of Xenstore. It would provide the more
fine grained watches for that purpose.

> 
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -87,6 +87,22 @@ bool __read_mostly vmtrace_available;
>>   /* Unique domain identifier, protected by domctl lock. */
>>   static uint64_t unique_id;
>>   
>> +static DECLARE_BITMAP(dom_state_changed, DOMID_MASK + 1);
> 
> While not really making a difference to the size of the bitmap, afaict up
> to DOMID_FIRST_RESERVED would do. Neither system domains nor idle ones
> will reach, in particular, the set_bit() in domain_create(). And none of
> them can be subject to destruction.

I'd be fine either way.

> Also I think this could do with a brief comment as to what causes bits
> to be set. This would avoid readers having to go hunt down all the
> set_bit() (or the commit introducing the bitmap).

Okay.

> 
>> +void domain_reset_states(void)
>> +{
>> +    struct domain *d;
>> +
>> +    bitmap_zero(dom_state_changed, DOMID_MASK + 1);
> 
> While this looks to be fine with the present updates of the bitmap, I
> still wonder about the non-atomicity here vs the atomic updates
> everywhere else. It feels like there's some locking needed to be future
> proof. Since you come here from VIRQ_DOM_EXC binding, it could be that
> domain's per-domain lock.

Fine with me.

> 
>> @@ -1141,6 +1161,8 @@ static void complete_domain_destroy(struct rcu_head *head)
>>   
>>       xfree(d->vcpu);
>>   
>> +    set_bit(d->domain_id, dom_state_changed);
>> +
>>       _domain_destroy(d);
>>   
>>       send_global_virq(VIRQ_DOM_EXC);
> 
> Wouldn't this better be in domain_destroy() immediately after the domain
> has been taken off the list, and hence is no longer "discoverable"?

In this case the call of send_global_virq() should be moved, too,
I guess?


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 1/4] xen: add a domain unique id to each domain
  2021-11-22 11:42   ` Julien Grall
@ 2021-11-22 12:48     ` Juergen Gross
  2021-11-22 14:10       ` Julien Grall
  0 siblings, 1 reply; 25+ messages in thread
From: Juergen Gross @ 2021-11-22 12:48 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Stefano Stabellini, Wei Liu


[-- Attachment #1.1.1: Type: text/plain, Size: 2075 bytes --]

On 22.11.21 12:42, Julien Grall wrote:
> Hi Juergen,
> 
> On 14/09/2021 13:35, Juergen Gross wrote:
>> Xenstore is referencing domains by their domid, but reuse of a domid
>> can lead to the situation that Xenstore can't tell whether a domain
>> with that domid has been deleted and created again without Xenstore
>> noticing the domain is a new one now.
>>
>> Add a global domain creation unique id which is updated when creating
>> a new domain, and store that value in struct domain of the new domain.
>> The global unique id is initialized with the system time and updates
>> are done via the xorshift algorithm which is used for pseudo random
>> number generation, too (see https://en.wikipedia.org/wiki/Xorshift).
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>>   xen/common/domain.c     | 16 ++++++++++++++++
>>   xen/include/xen/sched.h |  3 +++
>>   2 files changed, 19 insertions(+)
>>
>> diff --git a/xen/common/domain.c b/xen/common/domain.c
>> index 6ee5d033b0..755349b93f 100644
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -84,6 +84,9 @@ vcpu_info_t dummy_vcpu_info;
>>   bool __read_mostly vmtrace_available;
>> +/* Unique domain identifier, protected by domctl lock. */
>> +static uint64_t unique_id;
>> +
>>   static void __domain_finalise_shutdown(struct domain *d)
>>   {
>>       struct vcpu *v;
>> @@ -473,6 +476,18 @@ static void _domain_destroy(struct domain *d)
>>       free_domain_struct(d);
>>   }
>> +static uint64_t get_unique_id(void)
> 
> The implementation is assuming that domain cannot be created 
> concurrently. The rest of domain_create() seems to be able to cope with 
> concurrent call (even if domctl prevents this situation today).
> 
> So I think we would want to make this call safe as well. One possibility 
> would be to (ab)use the domlist_update_lock (I think the uniq ID is only 
> necessary for real domains).

In case this is thought to be needed, I'd rather use a cmpxchg operation
for updating unique_id.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 2/4] xen: add bitmap to indicate per-domain state changes
  2021-11-22 12:46     ` Juergen Gross
@ 2021-11-22 13:39       ` Jan Beulich
  2021-11-25  6:30         ` Juergen Gross
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2021-11-22 13:39 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu, xen-devel

On 22.11.2021 13:46, Juergen Gross wrote:
> On 22.11.21 11:41, Jan Beulich wrote:
>> On 14.09.2021 14:35, Juergen Gross wrote:
>>> Add a bitmap with one bit per possible domid indicating the respective
>>> domain has changed its state (created, deleted, dying, crashed,
>>> shutdown).
>>>
>>> Registering the VIRQ_DOM_EXC event will result in setting the bits for
>>> all existing domains and resetting all other bits.
>>
>> Generally I view VIRQ_DOM_EXC as overly restrictive already - what if both
>> a xenstore domain and a control domain want respective notification? Hence
> 
> The general idea was that in this case the control domain should
> register a related Xenstore watch.
> 
>> similarly I'm not convinced a single, global instance will do here. Which
>> in turn raises the question whether the approach chosen may not take us
>> far enough, and we wouldn't instead want a more precise notification model
>> (i.e. not just "something has changed").
> 
> Yes, that would be the job of Xenstore. It would provide the more
> fine grained watches for that purpose.

And the watch consumer still wouldn't have a way to distinguish two domain
instances using the same ID, would it?

>>> @@ -1141,6 +1161,8 @@ static void complete_domain_destroy(struct rcu_head *head)
>>>         xfree(d->vcpu);
>>>   +    set_bit(d->domain_id, dom_state_changed);
>>> +
>>>       _domain_destroy(d);
>>>         send_global_virq(VIRQ_DOM_EXC);
>>
>> Wouldn't this better be in domain_destroy() immediately after the domain
>> has been taken off the list, and hence is no longer "discoverable"?
> 
> In this case the call of send_global_virq() should be moved, too,
> I guess?

Possibly, albeit I'd see this as a separate change to make. It doesn't
seem outright wrong for the vIRQ to be sent late. But I agree with the
idea of keeping sending and bit-setting together (ideally, even if this
was to stay here, the two would better sit truly side by side).

Jan



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 1/4] xen: add a domain unique id to each domain
  2021-11-22 12:48     ` Juergen Gross
@ 2021-11-22 14:10       ` Julien Grall
  2021-11-22 14:29         ` Juergen Gross
  0 siblings, 1 reply; 25+ messages in thread
From: Julien Grall @ 2021-11-22 14:10 UTC (permalink / raw)
  To: Juergen Gross, xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Stefano Stabellini, Wei Liu

Hi,

On 22/11/2021 12:48, Juergen Gross wrote:
> On 22.11.21 12:42, Julien Grall wrote:
>> Hi Juergen,
>>
>> On 14/09/2021 13:35, Juergen Gross wrote:
>>> Xenstore is referencing domains by their domid, but reuse of a domid
>>> can lead to the situation that Xenstore can't tell whether a domain
>>> with that domid has been deleted and created again without Xenstore
>>> noticing the domain is a new one now.
>>>
>>> Add a global domain creation unique id which is updated when creating
>>> a new domain, and store that value in struct domain of the new domain.
>>> The global unique id is initialized with the system time and updates
>>> are done via the xorshift algorithm which is used for pseudo random
>>> number generation, too (see https://en.wikipedia.org/wiki/Xorshift).
>>>
>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>> ---
>>>   xen/common/domain.c     | 16 ++++++++++++++++
>>>   xen/include/xen/sched.h |  3 +++
>>>   2 files changed, 19 insertions(+)
>>>
>>> diff --git a/xen/common/domain.c b/xen/common/domain.c
>>> index 6ee5d033b0..755349b93f 100644
>>> --- a/xen/common/domain.c
>>> +++ b/xen/common/domain.c
>>> @@ -84,6 +84,9 @@ vcpu_info_t dummy_vcpu_info;
>>>   bool __read_mostly vmtrace_available;
>>> +/* Unique domain identifier, protected by domctl lock. */
>>> +static uint64_t unique_id;
>>> +
>>>   static void __domain_finalise_shutdown(struct domain *d)
>>>   {
>>>       struct vcpu *v;
>>> @@ -473,6 +476,18 @@ static void _domain_destroy(struct domain *d)
>>>       free_domain_struct(d);
>>>   }
>>> +static uint64_t get_unique_id(void)
>>
>> The implementation is assuming that domain cannot be created 
>> concurrently. The rest of domain_create() seems to be able to cope 
>> with concurrent call (even if domctl prevents this situation today).
>>
>> So I think we would want to make this call safe as well. One 
>> possibility would be to (ab)use the domlist_update_lock (I think the 
>> uniq ID is only necessary for real domains).
> 
> In case this is thought to be needed, I'd rather use a cmpxchg operation
> for updating unique_id.
I would be OK with cmpxchg(). But I would like to avoid cmpxchg() loop 
if possible.

Reading the commit message (and cover letter) again, I understand that 
you want a unique ID but it is not clear to me why this needs to be 
pseudo-randomly generated. IOW, given the expected use, can you clarify 
what would be the concern to use an atomic_inc_return() instead of xorshift?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 1/4] xen: add a domain unique id to each domain
  2021-11-22 14:10       ` Julien Grall
@ 2021-11-22 14:29         ` Juergen Gross
  2021-11-22 14:37           ` Julien Grall
  0 siblings, 1 reply; 25+ messages in thread
From: Juergen Gross @ 2021-11-22 14:29 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Stefano Stabellini, Wei Liu


[-- Attachment #1.1.1: Type: text/plain, Size: 3031 bytes --]

On 22.11.21 15:10, Julien Grall wrote:
> Hi,
> 
> On 22/11/2021 12:48, Juergen Gross wrote:
>> On 22.11.21 12:42, Julien Grall wrote:
>>> Hi Juergen,
>>>
>>> On 14/09/2021 13:35, Juergen Gross wrote:
>>>> Xenstore is referencing domains by their domid, but reuse of a domid
>>>> can lead to the situation that Xenstore can't tell whether a domain
>>>> with that domid has been deleted and created again without Xenstore
>>>> noticing the domain is a new one now.
>>>>
>>>> Add a global domain creation unique id which is updated when creating
>>>> a new domain, and store that value in struct domain of the new domain.
>>>> The global unique id is initialized with the system time and updates
>>>> are done via the xorshift algorithm which is used for pseudo random
>>>> number generation, too (see https://en.wikipedia.org/wiki/Xorshift).
>>>>
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>> ---
>>>>   xen/common/domain.c     | 16 ++++++++++++++++
>>>>   xen/include/xen/sched.h |  3 +++
>>>>   2 files changed, 19 insertions(+)
>>>>
>>>> diff --git a/xen/common/domain.c b/xen/common/domain.c
>>>> index 6ee5d033b0..755349b93f 100644
>>>> --- a/xen/common/domain.c
>>>> +++ b/xen/common/domain.c
>>>> @@ -84,6 +84,9 @@ vcpu_info_t dummy_vcpu_info;
>>>>   bool __read_mostly vmtrace_available;
>>>> +/* Unique domain identifier, protected by domctl lock. */
>>>> +static uint64_t unique_id;
>>>> +
>>>>   static void __domain_finalise_shutdown(struct domain *d)
>>>>   {
>>>>       struct vcpu *v;
>>>> @@ -473,6 +476,18 @@ static void _domain_destroy(struct domain *d)
>>>>       free_domain_struct(d);
>>>>   }
>>>> +static uint64_t get_unique_id(void)
>>>
>>> The implementation is assuming that domain cannot be created 
>>> concurrently. The rest of domain_create() seems to be able to cope 
>>> with concurrent call (even if domctl prevents this situation today).
>>>
>>> So I think we would want to make this call safe as well. One 
>>> possibility would be to (ab)use the domlist_update_lock (I think the 
>>> uniq ID is only necessary for real domains).
>>
>> In case this is thought to be needed, I'd rather use a cmpxchg operation
>> for updating unique_id.
> I would be OK with cmpxchg(). But I would like to avoid cmpxchg() loop 
> if possible.

The chances for that loop to be needed are zero today, as the domctl
lock is prohibiting concurrent use.

Another possibility would be to use an:

ASSERT(spin_is_locked(&domctl_lock));

which would need domctl_lock to be globally visible, though.

> 
> Reading the commit message (and cover letter) again, I understand that 
> you want a unique ID but it is not clear to me why this needs to be 
> pseudo-randomly generated. IOW, given the expected use, can you clarify 
> what would be the concern to use an atomic_inc_return() instead of 
> xorshift?

Jan had a concern regarding potential misuse of the unique id in case it
were kind of predictable.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 1/4] xen: add a domain unique id to each domain
  2021-11-22 14:29         ` Juergen Gross
@ 2021-11-22 14:37           ` Julien Grall
  2021-11-25  6:32             ` Juergen Gross
  0 siblings, 1 reply; 25+ messages in thread
From: Julien Grall @ 2021-11-22 14:37 UTC (permalink / raw)
  To: Juergen Gross, xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Stefano Stabellini, Wei Liu



On 22/11/2021 14:29, Juergen Gross wrote:
> On 22.11.21 15:10, Julien Grall wrote:
>> Hi,

Hi,

>> On 22/11/2021 12:48, Juergen Gross wrote:
>>> On 22.11.21 12:42, Julien Grall wrote:
>>>> Hi Juergen,
>>>>
>>>> On 14/09/2021 13:35, Juergen Gross wrote:
>>>>> Xenstore is referencing domains by their domid, but reuse of a domid
>>>>> can lead to the situation that Xenstore can't tell whether a domain
>>>>> with that domid has been deleted and created again without Xenstore
>>>>> noticing the domain is a new one now.
>>>>>
>>>>> Add a global domain creation unique id which is updated when creating
>>>>> a new domain, and store that value in struct domain of the new domain.
>>>>> The global unique id is initialized with the system time and updates
>>>>> are done via the xorshift algorithm which is used for pseudo random
>>>>> number generation, too (see https://en.wikipedia.org/wiki/Xorshift).
>>>>>
>>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>>> ---
>>>>>   xen/common/domain.c     | 16 ++++++++++++++++
>>>>>   xen/include/xen/sched.h |  3 +++
>>>>>   2 files changed, 19 insertions(+)
>>>>>
>>>>> diff --git a/xen/common/domain.c b/xen/common/domain.c
>>>>> index 6ee5d033b0..755349b93f 100644
>>>>> --- a/xen/common/domain.c
>>>>> +++ b/xen/common/domain.c
>>>>> @@ -84,6 +84,9 @@ vcpu_info_t dummy_vcpu_info;
>>>>>   bool __read_mostly vmtrace_available;
>>>>> +/* Unique domain identifier, protected by domctl lock. */
>>>>> +static uint64_t unique_id;
>>>>> +
>>>>>   static void __domain_finalise_shutdown(struct domain *d)
>>>>>   {
>>>>>       struct vcpu *v;
>>>>> @@ -473,6 +476,18 @@ static void _domain_destroy(struct domain *d)
>>>>>       free_domain_struct(d);
>>>>>   }
>>>>> +static uint64_t get_unique_id(void)
>>>>
>>>> The implementation is assuming that domain cannot be created 
>>>> concurrently. The rest of domain_create() seems to be able to cope 
>>>> with concurrent call (even if domctl prevents this situation today).
>>>>
>>>> So I think we would want to make this call safe as well. One 
>>>> possibility would be to (ab)use the domlist_update_lock (I think the 
>>>> uniq ID is only necessary for real domains).
>>>
>>> In case this is thought to be needed, I'd rather use a cmpxchg operation
>>> for updating unique_id.
>> I would be OK with cmpxchg(). But I would like to avoid cmpxchg() loop 
>> if possible.
> 
> The chances for that loop to be needed are zero today, as the domctl
> lock is prohibiting concurrent use.

Well for domain created by the toolstack yes... For domain created by 
Xen no. Today, this is only happening at boot so you are safe (for now).

That said, I don't really see a reason to prevent concurrent 
domain_create() and as I wrote nothing in domain_create() seems to make 
this assumption. So I am not happy to build on that in this new helper.

> 
> Another possibility would be to use an:
> 
> ASSERT(spin_is_locked(&domctl_lock));

This ASSERT() would be incorrect as it could be hit for any domain 
created by Xen.

[...]

>> Reading the commit message (and cover letter) again, I understand that 
>> you want a unique ID but it is not clear to me why this needs to be 
>> pseudo-randomly generated. IOW, given the expected use, can you 
>> clarify what would be the concern to use an atomic_inc_return() 
>> instead of xorshift?
> 
> Jan had a concern regarding potential misuse of the unique id in case it
> were kind of predictable.

Fair enough. Can this be written down in the commit message?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 3/4] xen: add new stable control hypercall
  2021-09-14 12:35 ` [PATCH RFC 3/4] xen: add new stable control hypercall Juergen Gross
@ 2021-11-22 15:39   ` Jan Beulich
  2021-11-25  6:55     ` Juergen Gross
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2021-11-22 15:39 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Daniel De Graaf, Daniel P. Smith, Ian Jackson, Wei Liu,
	Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Volodymyr Babchuk, Roger Pau Monné,
	xen-devel

On 14.09.2021 14:35, Juergen Gross wrote:
> The sysctl and domctl hypercalls are not stable, so tools using those
> need to be in sync with the hypervisor.
> 
> In order to decouple (some) tools from the hypervisor add a new stable
> hypercall __HYPERVISOR_control_op

I'm not convinced we need a new hypercall. New sub-ops of the existing ones
can be declared stable (and be made bypass the interface version checks).
If we want/need a new one, "control" is too generic: There's a reason we
currently have separate domctl and sysctl, and I think if we want new
hypercalls rather than new sub-ops, then we'd again want a global and a
per-domain one (unless the new one had provisions to be able to serve
both purposes).

> with (for now) two sub-options:
> 
> - XEN_CONTROL_OP_get_version for retrieving the max version of the new
>   hypercall supported by the hypervisor
> - XEN_CONTROL_OP_get_state_changed_domain for retrieving some state
>   data of a domain which changed state (this is needed by Xenstore).
>   The returned state just contains the domid, the domain unique id,
>   and some flags (existing, shutdown, dying).

If we go with a new hypercall, I think you want to split its introduction
(with just the version sub-op) from the addition of get_state_changed_dom.

> --- /dev/null
> +++ b/xen/common/control.c
> @@ -0,0 +1,52 @@
> +/******************************************************************************
> + *
> + * control.c
> + *
> + * Entry point of the stable __HYPERVISOR_control_op hypercall.
> + */
> +#include <xen/err.h>
> +#include <xen/event.h>
> +#include <xen/guest_access.h>
> +#include <xen/hypercall.h>
> +#include <public/control.h>
> +
> +long do_control_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
> +{
> +    int ret = 0;
> +
> +    if ( xsm_control_op(XSM_OTHER, cmd) )
> +        return -EPERM;
> +
> +    switch ( cmd )
> +    {
> +    case XEN_CONTROL_OP_get_version:
> +        if ( !guest_handle_is_null(arg) )
> +            return -EINVAL;
> +
> +        ret = XEN_CONTROL_VERSION;
> +        break;
> +
> +    case XEN_CONTROL_OP_get_state_changed_domain:
> +    {
> +        struct xen_control_changed_domain info = { };
> +
> +        if ( get_global_virq_handler(VIRQ_DOM_EXC) != current->domain )
> +            return -EPERM;

The function result is stale by the time it gets made use of here. If this
is deemed not to be a problem, then I guess it wants saying so in the
description.

> @@ -103,6 +104,43 @@ void domain_reset_states(void)
>      rcu_read_unlock(&domlist_read_lock);
>  }
>  
> +int domain_get_dom_state_changed(struct xen_control_changed_domain *info)
> +{
> +    unsigned int dom;
> +    struct domain *d;
> +
> +    while ( (dom = find_first_bit(dom_state_changed, DOMID_MASK + 1)) <
> +            DOMID_FIRST_RESERVED )

As per my comment on the earlier patch - the use of DOMID_MASK + 1 vs
is quite puzzling here.

> +    {
> +        d = rcu_lock_domain_by_id(dom);
> +
> +        if ( test_and_clear_bit(dom, dom_state_changed) )
> +        {
> +            info->domid = dom;
> +            if ( d )
> +            {
> +                info->state = XEN_CONTROL_CHANGEDDOM_STATE_EXIST;
> +                if ( d->is_shut_down )
> +                    info->state |= XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN;
> +                if ( d->is_dying == DOMDYING_dead )
> +                    info->state |= XEN_CONTROL_CHANGEDDOM_STATE_DYING;
> +                info->unique_id = d->unique_id;
> +
> +                rcu_unlock_domain(d);
> +            }
> +
> +            return 0;

With rapid creation of short lived domains, will the caller ever get to
see information on higher numbered domains (if, say, it gets "suitably"
preempted within its own environment)? IOW shouldn't there be a way for
the caller to specify a domid to start from?

> +        }
> +
> +        if ( d )
> +        {
> +            rcu_unlock_domain(d);
> +        }

Nit: Unnecessary braces.

> --- a/xen/common/event_channel.c
> +++ b/xen/common/event_channel.c
> @@ -894,11 +894,16 @@ static struct domain *global_virq_handlers[NR_VIRQS] __read_mostly;
>  
>  static DEFINE_SPINLOCK(global_virq_handlers_lock);
>  
> -void send_global_virq(uint32_t virq)
> +struct domain *get_global_virq_handler(uint32_t virq)
>  {
>      ASSERT(virq_is_global(virq));
>  
> -    send_guest_global_virq(global_virq_handlers[virq] ?: hardware_domain, virq);
> +    return global_virq_handlers[virq] ?: hardware_domain;
> +}
> +
> +void send_global_virq(uint32_t virq)
> +{
> +    send_guest_global_virq(get_global_virq_handler(virq), virq);
>  }

Following my comment further up, I think external exposure of this requires
to finally eliminate the (pre-existing) risk of race here. I think
get_knownalive_domain() is all it takes to at least prevent the domain
disappearing behind our backs, with the extra reference transferred to the
caller. Yet we may want to additionally be assured that the domain in
question continues to be the one handling the respective vIRQ ...

> --- /dev/null
> +++ b/xen/include/public/control.h
> @@ -0,0 +1,80 @@
> +/******************************************************************************
> + * Xen Control Hypercall
> + *
> + * Copyright (c) 2021, SUSE Software Solutions Germany GmbH
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to
> + * deal in the Software without restriction, including without limitation the
> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> + * sell copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + *
> + */
> +
> +#ifndef __XEN_PUBLIC_CONTROL_H__
> +#define __XEN_PUBLIC_CONTROL_H__
> +
> +#include "xen.h"
> +
> +/*
> + * Definitions for the __HYPERVISOR_control_op hypercall.
> + */
> +
> +/* Highest version number of the control interface currently defined. */
> +#define XEN_CONTROL_VERSION      1
> +
> +/*
> + * Hypercall operations.
> + */
> +
> +/*
> + * XEN_CONTROL_OP_get_version
> + *
> + * Read highest interface version supported by the hypervisor.
> + *
> + * arg: NULL
> + *
> + * Possible return values:
> + * >0: highest supported interface version
> + * <0: negative Xen errno value
> + */
> +#define XEN_CONTROL_OP_get_version                  0

What would a caller use the returned value for? I guess this follows
XEN_HYPFS_OP_get_version, but I'm less certain of the utility here.
Incompatible extensions are easy to make use separate sub-ops, unlike
possible extensions there to struct xen_hypfs_dir{,list}entry.

> +/*
> + * XEN_CONTROL_OP_get_state_changed_domain
> + *
> + * Get information about a domain having changed state and reset the state
> + * change indicator for that domain. This function is usable only by a domain
> + * having registered the VIRQ_DOM_EXC event (normally Xenstore).
> + *
> + * arg: XEN_GUEST_HANDLE(struct xen_control_changed_domain)
> + *
> + * Possible return values:
> + * 0: success
> + * <0 : negative Xen errno value
> + */
> +#define XEN_CONTROL_OP_get_state_changed_domain     1
> +struct xen_control_changed_domain {
> +    domid_t domid;
> +    uint16_t state;
> +#define XEN_CONTROL_CHANGEDDOM_STATE_EXIST     0x0001  /* Domain is existing. */
> +#define XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN  0x0002  /* Shutdown finished. */
> +#define XEN_CONTROL_CHANGEDDOM_STATE_DYING     0x0004  /* Domain dying. */
> +    uint32_t pad1;           /* Returned as 0. */
> +    uint64_t unique_id;      /* Unique domain identifier. */
> +    uint64_t pad2[6];        /* Returned as 0. */

I think the padding fields have to be zero on input, not just on return.
Unless you mean to mandate them to be OUT only now and forever. I also
wonder how the trailing padding plays up with the version sub-op: Do we
really need such double precaution?

Also - should we use uint64_aligned_t here?

Jan



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 2/4] xen: add bitmap to indicate per-domain state changes
  2021-11-22 13:39       ` Jan Beulich
@ 2021-11-25  6:30         ` Juergen Gross
  0 siblings, 0 replies; 25+ messages in thread
From: Juergen Gross @ 2021-11-25  6:30 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Julien Grall,
	Stefano Stabellini, Wei Liu, xen-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 2844 bytes --]

On 22.11.21 14:39, Jan Beulich wrote:
> On 22.11.2021 13:46, Juergen Gross wrote:
>> On 22.11.21 11:41, Jan Beulich wrote:
>>> On 14.09.2021 14:35, Juergen Gross wrote:
>>>> Add a bitmap with one bit per possible domid indicating the respective
>>>> domain has changed its state (created, deleted, dying, crashed,
>>>> shutdown).
>>>>
>>>> Registering the VIRQ_DOM_EXC event will result in setting the bits for
>>>> all existing domains and resetting all other bits.
>>>
>>> Generally I view VIRQ_DOM_EXC as overly restrictive already - what if both
>>> a xenstore domain and a control domain want respective notification? Hence
>>
>> The general idea was that in this case the control domain should
>> register a related Xenstore watch.
>>
>>> similarly I'm not convinced a single, global instance will do here. Which
>>> in turn raises the question whether the approach chosen may not take us
>>> far enough, and we wouldn't instead want a more precise notification model
>>> (i.e. not just "something has changed").
>>
>> Yes, that would be the job of Xenstore. It would provide the more
>> fine grained watches for that purpose.
> 
> And the watch consumer still wouldn't have a way to distinguish two domain
> instances using the same ID, would it?

My further plans include new watches for domain creation/destroy
which will include the domid in the watch path reported with the
watch event. So anyone registering for domain creation watches
would no longer need another way to distinguish domains with the
same domid by other means.

The main question remaining here is whether we are okay to let
Xenstore be the instance providing that functionality to the rest
of the stack, or if we want that functionality in the hypervisor,
with all the related needed access control (Xenstore allows to
set the usability of the create/destroy watches for other domains
today).

>>>> @@ -1141,6 +1161,8 @@ static void complete_domain_destroy(struct rcu_head *head)
>>>>          xfree(d->vcpu);
>>>>    +    set_bit(d->domain_id, dom_state_changed);
>>>> +
>>>>        _domain_destroy(d);
>>>>          send_global_virq(VIRQ_DOM_EXC);
>>>
>>> Wouldn't this better be in domain_destroy() immediately after the domain
>>> has been taken off the list, and hence is no longer "discoverable"?
>>
>> In this case the call of send_global_virq() should be moved, too,
>> I guess?
> 
> Possibly, albeit I'd see this as a separate change to make. It doesn't
> seem outright wrong for the vIRQ to be sent late. But I agree with the
> idea of keeping sending and bit-setting together (ideally, even if this
> was to stay here, the two would better sit truly side by side).

Okay, I'll prepend another patch moving the call of send_global_virq()
to domain_destroy().


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 1/4] xen: add a domain unique id to each domain
  2021-11-22 14:37           ` Julien Grall
@ 2021-11-25  6:32             ` Juergen Gross
  0 siblings, 0 replies; 25+ messages in thread
From: Juergen Gross @ 2021-11-25  6:32 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Andrew Cooper, George Dunlap, Ian Jackson, Jan Beulich,
	Stefano Stabellini, Wei Liu


[-- Attachment #1.1.1: Type: text/plain, Size: 3950 bytes --]

On 22.11.21 15:37, Julien Grall wrote:
> 
> 
> On 22/11/2021 14:29, Juergen Gross wrote:
>> On 22.11.21 15:10, Julien Grall wrote:
>>> Hi,
> 
> Hi,
> 
>>> On 22/11/2021 12:48, Juergen Gross wrote:
>>>> On 22.11.21 12:42, Julien Grall wrote:
>>>>> Hi Juergen,
>>>>>
>>>>> On 14/09/2021 13:35, Juergen Gross wrote:
>>>>>> Xenstore is referencing domains by their domid, but reuse of a domid
>>>>>> can lead to the situation that Xenstore can't tell whether a domain
>>>>>> with that domid has been deleted and created again without Xenstore
>>>>>> noticing the domain is a new one now.
>>>>>>
>>>>>> Add a global domain creation unique id which is updated when creating
>>>>>> a new domain, and store that value in struct domain of the new 
>>>>>> domain.
>>>>>> The global unique id is initialized with the system time and updates
>>>>>> are done via the xorshift algorithm which is used for pseudo random
>>>>>> number generation, too (see https://en.wikipedia.org/wiki/Xorshift).
>>>>>>
>>>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>>>> ---
>>>>>>   xen/common/domain.c     | 16 ++++++++++++++++
>>>>>>   xen/include/xen/sched.h |  3 +++
>>>>>>   2 files changed, 19 insertions(+)
>>>>>>
>>>>>> diff --git a/xen/common/domain.c b/xen/common/domain.c
>>>>>> index 6ee5d033b0..755349b93f 100644
>>>>>> --- a/xen/common/domain.c
>>>>>> +++ b/xen/common/domain.c
>>>>>> @@ -84,6 +84,9 @@ vcpu_info_t dummy_vcpu_info;
>>>>>>   bool __read_mostly vmtrace_available;
>>>>>> +/* Unique domain identifier, protected by domctl lock. */
>>>>>> +static uint64_t unique_id;
>>>>>> +
>>>>>>   static void __domain_finalise_shutdown(struct domain *d)
>>>>>>   {
>>>>>>       struct vcpu *v;
>>>>>> @@ -473,6 +476,18 @@ static void _domain_destroy(struct domain *d)
>>>>>>       free_domain_struct(d);
>>>>>>   }
>>>>>> +static uint64_t get_unique_id(void)
>>>>>
>>>>> The implementation is assuming that domain cannot be created 
>>>>> concurrently. The rest of domain_create() seems to be able to cope 
>>>>> with concurrent call (even if domctl prevents this situation today).
>>>>>
>>>>> So I think we would want to make this call safe as well. One 
>>>>> possibility would be to (ab)use the domlist_update_lock (I think 
>>>>> the uniq ID is only necessary for real domains).
>>>>
>>>> In case this is thought to be needed, I'd rather use a cmpxchg 
>>>> operation
>>>> for updating unique_id.
>>> I would be OK with cmpxchg(). But I would like to avoid cmpxchg() 
>>> loop if possible.
>>
>> The chances for that loop to be needed are zero today, as the domctl
>> lock is prohibiting concurrent use.
> 
> Well for domain created by the toolstack yes... For domain created by 
> Xen no. Today, this is only happening at boot so you are safe (for now).
> 
> That said, I don't really see a reason to prevent concurrent 
> domain_create() and as I wrote nothing in domain_create() seems to make 
> this assumption. So I am not happy to build on that in this new helper.

Fair enough.

> 
>>
>> Another possibility would be to use an:
>>
>> ASSERT(spin_is_locked(&domctl_lock));
> 
> This ASSERT() would be incorrect as it could be hit for any domain 
> created by Xen.

Oh, right.

In the end it doesn't really hurt to add another lock. Its not as if
we'd expect much contention for it. :-)

> 
> [...]
> 
>>> Reading the commit message (and cover letter) again, I understand 
>>> that you want a unique ID but it is not clear to me why this needs to 
>>> be pseudo-randomly generated. IOW, given the expected use, can you 
>>> clarify what would be the concern to use an atomic_inc_return() 
>>> instead of xorshift?
>>
>> Jan had a concern regarding potential misuse of the unique id in case it
>> were kind of predictable.
> 
> Fair enough. Can this be written down in the commit message?

Yes, will do that.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 3/4] xen: add new stable control hypercall
  2021-11-22 15:39   ` Jan Beulich
@ 2021-11-25  6:55     ` Juergen Gross
  2021-11-25  9:38       ` Jan Beulich
  0 siblings, 1 reply; 25+ messages in thread
From: Juergen Gross @ 2021-11-25  6:55 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper
  Cc: Daniel De Graaf, Daniel P. Smith, Ian Jackson, Wei Liu,
	George Dunlap, Julien Grall, Stefano Stabellini,
	Volodymyr Babchuk, Roger Pau Monné,
	xen-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 9970 bytes --]

On 22.11.21 16:39, Jan Beulich wrote:
> On 14.09.2021 14:35, Juergen Gross wrote:
>> The sysctl and domctl hypercalls are not stable, so tools using those
>> need to be in sync with the hypervisor.
>>
>> In order to decouple (some) tools from the hypervisor add a new stable
>> hypercall __HYPERVISOR_control_op
> 
> I'm not convinced we need a new hypercall. New sub-ops of the existing ones
> can be declared stable (and be made bypass the interface version checks).
> If we want/need a new one, "control" is too generic: There's a reason we
> currently have separate domctl and sysctl, and I think if we want new
> hypercalls rather than new sub-ops, then we'd again want a global and a
> per-domain one (unless the new one had provisions to be able to serve
> both purposes).

It would be nice to settle on a plan for stable [dom|sys]ctl interfaces.

Andrew, I think you already have some plans for that. Mind to share your
thoughts?

> 
>> with (for now) two sub-options:
>>
>> - XEN_CONTROL_OP_get_version for retrieving the max version of the new
>>    hypercall supported by the hypervisor
>> - XEN_CONTROL_OP_get_state_changed_domain for retrieving some state
>>    data of a domain which changed state (this is needed by Xenstore).
>>    The returned state just contains the domid, the domain unique id,
>>    and some flags (existing, shutdown, dying).
> 
> If we go with a new hypercall, I think you want to split its introduction
> (with just the version sub-op) from the addition of get_state_changed_dom.
> 
>> --- /dev/null
>> +++ b/xen/common/control.c
>> @@ -0,0 +1,52 @@
>> +/******************************************************************************
>> + *
>> + * control.c
>> + *
>> + * Entry point of the stable __HYPERVISOR_control_op hypercall.
>> + */
>> +#include <xen/err.h>
>> +#include <xen/event.h>
>> +#include <xen/guest_access.h>
>> +#include <xen/hypercall.h>
>> +#include <public/control.h>
>> +
>> +long do_control_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>> +{
>> +    int ret = 0;
>> +
>> +    if ( xsm_control_op(XSM_OTHER, cmd) )
>> +        return -EPERM;
>> +
>> +    switch ( cmd )
>> +    {
>> +    case XEN_CONTROL_OP_get_version:
>> +        if ( !guest_handle_is_null(arg) )
>> +            return -EINVAL;
>> +
>> +        ret = XEN_CONTROL_VERSION;
>> +        break;
>> +
>> +    case XEN_CONTROL_OP_get_state_changed_domain:
>> +    {
>> +        struct xen_control_changed_domain info = { };
>> +
>> +        if ( get_global_virq_handler(VIRQ_DOM_EXC) != current->domain )
>> +            return -EPERM;
> 
> The function result is stale by the time it gets made use of here. If this
> is deemed not to be a problem, then I guess it wants saying so in the
> description.

I can add something to lock VIRQ_DOM_EXC to the hardware domain in case
it hasn't been registered yet.

> 
>> @@ -103,6 +104,43 @@ void domain_reset_states(void)
>>       rcu_read_unlock(&domlist_read_lock);
>>   }
>>   
>> +int domain_get_dom_state_changed(struct xen_control_changed_domain *info)
>> +{
>> +    unsigned int dom;
>> +    struct domain *d;
>> +
>> +    while ( (dom = find_first_bit(dom_state_changed, DOMID_MASK + 1)) <
>> +            DOMID_FIRST_RESERVED )
> 
> As per my comment on the earlier patch - the use of DOMID_MASK + 1 vs
> is quite puzzling here.

Okay, will change that.

> 
>> +    {
>> +        d = rcu_lock_domain_by_id(dom);
>> +
>> +        if ( test_and_clear_bit(dom, dom_state_changed) )
>> +        {
>> +            info->domid = dom;
>> +            if ( d )
>> +            {
>> +                info->state = XEN_CONTROL_CHANGEDDOM_STATE_EXIST;
>> +                if ( d->is_shut_down )
>> +                    info->state |= XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN;
>> +                if ( d->is_dying == DOMDYING_dead )
>> +                    info->state |= XEN_CONTROL_CHANGEDDOM_STATE_DYING;
>> +                info->unique_id = d->unique_id;
>> +
>> +                rcu_unlock_domain(d);
>> +            }
>> +
>> +            return 0;
> 
> With rapid creation of short lived domains, will the caller ever get to
> see information on higher numbered domains (if, say, it gets "suitably"
> preempted within its own environment)? IOW shouldn't there be a way for
> the caller to specify a domid to start from?

I'd rather have a local variable for the last reported domid and start
from that.

> 
>> +        }
>> +
>> +        if ( d )
>> +        {
>> +            rcu_unlock_domain(d);
>> +        }
> 
> Nit: Unnecessary braces.
> 
>> --- a/xen/common/event_channel.c
>> +++ b/xen/common/event_channel.c
>> @@ -894,11 +894,16 @@ static struct domain *global_virq_handlers[NR_VIRQS] __read_mostly;
>>   
>>   static DEFINE_SPINLOCK(global_virq_handlers_lock);
>>   
>> -void send_global_virq(uint32_t virq)
>> +struct domain *get_global_virq_handler(uint32_t virq)
>>   {
>>       ASSERT(virq_is_global(virq));
>>   
>> -    send_guest_global_virq(global_virq_handlers[virq] ?: hardware_domain, virq);
>> +    return global_virq_handlers[virq] ?: hardware_domain;
>> +}
>> +
>> +void send_global_virq(uint32_t virq)
>> +{
>> +    send_guest_global_virq(get_global_virq_handler(virq), virq);
>>   }
> 
> Following my comment further up, I think external exposure of this requires
> to finally eliminate the (pre-existing) risk of race here. I think
> get_knownalive_domain() is all it takes to at least prevent the domain
> disappearing behind our backs, with the extra reference transferred to the
> caller. Yet we may want to additionally be assured that the domain in
> question continues to be the one handling the respective vIRQ ...

Okay. Will look into this.

> 
>> --- /dev/null
>> +++ b/xen/include/public/control.h
>> @@ -0,0 +1,80 @@
>> +/******************************************************************************
>> + * Xen Control Hypercall
>> + *
>> + * Copyright (c) 2021, SUSE Software Solutions Germany GmbH
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a copy
>> + * of this software and associated documentation files (the "Software"), to
>> + * deal in the Software without restriction, including without limitation the
>> + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
>> + * sell copies of the Software, and to permit persons to whom the Software is
>> + * furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be included in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
>> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>> + * DEALINGS IN THE SOFTWARE.
>> + *
>> + */
>> +
>> +#ifndef __XEN_PUBLIC_CONTROL_H__
>> +#define __XEN_PUBLIC_CONTROL_H__
>> +
>> +#include "xen.h"
>> +
>> +/*
>> + * Definitions for the __HYPERVISOR_control_op hypercall.
>> + */
>> +
>> +/* Highest version number of the control interface currently defined. */
>> +#define XEN_CONTROL_VERSION      1
>> +
>> +/*
>> + * Hypercall operations.
>> + */
>> +
>> +/*
>> + * XEN_CONTROL_OP_get_version
>> + *
>> + * Read highest interface version supported by the hypervisor.
>> + *
>> + * arg: NULL
>> + *
>> + * Possible return values:
>> + * >0: highest supported interface version
>> + * <0: negative Xen errno value
>> + */
>> +#define XEN_CONTROL_OP_get_version                  0
> 
> What would a caller use the returned value for? I guess this follows
> XEN_HYPFS_OP_get_version, but I'm less certain of the utility here.
> Incompatible extensions are easy to make use separate sub-ops, unlike
> possible extensions there to struct xen_hypfs_dir{,list}entry.

I think can be discussed finally when we have settled on a plan for
stable control interfaces.

> 
>> +/*
>> + * XEN_CONTROL_OP_get_state_changed_domain
>> + *
>> + * Get information about a domain having changed state and reset the state
>> + * change indicator for that domain. This function is usable only by a domain
>> + * having registered the VIRQ_DOM_EXC event (normally Xenstore).
>> + *
>> + * arg: XEN_GUEST_HANDLE(struct xen_control_changed_domain)
>> + *
>> + * Possible return values:
>> + * 0: success
>> + * <0 : negative Xen errno value
>> + */
>> +#define XEN_CONTROL_OP_get_state_changed_domain     1
>> +struct xen_control_changed_domain {
>> +    domid_t domid;
>> +    uint16_t state;
>> +#define XEN_CONTROL_CHANGEDDOM_STATE_EXIST     0x0001  /* Domain is existing. */
>> +#define XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN  0x0002  /* Shutdown finished. */
>> +#define XEN_CONTROL_CHANGEDDOM_STATE_DYING     0x0004  /* Domain dying. */
>> +    uint32_t pad1;           /* Returned as 0. */
>> +    uint64_t unique_id;      /* Unique domain identifier. */
>> +    uint64_t pad2[6];        /* Returned as 0. */
> 
> I think the padding fields have to be zero on input, not just on return.

I don't see why this would be needed, as this structure is only ever
copied to the caller, so "on input" just doesn't apply here.

> Unless you mean to mandate them to be OUT only now and forever. I also

The whole struct is OUT only.

> wonder how the trailing padding plays up with the version sub-op: Do we
> really need such double precaution?

I can remove it.

> Also - should we use uint64_aligned_t here?

Yes.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 3/4] xen: add new stable control hypercall
  2021-11-25  6:55     ` Juergen Gross
@ 2021-11-25  9:38       ` Jan Beulich
  2021-11-25 10:12         ` Juergen Gross
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2021-11-25  9:38 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Daniel De Graaf, Daniel P. Smith, Ian Jackson, Wei Liu,
	George Dunlap, Julien Grall, Stefano Stabellini,
	Volodymyr Babchuk, Roger Pau Monné,
	xen-devel, Andrew Cooper

On 25.11.2021 07:55, Juergen Gross wrote:
> On 22.11.21 16:39, Jan Beulich wrote:
>> On 14.09.2021 14:35, Juergen Gross wrote:
>>> @@ -103,6 +104,43 @@ void domain_reset_states(void)
>>>       rcu_read_unlock(&domlist_read_lock);
>>>   }
>>>   
>>> +int domain_get_dom_state_changed(struct xen_control_changed_domain *info)
>>> +{
>>> +    unsigned int dom;
>>> +    struct domain *d;
>>> +
>>> +    while ( (dom = find_first_bit(dom_state_changed, DOMID_MASK + 1)) <
>>> +            DOMID_FIRST_RESERVED )
>>
>> As per my comment on the earlier patch - the use of DOMID_MASK + 1 vs
>> is quite puzzling here.
> 
> Okay, will change that.
> 
>>
>>> +    {
>>> +        d = rcu_lock_domain_by_id(dom);
>>> +
>>> +        if ( test_and_clear_bit(dom, dom_state_changed) )
>>> +        {
>>> +            info->domid = dom;
>>> +            if ( d )
>>> +            {
>>> +                info->state = XEN_CONTROL_CHANGEDDOM_STATE_EXIST;
>>> +                if ( d->is_shut_down )
>>> +                    info->state |= XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN;
>>> +                if ( d->is_dying == DOMDYING_dead )
>>> +                    info->state |= XEN_CONTROL_CHANGEDDOM_STATE_DYING;
>>> +                info->unique_id = d->unique_id;
>>> +
>>> +                rcu_unlock_domain(d);
>>> +            }
>>> +
>>> +            return 0;
>>
>> With rapid creation of short lived domains, will the caller ever get to
>> see information on higher numbered domains (if, say, it gets "suitably"
>> preempted within its own environment)? IOW shouldn't there be a way for
>> the caller to specify a domid to start from?
> 
> I'd rather have a local variable for the last reported domid and start
> from that.

Well, it probably doesn't matter much to have yet one more aspect making
this a single-consumer-only interface.

>>> +/*
>>> + * XEN_CONTROL_OP_get_state_changed_domain
>>> + *
>>> + * Get information about a domain having changed state and reset the state
>>> + * change indicator for that domain. This function is usable only by a domain
>>> + * having registered the VIRQ_DOM_EXC event (normally Xenstore).
>>> + *
>>> + * arg: XEN_GUEST_HANDLE(struct xen_control_changed_domain)
>>> + *
>>> + * Possible return values:
>>> + * 0: success
>>> + * <0 : negative Xen errno value
>>> + */
>>> +#define XEN_CONTROL_OP_get_state_changed_domain     1
>>> +struct xen_control_changed_domain {
>>> +    domid_t domid;
>>> +    uint16_t state;
>>> +#define XEN_CONTROL_CHANGEDDOM_STATE_EXIST     0x0001  /* Domain is existing. */
>>> +#define XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN  0x0002  /* Shutdown finished. */
>>> +#define XEN_CONTROL_CHANGEDDOM_STATE_DYING     0x0004  /* Domain dying. */
>>> +    uint32_t pad1;           /* Returned as 0. */
>>> +    uint64_t unique_id;      /* Unique domain identifier. */
>>> +    uint64_t pad2[6];        /* Returned as 0. */
>>
>> I think the padding fields have to be zero on input, not just on return.
> 
> I don't see why this would be needed, as this structure is only ever
> copied to the caller, so "on input" just doesn't apply here.
> 
>> Unless you mean to mandate them to be OUT only now and forever. I also
> 
> The whole struct is OUT only.

Right now, yes. I wouldn't exclude "pad1" to become a flags field,
controlling some future behavioral aspect of the operation.

>> wonder how the trailing padding plays up with the version sub-op: Do we
>> really need such double precaution?
> 
> I can remove it.
> 
>> Also - should we use uint64_aligned_t here?
> 
> Yes.

But you realize this isn't straightforward, for the type not being
available in plain C89 (nor C99)? That's why it's presently used in
tools-only interfaces only, and the respective header are excluded
from the "is ANSI compatible" checking (memory.h and hvm/dm_op.h
have special but imo crude "precautions").

Jan



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 3/4] xen: add new stable control hypercall
  2021-11-25  9:38       ` Jan Beulich
@ 2021-11-25 10:12         ` Juergen Gross
  2021-11-25 10:19           ` Jan Beulich
  0 siblings, 1 reply; 25+ messages in thread
From: Juergen Gross @ 2021-11-25 10:12 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Daniel De Graaf, Daniel P. Smith, Ian Jackson, Wei Liu,
	George Dunlap, Julien Grall, Stefano Stabellini,
	Volodymyr Babchuk, Roger Pau Monné,
	xen-devel, Andrew Cooper


[-- Attachment #1.1.1: Type: text/plain, Size: 5024 bytes --]

On 25.11.21 10:38, Jan Beulich wrote:
> On 25.11.2021 07:55, Juergen Gross wrote:
>> On 22.11.21 16:39, Jan Beulich wrote:
>>> On 14.09.2021 14:35, Juergen Gross wrote:
>>>> @@ -103,6 +104,43 @@ void domain_reset_states(void)
>>>>        rcu_read_unlock(&domlist_read_lock);
>>>>    }
>>>>    
>>>> +int domain_get_dom_state_changed(struct xen_control_changed_domain *info)
>>>> +{
>>>> +    unsigned int dom;
>>>> +    struct domain *d;
>>>> +
>>>> +    while ( (dom = find_first_bit(dom_state_changed, DOMID_MASK + 1)) <
>>>> +            DOMID_FIRST_RESERVED )
>>>
>>> As per my comment on the earlier patch - the use of DOMID_MASK + 1 vs
>>> is quite puzzling here.
>>
>> Okay, will change that.
>>
>>>
>>>> +    {
>>>> +        d = rcu_lock_domain_by_id(dom);
>>>> +
>>>> +        if ( test_and_clear_bit(dom, dom_state_changed) )
>>>> +        {
>>>> +            info->domid = dom;
>>>> +            if ( d )
>>>> +            {
>>>> +                info->state = XEN_CONTROL_CHANGEDDOM_STATE_EXIST;
>>>> +                if ( d->is_shut_down )
>>>> +                    info->state |= XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN;
>>>> +                if ( d->is_dying == DOMDYING_dead )
>>>> +                    info->state |= XEN_CONTROL_CHANGEDDOM_STATE_DYING;
>>>> +                info->unique_id = d->unique_id;
>>>> +
>>>> +                rcu_unlock_domain(d);
>>>> +            }
>>>> +
>>>> +            return 0;
>>>
>>> With rapid creation of short lived domains, will the caller ever get to
>>> see information on higher numbered domains (if, say, it gets "suitably"
>>> preempted within its own environment)? IOW shouldn't there be a way for
>>> the caller to specify a domid to start from?
>>
>> I'd rather have a local variable for the last reported domid and start
>> from that.
> 
> Well, it probably doesn't matter much to have yet one more aspect making
> this a single-consumer-only interface.

For making it an interface consumable by multiple users you'd need to
have a per-consumer data set, which would include the bitmap of changed
domains and could include the domid last tested.

As one consumer is Xenstore, and Xenstore can run either in a dedicated
domain or in dom0, I believe a multiple user capable interface would
even need to support multiple users in the same domain, which would be
even more complicated. So I continue to be rather hesitant to add this
additional complexity with only some vague idea of "might come handy in
the future".

> 
>>>> +/*
>>>> + * XEN_CONTROL_OP_get_state_changed_domain
>>>> + *
>>>> + * Get information about a domain having changed state and reset the state
>>>> + * change indicator for that domain. This function is usable only by a domain
>>>> + * having registered the VIRQ_DOM_EXC event (normally Xenstore).
>>>> + *
>>>> + * arg: XEN_GUEST_HANDLE(struct xen_control_changed_domain)
>>>> + *
>>>> + * Possible return values:
>>>> + * 0: success
>>>> + * <0 : negative Xen errno value
>>>> + */
>>>> +#define XEN_CONTROL_OP_get_state_changed_domain     1
>>>> +struct xen_control_changed_domain {
>>>> +    domid_t domid;
>>>> +    uint16_t state;
>>>> +#define XEN_CONTROL_CHANGEDDOM_STATE_EXIST     0x0001  /* Domain is existing. */
>>>> +#define XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN  0x0002  /* Shutdown finished. */
>>>> +#define XEN_CONTROL_CHANGEDDOM_STATE_DYING     0x0004  /* Domain dying. */
>>>> +    uint32_t pad1;           /* Returned as 0. */
>>>> +    uint64_t unique_id;      /* Unique domain identifier. */
>>>> +    uint64_t pad2[6];        /* Returned as 0. */
>>>
>>> I think the padding fields have to be zero on input, not just on return.
>>
>> I don't see why this would be needed, as this structure is only ever
>> copied to the caller, so "on input" just doesn't apply here.
>>
>>> Unless you mean to mandate them to be OUT only now and forever. I also
>>
>> The whole struct is OUT only.
> 
> Right now, yes. I wouldn't exclude "pad1" to become a flags field,
> controlling some future behavioral aspect of the operation.

Right now I don't see the need for that, see my reasoning above.

> 
>>> wonder how the trailing padding plays up with the version sub-op: Do we
>>> really need such double precaution?
>>
>> I can remove it.
>>
>>> Also - should we use uint64_aligned_t here?
>>
>> Yes.
> 
> But you realize this isn't straightforward, for the type not being
> available in plain C89 (nor C99)? That's why it's presently used in
> tools-only interfaces only, and the respective header are excluded
> from the "is ANSI compatible" checking (memory.h and hvm/dm_op.h
> have special but imo crude "precautions").

No, I didn't realize that. I just looked how it is used today and
agreed I should follow current usage.

But even with using a stable interface I'm not sure we need to make it
strictly ANSI compatible, as usage of this interface will still be
restricted to tools.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 3/4] xen: add new stable control hypercall
  2021-11-25 10:12         ` Juergen Gross
@ 2021-11-25 10:19           ` Jan Beulich
  2021-11-25 10:33             ` Juergen Gross
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2021-11-25 10:19 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Daniel De Graaf, Daniel P. Smith, Ian Jackson, Wei Liu,
	George Dunlap, Julien Grall, Stefano Stabellini,
	Volodymyr Babchuk, Roger Pau Monné,
	xen-devel, Andrew Cooper

On 25.11.2021 11:12, Juergen Gross wrote:
> On 25.11.21 10:38, Jan Beulich wrote:
>> On 25.11.2021 07:55, Juergen Gross wrote:
>>> On 22.11.21 16:39, Jan Beulich wrote:
>>>> On 14.09.2021 14:35, Juergen Gross wrote:
>>>>> @@ -103,6 +104,43 @@ void domain_reset_states(void)
>>>>>        rcu_read_unlock(&domlist_read_lock);
>>>>>    }
>>>>>    
>>>>> +int domain_get_dom_state_changed(struct xen_control_changed_domain *info)
>>>>> +{
>>>>> +    unsigned int dom;
>>>>> +    struct domain *d;
>>>>> +
>>>>> +    while ( (dom = find_first_bit(dom_state_changed, DOMID_MASK + 1)) <
>>>>> +            DOMID_FIRST_RESERVED )
>>>>
>>>> As per my comment on the earlier patch - the use of DOMID_MASK + 1 vs
>>>> is quite puzzling here.
>>>
>>> Okay, will change that.
>>>
>>>>
>>>>> +    {
>>>>> +        d = rcu_lock_domain_by_id(dom);
>>>>> +
>>>>> +        if ( test_and_clear_bit(dom, dom_state_changed) )
>>>>> +        {
>>>>> +            info->domid = dom;
>>>>> +            if ( d )
>>>>> +            {
>>>>> +                info->state = XEN_CONTROL_CHANGEDDOM_STATE_EXIST;
>>>>> +                if ( d->is_shut_down )
>>>>> +                    info->state |= XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN;
>>>>> +                if ( d->is_dying == DOMDYING_dead )
>>>>> +                    info->state |= XEN_CONTROL_CHANGEDDOM_STATE_DYING;
>>>>> +                info->unique_id = d->unique_id;
>>>>> +
>>>>> +                rcu_unlock_domain(d);
>>>>> +            }
>>>>> +
>>>>> +            return 0;
>>>>
>>>> With rapid creation of short lived domains, will the caller ever get to
>>>> see information on higher numbered domains (if, say, it gets "suitably"
>>>> preempted within its own environment)? IOW shouldn't there be a way for
>>>> the caller to specify a domid to start from?
>>>
>>> I'd rather have a local variable for the last reported domid and start
>>> from that.
>>
>> Well, it probably doesn't matter much to have yet one more aspect making
>> this a single-consumer-only interface.
> 
> For making it an interface consumable by multiple users you'd need to
> have a per-consumer data set, which would include the bitmap of changed
> domains and could include the domid last tested.
> 
> As one consumer is Xenstore, and Xenstore can run either in a dedicated
> domain or in dom0, I believe a multiple user capable interface would
> even need to support multiple users in the same domain, which would be
> even more complicated. So I continue to be rather hesitant to add this
> additional complexity with only some vague idea of "might come handy in
> the future".
> 
>>
>>>>> +/*
>>>>> + * XEN_CONTROL_OP_get_state_changed_domain
>>>>> + *
>>>>> + * Get information about a domain having changed state and reset the state
>>>>> + * change indicator for that domain. This function is usable only by a domain
>>>>> + * having registered the VIRQ_DOM_EXC event (normally Xenstore).
>>>>> + *
>>>>> + * arg: XEN_GUEST_HANDLE(struct xen_control_changed_domain)
>>>>> + *
>>>>> + * Possible return values:
>>>>> + * 0: success
>>>>> + * <0 : negative Xen errno value
>>>>> + */
>>>>> +#define XEN_CONTROL_OP_get_state_changed_domain     1
>>>>> +struct xen_control_changed_domain {
>>>>> +    domid_t domid;
>>>>> +    uint16_t state;
>>>>> +#define XEN_CONTROL_CHANGEDDOM_STATE_EXIST     0x0001  /* Domain is existing. */
>>>>> +#define XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN  0x0002  /* Shutdown finished. */
>>>>> +#define XEN_CONTROL_CHANGEDDOM_STATE_DYING     0x0004  /* Domain dying. */
>>>>> +    uint32_t pad1;           /* Returned as 0. */
>>>>> +    uint64_t unique_id;      /* Unique domain identifier. */
>>>>> +    uint64_t pad2[6];        /* Returned as 0. */
>>>>
>>>> I think the padding fields have to be zero on input, not just on return.
>>>
>>> I don't see why this would be needed, as this structure is only ever
>>> copied to the caller, so "on input" just doesn't apply here.
>>>
>>>> Unless you mean to mandate them to be OUT only now and forever. I also
>>>
>>> The whole struct is OUT only.
>>
>> Right now, yes. I wouldn't exclude "pad1" to become a flags field,
>> controlling some future behavioral aspect of the operation.
> 
> Right now I don't see the need for that, see my reasoning above.

If your reference is to the single consumer aspect, then I don't see
why that would matter here. Future xenstore may want/need to make
use of such a future flag, yet older xenstore still wouldn't know
about it.

>>>> wonder how the trailing padding plays up with the version sub-op: Do we
>>>> really need such double precaution?
>>>
>>> I can remove it.
>>>
>>>> Also - should we use uint64_aligned_t here?
>>>
>>> Yes.
>>
>> But you realize this isn't straightforward, for the type not being
>> available in plain C89 (nor C99)? That's why it's presently used in
>> tools-only interfaces only, and the respective header are excluded
>> from the "is ANSI compatible" checking (memory.h and hvm/dm_op.h
>> have special but imo crude "precautions").
> 
> No, I didn't realize that. I just looked how it is used today and
> agreed I should follow current usage.
> 
> But even with using a stable interface I'm not sure we need to make it
> strictly ANSI compatible, as usage of this interface will still be
> restricted to tools.

True. Problem is that our present __XEN_TOOLS__ guards have effectively
dual meaning - "tools only" and "unstable". We merely need to be sure
everyone understands that this is changing. Perhaps when you add such a
guard here, it may want accompanying by a respective comment.

Jan



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 3/4] xen: add new stable control hypercall
  2021-11-25 10:19           ` Jan Beulich
@ 2021-11-25 10:33             ` Juergen Gross
  2021-11-25 10:51               ` Jan Beulich
  0 siblings, 1 reply; 25+ messages in thread
From: Juergen Gross @ 2021-11-25 10:33 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Daniel De Graaf, Daniel P. Smith, Ian Jackson, Wei Liu,
	George Dunlap, Julien Grall, Stefano Stabellini,
	Volodymyr Babchuk, Roger Pau Monné,
	xen-devel, Andrew Cooper


[-- Attachment #1.1.1: Type: text/plain, Size: 6272 bytes --]

On 25.11.21 11:19, Jan Beulich wrote:
> On 25.11.2021 11:12, Juergen Gross wrote:
>> On 25.11.21 10:38, Jan Beulich wrote:
>>> On 25.11.2021 07:55, Juergen Gross wrote:
>>>> On 22.11.21 16:39, Jan Beulich wrote:
>>>>> On 14.09.2021 14:35, Juergen Gross wrote:
>>>>>> @@ -103,6 +104,43 @@ void domain_reset_states(void)
>>>>>>         rcu_read_unlock(&domlist_read_lock);
>>>>>>     }
>>>>>>     
>>>>>> +int domain_get_dom_state_changed(struct xen_control_changed_domain *info)
>>>>>> +{
>>>>>> +    unsigned int dom;
>>>>>> +    struct domain *d;
>>>>>> +
>>>>>> +    while ( (dom = find_first_bit(dom_state_changed, DOMID_MASK + 1)) <
>>>>>> +            DOMID_FIRST_RESERVED )
>>>>>
>>>>> As per my comment on the earlier patch - the use of DOMID_MASK + 1 vs
>>>>> is quite puzzling here.
>>>>
>>>> Okay, will change that.
>>>>
>>>>>
>>>>>> +    {
>>>>>> +        d = rcu_lock_domain_by_id(dom);
>>>>>> +
>>>>>> +        if ( test_and_clear_bit(dom, dom_state_changed) )
>>>>>> +        {
>>>>>> +            info->domid = dom;
>>>>>> +            if ( d )
>>>>>> +            {
>>>>>> +                info->state = XEN_CONTROL_CHANGEDDOM_STATE_EXIST;
>>>>>> +                if ( d->is_shut_down )
>>>>>> +                    info->state |= XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN;
>>>>>> +                if ( d->is_dying == DOMDYING_dead )
>>>>>> +                    info->state |= XEN_CONTROL_CHANGEDDOM_STATE_DYING;
>>>>>> +                info->unique_id = d->unique_id;
>>>>>> +
>>>>>> +                rcu_unlock_domain(d);
>>>>>> +            }
>>>>>> +
>>>>>> +            return 0;
>>>>>
>>>>> With rapid creation of short lived domains, will the caller ever get to
>>>>> see information on higher numbered domains (if, say, it gets "suitably"
>>>>> preempted within its own environment)? IOW shouldn't there be a way for
>>>>> the caller to specify a domid to start from?
>>>>
>>>> I'd rather have a local variable for the last reported domid and start
>>>> from that.
>>>
>>> Well, it probably doesn't matter much to have yet one more aspect making
>>> this a single-consumer-only interface.
>>
>> For making it an interface consumable by multiple users you'd need to
>> have a per-consumer data set, which would include the bitmap of changed
>> domains and could include the domid last tested.
>>
>> As one consumer is Xenstore, and Xenstore can run either in a dedicated
>> domain or in dom0, I believe a multiple user capable interface would
>> even need to support multiple users in the same domain, which would be
>> even more complicated. So I continue to be rather hesitant to add this
>> additional complexity with only some vague idea of "might come handy in
>> the future".
>>
>>>
>>>>>> +/*
>>>>>> + * XEN_CONTROL_OP_get_state_changed_domain
>>>>>> + *
>>>>>> + * Get information about a domain having changed state and reset the state
>>>>>> + * change indicator for that domain. This function is usable only by a domain
>>>>>> + * having registered the VIRQ_DOM_EXC event (normally Xenstore).
>>>>>> + *
>>>>>> + * arg: XEN_GUEST_HANDLE(struct xen_control_changed_domain)
>>>>>> + *
>>>>>> + * Possible return values:
>>>>>> + * 0: success
>>>>>> + * <0 : negative Xen errno value
>>>>>> + */
>>>>>> +#define XEN_CONTROL_OP_get_state_changed_domain     1
>>>>>> +struct xen_control_changed_domain {
>>>>>> +    domid_t domid;
>>>>>> +    uint16_t state;
>>>>>> +#define XEN_CONTROL_CHANGEDDOM_STATE_EXIST     0x0001  /* Domain is existing. */
>>>>>> +#define XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN  0x0002  /* Shutdown finished. */
>>>>>> +#define XEN_CONTROL_CHANGEDDOM_STATE_DYING     0x0004  /* Domain dying. */
>>>>>> +    uint32_t pad1;           /* Returned as 0. */
>>>>>> +    uint64_t unique_id;      /* Unique domain identifier. */
>>>>>> +    uint64_t pad2[6];        /* Returned as 0. */
>>>>>
>>>>> I think the padding fields have to be zero on input, not just on return.
>>>>
>>>> I don't see why this would be needed, as this structure is only ever
>>>> copied to the caller, so "on input" just doesn't apply here.
>>>>
>>>>> Unless you mean to mandate them to be OUT only now and forever. I also
>>>>
>>>> The whole struct is OUT only.
>>>
>>> Right now, yes. I wouldn't exclude "pad1" to become a flags field,
>>> controlling some future behavioral aspect of the operation.
>>
>> Right now I don't see the need for that, see my reasoning above.
> 
> If your reference is to the single consumer aspect, then I don't see
> why that would matter here. Future xenstore may want/need to make
> use of such a future flag, yet older xenstore still wouldn't know
> about it.

I'm not sure it is a good idea to mix IN and OUT fields in such a struct
which is meant to return information only.

I'd rather add a new sub-op in this case taking another parameter for
specifying additional options or a struct prepending the needed IN
fields to above struct.

> 
>>>>> wonder how the trailing padding plays up with the version sub-op: Do we
>>>>> really need such double precaution?
>>>>
>>>> I can remove it.
>>>>
>>>>> Also - should we use uint64_aligned_t here?
>>>>
>>>> Yes.
>>>
>>> But you realize this isn't straightforward, for the type not being
>>> available in plain C89 (nor C99)? That's why it's presently used in
>>> tools-only interfaces only, and the respective header are excluded
>>> from the "is ANSI compatible" checking (memory.h and hvm/dm_op.h
>>> have special but imo crude "precautions").
>>
>> No, I didn't realize that. I just looked how it is used today and
>> agreed I should follow current usage.
>>
>> But even with using a stable interface I'm not sure we need to make it
>> strictly ANSI compatible, as usage of this interface will still be
>> restricted to tools.
> 
> True. Problem is that our present __XEN_TOOLS__ guards have effectively
> dual meaning - "tools only" and "unstable". We merely need to be sure
> everyone understands that this is changing. Perhaps when you add such a
> guard here, it may want accompanying by a respective comment.

I'd be fine with that.

Maybe we want a new guard "__XEN_INTERNAL__" for that (new) purpose?


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 3/4] xen: add new stable control hypercall
  2021-11-25 10:33             ` Juergen Gross
@ 2021-11-25 10:51               ` Jan Beulich
  2021-11-25 11:00                 ` Juergen Gross
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2021-11-25 10:51 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Daniel De Graaf, Daniel P. Smith, Ian Jackson, Wei Liu,
	George Dunlap, Julien Grall, Stefano Stabellini,
	Volodymyr Babchuk, Roger Pau Monné,
	xen-devel, Andrew Cooper

On 25.11.2021 11:33, Juergen Gross wrote:
> On 25.11.21 11:19, Jan Beulich wrote:
>> On 25.11.2021 11:12, Juergen Gross wrote:
>>> On 25.11.21 10:38, Jan Beulich wrote:
>>>> On 25.11.2021 07:55, Juergen Gross wrote:
>>>>> On 22.11.21 16:39, Jan Beulich wrote:
>>>>>> On 14.09.2021 14:35, Juergen Gross wrote:
>>>>>>> @@ -103,6 +104,43 @@ void domain_reset_states(void)
>>>>>>>         rcu_read_unlock(&domlist_read_lock);
>>>>>>>     }
>>>>>>>     
>>>>>>> +int domain_get_dom_state_changed(struct xen_control_changed_domain *info)
>>>>>>> +{
>>>>>>> +    unsigned int dom;
>>>>>>> +    struct domain *d;
>>>>>>> +
>>>>>>> +    while ( (dom = find_first_bit(dom_state_changed, DOMID_MASK + 1)) <
>>>>>>> +            DOMID_FIRST_RESERVED )
>>>>>>
>>>>>> As per my comment on the earlier patch - the use of DOMID_MASK + 1 vs
>>>>>> is quite puzzling here.
>>>>>
>>>>> Okay, will change that.
>>>>>
>>>>>>
>>>>>>> +    {
>>>>>>> +        d = rcu_lock_domain_by_id(dom);
>>>>>>> +
>>>>>>> +        if ( test_and_clear_bit(dom, dom_state_changed) )
>>>>>>> +        {
>>>>>>> +            info->domid = dom;
>>>>>>> +            if ( d )
>>>>>>> +            {
>>>>>>> +                info->state = XEN_CONTROL_CHANGEDDOM_STATE_EXIST;
>>>>>>> +                if ( d->is_shut_down )
>>>>>>> +                    info->state |= XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN;
>>>>>>> +                if ( d->is_dying == DOMDYING_dead )
>>>>>>> +                    info->state |= XEN_CONTROL_CHANGEDDOM_STATE_DYING;
>>>>>>> +                info->unique_id = d->unique_id;
>>>>>>> +
>>>>>>> +                rcu_unlock_domain(d);
>>>>>>> +            }
>>>>>>> +
>>>>>>> +            return 0;
>>>>>>
>>>>>> With rapid creation of short lived domains, will the caller ever get to
>>>>>> see information on higher numbered domains (if, say, it gets "suitably"
>>>>>> preempted within its own environment)? IOW shouldn't there be a way for
>>>>>> the caller to specify a domid to start from?
>>>>>
>>>>> I'd rather have a local variable for the last reported domid and start
>>>>> from that.
>>>>
>>>> Well, it probably doesn't matter much to have yet one more aspect making
>>>> this a single-consumer-only interface.
>>>
>>> For making it an interface consumable by multiple users you'd need to
>>> have a per-consumer data set, which would include the bitmap of changed
>>> domains and could include the domid last tested.
>>>
>>> As one consumer is Xenstore, and Xenstore can run either in a dedicated
>>> domain or in dom0, I believe a multiple user capable interface would
>>> even need to support multiple users in the same domain, which would be
>>> even more complicated. So I continue to be rather hesitant to add this
>>> additional complexity with only some vague idea of "might come handy in
>>> the future".
>>>
>>>>
>>>>>>> +/*
>>>>>>> + * XEN_CONTROL_OP_get_state_changed_domain
>>>>>>> + *
>>>>>>> + * Get information about a domain having changed state and reset the state
>>>>>>> + * change indicator for that domain. This function is usable only by a domain
>>>>>>> + * having registered the VIRQ_DOM_EXC event (normally Xenstore).
>>>>>>> + *
>>>>>>> + * arg: XEN_GUEST_HANDLE(struct xen_control_changed_domain)
>>>>>>> + *
>>>>>>> + * Possible return values:
>>>>>>> + * 0: success
>>>>>>> + * <0 : negative Xen errno value
>>>>>>> + */
>>>>>>> +#define XEN_CONTROL_OP_get_state_changed_domain     1
>>>>>>> +struct xen_control_changed_domain {
>>>>>>> +    domid_t domid;
>>>>>>> +    uint16_t state;
>>>>>>> +#define XEN_CONTROL_CHANGEDDOM_STATE_EXIST     0x0001  /* Domain is existing. */
>>>>>>> +#define XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN  0x0002  /* Shutdown finished. */
>>>>>>> +#define XEN_CONTROL_CHANGEDDOM_STATE_DYING     0x0004  /* Domain dying. */
>>>>>>> +    uint32_t pad1;           /* Returned as 0. */
>>>>>>> +    uint64_t unique_id;      /* Unique domain identifier. */
>>>>>>> +    uint64_t pad2[6];        /* Returned as 0. */
>>>>>>
>>>>>> I think the padding fields have to be zero on input, not just on return.
>>>>>
>>>>> I don't see why this would be needed, as this structure is only ever
>>>>> copied to the caller, so "on input" just doesn't apply here.
>>>>>
>>>>>> Unless you mean to mandate them to be OUT only now and forever. I also
>>>>>
>>>>> The whole struct is OUT only.
>>>>
>>>> Right now, yes. I wouldn't exclude "pad1" to become a flags field,
>>>> controlling some future behavioral aspect of the operation.
>>>
>>> Right now I don't see the need for that, see my reasoning above.
>>
>> If your reference is to the single consumer aspect, then I don't see
>> why that would matter here. Future xenstore may want/need to make
>> use of such a future flag, yet older xenstore still wouldn't know
>> about it.
> 
> I'm not sure it is a good idea to mix IN and OUT fields in such a struct
> which is meant to return information only.
> 
> I'd rather add a new sub-op in this case taking another parameter for
> specifying additional options or a struct prepending the needed IN
> fields to above struct.

Well, okay. May ask for a /* OUT */ comment then ahead of the first of
the struct fields?

>>>>>> wonder how the trailing padding plays up with the version sub-op: Do we
>>>>>> really need such double precaution?
>>>>>
>>>>> I can remove it.
>>>>>
>>>>>> Also - should we use uint64_aligned_t here?
>>>>>
>>>>> Yes.
>>>>
>>>> But you realize this isn't straightforward, for the type not being
>>>> available in plain C89 (nor C99)? That's why it's presently used in
>>>> tools-only interfaces only, and the respective header are excluded
>>>> from the "is ANSI compatible" checking (memory.h and hvm/dm_op.h
>>>> have special but imo crude "precautions").
>>>
>>> No, I didn't realize that. I just looked how it is used today and
>>> agreed I should follow current usage.
>>>
>>> But even with using a stable interface I'm not sure we need to make it
>>> strictly ANSI compatible, as usage of this interface will still be
>>> restricted to tools.
>>
>> True. Problem is that our present __XEN_TOOLS__ guards have effectively
>> dual meaning - "tools only" and "unstable". We merely need to be sure
>> everyone understands that this is changing. Perhaps when you add such a
>> guard here, it may want accompanying by a respective comment.
> 
> I'd be fine with that.
> 
> Maybe we want a new guard "__XEN_INTERNAL__" for that (new) purpose?

Not sure - this may result in undesirable code churn elsewhere.

Jan



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH RFC 3/4] xen: add new stable control hypercall
  2021-11-25 10:51               ` Jan Beulich
@ 2021-11-25 11:00                 ` Juergen Gross
  0 siblings, 0 replies; 25+ messages in thread
From: Juergen Gross @ 2021-11-25 11:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Daniel De Graaf, Daniel P. Smith, Ian Jackson, Wei Liu,
	George Dunlap, Julien Grall, Stefano Stabellini,
	Volodymyr Babchuk, Roger Pau Monné,
	xen-devel, Andrew Cooper


[-- Attachment #1.1.1: Type: text/plain, Size: 6878 bytes --]

On 25.11.21 11:51, Jan Beulich wrote:
> On 25.11.2021 11:33, Juergen Gross wrote:
>> On 25.11.21 11:19, Jan Beulich wrote:
>>> On 25.11.2021 11:12, Juergen Gross wrote:
>>>> On 25.11.21 10:38, Jan Beulich wrote:
>>>>> On 25.11.2021 07:55, Juergen Gross wrote:
>>>>>> On 22.11.21 16:39, Jan Beulich wrote:
>>>>>>> On 14.09.2021 14:35, Juergen Gross wrote:
>>>>>>>> @@ -103,6 +104,43 @@ void domain_reset_states(void)
>>>>>>>>          rcu_read_unlock(&domlist_read_lock);
>>>>>>>>      }
>>>>>>>>      
>>>>>>>> +int domain_get_dom_state_changed(struct xen_control_changed_domain *info)
>>>>>>>> +{
>>>>>>>> +    unsigned int dom;
>>>>>>>> +    struct domain *d;
>>>>>>>> +
>>>>>>>> +    while ( (dom = find_first_bit(dom_state_changed, DOMID_MASK + 1)) <
>>>>>>>> +            DOMID_FIRST_RESERVED )
>>>>>>>
>>>>>>> As per my comment on the earlier patch - the use of DOMID_MASK + 1 vs
>>>>>>> is quite puzzling here.
>>>>>>
>>>>>> Okay, will change that.
>>>>>>
>>>>>>>
>>>>>>>> +    {
>>>>>>>> +        d = rcu_lock_domain_by_id(dom);
>>>>>>>> +
>>>>>>>> +        if ( test_and_clear_bit(dom, dom_state_changed) )
>>>>>>>> +        {
>>>>>>>> +            info->domid = dom;
>>>>>>>> +            if ( d )
>>>>>>>> +            {
>>>>>>>> +                info->state = XEN_CONTROL_CHANGEDDOM_STATE_EXIST;
>>>>>>>> +                if ( d->is_shut_down )
>>>>>>>> +                    info->state |= XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN;
>>>>>>>> +                if ( d->is_dying == DOMDYING_dead )
>>>>>>>> +                    info->state |= XEN_CONTROL_CHANGEDDOM_STATE_DYING;
>>>>>>>> +                info->unique_id = d->unique_id;
>>>>>>>> +
>>>>>>>> +                rcu_unlock_domain(d);
>>>>>>>> +            }
>>>>>>>> +
>>>>>>>> +            return 0;
>>>>>>>
>>>>>>> With rapid creation of short lived domains, will the caller ever get to
>>>>>>> see information on higher numbered domains (if, say, it gets "suitably"
>>>>>>> preempted within its own environment)? IOW shouldn't there be a way for
>>>>>>> the caller to specify a domid to start from?
>>>>>>
>>>>>> I'd rather have a local variable for the last reported domid and start
>>>>>> from that.
>>>>>
>>>>> Well, it probably doesn't matter much to have yet one more aspect making
>>>>> this a single-consumer-only interface.
>>>>
>>>> For making it an interface consumable by multiple users you'd need to
>>>> have a per-consumer data set, which would include the bitmap of changed
>>>> domains and could include the domid last tested.
>>>>
>>>> As one consumer is Xenstore, and Xenstore can run either in a dedicated
>>>> domain or in dom0, I believe a multiple user capable interface would
>>>> even need to support multiple users in the same domain, which would be
>>>> even more complicated. So I continue to be rather hesitant to add this
>>>> additional complexity with only some vague idea of "might come handy in
>>>> the future".
>>>>
>>>>>
>>>>>>>> +/*
>>>>>>>> + * XEN_CONTROL_OP_get_state_changed_domain
>>>>>>>> + *
>>>>>>>> + * Get information about a domain having changed state and reset the state
>>>>>>>> + * change indicator for that domain. This function is usable only by a domain
>>>>>>>> + * having registered the VIRQ_DOM_EXC event (normally Xenstore).
>>>>>>>> + *
>>>>>>>> + * arg: XEN_GUEST_HANDLE(struct xen_control_changed_domain)
>>>>>>>> + *
>>>>>>>> + * Possible return values:
>>>>>>>> + * 0: success
>>>>>>>> + * <0 : negative Xen errno value
>>>>>>>> + */
>>>>>>>> +#define XEN_CONTROL_OP_get_state_changed_domain     1
>>>>>>>> +struct xen_control_changed_domain {
>>>>>>>> +    domid_t domid;
>>>>>>>> +    uint16_t state;
>>>>>>>> +#define XEN_CONTROL_CHANGEDDOM_STATE_EXIST     0x0001  /* Domain is existing. */
>>>>>>>> +#define XEN_CONTROL_CHANGEDDOM_STATE_SHUTDOWN  0x0002  /* Shutdown finished. */
>>>>>>>> +#define XEN_CONTROL_CHANGEDDOM_STATE_DYING     0x0004  /* Domain dying. */
>>>>>>>> +    uint32_t pad1;           /* Returned as 0. */
>>>>>>>> +    uint64_t unique_id;      /* Unique domain identifier. */
>>>>>>>> +    uint64_t pad2[6];        /* Returned as 0. */
>>>>>>>
>>>>>>> I think the padding fields have to be zero on input, not just on return.
>>>>>>
>>>>>> I don't see why this would be needed, as this structure is only ever
>>>>>> copied to the caller, so "on input" just doesn't apply here.
>>>>>>
>>>>>>> Unless you mean to mandate them to be OUT only now and forever. I also
>>>>>>
>>>>>> The whole struct is OUT only.
>>>>>
>>>>> Right now, yes. I wouldn't exclude "pad1" to become a flags field,
>>>>> controlling some future behavioral aspect of the operation.
>>>>
>>>> Right now I don't see the need for that, see my reasoning above.
>>>
>>> If your reference is to the single consumer aspect, then I don't see
>>> why that would matter here. Future xenstore may want/need to make
>>> use of such a future flag, yet older xenstore still wouldn't know
>>> about it.
>>
>> I'm not sure it is a good idea to mix IN and OUT fields in such a struct
>> which is meant to return information only.
>>
>> I'd rather add a new sub-op in this case taking another parameter for
>> specifying additional options or a struct prepending the needed IN
>> fields to above struct.
> 
> Well, okay. May ask for a /* OUT */ comment then ahead of the first of
> the struct fields?

Yes, will add it.

> 
>>>>>>> wonder how the trailing padding plays up with the version sub-op: Do we
>>>>>>> really need such double precaution?
>>>>>>
>>>>>> I can remove it.
>>>>>>
>>>>>>> Also - should we use uint64_aligned_t here?
>>>>>>
>>>>>> Yes.
>>>>>
>>>>> But you realize this isn't straightforward, for the type not being
>>>>> available in plain C89 (nor C99)? That's why it's presently used in
>>>>> tools-only interfaces only, and the respective header are excluded
>>>>> from the "is ANSI compatible" checking (memory.h and hvm/dm_op.h
>>>>> have special but imo crude "precautions").
>>>>
>>>> No, I didn't realize that. I just looked how it is used today and
>>>> agreed I should follow current usage.
>>>>
>>>> But even with using a stable interface I'm not sure we need to make it
>>>> strictly ANSI compatible, as usage of this interface will still be
>>>> restricted to tools.
>>>
>>> True. Problem is that our present __XEN_TOOLS__ guards have effectively
>>> dual meaning - "tools only" and "unstable". We merely need to be sure
>>> everyone understands that this is changing. Perhaps when you add such a
>>> guard here, it may want accompanying by a respective comment.
>>
>> I'd be fine with that.
>>
>> Maybe we want a new guard "__XEN_INTERNAL__" for that (new) purpose?
> 
> Not sure - this may result in undesirable code churn elsewhere.

I won't insist on it. :-)


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2021-11-25 11:01 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-14 12:35 [PATCH RFC 0/4] remove libxenctrl usage from xenstored Juergen Gross
2021-09-14 12:35 ` [PATCH RFC 1/4] xen: add a domain unique id to each domain Juergen Gross
2021-11-22 10:01   ` Jan Beulich
2021-11-22 11:42   ` Julien Grall
2021-11-22 12:48     ` Juergen Gross
2021-11-22 14:10       ` Julien Grall
2021-11-22 14:29         ` Juergen Gross
2021-11-22 14:37           ` Julien Grall
2021-11-25  6:32             ` Juergen Gross
2021-09-14 12:35 ` [PATCH RFC 2/4] xen: add bitmap to indicate per-domain state changes Juergen Gross
2021-09-23  9:16   ` Julien Grall
2021-11-22 10:41   ` Jan Beulich
2021-11-22 12:46     ` Juergen Gross
2021-11-22 13:39       ` Jan Beulich
2021-11-25  6:30         ` Juergen Gross
2021-09-14 12:35 ` [PATCH RFC 3/4] xen: add new stable control hypercall Juergen Gross
2021-11-22 15:39   ` Jan Beulich
2021-11-25  6:55     ` Juergen Gross
2021-11-25  9:38       ` Jan Beulich
2021-11-25 10:12         ` Juergen Gross
2021-11-25 10:19           ` Jan Beulich
2021-11-25 10:33             ` Juergen Gross
2021-11-25 10:51               ` Jan Beulich
2021-11-25 11:00                 ` Juergen Gross
2021-09-14 12:36 ` [PATCH RFC 4/4] tools/xenstore: use new stable interface instead of libxenctrl Juergen Gross

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.