All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/5] dom0less PV drivers
@ 2022-01-28 21:32 Stefano Stabellini
  2022-01-28 21:33 ` [PATCH v3 1/5] xen: introduce xen,enhanced dom0less property Stefano Stabellini
                   ` (4 more replies)
  0 siblings, 5 replies; 42+ messages in thread
From: Stefano Stabellini @ 2022-01-28 21:32 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, jgross, Bertrand.Marquis, julien, Volodymyr_Babchuk

Hi all,

Currently dom0less guests cannot use PV drivers because they don't have
access to xenstore. Also, the hypervisor node in device tree is missing
so they don't detect that they are running on Xen (thus, they don't try
to enable PV interfaces.)

This patch series enables dom0less guests (on ARM) to use PV drivers.

Instead of initializing xenstore immediately at boot, dom0less guests
get access to xenstore later. They delay the initialization until they
receive a notification via the xenstore event channel (which is
available at boot.)

An example workflow is as follows:
- all domains start in parallel, dom0less guests are immediately running
- when dom0 is up and running, the init-dom0less application is called
- dom0less guests receive the notification and initialize xenstore
- now xl network-attach/disk-attach works as expected for dom0less domUs

The patch series introduces a new dom0less device tree option
"xen,enhanced" (in the Xen device tree) to specify whether PV interfaces
should be enabled/disabled for the dom0less guest.

Cheers,

Stefano


Luca Miccio (4):
      xen: make evtchn_alloc_unbound public
      xen/arm: configure dom0less domain for enabling xenstore after boot
      xenstored: send an evtchn notification on introduce_domain
      tools: add example application to initialize dom0less PV drivers

Stefano Stabellini (1):
      xen: introduce xen,enhanced dom0less property

 docs/misc/arm/device-tree/booting.txt |  18 +++
 tools/helpers/Makefile                |  13 ++
 tools/helpers/init-dom0less.c         | 269 ++++++++++++++++++++++++++++++++++
 tools/xenstore/xenstored_domain.c     |   3 +
 xen/arch/arm/domain_build.c           |  49 +++++++
 xen/arch/arm/include/asm/kernel.h     |   3 +
 xen/common/event_channel.c            |  13 +-
 xen/include/xen/event.h               |   3 +
 8 files changed, 366 insertions(+), 5 deletions(-)
 create mode 100644 tools/helpers/init-dom0less.c


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 1/5] xen: introduce xen,enhanced dom0less property
  2022-01-28 21:32 [PATCH v3 0/5] dom0less PV drivers Stefano Stabellini
@ 2022-01-28 21:33 ` Stefano Stabellini
  2022-01-29 17:58   ` Julien Grall
  2022-01-28 21:33 ` [PATCH v3 2/5] xen: make evtchn_alloc_unbound public Stefano Stabellini
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 42+ messages in thread
From: Stefano Stabellini @ 2022-01-28 21:33 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, jgross, Bertrand.Marquis, julien, Volodymyr_Babchuk,
	Stefano Stabellini, Bertrand Marquis

From: Stefano Stabellini <stefano.stabellini@xilinx.com>

Introduce a new "xen,enhanced" dom0less property to enable/disable PV
driver interfaces for dom0less guests. Currently only "enabled" and
"disabled" are supported property values (and empty). Leave the option
open to implement further possible values in the future (e.g.
"xenstore" to enable only xenstore.)

The configurable option is for domUs only. For dom0 we always set the
corresponding property in the Xen code to true (PV interfaces enabled.)

This patch only parses the property. Next patches will make use of it.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
CC: Julien Grall <julien@xen.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Bertrand Marquis <bertrand.marquis@arm.com>
---
Changes in v3:
- improve commit message

Changes in v2:
- rename kinfo.enhanced to kinfo.dom0less_enhanced
- set kinfo.dom0less_enhanced to true for dom0
- handle -ENODATA in addition to -EILSEQ
---
 docs/misc/arm/device-tree/booting.txt | 18 ++++++++++++++++++
 xen/arch/arm/domain_build.c           |  8 ++++++++
 xen/arch/arm/include/asm/kernel.h     |  3 +++
 3 files changed, 29 insertions(+)

diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt
index 71895663a4..38c29fb3d8 100644
--- a/docs/misc/arm/device-tree/booting.txt
+++ b/docs/misc/arm/device-tree/booting.txt
@@ -169,6 +169,24 @@ with the following properties:
     Please note that the SPI used for the virtual pl011 could clash with the
     physical SPI of a physical device assigned to the guest.
 
+- xen,enhanced
+
+    A string property. Possible property values are:
+
+    - "enabled" (or missing property value)
+    Xen PV interfaces, including grant-table and xenstore, will be
+    enabled for the VM.
+
+    - "disabled"
+    Xen PV interfaces are disabled.
+
+    If the xen,enhanced property is present with no value, it defaults
+    to "enabled". If the xen,enhanced property is not present, PV
+    interfaces are disabled.
+
+    In the future other possible property values might be added to
+    enable only selected interfaces.
+
 - nr_spis
 
     Optional. A 32-bit integer specifying the number of SPIs (Shared
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 6931c022a2..9144d6c0b6 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -2963,6 +2963,7 @@ static int __init construct_domU(struct domain *d,
                                  const struct dt_device_node *node)
 {
     struct kernel_info kinfo = {};
+    const char *dom0less_enhanced;
     int rc;
     u64 mem;
 
@@ -2978,6 +2979,12 @@ static int __init construct_domU(struct domain *d,
 
     kinfo.vpl011 = dt_property_read_bool(node, "vpl011");
 
+    rc = dt_property_read_string(node, "xen,enhanced", &dom0less_enhanced);
+    if ( rc == -EILSEQ ||
+         rc == -ENODATA ||
+         (rc == 0 && !strcmp(dom0less_enhanced, "enabled")) )
+        kinfo.dom0less_enhanced = true;
+
     if ( vcpu_create(d, 0) == NULL )
         return -ENOMEM;
 
@@ -3095,6 +3102,7 @@ static int __init construct_dom0(struct domain *d)
 
     kinfo.unassigned_mem = dom0_mem;
     kinfo.d = d;
+    kinfo.dom0less_enhanced = true;
 
     rc = kernel_probe(&kinfo, NULL);
     if ( rc < 0 )
diff --git a/xen/arch/arm/include/asm/kernel.h b/xen/arch/arm/include/asm/kernel.h
index 874aa108a7..c4dc039b54 100644
--- a/xen/arch/arm/include/asm/kernel.h
+++ b/xen/arch/arm/include/asm/kernel.h
@@ -36,6 +36,9 @@ struct kernel_info {
     /* Enable pl011 emulation */
     bool vpl011;
 
+    /* Enable PV drivers */
+    bool dom0less_enhanced;
+
     /* GIC phandle */
     uint32_t phandle_gic;
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
  2022-01-28 21:32 [PATCH v3 0/5] dom0less PV drivers Stefano Stabellini
  2022-01-28 21:33 ` [PATCH v3 1/5] xen: introduce xen,enhanced dom0less property Stefano Stabellini
@ 2022-01-28 21:33 ` Stefano Stabellini
  2022-01-31  9:32   ` Jan Beulich
  2022-03-15 14:10   ` Daniel P. Smith
  2022-01-28 21:33 ` [PATCH v3 3/5] xen/arm: configure dom0less domain for enabling xenstore after boot Stefano Stabellini
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 42+ messages in thread
From: Stefano Stabellini @ 2022-01-28 21:33 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, jgross, Bertrand.Marquis, julien, Volodymyr_Babchuk,
	Luca Miccio, Stefano Stabellini, Bertrand Marquis, Andrew Cooper,
	George Dunlap, Jan Beulich, Wei Liu

From: Luca Miccio <lucmiccio@gmail.com>

The xenstore event channel will be allocated for dom0less domains. It is
necessary to have access to the evtchn_alloc_unbound function to do
that, so make evtchn_alloc_unbound public.

Add a skip_xsm parameter to allow disabling the XSM check in
evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
originated from Xen before running any domains.)

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
CC: Julien Grall <julien@xen.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Bertrand Marquis <bertrand.marquis@arm.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: George Dunlap <george.dunlap@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Wei Liu <wl@xen.org>
---
Changes v3:
- expose evtchn_alloc_unbound, assing a skip_xsm parameter
---
 xen/common/event_channel.c | 13 ++++++++-----
 xen/include/xen/event.h    |  3 +++
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index da88ad141a..be57d00a15 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn)
     xsm_evtchn_close_post(chn);
 }
 
-static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
+int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
 {
     struct evtchn *chn;
     struct domain *d;
@@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
         ERROR_EXIT_DOM(port, d);
     chn = evtchn_from_port(d, port);
 
-    rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
-    if ( rc )
-        goto out;
+    if ( !skip_xsm )
+    {
+        rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
+        if ( rc )
+            goto out;
+    }
 
     evtchn_write_lock(chn);
 
@@ -1195,7 +1198,7 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
         struct evtchn_alloc_unbound alloc_unbound;
         if ( copy_from_guest(&alloc_unbound, arg, 1) != 0 )
             return -EFAULT;
-        rc = evtchn_alloc_unbound(&alloc_unbound);
+        rc = evtchn_alloc_unbound(&alloc_unbound, false);
         if ( !rc && __copy_to_guest(arg, &alloc_unbound, 1) )
             rc = -EFAULT; /* Cleaning up here would be a mess! */
         break;
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index 21c95e14fd..0a2cdedf7d 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -68,6 +68,9 @@ int evtchn_close(struct domain *d1, int port1, bool guest);
 /* Free an event channel. */
 void evtchn_free(struct domain *d, struct evtchn *chn);
 
+/* Create a new event channel port */
+int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm);
+
 /* Allocate a specific event channel port. */
 int evtchn_allocate_port(struct domain *d, unsigned int port);
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 3/5] xen/arm: configure dom0less domain for enabling xenstore after boot
  2022-01-28 21:32 [PATCH v3 0/5] dom0less PV drivers Stefano Stabellini
  2022-01-28 21:33 ` [PATCH v3 1/5] xen: introduce xen,enhanced dom0less property Stefano Stabellini
  2022-01-28 21:33 ` [PATCH v3 2/5] xen: make evtchn_alloc_unbound public Stefano Stabellini
@ 2022-01-28 21:33 ` Stefano Stabellini
  2022-01-29 18:13   ` Julien Grall
  2022-01-28 21:33 ` [PATCH v3 4/5] xenstored: send an evtchn notification on introduce_domain Stefano Stabellini
  2022-01-28 21:33 ` [PATCH v3 5/5] tools: add example application to initialize dom0less PV drivers Stefano Stabellini
  4 siblings, 1 reply; 42+ messages in thread
From: Stefano Stabellini @ 2022-01-28 21:33 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, jgross, Bertrand.Marquis, julien, Volodymyr_Babchuk,
	Luca Miccio, Stefano Stabellini, Bertrand Marquis

From: Luca Miccio <lucmiccio@gmail.com>

If "xen,enhanced" is enabled, then add to dom0less domains:

- the hypervisor node in device tree
- the xenstore event channel

The xenstore event channel is also used for the first notification to
let the guest know that xenstore has become available.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
CC: Julien Grall <julien@xen.org>
CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
CC: Bertrand Marquis <bertrand.marquis@arm.com>

---
Changes in v3:
- use evtchn_alloc_unbound

Changes in v2:
- set HVM_PARAM_STORE_PFN to ~0ULL at domain creation
- in alloc_xenstore_evtchn do not call _evtchn_alloc_unbound
---
 xen/arch/arm/domain_build.c | 41 +++++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 9144d6c0b6..8e030a7f05 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -27,6 +27,7 @@
 #include <asm/setup.h>
 #include <asm/cpufeature.h>
 #include <asm/domain_build.h>
+#include <xen/event.h>
 
 #include <xen/irq.h>
 #include <xen/grant_table.h>
@@ -2619,6 +2620,8 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
     int ret;
 
     kinfo->phandle_gic = GUEST_PHANDLE_GIC;
+    kinfo->gnttab_start = GUEST_GNTTAB_BASE;
+    kinfo->gnttab_size = GUEST_GNTTAB_SIZE;
 
     addrcells = GUEST_ROOT_ADDRESS_CELLS;
     sizecells = GUEST_ROOT_SIZE_CELLS;
@@ -2693,6 +2696,13 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
             goto err;
     }
 
+    if ( kinfo->dom0less_enhanced )
+    {
+        ret = make_hypervisor_node(d, kinfo, addrcells, sizecells);
+        if ( ret )
+            goto err;
+    }
+
     ret = fdt_end_node(kinfo->fdt);
     if ( ret < 0 )
         goto err;
@@ -2959,6 +2969,25 @@ static int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
     return 0;
 }
 
+static int __init alloc_xenstore_evtchn(struct domain *d)
+{
+    evtchn_alloc_unbound_t alloc;
+    int rc;
+
+    alloc.dom = d->domain_id;
+    alloc.remote_dom = hardware_domain->domain_id;
+    rc = evtchn_alloc_unbound(&alloc, true);
+    if ( rc )
+    {
+        printk("Failed allocating event channel for domain\n");
+        return rc;
+    }
+
+    d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN] = alloc.port;
+
+    return 0;
+}
+
 static int __init construct_domU(struct domain *d,
                                  const struct dt_device_node *node)
 {
@@ -3014,7 +3043,19 @@ static int __init construct_domU(struct domain *d,
         return rc;
 
     if ( kinfo.vpl011 )
+    {
         rc = domain_vpl011_init(d, NULL);
+        if ( rc < 0 )
+            return rc;
+    }
+
+    if ( kinfo.dom0less_enhanced )
+    {
+        rc = alloc_xenstore_evtchn(d);
+        if ( rc < 0 )
+            return rc;
+        d->arch.hvm.params[HVM_PARAM_STORE_PFN] = ~0ULL;
+    }
 
     return rc;
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 4/5] xenstored: send an evtchn notification on introduce_domain
  2022-01-28 21:32 [PATCH v3 0/5] dom0less PV drivers Stefano Stabellini
                   ` (2 preceding siblings ...)
  2022-01-28 21:33 ` [PATCH v3 3/5] xen/arm: configure dom0less domain for enabling xenstore after boot Stefano Stabellini
@ 2022-01-28 21:33 ` Stefano Stabellini
  2022-01-28 21:33 ` [PATCH v3 5/5] tools: add example application to initialize dom0less PV drivers Stefano Stabellini
  4 siblings, 0 replies; 42+ messages in thread
From: Stefano Stabellini @ 2022-01-28 21:33 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, jgross, Bertrand.Marquis, julien, Volodymyr_Babchuk,
	Luca Miccio, Stefano Stabellini, Bertrand Marquis

From: Luca Miccio <lucmiccio@gmail.com>

When xs_introduce_domain is called, send out a notification on the
xenstore event channel so that any (dom0less) domain waiting for the
xenstore interface to be ready can continue with the initialization.

The extra notification is harmless for domains that don't require it.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
---
Changes in v2:
- drop the new late_init parameter
---
 tools/xenstore/xenstored_domain.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/xenstore/xenstored_domain.c b/tools/xenstore/xenstored_domain.c
index d03c7d93a9..7487129e47 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -461,6 +461,9 @@ static struct domain *introduce_domain(const void *ctx,
 		/* Now domain belongs to its connection. */
 		talloc_steal(domain->conn, domain);
 
+		/* Notify the domain that xenstore is available */
+		xenevtchn_notify(xce_handle, domain->port);
+
 		if (!is_master_domain && !restore)
 			fire_watches(NULL, ctx, "@introduceDomain", NULL,
 				     false, NULL);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v3 5/5] tools: add example application to initialize dom0less PV drivers
  2022-01-28 21:32 [PATCH v3 0/5] dom0less PV drivers Stefano Stabellini
                   ` (3 preceding siblings ...)
  2022-01-28 21:33 ` [PATCH v3 4/5] xenstored: send an evtchn notification on introduce_domain Stefano Stabellini
@ 2022-01-28 21:33 ` Stefano Stabellini
  2022-01-29 19:35   ` Julien Grall
  4 siblings, 1 reply; 42+ messages in thread
From: Stefano Stabellini @ 2022-01-28 21:33 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, jgross, Bertrand.Marquis, julien, Volodymyr_Babchuk,
	Luca Miccio, Stefano Stabellini, Wei Liu, Anthony PERARD

From: Luca Miccio <lucmiccio@gmail.com>

Add an example application that can be run in dom0 to complete the
dom0less domains initialization so that they can get access to xenstore
and use PV drivers.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
CC: Wei Liu <wl@xen.org>
CC: Anthony PERARD <anthony.perard@citrix.com>
CC: Juergen Gross <jgross@suse.com>
---
Changes in v3:
- handle xenstore errors
- add an in-code comment about xenstore entries
- less verbose output
- clean-up error path in main

Changes in v2:
- do not set HVM_PARAM_STORE_EVTCHN twice
- rename restore_xenstore to create_xenstore
- increase maxmem
---
 tools/helpers/Makefile        |  13 ++
 tools/helpers/init-dom0less.c | 269 ++++++++++++++++++++++++++++++++++
 2 files changed, 282 insertions(+)
 create mode 100644 tools/helpers/init-dom0less.c

diff --git a/tools/helpers/Makefile b/tools/helpers/Makefile
index 7f6c422440..8e42997052 100644
--- a/tools/helpers/Makefile
+++ b/tools/helpers/Makefile
@@ -10,6 +10,9 @@ ifeq ($(CONFIG_Linux),y)
 ifeq ($(CONFIG_X86),y)
 PROGS += init-xenstore-domain
 endif
+ifeq ($(CONFIG_ARM),y)
+PROGS += init-dom0less
+endif
 endif
 
 XEN_INIT_DOM0_OBJS = xen-init-dom0.o init-dom-json.o
@@ -26,6 +29,13 @@ $(INIT_XENSTORE_DOMAIN_OBJS): CFLAGS += $(CFLAGS_libxenstore)
 $(INIT_XENSTORE_DOMAIN_OBJS): CFLAGS += $(CFLAGS_libxenlight)
 $(INIT_XENSTORE_DOMAIN_OBJS): CFLAGS += -include $(XEN_ROOT)/tools/config.h
 
+INIT_DOM0LESS_OBJS = init-dom0less.o init-dom-json.o
+$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxentoollog)
+$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenstore)
+$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenlight)
+$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenctrl)
+$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenevtchn)
+
 .PHONY: all
 all: $(PROGS)
 
@@ -35,6 +45,9 @@ xen-init-dom0: $(XEN_INIT_DOM0_OBJS)
 init-xenstore-domain: $(INIT_XENSTORE_DOMAIN_OBJS)
 	$(CC) $(LDFLAGS) -o $@ $(INIT_XENSTORE_DOMAIN_OBJS) $(LDLIBS_libxentoollog) $(LDLIBS_libxenstore) $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) $(LDLIBS_libxenlight) $(APPEND_LDFLAGS)
 
+init-dom0less: $(INIT_DOM0LESS_OBJS)
+	$(CC) $(LDFLAGS) -o $@ $(INIT_DOM0LESS_OBJS) $(LDLIBS_libxenctrl) $(LDLIBS_libxenevtchn) $(LDLIBS_libxentoollog) $(LDLIBS_libxenstore) $(LDLIBS_libxenlight) $(LDLIBS_libxenguest)  $(APPEND_LDFLAGS)
+
 .PHONY: install
 install: all
 	$(INSTALL_DIR) $(DESTDIR)$(LIBEXEC_BIN)
diff --git a/tools/helpers/init-dom0less.c b/tools/helpers/init-dom0less.c
new file mode 100644
index 0000000000..b6a3831cb5
--- /dev/null
+++ b/tools/helpers/init-dom0less.c
@@ -0,0 +1,269 @@
+#include <stdbool.h>
+#include <syslog.h>
+#include <stdio.h>
+#include <err.h>
+#include <stdlib.h>
+#include <xenstore.h>
+#include <xenctrl.h>
+#include <xenguest.h>
+#include <libxl.h>
+#include <xenevtchn.h>
+
+#include "init-dom-json.h"
+
+#define NR_MAGIC_PAGES 4
+#define CONSOLE_PFN_OFFSET 0
+#define XENSTORE_PFN_OFFSET 1
+#define STR_MAX_LENGTH 64
+
+static int alloc_magic_pages(libxl_dominfo *info, struct xc_dom_image *dom)
+{
+    int rc, i;
+    const xen_pfn_t base = GUEST_MAGIC_BASE >> XC_PAGE_SHIFT;
+    xen_pfn_t p2m[NR_MAGIC_PAGES];
+
+    rc = xc_domain_setmaxmem(dom->xch, dom->guest_domid,
+                             info->max_memkb + NR_MAGIC_PAGES * 4);
+    if (rc < 0)
+        return rc;
+
+    for (i = 0; i < NR_MAGIC_PAGES; i++)
+        p2m[i] = base + i;
+
+    rc = xc_domain_populate_physmap_exact(dom->xch, dom->guest_domid,
+                                          NR_MAGIC_PAGES, 0, 0, p2m);
+    if (rc < 0)
+        return rc;
+
+    dom->xenstore_pfn = base + XENSTORE_PFN_OFFSET;
+
+    xc_clear_domain_page(dom->xch, dom->guest_domid, dom->xenstore_pfn);
+
+    xc_hvm_param_set(dom->xch, dom->guest_domid, HVM_PARAM_STORE_PFN,
+                     dom->xenstore_pfn);
+    return 0;
+}
+
+static bool do_xs_write_dom(struct xs_handle *xsh, xs_transaction_t t,
+                            domid_t domid, char *path, char *val)
+{
+    char full_path[STR_MAX_LENGTH];
+
+    snprintf(full_path, STR_MAX_LENGTH,
+             "/local/domain/%d/%s", domid, path);
+    return xs_write(xsh, t, full_path, val, strlen(val));
+}
+
+static bool do_xs_write_libxl(struct xs_handle *xsh, xs_transaction_t t,
+                              domid_t domid, char *path, char *val)
+{
+    char full_path[STR_MAX_LENGTH];
+
+    snprintf(full_path, STR_MAX_LENGTH,
+             "/libxl/%d/%s", domid, path);
+    return xs_write(xsh, t, full_path, val, strlen(val));
+}
+
+static bool do_xs_write_vm(struct xs_handle *xsh, xs_transaction_t t,
+                           libxl_uuid uuid, char *path, char *val)
+{
+    char full_path[STR_MAX_LENGTH];
+
+    snprintf(full_path, STR_MAX_LENGTH,
+             "/vm/" LIBXL_UUID_FMT "/%s", LIBXL_UUID_BYTES(uuid), path);
+    return xs_write(xsh, t, full_path, val, strlen(val));
+}
+
+/*
+ * The xenstore nodes are the xenstore nodes libxl writes at domain
+ * creation.
+ *
+ * The list was retrieved by running xenstore-ls on a corresponding
+ * domain started by xl/libxl.
+ */
+static int restore_xenstore(struct xs_handle *xsh,
+                            libxl_dominfo *info, libxl_uuid uuid,
+                            evtchn_port_t xenstore_port)
+{
+    domid_t domid;
+    int i;
+    char uuid_str[STR_MAX_LENGTH];
+    char dom_name_str[STR_MAX_LENGTH];
+    char vm_val_str[STR_MAX_LENGTH];
+    char id_str[STR_MAX_LENGTH];
+    char max_memkb_str[STR_MAX_LENGTH];
+    char cpu_str[STR_MAX_LENGTH];
+    char xenstore_port_str[STR_MAX_LENGTH];
+    char ring_ref_str[STR_MAX_LENGTH];
+    xs_transaction_t t;
+
+    domid = info->domid;
+    snprintf(id_str, STR_MAX_LENGTH, "%d", domid);
+    snprintf(dom_name_str, STR_MAX_LENGTH, "dom0less-%d", domid);
+    snprintf(uuid_str, STR_MAX_LENGTH, LIBXL_UUID_FMT, LIBXL_UUID_BYTES(uuid));
+    snprintf(vm_val_str, STR_MAX_LENGTH,
+             "vm/" LIBXL_UUID_FMT, LIBXL_UUID_BYTES(uuid));
+    snprintf(max_memkb_str, STR_MAX_LENGTH, "%lu", info->max_memkb);
+    snprintf(ring_ref_str, STR_MAX_LENGTH, "%lld",
+             (GUEST_MAGIC_BASE >> XC_PAGE_SHIFT) + XENSTORE_PFN_OFFSET);
+    snprintf(xenstore_port_str, STR_MAX_LENGTH, "%d", xenstore_port);
+
+retry_transaction:
+    t = xs_transaction_start(xsh);
+    if (t == XBT_NULL)
+        return -errno;
+
+    /* /vm */
+    if (!do_xs_write_vm(xsh, t, uuid, "name", dom_name_str)) return -EIO;
+    if (!do_xs_write_vm(xsh, t, uuid, "uuid", uuid_str)) return -EIO;
+    if (!do_xs_write_vm(xsh, t, uuid, "start_time", "0")) return -EIO;
+
+    /* /domain */
+    if (!do_xs_write_dom(xsh, t, domid, "vm", vm_val_str)) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "name", dom_name_str)) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "cpu", "")) return -EIO;
+    for (i = 0; i < info->vcpu_max_id; i++) {
+        snprintf(cpu_str, STR_MAX_LENGTH, "cpu/%d/availability/", i);
+        if (!do_xs_write_dom(xsh, t, domid, cpu_str,
+                             (info->cpupool & (1 << i)) ? "online" : "offline"))
+            return -EIO;
+    }
+    if (!do_xs_write_dom(xsh, t, domid, "cpu/0", "")) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "cpu/availability", "online")) return -EIO;
+
+    if (!do_xs_write_dom(xsh, t, domid, "memory", "")) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "memory/static-max", max_memkb_str)) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "memory/videoram", "-1")) return -EIO;
+
+    if (!do_xs_write_dom(xsh, t, domid, "device", "")) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "device/suspend", "")) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "device/suspend/event-channel", "")) return -EIO;
+
+    if (!do_xs_write_dom(xsh, t, domid, "control", "")) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "control/shutdown", "")) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "control/feature-poweroff", "1")) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "control/feature-reboot", "1")) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "control/feature-suspend", "")) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "control/sysrq", "")) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "control/platform-feature-multiprocessor-suspend", "1")) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "control", "platform-feature-xs_reset_watches")) return -EIO;
+
+    if (!do_xs_write_dom(xsh, t, domid, "domid", id_str)) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "data", "")) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "drivers", "")) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "feature", "")) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "attr", "")) return -EIO;
+
+    if (!do_xs_write_dom(xsh, t, domid, "store/port", xenstore_port_str)) return -EIO;
+    if (!do_xs_write_dom(xsh, t, domid, "store/ring-ref", ring_ref_str)) return -EIO;
+
+    if (!do_xs_write_libxl(xsh, t, domid, "type", "pvh")) return -EIO;
+    if (!do_xs_write_libxl(xsh, t, domid, "dm-version", "qemu_xen")) return -EIO;
+
+    if (!xs_transaction_end(xsh, t, false))
+        if (errno == EAGAIN)
+            goto retry_transaction;
+        else
+            return -errno;
+
+    return 0;
+}
+
+static int init_domain(struct xs_handle *xsh, libxl_dominfo *info)
+{
+    struct xc_dom_image dom;
+    libxl_uuid uuid;
+    uint64_t v;
+    int rc;
+
+    printf("Init dom0less domain: %d\n", info->domid);
+    dom.guest_domid = info->domid;
+    dom.xenstore_domid = 0;
+    dom.xch = xc_interface_open(0, 0, 0);
+
+    rc = xc_hvm_param_get(dom.xch, info->domid, HVM_PARAM_STORE_EVTCHN, &v);
+    if (rc != 0) {
+        printf("Failed to get HVM_PARAM_STORE_EVTCHN\n");
+        return 1;
+    }
+    dom.xenstore_evtchn = v;
+
+    /* Console won't be initialized but set its data for completeness */
+    dom.console_domid = 0;
+
+    /* Alloc magic pages */
+    if (alloc_magic_pages(info, &dom) != 0) {
+        printf("Error on alloc magic pages\n");
+        return 1;
+    }
+
+    xc_dom_gnttab_init(&dom);
+
+    libxl_uuid_generate(&uuid);
+    xc_domain_sethandle(dom.xch, info->domid, libxl_uuid_bytearray(&uuid));
+
+    rc = gen_stub_json_config(info->domid, &uuid);
+    if (rc)
+        err(1, "gen_stub_json_config");
+
+    rc = restore_xenstore(xsh, info, uuid, dom.xenstore_evtchn);
+    if (rc)
+        err(1, "writing to xenstore");
+
+    xs_introduce_domain(xsh, info->domid,
+            (GUEST_MAGIC_BASE >> XC_PAGE_SHIFT) + XENSTORE_PFN_OFFSET,
+            dom.xenstore_evtchn);
+    return 0;
+}
+
+/* Check if domain has been configured in XS */
+static bool domain_exists(struct xs_handle *xsh, int domid)
+{
+    return xs_is_domain_introduced(xsh, domid);
+}
+
+int main(int argc, char **argv)
+{
+    libxl_dominfo *info = NULL;
+    libxl_ctx *ctx;
+    int nb_vm = 0, rc = 0, i;
+    struct xs_handle *xsh = NULL;
+
+    xsh = xs_daemon_open();
+    if (xsh == NULL) {
+        fprintf(stderr, "Could not contact XenStore");
+        rc = -errno;
+        goto out;
+    }
+
+    rc = libxl_ctx_alloc(&ctx, LIBXL_VERSION, 0, NULL);
+    if (rc) {
+        fprintf(stderr, "cannot init xl context\n");
+        goto out;
+    }
+
+    info = libxl_list_domain(ctx, &nb_vm);
+    if (!info) {
+        fprintf(stderr, "libxl_list_vm failed.\n");
+        rc = -1;
+        goto out;
+    }
+
+    for (i = 0; i < nb_vm; i++) {
+        domid_t domid = info[i].domid;
+
+        /* Don't need to check for Dom0 */
+        if (!domid)
+            continue;
+
+        printf("Checking domid: %u\n", domid);
+        if (!domain_exists(xsh, domid))
+            init_domain(xsh, &info[i]);
+        else
+            printf("Domain %d has already been initialized\n", domid);
+    }
+out:
+    libxl_dominfo_list_free(info, nb_vm);
+    xs_close(xsh);
+    return rc;
+}
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/5] xen: introduce xen,enhanced dom0less property
  2022-01-28 21:33 ` [PATCH v3 1/5] xen: introduce xen,enhanced dom0less property Stefano Stabellini
@ 2022-01-29 17:58   ` Julien Grall
  2022-03-23  0:08     ` Stefano Stabellini
  0 siblings, 1 reply; 42+ messages in thread
From: Julien Grall @ 2022-01-29 17:58 UTC (permalink / raw)
  To: Stefano Stabellini, xen-devel
  Cc: jgross, Bertrand.Marquis, Volodymyr_Babchuk, Stefano Stabellini

Hi Stefano,

On 28/01/2022 21:33, Stefano Stabellini wrote:
> From: Stefano Stabellini <stefano.stabellini@xilinx.com>
> 
> Introduce a new "xen,enhanced" dom0less property to enable/disable PV
> driver interfaces for dom0less guests. Currently only "enabled" and
> "disabled" are supported property values (and empty). Leave the option
> open to implement further possible values in the future (e.g.
> "xenstore" to enable only xenstore.)
> 
> The configurable option is for domUs only. For dom0 we always set the
> corresponding property in the Xen code to true (PV interfaces enabled.)
> 
> This patch only parses the property. Next patches will make use of it.
> 
> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
> CC: Julien Grall <julien@xen.org>
> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
> CC: Bertrand Marquis <bertrand.marquis@arm.com>
> ---
> Changes in v3:
> - improve commit message
> 
> Changes in v2:
> - rename kinfo.enhanced to kinfo.dom0less_enhanced
> - set kinfo.dom0less_enhanced to true for dom0
> - handle -ENODATA in addition to -EILSEQ
> ---
>   docs/misc/arm/device-tree/booting.txt | 18 ++++++++++++++++++
>   xen/arch/arm/domain_build.c           |  8 ++++++++
>   xen/arch/arm/include/asm/kernel.h     |  3 +++
>   3 files changed, 29 insertions(+)
> 
> diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt
> index 71895663a4..38c29fb3d8 100644
> --- a/docs/misc/arm/device-tree/booting.txt
> +++ b/docs/misc/arm/device-tree/booting.txt
> @@ -169,6 +169,24 @@ with the following properties:
>       Please note that the SPI used for the virtual pl011 could clash with the
>       physical SPI of a physical device assigned to the guest.
>   
> +- xen,enhanced

NIT: I find a bit strange this is added in the middle of the property. 
Can you either sort the property alphabtically or move this one to the end?

> +
> +    A string property. Possible property values are:
> +
> +    - "enabled" (or missing property value)
> +    Xen PV interfaces, including grant-table and xenstore, will be
> +    enabled for the VM.
> +
> +    - "disabled"
> +    Xen PV interfaces are disabled.
> +
> +    If the xen,enhanced property is present with no value, it defaults
> +    to "enabled". If the xen,enhanced property is not present, PV
> +    interfaces are disabled.
> +
> +    In the future other possible property values might be added to
> +    enable only selected interfaces.
> +
>   - nr_spis
>   
>       Optional. A 32-bit integer specifying the number of SPIs (Shared
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 6931c022a2..9144d6c0b6 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -2963,6 +2963,7 @@ static int __init construct_domU(struct domain *d,
>                                    const struct dt_device_node *node)
>   {
>       struct kernel_info kinfo = {};
> +    const char *dom0less_enhanced;
>       int rc;
>       u64 mem;
>   
> @@ -2978,6 +2979,12 @@ static int __init construct_domU(struct domain *d,
>   
>       kinfo.vpl011 = dt_property_read_bool(node, "vpl011");
>   
> +    rc = dt_property_read_string(node, "xen,enhanced", &dom0less_enhanced);
> +    if ( rc == -EILSEQ ||

I think the use an -EILSEQ wants an explanation. In a previous version, 
you wrote that the value would be returned when:

fdt set /chosen/domU0 xen,enhanced

But it is not clear why. Can you print pp->value, pp->length, 
strnlen(..) when this happens?


> +         rc == -ENODATA ||
> +         (rc == 0 && !strcmp(dom0less_enhanced, "enabled")) )
> +        kinfo.dom0less_enhanced = true;
> +
>       if ( vcpu_create(d, 0) == NULL )
>           return -ENOMEM;
>   
> @@ -3095,6 +3102,7 @@ static int __init construct_dom0(struct domain *d)
>   
>       kinfo.unassigned_mem = dom0_mem;
>       kinfo.d = d;
> +    kinfo.dom0less_enhanced = true;

This is a bit odd. The name suggests that this is a dom0less specific 
option. But then you are setting it to dom0.

Given that this variable is about enable PV drivers, I think this should 
be false for dom0.

>   
>       rc = kernel_probe(&kinfo, NULL);
>       if ( rc < 0 )
> diff --git a/xen/arch/arm/include/asm/kernel.h b/xen/arch/arm/include/asm/kernel.h
> index 874aa108a7..c4dc039b54 100644
> --- a/xen/arch/arm/include/asm/kernel.h
> +++ b/xen/arch/arm/include/asm/kernel.h
> @@ -36,6 +36,9 @@ struct kernel_info {
>       /* Enable pl011 emulation */
>       bool vpl011;
>   
> +    /* Enable PV drivers */
> +    bool dom0less_enhanced;
> +
>       /* GIC phandle */
>       uint32_t phandle_gic;
>   

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 3/5] xen/arm: configure dom0less domain for enabling xenstore after boot
  2022-01-28 21:33 ` [PATCH v3 3/5] xen/arm: configure dom0less domain for enabling xenstore after boot Stefano Stabellini
@ 2022-01-29 18:13   ` Julien Grall
  2022-03-23  1:18     ` Stefano Stabellini
  0 siblings, 1 reply; 42+ messages in thread
From: Julien Grall @ 2022-01-29 18:13 UTC (permalink / raw)
  To: Stefano Stabellini, xen-devel
  Cc: jgross, Bertrand.Marquis, Volodymyr_Babchuk, Luca Miccio,
	Stefano Stabellini

Hi Stefano,

On 28/01/2022 21:33, Stefano Stabellini wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> If "xen,enhanced" is enabled, then add to dom0less domains:
> 
> - the hypervisor node in device tree
> - the xenstore event channel
> 
> The xenstore event channel is also used for the first notification to
> let the guest know that xenstore has become available.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
> CC: Julien Grall <julien@xen.org>
> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
> CC: Bertrand Marquis <bertrand.marquis@arm.com>
> 
> ---
> Changes in v3:
> - use evtchn_alloc_unbound
> 
> Changes in v2:
> - set HVM_PARAM_STORE_PFN to ~0ULL at domain creation
> - in alloc_xenstore_evtchn do not call _evtchn_alloc_unbound
> ---
>   xen/arch/arm/domain_build.c | 41 +++++++++++++++++++++++++++++++++++++
>   1 file changed, 41 insertions(+)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 9144d6c0b6..8e030a7f05 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -27,6 +27,7 @@
>   #include <asm/setup.h>
>   #include <asm/cpufeature.h>
>   #include <asm/domain_build.h>
> +#include <xen/event.h>
>   
>   #include <xen/irq.h>
>   #include <xen/grant_table.h>
> @@ -2619,6 +2620,8 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
>       int ret;
>   
>       kinfo->phandle_gic = GUEST_PHANDLE_GIC;
> +    kinfo->gnttab_start = GUEST_GNTTAB_BASE;
> +    kinfo->gnttab_size = GUEST_GNTTAB_SIZE;
>   
>       addrcells = GUEST_ROOT_ADDRESS_CELLS;
>       sizecells = GUEST_ROOT_SIZE_CELLS;
> @@ -2693,6 +2696,13 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo)
>               goto err;
>       }
>   
> +    if ( kinfo->dom0less_enhanced )
> +    {
> +        ret = make_hypervisor_node(d, kinfo, addrcells, sizecells);

Looking at the code, I think the extended regions will not work properly 
because we are looking at the host memory layout. In the case of domU, 
we want to use the guest layout. Please have a look how it was done in 
libxl.

> +        if ( ret )
> +            goto err;
> +    }
> +
>       ret = fdt_end_node(kinfo->fdt);
>       if ( ret < 0 )
>           goto err;
> @@ -2959,6 +2969,25 @@ static int __init construct_domain(struct domain *d, struct kernel_info *kinfo)
>       return 0;
>   }
>   
> +static int __init alloc_xenstore_evtchn(struct domain *d)
> +{
> +    evtchn_alloc_unbound_t alloc;
> +    int rc;
> +
> +    alloc.dom = d->domain_id;
> +    alloc.remote_dom = hardware_domain->domain_id;

The first thing evtchn_alloc_unbound() will do is looking up the domain. 
This seems a bit pointless given that we have the domain in hand. 
Shouldn't we extend evtchn_alloc_unbound() to pass the domain?

> +    rc = evtchn_alloc_unbound(&alloc, true);
> +    if ( rc )
> +    {
> +        printk("Failed allocating event channel for domain\n");
> +        return rc;
> +    }
> +
> +    d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN] = alloc.port;
> +
> +    return 0;
> +}
> +
>   static int __init construct_domU(struct domain *d,
>                                    const struct dt_device_node *node)
>   {
> @@ -3014,7 +3043,19 @@ static int __init construct_domU(struct domain *d,
>           return rc;
>   
>       if ( kinfo.vpl011 )
> +    {
>           rc = domain_vpl011_init(d, NULL);
> +        if ( rc < 0 )
> +            return rc;
> +    }
> +
> +    if ( kinfo.dom0less_enhanced )
> +    {
> +        rc = alloc_xenstore_evtchn(d);
> +        if ( rc < 0 )
> +            return rc;
> +        d->arch.hvm.params[HVM_PARAM_STORE_PFN] = ~0ULL;

I think it would be easy to allocate the page right now. So what prevent 
us to do it right now?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/5] tools: add example application to initialize dom0less PV drivers
  2022-01-28 21:33 ` [PATCH v3 5/5] tools: add example application to initialize dom0less PV drivers Stefano Stabellini
@ 2022-01-29 19:35   ` Julien Grall
  2022-03-23  2:50     ` Stefano Stabellini
  0 siblings, 1 reply; 42+ messages in thread
From: Julien Grall @ 2022-01-29 19:35 UTC (permalink / raw)
  To: Stefano Stabellini, xen-devel
  Cc: jgross, Bertrand.Marquis, Volodymyr_Babchuk, Luca Miccio,
	Stefano Stabellini, Wei Liu, Anthony PERARD

Hi,

On 28/01/2022 21:33, Stefano Stabellini wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Add an example application that can be run in dom0 to complete the
> dom0less domains initialization so that they can get access to xenstore
> and use PV drivers.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> CC: Wei Liu <wl@xen.org>
> CC: Anthony PERARD <anthony.perard@citrix.com>
> CC: Juergen Gross <jgross@suse.com>
> ---
> Changes in v3:
> - handle xenstore errors
> - add an in-code comment about xenstore entries
> - less verbose output
> - clean-up error path in main
> 
> Changes in v2:
> - do not set HVM_PARAM_STORE_EVTCHN twice
> - rename restore_xenstore to create_xenstore
> - increase maxmem
> ---
>   tools/helpers/Makefile        |  13 ++
>   tools/helpers/init-dom0less.c | 269 ++++++++++++++++++++++++++++++++++

Should we document how this is meant to be used?

>   2 files changed, 282 insertions(+)
>   create mode 100644 tools/helpers/init-dom0less.c
> 
> diff --git a/tools/helpers/Makefile b/tools/helpers/Makefile
> index 7f6c422440..8e42997052 100644
> --- a/tools/helpers/Makefile
> +++ b/tools/helpers/Makefile
> @@ -10,6 +10,9 @@ ifeq ($(CONFIG_Linux),y)
>   ifeq ($(CONFIG_X86),y)
>   PROGS += init-xenstore-domain
>   endif
> +ifeq ($(CONFIG_ARM),y)
> +PROGS += init-dom0less
> +endif >   endif
>   
>   XEN_INIT_DOM0_OBJS = xen-init-dom0.o init-dom-json.o
> @@ -26,6 +29,13 @@ $(INIT_XENSTORE_DOMAIN_OBJS): CFLAGS += $(CFLAGS_libxenstore)
>   $(INIT_XENSTORE_DOMAIN_OBJS): CFLAGS += $(CFLAGS_libxenlight)
>   $(INIT_XENSTORE_DOMAIN_OBJS): CFLAGS += -include $(XEN_ROOT)/tools/config.h
>   
> +INIT_DOM0LESS_OBJS = init-dom0less.o init-dom-json.o
> +$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxentoollog)
> +$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenstore)
> +$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenlight)
> +$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenctrl)
> +$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenevtchn)
> +
>   .PHONY: all
>   all: $(PROGS)
>   
> @@ -35,6 +45,9 @@ xen-init-dom0: $(XEN_INIT_DOM0_OBJS)
>   init-xenstore-domain: $(INIT_XENSTORE_DOMAIN_OBJS)
>   	$(CC) $(LDFLAGS) -o $@ $(INIT_XENSTORE_DOMAIN_OBJS) $(LDLIBS_libxentoollog) $(LDLIBS_libxenstore) $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) $(LDLIBS_libxenlight) $(APPEND_LDFLAGS)
>   
> +init-dom0less: $(INIT_DOM0LESS_OBJS)
> +	$(CC) $(LDFLAGS) -o $@ $(INIT_DOM0LESS_OBJS) $(LDLIBS_libxenctrl) $(LDLIBS_libxenevtchn) $(LDLIBS_libxentoollog) $(LDLIBS_libxenstore) $(LDLIBS_libxenlight) $(LDLIBS_libxenguest)  $(APPEND_LDFLAGS)
> +
>   .PHONY: install
>   install: all
>   	$(INSTALL_DIR) $(DESTDIR)$(LIBEXEC_BIN)
> diff --git a/tools/helpers/init-dom0less.c b/tools/helpers/init-dom0less.c
> new file mode 100644
> index 0000000000..b6a3831cb5
> --- /dev/null
> +++ b/tools/helpers/init-dom0less.c
> @@ -0,0 +1,269 @@
> +#include <stdbool.h>
> +#include <syslog.h>
> +#include <stdio.h>
> +#include <err.h>
> +#include <stdlib.h>
> +#include <xenstore.h>
> +#include <xenctrl.h>
> +#include <xenguest.h>
> +#include <libxl.h>
> +#include <xenevtchn.h>
> +
> +#include "init-dom-json.h"
> +
> +#define NR_MAGIC_PAGES 4

Why are we allocating 4 pages when only 2 (maybe 1) is necessary?

> +#define CONSOLE_PFN_OFFSET 0
> +#define XENSTORE_PFN_OFFSET 1
> +#define STR_MAX_LENGTH 64
> +
> +static int alloc_magic_pages(libxl_dominfo *info, struct xc_dom_image *dom)
> +{
> +    int rc, i;
> +    const xen_pfn_t base = GUEST_MAGIC_BASE >> XC_PAGE_SHIFT;
> +    xen_pfn_t p2m[NR_MAGIC_PAGES];
> +
> +    rc = xc_domain_setmaxmem(dom->xch, dom->guest_domid,
> +                             info->max_memkb + NR_MAGIC_PAGES * 4);

Please don't rely on the fact the page size will be 4KB in Xen. Instead, 
use XC_PAGE_*.

> +    if (rc < 0)
> +        return rc;
> +
> +    for (i = 0; i < NR_MAGIC_PAGES; i++)
> +        p2m[i] = base + i;
> +
> +    rc = xc_domain_populate_physmap_exact(dom->xch, dom->guest_domid,
> +                                          NR_MAGIC_PAGES, 0, 0, p2m);
> +    if (rc < 0)
> +        return rc;
> +
> +    dom->xenstore_pfn = base + XENSTORE_PFN_OFFSET;
> +
> +    xc_clear_domain_page(dom->xch, dom->guest_domid, dom->xenstore_pfn);

So you allocate 4 pages, use 2, but only clear 1. Can you explain why?

Also, should not you check the error return here and  ...

> +
> +    xc_hvm_param_set(dom->xch, dom->guest_domid, HVM_PARAM_STORE_PFN,
> +                     dom->xenstore_pfn);

here...?

Also, in theory, as soon as you set xc_hvm_param_set(), the guest may be 
able to start using Xenstore. So wouldn't it be better to set it once 
you know everything is in place (i.e. just before calling 
xs_introduce_domain())?

> +    return 0;
> +}
> +
> +static bool do_xs_write_dom(struct xs_handle *xsh, xs_transaction_t t,
> +                            domid_t domid, char *path, char *val)
> +{
> +    char full_path[STR_MAX_LENGTH];
> +
> +    snprintf(full_path, STR_MAX_LENGTH,
> +             "/local/domain/%d/%s", domid, path);
> +    return xs_write(xsh, t, full_path, val, strlen(val));

 From my understanding, xs_write() will create a node that will only be 
readable/writable by the domain executing this binary (i.e. dom0). IOW, 
the guest will not see the nodes.

So shouldn't you also set the permissions?

> +}
> +
> +static bool do_xs_write_libxl(struct xs_handle *xsh, xs_transaction_t t,
> +                              domid_t domid, char *path, char *val)
> +{
> +    char full_path[STR_MAX_LENGTH];
> +
> +    snprintf(full_path, STR_MAX_LENGTH,
> +             "/libxl/%d/%s", domid, path);

Please use %u when you refer to unsigned value. Also, I think it would 
be a good practice to check the return value of snprintf(). This would 
avoid any future surprise of value truncated by mistake.

The same is valid for all the other use below.

> +    return xs_write(xsh, t, full_path, val, strlen(val));
> +}
> +
> +static bool do_xs_write_vm(struct xs_handle *xsh, xs_transaction_t t,
> +                           libxl_uuid uuid, char *path, char *val)
> +{
> +    char full_path[STR_MAX_LENGTH];
> +
> +    snprintf(full_path, STR_MAX_LENGTH,
> +             "/vm/" LIBXL_UUID_FMT "/%s", LIBXL_UUID_BYTES(uuid), path);
> +    return xs_write(xsh, t, full_path, val, strlen(val));
> +}
> +
> +/*
> + * The xenstore nodes are the xenstore nodes libxl writes at domain
> + * creation.
> + *
> + * The list was retrieved by running xenstore-ls on a corresponding
> + * domain started by xl/libxl.
> + */
> +static int restore_xenstore(struct xs_handle *xsh,

As I wrote in v1, I think "restore" is misleading because the domain was 
never in Xenstore. So how about "create"? (Which BTW you agreed on back 
then).

> +                            libxl_dominfo *info, libxl_uuid uuid,
> +                            evtchn_port_t xenstore_port)
> +{
> +    domid_t domid;
> +    int i;

This is used as an iterator for a uint32_t value. So I think it should 
at least be unsigned int.

> +    char uuid_str[STR_MAX_LENGTH];
> +    char dom_name_str[STR_MAX_LENGTH];
> +    char vm_val_str[STR_MAX_LENGTH];
> +    char id_str[STR_MAX_LENGTH];
> +    char max_memkb_str[STR_MAX_LENGTH];
> +    char cpu_str[STR_MAX_LENGTH];
> +    char xenstore_port_str[STR_MAX_LENGTH];
> +    char ring_ref_str[STR_MAX_LENGTH];
> +    xs_transaction_t t;
> +
> +    domid = info->domid;
> +    snprintf(id_str, STR_MAX_LENGTH, "%d", domid);
> +    snprintf(dom_name_str, STR_MAX_LENGTH, "dom0less-%d", domid);
> +    snprintf(uuid_str, STR_MAX_LENGTH, LIBXL_UUID_FMT, LIBXL_UUID_BYTES(uuid));
> +    snprintf(vm_val_str, STR_MAX_LENGTH,
> +             "vm/" LIBXL_UUID_FMT, LIBXL_UUID_BYTES(uuid));
> +    snprintf(max_memkb_str, STR_MAX_LENGTH, "%lu", info->max_memkb);
> +    snprintf(ring_ref_str, STR_MAX_LENGTH, "%lld",
> +             (GUEST_MAGIC_BASE >> XC_PAGE_SHIFT) + XENSTORE_PFN_OFFSET);
> +    snprintf(xenstore_port_str, STR_MAX_LENGTH, "%d", xenstore_port); > +
> +retry_transaction:
> +    t = xs_transaction_start(xsh);
> +    if (t == XBT_NULL)
> +        return -errno;
> +
> +    /* /vm */
> +    if (!do_xs_write_vm(xsh, t, uuid, "name", dom_name_str)) return -EIO;

You should terminate the transaction in case of an error.

> +    if (!do_xs_write_vm(xsh, t, uuid, "uuid", uuid_str)) return -EIO;
> +    if (!do_xs_write_vm(xsh, t, uuid, "start_time", "0")) return -EIO;

Wouldn't it be better to create based on the time now?

> +
> +    /* /domain */
> +    if (!do_xs_write_dom(xsh, t, domid, "vm", vm_val_str)) return -EIO;
> +    if (!do_xs_write_dom(xsh, t, domid, "name", dom_name_str)) return -EIO;
> +    if (!do_xs_write_dom(xsh, t, domid, "cpu", "")) return -EIO;
> +    for (i = 0; i < info->vcpu_max_id; i++) {
> +        snprintf(cpu_str, STR_MAX_LENGTH, "cpu/%d/availability/", i);
> +        if (!do_xs_write_dom(xsh, t, domid, cpu_str,
> +                             (info->cpupool & (1 << i)) ? "online" : "offline"))
> +            return -EIO;
> +    }
> +    if (!do_xs_write_dom(xsh, t, domid, "cpu/0", "")) return -EIO;

I am a bit confused. You created 0 above, so why do you need to create 
it here again?

> +    if (!do_xs_write_dom(xsh, t, domid, "cpu/availability", "online")) return -EIO;

I can't seem to find this node in xenstore and libxl.

> +
> +    if (!do_xs_write_dom(xsh, t, domid, "memory", "")) return -EIO;
> +    if (!do_xs_write_dom(xsh, t, domid, "memory/static-max", max_memkb_str)) return -EIO;
> +    if (!do_xs_write_dom(xsh, t, domid, "memory/videoram", "-1")) return -EIO;

How about "memory/target"?

> +
> +    if (!do_xs_write_dom(xsh, t, domid, "device", "")) return -EIO;
> +    if (!do_xs_write_dom(xsh, t, domid, "device/suspend", "")) return -EIO;
> +    if (!do_xs_write_dom(xsh, t, domid, "device/suspend/event-channel", "")) return -EIO;
> +
> +    if (!do_xs_write_dom(xsh, t, domid, "control", "")) return -EIO;
> +    if (!do_xs_write_dom(xsh, t, domid, "control/shutdown", "")) return -EIO;
> +    if (!do_xs_write_dom(xsh, t, domid, "control/feature-poweroff", "1")) return -EIO;
> +    if (!do_xs_write_dom(xsh, t, domid, "control/feature-reboot", "1")) return -EIO;
> +    if (!do_xs_write_dom(xsh, t, domid, "control/feature-suspend", "")) return -EIO;
> +    if (!do_xs_write_dom(xsh, t, domid, "control/sysrq", "")) return -EIO;
> +    if (!do_xs_write_dom(xsh, t, domid, "control/platform-feature-multiprocessor-suspend", "1")) return -EIO;
> +    if (!do_xs_write_dom(xsh, t, domid, "control", "platform-feature-xs_reset_watches")) return -EIO;

It sounds like this wants to be "control/platform-feature...". If this 
hasn't been done, I would diff the libxl version and your version to 
check if all are the same (names, values, permissions).

> +
> +    if (!do_xs_write_dom(xsh, t, domid, "domid", id_str)) return -EIO;
> +    if (!do_xs_write_dom(xsh, t, domid, "data", "")) return -EIO;
> +    if (!do_xs_write_dom(xsh, t, domid, "drivers", "")) return -EIO;
> +    if (!do_xs_write_dom(xsh, t, domid, "feature", "")) return -EIO;
> +    if (!do_xs_write_dom(xsh, t, domid, "attr", "")) return -EIO;
> +
> +    if (!do_xs_write_dom(xsh, t, domid, "store/port", xenstore_port_str)) return -EIO;
> +    if (!do_xs_write_dom(xsh, t, domid, "store/ring-ref", ring_ref_str)) return -EIO;
> +
> +    if (!do_xs_write_libxl(xsh, t, domid, "type", "pvh")) return -EIO;
> +    if (!do_xs_write_libxl(xsh, t, domid, "dm-version", "qemu_xen")) return -EIO;
> +
> +    if (!xs_transaction_end(xsh, t, false))
> +        if (errno == EAGAIN)
> +            goto retry_transaction;
> +        else
> +            return -errno;
> +
> +    return 0;
> +}
> +
> +static int init_domain(struct xs_handle *xsh, libxl_dominfo *info)
> +{
> +    struct xc_dom_image dom;

I would initialize dom to 0 to avoid any undef behavior.

> +    libxl_uuid uuid;
> +    uint64_t v;
> +    int rc;
> +
> +    printf("Init dom0less domain: %d\n", info->domid);
> +    dom.guest_domid = info->domid;
> +    dom.xenstore_domid = 0;
> +    dom.xch = xc_interface_open(0, 0, 0);
> +
> +    rc = xc_hvm_param_get(dom.xch, info->domid, HVM_PARAM_STORE_EVTCHN, &v);
> +    if (rc != 0) {
> +        printf("Failed to get HVM_PARAM_STORE_EVTCHN\n");
> +        return 1;
> +    }
> +    dom.xenstore_evtchn = v;
> +
> +    /* Console won't be initialized but set its data for completeness */
> +    dom.console_domid = 0;

I find a bit odd you set the domid but not the event channel, page. Can 
you explain?

Actually, can you explain why only half of the structure is initialized?

> +
> +    /* Alloc magic pages */
> +    if (alloc_magic_pages(info, &dom) != 0) {
> +        printf("Error on alloc magic pages\n");
> +        return 1;
> +    }
> +
> +    xc_dom_gnttab_init(&dom);

This call as the risk to break the guest if the dom0 Linux doesn't 
support the acquire interface. This is because it will punch a hole in 
the domain memory where the grant-table may have already been mapped.

Also, this function could fails.

> +
> +    libxl_uuid_generate(&uuid);
> +    xc_domain_sethandle(dom.xch, info->domid, libxl_uuid_bytearray(&uuid));
> +
> +    rc = gen_stub_json_config(info->domid, &uuid);
> +    if (rc)
> +        err(1, "gen_stub_json_config");
> +
> +    rc = restore_xenstore(xsh, info, uuid, dom.xenstore_evtchn);
> +    if (rc)
> +        err(1, "writing to xenstore");
> +
> +    xs_introduce_domain(xsh, info->domid,
> +            (GUEST_MAGIC_BASE >> XC_PAGE_SHIFT) + XENSTORE_PFN_OFFSET,
> +            dom.xenstore_evtchn);

xs_introduce_domain() can technically fails.

> +    return 0;
> +}
> +
> +/* Check if domain has been configured in XS */
> +static bool domain_exists(struct xs_handle *xsh, int domid)
> +{
> +    return xs_is_domain_introduced(xsh, domid);
> +}

Would not this lead to initialize a domain with PV driver disabled?

In addition to that, I think we should warn the users (maybe in some 
documentation) that this should not be called in parallel of scripts 
that may create new domain. Otherwise they may be picked up here as well...

> +
> +int main(int argc, char **argv)
> +{
> +    libxl_dominfo *info = NULL;
> +    libxl_ctx *ctx;
> +    int nb_vm = 0, rc = 0, i;
> +    struct xs_handle *xsh = NULL;
> +
> +    xsh = xs_daemon_open();

 From my understanding xs_daemon_open() is deprecated. Instead, you want 
to use xs_open(0).

However, libxl_ctx_alloc() is already opening a xenstore connection. So 
I think it would be better to rely on libxl and other the provided 
helpers (introduce new one if necessary) or just completely drop libxl 
from the equations.

> +    if (xsh == NULL) {
> +        fprintf(stderr, "Could not contact XenStore");
> +        rc = -errno;
> +        goto out;
> +    }
> +
> +    rc = libxl_ctx_alloc(&ctx, LIBXL_VERSION, 0, NULL);
> +    if (rc) {
> +        fprintf(stderr, "cannot init xl context\n");
> +        goto out;
> +    }
> +
> +    info = libxl_list_domain(ctx, &nb_vm);
> +    if (!info) {
> +        fprintf(stderr, "libxl_list_vm failed.\n");
> +        rc = -1;
> +        goto out;
> +    }
> +
> +    for (i = 0; i < nb_vm; i++) {
> +        domid_t domid = info[i].domid;
> +
> +        /* Don't need to check for Dom0 */
> +        if (!domid)
> +            continue;
> +
> +        printf("Checking domid: %u\n", domid);
> +        if (!domain_exists(xsh, domid))
> +            init_domain(xsh, &info[i]);

init_domain() could return an error. So shouldn't you check it?

If yes, then the question is whether you want to continue to handle the 
other domain or abort?

If the former, then what's the next steps if the domain is half 
initialized? So we try again?

> +        else
> +            printf("Domain %d has already been initialized\n", domid);
> +    }
> +out:
> +    libxl_dominfo_list_free(info, nb_vm);
> +    xs_close(xsh);
> +    return rc;
> +}

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
  2022-01-28 21:33 ` [PATCH v3 2/5] xen: make evtchn_alloc_unbound public Stefano Stabellini
@ 2022-01-31  9:32   ` Jan Beulich
  2022-03-15 14:10   ` Daniel P. Smith
  1 sibling, 0 replies; 42+ messages in thread
From: Jan Beulich @ 2022-01-31  9:32 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: jgross, Bertrand.Marquis, julien, Volodymyr_Babchuk, Luca Miccio,
	Stefano Stabellini, Andrew Cooper, George Dunlap, Wei Liu,
	xen-devel

On 28.01.2022 22:33, Stefano Stabellini wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> The xenstore event channel will be allocated for dom0less domains. It is
> necessary to have access to the evtchn_alloc_unbound function to do
> that, so make evtchn_alloc_unbound public.
> 
> Add a skip_xsm parameter to allow disabling the XSM check in
> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
> originated from Xen before running any domains.)

What I continue to be missing here are some words on why it is okay to
skip XSM checking in this case. After all an alternative would be to
put in place a suitable domain as "current" one for the check to actually
work.

Jan



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
  2022-01-28 21:33 ` [PATCH v3 2/5] xen: make evtchn_alloc_unbound public Stefano Stabellini
  2022-01-31  9:32   ` Jan Beulich
@ 2022-03-15 14:10   ` Daniel P. Smith
  2022-03-23  0:22     ` Stefano Stabellini
  1 sibling, 1 reply; 42+ messages in thread
From: Daniel P. Smith @ 2022-03-15 14:10 UTC (permalink / raw)
  To: Stefano Stabellini, xen-devel
  Cc: jgross, Bertrand.Marquis, julien, Volodymyr_Babchuk, Luca Miccio,
	Stefano Stabellini, Andrew Cooper, George Dunlap, Jan Beulich,
	Wei Liu

On 1/28/22 16:33, Stefano Stabellini wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> The xenstore event channel will be allocated for dom0less domains. It is
> necessary to have access to the evtchn_alloc_unbound function to do
> that, so make evtchn_alloc_unbound public.
> 
> Add a skip_xsm parameter to allow disabling the XSM check in
> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
> originated from Xen before running any domains.)
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> CC: Julien Grall <julien@xen.org>
> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
> CC: Bertrand Marquis <bertrand.marquis@arm.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: George Dunlap <george.dunlap@citrix.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Wei Liu <wl@xen.org>
> ---
> Changes v3:
> - expose evtchn_alloc_unbound, assing a skip_xsm parameter
> ---
>  xen/common/event_channel.c | 13 ++++++++-----
>  xen/include/xen/event.h    |  3 +++
>  2 files changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
> index da88ad141a..be57d00a15 100644
> --- a/xen/common/event_channel.c
> +++ b/xen/common/event_channel.c
> @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn)
>      xsm_evtchn_close_post(chn);
>  }
>  
> -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
>  {
>      struct evtchn *chn;
>      struct domain *d;
> @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>          ERROR_EXIT_DOM(port, d);
>      chn = evtchn_from_port(d, port);
>  
> -    rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
> -    if ( rc )
> -        goto out;
> +    if ( !skip_xsm )
> +    {
> +        rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
> +        if ( rc )
> +            goto out;
> +    }

Please do not subvert the security framework because it causes an
inconvenience. As Jan recommended, work within the XSM check to allow
your access so that we may ensure it is done safely. If you need any
help making modifications to XSM, please do not hesitate to reach out as
I will gladly help.

>      evtchn_write_lock(chn);
>  
> @@ -1195,7 +1198,7 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>          struct evtchn_alloc_unbound alloc_unbound;
>          if ( copy_from_guest(&alloc_unbound, arg, 1) != 0 )
>              return -EFAULT;
> -        rc = evtchn_alloc_unbound(&alloc_unbound);
> +        rc = evtchn_alloc_unbound(&alloc_unbound, false);
>          if ( !rc && __copy_to_guest(arg, &alloc_unbound, 1) )
>              rc = -EFAULT; /* Cleaning up here would be a mess! */
>          break;
> diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
> index 21c95e14fd..0a2cdedf7d 100644
> --- a/xen/include/xen/event.h
> +++ b/xen/include/xen/event.h
> @@ -68,6 +68,9 @@ int evtchn_close(struct domain *d1, int port1, bool guest);
>  /* Free an event channel. */
>  void evtchn_free(struct domain *d, struct evtchn *chn);
>  
> +/* Create a new event channel port */
> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm);
> +
>  /* Allocate a specific event channel port. */
>  int evtchn_allocate_port(struct domain *d, unsigned int port);
>  


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/5] xen: introduce xen,enhanced dom0less property
  2022-01-29 17:58   ` Julien Grall
@ 2022-03-23  0:08     ` Stefano Stabellini
  2022-03-25 17:58       ` Julien Grall
  0 siblings, 1 reply; 42+ messages in thread
From: Stefano Stabellini @ 2022-03-23  0:08 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, xen-devel, jgross, Bertrand.Marquis,
	Volodymyr_Babchuk, Stefano Stabellini

On Sat, 29 Jan 2022, Julien Grall wrote:
> > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> > index 6931c022a2..9144d6c0b6 100644
> > --- a/xen/arch/arm/domain_build.c
> > +++ b/xen/arch/arm/domain_build.c
> > @@ -2963,6 +2963,7 @@ static int __init construct_domU(struct domain *d,
> >                                    const struct dt_device_node *node)
> >   {
> >       struct kernel_info kinfo = {};
> > +    const char *dom0less_enhanced;
> >       int rc;
> >       u64 mem;
> >   @@ -2978,6 +2979,12 @@ static int __init construct_domU(struct domain *d,
> >         kinfo.vpl011 = dt_property_read_bool(node, "vpl011");
> >   +    rc = dt_property_read_string(node, "xen,enhanced",
> > &dom0less_enhanced);
> > +    if ( rc == -EILSEQ ||
> 
> I think the use an -EILSEQ wants an explanation. In a previous version, you
> wrote that the value would be returned when:
> 
> fdt set /chosen/domU0 xen,enhanced
> 
> But it is not clear why. Can you print pp->value, pp->length, strnlen(..) when
> this happens?

I added in dt_property_read_string:

printk("DEBUG %s %d value=%s value[0]=%d length=%u len=%lu\n",__func__,__LINE__,(char*)pp->value, *((char*)pp->value),pp->length, strlen(pp->value));

This is the output:
(XEN) DEBUG dt_property_read_string 205 value= value[0]=0 length=0 len=0



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
  2022-03-15 14:10   ` Daniel P. Smith
@ 2022-03-23  0:22     ` Stefano Stabellini
  2022-03-23  6:57       ` Jan Beulich
  2022-03-24 15:36       ` Daniel P. Smith
  0 siblings, 2 replies; 42+ messages in thread
From: Stefano Stabellini @ 2022-03-23  0:22 UTC (permalink / raw)
  To: Daniel P. Smith, jbeulich
  Cc: Stefano Stabellini, xen-devel, jgross, Bertrand.Marquis, julien,
	Volodymyr_Babchuk, Luca Miccio, Stefano Stabellini,
	Andrew Cooper, George Dunlap, Wei Liu

On Tue, 15 Mar 2022, Daniel P. Smith wrote:
> On 1/28/22 16:33, Stefano Stabellini wrote:
> > From: Luca Miccio <lucmiccio@gmail.com>
> > 
> > The xenstore event channel will be allocated for dom0less domains. It is
> > necessary to have access to the evtchn_alloc_unbound function to do
> > that, so make evtchn_alloc_unbound public.
> > 
> > Add a skip_xsm parameter to allow disabling the XSM check in
> > evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
> > originated from Xen before running any domains.)
> > 
> > Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> > CC: Julien Grall <julien@xen.org>
> > CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
> > CC: Bertrand Marquis <bertrand.marquis@arm.com>
> > CC: Andrew Cooper <andrew.cooper3@citrix.com>
> > CC: George Dunlap <george.dunlap@citrix.com>
> > CC: Jan Beulich <jbeulich@suse.com>
> > CC: Wei Liu <wl@xen.org>
> > ---
> > Changes v3:
> > - expose evtchn_alloc_unbound, assing a skip_xsm parameter
> > ---
> >  xen/common/event_channel.c | 13 ++++++++-----
> >  xen/include/xen/event.h    |  3 +++
> >  2 files changed, 11 insertions(+), 5 deletions(-)
> > 
> > diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
> > index da88ad141a..be57d00a15 100644
> > --- a/xen/common/event_channel.c
> > +++ b/xen/common/event_channel.c
> > @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn)
> >      xsm_evtchn_close_post(chn);
> >  }
> >  
> > -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> > +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
> >  {
> >      struct evtchn *chn;
> >      struct domain *d;
> > @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> >          ERROR_EXIT_DOM(port, d);
> >      chn = evtchn_from_port(d, port);
> >  
> > -    rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
> > -    if ( rc )
> > -        goto out;
> > +    if ( !skip_xsm )
> > +    {
> > +        rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
> > +        if ( rc )
> > +            goto out;
> > +    }
> 
> Please do not subvert the security framework because it causes an
> inconvenience. As Jan recommended, work within the XSM check to allow
> your access so that we may ensure it is done safely. If you need any
> help making modifications to XSM, please do not hesitate to reach out as
> I will gladly help.

Thank you!

First let me reply to Jan: this series is only introducing 1 more call
to evtchn_alloc_unbound, which is to allocate the special xenstore event
channel, the one configured via
d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN].

It is not meant to be a generic function, and it is not meant to be
called more than once. It could (should?) be __init.

The existing XSM check in evtchn_alloc_unbound cannot work and should
not work: it is based on the current domain having enough privileges to
create the event channel. In this case, we have no current domain at
all. The current domain is Xen itself.

For these reasons, given [1], also not to subvert the security
framework as Daniel pointed out, I think I should go back to my own
implementation [2][3] based on get_free_port. That is my preference
because:

- the Xen codebase doesn't gain much by reusing evtchn_alloc_unbound
- adding skip_xsm introduces a component of risk (unless we make it
  __init maybe?)
- using get_free_port is trivial and doesn't pose the same issues


Let's find all an agreement on how to move forward on this.


[1] https://marc.info/?l=xen-devel&m=164194128922838
[2] https://marc.info/?l=xen-devel&m=164203543615114
[3] https://marc.info/?l=xen-devel&m=164203544615129 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 3/5] xen/arm: configure dom0less domain for enabling xenstore after boot
  2022-01-29 18:13   ` Julien Grall
@ 2022-03-23  1:18     ` Stefano Stabellini
  2022-03-25 18:46       ` Julien Grall
  0 siblings, 1 reply; 42+ messages in thread
From: Stefano Stabellini @ 2022-03-23  1:18 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, xen-devel, jgross, Bertrand.Marquis,
	Volodymyr_Babchuk, Luca Miccio, Stefano Stabellini

On Sat, 29 Jan 2022, Julien Grall wrote:
> On 28/01/2022 21:33, Stefano Stabellini wrote:
> > From: Luca Miccio <lucmiccio@gmail.com>
> > 
> > If "xen,enhanced" is enabled, then add to dom0less domains:
> > 
> > - the hypervisor node in device tree
> > - the xenstore event channel
> > 
> > The xenstore event channel is also used for the first notification to
> > let the guest know that xenstore has become available.
> > 
> > Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> > Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
> > CC: Julien Grall <julien@xen.org>
> > CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
> > CC: Bertrand Marquis <bertrand.marquis@arm.com>
> > 
> > ---
> > Changes in v3:
> > - use evtchn_alloc_unbound
> > 
> > Changes in v2:
> > - set HVM_PARAM_STORE_PFN to ~0ULL at domain creation
> > - in alloc_xenstore_evtchn do not call _evtchn_alloc_unbound
> > ---
> >   xen/arch/arm/domain_build.c | 41 +++++++++++++++++++++++++++++++++++++
> >   1 file changed, 41 insertions(+)
> > 
> > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> > index 9144d6c0b6..8e030a7f05 100644
> > --- a/xen/arch/arm/domain_build.c
> > +++ b/xen/arch/arm/domain_build.c
> > @@ -27,6 +27,7 @@
> >   #include <asm/setup.h>
> >   #include <asm/cpufeature.h>
> >   #include <asm/domain_build.h>
> > +#include <xen/event.h>
> >     #include <xen/irq.h>
> >   #include <xen/grant_table.h>
> > @@ -2619,6 +2620,8 @@ static int __init prepare_dtb_domU(struct domain *d,
> > struct kernel_info *kinfo)
> >       int ret;
> >         kinfo->phandle_gic = GUEST_PHANDLE_GIC;
> > +    kinfo->gnttab_start = GUEST_GNTTAB_BASE;
> > +    kinfo->gnttab_size = GUEST_GNTTAB_SIZE;
> >         addrcells = GUEST_ROOT_ADDRESS_CELLS;
> >       sizecells = GUEST_ROOT_SIZE_CELLS;
> > @@ -2693,6 +2696,13 @@ static int __init prepare_dtb_domU(struct domain *d,
> > struct kernel_info *kinfo)
> >               goto err;
> >       }
> >   +    if ( kinfo->dom0less_enhanced )
> > +    {
> > +        ret = make_hypervisor_node(d, kinfo, addrcells, sizecells);
> 
> Looking at the code, I think the extended regions will not work properly
> because we are looking at the host memory layout. In the case of domU, we want
> to use the guest layout. Please have a look how it was done in libxl.

Yeah you are right, I'll do that.


> > +        if ( ret )
> > +            goto err;
> > +    }
> > +
> >       ret = fdt_end_node(kinfo->fdt);
> >       if ( ret < 0 )
> >           goto err;
> > @@ -2959,6 +2969,25 @@ static int __init construct_domain(struct domain *d,
> > struct kernel_info *kinfo)
> >       return 0;
> >   }
> >   +static int __init alloc_xenstore_evtchn(struct domain *d)
> > +{
> > +    evtchn_alloc_unbound_t alloc;
> > +    int rc;
> > +
> > +    alloc.dom = d->domain_id;
> > +    alloc.remote_dom = hardware_domain->domain_id;
> 
> The first thing evtchn_alloc_unbound() will do is looking up the domain. This
> seems a bit pointless given that we have the domain in hand. Shouldn't we
> extend evtchn_alloc_unbound() to pass the domain?

That's a good point, but I actually think it is better to go back to
[2]. The evtchn_alloc_unbound discussion is still ongoing but I'll keep
this suggestion in mind.

[2] https://marc.info/?l=xen-devel&m=164203543615114


> > +    rc = evtchn_alloc_unbound(&alloc, true);
> > +    if ( rc )
> > +    {
> > +        printk("Failed allocating event channel for domain\n");
> > +        return rc;
> > +    }
> > +
> > +    d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN] = alloc.port;
> > +
> > +    return 0;
> > +}
> > +
> >   static int __init construct_domU(struct domain *d,
> >                                    const struct dt_device_node *node)
> >   {
> > @@ -3014,7 +3043,19 @@ static int __init construct_domU(struct domain *d,
> >           return rc;
> >         if ( kinfo.vpl011 )
> > +    {
> >           rc = domain_vpl011_init(d, NULL);
> > +        if ( rc < 0 )
> > +            return rc;
> > +    }
> > +
> > +    if ( kinfo.dom0less_enhanced )
> > +    {
> > +        rc = alloc_xenstore_evtchn(d);
> > +        if ( rc < 0 )
> > +            return rc;
> > +        d->arch.hvm.params[HVM_PARAM_STORE_PFN] = ~0ULL;
> 
> I think it would be easy to allocate the page right now. So what prevent us to
> do it right now?

Because (as you noted as a comment to the following patch) as soon as
d->arch.hvm.params[HVM_PARAM_STORE_PFN] is set the guest can continue
with the initialization and will expect the right data to be set on the
page. In other words: it is not enough to have the pfn allocated, we
also need xenstore to initialize it. At that point, it is better to do
both later from init-dom0less.c.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/5] tools: add example application to initialize dom0less PV drivers
  2022-01-29 19:35   ` Julien Grall
@ 2022-03-23  2:50     ` Stefano Stabellini
  2022-03-28 16:25       ` Julien Grall
  0 siblings, 1 reply; 42+ messages in thread
From: Stefano Stabellini @ 2022-03-23  2:50 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, xen-devel, jgross, Bertrand.Marquis,
	Volodymyr_Babchuk, Luca Miccio, Stefano Stabellini, Wei Liu,
	Anthony PERARD

On Sat, 29 Jan 2022, Julien Grall wrote:
> On 28/01/2022 21:33, Stefano Stabellini wrote:
> > From: Luca Miccio <lucmiccio@gmail.com>
> > 
> > Add an example application that can be run in dom0 to complete the
> > dom0less domains initialization so that they can get access to xenstore
> > and use PV drivers.
> > 
> > Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> > CC: Wei Liu <wl@xen.org>
> > CC: Anthony PERARD <anthony.perard@citrix.com>
> > CC: Juergen Gross <jgross@suse.com>
> > ---
> > Changes in v3:
> > - handle xenstore errors
> > - add an in-code comment about xenstore entries
> > - less verbose output
> > - clean-up error path in main
> > 
> > Changes in v2:
> > - do not set HVM_PARAM_STORE_EVTCHN twice
> > - rename restore_xenstore to create_xenstore
> > - increase maxmem
> > ---
> >   tools/helpers/Makefile        |  13 ++
> >   tools/helpers/init-dom0less.c | 269 ++++++++++++++++++++++++++++++++++
> 
> Should we document how this is meant to be used?

Good idea, I'll add it to docs/features/dom0less.pandoc


> >   2 files changed, 282 insertions(+)
> >   create mode 100644 tools/helpers/init-dom0less.c
> > 
> > diff --git a/tools/helpers/Makefile b/tools/helpers/Makefile
> > index 7f6c422440..8e42997052 100644
> > --- a/tools/helpers/Makefile
> > +++ b/tools/helpers/Makefile
> > @@ -10,6 +10,9 @@ ifeq ($(CONFIG_Linux),y)
> >   ifeq ($(CONFIG_X86),y)
> >   PROGS += init-xenstore-domain
> >   endif
> > +ifeq ($(CONFIG_ARM),y)
> > +PROGS += init-dom0less
> > +endif >   endif
> >     XEN_INIT_DOM0_OBJS = xen-init-dom0.o init-dom-json.o
> > @@ -26,6 +29,13 @@ $(INIT_XENSTORE_DOMAIN_OBJS): CFLAGS +=
> > $(CFLAGS_libxenstore)
> >   $(INIT_XENSTORE_DOMAIN_OBJS): CFLAGS += $(CFLAGS_libxenlight)
> >   $(INIT_XENSTORE_DOMAIN_OBJS): CFLAGS += -include
> > $(XEN_ROOT)/tools/config.h
> >   +INIT_DOM0LESS_OBJS = init-dom0less.o init-dom-json.o
> > +$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxentoollog)
> > +$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenstore)
> > +$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenlight)
> > +$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenctrl)
> > +$(INIT_DOM0LESS_OBJS): CFLAGS += $(CFLAGS_libxenevtchn)
> > +
> >   .PHONY: all
> >   all: $(PROGS)
> >   @@ -35,6 +45,9 @@ xen-init-dom0: $(XEN_INIT_DOM0_OBJS)
> >   init-xenstore-domain: $(INIT_XENSTORE_DOMAIN_OBJS)
> >   	$(CC) $(LDFLAGS) -o $@ $(INIT_XENSTORE_DOMAIN_OBJS)
> > $(LDLIBS_libxentoollog) $(LDLIBS_libxenstore) $(LDLIBS_libxenctrl)
> > $(LDLIBS_libxenguest) $(LDLIBS_libxenlight) $(APPEND_LDFLAGS)
> >   +init-dom0less: $(INIT_DOM0LESS_OBJS)
> > +	$(CC) $(LDFLAGS) -o $@ $(INIT_DOM0LESS_OBJS) $(LDLIBS_libxenctrl)
> > $(LDLIBS_libxenevtchn) $(LDLIBS_libxentoollog) $(LDLIBS_libxenstore)
> > $(LDLIBS_libxenlight) $(LDLIBS_libxenguest)  $(APPEND_LDFLAGS)
> > +
> >   .PHONY: install
> >   install: all
> >   	$(INSTALL_DIR) $(DESTDIR)$(LIBEXEC_BIN)
> > diff --git a/tools/helpers/init-dom0less.c b/tools/helpers/init-dom0less.c
> > new file mode 100644
> > index 0000000000..b6a3831cb5
> > --- /dev/null
> > +++ b/tools/helpers/init-dom0less.c
> > @@ -0,0 +1,269 @@
> > +#include <stdbool.h>
> > +#include <syslog.h>
> > +#include <stdio.h>
> > +#include <err.h>
> > +#include <stdlib.h>
> > +#include <xenstore.h>
> > +#include <xenctrl.h>
> > +#include <xenguest.h>
> > +#include <libxl.h>
> > +#include <xenevtchn.h>
> > +
> > +#include "init-dom-json.h"
> > +
> > +#define NR_MAGIC_PAGES 4
> 
> Why are we allocating 4 pages when only 2 (maybe 1) is necessary?

Good point, I think we can only allocate 1


> > +#define CONSOLE_PFN_OFFSET 0
> > +#define XENSTORE_PFN_OFFSET 1
> > +#define STR_MAX_LENGTH 64
> > +
> > +static int alloc_magic_pages(libxl_dominfo *info, struct xc_dom_image *dom)
> > +{
> > +    int rc, i;
> > +    const xen_pfn_t base = GUEST_MAGIC_BASE >> XC_PAGE_SHIFT;
> > +    xen_pfn_t p2m[NR_MAGIC_PAGES];
> > +
> > +    rc = xc_domain_setmaxmem(dom->xch, dom->guest_domid,
> > +                             info->max_memkb + NR_MAGIC_PAGES * 4);
> 
> Please don't rely on the fact the page size will be 4KB in Xen. Instead, use
> XC_PAGE_*.

OK


> > +    if (rc < 0)
> > +        return rc;
> > +
> > +    for (i = 0; i < NR_MAGIC_PAGES; i++)
> > +        p2m[i] = base + i;
> > +
> > +    rc = xc_domain_populate_physmap_exact(dom->xch, dom->guest_domid,
> > +                                          NR_MAGIC_PAGES, 0, 0, p2m);
> > +    if (rc < 0)
> > +        return rc;
> > +
> > +    dom->xenstore_pfn = base + XENSTORE_PFN_OFFSET;
> > +
> > +    xc_clear_domain_page(dom->xch, dom->guest_domid, dom->xenstore_pfn);
> 
> So you allocate 4 pages, use 2, but only clear 1. Can you explain why?

We only need the xenstore page, I'll fix it


> Also, should not you check the error return here and  ...
> 
> > +
> > +    xc_hvm_param_set(dom->xch, dom->guest_domid, HVM_PARAM_STORE_PFN,
> > +                     dom->xenstore_pfn);
> 
> here...?

Yes


> Also, in theory, as soon as you set xc_hvm_param_set(), the guest may be able
> to start using Xenstore. So wouldn't it be better to set it once you know
> everything is in place (i.e. just before calling xs_introduce_domain())?

This is a very good point. I'll do that


> > +    return 0;
> > +}
> > +
> > +static bool do_xs_write_dom(struct xs_handle *xsh, xs_transaction_t t,
> > +                            domid_t domid, char *path, char *val)
> > +{
> > +    char full_path[STR_MAX_LENGTH];
> > +
> > +    snprintf(full_path, STR_MAX_LENGTH,
> > +             "/local/domain/%d/%s", domid, path);
> > +    return xs_write(xsh, t, full_path, val, strlen(val));
> 
> From my understanding, xs_write() will create a node that will only be
> readable/writable by the domain executing this binary (i.e. dom0). IOW, the
> guest will not see the nodes.
> 
> So shouldn't you also set the permissions?

Yes you are right, I'll do that


> > +}
> > +
> > +static bool do_xs_write_libxl(struct xs_handle *xsh, xs_transaction_t t,
> > +                              domid_t domid, char *path, char *val)
> > +{
> > +    char full_path[STR_MAX_LENGTH];
> > +
> > +    snprintf(full_path, STR_MAX_LENGTH,
> > +             "/libxl/%d/%s", domid, path);
> 
> Please use %u when you refer to unsigned value. Also, I think it would be a
> good practice to check the return value of snprintf(). This would avoid any
> future surprise of value truncated by mistake.
> 
> The same is valid for all the other use below.

OK


> > +    return xs_write(xsh, t, full_path, val, strlen(val));
> > +}
> > +
> > +static bool do_xs_write_vm(struct xs_handle *xsh, xs_transaction_t t,
> > +                           libxl_uuid uuid, char *path, char *val)
> > +{
> > +    char full_path[STR_MAX_LENGTH];
> > +
> > +    snprintf(full_path, STR_MAX_LENGTH,
> > +             "/vm/" LIBXL_UUID_FMT "/%s", LIBXL_UUID_BYTES(uuid), path);
> > +    return xs_write(xsh, t, full_path, val, strlen(val));
> > +}
> > +
> > +/*
> > + * The xenstore nodes are the xenstore nodes libxl writes at domain
> > + * creation.
> > + *
> > + * The list was retrieved by running xenstore-ls on a corresponding
> > + * domain started by xl/libxl.
> > + */
> > +static int restore_xenstore(struct xs_handle *xsh,
> 
> As I wrote in v1, I think "restore" is misleading because the domain was never
> in Xenstore. So how about "create"? (Which BTW you agreed on back then).
 
OK, sorry about that


> > +                            libxl_dominfo *info, libxl_uuid uuid,
> > +                            evtchn_port_t xenstore_port)
> > +{
> > +    domid_t domid;
> > +    int i;
> 
> This is used as an iterator for a uint32_t value. So I think it should at
> least be unsigned int.

Yes


> > +    char uuid_str[STR_MAX_LENGTH];
> > +    char dom_name_str[STR_MAX_LENGTH];
> > +    char vm_val_str[STR_MAX_LENGTH];
> > +    char id_str[STR_MAX_LENGTH];
> > +    char max_memkb_str[STR_MAX_LENGTH];
> > +    char cpu_str[STR_MAX_LENGTH];
> > +    char xenstore_port_str[STR_MAX_LENGTH];
> > +    char ring_ref_str[STR_MAX_LENGTH];
> > +    xs_transaction_t t;
> > +
> > +    domid = info->domid;
> > +    snprintf(id_str, STR_MAX_LENGTH, "%d", domid);
> > +    snprintf(dom_name_str, STR_MAX_LENGTH, "dom0less-%d", domid);
> > +    snprintf(uuid_str, STR_MAX_LENGTH, LIBXL_UUID_FMT,
> > LIBXL_UUID_BYTES(uuid));
> > +    snprintf(vm_val_str, STR_MAX_LENGTH,
> > +             "vm/" LIBXL_UUID_FMT, LIBXL_UUID_BYTES(uuid));
> > +    snprintf(max_memkb_str, STR_MAX_LENGTH, "%lu", info->max_memkb);
> > +    snprintf(ring_ref_str, STR_MAX_LENGTH, "%lld",
> > +             (GUEST_MAGIC_BASE >> XC_PAGE_SHIFT) + XENSTORE_PFN_OFFSET);
> > +    snprintf(xenstore_port_str, STR_MAX_LENGTH, "%d", xenstore_port); > +
> > +retry_transaction:
> > +    t = xs_transaction_start(xsh);
> > +    if (t == XBT_NULL)
> > +        return -errno;
> > +
> > +    /* /vm */
> > +    if (!do_xs_write_vm(xsh, t, uuid, "name", dom_name_str)) return -EIO;
> 
> You should terminate the transaction in case of an error.

OK


> > +    if (!do_xs_write_vm(xsh, t, uuid, "uuid", uuid_str)) return -EIO;
> > +    if (!do_xs_write_vm(xsh, t, uuid, "start_time", "0")) return -EIO;
> 
> Wouldn't it be better to create based on the time now?

Yes


> > +
> > +    /* /domain */
> > +    if (!do_xs_write_dom(xsh, t, domid, "vm", vm_val_str)) return -EIO;
> > +    if (!do_xs_write_dom(xsh, t, domid, "name", dom_name_str)) return -EIO;
> > +    if (!do_xs_write_dom(xsh, t, domid, "cpu", "")) return -EIO;
> > +    for (i = 0; i < info->vcpu_max_id; i++) {
> > +        snprintf(cpu_str, STR_MAX_LENGTH, "cpu/%d/availability/", i);
> > +        if (!do_xs_write_dom(xsh, t, domid, cpu_str,
> > +                             (info->cpupool & (1 << i)) ? "online" :
> > "offline"))
> > +            return -EIO;
> > +    }
> > +    if (!do_xs_write_dom(xsh, t, domid, "cpu/0", "")) return -EIO;
> 
> I am a bit confused. You created 0 above, so why do you need to create it here
> again?

I'll fix it


> > +    if (!do_xs_write_dom(xsh, t, domid, "cpu/availability", "online"))
> > return -EIO;
> 
> I can't seem to find this node in xenstore and libxl.

I'll fix it


> > +
> > +    if (!do_xs_write_dom(xsh, t, domid, "memory", "")) return -EIO;
> > +    if (!do_xs_write_dom(xsh, t, domid, "memory/static-max",
> > max_memkb_str)) return -EIO;
> > +    if (!do_xs_write_dom(xsh, t, domid, "memory/videoram", "-1")) return
> > -EIO;
> 
> How about "memory/target"?

I'll add


> > +
> > +    if (!do_xs_write_dom(xsh, t, domid, "device", "")) return -EIO;
> > +    if (!do_xs_write_dom(xsh, t, domid, "device/suspend", "")) return -EIO;
> > +    if (!do_xs_write_dom(xsh, t, domid, "device/suspend/event-channel",
> > "")) return -EIO;
> > +
> > +    if (!do_xs_write_dom(xsh, t, domid, "control", "")) return -EIO;
> > +    if (!do_xs_write_dom(xsh, t, domid, "control/shutdown", "")) return
> > -EIO;
> > +    if (!do_xs_write_dom(xsh, t, domid, "control/feature-poweroff", "1"))
> > return -EIO;
> > +    if (!do_xs_write_dom(xsh, t, domid, "control/feature-reboot", "1"))
> > return -EIO;
> > +    if (!do_xs_write_dom(xsh, t, domid, "control/feature-suspend", ""))
> > return -EIO;
> > +    if (!do_xs_write_dom(xsh, t, domid, "control/sysrq", "")) return -EIO;
> > +    if (!do_xs_write_dom(xsh, t, domid,
> > "control/platform-feature-multiprocessor-suspend", "1")) return -EIO;
> > +    if (!do_xs_write_dom(xsh, t, domid, "control",
> > "platform-feature-xs_reset_watches")) return -EIO;
> 
> It sounds like this wants to be "control/platform-feature...". If this hasn't
> been done, I would diff the libxl version and your version to check if all are
> the same (names, values, permissions).

Yes, I'll fix it


> > +
> > +    if (!do_xs_write_dom(xsh, t, domid, "domid", id_str)) return -EIO;
> > +    if (!do_xs_write_dom(xsh, t, domid, "data", "")) return -EIO;
> > +    if (!do_xs_write_dom(xsh, t, domid, "drivers", "")) return -EIO;
> > +    if (!do_xs_write_dom(xsh, t, domid, "feature", "")) return -EIO;
> > +    if (!do_xs_write_dom(xsh, t, domid, "attr", "")) return -EIO;
> > +
> > +    if (!do_xs_write_dom(xsh, t, domid, "store/port", xenstore_port_str))
> > return -EIO;
> > +    if (!do_xs_write_dom(xsh, t, domid, "store/ring-ref", ring_ref_str))
> > return -EIO;
> > +
> > +    if (!do_xs_write_libxl(xsh, t, domid, "type", "pvh")) return -EIO;
> > +    if (!do_xs_write_libxl(xsh, t, domid, "dm-version", "qemu_xen")) return
> > -EIO;
> > +
> > +    if (!xs_transaction_end(xsh, t, false))
> > +        if (errno == EAGAIN)
> > +            goto retry_transaction;
> > +        else
> > +            return -errno;
> > +
> > +    return 0;
> > +}
> > +
> > +static int init_domain(struct xs_handle *xsh, libxl_dominfo *info)
> > +{
> > +    struct xc_dom_image dom;
> 
> I would initialize dom to 0 to avoid any undef behavior.

OK


> > +    libxl_uuid uuid;
> > +    uint64_t v;
> > +    int rc;
> > +
> > +    printf("Init dom0less domain: %d\n", info->domid);
> > +    dom.guest_domid = info->domid;
> > +    dom.xenstore_domid = 0;
> > +    dom.xch = xc_interface_open(0, 0, 0);
> > +
> > +    rc = xc_hvm_param_get(dom.xch, info->domid, HVM_PARAM_STORE_EVTCHN,
> > &v);
> > +    if (rc != 0) {
> > +        printf("Failed to get HVM_PARAM_STORE_EVTCHN\n");
> > +        return 1;
> > +    }
> > +    dom.xenstore_evtchn = v;
> > +
> > +    /* Console won't be initialized but set its data for completeness */
> > +    dom.console_domid = 0;
> 
> I find a bit odd you set the domid but not the event channel, page. Can you
> explain?
> 
> Actually, can you explain why only half of the structure is initialized?
 
We are only using the struct xc_dom_image parameter for
xc_dom_gnttab_init, and nothing else. We only need very few fields in
it. Basically we could call xc_dom_gnttab_seed directly and then we
wouldn't need struct xc_dom_image at all. Only the needed fields are
currently populated. 


> > +
> > +    /* Alloc magic pages */
> > +    if (alloc_magic_pages(info, &dom) != 0) {
> > +        printf("Error on alloc magic pages\n");
> > +        return 1;
> > +    }
> > +
> > +    xc_dom_gnttab_init(&dom);
> 
> This call as the risk to break the guest if the dom0 Linux doesn't support the
> acquire interface. This is because it will punch a hole in the domain memory
> where the grant-table may have already been mapped.
> 
> Also, this function could fails.

I'll check for return errors. Dom0less is for fully static
configurations so I think it is OK to return error and abort if
something unexpected happens: dom0less' main reason for being is that
there is nothing unexpected :-)


> > +
> > +    libxl_uuid_generate(&uuid);
> > +    xc_domain_sethandle(dom.xch, info->domid, libxl_uuid_bytearray(&uuid));
> > +
> > +    rc = gen_stub_json_config(info->domid, &uuid);
> > +    if (rc)
> > +        err(1, "gen_stub_json_config");
> > +
> > +    rc = restore_xenstore(xsh, info, uuid, dom.xenstore_evtchn);
> > +    if (rc)
> > +        err(1, "writing to xenstore");
> > +
> > +    xs_introduce_domain(xsh, info->domid,
> > +            (GUEST_MAGIC_BASE >> XC_PAGE_SHIFT) + XENSTORE_PFN_OFFSET,
> > +            dom.xenstore_evtchn);
> 
> xs_introduce_domain() can technically fails.

OK


> > +    return 0;
> > +}
> > +
> > +/* Check if domain has been configured in XS */
> > +static bool domain_exists(struct xs_handle *xsh, int domid)
> > +{
> > +    return xs_is_domain_introduced(xsh, domid);
> > +}
> 
> Would not this lead to initialize a domain with PV driver disabled?

I am not sure I understood your question, but I'll try to answer anyway.
This check is purely to distinguish dom0less guests, which needs further
initializations, from regular guests (e.g. xl guests) that don't need
any actions taken here.

Ideally init_dom0less should be called before creating any xl guests, but
this check should make it work regardless.


> In addition to that, I think we should warn the users (maybe in some
> documentation) that this should not be called in parallel of scripts that may
> create new domain. Otherwise they may be picked up here as well...

Good point.


> > +
> > +int main(int argc, char **argv)
> > +{
> > +    libxl_dominfo *info = NULL;
> > +    libxl_ctx *ctx;
> > +    int nb_vm = 0, rc = 0, i;
> > +    struct xs_handle *xsh = NULL;
> > +
> > +    xsh = xs_daemon_open();
> 
> From my understanding xs_daemon_open() is deprecated. Instead, you want to use
> xs_open(0).

OK


> However, libxl_ctx_alloc() is already opening a xenstore connection. So I
> think it would be better to rely on libxl and other the provided helpers
> (introduce new one if necessary) or just completely drop libxl from the
> equations.

Unfortunately today it is not possible to reuse ctx->xsh because it is
private to libxl. But your suggestion is an interesting optimization.
I'll write it down as TODO for now.


> > +    if (xsh == NULL) {
> > +        fprintf(stderr, "Could not contact XenStore");
> > +        rc = -errno;
> > +        goto out;
> > +    }
> > +
> > +    rc = libxl_ctx_alloc(&ctx, LIBXL_VERSION, 0, NULL);
> > +    if (rc) {
> > +        fprintf(stderr, "cannot init xl context\n");
> > +        goto out;
> > +    }
> > +
> > +    info = libxl_list_domain(ctx, &nb_vm);
> > +    if (!info) {
> > +        fprintf(stderr, "libxl_list_vm failed.\n");
> > +        rc = -1;
> > +        goto out;
> > +    }
> > +
> > +    for (i = 0; i < nb_vm; i++) {
> > +        domid_t domid = info[i].domid;
> > +
> > +        /* Don't need to check for Dom0 */
> > +        if (!domid)
> > +            continue;
> > +
> > +        printf("Checking domid: %u\n", domid);
> > +        if (!domain_exists(xsh, domid))
> > +            init_domain(xsh, &info[i]);
> 
> init_domain() could return an error. So shouldn't you check it?
> 
> If yes, then the question is whether you want to continue to handle the other
> domain or abort?
> 
> If the former, then what's the next steps if the domain is half initialized?
> So we try again?

As mentioned I think we should abort and exit. Basically we should just
tell the engineer to fix its configuration.


> > +        else
> > +            printf("Domain %d has already been initialized\n", domid);
> > +    }
> > +out:
> > +    libxl_dominfo_list_free(info, nb_vm);
> > +    xs_close(xsh);
> > +    return rc;
> > +}



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
  2022-03-23  0:22     ` Stefano Stabellini
@ 2022-03-23  6:57       ` Jan Beulich
  2022-03-25  0:30         ` Stefano Stabellini
  2022-03-24 15:36       ` Daniel P. Smith
  1 sibling, 1 reply; 42+ messages in thread
From: Jan Beulich @ 2022-03-23  6:57 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, jgross, Bertrand.Marquis, julien, Volodymyr_Babchuk,
	Luca Miccio, Stefano Stabellini, Andrew Cooper, George Dunlap,
	Wei Liu, Daniel P. Smith

On 23.03.2022 01:22, Stefano Stabellini wrote:
> On Tue, 15 Mar 2022, Daniel P. Smith wrote:
>> On 1/28/22 16:33, Stefano Stabellini wrote:
>>> From: Luca Miccio <lucmiccio@gmail.com>
>>>
>>> The xenstore event channel will be allocated for dom0less domains. It is
>>> necessary to have access to the evtchn_alloc_unbound function to do
>>> that, so make evtchn_alloc_unbound public.
>>>
>>> Add a skip_xsm parameter to allow disabling the XSM check in
>>> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
>>> originated from Xen before running any domains.)
>>>
>>> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>> CC: Julien Grall <julien@xen.org>
>>> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
>>> CC: Bertrand Marquis <bertrand.marquis@arm.com>
>>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>>> CC: George Dunlap <george.dunlap@citrix.com>
>>> CC: Jan Beulich <jbeulich@suse.com>
>>> CC: Wei Liu <wl@xen.org>
>>> ---
>>> Changes v3:
>>> - expose evtchn_alloc_unbound, assing a skip_xsm parameter
>>> ---
>>>  xen/common/event_channel.c | 13 ++++++++-----
>>>  xen/include/xen/event.h    |  3 +++
>>>  2 files changed, 11 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
>>> index da88ad141a..be57d00a15 100644
>>> --- a/xen/common/event_channel.c
>>> +++ b/xen/common/event_channel.c
>>> @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn)
>>>      xsm_evtchn_close_post(chn);
>>>  }
>>>  
>>> -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>>> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
>>>  {
>>>      struct evtchn *chn;
>>>      struct domain *d;
>>> @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>>>          ERROR_EXIT_DOM(port, d);
>>>      chn = evtchn_from_port(d, port);
>>>  
>>> -    rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
>>> -    if ( rc )
>>> -        goto out;
>>> +    if ( !skip_xsm )
>>> +    {
>>> +        rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
>>> +        if ( rc )
>>> +            goto out;
>>> +    }
>>
>> Please do not subvert the security framework because it causes an
>> inconvenience. As Jan recommended, work within the XSM check to allow
>> your access so that we may ensure it is done safely. If you need any
>> help making modifications to XSM, please do not hesitate to reach out as
>> I will gladly help.
> 
> Thank you!
> 
> First let me reply to Jan: this series is only introducing 1 more call
> to evtchn_alloc_unbound, which is to allocate the special xenstore event
> channel, the one configured via
> d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN].
> 
> It is not meant to be a generic function, and it is not meant to be
> called more than once. It could (should?) be __init.

How that? Its pre-existing use doesn't disappear, and requires it to be
non-__init.

> The existing XSM check in evtchn_alloc_unbound cannot work and should
> not work: it is based on the current domain having enough privileges to
> create the event channel. In this case, we have no current domain at
> all. The current domain is Xen itself.

And DOM_XEN cannot be given the appropriate permission, perhaps
explicitly when using a real policy and by default in dummy and SILO
modes?

Jan

> For these reasons, given [1], also not to subvert the security
> framework as Daniel pointed out, I think I should go back to my own
> implementation [2][3] based on get_free_port. That is my preference
> because:
> 
> - the Xen codebase doesn't gain much by reusing evtchn_alloc_unbound
> - adding skip_xsm introduces a component of risk (unless we make it
>   __init maybe?)
> - using get_free_port is trivial and doesn't pose the same issues
> 
> 
> Let's find all an agreement on how to move forward on this.
> 
> 
> [1] https://marc.info/?l=xen-devel&m=164194128922838
> [2] https://marc.info/?l=xen-devel&m=164203543615114
> [3] https://marc.info/?l=xen-devel&m=164203544615129 
> 



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
  2022-03-23  0:22     ` Stefano Stabellini
  2022-03-23  6:57       ` Jan Beulich
@ 2022-03-24 15:36       ` Daniel P. Smith
  1 sibling, 0 replies; 42+ messages in thread
From: Daniel P. Smith @ 2022-03-24 15:36 UTC (permalink / raw)
  To: Stefano Stabellini, jbeulich
  Cc: xen-devel, jgross, Bertrand.Marquis, julien, Volodymyr_Babchuk,
	Luca Miccio, Stefano Stabellini, Andrew Cooper, George Dunlap,
	Wei Liu

On 3/22/22 20:22, Stefano Stabellini wrote:
> On Tue, 15 Mar 2022, Daniel P. Smith wrote:
>> On 1/28/22 16:33, Stefano Stabellini wrote:
>>> From: Luca Miccio <lucmiccio@gmail.com>
>>>
>>> The xenstore event channel will be allocated for dom0less domains. It is
>>> necessary to have access to the evtchn_alloc_unbound function to do
>>> that, so make evtchn_alloc_unbound public.
>>>
>>> Add a skip_xsm parameter to allow disabling the XSM check in
>>> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
>>> originated from Xen before running any domains.)
>>>
>>> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>> CC: Julien Grall <julien@xen.org>
>>> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
>>> CC: Bertrand Marquis <bertrand.marquis@arm.com>
>>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>>> CC: George Dunlap <george.dunlap@citrix.com>
>>> CC: Jan Beulich <jbeulich@suse.com>
>>> CC: Wei Liu <wl@xen.org>
>>> ---
>>> Changes v3:
>>> - expose evtchn_alloc_unbound, assing a skip_xsm parameter
>>> ---
>>>  xen/common/event_channel.c | 13 ++++++++-----
>>>  xen/include/xen/event.h    |  3 +++
>>>  2 files changed, 11 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
>>> index da88ad141a..be57d00a15 100644
>>> --- a/xen/common/event_channel.c
>>> +++ b/xen/common/event_channel.c
>>> @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn)
>>>      xsm_evtchn_close_post(chn);
>>>  }
>>>  
>>> -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>>> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
>>>  {
>>>      struct evtchn *chn;
>>>      struct domain *d;
>>> @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>>>          ERROR_EXIT_DOM(port, d);
>>>      chn = evtchn_from_port(d, port);
>>>  
>>> -    rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
>>> -    if ( rc )
>>> -        goto out;
>>> +    if ( !skip_xsm )
>>> +    {
>>> +        rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
>>> +        if ( rc )
>>> +            goto out;
>>> +    }
>>
>> Please do not subvert the security framework because it causes an
>> inconvenience. As Jan recommended, work within the XSM check to allow
>> your access so that we may ensure it is done safely. If you need any
>> help making modifications to XSM, please do not hesitate to reach out as
>> I will gladly help.
> 
> Thank you!
> 
> First let me reply to Jan: this series is only introducing 1 more call
> to evtchn_alloc_unbound, which is to allocate the special xenstore event
> channel, the one configured via
> d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN].
> 
> It is not meant to be a generic function, and it is not meant to be
> called more than once. It could (should?) be __init.
> 
> The existing XSM check in evtchn_alloc_unbound cannot work and should
> not work: it is based on the current domain having enough privileges to
> create the event channel. In this case, we have no current domain at
> all. The current domain is Xen itself.

I have already replicated this in hyperlaunch for PV construction where
I have constructed the event channel for both xenstore and the console.
For hyperlaunch the construction is under a single, fairly-tight
function where I have promoted the Idle Domain to is_privileged before
the creation/construction loop starts and then demote the Idle Domain on
the other side of the loop. Honestly this is not my preferred approach
but for the initial implementation I do have a moderate amount of
confidence regarding the risk that results. My current thinking is that
the more appropriate approach would be to introduce a new system domain,
Construct Domain??, to provide a separate context under which all the
hyperlaunch creation and construction logic would be done and then
destroyed as part of init finalization.

> For these reasons, given [1], also not to subvert the security
> framework as Daniel pointed out, I think I should go back to my own
> implementation [2][3] based on get_free_port. That is my preference
> because:
> 
> - the Xen codebase doesn't gain much by reusing evtchn_alloc_unbound
> - adding skip_xsm introduces a component of risk (unless we make it
>   __init maybe?)
> - using get_free_port is trivial and doesn't pose the same issues
> 
> 
> Let's find all an agreement on how to move forward on this.
> 
> 
> [1] https://marc.info/?l=xen-devel&m=164194128922838
> [2] https://marc.info/?l=xen-devel&m=164203543615114
> [3] https://marc.info/?l=xen-devel&m=164203544615129 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
  2022-03-23  6:57       ` Jan Beulich
@ 2022-03-25  0:30         ` Stefano Stabellini
  2022-03-25  8:17           ` Jan Beulich
                             ` (2 more replies)
  0 siblings, 3 replies; 42+ messages in thread
From: Stefano Stabellini @ 2022-03-25  0:30 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, xen-devel, jgross, Bertrand.Marquis, julien,
	Volodymyr_Babchuk, Luca Miccio, Stefano Stabellini,
	Andrew Cooper, George Dunlap, Wei Liu, Daniel P. Smith

On Wed, 23 Mar 2022, Jan Beulich wrote:
> On 23.03.2022 01:22, Stefano Stabellini wrote:
> > On Tue, 15 Mar 2022, Daniel P. Smith wrote:
> >> On 1/28/22 16:33, Stefano Stabellini wrote:
> >>> From: Luca Miccio <lucmiccio@gmail.com>
> >>>
> >>> The xenstore event channel will be allocated for dom0less domains. It is
> >>> necessary to have access to the evtchn_alloc_unbound function to do
> >>> that, so make evtchn_alloc_unbound public.
> >>>
> >>> Add a skip_xsm parameter to allow disabling the XSM check in
> >>> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
> >>> originated from Xen before running any domains.)
> >>>
> >>> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> >>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> >>> CC: Julien Grall <julien@xen.org>
> >>> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
> >>> CC: Bertrand Marquis <bertrand.marquis@arm.com>
> >>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> >>> CC: George Dunlap <george.dunlap@citrix.com>
> >>> CC: Jan Beulich <jbeulich@suse.com>
> >>> CC: Wei Liu <wl@xen.org>
> >>> ---
> >>> Changes v3:
> >>> - expose evtchn_alloc_unbound, assing a skip_xsm parameter
> >>> ---
> >>>  xen/common/event_channel.c | 13 ++++++++-----
> >>>  xen/include/xen/event.h    |  3 +++
> >>>  2 files changed, 11 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
> >>> index da88ad141a..be57d00a15 100644
> >>> --- a/xen/common/event_channel.c
> >>> +++ b/xen/common/event_channel.c
> >>> @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn)
> >>>      xsm_evtchn_close_post(chn);
> >>>  }
> >>>  
> >>> -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> >>> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
> >>>  {
> >>>      struct evtchn *chn;
> >>>      struct domain *d;
> >>> @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> >>>          ERROR_EXIT_DOM(port, d);
> >>>      chn = evtchn_from_port(d, port);
> >>>  
> >>> -    rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
> >>> -    if ( rc )
> >>> -        goto out;
> >>> +    if ( !skip_xsm )
> >>> +    {
> >>> +        rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
> >>> +        if ( rc )
> >>> +            goto out;
> >>> +    }
> >>
> >> Please do not subvert the security framework because it causes an
> >> inconvenience. As Jan recommended, work within the XSM check to allow
> >> your access so that we may ensure it is done safely. If you need any
> >> help making modifications to XSM, please do not hesitate to reach out as
> >> I will gladly help.
> > 
> > Thank you!
> > 
> > First let me reply to Jan: this series is only introducing 1 more call
> > to evtchn_alloc_unbound, which is to allocate the special xenstore event
> > channel, the one configured via
> > d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN].
> > 
> > It is not meant to be a generic function, and it is not meant to be
> > called more than once. It could (should?) be __init.
> 
> How that? Its pre-existing use doesn't disappear, and requires it to be
> non-__init.

Sorry I meant the new function (calling get_free_port) for the new
use-case could be __init. The new function could be added to
xen/common/event_channel.c or to xen/arch/arm/domain_build.c.


> > The existing XSM check in evtchn_alloc_unbound cannot work and should
> > not work: it is based on the current domain having enough privileges to
> > create the event channel. In this case, we have no current domain at
> > all. The current domain is Xen itself.
> 
> And DOM_XEN cannot be given the appropriate permission, perhaps
> explicitly when using a real policy and by default in dummy and SILO
> modes?

The issue is that the check is based on "current", not one of the
domains passed as an argument to evtchn_alloc_unbound. Otherwise,
passing DOMID_XEN would be straightforward.

We would need to use a hack (like Daniel wrote in the other email) to
set the idle_domain temporarily as a privileged domain only for the sake
of passing an irrelevant (irrelevant as in "not relevant to this case")
XSM check. That cannot be an improvement. Also, setting current to a
"fake" domain is not great either.

In the specific case of dom0less and this patch, this is the only
instance of this issue and could be solved very straightforwardly by
calling get_free_port directly as we discussed [1].

I know Julien had some reservations about that. Let's try to find a
technical solution that makes everyone happy.

Maybe, instead of exporting get_free_port, we could create a new
function in xen/common/event_channel.c and mark it as __init? This way:
- we don't need to expose get_free_port
- the new function would only be __init anyway, so no risk of runtime
  misuse

What do you think?

[1] https://marc.info/?l=xen-devel&m=164197327305903


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
  2022-03-25  0:30         ` Stefano Stabellini
@ 2022-03-25  8:17           ` Jan Beulich
  2022-03-25 15:45           ` Daniel P. Smith
  2022-03-25 17:19           ` Julien Grall
  2 siblings, 0 replies; 42+ messages in thread
From: Jan Beulich @ 2022-03-25  8:17 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, jgross, Bertrand.Marquis, julien, Volodymyr_Babchuk,
	Luca Miccio, Stefano Stabellini, Andrew Cooper, George Dunlap,
	Wei Liu, Daniel P. Smith

On 25.03.2022 01:30, Stefano Stabellini wrote:
> Maybe, instead of exporting get_free_port, we could create a new
> function in xen/common/event_channel.c and mark it as __init? This way:
> - we don't need to expose get_free_port
> - the new function would only be __init anyway, so no risk of runtime
>   misuse
> 
> What do you think?

Maybe. Such a function would want to serve both your an Daniel's purpose
then.

Jan



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
  2022-03-25  0:30         ` Stefano Stabellini
  2022-03-25  8:17           ` Jan Beulich
@ 2022-03-25 15:45           ` Daniel P. Smith
  2022-03-25 16:52             ` Jason Andryuk
  2022-03-25 16:55             ` Julien Grall
  2022-03-25 17:19           ` Julien Grall
  2 siblings, 2 replies; 42+ messages in thread
From: Daniel P. Smith @ 2022-03-25 15:45 UTC (permalink / raw)
  To: Stefano Stabellini, Jan Beulich
  Cc: xen-devel, jgross, Bertrand.Marquis, julien, Volodymyr_Babchuk,
	Luca Miccio, Stefano Stabellini, Andrew Cooper, George Dunlap,
	Wei Liu

On 3/24/22 20:30, Stefano Stabellini wrote:
> On Wed, 23 Mar 2022, Jan Beulich wrote:
>> On 23.03.2022 01:22, Stefano Stabellini wrote:
>>> On Tue, 15 Mar 2022, Daniel P. Smith wrote:
>>>> On 1/28/22 16:33, Stefano Stabellini wrote:
>>>>> From: Luca Miccio <lucmiccio@gmail.com>
>>>>>
>>>>> The xenstore event channel will be allocated for dom0less domains. It is
>>>>> necessary to have access to the evtchn_alloc_unbound function to do
>>>>> that, so make evtchn_alloc_unbound public.
>>>>>
>>>>> Add a skip_xsm parameter to allow disabling the XSM check in
>>>>> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
>>>>> originated from Xen before running any domains.)
>>>>>
>>>>> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>>>> CC: Julien Grall <julien@xen.org>
>>>>> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
>>>>> CC: Bertrand Marquis <bertrand.marquis@arm.com>
>>>>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>> CC: George Dunlap <george.dunlap@citrix.com>
>>>>> CC: Jan Beulich <jbeulich@suse.com>
>>>>> CC: Wei Liu <wl@xen.org>
>>>>> ---
>>>>> Changes v3:
>>>>> - expose evtchn_alloc_unbound, assing a skip_xsm parameter
>>>>> ---
>>>>>  xen/common/event_channel.c | 13 ++++++++-----
>>>>>  xen/include/xen/event.h    |  3 +++
>>>>>  2 files changed, 11 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
>>>>> index da88ad141a..be57d00a15 100644
>>>>> --- a/xen/common/event_channel.c
>>>>> +++ b/xen/common/event_channel.c
>>>>> @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn)
>>>>>      xsm_evtchn_close_post(chn);
>>>>>  }
>>>>>  
>>>>> -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>>>>> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
>>>>>  {
>>>>>      struct evtchn *chn;
>>>>>      struct domain *d;
>>>>> @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>>>>>          ERROR_EXIT_DOM(port, d);
>>>>>      chn = evtchn_from_port(d, port);
>>>>>  
>>>>> -    rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
>>>>> -    if ( rc )
>>>>> -        goto out;
>>>>> +    if ( !skip_xsm )
>>>>> +    {
>>>>> +        rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
>>>>> +        if ( rc )
>>>>> +            goto out;
>>>>> +    }
>>>>
>>>> Please do not subvert the security framework because it causes an
>>>> inconvenience. As Jan recommended, work within the XSM check to allow
>>>> your access so that we may ensure it is done safely. If you need any
>>>> help making modifications to XSM, please do not hesitate to reach out as
>>>> I will gladly help.
>>>
>>> Thank you!
>>>
>>> First let me reply to Jan: this series is only introducing 1 more call
>>> to evtchn_alloc_unbound, which is to allocate the special xenstore event
>>> channel, the one configured via
>>> d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN].
>>>
>>> It is not meant to be a generic function, and it is not meant to be
>>> called more than once. It could (should?) be __init.
>>
>> How that? Its pre-existing use doesn't disappear, and requires it to be
>> non-__init.
> 
> Sorry I meant the new function (calling get_free_port) for the new
> use-case could be __init. The new function could be added to
> xen/common/event_channel.c or to xen/arch/arm/domain_build.c.
> 
> 
>>> The existing XSM check in evtchn_alloc_unbound cannot work and should
>>> not work: it is based on the current domain having enough privileges to
>>> create the event channel. In this case, we have no current domain at
>>> all. The current domain is Xen itself.
>>
>> And DOM_XEN cannot be given the appropriate permission, perhaps
>> explicitly when using a real policy and by default in dummy and SILO
>> modes?
> 
> The issue is that the check is based on "current", not one of the
> domains passed as an argument to evtchn_alloc_unbound. Otherwise,
> passing DOMID_XEN would be straightforward.
> 
> We would need to use a hack (like Daniel wrote in the other email) to
> set the idle_domain temporarily as a privileged domain only for the sake
> of passing an irrelevant (irrelevant as in "not relevant to this case")
> XSM check. That cannot be an improvement. Also, setting current to a
> "fake" domain is not great either.

My suggestion was not to intended to be simply a hack but looking at the
larger issue instead of simply doing a targeted fix for this one
instnace. While I cannot give an example right off hand, the reality is,
at least for hyperlaunch, that we cannot say for certain there will not
be further resource allocations that is protected by the security
framework and will require preliminary handling by the construction
logic in the hypervisor. The low-complexity approach is to address each
one in a case-by-case fashion using direct calls that go around the
security framework. A more security conscience, and higher complexity,
approach would be to consider a least-privilege approach and look at
introducing the ability to do controlled switching of contexts, i.e.
moving `current` from DOMID_IDLE to DOMID_CONSTRUCT, to one that is
granted only the necessary privileges to do the resource allocations in
which it is limited.

This is also not the first time this issue has come up, I don't recall
the exact thread but several months ago someone ran into the issue they
need to make a call to a resource function and was blocked by XSM
because DOMID_IDLE has no privileges. The reality is that the idea of
monolithic high-privileged entities is being dropped in favor of
least-privilege, and where possible hardware enforced, constraint. This
can be seen with Intel de-privileging SMM and running SMI handlers in
constrained ring 3. Arm is gaining capability pointers, CHERI, that will
enable the possibility for constrained, least-privileged kernel
subsystems. Would it not be advantageous for Xen to start moving in such
a direction that would enable it to provide a new level of safety and
security for consumers of Xen?

Coming back to the short-term, I would advocate for introducing the
concept and abstraction of constrained context switching through a set
of function calls, which would likely be under XSM to allow policy
enforcement. Likely the introductory implementation would just mask the
fact that it is just setting `is_privileged` for DOMID_IDLE. Future
evolution of the capability could see the introduction of new
"contexts", whether they are represented by a domain could be determined
then, and the ability to do controlled switching based on policy.

Just wanted to put more of my thinking out there before a path is taken.

> In the specific case of dom0less and this patch, this is the only
> instance of this issue and could be solved very straightforwardly by
> calling get_free_port directly as we discussed [1].
> 
> I know Julien had some reservations about that. Let's try to find a
> technical solution that makes everyone happy.
> 
> Maybe, instead of exporting get_free_port, we could create a new
> function in xen/common/event_channel.c and mark it as __init? This way:
> - we don't need to expose get_free_port
> - the new function would only be __init anyway, so no risk of runtime
>   misuse
> 
> What do you think?
> 
> [1] https://marc.info/?l=xen-devel&m=164197327305903


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
  2022-03-25 15:45           ` Daniel P. Smith
@ 2022-03-25 16:52             ` Jason Andryuk
  2022-03-28 14:54               ` Daniel P. Smith
  2022-03-25 16:55             ` Julien Grall
  1 sibling, 1 reply; 42+ messages in thread
From: Jason Andryuk @ 2022-03-25 16:52 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: Stefano Stabellini, Jan Beulich, xen-devel, Juergen Gross,
	Bertrand.Marquis, Julien Grall, Volodymyr_Babchuk, Luca Miccio,
	Stefano Stabellini, Andrew Cooper, George Dunlap, Wei Liu

On Fri, Mar 25, 2022 at 11:46 AM Daniel P. Smith <dpsmith.dev@gmail.com> wrote:
>
> On 3/24/22 20:30, Stefano Stabellini wrote:
> > On Wed, 23 Mar 2022, Jan Beulich wrote:
> >> On 23.03.2022 01:22, Stefano Stabellini wrote:
> >>> The existing XSM check in evtchn_alloc_unbound cannot work and should
> >>> not work: it is based on the current domain having enough privileges to
> >>> create the event channel. In this case, we have no current domain at
> >>> all. The current domain is Xen itself.
> >>
> >> And DOM_XEN cannot be given the appropriate permission, perhaps
> >> explicitly when using a real policy and by default in dummy and SILO
> >> modes?
> >
> > The issue is that the check is based on "current", not one of the
> > domains passed as an argument to evtchn_alloc_unbound. Otherwise,
> > passing DOMID_XEN would be straightforward.
> >
> > We would need to use a hack (like Daniel wrote in the other email) to
> > set the idle_domain temporarily as a privileged domain only for the sake
> > of passing an irrelevant (irrelevant as in "not relevant to this case")
> > XSM check. That cannot be an improvement. Also, setting current to a
> > "fake" domain is not great either.
>
> My suggestion was not to intended to be simply a hack but looking at the
> larger issue instead of simply doing a targeted fix for this one
> instnace. While I cannot give an example right off hand, the reality is,
> at least for hyperlaunch, that we cannot say for certain there will not
> be further resource allocations that is protected by the security
> framework and will require preliminary handling by the construction
> logic in the hypervisor. The low-complexity approach is to address each
> one in a case-by-case fashion using direct calls that go around the
> security framework. A more security conscience, and higher complexity,
> approach would be to consider a least-privilege approach and look at
> introducing the ability to do controlled switching of contexts, i.e.
> moving `current` from DOMID_IDLE to DOMID_CONSTRUCT, to one that is
> granted only the necessary privileges to do the resource allocations in
> which it is limited.
>
> This is also not the first time this issue has come up, I don't recall
> the exact thread but several months ago someone ran into the issue they
> need to make a call to a resource function and was blocked by XSM
> because DOMID_IDLE has no privileges. The reality is that the idea of
> monolithic high-privileged entities is being dropped in favor of
> least-privilege, and where possible hardware enforced, constraint. This
> can be seen with Intel de-privileging SMM and running SMI handlers in
> constrained ring 3. Arm is gaining capability pointers, CHERI, that will
> enable the possibility for constrained, least-privileged kernel
> subsystems. Would it not be advantageous for Xen to start moving in such
> a direction that would enable it to provide a new level of safety and
> security for consumers of Xen?
>
> Coming back to the short-term, I would advocate for introducing the
> concept and abstraction of constrained context switching through a set
> of function calls, which would likely be under XSM to allow policy
> enforcement. Likely the introductory implementation would just mask the
> fact that it is just setting `is_privileged` for DOMID_IDLE. Future
> evolution of the capability could see the introduction of new
> "contexts", whether they are represented by a domain could be determined
> then, and the ability to do controlled switching based on policy.

For the specific case of evtchn_alloc_unbound, Flask's
xsm_evtchn_unbound has a side effect of labeling the event channel.
So skipping the hook will have unintended consequences for Flask.

xsm_evtchn_unbound could be split in two to have an access piece and a
labeling piece.  The access piece is run at hypercall entry, and the
labeling is still done in evtchn_alloc_unbound.  For Flask, labeling
depends on the two domain endpoints, but not current.

More generally, it seems to me there are too many xsm checks in the
middle of functions/operations.  They are fine for a normal entry via
hypercall, but they interfere with Xen's internal operations.  Xen
shouldn't be restricted in its own operations.  The live update people
hit it with domain creation, and I just posted a patch for
unmap_domain_pirq.

It would be more obvious for auditing if each hypercall entrypoint
applied xsm checks.  Make the allow/deny decision as early as
possible.  Then a worker function would be easily callable for the
Xen-internal case.  The flip side of that is the xsm hook may need
sub-op specific data to make its decision, so it fits to put it in the
sub-op function.  It seems to me the location of hooks was determined
by where the data they need is already available.  Re-arranging hooks
may require some duplication.

The xsm controls should clearly apply to the DomUs and other entities
managed by Xen.  xsm restricting Xen itself seems wrong.  Having
internal operations get denied by xsm may unintentionally subvert Xen
itself.

Yes, moving checks outward makes for an un-restricted middle.  But the
security server running in the same address space isn't going to help
against code exec inside the hypervisor.

I think I view it like this:  Why is a xen internal operation subject
to a security check?  When would one ever want to deny a xen internal
operation?

Regards,
Jason


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
  2022-03-25 15:45           ` Daniel P. Smith
  2022-03-25 16:52             ` Jason Andryuk
@ 2022-03-25 16:55             ` Julien Grall
  1 sibling, 0 replies; 42+ messages in thread
From: Julien Grall @ 2022-03-25 16:55 UTC (permalink / raw)
  To: Daniel P. Smith, Stefano Stabellini, Jan Beulich
  Cc: xen-devel, jgross, Bertrand.Marquis, Volodymyr_Babchuk,
	Luca Miccio, Stefano Stabellini, Andrew Cooper, George Dunlap,
	Wei Liu

Hi Daniel,

On 25/03/2022 15:45, Daniel P. Smith wrote:
>>>> The existing XSM check in evtchn_alloc_unbound cannot work and should
>>>> not work: it is based on the current domain having enough privileges to
>>>> create the event channel. In this case, we have no current domain at
>>>> all. The current domain is Xen itself.
>>>
>>> And DOM_XEN cannot be given the appropriate permission, perhaps
>>> explicitly when using a real policy and by default in dummy and SILO
>>> modes?
>>
>> The issue is that the check is based on "current", not one of the
>> domains passed as an argument to evtchn_alloc_unbound. Otherwise,
>> passing DOMID_XEN would be straightforward.
>>
>> We would need to use a hack (like Daniel wrote in the other email) to
>> set the idle_domain temporarily as a privileged domain only for the sake
>> of passing an irrelevant (irrelevant as in "not relevant to this case")
>> XSM check. That cannot be an improvement. Also, setting current to a
>> "fake" domain is not great either.
> 
> My suggestion was not to intended to be simply a hack but looking at the
> larger issue instead of simply doing a targeted fix for this one
> instnace. While I cannot give an example right off hand, the reality is,
> at least for hyperlaunch, that we cannot say for certain there will not
> be further resource allocations that is protected by the security
> framework and will require preliminary handling by the construction
> logic in the hypervisor. The low-complexity approach is to address each
> one in a case-by-case fashion using direct calls that go around the
> security framework. A more security conscience, and higher complexity,
> approach would be to consider a least-privilege approach and look at
> introducing the ability to do controlled switching of contexts, i.e.
> moving `current` from DOMID_IDLE to DOMID_CONSTRUCT, to one that is
> granted only the necessary privileges to do the resource allocations in
> which it is limited.
> 
> This is also not the first time this issue has come up, I don't recall
> the exact thread but several months ago someone ran into the issue they
> need to make a call to a resource function and was blocked by XSM
> because DOMID_IDLE has no privileges.

This was in the context of Live-Updating Xen. We are trying to re-use as 
much as possible the code used by the hypercalls. Those functions may 
contain xsm check that would fail because current would be an idle vCPU.

For the full context:

https://lore.kernel.org/xen-devel/bfd645cf42ef7786183be15c222ad04beed362c0.camel@xen.org/

> The reality is that the idea of
> monolithic high-privileged entities is being dropped in favor of
> least-privilege, and where possible hardware enforced, constraint. This
> can be seen with Intel de-privileging SMM and running SMI handlers in
> constrained ring 3. Arm is gaining capability pointers, CHERI, that will
> enable the possibility for constrained, least-privileged kernel
> subsystems. Would it not be advantageous for Xen to start moving in such
> a direction that would enable it to provide a new level of safety and
> security for consumers of Xen?
> 
> Coming back to the short-term, I would advocate for introducing the
> concept and abstraction of constrained context switching through a set
> of function calls, which would likely be under XSM to allow policy
> enforcement. Likely the introductory implementation would just mask the
> fact that it is just setting `is_privileged` for DOMID_IDLE. Future
> evolution of the capability could see the introduction of new
> "contexts", whether they are represented by a domain could be determined
> then, and the ability to do controlled switching based on policy.

+1 with this idea.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
  2022-03-25  0:30         ` Stefano Stabellini
  2022-03-25  8:17           ` Jan Beulich
  2022-03-25 15:45           ` Daniel P. Smith
@ 2022-03-25 17:19           ` Julien Grall
  2022-03-25 21:05             ` Stefano Stabellini
  2 siblings, 1 reply; 42+ messages in thread
From: Julien Grall @ 2022-03-25 17:19 UTC (permalink / raw)
  To: Stefano Stabellini, Jan Beulich
  Cc: xen-devel, jgross, Bertrand.Marquis, Volodymyr_Babchuk,
	Luca Miccio, Stefano Stabellini, Andrew Cooper, George Dunlap,
	Wei Liu, Daniel P. Smith

Hi Stefano,

On 25/03/2022 00:30, Stefano Stabellini wrote:
> On Wed, 23 Mar 2022, Jan Beulich wrote:
>> On 23.03.2022 01:22, Stefano Stabellini wrote:
>>> On Tue, 15 Mar 2022, Daniel P. Smith wrote:
>>>> On 1/28/22 16:33, Stefano Stabellini wrote:
>>>>> From: Luca Miccio <lucmiccio@gmail.com>
>>>>>
>>>>> The xenstore event channel will be allocated for dom0less domains. It is
>>>>> necessary to have access to the evtchn_alloc_unbound function to do
>>>>> that, so make evtchn_alloc_unbound public.
>>>>>
>>>>> Add a skip_xsm parameter to allow disabling the XSM check in
>>>>> evtchn_alloc_unbound (xsm_evtchn_unbound wouldn't work for a call
>>>>> originated from Xen before running any domains.)
>>>>>
>>>>> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>>>> CC: Julien Grall <julien@xen.org>
>>>>> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
>>>>> CC: Bertrand Marquis <bertrand.marquis@arm.com>
>>>>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>>>>> CC: George Dunlap <george.dunlap@citrix.com>
>>>>> CC: Jan Beulich <jbeulich@suse.com>
>>>>> CC: Wei Liu <wl@xen.org>
>>>>> ---
>>>>> Changes v3:
>>>>> - expose evtchn_alloc_unbound, assing a skip_xsm parameter
>>>>> ---
>>>>>   xen/common/event_channel.c | 13 ++++++++-----
>>>>>   xen/include/xen/event.h    |  3 +++
>>>>>   2 files changed, 11 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
>>>>> index da88ad141a..be57d00a15 100644
>>>>> --- a/xen/common/event_channel.c
>>>>> +++ b/xen/common/event_channel.c
>>>>> @@ -284,7 +284,7 @@ void evtchn_free(struct domain *d, struct evtchn *chn)
>>>>>       xsm_evtchn_close_post(chn);
>>>>>   }
>>>>>   
>>>>> -static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>>>>> +int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
>>>>>   {
>>>>>       struct evtchn *chn;
>>>>>       struct domain *d;
>>>>> @@ -301,9 +301,12 @@ static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
>>>>>           ERROR_EXIT_DOM(port, d);
>>>>>       chn = evtchn_from_port(d, port);
>>>>>   
>>>>> -    rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
>>>>> -    if ( rc )
>>>>> -        goto out;
>>>>> +    if ( !skip_xsm )
>>>>> +    {
>>>>> +        rc = xsm_evtchn_unbound(XSM_TARGET, d, chn, alloc->remote_dom);
>>>>> +        if ( rc )
>>>>> +            goto out;
>>>>> +    }
>>>>
>>>> Please do not subvert the security framework because it causes an
>>>> inconvenience. As Jan recommended, work within the XSM check to allow
>>>> your access so that we may ensure it is done safely. If you need any
>>>> help making modifications to XSM, please do not hesitate to reach out as
>>>> I will gladly help.
>>>
>>> Thank you!
>>>
>>> First let me reply to Jan: this series is only introducing 1 more call
>>> to evtchn_alloc_unbound, which is to allocate the special xenstore event
>>> channel, the one configured via
>>> d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN].
>>>
>>> It is not meant to be a generic function, and it is not meant to be
>>> called more than once. It could (should?) be __init.
>>
>> How that? Its pre-existing use doesn't disappear, and requires it to be
>> non-__init.
> 
> Sorry I meant the new function (calling get_free_port) for the new
> use-case could be __init. The new function could be added to
> xen/common/event_channel.c or to xen/arch/arm/domain_build.c.
> 
> 
>>> The existing XSM check in evtchn_alloc_unbound cannot work and should
>>> not work: it is based on the current domain having enough privileges to
>>> create the event channel. In this case, we have no current domain at
>>> all. The current domain is Xen itself.
>>
>> And DOM_XEN cannot be given the appropriate permission, perhaps
>> explicitly when using a real policy and by default in dummy and SILO
>> modes?
> 
> The issue is that the check is based on "current", not one of the
> domains passed as an argument to evtchn_alloc_unbound. Otherwise,
> passing DOMID_XEN would be straightforward.
> 
> We would need to use a hack (like Daniel wrote in the other email) to
> set the idle_domain temporarily as a privileged domain only for the sake
> of passing an irrelevant (irrelevant as in "not relevant to this case")
> XSM check. That cannot be an improvement. Also, setting current to a
> "fake" domain is not great either.
I agree they are not great. But this needs to be compared with the other 
proposals.

AFAIU, your proposal is to duplicate code. This brings other risks such 
as duplicated bug and more code to maintain.

In your case, you only need one helper. But in some other context (e.g. 
Live-Update and it looks like Hyperlaunch), we may need to re-use more 
hypercalls code.

So to me, the idea of switching to a "fake" domain or bypassing the 
check is more appealing. I have a preference for the "fake" domain here.

> 
> In the specific case of dom0less and this patch, this is the only
> instance of this issue and could be solved very straightforwardly by
> calling get_free_port directly as we discussed [1].

Exporting get_free_port() is a no-go for me. This can be easily mis-used 
and in fact you already did it in your proposal by not using the proper 
locking (I appreciate this is meant to be boot code only).

> 
> I know Julien had some reservations about that. Let's try to find a
> technical solution that makes everyone happy.
> 
> Maybe, instead of exporting get_free_port, we could create a new
> function in xen/common/event_channel.c and mark it as __init? This way:
> - we don't need to expose get_free_port
> - the new function would only be __init anyway, so no risk of runtime
>    misuse

I think the code duplication is short-sighted. I could possibly accept 
one instance, but I suspect the proposal [1] will end up to add more.
So IMHO we should try to resolve it generically now.

Cheers,

[1] 
<4836304496e6fbbea41348ed8cc9fcf6b0f3e893.1648049827.git.rahul.singh@arm.com>

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/5] xen: introduce xen,enhanced dom0less property
  2022-03-23  0:08     ` Stefano Stabellini
@ 2022-03-25 17:58       ` Julien Grall
  2022-04-01  0:33         ` Stefano Stabellini
  0 siblings, 1 reply; 42+ messages in thread
From: Julien Grall @ 2022-03-25 17:58 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, jgross, Bertrand.Marquis, Volodymyr_Babchuk,
	Stefano Stabellini

Hi Stefano,

On 23/03/2022 00:08, Stefano Stabellini wrote:
> On Sat, 29 Jan 2022, Julien Grall wrote:
>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>> index 6931c022a2..9144d6c0b6 100644
>>> --- a/xen/arch/arm/domain_build.c
>>> +++ b/xen/arch/arm/domain_build.c
>>> @@ -2963,6 +2963,7 @@ static int __init construct_domU(struct domain *d,
>>>                                     const struct dt_device_node *node)
>>>    {
>>>        struct kernel_info kinfo = {};
>>> +    const char *dom0less_enhanced;
>>>        int rc;
>>>        u64 mem;
>>>    @@ -2978,6 +2979,12 @@ static int __init construct_domU(struct domain *d,
>>>          kinfo.vpl011 = dt_property_read_bool(node, "vpl011");
>>>    +    rc = dt_property_read_string(node, "xen,enhanced",
>>> &dom0less_enhanced);
>>> +    if ( rc == -EILSEQ ||
>>
>> I think the use an -EILSEQ wants an explanation. In a previous version, you
>> wrote that the value would be returned when:
>>
>> fdt set /chosen/domU0 xen,enhanced
>>
>> But it is not clear why. Can you print pp->value, pp->length, strnlen(..) when
>> this happens?
> 
> I added in dt_property_read_string:
> 
> printk("DEBUG %s %d value=%s value[0]=%d length=%u len=%lu\n",__func__,__LINE__,(char*)pp->value, *((char*)pp->value),pp->length, strlen(pp->value));
> 
> This is the output:
> (XEN) DEBUG dt_property_read_string 205 value= value[0]=0 length=0 len=0

Thanks posting the log!

For convenience, I am copying the comment on top of 
dt_property_read_string() prototype:

  * Search for a property in a device tree node and retrieve a null
  * terminated string value (pointer to data, not a copy). Returns 0 on
  * success, -EINVAL if the property does not exist, -ENODATA if property
  * doest not have value, and -EILSEQ if the string is not
  * null-terminated with the length of the property data.

Per your log, the length is NULL so I would have assumed -ENODATA would 
be returned. Looking at the implementation:

     const struct dt_property *pp = dt_find_property(np, propname, NULL);

     if ( !pp )
         return -EINVAL;
     if ( !pp->value )
         return -ENODATA;
     if ( strnlen(pp->value, pp->length) >= pp->length )
         return -EILSEQ;

We consider that the property when pp->value is NULL. However, AFAICT, 
we never set pp->value to NULL (see unflatten_dt_node()).

So I think there is a bug in the implementation. I would keep the check 
!pp->value (for hardening purpose) and also return -ENODATA when 
!pp->length.

Most of our device-tree code is from Linux. Looking at v5.17, the bug 
seems to be present there too. This would want to be fixed there too.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 3/5] xen/arm: configure dom0less domain for enabling xenstore after boot
  2022-03-23  1:18     ` Stefano Stabellini
@ 2022-03-25 18:46       ` Julien Grall
  2022-04-01  0:34         ` Stefano Stabellini
  0 siblings, 1 reply; 42+ messages in thread
From: Julien Grall @ 2022-03-25 18:46 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, jgross, Bertrand.Marquis, Volodymyr_Babchuk,
	Luca Miccio, Stefano Stabellini

Hi Stefano,

On 23/03/2022 01:18, Stefano Stabellini wrote:
> On Sat, 29 Jan 2022, Julien Grall wrote:
>> On 28/01/2022 21:33, Stefano Stabellini wrote:
>>> +    rc = evtchn_alloc_unbound(&alloc, true);
>>> +    if ( rc )
>>> +    {
>>> +        printk("Failed allocating event channel for domain\n");
>>> +        return rc;
>>> +    }
>>> +
>>> +    d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN] = alloc.port;
>>> +
>>> +    return 0;
>>> +}
>>> +
>>>    static int __init construct_domU(struct domain *d,
>>>                                     const struct dt_device_node *node)
>>>    {
>>> @@ -3014,7 +3043,19 @@ static int __init construct_domU(struct domain *d,
>>>            return rc;
>>>          if ( kinfo.vpl011 )
>>> +    {
>>>            rc = domain_vpl011_init(d, NULL);
>>> +        if ( rc < 0 )
>>> +            return rc;
>>> +    }
>>> +
>>> +    if ( kinfo.dom0less_enhanced )
>>> +    {
>>> +        rc = alloc_xenstore_evtchn(d);
>>> +        if ( rc < 0 )
>>> +            return rc;
>>> +        d->arch.hvm.params[HVM_PARAM_STORE_PFN] = ~0ULL;
>>
>> I think it would be easy to allocate the page right now. So what prevent us to
>> do it right now?
> 
> Because (as you noted as a comment to the following patch) as soon as
> d->arch.hvm.params[HVM_PARAM_STORE_PFN] is set the guest can continue
> with the initialization and will expect the right data to be set on the
> page.

I think you misunderstood my question. From my understanding, at the 
moment, Linux would break with your proposal. So you need to modify the 
kernel in order to support what you are doing.

IOW, we have room to decide the approach here.

Xenstore protocol has a way to allow (re)connection (see 
docs/mics/xenstore-ring.txt). This feature looks quite suited to what 
you are trying to do here (we want to delay the connection).

The main advantage with this approach is the resources allocation for 
Xenstore will be done in the place and the work in Linux could be 
re-used for non-dom0less domain.

Have you explored it?

> In other words: it is not enough to have the pfn allocated, we
> also need xenstore to initialize it. At that point, it is better to do
> both later from init-dom0less.c.
See above. My main concern with your proposal is the allocation is split 
and this making more difficult to understand the initialization. Could 
you write some documentation how everything is meant to work?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
  2022-03-25 17:19           ` Julien Grall
@ 2022-03-25 21:05             ` Stefano Stabellini
  2022-03-28 14:56               ` Julien Grall
                                 ` (2 more replies)
  0 siblings, 3 replies; 42+ messages in thread
From: Stefano Stabellini @ 2022-03-25 21:05 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Jan Beulich, xen-devel, jgross,
	Bertrand.Marquis, Volodymyr_Babchuk, Luca Miccio,
	Stefano Stabellini, Andrew Cooper, George Dunlap, Wei Liu,
	Daniel P. Smith

On Fri, 25 Mar 2022, Julien Grall wrote:
> So to me, the idea of switching to a "fake" domain or bypassing the check is
> more appealing. I have a preference for the "fake" domain here.

As a maintainer, I am not opposed to the "fake"/"contructor" domain
concept.  It all depends on how many instances of this issue we are
going to have.  This is the only one on xen-devel so far. I don't think
it is worth adding a constructor domain for one instance only.  But I
agree with you and Daniel that if we end up with several instances, then
the constructor domain approach is probably the best option overall.


As a contributor, sadly I won't be able to spend a lot of time on this
in the following months. If a significant rework is required, I don't
think I'll be able to do it, at least not for this Xen release (and it
would be nice to have dom0less PV drivers in the coming release.) If
Daniel is willing, I could add his "idle_domain is_priv" patch to this
series.  Not as clean as a proper constructor domain but it would work
and it would be simple. It could be evolved into a nicer constructor
domain later.

This is not my preference but I could do that if Julien and Jan prefer
this approach and if Daniel is happy to share his patch.


> AFAIU, your proposal is to duplicate code. This brings other risks such as
> duplicated bug and more code to maintain.

Got it. I'll make one last attempt at a proposal that doesn't involve
the fake constructor domain. The goal is to avoid code duplication while
providing a safe way forward to make progress with only a small amount
of changes. What if we:

- rename evtchn_alloc_unbound to _evtchn_alloc_unbound (still static)
- add a skip_xsm parameter to _evtchn_alloc_unbound
- introduce a wrapper evtchn_alloc_unbound that always set skip_xsm to
  false (same interface as today's evtchn_alloc_unbound)
- introduce an __init early_evtchn_alloc_unbound public function that
  sets skip_xsm to true

This way:
- we have a single implementation in _evtchn_alloc_unbound (no code
  duplication)
- the only function exposed that skips the XSM check is __init
- evtchn_alloc_unbound continue to have the XSM check same as today


E.g.:
static int _evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
{
    ...
}

static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
{
    return _evtchn_alloc_unbound(alloc, false);    
}

int __init early_evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
{
    return _evtchn_alloc_unbound(alloc, true);
}


Would this be good enough for now?


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
  2022-03-25 16:52             ` Jason Andryuk
@ 2022-03-28 14:54               ` Daniel P. Smith
  0 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Smith @ 2022-03-28 14:54 UTC (permalink / raw)
  To: Jason Andryuk
  Cc: Stefano Stabellini, Jan Beulich, xen-devel, Juergen Gross,
	Bertrand.Marquis, Julien Grall, Volodymyr_Babchuk, Luca Miccio,
	Stefano Stabellini, Andrew Cooper, George Dunlap, Wei Liu

On 3/25/22 12:52, Jason Andryuk wrote:
> On Fri, Mar 25, 2022 at 11:46 AM Daniel P. Smith <dpsmith.dev@gmail.com> wrote:
>>
>> On 3/24/22 20:30, Stefano Stabellini wrote:
>>> On Wed, 23 Mar 2022, Jan Beulich wrote:
>>>> On 23.03.2022 01:22, Stefano Stabellini wrote:
>>>>> The existing XSM check in evtchn_alloc_unbound cannot work and should
>>>>> not work: it is based on the current domain having enough privileges to
>>>>> create the event channel. In this case, we have no current domain at
>>>>> all. The current domain is Xen itself.
>>>>
>>>> And DOM_XEN cannot be given the appropriate permission, perhaps
>>>> explicitly when using a real policy and by default in dummy and SILO
>>>> modes?
>>>
>>> The issue is that the check is based on "current", not one of the
>>> domains passed as an argument to evtchn_alloc_unbound. Otherwise,
>>> passing DOMID_XEN would be straightforward.
>>>
>>> We would need to use a hack (like Daniel wrote in the other email) to
>>> set the idle_domain temporarily as a privileged domain only for the sake
>>> of passing an irrelevant (irrelevant as in "not relevant to this case")
>>> XSM check. That cannot be an improvement. Also, setting current to a
>>> "fake" domain is not great either.
>>
>> My suggestion was not to intended to be simply a hack but looking at the
>> larger issue instead of simply doing a targeted fix for this one
>> instnace. While I cannot give an example right off hand, the reality is,
>> at least for hyperlaunch, that we cannot say for certain there will not
>> be further resource allocations that is protected by the security
>> framework and will require preliminary handling by the construction
>> logic in the hypervisor. The low-complexity approach is to address each
>> one in a case-by-case fashion using direct calls that go around the
>> security framework. A more security conscience, and higher complexity,
>> approach would be to consider a least-privilege approach and look at
>> introducing the ability to do controlled switching of contexts, i.e.
>> moving `current` from DOMID_IDLE to DOMID_CONSTRUCT, to one that is
>> granted only the necessary privileges to do the resource allocations in
>> which it is limited.
>>
>> This is also not the first time this issue has come up, I don't recall
>> the exact thread but several months ago someone ran into the issue they
>> need to make a call to a resource function and was blocked by XSM
>> because DOMID_IDLE has no privileges. The reality is that the idea of
>> monolithic high-privileged entities is being dropped in favor of
>> least-privilege, and where possible hardware enforced, constraint. This
>> can be seen with Intel de-privileging SMM and running SMI handlers in
>> constrained ring 3. Arm is gaining capability pointers, CHERI, that will
>> enable the possibility for constrained, least-privileged kernel
>> subsystems. Would it not be advantageous for Xen to start moving in such
>> a direction that would enable it to provide a new level of safety and
>> security for consumers of Xen?
>>
>> Coming back to the short-term, I would advocate for introducing the
>> concept and abstraction of constrained context switching through a set
>> of function calls, which would likely be under XSM to allow policy
>> enforcement. Likely the introductory implementation would just mask the
>> fact that it is just setting `is_privileged` for DOMID_IDLE. Future
>> evolution of the capability could see the introduction of new
>> "contexts", whether they are represented by a domain could be determined
>> then, and the ability to do controlled switching based on policy.
> 
> For the specific case of evtchn_alloc_unbound, Flask's
> xsm_evtchn_unbound has a side effect of labeling the event channel.
> So skipping the hook will have unintended consequences for Flask.
> 
> xsm_evtchn_unbound could be split in two to have an access piece and a
> labeling piece.  The access piece is run at hypercall entry, and the
> labeling is still done in evtchn_alloc_unbound.  For Flask, labeling
> depends on the two domain endpoints, but not current.
> 
> More generally, it seems to me there are too many xsm checks in the
> middle of functions/operations.  They are fine for a normal entry via
> hypercall, but they interfere with Xen's internal operations.  Xen
> shouldn't be restricted in its own operations.  The live update people
> hit it with domain creation, and I just posted a patch for
> unmap_domain_pirq.
> 
> It would be more obvious for auditing if each hypercall entrypoint
> applied xsm checks.  Make the allow/deny decision as early as
> possible.  Then a worker function would be easily callable for the
> Xen-internal case.  The flip side of that is the xsm hook may need
> sub-op specific data to make its decision, so it fits to put it in the
> sub-op function.  It seems to me the location of hooks was determined
> by where the data they need is already available.  Re-arranging hooks
> may require some duplication.

While it can be inconvenient in some cases, one of the purposes of flask
is to provide fine(er) grained access control over resources and for Xen
the path to a resource is often through a multiplexed interface. As such
I would be very cautious in considering the addition of distance between
the time of check and the time of access to a resource.

It has been some time since it has occurred but I am aware that there
was an audit conducted to ensure all necessary resources were covered
and the placement of decision points were in the correct locations. With
that said, the code has evolved and it may be time for another audit
(though it would take some cycles to do such an audit).

> The xsm controls should clearly apply to the DomUs and other entities
> managed by Xen.  xsm restricting Xen itself seems wrong.  Having
> internal operations get denied by xsm may unintentionally subvert Xen
> itself.
> 
> Yes, moving checks outward makes for an un-restricted middle.  But the
> security server running in the same address space isn't going to help
> against code exec inside the hypervisor.
> 
> I think I view it like this:  Why is a xen internal operation subject
> to a security check?  When would one ever want to deny a xen internal
> operation?

In the past this was the standard of thinking but the reality of the
last few years has demonstrated this approach can no longer be relied
upon. If this approach was safe, then Intel would have had no reason to
put the work into providing the necessary mechanisms to de-privilege SMI
handlers, just to pick a recent example. Hardware changes are coming,
and in the case of OpenPOWER and Arm it is already here, that will
enable kernels and hypervisors to de-privilege sections of their
processing logic depending on the risk, for example ensuring code
processing input from a hypercall cannot reach memory used by the
security server.

My point here is that instead of cementing a historical approach into
the code base, that perhaps it would be worth considering using a more
flexible approach that provides the necessary modularity to enable the
introduction of this kind of capability, even if today it is merely a
facade, without having to rework the interfaces to remove all the
secondary direct paths.

V/r,
dps


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
  2022-03-25 21:05             ` Stefano Stabellini
@ 2022-03-28 14:56               ` Julien Grall
  2022-03-28 15:11               ` Daniel P. Smith
  2022-03-28 16:38               ` Daniel P. Smith
  2 siblings, 0 replies; 42+ messages in thread
From: Julien Grall @ 2022-03-28 14:56 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Jan Beulich, xen-devel, jgross, Bertrand.Marquis,
	Volodymyr_Babchuk, Luca Miccio, Stefano Stabellini,
	Andrew Cooper, George Dunlap, Wei Liu, Daniel P. Smith

Hi Stefano,

On 25/03/2022 21:05, Stefano Stabellini wrote:
> On Fri, 25 Mar 2022, Julien Grall wrote: > As a contributor, sadly I won't be able to spend a lot of time on this
> in the following months. If a significant rework is required, I don't
> think I'll be able to do it, at least not for this Xen release (and it
> would be nice to have dom0less PV drivers in the coming release.) If
> Daniel is willing, I could add his "idle_domain is_priv" patch to this
> series.  Not as clean as a proper constructor domain but it would work
> and it would be simple. It could be evolved into a nicer constructor
> domain later.
> 
> This is not my preference but I could do that if Julien and Jan prefer
> this approach and if Daniel is happy to share his patch.

This is still my preference because we are avoiding to push the problem 
to the unlucky person that will need to introduce another (or multiple) 
'skip_xsm'.

>> AFAIU, your proposal is to duplicate code. This brings other risks such as
>> duplicated bug and more code to maintain.
> 
> Got it. I'll make one last attempt at a proposal that doesn't involve
> the fake constructor domain. The goal is to avoid code duplication while
> providing a safe way forward to make progress with only a small amount
> of changes. What if we:
> 
> - rename evtchn_alloc_unbound to _evtchn_alloc_unbound (still static)
> - add a skip_xsm parameter to _evtchn_alloc_unbound
> - introduce a wrapper evtchn_alloc_unbound that always set skip_xsm to
>    false (same interface as today's evtchn_alloc_unbound)
> - introduce an __init early_evtchn_alloc_unbound public function that
>    sets skip_xsm to true
> 
> This way:
> - we have a single implementation in _evtchn_alloc_unbound (no code
>    duplication)
> - the only function exposed that skips the XSM check is __init
> - evtchn_alloc_unbound continue to have the XSM check same as today
> 
> 
> E.g.:
> static int _evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
> {
>      ...
> }
> 
> static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> {
>      return _evtchn_alloc_unbound(alloc, false);
> }
> 
> int __init early_evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> {
>      return _evtchn_alloc_unbound(alloc, true);
> }
> 
> 
> Would this be good enough for now?

I would be OK with that. Note that, I think we need to protect the use 
of skip_xsm with evaluate_nospec().

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
  2022-03-25 21:05             ` Stefano Stabellini
  2022-03-28 14:56               ` Julien Grall
@ 2022-03-28 15:11               ` Daniel P. Smith
  2022-03-28 16:38               ` Daniel P. Smith
  2 siblings, 0 replies; 42+ messages in thread
From: Daniel P. Smith @ 2022-03-28 15:11 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall
  Cc: Jan Beulich, xen-devel, jgross, Bertrand.Marquis,
	Volodymyr_Babchuk, Luca Miccio, Stefano Stabellini,
	Andrew Cooper, George Dunlap, Wei Liu

On 3/25/22 17:05, Stefano Stabellini wrote:
> On Fri, 25 Mar 2022, Julien Grall wrote:
>> So to me, the idea of switching to a "fake" domain or bypassing the check is
>> more appealing. I have a preference for the "fake" domain here.
> 
> As a maintainer, I am not opposed to the "fake"/"contructor" domain
> concept.  It all depends on how many instances of this issue we are
> going to have.  This is the only one on xen-devel so far. I don't think
> it is worth adding a constructor domain for one instance only.  But I
> agree with you and Daniel that if we end up with several instances, then
> the constructor domain approach is probably the best option overall.

The constructor domain still needs more discussion and would likely be
part of a larger approach that will require buy-in from several
maintainers and should be looking to solve a more general problem
internal access control of which domain construction within the
hypervisor is just one case. For this I would be glad to start a working
group, for which the start of can add to the next community call agenda.

> As a contributor, sadly I won't be able to spend a lot of time on this
> in the following months. If a significant rework is required, I don't
> think I'll be able to do it, at least not for this Xen release (and it
> would be nice to have dom0less PV drivers in the coming release.) If
> Daniel is willing, I could add his "idle_domain is_priv" patch to this
> series.  Not as clean as a proper constructor domain but it would work
> and it would be simple. It could be evolved into a nicer constructor
> domain later.
> 
> This is not my preference but I could do that if Julien and Jan prefer
> this approach and if Daniel is happy to share his patch.

I can look to spin out a general version of what I am doing, likely
exposed as an XSM call so it can be handled appropriately across policies.

>> AFAIU, your proposal is to duplicate code. This brings other risks such as
>> duplicated bug and more code to maintain.
> 
> Got it. I'll make one last attempt at a proposal that doesn't involve
> the fake constructor domain. The goal is to avoid code duplication while
> providing a safe way forward to make progress with only a small amount
> of changes. What if we:
> 
> - rename evtchn_alloc_unbound to _evtchn_alloc_unbound (still static)
> - add a skip_xsm parameter to _evtchn_alloc_unbound
> - introduce a wrapper evtchn_alloc_unbound that always set skip_xsm to
>   false (same interface as today's evtchn_alloc_unbound)
> - introduce an __init early_evtchn_alloc_unbound public function that
>   sets skip_xsm to true
> 
> This way:
> - we have a single implementation in _evtchn_alloc_unbound (no code
>   duplication)
> - the only function exposed that skips the XSM check is __init
> - evtchn_alloc_unbound continue to have the XSM check same as today
> 
> 
> E.g.:
> static int _evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
> {
>     ...
> }
> 
> static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> {
>     return _evtchn_alloc_unbound(alloc, false);    
> }
> 
> int __init early_evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> {
>     return _evtchn_alloc_unbound(alloc, true);
> }
> 
> 
> Would this be good enough for now?

I guest the question is if it is okay for this to exist until the new
XSM calls are found to be acceptable and then this is reverted/changed
to the XSM calls?

v/r
dps


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/5] tools: add example application to initialize dom0less PV drivers
  2022-03-23  2:50     ` Stefano Stabellini
@ 2022-03-28 16:25       ` Julien Grall
  2022-04-01  0:35         ` Stefano Stabellini
  0 siblings, 1 reply; 42+ messages in thread
From: Julien Grall @ 2022-03-28 16:25 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, jgross, Bertrand.Marquis, Volodymyr_Babchuk,
	Luca Miccio, Stefano Stabellini, Wei Liu, Anthony PERARD

Hi Stefano,

On 23/03/2022 02:50, Stefano Stabellini wrote:
> On Sat, 29 Jan 2022, Julien Grall wrote:
>> On 28/01/2022 21:33, Stefano Stabellini wrote:
>>> +    libxl_uuid uuid;
>>> +    uint64_t v;
>>> +    int rc;
>>> +
>>> +    printf("Init dom0less domain: %d\n", info->domid);
>>> +    dom.guest_domid = info->domid;
>>> +    dom.xenstore_domid = 0;
>>> +    dom.xch = xc_interface_open(0, 0, 0);
>>> +
>>> +    rc = xc_hvm_param_get(dom.xch, info->domid, HVM_PARAM_STORE_EVTCHN,
>>> &v);
>>> +    if (rc != 0) {
>>> +        printf("Failed to get HVM_PARAM_STORE_EVTCHN\n");
>>> +        return 1;
>>> +    }
>>> +    dom.xenstore_evtchn = v;
>>> +
>>> +    /* Console won't be initialized but set its data for completeness */
>>> +    dom.console_domid = 0;
>>
>> I find a bit odd you set the domid but not the event channel, page. Can you
>> explain?
>>
>> Actually, can you explain why only half of the structure is initialized?
>   
> We are only using the struct xc_dom_image parameter for
> xc_dom_gnttab_init, and nothing else. We only need very few fields in
> it. Basically we could call xc_dom_gnttab_seed directly and then we
> wouldn't need struct xc_dom_image at all. Only the needed fields are
> currently populated.

I would prefer if we don't use xc_dom_image and call 
xc_dom_gnttab_seed(). This would:
   1) reduce the risk that one of the unitialized field is will be 
mistakenly use
   2) extra documentation (currently missing) to explain which fields is 
used.

>>> +
>>> +    /* Alloc magic pages */
>>> +    if (alloc_magic_pages(info, &dom) != 0) {
>>> +        printf("Error on alloc magic pages\n");
>>> +        return 1;
>>> +    }
>>> +
>>> +    xc_dom_gnttab_init(&dom);
>>
>> This call as the risk to break the guest if the dom0 Linux doesn't support the
>> acquire interface. This is because it will punch a hole in the domain memory
>> where the grant-table may have already been mapped.
>>
>> Also, this function could fails.
> 
> I'll check for return errors. Dom0less is for fully static
> configurations so I think it is OK to return error and abort if
> something unexpected happens: dom0less' main reason for being is that
> there is nothing unexpected :-)
Does this mean the caller will have to reboot the system if there is an 
error? IOW, we don't expect them to call ./init-dom0less twice.

>>> +
>>> +    libxl_uuid_generate(&uuid);
>>> +    xc_domain_sethandle(dom.xch, info->domid, libxl_uuid_bytearray(&uuid));
>>> +
>>> +    rc = gen_stub_json_config(info->domid, &uuid);
>>> +    if (rc)
>>> +        err(1, "gen_stub_json_config");
>>> +
>>> +    rc = restore_xenstore(xsh, info, uuid, dom.xenstore_evtchn);
>>> +    if (rc)
>>> +        err(1, "writing to xenstore");
>>> +
>>> +    xs_introduce_domain(xsh, info->domid,
>>> +            (GUEST_MAGIC_BASE >> XC_PAGE_SHIFT) + XENSTORE_PFN_OFFSET,
>>> +            dom.xenstore_evtchn);
>>
>> xs_introduce_domain() can technically fails.
> 
> OK
> 
> 
>>> +    return 0;
>>> +}
>>> +
>>> +/* Check if domain has been configured in XS */
>>> +static bool domain_exists(struct xs_handle *xsh, int domid)
>>> +{
>>> +    return xs_is_domain_introduced(xsh, domid);
>>> +}
>>
>> Would not this lead to initialize a domain with PV driver disabled?
> 
> I am not sure I understood your question, but I'll try to answer anyway.
> This check is purely to distinguish dom0less guests, which needs further
> initializations, from regular guests (e.g. xl guests) that don't need
> any actions taken here.

Dom0less domUs can be divided in two categories based on whether they 
are xen aware (e.g. xen,enhanced is set).

Looking at this script, it seems to assume that all dom0less domUs are 
Xen aware. So it will end up to allocate Xenstore ring and call 
xs_introduce_domain(). I suspect the call will end up to fail because 
the event channel would be 0.

So did you try to use this script on a platform where there only xen 
aware domU and/or a mix?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
  2022-03-25 21:05             ` Stefano Stabellini
  2022-03-28 14:56               ` Julien Grall
  2022-03-28 15:11               ` Daniel P. Smith
@ 2022-03-28 16:38               ` Daniel P. Smith
  2022-03-28 23:23                 ` Stefano Stabellini
  2 siblings, 1 reply; 42+ messages in thread
From: Daniel P. Smith @ 2022-03-28 16:38 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall
  Cc: Jan Beulich, xen-devel, jgross, Bertrand.Marquis,
	Volodymyr_Babchuk, Luca Miccio, Stefano Stabellini,
	Andrew Cooper, George Dunlap, Wei Liu

On 3/25/22 17:05, Stefano Stabellini wrote:
> On Fri, 25 Mar 2022, Julien Grall wrote:
>> So to me, the idea of switching to a "fake" domain or bypassing the check is
>> more appealing. I have a preference for the "fake" domain here.
> 
> As a maintainer, I am not opposed to the "fake"/"contructor" domain
> concept.  It all depends on how many instances of this issue we are
> going to have.  This is the only one on xen-devel so far. I don't think
> it is worth adding a constructor domain for one instance only.  But I
> agree with you and Daniel that if we end up with several instances, then
> the constructor domain approach is probably the best option overall.
> 
> 
> As a contributor, sadly I won't be able to spend a lot of time on this
> in the following months. If a significant rework is required, I don't
> think I'll be able to do it, at least not for this Xen release (and it
> would be nice to have dom0less PV drivers in the coming release.) If
> Daniel is willing, I could add his "idle_domain is_priv" patch to this
> series.  Not as clean as a proper constructor domain but it would work
> and it would be simple. It could be evolved into a nicer constructor
> domain later.
> 
> This is not my preference but I could do that if Julien and Jan prefer
> this approach and if Daniel is happy to share his patch.
> 
> 
>> AFAIU, your proposal is to duplicate code. This brings other risks such as
>> duplicated bug and more code to maintain.
> 
> Got it. I'll make one last attempt at a proposal that doesn't involve
> the fake constructor domain. The goal is to avoid code duplication while
> providing a safe way forward to make progress with only a small amount
> of changes. What if we:
> 
> - rename evtchn_alloc_unbound to _evtchn_alloc_unbound (still static)
> - add a skip_xsm parameter to _evtchn_alloc_unbound
> - introduce a wrapper evtchn_alloc_unbound that always set skip_xsm to
>   false (same interface as today's evtchn_alloc_unbound)
> - introduce an __init early_evtchn_alloc_unbound public function that
>   sets skip_xsm to true
> 
> This way:
> - we have a single implementation in _evtchn_alloc_unbound (no code
>   duplication)
> - the only function exposed that skips the XSM check is __init
> - evtchn_alloc_unbound continue to have the XSM check same as today
> 
> 
> E.g.:
> static int _evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
> {
>     ...
> }
> 
> static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> {
>     return _evtchn_alloc_unbound(alloc, false);    
> }
> 
> int __init early_evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> {
>     return _evtchn_alloc_unbound(alloc, true);
> }
> 
> 
> Would this be good enough for now?

Please see the RFC patch I just posted[1], IMHO I think this is a safer
approach for this specific instance.

[1]
https://lore.kernel.org/xen-devel/20220328203622.30961-1-dpsmith@apertussolutions.com/T/#t

v/r,
dps


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/5] xen: make evtchn_alloc_unbound public
  2022-03-28 16:38               ` Daniel P. Smith
@ 2022-03-28 23:23                 ` Stefano Stabellini
  0 siblings, 0 replies; 42+ messages in thread
From: Stefano Stabellini @ 2022-03-28 23:23 UTC (permalink / raw)
  To: Daniel P. Smith
  Cc: Stefano Stabellini, Julien Grall, Jan Beulich, xen-devel, jgross,
	Bertrand.Marquis, Volodymyr_Babchuk, Luca Miccio,
	Stefano Stabellini, Andrew Cooper, George Dunlap, Wei Liu

On Mon, 28 Mar 2022, Daniel P. Smith wrote:
> On 3/25/22 17:05, Stefano Stabellini wrote:
> > On Fri, 25 Mar 2022, Julien Grall wrote:
> >> So to me, the idea of switching to a "fake" domain or bypassing the check is
> >> more appealing. I have a preference for the "fake" domain here.
> > 
> > As a maintainer, I am not opposed to the "fake"/"contructor" domain
> > concept.  It all depends on how many instances of this issue we are
> > going to have.  This is the only one on xen-devel so far. I don't think
> > it is worth adding a constructor domain for one instance only.  But I
> > agree with you and Daniel that if we end up with several instances, then
> > the constructor domain approach is probably the best option overall.
> > 
> > 
> > As a contributor, sadly I won't be able to spend a lot of time on this
> > in the following months. If a significant rework is required, I don't
> > think I'll be able to do it, at least not for this Xen release (and it
> > would be nice to have dom0less PV drivers in the coming release.) If
> > Daniel is willing, I could add his "idle_domain is_priv" patch to this
> > series.  Not as clean as a proper constructor domain but it would work
> > and it would be simple. It could be evolved into a nicer constructor
> > domain later.
> > 
> > This is not my preference but I could do that if Julien and Jan prefer
> > this approach and if Daniel is happy to share his patch.
> > 
> > 
> >> AFAIU, your proposal is to duplicate code. This brings other risks such as
> >> duplicated bug and more code to maintain.
> > 
> > Got it. I'll make one last attempt at a proposal that doesn't involve
> > the fake constructor domain. The goal is to avoid code duplication while
> > providing a safe way forward to make progress with only a small amount
> > of changes. What if we:
> > 
> > - rename evtchn_alloc_unbound to _evtchn_alloc_unbound (still static)
> > - add a skip_xsm parameter to _evtchn_alloc_unbound
> > - introduce a wrapper evtchn_alloc_unbound that always set skip_xsm to
> >   false (same interface as today's evtchn_alloc_unbound)
> > - introduce an __init early_evtchn_alloc_unbound public function that
> >   sets skip_xsm to true
> > 
> > This way:
> > - we have a single implementation in _evtchn_alloc_unbound (no code
> >   duplication)
> > - the only function exposed that skips the XSM check is __init
> > - evtchn_alloc_unbound continue to have the XSM check same as today
> > 
> > 
> > E.g.:
> > static int _evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc, bool skip_xsm)
> > {
> >     ...
> > }
> > 
> > static int evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> > {
> >     return _evtchn_alloc_unbound(alloc, false);    
> > }
> > 
> > int __init early_evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc)
> > {
> >     return _evtchn_alloc_unbound(alloc, true);
> > }
> > 
> > 
> > Would this be good enough for now?
> 
> Please see the RFC patch I just posted[1], IMHO I think this is a safer
> approach for this specific instance.
> 
> [1]
> https://lore.kernel.org/xen-devel/20220328203622.30961-1-dpsmith@apertussolutions.com/T/#t

I read it, the patch looks fine. I also tested it together with my
series and it solves the problem. With [1], it is just a matter of
making evtchn_alloc_unbound as is non-static.

If the other maintainers also agree with [1], then I'll just rebase on
it and limit my changes to exporting evtchn_alloc_unbound.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/5] xen: introduce xen,enhanced dom0less property
  2022-03-25 17:58       ` Julien Grall
@ 2022-04-01  0:33         ` Stefano Stabellini
  0 siblings, 0 replies; 42+ messages in thread
From: Stefano Stabellini @ 2022-04-01  0:33 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, xen-devel, jgross, Bertrand.Marquis,
	Volodymyr_Babchuk, Stefano Stabellini

On Fri, 25 Mar 2022, Julien Grall wrote:
> On 23/03/2022 00:08, Stefano Stabellini wrote:
> > On Sat, 29 Jan 2022, Julien Grall wrote:
> > > > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> > > > index 6931c022a2..9144d6c0b6 100644
> > > > --- a/xen/arch/arm/domain_build.c
> > > > +++ b/xen/arch/arm/domain_build.c
> > > > @@ -2963,6 +2963,7 @@ static int __init construct_domU(struct domain *d,
> > > >                                     const struct dt_device_node *node)
> > > >    {
> > > >        struct kernel_info kinfo = {};
> > > > +    const char *dom0less_enhanced;
> > > >        int rc;
> > > >        u64 mem;
> > > >    @@ -2978,6 +2979,12 @@ static int __init construct_domU(struct domain
> > > > *d,
> > > >          kinfo.vpl011 = dt_property_read_bool(node, "vpl011");
> > > >    +    rc = dt_property_read_string(node, "xen,enhanced",
> > > > &dom0less_enhanced);
> > > > +    if ( rc == -EILSEQ ||
> > > 
> > > I think the use an -EILSEQ wants an explanation. In a previous version,
> > > you
> > > wrote that the value would be returned when:
> > > 
> > > fdt set /chosen/domU0 xen,enhanced
> > > 
> > > But it is not clear why. Can you print pp->value, pp->length, strnlen(..)
> > > when
> > > this happens?
> > 
> > I added in dt_property_read_string:
> > 
> > printk("DEBUG %s %d value=%s value[0]=%d length=%u
> > len=%lu\n",__func__,__LINE__,(char*)pp->value,
> > *((char*)pp->value),pp->length, strlen(pp->value));
> > 
> > This is the output:
> > (XEN) DEBUG dt_property_read_string 205 value= value[0]=0 length=0 len=0
> 
> Thanks posting the log!
> 
> For convenience, I am copying the comment on top of dt_property_read_string()
> prototype:
> 
>  * Search for a property in a device tree node and retrieve a null
>  * terminated string value (pointer to data, not a copy). Returns 0 on
>  * success, -EINVAL if the property does not exist, -ENODATA if property
>  * doest not have value, and -EILSEQ if the string is not
>  * null-terminated with the length of the property data.
> 
> Per your log, the length is NULL so I would have assumed -ENODATA would be
> returned. Looking at the implementation:
> 
>     const struct dt_property *pp = dt_find_property(np, propname, NULL);
> 
>     if ( !pp )
>         return -EINVAL;
>     if ( !pp->value )
>         return -ENODATA;
>     if ( strnlen(pp->value, pp->length) >= pp->length )
>         return -EILSEQ;
> 
> We consider that the property when pp->value is NULL. However, AFAICT, we
> never set pp->value to NULL (see unflatten_dt_node()).
> 
> So I think there is a bug in the implementation. I would keep the check
> !pp->value (for hardening purpose) and also return -ENODATA when !pp->length.
> 
> Most of our device-tree code is from Linux. Looking at v5.17, the bug seems to
> be present there too. This would want to be fixed there too.

I have added a patch to fix dt_property_read_string. I am about to send
it out as patch of v4 of the series. I'll also follow-up on Linux.

I am thinking of keeping the -EILSEQ in domain_build.c for hardening
purposes.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 3/5] xen/arm: configure dom0less domain for enabling xenstore after boot
  2022-03-25 18:46       ` Julien Grall
@ 2022-04-01  0:34         ` Stefano Stabellini
  2022-04-01  8:36           ` Julien Grall
  0 siblings, 1 reply; 42+ messages in thread
From: Stefano Stabellini @ 2022-04-01  0:34 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, xen-devel, jgross, Bertrand.Marquis,
	Volodymyr_Babchuk, Luca Miccio, Stefano Stabellini

On Fri, 25 Mar 2022, Julien Grall wrote:
> On 23/03/2022 01:18, Stefano Stabellini wrote:
> > On Sat, 29 Jan 2022, Julien Grall wrote:
> > > On 28/01/2022 21:33, Stefano Stabellini wrote:
> > > > +    rc = evtchn_alloc_unbound(&alloc, true);
> > > > +    if ( rc )
> > > > +    {
> > > > +        printk("Failed allocating event channel for domain\n");
> > > > +        return rc;
> > > > +    }
> > > > +
> > > > +    d->arch.hvm.params[HVM_PARAM_STORE_EVTCHN] = alloc.port;
> > > > +
> > > > +    return 0;
> > > > +}
> > > > +
> > > >    static int __init construct_domU(struct domain *d,
> > > >                                     const struct dt_device_node *node)
> > > >    {
> > > > @@ -3014,7 +3043,19 @@ static int __init construct_domU(struct domain
> > > > *d,
> > > >            return rc;
> > > >          if ( kinfo.vpl011 )
> > > > +    {
> > > >            rc = domain_vpl011_init(d, NULL);
> > > > +        if ( rc < 0 )
> > > > +            return rc;
> > > > +    }
> > > > +
> > > > +    if ( kinfo.dom0less_enhanced )
> > > > +    {
> > > > +        rc = alloc_xenstore_evtchn(d);
> > > > +        if ( rc < 0 )
> > > > +            return rc;
> > > > +        d->arch.hvm.params[HVM_PARAM_STORE_PFN] = ~0ULL;
> > > 
> > > I think it would be easy to allocate the page right now. So what prevent
> > > us to
> > > do it right now?
> > 
> > Because (as you noted as a comment to the following patch) as soon as
> > d->arch.hvm.params[HVM_PARAM_STORE_PFN] is set the guest can continue
> > with the initialization and will expect the right data to be set on the
> > page.
> 
> I think you misunderstood my question. From my understanding, at the moment,
> Linux would break with your proposal. So you need to modify the kernel in
> order to support what you are doing.

Linux does not break with this proposal. I wrote a longer explanation
[1] some time ago.

In short: the master branch and any supported versions of Linux boots
fine with this proposal without changes, however it wouldn't be able to
use PV drivers when started as dom0less kernel.

To be able to use the new feature, this patch is required [2].

Old unsupported and not updated Linux is the only one to break. You gave
an excellent suggestion on thread [1], which resulted in me writing
patch #1 "xen: introduce xen,enhanced dom0less property" to retain
compatibility with older unpatched and unsupported kernels.

[1] https://marc.info/?l=xen-devel&m=164142956915469
[2] https://marc.info/?l=xen-devel&m=164203595315414


> IOW, we have room to decide the approach here.
> 
> Xenstore protocol has a way to allow (re)connection (see
> docs/mics/xenstore-ring.txt). This feature looks quite suited to what you are
> trying to do here (we want to delay the connection).
> 
> The main advantage with this approach is the resources allocation for Xenstore
> will be done in the place and the work in Linux could be re-used for
> non-dom0less domain.
> 
> Have you explored it?

Luca (CCed) is the original author. I don't know if he explored that
approach. I have not looked into it in any details but I think there
might be challenges: in this case there is nothing on the shared page.
There are no feature bits as it has not been initialized (xenstored is
the one initializating it).

Keep in mind that Luca and I have done many tests on this approach, both
the Xen side, the Linux side (very many different kernel versions) and
complex configurations (both network and block PV drivers, DMA mastering
devices, etc.) If we changed approach now we would lose some of the
value of the past efforts. But if required, I'll try to schedule time to
do a proper research of your suggestion.


> > In other words: it is not enough to have the pfn allocated, we
> > also need xenstore to initialize it. At that point, it is better to do
> > both later from init-dom0less.c.
> See above. My main concern with your proposal is the allocation is split and
> this making more difficult to understand the initialization. Could you write
> some documentation how everything is meant to work?

I can document it a lot better for sure. I'll do that.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/5] tools: add example application to initialize dom0less PV drivers
  2022-03-28 16:25       ` Julien Grall
@ 2022-04-01  0:35         ` Stefano Stabellini
  2022-04-01 10:02           ` Julien Grall
  0 siblings, 1 reply; 42+ messages in thread
From: Stefano Stabellini @ 2022-04-01  0:35 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, xen-devel, jgross, Bertrand.Marquis,
	Volodymyr_Babchuk, Luca Miccio, Stefano Stabellini, Wei Liu,
	Anthony PERARD

On Mon, 28 Mar 2022, Julien Grall wrote:
> On 23/03/2022 02:50, Stefano Stabellini wrote:
> > On Sat, 29 Jan 2022, Julien Grall wrote:
> > > On 28/01/2022 21:33, Stefano Stabellini wrote:
> > > > +    libxl_uuid uuid;
> > > > +    uint64_t v;
> > > > +    int rc;
> > > > +
> > > > +    printf("Init dom0less domain: %d\n", info->domid);
> > > > +    dom.guest_domid = info->domid;
> > > > +    dom.xenstore_domid = 0;
> > > > +    dom.xch = xc_interface_open(0, 0, 0);
> > > > +
> > > > +    rc = xc_hvm_param_get(dom.xch, info->domid, HVM_PARAM_STORE_EVTCHN,
> > > > &v);
> > > > +    if (rc != 0) {
> > > > +        printf("Failed to get HVM_PARAM_STORE_EVTCHN\n");
> > > > +        return 1;
> > > > +    }
> > > > +    dom.xenstore_evtchn = v;
> > > > +
> > > > +    /* Console won't be initialized but set its data for completeness
> > > > */
> > > > +    dom.console_domid = 0;
> > > 
> > > I find a bit odd you set the domid but not the event channel, page. Can
> > > you
> > > explain?
> > > 
> > > Actually, can you explain why only half of the structure is initialized?
> >   We are only using the struct xc_dom_image parameter for
> > xc_dom_gnttab_init, and nothing else. We only need very few fields in
> > it. Basically we could call xc_dom_gnttab_seed directly and then we
> > wouldn't need struct xc_dom_image at all. Only the needed fields are
> > currently populated.
> 
> I would prefer if we don't use xc_dom_image and call xc_dom_gnttab_seed().
> This would:
>   1) reduce the risk that one of the unitialized field is will be mistakenly
> use
>   2) extra documentation (currently missing) to explain which fields is used.

I have done that, it is in v4


> > > > +
> > > > +    /* Alloc magic pages */
> > > > +    if (alloc_magic_pages(info, &dom) != 0) {
> > > > +        printf("Error on alloc magic pages\n");
> > > > +        return 1;
> > > > +    }
> > > > +
> > > > +    xc_dom_gnttab_init(&dom);
> > > 
> > > This call as the risk to break the guest if the dom0 Linux doesn't support
> > > the
> > > acquire interface. This is because it will punch a hole in the domain
> > > memory
> > > where the grant-table may have already been mapped.
> > > 
> > > Also, this function could fails.
> > 
> > I'll check for return errors. Dom0less is for fully static
> > configurations so I think it is OK to return error and abort if
> > something unexpected happens: dom0less' main reason for being is that
> > there is nothing unexpected :-)
> Does this mean the caller will have to reboot the system if there is an error?
> IOW, we don't expect them to call ./init-dom0less twice.

Yes, exactly. I think init-dom0less could even panic. My mental model is
that this is an "extension" of construct_domU. Over there we just panic
if something is wrong and here it would be similar. The user provided a
wrong config and should fix it.


> > > > +
> > > > +    libxl_uuid_generate(&uuid);
> > > > +    xc_domain_sethandle(dom.xch, info->domid,
> > > > libxl_uuid_bytearray(&uuid));
> > > > +
> > > > +    rc = gen_stub_json_config(info->domid, &uuid);
> > > > +    if (rc)
> > > > +        err(1, "gen_stub_json_config");
> > > > +
> > > > +    rc = restore_xenstore(xsh, info, uuid, dom.xenstore_evtchn);
> > > > +    if (rc)
> > > > +        err(1, "writing to xenstore");
> > > > +
> > > > +    xs_introduce_domain(xsh, info->domid,
> > > > +            (GUEST_MAGIC_BASE >> XC_PAGE_SHIFT) + XENSTORE_PFN_OFFSET,
> > > > +            dom.xenstore_evtchn);
> > > 
> > > xs_introduce_domain() can technically fails.
> > 
> > OK
> > 
> > 
> > > > +    return 0;
> > > > +}
> > > > +
> > > > +/* Check if domain has been configured in XS */
> > > > +static bool domain_exists(struct xs_handle *xsh, int domid)
> > > > +{
> > > > +    return xs_is_domain_introduced(xsh, domid);
> > > > +}
> > > 
> > > Would not this lead to initialize a domain with PV driver disabled?
> > 
> > I am not sure I understood your question, but I'll try to answer anyway.
> > This check is purely to distinguish dom0less guests, which needs further
> > initializations, from regular guests (e.g. xl guests) that don't need
> > any actions taken here.
> 
> Dom0less domUs can be divided in two categories based on whether they are xen
> aware (e.g. xen,enhanced is set).
> 
> Looking at this script, it seems to assume that all dom0less domUs are Xen
> aware. So it will end up to allocate Xenstore ring and call
> xs_introduce_domain(). I suspect the call will end up to fail because the
> event channel would be 0.
> 
> So did you try to use this script on a platform where there only xen aware
> domU and/or a mix?

Good idea of asking for this test. I thought I already ran that test,
but I did it again to be sure. Everything works OK (although the
xenstore page allocation is unneeded). xs_introduce_domain does not
fail: I think that's because it is usually called on all domains by the
toolstack, even the ones without xenstore support in the kernel.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 3/5] xen/arm: configure dom0less domain for enabling xenstore after boot
  2022-04-01  0:34         ` Stefano Stabellini
@ 2022-04-01  8:36           ` Julien Grall
  0 siblings, 0 replies; 42+ messages in thread
From: Julien Grall @ 2022-04-01  8:36 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, jgross, Bertrand.Marquis, Volodymyr_Babchuk,
	Luca Miccio, Stefano Stabellini

Hi Stefano,

On 01/04/2022 01:34, Stefano Stabellini wrote:
>>> Because (as you noted as a comment to the following patch) as soon as
>>> d->arch.hvm.params[HVM_PARAM_STORE_PFN] is set the guest can continue
>>> with the initialization and will expect the right data to be set on the
>>> page.
>>
>> I think you misunderstood my question. From my understanding, at the moment,
>> Linux would break with your proposal. So you need to modify the kernel in
>> order to support what you are doing.
> 
> Linux does not break with this proposal. I wrote a longer explanation
> [1] some time ago.

I guess I should not have written "broken" here. But instead point out 
that...

> 
> In short: the master branch and any supported versions of Linux boots
> fine with this proposal without changes, however it wouldn't be able to
> use PV drivers when started as dom0less kernel.
> 
> To be able to use the new feature, this patch is required [2].

... without the extra patch is Linux, you would not be able to take 
advantage of this feature.

>> IOW, we have room to decide the approach here.
>>
>> Xenstore protocol has a way to allow (re)connection (see
>> docs/mics/xenstore-ring.txt). This feature looks quite suited to what you are
>> trying to do here (we want to delay the connection).
>>
>> The main advantage with this approach is the resources allocation for Xenstore
>> will be done in the place and the work in Linux could be re-used for
>> non-dom0less domain.
>>
>> Have you explored it?
> 
> Luca (CCed) is the original author. I don't know if he explored that
> approach. I have not looked into it in any details but I think there
> might be challenges: in this case there is nothing on the shared page.
> There are no feature bits as it has not been initialized (xenstored is
> the one initializating it).
I agree there is nothing today. But here, we have the liberty to 
initialize the feature bits in Xen.

> 
> Keep in mind that Luca and I have done many tests on this approach, both
> the Xen side, the Linux side (very many different kernel versions) and
> complex configurations (both network and block PV drivers, DMA mastering
> devices, etc.) If we changed approach now we would lose some of the
> value of the past efforts.

I appreciate the effort you put into testing this approach. However, 
this is an external interface that we will consider stable as soon as 
the two sides (Xen and Linux) are committed. So I want to make sure we 
are not putting ourself in a corner.

One major issue I can forsee with your approach is the xenstore 
initialization seem to only works for HVM. In theory, we may have the 
same need for PV (e.g. in the case of Hyperlaunch).

How would your proposal work for PV guest?

Note that I now that PV guest may not work without Xenstore. But I don't 
see any reason why we could not start them before Xenstored.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/5] tools: add example application to initialize dom0less PV drivers
  2022-04-01  0:35         ` Stefano Stabellini
@ 2022-04-01 10:02           ` Julien Grall
  2022-04-01 10:27             ` Julien Grall
                               ` (2 more replies)
  0 siblings, 3 replies; 42+ messages in thread
From: Julien Grall @ 2022-04-01 10:02 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, jgross, Bertrand.Marquis, Volodymyr_Babchuk,
	Luca Miccio, Stefano Stabellini, Wei Liu, Anthony PERARD

Hi Stefano,

On 01/04/2022 01:35, Stefano Stabellini wrote:
>>>>> +
>>>>> +    /* Alloc magic pages */
>>>>> +    if (alloc_magic_pages(info, &dom) != 0) {
>>>>> +        printf("Error on alloc magic pages\n");
>>>>> +        return 1;
>>>>> +    }
>>>>> +
>>>>> +    xc_dom_gnttab_init(&dom);
>>>>
>>>> This call as the risk to break the guest if the dom0 Linux doesn't support
>>>> the
>>>> acquire interface. This is because it will punch a hole in the domain
>>>> memory
>>>> where the grant-table may have already been mapped.
>>>>
>>>> Also, this function could fails.
>>>
>>> I'll check for return errors. Dom0less is for fully static
>>> configurations so I think it is OK to return error and abort if
>>> something unexpected happens: dom0less' main reason for being is that
>>> there is nothing unexpected :-)
>> Does this mean the caller will have to reboot the system if there is an error?
>> IOW, we don't expect them to call ./init-dom0less twice.
> 
> Yes, exactly. I think init-dom0less could even panic. My mental model is
> that this is an "extension" of construct_domU. Over there we just panic
> if something is wrong and here it would be similar. The user provided a
> wrong config and should fix it.

Ok. I think we should make explicit how it can be used.

>>>>> +
>>>>> +    libxl_uuid_generate(&uuid);
>>>>> +    xc_domain_sethandle(dom.xch, info->domid,
>>>>> libxl_uuid_bytearray(&uuid));
>>>>> +
>>>>> +    rc = gen_stub_json_config(info->domid, &uuid);
>>>>> +    if (rc)
>>>>> +        err(1, "gen_stub_json_config");
>>>>> +
>>>>> +    rc = restore_xenstore(xsh, info, uuid, dom.xenstore_evtchn);
>>>>> +    if (rc)
>>>>> +        err(1, "writing to xenstore");
>>>>> +
>>>>> +    xs_introduce_domain(xsh, info->domid,
>>>>> +            (GUEST_MAGIC_BASE >> XC_PAGE_SHIFT) + XENSTORE_PFN_OFFSET,
>>>>> +            dom.xenstore_evtchn);
>>>>
>>>> xs_introduce_domain() can technically fails.
>>>
>>> OK
>>>
>>>
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +/* Check if domain has been configured in XS */
>>>>> +static bool domain_exists(struct xs_handle *xsh, int domid)
>>>>> +{
>>>>> +    return xs_is_domain_introduced(xsh, domid);
>>>>> +}
>>>>
>>>> Would not this lead to initialize a domain with PV driver disabled?
>>>
>>> I am not sure I understood your question, but I'll try to answer anyway.
>>> This check is purely to distinguish dom0less guests, which needs further
>>> initializations, from regular guests (e.g. xl guests) that don't need
>>> any actions taken here.
>>
>> Dom0less domUs can be divided in two categories based on whether they are xen
>> aware (e.g. xen,enhanced is set).
>>
>> Looking at this script, it seems to assume that all dom0less domUs are Xen
>> aware. So it will end up to allocate Xenstore ring and call
>> xs_introduce_domain(). I suspect the call will end up to fail because the
>> event channel would be 0.
>>
>> So did you try to use this script on a platform where there only xen aware
>> domU and/or a mix?
> 
> Good idea of asking for this test. I thought I already ran that test,
> but I did it again to be sure. Everything works OK (although the
> xenstore page allocation is unneeded). xs_introduce_domain does not
 > fail:

Are you sure? If I pass 0 as the 4th argument (event channel), the 
command will return EINVAL. However, looking at the code you are not 
checking the return for the call. So you will continue as if it were 
successful.

So you will end up to write nodes for a domain Xenstored is not aware 
and also set HVM_PARAM_STORE_PFN which may further confuse the guest as 
it may try to initialize Xenstored it discovers the page.

> I think that's because it is usually called on all domains by the
> toolstack, even the ones without xenstore support in the kernel.

The toolstack will always allocate the event channel irrespective to 
whether the guest will use Xenstore. So both the shared page and the 
event channel are always valid today.

With your series, this will change as the event channel will not be 
allocated when "xen,enhanced" is not set.

In your case, I think we may want to register the domain to xenstore but 
say there are no connection available for the domain. Juergen, what do 
you think?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/5] tools: add example application to initialize dom0less PV drivers
  2022-04-01 10:02           ` Julien Grall
@ 2022-04-01 10:27             ` Julien Grall
  2022-04-01 10:34             ` Juergen Gross
  2022-04-06  0:25             ` Stefano Stabellini
  2 siblings, 0 replies; 42+ messages in thread
From: Julien Grall @ 2022-04-01 10:27 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, jgross, Bertrand.Marquis, Volodymyr_Babchuk,
	Luca Miccio, Stefano Stabellini, Wei Liu, Anthony PERARD



On 01/04/2022 11:02, Julien Grall wrote:
> Hi Stefano,
> 
> On 01/04/2022 01:35, Stefano Stabellini wrote:
>>>>>> +
>>>>>> +    /* Alloc magic pages */
>>>>>> +    if (alloc_magic_pages(info, &dom) != 0) {
>>>>>> +        printf("Error on alloc magic pages\n");
>>>>>> +        return 1;
>>>>>> +    }
>>>>>> +
>>>>>> +    xc_dom_gnttab_init(&dom);
>>>>>
>>>>> This call as the risk to break the guest if the dom0 Linux doesn't 
>>>>> support
>>>>> the
>>>>> acquire interface. This is because it will punch a hole in the domain
>>>>> memory
>>>>> where the grant-table may have already been mapped.
>>>>>
>>>>> Also, this function could fails.
>>>>
>>>> I'll check for return errors. Dom0less is for fully static
>>>> configurations so I think it is OK to return error and abort if
>>>> something unexpected happens: dom0less' main reason for being is that
>>>> there is nothing unexpected :-)
>>> Does this mean the caller will have to reboot the system if there is 
>>> an error?
>>> IOW, we don't expect them to call ./init-dom0less twice.
>>
>> Yes, exactly. I think init-dom0less could even panic. My mental model is
>> that this is an "extension" of construct_domU. Over there we just panic
>> if something is wrong and here it would be similar. The user provided a
>> wrong config and should fix it.
> 
> Ok. I think we should make explicit how it can be used.
> 
>>>>>> +
>>>>>> +    libxl_uuid_generate(&uuid);
>>>>>> +    xc_domain_sethandle(dom.xch, info->domid,
>>>>>> libxl_uuid_bytearray(&uuid));
>>>>>> +
>>>>>> +    rc = gen_stub_json_config(info->domid, &uuid);
>>>>>> +    if (rc)
>>>>>> +        err(1, "gen_stub_json_config");
>>>>>> +
>>>>>> +    rc = restore_xenstore(xsh, info, uuid, dom.xenstore_evtchn);
>>>>>> +    if (rc)
>>>>>> +        err(1, "writing to xenstore");
>>>>>> +
>>>>>> +    xs_introduce_domain(xsh, info->domid,
>>>>>> +            (GUEST_MAGIC_BASE >> XC_PAGE_SHIFT) + 
>>>>>> XENSTORE_PFN_OFFSET,
>>>>>> +            dom.xenstore_evtchn);
>>>>>
>>>>> xs_introduce_domain() can technically fails.
>>>>
>>>> OK
>>>>
>>>>
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>> +/* Check if domain has been configured in XS */
>>>>>> +static bool domain_exists(struct xs_handle *xsh, int domid)
>>>>>> +{
>>>>>> +    return xs_is_domain_introduced(xsh, domid);
>>>>>> +}
>>>>>
>>>>> Would not this lead to initialize a domain with PV driver disabled?
>>>>
>>>> I am not sure I understood your question, but I'll try to answer 
>>>> anyway.
>>>> This check is purely to distinguish dom0less guests, which needs 
>>>> further
>>>> initializations, from regular guests (e.g. xl guests) that don't need
>>>> any actions taken here.
>>>
>>> Dom0less domUs can be divided in two categories based on whether they 
>>> are xen
>>> aware (e.g. xen,enhanced is set).
>>>
>>> Looking at this script, it seems to assume that all dom0less domUs 
>>> are Xen
>>> aware. So it will end up to allocate Xenstore ring and call
>>> xs_introduce_domain(). I suspect the call will end up to fail because 
>>> the
>>> event channel would be 0.
>>>
>>> So did you try to use this script on a platform where there only xen 
>>> aware
>>> domU and/or a mix?
>>
>> Good idea of asking for this test. I thought I already ran that test,
>> but I did it again to be sure. Everything works OK (although the
>> xenstore page allocation is unneeded). xs_introduce_domain does not
>  > fail:
> 
> Are you sure? If I pass 0 as the 4th argument (event channel), the 
> command will return EINVAL. However, looking at the code you are not 
> checking the return for the call. So you will continue as if it were 
> successful.
> 
> So you will end up to write nodes for a domain Xenstored is not aware 
> and also set HVM_PARAM_STORE_PFN which may further confuse the guest as 
> it may try to initialize Xenstored it discovers the page.

^ if it discovers

> 
>> I think that's because it is usually called on all domains by the
>> toolstack, even the ones without xenstore support in the kernel.
> 
> The toolstack will always allocate the event channel irrespective to 
> whether the guest will use Xenstore. So both the shared page and the 
> event channel are always valid today.
> 
> With your series, this will change as the event channel will not be 
> allocated when "xen,enhanced" is not set.
> 
> In your case, I think we may want to register the domain to xenstore but 
> say there are no connection available for the domain. Juergen, what do 
> you think?
> 
> Cheers,
> 

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/5] tools: add example application to initialize dom0less PV drivers
  2022-04-01 10:02           ` Julien Grall
  2022-04-01 10:27             ` Julien Grall
@ 2022-04-01 10:34             ` Juergen Gross
  2022-04-06  0:28               ` Stefano Stabellini
  2022-04-06  0:25             ` Stefano Stabellini
  2 siblings, 1 reply; 42+ messages in thread
From: Juergen Gross @ 2022-04-01 10:34 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Bertrand.Marquis, Volodymyr_Babchuk, Luca Miccio,
	Stefano Stabellini, Wei Liu, Anthony PERARD


[-- Attachment #1.1.1: Type: text/plain, Size: 5056 bytes --]

On 01.04.22 12:02, Julien Grall wrote:
> Hi Stefano,
> 
> On 01/04/2022 01:35, Stefano Stabellini wrote:
>>>>>> +
>>>>>> +    /* Alloc magic pages */
>>>>>> +    if (alloc_magic_pages(info, &dom) != 0) {
>>>>>> +        printf("Error on alloc magic pages\n");
>>>>>> +        return 1;
>>>>>> +    }
>>>>>> +
>>>>>> +    xc_dom_gnttab_init(&dom);
>>>>>
>>>>> This call as the risk to break the guest if the dom0 Linux doesn't support
>>>>> the
>>>>> acquire interface. This is because it will punch a hole in the domain
>>>>> memory
>>>>> where the grant-table may have already been mapped.
>>>>>
>>>>> Also, this function could fails.
>>>>
>>>> I'll check for return errors. Dom0less is for fully static
>>>> configurations so I think it is OK to return error and abort if
>>>> something unexpected happens: dom0less' main reason for being is that
>>>> there is nothing unexpected :-)
>>> Does this mean the caller will have to reboot the system if there is an error?
>>> IOW, we don't expect them to call ./init-dom0less twice.
>>
>> Yes, exactly. I think init-dom0less could even panic. My mental model is
>> that this is an "extension" of construct_domU. Over there we just panic
>> if something is wrong and here it would be similar. The user provided a
>> wrong config and should fix it.
> 
> Ok. I think we should make explicit how it can be used.
> 
>>>>>> +
>>>>>> +    libxl_uuid_generate(&uuid);
>>>>>> +    xc_domain_sethandle(dom.xch, info->domid,
>>>>>> libxl_uuid_bytearray(&uuid));
>>>>>> +
>>>>>> +    rc = gen_stub_json_config(info->domid, &uuid);
>>>>>> +    if (rc)
>>>>>> +        err(1, "gen_stub_json_config");
>>>>>> +
>>>>>> +    rc = restore_xenstore(xsh, info, uuid, dom.xenstore_evtchn);
>>>>>> +    if (rc)
>>>>>> +        err(1, "writing to xenstore");
>>>>>> +
>>>>>> +    xs_introduce_domain(xsh, info->domid,
>>>>>> +            (GUEST_MAGIC_BASE >> XC_PAGE_SHIFT) + XENSTORE_PFN_OFFSET,
>>>>>> +            dom.xenstore_evtchn);
>>>>>
>>>>> xs_introduce_domain() can technically fails.
>>>>
>>>> OK
>>>>
>>>>
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>> +/* Check if domain has been configured in XS */
>>>>>> +static bool domain_exists(struct xs_handle *xsh, int domid)
>>>>>> +{
>>>>>> +    return xs_is_domain_introduced(xsh, domid);
>>>>>> +}
>>>>>
>>>>> Would not this lead to initialize a domain with PV driver disabled?
>>>>
>>>> I am not sure I understood your question, but I'll try to answer anyway.
>>>> This check is purely to distinguish dom0less guests, which needs further
>>>> initializations, from regular guests (e.g. xl guests) that don't need
>>>> any actions taken here.
>>>
>>> Dom0less domUs can be divided in two categories based on whether they are xen
>>> aware (e.g. xen,enhanced is set).
>>>
>>> Looking at this script, it seems to assume that all dom0less domUs are Xen
>>> aware. So it will end up to allocate Xenstore ring and call
>>> xs_introduce_domain(). I suspect the call will end up to fail because the
>>> event channel would be 0.
>>>
>>> So did you try to use this script on a platform where there only xen aware
>>> domU and/or a mix?
>>
>> Good idea of asking for this test. I thought I already ran that test,
>> but I did it again to be sure. Everything works OK (although the
>> xenstore page allocation is unneeded). xs_introduce_domain does not
>  > fail:
> 
> Are you sure? If I pass 0 as the 4th argument (event channel), the command will 
> return EINVAL. However, looking at the code you are not checking the return for 
> the call. So you will continue as if it were successful.
> 
> So you will end up to write nodes for a domain Xenstored is not aware and also 
> set HVM_PARAM_STORE_PFN which may further confuse the guest as it may try to 
> initialize Xenstored it discovers the page.
> 
>> I think that's because it is usually called on all domains by the
>> toolstack, even the ones without xenstore support in the kernel.
> 
> The toolstack will always allocate the event channel irrespective to whether the 
> guest will use Xenstore. So both the shared page and the event channel are 
> always valid today.
> 
> With your series, this will change as the event channel will not be allocated 
> when "xen,enhanced" is not set.
> 
> In your case, I think we may want to register the domain to xenstore but say 
> there are no connection available for the domain. Juergen, what do you think?

In my opinion such a domain should not be registered with Xenstore.

At least C-xenstored should work mostly correctly. I think the
"introduced" status is tested everywhere it should be tested.

Basically this is similar to today's status of xenstore-stubdom: it
is never introduced, but Xenstore itself is happy with it existing. :-)

And even today the first nodes for a new domain are being created
before the domain is officially introduced.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/5] tools: add example application to initialize dom0less PV drivers
  2022-04-01 10:02           ` Julien Grall
  2022-04-01 10:27             ` Julien Grall
  2022-04-01 10:34             ` Juergen Gross
@ 2022-04-06  0:25             ` Stefano Stabellini
  2022-04-06  0:53               ` Stefano Stabellini
  2 siblings, 1 reply; 42+ messages in thread
From: Stefano Stabellini @ 2022-04-06  0:25 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, xen-devel, jgross, Bertrand.Marquis,
	Volodymyr_Babchuk, Luca Miccio, Stefano Stabellini, Wei Liu,
	Anthony PERARD

On Fri, 1 Apr 2022, Julien Grall wrote:
> On 01/04/2022 01:35, Stefano Stabellini wrote:
> > > > > > +
> > > > > > +    /* Alloc magic pages */
> > > > > > +    if (alloc_magic_pages(info, &dom) != 0) {
> > > > > > +        printf("Error on alloc magic pages\n");
> > > > > > +        return 1;
> > > > > > +    }
> > > > > > +
> > > > > > +    xc_dom_gnttab_init(&dom);
> > > > > 
> > > > > This call as the risk to break the guest if the dom0 Linux doesn't
> > > > > support
> > > > > the
> > > > > acquire interface. This is because it will punch a hole in the domain
> > > > > memory
> > > > > where the grant-table may have already been mapped.
> > > > > 
> > > > > Also, this function could fails.
> > > > 
> > > > I'll check for return errors. Dom0less is for fully static
> > > > configurations so I think it is OK to return error and abort if
> > > > something unexpected happens: dom0less' main reason for being is that
> > > > there is nothing unexpected :-)
> > > Does this mean the caller will have to reboot the system if there is an
> > > error?
> > > IOW, we don't expect them to call ./init-dom0less twice.
> > 
> > Yes, exactly. I think init-dom0less could even panic. My mental model is
> > that this is an "extension" of construct_domU. Over there we just panic
> > if something is wrong and here it would be similar. The user provided a
> > wrong config and should fix it.
> 
> Ok. I think we should make explicit how it can be used.
> 
> > > > > > +
> > > > > > +    libxl_uuid_generate(&uuid);
> > > > > > +    xc_domain_sethandle(dom.xch, info->domid,
> > > > > > libxl_uuid_bytearray(&uuid));
> > > > > > +
> > > > > > +    rc = gen_stub_json_config(info->domid, &uuid);
> > > > > > +    if (rc)
> > > > > > +        err(1, "gen_stub_json_config");
> > > > > > +
> > > > > > +    rc = restore_xenstore(xsh, info, uuid, dom.xenstore_evtchn);
> > > > > > +    if (rc)
> > > > > > +        err(1, "writing to xenstore");
> > > > > > +
> > > > > > +    xs_introduce_domain(xsh, info->domid,
> > > > > > +            (GUEST_MAGIC_BASE >> XC_PAGE_SHIFT) +
> > > > > > XENSTORE_PFN_OFFSET,
> > > > > > +            dom.xenstore_evtchn);
> > > > > 
> > > > > xs_introduce_domain() can technically fails.
> > > > 
> > > > OK
> > > > 
> > > > 
> > > > > > +    return 0;
> > > > > > +}
> > > > > > +
> > > > > > +/* Check if domain has been configured in XS */
> > > > > > +static bool domain_exists(struct xs_handle *xsh, int domid)
> > > > > > +{
> > > > > > +    return xs_is_domain_introduced(xsh, domid);
> > > > > > +}
> > > > > 
> > > > > Would not this lead to initialize a domain with PV driver disabled?
> > > > 
> > > > I am not sure I understood your question, but I'll try to answer anyway.
> > > > This check is purely to distinguish dom0less guests, which needs further
> > > > initializations, from regular guests (e.g. xl guests) that don't need
> > > > any actions taken here.
> > > 
> > > Dom0less domUs can be divided in two categories based on whether they are
> > > xen
> > > aware (e.g. xen,enhanced is set).
> > > 
> > > Looking at this script, it seems to assume that all dom0less domUs are Xen
> > > aware. So it will end up to allocate Xenstore ring and call
> > > xs_introduce_domain(). I suspect the call will end up to fail because the
> > > event channel would be 0.
> > > 
> > > So did you try to use this script on a platform where there only xen aware
> > > domU and/or a mix?
> > 
> > Good idea of asking for this test. I thought I already ran that test,
> > but I did it again to be sure. Everything works OK (although the
> > xenstore page allocation is unneeded). xs_introduce_domain does not
> > fail:
> 
> Are you sure? If I pass 0 as the 4th argument (event channel), the command
> will return EINVAL. However, looking at the code you are not checking the
> return for the call. So you will continue as if it were successful.

We are not passing 0 as the 4th argument, we are passing the event
channel previously set as HVM_PARAM_STORE_EVTCHN by Xen:

    rc = xc_hvm_param_get(xch, info->domid, HVM_PARAM_STORE_EVTCHN,
                          &xenstore_evtchn);

Also in my working version of the series I have a check for the return
value of xs_introduce_domain (as you requested in one of your previous
reviews). So xs_introduce_domain is actually working correctly and
returning success.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/5] tools: add example application to initialize dom0less PV drivers
  2022-04-01 10:34             ` Juergen Gross
@ 2022-04-06  0:28               ` Stefano Stabellini
  0 siblings, 0 replies; 42+ messages in thread
From: Stefano Stabellini @ 2022-04-06  0:28 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Julien Grall, Stefano Stabellini, xen-devel, Bertrand.Marquis,
	Volodymyr_Babchuk, Luca Miccio, Stefano Stabellini, Wei Liu,
	Anthony PERARD

[-- Attachment #1: Type: text/plain, Size: 6034 bytes --]

On Fri, 1 Apr 2022, Juergen Gross wrote:
> On 01.04.22 12:02, Julien Grall wrote:
> > Hi Stefano,
> > 
> > On 01/04/2022 01:35, Stefano Stabellini wrote:
> > > > > > > +
> > > > > > > +    /* Alloc magic pages */
> > > > > > > +    if (alloc_magic_pages(info, &dom) != 0) {
> > > > > > > +        printf("Error on alloc magic pages\n");
> > > > > > > +        return 1;
> > > > > > > +    }
> > > > > > > +
> > > > > > > +    xc_dom_gnttab_init(&dom);
> > > > > > 
> > > > > > This call as the risk to break the guest if the dom0 Linux doesn't
> > > > > > support
> > > > > > the
> > > > > > acquire interface. This is because it will punch a hole in the
> > > > > > domain
> > > > > > memory
> > > > > > where the grant-table may have already been mapped.
> > > > > > 
> > > > > > Also, this function could fails.
> > > > > 
> > > > > I'll check for return errors. Dom0less is for fully static
> > > > > configurations so I think it is OK to return error and abort if
> > > > > something unexpected happens: dom0less' main reason for being is that
> > > > > there is nothing unexpected :-)
> > > > Does this mean the caller will have to reboot the system if there is an
> > > > error?
> > > > IOW, we don't expect them to call ./init-dom0less twice.
> > > 
> > > Yes, exactly. I think init-dom0less could even panic. My mental model is
> > > that this is an "extension" of construct_domU. Over there we just panic
> > > if something is wrong and here it would be similar. The user provided a
> > > wrong config and should fix it.
> > 
> > Ok. I think we should make explicit how it can be used.
> > 
> > > > > > > +
> > > > > > > +    libxl_uuid_generate(&uuid);
> > > > > > > +    xc_domain_sethandle(dom.xch, info->domid,
> > > > > > > libxl_uuid_bytearray(&uuid));
> > > > > > > +
> > > > > > > +    rc = gen_stub_json_config(info->domid, &uuid);
> > > > > > > +    if (rc)
> > > > > > > +        err(1, "gen_stub_json_config");
> > > > > > > +
> > > > > > > +    rc = restore_xenstore(xsh, info, uuid, dom.xenstore_evtchn);
> > > > > > > +    if (rc)
> > > > > > > +        err(1, "writing to xenstore");
> > > > > > > +
> > > > > > > +    xs_introduce_domain(xsh, info->domid,
> > > > > > > +            (GUEST_MAGIC_BASE >> XC_PAGE_SHIFT) +
> > > > > > > XENSTORE_PFN_OFFSET,
> > > > > > > +            dom.xenstore_evtchn);
> > > > > > 
> > > > > > xs_introduce_domain() can technically fails.
> > > > > 
> > > > > OK
> > > > > 
> > > > > 
> > > > > > > +    return 0;
> > > > > > > +}
> > > > > > > +
> > > > > > > +/* Check if domain has been configured in XS */
> > > > > > > +static bool domain_exists(struct xs_handle *xsh, int domid)
> > > > > > > +{
> > > > > > > +    return xs_is_domain_introduced(xsh, domid);
> > > > > > > +}
> > > > > > 
> > > > > > Would not this lead to initialize a domain with PV driver disabled?
> > > > > 
> > > > > I am not sure I understood your question, but I'll try to answer
> > > > > anyway.
> > > > > This check is purely to distinguish dom0less guests, which needs
> > > > > further
> > > > > initializations, from regular guests (e.g. xl guests) that don't need
> > > > > any actions taken here.
> > > > 
> > > > Dom0less domUs can be divided in two categories based on whether they
> > > > are xen
> > > > aware (e.g. xen,enhanced is set).
> > > > 
> > > > Looking at this script, it seems to assume that all dom0less domUs are
> > > > Xen
> > > > aware. So it will end up to allocate Xenstore ring and call
> > > > xs_introduce_domain(). I suspect the call will end up to fail because
> > > > the
> > > > event channel would be 0.
> > > > 
> > > > So did you try to use this script on a platform where there only xen
> > > > aware
> > > > domU and/or a mix?
> > > 
> > > Good idea of asking for this test. I thought I already ran that test,
> > > but I did it again to be sure. Everything works OK (although the
> > > xenstore page allocation is unneeded). xs_introduce_domain does not
> >  > fail:
> > 
> > Are you sure? If I pass 0 as the 4th argument (event channel), the command
> > will return EINVAL. However, looking at the code you are not checking the
> > return for the call. So you will continue as if it were successful.
> > 
> > So you will end up to write nodes for a domain Xenstored is not aware and
> > also set HVM_PARAM_STORE_PFN which may further confuse the guest as it may
> > try to initialize Xenstored it discovers the page.
> > 
> > > I think that's because it is usually called on all domains by the
> > > toolstack, even the ones without xenstore support in the kernel.
> > 
> > The toolstack will always allocate the event channel irrespective to whether
> > the guest will use Xenstore. So both the shared page and the event channel
> > are always valid today.
> > 
> > With your series, this will change as the event channel will not be
> > allocated when "xen,enhanced" is not set.
> > 
> > In your case, I think we may want to register the domain to xenstore but say
> > there are no connection available for the domain. Juergen, what do you
> > think?
> 
> In my opinion such a domain should not be registered with Xenstore.
> 
> At least C-xenstored should work mostly correctly. I think the
> "introduced" status is tested everywhere it should be tested.
> 
> Basically this is similar to today's status of xenstore-stubdom: it
> is never introduced, but Xenstore itself is happy with it existing. :-)
> 
> And even today the first nodes for a new domain are being created
> before the domain is officially introduced.

We could skip introducing dom0less domains that don't have
"xen,enhanced" without issues. It is easy to distinguish dom0less
domains that have "xen,enhanced" from the ones that don't have it
because only the one with "xen,enhanced" have a valid event channel set
as HVM_PARAM_STORE_EVTCHN.

So if the event channel is not valid, we can simply skip the
xs_introduce_domain call (which would probably fail anyway as Julien
pointed out).

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/5] tools: add example application to initialize dom0less PV drivers
  2022-04-06  0:25             ` Stefano Stabellini
@ 2022-04-06  0:53               ` Stefano Stabellini
  0 siblings, 0 replies; 42+ messages in thread
From: Stefano Stabellini @ 2022-04-06  0:53 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Julien Grall, xen-devel, jgross, Bertrand.Marquis,
	Volodymyr_Babchuk, Luca Miccio, Stefano Stabellini, Wei Liu,
	Anthony PERARD

On Tue, 5 Apr 2022, Stefano Stabellini wrote:
> On Fri, 1 Apr 2022, Julien Grall wrote:
> > On 01/04/2022 01:35, Stefano Stabellini wrote:
> > > > > > > +
> > > > > > > +    /* Alloc magic pages */
> > > > > > > +    if (alloc_magic_pages(info, &dom) != 0) {
> > > > > > > +        printf("Error on alloc magic pages\n");
> > > > > > > +        return 1;
> > > > > > > +    }
> > > > > > > +
> > > > > > > +    xc_dom_gnttab_init(&dom);
> > > > > > 
> > > > > > This call as the risk to break the guest if the dom0 Linux doesn't
> > > > > > support
> > > > > > the
> > > > > > acquire interface. This is because it will punch a hole in the domain
> > > > > > memory
> > > > > > where the grant-table may have already been mapped.
> > > > > > 
> > > > > > Also, this function could fails.
> > > > > 
> > > > > I'll check for return errors. Dom0less is for fully static
> > > > > configurations so I think it is OK to return error and abort if
> > > > > something unexpected happens: dom0less' main reason for being is that
> > > > > there is nothing unexpected :-)
> > > > Does this mean the caller will have to reboot the system if there is an
> > > > error?
> > > > IOW, we don't expect them to call ./init-dom0less twice.
> > > 
> > > Yes, exactly. I think init-dom0less could even panic. My mental model is
> > > that this is an "extension" of construct_domU. Over there we just panic
> > > if something is wrong and here it would be similar. The user provided a
> > > wrong config and should fix it.
> > 
> > Ok. I think we should make explicit how it can be used.
> > 
> > > > > > > +
> > > > > > > +    libxl_uuid_generate(&uuid);
> > > > > > > +    xc_domain_sethandle(dom.xch, info->domid,
> > > > > > > libxl_uuid_bytearray(&uuid));
> > > > > > > +
> > > > > > > +    rc = gen_stub_json_config(info->domid, &uuid);
> > > > > > > +    if (rc)
> > > > > > > +        err(1, "gen_stub_json_config");
> > > > > > > +
> > > > > > > +    rc = restore_xenstore(xsh, info, uuid, dom.xenstore_evtchn);
> > > > > > > +    if (rc)
> > > > > > > +        err(1, "writing to xenstore");
> > > > > > > +
> > > > > > > +    xs_introduce_domain(xsh, info->domid,
> > > > > > > +            (GUEST_MAGIC_BASE >> XC_PAGE_SHIFT) +
> > > > > > > XENSTORE_PFN_OFFSET,
> > > > > > > +            dom.xenstore_evtchn);
> > > > > > 
> > > > > > xs_introduce_domain() can technically fails.
> > > > > 
> > > > > OK
> > > > > 
> > > > > 
> > > > > > > +    return 0;
> > > > > > > +}
> > > > > > > +
> > > > > > > +/* Check if domain has been configured in XS */
> > > > > > > +static bool domain_exists(struct xs_handle *xsh, int domid)
> > > > > > > +{
> > > > > > > +    return xs_is_domain_introduced(xsh, domid);
> > > > > > > +}
> > > > > > 
> > > > > > Would not this lead to initialize a domain with PV driver disabled?
> > > > > 
> > > > > I am not sure I understood your question, but I'll try to answer anyway.
> > > > > This check is purely to distinguish dom0less guests, which needs further
> > > > > initializations, from regular guests (e.g. xl guests) that don't need
> > > > > any actions taken here.
> > > > 
> > > > Dom0less domUs can be divided in two categories based on whether they are
> > > > xen
> > > > aware (e.g. xen,enhanced is set).
> > > > 
> > > > Looking at this script, it seems to assume that all dom0less domUs are Xen
> > > > aware. So it will end up to allocate Xenstore ring and call
> > > > xs_introduce_domain(). I suspect the call will end up to fail because the
> > > > event channel would be 0.
> > > > 
> > > > So did you try to use this script on a platform where there only xen aware
> > > > domU and/or a mix?
> > > 
> > > Good idea of asking for this test. I thought I already ran that test,
> > > but I did it again to be sure. Everything works OK (although the
> > > xenstore page allocation is unneeded). xs_introduce_domain does not
> > > fail:
> > 
> > Are you sure? If I pass 0 as the 4th argument (event channel), the command
> > will return EINVAL. However, looking at the code you are not checking the
> > return for the call. So you will continue as if it were successful.
> 
> We are not passing 0 as the 4th argument, we are passing the event
> channel previously set as HVM_PARAM_STORE_EVTCHN by Xen:
> 
>     rc = xc_hvm_param_get(xch, info->domid, HVM_PARAM_STORE_EVTCHN,
>                           &xenstore_evtchn);
> 
> Also in my working version of the series I have a check for the return
> value of xs_introduce_domain (as you requested in one of your previous
> reviews). So xs_introduce_domain is actually working correctly and
> returning success.

Sorry I didn't read carefully enough the older messages. I re-run the
tests again and I can see the issue you were describing (I am puzzled on
why I didn't see it before as I did have a check on the return value as
I wrote -- probably a mistake in my setup.)

The problem goes away if we only call xs_introduce_domain for
xen,enhanced domains (when xenstore_evtchn is valid.)


^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2022-04-06  0:53 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-28 21:32 [PATCH v3 0/5] dom0less PV drivers Stefano Stabellini
2022-01-28 21:33 ` [PATCH v3 1/5] xen: introduce xen,enhanced dom0less property Stefano Stabellini
2022-01-29 17:58   ` Julien Grall
2022-03-23  0:08     ` Stefano Stabellini
2022-03-25 17:58       ` Julien Grall
2022-04-01  0:33         ` Stefano Stabellini
2022-01-28 21:33 ` [PATCH v3 2/5] xen: make evtchn_alloc_unbound public Stefano Stabellini
2022-01-31  9:32   ` Jan Beulich
2022-03-15 14:10   ` Daniel P. Smith
2022-03-23  0:22     ` Stefano Stabellini
2022-03-23  6:57       ` Jan Beulich
2022-03-25  0:30         ` Stefano Stabellini
2022-03-25  8:17           ` Jan Beulich
2022-03-25 15:45           ` Daniel P. Smith
2022-03-25 16:52             ` Jason Andryuk
2022-03-28 14:54               ` Daniel P. Smith
2022-03-25 16:55             ` Julien Grall
2022-03-25 17:19           ` Julien Grall
2022-03-25 21:05             ` Stefano Stabellini
2022-03-28 14:56               ` Julien Grall
2022-03-28 15:11               ` Daniel P. Smith
2022-03-28 16:38               ` Daniel P. Smith
2022-03-28 23:23                 ` Stefano Stabellini
2022-03-24 15:36       ` Daniel P. Smith
2022-01-28 21:33 ` [PATCH v3 3/5] xen/arm: configure dom0less domain for enabling xenstore after boot Stefano Stabellini
2022-01-29 18:13   ` Julien Grall
2022-03-23  1:18     ` Stefano Stabellini
2022-03-25 18:46       ` Julien Grall
2022-04-01  0:34         ` Stefano Stabellini
2022-04-01  8:36           ` Julien Grall
2022-01-28 21:33 ` [PATCH v3 4/5] xenstored: send an evtchn notification on introduce_domain Stefano Stabellini
2022-01-28 21:33 ` [PATCH v3 5/5] tools: add example application to initialize dom0less PV drivers Stefano Stabellini
2022-01-29 19:35   ` Julien Grall
2022-03-23  2:50     ` Stefano Stabellini
2022-03-28 16:25       ` Julien Grall
2022-04-01  0:35         ` Stefano Stabellini
2022-04-01 10:02           ` Julien Grall
2022-04-01 10:27             ` Julien Grall
2022-04-01 10:34             ` Juergen Gross
2022-04-06  0:28               ` Stefano Stabellini
2022-04-06  0:25             ` Stefano Stabellini
2022-04-06  0:53               ` Stefano Stabellini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.