All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/4] Xen ABI feature control
@ 2020-12-03 12:41 Paul Durrant
  2020-12-03 12:41 ` [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, Paul Durrant
                   ` (4 more replies)
  0 siblings, 5 replies; 28+ messages in thread
From: Paul Durrant @ 2020-12-03 12:41 UTC (permalink / raw)
  To: xen-devel
  Cc: Paul Durrant, Andrew Cooper, Anthony PERARD, Christian Lindig,
	David Scott, George Dunlap, Ian Jackson, Jan Beulich,
	Julien Grall, Roger Pau Monné,
	Stefano Stabellini, Volodymyr Babchuk, Wei Liu

From: Paul Durrant <pdurrant@amazon.com>

This series was previously called "evtchn: Introduce a per-guest knob to
control FIFO ABI". It is been extensively re-worked and extended to cover
another ABI feature.

Paul Durrant (4):
  domctl: introduce a new domain create flag,
    XEN_DOMCTL_CDF_evtchn_fifo, ...
  domctl: introduce a new domain create flag,
    XEN_DOMCTL_CDF_evtchn_upcall, ...
  libxl: introduce a 'libxl_xen_abi_features' enumeration...
  xl: introduce a 'xen-abi-features' option...

 docs/man/xl.cfg.5.pod.in         | 50 ++++++++++++++++++++++++++++++++
 tools/include/libxl.h            | 10 +++++++
 tools/libs/light/libxl_arm.c     | 22 +++++++++-----
 tools/libs/light/libxl_create.c  | 31 ++++++++++++++++++++
 tools/libs/light/libxl_types.idl |  7 +++++
 tools/libs/light/libxl_x86.c     | 17 ++++++++++-
 tools/ocaml/libs/xc/xenctrl.ml   |  2 ++
 tools/ocaml/libs/xc/xenctrl.mli  |  2 ++
 tools/xl/xl_parse.c              | 50 ++++++++++++++++++++++++++++++--
 xen/arch/arm/domain.c            |  3 +-
 xen/arch/arm/domain_build.c      |  3 +-
 xen/arch/arm/setup.c             |  3 +-
 xen/arch/x86/domain.c            |  8 +++++
 xen/arch/x86/hvm/hvm.c           |  3 ++
 xen/arch/x86/setup.c             |  4 ++-
 xen/common/domain.c              |  3 +-
 xen/common/event_channel.c       | 24 +++++++++++++--
 xen/include/public/domctl.h      |  6 +++-
 18 files changed, 229 insertions(+), 19 deletions(-)
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Anthony PERARD <anthony.perard@citrix.com>
Cc: Christian Lindig <christian.lindig@citrix.com>
Cc: David Scott <dave@recoil.org>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Ian Jackson <iwj@xenproject.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
Cc: Wei Liu <wl@xen.org>
-- 
2.20.1



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-03 12:41 [PATCH v5 0/4] Xen ABI feature control Paul Durrant
@ 2020-12-03 12:41 ` Paul Durrant
  2020-12-03 15:22   ` Jan Beulich
  2020-12-03 12:41 ` [PATCH v5 2/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_upcall, Paul Durrant
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 28+ messages in thread
From: Paul Durrant @ 2020-12-03 12:41 UTC (permalink / raw)
  To: xen-devel
  Cc: Paul Durrant, Eslam Elnikety, Ian Jackson, Wei Liu,
	Anthony PERARD, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Christian Lindig, David Scott,
	Volodymyr Babchuk, Roger Pau Monné

From: Paul Durrant <pdurrant@amazon.com>

...to control the visibility of the FIFO event channel operations
(EVTCHNOP_init_control, EVTCHNOP_expand_array, and EVTCHNOP_set_priority) to
the guest.

These operations were added to the public header in commit d2d50c2f308f
("evtchn: add FIFO-based event channel ABI") and the first implementation
appeared in the two subsequent commits: edc8872aeb4a ("evtchn: implement
EVTCHNOP_set_priority and add the set_priority hook") and 88910061ec61
("evtchn: add FIFO-based event channel hypercalls and port ops"). Prior to
that, a guest issuing those operations would receive a return value of
-ENOSYS (not implemented) from Xen. Guests aware of the FIFO operations but
running on an older (pre-4.4) Xen would fall back to using the 2-level event
channel interface upon seeing this return value.

Unfortunately the uncontrolable appearance of these new operations in Xen 4.4
onwards has implications for hibernation of some Linux guests. During resume
from hibernation, there are two kernels involved: the "boot" kernel and the
"resume" kernel. The guest boot kernel may default to use FIFO operations and
instruct Xen via EVTCHNOP_init_control to switch from 2-level to FIFO. On the
other hand, the resume kernel keeps assuming 2-level, because it was hibernated
on a version of Xen that did not support the FIFO operations.

To maintain compatibility it is necessary to make Xen behave as it did
before the new operations were added and hence the code in this patch ensures
that, if XEN_DOMCTL_CDF_evtchn_fifo is not set, the FIFO event channel
operations will again result in -ENOSYS being returned to the guest.

This patch also adds an extra log line into the 'e' key handler output to
call out which event channel ABI is in use by a domain.

NOTE: To maintain current behavior, until a tool-stack option is added to
      control the flag, it is unconditionally set for all domains. A
      subsequent patch will introduce such tool-stack control.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: Eslam Elnikety <elnikety@amazon.com>
---
Cc: Ian Jackson <iwj@xenproject.org>
Cc: Wei Liu <wl@xen.org>
Cc: Anthony PERARD <anthony.perard@citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Christian Lindig <christian.lindig@citrix.com>
Cc: David Scott <dave@recoil.org>
Cc: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>

v5:
 - Flip the sense of the flag from disabling to enabling, as requested by
   Andrew

v4:
 - New in v4
---
 tools/libs/light/libxl_create.c |  1 +
 tools/ocaml/libs/xc/xenctrl.ml  |  1 +
 tools/ocaml/libs/xc/xenctrl.mli |  1 +
 xen/arch/arm/domain.c           |  3 ++-
 xen/arch/arm/domain_build.c     |  3 ++-
 xen/arch/arm/setup.c            |  3 ++-
 xen/arch/x86/setup.c            |  3 ++-
 xen/common/domain.c             |  2 +-
 xen/common/event_channel.c      | 24 +++++++++++++++++++++---
 xen/include/public/domctl.h     |  4 +++-
 10 files changed, 36 insertions(+), 9 deletions(-)

diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index 321a13e519b5..3ca9f00d6d83 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -607,6 +607,7 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config,
             .max_evtchn_port = b_info->event_channels,
             .max_grant_frames = b_info->max_grant_frames,
             .max_maptrack_frames = b_info->max_maptrack_frames,
+            .flags = XEN_DOMCTL_CDF_evtchn_fifo,
         };
 
         if (info->type != LIBXL_DOMAIN_TYPE_PV) {
diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
index e878699b0a1a..fa5c7b7eb0a2 100644
--- a/tools/ocaml/libs/xc/xenctrl.ml
+++ b/tools/ocaml/libs/xc/xenctrl.ml
@@ -65,6 +65,7 @@ type domain_create_flag =
 	| CDF_XS_DOMAIN
 	| CDF_IOMMU
 	| CDF_NESTED_VIRT
+	| CDF_EVTCHN_FIFO
 
 type domain_create_iommu_opts =
 	| IOMMU_NO_SHAREPT
diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index e64907df8e7e..a872002d90cc 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -58,6 +58,7 @@ type domain_create_flag =
   | CDF_XS_DOMAIN
   | CDF_IOMMU
   | CDF_NESTED_VIRT
+  | CDF_EVTCHN_FIFO
 
 type domain_create_iommu_opts =
   | IOMMU_NO_SHAREPT
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 18cafcdda7b1..59f947370053 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -622,7 +622,8 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
     unsigned int max_vcpus;
 
     /* HVM and HAP must be set. IOMMU may or may not be */
-    if ( (config->flags & ~XEN_DOMCTL_CDF_iommu) !=
+    if ( (config->flags &
+          ~(XEN_DOMCTL_CDF_iommu | XEN_DOMCTL_CDF_evtchn_fifo) !=
          (XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap) )
     {
         dprintk(XENLOG_INFO, "Unsupported configuration %#x\n",
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index e824ba34b012..13d1e79f1463 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -2478,7 +2478,8 @@ void __init create_domUs(void)
         struct domain *d;
         struct xen_domctl_createdomain d_cfg = {
             .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE,
-            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
+            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
+                     XEN_DOMCTL_CDF_evtchn_fifo,
             .max_evtchn_port = -1,
             .max_grant_frames = 64,
             .max_maptrack_frames = 1024,
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 7fcff9af2a7e..0267acfca16e 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -805,7 +805,8 @@ void __init start_xen(unsigned long boot_phys_offset,
     struct bootmodule *xen_bootmodule;
     struct domain *dom0;
     struct xen_domctl_createdomain dom0_cfg = {
-        .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
+        .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
+                 XEN_DOMCTL_CDF_evtchn_fifo,
         .max_evtchn_port = -1,
         .max_grant_frames = gnttab_dom0_frames(),
         .max_maptrack_frames = -1,
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 30d6f375a3af..e558241c73da 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -738,7 +738,8 @@ static struct domain *__init create_dom0(const module_t *image,
                                          const char *loader)
 {
     struct xen_domctl_createdomain dom0_cfg = {
-        .flags = IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0,
+        .flags = XEN_DOMCTL_CDF_evtchn_fifo |
+                 (IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0),
         .max_evtchn_port = -1,
         .max_grant_frames = -1,
         .max_maptrack_frames = -1,
diff --git a/xen/common/domain.c b/xen/common/domain.c
index f748806a450b..28592c7c8486 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -307,7 +307,7 @@ static int sanitise_domain_config(struct xen_domctl_createdomain *config)
          ~(XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
            XEN_DOMCTL_CDF_s3_integrity | XEN_DOMCTL_CDF_oos_off |
            XEN_DOMCTL_CDF_xs_domain | XEN_DOMCTL_CDF_iommu |
-           XEN_DOMCTL_CDF_nested_virt) )
+           XEN_DOMCTL_CDF_nested_virt | XEN_DOMCTL_CDF_evtchn_fifo) )
     {
         dprintk(XENLOG_INFO, "Unknown CDF flags %#x\n", config->flags);
         return -EINVAL;
diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index dbfba62a4934..91133bf3c263 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -1188,10 +1188,27 @@ static long evtchn_set_priority(const struct evtchn_set_priority *set_priority)
     return ret;
 }
 
+static bool is_fifo_op(int cmd)
+{
+    switch ( cmd )
+    {
+    case EVTCHNOP_init_control:
+    case EVTCHNOP_expand_array:
+    case EVTCHNOP_set_priority:
+        return true;
+    default:
+        return false;
+    }
+}
+
 long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     long rc;
 
+    if ( !(current->domain->options & XEN_DOMCTL_CDF_evtchn_fifo) &&
+         is_fifo_op(cmd) )
+        return -ENOSYS;
+
     switch ( cmd )
     {
     case EVTCHNOP_alloc_unbound: {
@@ -1568,9 +1585,10 @@ static void domain_dump_evtchn_info(struct domain *d)
     unsigned int port;
     int irq;
 
-    printk("Event channel information for domain %d:\n"
-           "Polling vCPUs: {%*pbl}\n"
-           "    port [p/m/s]\n", d->domain_id, d->max_vcpus, d->poll_mask);
+    printk("Event channel information for %pd:\n", d);
+    printk("ABI: %s\n", d->evtchn_fifo ? "FIFO" : "2-level");
+    printk("Polling vCPUs: {%*pbl}\n"
+           "    port [p/m/s]\n", d->max_vcpus, d->poll_mask);
 
     spin_lock(&d->event_lock);
 
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 666aeb71bf1b..f7149c81a7c2 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -70,9 +70,11 @@ struct xen_domctl_createdomain {
 #define XEN_DOMCTL_CDF_iommu          (1U<<_XEN_DOMCTL_CDF_iommu)
 #define _XEN_DOMCTL_CDF_nested_virt   6
 #define XEN_DOMCTL_CDF_nested_virt    (1U << _XEN_DOMCTL_CDF_nested_virt)
+#define _XEN_DOMCTL_CDF_evtchn_fifo   7
+#define XEN_DOMCTL_CDF_evtchn_fifo    (1U << _XEN_DOMCTL_CDF_evtchn_fifo)
 
 /* Max XEN_DOMCTL_CDF_* constant.  Used for ABI checking. */
-#define XEN_DOMCTL_CDF_MAX XEN_DOMCTL_CDF_nested_virt
+#define XEN_DOMCTL_CDF_MAX XEN_DOMCTL_CDF_evtchn_fifo
 
     uint32_t flags;
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v5 2/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_upcall, ...
  2020-12-03 12:41 [PATCH v5 0/4] Xen ABI feature control Paul Durrant
  2020-12-03 12:41 ` [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, Paul Durrant
@ 2020-12-03 12:41 ` Paul Durrant
  2020-12-03 12:41 ` [PATCH v5 3/4] libxl: introduce a 'libxl_xen_abi_features' enumeration Paul Durrant
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 28+ messages in thread
From: Paul Durrant @ 2020-12-03 12:41 UTC (permalink / raw)
  To: xen-devel
  Cc: Paul Durrant, Ian Jackson, Wei Liu, Anthony PERARD,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Christian Lindig, David Scott,
	Roger Pau Monné

From: Paul Durrant <pdurrant@amazon.com>

...to control the visibility of the per-vCPU upcall feature for HVM guests.

Commit 04447f4453c0 ("x86/hvm: add per-vcpu evtchn upcalls") added a mechanism
by which x86 HVM guests can register a vector for each vCPU which will be used
by Xen to signal event channels on that vCPU.

This facility (an HVMOP hypercall) appeared in a uncontrolled fashion which
has implications for the behaviour of OS when moving from an older Xen to a
newer Xen. For instance, the OS may be aware of the per-vCPU upcall feature
but support for it is buggy. In this case the OS will function perfectly well
on the older Xen, but fail (in a potentially non-obvious way) on the newer
Xen.

To maintain compatibility it is necessary to make Xen behave as it did
before the new hypercall was added and hence the code in this patch ensures
that, if XEN_DOMCTL_CDF_evtchn_upcall is not set, the hypercall will again
result in -ENOSYS being returned to the guest.

NOTE: To maintain current behavior, until a tool-stack option is added to
      control the flag, it is unconditionally set for x86 HVM domains. A
      subsequent patch will introduce such tool-stack control.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
---
Cc: Ian Jackson <iwj@xenproject.org>
Cc: Wei Liu <wl@xen.org>
Cc: Anthony PERARD <anthony.perard@citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Christian Lindig <christian.lindig@citrix.com>
Cc: David Scott <dave@recoil.org>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>

v5:
 - New in v5
---
 tools/libs/light/libxl_x86.c    | 7 ++++++-
 tools/ocaml/libs/xc/xenctrl.ml  | 1 +
 tools/ocaml/libs/xc/xenctrl.mli | 1 +
 xen/arch/x86/domain.c           | 8 ++++++++
 xen/arch/x86/hvm/hvm.c          | 3 +++
 xen/arch/x86/setup.c            | 1 +
 xen/common/domain.c             | 3 ++-
 xen/include/public/domctl.h     | 4 +++-
 8 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
index 86d272999d67..f7217b422404 100644
--- a/tools/libs/light/libxl_x86.c
+++ b/tools/libs/light/libxl_x86.c
@@ -5,7 +5,12 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
                                       libxl_domain_config *d_config,
                                       struct xen_domctl_createdomain *config)
 {
-    switch(d_config->c_info.type) {
+    libxl_domain_create_info *info = &d_config->c_info;
+
+    if (info->type == LIBXL_DOMAIN_TYPE_HVM)
+        config->flags |= XEN_DOMCTL_CDF_evtchn_upcall;
+
+    switch(info->type) {
     case LIBXL_DOMAIN_TYPE_HVM:
         config->arch.emulation_flags = (XEN_X86_EMU_ALL & ~XEN_X86_EMU_VPCI);
         break;
diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
index fa5c7b7eb0a2..04284c364108 100644
--- a/tools/ocaml/libs/xc/xenctrl.ml
+++ b/tools/ocaml/libs/xc/xenctrl.ml
@@ -66,6 +66,7 @@ type domain_create_flag =
 	| CDF_IOMMU
 	| CDF_NESTED_VIRT
 	| CDF_EVTCHN_FIFO
+	| CDF_EVTCHN_UPCALL
 
 type domain_create_iommu_opts =
 	| IOMMU_NO_SHAREPT
diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index a872002d90cc..e40759464ae5 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -59,6 +59,7 @@ type domain_create_flag =
   | CDF_IOMMU
   | CDF_NESTED_VIRT
   | CDF_EVTCHN_FIFO
+  | CDF_EVTCHN_UPCALL
 
 type domain_create_iommu_opts =
   | IOMMU_NO_SHAREPT
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 1b894d0124d7..e7f83880219d 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -662,11 +662,19 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
     }
 
     if ( !hvm )
+    {
+        if ( config->flags & XEN_DOMCTL_CDF_evtchn_upcall )
+        {
+            dprintk(XENLOG_INFO, "Per-vCPU event channel vector only supported for HVM guests\n");
+            return -EINVAL;
+        }
+
         /*
          * It is only meaningful for XEN_DOMCTL_CDF_oos_off to be clear
          * for HVM guests.
          */
         config->flags |= XEN_DOMCTL_CDF_oos_off;
+    }
 
     if ( nested_virt && !hap )
     {
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 54e32e4fe85c..7ffc42a7282e 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4037,6 +4037,9 @@ static int hvmop_set_evtchn_upcall_vector(
     struct domain *d = current->domain;
     struct vcpu *v;
 
+    if ( !(d->options & XEN_DOMCTL_CDF_evtchn_upcall) )
+        return -ENOSYS;
+
     if ( !is_hvm_domain(d) )
         return -EINVAL;
 
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index e558241c73da..3ff9a25eede6 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -751,6 +751,7 @@ static struct domain *__init create_dom0(const module_t *image,
     if ( opt_dom0_pvh )
     {
         dom0_cfg.flags |= (XEN_DOMCTL_CDF_hvm |
+                           XEN_DOMCTL_CDF_evtchn_upcall |
                            ((hvm_hap_supported() && !opt_dom0_shadow) ?
                             XEN_DOMCTL_CDF_hap : 0));
 
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 28592c7c8486..1ff2603425a3 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -307,7 +307,8 @@ static int sanitise_domain_config(struct xen_domctl_createdomain *config)
          ~(XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
            XEN_DOMCTL_CDF_s3_integrity | XEN_DOMCTL_CDF_oos_off |
            XEN_DOMCTL_CDF_xs_domain | XEN_DOMCTL_CDF_iommu |
-           XEN_DOMCTL_CDF_nested_virt | XEN_DOMCTL_CDF_evtchn_fifo) )
+           XEN_DOMCTL_CDF_nested_virt | XEN_DOMCTL_CDF_evtchn_fifo |
+           XEN_DOMCTL_CDF_evtchn_upcall ) )
     {
         dprintk(XENLOG_INFO, "Unknown CDF flags %#x\n", config->flags);
         return -EINVAL;
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index f7149c81a7c2..f5fe43a55662 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -72,9 +72,11 @@ struct xen_domctl_createdomain {
 #define XEN_DOMCTL_CDF_nested_virt    (1U << _XEN_DOMCTL_CDF_nested_virt)
 #define _XEN_DOMCTL_CDF_evtchn_fifo   7
 #define XEN_DOMCTL_CDF_evtchn_fifo    (1U << _XEN_DOMCTL_CDF_evtchn_fifo)
+#define _XEN_DOMCTL_CDF_evtchn_upcall 8
+#define XEN_DOMCTL_CDF_evtchn_upcall  (1U << _XEN_DOMCTL_CDF_evtchn_upcall)
 
 /* Max XEN_DOMCTL_CDF_* constant.  Used for ABI checking. */
-#define XEN_DOMCTL_CDF_MAX XEN_DOMCTL_CDF_evtchn_fifo
+#define XEN_DOMCTL_CDF_MAX XEN_DOMCTL_CDF_evtchn_upcall
 
     uint32_t flags;
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v5 3/4] libxl: introduce a 'libxl_xen_abi_features' enumeration...
  2020-12-03 12:41 [PATCH v5 0/4] Xen ABI feature control Paul Durrant
  2020-12-03 12:41 ` [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, Paul Durrant
  2020-12-03 12:41 ` [PATCH v5 2/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_upcall, Paul Durrant
@ 2020-12-03 12:41 ` Paul Durrant
  2020-12-03 12:41 ` [PATCH v5 4/4] xl: introduce a 'xen-abi-features' option Paul Durrant
  2020-12-03 13:15 ` [PATCH v5 0/4] Xen ABI feature control Jürgen Groß
  4 siblings, 0 replies; 28+ messages in thread
From: Paul Durrant @ 2020-12-03 12:41 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Ian Jackson, Wei Liu, Anthony PERARD

From: Paul Durrant <pdurrant@amazon.com>

... and bitmaps to enable or disable fetaures.

This patch adds a new 'libxl_xen_abi_features' enumeration into the IDL which
specifies features of the Xen ABI which may be optionally enabled or disabled
via new 'feature_enable' and 'feature_disable' bitaps added into
'libxl_domain_build_info'.

The initially defined features are enabled by default (for relevant
architectures) and so the corresponding flags in
'struct xen_domctl_createdomain' are set if they are missing from
'disable_features' rather than if they are present in 'enable_features'.
Checks are, however, added to make sure that features are not specifically
enabled in cases where they are not supported.

NOTE: A subsequent patch will add an option into xl.cfg(5) to control whether
      Xen ABI features are enabled or disabled.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
---
Cc: Ian Jackson <iwj@xenproject.org>
Cc: Wei Liu <wl@xen.org>
Cc: Anthony PERARD <anthony.perard@citrix.com>

v5:
 - New in v5
---
 tools/include/libxl.h            | 10 ++++++++++
 tools/libs/light/libxl_arm.c     | 22 +++++++++++++++-------
 tools/libs/light/libxl_create.c  | 32 +++++++++++++++++++++++++++++++-
 tools/libs/light/libxl_types.idl |  7 +++++++
 tools/libs/light/libxl_x86.c     | 14 ++++++++++++--
 5 files changed, 75 insertions(+), 10 deletions(-)

diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index eaffccb30f37..b328a5621e6f 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -451,6 +451,16 @@
  */
 #define LIBXL_HAVE_VIRIDIAN_EX_PROCESSOR_MASKS 1
 
+/*
+ * LIBXL_HAVE_BUILDINFO_XEN_ABI_FEATURE indicates that the
+ * libxl_xen_abi_feature enumeration is defined and that
+ * libxl_domain_build_info has feature_enable and _disable bitmaps
+ * of the specified width. These bitmaps are used to enable or disable
+ * features of the Xen ABI (enumerated by the new type) for a domain.
+ */
+#define LIBXL_HAVE_BUILDINFO_XEN_ABI_FEATURE 1
+#define LIBXL_BUILDINFO_FEATURE_ENABLE_DISABLE_WIDTH 64
+
 /*
  * libxl ABI compatibility
  *
diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index 66e8a065fe67..69676340a661 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -28,19 +28,27 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
     unsigned int i;
     uint32_t vuart_irq;
     bool vuart_enabled = false;
+    libxl_domain_build_info *b_info = &d_config->b_info;
+    libxl_xen_abi_feature f = LIBXL_XEN_ABI_FEATURE_EVTCHN_UPCALL;
+
+    if (libxl_bitmap_test(&b_info->feature_enable, f)) {
+        LOG(ERROR, "unsupported Xen ABI feature '%s'",
+            libxl_xen_abi_feature_to_string(f));
+        return ERROR_FAIL;
+    }
 
     /*
      * If pl011 vuart is enabled then increment the nr_spis to allow allocation
      * of SPI VIRQ for pl011.
      */
-    if (d_config->b_info.arch_arm.vuart == LIBXL_VUART_TYPE_SBSA_UART) {
+    if (b_info->arch_arm.vuart == LIBXL_VUART_TYPE_SBSA_UART) {
         nr_spis += (GUEST_VPL011_SPI - 32) + 1;
         vuart_irq = GUEST_VPL011_SPI;
         vuart_enabled = true;
     }
 
-    for (i = 0; i < d_config->b_info.num_irqs; i++) {
-        uint32_t irq = d_config->b_info.irqs[i];
+    for (i = 0; i < b_info->num_irqs; i++) {
+        uint32_t irq = b_info->irqs[i];
         uint32_t spi;
 
         /*
@@ -72,7 +80,7 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
     config->arch.nr_spis = nr_spis;
     LOG(DEBUG, " - Allocate %u SPIs", nr_spis);
 
-    switch (d_config->b_info.arch_arm.gic_version) {
+    switch (b_info->arch_arm.gic_version) {
     case LIBXL_GIC_VERSION_DEFAULT:
         config->arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE;
         break;
@@ -84,11 +92,11 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
         break;
     default:
         LOG(ERROR, "Unknown GIC version %d",
-            d_config->b_info.arch_arm.gic_version);
+            b_info->arch_arm.gic_version);
         return ERROR_FAIL;
     }
 
-    switch (d_config->b_info.tee) {
+    switch (b_info->tee) {
     case LIBXL_TEE_TYPE_NONE:
         config->arch.tee_type = XEN_DOMCTL_CONFIG_TEE_NONE;
         break;
@@ -97,7 +105,7 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
         break;
     default:
         LOG(ERROR, "Unknown TEE type %d",
-            d_config->b_info.tee);
+            b_info->tee);
         return ERROR_FAIL;
     }
 
diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index 3ca9f00d6d83..8cf7fd5f6d1b 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -587,6 +587,7 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config,
     struct xs_permissions noperm[1];
     xs_transaction_t t = 0;
     libxl_vminfo *vm_list;
+    libxl_xen_abi_feature f;
 
     /* convenience aliases */
     libxl_domain_create_info *info = &d_config->c_info;
@@ -607,9 +608,38 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config,
             .max_evtchn_port = b_info->event_channels,
             .max_grant_frames = b_info->max_grant_frames,
             .max_maptrack_frames = b_info->max_maptrack_frames,
-            .flags = XEN_DOMCTL_CDF_evtchn_fifo,
         };
 
+        libxl_for_each_set_bit(f, b_info->feature_enable) {
+            if (!libxl_xen_abi_feature_to_string(f)) { /* check validity */
+                LOGED(ERROR, *domid, "unknown Xen ABI feature enabled");
+                rc = ERROR_FAIL;
+                goto out;
+            }
+            if (libxl_bitmap_test(&b_info->feature_disable, f)) {
+                LOGED(ERROR, *domid, "Xen ABI feature '%s' both enabled and disabled",
+                      libxl_xen_abi_feature_to_string(f));
+                rc = ERROR_FAIL;
+                goto out;
+            }
+            LOGD(DETAIL, *domid, "enable feature: '%s'",
+                 libxl_xen_abi_feature_to_string(f));
+        }
+
+        libxl_for_each_set_bit(f, b_info->feature_disable) {
+            if (!libxl_xen_abi_feature_to_string(f)) { /* check validity */
+                LOGED(ERROR, *domid, "unknown Xen ABI feature disabled");
+                rc = ERROR_FAIL;
+                goto out;
+            }
+            LOGD(DETAIL, *domid, "disable feature: '%s'",
+                 libxl_xen_abi_feature_to_string(f));
+        }
+
+        if (!libxl_bitmap_test(&b_info->feature_disable,
+                               LIBXL_XEN_ABI_FEATURE_EVTCHN_FIFO))
+            create.flags |= XEN_DOMCTL_CDF_evtchn_fifo;
+
         if (info->type != LIBXL_DOMAIN_TYPE_PV) {
             create.flags |= XEN_DOMCTL_CDF_hvm;
 
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 05324736b744..3c50724b64cd 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -477,6 +477,11 @@ libxl_tee_type = Enumeration("tee_type", [
     (1, "optee")
     ], init_val = "LIBXL_TEE_TYPE_NONE")
 
+libxl_xen_abi_feature = Enumeration("xen_abi_feature", [
+    (0, "evtchn_fifo"),
+    (1, "evtchn_upcall")
+    ])
+
 libxl_rdm_reserve = Struct("rdm_reserve", [
     ("strategy",    libxl_rdm_reserve_strategy),
     ("policy",      libxl_rdm_reserve_policy),
@@ -559,6 +564,8 @@ libxl_domain_build_info = Struct("domain_build_info",[
     ("apic",             libxl_defbool),
     ("dm_restrict",      libxl_defbool),
     ("tee",              libxl_tee_type),
+    ("feature_enable",   libxl_bitmap),
+    ("feature_disable",  libxl_bitmap),
     ("u", KeyedUnion(None, libxl_domain_type, "type",
                 [("hvm", Struct(None, [("firmware",         string),
                                        ("bios",             libxl_bios_type),
diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
index f7217b422404..39a9d3cbf9f8 100644
--- a/tools/libs/light/libxl_x86.c
+++ b/tools/libs/light/libxl_x86.c
@@ -6,9 +6,19 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
                                       struct xen_domctl_createdomain *config)
 {
     libxl_domain_create_info *info = &d_config->c_info;
+    libxl_domain_build_info *b_info = &d_config->b_info;
+    libxl_xen_abi_feature f = LIBXL_XEN_ABI_FEATURE_EVTCHN_UPCALL;
 
-    if (info->type == LIBXL_DOMAIN_TYPE_HVM)
-        config->flags |= XEN_DOMCTL_CDF_evtchn_upcall;
+    if (info->type != LIBXL_DOMAIN_TYPE_HVM &&
+        libxl_bitmap_test(&b_info->feature_enable, f)) {
+        LOG(ERROR, "unsupported Xen ABI feature '%s'",
+            libxl_xen_abi_feature_to_string(f));
+        return ERROR_FAIL;
+    }
+
+    if (info->type == LIBXL_DOMAIN_TYPE_HVM &&
+        !libxl_bitmap_test(&b_info->feature_disable, f))
+            config->flags |= XEN_DOMCTL_CDF_evtchn_upcall;
 
     switch(info->type) {
     case LIBXL_DOMAIN_TYPE_HVM:
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v5 4/4] xl: introduce a 'xen-abi-features' option...
  2020-12-03 12:41 [PATCH v5 0/4] Xen ABI feature control Paul Durrant
                   ` (2 preceding siblings ...)
  2020-12-03 12:41 ` [PATCH v5 3/4] libxl: introduce a 'libxl_xen_abi_features' enumeration Paul Durrant
@ 2020-12-03 12:41 ` Paul Durrant
  2020-12-03 13:15 ` [PATCH v5 0/4] Xen ABI feature control Jürgen Groß
  4 siblings, 0 replies; 28+ messages in thread
From: Paul Durrant @ 2020-12-03 12:41 UTC (permalink / raw)
  To: xen-devel; +Cc: Paul Durrant, Ian Jackson, Wei Liu, Anthony PERARD

From: Paul Durrant <pdurrant@amazon.com>

... to control which features of the Xen ABI are enabled in
'libxl_domain_build_info', and hence exposed to the guest.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
---
Cc: Ian Jackson <iwj@xenproject.org>
Cc: Wei Liu <wl@xen.org>
Cc: Anthony PERARD <anthony.perard@citrix.com>

v5:
 - New in v5
---
 docs/man/xl.cfg.5.pod.in | 50 ++++++++++++++++++++++++++++++++++++++++
 tools/xl/xl_parse.c      | 50 ++++++++++++++++++++++++++++++++++++++--
 2 files changed, 98 insertions(+), 2 deletions(-)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index 3f0f8de1e988..b42ab8ba9f60 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -1649,6 +1649,56 @@ This feature is a B<technology preview>.
 
 =back
 
+=item B<xen_abi_features=[ "STRING", "STRING", ...]>
+
+The features of the Xen ABI exposed to the guest. The following features
+may be specified:
+
+=over 4
+
+=item B<evtchn_fifo>
+
+A new event channel ABI was introduced in Xen 4.4. Moving a guest from an
+earlier Xen to Xen 4.4 onwards may expose bugs in the guest support for
+this ABI. Disabling this feature hides the ABI from the guest and hence
+may be used as a workaround for such bugs.
+
+The festure is enabled by default.
+
+=item B<evcthn_upcall>
+
+B<x86 HVM only>. A new hypercall to specify per-VCPU interrupt vectors to use
+for event channel upcalls in HVM guests was added in Xen 4.6. Moving a guest
+from an earlier Xen to Xen 4.6 onwards may expose bugs in the guest support
+for this hypercall. Disabling this feature hides the hypercall from the
+guest and hence may be used as a workaround for such bugs.
+
+The festure is enabled by default for B<x86 HVM> guests. Note that it is
+considered an error to enable this feature for B<Arm> or B<x86 PV> guests.
+
+=item B<all>
+
+This is a special value that enables all available features.
+
+=back
+
+Features can be disabled by prefixing the name with '!'. So, for example,
+to enable all features except B<evtchn_upcall>, specify:
+
+=over 4
+
+B<xen-abi-features=[ "all", "!evtchn_upcall" ]>
+
+=back
+
+Or, to simply enable default features except B<evtchn_fifo>, specify:
+
+=over 4
+
+B<xen-abi-features=[ "!evtchn_fifo" ]>
+
+=back
+
 =back
 
 =head2 Paravirtualised (PV) Guest Specific Options
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index cae8eb679c5a..566e09f938f4 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -1216,8 +1216,9 @@ void parse_config_data(const char *config_source,
     XLU_ConfigList *cpus, *vbds, *nics, *pcis, *cvfbs, *cpuids, *vtpms,
                    *usbctrls, *usbdevs, *p9devs, *vdispls, *pvcallsifs_devs;
     XLU_ConfigList *channels, *ioports, *irqs, *iomem, *viridian, *dtdevs,
-                   *mca_caps;
-    int num_ioports, num_irqs, num_iomem, num_cpus, num_viridian, num_mca_caps;
+                   *mca_caps, *features;
+    int num_ioports, num_irqs, num_iomem, num_cpus, num_viridian, num_mca_caps,
+        num_features;
     int pci_power_mgmt = 0;
     int pci_msitranslate = 0;
     int pci_permissive = 0;
@@ -2737,6 +2738,51 @@ skip_usbdev:
     xlu_cfg_get_defbool(config, "xend_suspend_evtchn_compat",
                         &c_info->xend_suspend_evtchn_compat, 0);
 
+    switch (xlu_cfg_get_list(config, "xen_abi_features",
+                             &features, &num_features, 1))
+    {
+    case 0: /* Success */
+        if (num_features) {
+            libxl_bitmap_alloc(ctx, &b_info->feature_enable,
+                               LIBXL_BUILDINFO_FEATURE_ENABLE_DISABLE_WIDTH);
+            libxl_bitmap_alloc(ctx, &b_info->feature_disable,
+                               LIBXL_BUILDINFO_FEATURE_ENABLE_DISABLE_WIDTH);
+        }
+        for (i = 0; i < num_features; i++) {
+            if (strcmp(buf, "all") == 0)
+                libxl_bitmap_set_any(&b_info->feature_enable);
+            else {
+                libxl_bitmap *s = &b_info->feature_enable;
+                libxl_bitmap *r = &b_info->feature_disable;
+                libxl_xen_abi_feature f;
+
+                buf = xlu_cfg_get_listitem(features, i);
+
+                if (*buf == '!') {
+                    s = &b_info->feature_disable;
+                    r = &b_info->feature_enable;
+                    buf++;
+                }
+
+                e = libxl_xen_abi_feature_from_string(buf, &f);
+                if (e) {
+                    fprintf(stderr,
+                            "xl: Unknown Xen ABI feature '%s'\n",
+                            buf);
+                    exit(-ERROR_FAIL);
+                }
+
+                libxl_bitmap_set(s, f);
+                libxl_bitmap_reset(r, f);
+            }
+        }
+        break;
+    case ESRCH: break; /* Option not present */
+    default:
+        fprintf(stderr,"xl: Unable to parse Xen ABI features.\n");
+        exit(-ERROR_FAIL);
+    }
+
     xlu_cfg_destroy(config);
 }
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 0/4] Xen ABI feature control
  2020-12-03 12:41 [PATCH v5 0/4] Xen ABI feature control Paul Durrant
                   ` (3 preceding siblings ...)
  2020-12-03 12:41 ` [PATCH v5 4/4] xl: introduce a 'xen-abi-features' option Paul Durrant
@ 2020-12-03 13:15 ` Jürgen Groß
  2020-12-03 13:51   ` Paul Durrant
  4 siblings, 1 reply; 28+ messages in thread
From: Jürgen Groß @ 2020-12-03 13:15 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Paul Durrant, Andrew Cooper, Anthony PERARD, Christian Lindig,
	David Scott, George Dunlap, Ian Jackson, Jan Beulich,
	Julien Grall, Roger Pau Monné,
	Stefano Stabellini, Volodymyr Babchuk, Wei Liu


[-- Attachment #1.1.1: Type: text/plain, Size: 2754 bytes --]

On 03.12.20 13:41, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> This series was previously called "evtchn: Introduce a per-guest knob to
> control FIFO ABI". It is been extensively re-worked and extended to cover
> another ABI feature.
> 
> Paul Durrant (4):
>    domctl: introduce a new domain create flag,
>      XEN_DOMCTL_CDF_evtchn_fifo, ...
>    domctl: introduce a new domain create flag,
>      XEN_DOMCTL_CDF_evtchn_upcall, ...
>    libxl: introduce a 'libxl_xen_abi_features' enumeration...
>    xl: introduce a 'xen-abi-features' option...
> 
>   docs/man/xl.cfg.5.pod.in         | 50 ++++++++++++++++++++++++++++++++
>   tools/include/libxl.h            | 10 +++++++
>   tools/libs/light/libxl_arm.c     | 22 +++++++++-----
>   tools/libs/light/libxl_create.c  | 31 ++++++++++++++++++++
>   tools/libs/light/libxl_types.idl |  7 +++++
>   tools/libs/light/libxl_x86.c     | 17 ++++++++++-
>   tools/ocaml/libs/xc/xenctrl.ml   |  2 ++
>   tools/ocaml/libs/xc/xenctrl.mli  |  2 ++
>   tools/xl/xl_parse.c              | 50 ++++++++++++++++++++++++++++++--
>   xen/arch/arm/domain.c            |  3 +-
>   xen/arch/arm/domain_build.c      |  3 +-
>   xen/arch/arm/setup.c             |  3 +-
>   xen/arch/x86/domain.c            |  8 +++++
>   xen/arch/x86/hvm/hvm.c           |  3 ++
>   xen/arch/x86/setup.c             |  4 ++-
>   xen/common/domain.c              |  3 +-
>   xen/common/event_channel.c       | 24 +++++++++++++--
>   xen/include/public/domctl.h      |  6 +++-
>   18 files changed, 229 insertions(+), 19 deletions(-)
> ---
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Anthony PERARD <anthony.perard@citrix.com>
> Cc: Christian Lindig <christian.lindig@citrix.com>
> Cc: David Scott <dave@recoil.org>
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Ian Jackson <iwj@xenproject.org>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Julien Grall <julien@xen.org>
> Cc: "Roger Pau Monné" <roger.pau@citrix.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
> Cc: Wei Liu <wl@xen.org>
> 

Do we want to add a create flag for each such feature, or would it be
better to set options like those via hypfs?

It would be fairly easy to ad dynamic hypfs paths, e.g.:

/domain/<domid>/abi-features/evtchn-fifo
/domain/<domid>/abi-features/evtchn-upcall

which would have boolean type and could be set as long as the domain
hasn't been started.

xl support could even be rather generic, without the need to add coding
to xl for each new feature.

This is no objection to this series, but just an idea how to avoid
extending the use of unstable interfaces.

Thoughts?


Juergen

[-- Attachment #1.1.2: OpenPGP_0xB0DE9DD628BF132F.asc --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v5 0/4] Xen ABI feature control
  2020-12-03 13:15 ` [PATCH v5 0/4] Xen ABI feature control Jürgen Groß
@ 2020-12-03 13:51   ` Paul Durrant
  2020-12-03 13:58     ` Jürgen Groß
  0 siblings, 1 reply; 28+ messages in thread
From: Paul Durrant @ 2020-12-03 13:51 UTC (permalink / raw)
  To: 'Jürgen Groß', xen-devel
  Cc: 'Paul Durrant', 'Andrew Cooper',
	'Anthony PERARD', 'Christian Lindig',
	'David Scott', 'George Dunlap',
	'Ian Jackson', 'Jan Beulich',
	'Julien Grall', 'Roger Pau Monné',
	'Stefano Stabellini', 'Volodymyr Babchuk',
	'Wei Liu'

> -----Original Message-----
> From: Jürgen Groß <jgross@suse.com>
> Sent: 03 December 2020 13:15
> To: Paul Durrant <paul@xen.org>; xen-devel@lists.xenproject.org
> Cc: Paul Durrant <pdurrant@amazon.com>; Andrew Cooper <andrew.cooper3@citrix.com>; Anthony PERARD
> <anthony.perard@citrix.com>; Christian Lindig <christian.lindig@citrix.com>; David Scott
> <dave@recoil.org>; George Dunlap <george.dunlap@citrix.com>; Ian Jackson <iwj@xenproject.org>; Jan
> Beulich <jbeulich@suse.com>; Julien Grall <julien@xen.org>; Roger Pau Monné <roger.pau@citrix.com>;
> Stefano Stabellini <sstabellini@kernel.org>; Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>; Wei Liu
> <wl@xen.org>
> Subject: Re: [PATCH v5 0/4] Xen ABI feature control
> 
> On 03.12.20 13:41, Paul Durrant wrote:
> > From: Paul Durrant <pdurrant@amazon.com>
> >
> > This series was previously called "evtchn: Introduce a per-guest knob to
> > control FIFO ABI". It is been extensively re-worked and extended to cover
> > another ABI feature.
> >
> > Paul Durrant (4):
> >    domctl: introduce a new domain create flag,
> >      XEN_DOMCTL_CDF_evtchn_fifo, ...
> >    domctl: introduce a new domain create flag,
> >      XEN_DOMCTL_CDF_evtchn_upcall, ...
> >    libxl: introduce a 'libxl_xen_abi_features' enumeration...
> >    xl: introduce a 'xen-abi-features' option...
> >
> >   docs/man/xl.cfg.5.pod.in         | 50 ++++++++++++++++++++++++++++++++
> >   tools/include/libxl.h            | 10 +++++++
> >   tools/libs/light/libxl_arm.c     | 22 +++++++++-----
> >   tools/libs/light/libxl_create.c  | 31 ++++++++++++++++++++
> >   tools/libs/light/libxl_types.idl |  7 +++++
> >   tools/libs/light/libxl_x86.c     | 17 ++++++++++-
> >   tools/ocaml/libs/xc/xenctrl.ml   |  2 ++
> >   tools/ocaml/libs/xc/xenctrl.mli  |  2 ++
> >   tools/xl/xl_parse.c              | 50 ++++++++++++++++++++++++++++++--
> >   xen/arch/arm/domain.c            |  3 +-
> >   xen/arch/arm/domain_build.c      |  3 +-
> >   xen/arch/arm/setup.c             |  3 +-
> >   xen/arch/x86/domain.c            |  8 +++++
> >   xen/arch/x86/hvm/hvm.c           |  3 ++
> >   xen/arch/x86/setup.c             |  4 ++-
> >   xen/common/domain.c              |  3 +-
> >   xen/common/event_channel.c       | 24 +++++++++++++--
> >   xen/include/public/domctl.h      |  6 +++-
> >   18 files changed, 229 insertions(+), 19 deletions(-)
> > ---
> > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > Cc: Anthony PERARD <anthony.perard@citrix.com>
> > Cc: Christian Lindig <christian.lindig@citrix.com>
> > Cc: David Scott <dave@recoil.org>
> > Cc: George Dunlap <george.dunlap@citrix.com>
> > Cc: Ian Jackson <iwj@xenproject.org>
> > Cc: Jan Beulich <jbeulich@suse.com>
> > Cc: Julien Grall <julien@xen.org>
> > Cc: "Roger Pau Monné" <roger.pau@citrix.com>
> > Cc: Stefano Stabellini <sstabellini@kernel.org>
> > Cc: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
> > Cc: Wei Liu <wl@xen.org>
> >
> 
> Do we want to add a create flag for each such feature, or would it be
> better to set options like those via hypfs?
> 
> It would be fairly easy to ad dynamic hypfs paths, e.g.:
> 
> /domain/<domid>/abi-features/evtchn-fifo
> /domain/<domid>/abi-features/evtchn-upcall
> 
> which would have boolean type and could be set as long as the domain
> hasn't been started.
> 
> xl support could even be rather generic, without the need to add coding
> to xl for each new feature.
> 
> This is no objection to this series, but just an idea how to avoid
> extending the use of unstable interfaces.
> 
> Thoughts?
> 

I was not aware we could have something that was dynamic only before I domain is started.

We'd still want libxl to write the features rather than xl doing it directly I think as we still want it to be the owner of the default settings. Personally it still feels like this kind of setting does want to be an explicit part of domain creation, though using hypfs does sound like a neat idea.

  Paul

> 
> Juergen



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 0/4] Xen ABI feature control
  2020-12-03 13:51   ` Paul Durrant
@ 2020-12-03 13:58     ` Jürgen Groß
  0 siblings, 0 replies; 28+ messages in thread
From: Jürgen Groß @ 2020-12-03 13:58 UTC (permalink / raw)
  To: paul, xen-devel
  Cc: 'Paul Durrant', 'Andrew Cooper',
	'Anthony PERARD', 'Christian Lindig',
	'David Scott', 'George Dunlap',
	'Ian Jackson', 'Jan Beulich',
	'Julien Grall', 'Roger Pau Monné',
	'Stefano Stabellini', 'Volodymyr Babchuk',
	'Wei Liu'


[-- Attachment #1.1.1: Type: text/plain, Size: 4345 bytes --]

On 03.12.20 14:51, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jürgen Groß <jgross@suse.com>
>> Sent: 03 December 2020 13:15
>> To: Paul Durrant <paul@xen.org>; xen-devel@lists.xenproject.org
>> Cc: Paul Durrant <pdurrant@amazon.com>; Andrew Cooper <andrew.cooper3@citrix.com>; Anthony PERARD
>> <anthony.perard@citrix.com>; Christian Lindig <christian.lindig@citrix.com>; David Scott
>> <dave@recoil.org>; George Dunlap <george.dunlap@citrix.com>; Ian Jackson <iwj@xenproject.org>; Jan
>> Beulich <jbeulich@suse.com>; Julien Grall <julien@xen.org>; Roger Pau Monné <roger.pau@citrix.com>;
>> Stefano Stabellini <sstabellini@kernel.org>; Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>; Wei Liu
>> <wl@xen.org>
>> Subject: Re: [PATCH v5 0/4] Xen ABI feature control
>>
>> On 03.12.20 13:41, Paul Durrant wrote:
>>> From: Paul Durrant <pdurrant@amazon.com>
>>>
>>> This series was previously called "evtchn: Introduce a per-guest knob to
>>> control FIFO ABI". It is been extensively re-worked and extended to cover
>>> another ABI feature.
>>>
>>> Paul Durrant (4):
>>>     domctl: introduce a new domain create flag,
>>>       XEN_DOMCTL_CDF_evtchn_fifo, ...
>>>     domctl: introduce a new domain create flag,
>>>       XEN_DOMCTL_CDF_evtchn_upcall, ...
>>>     libxl: introduce a 'libxl_xen_abi_features' enumeration...
>>>     xl: introduce a 'xen-abi-features' option...
>>>
>>>    docs/man/xl.cfg.5.pod.in         | 50 ++++++++++++++++++++++++++++++++
>>>    tools/include/libxl.h            | 10 +++++++
>>>    tools/libs/light/libxl_arm.c     | 22 +++++++++-----
>>>    tools/libs/light/libxl_create.c  | 31 ++++++++++++++++++++
>>>    tools/libs/light/libxl_types.idl |  7 +++++
>>>    tools/libs/light/libxl_x86.c     | 17 ++++++++++-
>>>    tools/ocaml/libs/xc/xenctrl.ml   |  2 ++
>>>    tools/ocaml/libs/xc/xenctrl.mli  |  2 ++
>>>    tools/xl/xl_parse.c              | 50 ++++++++++++++++++++++++++++++--
>>>    xen/arch/arm/domain.c            |  3 +-
>>>    xen/arch/arm/domain_build.c      |  3 +-
>>>    xen/arch/arm/setup.c             |  3 +-
>>>    xen/arch/x86/domain.c            |  8 +++++
>>>    xen/arch/x86/hvm/hvm.c           |  3 ++
>>>    xen/arch/x86/setup.c             |  4 ++-
>>>    xen/common/domain.c              |  3 +-
>>>    xen/common/event_channel.c       | 24 +++++++++++++--
>>>    xen/include/public/domctl.h      |  6 +++-
>>>    18 files changed, 229 insertions(+), 19 deletions(-)
>>> ---
>>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>>> Cc: Anthony PERARD <anthony.perard@citrix.com>
>>> Cc: Christian Lindig <christian.lindig@citrix.com>
>>> Cc: David Scott <dave@recoil.org>
>>> Cc: George Dunlap <george.dunlap@citrix.com>
>>> Cc: Ian Jackson <iwj@xenproject.org>
>>> Cc: Jan Beulich <jbeulich@suse.com>
>>> Cc: Julien Grall <julien@xen.org>
>>> Cc: "Roger Pau Monné" <roger.pau@citrix.com>
>>> Cc: Stefano Stabellini <sstabellini@kernel.org>
>>> Cc: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
>>> Cc: Wei Liu <wl@xen.org>
>>>
>>
>> Do we want to add a create flag for each such feature, or would it be
>> better to set options like those via hypfs?
>>
>> It would be fairly easy to ad dynamic hypfs paths, e.g.:
>>
>> /domain/<domid>/abi-features/evtchn-fifo
>> /domain/<domid>/abi-features/evtchn-upcall
>>
>> which would have boolean type and could be set as long as the domain
>> hasn't been started.
>>
>> xl support could even be rather generic, without the need to add coding
>> to xl for each new feature.
>>
>> This is no objection to this series, but just an idea how to avoid
>> extending the use of unstable interfaces.
>>
>> Thoughts?
>>
> 
> I was not aware we could have something that was dynamic only before I domain is started.

Look at my current cpupool/hypfs series: the per-cpupool scheduling
granularity can be modified only if no cpu is assigned to the cpupool.

> 
> We'd still want libxl to write the features rather than xl doing it directly I think as we still want it to be the owner of the default settings. Personally it still feels like this kind of setting does want to be an explicit part of domain creation, though using hypfs does sound like a neat idea.

No problem of doing it in libxl. libxl is already using libxenhypfs.


Juergen

[-- Attachment #1.1.2: OpenPGP_0xB0DE9DD628BF132F.asc --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-03 12:41 ` [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, Paul Durrant
@ 2020-12-03 15:22   ` Jan Beulich
  2020-12-03 15:45     ` Paul Durrant
  0 siblings, 1 reply; 28+ messages in thread
From: Jan Beulich @ 2020-12-03 15:22 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Paul Durrant, Eslam Elnikety, Ian Jackson, Wei Liu,
	Anthony PERARD, Andrew Cooper, George Dunlap, Julien Grall,
	Stefano Stabellini, Christian Lindig, David Scott,
	Volodymyr Babchuk, Roger Pau Monné,
	xen-devel

On 03.12.2020 13:41, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> ...to control the visibility of the FIFO event channel operations
> (EVTCHNOP_init_control, EVTCHNOP_expand_array, and EVTCHNOP_set_priority) to
> the guest.
> 
> These operations were added to the public header in commit d2d50c2f308f
> ("evtchn: add FIFO-based event channel ABI") and the first implementation
> appeared in the two subsequent commits: edc8872aeb4a ("evtchn: implement
> EVTCHNOP_set_priority and add the set_priority hook") and 88910061ec61
> ("evtchn: add FIFO-based event channel hypercalls and port ops"). Prior to
> that, a guest issuing those operations would receive a return value of
> -ENOSYS (not implemented) from Xen. Guests aware of the FIFO operations but
> running on an older (pre-4.4) Xen would fall back to using the 2-level event
> channel interface upon seeing this return value.
> 
> Unfortunately the uncontrolable appearance of these new operations in Xen 4.4
> onwards has implications for hibernation of some Linux guests. During resume
> from hibernation, there are two kernels involved: the "boot" kernel and the
> "resume" kernel. The guest boot kernel may default to use FIFO operations and
> instruct Xen via EVTCHNOP_init_control to switch from 2-level to FIFO. On the
> other hand, the resume kernel keeps assuming 2-level, because it was hibernated
> on a version of Xen that did not support the FIFO operations.
> 
> To maintain compatibility it is necessary to make Xen behave as it did
> before the new operations were added and hence the code in this patch ensures
> that, if XEN_DOMCTL_CDF_evtchn_fifo is not set, the FIFO event channel
> operations will again result in -ENOSYS being returned to the guest.

I have to admit I'm now even more concerned of the control for such
going into Xen, the more with the now 2nd use in the subsequent patch.
The implication of this would seem to be that whenever we add new
hypercalls or sub-ops, a domain creation control would also need
adding determining whether that new sub-op is actually okay to use by
a guest. Or else I'd be keen to up front see criteria at least roughly
outlined by which it could be established whether such an override
control is needed.

I'm also not convinced such controls really want to be opt-in rather
than opt-out. While perhaps sensible as long as a feature is
experimental, not exposing stuff by default may mean slower adoption
of new (and hopefully better) functionality. I realize there's still
the option of having the tool stack default to enable, and just the
hypervisor defaulting to disable, but anyway.

> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -622,7 +622,8 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
>      unsigned int max_vcpus;
>  
>      /* HVM and HAP must be set. IOMMU may or may not be */
> -    if ( (config->flags & ~XEN_DOMCTL_CDF_iommu) !=
> +    if ( (config->flags &
> +          ~(XEN_DOMCTL_CDF_iommu | XEN_DOMCTL_CDF_evtchn_fifo) !=
>           (XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap) )
>      {
>          dprintk(XENLOG_INFO, "Unsupported configuration %#x\n",
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -2478,7 +2478,8 @@ void __init create_domUs(void)
>          struct domain *d;
>          struct xen_domctl_createdomain d_cfg = {
>              .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE,
> -            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
> +            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
> +                     XEN_DOMCTL_CDF_evtchn_fifo,
>              .max_evtchn_port = -1,
>              .max_grant_frames = 64,
>              .max_maptrack_frames = 1024,
> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -805,7 +805,8 @@ void __init start_xen(unsigned long boot_phys_offset,
>      struct bootmodule *xen_bootmodule;
>      struct domain *dom0;
>      struct xen_domctl_createdomain dom0_cfg = {
> -        .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
> +        .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
> +                 XEN_DOMCTL_CDF_evtchn_fifo,
>          .max_evtchn_port = -1,
>          .max_grant_frames = gnttab_dom0_frames(),
>          .max_maptrack_frames = -1,
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -738,7 +738,8 @@ static struct domain *__init create_dom0(const module_t *image,
>                                           const char *loader)
>  {
>      struct xen_domctl_createdomain dom0_cfg = {
> -        .flags = IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0,
> +        .flags = XEN_DOMCTL_CDF_evtchn_fifo |
> +                 (IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0),
>          .max_evtchn_port = -1,
>          .max_grant_frames = -1,
>          .max_maptrack_frames = -1,
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -307,7 +307,7 @@ static int sanitise_domain_config(struct xen_domctl_createdomain *config)
>           ~(XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
>             XEN_DOMCTL_CDF_s3_integrity | XEN_DOMCTL_CDF_oos_off |
>             XEN_DOMCTL_CDF_xs_domain | XEN_DOMCTL_CDF_iommu |
> -           XEN_DOMCTL_CDF_nested_virt) )
> +           XEN_DOMCTL_CDF_nested_virt | XEN_DOMCTL_CDF_evtchn_fifo) )
>      {
>          dprintk(XENLOG_INFO, "Unknown CDF flags %#x\n", config->flags);
>          return -EINVAL;

All of the hunks above point out a scalability issue if we were to
follow this route for even just a fair part of new sub-ops, and I
suppose you've noticed this with the next patch presumably touching
all the same places again.

Jan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-03 15:22   ` Jan Beulich
@ 2020-12-03 15:45     ` Paul Durrant
  2020-12-03 15:56       ` Jan Beulich
  0 siblings, 1 reply; 28+ messages in thread
From: Paul Durrant @ 2020-12-03 15:45 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: 'Paul Durrant', 'Eslam Elnikety',
	'Ian Jackson', 'Wei Liu',
	'Anthony PERARD', 'Andrew Cooper',
	'George Dunlap', 'Julien Grall',
	'Stefano Stabellini', 'Christian Lindig',
	'David Scott', 'Volodymyr Babchuk',
	'Roger Pau Monné',
	xen-devel

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 03 December 2020 15:23
> To: Paul Durrant <paul@xen.org>
> Cc: Paul Durrant <pdurrant@amazon.com>; Eslam Elnikety <elnikety@amazon.com>; Ian Jackson
> <iwj@xenproject.org>; Wei Liu <wl@xen.org>; Anthony PERARD <anthony.perard@citrix.com>; Andrew Cooper
> <andrew.cooper3@citrix.com>; George Dunlap <george.dunlap@citrix.com>; Julien Grall <julien@xen.org>;
> Stefano Stabellini <sstabellini@kernel.org>; Christian Lindig <christian.lindig@citrix.com>; David
> Scott <dave@recoil.org>; Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>; Roger Pau Monné
> <roger.pau@citrix.com>; xen-devel@lists.xenproject.org
> Subject: Re: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo,
> ...
> 
> On 03.12.2020 13:41, Paul Durrant wrote:
> > From: Paul Durrant <pdurrant@amazon.com>
> >
> > ...to control the visibility of the FIFO event channel operations
> > (EVTCHNOP_init_control, EVTCHNOP_expand_array, and EVTCHNOP_set_priority) to
> > the guest.
> >
> > These operations were added to the public header in commit d2d50c2f308f
> > ("evtchn: add FIFO-based event channel ABI") and the first implementation
> > appeared in the two subsequent commits: edc8872aeb4a ("evtchn: implement
> > EVTCHNOP_set_priority and add the set_priority hook") and 88910061ec61
> > ("evtchn: add FIFO-based event channel hypercalls and port ops"). Prior to
> > that, a guest issuing those operations would receive a return value of
> > -ENOSYS (not implemented) from Xen. Guests aware of the FIFO operations but
> > running on an older (pre-4.4) Xen would fall back to using the 2-level event
> > channel interface upon seeing this return value.
> >
> > Unfortunately the uncontrolable appearance of these new operations in Xen 4.4
> > onwards has implications for hibernation of some Linux guests. During resume
> > from hibernation, there are two kernels involved: the "boot" kernel and the
> > "resume" kernel. The guest boot kernel may default to use FIFO operations and
> > instruct Xen via EVTCHNOP_init_control to switch from 2-level to FIFO. On the
> > other hand, the resume kernel keeps assuming 2-level, because it was hibernated
> > on a version of Xen that did not support the FIFO operations.
> >
> > To maintain compatibility it is necessary to make Xen behave as it did
> > before the new operations were added and hence the code in this patch ensures
> > that, if XEN_DOMCTL_CDF_evtchn_fifo is not set, the FIFO event channel
> > operations will again result in -ENOSYS being returned to the guest.
> 
> I have to admit I'm now even more concerned of the control for such
> going into Xen, the more with the now 2nd use in the subsequent patch.
> The implication of this would seem to be that whenever we add new
> hypercalls or sub-ops, a domain creation control would also need
> adding determining whether that new sub-op is actually okay to use by
> a guest. Or else I'd be keen to up front see criteria at least roughly
> outlined by which it could be established whether such an override
> control is needed.
> 

Ultimately I think any new hypercall (or related set of hypercalls) added to the ABI needs to be opt-in on a per-domain basis, so that we know that from when a domain is first created it will not see a change in its environment unless the VM administrator wants that to happen.

> I'm also not convinced such controls really want to be opt-in rather
> than opt-out.

They really need to be opt-in I think. From a cloud provider PoV it is important that nothing in a customer's environment changes unless we want it to. Otherwise we have no way to deploy an updated hypervisor version without risking crashing their VMs.

> While perhaps sensible as long as a feature is
> experimental, not exposing stuff by default may mean slower adoption
> of new (and hopefully better) functionality. I realize there's still
> the option of having the tool stack default to enable, and just the
> hypervisor defaulting to disable, but anyway.
> 

Ok. I don't see a problem in default-to-enable behaviour... but I guess we will need to add ABI features to migration stream to fix things properly.

> > --- a/xen/arch/arm/domain.c
> > +++ b/xen/arch/arm/domain.c
> > @@ -622,7 +622,8 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
> >      unsigned int max_vcpus;
> >
> >      /* HVM and HAP must be set. IOMMU may or may not be */
> > -    if ( (config->flags & ~XEN_DOMCTL_CDF_iommu) !=
> > +    if ( (config->flags &
> > +          ~(XEN_DOMCTL_CDF_iommu | XEN_DOMCTL_CDF_evtchn_fifo) !=
> >           (XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap) )
> >      {
> >          dprintk(XENLOG_INFO, "Unsupported configuration %#x\n",
> > --- a/xen/arch/arm/domain_build.c
> > +++ b/xen/arch/arm/domain_build.c
> > @@ -2478,7 +2478,8 @@ void __init create_domUs(void)
> >          struct domain *d;
> >          struct xen_domctl_createdomain d_cfg = {
> >              .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE,
> > -            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
> > +            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
> > +                     XEN_DOMCTL_CDF_evtchn_fifo,
> >              .max_evtchn_port = -1,
> >              .max_grant_frames = 64,
> >              .max_maptrack_frames = 1024,
> > --- a/xen/arch/arm/setup.c
> > +++ b/xen/arch/arm/setup.c
> > @@ -805,7 +805,8 @@ void __init start_xen(unsigned long boot_phys_offset,
> >      struct bootmodule *xen_bootmodule;
> >      struct domain *dom0;
> >      struct xen_domctl_createdomain dom0_cfg = {
> > -        .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
> > +        .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
> > +                 XEN_DOMCTL_CDF_evtchn_fifo,
> >          .max_evtchn_port = -1,
> >          .max_grant_frames = gnttab_dom0_frames(),
> >          .max_maptrack_frames = -1,
> > --- a/xen/arch/x86/setup.c
> > +++ b/xen/arch/x86/setup.c
> > @@ -738,7 +738,8 @@ static struct domain *__init create_dom0(const module_t *image,
> >                                           const char *loader)
> >  {
> >      struct xen_domctl_createdomain dom0_cfg = {
> > -        .flags = IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0,
> > +        .flags = XEN_DOMCTL_CDF_evtchn_fifo |
> > +                 (IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0),
> >          .max_evtchn_port = -1,
> >          .max_grant_frames = -1,
> >          .max_maptrack_frames = -1,
> > --- a/xen/common/domain.c
> > +++ b/xen/common/domain.c
> > @@ -307,7 +307,7 @@ static int sanitise_domain_config(struct xen_domctl_createdomain *config)
> >           ~(XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
> >             XEN_DOMCTL_CDF_s3_integrity | XEN_DOMCTL_CDF_oos_off |
> >             XEN_DOMCTL_CDF_xs_domain | XEN_DOMCTL_CDF_iommu |
> > -           XEN_DOMCTL_CDF_nested_virt) )
> > +           XEN_DOMCTL_CDF_nested_virt | XEN_DOMCTL_CDF_evtchn_fifo) )
> >      {
> >          dprintk(XENLOG_INFO, "Unknown CDF flags %#x\n", config->flags);
> >          return -EINVAL;
> 
> All of the hunks above point out a scalability issue if we were to
> follow this route for even just a fair part of new sub-ops, and I
> suppose you've noticed this with the next patch presumably touching
> all the same places again.

Indeed. This solution works for now but is probably not what we want in the long run. Same goes for the current way we control viridian features via an HVM param. It is good enough for now IMO since domctl is not a stable interface. Any ideas about how we might implement a better interface in the longer term?

  Paul

> 
> Jan



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-03 15:45     ` Paul Durrant
@ 2020-12-03 15:56       ` Jan Beulich
  2020-12-03 17:07         ` Paul Durrant
  0 siblings, 1 reply; 28+ messages in thread
From: Jan Beulich @ 2020-12-03 15:56 UTC (permalink / raw)
  To: paul
  Cc: 'Paul Durrant', 'Eslam Elnikety',
	'Ian Jackson', 'Wei Liu',
	'Anthony PERARD', 'Andrew Cooper',
	'George Dunlap', 'Julien Grall',
	'Stefano Stabellini', 'Christian Lindig',
	'David Scott', 'Volodymyr Babchuk',
	'Roger Pau Monné',
	xen-devel

On 03.12.2020 16:45, Paul Durrant wrote:
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 03 December 2020 15:23
>>
>> On 03.12.2020 13:41, Paul Durrant wrote:
>>> From: Paul Durrant <pdurrant@amazon.com>
>>>
>>> ...to control the visibility of the FIFO event channel operations
>>> (EVTCHNOP_init_control, EVTCHNOP_expand_array, and EVTCHNOP_set_priority) to
>>> the guest.
>>>
>>> These operations were added to the public header in commit d2d50c2f308f
>>> ("evtchn: add FIFO-based event channel ABI") and the first implementation
>>> appeared in the two subsequent commits: edc8872aeb4a ("evtchn: implement
>>> EVTCHNOP_set_priority and add the set_priority hook") and 88910061ec61
>>> ("evtchn: add FIFO-based event channel hypercalls and port ops"). Prior to
>>> that, a guest issuing those operations would receive a return value of
>>> -ENOSYS (not implemented) from Xen. Guests aware of the FIFO operations but
>>> running on an older (pre-4.4) Xen would fall back to using the 2-level event
>>> channel interface upon seeing this return value.
>>>
>>> Unfortunately the uncontrolable appearance of these new operations in Xen 4.4
>>> onwards has implications for hibernation of some Linux guests. During resume
>>> from hibernation, there are two kernels involved: the "boot" kernel and the
>>> "resume" kernel. The guest boot kernel may default to use FIFO operations and
>>> instruct Xen via EVTCHNOP_init_control to switch from 2-level to FIFO. On the
>>> other hand, the resume kernel keeps assuming 2-level, because it was hibernated
>>> on a version of Xen that did not support the FIFO operations.
>>>
>>> To maintain compatibility it is necessary to make Xen behave as it did
>>> before the new operations were added and hence the code in this patch ensures
>>> that, if XEN_DOMCTL_CDF_evtchn_fifo is not set, the FIFO event channel
>>> operations will again result in -ENOSYS being returned to the guest.
>>
>> I have to admit I'm now even more concerned of the control for such
>> going into Xen, the more with the now 2nd use in the subsequent patch.
>> The implication of this would seem to be that whenever we add new
>> hypercalls or sub-ops, a domain creation control would also need
>> adding determining whether that new sub-op is actually okay to use by
>> a guest. Or else I'd be keen to up front see criteria at least roughly
>> outlined by which it could be established whether such an override
>> control is needed.
>>
> 
> Ultimately I think any new hypercall (or related set of hypercalls) added to the ABI needs to be opt-in on a per-domain basis, so that we know that from when a domain is first created it will not see a change in its environment unless the VM administrator wants that to happen.

A new hypercall appearing is a change to the guest's environment, yes,
but a backwards compatible one. I don't see how this would harm a guest.
This and ...

>> I'm also not convinced such controls really want to be opt-in rather
>> than opt-out.
> 
> They really need to be opt-in I think. From a cloud provider PoV it is important that nothing in a customer's environment changes unless we want it to. Otherwise we have no way to deploy an updated hypervisor version without risking crashing their VMs.

... this sound to me more like workarounds for buggy guests than
functionality the hypervisor _needs_ to have. (I can appreciate
the specific case here for the specific scenario you provide as
an exception.)

>>> --- a/xen/arch/arm/domain.c
>>> +++ b/xen/arch/arm/domain.c
>>> @@ -622,7 +622,8 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
>>>      unsigned int max_vcpus;
>>>
>>>      /* HVM and HAP must be set. IOMMU may or may not be */
>>> -    if ( (config->flags & ~XEN_DOMCTL_CDF_iommu) !=
>>> +    if ( (config->flags &
>>> +          ~(XEN_DOMCTL_CDF_iommu | XEN_DOMCTL_CDF_evtchn_fifo) !=
>>>           (XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap) )
>>>      {
>>>          dprintk(XENLOG_INFO, "Unsupported configuration %#x\n",
>>> --- a/xen/arch/arm/domain_build.c
>>> +++ b/xen/arch/arm/domain_build.c
>>> @@ -2478,7 +2478,8 @@ void __init create_domUs(void)
>>>          struct domain *d;
>>>          struct xen_domctl_createdomain d_cfg = {
>>>              .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE,
>>> -            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
>>> +            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
>>> +                     XEN_DOMCTL_CDF_evtchn_fifo,
>>>              .max_evtchn_port = -1,
>>>              .max_grant_frames = 64,
>>>              .max_maptrack_frames = 1024,
>>> --- a/xen/arch/arm/setup.c
>>> +++ b/xen/arch/arm/setup.c
>>> @@ -805,7 +805,8 @@ void __init start_xen(unsigned long boot_phys_offset,
>>>      struct bootmodule *xen_bootmodule;
>>>      struct domain *dom0;
>>>      struct xen_domctl_createdomain dom0_cfg = {
>>> -        .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
>>> +        .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
>>> +                 XEN_DOMCTL_CDF_evtchn_fifo,
>>>          .max_evtchn_port = -1,
>>>          .max_grant_frames = gnttab_dom0_frames(),
>>>          .max_maptrack_frames = -1,
>>> --- a/xen/arch/x86/setup.c
>>> +++ b/xen/arch/x86/setup.c
>>> @@ -738,7 +738,8 @@ static struct domain *__init create_dom0(const module_t *image,
>>>                                           const char *loader)
>>>  {
>>>      struct xen_domctl_createdomain dom0_cfg = {
>>> -        .flags = IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0,
>>> +        .flags = XEN_DOMCTL_CDF_evtchn_fifo |
>>> +                 (IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0),
>>>          .max_evtchn_port = -1,
>>>          .max_grant_frames = -1,
>>>          .max_maptrack_frames = -1,
>>> --- a/xen/common/domain.c
>>> +++ b/xen/common/domain.c
>>> @@ -307,7 +307,7 @@ static int sanitise_domain_config(struct xen_domctl_createdomain *config)
>>>           ~(XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
>>>             XEN_DOMCTL_CDF_s3_integrity | XEN_DOMCTL_CDF_oos_off |
>>>             XEN_DOMCTL_CDF_xs_domain | XEN_DOMCTL_CDF_iommu |
>>> -           XEN_DOMCTL_CDF_nested_virt) )
>>> +           XEN_DOMCTL_CDF_nested_virt | XEN_DOMCTL_CDF_evtchn_fifo) )
>>>      {
>>>          dprintk(XENLOG_INFO, "Unknown CDF flags %#x\n", config->flags);
>>>          return -EINVAL;
>>
>> All of the hunks above point out a scalability issue if we were to
>> follow this route for even just a fair part of new sub-ops, and I
>> suppose you've noticed this with the next patch presumably touching
>> all the same places again.
> 
> Indeed. This solution works for now but is probably not what we want in the long run. Same goes for the current way we control viridian features via an HVM param. It is good enough for now IMO since domctl is not a stable interface. Any ideas about how we might implement a better interface in the longer term?

While it has other downsides, Jürgen's proposal doesn't have any
similar scalability issue afaics. Another possible model would
seem to be to key new hypercalls to hypervisor CPUID leaf bits,
and derive their availability from a guest's CPUID policy. Of
course this won't work when needing to retrofit guarding like
you want to do here.

Jan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-03 15:56       ` Jan Beulich
@ 2020-12-03 17:07         ` Paul Durrant
  2020-12-03 17:19           ` Jürgen Groß
  2020-12-04  7:53           ` Jan Beulich
  0 siblings, 2 replies; 28+ messages in thread
From: Paul Durrant @ 2020-12-03 17:07 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: 'Paul Durrant', 'Eslam Elnikety',
	'Ian Jackson', 'Wei Liu',
	'Anthony PERARD', 'Andrew Cooper',
	'George Dunlap', 'Julien Grall',
	'Stefano Stabellini', 'Christian Lindig',
	'David Scott', 'Volodymyr Babchuk',
	'Roger Pau Monné',
	xen-devel

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 03 December 2020 15:57
> To: paul@xen.org
> Cc: 'Paul Durrant' <pdurrant@amazon.com>; 'Eslam Elnikety' <elnikety@amazon.com>; 'Ian Jackson'
> <iwj@xenproject.org>; 'Wei Liu' <wl@xen.org>; 'Anthony PERARD' <anthony.perard@citrix.com>; 'Andrew
> Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap' <george.dunlap@citrix.com>; 'Julien Grall'
> <julien@xen.org>; 'Stefano Stabellini' <sstabellini@kernel.org>; 'Christian Lindig'
> <christian.lindig@citrix.com>; 'David Scott' <dave@recoil.org>; 'Volodymyr Babchuk'
> <Volodymyr_Babchuk@epam.com>; 'Roger Pau Monné' <roger.pau@citrix.com>; xen-devel@lists.xenproject.org
> Subject: Re: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo,
> ...
> 
> On 03.12.2020 16:45, Paul Durrant wrote:
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: 03 December 2020 15:23
> >>
> >> On 03.12.2020 13:41, Paul Durrant wrote:
> >>> From: Paul Durrant <pdurrant@amazon.com>
> >>>
> >>> ...to control the visibility of the FIFO event channel operations
> >>> (EVTCHNOP_init_control, EVTCHNOP_expand_array, and EVTCHNOP_set_priority) to
> >>> the guest.
> >>>
> >>> These operations were added to the public header in commit d2d50c2f308f
> >>> ("evtchn: add FIFO-based event channel ABI") and the first implementation
> >>> appeared in the two subsequent commits: edc8872aeb4a ("evtchn: implement
> >>> EVTCHNOP_set_priority and add the set_priority hook") and 88910061ec61
> >>> ("evtchn: add FIFO-based event channel hypercalls and port ops"). Prior to
> >>> that, a guest issuing those operations would receive a return value of
> >>> -ENOSYS (not implemented) from Xen. Guests aware of the FIFO operations but
> >>> running on an older (pre-4.4) Xen would fall back to using the 2-level event
> >>> channel interface upon seeing this return value.
> >>>
> >>> Unfortunately the uncontrolable appearance of these new operations in Xen 4.4
> >>> onwards has implications for hibernation of some Linux guests. During resume
> >>> from hibernation, there are two kernels involved: the "boot" kernel and the
> >>> "resume" kernel. The guest boot kernel may default to use FIFO operations and
> >>> instruct Xen via EVTCHNOP_init_control to switch from 2-level to FIFO. On the
> >>> other hand, the resume kernel keeps assuming 2-level, because it was hibernated
> >>> on a version of Xen that did not support the FIFO operations.
> >>>
> >>> To maintain compatibility it is necessary to make Xen behave as it did
> >>> before the new operations were added and hence the code in this patch ensures
> >>> that, if XEN_DOMCTL_CDF_evtchn_fifo is not set, the FIFO event channel
> >>> operations will again result in -ENOSYS being returned to the guest.
> >>
> >> I have to admit I'm now even more concerned of the control for such
> >> going into Xen, the more with the now 2nd use in the subsequent patch.
> >> The implication of this would seem to be that whenever we add new
> >> hypercalls or sub-ops, a domain creation control would also need
> >> adding determining whether that new sub-op is actually okay to use by
> >> a guest. Or else I'd be keen to up front see criteria at least roughly
> >> outlined by which it could be established whether such an override
> >> control is needed.
> >>
> >
> > Ultimately I think any new hypercall (or related set of hypercalls) added to the ABI needs to be
> opt-in on a per-domain basis, so that we know that from when a domain is first created it will not see
> a change in its environment unless the VM administrator wants that to happen.
> 
> A new hypercall appearing is a change to the guest's environment, yes,
> but a backwards compatible one. I don't see how this would harm a guest.

Say we have a guest which is aware of the newer Xen and is coded to use the new hypercall *but* we start it on an older Xen where the new hypercall is not implemented. *Then* we migrate it to the newer Xen... the new hypercall suddenly becomes available and changes the guest behaviour. In the worst case, the guest is sufficiently confused that it crashes.

> This and ...
> 
> >> I'm also not convinced such controls really want to be opt-in rather
> >> than opt-out.
> >
> > They really need to be opt-in I think. From a cloud provider PoV it is important that nothing in a
> customer's environment changes unless we want it to. Otherwise we have no way to deploy an updated
> hypervisor version without risking crashing their VMs.
> 
> ... this sound to me more like workarounds for buggy guests than
> functionality the hypervisor _needs_ to have. (I can appreciate
> the specific case here for the specific scenario you provide as
> an exception.)

If we want to have a hypervisor that can be used in a cloud environment then Xen absolutely needs this capability.

> 
> >>> --- a/xen/arch/arm/domain.c
> >>> +++ b/xen/arch/arm/domain.c
> >>> @@ -622,7 +622,8 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
> >>>      unsigned int max_vcpus;
> >>>
> >>>      /* HVM and HAP must be set. IOMMU may or may not be */
> >>> -    if ( (config->flags & ~XEN_DOMCTL_CDF_iommu) !=
> >>> +    if ( (config->flags &
> >>> +          ~(XEN_DOMCTL_CDF_iommu | XEN_DOMCTL_CDF_evtchn_fifo) !=
> >>>           (XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap) )
> >>>      {
> >>>          dprintk(XENLOG_INFO, "Unsupported configuration %#x\n",
> >>> --- a/xen/arch/arm/domain_build.c
> >>> +++ b/xen/arch/arm/domain_build.c
> >>> @@ -2478,7 +2478,8 @@ void __init create_domUs(void)
> >>>          struct domain *d;
> >>>          struct xen_domctl_createdomain d_cfg = {
> >>>              .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE,
> >>> -            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
> >>> +            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
> >>> +                     XEN_DOMCTL_CDF_evtchn_fifo,
> >>>              .max_evtchn_port = -1,
> >>>              .max_grant_frames = 64,
> >>>              .max_maptrack_frames = 1024,
> >>> --- a/xen/arch/arm/setup.c
> >>> +++ b/xen/arch/arm/setup.c
> >>> @@ -805,7 +805,8 @@ void __init start_xen(unsigned long boot_phys_offset,
> >>>      struct bootmodule *xen_bootmodule;
> >>>      struct domain *dom0;
> >>>      struct xen_domctl_createdomain dom0_cfg = {
> >>> -        .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
> >>> +        .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
> >>> +                 XEN_DOMCTL_CDF_evtchn_fifo,
> >>>          .max_evtchn_port = -1,
> >>>          .max_grant_frames = gnttab_dom0_frames(),
> >>>          .max_maptrack_frames = -1,
> >>> --- a/xen/arch/x86/setup.c
> >>> +++ b/xen/arch/x86/setup.c
> >>> @@ -738,7 +738,8 @@ static struct domain *__init create_dom0(const module_t *image,
> >>>                                           const char *loader)
> >>>  {
> >>>      struct xen_domctl_createdomain dom0_cfg = {
> >>> -        .flags = IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0,
> >>> +        .flags = XEN_DOMCTL_CDF_evtchn_fifo |
> >>> +                 (IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0),
> >>>          .max_evtchn_port = -1,
> >>>          .max_grant_frames = -1,
> >>>          .max_maptrack_frames = -1,
> >>> --- a/xen/common/domain.c
> >>> +++ b/xen/common/domain.c
> >>> @@ -307,7 +307,7 @@ static int sanitise_domain_config(struct xen_domctl_createdomain *config)
> >>>           ~(XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
> >>>             XEN_DOMCTL_CDF_s3_integrity | XEN_DOMCTL_CDF_oos_off |
> >>>             XEN_DOMCTL_CDF_xs_domain | XEN_DOMCTL_CDF_iommu |
> >>> -           XEN_DOMCTL_CDF_nested_virt) )
> >>> +           XEN_DOMCTL_CDF_nested_virt | XEN_DOMCTL_CDF_evtchn_fifo) )
> >>>      {
> >>>          dprintk(XENLOG_INFO, "Unknown CDF flags %#x\n", config->flags);
> >>>          return -EINVAL;
> >>
> >> All of the hunks above point out a scalability issue if we were to
> >> follow this route for even just a fair part of new sub-ops, and I
> >> suppose you've noticed this with the next patch presumably touching
> >> all the same places again.
> >
> > Indeed. This solution works for now but is probably not what we want in the long run. Same goes for
> the current way we control viridian features via an HVM param. It is good enough for now IMO since
> domctl is not a stable interface. Any ideas about how we might implement a better interface in the
> longer term?
> 
> While it has other downsides, Jürgen's proposal doesn't have any
> similar scalability issue afaics. Another possible model would
> seem to be to key new hypercalls to hypervisor CPUID leaf bits,
> and derive their availability from a guest's CPUID policy. Of
> course this won't work when needing to retrofit guarding like
> you want to do here.
> 

Ok, I'll take a look hypfs as an immediate solution, if that's preferred.

  Paul



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-03 17:07         ` Paul Durrant
@ 2020-12-03 17:19           ` Jürgen Groß
  2020-12-03 18:44             ` Paul Durrant
  2020-12-04  7:53           ` Jan Beulich
  1 sibling, 1 reply; 28+ messages in thread
From: Jürgen Groß @ 2020-12-03 17:19 UTC (permalink / raw)
  To: paul, 'Jan Beulich'
  Cc: 'Paul Durrant', 'Eslam Elnikety',
	'Ian Jackson', 'Wei Liu',
	'Anthony PERARD', 'Andrew Cooper',
	'George Dunlap', 'Julien Grall',
	'Stefano Stabellini', 'Christian Lindig',
	'David Scott', 'Volodymyr Babchuk',
	'Roger Pau Monné',
	xen-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 9434 bytes --]

On 03.12.20 18:07, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 03 December 2020 15:57
>> To: paul@xen.org
>> Cc: 'Paul Durrant' <pdurrant@amazon.com>; 'Eslam Elnikety' <elnikety@amazon.com>; 'Ian Jackson'
>> <iwj@xenproject.org>; 'Wei Liu' <wl@xen.org>; 'Anthony PERARD' <anthony.perard@citrix.com>; 'Andrew
>> Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap' <george.dunlap@citrix.com>; 'Julien Grall'
>> <julien@xen.org>; 'Stefano Stabellini' <sstabellini@kernel.org>; 'Christian Lindig'
>> <christian.lindig@citrix.com>; 'David Scott' <dave@recoil.org>; 'Volodymyr Babchuk'
>> <Volodymyr_Babchuk@epam.com>; 'Roger Pau Monné' <roger.pau@citrix.com>; xen-devel@lists.xenproject.org
>> Subject: Re: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo,
>> ...
>>
>> On 03.12.2020 16:45, Paul Durrant wrote:
>>>> From: Jan Beulich <jbeulich@suse.com>
>>>> Sent: 03 December 2020 15:23
>>>>
>>>> On 03.12.2020 13:41, Paul Durrant wrote:
>>>>> From: Paul Durrant <pdurrant@amazon.com>
>>>>>
>>>>> ...to control the visibility of the FIFO event channel operations
>>>>> (EVTCHNOP_init_control, EVTCHNOP_expand_array, and EVTCHNOP_set_priority) to
>>>>> the guest.
>>>>>
>>>>> These operations were added to the public header in commit d2d50c2f308f
>>>>> ("evtchn: add FIFO-based event channel ABI") and the first implementation
>>>>> appeared in the two subsequent commits: edc8872aeb4a ("evtchn: implement
>>>>> EVTCHNOP_set_priority and add the set_priority hook") and 88910061ec61
>>>>> ("evtchn: add FIFO-based event channel hypercalls and port ops"). Prior to
>>>>> that, a guest issuing those operations would receive a return value of
>>>>> -ENOSYS (not implemented) from Xen. Guests aware of the FIFO operations but
>>>>> running on an older (pre-4.4) Xen would fall back to using the 2-level event
>>>>> channel interface upon seeing this return value.
>>>>>
>>>>> Unfortunately the uncontrolable appearance of these new operations in Xen 4.4
>>>>> onwards has implications for hibernation of some Linux guests. During resume
>>>>> from hibernation, there are two kernels involved: the "boot" kernel and the
>>>>> "resume" kernel. The guest boot kernel may default to use FIFO operations and
>>>>> instruct Xen via EVTCHNOP_init_control to switch from 2-level to FIFO. On the
>>>>> other hand, the resume kernel keeps assuming 2-level, because it was hibernated
>>>>> on a version of Xen that did not support the FIFO operations.
>>>>>
>>>>> To maintain compatibility it is necessary to make Xen behave as it did
>>>>> before the new operations were added and hence the code in this patch ensures
>>>>> that, if XEN_DOMCTL_CDF_evtchn_fifo is not set, the FIFO event channel
>>>>> operations will again result in -ENOSYS being returned to the guest.
>>>>
>>>> I have to admit I'm now even more concerned of the control for such
>>>> going into Xen, the more with the now 2nd use in the subsequent patch.
>>>> The implication of this would seem to be that whenever we add new
>>>> hypercalls or sub-ops, a domain creation control would also need
>>>> adding determining whether that new sub-op is actually okay to use by
>>>> a guest. Or else I'd be keen to up front see criteria at least roughly
>>>> outlined by which it could be established whether such an override
>>>> control is needed.
>>>>
>>>
>>> Ultimately I think any new hypercall (or related set of hypercalls) added to the ABI needs to be
>> opt-in on a per-domain basis, so that we know that from when a domain is first created it will not see
>> a change in its environment unless the VM administrator wants that to happen.
>>
>> A new hypercall appearing is a change to the guest's environment, yes,
>> but a backwards compatible one. I don't see how this would harm a guest.
> 
> Say we have a guest which is aware of the newer Xen and is coded to use the new hypercall *but* we start it on an older Xen where the new hypercall is not implemented. *Then* we migrate it to the newer Xen... the new hypercall suddenly becomes available and changes the guest behaviour. In the worst case, the guest is sufficiently confused that it crashes.
> 
>> This and ...
>>
>>>> I'm also not convinced such controls really want to be opt-in rather
>>>> than opt-out.
>>>
>>> They really need to be opt-in I think. From a cloud provider PoV it is important that nothing in a
>> customer's environment changes unless we want it to. Otherwise we have no way to deploy an updated
>> hypervisor version without risking crashing their VMs.
>>
>> ... this sound to me more like workarounds for buggy guests than
>> functionality the hypervisor _needs_ to have. (I can appreciate
>> the specific case here for the specific scenario you provide as
>> an exception.)
> 
> If we want to have a hypervisor that can be used in a cloud environment then Xen absolutely needs this capability.
> 
>>
>>>>> --- a/xen/arch/arm/domain.c
>>>>> +++ b/xen/arch/arm/domain.c
>>>>> @@ -622,7 +622,8 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
>>>>>       unsigned int max_vcpus;
>>>>>
>>>>>       /* HVM and HAP must be set. IOMMU may or may not be */
>>>>> -    if ( (config->flags & ~XEN_DOMCTL_CDF_iommu) !=
>>>>> +    if ( (config->flags &
>>>>> +          ~(XEN_DOMCTL_CDF_iommu | XEN_DOMCTL_CDF_evtchn_fifo) !=
>>>>>            (XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap) )
>>>>>       {
>>>>>           dprintk(XENLOG_INFO, "Unsupported configuration %#x\n",
>>>>> --- a/xen/arch/arm/domain_build.c
>>>>> +++ b/xen/arch/arm/domain_build.c
>>>>> @@ -2478,7 +2478,8 @@ void __init create_domUs(void)
>>>>>           struct domain *d;
>>>>>           struct xen_domctl_createdomain d_cfg = {
>>>>>               .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE,
>>>>> -            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
>>>>> +            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
>>>>> +                     XEN_DOMCTL_CDF_evtchn_fifo,
>>>>>               .max_evtchn_port = -1,
>>>>>               .max_grant_frames = 64,
>>>>>               .max_maptrack_frames = 1024,
>>>>> --- a/xen/arch/arm/setup.c
>>>>> +++ b/xen/arch/arm/setup.c
>>>>> @@ -805,7 +805,8 @@ void __init start_xen(unsigned long boot_phys_offset,
>>>>>       struct bootmodule *xen_bootmodule;
>>>>>       struct domain *dom0;
>>>>>       struct xen_domctl_createdomain dom0_cfg = {
>>>>> -        .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
>>>>> +        .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
>>>>> +                 XEN_DOMCTL_CDF_evtchn_fifo,
>>>>>           .max_evtchn_port = -1,
>>>>>           .max_grant_frames = gnttab_dom0_frames(),
>>>>>           .max_maptrack_frames = -1,
>>>>> --- a/xen/arch/x86/setup.c
>>>>> +++ b/xen/arch/x86/setup.c
>>>>> @@ -738,7 +738,8 @@ static struct domain *__init create_dom0(const module_t *image,
>>>>>                                            const char *loader)
>>>>>   {
>>>>>       struct xen_domctl_createdomain dom0_cfg = {
>>>>> -        .flags = IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0,
>>>>> +        .flags = XEN_DOMCTL_CDF_evtchn_fifo |
>>>>> +                 (IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0),
>>>>>           .max_evtchn_port = -1,
>>>>>           .max_grant_frames = -1,
>>>>>           .max_maptrack_frames = -1,
>>>>> --- a/xen/common/domain.c
>>>>> +++ b/xen/common/domain.c
>>>>> @@ -307,7 +307,7 @@ static int sanitise_domain_config(struct xen_domctl_createdomain *config)
>>>>>            ~(XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
>>>>>              XEN_DOMCTL_CDF_s3_integrity | XEN_DOMCTL_CDF_oos_off |
>>>>>              XEN_DOMCTL_CDF_xs_domain | XEN_DOMCTL_CDF_iommu |
>>>>> -           XEN_DOMCTL_CDF_nested_virt) )
>>>>> +           XEN_DOMCTL_CDF_nested_virt | XEN_DOMCTL_CDF_evtchn_fifo) )
>>>>>       {
>>>>>           dprintk(XENLOG_INFO, "Unknown CDF flags %#x\n", config->flags);
>>>>>           return -EINVAL;
>>>>
>>>> All of the hunks above point out a scalability issue if we were to
>>>> follow this route for even just a fair part of new sub-ops, and I
>>>> suppose you've noticed this with the next patch presumably touching
>>>> all the same places again.
>>>
>>> Indeed. This solution works for now but is probably not what we want in the long run. Same goes for
>> the current way we control viridian features via an HVM param. It is good enough for now IMO since
>> domctl is not a stable interface. Any ideas about how we might implement a better interface in the
>> longer term?
>>
>> While it has other downsides, Jürgen's proposal doesn't have any
>> similar scalability issue afaics. Another possible model would
>> seem to be to key new hypercalls to hypervisor CPUID leaf bits,
>> and derive their availability from a guest's CPUID policy. Of
>> course this won't work when needing to retrofit guarding like
>> you want to do here.
>>
> 
> Ok, I'll take a look hypfs as an immediate solution, if that's preferred.

Paul, if you'd like me to add a few patches to my series doing the basic
framework (per-domain nodes, interfaces for setting struct domain fields
via hypfs), I can do that. You could then concentrate on the tools side.


Juergen

[-- Attachment #1.1.2: OpenPGP_0xB0DE9DD628BF132F.asc --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-03 17:19           ` Jürgen Groß
@ 2020-12-03 18:44             ` Paul Durrant
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Durrant @ 2020-12-03 18:44 UTC (permalink / raw)
  To: 'Jürgen Groß', 'Jan Beulich'
  Cc: 'Paul Durrant', 'Eslam Elnikety',
	'Ian Jackson', 'Wei Liu',
	'Anthony PERARD', 'Andrew Cooper',
	'George Dunlap', 'Julien Grall',
	'Stefano Stabellini', 'Christian Lindig',
	'David Scott', 'Volodymyr Babchuk',
	'Roger Pau Monné',
	xen-devel

> -----Original Message-----
[snip]
> >>>>
> >>>> All of the hunks above point out a scalability issue if we were to
> >>>> follow this route for even just a fair part of new sub-ops, and I
> >>>> suppose you've noticed this with the next patch presumably touching
> >>>> all the same places again.
> >>>
> >>> Indeed. This solution works for now but is probably not what we want in the long run. Same goes
> for
> >> the current way we control viridian features via an HVM param. It is good enough for now IMO since
> >> domctl is not a stable interface. Any ideas about how we might implement a better interface in the
> >> longer term?
> >>
> >> While it has other downsides, Jürgen's proposal doesn't have any
> >> similar scalability issue afaics. Another possible model would
> >> seem to be to key new hypercalls to hypervisor CPUID leaf bits,
> >> and derive their availability from a guest's CPUID policy. Of
> >> course this won't work when needing to retrofit guarding like
> >> you want to do here.
> >>
> >
> > Ok, I'll take a look hypfs as an immediate solution, if that's preferred.
> 
> Paul, if you'd like me to add a few patches to my series doing the basic
> framework (per-domain nodes, interfaces for setting struct domain fields
> via hypfs), I can do that. You could then concentrate on the tools side.
> 

That would be very helpful. Thanks Juergen.

  Paul



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-03 17:07         ` Paul Durrant
  2020-12-03 17:19           ` Jürgen Groß
@ 2020-12-04  7:53           ` Jan Beulich
  2020-12-04  8:22             ` Paul Durrant
  1 sibling, 1 reply; 28+ messages in thread
From: Jan Beulich @ 2020-12-04  7:53 UTC (permalink / raw)
  To: paul
  Cc: 'Paul Durrant', 'Eslam Elnikety',
	'Ian Jackson', 'Wei Liu',
	'Anthony PERARD', 'Andrew Cooper',
	'George Dunlap', 'Julien Grall',
	'Stefano Stabellini', 'Christian Lindig',
	'David Scott', 'Volodymyr Babchuk',
	'Roger Pau Monné',
	xen-devel

On 03.12.2020 18:07, Paul Durrant wrote:
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 03 December 2020 15:57
>>
>> On 03.12.2020 16:45, Paul Durrant wrote:
>>>> From: Jan Beulich <jbeulich@suse.com>
>>>> Sent: 03 December 2020 15:23
>>>>
>>>> On 03.12.2020 13:41, Paul Durrant wrote:
>>>>> From: Paul Durrant <pdurrant@amazon.com>
>>>>>
>>>>> ...to control the visibility of the FIFO event channel operations
>>>>> (EVTCHNOP_init_control, EVTCHNOP_expand_array, and EVTCHNOP_set_priority) to
>>>>> the guest.
>>>>>
>>>>> These operations were added to the public header in commit d2d50c2f308f
>>>>> ("evtchn: add FIFO-based event channel ABI") and the first implementation
>>>>> appeared in the two subsequent commits: edc8872aeb4a ("evtchn: implement
>>>>> EVTCHNOP_set_priority and add the set_priority hook") and 88910061ec61
>>>>> ("evtchn: add FIFO-based event channel hypercalls and port ops"). Prior to
>>>>> that, a guest issuing those operations would receive a return value of
>>>>> -ENOSYS (not implemented) from Xen. Guests aware of the FIFO operations but
>>>>> running on an older (pre-4.4) Xen would fall back to using the 2-level event
>>>>> channel interface upon seeing this return value.
>>>>>
>>>>> Unfortunately the uncontrolable appearance of these new operations in Xen 4.4
>>>>> onwards has implications for hibernation of some Linux guests. During resume
>>>>> from hibernation, there are two kernels involved: the "boot" kernel and the
>>>>> "resume" kernel. The guest boot kernel may default to use FIFO operations and
>>>>> instruct Xen via EVTCHNOP_init_control to switch from 2-level to FIFO. On the
>>>>> other hand, the resume kernel keeps assuming 2-level, because it was hibernated
>>>>> on a version of Xen that did not support the FIFO operations.
>>>>>
>>>>> To maintain compatibility it is necessary to make Xen behave as it did
>>>>> before the new operations were added and hence the code in this patch ensures
>>>>> that, if XEN_DOMCTL_CDF_evtchn_fifo is not set, the FIFO event channel
>>>>> operations will again result in -ENOSYS being returned to the guest.
>>>>
>>>> I have to admit I'm now even more concerned of the control for such
>>>> going into Xen, the more with the now 2nd use in the subsequent patch.
>>>> The implication of this would seem to be that whenever we add new
>>>> hypercalls or sub-ops, a domain creation control would also need
>>>> adding determining whether that new sub-op is actually okay to use by
>>>> a guest. Or else I'd be keen to up front see criteria at least roughly
>>>> outlined by which it could be established whether such an override
>>>> control is needed.
>>>>
>>>
>>> Ultimately I think any new hypercall (or related set of hypercalls) added to the ABI needs to be
>> opt-in on a per-domain basis, so that we know that from when a domain is first created it will not see
>> a change in its environment unless the VM administrator wants that to happen.
>>
>> A new hypercall appearing is a change to the guest's environment, yes,
>> but a backwards compatible one. I don't see how this would harm a guest.
> 
> Say we have a guest which is aware of the newer Xen and is coded to use the new hypercall *but* we start it on an older Xen where the new hypercall is not implemented. *Then* we migrate it to the newer Xen... the new hypercall suddenly becomes available and changes the guest behaviour. In the worst case, the guest is sufficiently confused that it crashes.

If a guest recognizes a new hypercall is unavailable, why would it ever try
to make use of it again, unless it knows it may appear after having been
migrated (which implies it's prepared for the sudden appearance)? This to
me still looks like an attempt to work around guest bugs. And a halfhearted
one, as you cherry pick just two out of many additions to the original 3.0
ABI.

>> This and ...
>>
>>>> I'm also not convinced such controls really want to be opt-in rather
>>>> than opt-out.
>>>
>>> They really need to be opt-in I think. From a cloud provider PoV it is important that nothing in a
>> customer's environment changes unless we want it to. Otherwise we have no way to deploy an updated
>> hypervisor version without risking crashing their VMs.
>>
>> ... this sound to me more like workarounds for buggy guests than
>> functionality the hypervisor _needs_ to have. (I can appreciate
>> the specific case here for the specific scenario you provide as
>> an exception.)
> 
> If we want to have a hypervisor that can be used in a cloud environment
> then Xen absolutely needs this capability.

As per above you can conclude that I'm still struggling to see the
"why" part here.

>>>>> --- a/xen/arch/arm/domain.c
>>>>> +++ b/xen/arch/arm/domain.c
>>>>> @@ -622,7 +622,8 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
>>>>>      unsigned int max_vcpus;
>>>>>
>>>>>      /* HVM and HAP must be set. IOMMU may or may not be */
>>>>> -    if ( (config->flags & ~XEN_DOMCTL_CDF_iommu) !=
>>>>> +    if ( (config->flags &
>>>>> +          ~(XEN_DOMCTL_CDF_iommu | XEN_DOMCTL_CDF_evtchn_fifo) !=
>>>>>           (XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap) )
>>>>>      {
>>>>>          dprintk(XENLOG_INFO, "Unsupported configuration %#x\n",
>>>>> --- a/xen/arch/arm/domain_build.c
>>>>> +++ b/xen/arch/arm/domain_build.c
>>>>> @@ -2478,7 +2478,8 @@ void __init create_domUs(void)
>>>>>          struct domain *d;
>>>>>          struct xen_domctl_createdomain d_cfg = {
>>>>>              .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE,
>>>>> -            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
>>>>> +            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
>>>>> +                     XEN_DOMCTL_CDF_evtchn_fifo,
>>>>>              .max_evtchn_port = -1,
>>>>>              .max_grant_frames = 64,
>>>>>              .max_maptrack_frames = 1024,
>>>>> --- a/xen/arch/arm/setup.c
>>>>> +++ b/xen/arch/arm/setup.c
>>>>> @@ -805,7 +805,8 @@ void __init start_xen(unsigned long boot_phys_offset,
>>>>>      struct bootmodule *xen_bootmodule;
>>>>>      struct domain *dom0;
>>>>>      struct xen_domctl_createdomain dom0_cfg = {
>>>>> -        .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
>>>>> +        .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
>>>>> +                 XEN_DOMCTL_CDF_evtchn_fifo,
>>>>>          .max_evtchn_port = -1,
>>>>>          .max_grant_frames = gnttab_dom0_frames(),
>>>>>          .max_maptrack_frames = -1,
>>>>> --- a/xen/arch/x86/setup.c
>>>>> +++ b/xen/arch/x86/setup.c
>>>>> @@ -738,7 +738,8 @@ static struct domain *__init create_dom0(const module_t *image,
>>>>>                                           const char *loader)
>>>>>  {
>>>>>      struct xen_domctl_createdomain dom0_cfg = {
>>>>> -        .flags = IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0,
>>>>> +        .flags = XEN_DOMCTL_CDF_evtchn_fifo |
>>>>> +                 (IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0),
>>>>>          .max_evtchn_port = -1,
>>>>>          .max_grant_frames = -1,
>>>>>          .max_maptrack_frames = -1,
>>>>> --- a/xen/common/domain.c
>>>>> +++ b/xen/common/domain.c
>>>>> @@ -307,7 +307,7 @@ static int sanitise_domain_config(struct xen_domctl_createdomain *config)
>>>>>           ~(XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
>>>>>             XEN_DOMCTL_CDF_s3_integrity | XEN_DOMCTL_CDF_oos_off |
>>>>>             XEN_DOMCTL_CDF_xs_domain | XEN_DOMCTL_CDF_iommu |
>>>>> -           XEN_DOMCTL_CDF_nested_virt) )
>>>>> +           XEN_DOMCTL_CDF_nested_virt | XEN_DOMCTL_CDF_evtchn_fifo) )
>>>>>      {
>>>>>          dprintk(XENLOG_INFO, "Unknown CDF flags %#x\n", config->flags);
>>>>>          return -EINVAL;
>>>>
>>>> All of the hunks above point out a scalability issue if we were to
>>>> follow this route for even just a fair part of new sub-ops, and I
>>>> suppose you've noticed this with the next patch presumably touching
>>>> all the same places again.
>>>
>>> Indeed. This solution works for now but is probably not what we want in the long run. Same goes for
>> the current way we control viridian features via an HVM param. It is good enough for now IMO since
>> domctl is not a stable interface. Any ideas about how we might implement a better interface in the
>> longer term?
>>
>> While it has other downsides, Jürgen's proposal doesn't have any
>> similar scalability issue afaics. Another possible model would
>> seem to be to key new hypercalls to hypervisor CPUID leaf bits,
>> and derive their availability from a guest's CPUID policy. Of
>> course this won't work when needing to retrofit guarding like
>> you want to do here.
> 
> Ok, I'll take a look hypfs as an immediate solution, if that's preferred.

Well, as said - there are also downsides with that approach. I'm
not convinced it should be just the three of us to determine which
one is better overall, the more that you don't seem to be convinced
anyway (and I'm not going to claim I am, in either direction). It
is my understanding that based on the claimed need for this (see
above), this may become very widely used functionality, and if so
we want to make sure we won't regret the route we went.

For starters, just to get a better overall picture, besides the two
overrides you add here, do you have any plans to retroactively add
further controls for past ABI additions? I don't suppose you have
plans to consistently do this for all of them? I ask because I
could see us going the route you've chosen for very few retroactive
additions of overrides, but use the CPUID approach (provided
there's a suitable Arm counterpart, and ideally some understanding
whether something similar could also be found for at least the two
planned new ports) for ABI additions going forward.

Jan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-04  7:53           ` Jan Beulich
@ 2020-12-04  8:22             ` Paul Durrant
  2020-12-04  9:43               ` Jan Beulich
  0 siblings, 1 reply; 28+ messages in thread
From: Paul Durrant @ 2020-12-04  8:22 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: 'Paul Durrant', 'Eslam Elnikety',
	'Ian Jackson', 'Wei Liu',
	'Anthony PERARD', 'Andrew Cooper',
	'George Dunlap', 'Julien Grall',
	'Stefano Stabellini', 'Christian Lindig',
	'David Scott', 'Volodymyr Babchuk',
	'Roger Pau Monné',
	xen-devel

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 04 December 2020 07:53
> To: paul@xen.org
> Cc: 'Paul Durrant' <pdurrant@amazon.com>; 'Eslam Elnikety' <elnikety@amazon.com>; 'Ian Jackson'
> <iwj@xenproject.org>; 'Wei Liu' <wl@xen.org>; 'Anthony PERARD' <anthony.perard@citrix.com>; 'Andrew
> Cooper' <andrew.cooper3@citrix.com>; 'George Dunlap' <george.dunlap@citrix.com>; 'Julien Grall'
> <julien@xen.org>; 'Stefano Stabellini' <sstabellini@kernel.org>; 'Christian Lindig'
> <christian.lindig@citrix.com>; 'David Scott' <dave@recoil.org>; 'Volodymyr Babchuk'
> <Volodymyr_Babchuk@epam.com>; 'Roger Pau Monné' <roger.pau@citrix.com>; xen-devel@lists.xenproject.org
> Subject: Re: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo,
> ...
> 
> On 03.12.2020 18:07, Paul Durrant wrote:
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: 03 December 2020 15:57
> >>
> >> On 03.12.2020 16:45, Paul Durrant wrote:
> >>>> From: Jan Beulich <jbeulich@suse.com>
> >>>> Sent: 03 December 2020 15:23
> >>>>
> >>>> On 03.12.2020 13:41, Paul Durrant wrote:
> >>>>> From: Paul Durrant <pdurrant@amazon.com>
> >>>>>
> >>>>> ...to control the visibility of the FIFO event channel operations
> >>>>> (EVTCHNOP_init_control, EVTCHNOP_expand_array, and EVTCHNOP_set_priority) to
> >>>>> the guest.
> >>>>>
> >>>>> These operations were added to the public header in commit d2d50c2f308f
> >>>>> ("evtchn: add FIFO-based event channel ABI") and the first implementation
> >>>>> appeared in the two subsequent commits: edc8872aeb4a ("evtchn: implement
> >>>>> EVTCHNOP_set_priority and add the set_priority hook") and 88910061ec61
> >>>>> ("evtchn: add FIFO-based event channel hypercalls and port ops"). Prior to
> >>>>> that, a guest issuing those operations would receive a return value of
> >>>>> -ENOSYS (not implemented) from Xen. Guests aware of the FIFO operations but
> >>>>> running on an older (pre-4.4) Xen would fall back to using the 2-level event
> >>>>> channel interface upon seeing this return value.
> >>>>>
> >>>>> Unfortunately the uncontrolable appearance of these new operations in Xen 4.4
> >>>>> onwards has implications for hibernation of some Linux guests. During resume
> >>>>> from hibernation, there are two kernels involved: the "boot" kernel and the
> >>>>> "resume" kernel. The guest boot kernel may default to use FIFO operations and
> >>>>> instruct Xen via EVTCHNOP_init_control to switch from 2-level to FIFO. On the
> >>>>> other hand, the resume kernel keeps assuming 2-level, because it was hibernated
> >>>>> on a version of Xen that did not support the FIFO operations.
> >>>>>
> >>>>> To maintain compatibility it is necessary to make Xen behave as it did
> >>>>> before the new operations were added and hence the code in this patch ensures
> >>>>> that, if XEN_DOMCTL_CDF_evtchn_fifo is not set, the FIFO event channel
> >>>>> operations will again result in -ENOSYS being returned to the guest.
> >>>>
> >>>> I have to admit I'm now even more concerned of the control for such
> >>>> going into Xen, the more with the now 2nd use in the subsequent patch.
> >>>> The implication of this would seem to be that whenever we add new
> >>>> hypercalls or sub-ops, a domain creation control would also need
> >>>> adding determining whether that new sub-op is actually okay to use by
> >>>> a guest. Or else I'd be keen to up front see criteria at least roughly
> >>>> outlined by which it could be established whether such an override
> >>>> control is needed.
> >>>>
> >>>
> >>> Ultimately I think any new hypercall (or related set of hypercalls) added to the ABI needs to be
> >> opt-in on a per-domain basis, so that we know that from when a domain is first created it will not
> see
> >> a change in its environment unless the VM administrator wants that to happen.
> >>
> >> A new hypercall appearing is a change to the guest's environment, yes,
> >> but a backwards compatible one. I don't see how this would harm a guest.
> >
> > Say we have a guest which is aware of the newer Xen and is coded to use the new hypercall *but* we
> start it on an older Xen where the new hypercall is not implemented. *Then* we migrate it to the newer
> Xen... the new hypercall suddenly becomes available and changes the guest behaviour. In the worst
> case, the guest is sufficiently confused that it crashes.
> 
> If a guest recognizes a new hypercall is unavailable, why would it ever try
> to make use of it again, unless it knows it may appear after having been
> migrated (which implies it's prepared for the sudden appearance)? This to
> me still looks like an attempt to work around guest bugs. And a halfhearted
> one, as you cherry pick just two out of many additions to the original 3.0
> ABI.
> 

This is exactly an attempt to work around guest bugs, which is something a cloud provider needs to do where possible.

> >> This and ...
> >>
> >>>> I'm also not convinced such controls really want to be opt-in rather
> >>>> than opt-out.
> >>>
> >>> They really need to be opt-in I think. From a cloud provider PoV it is important that nothing in a
> >> customer's environment changes unless we want it to. Otherwise we have no way to deploy an updated
> >> hypervisor version without risking crashing their VMs.
> >>
> >> ... this sound to me more like workarounds for buggy guests than
> >> functionality the hypervisor _needs_ to have. (I can appreciate
> >> the specific case here for the specific scenario you provide as
> >> an exception.)
> >
> > If we want to have a hypervisor that can be used in a cloud environment
> > then Xen absolutely needs this capability.
> 
> As per above you can conclude that I'm still struggling to see the
> "why" part here.
> 

Imagine you are a customer. You boot your OS and everything is just fine... you run your workload and all is good. You then shut down your VM and re-start it. Now it starts to crash. Who are you going to blame? You did nothing to your OS or application s/w, so you are going to blame the cloud provider of course.

Now imagine you are the cloud provider, running Xen. What you did was start to upgrade your hosts from an older version of Xen to a newer version of Xen, to pick up various bug fixes and make sure you are running a version that is within the security support envelope. You identify that your customer's problem is a bug in their OS that was latent on the old version of the hypervisor but is now manifesting on the new one because it has buggy support for a hypercall that was added between the two versions. How are you going to fix this issue, and get your customer up and running again? Of course you'd like your customer to upgrade their OS, but they can't even boot it to do that. You really need a solution that can restore the old VM environment, at least temporarily, for that customer.

> >>>>> --- a/xen/arch/arm/domain.c
> >>>>> +++ b/xen/arch/arm/domain.c
> >>>>> @@ -622,7 +622,8 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
> >>>>>      unsigned int max_vcpus;
> >>>>>
> >>>>>      /* HVM and HAP must be set. IOMMU may or may not be */
> >>>>> -    if ( (config->flags & ~XEN_DOMCTL_CDF_iommu) !=
> >>>>> +    if ( (config->flags &
> >>>>> +          ~(XEN_DOMCTL_CDF_iommu | XEN_DOMCTL_CDF_evtchn_fifo) !=
> >>>>>           (XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap) )
> >>>>>      {
> >>>>>          dprintk(XENLOG_INFO, "Unsupported configuration %#x\n",
> >>>>> --- a/xen/arch/arm/domain_build.c
> >>>>> +++ b/xen/arch/arm/domain_build.c
> >>>>> @@ -2478,7 +2478,8 @@ void __init create_domUs(void)
> >>>>>          struct domain *d;
> >>>>>          struct xen_domctl_createdomain d_cfg = {
> >>>>>              .arch.gic_version = XEN_DOMCTL_CONFIG_GIC_NATIVE,
> >>>>> -            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
> >>>>> +            .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
> >>>>> +                     XEN_DOMCTL_CDF_evtchn_fifo,
> >>>>>              .max_evtchn_port = -1,
> >>>>>              .max_grant_frames = 64,
> >>>>>              .max_maptrack_frames = 1024,
> >>>>> --- a/xen/arch/arm/setup.c
> >>>>> +++ b/xen/arch/arm/setup.c
> >>>>> @@ -805,7 +805,8 @@ void __init start_xen(unsigned long boot_phys_offset,
> >>>>>      struct bootmodule *xen_bootmodule;
> >>>>>      struct domain *dom0;
> >>>>>      struct xen_domctl_createdomain dom0_cfg = {
> >>>>> -        .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap,
> >>>>> +        .flags = XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
> >>>>> +                 XEN_DOMCTL_CDF_evtchn_fifo,
> >>>>>          .max_evtchn_port = -1,
> >>>>>          .max_grant_frames = gnttab_dom0_frames(),
> >>>>>          .max_maptrack_frames = -1,
> >>>>> --- a/xen/arch/x86/setup.c
> >>>>> +++ b/xen/arch/x86/setup.c
> >>>>> @@ -738,7 +738,8 @@ static struct domain *__init create_dom0(const module_t *image,
> >>>>>                                           const char *loader)
> >>>>>  {
> >>>>>      struct xen_domctl_createdomain dom0_cfg = {
> >>>>> -        .flags = IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0,
> >>>>> +        .flags = XEN_DOMCTL_CDF_evtchn_fifo |
> >>>>> +                 (IS_ENABLED(CONFIG_TBOOT) ? XEN_DOMCTL_CDF_s3_integrity : 0),
> >>>>>          .max_evtchn_port = -1,
> >>>>>          .max_grant_frames = -1,
> >>>>>          .max_maptrack_frames = -1,
> >>>>> --- a/xen/common/domain.c
> >>>>> +++ b/xen/common/domain.c
> >>>>> @@ -307,7 +307,7 @@ static int sanitise_domain_config(struct xen_domctl_createdomain *config)
> >>>>>           ~(XEN_DOMCTL_CDF_hvm | XEN_DOMCTL_CDF_hap |
> >>>>>             XEN_DOMCTL_CDF_s3_integrity | XEN_DOMCTL_CDF_oos_off |
> >>>>>             XEN_DOMCTL_CDF_xs_domain | XEN_DOMCTL_CDF_iommu |
> >>>>> -           XEN_DOMCTL_CDF_nested_virt) )
> >>>>> +           XEN_DOMCTL_CDF_nested_virt | XEN_DOMCTL_CDF_evtchn_fifo) )
> >>>>>      {
> >>>>>          dprintk(XENLOG_INFO, "Unknown CDF flags %#x\n", config->flags);
> >>>>>          return -EINVAL;
> >>>>
> >>>> All of the hunks above point out a scalability issue if we were to
> >>>> follow this route for even just a fair part of new sub-ops, and I
> >>>> suppose you've noticed this with the next patch presumably touching
> >>>> all the same places again.
> >>>
> >>> Indeed. This solution works for now but is probably not what we want in the long run. Same goes
> for
> >> the current way we control viridian features via an HVM param. It is good enough for now IMO since
> >> domctl is not a stable interface. Any ideas about how we might implement a better interface in the
> >> longer term?
> >>
> >> While it has other downsides, Jürgen's proposal doesn't have any
> >> similar scalability issue afaics. Another possible model would
> >> seem to be to key new hypercalls to hypervisor CPUID leaf bits,
> >> and derive their availability from a guest's CPUID policy. Of
> >> course this won't work when needing to retrofit guarding like
> >> you want to do here.
> >
> > Ok, I'll take a look hypfs as an immediate solution, if that's preferred.
> 
> Well, as said - there are also downsides with that approach. I'm
> not convinced it should be just the three of us to determine which
> one is better overall, the more that you don't seem to be convinced
> anyway (and I'm not going to claim I am, in either direction). It
> is my understanding that based on the claimed need for this (see
> above), this may become very widely used functionality, and if so
> we want to make sure we won't regret the route we went.
> 

Agreed, but we don't need the final top-to-bottom solution now. The only part of this that is introducing something that is stable is the libxl part, and I think the 'feature' bitmap is reasonable future-proof (being modelled on the viridian feature bitmap, which has been extended over several releases since it was introduced).

> For starters, just to get a better overall picture, besides the two
> overrides you add here, do you have any plans to retroactively add
> further controls for past ABI additions?

I don't have any specific plans. The two I deal with here are the causes of observed differences in guest behaviour, one being an actual crash and the other affecting PV driver behaviour which may or may not be the cause of other crashes... but still something we need to have control over.

> I don't suppose you have
> plans to consistently do this for all of them? I ask because I
> could see us going the route you've chosen for very few retroactive
> additions of overrides, but use the CPUID approach (provided
> there's a suitable Arm counterpart, and ideally some understanding
> whether something similar could also be found for at least the two
> planned new ports) for ABI additions going forward.
> 

That sounds reasonable and I'd be perfectly happy to rip and replace the use of domain creation flags when we have something 'better'. We don't need to wait for that though... this series is perfectly functional and doesn't take us down any one-way streets (below libxl) AFAICT.

  Paul

> Jan



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-04  8:22             ` Paul Durrant
@ 2020-12-04  9:43               ` Jan Beulich
  2020-12-04 11:45                 ` Julien Grall
  0 siblings, 1 reply; 28+ messages in thread
From: Jan Beulich @ 2020-12-04  9:43 UTC (permalink / raw)
  To: paul
  Cc: 'Paul Durrant', 'Eslam Elnikety',
	'Ian Jackson', 'Wei Liu',
	'Anthony PERARD', 'Andrew Cooper',
	'George Dunlap', 'Julien Grall',
	'Stefano Stabellini', 'Christian Lindig',
	'David Scott', 'Volodymyr Babchuk',
	'Roger Pau Monné',
	xen-devel

On 04.12.2020 09:22, Paul Durrant wrote:
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 04 December 2020 07:53
>>
>> On 03.12.2020 18:07, Paul Durrant wrote:
>>>> From: Jan Beulich <jbeulich@suse.com>
>>>> Sent: 03 December 2020 15:57
>>>>
>>>> ... this sound to me more like workarounds for buggy guests than
>>>> functionality the hypervisor _needs_ to have. (I can appreciate
>>>> the specific case here for the specific scenario you provide as
>>>> an exception.)
>>>
>>> If we want to have a hypervisor that can be used in a cloud environment
>>> then Xen absolutely needs this capability.
>>
>> As per above you can conclude that I'm still struggling to see the
>> "why" part here.
>>
> 
> Imagine you are a customer. You boot your OS and everything is just fine... you run your workload and all is good. You then shut down your VM and re-start it. Now it starts to crash. Who are you going to blame? You did nothing to your OS or application s/w, so you are going to blame the cloud provider of course.

That's a situation OSes are in all the time. Buggy applications may
stop working on newer OS versions. It's still the application that's
in need of updating then. I guess OSes may choose to work around
some very common applications' bugs, but I'd then wonder on what
basis "very common" gets established. I dislike the underlying
asymmetry / inconsistency (if not unfairness) of such a model,
despite seeing that there may be business reasons leading people to
think they want something like this.

> Now imagine you are the cloud provider, running Xen. What you did was start to upgrade your hosts from an older version of Xen to a newer version of Xen, to pick up various bug fixes and make sure you are running a version that is within the security support envelope. You identify that your customer's problem is a bug in their OS that was latent on the old version of the hypervisor but is now manifesting on the new one because it has buggy support for a hypercall that was added between the two versions. How are you going to fix this issue, and get your customer up and running again? Of course you'd like your customer to upgrade their OS, but they can't even boot it to do that. You really need a solution that can restore the old VM environment, at least temporarily, for that customer.

Boot the guest on a not-yet-upgraded host again, to update its kernel?

>>>> While it has other downsides, Jürgen's proposal doesn't have any
>>>> similar scalability issue afaics. Another possible model would
>>>> seem to be to key new hypercalls to hypervisor CPUID leaf bits,
>>>> and derive their availability from a guest's CPUID policy. Of
>>>> course this won't work when needing to retrofit guarding like
>>>> you want to do here.
>>>
>>> Ok, I'll take a look hypfs as an immediate solution, if that's preferred.
>>
>> Well, as said - there are also downsides with that approach. I'm
>> not convinced it should be just the three of us to determine which
>> one is better overall, the more that you don't seem to be convinced
>> anyway (and I'm not going to claim I am, in either direction). It
>> is my understanding that based on the claimed need for this (see
>> above), this may become very widely used functionality, and if so
>> we want to make sure we won't regret the route we went.
>>
> 
> Agreed, but we don't need the final top-to-bottom solution now. The only part of this that is introducing something that is stable is the libxl part, and I think the 'feature' bitmap is reasonable future-proof (being modelled on the viridian feature bitmap, which has been extended over several releases since it was introduced).
> 
>> For starters, just to get a better overall picture, besides the two
>> overrides you add here, do you have any plans to retroactively add
>> further controls for past ABI additions?
> 
> I don't have any specific plans. The two I deal with here are the causes of observed differences in guest behaviour, one being an actual crash and the other affecting PV driver behaviour which may or may not be the cause of other crashes... but still something we need to have control over.

I.e. basically an arbitrary choice. This is again a symmetry /
consistency / fairness issue. If a guest admin pesters his cloud provider
enough, they may get the cloud provider to make hypervisor adjustments.
If another guest admin simply does the technically correct thing and
works out what needs fixing in the kernel, they may then figure they
simply didn't shout loud enough to avoid themselves needing to do much
work.

Anyway - I guess I'm about the only one seeing this from a purely
technical, non-business perspective. I suppose I'll continue looking at
the code for the purpose of finding issues (hopefully there aren't going
to be any), but will stay away from acking any parts of this. Whoever
agrees with the underlying concept can then provide their acks. (As said
elsewhere, in the particular case of the kexec issue with FIFO event
channels here, I could maybe see that as halfway acceptable
justification, albeit I did voice my concerns in that regard as well.
It's still working around a shortcoming in guest software.)

Nevertheless I'd like to gain clarity of what the plans are with future
ABI additions.

Jan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-04  9:43               ` Jan Beulich
@ 2020-12-04 11:45                 ` Julien Grall
  2020-12-04 11:52                   ` Andrew Cooper
  2020-12-07  9:17                   ` Jan Beulich
  0 siblings, 2 replies; 28+ messages in thread
From: Julien Grall @ 2020-12-04 11:45 UTC (permalink / raw)
  To: Jan Beulich, paul
  Cc: 'Paul Durrant', 'Eslam Elnikety',
	'Ian Jackson', 'Wei Liu',
	'Anthony PERARD', 'Andrew Cooper',
	'George Dunlap', 'Stefano Stabellini',
	'Christian Lindig', 'David Scott',
	'Volodymyr Babchuk', 'Roger Pau Monné',
	xen-devel

Hi,

I haven't looked at the series yet. Just adding some thoughts on why one 
would want such option.

On 04/12/2020 09:43, Jan Beulich wrote:
> On 04.12.2020 09:22, Paul Durrant wrote:
>>> From: Jan Beulich <jbeulich@suse.com>
>>> Sent: 04 December 2020 07:53
>>>
>>> On 03.12.2020 18:07, Paul Durrant wrote:
>>>>> From: Jan Beulich <jbeulich@suse.com>
>>>>> Sent: 03 December 2020 15:57
>>>>>
>>>>> ... this sound to me more like workarounds for buggy guests than
>>>>> functionality the hypervisor _needs_ to have. (I can appreciate
>>>>> the specific case here for the specific scenario you provide as
>>>>> an exception.)
>>>>
>>>> If we want to have a hypervisor that can be used in a cloud environment
>>>> then Xen absolutely needs this capability.
>>>
>>> As per above you can conclude that I'm still struggling to see the
>>> "why" part here.
>>>
>>
>> Imagine you are a customer. You boot your OS and everything is just fine... you run your workload and all is good. You then shut down your VM and re-start it. Now it starts to crash. Who are you going to blame? You did nothing to your OS or application s/w, so you are going to blame the cloud provider of course.
> 
> That's a situation OSes are in all the time. Buggy applications may
> stop working on newer OS versions. It's still the application that's
> in need of updating then. I guess OSes may choose to work around
> some very common applications' bugs, but I'd then wonder on what
> basis "very common" gets established. I dislike the underlying
> asymmetry / inconsistency (if not unfairness) of such a model,
> despite seeing that there may be business reasons leading people to
> think they want something like this.

The discussion seems to be geared towards buggy guest so far. However, 
this is not the only reason that one my want to avoid exposing some 
features:

    1) From the recent security issues (such as XSA-343), a knob to 
disable FIFO would be quite beneficials for vendors that don't need the 
feature.

    2) Fleet management purpose. You may have a fleet with multiple 
versions of Xen. You don't want your customer to start relying on 
features that may not be available on all the hosts otherwise it 
complicates the guest placement.

FAOD, I am sure there might be other features that need to be disabled. 
But we have to start somewhere :).

> 
>> Now imagine you are the cloud provider, running Xen. What you did was start to upgrade your hosts from an older version of Xen to a newer version of Xen, to pick up various bug fixes and make sure you are running a version that is within the security support envelope. You identify that your customer's problem is a bug in their OS that was latent on the old version of the hypervisor but is now manifesting on the new one because it has buggy support for a hypercall that was added between the two versions. How are you going to fix this issue, and get your customer up and running again? Of course you'd like your customer to upgrade their OS, but they can't even boot it to do that. You really need a solution that can restore the old VM environment, at least temporarily, for that customer.
> 
> Boot the guest on a not-yet-upgraded host again, to update its kernel?

You are making the assumption that the customer would have the choice to 
target a specific versions of Xen. This may be undesirable for a cloud 
provider as suddenly your customer may want to stick on the old version 
of Xen.

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-04 11:45                 ` Julien Grall
@ 2020-12-04 11:52                   ` Andrew Cooper
  2020-12-04 17:41                     ` Stefano Stabellini
  2020-12-07  9:17                   ` Jan Beulich
  1 sibling, 1 reply; 28+ messages in thread
From: Andrew Cooper @ 2020-12-04 11:52 UTC (permalink / raw)
  To: Julien Grall, Jan Beulich, paul
  Cc: 'Paul Durrant', 'Eslam Elnikety',
	'Ian Jackson', 'Wei Liu',
	'Anthony PERARD', 'George Dunlap',
	'Stefano Stabellini', 'Christian Lindig',
	'David Scott', 'Volodymyr Babchuk',
	'Roger Pau Monné',
	xen-devel

On 04/12/2020 11:45, Julien Grall wrote:
> Hi,
>
> I haven't looked at the series yet. Just adding some thoughts on why
> one would want such option.
>
> On 04/12/2020 09:43, Jan Beulich wrote:
>> On 04.12.2020 09:22, Paul Durrant wrote:
>>>> From: Jan Beulich <jbeulich@suse.com>
>>>> Sent: 04 December 2020 07:53
>>>>
>>>> On 03.12.2020 18:07, Paul Durrant wrote:
>>>>>> From: Jan Beulich <jbeulich@suse.com>
>>>>>> Sent: 03 December 2020 15:57
>>>>>>
>>>>>> ... this sound to me more like workarounds for buggy guests than
>>>>>> functionality the hypervisor _needs_ to have. (I can appreciate
>>>>>> the specific case here for the specific scenario you provide as
>>>>>> an exception.)
>>>>>
>>>>> If we want to have a hypervisor that can be used in a cloud
>>>>> environment
>>>>> then Xen absolutely needs this capability.
>>>>
>>>> As per above you can conclude that I'm still struggling to see the
>>>> "why" part here.
>>>>
>>>
>>> Imagine you are a customer. You boot your OS and everything is just
>>> fine... you run your workload and all is good. You then shut down
>>> your VM and re-start it. Now it starts to crash. Who are you going
>>> to blame? You did nothing to your OS or application s/w, so you are
>>> going to blame the cloud provider of course.
>>
>> That's a situation OSes are in all the time. Buggy applications may
>> stop working on newer OS versions. It's still the application that's
>> in need of updating then. I guess OSes may choose to work around
>> some very common applications' bugs, but I'd then wonder on what
>> basis "very common" gets established. I dislike the underlying
>> asymmetry / inconsistency (if not unfairness) of such a model,
>> despite seeing that there may be business reasons leading people to
>> think they want something like this.
>
> The discussion seems to be geared towards buggy guest so far. However,
> this is not the only reason that one my want to avoid exposing some
> features:
>
>    1) From the recent security issues (such as XSA-343), a knob to
> disable FIFO would be quite beneficials for vendors that don't need
> the feature.
>
>    2) Fleet management purpose. You may have a fleet with multiple
> versions of Xen. You don't want your customer to start relying on
> features that may not be available on all the hosts otherwise it
> complicates the guest placement.
>
> FAOD, I am sure there might be other features that need to be
> disabled. But we have to start somewhere :).

Absolutely top of the list, importance wise, is so we can test different
configurations, without needing to rebuild the hypervisor (and to a
lesser extent, without having to reboot).

It is a mistake that events/grants/etc were ever available unilaterally
in HVM guests.  This is definitely a step in the right direction (but I
thought it would be too rude to ask Paul to make all of those CDF flags
at once).

~Andrew


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-04 11:52                   ` Andrew Cooper
@ 2020-12-04 17:41                     ` Stefano Stabellini
  2020-12-04 17:45                       ` Andrew Cooper
  0 siblings, 1 reply; 28+ messages in thread
From: Stefano Stabellini @ 2020-12-04 17:41 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Julien Grall, Jan Beulich, paul, 'Paul Durrant',
	'Eslam Elnikety', 'Ian Jackson',
	'Wei Liu', 'Anthony PERARD',
	'George Dunlap', 'Stefano Stabellini',
	'Christian Lindig', 'David Scott',
	'Volodymyr Babchuk', 'Roger Pau Monné',
	xen-devel

[-- Attachment #1: Type: text/plain, Size: 3125 bytes --]

On Fri, 4 Dec 2020, Andrew Cooper wrote:
> On 04/12/2020 11:45, Julien Grall wrote:
> > Hi,
> >
> > I haven't looked at the series yet. Just adding some thoughts on why
> > one would want such option.
> >
> > On 04/12/2020 09:43, Jan Beulich wrote:
> >> On 04.12.2020 09:22, Paul Durrant wrote:
> >>>> From: Jan Beulich <jbeulich@suse.com>
> >>>> Sent: 04 December 2020 07:53
> >>>>
> >>>> On 03.12.2020 18:07, Paul Durrant wrote:
> >>>>>> From: Jan Beulich <jbeulich@suse.com>
> >>>>>> Sent: 03 December 2020 15:57
> >>>>>>
> >>>>>> ... this sound to me more like workarounds for buggy guests than
> >>>>>> functionality the hypervisor _needs_ to have. (I can appreciate
> >>>>>> the specific case here for the specific scenario you provide as
> >>>>>> an exception.)
> >>>>>
> >>>>> If we want to have a hypervisor that can be used in a cloud
> >>>>> environment
> >>>>> then Xen absolutely needs this capability.
> >>>>
> >>>> As per above you can conclude that I'm still struggling to see the
> >>>> "why" part here.
> >>>>
> >>>
> >>> Imagine you are a customer. You boot your OS and everything is just
> >>> fine... you run your workload and all is good. You then shut down
> >>> your VM and re-start it. Now it starts to crash. Who are you going
> >>> to blame? You did nothing to your OS or application s/w, so you are
> >>> going to blame the cloud provider of course.
> >>
> >> That's a situation OSes are in all the time. Buggy applications may
> >> stop working on newer OS versions. It's still the application that's
> >> in need of updating then. I guess OSes may choose to work around
> >> some very common applications' bugs, but I'd then wonder on what
> >> basis "very common" gets established. I dislike the underlying
> >> asymmetry / inconsistency (if not unfairness) of such a model,
> >> despite seeing that there may be business reasons leading people to
> >> think they want something like this.
> >
> > The discussion seems to be geared towards buggy guest so far. However,
> > this is not the only reason that one my want to avoid exposing some
> > features:
> >
> >    1) From the recent security issues (such as XSA-343), a knob to
> > disable FIFO would be quite beneficials for vendors that don't need
> > the feature.
> >
> >    2) Fleet management purpose. You may have a fleet with multiple
> > versions of Xen. You don't want your customer to start relying on
> > features that may not be available on all the hosts otherwise it
> > complicates the guest placement.
> >
> > FAOD, I am sure there might be other features that need to be
> > disabled. But we have to start somewhere :).
> 
> Absolutely top of the list, importance wise, is so we can test different
> configurations, without needing to rebuild the hypervisor (and to a
> lesser extent, without having to reboot).
> 
> It is a mistake that events/grants/etc were ever available unilaterally
> in HVM guests.  This is definitely a step in the right direction (but I
> thought it would be too rude to ask Paul to make all of those CDF flags
> at once).

+1

For FuSa we'll need to be able to disable them at some point soon.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-04 17:41                     ` Stefano Stabellini
@ 2020-12-04 17:45                       ` Andrew Cooper
  2020-12-04 18:33                         ` Durrant, Paul
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Cooper @ 2020-12-04 17:45 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Julien Grall, Jan Beulich, paul, 'Paul Durrant',
	'Eslam Elnikety', 'Ian Jackson',
	'Wei Liu', 'Anthony PERARD',
	'George Dunlap', 'Christian Lindig',
	'David Scott', 'Volodymyr Babchuk',
	'Roger Pau Monné',
	xen-devel

On 04/12/2020 17:41, Stefano Stabellini wrote:
>>> FAOD, I am sure there might be other features that need to be
>>> disabled. But we have to start somewhere :).
>> Absolutely top of the list, importance wise, is so we can test different
>> configurations, without needing to rebuild the hypervisor (and to a
>> lesser extent, without having to reboot).
>>
>> It is a mistake that events/grants/etc were ever available unilaterally
>> in HVM guests.  This is definitely a step in the right direction (but I
>> thought it would be too rude to ask Paul to make all of those CDF flags
>> at once).
> +1
>
> For FuSa we'll need to be able to disable them at some point soon.

FWIW, I have a proper plan for this stuff, which start alongside the
fixed toolstack ABI, and will cover all aspects of optional
functionality in a domain.

~Andrew


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-04 17:45                       ` Andrew Cooper
@ 2020-12-04 18:33                         ` Durrant, Paul
  2020-12-05  1:34                           ` Stefano Stabellini
  0 siblings, 1 reply; 28+ messages in thread
From: Durrant, Paul @ 2020-12-04 18:33 UTC (permalink / raw)
  To: Andrew Cooper, Stefano Stabellini
  Cc: Julien Grall, Jan Beulich, paul, Elnikety, Eslam,
	'Ian Jackson', 'Wei Liu',
	'Anthony PERARD', 'George Dunlap',
	'Christian Lindig', 'David Scott',
	'Volodymyr Babchuk', 'Roger Pau Monné',
	xen-devel

> -----Original Message-----
> From: Andrew Cooper <andrew.cooper3@citrix.com>
> Sent: 04 December 2020 17:45
> To: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Julien Grall <julien@xen.org>; Jan Beulich <jbeulich@suse.com>; paul@xen.org; Durrant, Paul
> <pdurrant@amazon.co.uk>; Elnikety, Eslam <elnikety@amazon.com>; 'Ian Jackson' <iwj@xenproject.org>;
> 'Wei Liu' <wl@xen.org>; 'Anthony PERARD' <anthony.perard@citrix.com>; 'George Dunlap'
> <george.dunlap@citrix.com>; 'Christian Lindig' <christian.lindig@citrix.com>; 'David Scott'
> <dave@recoil.org>; 'Volodymyr Babchuk' <Volodymyr_Babchuk@epam.com>; 'Roger Pau Monné'
> <roger.pau@citrix.com>; xen-devel@lists.xenproject.org
> Subject: RE: [EXTERNAL] [PATCH v5 1/4] domctl: introduce a new domain create flag,
> XEN_DOMCTL_CDF_evtchn_fifo, ...
> 
> CAUTION: This email originated from outside of the organization. Do not click links or open
> attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> On 04/12/2020 17:41, Stefano Stabellini wrote:
> >>> FAOD, I am sure there might be other features that need to be
> >>> disabled. But we have to start somewhere :).
> >> Absolutely top of the list, importance wise, is so we can test different
> >> configurations, without needing to rebuild the hypervisor (and to a
> >> lesser extent, without having to reboot).
> >>
> >> It is a mistake that events/grants/etc were ever available unilaterally
> >> in HVM guests.  This is definitely a step in the right direction (but I
> >> thought it would be too rude to ask Paul to make all of those CDF flags
> >> at once).
> > +1
> >
> > For FuSa we'll need to be able to disable them at some point soon.
> 
> FWIW, I have a proper plan for this stuff, which start alongside the
> fixed toolstack ABI, and will cover all aspects of optional
> functionality in a domain.
> 

OK. Can we live with this series as it stands until that point? There is some urgency to get at least these two things fixed.

  Paul

> ~Andrew

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-04 18:33                         ` Durrant, Paul
@ 2020-12-05  1:34                           ` Stefano Stabellini
  0 siblings, 0 replies; 28+ messages in thread
From: Stefano Stabellini @ 2020-12-05  1:34 UTC (permalink / raw)
  To: Durrant, Paul
  Cc: Andrew Cooper, Stefano Stabellini, Julien Grall, Jan Beulich,
	paul, Elnikety, Eslam, 'Ian Jackson', 'Wei Liu',
	'Anthony PERARD', 'George Dunlap',
	'Christian Lindig', 'David Scott',
	'Volodymyr Babchuk', 'Roger Pau Monné',
	xen-devel

[-- Attachment #1: Type: text/plain, Size: 2184 bytes --]

On Fri, 4 Dec 2020, Durrant, Paul wrote:
> > -----Original Message-----
> > From: Andrew Cooper <andrew.cooper3@citrix.com>
> > Sent: 04 December 2020 17:45
> > To: Stefano Stabellini <sstabellini@kernel.org>
> > Cc: Julien Grall <julien@xen.org>; Jan Beulich <jbeulich@suse.com>; paul@xen.org; Durrant, Paul
> > <pdurrant@amazon.co.uk>; Elnikety, Eslam <elnikety@amazon.com>; 'Ian Jackson' <iwj@xenproject.org>;
> > 'Wei Liu' <wl@xen.org>; 'Anthony PERARD' <anthony.perard@citrix.com>; 'George Dunlap'
> > <george.dunlap@citrix.com>; 'Christian Lindig' <christian.lindig@citrix.com>; 'David Scott'
> > <dave@recoil.org>; 'Volodymyr Babchuk' <Volodymyr_Babchuk@epam.com>; 'Roger Pau Monné'
> > <roger.pau@citrix.com>; xen-devel@lists.xenproject.org
> > Subject: RE: [EXTERNAL] [PATCH v5 1/4] domctl: introduce a new domain create flag,
> > XEN_DOMCTL_CDF_evtchn_fifo, ...
> > 
> > CAUTION: This email originated from outside of the organization. Do not click links or open
> > attachments unless you can confirm the sender and know the content is safe.
> > 
> > 
> > 
> > On 04/12/2020 17:41, Stefano Stabellini wrote:
> > >>> FAOD, I am sure there might be other features that need to be
> > >>> disabled. But we have to start somewhere :).
> > >> Absolutely top of the list, importance wise, is so we can test different
> > >> configurations, without needing to rebuild the hypervisor (and to a
> > >> lesser extent, without having to reboot).
> > >>
> > >> It is a mistake that events/grants/etc were ever available unilaterally
> > >> in HVM guests.  This is definitely a step in the right direction (but I
> > >> thought it would be too rude to ask Paul to make all of those CDF flags
> > >> at once).
> > > +1
> > >
> > > For FuSa we'll need to be able to disable them at some point soon.
> > 
> > FWIW, I have a proper plan for this stuff, which start alongside the
> > fixed toolstack ABI, and will cover all aspects of optional
> > functionality in a domain.
> > 
> 
> OK. Can we live with this series as it stands until that point? There is some urgency to get at least these two things fixed.

I am happy to take things one step at a time, and this is a good step
forward.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-04 11:45                 ` Julien Grall
  2020-12-04 11:52                   ` Andrew Cooper
@ 2020-12-07  9:17                   ` Jan Beulich
  2020-12-07 10:04                     ` Julien Grall
  1 sibling, 1 reply; 28+ messages in thread
From: Jan Beulich @ 2020-12-07  9:17 UTC (permalink / raw)
  To: Julien Grall
  Cc: 'Paul Durrant', 'Eslam Elnikety',
	'Ian Jackson', 'Wei Liu',
	'Anthony PERARD', 'Andrew Cooper',
	'George Dunlap', 'Stefano Stabellini',
	'Christian Lindig', 'David Scott',
	'Volodymyr Babchuk', 'Roger Pau Monné',
	xen-devel, paul

On 04.12.2020 12:45, Julien Grall wrote:
> Hi,
> 
> I haven't looked at the series yet. Just adding some thoughts on why one 
> would want such option.
> 
> On 04/12/2020 09:43, Jan Beulich wrote:
>> On 04.12.2020 09:22, Paul Durrant wrote:
>>>> From: Jan Beulich <jbeulich@suse.com>
>>>> Sent: 04 December 2020 07:53
>>>>
>>>> On 03.12.2020 18:07, Paul Durrant wrote:
>>>>>> From: Jan Beulich <jbeulich@suse.com>
>>>>>> Sent: 03 December 2020 15:57
>>>>>>
>>>>>> ... this sound to me more like workarounds for buggy guests than
>>>>>> functionality the hypervisor _needs_ to have. (I can appreciate
>>>>>> the specific case here for the specific scenario you provide as
>>>>>> an exception.)
>>>>>
>>>>> If we want to have a hypervisor that can be used in a cloud environment
>>>>> then Xen absolutely needs this capability.
>>>>
>>>> As per above you can conclude that I'm still struggling to see the
>>>> "why" part here.
>>>>
>>>
>>> Imagine you are a customer. You boot your OS and everything is just fine... you run your workload and all is good. You then shut down your VM and re-start it. Now it starts to crash. Who are you going to blame? You did nothing to your OS or application s/w, so you are going to blame the cloud provider of course.
>>
>> That's a situation OSes are in all the time. Buggy applications may
>> stop working on newer OS versions. It's still the application that's
>> in need of updating then. I guess OSes may choose to work around
>> some very common applications' bugs, but I'd then wonder on what
>> basis "very common" gets established. I dislike the underlying
>> asymmetry / inconsistency (if not unfairness) of such a model,
>> despite seeing that there may be business reasons leading people to
>> think they want something like this.
> 
> The discussion seems to be geared towards buggy guest so far. However, 
> this is not the only reason that one my want to avoid exposing some 
> features:
> 
>     1) From the recent security issues (such as XSA-343), a knob to 
> disable FIFO would be quite beneficials for vendors that don't need the 
> feature.

Except that this wouldn't have been suitable as a during-embargo
mitigation, for its observability by guests.

>     2) Fleet management purpose. You may have a fleet with multiple 
> versions of Xen. You don't want your customer to start relying on 
> features that may not be available on all the hosts otherwise it 
> complicates the guest placement.

Guests incapable to run on older Xen are a problem in this regard
anyway, aren't they? And if they are capable, I don't see what
you're referring to.

> FAOD, I am sure there might be other features that need to be disabled. 
> But we have to start somewhere :).

If there is such a need, then yes, sure. But shouldn't we at least
gain rough agreement on how the future is going to look like with
this? IOW have in hands some at least roughly agreed criteria by
which we could decide which new ABI additions will need some form
of override going forward (also allowing to judge which prior
additions may want to gain overrides in a retroactive fashion, or
in fact should have had ones from the beginning)?

>>> Now imagine you are the cloud provider, running Xen. What you did was start to upgrade your hosts from an older version of Xen to a newer version of Xen, to pick up various bug fixes and make sure you are running a version that is within the security support envelope. You identify that your customer's problem is a bug in their OS that was latent on the old version of the hypervisor but is now manifesting on the new one because it has buggy support for a hypercall that was added between the two versions. How are you going to fix this issue, and get your customer up and running again? Of course you'd like your customer to upgrade their OS, but they can't even boot it to do that. You really need a solution that can restore the old VM environment, at least temporarily, for that customer.
>>
>> Boot the guest on a not-yet-upgraded host again, to update its kernel?
> 
> You are making the assumption that the customer would have the choice to 
> target a specific versions of Xen. This may be undesirable for a cloud 
> provider as suddenly your customer may want to stick on the old version 
> of Xen.

I've gone from you saying "You really need a solution that can restore
the old VM environment, at least temporarily, for that customer." The
"temporarily" to me implies that it is at least an option to tie a
certain guest to a certain Xen version for in-guest upgrading purposes.
If the deal with the customer doesn't include running on a certain Xen
version, I don't see how this could have non-temporary consequences to
the cloud provider.

Jan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-07  9:17                   ` Jan Beulich
@ 2020-12-07 10:04                     ` Julien Grall
  2020-12-07 10:07                       ` Julien Grall
  2020-12-07 10:15                       ` Jan Beulich
  0 siblings, 2 replies; 28+ messages in thread
From: Julien Grall @ 2020-12-07 10:04 UTC (permalink / raw)
  To: Jan Beulich
  Cc: 'Paul Durrant', 'Eslam Elnikety',
	'Ian Jackson', 'Wei Liu',
	'Anthony PERARD', 'Andrew Cooper',
	'George Dunlap', 'Stefano Stabellini',
	'Christian Lindig', 'David Scott',
	'Volodymyr Babchuk', 'Roger Pau Monné',
	xen-devel, paul

Hi Jan,

On 07/12/2020 09:17, Jan Beulich wrote:
> On 04.12.2020 12:45, Julien Grall wrote:
>>      1) From the recent security issues (such as XSA-343), a knob to
>> disable FIFO would be quite beneficials for vendors that don't need the
>> feature.
> 
> Except that this wouldn't have been suitable as a during-embargo
> mitigation, for its observability by guests.

I think you are missing my point here. If that knob was available before 
the security event, a vendor may have already decided to use it to 
disable feature affected.

This vendor would not have needed to spend time to either hotpatch or 
reboot all its fleet.

>>      2) Fleet management purpose. You may have a fleet with multiple
>> versions of Xen. You don't want your customer to start relying on
>> features that may not be available on all the hosts otherwise it
>> complicates the guest placement.
> 
> Guests incapable to run on older Xen are a problem in this regard
> anyway, aren't they? And if they are capable, I don't see what
> you're referring to.

It is not about guest that cannot run on older Xen. It is more about 
allowing a guest to use a feature that is not yet widely available in 
the fleet (you don't update all the hosts in a night...).

Imagine the guest owner really wants to use feature A that is only 
available on new Xen version. The owner may have noticed the feature on 
an existing running guest and would like to create a new guest that use 
the feature.

It might be possible that there are no capacity available on the new Xen 
version. So the guest may start on an older capacity.

I can assure you that the owner will contact the customer service of the 
cloud provider to ask why the feature is not available on the new guest.

With a knob available, a cloud provider has more flexibility to when the 
feature can be exposed.

>> FAOD, I am sure there might be other features that need to be disabled.
>> But we have to start somewhere :).
> 
> If there is such a need, then yes, sure. But shouldn't we at least
> gain rough agreement on how the future is going to look like with
> this? IOW have in hands some at least roughly agreed criteria by
> which we could decide which new ABI additions will need some form
> of override going forward (also allowing to judge which prior
> additions may want to gain overrides in a retroactive fashion, or
> in fact should have had ones from the beginning)?

I think the answer is quite straight-forward: Anything exposed to the 
non-privileged (I include stubdomain) guest should have a knob to 
disable it.

> 
>>>> Now imagine you are the cloud provider, running Xen. What you did was start to upgrade your hosts from an older version of Xen to a newer version of Xen, to pick up various bug fixes and make sure you are running a version that is within the security support envelope. You identify that your customer's problem is a bug in their OS that was latent on the old version of the hypervisor but is now manifesting on the new one because it has buggy support for a hypercall that was added between the two versions. How are you going to fix this issue, and get your customer up and running again? Of course you'd like your customer to upgrade their OS, but they can't even boot it to do that. You really need a solution that can restore the old VM environment, at least temporarily, for that customer.
>>>
>>> Boot the guest on a not-yet-upgraded host again, to update its kernel?
>>
>> You are making the assumption that the customer would have the choice to
>> target a specific versions of Xen. This may be undesirable for a cloud
>> provider as suddenly your customer may want to stick on the old version
>> of Xen.
> 
> I've gone from you saying "You really need a solution that can restore
> the old VM environment, at least temporarily, for that customer." The
> "temporarily" to me implies that it is at least an option to tie a
> certain guest to a certain Xen version for in-guest upgrading purposes.
 >
> If the deal with the customer doesn't include running on a certain Xen
> version, I don't see how this could have non-temporary consequences to
> the cloud provider.

I think by "you", you mean Paul and not me?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-07 10:04                     ` Julien Grall
@ 2020-12-07 10:07                       ` Julien Grall
  2020-12-07 10:15                       ` Jan Beulich
  1 sibling, 0 replies; 28+ messages in thread
From: Julien Grall @ 2020-12-07 10:07 UTC (permalink / raw)
  To: Jan Beulich
  Cc: 'Paul Durrant', 'Eslam Elnikety',
	'Ian Jackson', 'Wei Liu',
	'Anthony PERARD', 'Andrew Cooper',
	'George Dunlap', 'Stefano Stabellini',
	'Christian Lindig', 'David Scott',
	'Volodymyr Babchuk', 'Roger Pau Monné',
	xen-devel, paul



On 07/12/2020 10:04, Julien Grall wrote:
> Hi Jan,
> 
> On 07/12/2020 09:17, Jan Beulich wrote:
>> On 04.12.2020 12:45, Julien Grall wrote:
>>>      1) From the recent security issues (such as XSA-343), a knob to
>>> disable FIFO would be quite beneficials for vendors that don't need the
>>> feature.
>>
>> Except that this wouldn't have been suitable as a during-embargo
>> mitigation, for its observability by guests.
> 
> I think you are missing my point here. If that knob was available before 
> the security event, a vendor may have already decided to use it to 
> disable feature affected.
> 
> This vendor would not have needed to spend time to either hotpatch or 
> reboot all its fleet.
> 
>>>      2) Fleet management purpose. You may have a fleet with multiple
>>> versions of Xen. You don't want your customer to start relying on
>>> features that may not be available on all the hosts otherwise it
>>> complicates the guest placement.
>>
>> Guests incapable to run on older Xen are a problem in this regard
>> anyway, aren't they? And if they are capable, I don't see what
>> you're referring to.
> 
> It is not about guest that cannot run on older Xen. It is more about 
> allowing a guest to use a feature that is not yet widely available in 
> the fleet (you don't update all the hosts in a night...).
> 
> Imagine the guest owner really wants to use feature A that is only 
> available on new Xen version. The owner may have noticed the feature on 
> an existing running guest and would like to create a new guest that use 
> the feature.
> 
> It might be possible that there are no capacity available on the new Xen 
> version. So the guest may start on an older capacity.
> 
> I can assure you that the owner will contact the customer service of the 
> cloud provider to ask why the feature is not available on the new guest.
> 
> With a knob available, a cloud provider has more flexibility to when the 
> feature can be exposed.
> 
>>> FAOD, I am sure there might be other features that need to be disabled.
>>> But we have to start somewhere :).
>>
>> If there is such a need, then yes, sure. But shouldn't we at least
>> gain rough agreement on how the future is going to look like with
>> this? IOW have in hands some at least roughly agreed criteria by
>> which we could decide which new ABI additions will need some form
>> of override going forward (also allowing to judge which prior
>> additions may want to gain overrides in a retroactive fashion, or
>> in fact should have had ones from the beginning)?
> 
> I think the answer is quite straight-forward: Anything exposed to the 
> non-privileged (I include stubdomain) guest should have a knob to 

s/include/exclude/

> disable it.
> 
>>
>>>>> Now imagine you are the cloud provider, running Xen. What you did 
>>>>> was start to upgrade your hosts from an older version of Xen to a 
>>>>> newer version of Xen, to pick up various bug fixes and make sure 
>>>>> you are running a version that is within the security support 
>>>>> envelope. You identify that your customer's problem is a bug in 
>>>>> their OS that was latent on the old version of the hypervisor but 
>>>>> is now manifesting on the new one because it has buggy support for 
>>>>> a hypercall that was added between the two versions. How are you 
>>>>> going to fix this issue, and get your customer up and running 
>>>>> again? Of course you'd like your customer to upgrade their OS, but 
>>>>> they can't even boot it to do that. You really need a solution that 
>>>>> can restore the old VM environment, at least temporarily, for that 
>>>>> customer.
>>>>
>>>> Boot the guest on a not-yet-upgraded host again, to update its kernel?
>>>
>>> You are making the assumption that the customer would have the choice to
>>> target a specific versions of Xen. This may be undesirable for a cloud
>>> provider as suddenly your customer may want to stick on the old version
>>> of Xen.
>>
>> I've gone from you saying "You really need a solution that can restore
>> the old VM environment, at least temporarily, for that customer." The
>> "temporarily" to me implies that it is at least an option to tie a
>> certain guest to a certain Xen version for in-guest upgrading purposes.
>  >
>> If the deal with the customer doesn't include running on a certain Xen
>> version, I don't see how this could have non-temporary consequences to
>> the cloud provider.
> 
> I think by "you", you mean Paul and not me?
> 
> Cheers,
> 

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-07 10:04                     ` Julien Grall
  2020-12-07 10:07                       ` Julien Grall
@ 2020-12-07 10:15                       ` Jan Beulich
  2020-12-07 10:23                         ` Durrant, Paul
  1 sibling, 1 reply; 28+ messages in thread
From: Jan Beulich @ 2020-12-07 10:15 UTC (permalink / raw)
  To: Julien Grall
  Cc: 'Paul Durrant', 'Eslam Elnikety',
	'Ian Jackson', 'Wei Liu',
	'Anthony PERARD', 'Andrew Cooper',
	'George Dunlap', 'Stefano Stabellini',
	'Christian Lindig', 'David Scott',
	'Volodymyr Babchuk', 'Roger Pau Monné',
	xen-devel, paul

On 07.12.2020 11:04, Julien Grall wrote:
> On 07/12/2020 09:17, Jan Beulich wrote:
>> On 04.12.2020 12:45, Julien Grall wrote:
>>> You are making the assumption that the customer would have the choice to
>>> target a specific versions of Xen. This may be undesirable for a cloud
>>> provider as suddenly your customer may want to stick on the old version
>>> of Xen.
>>
>> I've gone from you saying "You really need a solution that can restore
>> the old VM environment, at least temporarily, for that customer." The
>> "temporarily" to me implies that it is at least an option to tie a
>> certain guest to a certain Xen version for in-guest upgrading purposes.
>  >
>> If the deal with the customer doesn't include running on a certain Xen
>> version, I don't see how this could have non-temporary consequences to
>> the cloud provider.
> 
> I think by "you", you mean Paul and not me?

Oh, right, I didn't pay attention to who wrote that text. Sorry.

Jan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, ...
  2020-12-07 10:15                       ` Jan Beulich
@ 2020-12-07 10:23                         ` Durrant, Paul
  0 siblings, 0 replies; 28+ messages in thread
From: Durrant, Paul @ 2020-12-07 10:23 UTC (permalink / raw)
  To: Jan Beulich, Julien Grall
  Cc: Elnikety, Eslam, 'Ian Jackson', 'Wei Liu',
	'Anthony PERARD', 'Andrew Cooper',
	'George Dunlap', 'Stefano Stabellini',
	'Christian Lindig', 'David Scott',
	'Volodymyr Babchuk', 'Roger Pau Monné',
	xen-devel, paul

> -----Original Message-----
[snip]
> >> I've gone from you saying "You really need a solution that can restore
> >> the old VM environment, at least temporarily, for that customer." The
> >> "temporarily" to me implies that it is at least an option to tie a
> >> certain guest to a certain Xen version for in-guest upgrading purposes.
> >  >

Not necessarily.

> >> If the deal with the customer doesn't include running on a certain Xen
> >> version, I don't see how this could have non-temporary consequences to
> >> the cloud provider.
> >
> > I think by "you", you mean Paul and not me?
> 
> Oh, right, I didn't pay attention to who wrote that text. Sorry.
> 

By temporary I mean that we may want to time-limit turning off a certain part of the ABU because, whilst it is problematic for some customers, it could (and is likely to) have measurable benefits to others. Thus you keep the feature off only until any customers running OS that have problems have upgraded their installations.

  Paul


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2020-12-07 10:24 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-03 12:41 [PATCH v5 0/4] Xen ABI feature control Paul Durrant
2020-12-03 12:41 ` [PATCH v5 1/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_fifo, Paul Durrant
2020-12-03 15:22   ` Jan Beulich
2020-12-03 15:45     ` Paul Durrant
2020-12-03 15:56       ` Jan Beulich
2020-12-03 17:07         ` Paul Durrant
2020-12-03 17:19           ` Jürgen Groß
2020-12-03 18:44             ` Paul Durrant
2020-12-04  7:53           ` Jan Beulich
2020-12-04  8:22             ` Paul Durrant
2020-12-04  9:43               ` Jan Beulich
2020-12-04 11:45                 ` Julien Grall
2020-12-04 11:52                   ` Andrew Cooper
2020-12-04 17:41                     ` Stefano Stabellini
2020-12-04 17:45                       ` Andrew Cooper
2020-12-04 18:33                         ` Durrant, Paul
2020-12-05  1:34                           ` Stefano Stabellini
2020-12-07  9:17                   ` Jan Beulich
2020-12-07 10:04                     ` Julien Grall
2020-12-07 10:07                       ` Julien Grall
2020-12-07 10:15                       ` Jan Beulich
2020-12-07 10:23                         ` Durrant, Paul
2020-12-03 12:41 ` [PATCH v5 2/4] domctl: introduce a new domain create flag, XEN_DOMCTL_CDF_evtchn_upcall, Paul Durrant
2020-12-03 12:41 ` [PATCH v5 3/4] libxl: introduce a 'libxl_xen_abi_features' enumeration Paul Durrant
2020-12-03 12:41 ` [PATCH v5 4/4] xl: introduce a 'xen-abi-features' option Paul Durrant
2020-12-03 13:15 ` [PATCH v5 0/4] Xen ABI feature control Jürgen Groß
2020-12-03 13:51   ` Paul Durrant
2020-12-03 13:58     ` Jürgen Groß

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.