All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] AMD Nested Virt Preparation
@ 2024-02-06  1:20 George Dunlap
  2024-02-06  1:20 ` [PATCH 1/6] xen/hvm: Convert hap_capabilities into a bitfield George Dunlap
                   ` (5 more replies)
  0 siblings, 6 replies; 33+ messages in thread
From: George Dunlap @ 2024-02-06  1:20 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap

This series lays the groundwork for revamp of the AMD nested virt
functionality.  The first five patches are clean-ups or reorganizations
of existing code.  The final patch is the first major step towards making
the feature supportable: allowing Xen to refuse nested virt support if certain
hardware features are not present.

George Dunlap (6):
  xen/hvm: Convert hap_capabilities into a bitfield
  svm: Improve type of cpu_has_svm_feature
  xen/hvm: Move other hvm_function_table booleans into the caps bitfield
  nestedsvm: Disable TscRateMSR
  nestedsvm: Remove bogus debug message from nestedsvm_check_intercepts
  svm/nestedvm: Introduce nested capabilities bit

 docs/designs/nested-svm-cpu-features.md      | 110 +++++++++++++++++++
 xen/arch/x86/cpu-policy.c                    |   3 +-
 xen/arch/x86/domain.c                        |   6 +
 xen/arch/x86/hvm/hvm.c                       |  14 +--
 xen/arch/x86/hvm/svm/nestedhvm.h             |   1 +
 xen/arch/x86/hvm/svm/nestedsvm.c             |  22 +++-
 xen/arch/x86/hvm/svm/svm.c                   |  65 +----------
 xen/arch/x86/hvm/vlapic.c                    |   4 +-
 xen/arch/x86/hvm/vmx/vmcs.c                  |   6 +-
 xen/arch/x86/hvm/vmx/vmx.c                   |  19 ++--
 xen/arch/x86/include/asm/hvm/hvm.h           |  47 ++++----
 xen/arch/x86/include/asm/hvm/svm/nestedsvm.h |   5 -
 xen/arch/x86/include/asm/hvm/svm/svm.h       |   5 +-
 13 files changed, 191 insertions(+), 116 deletions(-)
 create mode 100644 docs/designs/nested-svm-cpu-features.md

-- 
2.25.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 1/6] xen/hvm: Convert hap_capabilities into a bitfield
  2024-02-06  1:20 [PATCH 0/6] AMD Nested Virt Preparation George Dunlap
@ 2024-02-06  1:20 ` George Dunlap
  2024-02-19 13:36   ` Jan Beulich
  2024-02-06  1:20 ` [PATCH 2/6] svm: Improve type of cpu_has_svm_feature George Dunlap
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 33+ messages in thread
From: George Dunlap @ 2024-02-06  1:20 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian

hvm_function_table is an internal structure; rather than manually
|-ing and &-ing bits, just make it a boolean bitfield and let the
compiler do all the work.  This makes everything easier to read, and
presumably allows the compiler more flexibility in producing efficient
code.

No functional change intended.

Signed-off-by: George Dunlap <george.dunlap@cloud.com>
---
Questions:

* Should hap_superpage_2m really be set unconditionally, or should we
  condition it on cpu_has_svm_npt?

* Do we really need to "!!cpu_has_svm_npt"?  If so, wouldn't it be
  better to put the "!!"  in the #define, rather than requiring the
  user to know that it's needed?

CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: "Roger Pau Monné" <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
---
 xen/arch/x86/hvm/hvm.c             |  8 ++++----
 xen/arch/x86/hvm/svm/svm.c         |  4 ++--
 xen/arch/x86/hvm/vmx/vmcs.c        |  4 ++--
 xen/arch/x86/hvm/vmx/vmx.c         |  8 ++------
 xen/arch/x86/include/asm/hvm/hvm.h | 19 +++++++------------
 5 files changed, 17 insertions(+), 26 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index e8deeb0222..ae9d4c4756 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -174,17 +174,17 @@ static int __init cf_check hvm_enable(void)
     {
         printk("HVM: Hardware Assisted Paging (HAP) detected\n");
         printk("HVM: HAP page sizes: 4kB");
-        if ( fns->hap_capabilities & HVM_HAP_SUPERPAGE_2MB )
+        if ( fns->caps.hap_superpage_2mb )
         {
             printk(", 2MB%s", opt_hap_2mb ? "" : " [disabled]");
             if ( !opt_hap_2mb )
-                hvm_funcs.hap_capabilities &= ~HVM_HAP_SUPERPAGE_2MB;
+                hvm_funcs.caps.hap_superpage_2mb = false;
         }
-        if ( fns->hap_capabilities & HVM_HAP_SUPERPAGE_1GB )
+        if ( fns->caps.hap_superpage_1gb )
         {
             printk(", 1GB%s", opt_hap_1gb ? "" : " [disabled]");
             if ( !opt_hap_1gb )
-                hvm_funcs.hap_capabilities &= ~HVM_HAP_SUPERPAGE_1GB;
+                hvm_funcs.caps.hap_superpage_1gb = false;
         }
         printk("\n");
     }
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 65f437e958..5741287355 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -2581,8 +2581,8 @@ const struct hvm_function_table * __init start_svm(void)
         printk(" - none\n");
 
     svm_function_table.hap_supported = !!cpu_has_svm_npt;
-    svm_function_table.hap_capabilities = HVM_HAP_SUPERPAGE_2MB |
-        (cpu_has_page1gb ? HVM_HAP_SUPERPAGE_1GB : 0);
+    svm_function_table.caps.hap_superpage_2mb = true;
+    svm_function_table.caps.hap_superpage_1gb = cpu_has_page1gb;
 
     return &svm_function_table;
 }
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 4fe1213855..53f9d81aa9 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -113,8 +113,8 @@ static int cf_check parse_ept_param_runtime(const char *s)
     int val;
 
     if ( !cpu_has_vmx_ept || !hvm_funcs.hap_supported ||
-         !(hvm_funcs.hap_capabilities &
-           (HVM_HAP_SUPERPAGE_2MB | HVM_HAP_SUPERPAGE_1GB)) )
+         !(hvm_funcs.caps.hap_superpage_2mb ||
+           hvm_funcs.caps.hap_superpage_1gb) )
     {
         printk("VMX: EPT not available, or not in use - ignoring\n");
         return 0;
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 1500dca603..9cfc0140b4 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2989,12 +2989,8 @@ const struct hvm_function_table * __init start_vmx(void)
         vmx_function_table.hap_supported = 1;
         vmx_function_table.altp2m_supported = 1;
 
-        vmx_function_table.hap_capabilities = 0;
-
-        if ( cpu_has_vmx_ept_2mb )
-            vmx_function_table.hap_capabilities |= HVM_HAP_SUPERPAGE_2MB;
-        if ( cpu_has_vmx_ept_1gb )
-            vmx_function_table.hap_capabilities |= HVM_HAP_SUPERPAGE_1GB;
+        vmx_function_table.caps.hap_superpage_2mb = cpu_has_vmx_ept_2mb;
+        vmx_function_table.caps.hap_superpage_1gb = cpu_has_vmx_ept_1gb;
 
         setup_ept_dump();
     }
diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h
index 985c1c14c6..f50476f50f 100644
--- a/xen/arch/x86/include/asm/hvm/hvm.h
+++ b/xen/arch/x86/include/asm/hvm/hvm.h
@@ -61,14 +61,6 @@ enum hvm_intblk {
 #define HVM_INTR_SHADOW_SMI    0x00000004
 #define HVM_INTR_SHADOW_NMI    0x00000008
 
-/*
- * HAP super page capabilities:
- * bit0: if 2MB super page is allowed?
- * bit1: if 1GB super page is allowed?
- */
-#define HVM_HAP_SUPERPAGE_2MB   0x00000001
-#define HVM_HAP_SUPERPAGE_1GB   0x00000002
-
 #define HVM_EVENT_VECTOR_UNSET    (-1)
 #define HVM_EVENT_VECTOR_UPDATING (-2)
 
@@ -104,8 +96,11 @@ struct hvm_function_table {
     /* Hardware virtual interrupt delivery enable? */
     bool virtual_intr_delivery_enabled;
 
-    /* Indicate HAP capabilities. */
-    unsigned int hap_capabilities;
+    struct {
+        /* Indicate HAP capabilities. */
+        bool hap_superpage_1gb:1,
+            hap_superpage_2mb:1;
+    } caps;
 
     /*
      * Initialise/destroy HVM domain/vcpu resources
@@ -402,8 +397,8 @@ int hvm_get_param(struct domain *d, uint32_t index, uint64_t *value);
     (hvm_paging_enabled(v) && ((v)->arch.hvm.guest_cr[4] & X86_CR4_PKS))
 
 /* Can we use superpages in the HAP p2m table? */
-#define hap_has_1gb (!!(hvm_funcs.hap_capabilities & HVM_HAP_SUPERPAGE_1GB))
-#define hap_has_2mb (!!(hvm_funcs.hap_capabilities & HVM_HAP_SUPERPAGE_2MB))
+#define hap_has_1gb hvm_funcs.caps.hap_superpage_1gb
+#define hap_has_2mb hvm_funcs.caps.hap_superpage_2mb
 
 #define hvm_long_mode_active(v) (!!((v)->arch.hvm.guest_efer & EFER_LMA))
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 2/6] svm: Improve type of cpu_has_svm_feature
  2024-02-06  1:20 [PATCH 0/6] AMD Nested Virt Preparation George Dunlap
  2024-02-06  1:20 ` [PATCH 1/6] xen/hvm: Convert hap_capabilities into a bitfield George Dunlap
@ 2024-02-06  1:20 ` George Dunlap
  2024-02-19 14:03   ` Jan Beulich
  2024-02-19 15:24   ` Jan Beulich
  2024-02-06  1:20 ` [PATCH 3/6] xen/hvm: Move other hvm_function_table booleans into the caps bitfield George Dunlap
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 33+ messages in thread
From: George Dunlap @ 2024-02-06  1:20 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

The "effective type" of the cpu_has_svm_feature macro is effectively
an unsigned log with one bit set (or not); at least one place someone
felt compelled to do a !! to make sure that they got a boolean out of
it.

Ideally the whole of this would be folded into the cpufeature.h
infrastructure.  But for now, duplicate the more type-safe static
inlines in that file, and remove the !!.

Signed-off-by: George Dunlap <george.dunlap@cloud.com>
---
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: "Roger Pau Monné" <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/hvm/svm/svm.c             | 2 +-
 xen/arch/x86/include/asm/hvm/svm/svm.h | 5 ++++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 5741287355..40bc1ffbc6 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -2580,7 +2580,7 @@ const struct hvm_function_table * __init start_svm(void)
     if ( !printed )
         printk(" - none\n");
 
-    svm_function_table.hap_supported = !!cpu_has_svm_npt;
+    svm_function_table.hap_supported = cpu_has_svm_npt;
     svm_function_table.caps.hap_superpage_2mb = true;
     svm_function_table.caps.hap_superpage_1gb = cpu_has_page1gb;
 
diff --git a/xen/arch/x86/include/asm/hvm/svm/svm.h b/xen/arch/x86/include/asm/hvm/svm/svm.h
index 687d35be40..9e9a148845 100644
--- a/xen/arch/x86/include/asm/hvm/svm/svm.h
+++ b/xen/arch/x86/include/asm/hvm/svm/svm.h
@@ -38,7 +38,10 @@ extern u32 svm_feature_flags;
 #define SVM_FEATURE_SSS           19 /* NPT Supervisor Shadow Stacks */
 #define SVM_FEATURE_SPEC_CTRL     20 /* MSR_SPEC_CTRL virtualisation */
 
-#define cpu_has_svm_feature(f) (svm_feature_flags & (1u << (f)))
+static inline bool cpu_has_svm_feature(unsigned int feat)
+{
+    return svm_feature_flags & (1u << (feat));
+}
 #define cpu_has_svm_npt       cpu_has_svm_feature(SVM_FEATURE_NPT)
 #define cpu_has_svm_lbrv      cpu_has_svm_feature(SVM_FEATURE_LBRV)
 #define cpu_has_svm_svml      cpu_has_svm_feature(SVM_FEATURE_SVML)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 3/6] xen/hvm: Move other hvm_function_table booleans into the caps bitfield
  2024-02-06  1:20 [PATCH 0/6] AMD Nested Virt Preparation George Dunlap
  2024-02-06  1:20 ` [PATCH 1/6] xen/hvm: Convert hap_capabilities into a bitfield George Dunlap
  2024-02-06  1:20 ` [PATCH 2/6] svm: Improve type of cpu_has_svm_feature George Dunlap
@ 2024-02-06  1:20 ` George Dunlap
  2024-02-19 14:39   ` Jan Beulich
  2024-02-19 16:08   ` Jan Beulich
  2024-02-06  1:20 ` [PATCH 4/6] nestedsvm: Disable TscRateMSR George Dunlap
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 33+ messages in thread
From: George Dunlap @ 2024-02-06  1:20 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian

Moving them all together has several advantages:
 * Collects them all in one part of the struct
 * The `caps` field means that we can drop the "_supported" suffix, as it's
   clear what is meant.

Signed-off-by: George Dunlap <george.dunlap@cloud.com>
---
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: "Roger Pau Monné" <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
---
 xen/arch/x86/hvm/hvm.c             |  6 +++---
 xen/arch/x86/hvm/svm/svm.c         |  2 +-
 xen/arch/x86/hvm/vlapic.c          |  4 ++--
 xen/arch/x86/hvm/vmx/vmcs.c        |  2 +-
 xen/arch/x86/hvm/vmx/vmx.c         |  8 ++++----
 xen/arch/x86/include/asm/hvm/hvm.h | 29 ++++++++++++++---------------
 6 files changed, 25 insertions(+), 26 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index ae9d4c4756..aa2f2d054a 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -136,7 +136,7 @@ static struct notifier_block cpu_nfb = {
 
 static bool __init hap_supported(struct hvm_function_table *fns)
 {
-    if ( !fns->hap_supported )
+    if ( !fns->caps.hap )
     {
         printk("HVM: Hardware Assisted Paging (HAP) not detected\n");
         return false;
@@ -144,7 +144,7 @@ static bool __init hap_supported(struct hvm_function_table *fns)
 
     if ( !opt_hap_enabled )
     {
-        fns->hap_supported = 0;
+        fns->caps.hap = 0;
         printk("HVM: Hardware Assisted Paging (HAP) detected but disabled\n");
         return false;
     }
@@ -190,7 +190,7 @@ static int __init cf_check hvm_enable(void)
     }
 
     if ( !opt_altp2m_enabled )
-        hvm_funcs.altp2m_supported = 0;
+        hvm_funcs.caps.altp2m = 0;
 
     if ( opt_hvm_fep )
         warning_add(warning_hvm_fep);
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 40bc1ffbc6..b551eac807 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -2580,7 +2580,7 @@ const struct hvm_function_table * __init start_svm(void)
     if ( !printed )
         printk(" - none\n");
 
-    svm_function_table.hap_supported = cpu_has_svm_npt;
+    svm_function_table.caps.hap = cpu_has_svm_npt;
     svm_function_table.caps.hap_superpage_2mb = true;
     svm_function_table.caps.hap_superpage_1gb = cpu_has_page1gb;
 
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 71a4b954b0..dcbcf4a1fe 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -1326,7 +1326,7 @@ int vlapic_has_pending_irq(struct vcpu *v)
     if ( irr == -1 )
         return -1;
 
-    if ( hvm_funcs.virtual_intr_delivery_enabled &&
+    if ( hvm_funcs.caps.virtual_intr_delivery &&
          !nestedhvm_vcpu_in_guestmode(v) )
         return irr;
 
@@ -1361,7 +1361,7 @@ int vlapic_ack_pending_irq(struct vcpu *v, int vector, bool force_ack)
     int isr;
 
     if ( !force_ack &&
-         hvm_funcs.virtual_intr_delivery_enabled )
+         hvm_funcs.caps.virtual_intr_delivery )
         return 1;
 
     /* If there's no chance of using APIC assist then bail now. */
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 53f9d81aa9..aff69d5320 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -112,7 +112,7 @@ static int cf_check parse_ept_param_runtime(const char *s)
     struct domain *d;
     int val;
 
-    if ( !cpu_has_vmx_ept || !hvm_funcs.hap_supported ||
+    if ( !cpu_has_vmx_ept || !hvm_funcs.caps.hap ||
          !(hvm_funcs.caps.hap_superpage_2mb ||
            hvm_funcs.caps.hap_superpage_1gb) )
     {
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 9cfc0140b4..4bcf436d2c 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2963,7 +2963,7 @@ const struct hvm_function_table * __init start_vmx(void)
         return NULL;
     }
 
-    vmx_function_table.singlestep_supported = cpu_has_monitor_trap_flag;
+    vmx_function_table.caps.singlestep = cpu_has_monitor_trap_flag;
 
     if ( cpu_has_vmx_dt_exiting )
         vmx_function_table.set_descriptor_access_exiting =
@@ -2986,8 +2986,8 @@ const struct hvm_function_table * __init start_vmx(void)
                 printk("VMX: Disabling executable EPT superpages due to CVE-2018-12207\n");
         }
 
-        vmx_function_table.hap_supported = 1;
-        vmx_function_table.altp2m_supported = 1;
+        vmx_function_table.caps.hap = 1;
+        vmx_function_table.caps.altp2m = 1;
 
         vmx_function_table.caps.hap_superpage_2mb = cpu_has_vmx_ept_2mb;
         vmx_function_table.caps.hap_superpage_1gb = cpu_has_vmx_ept_1gb;
@@ -3000,7 +3000,7 @@ const struct hvm_function_table * __init start_vmx(void)
         vmx_function_table.update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap;
         vmx_function_table.process_isr = vmx_process_isr;
         vmx_function_table.handle_eoi = vmx_handle_eoi;
-        vmx_function_table.virtual_intr_delivery_enabled = true;
+        vmx_function_table.caps.virtual_intr_delivery = true;
     }
 
     if ( cpu_has_vmx_posted_intr_processing )
diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h
index f50476f50f..bbd83a8275 100644
--- a/xen/arch/x86/include/asm/hvm/hvm.h
+++ b/xen/arch/x86/include/asm/hvm/hvm.h
@@ -86,20 +86,19 @@ struct hvm_vcpu_nonreg_state {
 struct hvm_function_table {
     const char *name;
 
-    /* Support Hardware-Assisted Paging? */
-    bool hap_supported;
-
-    /* Necessary hardware support for alternate p2m's? */
-    bool altp2m_supported;
-    bool singlestep_supported;
-
-    /* Hardware virtual interrupt delivery enable? */
-    bool virtual_intr_delivery_enabled;
-
     struct {
         /* Indicate HAP capabilities. */
-        bool hap_superpage_1gb:1,
-            hap_superpage_2mb:1;
+        bool hap:1,
+             hap_superpage_1gb:1,
+             hap_superpage_2mb:1,
+
+            /* Altp2m capabilities */
+            altp2m:1,
+            singlestep:1,
+            
+            /* Hardware virtual interrupt delivery enable? */
+            virtual_intr_delivery;
+
     } caps;
 
     /*
@@ -642,18 +641,18 @@ static inline void hvm_enable_msr_interception(struct domain *d, uint32_t msr)
 
 static inline bool hvm_is_singlestep_supported(void)
 {
-    return hvm_funcs.singlestep_supported;
+    return hvm_funcs.caps.singlestep;
 }
 
 static inline bool hvm_hap_supported(void)
 {
-    return hvm_funcs.hap_supported;
+    return hvm_funcs.caps.hap;
 }
 
 /* returns true if hardware supports alternate p2m's */
 static inline bool hvm_altp2m_supported(void)
 {
-    return hvm_funcs.altp2m_supported;
+    return hvm_funcs.caps.altp2m;
 }
 
 /* updates the current hardware p2m */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 4/6] nestedsvm: Disable TscRateMSR
  2024-02-06  1:20 [PATCH 0/6] AMD Nested Virt Preparation George Dunlap
                   ` (2 preceding siblings ...)
  2024-02-06  1:20 ` [PATCH 3/6] xen/hvm: Move other hvm_function_table booleans into the caps bitfield George Dunlap
@ 2024-02-06  1:20 ` George Dunlap
  2024-02-19 15:22   ` Jan Beulich
  2024-02-06  1:20 ` [PATCH 5/6] nestedsvm: Remove bogus debug message from nestedsvm_check_intercepts George Dunlap
  2024-02-06  1:20 ` [PATCH 6/6] svm/nestedvm: Introduce nested capabilities bit George Dunlap
  5 siblings, 1 reply; 33+ messages in thread
From: George Dunlap @ 2024-02-06  1:20 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu

The primary purpose of TSC scaling, from our perspective, is to
maintain the fiction of an "invariant TSC" across migrates between
platforms with different clock speeds.

On AMD, the TscRateMSR CPUID bit is unconditionally enabled in the
"host cpuid", even if the hardware doesn't actually support it.
According to c/s fd14a1943c4 ("nestedsvm: Support TSC Rate MSR"),
testing showed that emulating TSC scaling in an L1 was more expensive
than emulating TSC scaling on an L0 (due to extra sets of vmexit /
vmenter).

However, the current implementation seems to be broken.

First of all, the final L2 scaling ratio should be a composition of
the L0 scaling ratio and the L1 scaling ratio; there's no indication
this is being done anywhere.

Secondly, it's not clear that the L1 tsc scaling ratio actually
affects the L0 tsc scaling ratio.  The stored value (ns_tscratio) is
used to affect the tsc *offset*, but doesn't seem to actually be
factored into d->hvm.tsc_scaling_ratio.  (Which shouldn't be
per-domain anyway, but per-vcpu.)  Having the *offset* scaled
according to the nested scaling without the actual RDTSC itself also
being scaled has got to produce inconsistent results.

For now, just disable the functionality entirely until we can
implement it properly:

- Don't set TSCRATEMSR in the host CPUID policy

- Remove MSR_AMD64_TSC_RATIO emulation handling, so that the guest
  guests a #GP if it tries to access them (as it should when
  TSCRATEMSR is clear)

- Remove ns_tscratio from struct nestedhvm, and all code that touches
  it

Unfortunately this means ripping out the scaling calculation stuff as
well, since it's only used in the nested case; it's there in the git
tree if we need it for reference when we re-introduce it.

Signed-off-by: George Dunlap <george.dunlap@cloud.com>
---
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: "Roger Pau Monné" <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/cpu-policy.c                    |  3 +-
 xen/arch/x86/hvm/svm/nestedsvm.c             |  2 -
 xen/arch/x86/hvm/svm/svm.c                   | 57 --------------------
 xen/arch/x86/include/asm/hvm/svm/nestedsvm.h |  5 --
 4 files changed, 1 insertion(+), 66 deletions(-)

diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c
index 10079c26ae..d71abbc44a 100644
--- a/xen/arch/x86/cpu-policy.c
+++ b/xen/arch/x86/cpu-policy.c
@@ -407,8 +407,7 @@ static void __init calculate_host_policy(void)
                                (1u << SVM_FEATURE_PAUSEFILTER) |
                                (1u << SVM_FEATURE_DECODEASSISTS));
         /* Enable features which are always emulated. */
-        p->extd.raw[0xa].d |= ((1u << SVM_FEATURE_VMCBCLEAN) |
-                               (1u << SVM_FEATURE_TSCRATEMSR));
+        p->extd.raw[0xa].d |= (1u << SVM_FEATURE_VMCBCLEAN);
     }
 
     /* 0x000000ce  MSR_INTEL_PLATFORM_INFO */
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index ee9602f5c8..d02a59f184 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -146,8 +146,6 @@ int cf_check nsvm_vcpu_reset(struct vcpu *v)
     svm->ns_msr_hsavepa = INVALID_PADDR;
     svm->ns_ovvmcb_pa = INVALID_PADDR;
 
-    svm->ns_tscratio = DEFAULT_TSC_RATIO;
-
     svm->ns_cr_intercepts = 0;
     svm->ns_dr_intercepts = 0;
     svm->ns_exception_intercepts = 0;
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index b551eac807..34b9f603bc 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -777,43 +777,6 @@ static int cf_check svm_get_guest_pat(struct vcpu *v, u64 *gpat)
     return 1;
 }
 
-static uint64_t scale_tsc(uint64_t host_tsc, uint64_t ratio)
-{
-    uint64_t mult, frac, scaled_host_tsc;
-
-    if ( ratio == DEFAULT_TSC_RATIO )
-        return host_tsc;
-
-    /*
-     * Suppose the most significant 32 bits of host_tsc and ratio are
-     * tsc_h and mult, and the least 32 bits of them are tsc_l and frac,
-     * then
-     *     host_tsc * ratio * 2^-32
-     *     = host_tsc * (mult * 2^32 + frac) * 2^-32
-     *     = host_tsc * mult + (tsc_h * 2^32 + tsc_l) * frac * 2^-32
-     *     = host_tsc * mult + tsc_h * frac + ((tsc_l * frac) >> 32)
-     *
-     * Multiplications in the last two terms are between 32-bit integers,
-     * so both of them can fit in 64-bit integers.
-     *
-     * Because mult is usually less than 10 in practice, it's very rare
-     * that host_tsc * mult can overflow a 64-bit integer.
-     */
-    mult = ratio >> 32;
-    frac = ratio & ((1ULL << 32) - 1);
-    scaled_host_tsc  = host_tsc * mult;
-    scaled_host_tsc += (host_tsc >> 32) * frac;
-    scaled_host_tsc += ((host_tsc & ((1ULL << 32) - 1)) * frac) >> 32;
-
-    return scaled_host_tsc;
-}
-
-static uint64_t svm_get_tsc_offset(uint64_t host_tsc, uint64_t guest_tsc,
-    uint64_t ratio)
-{
-    return guest_tsc - scale_tsc(host_tsc, ratio);
-}
-
 static void cf_check svm_set_tsc_offset(struct vcpu *v, u64 offset, u64 at_tsc)
 {
     struct vmcb_struct *vmcb = v->arch.hvm.svm.vmcb;
@@ -832,18 +795,8 @@ static void cf_check svm_set_tsc_offset(struct vcpu *v, u64 offset, u64 at_tsc)
 
     if ( nestedhvm_vcpu_in_guestmode(v) )
     {
-        struct nestedsvm *svm = &vcpu_nestedsvm(v);
-
         n2_tsc_offset = vmcb_get_tsc_offset(n2vmcb) -
                         vmcb_get_tsc_offset(n1vmcb);
-        if ( svm->ns_tscratio != DEFAULT_TSC_RATIO )
-        {
-            uint64_t guest_tsc = hvm_get_guest_tsc_fixed(v, at_tsc);
-
-            n2_tsc_offset = svm_get_tsc_offset(guest_tsc,
-                                               guest_tsc + n2_tsc_offset,
-                                               svm->ns_tscratio);
-        }
         vmcb_set_tsc_offset(n1vmcb, offset);
     }
 
@@ -1921,10 +1874,6 @@ static int cf_check svm_msr_read_intercept(
         *msr_content = nsvm->ns_msr_hsavepa;
         break;
 
-    case MSR_AMD64_TSC_RATIO:
-        *msr_content = nsvm->ns_tscratio;
-        break;
-
     case MSR_AMD_OSVW_ID_LENGTH:
     case MSR_AMD_OSVW_STATUS:
         if ( !d->arch.cpuid->extd.osvw )
@@ -2103,12 +2052,6 @@ static int cf_check svm_msr_write_intercept(
             goto gpf;
         break;
 
-    case MSR_AMD64_TSC_RATIO:
-        if ( msr_content & TSC_RATIO_RSVD_BITS )
-            goto gpf;
-        nsvm->ns_tscratio = msr_content;
-        break;
-
     case MSR_IA32_MCx_MISC(4): /* Threshold register */
     case MSR_F10_MC4_MISC1 ... MSR_F10_MC4_MISC3:
         /*
diff --git a/xen/arch/x86/include/asm/hvm/svm/nestedsvm.h b/xen/arch/x86/include/asm/hvm/svm/nestedsvm.h
index 406fc082b1..45d658ad01 100644
--- a/xen/arch/x86/include/asm/hvm/svm/nestedsvm.h
+++ b/xen/arch/x86/include/asm/hvm/svm/nestedsvm.h
@@ -18,11 +18,6 @@ struct nestedsvm {
      */
     uint64_t ns_ovvmcb_pa;
 
-    /* virtual tscratio holding the value l1 guest writes to the
-     * MSR_AMD64_TSC_RATIO MSR.
-     */
-    uint64_t ns_tscratio;
-
     /* Cached real intercepts of the l2 guest */
     uint32_t ns_cr_intercepts;
     uint32_t ns_dr_intercepts;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 5/6] nestedsvm: Remove bogus debug message from nestedsvm_check_intercepts
  2024-02-06  1:20 [PATCH 0/6] AMD Nested Virt Preparation George Dunlap
                   ` (3 preceding siblings ...)
  2024-02-06  1:20 ` [PATCH 4/6] nestedsvm: Disable TscRateMSR George Dunlap
@ 2024-02-06  1:20 ` George Dunlap
  2024-02-19 15:56   ` Jan Beulich
  2024-02-26 16:47   ` Andrew Cooper
  2024-02-06  1:20 ` [PATCH 6/6] svm/nestedvm: Introduce nested capabilities bit George Dunlap
  5 siblings, 2 replies; 33+ messages in thread
From: George Dunlap @ 2024-02-06  1:20 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap

Changeset ef3e8db8068 ("x86/hvm: Corrections and improvements to
unhandled vmexit logging") introduced a printk to the default path of
the switch statement in nestedsvm_check_intercepts(), complaining of
an unknown exit reason.

Unfortunately, the "core" switch statement which is meant to handle
all vmexit reasons is in nsvm_vmcb_guest_intercepts_exitcode(); the
switch statement in nestedsvm_check_intercepts() is only meant to
superimpose on top of that some special-casing for how to interaction
between L1 and L0 vmexits.

Remove the printk, and add a comment to prevent future confusion.

Signed-off-by: George Dunlap <george.dunlap@cloud.com>
---
 xen/arch/x86/hvm/svm/nestedsvm.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index d02a59f184..a5319ab729 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -1292,6 +1292,10 @@ nestedsvm_check_intercepts(struct vcpu *v, struct cpu_user_regs *regs,
     ASSERT(vcpu_nestedhvm(v).nv_vmexit_pending == 0);
     is_intercepted = nsvm_vmcb_guest_intercepts_exitcode(v, regs, exitcode);
 
+    /* 
+     * Handle specific interactions between things the guest and host
+     * may both want to intercept
+     */
     switch ( exitcode )
     {
     case VMEXIT_INVALID:
@@ -1347,8 +1351,6 @@ nestedsvm_check_intercepts(struct vcpu *v, struct cpu_user_regs *regs,
         /* Always let the guest handle VMMCALL/VMCALL */
         return NESTEDHVM_VMEXIT_INJECT;
     default:
-        gprintk(XENLOG_ERR, "Unexpected nested vmexit: reason %#"PRIx64"\n",
-                exitcode);
         break;
     }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 6/6] svm/nestedvm: Introduce nested capabilities bit
  2024-02-06  1:20 [PATCH 0/6] AMD Nested Virt Preparation George Dunlap
                   ` (4 preceding siblings ...)
  2024-02-06  1:20 ` [PATCH 5/6] nestedsvm: Remove bogus debug message from nestedsvm_check_intercepts George Dunlap
@ 2024-02-06  1:20 ` George Dunlap
  2024-02-06  1:34   ` George Dunlap
  2024-02-19 16:25   ` Jan Beulich
  5 siblings, 2 replies; 33+ messages in thread
From: George Dunlap @ 2024-02-06  1:20 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Roger Pau Monné,
	Jun Nakajima, Kevin Tian

In order to make implementation and testing tractable, we will require
specific host functionality.  Add a nested_virt bit to hvm_funcs.caps,
and return an error if a domain is created with nested virt and this
bit isn't set.

For VMX, start with always enabling it if HAP is present; this
shouldn't change current behvior.

For SVM, require some basic functionality, adding a document
explaining the rationale.

NB that only SVM CPUID bits 0-7 have been considered.  Bits 10-16 may
be considered in a follow-up patch.

Signed-off-by: George Dunlap <george.dunlap@cloud.com>
---
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: George Dunlap <george.dunlap@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Julien Grall <julien@xen.org>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Wei Liu <wl@xen.org>
CC: "Roger Pau Monné" <roger.pau@citrix.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
---
 docs/designs/nested-svm-cpu-features.md | 110 ++++++++++++++++++++++++
 xen/arch/x86/domain.c                   |   6 ++
 xen/arch/x86/hvm/svm/nestedhvm.h        |   1 +
 xen/arch/x86/hvm/svm/nestedsvm.c        |  14 +++
 xen/arch/x86/hvm/svm/svm.c              |   2 +
 xen/arch/x86/hvm/vmx/vmx.c              |   3 +
 xen/arch/x86/include/asm/hvm/hvm.h      |  11 ++-
 7 files changed, 146 insertions(+), 1 deletion(-)

diff --git a/docs/designs/nested-svm-cpu-features.md b/docs/designs/nested-svm-cpu-features.md
new file mode 100644
index 0000000000..7ffb8daefd
--- /dev/null
+++ b/docs/designs/nested-svm-cpu-features.md
@@ -0,0 +1,110 @@
+# Nested SVM (AMD) CPUID requirements
+
+The first step in making nested SVM production-ready is to make sure
+that all features are implemented and well-tested.  To make this
+tractable, we will initially be limiting the "supported" range of
+nested virt to a specific subset of host and guest features.  This
+document describes the criteria for deciding on features, and the
+rationale behind each feature.
+
+For AMD, all virtualization-related features can be found in CPUID
+leaf 8000000A:edx
+
+# Criteria
+
+- Processor support: At a minimum we want to support processors from
+  the last 5 years.  All things being equal, older processors are
+  better.  Bits 0:7 were available in the very earliest processors;
+  and even through bit 15 we should be pretty good support-wise.
+
+- Faithfulness to hardware: We need the behavior of the "virtual cpu"
+  from the L1 hypervisor's perspective to be as close as possible to
+  the original hardware.  In particular, the behavior of the hardware
+  on error paths 1) is not easy to understand or test, 2) can be the
+  source of surprising vulnerabiliies.  (See XSA-7 for an example of a
+  case where subtle error-handling differences can open up a privilege
+  escalation.)  We should avoid emulating any bit of the hardware with
+  complex error paths if we can at all help it.
+
+- Cost of implementation: We want to minimize the cost of
+  implementation (where this includes bringing an existing sub-par
+  implementation up to speed).  All things being equal, we'll favor a
+  configuration which does not require any new implementation.
+
+- Performance: All things being equal, we'd prefer to choose a set of
+  L0 / L1 CPUID bits that are faster than slower.
+
+
+# Bits
+
+- 0 `NP` *Nested Paging*: Required both for L0 and L1.
+
+  Based primarily on faithfulness and performance, as well as
+  potential cost of implementation.  Available on earliest hardware,
+  so no compatibility issues.
+
+- 1 `LbrVirt` *LBR / debugging virtualization*: Require for L0 and L1.
+
+  For L0 this is required for performance: There's no way to tell the
+  guests not to use the LBR-related registers; and if the guest does,
+  then you have to save and restore all LBR-related registers on
+  context switch, which is prohibitive.  Furthermore, the additional
+  emulation risks a security-relevant difference to come up.
+
+  Providing it to L1 when we have it in L0 is basically free, and
+  already implemented.
+
+  Just require it and provide it.
+
+- 2 `SVML` *SVM Lock*: Not required for L0, not provided to L1
+
+  Seems to be aboult enabling an operating system to prevent "blue
+  pill" attacks against itself.
+
+  Xen doesn't use it, nor provide it; so it would need to be
+  implementend.  The best way to protect a guest OS is to leave nested
+  virt disabled in the tools.
+
+- 3 `NRIPS` NRIP Save: Require for both L0 and L1
+
+  If NRIPS is not present, the software interrupt injection
+  functionality can't be used; and Xen has to emulate it.  That's
+  another source of potential security issues.  If hardware supports
+  it, then providing it to guest is basically free.
+
+- 4 `TscRateMsr`: Not required by L0, not provided to L1
+
+  The main putative use for this would be trying to maintain an
+  invariant TSC across cores with different clock speeds, or after a
+  migrate.  Unlike others, this doesn't have an error path to worry
+  about compatibility-wise; and according to tests done when nestedSVM
+  was first implemented, it's actually faster to emliate TscRateMSR in
+  the L0 hypervisor than for L1 to attempt to emulate it itself.
+
+  However, using this properly in L0 will take some implementation
+  effort; and composing it properly with L1 will take even more
+  effort.  Just leave it off for now.
+
+ - 5 `VmcbClean`: VMCB Clean Bits: Not required by L0, provide to L1
+
+  This is a pure optimization, both on the side of the L0 and L1.  The
+  implementaiton for L1 is entirely Xen-side, so can be provided even
+  on hardware that doesn't provide it.  And it's purely an
+  optimization, so could be "implemented" by ignoring the bits
+  entirely.
+
+  As such, we don't need to require it for L0; and as it's already
+  implemented, no reason not to provide it to L1.  Before this feature
+  was available those bits were marked SBZ ("should be zero"); setting
+  them was already advertised to cause unpredictable behavior.
+
+- 6 `FlushByAsid`: Require for L0, provide to L1
+
+  This is cheap and easy to use for L0 and to provide to the L1;
+  there's no reson not to just pass it through.
+
+- 7 `DecodeAssists`: Require for L0, provide to L1
+
+  Using it in L0 reduces the chance that we'll make some sort of error
+  in the decode path.  And if hardware supports it, it's easy enough
+  to provide to the L1.
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index bda853e3c9..a25f498265 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -673,6 +673,12 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
          */
         config->flags |= XEN_DOMCTL_CDF_oos_off;
 
+    if ( nested_virt && !hvm_nested_virt_supported() )
+    {
+        dprintk(XENLOG_INFO, "Nested virt requested but not available\n");
+        return -EINVAL;        
+    }
+
     if ( nested_virt && !hap )
     {
         dprintk(XENLOG_INFO, "Nested virt not supported without HAP\n");
diff --git a/xen/arch/x86/hvm/svm/nestedhvm.h b/xen/arch/x86/hvm/svm/nestedhvm.h
index 43245e13de..31cf2af8e4 100644
--- a/xen/arch/x86/hvm/svm/nestedhvm.h
+++ b/xen/arch/x86/hvm/svm/nestedhvm.h
@@ -35,6 +35,7 @@ enum nestedhvm_vmexits
 nestedsvm_check_intercepts(struct vcpu *v, struct cpu_user_regs *regs,
     uint64_t exitcode);
 void svm_nested_features_on_efer_update(struct vcpu *v);
+void __init start_nested_svm(struct hvm_function_table *svm_function_table);
 
 /* Interface methods */
 void cf_check nsvm_vcpu_destroy(struct vcpu *v);
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index a5319ab729..92b063daa5 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -1666,3 +1666,17 @@ void svm_nested_features_on_efer_update(struct vcpu *v)
         }
     }
 }
+
+void __init start_nested_svm(struct hvm_function_table *svm_function_table)
+{
+    /* 
+     * Required host functionality to support nested virt.  See
+     * docs/designs/nested-svm-cpu-features.md for rationale.
+     */
+    svm_function_table->caps.nested_virt =
+        cpu_has_svm_nrips &&
+        cpu_has_svm_lbrv &&
+        cpu_has_svm_nrips &&
+        cpu_has_svm_flushbyasid &&
+        cpu_has_svm_decode;
+}
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 34b9f603bc..5c2e171777 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -2527,6 +2527,8 @@ const struct hvm_function_table * __init start_svm(void)
     svm_function_table.caps.hap_superpage_2mb = true;
     svm_function_table.caps.hap_superpage_1gb = cpu_has_page1gb;
 
+    start_nested_svm(&svm_function_table);
+
     return &svm_function_table;
 }
 
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 4bcf436d2c..6b5ad4a509 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3021,6 +3021,9 @@ const struct hvm_function_table * __init start_vmx(void)
     if ( cpu_has_vmx_tsc_scaling )
         vmx_function_table.tsc_scaling.ratio_frac_bits = 48;
 
+    /* TODO: Require hardware support before enabling nested virt */
+    vmx_function_table.caps.nested_virt = vmx_function_table.caps.hap;
+
     model_specific_lbr = get_model_specific_lbr();
     lbr_tsx_fixup_check();
     ler_to_fixup_check();
diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h
index bbd83a8275..8a3df0eca7 100644
--- a/xen/arch/x86/include/asm/hvm/hvm.h
+++ b/xen/arch/x86/include/asm/hvm/hvm.h
@@ -97,7 +97,10 @@ struct hvm_function_table {
             singlestep:1,
             
             /* Hardware virtual interrupt delivery enable? */
-            virtual_intr_delivery;
+            virtual_intr_delivery,
+
+            /* Nested virt capabilities */
+            nested_virt:1;
 
     } caps;
 
@@ -655,6 +658,12 @@ static inline bool hvm_altp2m_supported(void)
     return hvm_funcs.caps.altp2m;
 }
 
+/* Returns true if we have the minimum hardware requirements for nested virt */
+static inline bool hvm_nested_virt_supported(void)
+{
+    return hvm_funcs.caps.nested_virt;
+}
+
 /* updates the current hardware p2m */
 static inline void altp2m_vcpu_update_p2m(struct vcpu *v)
 {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 6/6] svm/nestedvm: Introduce nested capabilities bit
  2024-02-06  1:20 ` [PATCH 6/6] svm/nestedvm: Introduce nested capabilities bit George Dunlap
@ 2024-02-06  1:34   ` George Dunlap
  2024-02-19 16:25   ` Jan Beulich
  1 sibling, 0 replies; 33+ messages in thread
From: George Dunlap @ 2024-02-06  1:34 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Jan Beulich, Julien Grall, Stefano Stabellini,
	Wei Liu, Roger Pau Monné,
	Jun Nakajima, Kevin Tian

On Tue, Feb 6, 2024 at 9:20 AM George Dunlap <george.dunlap@cloud.com> wrote:

> @@ -655,6 +658,12 @@ static inline bool hvm_altp2m_supported(void)
>      return hvm_funcs.caps.altp2m;
>  }
>
> +/* Returns true if we have the minimum hardware requirements for nested virt */
> +static inline bool hvm_nested_virt_supported(void)
> +{
> +    return hvm_funcs.caps.nested_virt;
> +}

NB this is missing the !CONFIG_HVM version of this function; I'll
include it in v2.

 -George


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/6] xen/hvm: Convert hap_capabilities into a bitfield
  2024-02-06  1:20 ` [PATCH 1/6] xen/hvm: Convert hap_capabilities into a bitfield George Dunlap
@ 2024-02-19 13:36   ` Jan Beulich
  2024-02-21  7:02     ` George Dunlap
  0 siblings, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2024-02-19 13:36 UTC (permalink / raw)
  To: George Dunlap
  Cc: Andrew Cooper, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, xen-devel

On 06.02.2024 02:20, George Dunlap wrote:
> hvm_function_table is an internal structure; rather than manually
> |-ing and &-ing bits, just make it a boolean bitfield and let the
> compiler do all the work.  This makes everything easier to read, and
> presumably allows the compiler more flexibility in producing efficient
> code.
> 
> No functional change intended.
> 
> Signed-off-by: George Dunlap <george.dunlap@cloud.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

> ---
> Questions:
> 
> * Should hap_superpage_2m really be set unconditionally, or should we
>   condition it on cpu_has_svm_npt?

That's HAP capabilities; there's not going to be any use of HAP when
there's no NPT (on an AMD system). IOW - all is fine as is, imo.

> * Do we really need to "!!cpu_has_svm_npt"?  If so, wouldn't it be
>   better to put the "!!"  in the #define, rather than requiring the
>   user to know that it's needed?

Considering that hap_supported is bool now, the !! can simply be
dropped. We've been doing so as code was touched anyway, not in a
concerted effort.

> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> @@ -113,8 +113,8 @@ static int cf_check parse_ept_param_runtime(const char *s)
>      int val;
>  
>      if ( !cpu_has_vmx_ept || !hvm_funcs.hap_supported ||
> -         !(hvm_funcs.hap_capabilities &
> -           (HVM_HAP_SUPERPAGE_2MB | HVM_HAP_SUPERPAGE_1GB)) )
> +         !(hvm_funcs.caps.hap_superpage_2mb ||
> +           hvm_funcs.caps.hap_superpage_1gb) )
>      {
>          printk("VMX: EPT not available, or not in use - ignoring\n");

Just to mention it: The conditional and the log message don't really
fit together. (I was first wondering what the 2mb/1gb checks had to
do here at all, but that's immediately clear when seeing that the
only sub-option here is "exec-sp".)

> @@ -104,8 +96,11 @@ struct hvm_function_table {
>      /* Hardware virtual interrupt delivery enable? */
>      bool virtual_intr_delivery_enabled;
>  
> -    /* Indicate HAP capabilities. */
> -    unsigned int hap_capabilities;
> +    struct {
> +        /* Indicate HAP capabilities. */
> +        bool hap_superpage_1gb:1,
> +            hap_superpage_2mb:1;

Nit: Would be nice imo if the two identifiers aligned vertically with
one another.

Jan


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/6] svm: Improve type of cpu_has_svm_feature
  2024-02-06  1:20 ` [PATCH 2/6] svm: Improve type of cpu_has_svm_feature George Dunlap
@ 2024-02-19 14:03   ` Jan Beulich
  2024-02-21  7:09     ` George Dunlap
  2024-02-19 15:24   ` Jan Beulich
  1 sibling, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2024-02-19 14:03 UTC (permalink / raw)
  To: George Dunlap; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On 06.02.2024 02:20, George Dunlap wrote:
> The "effective type" of the cpu_has_svm_feature macro is effectively
> an unsigned log with one bit set (or not); at least one place someone
> felt compelled to do a !! to make sure that they got a boolean out of
> it.
> 
> Ideally the whole of this would be folded into the cpufeature.h
> infrastructure.  But for now, duplicate the more type-safe static
> inlines in that file, and remove the !!.
> 
> Signed-off-by: George Dunlap <george.dunlap@cloud.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
albeit preferably with ...

> --- a/xen/arch/x86/include/asm/hvm/svm/svm.h
> +++ b/xen/arch/x86/include/asm/hvm/svm/svm.h
> @@ -38,7 +38,10 @@ extern u32 svm_feature_flags;
>  #define SVM_FEATURE_SSS           19 /* NPT Supervisor Shadow Stacks */
>  #define SVM_FEATURE_SPEC_CTRL     20 /* MSR_SPEC_CTRL virtualisation */
>  
> -#define cpu_has_svm_feature(f) (svm_feature_flags & (1u << (f)))
> +static inline bool cpu_has_svm_feature(unsigned int feat)
> +{
> +    return svm_feature_flags & (1u << (feat));

... the inner pair of parentheses dropped.

Jan


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 3/6] xen/hvm: Move other hvm_function_table booleans into the caps bitfield
  2024-02-06  1:20 ` [PATCH 3/6] xen/hvm: Move other hvm_function_table booleans into the caps bitfield George Dunlap
@ 2024-02-19 14:39   ` Jan Beulich
  2024-02-19 16:08   ` Jan Beulich
  1 sibling, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2024-02-19 14:39 UTC (permalink / raw)
  To: George Dunlap
  Cc: Andrew Cooper, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, xen-devel

On 06.02.2024 02:20, George Dunlap wrote:
> @@ -144,7 +144,7 @@ static bool __init hap_supported(struct hvm_function_table *fns)
>  
>      if ( !opt_hap_enabled )
>      {
> -        fns->hap_supported = 0;
> +        fns->caps.hap = 0;

As you touch such, would you mind switching to true/false instead of
1/0 at this occasion?

> @@ -3000,7 +3000,7 @@ const struct hvm_function_table * __init start_vmx(void)
>          vmx_function_table.update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap;
>          vmx_function_table.process_isr = vmx_process_isr;
>          vmx_function_table.handle_eoi = vmx_handle_eoi;
> -        vmx_function_table.virtual_intr_delivery_enabled = true;
> +        vmx_function_table.caps.virtual_intr_delivery = true;

I'm unsure about this one - it had "enabled" in its name for a good
reason. Then again its (somewhat more involved) derivation from
other capability bits (and a command line option) isn't fundamentally
different from that of, say, hap_supported. Hence I guess with the
other item taken care of
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/6] nestedsvm: Disable TscRateMSR
  2024-02-06  1:20 ` [PATCH 4/6] nestedsvm: Disable TscRateMSR George Dunlap
@ 2024-02-19 15:22   ` Jan Beulich
  2024-02-21  8:48     ` George Dunlap
  0 siblings, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2024-02-19 15:22 UTC (permalink / raw)
  To: George Dunlap; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On 06.02.2024 02:20, George Dunlap wrote:
> For now, just disable the functionality entirely until we can
> implement it properly:
> 
> - Don't set TSCRATEMSR in the host CPUID policy

This goes too far: This way you would (in principle) also affect guests
with nesting disabled. According to the earlier parts of the description
there's also no issue with it in that case. What you want to make sure
it that in the HVM policy the bit isn't set.

While presently resolving to cpu_has_svm_feature(), I think
cpu_has_tsc_ratio really ought to resolve to the host policy field.
Of course then requiring the host policy to reflect reality rather than
having what is "always emulated". IOW ...

> --- a/xen/arch/x86/cpu-policy.c
> +++ b/xen/arch/x86/cpu-policy.c
> @@ -407,8 +407,7 @@ static void __init calculate_host_policy(void)
>                                 (1u << SVM_FEATURE_PAUSEFILTER) |
>                                 (1u << SVM_FEATURE_DECODEASSISTS));
>          /* Enable features which are always emulated. */
> -        p->extd.raw[0xa].d |= ((1u << SVM_FEATURE_VMCBCLEAN) |
> -                               (1u << SVM_FEATURE_TSCRATEMSR));
> +        p->extd.raw[0xa].d |= (1u << SVM_FEATURE_VMCBCLEAN);

... this likely wants replacing altogether by not overriding what we
found in hardware, which would apparently mean moving the two bit
masks to the earlier "clamping" expression.

But then of course Andrew may know of reasons why all of this is done
in calculate_host_policy() in the first place, rather than in HVM
policy calculation.

Jan


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/6] svm: Improve type of cpu_has_svm_feature
  2024-02-06  1:20 ` [PATCH 2/6] svm: Improve type of cpu_has_svm_feature George Dunlap
  2024-02-19 14:03   ` Jan Beulich
@ 2024-02-19 15:24   ` Jan Beulich
  2024-02-21  7:13     ` George Dunlap
  1 sibling, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2024-02-19 15:24 UTC (permalink / raw)
  To: George Dunlap; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On 06.02.2024 02:20, George Dunlap wrote:
> --- a/xen/arch/x86/include/asm/hvm/svm/svm.h
> +++ b/xen/arch/x86/include/asm/hvm/svm/svm.h
> @@ -38,7 +38,10 @@ extern u32 svm_feature_flags;
>  #define SVM_FEATURE_SSS           19 /* NPT Supervisor Shadow Stacks */
>  #define SVM_FEATURE_SPEC_CTRL     20 /* MSR_SPEC_CTRL virtualisation */
>  
> -#define cpu_has_svm_feature(f) (svm_feature_flags & (1u << (f)))
> +static inline bool cpu_has_svm_feature(unsigned int feat)
> +{
> +    return svm_feature_flags & (1u << (feat));
> +}
>  #define cpu_has_svm_npt       cpu_has_svm_feature(SVM_FEATURE_NPT)
>  #define cpu_has_svm_lbrv      cpu_has_svm_feature(SVM_FEATURE_LBRV)
>  #define cpu_has_svm_svml      cpu_has_svm_feature(SVM_FEATURE_SVML)

Having seen patch 4 now, I have to raise the question here as well: Why
do we need a separate variable (svm_feature_flags) when we could use
the host policy (provided it isn't abused; see comments on patch 4)?

Jan


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 5/6] nestedsvm: Remove bogus debug message from nestedsvm_check_intercepts
  2024-02-06  1:20 ` [PATCH 5/6] nestedsvm: Remove bogus debug message from nestedsvm_check_intercepts George Dunlap
@ 2024-02-19 15:56   ` Jan Beulich
  2024-02-22 12:58     ` George Dunlap
  2024-02-26 16:47   ` Andrew Cooper
  1 sibling, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2024-02-19 15:56 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel

On 06.02.2024 02:20, George Dunlap wrote:
> Changeset ef3e8db8068 ("x86/hvm: Corrections and improvements to
> unhandled vmexit logging") introduced a printk to the default path of
> the switch statement in nestedsvm_check_intercepts(), complaining of
> an unknown exit reason.
> 
> Unfortunately, the "core" switch statement which is meant to handle
> all vmexit reasons is in nsvm_vmcb_guest_intercepts_exitcode(); the
> switch statement in nestedsvm_check_intercepts() is only meant to
> superimpose on top of that some special-casing for how to interaction
> between L1 and L0 vmexits.
> 
> Remove the printk, and add a comment to prevent future confusion.
> 
> Signed-off-by: George Dunlap <george.dunlap@cloud.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

I wonder if a Fixes: tag is warranted here.

Jan


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 3/6] xen/hvm: Move other hvm_function_table booleans into the caps bitfield
  2024-02-06  1:20 ` [PATCH 3/6] xen/hvm: Move other hvm_function_table booleans into the caps bitfield George Dunlap
  2024-02-19 14:39   ` Jan Beulich
@ 2024-02-19 16:08   ` Jan Beulich
  2024-02-21  7:21     ` George Dunlap
  1 sibling, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2024-02-19 16:08 UTC (permalink / raw)
  To: George Dunlap
  Cc: Andrew Cooper, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, xen-devel

On 06.02.2024 02:20, George Dunlap wrote:
> --- a/xen/arch/x86/include/asm/hvm/hvm.h
> +++ b/xen/arch/x86/include/asm/hvm/hvm.h
> @@ -86,20 +86,19 @@ struct hvm_vcpu_nonreg_state {
>  struct hvm_function_table {
>      const char *name;
>  
> -    /* Support Hardware-Assisted Paging? */
> -    bool hap_supported;
> -
> -    /* Necessary hardware support for alternate p2m's? */
> -    bool altp2m_supported;
> -    bool singlestep_supported;
> -
> -    /* Hardware virtual interrupt delivery enable? */
> -    bool virtual_intr_delivery_enabled;
> -
>      struct {
>          /* Indicate HAP capabilities. */
> -        bool hap_superpage_1gb:1,
> -            hap_superpage_2mb:1;
> +        bool hap:1,
> +             hap_superpage_1gb:1,
> +             hap_superpage_2mb:1,
> +
> +            /* Altp2m capabilities */
> +            altp2m:1,
> +            singlestep:1,
> +            
> +            /* Hardware virtual interrupt delivery enable? */
> +            virtual_intr_delivery;
> +
>      } caps;

Nit (spotted only while looking at patch 6): You're adding a stray blank
line at the end of the structure. Further I expect virtual_intr_delivery
would also want to be just a single bit?

Jan


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 6/6] svm/nestedvm: Introduce nested capabilities bit
  2024-02-06  1:20 ` [PATCH 6/6] svm/nestedvm: Introduce nested capabilities bit George Dunlap
  2024-02-06  1:34   ` George Dunlap
@ 2024-02-19 16:25   ` Jan Beulich
  2024-02-22 14:33     ` George Dunlap
  1 sibling, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2024-02-19 16:25 UTC (permalink / raw)
  To: George Dunlap
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Roger Pau Monné,
	Jun Nakajima, Kevin Tian, xen-devel

On 06.02.2024 02:20, George Dunlap wrote:
> --- /dev/null
> +++ b/docs/designs/nested-svm-cpu-features.md
> @@ -0,0 +1,110 @@
> +# Nested SVM (AMD) CPUID requirements
> +
> +The first step in making nested SVM production-ready is to make sure
> +that all features are implemented and well-tested.  To make this
> +tractable, we will initially be limiting the "supported" range of
> +nested virt to a specific subset of host and guest features.  This
> +document describes the criteria for deciding on features, and the
> +rationale behind each feature.
> +
> +For AMD, all virtualization-related features can be found in CPUID
> +leaf 8000000A:edx
> +
> +# Criteria
> +
> +- Processor support: At a minimum we want to support processors from
> +  the last 5 years.  All things being equal, older processors are
> +  better.

Nit: Perhaps missing "covering"? Generally I hope newer processors are
"better".

>  Bits 0:7 were available in the very earliest processors;
> +  and even through bit 15 we should be pretty good support-wise.
> +
> +- Faithfulness to hardware: We need the behavior of the "virtual cpu"
> +  from the L1 hypervisor's perspective to be as close as possible to
> +  the original hardware.  In particular, the behavior of the hardware
> +  on error paths 1) is not easy to understand or test, 2) can be the
> +  source of surprising vulnerabiliies.  (See XSA-7 for an example of a
> +  case where subtle error-handling differences can open up a privilege
> +  escalation.)  We should avoid emulating any bit of the hardware with
> +  complex error paths if we can at all help it.
> +
> +- Cost of implementation: We want to minimize the cost of
> +  implementation (where this includes bringing an existing sub-par
> +  implementation up to speed).  All things being equal, we'll favor a
> +  configuration which does not require any new implementation.
> +
> +- Performance: All things being equal, we'd prefer to choose a set of
> +  L0 / L1 CPUID bits that are faster than slower.
> +
> +
> +# Bits
> +
> +- 0 `NP` *Nested Paging*: Required both for L0 and L1.
> +
> +  Based primarily on faithfulness and performance, as well as
> +  potential cost of implementation.  Available on earliest hardware,
> +  so no compatibility issues.
> +
> +- 1 `LbrVirt` *LBR / debugging virtualization*: Require for L0 and L1.
> +
> +  For L0 this is required for performance: There's no way to tell the
> +  guests not to use the LBR-related registers; and if the guest does,
> +  then you have to save and restore all LBR-related registers on
> +  context switch, which is prohibitive.

"prohibitive" is too strong imo; maybe "undesirable"?

> --- a/xen/arch/x86/hvm/svm/nestedhvm.h
> +++ b/xen/arch/x86/hvm/svm/nestedhvm.h
> @@ -35,6 +35,7 @@ enum nestedhvm_vmexits
>  nestedsvm_check_intercepts(struct vcpu *v, struct cpu_user_regs *regs,
>      uint64_t exitcode);
>  void svm_nested_features_on_efer_update(struct vcpu *v);
> +void __init start_nested_svm(struct hvm_function_table *svm_function_table);

No section placement attributes on declarations, please.

> --- a/xen/arch/x86/hvm/svm/nestedsvm.c
> +++ b/xen/arch/x86/hvm/svm/nestedsvm.c
> @@ -1666,3 +1666,17 @@ void svm_nested_features_on_efer_update(struct vcpu *v)
>          }
>      }
>  }
> +
> +void __init start_nested_svm(struct hvm_function_table *svm_function_table)
> +{
> +    /* 
> +     * Required host functionality to support nested virt.  See
> +     * docs/designs/nested-svm-cpu-features.md for rationale.
> +     */
> +    svm_function_table->caps.nested_virt =
> +        cpu_has_svm_nrips &&
> +        cpu_has_svm_lbrv &&
> +        cpu_has_svm_nrips &&

nrips twice? Was the earlier one meant to be npt?

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -3021,6 +3021,9 @@ const struct hvm_function_table * __init start_vmx(void)
>      if ( cpu_has_vmx_tsc_scaling )
>          vmx_function_table.tsc_scaling.ratio_frac_bits = 48;
>  
> +    /* TODO: Require hardware support before enabling nested virt */
> +    vmx_function_table.caps.nested_virt = vmx_function_table.caps.hap;

This won't have the intended effect if hap_supported() ends up clearing
the bit (used as input here) due to a command line option override. I
wonder if instead this wants doing e.g. in a new hook to be called from
nestedhvm_setup(). On the SVM side the hook function would then be the
start_nested_svm() that you already introduce, with a caps.hap check
added.

Since you leave the other nested-related if() in place in
arch_sanitise_domain_config(), all ought to be well, but I think that
other if() really wants replacing by the one you presently add.

Jan


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/6] xen/hvm: Convert hap_capabilities into a bitfield
  2024-02-19 13:36   ` Jan Beulich
@ 2024-02-21  7:02     ` George Dunlap
  2024-02-21  7:23       ` Jan Beulich
  0 siblings, 1 reply; 33+ messages in thread
From: George Dunlap @ 2024-02-21  7:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, xen-devel

On Mon, Feb 19, 2024 at 9:36 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 06.02.2024 02:20, George Dunlap wrote:
> > hvm_function_table is an internal structure; rather than manually
> > |-ing and &-ing bits, just make it a boolean bitfield and let the
> > compiler do all the work.  This makes everything easier to read, and
> > presumably allows the compiler more flexibility in producing efficient
> > code.
> >
> > No functional change intended.
> >
> > Signed-off-by: George Dunlap <george.dunlap@cloud.com>
>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>
> > ---
> > Questions:
> >
> > * Should hap_superpage_2m really be set unconditionally, or should we
> >   condition it on cpu_has_svm_npt?
>
> That's HAP capabilities; there's not going to be any use of HAP when
> there's no NPT (on an AMD system). IOW - all is fine as is, imo.

Basically there are two stances to take:

1. hap_superpage_2m always has meaning.  If there's no HAP, then there
are no HAP superpages, so we should say that there are no superpages.

2. hap_superpage_2m only has meaning if hap is enabled.  If there's no
HAP, then the question "are there superpages" is moot; nobody should
be paying attention to it.

The second is not without precedent, but is it really the best stance?
 It means that before reading hap_superpage_2m, you have to first
check hap; and it introduces a risk (no matter how small) that someone
will forget to check, and end up doing the wrong thing.

Not a huge deal, but overall my vote would be #1.  I may send a patch
at some point in the future.

> > * Do we really need to "!!cpu_has_svm_npt"?  If so, wouldn't it be
> >   better to put the "!!"  in the #define, rather than requiring the
> >   user to know that it's needed?
>
> Considering that hap_supported is bool now, the !! can simply be
> dropped. We've been doing so as code was touched anyway, not in a
> concerted effort.

Right, forgot to actually delete this para after adding a patch to address it.

> > --- a/xen/arch/x86/hvm/vmx/vmcs.c
> > +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> > @@ -113,8 +113,8 @@ static int cf_check parse_ept_param_runtime(const char *s)
> >      int val;
> >
> >      if ( !cpu_has_vmx_ept || !hvm_funcs.hap_supported ||
> > -         !(hvm_funcs.hap_capabilities &
> > -           (HVM_HAP_SUPERPAGE_2MB | HVM_HAP_SUPERPAGE_1GB)) )
> > +         !(hvm_funcs.caps.hap_superpage_2mb ||
> > +           hvm_funcs.caps.hap_superpage_1gb) )
> >      {
> >          printk("VMX: EPT not available, or not in use - ignoring\n");
>
> Just to mention it: The conditional and the log message don't really
> fit together. (I was first wondering what the 2mb/1gb checks had to
> do here at all, but that's immediately clear when seeing that the
> only sub-option here is "exec-sp".)

So you mean basically that the checks & error message are poorly
factored, because there's only a single sub-option?  (i.e., if there
were options which didn't rely on superpages, the check would be
incorrect?)

Let me know if there's something concrete you'd like me to do here.

> > @@ -104,8 +96,11 @@ struct hvm_function_table {
> >      /* Hardware virtual interrupt delivery enable? */
> >      bool virtual_intr_delivery_enabled;
> >
> > -    /* Indicate HAP capabilities. */
> > -    unsigned int hap_capabilities;
> > +    struct {
> > +        /* Indicate HAP capabilities. */
> > +        bool hap_superpage_1gb:1,
> > +            hap_superpage_2mb:1;
>
> Nit: Would be nice imo if the two identifiers aligned vertically with
> one another.

Done.

 -George


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/6] svm: Improve type of cpu_has_svm_feature
  2024-02-19 14:03   ` Jan Beulich
@ 2024-02-21  7:09     ` George Dunlap
  0 siblings, 0 replies; 33+ messages in thread
From: George Dunlap @ 2024-02-21  7:09 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On Mon, Feb 19, 2024 at 10:03 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 06.02.2024 02:20, George Dunlap wrote:
> > The "effective type" of the cpu_has_svm_feature macro is effectively
> > an unsigned log with one bit set (or not); at least one place someone
> > felt compelled to do a !! to make sure that they got a boolean out of
> > it.
> >
> > Ideally the whole of this would be folded into the cpufeature.h
> > infrastructure.  But for now, duplicate the more type-safe static
> > inlines in that file, and remove the !!.
> >
> > Signed-off-by: George Dunlap <george.dunlap@cloud.com>
>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> albeit preferably with ...
>
> > --- a/xen/arch/x86/include/asm/hvm/svm/svm.h
> > +++ b/xen/arch/x86/include/asm/hvm/svm/svm.h
> > @@ -38,7 +38,10 @@ extern u32 svm_feature_flags;
> >  #define SVM_FEATURE_SSS           19 /* NPT Supervisor Shadow Stacks */
> >  #define SVM_FEATURE_SPEC_CTRL     20 /* MSR_SPEC_CTRL virtualisation */
> >
> > -#define cpu_has_svm_feature(f) (svm_feature_flags & (1u << (f)))
> > +static inline bool cpu_has_svm_feature(unsigned int feat)
> > +{
> > +    return svm_feature_flags & (1u << (feat));
>
> ... the inner pair of parentheses dropped.

Done, thanks.

 -George


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/6] svm: Improve type of cpu_has_svm_feature
  2024-02-19 15:24   ` Jan Beulich
@ 2024-02-21  7:13     ` George Dunlap
  0 siblings, 0 replies; 33+ messages in thread
From: George Dunlap @ 2024-02-21  7:13 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On Mon, Feb 19, 2024 at 11:24 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 06.02.2024 02:20, George Dunlap wrote:
> > --- a/xen/arch/x86/include/asm/hvm/svm/svm.h
> > +++ b/xen/arch/x86/include/asm/hvm/svm/svm.h
> > @@ -38,7 +38,10 @@ extern u32 svm_feature_flags;
> >  #define SVM_FEATURE_SSS           19 /* NPT Supervisor Shadow Stacks */
> >  #define SVM_FEATURE_SPEC_CTRL     20 /* MSR_SPEC_CTRL virtualisation */
> >
> > -#define cpu_has_svm_feature(f) (svm_feature_flags & (1u << (f)))
> > +static inline bool cpu_has_svm_feature(unsigned int feat)
> > +{
> > +    return svm_feature_flags & (1u << (feat));
> > +}
> >  #define cpu_has_svm_npt       cpu_has_svm_feature(SVM_FEATURE_NPT)
> >  #define cpu_has_svm_lbrv      cpu_has_svm_feature(SVM_FEATURE_LBRV)
> >  #define cpu_has_svm_svml      cpu_has_svm_feature(SVM_FEATURE_SVML)
>
> Having seen patch 4 now, I have to raise the question here as well: Why
> do we need a separate variable (svm_feature_flags) when we could use
> the host policy (provided it isn't abused; see comments on patch 4)?

We certainly don't need an extra variable; in fact, moving all of
these into the host cpuid policy thing would make it easier, for
example, to test some of the guest creation restrictions: One could
use the Xen command-line parameters to disable some of the bits, then
try to create a nested SVM guest and verify that it fails as expected.

I would like to do that eventually, but this patch is already done and
improves the code, so I just kept it.

Let me know if you'd like me to simply drop this patch instead.

 -George


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 3/6] xen/hvm: Move other hvm_function_table booleans into the caps bitfield
  2024-02-19 16:08   ` Jan Beulich
@ 2024-02-21  7:21     ` George Dunlap
  0 siblings, 0 replies; 33+ messages in thread
From: George Dunlap @ 2024-02-21  7:21 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, xen-devel

On Tue, Feb 20, 2024 at 12:08 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 06.02.2024 02:20, George Dunlap wrote:
> > --- a/xen/arch/x86/include/asm/hvm/hvm.h
> > +++ b/xen/arch/x86/include/asm/hvm/hvm.h
> > @@ -86,20 +86,19 @@ struct hvm_vcpu_nonreg_state {
> >  struct hvm_function_table {
> >      const char *name;
> >
> > -    /* Support Hardware-Assisted Paging? */
> > -    bool hap_supported;
> > -
> > -    /* Necessary hardware support for alternate p2m's? */
> > -    bool altp2m_supported;
> > -    bool singlestep_supported;
> > -
> > -    /* Hardware virtual interrupt delivery enable? */
> > -    bool virtual_intr_delivery_enabled;
> > -
> >      struct {
> >          /* Indicate HAP capabilities. */
> > -        bool hap_superpage_1gb:1,
> > -            hap_superpage_2mb:1;
> > +        bool hap:1,
> > +             hap_superpage_1gb:1,
> > +             hap_superpage_2mb:1,
> > +
> > +            /* Altp2m capabilities */
> > +            altp2m:1,
> > +            singlestep:1,
> > +
> > +            /* Hardware virtual interrupt delivery enable? */
> > +            virtual_intr_delivery;
> > +
> >      } caps;
>
> Nit (spotted only while looking at patch 6): You're adding a stray blank
> line at the end of the structure. Further I expect virtual_intr_delivery
> would also want to be just a single bit?

Oh yes, good catch.  (I kind of feel like ":1" should be the default
for bools, but hey...)

I'll fix up this and the 0/1 => false/true thing.

 -George


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/6] xen/hvm: Convert hap_capabilities into a bitfield
  2024-02-21  7:02     ` George Dunlap
@ 2024-02-21  7:23       ` Jan Beulich
  2024-02-21  7:38         ` George Dunlap
  0 siblings, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2024-02-21  7:23 UTC (permalink / raw)
  To: George Dunlap
  Cc: Andrew Cooper, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, xen-devel

On 21.02.2024 08:02, George Dunlap wrote:
> On Mon, Feb 19, 2024 at 9:36 PM Jan Beulich <jbeulich@suse.com> wrote:
>> On 06.02.2024 02:20, George Dunlap wrote:
>>> --- a/xen/arch/x86/hvm/vmx/vmcs.c
>>> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
>>> @@ -113,8 +113,8 @@ static int cf_check parse_ept_param_runtime(const char *s)
>>>      int val;
>>>
>>>      if ( !cpu_has_vmx_ept || !hvm_funcs.hap_supported ||
>>> -         !(hvm_funcs.hap_capabilities &
>>> -           (HVM_HAP_SUPERPAGE_2MB | HVM_HAP_SUPERPAGE_1GB)) )
>>> +         !(hvm_funcs.caps.hap_superpage_2mb ||
>>> +           hvm_funcs.caps.hap_superpage_1gb) )
>>>      {
>>>          printk("VMX: EPT not available, or not in use - ignoring\n");
>>
>> Just to mention it: The conditional and the log message don't really
>> fit together. (I was first wondering what the 2mb/1gb checks had to
>> do here at all, but that's immediately clear when seeing that the
>> only sub-option here is "exec-sp".)
> 
> So you mean basically that the checks & error message are poorly
> factored, because there's only a single sub-option?  (i.e., if there
> were options which didn't rely on superpages, the check would be
> incorrect?)

Right.

> Let me know if there's something concrete you'd like me to do here.

Nothing. I meant to express this by starting with "Just to mention it".
If you were eager, you might deal with this right away, but there's no
expectation to this effect. If and when it becomes a problem, it'll
then need sorting. And the message being potentially misleading is,
well, just an annoyance - as with many other things, finding it confusing
will likely lead to looking up where it's issued, then realizing that
what it says isn't what it means.

Jan


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/6] xen/hvm: Convert hap_capabilities into a bitfield
  2024-02-21  7:23       ` Jan Beulich
@ 2024-02-21  7:38         ` George Dunlap
  0 siblings, 0 replies; 33+ messages in thread
From: George Dunlap @ 2024-02-21  7:38 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, xen-devel

On Wed, Feb 21, 2024 at 3:23 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 21.02.2024 08:02, George Dunlap wrote:
> > On Mon, Feb 19, 2024 at 9:36 PM Jan Beulich <jbeulich@suse.com> wrote:
> >> On 06.02.2024 02:20, George Dunlap wrote:
> >>> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> >>> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> >>> @@ -113,8 +113,8 @@ static int cf_check parse_ept_param_runtime(const char *s)
> >>>      int val;
> >>>
> >>>      if ( !cpu_has_vmx_ept || !hvm_funcs.hap_supported ||
> >>> -         !(hvm_funcs.hap_capabilities &
> >>> -           (HVM_HAP_SUPERPAGE_2MB | HVM_HAP_SUPERPAGE_1GB)) )
> >>> +         !(hvm_funcs.caps.hap_superpage_2mb ||
> >>> +           hvm_funcs.caps.hap_superpage_1gb) )
> >>>      {
> >>>          printk("VMX: EPT not available, or not in use - ignoring\n");
> >>
> >> Just to mention it: The conditional and the log message don't really
> >> fit together. (I was first wondering what the 2mb/1gb checks had to
> >> do here at all, but that's immediately clear when seeing that the
> >> only sub-option here is "exec-sp".)
> >
> > So you mean basically that the checks & error message are poorly
> > factored, because there's only a single sub-option?  (i.e., if there
> > were options which didn't rely on superpages, the check would be
> > incorrect?)
>
> Right.
>
> > Let me know if there's something concrete you'd like me to do here.
>
> Nothing. I meant to express this by starting with "Just to mention it".

Right, and when I said "let me know" I meant, "I'm going to ignore
this unless you say something, feel free to say nothing". :-D

I understood that you weren't asking for anything, but maybe coming
back to this after a few days you'd've had a simple fix.  I wouldn't
mind changing the text of the message, but I didn't feel like finding
a better text.  Reorganizing the checks (which seems closer to the
Right Thing) is off-topic for this patch of course.

 -George


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/6] nestedsvm: Disable TscRateMSR
  2024-02-19 15:22   ` Jan Beulich
@ 2024-02-21  8:48     ` George Dunlap
  2024-02-21 10:52       ` Jan Beulich
  0 siblings, 1 reply; 33+ messages in thread
From: George Dunlap @ 2024-02-21  8:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On Mon, Feb 19, 2024 at 11:22 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 06.02.2024 02:20, George Dunlap wrote:
> > For now, just disable the functionality entirely until we can
> > implement it properly:
> >
> > - Don't set TSCRATEMSR in the host CPUID policy
>
> This goes too far: This way you would (in principle) also affect guests
> with nesting disabled. According to the earlier parts of the description
> there's also no issue with it in that case. What you want to make sure
> it that in the HVM policy the bit isn't set.
>
> While presently resolving to cpu_has_svm_feature(), I think
> cpu_has_tsc_ratio really ought to resolve to the host policy field.
> Of course then requiring the host policy to reflect reality rather than
> having what is "always emulated". IOW ...
>
> > --- a/xen/arch/x86/cpu-policy.c
> > +++ b/xen/arch/x86/cpu-policy.c
> > @@ -407,8 +407,7 @@ static void __init calculate_host_policy(void)
> >                                 (1u << SVM_FEATURE_PAUSEFILTER) |
> >                                 (1u << SVM_FEATURE_DECODEASSISTS));
> >          /* Enable features which are always emulated. */
> > -        p->extd.raw[0xa].d |= ((1u << SVM_FEATURE_VMCBCLEAN) |
> > -                               (1u << SVM_FEATURE_TSCRATEMSR));
> > +        p->extd.raw[0xa].d |= (1u << SVM_FEATURE_VMCBCLEAN);
>
> ... this likely wants replacing altogether by not overriding what we
> found in hardware, which would apparently mean moving the two bit
> masks to the earlier "clamping" expression.
>
> But then of course Andrew may know of reasons why all of this is done
> in calculate_host_policy() in the first place, rather than in HVM
> policy calculation.

It sounds like maybe you're confusing host_policy with
x86_capabilities?  From what I can tell:

*  the "basic" cpu_has_X macros resolve to boot_cpu_has(), which
resolves to cpu_has(&boot_cpu_data, ...), which is completely
independent of the cpu-policy.c:host_cpu_policy

* cpu-policy.c:host_cpu_policy only affects what is advertised to
guests, via {pv,hvm}_cpu_policy and featureset bits.  Most notably a
quick skim doesn't show any mechanism by which host_cpu_policy could
affect what features Xen itself decides to use.

Not sure exactly why the nested virt stuff is done at the
host_cpu_policy level rather than the hvm_cpu_policy level, but since
that's where it is, that's where we need to change it.

FWIW, as I said in response to your comment on 2/6, it would be nicer
if we moved svm_feature_flags into the "capabilities" section; but
that's a different set of work.

 -George


 -George


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/6] nestedsvm: Disable TscRateMSR
  2024-02-21  8:48     ` George Dunlap
@ 2024-02-21 10:52       ` Jan Beulich
  2024-02-22  9:30         ` George Dunlap
  0 siblings, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2024-02-21 10:52 UTC (permalink / raw)
  To: George Dunlap; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On 21.02.2024 09:48, George Dunlap wrote:
> On Mon, Feb 19, 2024 at 11:22 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 06.02.2024 02:20, George Dunlap wrote:
>>> For now, just disable the functionality entirely until we can
>>> implement it properly:
>>>
>>> - Don't set TSCRATEMSR in the host CPUID policy
>>
>> This goes too far: This way you would (in principle) also affect guests
>> with nesting disabled. According to the earlier parts of the description
>> there's also no issue with it in that case. What you want to make sure
>> it that in the HVM policy the bit isn't set.
>>
>> While presently resolving to cpu_has_svm_feature(), I think
>> cpu_has_tsc_ratio really ought to resolve to the host policy field.
>> Of course then requiring the host policy to reflect reality rather than
>> having what is "always emulated". IOW ...
>>
>>> --- a/xen/arch/x86/cpu-policy.c
>>> +++ b/xen/arch/x86/cpu-policy.c
>>> @@ -407,8 +407,7 @@ static void __init calculate_host_policy(void)
>>>                                 (1u << SVM_FEATURE_PAUSEFILTER) |
>>>                                 (1u << SVM_FEATURE_DECODEASSISTS));
>>>          /* Enable features which are always emulated. */
>>> -        p->extd.raw[0xa].d |= ((1u << SVM_FEATURE_VMCBCLEAN) |
>>> -                               (1u << SVM_FEATURE_TSCRATEMSR));
>>> +        p->extd.raw[0xa].d |= (1u << SVM_FEATURE_VMCBCLEAN);
>>
>> ... this likely wants replacing altogether by not overriding what we
>> found in hardware, which would apparently mean moving the two bit
>> masks to the earlier "clamping" expression.
>>
>> But then of course Andrew may know of reasons why all of this is done
>> in calculate_host_policy() in the first place, rather than in HVM
>> policy calculation.
> 
> It sounds like maybe you're confusing host_policy with
> x86_capabilities?  From what I can tell:
> 
> *  the "basic" cpu_has_X macros resolve to boot_cpu_has(), which
> resolves to cpu_has(&boot_cpu_data, ...), which is completely
> independent of the cpu-policy.c:host_cpu_policy
> 
> * cpu-policy.c:host_cpu_policy only affects what is advertised to
> guests, via {pv,hvm}_cpu_policy and featureset bits.  Most notably a
> quick skim doesn't show any mechanism by which host_cpu_policy could
> affect what features Xen itself decides to use.

I'm not mixing the two, no; the two are still insufficiently disentangled.
There's really no reason (long term) to have both host policy and
x86_capabilities. Therefore I'd prefer if new code (including a basically
fundamental re-write as is going to be needed for nested) to avoid
needlessly further extending x86_capabilities. Unless of course there's
something fundamentally wrong with eliminating the redundancy, which
likely Andrew would be in the best position to point out.

> Not sure exactly why the nested virt stuff is done at the
> host_cpu_policy level rather than the hvm_cpu_policy level, but since
> that's where it is, that's where we need to change it.
> 
> FWIW, as I said in response to your comment on 2/6, it would be nicer
> if we moved svm_feature_flags into the "capabilities" section; but
> that's a different set of work.

Can as well reply here then, rather than there: If the movement from
host to HVM policy was done first, the patch here could more sanely go
on top, and patch 2 could then also go on top, converting the separate
variable to host policy accesses, quite possibly introducing a similar
wrapper as you introduce there right now.

But no, I'm not meaning to make this a requirement; this would merely be
an imo better approach. My ack there stands, in case you want to keep
(and commit) the change as is.

Jan


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/6] nestedsvm: Disable TscRateMSR
  2024-02-21 10:52       ` Jan Beulich
@ 2024-02-22  9:30         ` George Dunlap
  2024-02-22  9:50           ` Jan Beulich
  0 siblings, 1 reply; 33+ messages in thread
From: George Dunlap @ 2024-02-22  9:30 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On Wed, Feb 21, 2024 at 6:52 PM Jan Beulich <jbeulich@suse.com> wrote:
> >> But then of course Andrew may know of reasons why all of this is done
> >> in calculate_host_policy() in the first place, rather than in HVM
> >> policy calculation.
> >
> > It sounds like maybe you're confusing host_policy with
> > x86_capabilities?  From what I can tell:
> >
> > *  the "basic" cpu_has_X macros resolve to boot_cpu_has(), which
> > resolves to cpu_has(&boot_cpu_data, ...), which is completely
> > independent of the cpu-policy.c:host_cpu_policy
> >
> > * cpu-policy.c:host_cpu_policy only affects what is advertised to
> > guests, via {pv,hvm}_cpu_policy and featureset bits.  Most notably a
> > quick skim doesn't show any mechanism by which host_cpu_policy could
> > affect what features Xen itself decides to use.
>
> I'm not mixing the two, no; the two are still insufficiently disentangled.
> There's really no reason (long term) to have both host policy and
> x86_capabilities. Therefore I'd prefer if new code (including a basically
> fundamental re-write as is going to be needed for nested) to avoid
> needlessly further extending x86_capabilities. Unless of course there's
> something fundamentally wrong with eliminating the redundancy, which
> likely Andrew would be in the best position to point out.

So I don't know the history of how things got to be the way they are,
nor really much about the code but what I've gathered from skimming
through while creating this patch series.  But from that impression,
the only issue I really see with the current code is the confusing
naming.  The cpufeature.h code has this nice infrastructure to allow
you to, for instance, enable or disable certain bits on the
command-line; and the interface for querying all the different bits of
functionality is all nicely put in one place.  Moving the
svm_feature_flags into x86_capabilities would immediately allow SVM to
take advantage of this infrastructure; it's not clear to me how this
would be "needless".

Furthermore, it looks to me like host_cpu_policy is used as a starting
point for generating pv_cpu_policy and hvm_cpu_policy, both of which
are only used for guest cpuid generation.  Given that the format of
those policies is fixed, and there's a lot of "copy this bit from the
host policy wholesale", it seems like no matter what, you'd want a
host_cpu_policy.

And in any case -- all that is kind of moot.  *Right now*,
host_cpu_policy is only used for guest cpuid policy creation; *right
now*, the nested virt features of AMD are handled in the
host_cpu_policy; *right now*, we're advertising to guests bits which
are not properly virtualized; *right now* these bits are actually set
unconditionally, regardless of whether they're even available on the
hardware; *right now*, Xen uses svm_feature_flags to determine its own
use of TscRateMSR; so *right now*, removing this bit from
host_cpu_policy won't prevent Xen from using TscRateMSR itself.

(Unless my understanding of the code is wrong, in which case I'd
appreciate a correction.)

If at some point in the future x86_capabilities and host_cpu_policy
were merged somehow, whoever did the merging would have to untangle
the twiddling of these bits anyway.  What I'm changing in this patch
wouldn't make that any harder.

> > Not sure exactly why the nested virt stuff is done at the
> > host_cpu_policy level rather than the hvm_cpu_policy level, but since
> > that's where it is, that's where we need to change it.
> >
> > FWIW, as I said in response to your comment on 2/6, it would be nicer
> > if we moved svm_feature_flags into the "capabilities" section; but
> > that's a different set of work.
>
> Can as well reply here then, rather than there: If the movement from
> host to HVM policy was done first, the patch here could more sanely go
> on top, and patch 2 could then also go on top, converting the separate
> variable to host policy accesses, quite possibly introducing a similar
> wrapper as you introduce there right now.
>
> But no, I'm not meaning to make this a requirement; this would merely be
> an imo better approach. My ack there stands, in case you want to keep
> (and commit) the change as is.

I mean, I don't mind doing a bit more prep work, if I know that's the
direction we want to go in.  "Actually, since you're doing a bit of
clean-up anyway -- right now host_cpu_policy is only used to calculate
guest policy, but we'd like to move over to being the Source of Truth
for the host instead of x86_capabilities.  While you're here, would
you mind moving the nested virt policy stuff into hvm_cpu_policy
instead?"

I certainly wouldn't want to move svm_feature_flags into
host_cpu_policy while it's still got random other "guest-only" policy
bits in it; and auditing it for such policy bits is out of the scope
of this work.

 -George


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/6] nestedsvm: Disable TscRateMSR
  2024-02-22  9:30         ` George Dunlap
@ 2024-02-22  9:50           ` Jan Beulich
  2024-02-22 11:00             ` George Dunlap
  0 siblings, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2024-02-22  9:50 UTC (permalink / raw)
  To: George Dunlap; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On 22.02.2024 10:30, George Dunlap wrote:
> On Wed, Feb 21, 2024 at 6:52 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>> But then of course Andrew may know of reasons why all of this is done
>>>> in calculate_host_policy() in the first place, rather than in HVM
>>>> policy calculation.
>>>
>>> It sounds like maybe you're confusing host_policy with
>>> x86_capabilities?  From what I can tell:
>>>
>>> *  the "basic" cpu_has_X macros resolve to boot_cpu_has(), which
>>> resolves to cpu_has(&boot_cpu_data, ...), which is completely
>>> independent of the cpu-policy.c:host_cpu_policy
>>>
>>> * cpu-policy.c:host_cpu_policy only affects what is advertised to
>>> guests, via {pv,hvm}_cpu_policy and featureset bits.  Most notably a
>>> quick skim doesn't show any mechanism by which host_cpu_policy could
>>> affect what features Xen itself decides to use.
>>
>> I'm not mixing the two, no; the two are still insufficiently disentangled.
>> There's really no reason (long term) to have both host policy and
>> x86_capabilities. Therefore I'd prefer if new code (including a basically
>> fundamental re-write as is going to be needed for nested) to avoid
>> needlessly further extending x86_capabilities. Unless of course there's
>> something fundamentally wrong with eliminating the redundancy, which
>> likely Andrew would be in the best position to point out.
> 
> So I don't know the history of how things got to be the way they are,
> nor really much about the code but what I've gathered from skimming
> through while creating this patch series.  But from that impression,
> the only issue I really see with the current code is the confusing
> naming.  The cpufeature.h code has this nice infrastructure to allow
> you to, for instance, enable or disable certain bits on the
> command-line; and the interface for querying all the different bits of
> functionality is all nicely put in one place.  Moving the
> svm_feature_flags into x86_capabilities would immediately allow SVM to
> take advantage of this infrastructure; it's not clear to me how this
> would be "needless".
> 
> Furthermore, it looks to me like host_cpu_policy is used as a starting
> point for generating pv_cpu_policy and hvm_cpu_policy, both of which
> are only used for guest cpuid generation.  Given that the format of
> those policies is fixed, and there's a lot of "copy this bit from the
> host policy wholesale", it seems like no matter what, you'd want a
> host_cpu_policy.
> 
> And in any case -- all that is kind of moot.  *Right now*,
> host_cpu_policy is only used for guest cpuid policy creation; *right
> now*, the nested virt features of AMD are handled in the
> host_cpu_policy; *right now*, we're advertising to guests bits which
> are not properly virtualized; *right now* these bits are actually set
> unconditionally, regardless of whether they're even available on the
> hardware; *right now*, Xen uses svm_feature_flags to determine its own
> use of TscRateMSR; so *right now*, removing this bit from
> host_cpu_policy won't prevent Xen from using TscRateMSR itself.
> 
> (Unless my understanding of the code is wrong, in which case I'd
> appreciate a correction.)

There's nothing wrong afaics, just missing at least one aspect: Did you
see all the featureset <-> policy conversions in cpu-policy.c? That (to
me at least) clearly is a sign of unnecessary duplication of the same
data. This goes as far as seeding the host policy from the raw one, just
to then immediately run x86_cpu_featureset_to_policy(), thus overwriting
a fair part of what was first taken from the raw policy. That's necessary
right now, because setup_{force,clear}_cpu_cap() act on
boot_cpu_data.x86_capability[], not the host policy.

As to the "needless" further up, it's only as far as moving those bits
into x86_capability[] would further duplicate information, rather than
(for that piece at least) putting them into the policies right away. But
yes, if the goal is to have setup_{force,clear}_cpu_cap() be able to
control those bits as well, then going the intermediate step would be
unavoidable at this point in time.

Jan


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/6] nestedsvm: Disable TscRateMSR
  2024-02-22  9:50           ` Jan Beulich
@ 2024-02-22 11:00             ` George Dunlap
  2024-02-22 11:26               ` Jan Beulich
  0 siblings, 1 reply; 33+ messages in thread
From: George Dunlap @ 2024-02-22 11:00 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On Thu, Feb 22, 2024 at 5:50 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 22.02.2024 10:30, George Dunlap wrote:
> > On Wed, Feb 21, 2024 at 6:52 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>>> But then of course Andrew may know of reasons why all of this is done
> >>>> in calculate_host_policy() in the first place, rather than in HVM
> >>>> policy calculation.
> >>>
> >>> It sounds like maybe you're confusing host_policy with
> >>> x86_capabilities?  From what I can tell:
> >>>
> >>> *  the "basic" cpu_has_X macros resolve to boot_cpu_has(), which
> >>> resolves to cpu_has(&boot_cpu_data, ...), which is completely
> >>> independent of the cpu-policy.c:host_cpu_policy
> >>>
> >>> * cpu-policy.c:host_cpu_policy only affects what is advertised to
> >>> guests, via {pv,hvm}_cpu_policy and featureset bits.  Most notably a
> >>> quick skim doesn't show any mechanism by which host_cpu_policy could
> >>> affect what features Xen itself decides to use.
> >>
> >> I'm not mixing the two, no; the two are still insufficiently disentangled.
> >> There's really no reason (long term) to have both host policy and
> >> x86_capabilities. Therefore I'd prefer if new code (including a basically
> >> fundamental re-write as is going to be needed for nested) to avoid
> >> needlessly further extending x86_capabilities. Unless of course there's
> >> something fundamentally wrong with eliminating the redundancy, which
> >> likely Andrew would be in the best position to point out.
> >
> > So I don't know the history of how things got to be the way they are,
> > nor really much about the code but what I've gathered from skimming
> > through while creating this patch series.  But from that impression,
> > the only issue I really see with the current code is the confusing
> > naming.  The cpufeature.h code has this nice infrastructure to allow
> > you to, for instance, enable or disable certain bits on the
> > command-line; and the interface for querying all the different bits of
> > functionality is all nicely put in one place.  Moving the
> > svm_feature_flags into x86_capabilities would immediately allow SVM to
> > take advantage of this infrastructure; it's not clear to me how this
> > would be "needless".
> >
> > Furthermore, it looks to me like host_cpu_policy is used as a starting
> > point for generating pv_cpu_policy and hvm_cpu_policy, both of which
> > are only used for guest cpuid generation.  Given that the format of
> > those policies is fixed, and there's a lot of "copy this bit from the
> > host policy wholesale", it seems like no matter what, you'd want a
> > host_cpu_policy.
> >
> > And in any case -- all that is kind of moot.  *Right now*,
> > host_cpu_policy is only used for guest cpuid policy creation; *right
> > now*, the nested virt features of AMD are handled in the
> > host_cpu_policy; *right now*, we're advertising to guests bits which
> > are not properly virtualized; *right now* these bits are actually set
> > unconditionally, regardless of whether they're even available on the
> > hardware; *right now*, Xen uses svm_feature_flags to determine its own
> > use of TscRateMSR; so *right now*, removing this bit from
> > host_cpu_policy won't prevent Xen from using TscRateMSR itself.
> >
> > (Unless my understanding of the code is wrong, in which case I'd
> > appreciate a correction.)
>
> There's nothing wrong afaics, just missing at least one aspect: Did you
> see all the featureset <-> policy conversions in cpu-policy.c? That (to
> me at least) clearly is a sign of unnecessary duplication of the same
> data. This goes as far as seeding the host policy from the raw one, just
> to then immediately run x86_cpu_featureset_to_policy(), thus overwriting
> a fair part of what was first taken from the raw policy. That's necessary
> right now, because setup_{force,clear}_cpu_cap() act on
> boot_cpu_data.x86_capability[], not the host policy.
>
> As to the "needless" further up, it's only as far as moving those bits
> into x86_capability[] would further duplicate information, rather than
> (for that piece at least) putting them into the policies right away. But
> yes, if the goal is to have setup_{force,clear}_cpu_cap() be able to
> control those bits as well, then going the intermediate step would be
> unavoidable at this point in time.

I'm still not sure of what needs to happen to move this forward.

As I said, I'm not opposed to doing some prep work; but I don't want
to randomly guess as to what kinds of clean-up needs to be done, only
to be told it was wrong (either by you when I post it or by Andy
sometime after it's checked in).

I could certainly move svm_feature_flags into host_cpu_policy, and
have cpu_svm_feature_* reference host_cpu_policy instead (after moving
the nested virt "guest policy" tweaks into hvm_cpu_policy); but as far
as I can tell, that would be the *very first* instance of Xen using
host_cpu_policy in that manner.  I'd like more clarity that this is
the long-term direction that things are going before then.

If you (plural) don't have time now to refresh your memory / make an
informed decision about what you want to happen, then please consider
just taking the patch as it is; it doesn't make future changes any
harder.

 -George


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/6] nestedsvm: Disable TscRateMSR
  2024-02-22 11:00             ` George Dunlap
@ 2024-02-22 11:26               ` Jan Beulich
  0 siblings, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2024-02-22 11:26 UTC (permalink / raw)
  To: George Dunlap; +Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, xen-devel

On 22.02.2024 12:00, George Dunlap wrote:
> On Thu, Feb 22, 2024 at 5:50 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 22.02.2024 10:30, George Dunlap wrote:
>>> On Wed, Feb 21, 2024 at 6:52 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>> But then of course Andrew may know of reasons why all of this is done
>>>>>> in calculate_host_policy() in the first place, rather than in HVM
>>>>>> policy calculation.
>>>>>
>>>>> It sounds like maybe you're confusing host_policy with
>>>>> x86_capabilities?  From what I can tell:
>>>>>
>>>>> *  the "basic" cpu_has_X macros resolve to boot_cpu_has(), which
>>>>> resolves to cpu_has(&boot_cpu_data, ...), which is completely
>>>>> independent of the cpu-policy.c:host_cpu_policy
>>>>>
>>>>> * cpu-policy.c:host_cpu_policy only affects what is advertised to
>>>>> guests, via {pv,hvm}_cpu_policy and featureset bits.  Most notably a
>>>>> quick skim doesn't show any mechanism by which host_cpu_policy could
>>>>> affect what features Xen itself decides to use.
>>>>
>>>> I'm not mixing the two, no; the two are still insufficiently disentangled.
>>>> There's really no reason (long term) to have both host policy and
>>>> x86_capabilities. Therefore I'd prefer if new code (including a basically
>>>> fundamental re-write as is going to be needed for nested) to avoid
>>>> needlessly further extending x86_capabilities. Unless of course there's
>>>> something fundamentally wrong with eliminating the redundancy, which
>>>> likely Andrew would be in the best position to point out.
>>>
>>> So I don't know the history of how things got to be the way they are,
>>> nor really much about the code but what I've gathered from skimming
>>> through while creating this patch series.  But from that impression,
>>> the only issue I really see with the current code is the confusing
>>> naming.  The cpufeature.h code has this nice infrastructure to allow
>>> you to, for instance, enable or disable certain bits on the
>>> command-line; and the interface for querying all the different bits of
>>> functionality is all nicely put in one place.  Moving the
>>> svm_feature_flags into x86_capabilities would immediately allow SVM to
>>> take advantage of this infrastructure; it's not clear to me how this
>>> would be "needless".
>>>
>>> Furthermore, it looks to me like host_cpu_policy is used as a starting
>>> point for generating pv_cpu_policy and hvm_cpu_policy, both of which
>>> are only used for guest cpuid generation.  Given that the format of
>>> those policies is fixed, and there's a lot of "copy this bit from the
>>> host policy wholesale", it seems like no matter what, you'd want a
>>> host_cpu_policy.
>>>
>>> And in any case -- all that is kind of moot.  *Right now*,
>>> host_cpu_policy is only used for guest cpuid policy creation; *right
>>> now*, the nested virt features of AMD are handled in the
>>> host_cpu_policy; *right now*, we're advertising to guests bits which
>>> are not properly virtualized; *right now* these bits are actually set
>>> unconditionally, regardless of whether they're even available on the
>>> hardware; *right now*, Xen uses svm_feature_flags to determine its own
>>> use of TscRateMSR; so *right now*, removing this bit from
>>> host_cpu_policy won't prevent Xen from using TscRateMSR itself.
>>>
>>> (Unless my understanding of the code is wrong, in which case I'd
>>> appreciate a correction.)
>>
>> There's nothing wrong afaics, just missing at least one aspect: Did you
>> see all the featureset <-> policy conversions in cpu-policy.c? That (to
>> me at least) clearly is a sign of unnecessary duplication of the same
>> data. This goes as far as seeding the host policy from the raw one, just
>> to then immediately run x86_cpu_featureset_to_policy(), thus overwriting
>> a fair part of what was first taken from the raw policy. That's necessary
>> right now, because setup_{force,clear}_cpu_cap() act on
>> boot_cpu_data.x86_capability[], not the host policy.
>>
>> As to the "needless" further up, it's only as far as moving those bits
>> into x86_capability[] would further duplicate information, rather than
>> (for that piece at least) putting them into the policies right away. But
>> yes, if the goal is to have setup_{force,clear}_cpu_cap() be able to
>> control those bits as well, then going the intermediate step would be
>> unavoidable at this point in time.
> 
> I'm still not sure of what needs to happen to move this forward.
> 
> As I said, I'm not opposed to doing some prep work; but I don't want
> to randomly guess as to what kinds of clean-up needs to be done, only
> to be told it was wrong (either by you when I post it or by Andy
> sometime after it's checked in).
> 
> I could certainly move svm_feature_flags into host_cpu_policy, and
> have cpu_svm_feature_* reference host_cpu_policy instead (after moving
> the nested virt "guest policy" tweaks into hvm_cpu_policy); but as far
> as I can tell, that would be the *very first* instance of Xen using
> host_cpu_policy in that manner.  I'd like more clarity that this is
> the long-term direction that things are going before then.
> 
> If you (plural) don't have time now to refresh your memory / make an
> informed decision about what you want to happen, then please consider
> just taking the patch as it is; it doesn't make future changes any
> harder.

I'd be willing to ack the patch as is if I understood the piece of code
in calculate_host_policy() that is being changed here. My take is that
in a prereq patch that wants adjusting, by moving it wherever applicable.
Without indications to the contrary, it living there is plain wrong, and
the patch here touching that function isn't quite right either, even if
only as a result of the original issue.

If without at least a little bit of input from Andrew we can't move
forward, then that's the way it is - we need to wait. As you likely know
I have dozens of patches in that state ...

Jan


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 5/6] nestedsvm: Remove bogus debug message from nestedsvm_check_intercepts
  2024-02-19 15:56   ` Jan Beulich
@ 2024-02-22 12:58     ` George Dunlap
  0 siblings, 0 replies; 33+ messages in thread
From: George Dunlap @ 2024-02-22 12:58 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On Mon, Feb 19, 2024 at 11:56 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 06.02.2024 02:20, George Dunlap wrote:
> > Changeset ef3e8db8068 ("x86/hvm: Corrections and improvements to
> > unhandled vmexit logging") introduced a printk to the default path of
> > the switch statement in nestedsvm_check_intercepts(), complaining of
> > an unknown exit reason.
> >
> > Unfortunately, the "core" switch statement which is meant to handle
> > all vmexit reasons is in nsvm_vmcb_guest_intercepts_exitcode(); the
> > switch statement in nestedsvm_check_intercepts() is only meant to
> > superimpose on top of that some special-casing for how to interaction
> > between L1 and L0 vmexits.
> >
> > Remove the printk, and add a comment to prevent future confusion.
> >
> > Signed-off-by: George Dunlap <george.dunlap@cloud.com>
>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>
> I wonder if a Fixes: tag is warranted here.

Yes, I think probably so.


 -George


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 6/6] svm/nestedvm: Introduce nested capabilities bit
  2024-02-19 16:25   ` Jan Beulich
@ 2024-02-22 14:33     ` George Dunlap
  2024-02-22 15:05       ` Jan Beulich
  0 siblings, 1 reply; 33+ messages in thread
From: George Dunlap @ 2024-02-22 14:33 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Wei Liu,
	Roger Pau Monné,
	Jun Nakajima, Kevin Tian, xen-devel

On Tue, Feb 20, 2024 at 12:25 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 06.02.2024 02:20, George Dunlap wrote:
> > --- /dev/null
> > +++ b/docs/designs/nested-svm-cpu-features.md
> > @@ -0,0 +1,110 @@
> > +# Nested SVM (AMD) CPUID requirements
> > +
> > +The first step in making nested SVM production-ready is to make sure
> > +that all features are implemented and well-tested.  To make this
> > +tractable, we will initially be limiting the "supported" range of
> > +nested virt to a specific subset of host and guest features.  This
> > +document describes the criteria for deciding on features, and the
> > +rationale behind each feature.
> > +
> > +For AMD, all virtualization-related features can be found in CPUID
> > +leaf 8000000A:edx
> > +
> > +# Criteria
> > +
> > +- Processor support: At a minimum we want to support processors from
> > +  the last 5 years.  All things being equal, older processors are
> > +  better.
>
> Nit: Perhaps missing "covering"? Generally I hope newer processors are
> "better".

I've reworded it.

> > +  For L0 this is required for performance: There's no way to tell the
> > +  guests not to use the LBR-related registers; and if the guest does,
> > +  then you have to save and restore all LBR-related registers on
> > +  context switch, which is prohibitive.
>
> "prohibitive" is too strong imo; maybe "undesirable"?

Prohibitive sounds strong, but I don't think it really is; here's the
dictionary definition:

"(of a price or charge) so high as to prevent something being done or bought."

The cost of saving the registers on every context switch is so high as
to prevent us from wanting to do it.

I could change it to "too expensive".

> > --- a/xen/arch/x86/hvm/svm/nestedhvm.h
> > +++ b/xen/arch/x86/hvm/svm/nestedhvm.h
> > @@ -35,6 +35,7 @@ enum nestedhvm_vmexits
> >  nestedsvm_check_intercepts(struct vcpu *v, struct cpu_user_regs *regs,
> >      uint64_t exitcode);
> >  void svm_nested_features_on_efer_update(struct vcpu *v);
> > +void __init start_nested_svm(struct hvm_function_table *svm_function_table);
>
> No section placement attributes on declarations, please.

Ack.

> > --- a/xen/arch/x86/hvm/svm/nestedsvm.c
> > +++ b/xen/arch/x86/hvm/svm/nestedsvm.c
> > @@ -1666,3 +1666,17 @@ void svm_nested_features_on_efer_update(struct vcpu *v)
> >          }
> >      }
> >  }
> > +
> > +void __init start_nested_svm(struct hvm_function_table *svm_function_table)
> > +{
> > +    /*
> > +     * Required host functionality to support nested virt.  See
> > +     * docs/designs/nested-svm-cpu-features.md for rationale.
> > +     */
> > +    svm_function_table->caps.nested_virt =
> > +        cpu_has_svm_nrips &&
> > +        cpu_has_svm_lbrv &&
> > +        cpu_has_svm_nrips &&
>
> nrips twice? Was the earlier one meant to be npt?

Er, yes exactly.



> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -3021,6 +3021,9 @@ const struct hvm_function_table * __init start_vmx(void)
> >      if ( cpu_has_vmx_tsc_scaling )
> >          vmx_function_table.tsc_scaling.ratio_frac_bits = 48;
> >
> > +    /* TODO: Require hardware support before enabling nested virt */
> > +    vmx_function_table.caps.nested_virt = vmx_function_table.caps.hap;
>
> This won't have the intended effect if hap_supported() ends up clearing
> the bit (used as input here) due to a command line option override. I
> wonder if instead this wants doing e.g. in a new hook to be called from
> nestedhvm_setup(). On the SVM side the hook function would then be the
> start_nested_svm() that you already introduce, with a caps.hap check
> added.

I take it presmp_initcall()'s are guaranteed to run before __initcall()'s?

If so that's probably a good idea.

> Since you leave the other nested-related if() in place in
> arch_sanitise_domain_config(), all ought to be well, but I think that
> other if() really wants replacing by the one you presently add.

Ack.

I'll probably check in patches 1,2,3, and 5, and resend the other two,
unless you'd like to see all the changes again...

 -George




 -George


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 6/6] svm/nestedvm: Introduce nested capabilities bit
  2024-02-22 14:33     ` George Dunlap
@ 2024-02-22 15:05       ` Jan Beulich
  0 siblings, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2024-02-22 15:05 UTC (permalink / raw)
  To: George Dunlap
  Cc: Andrew Cooper, Julien Grall, Stefano Stabellini, Wei Liu,
	Roger Pau Monné,
	Jun Nakajima, Kevin Tian, xen-devel

On 22.02.2024 15:33, George Dunlap wrote:
> On Tue, Feb 20, 2024 at 12:25 AM Jan Beulich <jbeulich@suse.com> wrote:
>> On 06.02.2024 02:20, George Dunlap wrote:
>>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>>> @@ -3021,6 +3021,9 @@ const struct hvm_function_table * __init start_vmx(void)
>>>      if ( cpu_has_vmx_tsc_scaling )
>>>          vmx_function_table.tsc_scaling.ratio_frac_bits = 48;
>>>
>>> +    /* TODO: Require hardware support before enabling nested virt */
>>> +    vmx_function_table.caps.nested_virt = vmx_function_table.caps.hap;
>>
>> This won't have the intended effect if hap_supported() ends up clearing
>> the bit (used as input here) due to a command line option override. I
>> wonder if instead this wants doing e.g. in a new hook to be called from
>> nestedhvm_setup(). On the SVM side the hook function would then be the
>> start_nested_svm() that you already introduce, with a caps.hap check
>> added.
> 
> I take it presmp_initcall()'s are guaranteed to run before __initcall()'s?

Yes - the former happen ahead of AP bringup, the latter after.

>> Since you leave the other nested-related if() in place in
>> arch_sanitise_domain_config(), all ought to be well, but I think that
>> other if() really wants replacing by the one you presently add.
> 
> Ack.
> 
> I'll probably check in patches 1,2,3, and 5, and resend the other two,
> unless you'd like to see all the changes again...

No need imo to re-post anything that was agreed upon.

Jan


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 5/6] nestedsvm: Remove bogus debug message from nestedsvm_check_intercepts
  2024-02-06  1:20 ` [PATCH 5/6] nestedsvm: Remove bogus debug message from nestedsvm_check_intercepts George Dunlap
  2024-02-19 15:56   ` Jan Beulich
@ 2024-02-26 16:47   ` Andrew Cooper
  2024-03-04  9:50     ` George Dunlap
  1 sibling, 1 reply; 33+ messages in thread
From: Andrew Cooper @ 2024-02-26 16:47 UTC (permalink / raw)
  To: George Dunlap, xen-devel

On 06/02/2024 1:20 am, George Dunlap wrote:
> Changeset ef3e8db8068 ("x86/hvm: Corrections and improvements to
> unhandled vmexit logging") introduced a printk to the default path of
> the switch statement in nestedsvm_check_intercepts(), complaining of
> an unknown exit reason.
>
> Unfortunately, the "core" switch statement which is meant to handle
> all vmexit reasons is in nsvm_vmcb_guest_intercepts_exitcode(); the
> switch statement in nestedsvm_check_intercepts() is only meant to
> superimpose on top of that some special-casing for how to interaction
> between L1 and L0 vmexits.
>
> Remove the printk, and add a comment to prevent future confusion.
>
> Signed-off-by: George Dunlap <george.dunlap@cloud.com>

Erm...   The addition of this printk was very deliberate, to point out
where security fixes are needed.

It's not bogus in the slightest.  It is an error for exit reasons to not
be inspected for safety in this path.

Please revert this patch.

~Andrew


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 5/6] nestedsvm: Remove bogus debug message from nestedsvm_check_intercepts
  2024-02-26 16:47   ` Andrew Cooper
@ 2024-03-04  9:50     ` George Dunlap
  0 siblings, 0 replies; 33+ messages in thread
From: George Dunlap @ 2024-03-04  9:50 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel

On Tue, Feb 27, 2024 at 12:47 AM Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>
> On 06/02/2024 1:20 am, George Dunlap wrote:
> > Changeset ef3e8db8068 ("x86/hvm: Corrections and improvements to
> > unhandled vmexit logging") introduced a printk to the default path of
> > the switch statement in nestedsvm_check_intercepts(), complaining of
> > an unknown exit reason.
> >
> > Unfortunately, the "core" switch statement which is meant to handle
> > all vmexit reasons is in nsvm_vmcb_guest_intercepts_exitcode(); the
> > switch statement in nestedsvm_check_intercepts() is only meant to
> > superimpose on top of that some special-casing for how to interaction
> > between L1 and L0 vmexits.
> >
> > Remove the printk, and add a comment to prevent future confusion.
> >
> > Signed-off-by: George Dunlap <george.dunlap@cloud.com>
>
> Erm...   The addition of this printk was very deliberate, to point out
> where security fixes are needed.
>
> It's not bogus in the slightest.  It is an error for exit reasons to not
> be inspected for safety in this path.

I'm a bit at a loss how to respond to this.  As I wrote above, exit
reasons are inspected for safety in this path -- in
nsvm_vmcb_guest_intercepts_exitcode().  If you want the patch
reverted, you'll have to explain why that's not sufficient.

 -George


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2024-03-04  9:50 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-06  1:20 [PATCH 0/6] AMD Nested Virt Preparation George Dunlap
2024-02-06  1:20 ` [PATCH 1/6] xen/hvm: Convert hap_capabilities into a bitfield George Dunlap
2024-02-19 13:36   ` Jan Beulich
2024-02-21  7:02     ` George Dunlap
2024-02-21  7:23       ` Jan Beulich
2024-02-21  7:38         ` George Dunlap
2024-02-06  1:20 ` [PATCH 2/6] svm: Improve type of cpu_has_svm_feature George Dunlap
2024-02-19 14:03   ` Jan Beulich
2024-02-21  7:09     ` George Dunlap
2024-02-19 15:24   ` Jan Beulich
2024-02-21  7:13     ` George Dunlap
2024-02-06  1:20 ` [PATCH 3/6] xen/hvm: Move other hvm_function_table booleans into the caps bitfield George Dunlap
2024-02-19 14:39   ` Jan Beulich
2024-02-19 16:08   ` Jan Beulich
2024-02-21  7:21     ` George Dunlap
2024-02-06  1:20 ` [PATCH 4/6] nestedsvm: Disable TscRateMSR George Dunlap
2024-02-19 15:22   ` Jan Beulich
2024-02-21  8:48     ` George Dunlap
2024-02-21 10:52       ` Jan Beulich
2024-02-22  9:30         ` George Dunlap
2024-02-22  9:50           ` Jan Beulich
2024-02-22 11:00             ` George Dunlap
2024-02-22 11:26               ` Jan Beulich
2024-02-06  1:20 ` [PATCH 5/6] nestedsvm: Remove bogus debug message from nestedsvm_check_intercepts George Dunlap
2024-02-19 15:56   ` Jan Beulich
2024-02-22 12:58     ` George Dunlap
2024-02-26 16:47   ` Andrew Cooper
2024-03-04  9:50     ` George Dunlap
2024-02-06  1:20 ` [PATCH 6/6] svm/nestedvm: Introduce nested capabilities bit George Dunlap
2024-02-06  1:34   ` George Dunlap
2024-02-19 16:25   ` Jan Beulich
2024-02-22 14:33     ` George Dunlap
2024-02-22 15:05       ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.