All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] Permit fault-less access to non-emulated MSRs
@ 2021-01-20 22:49 Boris Ostrovsky
  2021-01-20 22:49 ` [PATCH v2 1/4] xl: Add support for ignore_msrs option Boris Ostrovsky
                   ` (3 more replies)
  0 siblings, 4 replies; 53+ messages in thread
From: Boris Ostrovsky @ 2021-01-20 22:49 UTC (permalink / raw)
  To: xen-devel
  Cc: iwj, wl, anthony.perard, jbeulich, andrew.cooper3, roger.pau,
	jun.nakajima, kevin.tian, boris.ostrovsky


Disallowing accesses to MSRs that are not explicitly handled (done by
commit 84e848fd7a16 ("x86/hvm: disallow access to unknown MSRs")) caused
regression for Solaris guests who access MSR_RAPL_POWER_UNIT assuming
it's always there.

We can add special handling for this and other RAPL registers (and that's
what should happen for pre-4.14 releases) but a more general solution is to
provide an option to allow accesses to unhandled registers to proceed.

The new option is "ignore_msrs" and it can take three values
    never:   Issue a warning to the log and #GP to the guest. This is default.
    silent:  MSR reads return 0, MSR writes are ignored. No warnings to the log.
    verbose: Similar to silent but a warning is written.

v2:
* pass ignore_msrs in msr_policy's value filed, not flags
* use 0x400002ff as unhandled MSR
* make sure MSR policy is only consulted for guest's MSR acceses


Boris Ostrovsky (4):
  xl: Add support for ignore_msrs option
  x86: Introduce MSR_UNHANDLED
  x86: Allow non-faulting accesses to non-emulated MSRs if policy
    permits this
  tools/libs: Apply MSR policy to a guest

 docs/man/xl.cfg.5.pod.in               |  20 +++++-
 tools/include/xenctrl.h                |   2 +
 tools/libs/guest/Makefile              |   1 +
 tools/libs/guest/xg_msrs_x86.c         | 110 +++++++++++++++++++++++++++++++++
 tools/libs/light/libxl_dom.c           |   5 +-
 tools/libs/light/libxl_internal.h      |   2 +
 tools/libs/light/libxl_types.idl       |   7 +++
 tools/libs/light/libxl_x86.c           |   7 +++
 tools/xl/xl_parse.c                    |   7 +++
 xen/arch/x86/hvm/svm/svm.c             |  10 ++-
 xen/arch/x86/hvm/vmx/vmx.c             |  10 ++-
 xen/arch/x86/msr.c                     |  32 +++++++++-
 xen/arch/x86/pv/emul-priv-op.c         |  10 +--
 xen/arch/x86/x86_emulate/x86_emulate.h |   6 ++
 xen/include/asm-x86/msr.h              |   3 +
 xen/include/xen/lib/x86/msr.h          |  17 ++++-
 xen/lib/x86/msr.c                      |   2 +
 17 files changed, 229 insertions(+), 22 deletions(-)
 create mode 100644 tools/libs/guest/xg_msrs_x86.c

-- 
1.8.3.1



^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 1/4] xl: Add support for ignore_msrs option
  2021-01-20 22:49 [PATCH v2 0/4] Permit fault-less access to non-emulated MSRs Boris Ostrovsky
@ 2021-01-20 22:49 ` Boris Ostrovsky
  2021-01-21 14:56   ` Wei Liu
                     ` (2 more replies)
  2021-01-20 22:49 ` [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED Boris Ostrovsky
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 53+ messages in thread
From: Boris Ostrovsky @ 2021-01-20 22:49 UTC (permalink / raw)
  To: xen-devel
  Cc: iwj, wl, anthony.perard, jbeulich, andrew.cooper3, roger.pau,
	jun.nakajima, kevin.tian, boris.ostrovsky

This option allows guest administrator specify what should happen when
guest accesses an MSR which is not explicitly emulated by the hypervisor.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 docs/man/xl.cfg.5.pod.in         | 20 +++++++++++++++++++-
 tools/libs/light/libxl_types.idl |  7 +++++++
 tools/xl/xl_parse.c              |  7 +++++++
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index c8e017f950de..96ce97c42cab 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -2044,7 +2044,25 @@ Do not provide a VM generation ID.
 See also "Virtual Machine Generation ID" by Microsoft:
 L<https://docs.microsoft.com/en-us/windows/win32/hyperv_v2/virtual-machine-generation-identifier>
 
-=back 
+=over
+
+=item B<ignore_msrs="STRING">
+
+Determine hypervisor behavior on accesses to MSRs that are not emulated by the hypervisor.
+
+=over 4
+
+=item B<never>
+
+Issue a warning to the log and #GP to the guest. This is default.
+
+=item B<silent>
+
+MSR reads return 0, MSR writes are ignored. No warnings to the log.
+
+=item B<verbose>
+
+Similar to B<silent> but a warning is written.
 
 =head3 Guest Virtual Time Controls
 
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 05324736b744..7b5fef771ee8 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -477,6 +477,12 @@ libxl_tee_type = Enumeration("tee_type", [
     (1, "optee")
     ], init_val = "LIBXL_TEE_TYPE_NONE")
 
+libxl_ignore_msrs = Enumeration("ignore_msrs", [
+    (0, "never"),
+    (1, "silent"),
+    (2, "verbose"),
+    ], init_val = "LIBXL_IGNORE_MSRS_NEVER")
+
 libxl_rdm_reserve = Struct("rdm_reserve", [
     ("strategy",    libxl_rdm_reserve_strategy),
     ("policy",      libxl_rdm_reserve_policy),
@@ -559,6 +565,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
     ("apic",             libxl_defbool),
     ("dm_restrict",      libxl_defbool),
     ("tee",              libxl_tee_type),
+    ("ignore_msrs",      libxl_ignore_msrs),
     ("u", KeyedUnion(None, libxl_domain_type, "type",
                 [("hvm", Struct(None, [("firmware",         string),
                                        ("bios",             libxl_bios_type),
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index 4ebf39620ae7..942086c3f41d 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -2732,6 +2732,13 @@ skip_usbdev:
         }
     }
 
+    if (!xlu_cfg_get_string(config, "ignore_msrs", &buf, 0)) {
+        if (libxl_ignore_msrs_from_string(buf, &b_info->ignore_msrs)) {
+           fprintf(stderr, "ERROR: invalid value \"%s\" for \"ignore_msrs\"\n", buf);
+           exit(1);
+        }
+    }
+
     parse_vkb_list(config, d_config);
 
     xlu_cfg_get_defbool(config, "xend_suspend_evtchn_compat",
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-01-20 22:49 [PATCH v2 0/4] Permit fault-less access to non-emulated MSRs Boris Ostrovsky
  2021-01-20 22:49 ` [PATCH v2 1/4] xl: Add support for ignore_msrs option Boris Ostrovsky
@ 2021-01-20 22:49 ` Boris Ostrovsky
  2021-01-22 11:51   ` Jan Beulich
  2021-02-18 10:51   ` Roger Pau Monné
  2021-01-20 22:49 ` [PATCH v2 3/4] x86: Allow non-faulting accesses to non-emulated MSRs if policy permits this Boris Ostrovsky
  2021-01-20 22:49 ` [PATCH v2 4/4] tools/libs: Apply MSR policy to a guest Boris Ostrovsky
  3 siblings, 2 replies; 53+ messages in thread
From: Boris Ostrovsky @ 2021-01-20 22:49 UTC (permalink / raw)
  To: xen-devel
  Cc: iwj, wl, anthony.perard, jbeulich, andrew.cooper3, roger.pau,
	jun.nakajima, kevin.tian, boris.ostrovsky

When toolstack updates MSR policy, this MSR offset (which is the last
index in the hypervisor MSR range) is used to indicate hypervisor
behavior when guest accesses an MSR which is not explicitly emulated.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
Changes in v2:
* Use 0x400002ff for MSR_UNHANDLED
* Pass ignore_msrs in msr_policy.value

 xen/arch/x86/msr.c            |  4 ++--
 xen/include/xen/lib/x86/msr.h | 17 ++++++++++++++++-
 xen/lib/x86/msr.c             |  2 ++
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index be8e36386250..433d16c80728 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -295,7 +295,7 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
         }
 
         /* Fallthrough. */
-    case 0x40000200 ... 0x400002ff:
+    case 0x40000200 ... 0x400002fe:
         ret = guest_rdmsr_xen(v, msr, val);
         break;
 
@@ -514,7 +514,7 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
         }
 
         /* Fallthrough. */
-    case 0x40000200 ... 0x400002ff:
+    case 0x40000200 ... 0x400002fe:
         ret = guest_wrmsr_xen(v, msr, val);
         break;
 
diff --git a/xen/include/xen/lib/x86/msr.h b/xen/include/xen/lib/x86/msr.h
index 48ba4a59c036..fbbb3b7ba870 100644
--- a/xen/include/xen/lib/x86/msr.h
+++ b/xen/include/xen/lib/x86/msr.h
@@ -2,8 +2,21 @@
 #ifndef XEN_LIB_X86_MSR_H
 #define XEN_LIB_X86_MSR_H
 
+/*
+ * Behavior on accesses to MSRs that are not handled by emulation:
+ *  0 = return #GP, warning emitted
+ *  1 = read as 0, writes are dropped, no warning
+ *  2 = read as 0, writes are dropped, warning emitted
+ */
+#define MSR_UNHANDLED_NEVER     0
+#define MSR_UNHANDLED_SILENT    1
+#define MSR_UNHANDLED_VERBOSE   2
+
+/* MSR that is not explicitly processed by emulation */
+#define MSR_UNHANDLED           0x400002ff
+
 /* Maximum number of MSRs written when serialising msr_policy. */
-#define MSR_MAX_SERIALISED_ENTRIES 2
+#define MSR_MAX_SERIALISED_ENTRIES 3
 
 /* MSR policy object for shared per-domain MSRs */
 struct msr_policy
@@ -45,6 +58,8 @@ struct msr_policy
             bool taa_no:1;
         };
     } arch_caps;
+
+    uint8_t ignore_msrs;
 };
 
 #ifdef __XEN__
diff --git a/xen/lib/x86/msr.c b/xen/lib/x86/msr.c
index 7d71e92a380a..178203803946 100644
--- a/xen/lib/x86/msr.c
+++ b/xen/lib/x86/msr.c
@@ -40,6 +40,7 @@ int x86_msr_copy_to_buffer(const struct msr_policy *p,
 
     COPY_MSR(MSR_INTEL_PLATFORM_INFO, p->platform_info.raw);
     COPY_MSR(MSR_ARCH_CAPABILITIES,   p->arch_caps.raw);
+    COPY_MSR(MSR_UNHANDLED,           p->ignore_msrs);
 
 #undef COPY_MSR
 
@@ -101,6 +102,7 @@ int x86_msr_copy_from_buffer(struct msr_policy *p,
 
         case MSR_INTEL_PLATFORM_INFO: ASSIGN(platform_info.raw); break;
         case MSR_ARCH_CAPABILITIES:   ASSIGN(arch_caps.raw);     break;
+        case MSR_UNHANDLED:           ASSIGN(ignore_msrs);       break;
 
 #undef ASSIGN
 
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v2 3/4] x86: Allow non-faulting accesses to non-emulated MSRs if policy permits this
  2021-01-20 22:49 [PATCH v2 0/4] Permit fault-less access to non-emulated MSRs Boris Ostrovsky
  2021-01-20 22:49 ` [PATCH v2 1/4] xl: Add support for ignore_msrs option Boris Ostrovsky
  2021-01-20 22:49 ` [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED Boris Ostrovsky
@ 2021-01-20 22:49 ` Boris Ostrovsky
  2021-01-22 12:51   ` Jan Beulich
  2021-02-18 11:24   ` Roger Pau Monné
  2021-01-20 22:49 ` [PATCH v2 4/4] tools/libs: Apply MSR policy to a guest Boris Ostrovsky
  3 siblings, 2 replies; 53+ messages in thread
From: Boris Ostrovsky @ 2021-01-20 22:49 UTC (permalink / raw)
  To: xen-devel
  Cc: iwj, wl, anthony.perard, jbeulich, andrew.cooper3, roger.pau,
	jun.nakajima, kevin.tian, boris.ostrovsky

Starting with commit 84e848fd7a16 ("x86/hvm: disallow access to unknown MSRs")
accesses to unhandled MSRs result in #GP sent to the guest. This caused a
regression for Solaris who tries to acccess MSR_RAPL_POWER_UNIT and (unlike,
for example, Linux) does not catch exceptions when accessing MSRs that
potentially may not be present.

Instead of special-casing RAPL registers we decide what to do when any
non-emulated MSR is accessed based on ignore_msrs field of msr_policy.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
Changes in v2:
* define x86_emul_guest_msr_access() and use it to determine whether emulated
  instruction is rd/wrmsr.
* Don't use ignore_msrs for MSR accesses that are not guest's rd/wrmsr.
* Clear @val for writes too in guest_unhandled_msr()

 xen/arch/x86/hvm/svm/svm.c             | 10 ++++------
 xen/arch/x86/hvm/vmx/vmx.c             | 10 ++++------
 xen/arch/x86/msr.c                     | 28 ++++++++++++++++++++++++++++
 xen/arch/x86/pv/emul-priv-op.c         | 10 ++++++----
 xen/arch/x86/x86_emulate/x86_emulate.h |  6 ++++++
 xen/include/asm-x86/msr.h              |  3 +++
 6 files changed, 51 insertions(+), 16 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index b819897a4a9f..7b59885b2619 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1965,8 +1965,8 @@ static int svm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
         break;
 
     default:
-        gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", msr);
-        goto gpf;
+        if ( guest_unhandled_msr(v, msr, msr_content, false, true) )
+            goto gpf;
     }
 
     HVM_DBG_LOG(DBG_LEVEL_MSR, "returns: ecx=%x, msr_value=%"PRIx64,
@@ -2151,10 +2151,8 @@ static int svm_msr_write_intercept(unsigned int msr, uint64_t msr_content)
         break;
 
     default:
-        gdprintk(XENLOG_WARNING,
-                 "WRMSR 0x%08x val 0x%016"PRIx64" unimplemented\n",
-                 msr, msr_content);
-        goto gpf;
+        if ( guest_unhandled_msr(v, msr, &msr_content, true, true) )
+            goto gpf;
     }
 
     return X86EMUL_OKAY;
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 2d4475ee3de2..87baca57d33f 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3017,8 +3017,8 @@ static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
             break;
         }
 
-        gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", msr);
-        goto gp_fault;
+        if ( guest_unhandled_msr(curr, msr, msr_content, false, true) )
+            goto gp_fault;
     }
 
 done:
@@ -3319,10 +3319,8 @@ static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content)
              is_last_branch_msr(msr) )
             break;
 
-        gdprintk(XENLOG_WARNING,
-                 "WRMSR 0x%08x val 0x%016"PRIx64" unimplemented\n",
-                 msr, msr_content);
-        goto gp_fault;
+        if ( guest_unhandled_msr(v, msr, &msr_content, true, true) )
+            goto gp_fault;
     }
 
     return X86EMUL_OKAY;
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 433d16c80728..a57d838f642b 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -164,6 +164,34 @@ int init_vcpu_msr_policy(struct vcpu *v)
     return 0;
 }
 
+/* Returns true if policy requires #GP to the guest. */
+bool guest_unhandled_msr(const struct vcpu *v, uint32_t msr, uint64_t *val,
+                         bool is_write, bool is_guest_msr_access)
+{
+    u8 ignore_msrs = v->domain->arch.msr->ignore_msrs;
+
+    /*
+     * Accesses to unimplemented MSRs as part of emulation of instructions
+     * other than guest's RDMSR/WRMSR should never succeed.
+     */
+    if ( !is_guest_msr_access )
+        ignore_msrs = MSR_UNHANDLED_NEVER;
+
+    if ( unlikely(ignore_msrs != MSR_UNHANDLED_NEVER) )
+        *val = 0;
+
+    if ( likely(ignore_msrs != MSR_UNHANDLED_SILENT) )
+    {
+        if ( is_write )
+            gdprintk(XENLOG_WARNING, "WRMSR 0x%08x val 0x%016"PRIx64
+                    " unimplemented\n", msr, *val);
+        else
+            gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", msr);
+    }
+
+    return (ignore_msrs == MSR_UNHANDLED_NEVER);
+}
+
 int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
 {
     const struct vcpu *curr = current;
diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index dbceed8a05fd..6b378dbe2239 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -984,7 +984,9 @@ static int read_msr(unsigned int reg, uint64_t *val,
         }
         /* fall through */
     default:
-        gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", reg);
+        if ( !guest_unhandled_msr(curr, reg, val, false,
+                                  x86_emul_guest_msr_access(ctxt)) )
+            return X86EMUL_OKAY;
         break;
 
     normal:
@@ -1146,9 +1148,9 @@ static int write_msr(unsigned int reg, uint64_t val,
         }
         /* fall through */
     default:
-        gdprintk(XENLOG_WARNING,
-                 "WRMSR 0x%08x val 0x%016"PRIx64" unimplemented\n",
-                 reg, val);
+        if ( !guest_unhandled_msr(curr, reg, &val, true,
+                                  x86_emul_guest_msr_access(ctxt)) )
+            return X86EMUL_OKAY;
         break;
 
     invalid:
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index d8fb3a990933..06e6b7479f37 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -850,4 +850,10 @@ static inline void x86_emul_reset_event(struct x86_emulate_ctxt *ctxt)
     ctxt->event = (struct x86_event){};
 }
 
+static inline bool x86_emul_guest_msr_access(struct x86_emulate_ctxt *ctxt)
+{
+    return ctxt->opcode == X86EMUL_OPC(0x0f, 0x32) ||  /* RDMSR */
+           ctxt->opcode == X86EMUL_OPC(0x0f, 0x30);    /* WRMSR */
+}
+
 #endif /* __X86_EMULATE_H__ */
diff --git a/xen/include/asm-x86/msr.h b/xen/include/asm-x86/msr.h
index 16f95e734428..e7d69ad5bf29 100644
--- a/xen/include/asm-x86/msr.h
+++ b/xen/include/asm-x86/msr.h
@@ -345,5 +345,8 @@ int init_vcpu_msr_policy(struct vcpu *v);
  */
 int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val);
 int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val);
+bool guest_unhandled_msr(const struct vcpu *v, uint32_t msr,
+                         uint64_t *val, bool is_write,
+                         bool is_guest_msr_access);
 
 #endif /* __ASM_MSR_H */
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v2 4/4] tools/libs: Apply MSR policy to a guest
  2021-01-20 22:49 [PATCH v2 0/4] Permit fault-less access to non-emulated MSRs Boris Ostrovsky
                   ` (2 preceding siblings ...)
  2021-01-20 22:49 ` [PATCH v2 3/4] x86: Allow non-faulting accesses to non-emulated MSRs if policy permits this Boris Ostrovsky
@ 2021-01-20 22:49 ` Boris Ostrovsky
  2021-01-21 14:58   ` Wei Liu
                     ` (2 more replies)
  3 siblings, 3 replies; 53+ messages in thread
From: Boris Ostrovsky @ 2021-01-20 22:49 UTC (permalink / raw)
  To: xen-devel
  Cc: iwj, wl, anthony.perard, jbeulich, andrew.cooper3, roger.pau,
	jun.nakajima, kevin.tian, boris.ostrovsky

When creating a guest, if ignore_msrs option has been specified,
apply it to guest's MSR policy.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 tools/include/xenctrl.h           |   2 +
 tools/libs/guest/Makefile         |   1 +
 tools/libs/guest/xg_msrs_x86.c    | 110 ++++++++++++++++++++++++++++++++++++++
 tools/libs/light/libxl_dom.c      |   5 +-
 tools/libs/light/libxl_internal.h |   2 +
 tools/libs/light/libxl_x86.c      |   7 +++
 6 files changed, 125 insertions(+), 2 deletions(-)
 create mode 100644 tools/libs/guest/xg_msrs_x86.c

diff --git a/tools/include/xenctrl.h b/tools/include/xenctrl.h
index 3796425e1eca..1d6a38e73dcf 100644
--- a/tools/include/xenctrl.h
+++ b/tools/include/xenctrl.h
@@ -1835,6 +1835,8 @@ int xc_cpuid_apply_policy(xc_interface *xch,
                           const uint32_t *featureset,
                           unsigned int nr_features, bool pae, bool itsc,
                           bool nested_virt, const struct xc_xend_cpuid *xend);
+int xc_msr_apply_policy(xc_interface *xch, uint32_t domid,
+                        unsigned int ignore_msr);
 int xc_mca_op(xc_interface *xch, struct xen_mc *mc);
 int xc_mca_op_inject_v2(xc_interface *xch, unsigned int flags,
                         xc_cpumap_t cpumap, unsigned int nr_cpus);
diff --git a/tools/libs/guest/Makefile b/tools/libs/guest/Makefile
index 1c729040b337..452155ea0385 100644
--- a/tools/libs/guest/Makefile
+++ b/tools/libs/guest/Makefile
@@ -56,6 +56,7 @@ SRCS-y                 += xg_dom_compat_linux.c
 
 SRCS-$(CONFIG_X86)     += xg_dom_x86.c
 SRCS-$(CONFIG_X86)     += xg_cpuid_x86.c
+SRCS-$(CONFIG_X86)     += xg_msrs_x86.c
 SRCS-$(CONFIG_ARM)     += xg_dom_arm.c
 
 ifeq ($(CONFIG_LIBXC_MINIOS),y)
diff --git a/tools/libs/guest/xg_msrs_x86.c b/tools/libs/guest/xg_msrs_x86.c
new file mode 100644
index 000000000000..464ce9292ad8
--- /dev/null
+++ b/tools/libs/guest/xg_msrs_x86.c
@@ -0,0 +1,110 @@
+/******************************************************************************
+ * xc_msrs_x86.c
+ *
+ * Update MSR policy of a domain.
+ *
+ * Copyright (c) 2021, Oracle and/or its affiliates.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation;
+ * version 2.1 of the License.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "xc_private.h"
+#include "xen/lib/x86/msr.h"
+
+
+
+int xc_msr_apply_policy(xc_interface *xch, uint32_t domid, unsigned int ignore_msr)
+{
+    int rc;
+    unsigned int nr_leaves, nr_msrs;
+    xen_msr_entry_t *msrs = NULL;
+    struct msr_policy *p = NULL;
+    xc_dominfo_t di;
+    unsigned int err_leaf, err_subleaf, err_msr;
+
+    if ( xc_domain_getinfo(xch, domid, 1, &di) != 1 ||
+         di.domid != domid )
+    {
+        ERROR("Failed to obtain d%d info", domid);
+        rc = -ESRCH;
+        goto out;
+    }
+
+    rc = xc_get_cpu_policy_size(xch, &nr_leaves, &nr_msrs);
+    if ( rc )
+    {
+        PERROR("Failed to obtain policy info size");
+        rc = -errno;
+        goto out;
+    }
+
+    rc = -ENOMEM;
+    if ( (msrs = calloc(nr_msrs, sizeof(*msrs))) == NULL ||
+         (p = calloc(1, sizeof(*p))) == NULL )
+        goto out;
+
+    /* Get the domain's default policy. */
+    nr_leaves = 0;
+    rc = xc_get_system_cpu_policy(xch, di.hvm ? XEN_SYSCTL_cpu_policy_hvm_default
+                                              : XEN_SYSCTL_cpu_policy_pv_default,
+                                  &nr_leaves, NULL, &nr_msrs, msrs);
+    if ( rc )
+    {
+        PERROR("Failed to obtain %s default policy", di.hvm ? "hvm" : "pv");
+        rc = -errno;
+        goto out;
+    }
+
+    rc = x86_msr_copy_from_buffer(p, msrs, nr_msrs, &err_msr);
+    if ( rc )
+    {
+        ERROR("Failed to deserialise MSR (err msr %#x) (%d = %s)",
+              err_msr, -rc, strerror(-rc));
+        goto out;
+    }
+
+    p->ignore_msrs = ignore_msr;
+
+    rc = x86_msr_copy_to_buffer(p, msrs, &nr_msrs);
+    if ( rc )
+    {
+        ERROR("Failed to serialise MSR (%d = %s)", -rc, strerror(-rc));
+        goto out;
+    }
+
+    nr_leaves = 0;
+    rc = xc_set_domain_cpu_policy(xch, domid, nr_leaves, NULL, nr_msrs, msrs,
+                                  &err_leaf, &err_subleaf, &err_msr);
+    if ( rc )
+    {
+        PERROR("Failed to set d%d's MSR policy (err leaf %#x, subleaf %#x, msr %#x)",
+               domid, err_leaf, err_subleaf, err_msr);
+        rc = -errno;
+    }
+
+out:
+    free(msrs);
+    free(p);
+
+    return rc;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
+
diff --git a/tools/libs/light/libxl_dom.c b/tools/libs/light/libxl_dom.c
index 19168572fd3e..1f2abf6679d7 100644
--- a/tools/libs/light/libxl_dom.c
+++ b/tools/libs/light/libxl_dom.c
@@ -383,9 +383,10 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid,
     /* Construct a CPUID policy, but only for brand new domains.  Domains
      * being migrated-in/restored have CPUID handled during the
      * static_data_done() callback. */
-    if (!state->restore)
+    if (!state->restore) {
         libxl__cpuid_legacy(ctx, domid, false, info);
-
+        libxl__msr_policy(ctx, domid, info);
+    }
     return rc;
 }
 
diff --git a/tools/libs/light/libxl_internal.h b/tools/libs/light/libxl_internal.h
index c79523ba9248..4f369e6a6f14 100644
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -2054,6 +2054,8 @@ _hidden char *libxl__object_to_json(libxl_ctx *ctx, const char *type,
 
 _hidden void libxl__cpuid_legacy(libxl_ctx *ctx, uint32_t domid, bool retore,
                                  libxl_domain_build_info *info);
+_hidden void libxl__msr_policy(libxl_ctx *ctx, uint32_t domid,
+                               libxl_domain_build_info *info);
 
 /* Calls poll() again - useful to check whether a signaled condition
  * is still true.  Cannot fail.  Returns currently-true revents. */
diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
index 86d272999d67..92ec1da77139 100644
--- a/tools/libs/light/libxl_x86.c
+++ b/tools/libs/light/libxl_x86.c
@@ -1,5 +1,6 @@
 #include "libxl_internal.h"
 #include "libxl_arch.h"
+#include "xen/lib/x86/msr.h"
 
 int libxl__arch_domain_prepare_config(libxl__gc *gc,
                                       libxl_domain_config *d_config,
@@ -838,6 +839,12 @@ int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc,
     return rc;
 }
 
+void libxl__msr_policy(libxl_ctx *ctx, uint32_t domid,
+                       libxl_domain_build_info *info)
+{
+    if (info->ignore_msrs != LIBXL_IGNORE_MSRS_NEVER)
+        xc_msr_apply_policy(ctx->xch, domid, info->ignore_msrs);
+}
 
 /*
  * Local variables:
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 1/4] xl: Add support for ignore_msrs option
  2021-01-20 22:49 ` [PATCH v2 1/4] xl: Add support for ignore_msrs option Boris Ostrovsky
@ 2021-01-21 14:56   ` Wei Liu
  2021-01-21 22:43     ` Boris Ostrovsky
  2021-01-22  9:52   ` Julien Grall
  2021-02-18 10:42   ` Roger Pau Monné
  2 siblings, 1 reply; 53+ messages in thread
From: Wei Liu @ 2021-01-21 14:56 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: xen-devel, iwj, wl, anthony.perard, jbeulich, andrew.cooper3,
	roger.pau, jun.nakajima, kevin.tian

On Wed, Jan 20, 2021 at 05:49:09PM -0500, Boris Ostrovsky wrote:
> This option allows guest administrator specify what should happen when
> guest accesses an MSR which is not explicitly emulated by the hypervisor.
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> ---
>  docs/man/xl.cfg.5.pod.in         | 20 +++++++++++++++++++-
>  tools/libs/light/libxl_types.idl |  7 +++++++
>  tools/xl/xl_parse.c              |  7 +++++++
>  3 files changed, 33 insertions(+), 1 deletion(-)

It is mainly missing a #define LIBXL_HAVE_XXX in libxl.h.

Other than that, this patch looks good to me. If you end up resending
this series, please fix that issue.

If other patches are all reviewed, you can provide some text to be
merged into this patch.

Wei.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 4/4] tools/libs: Apply MSR policy to a guest
  2021-01-20 22:49 ` [PATCH v2 4/4] tools/libs: Apply MSR policy to a guest Boris Ostrovsky
@ 2021-01-21 14:58   ` Wei Liu
  2021-01-22  9:56   ` Julien Grall
  2021-02-18 11:48   ` Roger Pau Monné
  2 siblings, 0 replies; 53+ messages in thread
From: Wei Liu @ 2021-01-21 14:58 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: xen-devel, iwj, wl, anthony.perard, jbeulich, andrew.cooper3,
	roger.pau, jun.nakajima, kevin.tian

On Wed, Jan 20, 2021 at 05:49:12PM -0500, Boris Ostrovsky wrote:
> When creating a guest, if ignore_msrs option has been specified,
> apply it to guest's MSR policy.
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

Acked-by: Wei Liu <wl@xen.org>


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 1/4] xl: Add support for ignore_msrs option
  2021-01-21 14:56   ` Wei Liu
@ 2021-01-21 22:43     ` Boris Ostrovsky
  0 siblings, 0 replies; 53+ messages in thread
From: Boris Ostrovsky @ 2021-01-21 22:43 UTC (permalink / raw)
  To: Wei Liu
  Cc: xen-devel, iwj, anthony.perard, jbeulich, andrew.cooper3,
	roger.pau, jun.nakajima, kevin.tian



On 1/21/21 9:56 AM, Wei Liu wrote:
> On Wed, Jan 20, 2021 at 05:49:09PM -0500, Boris Ostrovsky wrote:
>> This option allows guest administrator specify what should happen when
>> guest accesses an MSR which is not explicitly emulated by the hypervisor.
>>
>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> ---
>>  docs/man/xl.cfg.5.pod.in         | 20 +++++++++++++++++++-
>>  tools/libs/light/libxl_types.idl |  7 +++++++
>>  tools/xl/xl_parse.c              |  7 +++++++
>>  3 files changed, 33 insertions(+), 1 deletion(-)
> 
> It is mainly missing a #define LIBXL_HAVE_XXX in libxl.h.
> 
> Other than that, this patch looks good to me. If you end up resending
> this series, please fix that issue.
> 
> If other patches are all reviewed, you can provide some text to be
> merged into this patch.
> 
> Wei.
> 


Ah yes, I forgot about this.



diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 3433c950f9aa..f249740daf3f 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -1310,6 +1310,13 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, const libxl_mac *src);
  */
 #define LIBXL_HAVE_CREATEINFO_XEND_SUSPEND_EVTCHN_COMPAT
 
+/*
+ *  LIBXL_HAVE_IGNORE_MSRS indicates that the libxl_ignore_msrs field is
+ *  present in libxl_domain_build_info. This field describes hypervisor
+ *  behavior on guest accesses to unimplemented MSRs.
+ */
+#define LIBXL_HAVE_IGNORE_MSRS 1
+
 typedef char **libxl_string_list;
 void libxl_string_list_dispose(libxl_string_list *sl);
 int libxl_string_list_length(const libxl_string_list *sl);


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 1/4] xl: Add support for ignore_msrs option
  2021-01-20 22:49 ` [PATCH v2 1/4] xl: Add support for ignore_msrs option Boris Ostrovsky
  2021-01-21 14:56   ` Wei Liu
@ 2021-01-22  9:52   ` Julien Grall
  2021-01-22 18:28     ` Boris Ostrovsky
  2021-02-18 10:42   ` Roger Pau Monné
  2 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2021-01-22  9:52 UTC (permalink / raw)
  To: Boris Ostrovsky, xen-devel
  Cc: iwj, wl, anthony.perard, jbeulich, andrew.cooper3, roger.pau,
	jun.nakajima, kevin.tian

Hi Boris,

On 20/01/2021 22:49, Boris Ostrovsky wrote:
> This option allows guest administrator specify what should happen when
> guest accesses an MSR which is not explicitly emulated by the hypervisor.
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> ---
>   docs/man/xl.cfg.5.pod.in         | 20 +++++++++++++++++++-
>   tools/libs/light/libxl_types.idl |  7 +++++++
>   tools/xl/xl_parse.c              |  7 +++++++
>   3 files changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
> index c8e017f950de..96ce97c42cab 100644
> --- a/docs/man/xl.cfg.5.pod.in
> +++ b/docs/man/xl.cfg.5.pod.in
> @@ -2044,7 +2044,25 @@ Do not provide a VM generation ID.
>   See also "Virtual Machine Generation ID" by Microsoft:
>   L<https://docs.microsoft.com/en-us/windows/win32/hyperv_v2/virtual-machine-generation-identifier>
>   
> -=back
> +=over
> +
> +=item B<ignore_msrs="STRING">
> +
> +Determine hypervisor behavior on accesses to MSRs that are not emulated by the hypervisor.

The description of the feature looks very x86 focus. Yet, it seems to be 
defined as a generic one.

Could you clarify whether this is intended to be re-usable by other 
architectures?

> +
> +=over 4
> +
> +=item B<never>
> +
> +Issue a warning to the log and #GP to the guest. This is default.
> +
> +=item B<silent>
> +
> +MSR reads return 0, MSR writes are ignored. No warnings to the log.
> +
> +=item B<verbose>
> +
> +Similar to B<silent> but a warning is written.
>   
>   =head3 Guest Virtual Time Controls
>   
> diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
> index 05324736b744..7b5fef771ee8 100644
> --- a/tools/libs/light/libxl_types.idl
> +++ b/tools/libs/light/libxl_types.idl
> @@ -477,6 +477,12 @@ libxl_tee_type = Enumeration("tee_type", [
>       (1, "optee")
>       ], init_val = "LIBXL_TEE_TYPE_NONE")
>   
> +libxl_ignore_msrs = Enumeration("ignore_msrs", [
> +    (0, "never"),
> +    (1, "silent"),
> +    (2, "verbose"),
> +    ], init_val = "LIBXL_IGNORE_MSRS_NEVER")
> +
>   libxl_rdm_reserve = Struct("rdm_reserve", [
>       ("strategy",    libxl_rdm_reserve_strategy),
>       ("policy",      libxl_rdm_reserve_policy),
> @@ -559,6 +565,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
>       ("apic",             libxl_defbool),
>       ("dm_restrict",      libxl_defbool),
>       ("tee",              libxl_tee_type),
> +    ("ignore_msrs",      libxl_ignore_msrs),
>       ("u", KeyedUnion(None, libxl_domain_type, "type",
>                   [("hvm", Struct(None, [("firmware",         string),
>                                          ("bios",             libxl_bios_type),
> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
> index 4ebf39620ae7..942086c3f41d 100644
> --- a/tools/xl/xl_parse.c
> +++ b/tools/xl/xl_parse.c
> @@ -2732,6 +2732,13 @@ skip_usbdev:
>           }
>       }
>   
> +    if (!xlu_cfg_get_string(config, "ignore_msrs", &buf, 0)) {
> +        if (libxl_ignore_msrs_from_string(buf, &b_info->ignore_msrs)) {
> +           fprintf(stderr, "ERROR: invalid value \"%s\" for \"ignore_msrs\"\n", buf);
> +           exit(1);
> +        }
> +    }
> +
>       parse_vkb_list(config, d_config);
>   
>       xlu_cfg_get_defbool(config, "xend_suspend_evtchn_compat",
> 

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 4/4] tools/libs: Apply MSR policy to a guest
  2021-01-20 22:49 ` [PATCH v2 4/4] tools/libs: Apply MSR policy to a guest Boris Ostrovsky
  2021-01-21 14:58   ` Wei Liu
@ 2021-01-22  9:56   ` Julien Grall
  2021-01-22 18:35     ` Boris Ostrovsky
  2021-02-18 11:48   ` Roger Pau Monné
  2 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2021-01-22  9:56 UTC (permalink / raw)
  To: Boris Ostrovsky, xen-devel
  Cc: iwj, wl, anthony.perard, jbeulich, andrew.cooper3, roger.pau,
	jun.nakajima, kevin.tian

Hi Boris,

On 20/01/2021 22:49, Boris Ostrovsky wrote:
> diff --git a/tools/libs/light/libxl_dom.c b/tools/libs/light/libxl_dom.c
> index 19168572fd3e..1f2abf6679d7 100644
> --- a/tools/libs/light/libxl_dom.c
> +++ b/tools/libs/light/libxl_dom.c
> @@ -383,9 +383,10 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid,
>       /* Construct a CPUID policy, but only for brand new domains.  Domains
>        * being migrated-in/restored have CPUID handled during the
>        * static_data_done() callback. */
> -    if (!state->restore)
> +    if (!state->restore) {
>           libxl__cpuid_legacy(ctx, domid, false, info);
> -
> +        libxl__msr_policy(ctx, domid, info);

AFAICT, this is going to break compilation of the toolst on Arm because 
libxl__msr_policy().

However, I am a bit unsure whether we should define a stub for this on 
Arm. It feels to me it would be better to pass an extra boolean 
(restore) to libxl__arch_domain_create() and directly implement it there.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-01-20 22:49 ` [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED Boris Ostrovsky
@ 2021-01-22 11:51   ` Jan Beulich
  2021-01-22 18:56     ` Boris Ostrovsky
  2021-02-02 17:01     ` Boris Ostrovsky
  2021-02-18 10:51   ` Roger Pau Monné
  1 sibling, 2 replies; 53+ messages in thread
From: Jan Beulich @ 2021-01-22 11:51 UTC (permalink / raw)
  To: Boris Ostrovsky, andrew.cooper3
  Cc: iwj, wl, anthony.perard, roger.pau, jun.nakajima, kevin.tian, xen-devel

On 20.01.2021 23:49, Boris Ostrovsky wrote:
> --- a/xen/arch/x86/msr.c
> +++ b/xen/arch/x86/msr.c
> @@ -295,7 +295,7 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
>          }
>  
>          /* Fallthrough. */
> -    case 0x40000200 ... 0x400002ff:
> +    case 0x40000200 ... 0x400002fe:
>          ret = guest_rdmsr_xen(v, msr, val);
>          break;
>  
> @@ -514,7 +514,7 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
>          }
>  
>          /* Fallthrough. */
> -    case 0x40000200 ... 0x400002ff:
> +    case 0x40000200 ... 0x400002fe:
>          ret = guest_wrmsr_xen(v, msr, val);
>          break;

For both of these, we need some kind of connection to
MSR_UNHANDLED. Could be a BUILD_BUG_ON() or an explicit
"case MSR_UNHANDLED:" (preventing someone "correcting" the
apparent mistake) or yet something else.

> --- a/xen/include/xen/lib/x86/msr.h
> +++ b/xen/include/xen/lib/x86/msr.h
> @@ -2,8 +2,21 @@
>  #ifndef XEN_LIB_X86_MSR_H
>  #define XEN_LIB_X86_MSR_H
>  
> +/*
> + * Behavior on accesses to MSRs that are not handled by emulation:
> + *  0 = return #GP, warning emitted
> + *  1 = read as 0, writes are dropped, no warning
> + *  2 = read as 0, writes are dropped, warning emitted
> + */
> +#define MSR_UNHANDLED_NEVER     0
> +#define MSR_UNHANDLED_SILENT    1
> +#define MSR_UNHANDLED_VERBOSE   2
> +
> +/* MSR that is not explicitly processed by emulation */
> +#define MSR_UNHANDLED           0x400002ff

MSR indexes as well as definitions for MSR contents generally
live in asm-x86/msr-index.h. I think it would be better for
the above to also go there.

Additionally the comment on MSR_UNHANDLED should not only
say what will not happen for this index, but also what its
intended purpose is.

> @@ -45,6 +58,8 @@ struct msr_policy
>              bool taa_no:1;
>          };
>      } arch_caps;
> +
> +    uint8_t ignore_msrs;

Add a brief comment along the lines of /* MSR_UNHANDLED_* */
here to make the connection to intended values?

Also, Andrew, (I think I did say so before) - I definitely
would want your general consent with this model, as what gets
altered here is almost all relatively recent contributions
by you. Nor would I exclude the approach being controversial.

Jan


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 3/4] x86: Allow non-faulting accesses to non-emulated MSRs if policy permits this
  2021-01-20 22:49 ` [PATCH v2 3/4] x86: Allow non-faulting accesses to non-emulated MSRs if policy permits this Boris Ostrovsky
@ 2021-01-22 12:51   ` Jan Beulich
  2021-01-22 19:52     ` Boris Ostrovsky
  2021-02-18 11:24   ` Roger Pau Monné
  1 sibling, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2021-01-22 12:51 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: iwj, wl, anthony.perard, andrew.cooper3, roger.pau, jun.nakajima,
	kevin.tian, xen-devel

On 20.01.2021 23:49, Boris Ostrovsky wrote:
> --- a/xen/arch/x86/msr.c
> +++ b/xen/arch/x86/msr.c
> @@ -164,6 +164,34 @@ int init_vcpu_msr_policy(struct vcpu *v)
>      return 0;
>  }
>  
> +/* Returns true if policy requires #GP to the guest. */
> +bool guest_unhandled_msr(const struct vcpu *v, uint32_t msr, uint64_t *val,
> +                         bool is_write, bool is_guest_msr_access)

Would you mind dropping the "_msr" infix from the last
parameter's name? We're anyway aware we're talking about MSR
accesses here, from the function name.

As a nit - while I'm okay with the uint64_t *, I think the
uint32_t would better be unsigned int - see ./CODING_STYLE.

Finally, this being a policy function and policy being per-
domain here, I think the first parameter wants to be const
struct domain *.

> +{
> +    u8 ignore_msrs = v->domain->arch.msr->ignore_msrs;

uint8_t please or, as per above, even better unsigned int.

> +
> +    /*
> +     * Accesses to unimplemented MSRs as part of emulation of instructions
> +     * other than guest's RDMSR/WRMSR should never succeed.
> +     */
> +    if ( !is_guest_msr_access )
> +        ignore_msrs = MSR_UNHANDLED_NEVER;

Wouldn't you better "return true" here? Such accesses also
shouldn't be logged imo (albeit I agree that's a change from
current behavior).

> +    if ( unlikely(ignore_msrs != MSR_UNHANDLED_NEVER) )
> +        *val = 0;

I don't understand the conditional here, even more so with
the respective changelog entry. In any event you don't
want to clobber the value ahead of ...

> +    if ( likely(ignore_msrs != MSR_UNHANDLED_SILENT) )
> +    {
> +        if ( is_write )
> +            gdprintk(XENLOG_WARNING, "WRMSR 0x%08x val 0x%016"PRIx64
> +                    " unimplemented\n", msr, *val);

... logging it.

> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -850,4 +850,10 @@ static inline void x86_emul_reset_event(struct x86_emulate_ctxt *ctxt)
>      ctxt->event = (struct x86_event){};
>  }
>  
> +static inline bool x86_emul_guest_msr_access(struct x86_emulate_ctxt *ctxt)

The parameter wants to be pointer-to-const. In addition I wonder
whether this wouldn't better be a sibling to
x86_insn_is_cr_access() (without a "state" parameter, which
would be unused and unavailable to the callers), which may end
up finding further uses down the road.

> +{
> +    return ctxt->opcode == X86EMUL_OPC(0x0f, 0x32) ||  /* RDMSR */
> +           ctxt->opcode == X86EMUL_OPC(0x0f, 0x30);    /* WRMSR */
> +}

Personally I'd prefer if this was a single comparison:

    return (ctxt->opcode | 2) == X86EMUL_OPC(0x0f, 0x32);

But maybe nowadays' compilers are capable of this
transformation?

I notice you use this function only from PV priv-op emulation.
What about the call paths through hvmemul_{read,write}_msr()?
(It's also questionable whether the write paths need this -
the only MSR written outside of WRMSR emulation is
MSR_SHADOW_GS_BASE, which can't possibly reach the "unhandled"
logic anywhere. But maybe better to be future proof here in
case new MSR writes appear in the emulator, down the road.)

Jan


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 1/4] xl: Add support for ignore_msrs option
  2021-01-22  9:52   ` Julien Grall
@ 2021-01-22 18:28     ` Boris Ostrovsky
  2021-01-22 18:33       ` Julien Grall
  0 siblings, 1 reply; 53+ messages in thread
From: Boris Ostrovsky @ 2021-01-22 18:28 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: iwj, wl, anthony.perard, jbeulich, andrew.cooper3, roger.pau,
	jun.nakajima, kevin.tian



On 1/22/21 4:52 AM, Julien Grall wrote:
> Hi Boris,
> 
> On 20/01/2021 22:49, Boris Ostrovsky wrote:
>> This option allows guest administrator specify what should happen when
>> guest accesses an MSR which is not explicitly emulated by the hypervisor.
>>
>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> ---
>>   docs/man/xl.cfg.5.pod.in         | 20 +++++++++++++++++++-
>>   tools/libs/light/libxl_types.idl |  7 +++++++
>>   tools/xl/xl_parse.c              |  7 +++++++
>>   3 files changed, 33 insertions(+), 1 deletion(-)
>>
>> diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
>> index c8e017f950de..96ce97c42cab 100644
>> --- a/docs/man/xl.cfg.5.pod.in
>> +++ b/docs/man/xl.cfg.5.pod.in
>> @@ -2044,7 +2044,25 @@ Do not provide a VM generation ID.
>>   See also "Virtual Machine Generation ID" by Microsoft:
>>   L<https://docs.microsoft.com/en-us/windows/win32/hyperv_v2/virtual-machine-generation-identifier>
>>   -=back
>> +=over
>> +
>> +=item B<ignore_msrs="STRING">
>> +
>> +Determine hypervisor behavior on accesses to MSRs that are not emulated by the hypervisor.
> 
> The description of the feature looks very x86 focus. Yet, it seems to be defined as a generic one.
> 
> Could you clarify whether this is intended to be re-usable by other architectures?


x86 only. I'll add appropriate note.


-boris


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 1/4] xl: Add support for ignore_msrs option
  2021-01-22 18:28     ` Boris Ostrovsky
@ 2021-01-22 18:33       ` Julien Grall
  2021-01-22 18:39         ` Boris Ostrovsky
  0 siblings, 1 reply; 53+ messages in thread
From: Julien Grall @ 2021-01-22 18:33 UTC (permalink / raw)
  To: Boris Ostrovsky, xen-devel
  Cc: iwj, wl, anthony.perard, jbeulich, andrew.cooper3, roger.pau,
	jun.nakajima, kevin.tian



On 22/01/2021 18:28, Boris Ostrovsky wrote:
> 
> 
> On 1/22/21 4:52 AM, Julien Grall wrote:
>> Hi Boris,
>>
>> On 20/01/2021 22:49, Boris Ostrovsky wrote:
>>> This option allows guest administrator specify what should happen when
>>> guest accesses an MSR which is not explicitly emulated by the hypervisor.
>>>
>>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>> ---
>>>    docs/man/xl.cfg.5.pod.in         | 20 +++++++++++++++++++-
>>>    tools/libs/light/libxl_types.idl |  7 +++++++
>>>    tools/xl/xl_parse.c              |  7 +++++++
>>>    3 files changed, 33 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
>>> index c8e017f950de..96ce97c42cab 100644
>>> --- a/docs/man/xl.cfg.5.pod.in
>>> +++ b/docs/man/xl.cfg.5.pod.in
>>> @@ -2044,7 +2044,25 @@ Do not provide a VM generation ID.
>>>    See also "Virtual Machine Generation ID" by Microsoft:
>>>    L<https://docs.microsoft.com/en-us/windows/win32/hyperv_v2/virtual-machine-generation-identifier>
>>>    -=back
>>> +=over
>>> +
>>> +=item B<ignore_msrs="STRING">
>>> +
>>> +Determine hypervisor behavior on accesses to MSRs that are not emulated by the hypervisor.
>>
>> The description of the feature looks very x86 focus. Yet, it seems to be defined as a generic one.
>>
>> Could you clarify whether this is intended to be re-usable by other architectures?
> 
> 
> x86 only. I'll add appropriate note.

Thanks. In which case, I would suggest to move the definition of 
ignore_msrs in the idle to an x86 specific structure. Maybe "arch_x86"?

I noticed that none exists today, but we could duplicate that "arch_arm" 
one.

Wei, Ian, what do you think?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 4/4] tools/libs: Apply MSR policy to a guest
  2021-01-22  9:56   ` Julien Grall
@ 2021-01-22 18:35     ` Boris Ostrovsky
  0 siblings, 0 replies; 53+ messages in thread
From: Boris Ostrovsky @ 2021-01-22 18:35 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: iwj, wl, anthony.perard, jbeulich, andrew.cooper3, roger.pau,
	jun.nakajima, kevin.tian



On 1/22/21 4:56 AM, Julien Grall wrote:
> Hi Boris,
> 
> On 20/01/2021 22:49, Boris Ostrovsky wrote:
>> diff --git a/tools/libs/light/libxl_dom.c b/tools/libs/light/libxl_dom.c
>> index 19168572fd3e..1f2abf6679d7 100644
>> --- a/tools/libs/light/libxl_dom.c
>> +++ b/tools/libs/light/libxl_dom.c
>> @@ -383,9 +383,10 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid,
>>       /* Construct a CPUID policy, but only for brand new domains.  Domains
>>        * being migrated-in/restored have CPUID handled during the
>>        * static_data_done() callback. */
>> -    if (!state->restore)
>> +    if (!state->restore) {
>>           libxl__cpuid_legacy(ctx, domid, false, info);
>> -
>> +        libxl__msr_policy(ctx, domid, info);
> 
> AFAICT, this is going to break compilation of the toolst on Arm because libxl__msr_policy().

Yes, it will ;-(

> 
> However, I am a bit unsure whether we should define a stub for this on Arm. It feels to me it would be better to pass an extra boolean (restore) to libxl__arch_domain_create() and directly implement it there.


Yes. And move libxl__cpuid_legacy call there too then.


-boris


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 1/4] xl: Add support for ignore_msrs option
  2021-01-22 18:33       ` Julien Grall
@ 2021-01-22 18:39         ` Boris Ostrovsky
  2021-01-22 20:42           ` Julien Grall
  0 siblings, 1 reply; 53+ messages in thread
From: Boris Ostrovsky @ 2021-01-22 18:39 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: iwj, wl, anthony.perard, jbeulich, andrew.cooper3, roger.pau,
	jun.nakajima, kevin.tian



On 1/22/21 1:33 PM, Julien Grall wrote:

> 
> Thanks. In which case, I would suggest to move the definition of ignore_msrs in the idle to an x86 specific structure. Maybe "arch_x86"?


I did consider this but left it in common area since there are already a bunch of fields there that are x86-specific.

If there is desire to create an x86 sub-struct then we should move all of them there.


-boris


> 
> I noticed that none exists today, but we could duplicate that "arch_arm" one.
> 
> Wei, Ian, what do you think?
> 
> Cheers,
> 


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-01-22 11:51   ` Jan Beulich
@ 2021-01-22 18:56     ` Boris Ostrovsky
  2021-02-02 17:01     ` Boris Ostrovsky
  1 sibling, 0 replies; 53+ messages in thread
From: Boris Ostrovsky @ 2021-01-22 18:56 UTC (permalink / raw)
  To: Jan Beulich, andrew.cooper3
  Cc: iwj, wl, anthony.perard, roger.pau, jun.nakajima, kevin.tian, xen-devel



On 1/22/21 6:51 AM, Jan Beulich wrote:
> On 20.01.2021 23:49, Boris Ostrovsky wrote:
>> --- a/xen/arch/x86/msr.c
>> +++ b/xen/arch/x86/msr.c
>> @@ -295,7 +295,7 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
>>          }
>>  
>>          /* Fallthrough. */
>> -    case 0x40000200 ... 0x400002ff:
>> +    case 0x40000200 ... 0x400002fe:
>>          ret = guest_rdmsr_xen(v, msr, val);
>>          break;
>>  
>> @@ -514,7 +514,7 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
>>          }
>>  
>>          /* Fallthrough. */
>> -    case 0x40000200 ... 0x400002ff:
>> +    case 0x40000200 ... 0x400002fe:
>>          ret = guest_wrmsr_xen(v, msr, val);
>>          break;
> 
> For both of these, we need some kind of connection to
> MSR_UNHANDLED. Could be a BUILD_BUG_ON() or an explicit
> "case MSR_UNHANDLED:" (preventing someone "correcting" the
> apparent mistake) or yet something else.


I actually meant to add a comment there but forgot.

I'll add an explicit 'case'.


> 
>> --- a/xen/include/xen/lib/x86/msr.h
>> +++ b/xen/include/xen/lib/x86/msr.h
>> @@ -2,8 +2,21 @@
>>  #ifndef XEN_LIB_X86_MSR_H
>>  #define XEN_LIB_X86_MSR_H
>>  
>> +/*
>> + * Behavior on accesses to MSRs that are not handled by emulation:
>> + *  0 = return #GP, warning emitted
>> + *  1 = read as 0, writes are dropped, no warning
>> + *  2 = read as 0, writes are dropped, warning emitted
>> + */
>> +#define MSR_UNHANDLED_NEVER     0
>> +#define MSR_UNHANDLED_SILENT    1
>> +#define MSR_UNHANDLED_VERBOSE   2
>> +
>> +/* MSR that is not explicitly processed by emulation */
>> +#define MSR_UNHANDLED           0x400002ff
> 
> MSR indexes as well as definitions for MSR contents generally
> live in asm-x86/msr-index.h. I think it would be better for
> the above to also go there.
> 


I didn't put it there because I felt that file is for "real" (including hypervisor range) MSRs. But I can move it to msr-index.h


-boris


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 3/4] x86: Allow non-faulting accesses to non-emulated MSRs if policy permits this
  2021-01-22 12:51   ` Jan Beulich
@ 2021-01-22 19:52     ` Boris Ostrovsky
  2021-01-25 10:22       ` Jan Beulich
  0 siblings, 1 reply; 53+ messages in thread
From: Boris Ostrovsky @ 2021-01-22 19:52 UTC (permalink / raw)
  To: Jan Beulich
  Cc: iwj, wl, anthony.perard, andrew.cooper3, roger.pau, jun.nakajima,
	kevin.tian, xen-devel



On 1/22/21 7:51 AM, Jan Beulich wrote:
> On 20.01.2021 23:49, Boris Ostrovsky wrote:


> 
>> +
>> +    /*
>> +     * Accesses to unimplemented MSRs as part of emulation of instructions
>> +     * other than guest's RDMSR/WRMSR should never succeed.
>> +     */
>> +    if ( !is_guest_msr_access )
>> +        ignore_msrs = MSR_UNHANDLED_NEVER;
> 
> Wouldn't you better "return true" here? Such accesses also
> shouldn't be logged imo (albeit I agree that's a change from
> current behavior).


Yes, that's why I didn't return here. We will be here in !is_guest_msr_access case most likely due to a bug in the emulator so I think we do want to see the error logged.


> 
>> +    if ( unlikely(ignore_msrs != MSR_UNHANDLED_NEVER) )
>> +        *val = 0;
> 
> I don't understand the conditional here, even more so with
> the respective changelog entry. In any event you don't
> want to clobber the value ahead of ...
> 
>> +    if ( likely(ignore_msrs != MSR_UNHANDLED_SILENT) )
>> +    {
>> +        if ( is_write )
>> +            gdprintk(XENLOG_WARNING, "WRMSR 0x%08x val 0x%016"PRIx64
>> +                    " unimplemented\n", msr, *val);
> 
> ... logging it.


True. I dropped !is_write from v1 without considering this.

As far as the conditional --- dropping it too would be a behavior change. 


> 
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
>> @@ -850,4 +850,10 @@ static inline void x86_emul_reset_event(struct x86_emulate_ctxt *ctxt)
>>      ctxt->event = (struct x86_event){};
>>  }
>>  
>> +static inline bool x86_emul_guest_msr_access(struct x86_emulate_ctxt *ctxt)
> 
> The parameter wants to be pointer-to-const. In addition I wonder
> whether this wouldn't better be a sibling to
> x86_insn_is_cr_access() (without a "state" parameter, which
> would be unused and unavailable to the callers), which may end
> up finding further uses down the road.


"Sibling" in terms of name (yes, it would be) or something else?


> 
>> +{
>> +    return ctxt->opcode == X86EMUL_OPC(0x0f, 0x32) ||  /* RDMSR */
>> +           ctxt->opcode == X86EMUL_OPC(0x0f, 0x30);    /* WRMSR */
>> +}
> 
> Personally I'd prefer if this was a single comparison:
> 
>     return (ctxt->opcode | 2) == X86EMUL_OPC(0x0f, 0x32);
> 
> But maybe nowadays' compilers are capable of this
> transformation?

Here is what I've got (not an inline but shouldn't make much difference I'd think)

ffff82d040385960 <x86_emul_guest_msr_access_2>: # your code
ffff82d040385960:       8b 47 2c                mov    0x2c(%rdi),%eax
ffff82d040385963:       83 e0 fd                and    $0xfffffffd,%eax
ffff82d040385966:       3d 30 00 0f 00          cmp    $0xf0030,%eax
ffff82d04038596b:       0f 94 c0                sete   %al
ffff82d04038596e:       c3                      retq

ffff82d04038596f <x86_emul_guest_msr_access_1>: # my code
ffff82d04038596f:       8b 47 2c                mov    0x2c(%rdi),%eax
ffff82d040385972:       83 c8 02                or     $0x2,%eax
ffff82d040385975:       3d 32 00 0f 00          cmp    $0xf0032,%eax
ffff82d04038597a:       0f 94 c0                sete   %al
ffff82d04038597d:       c3                      retq


So it's a wash in terms of generated code.

> 
> I notice you use this function only from PV priv-op emulation.
> What about the call paths through hvmemul_{read,write}_msr()?
> (It's also questionable whether the write paths need this -
> the only MSR written outside of WRMSR emulation is
> MSR_SHADOW_GS_BASE, which can't possibly reach the "unhandled"
> logic anywhere. But maybe better to be future proof here in
> case new MSR writes appear in the emulator, down the road.)


Won't we end up in hvm_funcs.msr_write_intercept ops which do call it?


-boris


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 1/4] xl: Add support for ignore_msrs option
  2021-01-22 18:39         ` Boris Ostrovsky
@ 2021-01-22 20:42           ` Julien Grall
  0 siblings, 0 replies; 53+ messages in thread
From: Julien Grall @ 2021-01-22 20:42 UTC (permalink / raw)
  To: Boris Ostrovsky, xen-devel
  Cc: iwj, wl, anthony.perard, jbeulich, andrew.cooper3, roger.pau,
	jun.nakajima, kevin.tian



On 22/01/2021 18:39, Boris Ostrovsky wrote:
> 
> 
> On 1/22/21 1:33 PM, Julien Grall wrote:
> 
>>
>> Thanks. In which case, I would suggest to move the definition of ignore_msrs in the idle to an x86 specific structure. Maybe "arch_x86"?
> 
> 
> I did consider this but left it in common area since there are already a bunch of fields there that are x86-specific.

Most of them were introduced before Arm existed :).

> 
> If there is desire to create an x86 sub-struct then we should move all of them there.

The libxl interface is meant to be stable at the source level but not 
ABI stable. So we wouldn't be able to move the existing field in a new 
x86 without adding a compat layer (see for instance hvm.altp2m vs 
altp2m). I don't think this is worth it for them.

That said, we should try to treat each architecture equally. So I think 
it is not a good idea to continue to pursue adding an field in common 
area if the field is only meant to be used for a single arch.

If Ian or Wei thinks it is fine to define in common code, then so it be.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 3/4] x86: Allow non-faulting accesses to non-emulated MSRs if policy permits this
  2021-01-22 19:52     ` Boris Ostrovsky
@ 2021-01-25 10:22       ` Jan Beulich
  2021-01-25 18:42         ` Boris Ostrovsky
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2021-01-25 10:22 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: iwj, wl, anthony.perard, andrew.cooper3, roger.pau, jun.nakajima,
	kevin.tian, xen-devel

On 22.01.2021 20:52, Boris Ostrovsky wrote:
> On 1/22/21 7:51 AM, Jan Beulich wrote:
>> On 20.01.2021 23:49, Boris Ostrovsky wrote:
>>> +
>>> +    /*
>>> +     * Accesses to unimplemented MSRs as part of emulation of instructions
>>> +     * other than guest's RDMSR/WRMSR should never succeed.
>>> +     */
>>> +    if ( !is_guest_msr_access )
>>> +        ignore_msrs = MSR_UNHANDLED_NEVER;
>>
>> Wouldn't you better "return true" here? Such accesses also
>> shouldn't be logged imo (albeit I agree that's a change from
>> current behavior).
> 
> 
> Yes, that's why I didn't return here. We will be here in !is_guest_msr_access case most likely due to a bug in the emulator so I think we do want to see the error logged.

Why "most likely"?

>>> +    if ( unlikely(ignore_msrs != MSR_UNHANDLED_NEVER) )
>>> +        *val = 0;
>>
>> I don't understand the conditional here, even more so with
>> the respective changelog entry. In any event you don't
>> want to clobber the value ahead of ...
>>
>>> +    if ( likely(ignore_msrs != MSR_UNHANDLED_SILENT) )
>>> +    {
>>> +        if ( is_write )
>>> +            gdprintk(XENLOG_WARNING, "WRMSR 0x%08x val 0x%016"PRIx64
>>> +                    " unimplemented\n", msr, *val);
>>
>> ... logging it.
> 
> 
> True. I dropped !is_write from v1 without considering this.
> 
> As far as the conditional --- dropping it too would be a behavior change. 

Albeit an intentional one then? Plus I think I have trouble
seeing what behavior it would be that would change.

>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
>>> @@ -850,4 +850,10 @@ static inline void x86_emul_reset_event(struct x86_emulate_ctxt *ctxt)
>>>      ctxt->event = (struct x86_event){};
>>>  }
>>>  
>>> +static inline bool x86_emul_guest_msr_access(struct x86_emulate_ctxt *ctxt)
>>
>> The parameter wants to be pointer-to-const. In addition I wonder
>> whether this wouldn't better be a sibling to
>> x86_insn_is_cr_access() (without a "state" parameter, which
>> would be unused and unavailable to the callers), which may end
>> up finding further uses down the road.
> 
> 
> "Sibling" in terms of name (yes, it would be) or something else?

Name and (possible) purpose - a validate hook could want to
make use of this, for example.

>>> +{
>>> +    return ctxt->opcode == X86EMUL_OPC(0x0f, 0x32) ||  /* RDMSR */
>>> +           ctxt->opcode == X86EMUL_OPC(0x0f, 0x30);    /* WRMSR */
>>> +}
>>
>> Personally I'd prefer if this was a single comparison:
>>
>>     return (ctxt->opcode | 2) == X86EMUL_OPC(0x0f, 0x32);
>>
>> But maybe nowadays' compilers are capable of this
>> transformation?
> 
> Here is what I've got (not an inline but shouldn't make much difference I'd think)
> 
> ffff82d040385960 <x86_emul_guest_msr_access_2>: # your code
> ffff82d040385960:       8b 47 2c                mov    0x2c(%rdi),%eax
> ffff82d040385963:       83 e0 fd                and    $0xfffffffd,%eax
> ffff82d040385966:       3d 30 00 0f 00          cmp    $0xf0030,%eax
> ffff82d04038596b:       0f 94 c0                sete   %al
> ffff82d04038596e:       c3                      retq
> 
> ffff82d04038596f <x86_emul_guest_msr_access_1>: # my code
> ffff82d04038596f:       8b 47 2c                mov    0x2c(%rdi),%eax
> ffff82d040385972:       83 c8 02                or     $0x2,%eax
> ffff82d040385975:       3d 32 00 0f 00          cmp    $0xf0032,%eax
> ffff82d04038597a:       0f 94 c0                sete   %al
> ffff82d04038597d:       c3                      retq
> 
> 
> So it's a wash in terms of generated code.

True, albeit I guess you got "your code" and "my code" the
wrong way round, as I don't expect the compiler to
translate | into "and".

>> I notice you use this function only from PV priv-op emulation.
>> What about the call paths through hvmemul_{read,write}_msr()?
>> (It's also questionable whether the write paths need this -
>> the only MSR written outside of WRMSR emulation is
>> MSR_SHADOW_GS_BASE, which can't possibly reach the "unhandled"
>> logic anywhere. But maybe better to be future proof here in
>> case new MSR writes appear in the emulator, down the road.)
> 
> 
> Won't we end up in hvm_funcs.msr_write_intercept ops which do call it?

Of course we will - the boolean will very likely need
propagating (a possible alternative being a per-vCPU flag
indicating "in emulator").

Jan


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 3/4] x86: Allow non-faulting accesses to non-emulated MSRs if policy permits this
  2021-01-25 10:22       ` Jan Beulich
@ 2021-01-25 18:42         ` Boris Ostrovsky
  2021-01-26  9:05           ` Jan Beulich
  0 siblings, 1 reply; 53+ messages in thread
From: Boris Ostrovsky @ 2021-01-25 18:42 UTC (permalink / raw)
  To: Jan Beulich
  Cc: iwj, wl, anthony.perard, andrew.cooper3, roger.pau, jun.nakajima,
	kevin.tian, xen-devel

On 21-01-25 11:22:08, Jan Beulich wrote:
> On 22.01.2021 20:52, Boris Ostrovsky wrote:
> > On 1/22/21 7:51 AM, Jan Beulich wrote:
> >> On 20.01.2021 23:49, Boris Ostrovsky wrote:
> >>> +
> >>> +    /*
> >>> +     * Accesses to unimplemented MSRs as part of emulation of instructions
> >>> +     * other than guest's RDMSR/WRMSR should never succeed.
> >>> +     */
> >>> +    if ( !is_guest_msr_access )
> >>> +        ignore_msrs = MSR_UNHANDLED_NEVER;
> >>
> >> Wouldn't you better "return true" here? Such accesses also
> >> shouldn't be logged imo (albeit I agree that's a change from
> >> current behavior).
> > 
> > 
> > Yes, that's why I didn't return here. We will be here in !is_guest_msr_access case most likely due to a bug in the emulator so I think we do want to see the error logged.
> 
> Why "most likely"?


OK, definitely ;-) But I still think logging these accesses would be helpful.

> 
> >>> +    if ( unlikely(ignore_msrs != MSR_UNHANDLED_NEVER) )
> >>> +        *val = 0;
> >>
> >> I don't understand the conditional here, even more so with
> >> the respective changelog entry. In any event you don't
> >> want to clobber the value ahead of ...
> >>
> >>> +    if ( likely(ignore_msrs != MSR_UNHANDLED_SILENT) )
> >>> +    {
> >>> +        if ( is_write )
> >>> +            gdprintk(XENLOG_WARNING, "WRMSR 0x%08x val 0x%016"PRIx64
> >>> +                    " unimplemented\n", msr, *val);
> >>
> >> ... logging it.
> > 
> > 
> > True. I dropped !is_write from v1 without considering this.
> > 
> > As far as the conditional --- dropping it too would be a behavior change. 
> 
> Albeit an intentional one then? Plus I think I have trouble
> seeing what behavior it would be that would change.


Currently callers of, say, read_msr() don't expect the argument that they pass in to change. Granted, they shouldn't (and AFAICS don't) look at it but it's a change nonetheless.

> 
> >>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> >>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> >>> @@ -850,4 +850,10 @@ static inline void x86_emul_reset_event(struct x86_emulate_ctxt *ctxt)
> >>>      ctxt->event = (struct x86_event){};
> >>>  }
> >>>  
> >>> +static inline bool x86_emul_guest_msr_access(struct x86_emulate_ctxt *ctxt)
> >>
> >> The parameter wants to be pointer-to-const. In addition I wonder
> >> whether this wouldn't better be a sibling to
> >> x86_insn_is_cr_access() (without a "state" parameter, which
> >> would be unused and unavailable to the callers), which may end
> >> up finding further uses down the road.
> > 
> > 
> > "Sibling" in terms of name (yes, it would be) or something else?
> 
> Name and (possible) purpose - a validate hook could want to
> make use of this, for example.

A validate hook? 

> 
> >>> +{
> >>> +    return ctxt->opcode == X86EMUL_OPC(0x0f, 0x32) ||  /* RDMSR */
> >>> +           ctxt->opcode == X86EMUL_OPC(0x0f, 0x30);    /* WRMSR */
> >>> +}
> >>
> >> Personally I'd prefer if this was a single comparison:
> >>
> >>     return (ctxt->opcode | 2) == X86EMUL_OPC(0x0f, 0x32);
> >>
> >> But maybe nowadays' compilers are capable of this
> >> transformation?
> > 
> > Here is what I've got (not an inline but shouldn't make much difference I'd think)
> > 
> > ffff82d040385960 <x86_emul_guest_msr_access_2>: # your code
> > ffff82d040385960:       8b 47 2c                mov    0x2c(%rdi),%eax
> > ffff82d040385963:       83 e0 fd                and    $0xfffffffd,%eax
> > ffff82d040385966:       3d 30 00 0f 00          cmp    $0xf0030,%eax
> > ffff82d04038596b:       0f 94 c0                sete   %al
> > ffff82d04038596e:       c3                      retq
> > 
> > ffff82d04038596f <x86_emul_guest_msr_access_1>: # my code
> > ffff82d04038596f:       8b 47 2c                mov    0x2c(%rdi),%eax
> > ffff82d040385972:       83 c8 02                or     $0x2,%eax
> > ffff82d040385975:       3d 32 00 0f 00          cmp    $0xf0032,%eax
> > ffff82d04038597a:       0f 94 c0                sete   %al
> > ffff82d04038597d:       c3                      retq
> > 
> > 
> > So it's a wash in terms of generated code.
> 
> True, albeit I guess you got "your code" and "my code" the
> wrong way round, as I don't expect the compiler to
> translate | into "and".


Yes, looks like I did switch them.

> 
> >> I notice you use this function only from PV priv-op emulation.
> >> What about the call paths through hvmemul_{read,write}_msr()?
> >> (It's also questionable whether the write paths need this -
> >> the only MSR written outside of WRMSR emulation is
> >> MSR_SHADOW_GS_BASE, which can't possibly reach the "unhandled"
> >> logic anywhere. But maybe better to be future proof here in
> >> case new MSR writes appear in the emulator, down the road.)
> > 
> > 
> > Won't we end up in hvm_funcs.msr_write_intercept ops which do call it?
> 
> Of course we will - the boolean will very likely need
> propagating (a possible alternative being a per-vCPU flag
> indicating "in emulator").


Oh, I see what you mean. By per-vcpu flag you mean arch_vcpu field I assume?


-boris



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 3/4] x86: Allow non-faulting accesses to non-emulated MSRs if policy permits this
  2021-01-25 18:42         ` Boris Ostrovsky
@ 2021-01-26  9:05           ` Jan Beulich
  2021-01-26 16:02             ` Boris Ostrovsky
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2021-01-26  9:05 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: iwj, wl, anthony.perard, andrew.cooper3, roger.pau, jun.nakajima,
	kevin.tian, xen-devel

On 25.01.2021 19:42, Boris Ostrovsky wrote:
> On 21-01-25 11:22:08, Jan Beulich wrote:
>> On 22.01.2021 20:52, Boris Ostrovsky wrote:
>>> On 1/22/21 7:51 AM, Jan Beulich wrote:
>>>> On 20.01.2021 23:49, Boris Ostrovsky wrote:
>>>>> +
>>>>> +    /*
>>>>> +     * Accesses to unimplemented MSRs as part of emulation of instructions
>>>>> +     * other than guest's RDMSR/WRMSR should never succeed.
>>>>> +     */
>>>>> +    if ( !is_guest_msr_access )
>>>>> +        ignore_msrs = MSR_UNHANDLED_NEVER;
>>>>
>>>> Wouldn't you better "return true" here? Such accesses also
>>>> shouldn't be logged imo (albeit I agree that's a change from
>>>> current behavior).
>>>
>>>
>>> Yes, that's why I didn't return here. We will be here in !is_guest_msr_access case most likely due to a bug in the emulator so I think we do want to see the error logged.
>>
>> Why "most likely"?
> 
> 
> OK, definitely ;-)

Oops - I was thinking the other way around, considering such
to possibly be legitimate. It just so happens that curently
we have no such path.

> But I still think logging these accesses would be helpful.

Because of the above I continue to question this.

>>>>> +    if ( unlikely(ignore_msrs != MSR_UNHANDLED_NEVER) )
>>>>> +        *val = 0;
>>>>
>>>> I don't understand the conditional here, even more so with
>>>> the respective changelog entry. In any event you don't
>>>> want to clobber the value ahead of ...
>>>>
>>>>> +    if ( likely(ignore_msrs != MSR_UNHANDLED_SILENT) )
>>>>> +    {
>>>>> +        if ( is_write )
>>>>> +            gdprintk(XENLOG_WARNING, "WRMSR 0x%08x val 0x%016"PRIx64
>>>>> +                    " unimplemented\n", msr, *val);
>>>>
>>>> ... logging it.
>>>
>>>
>>> True. I dropped !is_write from v1 without considering this.
>>>
>>> As far as the conditional --- dropping it too would be a behavior change. 
>>
>> Albeit an intentional one then? Plus I think I have trouble
>> seeing what behavior it would be that would change.
> 
> 
> Currently callers of, say, read_msr() don't expect the argument that they pass in to change. Granted, they shouldn't (and AFAICS don't) look at it but it's a change nonetheless.

Hmm, I'm confused: The purpose of read_msr() is to change the
value pointed at by the passed in argument. And for write_msr()
the users of the hook pass the argument by value, i.e. wouldn't
observe the changed value (it would only possibly be
intermediate layers which might observe the change, but those
ought to not care).

>>>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
>>>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
>>>>> @@ -850,4 +850,10 @@ static inline void x86_emul_reset_event(struct x86_emulate_ctxt *ctxt)
>>>>>      ctxt->event = (struct x86_event){};
>>>>>  }
>>>>>  
>>>>> +static inline bool x86_emul_guest_msr_access(struct x86_emulate_ctxt *ctxt)
>>>>
>>>> The parameter wants to be pointer-to-const. In addition I wonder
>>>> whether this wouldn't better be a sibling to
>>>> x86_insn_is_cr_access() (without a "state" parameter, which
>>>> would be unused and unavailable to the callers), which may end
>>>> up finding further uses down the road.
>>>
>>>
>>> "Sibling" in terms of name (yes, it would be) or something else?
>>
>> Name and (possible) purpose - a validate hook could want to
>> make use of this, for example.
> 
> A validate hook? 

Quoting from struct x86_emulate_ops:

    /*
     * validate: Post-decode, pre-emulate hook to allow caller controlled
     * filtering.
     */
    int (*validate)(
        const struct x86_emulate_state *state,
        struct x86_emulate_ctxt *ctxt);

Granted to be directly usable the function would need to have a
"state" parameter. As that's unused, having it have one and
passing NULL in your case might be acceptable. But I also could
see arguments towards this not being a good idea.

>>>> I notice you use this function only from PV priv-op emulation.
>>>> What about the call paths through hvmemul_{read,write}_msr()?
>>>> (It's also questionable whether the write paths need this -
>>>> the only MSR written outside of WRMSR emulation is
>>>> MSR_SHADOW_GS_BASE, which can't possibly reach the "unhandled"
>>>> logic anywhere. But maybe better to be future proof here in
>>>> case new MSR writes appear in the emulator, down the road.)
>>>
>>>
>>> Won't we end up in hvm_funcs.msr_write_intercept ops which do call it?
>>
>> Of course we will - the boolean will very likely need
>> propagating (a possible alternative being a per-vCPU flag
>> indicating "in emulator").
> 
> 
> Oh, I see what you mean. By per-vcpu flag you mean arch_vcpu field I assume?

Yes, a boolean in one of the arch-specific per-vCPU structs.
Whether that's arch_vcpu or perhaps something HVM specific is
another question.

Jan


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 3/4] x86: Allow non-faulting accesses to non-emulated MSRs if policy permits this
  2021-01-26  9:05           ` Jan Beulich
@ 2021-01-26 16:02             ` Boris Ostrovsky
  2021-01-26 16:35               ` Jan Beulich
  0 siblings, 1 reply; 53+ messages in thread
From: Boris Ostrovsky @ 2021-01-26 16:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: iwj, wl, anthony.perard, andrew.cooper3, roger.pau, jun.nakajima,
	kevin.tian, xen-devel

On 21-01-26 10:05:47, Jan Beulich wrote:
> On 25.01.2021 19:42, Boris Ostrovsky wrote:
> > On 21-01-25 11:22:08, Jan Beulich wrote:
> >> On 22.01.2021 20:52, Boris Ostrovsky wrote:
> >>> On 1/22/21 7:51 AM, Jan Beulich wrote:
> >>>> On 20.01.2021 23:49, Boris Ostrovsky wrote:
> >>>>> +
> >>>>> +    /*
> >>>>> +     * Accesses to unimplemented MSRs as part of emulation of instructions
> >>>>> +     * other than guest's RDMSR/WRMSR should never succeed.
> >>>>> +     */
> >>>>> +    if ( !is_guest_msr_access )
> >>>>> +        ignore_msrs = MSR_UNHANDLED_NEVER;
> >>>>
> >>>> Wouldn't you better "return true" here? Such accesses also
> >>>> shouldn't be logged imo (albeit I agree that's a change from
> >>>> current behavior).
> >>>
> >>>
> >>> Yes, that's why I didn't return here. We will be here in !is_guest_msr_access case most likely due to a bug in the emulator so I think we do want to see the error logged.
> >>
> >> Why "most likely"?
> > 
> > 
> > OK, definitely ;-)
> 
> Oops - I was thinking the other way around, considering such
> to possibly be legitimate. It just so happens that curently
> we have no such path.

I was imagining a case where we'd add have another "ops->read_msr(_regs.ecx, ...)"
in the emulator and some values of ecx may not be tested. (Or even passing
an explicit index that is not handled, although that is highly unlikely to
pass code review).

Yes, these cases do not currently exist. 

> 
> > But I still think logging these accesses would be helpful.
> 
> Because of the above I continue to question this.
> 
> >>>>> +    if ( unlikely(ignore_msrs != MSR_UNHANDLED_NEVER) )
> >>>>> +        *val = 0;
> >>>>
> >>>> I don't understand the conditional here, even more so with
> >>>> the respective changelog entry. In any event you don't
> >>>> want to clobber the value ahead of ...
> >>>>
> >>>>> +    if ( likely(ignore_msrs != MSR_UNHANDLED_SILENT) )
> >>>>> +    {
> >>>>> +        if ( is_write )
> >>>>> +            gdprintk(XENLOG_WARNING, "WRMSR 0x%08x val 0x%016"PRIx64
> >>>>> +                    " unimplemented\n", msr, *val);
> >>>>
> >>>> ... logging it.
> >>>
> >>>
> >>> True. I dropped !is_write from v1 without considering this.
> >>>
> >>> As far as the conditional --- dropping it too would be a behavior change. 
> >>
> >> Albeit an intentional one then? Plus I think I have trouble
> >> seeing what behavior it would be that would change.
> > 
> > 
> > Currently callers of, say, read_msr() don't expect the argument that they pass in to change. Granted, they shouldn't (and AFAICS don't) look at it but it's a change nonetheless.
> 
> Hmm, I'm confused: The purpose of read_msr() is to change the
> value pointed at by the passed in argument.

Only if MSR exists/handled. We add parameters that allow it to be set
to zero for unhandled case but default (which is existing behavior, i.e.
MSR_UNHANDLED_NEVER) would leave the (pointed to) argument unchanged.

>  And for write_msr()
> the users of the hook pass the argument by value, i.e. wouldn't
> observe the changed value (it would only possibly be
> intermediate layers which might observe the change, but those
> ought to not care).

Yes, this would only be relevant to reads but since I dropped is_write check
the conditional covers both reads and writes.

In any case, this (and the one above) are minor issues and I don't mind
making the change.

> 
> >>>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> >>>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> >>>>> @@ -850,4 +850,10 @@ static inline void x86_emul_reset_event(struct x86_emulate_ctxt *ctxt)
> >>>>>      ctxt->event = (struct x86_event){};
> >>>>>  }
> >>>>>  
> >>>>> +static inline bool x86_emul_guest_msr_access(struct x86_emulate_ctxt *ctxt)
> >>>>
> >>>> The parameter wants to be pointer-to-const. In addition I wonder
> >>>> whether this wouldn't better be a sibling to
> >>>> x86_insn_is_cr_access() (without a "state" parameter, which
> >>>> would be unused and unavailable to the callers), which may end
> >>>> up finding further uses down the road.
> >>>
> >>>
> >>> "Sibling" in terms of name (yes, it would be) or something else?
> >>
> >> Name and (possible) purpose - a validate hook could want to
> >> make use of this, for example.
> > 
> > A validate hook? 
> 
> Quoting from struct x86_emulate_ops:
> 
>     /*
>      * validate: Post-decode, pre-emulate hook to allow caller controlled
>      * filtering.
>      */
>     int (*validate)(
>         const struct x86_emulate_state *state,
>         struct x86_emulate_ctxt *ctxt);
> 
> Granted to be directly usable the function would need to have a
> "state" parameter. As that's unused, having it have one and
> passing NULL in your case might be acceptable. But I also could
> see arguments towards this not being a good idea.

I see. Where would we need to call this hook though? We are never directly
emulating MSR access (compared to, for example, CR accesses where
x86_insn_is_cr_access is used). And for PV we already validate it in
emul-priv-op.c():validate().

> 
> >>>> I notice you use this function only from PV priv-op emulation.
> >>>> What about the call paths through hvmemul_{read,write}_msr()?
> >>>> (It's also questionable whether the write paths need this -
> >>>> the only MSR written outside of WRMSR emulation is
> >>>> MSR_SHADOW_GS_BASE, which can't possibly reach the "unhandled"
> >>>> logic anywhere. But maybe better to be future proof here in
> >>>> case new MSR writes appear in the emulator, down the road.)
> >>>
> >>>
> >>> Won't we end up in hvm_funcs.msr_write_intercept ops which do call it?
> >>
> >> Of course we will - the boolean will very likely need
> >> propagating (a possible alternative being a per-vCPU flag
> >> indicating "in emulator").
> > 
> > 
> > Oh, I see what you mean. By per-vcpu flag you mean arch_vcpu field I assume?
> 
> Yes, a boolean in one of the arch-specific per-vCPU structs.
> Whether that's arch_vcpu or perhaps something HVM specific is
> another question.

Currently we only need this for HVM but using it for both will avoid the need to
check guest type. 


-boris


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 3/4] x86: Allow non-faulting accesses to non-emulated MSRs if policy permits this
  2021-01-26 16:02             ` Boris Ostrovsky
@ 2021-01-26 16:35               ` Jan Beulich
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2021-01-26 16:35 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: iwj, wl, anthony.perard, andrew.cooper3, roger.pau, jun.nakajima,
	kevin.tian, xen-devel

On 26.01.2021 17:02, Boris Ostrovsky wrote:
> On 21-01-26 10:05:47, Jan Beulich wrote:
>> On 25.01.2021 19:42, Boris Ostrovsky wrote:
>>> On 21-01-25 11:22:08, Jan Beulich wrote:
>>>> On 22.01.2021 20:52, Boris Ostrovsky wrote:
>>>>> "Sibling" in terms of name (yes, it would be) or something else?
>>>>
>>>> Name and (possible) purpose - a validate hook could want to
>>>> make use of this, for example.
>>>
>>> A validate hook? 
>>
>> Quoting from struct x86_emulate_ops:
>>
>>     /*
>>      * validate: Post-decode, pre-emulate hook to allow caller controlled
>>      * filtering.
>>      */
>>     int (*validate)(
>>         const struct x86_emulate_state *state,
>>         struct x86_emulate_ctxt *ctxt);
>>
>> Granted to be directly usable the function would need to have a
>> "state" parameter. As that's unused, having it have one and
>> passing NULL in your case might be acceptable. But I also could
>> see arguments towards this not being a good idea.
> 
> I see. Where would we need to call this hook though? We are never directly
> emulating MSR access (compared to, for example, CR accesses where
> x86_insn_is_cr_access is used). And for PV we already validate it in
> emul-priv-op.c():validate().

If you look at some of the functions used for this hook, you may
realize that what your function does could also fit a hypothetical
future hook. Hence I was suggesting to make the new function
suitable for such use right away (and in particular fit their
naming scheme). But it looks like this has ended up more confusing
than anything else, so never mind.

Jan


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-01-22 11:51   ` Jan Beulich
  2021-01-22 18:56     ` Boris Ostrovsky
@ 2021-02-02 17:01     ` Boris Ostrovsky
  1 sibling, 0 replies; 53+ messages in thread
From: Boris Ostrovsky @ 2021-02-02 17:01 UTC (permalink / raw)
  To: Jan Beulich, andrew.cooper3
  Cc: iwj, wl, anthony.perard, roger.pau, jun.nakajima, kevin.tian, xen-devel


On 1/22/21 6:51 AM, Jan Beulich wrote:
> Also, Andrew, (I think I did say so before) - I definitely
> would want your general consent with this model, as what gets
> altered here is almost all relatively recent contributions
> by you. Nor would I exclude the approach being controversial.
>

Andrew, ping?


-boris



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 1/4] xl: Add support for ignore_msrs option
  2021-01-20 22:49 ` [PATCH v2 1/4] xl: Add support for ignore_msrs option Boris Ostrovsky
  2021-01-21 14:56   ` Wei Liu
  2021-01-22  9:52   ` Julien Grall
@ 2021-02-18 10:42   ` Roger Pau Monné
  2021-02-18 11:54     ` Jan Beulich
  2 siblings, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2021-02-18 10:42 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: xen-devel, iwj, wl, anthony.perard, jbeulich, andrew.cooper3,
	jun.nakajima, kevin.tian

On Wed, Jan 20, 2021 at 05:49:09PM -0500, Boris Ostrovsky wrote:
> This option allows guest administrator specify what should happen when
> guest accesses an MSR which is not explicitly emulated by the hypervisor.
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> ---
>  docs/man/xl.cfg.5.pod.in         | 20 +++++++++++++++++++-
>  tools/libs/light/libxl_types.idl |  7 +++++++
>  tools/xl/xl_parse.c              |  7 +++++++
>  3 files changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
> index c8e017f950de..96ce97c42cab 100644
> --- a/docs/man/xl.cfg.5.pod.in
> +++ b/docs/man/xl.cfg.5.pod.in
> @@ -2044,7 +2044,25 @@ Do not provide a VM generation ID.
>  See also "Virtual Machine Generation ID" by Microsoft:
>  L<https://docs.microsoft.com/en-us/windows/win32/hyperv_v2/virtual-machine-generation-identifier>
>  
> -=back 
> +=over
> +
> +=item B<ignore_msrs="STRING">
> +
> +Determine hypervisor behavior on accesses to MSRs that are not emulated by the hypervisor.
> +
> +=over 4
> +
> +=item B<never>
> +
> +Issue a warning to the log and #GP to the guest. This is default.
> +
> +=item B<silent>
> +
> +MSR reads return 0, MSR writes are ignored. No warnings to the log.
> +
> +=item B<verbose>
> +
> +Similar to B<silent> but a warning is written.

Would it make sense to allow for this option to be more fine-grained
in the future?

Not that you need to implement the full thing now, but maybe we could
have something like:

"
=item B<ignore_msrs=[ "MSR_RANGE, "MSR_RANGE", ..]>

Specify a list of MSR ranges that will be ignored by the hypervisor:
reads will return zeros and writes will be discarded without raising a
#GP.

Each MSR_RANGE is given in hexadecimal format and may be a range, e.g.
c00102f0-c00102f1 (inclusive), or a single MSR, e.g. c00102f1.
"

Then you can print the messages in the hypervisor using a guest log
level and modify it on demand in order to get more verbose output?

I don't think selecting whether the messages are printed or not from
xl is that helpful as the same could be achieved using guest_loglvl.

Also I think it will be fine to only implement:

ignore_msrs=[ "0-ffffffff" ]

Right now and return an error for any other combination, so that we
can get something in soon and expand it later.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-01-20 22:49 ` [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED Boris Ostrovsky
  2021-01-22 11:51   ` Jan Beulich
@ 2021-02-18 10:51   ` Roger Pau Monné
  2021-02-19 14:56     ` Boris Ostrovsky
  1 sibling, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2021-02-18 10:51 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: xen-devel, iwj, wl, anthony.perard, jbeulich, andrew.cooper3,
	jun.nakajima, kevin.tian

On Wed, Jan 20, 2021 at 05:49:10PM -0500, Boris Ostrovsky wrote:
> When toolstack updates MSR policy, this MSR offset (which is the last
> index in the hypervisor MSR range) is used to indicate hypervisor
> behavior when guest accesses an MSR which is not explicitly emulated.

It's kind of weird to use an MSR to store this. I assume this is done
for migration reasons?

Isn't is possible to convey this data in the xl migration stream
instead of having to pack it with MSRs?

Thanks, Roger.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 3/4] x86: Allow non-faulting accesses to non-emulated MSRs if policy permits this
  2021-01-20 22:49 ` [PATCH v2 3/4] x86: Allow non-faulting accesses to non-emulated MSRs if policy permits this Boris Ostrovsky
  2021-01-22 12:51   ` Jan Beulich
@ 2021-02-18 11:24   ` Roger Pau Monné
  2021-02-18 11:57     ` Jan Beulich
  1 sibling, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2021-02-18 11:24 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: xen-devel, iwj, wl, anthony.perard, jbeulich, andrew.cooper3,
	jun.nakajima, kevin.tian

On Wed, Jan 20, 2021 at 05:49:11PM -0500, Boris Ostrovsky wrote:
> Starting with commit 84e848fd7a16 ("x86/hvm: disallow access to unknown MSRs")
> accesses to unhandled MSRs result in #GP sent to the guest. This caused a
> regression for Solaris who tries to acccess MSR_RAPL_POWER_UNIT and (unlike,
> for example, Linux) does not catch exceptions when accessing MSRs that
> potentially may not be present.
> 
> Instead of special-casing RAPL registers we decide what to do when any
> non-emulated MSR is accessed based on ignore_msrs field of msr_policy.
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> ---
> Changes in v2:
> * define x86_emul_guest_msr_access() and use it to determine whether emulated
>   instruction is rd/wrmsr.
> * Don't use ignore_msrs for MSR accesses that are not guest's rd/wrmsr.
> * Clear @val for writes too in guest_unhandled_msr()
> 
>  xen/arch/x86/hvm/svm/svm.c             | 10 ++++------
>  xen/arch/x86/hvm/vmx/vmx.c             | 10 ++++------
>  xen/arch/x86/msr.c                     | 28 ++++++++++++++++++++++++++++
>  xen/arch/x86/pv/emul-priv-op.c         | 10 ++++++----
>  xen/arch/x86/x86_emulate/x86_emulate.h |  6 ++++++
>  xen/include/asm-x86/msr.h              |  3 +++
>  6 files changed, 51 insertions(+), 16 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index b819897a4a9f..7b59885b2619 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -1965,8 +1965,8 @@ static int svm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
>          break;
>  
>      default:
> -        gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", msr);
> -        goto gpf;
> +        if ( guest_unhandled_msr(v, msr, msr_content, false, true) )
> +            goto gpf;
>      }
>  
>      HVM_DBG_LOG(DBG_LEVEL_MSR, "returns: ecx=%x, msr_value=%"PRIx64,
> @@ -2151,10 +2151,8 @@ static int svm_msr_write_intercept(unsigned int msr, uint64_t msr_content)
>          break;
>  
>      default:
> -        gdprintk(XENLOG_WARNING,
> -                 "WRMSR 0x%08x val 0x%016"PRIx64" unimplemented\n",
> -                 msr, msr_content);
> -        goto gpf;
> +        if ( guest_unhandled_msr(v, msr, &msr_content, true, true) )
> +            goto gpf;
>      }
>  
>      return X86EMUL_OKAY;
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 2d4475ee3de2..87baca57d33f 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -3017,8 +3017,8 @@ static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
>              break;
>          }
>  
> -        gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", msr);
> -        goto gp_fault;
> +        if ( guest_unhandled_msr(curr, msr, msr_content, false, true) )
> +            goto gp_fault;
>      }
>  
>  done:
> @@ -3319,10 +3319,8 @@ static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content)
>               is_last_branch_msr(msr) )
>              break;
>  
> -        gdprintk(XENLOG_WARNING,
> -                 "WRMSR 0x%08x val 0x%016"PRIx64" unimplemented\n",
> -                 msr, msr_content);
> -        goto gp_fault;
> +        if ( guest_unhandled_msr(v, msr, &msr_content, true, true) )
> +            goto gp_fault;
>      }

I think this could be done in hvm_msr_read_intercept instead of having
to call guest_unhandled_msr from each vendor specific handler?

Oh, I see, that's likely done to differentiate between guest MSR
accesses and emulator ones? I'm not sure we really need to make a
difference between guests MSR accesses and emulator ones, surely in
the past they would be treated equally?

Thanks, Roger.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 4/4] tools/libs: Apply MSR policy to a guest
  2021-01-20 22:49 ` [PATCH v2 4/4] tools/libs: Apply MSR policy to a guest Boris Ostrovsky
  2021-01-21 14:58   ` Wei Liu
  2021-01-22  9:56   ` Julien Grall
@ 2021-02-18 11:48   ` Roger Pau Monné
  2021-02-19 14:57     ` Boris Ostrovsky
  2 siblings, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2021-02-18 11:48 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: xen-devel, iwj, wl, anthony.perard, jbeulich, andrew.cooper3,
	jun.nakajima, kevin.tian

On Wed, Jan 20, 2021 at 05:49:12PM -0500, Boris Ostrovsky wrote:
> When creating a guest, if ignore_msrs option has been specified,
> apply it to guest's MSR policy.
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> ---
>  tools/include/xenctrl.h           |   2 +
>  tools/libs/guest/Makefile         |   1 +
>  tools/libs/guest/xg_msrs_x86.c    | 110 ++++++++++++++++++++++++++++++++++++++
>  tools/libs/light/libxl_dom.c      |   5 +-
>  tools/libs/light/libxl_internal.h |   2 +
>  tools/libs/light/libxl_x86.c      |   7 +++
>  6 files changed, 125 insertions(+), 2 deletions(-)
>  create mode 100644 tools/libs/guest/xg_msrs_x86.c
> 
> diff --git a/tools/include/xenctrl.h b/tools/include/xenctrl.h
> index 3796425e1eca..1d6a38e73dcf 100644
> --- a/tools/include/xenctrl.h
> +++ b/tools/include/xenctrl.h
> @@ -1835,6 +1835,8 @@ int xc_cpuid_apply_policy(xc_interface *xch,
>                            const uint32_t *featureset,
>                            unsigned int nr_features, bool pae, bool itsc,
>                            bool nested_virt, const struct xc_xend_cpuid *xend);
> +int xc_msr_apply_policy(xc_interface *xch, uint32_t domid,
> +                        unsigned int ignore_msr);
>  int xc_mca_op(xc_interface *xch, struct xen_mc *mc);
>  int xc_mca_op_inject_v2(xc_interface *xch, unsigned int flags,
>                          xc_cpumap_t cpumap, unsigned int nr_cpus);
> diff --git a/tools/libs/guest/Makefile b/tools/libs/guest/Makefile
> index 1c729040b337..452155ea0385 100644
> --- a/tools/libs/guest/Makefile
> +++ b/tools/libs/guest/Makefile
> @@ -56,6 +56,7 @@ SRCS-y                 += xg_dom_compat_linux.c
>  
>  SRCS-$(CONFIG_X86)     += xg_dom_x86.c
>  SRCS-$(CONFIG_X86)     += xg_cpuid_x86.c
> +SRCS-$(CONFIG_X86)     += xg_msrs_x86.c
>  SRCS-$(CONFIG_ARM)     += xg_dom_arm.c
>  
>  ifeq ($(CONFIG_LIBXC_MINIOS),y)
> diff --git a/tools/libs/guest/xg_msrs_x86.c b/tools/libs/guest/xg_msrs_x86.c
> new file mode 100644
> index 000000000000..464ce9292ad8
> --- /dev/null
> +++ b/tools/libs/guest/xg_msrs_x86.c
> @@ -0,0 +1,110 @@
> +/******************************************************************************
> + * xc_msrs_x86.c
> + *
> + * Update MSR policy of a domain.
> + *
> + * Copyright (c) 2021, Oracle and/or its affiliates.
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation;
> + * version 2.1 of the License.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "xc_private.h"
> +#include "xen/lib/x86/msr.h"
> +
> +
> +
> +int xc_msr_apply_policy(xc_interface *xch, uint32_t domid, unsigned int ignore_msr)
> +{
> +    int rc;
> +    unsigned int nr_leaves, nr_msrs;
> +    xen_msr_entry_t *msrs = NULL;
> +    struct msr_policy *p = NULL;
> +    xc_dominfo_t di;
> +    unsigned int err_leaf, err_subleaf, err_msr;
> +
> +    if ( xc_domain_getinfo(xch, domid, 1, &di) != 1 ||
> +         di.domid != domid )
> +    {
> +        ERROR("Failed to obtain d%d info", domid);
> +        rc = -ESRCH;
> +        goto out;
> +    }
> +
> +    rc = xc_get_cpu_policy_size(xch, &nr_leaves, &nr_msrs);
> +    if ( rc )
> +    {
> +        PERROR("Failed to obtain policy info size");
> +        rc = -errno;
> +        goto out;
> +    }
> +
> +    rc = -ENOMEM;
> +    if ( (msrs = calloc(nr_msrs, sizeof(*msrs))) == NULL ||
> +         (p = calloc(1, sizeof(*p))) == NULL )
> +        goto out;
> +
> +    /* Get the domain's default policy. */
> +    nr_leaves = 0;
> +    rc = xc_get_system_cpu_policy(xch, di.hvm ? XEN_SYSCTL_cpu_policy_hvm_default
> +                                              : XEN_SYSCTL_cpu_policy_pv_default,
> +                                  &nr_leaves, NULL, &nr_msrs, msrs);
> +    if ( rc )
> +    {
> +        PERROR("Failed to obtain %s default policy", di.hvm ? "hvm" : "pv");
> +        rc = -errno;
> +        goto out;
> +    }

Why not use xc_get_domain_cpu_policy instead so that you can avoid the
call to xc_domain_getinfo?

It would also seem safer, as you won't be discarding any adjustments
made to the default policy by the hypervisor for this specific domain.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 1/4] xl: Add support for ignore_msrs option
  2021-02-18 10:42   ` Roger Pau Monné
@ 2021-02-18 11:54     ` Jan Beulich
  2021-02-18 15:52       ` Roger Pau Monné
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2021-02-18 11:54 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, iwj, wl, anthony.perard, andrew.cooper3, jun.nakajima,
	kevin.tian, Boris Ostrovsky

On 18.02.2021 11:42, Roger Pau Monné wrote:
> On Wed, Jan 20, 2021 at 05:49:09PM -0500, Boris Ostrovsky wrote:
>> This option allows guest administrator specify what should happen when
>> guest accesses an MSR which is not explicitly emulated by the hypervisor.
>>
>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> ---
>>  docs/man/xl.cfg.5.pod.in         | 20 +++++++++++++++++++-
>>  tools/libs/light/libxl_types.idl |  7 +++++++
>>  tools/xl/xl_parse.c              |  7 +++++++
>>  3 files changed, 33 insertions(+), 1 deletion(-)
>>
>> diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
>> index c8e017f950de..96ce97c42cab 100644
>> --- a/docs/man/xl.cfg.5.pod.in
>> +++ b/docs/man/xl.cfg.5.pod.in
>> @@ -2044,7 +2044,25 @@ Do not provide a VM generation ID.
>>  See also "Virtual Machine Generation ID" by Microsoft:
>>  L<https://docs.microsoft.com/en-us/windows/win32/hyperv_v2/virtual-machine-generation-identifier>
>>  
>> -=back 
>> +=over
>> +
>> +=item B<ignore_msrs="STRING">
>> +
>> +Determine hypervisor behavior on accesses to MSRs that are not emulated by the hypervisor.
>> +
>> +=over 4
>> +
>> +=item B<never>
>> +
>> +Issue a warning to the log and #GP to the guest. This is default.
>> +
>> +=item B<silent>
>> +
>> +MSR reads return 0, MSR writes are ignored. No warnings to the log.
>> +
>> +=item B<verbose>
>> +
>> +Similar to B<silent> but a warning is written.
> 
> Would it make sense to allow for this option to be more fine-grained
> in the future?

From an abstract perspective - maybe. But remember that this information
will need to be migrated with the guest. It would seem to me that
Boris'es approach is easier migration-wise.

> Not that you need to implement the full thing now, but maybe we could
> have something like:
> 
> "
> =item B<ignore_msrs=[ "MSR_RANGE, "MSR_RANGE", ..]>
> 
> Specify a list of MSR ranges that will be ignored by the hypervisor:
> reads will return zeros and writes will be discarded without raising a
> #GP.
> 
> Each MSR_RANGE is given in hexadecimal format and may be a range, e.g.
> c00102f0-c00102f1 (inclusive), or a single MSR, e.g. c00102f1.
> "
> 
> Then you can print the messages in the hypervisor using a guest log
> level and modify it on demand in order to get more verbose output?

"Modify on demand"? Irrespective of what you mean with this, ...

> I don't think selecting whether the messages are printed or not from
> xl is that helpful as the same could be achieved using guest_loglvl.

... controlling this via guest_loglvl would affect various other
log messages' visibility.

Jan


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 3/4] x86: Allow non-faulting accesses to non-emulated MSRs if policy permits this
  2021-02-18 11:24   ` Roger Pau Monné
@ 2021-02-18 11:57     ` Jan Beulich
  2021-02-18 15:53       ` Roger Pau Monné
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2021-02-18 11:57 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, iwj, wl, anthony.perard, andrew.cooper3, jun.nakajima,
	kevin.tian, Boris Ostrovsky

On 18.02.2021 12:24, Roger Pau Monné wrote:
> On Wed, Jan 20, 2021 at 05:49:11PM -0500, Boris Ostrovsky wrote:
>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> @@ -3017,8 +3017,8 @@ static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
>>              break;
>>          }
>>  
>> -        gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", msr);
>> -        goto gp_fault;
>> +        if ( guest_unhandled_msr(curr, msr, msr_content, false, true) )
>> +            goto gp_fault;
>>      }
>>  
>>  done:
>> @@ -3319,10 +3319,8 @@ static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content)
>>               is_last_branch_msr(msr) )
>>              break;
>>  
>> -        gdprintk(XENLOG_WARNING,
>> -                 "WRMSR 0x%08x val 0x%016"PRIx64" unimplemented\n",
>> -                 msr, msr_content);
>> -        goto gp_fault;
>> +        if ( guest_unhandled_msr(v, msr, &msr_content, true, true) )
>> +            goto gp_fault;
>>      }
> 
> I think this could be done in hvm_msr_read_intercept instead of having
> to call guest_unhandled_msr from each vendor specific handler?
> 
> Oh, I see, that's likely done to differentiate between guest MSR
> accesses and emulator ones? I'm not sure we really need to make a
> difference between guests MSR accesses and emulator ones, surely in
> the past they would be treated equally?

We did discuss this before. Even if they were treated the same in
the past, that's not correct, and hence we shouldn't suppress the
distinction going forward. A guest explicitly asking to access an
MSR (via RDMSR/WRMSR) is entirely different from the emulator
perhaps just probing an MSR, falling back to some default behavior
if it's unavailable.

Jan


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 1/4] xl: Add support for ignore_msrs option
  2021-02-18 11:54     ` Jan Beulich
@ 2021-02-18 15:52       ` Roger Pau Monné
  2021-02-18 15:57         ` Jan Beulich
  0 siblings, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2021-02-18 15:52 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, iwj, wl, anthony.perard, andrew.cooper3, jun.nakajima,
	kevin.tian, Boris Ostrovsky

On Thu, Feb 18, 2021 at 12:54:13PM +0100, Jan Beulich wrote:
> On 18.02.2021 11:42, Roger Pau Monné wrote:
> > On Wed, Jan 20, 2021 at 05:49:09PM -0500, Boris Ostrovsky wrote:
> >> This option allows guest administrator specify what should happen when
> >> guest accesses an MSR which is not explicitly emulated by the hypervisor.
> >>
> >> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> >> ---
> >>  docs/man/xl.cfg.5.pod.in         | 20 +++++++++++++++++++-
> >>  tools/libs/light/libxl_types.idl |  7 +++++++
> >>  tools/xl/xl_parse.c              |  7 +++++++
> >>  3 files changed, 33 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
> >> index c8e017f950de..96ce97c42cab 100644
> >> --- a/docs/man/xl.cfg.5.pod.in
> >> +++ b/docs/man/xl.cfg.5.pod.in
> >> @@ -2044,7 +2044,25 @@ Do not provide a VM generation ID.
> >>  See also "Virtual Machine Generation ID" by Microsoft:
> >>  L<https://docs.microsoft.com/en-us/windows/win32/hyperv_v2/virtual-machine-generation-identifier>
> >>  
> >> -=back 
> >> +=over
> >> +
> >> +=item B<ignore_msrs="STRING">
> >> +
> >> +Determine hypervisor behavior on accesses to MSRs that are not emulated by the hypervisor.
> >> +
> >> +=over 4
> >> +
> >> +=item B<never>
> >> +
> >> +Issue a warning to the log and #GP to the guest. This is default.
> >> +
> >> +=item B<silent>
> >> +
> >> +MSR reads return 0, MSR writes are ignored. No warnings to the log.
> >> +
> >> +=item B<verbose>
> >> +
> >> +Similar to B<silent> but a warning is written.
> > 
> > Would it make sense to allow for this option to be more fine-grained
> > in the future?
> 
> From an abstract perspective - maybe. But remember that this information
> will need to be migrated with the guest. It would seem to me that
> Boris'es approach is easier migration-wise.

I'm not an expert on migration, but I seem to recall there's already a
libxl blob that gets migrated that contains the domain configuration,
so having the MSR configuration there seems like a sensible thing to
do.

> > Not that you need to implement the full thing now, but maybe we could
> > have something like:
> > 
> > "
> > =item B<ignore_msrs=[ "MSR_RANGE, "MSR_RANGE", ..]>
> > 
> > Specify a list of MSR ranges that will be ignored by the hypervisor:
> > reads will return zeros and writes will be discarded without raising a
> > #GP.
> > 
> > Each MSR_RANGE is given in hexadecimal format and may be a range, e.g.
> > c00102f0-c00102f1 (inclusive), or a single MSR, e.g. c00102f1.
> > "
> > 
> > Then you can print the messages in the hypervisor using a guest log
> > level and modify it on demand in order to get more verbose output?
> 
> "Modify on demand"? Irrespective of what you mean with this, ...
> 
> > I don't think selecting whether the messages are printed or not from
> > xl is that helpful as the same could be achieved using guest_loglvl.
> 
> ... controlling this via guest_loglvl would affect various other
> log messages' visibility.

Right, but do we really need this level of per-guest log control,
implemented in this way exclusively for MSRs?

We don't have a way for other parts of the code to have such
fine-grained control about what messages should be printed, and I
don't think MSR should be an exception. I assume this would be used to
detect which MSRs a guest is trying to access, and I would be fine
just using guest_loglvl to that end, the more that it can be modified
at run time now.

In any case I'm more worried about having a big switch to ignore all
unhandled MSRs rather than whether accesses should print a message or
not. Worse come to worse we could always add a new option afterwards
to selectively ignore MSR access, but that would be confusing for
users IMO.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 3/4] x86: Allow non-faulting accesses to non-emulated MSRs if policy permits this
  2021-02-18 11:57     ` Jan Beulich
@ 2021-02-18 15:53       ` Roger Pau Monné
  0 siblings, 0 replies; 53+ messages in thread
From: Roger Pau Monné @ 2021-02-18 15:53 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, iwj, wl, anthony.perard, andrew.cooper3, jun.nakajima,
	kevin.tian, Boris Ostrovsky

On Thu, Feb 18, 2021 at 12:57:13PM +0100, Jan Beulich wrote:
> On 18.02.2021 12:24, Roger Pau Monné wrote:
> > On Wed, Jan 20, 2021 at 05:49:11PM -0500, Boris Ostrovsky wrote:
> >> --- a/xen/arch/x86/hvm/vmx/vmx.c
> >> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> >> @@ -3017,8 +3017,8 @@ static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
> >>              break;
> >>          }
> >>  
> >> -        gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", msr);
> >> -        goto gp_fault;
> >> +        if ( guest_unhandled_msr(curr, msr, msr_content, false, true) )
> >> +            goto gp_fault;
> >>      }
> >>  
> >>  done:
> >> @@ -3319,10 +3319,8 @@ static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content)
> >>               is_last_branch_msr(msr) )
> >>              break;
> >>  
> >> -        gdprintk(XENLOG_WARNING,
> >> -                 "WRMSR 0x%08x val 0x%016"PRIx64" unimplemented\n",
> >> -                 msr, msr_content);
> >> -        goto gp_fault;
> >> +        if ( guest_unhandled_msr(v, msr, &msr_content, true, true) )
> >> +            goto gp_fault;
> >>      }
> > 
> > I think this could be done in hvm_msr_read_intercept instead of having
> > to call guest_unhandled_msr from each vendor specific handler?
> > 
> > Oh, I see, that's likely done to differentiate between guest MSR
> > accesses and emulator ones? I'm not sure we really need to make a
> > difference between guests MSR accesses and emulator ones, surely in
> > the past they would be treated equally?
> 
> We did discuss this before. Even if they were treated the same in
> the past, that's not correct, and hence we shouldn't suppress the
> distinction going forward. A guest explicitly asking to access an
> MSR (via RDMSR/WRMSR) is entirely different from the emulator
> perhaps just probing an MSR, falling back to some default behavior
> if it's unavailable.

Ack, then placing the calls to guest_unhandled_msr in vendor code
seems like the best option.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 1/4] xl: Add support for ignore_msrs option
  2021-02-18 15:52       ` Roger Pau Monné
@ 2021-02-18 15:57         ` Jan Beulich
  2021-02-19 14:50           ` Boris Ostrovsky
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2021-02-18 15:57 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, iwj, wl, anthony.perard, andrew.cooper3, jun.nakajima,
	kevin.tian, Boris Ostrovsky

On 18.02.2021 16:52, Roger Pau Monné wrote:
> On Thu, Feb 18, 2021 at 12:54:13PM +0100, Jan Beulich wrote:
>> On 18.02.2021 11:42, Roger Pau Monné wrote:
>>> Not that you need to implement the full thing now, but maybe we could
>>> have something like:
>>>
>>> "
>>> =item B<ignore_msrs=[ "MSR_RANGE, "MSR_RANGE", ..]>
>>>
>>> Specify a list of MSR ranges that will be ignored by the hypervisor:
>>> reads will return zeros and writes will be discarded without raising a
>>> #GP.
>>>
>>> Each MSR_RANGE is given in hexadecimal format and may be a range, e.g.
>>> c00102f0-c00102f1 (inclusive), or a single MSR, e.g. c00102f1.
>>> "
>>>
>>> Then you can print the messages in the hypervisor using a guest log
>>> level and modify it on demand in order to get more verbose output?
>>
>> "Modify on demand"? Irrespective of what you mean with this, ...
>>
>>> I don't think selecting whether the messages are printed or not from
>>> xl is that helpful as the same could be achieved using guest_loglvl.
>>
>> ... controlling this via guest_loglvl would affect various other
>> log messages' visibility.
> 
> Right, but do we really need this level of per-guest log control,
> implemented in this way exclusively for MSRs?
> 
> We don't have a way for other parts of the code to have such
> fine-grained control about what messages should be printed, and I
> don't think MSR should be an exception. I assume this would be used to
> detect which MSRs a guest is trying to access, and I would be fine
> just using guest_loglvl to that end, the more that it can be modified
> at run time now.

I can certainly see your point. The problem is that for guests
heavily accessing such MSRs, all other messages may disappear
due to rate limiting. That's not going to be helpful.

Jan


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 1/4] xl: Add support for ignore_msrs option
  2021-02-18 15:57         ` Jan Beulich
@ 2021-02-19 14:50           ` Boris Ostrovsky
  2021-02-22 10:24             ` Roger Pau Monné
  0 siblings, 1 reply; 53+ messages in thread
From: Boris Ostrovsky @ 2021-02-19 14:50 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monné
  Cc: xen-devel, iwj, wl, anthony.perard, andrew.cooper3, jun.nakajima,
	kevin.tian


On 2/18/21 10:57 AM, Jan Beulich wrote:
> On 18.02.2021 16:52, Roger Pau Monné wrote:
>> On Thu, Feb 18, 2021 at 12:54:13PM +0100, Jan Beulich wrote:
>>> On 18.02.2021 11:42, Roger Pau Monné wrote:
>>>> Not that you need to implement the full thing now, but maybe we could
>>>> have something like:
>>>>
>>>> "
>>>> =item B<ignore_msrs=[ "MSR_RANGE, "MSR_RANGE", ..]>
>>>>
>>>> Specify a list of MSR ranges that will be ignored by the hypervisor:
>>>> reads will return zeros and writes will be discarded without raising a
>>>> #GP.
>>>>
>>>> Each MSR_RANGE is given in hexadecimal format and may be a range, e.g.
>>>> c00102f0-c00102f1 (inclusive), or a single MSR, e.g. c00102f1.
>>>> "
>>>>
>>>> Then you can print the messages in the hypervisor using a guest log
>>>> level and modify it on demand in order to get more verbose output?
>>> "Modify on demand"? Irrespective of what you mean with this, ...
>>>
>>>> I don't think selecting whether the messages are printed or not from
>>>> xl is that helpful as the same could be achieved using guest_loglvl.
>>> ... controlling this via guest_loglvl would affect various other
>>> log messages' visibility.
>> Right, but do we really need this level of per-guest log control,
>> implemented in this way exclusively for MSRs?


In a multi-tenant environment we may need to figure out why a particular guest is failing to boot, without affecting behavior of other guests.


If we had per-guest log level in general then what you are suggesting would be the right thing to do IMO. Maybe that's what we should add?


-boris


>>
>> We don't have a way for other parts of the code to have such
>> fine-grained control about what messages should be printed, and I
>> don't think MSR should be an exception. I assume this would be used to
>> detect which MSRs a guest is trying to access, and I would be fine
>> just using guest_loglvl to that end, the more that it can be modified
>> at run time now.
> I can certainly see your point. The problem is that for guests
> heavily accessing such MSRs, all other messages may disappear
> due to rate limiting. That's not going to be helpful.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-02-18 10:51   ` Roger Pau Monné
@ 2021-02-19 14:56     ` Boris Ostrovsky
  2021-02-22 11:08       ` Roger Pau Monné
  0 siblings, 1 reply; 53+ messages in thread
From: Boris Ostrovsky @ 2021-02-19 14:56 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, iwj, wl, anthony.perard, jbeulich, andrew.cooper3,
	jun.nakajima, kevin.tian


On 2/18/21 5:51 AM, Roger Pau Monné wrote:
> On Wed, Jan 20, 2021 at 05:49:10PM -0500, Boris Ostrovsky wrote:
>> When toolstack updates MSR policy, this MSR offset (which is the last
>> index in the hypervisor MSR range) is used to indicate hypervisor
>> behavior when guest accesses an MSR which is not explicitly emulated.
> It's kind of weird to use an MSR to store this. I assume this is done
> for migration reasons?


Not really. It just seemed to me that MSR policy is the logical place to do that. Because it *is* a policy of how to deal with such accesses, isn't it?


> Isn't is possible to convey this data in the xl migration stream
> instead of having to pack it with MSRs?


I haven't looked at this but again --- the feature itself has nothing to do with migration. The fact that folding it into policy makes migration of this information "just work" is just a nice side benefit of this implementation.


-boris





^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 4/4] tools/libs: Apply MSR policy to a guest
  2021-02-18 11:48   ` Roger Pau Monné
@ 2021-02-19 14:57     ` Boris Ostrovsky
  0 siblings, 0 replies; 53+ messages in thread
From: Boris Ostrovsky @ 2021-02-19 14:57 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, iwj, wl, anthony.perard, jbeulich, andrew.cooper3,
	jun.nakajima, kevin.tian


On 2/18/21 6:48 AM, Roger Pau Monné wrote:
>
>> +    /* Get the domain's default policy. */
>> +    nr_leaves = 0;
>> +    rc = xc_get_system_cpu_policy(xch, di.hvm ? XEN_SYSCTL_cpu_policy_hvm_default
>> +                                              : XEN_SYSCTL_cpu_policy_pv_default,
>> +                                  &nr_leaves, NULL, &nr_msrs, msrs);
>> +    if ( rc )
>> +    {
>> +        PERROR("Failed to obtain %s default policy", di.hvm ? "hvm" : "pv");
>> +        rc = -errno;
>> +        goto out;
>> +    }
> Why not use xc_get_domain_cpu_policy instead so that you can avoid the
> call to xc_domain_getinfo?


Yes, indeed.


-boris


>
> It would also seem safer, as you won't be discarding any adjustments
> made to the default policy by the hypervisor for this specific domain.
>
> Thanks, Roger.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 1/4] xl: Add support for ignore_msrs option
  2021-02-19 14:50           ` Boris Ostrovsky
@ 2021-02-22 10:24             ` Roger Pau Monné
  2021-02-22 10:33               ` Jan Beulich
  0 siblings, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2021-02-22 10:24 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Jan Beulich, xen-devel, iwj, wl, anthony.perard, andrew.cooper3,
	jun.nakajima, kevin.tian

On Fri, Feb 19, 2021 at 09:50:12AM -0500, Boris Ostrovsky wrote:
> 
> On 2/18/21 10:57 AM, Jan Beulich wrote:
> > On 18.02.2021 16:52, Roger Pau Monné wrote:
> >> On Thu, Feb 18, 2021 at 12:54:13PM +0100, Jan Beulich wrote:
> >>> On 18.02.2021 11:42, Roger Pau Monné wrote:
> >>>> Not that you need to implement the full thing now, but maybe we could
> >>>> have something like:
> >>>>
> >>>> "
> >>>> =item B<ignore_msrs=[ "MSR_RANGE, "MSR_RANGE", ..]>
> >>>>
> >>>> Specify a list of MSR ranges that will be ignored by the hypervisor:
> >>>> reads will return zeros and writes will be discarded without raising a
> >>>> #GP.
> >>>>
> >>>> Each MSR_RANGE is given in hexadecimal format and may be a range, e.g.
> >>>> c00102f0-c00102f1 (inclusive), or a single MSR, e.g. c00102f1.
> >>>> "
> >>>>
> >>>> Then you can print the messages in the hypervisor using a guest log
> >>>> level and modify it on demand in order to get more verbose output?
> >>> "Modify on demand"? Irrespective of what you mean with this, ...
> >>>
> >>>> I don't think selecting whether the messages are printed or not from
> >>>> xl is that helpful as the same could be achieved using guest_loglvl.
> >>> ... controlling this via guest_loglvl would affect various other
> >>> log messages' visibility.
> >> Right, but do we really need this level of per-guest log control,
> >> implemented in this way exclusively for MSRs?
> 
> 
> In a multi-tenant environment we may need to figure out why a particular guest is failing to boot, without affecting behavior of other guests.
> 
> 
> If we had per-guest log level in general then what you are suggesting would be the right thing to do IMO. Maybe that's what we should add?

Yes, that would seem better IMO, but I don't think it's fair to ask
you to do that work.

Do you think it would be acceptable to untangle both, and try to get
the MSR stuff without any logging changes?

I know we would be addressing only one part of what the series
originally tried to achieve, but I would rather prefer to have a
generic way to set a per-guest log level rather than something
specific to MSR accesses.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 1/4] xl: Add support for ignore_msrs option
  2021-02-22 10:24             ` Roger Pau Monné
@ 2021-02-22 10:33               ` Jan Beulich
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Beulich @ 2021-02-22 10:33 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, iwj, wl, anthony.perard, andrew.cooper3, jun.nakajima,
	kevin.tian, Boris Ostrovsky

On 22.02.2021 11:24, Roger Pau Monné wrote:
> On Fri, Feb 19, 2021 at 09:50:12AM -0500, Boris Ostrovsky wrote:
>>
>> On 2/18/21 10:57 AM, Jan Beulich wrote:
>>> On 18.02.2021 16:52, Roger Pau Monné wrote:
>>>> On Thu, Feb 18, 2021 at 12:54:13PM +0100, Jan Beulich wrote:
>>>>> On 18.02.2021 11:42, Roger Pau Monné wrote:
>>>>>> Not that you need to implement the full thing now, but maybe we could
>>>>>> have something like:
>>>>>>
>>>>>> "
>>>>>> =item B<ignore_msrs=[ "MSR_RANGE, "MSR_RANGE", ..]>
>>>>>>
>>>>>> Specify a list of MSR ranges that will be ignored by the hypervisor:
>>>>>> reads will return zeros and writes will be discarded without raising a
>>>>>> #GP.
>>>>>>
>>>>>> Each MSR_RANGE is given in hexadecimal format and may be a range, e.g.
>>>>>> c00102f0-c00102f1 (inclusive), or a single MSR, e.g. c00102f1.
>>>>>> "
>>>>>>
>>>>>> Then you can print the messages in the hypervisor using a guest log
>>>>>> level and modify it on demand in order to get more verbose output?
>>>>> "Modify on demand"? Irrespective of what you mean with this, ...
>>>>>
>>>>>> I don't think selecting whether the messages are printed or not from
>>>>>> xl is that helpful as the same could be achieved using guest_loglvl.
>>>>> ... controlling this via guest_loglvl would affect various other
>>>>> log messages' visibility.
>>>> Right, but do we really need this level of per-guest log control,
>>>> implemented in this way exclusively for MSRs?
>>
>>
>> In a multi-tenant environment we may need to figure out why a particular guest is failing to boot, without affecting behavior of other guests.
>>
>>
>> If we had per-guest log level in general then what you are suggesting would be the right thing to do IMO. Maybe that's what we should add?
> 
> Yes, that would seem better IMO, but I don't think it's fair to ask
> you to do that work.
> 
> Do you think it would be acceptable to untangle both, and try to get
> the MSR stuff without any logging changes?
> 
> I know we would be addressing only one part of what the series
> originally tried to achieve, but I would rather prefer to have a
> generic way to set a per-guest log level rather than something
> specific to MSR accesses.

TBH I'd see us go the other route: Follow Boris'es approach for
4.15, and switch the logging control to per-guest once that
ability is there, _and_ if we're really convinced we don't want
to have this extra level of control. The latter because I think
a domain could end up pretty chatty just because of MSR accesses,
and it might therefore be undesirable to also hide all other
potentially relevant output. Perhaps the per-domain log level
control needs to be finer grained than what "guest_loglvl="
currently permits, more like what "hvm_debug=" has.

Jan


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-02-19 14:56     ` Boris Ostrovsky
@ 2021-02-22 11:08       ` Roger Pau Monné
  2021-02-22 21:19         ` Boris Ostrovsky
  0 siblings, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2021-02-22 11:08 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: xen-devel, iwj, wl, anthony.perard, jbeulich, andrew.cooper3,
	jun.nakajima, kevin.tian

On Fri, Feb 19, 2021 at 09:56:32AM -0500, Boris Ostrovsky wrote:
> 
> On 2/18/21 5:51 AM, Roger Pau Monné wrote:
> > On Wed, Jan 20, 2021 at 05:49:10PM -0500, Boris Ostrovsky wrote:
> >> When toolstack updates MSR policy, this MSR offset (which is the last
> >> index in the hypervisor MSR range) is used to indicate hypervisor
> >> behavior when guest accesses an MSR which is not explicitly emulated.
> > It's kind of weird to use an MSR to store this. I assume this is done
> > for migration reasons?
> 
> 
> Not really. It just seemed to me that MSR policy is the logical place to do that. Because it *is* a policy of how to deal with such accesses, isn't it?

I agree that using the msr_policy seems like the most suitable place
to convey this information between the toolstack and Xen. I wonder if
it would be fine to have fields in msr_policy that don't directly
translate into an MSR value?

But having such a list of ignored MSRs in msr_policy makes the whole
get/set logic a bit weird, as the user would have to provide a buffer
in order to get the list of ignored MSRs.

> 
> > Isn't is possible to convey this data in the xl migration stream
> > instead of having to pack it with MSRs?
> 
> 
> I haven't looked at this but again --- the feature itself has nothing to do with migration. The fact that folding it into policy makes migration of this information "just work" is just a nice side benefit of this implementation.

IMO it feels slightly weird that we have to use a MSR (MSR_UNHANDLED)
to store this option, seems like wasting an MSR index when there's
really no need for it to be stored in an MSR, as we don't expose it to
guests.

It would seem more natural for such option to live in arch_domain as a
rangeset for example.

Maybe introduce a new DOMCTL to set it?

#define XEN_DOMCTL_msr_set_ignore ...
struct xen_domctl_msr_set_ignore {
    uint32_t index;
    uint32_t size;
};

Thanks, Roger.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-02-22 11:08       ` Roger Pau Monné
@ 2021-02-22 21:19         ` Boris Ostrovsky
  2021-02-23  7:57           ` Jan Beulich
  0 siblings, 1 reply; 53+ messages in thread
From: Boris Ostrovsky @ 2021-02-22 21:19 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, iwj, wl, anthony.perard, jbeulich, andrew.cooper3,
	jun.nakajima, kevin.tian


On 2/22/21 6:08 AM, Roger Pau Monné wrote:
> On Fri, Feb 19, 2021 at 09:56:32AM -0500, Boris Ostrovsky wrote:
>> On 2/18/21 5:51 AM, Roger Pau Monné wrote:
>>> On Wed, Jan 20, 2021 at 05:49:10PM -0500, Boris Ostrovsky wrote:
>>>> When toolstack updates MSR policy, this MSR offset (which is the last
>>>> index in the hypervisor MSR range) is used to indicate hypervisor
>>>> behavior when guest accesses an MSR which is not explicitly emulated.
>>> It's kind of weird to use an MSR to store this. I assume this is done
>>> for migration reasons?
>>
>> Not really. It just seemed to me that MSR policy is the logical place to do that. Because it *is* a policy of how to deal with such accesses, isn't it?
> I agree that using the msr_policy seems like the most suitable place
> to convey this information between the toolstack and Xen. I wonder if
> it would be fine to have fields in msr_policy that don't directly
> translate into an MSR value?


We have xen_msr_entry_t.flags that we can use when passing policy array back and forth. Then we can ignore xen_msr_entry_t.idx for that entry (although in earlier version of this series Jan preferred to use idx and leave flags alone).


>
> But having such a list of ignored MSRs in msr_policy makes the whole
> get/set logic a bit weird, as the user would have to provide a buffer
> in order to get the list of ignored MSRs.


If we go with ranges/lists of ignored MSRs then we will need to have ignore_msrs as a rangeset in msr_policy, not as (current) uint8. And xen_msr_entry_t will need to have a range as opposed to single index. Or maybe I don't understand what you are referring to as get/set logic.


But I would like to make sure we really want to support such ranges, I am a little concerned about over-engineering this (especially wrt migration, I think it will need marshalling/unmarshalling)


>>> Isn't is possible to convey this data in the xl migration stream
>>> instead of having to pack it with MSRs?
>>
>> I haven't looked at this but again --- the feature itself has nothing to do with migration. The fact that folding it into policy makes migration of this information "just work" is just a nice side benefit of this implementation.
> IMO it feels slightly weird that we have to use a MSR (MSR_UNHANDLED)
> to store this option, seems like wasting an MSR index when there's
> really no need for it to be stored in an MSR, as we don't expose it to
> guests.
>
> It would seem more natural for such option to live in arch_domain as a
> rangeset for example.
>
> Maybe introduce a new DOMCTL to set it?
>
> #define XEN_DOMCTL_msr_set_ignore ...
> struct xen_domctl_msr_set_ignore {
>     uint32_t index;
>     uint32_t size;
> };


That will work too but this is adding 2 new domctls (I think we will need a "get" too) whereas with policy we use existing interface.


-boris



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-02-22 21:19         ` Boris Ostrovsky
@ 2021-02-23  7:57           ` Jan Beulich
  2021-02-23  9:34             ` Roger Pau Monné
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2021-02-23  7:57 UTC (permalink / raw)
  To: Boris Ostrovsky, Roger Pau Monné
  Cc: xen-devel, iwj, wl, anthony.perard, andrew.cooper3, jun.nakajima,
	kevin.tian

On 22.02.2021 22:19, Boris Ostrovsky wrote:
> 
> On 2/22/21 6:08 AM, Roger Pau Monné wrote:
>> On Fri, Feb 19, 2021 at 09:56:32AM -0500, Boris Ostrovsky wrote:
>>> On 2/18/21 5:51 AM, Roger Pau Monné wrote:
>>>> On Wed, Jan 20, 2021 at 05:49:10PM -0500, Boris Ostrovsky wrote:
>>>>> When toolstack updates MSR policy, this MSR offset (which is the last
>>>>> index in the hypervisor MSR range) is used to indicate hypervisor
>>>>> behavior when guest accesses an MSR which is not explicitly emulated.
>>>> It's kind of weird to use an MSR to store this. I assume this is done
>>>> for migration reasons?
>>>
>>> Not really. It just seemed to me that MSR policy is the logical place to do that. Because it *is* a policy of how to deal with such accesses, isn't it?
>> I agree that using the msr_policy seems like the most suitable place
>> to convey this information between the toolstack and Xen. I wonder if
>> it would be fine to have fields in msr_policy that don't directly
>> translate into an MSR value?
> 
> 
> We have xen_msr_entry_t.flags that we can use when passing policy array back and forth. Then we can ignore xen_msr_entry_t.idx for that entry (although in earlier version of this series Jan preferred to use idx and leave flags alone).

Which, just to clarify, was not the least because of the flags
field being per-entry, i.e. per MSR, while here we want a global
indicator.

Jan


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-02-23  7:57           ` Jan Beulich
@ 2021-02-23  9:34             ` Roger Pau Monné
  2021-02-23 10:15               ` Jan Beulich
  0 siblings, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2021-02-23  9:34 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Boris Ostrovsky, xen-devel, iwj, wl, anthony.perard,
	andrew.cooper3, jun.nakajima, kevin.tian

On Tue, Feb 23, 2021 at 08:57:00AM +0100, Jan Beulich wrote:
> On 22.02.2021 22:19, Boris Ostrovsky wrote:
> > 
> > On 2/22/21 6:08 AM, Roger Pau Monné wrote:
> >> On Fri, Feb 19, 2021 at 09:56:32AM -0500, Boris Ostrovsky wrote:
> >>> On 2/18/21 5:51 AM, Roger Pau Monné wrote:
> >>>> On Wed, Jan 20, 2021 at 05:49:10PM -0500, Boris Ostrovsky wrote:
> >>>>> When toolstack updates MSR policy, this MSR offset (which is the last
> >>>>> index in the hypervisor MSR range) is used to indicate hypervisor
> >>>>> behavior when guest accesses an MSR which is not explicitly emulated.
> >>>> It's kind of weird to use an MSR to store this. I assume this is done
> >>>> for migration reasons?
> >>>
> >>> Not really. It just seemed to me that MSR policy is the logical place to do that. Because it *is* a policy of how to deal with such accesses, isn't it?
> >> I agree that using the msr_policy seems like the most suitable place
> >> to convey this information between the toolstack and Xen. I wonder if
> >> it would be fine to have fields in msr_policy that don't directly
> >> translate into an MSR value?
> > 
> > 
> > We have xen_msr_entry_t.flags that we can use when passing policy array back and forth. Then we can ignore xen_msr_entry_t.idx for that entry (although in earlier version of this series Jan preferred to use idx and leave flags alone).
> 
> Which, just to clarify, was not the least because of the flags
> field being per-entry, i.e. per MSR, while here we want a global
> indicator.

We could exploit a bit the xen_msr_entry_t structure and use it
like:

typedef struct xen_msr_entry {
    uint32_t idx;
#define XEN_MSR_IGNORE (1u << 0)
    uint32_t flags;
    uint64_t val;
} xen_msr_entry_t;

Then use the idx and val fields to signal the start and size of the
range to ignore accesses to when XEN_MSR_IGNORE is set in the flags
field?

xen_msr_entry_t = {
    .idx = 0,
    .val = 0xffffffff,
    .flags = XEN_MSR_IGNORE,
};

Would be equivalent to ignoring accesses to the whole MSR range.

This would allow selectively selecting which MSRs to ignore, while
not wasting a MSR itself to convey the information?

It would still need to be stored somewhere in the Xen internal domain
structure using a rangeset I think, which could be translated back and
forth into this xen_msr_entry_t format.

Roger.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-02-23  9:34             ` Roger Pau Monné
@ 2021-02-23 10:15               ` Jan Beulich
  2021-02-23 12:17                 ` Roger Pau Monné
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2021-02-23 10:15 UTC (permalink / raw)
  To: Roger Pau Monné, andrew.cooper3
  Cc: Boris Ostrovsky, xen-devel, iwj, wl, anthony.perard,
	jun.nakajima, kevin.tian

On 23.02.2021 10:34, Roger Pau Monné wrote:
> On Tue, Feb 23, 2021 at 08:57:00AM +0100, Jan Beulich wrote:
>> On 22.02.2021 22:19, Boris Ostrovsky wrote:
>>>
>>> On 2/22/21 6:08 AM, Roger Pau Monné wrote:
>>>> On Fri, Feb 19, 2021 at 09:56:32AM -0500, Boris Ostrovsky wrote:
>>>>> On 2/18/21 5:51 AM, Roger Pau Monné wrote:
>>>>>> On Wed, Jan 20, 2021 at 05:49:10PM -0500, Boris Ostrovsky wrote:
>>>>>>> When toolstack updates MSR policy, this MSR offset (which is the last
>>>>>>> index in the hypervisor MSR range) is used to indicate hypervisor
>>>>>>> behavior when guest accesses an MSR which is not explicitly emulated.
>>>>>> It's kind of weird to use an MSR to store this. I assume this is done
>>>>>> for migration reasons?
>>>>>
>>>>> Not really. It just seemed to me that MSR policy is the logical place to do that. Because it *is* a policy of how to deal with such accesses, isn't it?
>>>> I agree that using the msr_policy seems like the most suitable place
>>>> to convey this information between the toolstack and Xen. I wonder if
>>>> it would be fine to have fields in msr_policy that don't directly
>>>> translate into an MSR value?
>>>
>>>
>>> We have xen_msr_entry_t.flags that we can use when passing policy array back and forth. Then we can ignore xen_msr_entry_t.idx for that entry (although in earlier version of this series Jan preferred to use idx and leave flags alone).
>>
>> Which, just to clarify, was not the least because of the flags
>> field being per-entry, i.e. per MSR, while here we want a global
>> indicator.
> 
> We could exploit a bit the xen_msr_entry_t structure and use it
> like:
> 
> typedef struct xen_msr_entry {
>     uint32_t idx;
> #define XEN_MSR_IGNORE (1u << 0)
>     uint32_t flags;
>     uint64_t val;
> } xen_msr_entry_t;
> 
> Then use the idx and val fields to signal the start and size of the
> range to ignore accesses to when XEN_MSR_IGNORE is set in the flags
> field?
> 
> xen_msr_entry_t = {
>     .idx = 0,
>     .val = 0xffffffff,
>     .flags = XEN_MSR_IGNORE,
> };
> 
> Would be equivalent to ignoring accesses to the whole MSR range.
> 
> This would allow selectively selecting which MSRs to ignore, while
> not wasting a MSR itself to convey the information?

Hmm, yes, the added flexibility would be nice from an abstract pov
(not sure how relevant it would be to Solaris'es issue). But my
dislike of using a flag which is meaningless in ordinary entries
remains, as was voiced against Boris'es original version.

Andrew - afaict you've been completely silent on this thread so
far. What's your view?

Jan


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-02-23 10:15               ` Jan Beulich
@ 2021-02-23 12:17                 ` Roger Pau Monné
  2021-02-23 13:23                   ` Jan Beulich
  0 siblings, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2021-02-23 12:17 UTC (permalink / raw)
  To: Jan Beulich
  Cc: andrew.cooper3, Boris Ostrovsky, xen-devel, iwj, wl,
	anthony.perard, jun.nakajima, kevin.tian

On Tue, Feb 23, 2021 at 11:15:31AM +0100, Jan Beulich wrote:
> On 23.02.2021 10:34, Roger Pau Monné wrote:
> > On Tue, Feb 23, 2021 at 08:57:00AM +0100, Jan Beulich wrote:
> >> On 22.02.2021 22:19, Boris Ostrovsky wrote:
> >>>
> >>> On 2/22/21 6:08 AM, Roger Pau Monné wrote:
> >>>> On Fri, Feb 19, 2021 at 09:56:32AM -0500, Boris Ostrovsky wrote:
> >>>>> On 2/18/21 5:51 AM, Roger Pau Monné wrote:
> >>>>>> On Wed, Jan 20, 2021 at 05:49:10PM -0500, Boris Ostrovsky wrote:
> >>>>>>> When toolstack updates MSR policy, this MSR offset (which is the last
> >>>>>>> index in the hypervisor MSR range) is used to indicate hypervisor
> >>>>>>> behavior when guest accesses an MSR which is not explicitly emulated.
> >>>>>> It's kind of weird to use an MSR to store this. I assume this is done
> >>>>>> for migration reasons?
> >>>>>
> >>>>> Not really. It just seemed to me that MSR policy is the logical place to do that. Because it *is* a policy of how to deal with such accesses, isn't it?
> >>>> I agree that using the msr_policy seems like the most suitable place
> >>>> to convey this information between the toolstack and Xen. I wonder if
> >>>> it would be fine to have fields in msr_policy that don't directly
> >>>> translate into an MSR value?
> >>>
> >>>
> >>> We have xen_msr_entry_t.flags that we can use when passing policy array back and forth. Then we can ignore xen_msr_entry_t.idx for that entry (although in earlier version of this series Jan preferred to use idx and leave flags alone).
> >>
> >> Which, just to clarify, was not the least because of the flags
> >> field being per-entry, i.e. per MSR, while here we want a global
> >> indicator.
> > 
> > We could exploit a bit the xen_msr_entry_t structure and use it
> > like:
> > 
> > typedef struct xen_msr_entry {
> >     uint32_t idx;
> > #define XEN_MSR_IGNORE (1u << 0)
> >     uint32_t flags;
> >     uint64_t val;
> > } xen_msr_entry_t;
> > 
> > Then use the idx and val fields to signal the start and size of the
> > range to ignore accesses to when XEN_MSR_IGNORE is set in the flags
> > field?
> > 
> > xen_msr_entry_t = {
> >     .idx = 0,
> >     .val = 0xffffffff,
> >     .flags = XEN_MSR_IGNORE,
> > };
> > 
> > Would be equivalent to ignoring accesses to the whole MSR range.
> > 
> > This would allow selectively selecting which MSRs to ignore, while
> > not wasting a MSR itself to convey the information?
> 
> Hmm, yes, the added flexibility would be nice from an abstract pov
> (not sure how relevant it would be to Solaris'es issue). But my
> dislike of using a flag which is meaningless in ordinary entries
> remains, as was voiced against Boris'es original version.

I understand the flags field is meaningless for regular MSRs, but I
don't see why it would be an issue to start using it for this specific
case of registering ranges of ignored MSRs. It certainly seems better
than hijacking an MSR index (MSR_UNHANDLED).

Thanks, Roger.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-02-23 12:17                 ` Roger Pau Monné
@ 2021-02-23 13:23                   ` Jan Beulich
  2021-02-23 15:39                     ` Boris Ostrovsky
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2021-02-23 13:23 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: andrew.cooper3, Boris Ostrovsky, xen-devel, iwj, wl,
	anthony.perard, jun.nakajima, kevin.tian

On 23.02.2021 13:17, Roger Pau Monné wrote:
> On Tue, Feb 23, 2021 at 11:15:31AM +0100, Jan Beulich wrote:
>> On 23.02.2021 10:34, Roger Pau Monné wrote:
>>> On Tue, Feb 23, 2021 at 08:57:00AM +0100, Jan Beulich wrote:
>>>> On 22.02.2021 22:19, Boris Ostrovsky wrote:
>>>>>
>>>>> On 2/22/21 6:08 AM, Roger Pau Monné wrote:
>>>>>> On Fri, Feb 19, 2021 at 09:56:32AM -0500, Boris Ostrovsky wrote:
>>>>>>> On 2/18/21 5:51 AM, Roger Pau Monné wrote:
>>>>>>>> On Wed, Jan 20, 2021 at 05:49:10PM -0500, Boris Ostrovsky wrote:
>>>>>>>>> When toolstack updates MSR policy, this MSR offset (which is the last
>>>>>>>>> index in the hypervisor MSR range) is used to indicate hypervisor
>>>>>>>>> behavior when guest accesses an MSR which is not explicitly emulated.
>>>>>>>> It's kind of weird to use an MSR to store this. I assume this is done
>>>>>>>> for migration reasons?
>>>>>>>
>>>>>>> Not really. It just seemed to me that MSR policy is the logical place to do that. Because it *is* a policy of how to deal with such accesses, isn't it?
>>>>>> I agree that using the msr_policy seems like the most suitable place
>>>>>> to convey this information between the toolstack and Xen. I wonder if
>>>>>> it would be fine to have fields in msr_policy that don't directly
>>>>>> translate into an MSR value?
>>>>>
>>>>>
>>>>> We have xen_msr_entry_t.flags that we can use when passing policy array back and forth. Then we can ignore xen_msr_entry_t.idx for that entry (although in earlier version of this series Jan preferred to use idx and leave flags alone).
>>>>
>>>> Which, just to clarify, was not the least because of the flags
>>>> field being per-entry, i.e. per MSR, while here we want a global
>>>> indicator.
>>>
>>> We could exploit a bit the xen_msr_entry_t structure and use it
>>> like:
>>>
>>> typedef struct xen_msr_entry {
>>>     uint32_t idx;
>>> #define XEN_MSR_IGNORE (1u << 0)
>>>     uint32_t flags;
>>>     uint64_t val;
>>> } xen_msr_entry_t;
>>>
>>> Then use the idx and val fields to signal the start and size of the
>>> range to ignore accesses to when XEN_MSR_IGNORE is set in the flags
>>> field?
>>>
>>> xen_msr_entry_t = {
>>>     .idx = 0,
>>>     .val = 0xffffffff,
>>>     .flags = XEN_MSR_IGNORE,
>>> };
>>>
>>> Would be equivalent to ignoring accesses to the whole MSR range.
>>>
>>> This would allow selectively selecting which MSRs to ignore, while
>>> not wasting a MSR itself to convey the information?
>>
>> Hmm, yes, the added flexibility would be nice from an abstract pov
>> (not sure how relevant it would be to Solaris'es issue). But my
>> dislike of using a flag which is meaningless in ordinary entries
>> remains, as was voiced against Boris'es original version.
> 
> I understand the flags field is meaningless for regular MSRs, but I
> don't see why it would be an issue to start using it for this specific
> case of registering ranges of ignored MSRs.

It's not an "issue", it is - as said - my dislike. However, the way
it is in your proposal it is perhaps indeed not as bad as in Boris'es
original one: The flag now designates the purpose of the entry, and
the other two fields still have a meaning. Hence I was wrong to state
that it's "meaningless" - it now is required to be clear for
"ordinary" entries.

In principle there could then also be multiple such entries / ranges.

> It certainly seems better than hijacking an MSR index (MSR_UNHANDLED).

Not sure.

Jan


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-02-23 13:23                   ` Jan Beulich
@ 2021-02-23 15:39                     ` Boris Ostrovsky
  2021-02-23 16:10                       ` Jan Beulich
  2021-02-23 16:11                       ` Roger Pau Monné
  0 siblings, 2 replies; 53+ messages in thread
From: Boris Ostrovsky @ 2021-02-23 15:39 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monné
  Cc: andrew.cooper3, xen-devel, iwj, wl, anthony.perard, jun.nakajima,
	kevin.tian


On 2/23/21 8:23 AM, Jan Beulich wrote:
> On 23.02.2021 13:17, Roger Pau Monné wrote:
>> On Tue, Feb 23, 2021 at 11:15:31AM +0100, Jan Beulich wrote:
>>> On 23.02.2021 10:34, Roger Pau Monné wrote:
>>>> On Tue, Feb 23, 2021 at 08:57:00AM +0100, Jan Beulich wrote:
>>>>> On 22.02.2021 22:19, Boris Ostrovsky wrote:
>>>>>> On 2/22/21 6:08 AM, Roger Pau Monné wrote:
>>>>>>> On Fri, Feb 19, 2021 at 09:56:32AM -0500, Boris Ostrovsky wrote:
>>>>>>>> On 2/18/21 5:51 AM, Roger Pau Monné wrote:
>>>>>>>>> On Wed, Jan 20, 2021 at 05:49:10PM -0500, Boris Ostrovsky wrote:
>>>>>>>>>> When toolstack updates MSR policy, this MSR offset (which is the last
>>>>>>>>>> index in the hypervisor MSR range) is used to indicate hypervisor
>>>>>>>>>> behavior when guest accesses an MSR which is not explicitly emulated.
>>>>>>>>> It's kind of weird to use an MSR to store this. I assume this is done
>>>>>>>>> for migration reasons?
>>>>>>>> Not really. It just seemed to me that MSR policy is the logical place to do that. Because it *is* a policy of how to deal with such accesses, isn't it?
>>>>>>> I agree that using the msr_policy seems like the most suitable place
>>>>>>> to convey this information between the toolstack and Xen. I wonder if
>>>>>>> it would be fine to have fields in msr_policy that don't directly
>>>>>>> translate into an MSR value?
>>>>>>
>>>>>> We have xen_msr_entry_t.flags that we can use when passing policy array back and forth. Then we can ignore xen_msr_entry_t.idx for that entry (although in earlier version of this series Jan preferred to use idx and leave flags alone).
>>>>> Which, just to clarify, was not the least because of the flags
>>>>> field being per-entry, i.e. per MSR, while here we want a global
>>>>> indicator.
>>>> We could exploit a bit the xen_msr_entry_t structure and use it
>>>> like:
>>>>
>>>> typedef struct xen_msr_entry {
>>>>     uint32_t idx;
>>>> #define XEN_MSR_IGNORE (1u << 0)
>>>>     uint32_t flags;
>>>>     uint64_t val;
>>>> } xen_msr_entry_t;
>>>>
>>>> Then use the idx and val fields to signal the start and size of the
>>>> range to ignore accesses to when XEN_MSR_IGNORE is set in the flags
>>>> field?
>>>>
>>>> xen_msr_entry_t = {
>>>>     .idx = 0,
>>>>     .val = 0xffffffff,
>>>>     .flags = XEN_MSR_IGNORE,
>>>> };
>>>>
>>>> Would be equivalent to ignoring accesses to the whole MSR range.
>>>>
>>>> This would allow selectively selecting which MSRs to ignore, while
>>>> not wasting a MSR itself to convey the information?
>>> Hmm, yes, the added flexibility would be nice from an abstract pov
>>> (not sure how relevant it would be to Solaris'es issue). But my
>>> dislike of using a flag which is meaningless in ordinary entries
>>> remains, as was voiced against Boris'es original version.
>> I understand the flags field is meaningless for regular MSRs, but I
>> don't see why it would be an issue to start using it for this specific
>> case of registering ranges of ignored MSRs.
> It's not an "issue", it is - as said - my dislike. However, the way
> it is in your proposal it is perhaps indeed not as bad as in Boris'es
> original one: The flag now designates the purpose of the entry, and
> the other two fields still have a meaning. Hence I was wrong to state
> that it's "meaningless" - it now is required to be clear for
> "ordinary" entries.
>
> In principle there could then also be multiple such entries / ranges.


TBH I am not sold on usefulness of multiple ranges but if both of you feel it can be handy I'll do that, using Roger's proposal above. (Would it make sense to make val a union with, say, size or should the context be clear from flag's value?)


Before I do that though --- what was the conclusion on verbosity control?

 

-boris


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-02-23 15:39                     ` Boris Ostrovsky
@ 2021-02-23 16:10                       ` Jan Beulich
  2021-02-23 18:00                         ` Roger Pau Monné
  2021-02-23 16:11                       ` Roger Pau Monné
  1 sibling, 1 reply; 53+ messages in thread
From: Jan Beulich @ 2021-02-23 16:10 UTC (permalink / raw)
  To: Boris Ostrovsky, Roger Pau Monné
  Cc: andrew.cooper3, xen-devel, iwj, wl, anthony.perard, jun.nakajima,
	kevin.tian

On 23.02.2021 16:39, Boris Ostrovsky wrote:
> 
> On 2/23/21 8:23 AM, Jan Beulich wrote:
>> On 23.02.2021 13:17, Roger Pau Monné wrote:
>>> On Tue, Feb 23, 2021 at 11:15:31AM +0100, Jan Beulich wrote:
>>>> On 23.02.2021 10:34, Roger Pau Monné wrote:
>>>>> On Tue, Feb 23, 2021 at 08:57:00AM +0100, Jan Beulich wrote:
>>>>>> On 22.02.2021 22:19, Boris Ostrovsky wrote:
>>>>>>> On 2/22/21 6:08 AM, Roger Pau Monné wrote:
>>>>>>>> On Fri, Feb 19, 2021 at 09:56:32AM -0500, Boris Ostrovsky wrote:
>>>>>>>>> On 2/18/21 5:51 AM, Roger Pau Monné wrote:
>>>>>>>>>> On Wed, Jan 20, 2021 at 05:49:10PM -0500, Boris Ostrovsky wrote:
>>>>>>>>>>> When toolstack updates MSR policy, this MSR offset (which is the last
>>>>>>>>>>> index in the hypervisor MSR range) is used to indicate hypervisor
>>>>>>>>>>> behavior when guest accesses an MSR which is not explicitly emulated.
>>>>>>>>>> It's kind of weird to use an MSR to store this. I assume this is done
>>>>>>>>>> for migration reasons?
>>>>>>>>> Not really. It just seemed to me that MSR policy is the logical place to do that. Because it *is* a policy of how to deal with such accesses, isn't it?
>>>>>>>> I agree that using the msr_policy seems like the most suitable place
>>>>>>>> to convey this information between the toolstack and Xen. I wonder if
>>>>>>>> it would be fine to have fields in msr_policy that don't directly
>>>>>>>> translate into an MSR value?
>>>>>>>
>>>>>>> We have xen_msr_entry_t.flags that we can use when passing policy array back and forth. Then we can ignore xen_msr_entry_t.idx for that entry (although in earlier version of this series Jan preferred to use idx and leave flags alone).
>>>>>> Which, just to clarify, was not the least because of the flags
>>>>>> field being per-entry, i.e. per MSR, while here we want a global
>>>>>> indicator.
>>>>> We could exploit a bit the xen_msr_entry_t structure and use it
>>>>> like:
>>>>>
>>>>> typedef struct xen_msr_entry {
>>>>>     uint32_t idx;
>>>>> #define XEN_MSR_IGNORE (1u << 0)
>>>>>     uint32_t flags;
>>>>>     uint64_t val;
>>>>> } xen_msr_entry_t;
>>>>>
>>>>> Then use the idx and val fields to signal the start and size of the
>>>>> range to ignore accesses to when XEN_MSR_IGNORE is set in the flags
>>>>> field?
>>>>>
>>>>> xen_msr_entry_t = {
>>>>>     .idx = 0,
>>>>>     .val = 0xffffffff,
>>>>>     .flags = XEN_MSR_IGNORE,
>>>>> };
>>>>>
>>>>> Would be equivalent to ignoring accesses to the whole MSR range.
>>>>>
>>>>> This would allow selectively selecting which MSRs to ignore, while
>>>>> not wasting a MSR itself to convey the information?
>>>> Hmm, yes, the added flexibility would be nice from an abstract pov
>>>> (not sure how relevant it would be to Solaris'es issue). But my
>>>> dislike of using a flag which is meaningless in ordinary entries
>>>> remains, as was voiced against Boris'es original version.
>>> I understand the flags field is meaningless for regular MSRs, but I
>>> don't see why it would be an issue to start using it for this specific
>>> case of registering ranges of ignored MSRs.
>> It's not an "issue", it is - as said - my dislike. However, the way
>> it is in your proposal it is perhaps indeed not as bad as in Boris'es
>> original one: The flag now designates the purpose of the entry, and
>> the other two fields still have a meaning. Hence I was wrong to state
>> that it's "meaningless" - it now is required to be clear for
>> "ordinary" entries.
>>
>> In principle there could then also be multiple such entries / ranges.
> 
> 
> TBH I am not sold on usefulness of multiple ranges but if both of
> you feel it can be handy I'll do that, using Roger's proposal above.

As indicated I really think an opinion from Andrew would be helpful.

> (Would it make sense to make val a union with, say, size or should
> the context be clear from flag's value?)

I'd recommend against this, unless you were to also name the upper
half of the 64-bit field so you can check that part to be zero.
Plus unions are generally not nice to be introduced into public
headers for already existing types.

> Before I do that though --- what was the conclusion on verbosity
> control?

Not sure - afaict the conclusion was that we still don't really
agree. Roger?

Jan


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-02-23 15:39                     ` Boris Ostrovsky
  2021-02-23 16:10                       ` Jan Beulich
@ 2021-02-23 16:11                       ` Roger Pau Monné
  2021-02-23 16:40                         ` Boris Ostrovsky
  1 sibling, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2021-02-23 16:11 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Jan Beulich, andrew.cooper3, xen-devel, iwj, wl, anthony.perard,
	jun.nakajima, kevin.tian

On Tue, Feb 23, 2021 at 10:39:48AM -0500, Boris Ostrovsky wrote:
> 
> On 2/23/21 8:23 AM, Jan Beulich wrote:
> > On 23.02.2021 13:17, Roger Pau Monné wrote:
> >> On Tue, Feb 23, 2021 at 11:15:31AM +0100, Jan Beulich wrote:
> >>> On 23.02.2021 10:34, Roger Pau Monné wrote:
> >>>> On Tue, Feb 23, 2021 at 08:57:00AM +0100, Jan Beulich wrote:
> >>>>> On 22.02.2021 22:19, Boris Ostrovsky wrote:
> >>>>>> On 2/22/21 6:08 AM, Roger Pau Monné wrote:
> >>>>>>> On Fri, Feb 19, 2021 at 09:56:32AM -0500, Boris Ostrovsky wrote:
> >>>>>>>> On 2/18/21 5:51 AM, Roger Pau Monné wrote:
> >>>>>>>>> On Wed, Jan 20, 2021 at 05:49:10PM -0500, Boris Ostrovsky wrote:
> >>>>>>>>>> When toolstack updates MSR policy, this MSR offset (which is the last
> >>>>>>>>>> index in the hypervisor MSR range) is used to indicate hypervisor
> >>>>>>>>>> behavior when guest accesses an MSR which is not explicitly emulated.
> >>>>>>>>> It's kind of weird to use an MSR to store this. I assume this is done
> >>>>>>>>> for migration reasons?
> >>>>>>>> Not really. It just seemed to me that MSR policy is the logical place to do that. Because it *is* a policy of how to deal with such accesses, isn't it?
> >>>>>>> I agree that using the msr_policy seems like the most suitable place
> >>>>>>> to convey this information between the toolstack and Xen. I wonder if
> >>>>>>> it would be fine to have fields in msr_policy that don't directly
> >>>>>>> translate into an MSR value?
> >>>>>>
> >>>>>> We have xen_msr_entry_t.flags that we can use when passing policy array back and forth. Then we can ignore xen_msr_entry_t.idx for that entry (although in earlier version of this series Jan preferred to use idx and leave flags alone).
> >>>>> Which, just to clarify, was not the least because of the flags
> >>>>> field being per-entry, i.e. per MSR, while here we want a global
> >>>>> indicator.
> >>>> We could exploit a bit the xen_msr_entry_t structure and use it
> >>>> like:
> >>>>
> >>>> typedef struct xen_msr_entry {
> >>>>     uint32_t idx;
> >>>> #define XEN_MSR_IGNORE (1u << 0)
> >>>>     uint32_t flags;
> >>>>     uint64_t val;
> >>>> } xen_msr_entry_t;
> >>>>
> >>>> Then use the idx and val fields to signal the start and size of the
> >>>> range to ignore accesses to when XEN_MSR_IGNORE is set in the flags
> >>>> field?
> >>>>
> >>>> xen_msr_entry_t = {
> >>>>     .idx = 0,
> >>>>     .val = 0xffffffff,
> >>>>     .flags = XEN_MSR_IGNORE,
> >>>> };
> >>>>
> >>>> Would be equivalent to ignoring accesses to the whole MSR range.
> >>>>
> >>>> This would allow selectively selecting which MSRs to ignore, while
> >>>> not wasting a MSR itself to convey the information?
> >>> Hmm, yes, the added flexibility would be nice from an abstract pov
> >>> (not sure how relevant it would be to Solaris'es issue). But my
> >>> dislike of using a flag which is meaningless in ordinary entries
> >>> remains, as was voiced against Boris'es original version.
> >> I understand the flags field is meaningless for regular MSRs, but I
> >> don't see why it would be an issue to start using it for this specific
> >> case of registering ranges of ignored MSRs.
> > It's not an "issue", it is - as said - my dislike. However, the way
> > it is in your proposal it is perhaps indeed not as bad as in Boris'es
> > original one: The flag now designates the purpose of the entry, and
> > the other two fields still have a meaning. Hence I was wrong to state
> > that it's "meaningless" - it now is required to be clear for
> > "ordinary" entries.
> >
> > In principle there could then also be multiple such entries / ranges.
> 
> 
> TBH I am not sold on usefulness of multiple ranges but if both of you feel it can be handy I'll do that, using Roger's proposal above. (Would it make sense to make val a union with, say, size or should the context be clear from flag's value?)

Doing an union with { val, size } would be fine for me, I'm also fine
with not doing it though.

> 
> Before I do that though --- what was the conclusion on verbosity control?

Ideally I would like to find a way to have a more generic interface to
change verbosity level on a per-guest basis, but I haven't looked at
all about how to do that, neither I think it would be acceptable to
put that burden on you.

Maybe we could introduce another flag to set whether ignored MSRs
should be logged, as that would be easier to implement?

I think in that case we could enforce that either all ranges have the
flag set or not, in order to prevent ending up tracking two different
ranges of ignored MSRs, one that triggers a message and another one
that doesn't.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-02-23 16:11                       ` Roger Pau Monné
@ 2021-02-23 16:40                         ` Boris Ostrovsky
  2021-02-23 18:02                           ` Roger Pau Monné
  0 siblings, 1 reply; 53+ messages in thread
From: Boris Ostrovsky @ 2021-02-23 16:40 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Jan Beulich, andrew.cooper3, xen-devel, iwj, wl, anthony.perard,
	jun.nakajima, kevin.tian


On 2/23/21 11:11 AM, Roger Pau Monné wrote:
> On Tue, Feb 23, 2021 at 10:39:48AM -0500, Boris Ostrovsky wrote:
>
>> Before I do that though --- what was the conclusion on verbosity control?
> Ideally I would like to find a way to have a more generic interface to
> change verbosity level on a per-guest basis, but I haven't looked at
> all about how to do that, neither I think it would be acceptable to
> put that burden on you.
>
> Maybe we could introduce another flag to set whether ignored MSRs
> should be logged, as that would be easier to implement?


Probably.


    msr_ignore=["verbose=<bool>", "<range>", "range>", ...]


-boris


>
> I think in that case we could enforce that either all ranges have the
> flag set or not, in order to prevent ending up tracking two different
> ranges of ignored MSRs, one that triggers a message and another one
> that doesn't.
>
> Thanks, Roger.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-02-23 16:10                       ` Jan Beulich
@ 2021-02-23 18:00                         ` Roger Pau Monné
  0 siblings, 0 replies; 53+ messages in thread
From: Roger Pau Monné @ 2021-02-23 18:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Boris Ostrovsky, andrew.cooper3, xen-devel, iwj, wl,
	anthony.perard, jun.nakajima, kevin.tian

On Tue, Feb 23, 2021 at 05:10:16PM +0100, Jan Beulich wrote:
> On 23.02.2021 16:39, Boris Ostrovsky wrote:
> > Before I do that though --- what was the conclusion on verbosity
> > control?
> 
> Not sure - afaict the conclusion was that we still don't really
> agree. Roger?

As I said on my reply to Boris, I would really like to have a more
generic way of doing this kind of per-domain verbosity level, but I
don't think it would be fair to ask Boris to implement this. Xen
likely needs a way to print issues with MSRs accesses in 4.15 on a
per-domain basis, so going for a MSR specific solution would be
acceptable given the time frame. I think Boris made a proposal for a
solution on another email, let me go look at that.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-02-23 16:40                         ` Boris Ostrovsky
@ 2021-02-23 18:02                           ` Roger Pau Monné
  2021-02-23 18:45                             ` Boris Ostrovsky
  0 siblings, 1 reply; 53+ messages in thread
From: Roger Pau Monné @ 2021-02-23 18:02 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Jan Beulich, andrew.cooper3, xen-devel, iwj, wl, anthony.perard,
	jun.nakajima, kevin.tian

On Tue, Feb 23, 2021 at 11:40:07AM -0500, Boris Ostrovsky wrote:
> 
> On 2/23/21 11:11 AM, Roger Pau Monné wrote:
> > On Tue, Feb 23, 2021 at 10:39:48AM -0500, Boris Ostrovsky wrote:
> >
> >> Before I do that though --- what was the conclusion on verbosity control?
> > Ideally I would like to find a way to have a more generic interface to
> > change verbosity level on a per-guest basis, but I haven't looked at
> > all about how to do that, neither I think it would be acceptable to
> > put that burden on you.
> >
> > Maybe we could introduce another flag to set whether ignored MSRs
> > should be logged, as that would be easier to implement?
> 
> 
> Probably.
> 
> 
>     msr_ignore=["verbose=<bool>", "<range>", "range>", ...]

I think just adding a new option will be easier to parse in xl rather
than placing it in the list of MSR ranges:

msr_ignore=["<range>", "range>", ...]
msr_ignore_verbose=<boolean>

Seems easier to implement IMO.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
  2021-02-23 18:02                           ` Roger Pau Monné
@ 2021-02-23 18:45                             ` Boris Ostrovsky
  0 siblings, 0 replies; 53+ messages in thread
From: Boris Ostrovsky @ 2021-02-23 18:45 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Jan Beulich, andrew.cooper3, xen-devel, iwj, wl, anthony.perard,
	jun.nakajima, kevin.tian


On 2/23/21 1:02 PM, Roger Pau Monné wrote:
> On Tue, Feb 23, 2021 at 11:40:07AM -0500, Boris Ostrovsky wrote:
>> On 2/23/21 11:11 AM, Roger Pau Monné wrote:
>>> On Tue, Feb 23, 2021 at 10:39:48AM -0500, Boris Ostrovsky wrote:
>>>
>>>> Before I do that though --- what was the conclusion on verbosity control?
>>> Ideally I would like to find a way to have a more generic interface to
>>> change verbosity level on a per-guest basis, but I haven't looked at
>>> all about how to do that, neither I think it would be acceptable to
>>> put that burden on you.
>>>
>>> Maybe we could introduce another flag to set whether ignored MSRs
>>> should be logged, as that would be easier to implement?
>>
>> Probably.
>>
>>
>>     msr_ignore=["verbose=<bool>", "<range>", "range>", ...]
> I think just adding a new option will be easier to parse in xl rather
> than placing it in the list of MSR ranges:
>
> msr_ignore=["<range>", "range>", ...]
> msr_ignore_verbose=<boolean>
>
> Seems easier to implement IMO.


I haven't looked at what parsing support is available in xl. If I don't find anything useful (and I give up quickly ;-) ) I can use separate options.



-boris



^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2021-02-23 18:46 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-20 22:49 [PATCH v2 0/4] Permit fault-less access to non-emulated MSRs Boris Ostrovsky
2021-01-20 22:49 ` [PATCH v2 1/4] xl: Add support for ignore_msrs option Boris Ostrovsky
2021-01-21 14:56   ` Wei Liu
2021-01-21 22:43     ` Boris Ostrovsky
2021-01-22  9:52   ` Julien Grall
2021-01-22 18:28     ` Boris Ostrovsky
2021-01-22 18:33       ` Julien Grall
2021-01-22 18:39         ` Boris Ostrovsky
2021-01-22 20:42           ` Julien Grall
2021-02-18 10:42   ` Roger Pau Monné
2021-02-18 11:54     ` Jan Beulich
2021-02-18 15:52       ` Roger Pau Monné
2021-02-18 15:57         ` Jan Beulich
2021-02-19 14:50           ` Boris Ostrovsky
2021-02-22 10:24             ` Roger Pau Monné
2021-02-22 10:33               ` Jan Beulich
2021-01-20 22:49 ` [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED Boris Ostrovsky
2021-01-22 11:51   ` Jan Beulich
2021-01-22 18:56     ` Boris Ostrovsky
2021-02-02 17:01     ` Boris Ostrovsky
2021-02-18 10:51   ` Roger Pau Monné
2021-02-19 14:56     ` Boris Ostrovsky
2021-02-22 11:08       ` Roger Pau Monné
2021-02-22 21:19         ` Boris Ostrovsky
2021-02-23  7:57           ` Jan Beulich
2021-02-23  9:34             ` Roger Pau Monné
2021-02-23 10:15               ` Jan Beulich
2021-02-23 12:17                 ` Roger Pau Monné
2021-02-23 13:23                   ` Jan Beulich
2021-02-23 15:39                     ` Boris Ostrovsky
2021-02-23 16:10                       ` Jan Beulich
2021-02-23 18:00                         ` Roger Pau Monné
2021-02-23 16:11                       ` Roger Pau Monné
2021-02-23 16:40                         ` Boris Ostrovsky
2021-02-23 18:02                           ` Roger Pau Monné
2021-02-23 18:45                             ` Boris Ostrovsky
2021-01-20 22:49 ` [PATCH v2 3/4] x86: Allow non-faulting accesses to non-emulated MSRs if policy permits this Boris Ostrovsky
2021-01-22 12:51   ` Jan Beulich
2021-01-22 19:52     ` Boris Ostrovsky
2021-01-25 10:22       ` Jan Beulich
2021-01-25 18:42         ` Boris Ostrovsky
2021-01-26  9:05           ` Jan Beulich
2021-01-26 16:02             ` Boris Ostrovsky
2021-01-26 16:35               ` Jan Beulich
2021-02-18 11:24   ` Roger Pau Monné
2021-02-18 11:57     ` Jan Beulich
2021-02-18 15:53       ` Roger Pau Monné
2021-01-20 22:49 ` [PATCH v2 4/4] tools/libs: Apply MSR policy to a guest Boris Ostrovsky
2021-01-21 14:58   ` Wei Liu
2021-01-22  9:56   ` Julien Grall
2021-01-22 18:35     ` Boris Ostrovsky
2021-02-18 11:48   ` Roger Pau Monné
2021-02-19 14:57     ` Boris Ostrovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.