All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/8] x86: Protection Key Supervisor support
@ 2023-01-10 17:18 Andrew Cooper
  2023-01-10 17:18 ` [PATCH v2 1/8] x86/boot: Sanitise PKRU on boot Andrew Cooper
                   ` (7 more replies)
  0 siblings, 8 replies; 27+ messages in thread
From: Andrew Cooper @ 2023-01-10 17:18 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu, Kevin Tian

Only 14 months after v1...  This time with testing on real hardware (not that
any bugs were found) and against an updated XTF comprehensive pagewalk test.

  # time ./xtf-runner hvm64 pagetable-emulation~hap
  --- Xen Test Framework ---
  Environment: HVM 64bit (Long mode 4 levels)
  Test pagetable-emulation
    Info: Intel, Fam 6, Model 143, Stepping 3, paddr 46, vaddr 48
    Features: PSE PAE PGE PAT PSE36 PCID NX PAGE1G SMEP SMAP PKU PKS
    Paging mode heuristic: Hap
    Using physical addresses 0000000040000000 and 0000200000000000
  Test L1e
  Test L2e
  Test L2e Superpage
  Test L3e
  Test L3e Superpage
  Test L4e
  Test L4e Superpage
  Completed 28835840 tests
  Test result: SUCCESS

  Combined test results:
  test-hvm64-pagetable-emulation~hap       SUCCESS

  real                                     1m26.315s
  user                                     0m0.039s
  sys                                      0m0.086s

This test is going to get even more silly when CET-SS becomes supported.

Andrew Cooper (8):
  x86/boot: Sanitise PKRU on boot
  x86/prot-key: Enumeration for Protection Key Supervisor
  x86/prot-key: Split PKRU infrastructure out of asm/processor.h
  x86: Initial support for WRMSRNS
  x86/hvm: Context switch MSR_PKRS
  x86/hvm: Enable guest access to MSR_PKRS
  x86/pagewalk: Support PKS
  x86/hvm: Support PKS for HAP guests

 tools/libs/light/libxl_cpuid.c              |  2 +
 tools/misc/xen-cpuid.c                      |  3 +-
 xen/arch/x86/cpu/common.c                   |  6 ++
 xen/arch/x86/cpuid.c                        |  9 +++
 xen/arch/x86/hvm/hvm.c                      |  5 +-
 xen/arch/x86/hvm/vmx/vmx.c                  | 26 +++++++++
 xen/arch/x86/include/asm/cpufeature.h       |  2 +
 xen/arch/x86/include/asm/guest_pt.h         |  5 ++
 xen/arch/x86/include/asm/hvm/hvm.h          |  3 +
 xen/arch/x86/include/asm/msr-index.h        |  2 +
 xen/arch/x86/include/asm/msr.h              | 21 +++++++
 xen/arch/x86/include/asm/processor.h        | 38 -------------
 xen/arch/x86/include/asm/prot-key.h         | 85 +++++++++++++++++++++++++++++
 xen/arch/x86/include/asm/x86-defns.h        |  1 +
 xen/arch/x86/mm/guest_walk.c                | 16 ++++--
 xen/arch/x86/msr.c                          | 10 ++++
 xen/arch/x86/setup.c                        |  6 +-
 xen/arch/x86/smpboot.c                      |  4 ++
 xen/arch/x86/x86_emulate.c                  |  2 +
 xen/include/public/arch-x86/cpufeatureset.h |  2 +
 20 files changed, 201 insertions(+), 47 deletions(-)
 create mode 100644 xen/arch/x86/include/asm/prot-key.h

-- 
2.11.0



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 1/8] x86/boot: Sanitise PKRU on boot
  2023-01-10 17:18 [PATCH v2 0/8] x86: Protection Key Supervisor support Andrew Cooper
@ 2023-01-10 17:18 ` Andrew Cooper
  2023-01-12 12:47   ` Jan Beulich
  2023-01-10 17:18 ` [PATCH v2 2/8] x86/prot-key: Enumeration for Protection Key Supervisor Andrew Cooper
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 27+ messages in thread
From: Andrew Cooper @ 2023-01-10 17:18 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

While the reset value of the register is 0, it might not be after kexec/etc.
If PKEY0.{WD,AD} have leaked in from an earlier context, construction of a PV
dom0 will explode.

Sequencing wise, this must come after setting CR4.PKE, and before we touch any
user mappings.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>

For sequencing, it could also come after setting XCR0.PKRU too, but then we'd
need to construct an empty XSAVE area to XRSTOR from, and that would be even
more horrible to arrange.
---
 xen/arch/x86/cpu/common.c             | 3 +++
 xen/arch/x86/include/asm/cpufeature.h | 1 +
 xen/arch/x86/setup.c                  | 2 +-
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 0412dbc915e5..fe92f29c2dc6 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -936,6 +936,9 @@ void cpu_init(void)
 	write_debugreg(6, X86_DR6_DEFAULT);
 	write_debugreg(7, X86_DR7_DEFAULT);
 
+	if (cpu_has_pku)
+		wrpkru(0);
+
 	/*
 	 * If the platform is performing a Secure Launch via SKINIT, GIF is
 	 * clear to prevent external interrupts interfering with Secure
diff --git a/xen/arch/x86/include/asm/cpufeature.h b/xen/arch/x86/include/asm/cpufeature.h
index a3ad9ebee4e9..044cfd9f882d 100644
--- a/xen/arch/x86/include/asm/cpufeature.h
+++ b/xen/arch/x86/include/asm/cpufeature.h
@@ -109,6 +109,7 @@
 
 /* CPUID level 0x00000007:0.ecx */
 #define cpu_has_avx512_vbmi     boot_cpu_has(X86_FEATURE_AVX512_VBMI)
+#define cpu_has_pku             boot_cpu_has(X86_FEATURE_PKU)
 #define cpu_has_avx512_vbmi2    boot_cpu_has(X86_FEATURE_AVX512_VBMI2)
 #define cpu_has_gfni            boot_cpu_has(X86_FEATURE_GFNI)
 #define cpu_has_vaes            boot_cpu_has(X86_FEATURE_VAES)
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 566422600d94..6deadcf74763 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1798,7 +1798,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
     if ( boot_cpu_has(X86_FEATURE_FSGSBASE) )
         set_in_cr4(X86_CR4_FSGSBASE);
 
-    if ( boot_cpu_has(X86_FEATURE_PKU) )
+    if ( cpu_has_pku )
         set_in_cr4(X86_CR4_PKE);
 
     if ( opt_invpcid && cpu_has_invpcid )
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 2/8] x86/prot-key: Enumeration for Protection Key Supervisor
  2023-01-10 17:18 [PATCH v2 0/8] x86: Protection Key Supervisor support Andrew Cooper
  2023-01-10 17:18 ` [PATCH v2 1/8] x86/boot: Sanitise PKRU on boot Andrew Cooper
@ 2023-01-10 17:18 ` Andrew Cooper
  2023-01-10 17:18 ` [PATCH v2 3/8] x86/prot-key: Split PKRU infrastructure out of asm/processor.h Andrew Cooper
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 27+ messages in thread
From: Andrew Cooper @ 2023-01-10 17:18 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

Protection Key Supervisor works in a very similar way to Protection Key User,
except that instead of a PKRU register used by the {RD,WR}PKRU instructions,
the supervisor protection settings live in MSR_PKRS and is accessed using
normal {RD,WR}MSR instructions.

PKS has the same problematic interactions with PV guests as PKU (more infact,
given the guest kernel's CPL), so we'll only support this for HVM guests for
now.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 tools/libs/light/libxl_cpuid.c              | 1 +
 tools/misc/xen-cpuid.c                      | 2 +-
 xen/arch/x86/include/asm/cpufeature.h       | 1 +
 xen/arch/x86/include/asm/msr-index.h        | 2 ++
 xen/arch/x86/include/asm/x86-defns.h        | 1 +
 xen/include/public/arch-x86/cpufeatureset.h | 1 +
 6 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/libs/light/libxl_cpuid.c b/tools/libs/light/libxl_cpuid.c
index 2aa23225f42c..cbd4e511e8ab 100644
--- a/tools/libs/light/libxl_cpuid.c
+++ b/tools/libs/light/libxl_cpuid.c
@@ -211,6 +211,7 @@ int libxl_cpuid_parse_config(libxl_cpuid_policy_list *cpuid, const char* str)
         {"avx512-vpopcntdq",0x00000007,0,CPUID_REG_ECX, 14,  1},
         {"rdpid",        0x00000007,  0, CPUID_REG_ECX, 22,  1},
         {"cldemote",     0x00000007,  0, CPUID_REG_ECX, 25,  1},
+        {"pks",          0x00000007,  0, CPUID_REG_ECX, 31,  1},
 
         {"avx512-4vnniw",0x00000007,  0, CPUID_REG_EDX,  2,  1},
         {"avx512-4fmaps",0x00000007,  0, CPUID_REG_EDX,  3,  1},
diff --git a/tools/misc/xen-cpuid.c b/tools/misc/xen-cpuid.c
index d5833e9ce879..ea7ff320e0e4 100644
--- a/tools/misc/xen-cpuid.c
+++ b/tools/misc/xen-cpuid.c
@@ -134,7 +134,7 @@ static const char *const str_7c0[32] =
     /* 24 */                   [25] = "cldemote",
     /* 26 */                   [27] = "movdiri",
     [28] = "movdir64b",        [29] = "enqcmd",
-    [30] = "sgx-lc",
+    [30] = "sgx-lc",           [31] = "pks",
 };
 
 static const char *const str_e7d[32] =
diff --git a/xen/arch/x86/include/asm/cpufeature.h b/xen/arch/x86/include/asm/cpufeature.h
index 044cfd9f882d..0a301013c3d9 100644
--- a/xen/arch/x86/include/asm/cpufeature.h
+++ b/xen/arch/x86/include/asm/cpufeature.h
@@ -121,6 +121,7 @@
 #define cpu_has_movdiri         boot_cpu_has(X86_FEATURE_MOVDIRI)
 #define cpu_has_movdir64b       boot_cpu_has(X86_FEATURE_MOVDIR64B)
 #define cpu_has_enqcmd          boot_cpu_has(X86_FEATURE_ENQCMD)
+#define cpu_has_pks             boot_cpu_has(X86_FEATURE_PKS)
 
 /* CPUID level 0x80000007.edx */
 #define cpu_has_hw_pstate       boot_cpu_has(X86_FEATURE_HW_PSTATE)
diff --git a/xen/arch/x86/include/asm/msr-index.h b/xen/arch/x86/include/asm/msr-index.h
index 0a8852f3c246..7615d8087f46 100644
--- a/xen/arch/x86/include/asm/msr-index.h
+++ b/xen/arch/x86/include/asm/msr-index.h
@@ -148,6 +148,8 @@
 #define MSR_PL3_SSP                         0x000006a7
 #define MSR_INTERRUPT_SSP_TABLE             0x000006a8
 
+#define MSR_PKRS                            0x000006e1
+
 #define MSR_X2APIC_FIRST                    0x00000800
 #define MSR_X2APIC_LAST                     0x000008ff
 
diff --git a/xen/arch/x86/include/asm/x86-defns.h b/xen/arch/x86/include/asm/x86-defns.h
index 42b5f382d438..fe1caba6f819 100644
--- a/xen/arch/x86/include/asm/x86-defns.h
+++ b/xen/arch/x86/include/asm/x86-defns.h
@@ -74,6 +74,7 @@
 #define X86_CR4_SMAP       0x00200000 /* enable SMAP */
 #define X86_CR4_PKE        0x00400000 /* enable PKE */
 #define X86_CR4_CET        0x00800000 /* Control-flow Enforcement Technology */
+#define X86_CR4_PKS        0x01000000 /* Protection Key Supervisor */
 
 /*
  * XSTATE component flags in XCR0
diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
index 7915f5826f57..ad7e89dd4c40 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -227,6 +227,7 @@ XEN_CPUFEATURE(CLDEMOTE,      6*32+25) /*A  CLDEMOTE instruction */
 XEN_CPUFEATURE(MOVDIRI,       6*32+27) /*a  MOVDIRI instruction */
 XEN_CPUFEATURE(MOVDIR64B,     6*32+28) /*a  MOVDIR64B instruction */
 XEN_CPUFEATURE(ENQCMD,        6*32+29) /*   ENQCMD{,S} instructions */
+XEN_CPUFEATURE(PKS,           6*32+31) /*   Protection Key for Supervisor */
 
 /* AMD-defined CPU features, CPUID level 0x80000007.edx, word 7 */
 XEN_CPUFEATURE(HW_PSTATE,     7*32+ 7) /*   Hardware Pstates */
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 3/8] x86/prot-key: Split PKRU infrastructure out of asm/processor.h
  2023-01-10 17:18 [PATCH v2 0/8] x86: Protection Key Supervisor support Andrew Cooper
  2023-01-10 17:18 ` [PATCH v2 1/8] x86/boot: Sanitise PKRU on boot Andrew Cooper
  2023-01-10 17:18 ` [PATCH v2 2/8] x86/prot-key: Enumeration for Protection Key Supervisor Andrew Cooper
@ 2023-01-10 17:18 ` Andrew Cooper
  2023-01-10 17:18 ` [PATCH v2 4/8] x86: Initial support for WRMSRNS Andrew Cooper
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 27+ messages in thread
From: Andrew Cooper @ 2023-01-10 17:18 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

asm/processor.h is in desperate need of splitting up, and protection key
functionality in only used in the emulator and pagewalk.  Introduce a new
asm/prot-key.h and move the relevant content over.

Rename the PKRU_* constants to drop the user part and to use the architectural
terminology.

Drop the read_pkru_{ad,wd}() helpers entirely.  The pkru infix is about to
become wrong, and the sole user is shorter and easier to follow without the
helpers.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>

v2:
 * Mask pk_ar
---
 xen/arch/x86/cpu/common.c            |  1 +
 xen/arch/x86/include/asm/processor.h | 38 ------------------------------------
 xen/arch/x86/include/asm/prot-key.h  | 31 +++++++++++++++++++++++++++++
 xen/arch/x86/mm/guest_walk.c         |  9 ++++++---
 xen/arch/x86/x86_emulate.c           |  2 ++
 5 files changed, 40 insertions(+), 41 deletions(-)
 create mode 100644 xen/arch/x86/include/asm/prot-key.h

diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index fe92f29c2dc6..2bcdd08b2fb5 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -11,6 +11,7 @@
 #include <asm/io.h>
 #include <asm/mpspec.h>
 #include <asm/apic.h>
+#include <asm/prot-key.h>
 #include <asm/random.h>
 #include <asm/setup.h>
 #include <asm/shstk.h>
diff --git a/xen/arch/x86/include/asm/processor.h b/xen/arch/x86/include/asm/processor.h
index 60b902060914..b95d2483212a 100644
--- a/xen/arch/x86/include/asm/processor.h
+++ b/xen/arch/x86/include/asm/processor.h
@@ -374,44 +374,6 @@ static always_inline void set_in_cr4 (unsigned long mask)
     write_cr4(read_cr4() | mask);
 }
 
-static inline unsigned int rdpkru(void)
-{
-    unsigned int pkru;
-
-    asm volatile (".byte 0x0f,0x01,0xee"
-        : "=a" (pkru) : "c" (0) : "dx");
-
-    return pkru;
-}
-
-static inline void wrpkru(unsigned int pkru)
-{
-    asm volatile ( ".byte 0x0f, 0x01, 0xef"
-                   :: "a" (pkru), "d" (0), "c" (0) );
-}
-
-/* Macros for PKRU domain */
-#define PKRU_READ  (0)
-#define PKRU_WRITE (1)
-#define PKRU_ATTRS (2)
-
-/*
- * PKRU defines 32 bits, there are 16 domains and 2 attribute bits per
- * domain in pkru, pkeys is index to a defined domain, so the value of
- * pte_pkeys * PKRU_ATTRS + R/W is offset of a defined domain attribute.
- */
-static inline bool_t read_pkru_ad(uint32_t pkru, unsigned int pkey)
-{
-    ASSERT(pkey < 16);
-    return (pkru >> (pkey * PKRU_ATTRS + PKRU_READ)) & 1;
-}
-
-static inline bool_t read_pkru_wd(uint32_t pkru, unsigned int pkey)
-{
-    ASSERT(pkey < 16);
-    return (pkru >> (pkey * PKRU_ATTRS + PKRU_WRITE)) & 1;
-}
-
 static always_inline void __monitor(const void *eax, unsigned long ecx,
                                     unsigned long edx)
 {
diff --git a/xen/arch/x86/include/asm/prot-key.h b/xen/arch/x86/include/asm/prot-key.h
new file mode 100644
index 000000000000..63a2e22f3fa0
--- /dev/null
+++ b/xen/arch/x86/include/asm/prot-key.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021-2022 Citrix Systems Ltd.
+ */
+#ifndef ASM_PROT_KEY_H
+#define ASM_PROT_KEY_H
+
+#include <xen/types.h>
+
+#define PKEY_AD 1 /* Access Disable */
+#define PKEY_WD 2 /* Write Disable */
+
+#define PKEY_WIDTH 2 /* Two bits per protection key */
+
+static inline uint32_t rdpkru(void)
+{
+    uint32_t pkru;
+
+    asm volatile ( ".byte 0x0f,0x01,0xee"
+                   : "=a" (pkru) : "c" (0) : "dx" );
+
+    return pkru;
+}
+
+static inline void wrpkru(uint32_t pkru)
+{
+    asm volatile ( ".byte 0x0f,0x01,0xef"
+                   :: "a" (pkru), "d" (0), "c" (0) );
+}
+
+#endif /* ASM_PROT_KEY_H */
diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
index 70dacc477f9a..161a61b8f5ca 100644
--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -26,7 +26,9 @@
 #include <xen/paging.h>
 #include <xen/domain_page.h>
 #include <xen/sched.h>
+
 #include <asm/page.h>
+#include <asm/prot-key.h>
 #include <asm/guest_pt.h>
 #include <asm/hvm/emulate.h>
 
@@ -413,10 +415,11 @@ guest_walk_tables(const struct vcpu *v, struct p2m_domain *p2m,
          guest_pku_enabled(v) )
     {
         unsigned int pkey = guest_l1e_get_pkey(gw->l1e);
-        unsigned int pkru = rdpkru();
+        unsigned int pkr = rdpkru();
+        unsigned int pk_ar = (pkr >> (pkey * PKEY_WIDTH)) & (PKEY_AD | PKEY_WD);
 
-        if ( read_pkru_ad(pkru, pkey) ||
-             ((walk & PFEC_write_access) && read_pkru_wd(pkru, pkey) &&
+        if ( (pk_ar & PKEY_AD) ||
+             ((walk & PFEC_write_access) && (pk_ar & PKEY_WD) &&
               ((walk & PFEC_user_mode) || guest_wp_enabled(v))) )
         {
             gw->pfec |= PFEC_prot_key;
diff --git a/xen/arch/x86/x86_emulate.c b/xen/arch/x86/x86_emulate.c
index 720740f29b84..8c7d18521807 100644
--- a/xen/arch/x86/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate.c
@@ -12,8 +12,10 @@
 #include <xen/domain_page.h>
 #include <xen/err.h>
 #include <xen/event.h>
+
 #include <asm/x86_emulate.h>
 #include <asm/processor.h> /* current_cpu_info */
+#include <asm/prot-key.h>
 #include <asm/xstate.h>
 #include <asm/amd.h> /* cpu_has_amd_erratum() */
 #include <asm/debugreg.h>
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 4/8] x86: Initial support for WRMSRNS
  2023-01-10 17:18 [PATCH v2 0/8] x86: Protection Key Supervisor support Andrew Cooper
                   ` (2 preceding siblings ...)
  2023-01-10 17:18 ` [PATCH v2 3/8] x86/prot-key: Split PKRU infrastructure out of asm/processor.h Andrew Cooper
@ 2023-01-10 17:18 ` Andrew Cooper
  2023-01-12 12:58   ` Jan Beulich
  2023-01-10 17:18 ` [PATCH v2 5/8] x86/hvm: Context switch MSR_PKRS Andrew Cooper
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 27+ messages in thread
From: Andrew Cooper @ 2023-01-10 17:18 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

WRMSR Non-Serialising is an optimisation intended for cases where an MSR needs
updating, but architectural serialising properties are not needed.

In is anticipated that this will apply to most if not all MSRs modified on
context switch paths.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>

v2:
 * New
---
 tools/libs/light/libxl_cpuid.c              |  1 +
 tools/misc/xen-cpuid.c                      |  1 +
 xen/arch/x86/include/asm/msr.h              | 12 ++++++++++++
 xen/include/public/arch-x86/cpufeatureset.h |  1 +
 4 files changed, 15 insertions(+)

diff --git a/tools/libs/light/libxl_cpuid.c b/tools/libs/light/libxl_cpuid.c
index cbd4e511e8ab..8da78773a886 100644
--- a/tools/libs/light/libxl_cpuid.c
+++ b/tools/libs/light/libxl_cpuid.c
@@ -235,6 +235,7 @@ int libxl_cpuid_parse_config(libxl_cpuid_policy_list *cpuid, const char* str)
         {"fzrm",         0x00000007,  1, CPUID_REG_EAX, 10,  1},
         {"fsrs",         0x00000007,  1, CPUID_REG_EAX, 11,  1},
         {"fsrcs",        0x00000007,  1, CPUID_REG_EAX, 12,  1},
+        {"wrmsrns",      0x00000007,  1, CPUID_REG_EAX, 19,  1},
 
         {"intel-psfd",   0x00000007,  2, CPUID_REG_EDX,  0,  1},
         {"mcdt-no",      0x00000007,  2, CPUID_REG_EDX,  5,  1},
diff --git a/tools/misc/xen-cpuid.c b/tools/misc/xen-cpuid.c
index ea7ff320e0e4..f482c4e28f30 100644
--- a/tools/misc/xen-cpuid.c
+++ b/tools/misc/xen-cpuid.c
@@ -189,6 +189,7 @@ static const char *const str_7a1[32] =
 
     [10] = "fzrm",          [11] = "fsrs",
     [12] = "fsrcs",
+    /* 18 */                [19] = "wrmsrns",
 };
 
 static const char *const str_e21a[32] =
diff --git a/xen/arch/x86/include/asm/msr.h b/xen/arch/x86/include/asm/msr.h
index dd1eee04a637..191e54068856 100644
--- a/xen/arch/x86/include/asm/msr.h
+++ b/xen/arch/x86/include/asm/msr.h
@@ -38,6 +38,18 @@ static inline void wrmsrl(unsigned int msr, __u64 val)
         wrmsr(msr, lo, hi);
 }
 
+/* Non-serialising WRMSR, when available.  Falls back to a serialising WRMSR. */
+static inline void wrmsr_ns(uint32_t msr, uint32_t lo, uint32_t hi)
+{
+    /*
+     * WRMSR is 2 bytes.  WRMSRNS is 3 bytes.  Pad WRMSR with a redundant CS
+     * prefix to avoid a trailing NOP.
+     */
+    alternative_input(".byte 0x2e; wrmsr",
+                      ".byte 0x0f,0x01,0xc6", X86_FEATURE_WRMSRNS,
+                      "c" (msr), "a" (lo), "d" (hi));
+}
+
 /* rdmsr with exception handling */
 #define rdmsr_safe(msr,val) ({\
     int rc_; \
diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
index ad7e89dd4c40..5444bc5d8374 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -281,6 +281,7 @@ XEN_CPUFEATURE(AVX512_BF16,  10*32+ 5) /*A  AVX512 BFloat16 Instructions */
 XEN_CPUFEATURE(FZRM,         10*32+10) /*A  Fast Zero-length REP MOVSB */
 XEN_CPUFEATURE(FSRS,         10*32+11) /*A  Fast Short REP STOSB */
 XEN_CPUFEATURE(FSRCS,        10*32+12) /*A  Fast Short REP CMPSB/SCASB */
+XEN_CPUFEATURE(WRMSRNS,      10*32+19) /*   WRMSR Non-Serialising */
 
 /* AMD-defined CPU features, CPUID level 0x80000021.eax, word 11 */
 XEN_CPUFEATURE(LFENCE_DISPATCH,    11*32+ 2) /*A  LFENCE always serializing */
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 5/8] x86/hvm: Context switch MSR_PKRS
  2023-01-10 17:18 [PATCH v2 0/8] x86: Protection Key Supervisor support Andrew Cooper
                   ` (3 preceding siblings ...)
  2023-01-10 17:18 ` [PATCH v2 4/8] x86: Initial support for WRMSRNS Andrew Cooper
@ 2023-01-10 17:18 ` Andrew Cooper
  2023-01-12 13:10   ` Jan Beulich
  2023-01-10 17:18 ` [PATCH v2 6/8] x86/hvm: Enable guest access to MSR_PKRS Andrew Cooper
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 27+ messages in thread
From: Andrew Cooper @ 2023-01-10 17:18 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu, Kevin Tian

Under PKS, MSR_PKRS is available and based on the CPUID policy alone, and
usable independently of CR4.PKS.  See the large comment in prot-key.h for
details of the context switching arrangement.

Use WRMSRNS right away, as we don't care about serialsing properties for
context switching this MSR.

Sanitise MSR_PKRS on boot.  In anticipation of wanting to use PKS for Xen in
the future, arrange for the sanitisation to occur prior to potentially setting
CR4.PKS; if PKEY0.{AD,WD} leak in from a previous context, we will triple
fault immediately on setting CR4.PKS.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Kevin Tian <kevin.tian@intel.com>

v2:
 * Use WRMSRNS
 * Sanitise MSR_PKS on boot.
---
 xen/arch/x86/cpu/common.c           |  2 ++
 xen/arch/x86/hvm/vmx/vmx.c          |  9 +++++++
 xen/arch/x86/include/asm/msr.h      |  9 +++++++
 xen/arch/x86/include/asm/prot-key.h | 54 +++++++++++++++++++++++++++++++++++++
 xen/arch/x86/setup.c                |  4 +++
 xen/arch/x86/smpboot.c              |  4 +++
 6 files changed, 82 insertions(+)

diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 2bcdd08b2fb5..f44c907e8a43 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -58,6 +58,8 @@ static unsigned int forced_caps[NCAPINTS];
 
 DEFINE_PER_CPU(bool, full_gdt_loaded);
 
+DEFINE_PER_CPU(uint32_t, pkrs);
+
 void __init setup_clear_cpu_cap(unsigned int cap)
 {
 	const uint32_t *dfs;
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 43a4865d1c76..b1f493f009fd 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -58,6 +58,7 @@
 #include <asm/event.h>
 #include <asm/mce.h>
 #include <asm/monitor.h>
+#include <asm/prot-key.h>
 #include <public/arch-x86/cpuid.h>
 
 static bool_t __initdata opt_force_ept;
@@ -536,6 +537,7 @@ static void vmx_restore_host_msrs(void)
 
 static void vmx_save_guest_msrs(struct vcpu *v)
 {
+    const struct cpuid_policy *cp = v->domain->arch.cpuid;
     struct vcpu_msrs *msrs = v->arch.msrs;
 
     /*
@@ -549,10 +551,14 @@ static void vmx_save_guest_msrs(struct vcpu *v)
         rdmsrl(MSR_RTIT_OUTPUT_MASK, msrs->rtit.output_mask);
         rdmsrl(MSR_RTIT_STATUS, msrs->rtit.status);
     }
+
+    if ( cp->feat.pks )
+        msrs->pkrs = rdpkrs_and_cache();
 }
 
 static void vmx_restore_guest_msrs(struct vcpu *v)
 {
+    const struct cpuid_policy *cp = v->domain->arch.cpuid;
     const struct vcpu_msrs *msrs = v->arch.msrs;
 
     write_gs_shadow(v->arch.hvm.vmx.shadow_gs);
@@ -569,6 +575,9 @@ static void vmx_restore_guest_msrs(struct vcpu *v)
         wrmsrl(MSR_RTIT_OUTPUT_MASK, msrs->rtit.output_mask);
         wrmsrl(MSR_RTIT_STATUS, msrs->rtit.status);
     }
+
+    if ( cp->feat.pks )
+        wrpkrs(msrs->pkrs);
 }
 
 void vmx_update_cpu_exec_control(struct vcpu *v)
diff --git a/xen/arch/x86/include/asm/msr.h b/xen/arch/x86/include/asm/msr.h
index 191e54068856..7946b6b24c11 100644
--- a/xen/arch/x86/include/asm/msr.h
+++ b/xen/arch/x86/include/asm/msr.h
@@ -373,6 +373,15 @@ struct vcpu_msrs
         };
     } rtit;
 
+    /*
+     * 0x000006e1 - MSR_PKRS - Protection Key Supervisor.
+     *
+     * Exposed R/W to guests.  Xen doesn't use PKS yet, so only context
+     * switched per vcpu.  When in current context, live value is in hardware,
+     * and this value is stale.
+     */
+    uint32_t pkrs;
+
     /* 0x00000da0 - MSR_IA32_XSS */
     struct {
         uint64_t raw;
diff --git a/xen/arch/x86/include/asm/prot-key.h b/xen/arch/x86/include/asm/prot-key.h
index 63a2e22f3fa0..0dcd31b7ea68 100644
--- a/xen/arch/x86/include/asm/prot-key.h
+++ b/xen/arch/x86/include/asm/prot-key.h
@@ -5,8 +5,11 @@
 #ifndef ASM_PROT_KEY_H
 #define ASM_PROT_KEY_H
 
+#include <xen/percpu.h>
 #include <xen/types.h>
 
+#include <asm/msr.h>
+
 #define PKEY_AD 1 /* Access Disable */
 #define PKEY_WD 2 /* Write Disable */
 
@@ -28,4 +31,55 @@ static inline void wrpkru(uint32_t pkru)
                    :: "a" (pkru), "d" (0), "c" (0) );
 }
 
+/*
+ * Xen does not use PKS.
+ *
+ * Guest kernel use is expected to be one default key, except for tiny windows
+ * with a double write to switch to a non-default key in a permitted critical
+ * section.
+ *
+ * As such, we want MSR_PKRS un-intercepted.  Furthermore, as we only need it
+ * in Xen for emulation or migration purposes (i.e. possibly never in a
+ * domain's lifetime), we don't want to re-sync the hardware value on every
+ * vmexit.
+ *
+ * Therefore, we read and cache the guest value in ctxt_switch_from(), in the
+ * expectation that we can short-circuit the write in ctxt_switch_to().
+ * During regular operations in current context, the guest value is in
+ * hardware and the per-cpu cache is stale.
+ */
+DECLARE_PER_CPU(uint32_t, pkrs);
+
+static inline uint32_t rdpkrs(void)
+{
+    uint32_t pkrs, tmp;
+
+    rdmsr(MSR_PKRS, pkrs, tmp);
+
+    return pkrs;
+}
+
+static inline uint32_t rdpkrs_and_cache(void)
+{
+    return this_cpu(pkrs) = rdpkrs();
+}
+
+static inline void wrpkrs(uint32_t pkrs)
+{
+    uint32_t *this_pkrs = &this_cpu(pkrs);
+
+    if ( *this_pkrs != pkrs )
+    {
+        *this_pkrs = pkrs;
+
+        wrmsr_ns(MSR_PKRS, pkrs, 0);
+    }
+}
+
+static inline void wrpkrs_and_cache(uint32_t pkrs)
+{
+    this_cpu(pkrs) = pkrs;
+    wrmsr_ns(MSR_PKRS, pkrs, 0);
+}
+
 #endif /* ASM_PROT_KEY_H */
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 6deadcf74763..567a0a42ac50 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -54,6 +54,7 @@
 #include <asm/spec_ctrl.h>
 #include <asm/guest.h>
 #include <asm/microcode.h>
+#include <asm/prot-key.h>
 #include <asm/pv/domain.h>
 
 /* opt_nosmp: If true, secondary processors are ignored. */
@@ -1804,6 +1805,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
     if ( opt_invpcid && cpu_has_invpcid )
         use_invpcid = true;
 
+    if ( cpu_has_pks )
+        wrpkrs_and_cache(0); /* Must be before setting CR4.PKS */
+
     init_speculation_mitigations();
 
     init_idle_domain();
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 52beed9d8d6d..b26758c2c89f 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -42,6 +42,7 @@
 #include <asm/microcode.h>
 #include <asm/msr.h>
 #include <asm/mtrr.h>
+#include <asm/prot-key.h>
 #include <asm/setup.h>
 #include <asm/spec_ctrl.h>
 #include <asm/time.h>
@@ -364,6 +365,9 @@ void start_secondary(void *unused)
 
     /* Full exception support from here on in. */
 
+    if ( cpu_has_pks )
+        wrpkrs_and_cache(0); /* Must be before setting CR4.PKS */
+
     /* Safe to enable feature such as CR4.MCE with the IDT set up now. */
     write_cr4(mmu_cr4_features);
 
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 6/8] x86/hvm: Enable guest access to MSR_PKRS
  2023-01-10 17:18 [PATCH v2 0/8] x86: Protection Key Supervisor support Andrew Cooper
                   ` (4 preceding siblings ...)
  2023-01-10 17:18 ` [PATCH v2 5/8] x86/hvm: Context switch MSR_PKRS Andrew Cooper
@ 2023-01-10 17:18 ` Andrew Cooper
  2023-01-10 18:07   ` Andrew Cooper
  2023-01-12 13:26   ` Jan Beulich
  2023-01-10 17:18 ` [PATCH v2 7/8] x86/pagewalk: Support PKS Andrew Cooper
  2023-01-10 17:18 ` [PATCH v2 8/8] x86/hvm: Support PKS for HAP guests Andrew Cooper
  7 siblings, 2 replies; 27+ messages in thread
From: Andrew Cooper @ 2023-01-10 17:18 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu, Kevin Tian

Have guest_{rd,wr}msr(), via hvm_{get,set}_reg(), access either the live
register, or stashed state, depending on context.  Include MSR_PKRS for
migration, and let the guest have full access.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Kevin Tian <kevin.tian@intel.com>

v2:
 * Rebase over the get/set_reg() infrastructure.
---
 xen/arch/x86/hvm/hvm.c     |  1 +
 xen/arch/x86/hvm/vmx/vmx.c | 17 +++++++++++++++++
 xen/arch/x86/msr.c         | 10 ++++++++++
 3 files changed, 28 insertions(+)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 927a221660e8..c6c1eea18003 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1333,6 +1333,7 @@ static int cf_check hvm_load_cpu_xsave_states(
 static const uint32_t msrs_to_send[] = {
     MSR_SPEC_CTRL,
     MSR_INTEL_MISC_FEATURES_ENABLES,
+    MSR_PKRS,
     MSR_IA32_BNDCFGS,
     MSR_IA32_XSS,
     MSR_VIRT_SPEC_CTRL,
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index b1f493f009fd..57827779c305 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -657,6 +657,11 @@ static void cf_check vmx_cpuid_policy_changed(struct vcpu *v)
     else
         vmx_set_msr_intercept(v, MSR_FLUSH_CMD, VMX_MSR_RW);
 
+    if ( cp->feat.pks )
+        vmx_clear_msr_intercept(v, MSR_PKRS, VMX_MSR_RW);
+    else
+        vmx_set_msr_intercept(v, MSR_PKRS, VMX_MSR_RW);
+
  out:
     vmx_vmcs_exit(v);
 
@@ -2455,6 +2460,7 @@ static uint64_t cf_check vmx_get_reg(struct vcpu *v, unsigned int reg)
 {
     const struct vcpu *curr = current;
     struct domain *d = v->domain;
+    const struct vcpu_msrs *msrs = v->arch.msrs;
     uint64_t val = 0;
     int rc;
 
@@ -2471,6 +2477,9 @@ static uint64_t cf_check vmx_get_reg(struct vcpu *v, unsigned int reg)
         }
         return val;
 
+    case MSR_PKRS:
+        return (v == curr) ? rdpkrs() : msrs->pkrs;
+
     case MSR_SHADOW_GS_BASE:
         if ( v != curr )
             return v->arch.hvm.vmx.shadow_gs;
@@ -2499,6 +2508,8 @@ static uint64_t cf_check vmx_get_reg(struct vcpu *v, unsigned int reg)
 
 static void cf_check vmx_set_reg(struct vcpu *v, unsigned int reg, uint64_t val)
 {
+    const struct vcpu *curr = current;
+    struct vcpu_msrs *msrs = v->arch.msrs;
     struct domain *d = v->domain;
     int rc;
 
@@ -2514,6 +2525,12 @@ static void cf_check vmx_set_reg(struct vcpu *v, unsigned int reg, uint64_t val)
             domain_crash(d);
         }
         return;
+
+    case MSR_PKRS:
+        msrs->pkrs = val;
+        if ( v == curr )
+            wrpkrs(val);
+        return;
     }
 
     /* Logic which maybe requires remote VMCS acquisition. */
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 317b154d244d..7ddf0078c3a2 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -325,6 +325,11 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
         *val = 0;
         break;
 
+    case MSR_PKRS:
+        if ( !cp->feat.pks )
+            goto gp_fault;
+        goto get_reg;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;
@@ -616,6 +621,11 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
             break;
         goto gp_fault;
 
+    case MSR_PKRS:
+        if ( !cp->feat.pks || val != (uint32_t)val )
+            goto gp_fault;
+        goto set_reg;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 7/8] x86/pagewalk: Support PKS
  2023-01-10 17:18 [PATCH v2 0/8] x86: Protection Key Supervisor support Andrew Cooper
                   ` (5 preceding siblings ...)
  2023-01-10 17:18 ` [PATCH v2 6/8] x86/hvm: Enable guest access to MSR_PKRS Andrew Cooper
@ 2023-01-10 17:18 ` Andrew Cooper
  2023-01-10 17:18 ` [PATCH v2 8/8] x86/hvm: Support PKS for HAP guests Andrew Cooper
  7 siblings, 0 replies; 27+ messages in thread
From: Andrew Cooper @ 2023-01-10 17:18 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

PKS is very similar to the existing PKU behaviour, operating on pagewalks for
any supervisor mapping.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/include/asm/guest_pt.h | 5 +++++
 xen/arch/x86/include/asm/hvm/hvm.h  | 3 +++
 xen/arch/x86/mm/guest_walk.c        | 9 +++++----
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/include/asm/guest_pt.h b/xen/arch/x86/include/asm/guest_pt.h
index 6647ccfb8520..6802db2a415a 100644
--- a/xen/arch/x86/include/asm/guest_pt.h
+++ b/xen/arch/x86/include/asm/guest_pt.h
@@ -282,6 +282,11 @@ static always_inline bool guest_pku_enabled(const struct vcpu *v)
     return !is_pv_vcpu(v) && hvm_pku_enabled(v);
 }
 
+static always_inline bool guest_pks_enabled(const struct vcpu *v)
+{
+    return !is_pv_vcpu(v) && hvm_pks_enabled(v);
+}
+
 /* Helpers for identifying whether guest entries have reserved bits set. */
 
 /* Bits reserved because of maxphysaddr, and (lack of) EFER.NX */
diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h
index 93254651f2f5..65768c797ea7 100644
--- a/xen/arch/x86/include/asm/hvm/hvm.h
+++ b/xen/arch/x86/include/asm/hvm/hvm.h
@@ -407,6 +407,8 @@ int hvm_get_param(struct domain *d, uint32_t index, uint64_t *value);
     ((v)->arch.hvm.guest_efer & EFER_NXE)
 #define hvm_pku_enabled(v) \
     (hvm_paging_enabled(v) && ((v)->arch.hvm.guest_cr[4] & X86_CR4_PKE))
+#define hvm_pks_enabled(v) \
+    (hvm_paging_enabled(v) && ((v)->arch.hvm.guest_cr[4] & X86_CR4_PKS))
 
 /* Can we use superpages in the HAP p2m table? */
 #define hap_has_1gb (!!(hvm_funcs.hap_capabilities & HVM_HAP_SUPERPAGE_1GB))
@@ -911,6 +913,7 @@ static inline void hvm_set_reg(struct vcpu *v, unsigned int reg, uint64_t val)
 #define hvm_smap_enabled(v) ((void)(v), false)
 #define hvm_nx_enabled(v) ((void)(v), false)
 #define hvm_pku_enabled(v) ((void)(v), false)
+#define hvm_pks_enabled(v) ((void)(v), false)
 
 #define arch_vcpu_block(v) ((void)(v))
 
diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
index 161a61b8f5ca..76b4e0425887 100644
--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -406,16 +406,17 @@ guest_walk_tables(const struct vcpu *v, struct p2m_domain *p2m,
 #if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
     /*
      * If all access checks are thus far ok, check Protection Key for 64bit
-     * data accesses to user mappings.
+     * data accesses.
      *
      * N.B. In the case that the walk ended with a superpage, the fabricated
      * gw->l1e contains the appropriate leaf pkey.
      */
-    if ( (ar & _PAGE_USER) && !(walk & PFEC_insn_fetch) &&
-         guest_pku_enabled(v) )
+    if ( !(walk & PFEC_insn_fetch) &&
+         ((ar & _PAGE_USER) ? guest_pku_enabled(v)
+                            : guest_pks_enabled(v)) )
     {
         unsigned int pkey = guest_l1e_get_pkey(gw->l1e);
-        unsigned int pkr = rdpkru();
+        unsigned int pkr = (ar & _PAGE_USER) ? rdpkru() : rdpkrs();
         unsigned int pk_ar = (pkr >> (pkey * PKEY_WIDTH)) & (PKEY_AD | PKEY_WD);
 
         if ( (pk_ar & PKEY_AD) ||
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 8/8] x86/hvm: Support PKS for HAP guests
  2023-01-10 17:18 [PATCH v2 0/8] x86: Protection Key Supervisor support Andrew Cooper
                   ` (6 preceding siblings ...)
  2023-01-10 17:18 ` [PATCH v2 7/8] x86/pagewalk: Support PKS Andrew Cooper
@ 2023-01-10 17:18 ` Andrew Cooper
  7 siblings, 0 replies; 27+ messages in thread
From: Andrew Cooper @ 2023-01-10 17:18 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

With all infrastructure in place, advertise the PKS CPUID bit to HAP guests,
and let them set CR4.PKS.

Experiment with a tweak to the layout of hvm_cr4_guest_valid_bits() so future
additions will be just a single added line.

The current context switching behaviour is tied to how VT-x works, so leave a
safety check in the short term.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/cpuid.c                        | 9 +++++++++
 xen/arch/x86/hvm/hvm.c                      | 4 +++-
 xen/include/public/arch-x86/cpufeatureset.h | 2 +-
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index acc2f606cea8..b22725c492e7 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -579,6 +579,15 @@ static void __init calculate_hvm_max_policy(void)
             __clear_bit(X86_FEATURE_XSAVES, hvm_featureset);
     }
 
+    /*
+     * Xen doesn't use PKS, so the guest support for it has opted to not use
+     * the VMCS load/save controls for efficiency reasons.  This depends on
+     * the exact vmentry/exit behaviour, so don't expose PKS in other
+     * situations until someone has cross-checked the behaviour for safety.
+     */
+    if ( !cpu_has_vmx )
+        __clear_bit(X86_FEATURE_PKS, hvm_featureset);
+
     guest_common_feature_adjustments(hvm_featureset);
 
     sanitise_featureset(hvm_featureset);
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c6c1eea18003..606f0e864981 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -969,7 +969,9 @@ unsigned long hvm_cr4_guest_valid_bits(const struct domain *d)
             (p->feat.smep     ? X86_CR4_SMEP              : 0) |
             (p->feat.smap     ? X86_CR4_SMAP              : 0) |
             (p->feat.pku      ? X86_CR4_PKE               : 0) |
-            (cet              ? X86_CR4_CET               : 0));
+            (cet              ? X86_CR4_CET               : 0) |
+            (p->feat.pks      ? X86_CR4_PKS               : 0) |
+            0);
 }
 
 static int cf_check hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
index 5444bc5d8374..3b85bcca1537 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -227,7 +227,7 @@ XEN_CPUFEATURE(CLDEMOTE,      6*32+25) /*A  CLDEMOTE instruction */
 XEN_CPUFEATURE(MOVDIRI,       6*32+27) /*a  MOVDIRI instruction */
 XEN_CPUFEATURE(MOVDIR64B,     6*32+28) /*a  MOVDIR64B instruction */
 XEN_CPUFEATURE(ENQCMD,        6*32+29) /*   ENQCMD{,S} instructions */
-XEN_CPUFEATURE(PKS,           6*32+31) /*   Protection Key for Supervisor */
+XEN_CPUFEATURE(PKS,           6*32+31) /*H  Protection Key for Supervisor */
 
 /* AMD-defined CPU features, CPUID level 0x80000007.edx, word 7 */
 XEN_CPUFEATURE(HW_PSTATE,     7*32+ 7) /*   Hardware Pstates */
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 6/8] x86/hvm: Enable guest access to MSR_PKRS
  2023-01-10 17:18 ` [PATCH v2 6/8] x86/hvm: Enable guest access to MSR_PKRS Andrew Cooper
@ 2023-01-10 18:07   ` Andrew Cooper
  2023-01-12 13:26   ` Jan Beulich
  1 sibling, 0 replies; 27+ messages in thread
From: Andrew Cooper @ 2023-01-10 18:07 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Jan Beulich, Roger Pau Monne, Wei Liu, Kevin Tian

On 10/01/2023 5:18 pm, Andrew Cooper wrote:
> Have guest_{rd,wr}msr(), via hvm_{get,set}_reg(), access either the live
> register, or stashed state, depending on context.  Include MSR_PKRS for
> migration, and let the guest have full access.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Wei Liu <wl@xen.org>
> CC: Kevin Tian <kevin.tian@intel.com>
>
> v2:
>  * Rebase over the get/set_reg() infrastructure.
> ---
>  xen/arch/x86/hvm/hvm.c     |  1 +
>  xen/arch/x86/hvm/vmx/vmx.c | 17 +++++++++++++++++
>  xen/arch/x86/msr.c         | 10 ++++++++++
>  3 files changed, 28 insertions(+)
>
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 927a221660e8..c6c1eea18003 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -1333,6 +1333,7 @@ static int cf_check hvm_load_cpu_xsave_states(
>  static const uint32_t msrs_to_send[] = {
>      MSR_SPEC_CTRL,
>      MSR_INTEL_MISC_FEATURES_ENABLES,
> +    MSR_PKRS,
>      MSR_IA32_BNDCFGS,
>      MSR_IA32_XSS,
>      MSR_VIRT_SPEC_CTRL,

Needs the following hunk too:

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c6c1eea18003..86cab7aa2627 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1487,6 +1487,7 @@ static int cf_check hvm_load_cpu_msrs(struct
domain *d, hvm_domain_context_t *h)
 
         case MSR_SPEC_CTRL:
         case MSR_INTEL_MISC_FEATURES_ENABLES:
+        case MSR_PKRS:
         case MSR_IA32_BNDCFGS:
         case MSR_IA32_XSS:
         case MSR_VIRT_SPEC_CTRL:

for the receive side of migration to work.

~Andrew

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/8] x86/boot: Sanitise PKRU on boot
  2023-01-10 17:18 ` [PATCH v2 1/8] x86/boot: Sanitise PKRU on boot Andrew Cooper
@ 2023-01-12 12:47   ` Jan Beulich
  2023-01-12 17:07     ` Andrew Cooper
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2023-01-12 12:47 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 10.01.2023 18:18, Andrew Cooper wrote:
> While the reset value of the register is 0, it might not be after kexec/etc.
> If PKEY0.{WD,AD} have leaked in from an earlier context, construction of a PV
> dom0 will explode.
> 
> Sequencing wise, this must come after setting CR4.PKE, and before we touch any
> user mappings.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Wei Liu <wl@xen.org>
> 
> For sequencing, it could also come after setting XCR0.PKRU too, but then we'd
> need to construct an empty XSAVE area to XRSTOR from, and that would be even
> more horrible to arrange.

That would be ugly for other reasons as well, I think.

> --- a/xen/arch/x86/cpu/common.c
> +++ b/xen/arch/x86/cpu/common.c
> @@ -936,6 +936,9 @@ void cpu_init(void)
>  	write_debugreg(6, X86_DR6_DEFAULT);
>  	write_debugreg(7, X86_DR7_DEFAULT);
>  
> +	if (cpu_has_pku)
> +		wrpkru(0);

What about the BSP during S3 resume? Shouldn't we play safe there too, just
in case?

Jan


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/8] x86: Initial support for WRMSRNS
  2023-01-10 17:18 ` [PATCH v2 4/8] x86: Initial support for WRMSRNS Andrew Cooper
@ 2023-01-12 12:58   ` Jan Beulich
  2023-01-12 13:58     ` Andrew Cooper
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2023-01-12 12:58 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 10.01.2023 18:18, Andrew Cooper wrote:
> WRMSR Non-Serialising is an optimisation intended for cases where an MSR needs
> updating, but architectural serialising properties are not needed.
> 
> In is anticipated that this will apply to most if not all MSRs modified on
> context switch paths.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

This will allow me to drop half of what the respective emulator patch
consists of, which I'm yet to post (but which in turn is sitting on
top of many other already posted emulator patches). Comparing with
that patch, one nit though:

> --- a/tools/misc/xen-cpuid.c
> +++ b/tools/misc/xen-cpuid.c
> @@ -189,6 +189,7 @@ static const char *const str_7a1[32] =
>  
>      [10] = "fzrm",          [11] = "fsrs",
>      [12] = "fsrcs",
> +    /* 18 */                [19] = "wrmsrns",
>  };

We commonly leave a blank line to indicate dis-contiguous entries.

> --- a/xen/arch/x86/include/asm/msr.h
> +++ b/xen/arch/x86/include/asm/msr.h
> @@ -38,6 +38,18 @@ static inline void wrmsrl(unsigned int msr, __u64 val)
>          wrmsr(msr, lo, hi);
>  }
>  
> +/* Non-serialising WRMSR, when available.  Falls back to a serialising WRMSR. */
> +static inline void wrmsr_ns(uint32_t msr, uint32_t lo, uint32_t hi)
> +{
> +    /*
> +     * WRMSR is 2 bytes.  WRMSRNS is 3 bytes.  Pad WRMSR with a redundant CS
> +     * prefix to avoid a trailing NOP.
> +     */
> +    alternative_input(".byte 0x2e; wrmsr",
> +                      ".byte 0x0f,0x01,0xc6", X86_FEATURE_WRMSRNS,
> +                      "c" (msr), "a" (lo), "d" (hi));
> +}

No wrmsrl_ns() and/or wrmsr_ns_safe() variants right away?

Do you have any indications towards a CS prefix being the least risky
one to use here (or in general)? Recognizing that segment prefixes have
gained alternative meaning in certain contexts, I would otherwise wonder
whether an address or operand size prefix wouldn't be more suitable.

Jan


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 5/8] x86/hvm: Context switch MSR_PKRS
  2023-01-10 17:18 ` [PATCH v2 5/8] x86/hvm: Context switch MSR_PKRS Andrew Cooper
@ 2023-01-12 13:10   ` Jan Beulich
  2023-01-12 16:51     ` Andrew Cooper
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2023-01-12 13:10 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Kevin Tian, Xen-devel

On 10.01.2023 18:18, Andrew Cooper wrote:
> +static inline void wrpkrs(uint32_t pkrs)
> +{
> +    uint32_t *this_pkrs = &this_cpu(pkrs);
> +
> +    if ( *this_pkrs != pkrs )
> +    {
> +        *this_pkrs = pkrs;
> +
> +        wrmsr_ns(MSR_PKRS, pkrs, 0);
> +    }
> +}
> +
> +static inline void wrpkrs_and_cache(uint32_t pkrs)
> +{
> +    this_cpu(pkrs) = pkrs;
> +    wrmsr_ns(MSR_PKRS, pkrs, 0);
> +}

Just to confirm - there's no anticipation of uses of this in async
contexts, i.e. there's no concern about the ordering of cache vs hardware
writes?

> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -54,6 +54,7 @@
>  #include <asm/spec_ctrl.h>
>  #include <asm/guest.h>
>  #include <asm/microcode.h>
> +#include <asm/prot-key.h>
>  #include <asm/pv/domain.h>
>  
>  /* opt_nosmp: If true, secondary processors are ignored. */
> @@ -1804,6 +1805,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>      if ( opt_invpcid && cpu_has_invpcid )
>          use_invpcid = true;
>  
> +    if ( cpu_has_pks )
> +        wrpkrs_and_cache(0); /* Must be before setting CR4.PKS */

Same question here as for PKRU wrt the BSP during S3 resume.

Jan


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 6/8] x86/hvm: Enable guest access to MSR_PKRS
  2023-01-10 17:18 ` [PATCH v2 6/8] x86/hvm: Enable guest access to MSR_PKRS Andrew Cooper
  2023-01-10 18:07   ` Andrew Cooper
@ 2023-01-12 13:26   ` Jan Beulich
  2023-01-12 14:16     ` Andrew Cooper
  1 sibling, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2023-01-12 13:26 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Kevin Tian, Xen-devel

On 10.01.2023 18:18, Andrew Cooper wrote:
> @@ -2471,6 +2477,9 @@ static uint64_t cf_check vmx_get_reg(struct vcpu *v, unsigned int reg)
>          }
>          return val;
>  
> +    case MSR_PKRS:
> +        return (v == curr) ? rdpkrs() : msrs->pkrs;

Nothing here or ...

> @@ -2514,6 +2525,12 @@ static void cf_check vmx_set_reg(struct vcpu *v, unsigned int reg, uint64_t val)
>              domain_crash(d);
>          }
>          return;
> +
> +    case MSR_PKRS:
> +        msrs->pkrs = val;
> +        if ( v == curr )
> +            wrpkrs(val);
> +        return;

... here is VMX or (if we were to support it, just as a abstract
consideration) HVM specific. Which makes me wonder why this needs
handling in [gs]et_reg() in the first place. I guess I'm still not
fully in sync with your longer term plans here ...

The other thing I'd like to understand (and having an answer to this
would have been better before re-applying my R-b to this re-based
logic) is towards the lack of feature checks here. hvm_get_reg()
can be called from other than guest_rdmsr(), for an example see
arch_get_info_guest().

Jan


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/8] x86: Initial support for WRMSRNS
  2023-01-12 12:58   ` Jan Beulich
@ 2023-01-12 13:58     ` Andrew Cooper
  2023-01-12 15:11       ` Jan Beulich
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Cooper @ 2023-01-12 13:58 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roger Pau Monne, Wei Liu, Xen-devel

On 12/01/2023 12:58 pm, Jan Beulich wrote:
> On 10.01.2023 18:18, Andrew Cooper wrote:
>> WRMSR Non-Serialising is an optimisation intended for cases where an MSR needs
>> updating, but architectural serialising properties are not needed.
>>
>> In is anticipated that this will apply to most if not all MSRs modified on
>> context switch paths.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>
> This will allow me to drop half of what the respective emulator patch
> consists of, which I'm yet to post (but which in turn is sitting on
> top of many other already posted emulator patches). Comparing with
> that patch, one nit though:

I did wonder if you had some stuff queued up.  I do need to get back to
reviewing.

>
>> --- a/tools/misc/xen-cpuid.c
>> +++ b/tools/misc/xen-cpuid.c
>> @@ -189,6 +189,7 @@ static const char *const str_7a1[32] =
>>  
>>      [10] = "fzrm",          [11] = "fsrs",
>>      [12] = "fsrcs",
>> +    /* 18 */                [19] = "wrmsrns",
>>  };
> We commonly leave a blank line to indicate dis-contiguous entries.

Oops yes.  Will fix.

>
>> --- a/xen/arch/x86/include/asm/msr.h
>> +++ b/xen/arch/x86/include/asm/msr.h
>> @@ -38,6 +38,18 @@ static inline void wrmsrl(unsigned int msr, __u64 val)
>>          wrmsr(msr, lo, hi);
>>  }
>>  
>> +/* Non-serialising WRMSR, when available.  Falls back to a serialising WRMSR. */
>> +static inline void wrmsr_ns(uint32_t msr, uint32_t lo, uint32_t hi)
>> +{
>> +    /*
>> +     * WRMSR is 2 bytes.  WRMSRNS is 3 bytes.  Pad WRMSR with a redundant CS
>> +     * prefix to avoid a trailing NOP.
>> +     */
>> +    alternative_input(".byte 0x2e; wrmsr",
>> +                      ".byte 0x0f,0x01,0xc6", X86_FEATURE_WRMSRNS,
>> +                      "c" (msr), "a" (lo), "d" (hi));
>> +}
> No wrmsrl_ns() and/or wrmsr_ns_safe() variants right away?

I still have a branch cleaning up MSR handling, which has been pending
since the Nanjing XenSummit, which makes some of those disappear.

But no - I wasn't planning to introduce helpers ahead of them being needed.

> Do you have any indications towards a CS prefix being the least risky
> one to use here (or in general)?

Yes.

Remember it's the prefix recommended for, and used by,
-mbranches-within-32B-boundaries to work around the Skylake jmp errata.

And based on this justification, its also the prefix we use for padding
on various jmp/call's for retpoline inlining purposes.

~Andrew

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 6/8] x86/hvm: Enable guest access to MSR_PKRS
  2023-01-12 13:26   ` Jan Beulich
@ 2023-01-12 14:16     ` Andrew Cooper
  2023-01-12 15:16       ` Jan Beulich
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Cooper @ 2023-01-12 14:16 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roger Pau Monne, Wei Liu, Kevin Tian, Xen-devel

On 12/01/2023 1:26 pm, Jan Beulich wrote:
> On 10.01.2023 18:18, Andrew Cooper wrote:
>> @@ -2471,6 +2477,9 @@ static uint64_t cf_check vmx_get_reg(struct vcpu *v, unsigned int reg)
>>          }
>>          return val;
>>  
>> +    case MSR_PKRS:
>> +        return (v == curr) ? rdpkrs() : msrs->pkrs;
> Nothing here or ...
>
>> @@ -2514,6 +2525,12 @@ static void cf_check vmx_set_reg(struct vcpu *v, unsigned int reg, uint64_t val)
>>              domain_crash(d);
>>          }
>>          return;
>> +
>> +    case MSR_PKRS:
>> +        msrs->pkrs = val;
>> +        if ( v == curr )
>> +            wrpkrs(val);
>> +        return;
> ... here is VMX or (if we were to support it, just as a abstract
> consideration) HVM specific. Which makes me wonder why this needs
> handling in [gs]et_reg() in the first place. I guess I'm still not
> fully in sync with your longer term plans here ...

If (when) AMD implement it, the AMD form needs will be vmcb->pkrs and
not msrs->pkrs, because like all other paging controls, they'll be
swapped automatically by VMRUN.

(I don't know this for certain, but I'm happy to bet on it, given a
decade of consistency in this regard.)

> The other thing I'd like to understand (and having an answer to this
> would have been better before re-applying my R-b to this re-based
> logic) is towards the lack of feature checks here. hvm_get_reg()
> can be called from other than guest_rdmsr(), for an example see
> arch_get_info_guest().

The point is to separate auditing logic (wants to be implemented only
once) from data shuffling logic (is the value in a register, or the MSR
lists, or VMCB/VMCS or struct vcpu, etc).  It is always the caller's
responsibility to confirm that REG exists, and that VAL is suitable for REG.

arch_get_info_guest() passes MSR_SHADOW_GS_BASE which exists
unilaterally (because we don't technically do !LM correctly.)


But this is all discussed in the comment by the function prototypes. 
I'm not sure how to make that any clearer than it already is.

~Andrew

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/8] x86: Initial support for WRMSRNS
  2023-01-12 13:58     ` Andrew Cooper
@ 2023-01-12 15:11       ` Jan Beulich
  2023-01-16 12:21         ` Andrew Cooper
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2023-01-12 15:11 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monne, Wei Liu, Xen-devel

On 12.01.2023 14:58, Andrew Cooper wrote:
> On 12/01/2023 12:58 pm, Jan Beulich wrote:
>> Do you have any indications towards a CS prefix being the least risky
>> one to use here (or in general)?
> 
> Yes.
> 
> Remember it's the prefix recommended for, and used by,
> -mbranches-within-32B-boundaries to work around the Skylake jmp errata.
> 
> And based on this justification, its also the prefix we use for padding
> on various jmp/call's for retpoline inlining purposes.

While I'm okay with the reply, I'd like to point out that in those cases
address or operand size prefix simply could not have been used, for the
insns in question having explicit operands which would be affected. Which
is unlike the case here.

Jan


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 6/8] x86/hvm: Enable guest access to MSR_PKRS
  2023-01-12 14:16     ` Andrew Cooper
@ 2023-01-12 15:16       ` Jan Beulich
  0 siblings, 0 replies; 27+ messages in thread
From: Jan Beulich @ 2023-01-12 15:16 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monne, Wei Liu, Kevin Tian, Xen-devel

On 12.01.2023 15:16, Andrew Cooper wrote:
> On 12/01/2023 1:26 pm, Jan Beulich wrote:
>> The other thing I'd like to understand (and having an answer to this
>> would have been better before re-applying my R-b to this re-based
>> logic) is towards the lack of feature checks here. hvm_get_reg()
>> can be called from other than guest_rdmsr(), for an example see
>> arch_get_info_guest().
> 
> The point is to separate auditing logic (wants to be implemented only
> once) from data shuffling logic (is the value in a register, or the MSR
> lists, or VMCB/VMCS or struct vcpu, etc).  It is always the caller's
> responsibility to confirm that REG exists, and that VAL is suitable for REG.
> 
> arch_get_info_guest() passes MSR_SHADOW_GS_BASE which exists
> unilaterally (because we don't technically do !LM correctly.)
> 
> 
> But this is all discussed in the comment by the function prototypes. 
> I'm not sure how to make that any clearer than it already is.

Okay, and I'm sorry for having looked at the definitions without finding
any helpful comment, but not at the declarations. Certainly sufficient
to confirm that my R-b can remain as you already had it.

Jan


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 5/8] x86/hvm: Context switch MSR_PKRS
  2023-01-12 13:10   ` Jan Beulich
@ 2023-01-12 16:51     ` Andrew Cooper
  2023-01-16 13:00       ` Andrew Cooper
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Cooper @ 2023-01-12 16:51 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roger Pau Monne, Wei Liu, Kevin Tian, Xen-devel

On 12/01/2023 1:10 pm, Jan Beulich wrote:
> On 10.01.2023 18:18, Andrew Cooper wrote:
>> +static inline void wrpkrs(uint32_t pkrs)
>> +{
>> +    uint32_t *this_pkrs = &this_cpu(pkrs);
>> +
>> +    if ( *this_pkrs != pkrs )
>> +    {
>> +        *this_pkrs = pkrs;
>> +
>> +        wrmsr_ns(MSR_PKRS, pkrs, 0);
>> +    }
>> +}
>> +
>> +static inline void wrpkrs_and_cache(uint32_t pkrs)
>> +{
>> +    this_cpu(pkrs) = pkrs;
>> +    wrmsr_ns(MSR_PKRS, pkrs, 0);
>> +}
> Just to confirm - there's no anticipation of uses of this in async
> contexts, i.e. there's no concern about the ordering of cache vs hardware
> writes?

No.  The only thing modifying MSR_PKRS does is change how the pagewalk
works for the current thread (specifically, the determination of Access
Rights).  Their is no relevance outside of the core, especially for
Xen's local copy of the register value.

What WRMSRNS does guarantee is that older instructions will complete
before the MSR gets updated, and that subsequent instructions won't
start, so WRMSRNS acts "atomically" with respect to instruction order.

Also remember that not all WRMSRs are serialising.  e.g. the X2APIC MSRs
are explicitly not, and this is an oversight in practice for
MSR_X2APIC_ICR at least.

>> --- a/xen/arch/x86/setup.c
>> +++ b/xen/arch/x86/setup.c
>> @@ -54,6 +54,7 @@
>>  #include <asm/spec_ctrl.h>
>>  #include <asm/guest.h>
>>  #include <asm/microcode.h>
>> +#include <asm/prot-key.h>
>>  #include <asm/pv/domain.h>
>>  
>>  /* opt_nosmp: If true, secondary processors are ignored. */
>> @@ -1804,6 +1805,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>>      if ( opt_invpcid && cpu_has_invpcid )
>>          use_invpcid = true;
>>  
>> +    if ( cpu_has_pks )
>> +        wrpkrs_and_cache(0); /* Must be before setting CR4.PKS */
> Same question here as for PKRU wrt the BSP during S3 resume.

I had reasoned not, but it turns out that I'm wrong.

It's important to reset the cache back to 0 here.  (Handling PKRU is
different - I'll follow up on the other email..)

~Andrew

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/8] x86/boot: Sanitise PKRU on boot
  2023-01-12 12:47   ` Jan Beulich
@ 2023-01-12 17:07     ` Andrew Cooper
  2023-01-13  9:04       ` Jan Beulich
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Cooper @ 2023-01-12 17:07 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roger Pau Monne, Wei Liu, Xen-devel

On 12/01/2023 12:47 pm, Jan Beulich wrote:
> On 10.01.2023 18:18, Andrew Cooper wrote:
>> While the reset value of the register is 0, it might not be after kexec/etc.
>> If PKEY0.{WD,AD} have leaked in from an earlier context, construction of a PV
>> dom0 will explode.
>>
>> Sequencing wise, this must come after setting CR4.PKE, and before we touch any
>> user mappings.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>> CC: Roger Pau Monné <roger.pau@citrix.com>
>> CC: Wei Liu <wl@xen.org>
>>
>> For sequencing, it could also come after setting XCR0.PKRU too, but then we'd
>> need to construct an empty XSAVE area to XRSTOR from, and that would be even
>> more horrible to arrange.
> That would be ugly for other reasons as well, I think.

Yeah - I absolutely don't want to go down this route.

>
>> --- a/xen/arch/x86/cpu/common.c
>> +++ b/xen/arch/x86/cpu/common.c
>> @@ -936,6 +936,9 @@ void cpu_init(void)
>>  	write_debugreg(6, X86_DR6_DEFAULT);
>>  	write_debugreg(7, X86_DR7_DEFAULT);
>>  
>> +	if (cpu_has_pku)
>> +		wrpkru(0);
> What about the BSP during S3 resume? Shouldn't we play safe there too, just
> in case?

Out of S3, I think it's reasonable to rely on proper reset values, and
for pkru, and any issues of it being "wrong" should be fixed when we
reload d0v0's XSAVE state.


That said, I'm wanting to try and merge parts of the boot and S3 paths
because we're finding no end of errors/oversights, not least because we
have no automated testing of S3 suspend/resume.  Servers typically don't
implement it, and fixes either come from code inspection, or Qubes
noticing (which is absolutely better than nothing, but not a great
reflection on Xen).

But to merge these things, I first need to finish the work to make
microcode loading properly early, and then fix up some of the feature
detection paths, and cleanly separate feature detection from applying
the chosen configuration, at which point I hope the latter part will be
reusable on the S3 resume path.

I don't expect this work to happen imminently...

~Andrew

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/8] x86/boot: Sanitise PKRU on boot
  2023-01-12 17:07     ` Andrew Cooper
@ 2023-01-13  9:04       ` Jan Beulich
  0 siblings, 0 replies; 27+ messages in thread
From: Jan Beulich @ 2023-01-13  9:04 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monne, Wei Liu, Xen-devel

On 12.01.2023 18:07, Andrew Cooper wrote:
> On 12/01/2023 12:47 pm, Jan Beulich wrote:
>> On 10.01.2023 18:18, Andrew Cooper wrote:
>>> --- a/xen/arch/x86/cpu/common.c
>>> +++ b/xen/arch/x86/cpu/common.c
>>> @@ -936,6 +936,9 @@ void cpu_init(void)
>>>  	write_debugreg(6, X86_DR6_DEFAULT);
>>>  	write_debugreg(7, X86_DR7_DEFAULT);
>>>  
>>> +	if (cpu_has_pku)
>>> +		wrpkru(0);
>> What about the BSP during S3 resume? Shouldn't we play safe there too, just
>> in case?
> 
> Out of S3, I think it's reasonable to rely on proper reset values, and
> for pkru, and any issues of it being "wrong" should be fixed when we
> reload d0v0's XSAVE state.

Fair enough:
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/8] x86: Initial support for WRMSRNS
  2023-01-12 15:11       ` Jan Beulich
@ 2023-01-16 12:21         ` Andrew Cooper
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Cooper @ 2023-01-16 12:21 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roger Pau Monne, Wei Liu, Xen-devel

On 12/01/2023 3:11 pm, Jan Beulich wrote:
> On 12.01.2023 14:58, Andrew Cooper wrote:
>> On 12/01/2023 12:58 pm, Jan Beulich wrote:
>>> Do you have any indications towards a CS prefix being the least risky
>>> one to use here (or in general)?
>> Yes.
>>
>> Remember it's the prefix recommended for, and used by,
>> -mbranches-within-32B-boundaries to work around the Skylake jmp errata.
>>
>> And based on this justification, its also the prefix we use for padding
>> on various jmp/call's for retpoline inlining purposes.
> While I'm okay with the reply, I'd like to point out that in those cases
> address or operand size prefix simply could not have been used, for the
> insns in question having explicit operands which would be affected. Which
> is unlike the case here.

A CS prefix *is* the option which the architects deemed was the safest,
and subsequent workarounds are using CS because it had previously passed
muster.

Various toolchain codegen options will result in `cs wrmsr` being emitted.

There are a subset of instructions for which other prefixes work for
padding purposes, but by not following the recommendation and choosing
CS, you're betting against the people who make new meanings for
instruction encodings.

~Andrew

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 5/8] x86/hvm: Context switch MSR_PKRS
  2023-01-12 16:51     ` Andrew Cooper
@ 2023-01-16 13:00       ` Andrew Cooper
  2023-01-16 14:17         ` Jan Beulich
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Cooper @ 2023-01-16 13:00 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roger Pau Monne, Wei Liu, Kevin Tian, Xen-devel

On 12/01/2023 4:51 pm, Andrew Cooper wrote:
> On 12/01/2023 1:10 pm, Jan Beulich wrote:
>> On 10.01.2023 18:18, Andrew Cooper wrote:
>>> --- a/xen/arch/x86/setup.c
>>> +++ b/xen/arch/x86/setup.c
>>> @@ -54,6 +54,7 @@
>>>  #include <asm/spec_ctrl.h>
>>>  #include <asm/guest.h>
>>>  #include <asm/microcode.h>
>>> +#include <asm/prot-key.h>
>>>  #include <asm/pv/domain.h>
>>>  
>>>  /* opt_nosmp: If true, secondary processors are ignored. */
>>> @@ -1804,6 +1805,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>>>      if ( opt_invpcid && cpu_has_invpcid )
>>>          use_invpcid = true;
>>>  
>>> +    if ( cpu_has_pks )
>>> +        wrpkrs_and_cache(0); /* Must be before setting CR4.PKS */
>> Same question here as for PKRU wrt the BSP during S3 resume.
> I had reasoned not, but it turns out that I'm wrong.
>
> It's important to reset the cache back to 0 here.  (Handling PKRU is
> different - I'll follow up on the other email..)

diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index d23335391c67..de9317e8c573 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -299,6 +299,13 @@ static int enter_state(u32 state)
 
     update_mcu_opt_ctrl();
 
+    /*
+     * Should be before restoring CR4, but that is earlier in asm.  We
rely on
+     * MSR_PKRS actually being 0 out of S3 resume.
+     */
+    if ( cpu_has_pks )
+        wrpkrs_and_cache(0);
+
     /* (re)initialise SYSCALL/SYSENTER state, amongst other things. */
     percpu_traps_init();
 

I've folded this hunk, to sort out the S3 resume path.

As its the final hunk before the entire series can be committed, I
shan't bother sending a v3 just for this.

~Andrew

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 5/8] x86/hvm: Context switch MSR_PKRS
  2023-01-16 13:00       ` Andrew Cooper
@ 2023-01-16 14:17         ` Jan Beulich
  2023-01-16 14:57           ` Andrew Cooper
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2023-01-16 14:17 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monne, Wei Liu, Kevin Tian, Xen-devel

On 16.01.2023 14:00, Andrew Cooper wrote:
> On 12/01/2023 4:51 pm, Andrew Cooper wrote:
>> On 12/01/2023 1:10 pm, Jan Beulich wrote:
>>> On 10.01.2023 18:18, Andrew Cooper wrote:
>>>> --- a/xen/arch/x86/setup.c
>>>> +++ b/xen/arch/x86/setup.c
>>>> @@ -54,6 +54,7 @@
>>>>  #include <asm/spec_ctrl.h>
>>>>  #include <asm/guest.h>
>>>>  #include <asm/microcode.h>
>>>> +#include <asm/prot-key.h>
>>>>  #include <asm/pv/domain.h>
>>>>  
>>>>  /* opt_nosmp: If true, secondary processors are ignored. */
>>>> @@ -1804,6 +1805,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>>>>      if ( opt_invpcid && cpu_has_invpcid )
>>>>          use_invpcid = true;
>>>>  
>>>> +    if ( cpu_has_pks )
>>>> +        wrpkrs_and_cache(0); /* Must be before setting CR4.PKS */
>>> Same question here as for PKRU wrt the BSP during S3 resume.
>> I had reasoned not, but it turns out that I'm wrong.
>>
>> It's important to reset the cache back to 0 here.  (Handling PKRU is
>> different - I'll follow up on the other email..)
> 
> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
> index d23335391c67..de9317e8c573 100644
> --- a/xen/arch/x86/acpi/power.c
> +++ b/xen/arch/x86/acpi/power.c
> @@ -299,6 +299,13 @@ static int enter_state(u32 state)
>  
>      update_mcu_opt_ctrl();
>  
> +    /*
> +     * Should be before restoring CR4, but that is earlier in asm.  We
> rely on
> +     * MSR_PKRS actually being 0 out of S3 resume.
> +     */
> +    if ( cpu_has_pks )
> +        wrpkrs_and_cache(0);
> +
>      /* (re)initialise SYSCALL/SYSENTER state, amongst other things. */
>      percpu_traps_init();
>  
> 
> I've folded this hunk, to sort out the S3 resume path.

The comment is a little misleading imo - it looks to justify that nothing
needs doing. Could you add "..., but our cache needs clearing" to clarify
why, despite our relying on zero being in the register (which I find
problematic, considering that the doc doesn't even spell out reset state),
the write is needed?

> As its the final hunk before the entire series can be committed, I
> shan't bother sending a v3 just for this.

If you're seeing reasons not to be concerned of the unspecified reset
state, then feel free to add my A-b (but not R-b) here.

Jan


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 5/8] x86/hvm: Context switch MSR_PKRS
  2023-01-16 14:17         ` Jan Beulich
@ 2023-01-16 14:57           ` Andrew Cooper
  2023-01-16 15:04             ` Jan Beulich
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Cooper @ 2023-01-16 14:57 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roger Pau Monne, Wei Liu, Kevin Tian, Xen-devel

On 16/01/2023 2:17 pm, Jan Beulich wrote:
> On 16.01.2023 14:00, Andrew Cooper wrote:
>> On 12/01/2023 4:51 pm, Andrew Cooper wrote:
>>> On 12/01/2023 1:10 pm, Jan Beulich wrote:
>>>> On 10.01.2023 18:18, Andrew Cooper wrote:
>>>>> --- a/xen/arch/x86/setup.c
>>>>> +++ b/xen/arch/x86/setup.c
>>>>> @@ -54,6 +54,7 @@
>>>>>  #include <asm/spec_ctrl.h>
>>>>>  #include <asm/guest.h>
>>>>>  #include <asm/microcode.h>
>>>>> +#include <asm/prot-key.h>
>>>>>  #include <asm/pv/domain.h>
>>>>>  
>>>>>  /* opt_nosmp: If true, secondary processors are ignored. */
>>>>> @@ -1804,6 +1805,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>>>>>      if ( opt_invpcid && cpu_has_invpcid )
>>>>>          use_invpcid = true;
>>>>>  
>>>>> +    if ( cpu_has_pks )
>>>>> +        wrpkrs_and_cache(0); /* Must be before setting CR4.PKS */
>>>> Same question here as for PKRU wrt the BSP during S3 resume.
>>> I had reasoned not, but it turns out that I'm wrong.
>>>
>>> It's important to reset the cache back to 0 here.  (Handling PKRU is
>>> different - I'll follow up on the other email..)
>> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
>> index d23335391c67..de9317e8c573 100644
>> --- a/xen/arch/x86/acpi/power.c
>> +++ b/xen/arch/x86/acpi/power.c
>> @@ -299,6 +299,13 @@ static int enter_state(u32 state)
>>  
>>      update_mcu_opt_ctrl();
>>  
>> +    /*
>> +     * Should be before restoring CR4, but that is earlier in asm.  We
>> rely on
>> +     * MSR_PKRS actually being 0 out of S3 resume.
>> +     */
>> +    if ( cpu_has_pks )
>> +        wrpkrs_and_cache(0);
>> +
>>      /* (re)initialise SYSCALL/SYSENTER state, amongst other things. */
>>      percpu_traps_init();
>>  
>>
>> I've folded this hunk, to sort out the S3 resume path.
> The comment is a little misleading imo - it looks to justify that nothing
> needs doing. Could you add "..., but our cache needs clearing" to clarify
> why, despite our relying on zero being in the register (which I find
> problematic, considering that the doc doesn't even spell out reset state),
> the write is needed?

Xen doesn't actually set CR4.PKS at all (yet).

I'm just trying to do a reasonable job of leaving Xen in a position
where it doesn't explode the instant we want to start using PKS ourselves.

S3 resume is out of a full core poweroff.  Registers (which aren't
modified by firmware) will have their architectural reset values, and
for MSR_PKRS, that's 0.

I will add a comment about resetting the cache, because that is the
point of doing this operation.

If we do find firmware which really does mess around with MSR_PKRS (and
isn't restoring the pre-S3 value), then this clause needs moving down
into asm before we restore %cr4 fully, but TBH, I hope I've had time to
try and unify the boot and S3 resume paths a bit before then.

>> As its the final hunk before the entire series can be committed, I
>> shan't bother sending a v3 just for this.
> If you're seeing reasons not to be concerned of the unspecified reset
> state, then feel free to add my A-b (but not R-b) here.

There are several reasons not to be concerned.  I guess I'll take your
ack then.  Thanks.

~Andrew

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 5/8] x86/hvm: Context switch MSR_PKRS
  2023-01-16 14:57           ` Andrew Cooper
@ 2023-01-16 15:04             ` Jan Beulich
  2023-01-16 15:43               ` Andrew Cooper
  0 siblings, 1 reply; 27+ messages in thread
From: Jan Beulich @ 2023-01-16 15:04 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monne, Wei Liu, Kevin Tian, Xen-devel

On 16.01.2023 15:57, Andrew Cooper wrote:
> On 16/01/2023 2:17 pm, Jan Beulich wrote:
>> On 16.01.2023 14:00, Andrew Cooper wrote:
>>> --- a/xen/arch/x86/acpi/power.c
>>> +++ b/xen/arch/x86/acpi/power.c
>>> @@ -299,6 +299,13 @@ static int enter_state(u32 state)
>>>  
>>>      update_mcu_opt_ctrl();
>>>  
>>> +    /*
>>> +     * Should be before restoring CR4, but that is earlier in asm.  We
>>> rely on
>>> +     * MSR_PKRS actually being 0 out of S3 resume.
>>> +     */
>>> +    if ( cpu_has_pks )
>>> +        wrpkrs_and_cache(0);
>>> +
>>>      /* (re)initialise SYSCALL/SYSENTER state, amongst other things. */
>>>      percpu_traps_init();
>>>  
>>>
>>> I've folded this hunk, to sort out the S3 resume path.
>> The comment is a little misleading imo - it looks to justify that nothing
>> needs doing. Could you add "..., but our cache needs clearing" to clarify
>> why, despite our relying on zero being in the register (which I find
>> problematic, considering that the doc doesn't even spell out reset state),
>> the write is needed?
> 
> Xen doesn't actually set CR4.PKS at all (yet).
> 
> I'm just trying to do a reasonable job of leaving Xen in a position
> where it doesn't explode the instant we want to start using PKS ourselves.
> 
> S3 resume is out of a full core poweroff.  Registers (which aren't
> modified by firmware) will have their architectural reset values, and
> for MSR_PKRS, that's 0.

And where have you found that to be spelled out? It is this lack of
specification (afaics) which is concerning me.

Jan


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 5/8] x86/hvm: Context switch MSR_PKRS
  2023-01-16 15:04             ` Jan Beulich
@ 2023-01-16 15:43               ` Andrew Cooper
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Cooper @ 2023-01-16 15:43 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roger Pau Monne, Wei Liu, Kevin Tian, Xen-devel

On 16/01/2023 3:04 pm, Jan Beulich wrote:
> On 16.01.2023 15:57, Andrew Cooper wrote:
>> On 16/01/2023 2:17 pm, Jan Beulich wrote:
>>> On 16.01.2023 14:00, Andrew Cooper wrote:
>>>> --- a/xen/arch/x86/acpi/power.c
>>>> +++ b/xen/arch/x86/acpi/power.c
>>>> @@ -299,6 +299,13 @@ static int enter_state(u32 state)
>>>>  
>>>>      update_mcu_opt_ctrl();
>>>>  
>>>> +    /*
>>>> +     * Should be before restoring CR4, but that is earlier in asm.  We
>>>> rely on
>>>> +     * MSR_PKRS actually being 0 out of S3 resume.
>>>> +     */
>>>> +    if ( cpu_has_pks )
>>>> +        wrpkrs_and_cache(0);
>>>> +
>>>>      /* (re)initialise SYSCALL/SYSENTER state, amongst other things. */
>>>>      percpu_traps_init();
>>>>  
>>>>
>>>> I've folded this hunk, to sort out the S3 resume path.
>>> The comment is a little misleading imo - it looks to justify that nothing
>>> needs doing. Could you add "..., but our cache needs clearing" to clarify
>>> why, despite our relying on zero being in the register (which I find
>>> problematic, considering that the doc doesn't even spell out reset state),
>>> the write is needed?
>> Xen doesn't actually set CR4.PKS at all (yet).
>>
>> I'm just trying to do a reasonable job of leaving Xen in a position
>> where it doesn't explode the instant we want to start using PKS ourselves.
>>
>> S3 resume is out of a full core poweroff.  Registers (which aren't
>> modified by firmware) will have their architectural reset values, and
>> for MSR_PKRS, that's 0.
> And where have you found that to be spelled out? It is this lack of
> specification (afaics) which is concerning me.

I have a request for an update to table 10-1 already pending.  MSR_PKRS
isn't plausibly different from PKRU.

(And even if it is different, this still won't matter while Xen doesn't
use CR4.PKS itself.)

~Andrew

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2023-01-16 15:44 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-10 17:18 [PATCH v2 0/8] x86: Protection Key Supervisor support Andrew Cooper
2023-01-10 17:18 ` [PATCH v2 1/8] x86/boot: Sanitise PKRU on boot Andrew Cooper
2023-01-12 12:47   ` Jan Beulich
2023-01-12 17:07     ` Andrew Cooper
2023-01-13  9:04       ` Jan Beulich
2023-01-10 17:18 ` [PATCH v2 2/8] x86/prot-key: Enumeration for Protection Key Supervisor Andrew Cooper
2023-01-10 17:18 ` [PATCH v2 3/8] x86/prot-key: Split PKRU infrastructure out of asm/processor.h Andrew Cooper
2023-01-10 17:18 ` [PATCH v2 4/8] x86: Initial support for WRMSRNS Andrew Cooper
2023-01-12 12:58   ` Jan Beulich
2023-01-12 13:58     ` Andrew Cooper
2023-01-12 15:11       ` Jan Beulich
2023-01-16 12:21         ` Andrew Cooper
2023-01-10 17:18 ` [PATCH v2 5/8] x86/hvm: Context switch MSR_PKRS Andrew Cooper
2023-01-12 13:10   ` Jan Beulich
2023-01-12 16:51     ` Andrew Cooper
2023-01-16 13:00       ` Andrew Cooper
2023-01-16 14:17         ` Jan Beulich
2023-01-16 14:57           ` Andrew Cooper
2023-01-16 15:04             ` Jan Beulich
2023-01-16 15:43               ` Andrew Cooper
2023-01-10 17:18 ` [PATCH v2 6/8] x86/hvm: Enable guest access to MSR_PKRS Andrew Cooper
2023-01-10 18:07   ` Andrew Cooper
2023-01-12 13:26   ` Jan Beulich
2023-01-12 14:16     ` Andrew Cooper
2023-01-12 15:16       ` Jan Beulich
2023-01-10 17:18 ` [PATCH v2 7/8] x86/pagewalk: Support PKS Andrew Cooper
2023-01-10 17:18 ` [PATCH v2 8/8] x86/hvm: Support PKS for HAP guests Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.