All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] x86: Support PKS
@ 2021-12-16  9:54 Andrew Cooper
  2021-12-16  9:54 ` [PATCH 1/6] x86/prot-key: Enumeration for Protection Key Supervisor Andrew Cooper
                   ` (6 more replies)
  0 siblings, 7 replies; 18+ messages in thread
From: Andrew Cooper @ 2021-12-16  9:54 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu, Kevin Tian

I found a spare half hour, and this turned out to go very smoothly.

It's tentatively RFC right now, because I'm still adding PKS support to the
XTF comprehensive pagewalk test, but the series is definitely fit for review
at this point.

Andrew Cooper (6):
  x86/prot-key: Enumeration for Protection Key Supervisor
  x86/prot-key: Split PKRU infrastructure out of asm/processor.h
  x86/hvm: Context switch MSR_PKRS
  x86/hvm: Enable guest access to MSR_PKRS
  x86/pagewalk: Support PKS
  x86/hvm: Support PKS

 tools/libs/light/libxl_cpuid.c              |  1 +
 tools/misc/xen-cpuid.c                      |  2 +-
 xen/arch/x86/cpuid.c                        |  9 +++
 xen/arch/x86/hvm/hvm.c                      |  8 ++-
 xen/arch/x86/hvm/vmx/vmx.c                  | 14 +++++
 xen/arch/x86/include/asm/guest_pt.h         |  5 ++
 xen/arch/x86/include/asm/hvm/hvm.h          |  3 +
 xen/arch/x86/include/asm/msr-index.h        |  2 +
 xen/arch/x86/include/asm/msr.h              |  8 +++
 xen/arch/x86/include/asm/processor.h        | 38 ------------
 xen/arch/x86/include/asm/prot-key.h         | 93 +++++++++++++++++++++++++++++
 xen/arch/x86/include/asm/x86-defns.h        |  1 +
 xen/arch/x86/mm/guest_walk.c                | 16 +++--
 xen/arch/x86/msr.c                          | 17 ++++++
 xen/arch/x86/x86_emulate.c                  |  2 +
 xen/include/public/arch-x86/cpufeatureset.h |  1 +
 16 files changed, 174 insertions(+), 46 deletions(-)
 create mode 100644 xen/arch/x86/include/asm/prot-key.h

-- 
2.11.0



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/6] x86/prot-key: Enumeration for Protection Key Supervisor
  2021-12-16  9:54 [PATCH 0/6] x86: Support PKS Andrew Cooper
@ 2021-12-16  9:54 ` Andrew Cooper
  2021-12-21 11:15   ` Jan Beulich
  2021-12-16  9:54 ` [PATCH 2/6] x86/prot-key: Split PKRU infrastructure out of asm/processor.h Andrew Cooper
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2021-12-16  9:54 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

Protection Key Supervisor works in a very similar way to Protection Key User,
except that instead of a PKRU register used by the {RD,WR}PKRU instructions,
the supervisor protection settings live in MSR_PKRS and is accessed using
normal {RD,WR}MSR instructions.

PKS has the same problematic interactions with PV guests as PKU (more infact,
given the guest kernel's CPL), so we'll only support this for HVM guests for
now.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 tools/libs/light/libxl_cpuid.c              | 1 +
 tools/misc/xen-cpuid.c                      | 2 +-
 xen/arch/x86/include/asm/msr-index.h        | 2 ++
 xen/arch/x86/include/asm/x86-defns.h        | 1 +
 xen/include/public/arch-x86/cpufeatureset.h | 1 +
 5 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/tools/libs/light/libxl_cpuid.c b/tools/libs/light/libxl_cpuid.c
index e1acf6648db4..efd01fd5c5b5 100644
--- a/tools/libs/light/libxl_cpuid.c
+++ b/tools/libs/light/libxl_cpuid.c
@@ -211,6 +211,7 @@ int libxl_cpuid_parse_config(libxl_cpuid_policy_list *cpuid, const char* str)
         {"avx512-vpopcntdq",0x00000007,0,CPUID_REG_ECX, 14,  1},
         {"rdpid",        0x00000007,  0, CPUID_REG_ECX, 22,  1},
         {"cldemote",     0x00000007,  0, CPUID_REG_ECX, 25,  1},
+        {"pks",          0x00000007,  0, CPUID_REG_ECX, 31,  1},
 
         {"avx512-4vnniw",0x00000007,  0, CPUID_REG_EDX,  2,  1},
         {"avx512-4fmaps",0x00000007,  0, CPUID_REG_EDX,  3,  1},
diff --git a/tools/misc/xen-cpuid.c b/tools/misc/xen-cpuid.c
index fb36cac07baa..f5b67acacc48 100644
--- a/tools/misc/xen-cpuid.c
+++ b/tools/misc/xen-cpuid.c
@@ -133,7 +133,7 @@ static const char *const str_7c0[32] =
     /* 24 */                   [25] = "cldemote",
     /* 26 */                   [27] = "movdiri",
     [28] = "movdir64b",        [29] = "enqcmd",
-    [30] = "sgx-lc",
+    [30] = "sgx-lc",           [31] = "pks",
 };
 
 static const char *const str_e7d[32] =
diff --git a/xen/arch/x86/include/asm/msr-index.h b/xen/arch/x86/include/asm/msr-index.h
index ab68ef2681a9..3a1b4438e939 100644
--- a/xen/arch/x86/include/asm/msr-index.h
+++ b/xen/arch/x86/include/asm/msr-index.h
@@ -122,6 +122,8 @@
 #define MSR_PL3_SSP                         0x000006a7
 #define MSR_INTERRUPT_SSP_TABLE             0x000006a8
 
+#define MSR_PKRS                            0x000006e1
+
 #define MSR_X2APIC_FIRST                    0x00000800
 #define MSR_X2APIC_LAST                     0x00000bff
 
diff --git a/xen/arch/x86/include/asm/x86-defns.h b/xen/arch/x86/include/asm/x86-defns.h
index 28628807cb98..37bbb3594e88 100644
--- a/xen/arch/x86/include/asm/x86-defns.h
+++ b/xen/arch/x86/include/asm/x86-defns.h
@@ -74,6 +74,7 @@
 #define X86_CR4_SMAP       0x00200000 /* enable SMAP */
 #define X86_CR4_PKE        0x00400000 /* enable PKE */
 #define X86_CR4_CET        0x00800000 /* Control-flow Enforcement Technology */
+#define X86_CR4_PKS        0x01000000 /* Protection Key Supervisor */
 
 /*
  * XSTATE component flags in XCR0
diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
index 647ee9e5e277..79a8f244d88a 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -244,6 +244,7 @@ XEN_CPUFEATURE(CLDEMOTE,      6*32+25) /*A  CLDEMOTE instruction */
 XEN_CPUFEATURE(MOVDIRI,       6*32+27) /*a  MOVDIRI instruction */
 XEN_CPUFEATURE(MOVDIR64B,     6*32+28) /*a  MOVDIR64B instruction */
 XEN_CPUFEATURE(ENQCMD,        6*32+29) /*   ENQCMD{,S} instructions */
+XEN_CPUFEATURE(PKS,           6*32+31) /*   Protection Key for Supervisor */
 
 /* AMD-defined CPU features, CPUID level 0x80000007.edx, word 7 */
 XEN_CPUFEATURE(HW_PSTATE,     7*32+ 7) /*   Hardware Pstates */
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/6] x86/prot-key: Split PKRU infrastructure out of asm/processor.h
  2021-12-16  9:54 [PATCH 0/6] x86: Support PKS Andrew Cooper
  2021-12-16  9:54 ` [PATCH 1/6] x86/prot-key: Enumeration for Protection Key Supervisor Andrew Cooper
@ 2021-12-16  9:54 ` Andrew Cooper
  2021-12-21 11:28   ` Jan Beulich
  2021-12-16  9:54 ` [PATCH 3/6] x86/hvm: Context switch MSR_PKRS Andrew Cooper
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2021-12-16  9:54 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

asm/processor.h is in desperate need of splitting up, and protection key
functionality in only used in the emulator and pagewalk.  Introduce a new
asm/prot-key.h and move the relevant content over.

Rename the PKRU_* constants to drop the user part and to use the architectural
terminology.

Drop the read_pkru_{ad,wd}() helpers entirely.  The pkru infix is about to
become wrong, and the sole user is shorter and easier to follow without the
helpers.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/include/asm/processor.h | 38 ------------------------------
 xen/arch/x86/include/asm/prot-key.h  | 45 ++++++++++++++++++++++++++++++++++++
 xen/arch/x86/mm/guest_walk.c         |  9 +++++---
 xen/arch/x86/x86_emulate.c           |  2 ++
 4 files changed, 53 insertions(+), 41 deletions(-)
 create mode 100644 xen/arch/x86/include/asm/prot-key.h

diff --git a/xen/arch/x86/include/asm/processor.h b/xen/arch/x86/include/asm/processor.h
index 400b4fac5ed4..eb1687d0795c 100644
--- a/xen/arch/x86/include/asm/processor.h
+++ b/xen/arch/x86/include/asm/processor.h
@@ -367,44 +367,6 @@ static always_inline void set_in_cr4 (unsigned long mask)
     write_cr4(read_cr4() | mask);
 }
 
-static inline unsigned int rdpkru(void)
-{
-    unsigned int pkru;
-
-    asm volatile (".byte 0x0f,0x01,0xee"
-        : "=a" (pkru) : "c" (0) : "dx");
-
-    return pkru;
-}
-
-static inline void wrpkru(unsigned int pkru)
-{
-    asm volatile ( ".byte 0x0f, 0x01, 0xef"
-                   :: "a" (pkru), "d" (0), "c" (0) );
-}
-
-/* Macros for PKRU domain */
-#define PKRU_READ  (0)
-#define PKRU_WRITE (1)
-#define PKRU_ATTRS (2)
-
-/*
- * PKRU defines 32 bits, there are 16 domains and 2 attribute bits per
- * domain in pkru, pkeys is index to a defined domain, so the value of
- * pte_pkeys * PKRU_ATTRS + R/W is offset of a defined domain attribute.
- */
-static inline bool_t read_pkru_ad(uint32_t pkru, unsigned int pkey)
-{
-    ASSERT(pkey < 16);
-    return (pkru >> (pkey * PKRU_ATTRS + PKRU_READ)) & 1;
-}
-
-static inline bool_t read_pkru_wd(uint32_t pkru, unsigned int pkey)
-{
-    ASSERT(pkey < 16);
-    return (pkru >> (pkey * PKRU_ATTRS + PKRU_WRITE)) & 1;
-}
-
 static always_inline void __monitor(const void *eax, unsigned long ecx,
                                     unsigned long edx)
 {
diff --git a/xen/arch/x86/include/asm/prot-key.h b/xen/arch/x86/include/asm/prot-key.h
new file mode 100644
index 000000000000..084b248d81a5
--- /dev/null
+++ b/xen/arch/x86/include/asm/prot-key.h
@@ -0,0 +1,45 @@
+/******************************************************************************
+ * arch/x86/include/asm/spec_ctrl.h
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (c) 2021 Citrix Systems Ltd.
+ */
+#ifndef ASM_PROT_KEY_H
+#define ASM_PROT_KEY_H
+
+#include <xen/types.h>
+
+#define PKEY_AD 1 /* Access Disable */
+#define PKEY_WD 2 /* Write Disable */
+
+#define PKEY_WIDTH 2 /* Two bits per protection key */
+
+static inline uint32_t rdpkru(void)
+{
+    uint32_t pkru;
+
+    asm volatile ( ".byte 0x0f,0x01,0xee"
+                   : "=a" (pkru) : "c" (0) : "dx" );
+
+    return pkru;
+}
+
+static inline void wrpkru(uint32_t pkru)
+{
+    asm volatile ( ".byte 0x0f, 0x01, 0xef"
+                   :: "a" (pkru), "d" (0), "c" (0) );
+}
+
+#endif /* ASM_PROT_KEY_H */
diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
index b9f607272c39..dc8fdde0212e 100644
--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -26,7 +26,9 @@
 #include <xen/paging.h>
 #include <xen/domain_page.h>
 #include <xen/sched.h>
+
 #include <asm/page.h>
+#include <asm/prot-key.h>
 #include <asm/guest_pt.h>
 #include <asm/hvm/emulate.h>
 
@@ -413,10 +415,11 @@ guest_walk_tables(const struct vcpu *v, struct p2m_domain *p2m,
          guest_pku_enabled(v) )
     {
         unsigned int pkey = guest_l1e_get_pkey(gw->l1e);
-        unsigned int pkru = rdpkru();
+        unsigned int pkr = rdpkru();
+        unsigned int pk_ar = pkr >> (pkey * PKEY_WIDTH);
 
-        if ( read_pkru_ad(pkru, pkey) ||
-             ((walk & PFEC_write_access) && read_pkru_wd(pkru, pkey) &&
+        if ( (pk_ar & PKEY_AD) ||
+             ((walk & PFEC_write_access) && (pk_ar & PKEY_WD) &&
               ((walk & PFEC_user_mode) || guest_wp_enabled(v))) )
         {
             gw->pfec |= PFEC_prot_key;
diff --git a/xen/arch/x86/x86_emulate.c b/xen/arch/x86/x86_emulate.c
index 1e082e6f3b2d..551ad0f7b303 100644
--- a/xen/arch/x86/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate.c
@@ -12,8 +12,10 @@
 #include <xen/domain_page.h>
 #include <xen/err.h>
 #include <xen/event.h>
+
 #include <asm/x86_emulate.h>
 #include <asm/processor.h> /* current_cpu_info */
+#include <asm/prot-key.h>
 #include <asm/xstate.h>
 #include <asm/amd.h> /* cpu_has_amd_erratum() */
 #include <asm/debugreg.h>
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/6] x86/hvm: Context switch MSR_PKRS
  2021-12-16  9:54 [PATCH 0/6] x86: Support PKS Andrew Cooper
  2021-12-16  9:54 ` [PATCH 1/6] x86/prot-key: Enumeration for Protection Key Supervisor Andrew Cooper
  2021-12-16  9:54 ` [PATCH 2/6] x86/prot-key: Split PKRU infrastructure out of asm/processor.h Andrew Cooper
@ 2021-12-16  9:54 ` Andrew Cooper
  2021-12-21 11:56   ` Jan Beulich
  2021-12-16  9:54 ` [PATCH 4/6] x86/hvm: Enable guest access to MSR_PKRS Andrew Cooper
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2021-12-16  9:54 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu, Kevin Tian

Under PKS, MSR_PKRS is available and based on the CPUID policy alone, and
usable independently of CR4.PKS.  See the large comment in prot-key.h for
details of the context switching arrangement.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Kevin Tian <kevin.tian@intel.com>

At a guess, we're likely to see PKS on AMD eventually, hence not putting the
DEFINE_PER_CPU() in vmx.c, but I'm at a total loss to find anywhere better to
put it than hvm.c.  Suggestions welcome.
---
 xen/arch/x86/hvm/hvm.c              |  3 +++
 xen/arch/x86/hvm/vmx/vmx.c          |  9 +++++++
 xen/arch/x86/include/asm/msr.h      |  8 +++++++
 xen/arch/x86/include/asm/prot-key.h | 48 +++++++++++++++++++++++++++++++++++++
 4 files changed, 68 insertions(+)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 350dc396e37c..63eaa3c5a66b 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -69,6 +69,7 @@
 #include <asm/hvm/vm_event.h>
 #include <asm/altp2m.h>
 #include <asm/mtrr.h>
+#include <asm/prot-key.h>
 #include <asm/apic.h>
 #include <asm/vm_event.h>
 #include <public/sched.h>
@@ -117,6 +118,8 @@ static const char __initconst warning_hvm_fep[] =
 static bool_t __initdata opt_altp2m_enabled = 0;
 boolean_param("altp2m", opt_altp2m_enabled);
 
+DEFINE_PER_CPU(uint32_t, pkrs);
+
 static int cpu_callback(
     struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index a7a0d662342a..2e6af1e1c033 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -58,6 +58,7 @@
 #include <asm/event.h>
 #include <asm/mce.h>
 #include <asm/monitor.h>
+#include <asm/prot-key.h>
 #include <public/arch-x86/cpuid.h>
 
 static bool_t __initdata opt_force_ept;
@@ -525,6 +526,7 @@ static void vmx_restore_host_msrs(void)
 
 static void vmx_save_guest_msrs(struct vcpu *v)
 {
+    const struct cpuid_policy *cp = v->domain->arch.cpuid;
     struct vcpu_msrs *msrs = v->arch.msrs;
 
     /*
@@ -538,10 +540,14 @@ static void vmx_save_guest_msrs(struct vcpu *v)
         rdmsrl(MSR_RTIT_OUTPUT_MASK, msrs->rtit.output_mask);
         rdmsrl(MSR_RTIT_STATUS, msrs->rtit.status);
     }
+
+    if ( cp->feat.pks )
+        msrs->pkrs = rdpkrs_and_cache();
 }
 
 static void vmx_restore_guest_msrs(struct vcpu *v)
 {
+    const struct cpuid_policy *cp = v->domain->arch.cpuid;
     const struct vcpu_msrs *msrs = v->arch.msrs;
 
     write_gs_shadow(v->arch.hvm.vmx.shadow_gs);
@@ -558,6 +564,9 @@ static void vmx_restore_guest_msrs(struct vcpu *v)
         wrmsrl(MSR_RTIT_OUTPUT_MASK, msrs->rtit.output_mask);
         wrmsrl(MSR_RTIT_STATUS, msrs->rtit.status);
     }
+
+    if ( cp->feat.pks )
+        wrpkrs(msrs->pkrs);
 }
 
 void vmx_update_cpu_exec_control(struct vcpu *v)
diff --git a/xen/arch/x86/include/asm/msr.h b/xen/arch/x86/include/asm/msr.h
index 1d3eca9063a2..2ee0b68100c9 100644
--- a/xen/arch/x86/include/asm/msr.h
+++ b/xen/arch/x86/include/asm/msr.h
@@ -338,6 +338,14 @@ struct vcpu_msrs
         };
     } rtit;
 
+    /*
+     * 0x000006e1 - MSR_PKRS - Protection Key Supervisor.
+     *
+     * Exposed R/W to guests.  Xen doesn't use PKS yet, so only context
+     * switched per vcpu.  When in current context, live value is in hardware.
+     */
+    uint32_t pkrs;
+
     /* 0x00000da0 - MSR_IA32_XSS */
     struct {
         uint64_t raw;
diff --git a/xen/arch/x86/include/asm/prot-key.h b/xen/arch/x86/include/asm/prot-key.h
index 084b248d81a5..4387c27b7ec5 100644
--- a/xen/arch/x86/include/asm/prot-key.h
+++ b/xen/arch/x86/include/asm/prot-key.h
@@ -19,8 +19,11 @@
 #ifndef ASM_PROT_KEY_H
 #define ASM_PROT_KEY_H
 
+#include <xen/percpu.h>
 #include <xen/types.h>
 
+#include <asm/msr.h>
+
 #define PKEY_AD 1 /* Access Disable */
 #define PKEY_WD 2 /* Write Disable */
 
@@ -42,4 +45,49 @@ static inline void wrpkru(uint32_t pkru)
                    :: "a" (pkru), "d" (0), "c" (0) );
 }
 
+/*
+ * Xen does not use PKS.
+ *
+ * Guest kernel use is expected to be one default key, except for tiny windows
+ * with a double write to switch to a non-default key in a permitted critical
+ * section.
+ *
+ * As such, we want MSR_PKRS un-intercepted.  Furthermore, as we only need it
+ * in Xen for emulation or migration purposes (i.e. possibly never in a
+ * domain's lifetime), we don't want to re-sync the hardware value on every
+ * vmexit.
+ *
+ * Therefore, we read and cache the guest value in ctxt_switch_from(), in the
+ * expectation that we can short-circuit the write in ctxt_switch_to().
+ * During regular operations in current context, the guest value is in
+ * hardware and the per-cpu cache is stale.
+ */
+DECLARE_PER_CPU(uint32_t, pkrs);
+
+static inline uint32_t rdpkrs(void)
+{
+    uint32_t pkrs, tmp;
+
+    rdmsr(MSR_PKRS, pkrs, tmp);
+
+    return pkrs;
+}
+
+static inline uint32_t rdpkrs_and_cache(void)
+{
+    return this_cpu(pkrs) = rdpkrs();
+}
+
+static inline void wrpkrs(uint32_t pkrs)
+{
+    uint32_t *this_pkrs = &this_cpu(pkrs);
+
+    if ( *this_pkrs != pkrs )
+    {
+        *this_pkrs = pkrs;
+
+        wrmsr(MSR_PKRS, pkrs, 0);
+    }
+}
+
 #endif /* ASM_PROT_KEY_H */
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/6] x86/hvm: Enable guest access to MSR_PKRS
  2021-12-16  9:54 [PATCH 0/6] x86: Support PKS Andrew Cooper
                   ` (2 preceding siblings ...)
  2021-12-16  9:54 ` [PATCH 3/6] x86/hvm: Context switch MSR_PKRS Andrew Cooper
@ 2021-12-16  9:54 ` Andrew Cooper
  2021-12-21 11:58   ` Jan Beulich
  2021-12-16  9:54 ` [PATCH 5/6] x86/pagewalk: Support PKS Andrew Cooper
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2021-12-16  9:54 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu, Kevin Tian

Have guest_{rd,wr}msr() access either the live register, or stashed state,
depending on context.  Include MSR_PKRS for migration, and let the guest have
full access.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Kevin Tian <kevin.tian@intel.com>
---
 xen/arch/x86/hvm/hvm.c     |  1 +
 xen/arch/x86/hvm/vmx/vmx.c |  5 +++++
 xen/arch/x86/msr.c         | 17 +++++++++++++++++
 3 files changed, 23 insertions(+)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 63eaa3c5a66b..e75245f36dce 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1372,6 +1372,7 @@ static int hvm_load_cpu_xsave_states(struct domain *d, hvm_domain_context_t *h)
 static const uint32_t msrs_to_send[] = {
     MSR_SPEC_CTRL,
     MSR_INTEL_MISC_FEATURES_ENABLES,
+    MSR_PKRS,
     MSR_IA32_BNDCFGS,
     MSR_IA32_XSS,
     MSR_AMD64_DR0_ADDRESS_MASK,
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 2e6af1e1c033..2288ea54f0b5 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -632,6 +632,11 @@ static void vmx_cpuid_policy_changed(struct vcpu *v)
         vmx_clear_msr_intercept(v, MSR_FLUSH_CMD, VMX_MSR_RW);
     else
         vmx_set_msr_intercept(v, MSR_FLUSH_CMD, VMX_MSR_RW);
+
+    if ( cp->feat.pks )
+        vmx_clear_msr_intercept(v, MSR_PKRS, VMX_MSR_RW);
+    else
+        vmx_set_msr_intercept(v, MSR_PKRS, VMX_MSR_RW);
 }
 
 int vmx_guest_x86_mode(struct vcpu *v)
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index b834456c7b02..d2569a81b7ba 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -28,6 +28,7 @@
 #include <asm/hvm/nestedhvm.h>
 #include <asm/hvm/viridian.h>
 #include <asm/msr.h>
+#include <asm/prot-key.h>
 #include <asm/setup.h>
 
 #include <public/hvm/params.h>
@@ -315,6 +316,13 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
         *val = 0;
         break;
 
+    case MSR_PKRS:
+        if ( !cp->feat.pks )
+            goto gp_fault;
+
+        *val = (v == curr) ? rdpkrs() : msrs->pkrs;
+        break;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;
@@ -581,6 +589,15 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
             break;
         goto gp_fault;
 
+    case MSR_PKRS:
+        if ( !cp->feat.pks || val != (uint32_t)val )
+            goto gp_fault;
+
+        msrs->pkrs = val;
+        if ( v == curr )
+            wrmsr(MSR_PKRS, val, 0);
+        break;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/6] x86/pagewalk: Support PKS
  2021-12-16  9:54 [PATCH 0/6] x86: Support PKS Andrew Cooper
                   ` (3 preceding siblings ...)
  2021-12-16  9:54 ` [PATCH 4/6] x86/hvm: Enable guest access to MSR_PKRS Andrew Cooper
@ 2021-12-16  9:54 ` Andrew Cooper
  2021-12-21 12:07   ` Jan Beulich
  2021-12-16  9:54 ` [PATCH 6/6] x86/hvm: " Andrew Cooper
  2021-12-16 21:09 ` [PATCH 0/6] x86: " Andrew Cooper
  6 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2021-12-16  9:54 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

PKS is incredibly similar to the existing PKU behaviour, operating on
pagewalks for any supervisor mapping.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/include/asm/guest_pt.h | 5 +++++
 xen/arch/x86/include/asm/hvm/hvm.h  | 3 +++
 xen/arch/x86/mm/guest_walk.c        | 9 +++++----
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/include/asm/guest_pt.h b/xen/arch/x86/include/asm/guest_pt.h
index 6647ccfb8520..6802db2a415a 100644
--- a/xen/arch/x86/include/asm/guest_pt.h
+++ b/xen/arch/x86/include/asm/guest_pt.h
@@ -282,6 +282,11 @@ static always_inline bool guest_pku_enabled(const struct vcpu *v)
     return !is_pv_vcpu(v) && hvm_pku_enabled(v);
 }
 
+static always_inline bool guest_pks_enabled(const struct vcpu *v)
+{
+    return !is_pv_vcpu(v) && hvm_pks_enabled(v);
+}
+
 /* Helpers for identifying whether guest entries have reserved bits set. */
 
 /* Bits reserved because of maxphysaddr, and (lack of) EFER.NX */
diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h
index bd2cbb0e7baf..ffef7ed075a7 100644
--- a/xen/arch/x86/include/asm/hvm/hvm.h
+++ b/xen/arch/x86/include/asm/hvm/hvm.h
@@ -394,6 +394,8 @@ int hvm_get_param(struct domain *d, uint32_t index, uint64_t *value);
     ((v)->arch.hvm.guest_efer & EFER_NXE)
 #define hvm_pku_enabled(v) \
     (hvm_paging_enabled(v) && ((v)->arch.hvm.guest_cr[4] & X86_CR4_PKE))
+#define hvm_pks_enabled(v) \
+    (hvm_paging_enabled(v) && ((v)->arch.hvm.guest_cr[4] & X86_CR4_PKS))
 
 /* Can we use superpages in the HAP p2m table? */
 #define hap_has_1gb (!!(hvm_funcs.hap_capabilities & HVM_HAP_SUPERPAGE_1GB))
@@ -868,6 +870,7 @@ static inline int hvm_vmtrace_get_option(
 #define hvm_smap_enabled(v) ((void)(v), false)
 #define hvm_nx_enabled(v) ((void)(v), false)
 #define hvm_pku_enabled(v) ((void)(v), false)
+#define hvm_pks_enabled(v) ((void)(v), false)
 
 #define arch_vcpu_block(v) ((void)(v))
 
diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
index dc8fdde0212e..8670d4990a11 100644
--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -406,16 +406,17 @@ guest_walk_tables(const struct vcpu *v, struct p2m_domain *p2m,
 #if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
     /*
      * If all access checks are thus far ok, check Protection Key for 64bit
-     * data accesses to user mappings.
+     * data accesses.
      *
      * N.B. In the case that the walk ended with a superpage, the fabricated
      * gw->l1e contains the appropriate leaf pkey.
      */
-    if ( (ar & _PAGE_USER) && !(walk & PFEC_insn_fetch) &&
-         guest_pku_enabled(v) )
+    if ( !(walk & PFEC_insn_fetch) &&
+         ((ar & _PAGE_USER) ? guest_pku_enabled(v)
+                            : guest_pks_enabled(v)) )
     {
         unsigned int pkey = guest_l1e_get_pkey(gw->l1e);
-        unsigned int pkr = rdpkru();
+        unsigned int pkr = (ar & _PAGE_USER) ? rdpkru() : rdpkrs();
         unsigned int pk_ar = pkr >> (pkey * PKEY_WIDTH);
 
         if ( (pk_ar & PKEY_AD) ||
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 6/6] x86/hvm: Support PKS
  2021-12-16  9:54 [PATCH 0/6] x86: Support PKS Andrew Cooper
                   ` (4 preceding siblings ...)
  2021-12-16  9:54 ` [PATCH 5/6] x86/pagewalk: Support PKS Andrew Cooper
@ 2021-12-16  9:54 ` Andrew Cooper
  2021-12-21 12:18   ` Jan Beulich
  2021-12-16 21:09 ` [PATCH 0/6] x86: " Andrew Cooper
  6 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2021-12-16  9:54 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

With all infrastructure in place, advertise the PKS CPUID bit to guests, and
let them set CR4.PKS.

Experiment with a tweak to the layout of hvm_cr4_guest_valid_bits() so future
additions will be just a single added line.

The current context switching behaviour is tied to how VT-x works, so leave a
safety check in the short term.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/cpuid.c                        | 9 +++++++++
 xen/arch/x86/hvm/hvm.c                      | 4 +++-
 xen/include/public/arch-x86/cpufeatureset.h | 2 +-
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 151944f65702..03653d3766f4 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -512,6 +512,15 @@ static void __init calculate_hvm_max_policy(void)
             __clear_bit(X86_FEATURE_XSAVES, hvm_featureset);
     }
 
+    /*
+     * Xen doesn't use PKS, so the guest support for it has opted to not use
+     * the VMCS load/save controls for efficiency reasons.  This depends on
+     * the exact vmentry/exit behaviour, so don't expose PKS in other
+     * situations until someone has cross-checked the behaviour for safety.
+     */
+    if ( !cpu_has_vmx )
+        __clear_bit(X86_FEATURE_PKS, hvm_featureset);
+
     guest_common_feature_adjustments(hvm_featureset);
 
     sanitise_featureset(hvm_featureset);
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index e75245f36dce..2552e7f45499 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1010,7 +1010,9 @@ unsigned long hvm_cr4_guest_valid_bits(const struct domain *d)
             (p->feat.smep     ? X86_CR4_SMEP              : 0) |
             (p->feat.smap     ? X86_CR4_SMAP              : 0) |
             (p->feat.pku      ? X86_CR4_PKE               : 0) |
-            (cet              ? X86_CR4_CET               : 0));
+            (cet              ? X86_CR4_CET               : 0) |
+            (p->feat.pks      ? X86_CR4_PKS               : 0) |
+            0);
 }
 
 static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
index 79a8f244d88a..92ec9eed3fd1 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -244,7 +244,7 @@ XEN_CPUFEATURE(CLDEMOTE,      6*32+25) /*A  CLDEMOTE instruction */
 XEN_CPUFEATURE(MOVDIRI,       6*32+27) /*a  MOVDIRI instruction */
 XEN_CPUFEATURE(MOVDIR64B,     6*32+28) /*a  MOVDIR64B instruction */
 XEN_CPUFEATURE(ENQCMD,        6*32+29) /*   ENQCMD{,S} instructions */
-XEN_CPUFEATURE(PKS,           6*32+31) /*   Protection Key for Supervisor */
+XEN_CPUFEATURE(PKS,           6*32+31) /*H  Protection Key for Supervisor */
 
 /* AMD-defined CPU features, CPUID level 0x80000007.edx, word 7 */
 XEN_CPUFEATURE(HW_PSTATE,     7*32+ 7) /*   Hardware Pstates */
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/6] x86: Support PKS
  2021-12-16  9:54 [PATCH 0/6] x86: Support PKS Andrew Cooper
                   ` (5 preceding siblings ...)
  2021-12-16  9:54 ` [PATCH 6/6] x86/hvm: " Andrew Cooper
@ 2021-12-16 21:09 ` Andrew Cooper
  6 siblings, 0 replies; 18+ messages in thread
From: Andrew Cooper @ 2021-12-16 21:09 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Jan Beulich, Roger Pau Monné, Wei Liu, Kevin Tian

On 16/12/2021 09:54, Andrew Cooper wrote:
> I found a spare half hour, and this turned out to go very smoothly.
>
> It's tentatively RFC right now, because I'm still adding PKS support to the
> XTF comprehensive pagewalk test, but the series is definitely fit for review
> at this point.

I suppose it's worth expanding on this a little.  What I've proposed
here is the most efficient option, and it is very non-invasive but comes
with the downside that Xen can't set CR4.PKS.

It is tied to VT-x behaviour, so I've left a deliberate clobber so it
won't engage automatically if AMD add support on future CPUs.


If we want Xen to be able to use PKS, then a couple of things change.

1) PV32 needs inhibiting.  This is likely the case anyway, due to CET.
2) VT-x will need to start using the PKRS load/save controls
2a) Need new get/set_pkrs hvm_funcs accessors to abstract the
VMREAD/WRITE out of common code.
2b) guest_{rd,wr}msr() and pagewalk updated to cope
3) Whatever AMD needs (if applicable).


In terms of Xen using PKS, the first piece of low hanging fruit is
removing access to the stubs by default, to prevent stray writes from
interfering with other CPUs.

Changing PKEY is a WRMSR, so not the fastest action in the world even if
it is well optimised in microcode, but modification of the stubs is not
a fastpath, so this would be entirely fine.

~Andrew


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/6] x86/prot-key: Enumeration for Protection Key Supervisor
  2021-12-16  9:54 ` [PATCH 1/6] x86/prot-key: Enumeration for Protection Key Supervisor Andrew Cooper
@ 2021-12-21 11:15   ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2021-12-21 11:15 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 16.12.2021 10:54, Andrew Cooper wrote:
> Protection Key Supervisor works in a very similar way to Protection Key User,
> except that instead of a PKRU register used by the {RD,WR}PKRU instructions,
> the supervisor protection settings live in MSR_PKRS and is accessed using
> normal {RD,WR}MSR instructions.
> 
> PKS has the same problematic interactions with PV guests as PKU (more infact,
> given the guest kernel's CPL), so we'll only support this for HVM guests for
> now.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/6] x86/prot-key: Split PKRU infrastructure out of asm/processor.h
  2021-12-16  9:54 ` [PATCH 2/6] x86/prot-key: Split PKRU infrastructure out of asm/processor.h Andrew Cooper
@ 2021-12-21 11:28   ` Jan Beulich
  2023-01-09 14:57     ` Andrew Cooper
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2021-12-21 11:28 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 16.12.2021 10:54, Andrew Cooper wrote:
> asm/processor.h is in desperate need of splitting up, and protection key
> functionality in only used in the emulator and pagewalk.  Introduce a new
> asm/prot-key.h and move the relevant content over.
> 
> Rename the PKRU_* constants to drop the user part and to use the architectural
> terminology.
> 
> Drop the read_pkru_{ad,wd}() helpers entirely.  The pkru infix is about to
> become wrong, and the sole user is shorter and easier to follow without the
> helpers.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

This looks functionally correct, so in principle
Reviewed-by: Jan Beulich <jbeulich@suse.com>
But I have two remarks:

> --- /dev/null
> +++ b/xen/arch/x86/include/asm/prot-key.h
> @@ -0,0 +1,45 @@
> +/******************************************************************************
> + * arch/x86/include/asm/spec_ctrl.h
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
> + *
> + * Copyright (c) 2021 Citrix Systems Ltd.
> + */
> +#ifndef ASM_PROT_KEY_H
> +#define ASM_PROT_KEY_H
> +
> +#include <xen/types.h>
> +
> +#define PKEY_AD 1 /* Access Disable */
> +#define PKEY_WD 2 /* Write Disable */
> +
> +#define PKEY_WIDTH 2 /* Two bits per protection key */
> +
> +static inline uint32_t rdpkru(void)
> +{
> +    uint32_t pkru;

I agree this wants to be uint32_t (i.e. unlike the original function),
but I don't see why the function's return type needs to be, the more
that the sole caller also uses unsigned int for the variable to store
the result in.

> +    asm volatile ( ".byte 0x0f,0x01,0xee"
> +                   : "=a" (pkru) : "c" (0) : "dx" );
> +
> +    return pkru;
> +}
> +
> +static inline void wrpkru(uint32_t pkru)

(To avoid an intermediate local variable, I can agree with using
uint32_t for the parameter type directly here.)

> --- a/xen/arch/x86/mm/guest_walk.c
> +++ b/xen/arch/x86/mm/guest_walk.c
> @@ -26,7 +26,9 @@
>  #include <xen/paging.h>
>  #include <xen/domain_page.h>
>  #include <xen/sched.h>
> +
>  #include <asm/page.h>
> +#include <asm/prot-key.h>
>  #include <asm/guest_pt.h>
>  #include <asm/hvm/emulate.h>
>  
> @@ -413,10 +415,11 @@ guest_walk_tables(const struct vcpu *v, struct p2m_domain *p2m,
>           guest_pku_enabled(v) )
>      {
>          unsigned int pkey = guest_l1e_get_pkey(gw->l1e);
> -        unsigned int pkru = rdpkru();
> +        unsigned int pkr = rdpkru();
> +        unsigned int pk_ar = pkr >> (pkey * PKEY_WIDTH);

This is correct only because below you only inspect the low two bits.
Since I don't think masking off the upper bits is really useful here,
I'd like to suggest to not call the variable "pk_ar". Perhaps
something as generic as "temp"?

Jan



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/6] x86/hvm: Context switch MSR_PKRS
  2021-12-16  9:54 ` [PATCH 3/6] x86/hvm: Context switch MSR_PKRS Andrew Cooper
@ 2021-12-21 11:56   ` Jan Beulich
  2023-01-09 16:55     ` Andrew Cooper
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2021-12-21 11:56 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Kevin Tian, Xen-devel

On 16.12.2021 10:54, Andrew Cooper wrote:
> Under PKS, MSR_PKRS is available and based on the CPUID policy alone, and
> usable independently of CR4.PKS.  See the large comment in prot-key.h for
> details of the context switching arrangement.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Wei Liu <wl@xen.org>
> CC: Kevin Tian <kevin.tian@intel.com>
> 
> At a guess, we're likely to see PKS on AMD eventually, hence not putting the
> DEFINE_PER_CPU() in vmx.c, but I'm at a total loss to find anywhere better to
> put it than hvm.c.  Suggestions welcome.

That's fine for now imo. hvm.c is another candidate for splitting up,
at which point a better place may surface. (I would even be willing
to make an attempt in that direction, if only I knew the results
wouldn't end up stuck again, just like appears to have happened for
the p2m / physmap etc splitting.)

> @@ -42,4 +45,49 @@ static inline void wrpkru(uint32_t pkru)
>                     :: "a" (pkru), "d" (0), "c" (0) );
>  }
>  
> +/*
> + * Xen does not use PKS.
> + *
> + * Guest kernel use is expected to be one default key, except for tiny windows
> + * with a double write to switch to a non-default key in a permitted critical
> + * section.
> + *
> + * As such, we want MSR_PKRS un-intercepted.  Furthermore, as we only need it
> + * in Xen for emulation or migration purposes (i.e. possibly never in a
> + * domain's lifetime), we don't want to re-sync the hardware value on every
> + * vmexit.
> + *
> + * Therefore, we read and cache the guest value in ctxt_switch_from(), in the
> + * expectation that we can short-circuit the write in ctxt_switch_to().
> + * During regular operations in current context, the guest value is in
> + * hardware and the per-cpu cache is stale.
> + */
> +DECLARE_PER_CPU(uint32_t, pkrs);
> +
> +static inline uint32_t rdpkrs(void)
> +{
> +    uint32_t pkrs, tmp;
> +
> +    rdmsr(MSR_PKRS, pkrs, tmp);
> +
> +    return pkrs;
> +}
> +
> +static inline uint32_t rdpkrs_and_cache(void)
> +{
> +    return this_cpu(pkrs) = rdpkrs();
> +}
> +
> +static inline void wrpkrs(uint32_t pkrs)
> +{
> +    uint32_t *this_pkrs = &this_cpu(pkrs);
> +
> +    if ( *this_pkrs != pkrs )

For this to work, I think we need to clear PKRS during CPU init; I
admit I didn't peek ahead in the series to check whether you do so
later on in the series. At least the version of the SDM I'm looking
at doesn't even specify reset state of 0 for this MSR. But even if
it did, it would likely be as for PKRU - unchanged after INIT. Yet
INIT is all that CPUs go through when e.g. parking / unparking them.

Jan



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/6] x86/hvm: Enable guest access to MSR_PKRS
  2021-12-16  9:54 ` [PATCH 4/6] x86/hvm: Enable guest access to MSR_PKRS Andrew Cooper
@ 2021-12-21 11:58   ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2021-12-21 11:58 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Kevin Tian, Xen-devel

On 16.12.2021 10:54, Andrew Cooper wrote:
> Have guest_{rd,wr}msr() access either the live register, or stashed state,
> depending on context.  Include MSR_PKRS for migration, and let the guest have
> full access.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/6] x86/pagewalk: Support PKS
  2021-12-16  9:54 ` [PATCH 5/6] x86/pagewalk: Support PKS Andrew Cooper
@ 2021-12-21 12:07   ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2021-12-21 12:07 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 16.12.2021 10:54, Andrew Cooper wrote:
> PKS is incredibly similar to the existing PKU behaviour, operating on
> pagewalks for any supervisor mapping.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/6] x86/hvm: Support PKS
  2021-12-16  9:54 ` [PATCH 6/6] x86/hvm: " Andrew Cooper
@ 2021-12-21 12:18   ` Jan Beulich
  2023-01-09 17:01     ` Andrew Cooper
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2021-12-21 12:18 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 16.12.2021 10:54, Andrew Cooper wrote:
> With all infrastructure in place, advertise the PKS CPUID bit to guests, and
> let them set CR4.PKS.
> 
> Experiment with a tweak to the layout of hvm_cr4_guest_valid_bits() so future
> additions will be just a single added line.
> 
> The current context switching behaviour is tied to how VT-x works, so leave a
> safety check in the short term.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

I would like to ask though that you ...

> --- a/xen/include/public/arch-x86/cpufeatureset.h
> +++ b/xen/include/public/arch-x86/cpufeatureset.h
> @@ -244,7 +244,7 @@ XEN_CPUFEATURE(CLDEMOTE,      6*32+25) /*A  CLDEMOTE instruction */
>  XEN_CPUFEATURE(MOVDIRI,       6*32+27) /*a  MOVDIRI instruction */
>  XEN_CPUFEATURE(MOVDIR64B,     6*32+28) /*a  MOVDIR64B instruction */
>  XEN_CPUFEATURE(ENQCMD,        6*32+29) /*   ENQCMD{,S} instructions */
> -XEN_CPUFEATURE(PKS,           6*32+31) /*   Protection Key for Supervisor */
> +XEN_CPUFEATURE(PKS,           6*32+31) /*H  Protection Key for Supervisor */

... clarify this restriction of not covering shadow mode guests by
an adjustment to title or description. Aiui the sole reason for
the restriction is that shadow code doesn't propagate the key bits
from guest to shadow PTEs?

Jan



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/6] x86/prot-key: Split PKRU infrastructure out of asm/processor.h
  2021-12-21 11:28   ` Jan Beulich
@ 2023-01-09 14:57     ` Andrew Cooper
  2023-01-09 16:30       ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2023-01-09 14:57 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roger Pau Monne, Wei Liu, Xen-devel

On 21/12/2021 11:28 am, Jan Beulich wrote:
> On 16.12.2021 10:54, Andrew Cooper wrote:
>> --- /dev/null
>> +++ b/xen/arch/x86/include/asm/prot-key.h
>> @@ -0,0 +1,45 @@
>> +/******************************************************************************
>> + * arch/x86/include/asm/spec_ctrl.h
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
>> + *
>> + * Copyright (c) 2021 Citrix Systems Ltd.
>> + */
>> +#ifndef ASM_PROT_KEY_H
>> +#define ASM_PROT_KEY_H
>> +
>> +#include <xen/types.h>
>> +
>> +#define PKEY_AD 1 /* Access Disable */
>> +#define PKEY_WD 2 /* Write Disable */
>> +
>> +#define PKEY_WIDTH 2 /* Two bits per protection key */
>> +
>> +static inline uint32_t rdpkru(void)
>> +{
>> +    uint32_t pkru;
> I agree this wants to be uint32_t (i.e. unlike the original function),
> but I don't see why the function's return type needs to be, the more
> that the sole caller also uses unsigned int for the variable to store
> the result in.

This is thinnest-possible wrapper around an instruction which
architecturally returns exactly 32 bits of data.

It is literally the example that CODING_STYLE uses to demonstrate when
fixed width types should be used.

>> --- a/xen/arch/x86/mm/guest_walk.c
>> +++ b/xen/arch/x86/mm/guest_walk.c
>> @@ -26,7 +26,9 @@
>>  #include <xen/paging.h>
>>  #include <xen/domain_page.h>
>>  #include <xen/sched.h>
>> +
>>  #include <asm/page.h>
>> +#include <asm/prot-key.h>
>>  #include <asm/guest_pt.h>
>>  #include <asm/hvm/emulate.h>
>>  
>> @@ -413,10 +415,11 @@ guest_walk_tables(const struct vcpu *v, struct p2m_domain *p2m,
>>           guest_pku_enabled(v) )
>>      {
>>          unsigned int pkey = guest_l1e_get_pkey(gw->l1e);
>> -        unsigned int pkru = rdpkru();
>> +        unsigned int pkr = rdpkru();
>> +        unsigned int pk_ar = pkr >> (pkey * PKEY_WIDTH);
> This is correct only because below you only inspect the low two bits.
> Since I don't think masking off the upper bits is really useful here,
> I'd like to suggest to not call the variable "pk_ar". Perhaps
> something as generic as "temp"?

This variable being named pk_ar (or thereabouts) is very important for
clarity below.

I've masked them - seems the compiler is clever enough to undo that even
at -O1.

~Andrew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/6] x86/prot-key: Split PKRU infrastructure out of asm/processor.h
  2023-01-09 14:57     ` Andrew Cooper
@ 2023-01-09 16:30       ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2023-01-09 16:30 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monne, Wei Liu, Xen-devel

On 09.01.2023 15:57, Andrew Cooper wrote:
> On 21/12/2021 11:28 am, Jan Beulich wrote:
>> On 16.12.2021 10:54, Andrew Cooper wrote:
>>> --- /dev/null
>>> +++ b/xen/arch/x86/include/asm/prot-key.h
>>> @@ -0,0 +1,45 @@
>>> +/******************************************************************************
>>> + * arch/x86/include/asm/spec_ctrl.h
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License as published by
>>> + * the Free Software Foundation; either version 2 of the License, or
>>> + * (at your option) any later version.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>> + * GNU General Public License for more details.
>>> + *
>>> + * You should have received a copy of the GNU General Public License
>>> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
>>> + *
>>> + * Copyright (c) 2021 Citrix Systems Ltd.
>>> + */
>>> +#ifndef ASM_PROT_KEY_H
>>> +#define ASM_PROT_KEY_H
>>> +
>>> +#include <xen/types.h>
>>> +
>>> +#define PKEY_AD 1 /* Access Disable */
>>> +#define PKEY_WD 2 /* Write Disable */
>>> +
>>> +#define PKEY_WIDTH 2 /* Two bits per protection key */
>>> +
>>> +static inline uint32_t rdpkru(void)
>>> +{
>>> +    uint32_t pkru;
>> I agree this wants to be uint32_t (i.e. unlike the original function),
>> but I don't see why the function's return type needs to be, the more
>> that the sole caller also uses unsigned int for the variable to store
>> the result in.
> 
> This is thinnest-possible wrapper around an instruction which
> architecturally returns exactly 32 bits of data.
> 
> It is literally the example that CODING_STYLE uses to demonstrate when
> fixed width types should be used.

I don't read it like that, but I guess we're simply drawing the line in
different places (and agreeing on one place would be nice). To me using
uint32_t for a variable accessed by an asm() is what is meant. But that
then (to me) doesn't extend to the function return type here.

But no, I don't mean to block this change just because of this aspect.
We've got many far worse uses of types, which bother me more. It merely
would be nice if new code (regardless of the contributor) ended up all
consistent in this regard.

Jan


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/6] x86/hvm: Context switch MSR_PKRS
  2021-12-21 11:56   ` Jan Beulich
@ 2023-01-09 16:55     ` Andrew Cooper
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Cooper @ 2023-01-09 16:55 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roger Pau Monne, Wei Liu, Kevin Tian, Xen-devel

On 21/12/2021 11:56 am, Jan Beulich wrote:
> On 16.12.2021 10:54, Andrew Cooper wrote:
>> @@ -42,4 +45,49 @@ static inline void wrpkru(uint32_t pkru)
>>                     :: "a" (pkru), "d" (0), "c" (0) );
>>  }
>>  
>> +/*
>> + * Xen does not use PKS.
>> + *
>> + * Guest kernel use is expected to be one default key, except for tiny windows
>> + * with a double write to switch to a non-default key in a permitted critical
>> + * section.
>> + *
>> + * As such, we want MSR_PKRS un-intercepted.  Furthermore, as we only need it
>> + * in Xen for emulation or migration purposes (i.e. possibly never in a
>> + * domain's lifetime), we don't want to re-sync the hardware value on every
>> + * vmexit.
>> + *
>> + * Therefore, we read and cache the guest value in ctxt_switch_from(), in the
>> + * expectation that we can short-circuit the write in ctxt_switch_to().
>> + * During regular operations in current context, the guest value is in
>> + * hardware and the per-cpu cache is stale.
>> + */
>> +DECLARE_PER_CPU(uint32_t, pkrs);
>> +
>> +static inline uint32_t rdpkrs(void)
>> +{
>> +    uint32_t pkrs, tmp;
>> +
>> +    rdmsr(MSR_PKRS, pkrs, tmp);
>> +
>> +    return pkrs;
>> +}
>> +
>> +static inline uint32_t rdpkrs_and_cache(void)
>> +{
>> +    return this_cpu(pkrs) = rdpkrs();
>> +}
>> +
>> +static inline void wrpkrs(uint32_t pkrs)
>> +{
>> +    uint32_t *this_pkrs = &this_cpu(pkrs);
>> +
>> +    if ( *this_pkrs != pkrs )
> For this to work, I think we need to clear PKRS during CPU init; I
> admit I didn't peek ahead in the series to check whether you do so
> later on in the series. At least the version of the SDM I'm looking
> at doesn't even specify reset state of 0 for this MSR. But even if
> it did, it would likely be as for PKRU - unchanged after INIT. Yet
> INIT is all that CPUs go through when e.g. parking / unparking them.

While trying to address this, I've noticed that we don't sanitise PKRU
during CPU init either.

This will explode in a fun way if e.g. we kexec into a new Xen, but
leave PKEY 0 with AD/WD, and try building a PV dom0.

As soon as we've fully context switched into a vCPU context, we'll pick
up the 0 from XSTATE and do the right thing by default.

~Andrew

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/6] x86/hvm: Support PKS
  2021-12-21 12:18   ` Jan Beulich
@ 2023-01-09 17:01     ` Andrew Cooper
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Cooper @ 2023-01-09 17:01 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Roger Pau Monne, Wei Liu, Xen-devel

On 21/12/2021 12:18 pm, Jan Beulich wrote:
> On 16.12.2021 10:54, Andrew Cooper wrote:
>> With all infrastructure in place, advertise the PKS CPUID bit to guests, and
>> let them set CR4.PKS.
>>
>> Experiment with a tweak to the layout of hvm_cr4_guest_valid_bits() so future
>> additions will be just a single added line.
>>
>> The current context switching behaviour is tied to how VT-x works, so leave a
>> safety check in the short term.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Thanks.

>
> I would like to ask though that you ...
>
>> --- a/xen/include/public/arch-x86/cpufeatureset.h
>> +++ b/xen/include/public/arch-x86/cpufeatureset.h
>> @@ -244,7 +244,7 @@ XEN_CPUFEATURE(CLDEMOTE,      6*32+25) /*A  CLDEMOTE instruction */
>>  XEN_CPUFEATURE(MOVDIRI,       6*32+27) /*a  MOVDIRI instruction */
>>  XEN_CPUFEATURE(MOVDIR64B,     6*32+28) /*a  MOVDIR64B instruction */
>>  XEN_CPUFEATURE(ENQCMD,        6*32+29) /*   ENQCMD{,S} instructions */
>> -XEN_CPUFEATURE(PKS,           6*32+31) /*   Protection Key for Supervisor */
>> +XEN_CPUFEATURE(PKS,           6*32+31) /*H  Protection Key for Supervisor */
> ... clarify this restriction of not covering shadow mode guests by
> an adjustment to title or description. Aiui the sole reason for
> the restriction is that shadow code doesn't propagate the key bits
> from guest to shadow PTEs?

PKU is only exposed on HAP, so PKS really ought to match.

We indeed don't copy the leaf PKEY into the shadows.  While that ought
to be relatively to adjust, we would then have to make sh_page_fault()
cope with seeing PFEC_prot_key.

But honestly, there are far far more important things to spend time on.

~Andrew

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-01-09 17:02 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-16  9:54 [PATCH 0/6] x86: Support PKS Andrew Cooper
2021-12-16  9:54 ` [PATCH 1/6] x86/prot-key: Enumeration for Protection Key Supervisor Andrew Cooper
2021-12-21 11:15   ` Jan Beulich
2021-12-16  9:54 ` [PATCH 2/6] x86/prot-key: Split PKRU infrastructure out of asm/processor.h Andrew Cooper
2021-12-21 11:28   ` Jan Beulich
2023-01-09 14:57     ` Andrew Cooper
2023-01-09 16:30       ` Jan Beulich
2021-12-16  9:54 ` [PATCH 3/6] x86/hvm: Context switch MSR_PKRS Andrew Cooper
2021-12-21 11:56   ` Jan Beulich
2023-01-09 16:55     ` Andrew Cooper
2021-12-16  9:54 ` [PATCH 4/6] x86/hvm: Enable guest access to MSR_PKRS Andrew Cooper
2021-12-21 11:58   ` Jan Beulich
2021-12-16  9:54 ` [PATCH 5/6] x86/pagewalk: Support PKS Andrew Cooper
2021-12-21 12:07   ` Jan Beulich
2021-12-16  9:54 ` [PATCH 6/6] x86/hvm: " Andrew Cooper
2021-12-21 12:18   ` Jan Beulich
2023-01-09 17:01     ` Andrew Cooper
2021-12-16 21:09 ` [PATCH 0/6] x86: " Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.