All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 0/4] Make PAT handling less brittle
@ 2023-01-07 22:07 Demi Marie Obenour
  2023-01-07 22:07 ` [PATCH v7 1/4] x86: Remove MEMORY_NUM_TYPES and NO_HARDCODE_MEM_TYPE Demi Marie Obenour
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Demi Marie Obenour @ 2023-01-07 22:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Demi Marie Obenour, Marek Marczykowski-Górecki,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu, Roger Pau Monné,
	Tim Deegan

While working on Qubes OS Marek found out that there were some PAT hacks
in the Linux i195 driver.  I decided to make Xen use Linux’s PAT to see
if it solved the graphics glitches that were observed; it did.  This
required a substantial amount of preliminary work that is useful even
without using Linux’s PAT.  Furthermore, it turned out that the graphics
glitches were due to a hardware bug, which means that Xen's PAT is
fundamentally incompatible with the use of current-generation Intel
integrated GPUs assigned to a PV guest (including dom0).

Patches 1 through 3 are the preliminary work.  Patch 3 does break ABI by
rejecting the unused PAT entries, but this will only impact buggy PV
guests and can be disabled with a Xen command-line option.  Patch 4
provides a new Kconfig option (CONFIG_LINUX_PAT) to use Linux's PAT
instead of Xen's default.

Only patches 3 and 4 actually change Xen’s observable behavior.  Patch 1
is strictly cleanup.  Patch 2 makes changing the PAT much less
error-prone, as problems with the PAT or with the associated _PAGE_*
constants will be detected at compile time.

Demi Marie Obenour (4):
  x86: Remove MEMORY_NUM_TYPES and NO_HARDCODE_MEM_TYPE
  x86/mm: Reject invalid cacheability in PV guests by default
  x86/mm: make code robust to future PAT changes
  x86: Allow using Linux's PAT

 docs/misc/xen-command-line.pandoc    |  11 ++
 xen/arch/x86/Kconfig                 |  33 ++++++
 xen/arch/x86/hvm/mtrr.c              |  18 ++--
 xen/arch/x86/include/asm/mtrr.h      |   2 -
 xen/arch/x86/include/asm/page.h      |  14 +++
 xen/arch/x86/include/asm/processor.h |  15 +++
 xen/arch/x86/include/asm/pv/domain.h |   7 ++
 xen/arch/x86/mm.c                    | 151 ++++++++++++++++++++++++++-
 xen/arch/x86/mm/shadow/multi.c       |   2 +-
 xen/arch/x86/pv/domain.c             |  18 +++-
 10 files changed, 255 insertions(+), 16 deletions(-)

-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v7 1/4] x86: Remove MEMORY_NUM_TYPES and NO_HARDCODE_MEM_TYPE
  2023-01-07 22:07 [PATCH v7 0/4] Make PAT handling less brittle Demi Marie Obenour
@ 2023-01-07 22:07 ` Demi Marie Obenour
  2023-01-07 22:07 ` [PATCH v7 2/4] x86/mm: Reject invalid cacheability in PV guests by default Demi Marie Obenour
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Demi Marie Obenour @ 2023-01-07 22:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Demi Marie Obenour, Marek Marczykowski-Górecki,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu, Roger Pau Monné,
	Tim Deegan

No functional change intended.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
---
Changes since v2:

- Keep MTRR_NUM_TYPES and adjust commit message accordingly
---
 xen/arch/x86/hvm/mtrr.c         | 18 +++++++++---------
 xen/arch/x86/include/asm/mtrr.h |  2 --
 xen/arch/x86/mm/shadow/multi.c  |  2 +-
 3 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c
index 093103f6c768cf64f880d1b20e1c14f5918c1250..05e978041d62fd0d559462de181a04bef8a5bca9 100644
--- a/xen/arch/x86/hvm/mtrr.c
+++ b/xen/arch/x86/hvm/mtrr.c
@@ -38,7 +38,7 @@ static const uint8_t pat_entry_2_pte_flags[8] = {
 
 /* Effective mm type lookup table, according to MTRR and PAT. */
 static const uint8_t mm_type_tbl[MTRR_NUM_TYPES][X86_NUM_MT] = {
-#define RS MEMORY_NUM_TYPES
+#define RS MTRR_NUM_TYPES
 #define UC X86_MT_UC
 #define WB X86_MT_WB
 #define WC X86_MT_WC
@@ -66,9 +66,9 @@ static const uint8_t mm_type_tbl[MTRR_NUM_TYPES][X86_NUM_MT] = {
  * Reverse lookup table, to find a pat type according to MTRR and effective
  * memory type. This table is dynamically generated.
  */
-static uint8_t __read_mostly mtrr_epat_tbl[MTRR_NUM_TYPES][MEMORY_NUM_TYPES] =
-    { [0 ... MTRR_NUM_TYPES-1] =
-        { [0 ... MEMORY_NUM_TYPES-1] = INVALID_MEM_TYPE }
+static uint8_t __read_mostly mtrr_epat_tbl[MTRR_NUM_TYPES][MTRR_NUM_TYPES] =
+    { [0 ... MTRR_NUM_TYPES - 1] =
+        { [0 ... MTRR_NUM_TYPES - 1] = INVALID_MEM_TYPE }
     };
 
 /* Lookup table for PAT entry of a given PAT value in host PAT. */
@@ -85,7 +85,7 @@ static int __init cf_check hvm_mtrr_pat_init(void)
         {
             unsigned int tmp = mm_type_tbl[i][j];
 
-            if ( tmp < MEMORY_NUM_TYPES )
+            if ( tmp < MTRR_NUM_TYPES )
                 mtrr_epat_tbl[i][tmp] = j;
         }
     }
@@ -317,11 +317,11 @@ static uint8_t effective_mm_type(struct mtrr_state *m,
                                  uint8_t gmtrr_mtype)
 {
     uint8_t mtrr_mtype, pat_value;
-   
+
     /* if get_pat_flags() gives a dedicated MTRR type,
      * just use it
-     */ 
-    if ( gmtrr_mtype == NO_HARDCODE_MEM_TYPE )
+     */
+    if ( gmtrr_mtype == MTRR_NUM_TYPES )
         mtrr_mtype = mtrr_get_type(m, gpa, 0);
     else
         mtrr_mtype = gmtrr_mtype;
@@ -346,7 +346,7 @@ uint32_t get_pat_flags(struct vcpu *v,
     /* 1. Get the effective memory type of guest physical address,
      * with the pair of guest MTRR and PAT
      */
-    guest_eff_mm_type = effective_mm_type(g, pat, gpaddr, 
+    guest_eff_mm_type = effective_mm_type(g, pat, gpaddr,
                                           gl1e_flags, gmtrr_mtype);
     /* 2. Get the memory type of host physical address, with MTRR */
     shadow_mtrr_type = mtrr_get_type(&mtrr_state, spaddr, 0);
diff --git a/xen/arch/x86/include/asm/mtrr.h b/xen/arch/x86/include/asm/mtrr.h
index e4f6ca6048334b2094a1836cc2f298453641232f..4b7f840a965954cc4b59698327a37e47026893a4 100644
--- a/xen/arch/x86/include/asm/mtrr.h
+++ b/xen/arch/x86/include/asm/mtrr.h
@@ -4,8 +4,6 @@
 #include <xen/mm.h>
 
 #define MTRR_NUM_TYPES       X86_MT_UCM
-#define MEMORY_NUM_TYPES     MTRR_NUM_TYPES
-#define NO_HARDCODE_MEM_TYPE MTRR_NUM_TYPES
 
 #define NORMAL_CACHE_MODE          0
 #define NO_FILL_CACHE_MODE         2
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index f5f7ff021bd9e057c5b6f6329de7acb5ef05d58f..1faf9940db6b0afefc5977c00c00fb6a39cd27d2 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -578,7 +578,7 @@ _sh_propagate(struct vcpu *v,
                             gflags,
                             gfn_to_paddr(target_gfn),
                             mfn_to_maddr(target_mfn),
-                            NO_HARDCODE_MEM_TYPE);
+                            MTRR_NUM_TYPES);
             }
     }
 
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v7 2/4] x86/mm: Reject invalid cacheability in PV guests by default
  2023-01-07 22:07 [PATCH v7 0/4] Make PAT handling less brittle Demi Marie Obenour
  2023-01-07 22:07 ` [PATCH v7 1/4] x86: Remove MEMORY_NUM_TYPES and NO_HARDCODE_MEM_TYPE Demi Marie Obenour
@ 2023-01-07 22:07 ` Demi Marie Obenour
  2023-01-09 13:53   ` Jan Beulich
  2023-01-07 22:07 ` [PATCH v7 3/4] x86/mm: make code robust to future PAT changes Demi Marie Obenour
  2023-01-07 22:07 ` [PATCH v7 4/4] x86: Allow using Linux's PAT Demi Marie Obenour
  3 siblings, 1 reply; 9+ messages in thread
From: Demi Marie Obenour @ 2023-01-07 22:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Demi Marie Obenour, Marek Marczykowski-Górecki,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu, Roger Pau Monné,
	Tim Deegan

Setting cacheability flags that are not ones specified by Xen is a bug
in the guest.  By default, return -EINVAL if a guests attempts to do
this.  The invalid-cacheability= Xen command-line flag allows the
administrator to allow such attempts or to produce

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
---
Changes since v6:
- Make invalid-cacheability= a subflag of pv=.
- Move check for invalid cacheability to get_page_from_l1e().

Changes since v5:
- Make parameters static and __ro_after_init.
- Replace boolean parameter allow_invalid_cacheability with string
  parameter invalid-cacheability.
- Move parameter definitions to near where they are used.
- Add documentation.

Changes since v4:
- Remove pointless BUILD_BUG_ON().
- Add comment explaining why an exception is being injected.

Changes since v3:
- Add Andrew Cooper’s Suggested-by
---
 docs/misc/xen-command-line.pandoc    | 11 ++++++
 xen/arch/x86/include/asm/pv/domain.h |  7 ++++
 xen/arch/x86/mm.c                    | 53 +++++++++++++++++++++++++++-
 xen/arch/x86/pv/domain.c             | 18 ++++++++--
 4 files changed, 85 insertions(+), 4 deletions(-)

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 424b12cfb27d6ade2ec63eacb8afe5df82465451..0230a7bc17cbd4362a42ea64cea695f31f5e0f86 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -1417,6 +1417,17 @@ detection of systems known to misbehave upon accesses to that port.
 ### idle_latency_factor (x86)
 > `= <integer>`
 
+### invalid-cacheability (x86)
+> `= allow | deny | trap`
+
+> Default: `deny` in release builds, otherwise `trap`
+
+Specify what happens when a PV guest tries to use one of the reserved entries in
+the PAT.  `deny` causes the attempt to be rejected with -EINVAL, `allow` allows
+the attempt, and `trap` causes a general protection fault to be raised.
+Currently, the reserved entries are marked as uncacheable in Xen's PAT, but this
+will change if new memory types are added, so guests must not rely on it.
+
 ### ioapic_ack (x86)
 > `= old | new`
 
diff --git a/xen/arch/x86/include/asm/pv/domain.h b/xen/arch/x86/include/asm/pv/domain.h
index 924508bbb4f0c199b3cd2306d9d8f0bd0ef399f9..1c9ce259ab4ee23ea5d057f5dfa964effb169032 100644
--- a/xen/arch/x86/include/asm/pv/domain.h
+++ b/xen/arch/x86/include/asm/pv/domain.h
@@ -71,6 +71,13 @@ void pv_vcpu_destroy(struct vcpu *v);
 int pv_vcpu_initialise(struct vcpu *v);
 void pv_domain_destroy(struct domain *d);
 int pv_domain_initialise(struct domain *d);
+extern __ro_after_init uint8_t invalid_cacheability;
+
+enum {
+    INVALID_CACHEABILITY_ALLOW,
+    INVALID_CACHEABILITY_DENY,
+    INVALID_CACHEABILITY_TRAP,
+};
 
 /*
  * Bits which a PV guest can toggle in its view of cr4.  Some are loaded into
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 3558ca215b02a517d55d75329d645ae5905424e4..a8f137925cba1846b97aee9321df6427f4dd1a94 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -879,6 +879,30 @@ get_page_from_l1e(
         return -EINVAL;
     }
 
+    if ( invalid_cacheability != INVALID_CACHEABILITY_ALLOW )
+    {
+        switch ( l1e.l1 & PAGE_CACHE_ATTRS )
+        {
+        case _PAGE_WB:
+        case _PAGE_UC:
+        case _PAGE_UCM:
+        case _PAGE_WC:
+        case _PAGE_WT:
+        case _PAGE_WP:
+            break;
+        default:
+            /*
+             * If we get here, a PV guest tried to use one of the
+             * reserved values in Xen's PAT.  This indicates a bug
+             * in the guest.  If requested by the user, inject #GP
+             * to cause the guest to log a stack trace.
+             */
+            if ( invalid_cacheability == INVALID_CACHEABILITY_TRAP )
+                pv_inject_hw_exception(TRAP_gp_fault, 0);
+            return -EINVAL;
+        }
+    }
+
     valid = mfn_valid(_mfn(mfn));
 
     if ( !valid ||
@@ -1324,6 +1348,31 @@ static int put_page_from_l4e(l4_pgentry_t l4e, mfn_t l4mfn, unsigned int flags)
     return put_pt_page(l4e_get_page(l4e), mfn_to_page(l4mfn), flags);
 }
 
+#ifdef NDEBUG
+#define INVALID_CACHEABILITY_DEFAULT INVALID_CACHEABILITY_DENY
+#else
+#define INVALID_CACHEABILITY_DEFAULT INVALID_CACHEABILITY_TRAP
+#endif
+
+__ro_after_init uint8_t invalid_cacheability =
+    INVALID_CACHEABILITY_DEFAULT;
+
+static int __init cf_check set_invalid_cacheability(const char *str)
+{
+    if (strcmp("allow", str) == 0)
+        invalid_cacheability = INVALID_CACHEABILITY_ALLOW;
+    else if (strcmp("deny", str) == 0)
+        invalid_cacheability = INVALID_CACHEABILITY_DENY;
+    else if (strcmp("trap", str) == 0)
+        invalid_cacheability = INVALID_CACHEABILITY_TRAP;
+    else
+        return -EINVAL;
+
+    return 0;
+}
+
+custom_param("invalid-cacheability", set_invalid_cacheability);
+
 static int promote_l1_table(struct page_info *page)
 {
     struct domain *d = page_get_owner(page);
@@ -1343,7 +1392,9 @@ static int promote_l1_table(struct page_info *page)
         }
         else
         {
-            switch ( ret = get_page_from_l1e(pl1e[i], d, d) )
+            l1_pgentry_t l1e = pl1e[i];
+
+            switch ( ret = get_page_from_l1e(l1e, d, d) )
             {
             default:
                 goto fail;
diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c
index f94f28c8e271549acb449ef2e129b928751f765d..40b424351fd99fe1fb0a5faa5b20bf4070bb1d4a 100644
--- a/xen/arch/x86/pv/domain.c
+++ b/xen/arch/x86/pv/domain.c
@@ -28,9 +28,21 @@ static int __init cf_check parse_pv(const char *s)
     do {
         ss = strchr(s, ',');
         if ( !ss )
-            ss = strchr(s, '\0');
-
-        if ( (val = parse_boolean("32", s, ss)) >= 0 )
+            ss += strlen(s);
+        if ( !strncmp("invalid-cacheability=", s,
+                      sizeof("invalid-cacheability=") - 1) )
+        {
+            const char *p = s + (sizeof("invalid-cacheability=") - 1);
+            if (ss - p == 5 && !memcmp(p, "allow", 5))
+                invalid_cacheability = INVALID_CACHEABILITY_ALLOW;
+            else if (ss - p == 4 && !memcmp(p, "deny", 4))
+                invalid_cacheability = INVALID_CACHEABILITY_DENY;
+            else if (ss - p == 4 && !memcmp(p, "trap", 4))
+                invalid_cacheability = INVALID_CACHEABILITY_TRAP;
+            else
+                rc = -EINVAL;
+        }
+        else if ( (val = parse_boolean("32", s, ss)) >= 0 )
         {
 #ifdef CONFIG_PV32
             opt_pv32 = val;
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v7 3/4] x86/mm: make code robust to future PAT changes
  2023-01-07 22:07 [PATCH v7 0/4] Make PAT handling less brittle Demi Marie Obenour
  2023-01-07 22:07 ` [PATCH v7 1/4] x86: Remove MEMORY_NUM_TYPES and NO_HARDCODE_MEM_TYPE Demi Marie Obenour
  2023-01-07 22:07 ` [PATCH v7 2/4] x86/mm: Reject invalid cacheability in PV guests by default Demi Marie Obenour
@ 2023-01-07 22:07 ` Demi Marie Obenour
  2023-01-09 14:07   ` Jan Beulich
  2023-01-07 22:07 ` [PATCH v7 4/4] x86: Allow using Linux's PAT Demi Marie Obenour
  3 siblings, 1 reply; 9+ messages in thread
From: Demi Marie Obenour @ 2023-01-07 22:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Demi Marie Obenour, Marek Marczykowski-Górecki,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu, Roger Pau Monné,
	Tim Deegan

It may be desirable to change Xen's PAT for various reasons.  This
requires changes to several _PAGE_* macros as well.  Add static
assertions to check that XEN_MSR_PAT is consistent with the _PAGE_*
macros, and that _PAGE_WB is 0 as required by Linux.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
---
Changes since v6:
- Remove overly-specific comment.
- Remove double blank line.
- Check that unused entries are distinct from higher entries that are
  used.
- Have PTE_FLAGS_TO_CACHEATTR validate its argument.
- Do not check if an unsigned number is negative.
- Expand comment explaining why _PAGE_WB must be zero.

Changes since v5:
- Remove unhelpful comment.
- Move a BUILD_BUG_ON.
- Use fewer hardcoded constants in PTE_FLAGS_TO_CACHEATTR.
- Fix coding style.

Changes since v4:
- Add lots of comments explaining what the various BUILD_BUG_ON()s mean.

Changes since v3:
- Refactor some macros
- Avoid including a string literal in BUILD_BUG_ON

Additional checks on PAT values

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
---
 xen/arch/x86/include/asm/page.h |  2 +
 xen/arch/x86/mm.c               | 96 +++++++++++++++++++++++++++++++++
 2 files changed, 98 insertions(+)

diff --git a/xen/arch/x86/include/asm/page.h b/xen/arch/x86/include/asm/page.h
index b585235d064a567082582c8e92a4e8283fd949ca..c7d77ab2901aa5bdb03a719af810c6f8d8ba9d4e 100644
--- a/xen/arch/x86/include/asm/page.h
+++ b/xen/arch/x86/include/asm/page.h
@@ -338,6 +338,8 @@ void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t);
 #define _PAGE_UC         (            _PAGE_PCD | _PAGE_PWT)
 #define _PAGE_WC         (_PAGE_PAT                        )
 #define _PAGE_WP         (_PAGE_PAT |             _PAGE_PWT)
+#define _PAGE_RSVD_1     (_PAGE_PAT | _PAGE_PCD            )
+#define _PAGE_RSVD_2     (_PAGE_PAT | _PAGE_PCD | _PAGE_PWT)
 
 /*
  * Debug option: Ensure that granted mappings are not implicitly unmapped.
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index a8f137925cba1846b97aee9321df6427f4dd1a94..d69e9bea6c30bc782ab4c331f42502f6e61a028a 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -890,6 +890,8 @@ get_page_from_l1e(
         case _PAGE_WT:
         case _PAGE_WP:
             break;
+        case _PAGE_RSVD_1:
+        case _PAGE_RSVD_2:
         default:
             /*
              * If we get here, a PV guest tried to use one of the
@@ -6412,6 +6414,100 @@ static void __init __maybe_unused build_assertions(void)
      * using different PATs will not work.
      */
     BUILD_BUG_ON(XEN_MSR_PAT != 0x050100070406ULL);
+
+    /*
+     * _PAGE_WB must be zero.  Linux PV guests assume that _PAGE_WB will be
+     * zero, and indeed Linux has a BUILD_BUG_ON validating that their version
+     * of _PAGE_WB *is* zero.  Furthermore, since _PAGE_WB is zero, it is quite
+     * likely to be omitted from various parts of Xen, and indeed L1 PTE
+     * validation code checks that ((l1f & PAGE_CACHE_ATTRs) == 0), not
+     * ((l1f & PAGE_CACHE_ATTRs) == _PAGE_WB).
+     */
+    BUILD_BUG_ON(_PAGE_WB);
+
+    /* _PAGE_RSVD_1 must be less than _PAGE_RSVD_2 */
+    BUILD_BUG_ON(_PAGE_RSVD_1 >= _PAGE_RSVD_2);
+
+#define PAT_ENTRY(v)                                                           \
+    (BUILD_BUG_ON_ZERO(((v) < 0) || ((v) > 7)) +                               \
+     (0xFF & (XEN_MSR_PAT >> (8 * (v)))))
+
+    /* Validate at compile-time that v is a valid value for a PAT entry */
+#define CHECK_PAT_ENTRY_VALUE(v)                                               \
+    BUILD_BUG_ON((v) > X86_NUM_MT || (v) == X86_MT_RSVD_2 ||                   \
+                 (v) == X86_MT_RSVD_3)
+
+    /* Validate at compile-time that PAT entry v is valid */
+#define CHECK_PAT_ENTRY(v) CHECK_PAT_ENTRY_VALUE(PAT_ENTRY(v))
+
+    /*
+     * If one of these trips, the corresponding entry in XEN_MSR_PAT is invalid.
+     * This would cause Xen to crash (with #GP) at startup.
+     */
+    CHECK_PAT_ENTRY(0);
+    CHECK_PAT_ENTRY(1);
+    CHECK_PAT_ENTRY(2);
+    CHECK_PAT_ENTRY(3);
+    CHECK_PAT_ENTRY(4);
+    CHECK_PAT_ENTRY(5);
+    CHECK_PAT_ENTRY(6);
+    CHECK_PAT_ENTRY(7);
+
+    /* Macro version of pte_flags_to_cacheattr(), for use in BUILD_BUG_ON()s */
+#define PTE_FLAGS_TO_CACHEATTR(pte_value)                                      \
+    /* Check that the _PAGE_* macros only use bits from PAGE_CACHE_ATTRS */    \
+    (BUILD_BUG_ON_ZERO(((pte_value) & PAGE_CACHE_ATTRS) != (pte_value)) |      \
+     (((pte_value) & _PAGE_PAT) >> 5) |                                        \
+     (((pte_value) & (_PAGE_PCD | _PAGE_PWT)) >> 3))
+
+    CHECK_PAT_ENTRY(PTE_FLAGS_TO_CACHEATTR(_PAGE_RSVD_1));
+    CHECK_PAT_ENTRY(PTE_FLAGS_TO_CACHEATTR(_PAGE_RSVD_2));
+#define PAT_ENTRY_FROM_FLAGS(x) PAT_ENTRY(PTE_FLAGS_TO_CACHEATTR(x))
+
+    /* Validate at compile time that X does not duplicate a smaller PAT entry */
+#define CHECK_DUPLICATE_ENTRY(x, y)                                            \
+    BUILD_BUG_ON((x) >= (y) &&                                                 \
+                 (PAT_ENTRY_FROM_FLAGS(x) == PAT_ENTRY_FROM_FLAGS(y)))
+
+    /* Check that a PAT-related _PAGE_* macro is correct */
+#define CHECK_PAGE_VALUE(page_value) do {                                      \
+    /* Check that the _PAGE_* macros only use bits from PAGE_CACHE_ATTRS */    \
+    BUILD_BUG_ON(((_PAGE_ ## page_value) & PAGE_CACHE_ATTRS) !=                \
+                 (_PAGE_ ## page_value));                                      \
+    /* Check that the _PAGE_* are consistent with XEN_MSR_PAT */               \
+    BUILD_BUG_ON(PAT_ENTRY(PTE_FLAGS_TO_CACHEATTR(_PAGE_ ## page_value)) !=    \
+                 (X86_MT_ ## page_value));                                     \
+    case _PAGE_ ## page_value:; /* ensure no duplicate values */               \
+    /*                                                                         \
+     * Check that the _PAGE_* entries do not duplicate a smaller reserved      \
+     * entry.                                                                  \
+     */                                                                        \
+    CHECK_DUPLICATE_ENTRY(_PAGE_ ## page_value, _PAGE_RSVD_1);                 \
+    CHECK_DUPLICATE_ENTRY(_PAGE_ ## page_value, _PAGE_RSVD_2);                 \
+    CHECK_PAT_ENTRY(PTE_FLAGS_TO_CACHEATTR(_PAGE_ ## page_value));             \
+} while ( false )
+
+    /*
+     * If one of these trips, the corresponding _PAGE_* macro is inconsistent
+     * with XEN_MSR_PAT.  This would cause Xen to use incorrect cacheability
+     * flags, with results that are unknown and possibly harmful.
+     */
+    switch (0) {
+    CHECK_PAGE_VALUE(WT);
+    CHECK_PAGE_VALUE(WB);
+    CHECK_PAGE_VALUE(WC);
+    CHECK_PAGE_VALUE(UC);
+    CHECK_PAGE_VALUE(UCM);
+    CHECK_PAGE_VALUE(WP);
+    case _PAGE_RSVD_1:
+    case _PAGE_RSVD_2:
+        break;
+    }
+#undef CHECK_PAT_ENTRY
+#undef CHECK_PAT_ENTRY_VALUE
+#undef CHECK_PAGE_VALUE
+#undef PAGE_FLAGS_TO_CACHEATTR
+#undef PAT_ENTRY
 }
 
 /*
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v7 4/4] x86: Allow using Linux's PAT
  2023-01-07 22:07 [PATCH v7 0/4] Make PAT handling less brittle Demi Marie Obenour
                   ` (2 preceding siblings ...)
  2023-01-07 22:07 ` [PATCH v7 3/4] x86/mm: make code robust to future PAT changes Demi Marie Obenour
@ 2023-01-07 22:07 ` Demi Marie Obenour
  2023-01-09 11:37   ` Jan Beulich
  3 siblings, 1 reply; 9+ messages in thread
From: Demi Marie Obenour @ 2023-01-07 22:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Demi Marie Obenour, Marek Marczykowski-Górecki,
	Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu, Roger Pau Monné,
	Tim Deegan

Due to a hardware errata, Intel integrated GPUs are incompatible with
Xen's PAT.  Using Linux's PAT is a workaround for this flaw.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
---
 xen/arch/x86/Kconfig                 | 33 ++++++++++++++++++++++++++++
 xen/arch/x86/include/asm/page.h      | 12 ++++++++++
 xen/arch/x86/include/asm/processor.h | 15 +++++++++++++
 xen/arch/x86/mm.c                    |  2 ++
 4 files changed, 62 insertions(+)

diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 6a7825f4ba3c98e0496415123fde79ee62f771fa..18efccedfd08873cd169a54825b0ba4256a12942 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -227,6 +227,39 @@ config XEN_ALIGN_2M
 
 endchoice
 
+config LINUX_PAT
+	bool "Use Linux's PAT instead of Xen's default"
+	help
+	  Use Linux's Page Attribute Table instead of the default Xen value.
+
+	  The Page Attribute Table (PAT) maps three bits in the page table entry
+	  to the actual cacheability used by the processor.  Many Intel
+	  integrated GPUs have errata (bugs) that cause CPU access to GPU memory
+	  to ignore the topmost bit.  When using Xen's default PAT, this results
+	  in caches not being flushed and incorrect images being displayed.  The
+	  default PAT used by Linux does not cause this problem.
+
+	  If you say Y here, you will be able to use Intel integrated GPUs that
+	  are attached to your Linux dom0 or other Linux PV guests.  However,
+	  you will not be able to use non-Linux OSs in dom0, and attaching a PCI
+	  device to a non-Linux PV guest will result in unpredictable guest
+	  behavior.  If you say N here, you will be able to use a non-Linux
+	  dom0, and will be able to attach PCI devices to non-Linux PV guests.
+
+	  Note that saving a PV guest with an assigned PCI device on a machine
+	  with one PAT and restoring it on a machine with a different PAT won't
+	  work: the resulting guest may boot and even appear to work, but caches
+	  will not be flushed when needed, with unpredictable results.  HVM
+	  (including PVH and PVHVM) guests and guests without assigned PCI
+	  devices do not care what PAT Xen uses, and migration (even live)
+	  between hypervisors with different PATs will work fine.  Guests using
+	  PV Shim care about the PAT used by the PV Shim firmware, not the
+	  host’s PAT.  Also, non-default PAT values are incompatible with the
+	  (deprecated) qemu-traditional stubdomain.
+
+	  Say Y if you are building a hypervisor for a Linux distribution that
+	  supports Intel iGPUs.  Say N otherwise.
+
 config X2APIC_PHYSICAL
 	bool "x2APIC Physical Destination mode"
 	help
diff --git a/xen/arch/x86/include/asm/page.h b/xen/arch/x86/include/asm/page.h
index c7d77ab2901aa5bdb03a719af810c6f8d8ba9d4e..03839eb2b78517332663daad2089677d7000852c 100644
--- a/xen/arch/x86/include/asm/page.h
+++ b/xen/arch/x86/include/asm/page.h
@@ -331,6 +331,7 @@ void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t);
 
 #define PAGE_CACHE_ATTRS (_PAGE_PAT | _PAGE_PCD | _PAGE_PWT)
 
+#ifndef CONFIG_LINUX_PAT
 /* Memory types, encoded under Xen's choice of MSR_PAT. */
 #define _PAGE_WB         (                                0)
 #define _PAGE_WT         (                        _PAGE_PWT)
@@ -340,6 +341,17 @@ void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t);
 #define _PAGE_WP         (_PAGE_PAT |             _PAGE_PWT)
 #define _PAGE_RSVD_1     (_PAGE_PAT | _PAGE_PCD            )
 #define _PAGE_RSVD_2     (_PAGE_PAT | _PAGE_PCD | _PAGE_PWT)
+#else
+/* Memory types, encoded under Linux's choice of MSR_PAT. */
+#define _PAGE_WB         (                                0)
+#define _PAGE_WC         (                        _PAGE_PWT)
+#define _PAGE_UCM        (            _PAGE_PCD            )
+#define _PAGE_UC         (            _PAGE_PCD | _PAGE_PWT)
+#define _PAGE_RSVD_1     (_PAGE_PAT                        )
+#define _PAGE_WP         (_PAGE_PAT |             _PAGE_PWT)
+#define _PAGE_RSVD_2     (_PAGE_PAT | _PAGE_PCD            )
+#define _PAGE_WT         (_PAGE_PAT | _PAGE_PCD | _PAGE_PWT)
+#endif
 
 /*
  * Debug option: Ensure that granted mappings are not implicitly unmapped.
diff --git a/xen/arch/x86/include/asm/processor.h b/xen/arch/x86/include/asm/processor.h
index 60b902060914584957db8afa5c7c1e6abdad4d13..413b59ab284990cca192fa1dc44b437f58bd282f 100644
--- a/xen/arch/x86/include/asm/processor.h
+++ b/xen/arch/x86/include/asm/processor.h
@@ -92,6 +92,20 @@
                           X86_EFLAGS_NT|X86_EFLAGS_DF|X86_EFLAGS_IF|    \
                           X86_EFLAGS_TF)
 
+#ifdef CONFIG_LINUX_PAT
+/*
+ * Host IA32_CR_PAT value to cover all memory types.  This is not the default
+ * MSR_PAT value, but is the same as the one used by Linux.
+ */
+#define XEN_MSR_PAT ((_AC(X86_MT_WB,  ULL) << 0x00) | \
+                     (_AC(X86_MT_WC,  ULL) << 0x08) | \
+                     (_AC(X86_MT_UCM, ULL) << 0x10) | \
+                     (_AC(X86_MT_UC,  ULL) << 0x18) | \
+                     (_AC(X86_MT_WB,  ULL) << 0x20) | \
+                     (_AC(X86_MT_WP,  ULL) << 0x28) | \
+                     (_AC(X86_MT_UCM, ULL) << 0x30) | \
+                     (_AC(X86_MT_WT,  ULL) << 0x38))
+#else
 /*
  * Host IA32_CR_PAT value to cover all memory types.  This is not the default
  * MSR_PAT value, and is an ABI with PV guests.
@@ -104,6 +118,7 @@
                      (_AC(X86_MT_WP,  ULL) << 0x28) | \
                      (_AC(X86_MT_UC,  ULL) << 0x30) | \
                      (_AC(X86_MT_UC,  ULL) << 0x38))
+#endif
 
 #ifndef __ASSEMBLY__
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index d69e9bea6c30bc782ab4c331f42502f6e61a028a..042c1875a02092a3f19c293003ef12209d88a450 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -6407,6 +6407,7 @@ unsigned long get_upper_mfn_bound(void)
 
 static void __init __maybe_unused build_assertions(void)
 {
+#ifndef CONFIG_LINUX_PAT
     /*
      * If this trips, any guests that blindly rely on the public API in xen.h
      * (instead of reading the PAT from Xen, as Linux 3.19+ does) will be
@@ -6414,6 +6415,7 @@ static void __init __maybe_unused build_assertions(void)
      * using different PATs will not work.
      */
     BUILD_BUG_ON(XEN_MSR_PAT != 0x050100070406ULL);
+#endif
 
     /*
      * _PAGE_WB must be zero.  Linux PV guests assume that _PAGE_WB will be
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v7 4/4] x86: Allow using Linux's PAT
  2023-01-07 22:07 ` [PATCH v7 4/4] x86: Allow using Linux's PAT Demi Marie Obenour
@ 2023-01-09 11:37   ` Jan Beulich
  2023-01-09 16:14     ` Demi Marie Obenour
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2023-01-09 11:37 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: Marek Marczykowski-Górecki, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, Roger Pau Monné,
	Tim Deegan, xen-devel

On 07.01.2023 23:07, Demi Marie Obenour wrote:
> --- a/xen/arch/x86/Kconfig
> +++ b/xen/arch/x86/Kconfig
> @@ -227,6 +227,39 @@ config XEN_ALIGN_2M
>  
>  endchoice
>  
> +config LINUX_PAT
> +	bool "Use Linux's PAT instead of Xen's default"
> +	help
> +	  Use Linux's Page Attribute Table instead of the default Xen value.
> +
> +	  The Page Attribute Table (PAT) maps three bits in the page table entry
> +	  to the actual cacheability used by the processor.  Many Intel
> +	  integrated GPUs have errata (bugs) that cause CPU access to GPU memory
> +	  to ignore the topmost bit.  When using Xen's default PAT, this results
> +	  in caches not being flushed and incorrect images being displayed.  The
> +	  default PAT used by Linux does not cause this problem.
> +
> +	  If you say Y here, you will be able to use Intel integrated GPUs that
> +	  are attached to your Linux dom0 or other Linux PV guests.  However,
> +	  you will not be able to use non-Linux OSs in dom0, and attaching a PCI
> +	  device to a non-Linux PV guest will result in unpredictable guest
> +	  behavior.  If you say N here, you will be able to use a non-Linux
> +	  dom0, and will be able to attach PCI devices to non-Linux PV guests.
> +
> +	  Note that saving a PV guest with an assigned PCI device on a machine
> +	  with one PAT and restoring it on a machine with a different PAT won't
> +	  work: the resulting guest may boot and even appear to work, but caches
> +	  will not be flushed when needed, with unpredictable results.  HVM
> +	  (including PVH and PVHVM) guests and guests without assigned PCI
> +	  devices do not care what PAT Xen uses, and migration (even live)
> +	  between hypervisors with different PATs will work fine.  Guests using
> +	  PV Shim care about the PAT used by the PV Shim firmware, not the
> +	  host’s PAT.  Also, non-default PAT values are incompatible with the
> +	  (deprecated) qemu-traditional stubdomain.
> +
> +	  Say Y if you are building a hypervisor for a Linux distribution that
> +	  supports Intel iGPUs.  Say N otherwise.

I'm not convinced we want this; if other maintainers think differently,
then I don't mean to stand in the way though. If so, however,
- the above likely wants guarding by EXPERT and/or UNSUPPORTED
- the support status of using this setting wants to be made crystal
  clear, perhaps by an addition to ./SUPPORT.md.

Jan



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v7 2/4] x86/mm: Reject invalid cacheability in PV guests by default
  2023-01-07 22:07 ` [PATCH v7 2/4] x86/mm: Reject invalid cacheability in PV guests by default Demi Marie Obenour
@ 2023-01-09 13:53   ` Jan Beulich
  0 siblings, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2023-01-09 13:53 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: Marek Marczykowski-Górecki, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, Roger Pau Monné,
	Tim Deegan, xen-devel

On 07.01.2023 23:07, Demi Marie Obenour wrote:
> Setting cacheability flags that are not ones specified by Xen is a bug
> in the guest.  By default, return -EINVAL if a guests attempts to do
> this.  The invalid-cacheability= Xen command-line flag allows the
> administrator to allow such attempts or to produce

Like in v6: Unfinished sentence?

> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
> ---
> Changes since v6:
> - Make invalid-cacheability= a subflag of pv=.

While you did this, you've retained the standalone option, and documentation
also continues to describe that one instead of the new sub-option. You will
then also want to move where invalid_cacheability is defined, I think.

> @@ -1343,7 +1392,9 @@ static int promote_l1_table(struct page_info *page)
>          }
>          else
>          {
> -            switch ( ret = get_page_from_l1e(pl1e[i], d, d) )
> +            l1_pgentry_t l1e = pl1e[i];
> +
> +            switch ( ret = get_page_from_l1e(l1e, d, d) )
>              {
>              default:
>                  goto fail;

Stale (and now pointless) change?

> --- a/xen/arch/x86/pv/domain.c
> +++ b/xen/arch/x86/pv/domain.c
> @@ -28,9 +28,21 @@ static int __init cf_check parse_pv(const char *s)
>      do {
>          ss = strchr(s, ',');
>          if ( !ss )
> -            ss = strchr(s, '\0');
> -
> -        if ( (val = parse_boolean("32", s, ss)) >= 0 )
> +            ss += strlen(s);
> +        if ( !strncmp("invalid-cacheability=", s,
> +                      sizeof("invalid-cacheability=") - 1) )
> +        {
> +            const char *p = s + (sizeof("invalid-cacheability=") - 1);
> +            if (ss - p == 5 && !memcmp(p, "allow", 5))

Nit: Blank line please between declaration(s) and statement(s).

Jan


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v7 3/4] x86/mm: make code robust to future PAT changes
  2023-01-07 22:07 ` [PATCH v7 3/4] x86/mm: make code robust to future PAT changes Demi Marie Obenour
@ 2023-01-09 14:07   ` Jan Beulich
  0 siblings, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2023-01-09 14:07 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: Marek Marczykowski-Górecki, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, Roger Pau Monné,
	Tim Deegan, xen-devel

On 07.01.2023 23:07, Demi Marie Obenour wrote:
> @@ -6412,6 +6414,100 @@ static void __init __maybe_unused build_assertions(void)
>       * using different PATs will not work.
>       */
>      BUILD_BUG_ON(XEN_MSR_PAT != 0x050100070406ULL);
> +
> +    /*
> +     * _PAGE_WB must be zero.  Linux PV guests assume that _PAGE_WB will be
> +     * zero, and indeed Linux has a BUILD_BUG_ON validating that their version
> +     * of _PAGE_WB *is* zero.  Furthermore, since _PAGE_WB is zero, it is quite
> +     * likely to be omitted from various parts of Xen, and indeed L1 PTE
> +     * validation code checks that ((l1f & PAGE_CACHE_ATTRs) == 0), not
> +     * ((l1f & PAGE_CACHE_ATTRs) == _PAGE_WB).
> +     */
> +    BUILD_BUG_ON(_PAGE_WB);
> +
> +    /* _PAGE_RSVD_1 must be less than _PAGE_RSVD_2 */
> +    BUILD_BUG_ON(_PAGE_RSVD_1 >= _PAGE_RSVD_2);
> +
> +#define PAT_ENTRY(v)                                                           \
> +    (BUILD_BUG_ON_ZERO(((v) < 0) || ((v) > 7)) +                               \
> +     (0xFF & (XEN_MSR_PAT >> (8 * (v)))))
> +
> +    /* Validate at compile-time that v is a valid value for a PAT entry */
> +#define CHECK_PAT_ENTRY_VALUE(v)                                               \
> +    BUILD_BUG_ON((v) > X86_NUM_MT || (v) == X86_MT_RSVD_2 ||                   \
> +                 (v) == X86_MT_RSVD_3)
> +
> +    /* Validate at compile-time that PAT entry v is valid */
> +#define CHECK_PAT_ENTRY(v) CHECK_PAT_ENTRY_VALUE(PAT_ENTRY(v))
> +
> +    /*
> +     * If one of these trips, the corresponding entry in XEN_MSR_PAT is invalid.
> +     * This would cause Xen to crash (with #GP) at startup.
> +     */
> +    CHECK_PAT_ENTRY(0);
> +    CHECK_PAT_ENTRY(1);
> +    CHECK_PAT_ENTRY(2);
> +    CHECK_PAT_ENTRY(3);
> +    CHECK_PAT_ENTRY(4);
> +    CHECK_PAT_ENTRY(5);
> +    CHECK_PAT_ENTRY(6);
> +    CHECK_PAT_ENTRY(7);
> +
> +    /* Macro version of pte_flags_to_cacheattr(), for use in BUILD_BUG_ON()s */
> +#define PTE_FLAGS_TO_CACHEATTR(pte_value)                                      \
> +    /* Check that the _PAGE_* macros only use bits from PAGE_CACHE_ATTRS */    \
> +    (BUILD_BUG_ON_ZERO(((pte_value) & PAGE_CACHE_ATTRS) != (pte_value)) |      \

Slightly cheaper as BUILD_BUG_ON_ZERO((pte_value) & ~PAGE_CACHE_ATTRS)?

> +     (((pte_value) & _PAGE_PAT) >> 5) |                                        \
> +     (((pte_value) & (_PAGE_PCD | _PAGE_PWT)) >> 3))
> +
> +    CHECK_PAT_ENTRY(PTE_FLAGS_TO_CACHEATTR(_PAGE_RSVD_1));
> +    CHECK_PAT_ENTRY(PTE_FLAGS_TO_CACHEATTR(_PAGE_RSVD_2));

What do these two check that the 8 instances above don't already check?

> +#define PAT_ENTRY_FROM_FLAGS(x) PAT_ENTRY(PTE_FLAGS_TO_CACHEATTR(x))
> +
> +    /* Validate at compile time that X does not duplicate a smaller PAT entry */
> +#define CHECK_DUPLICATE_ENTRY(x, y)                                            \
> +    BUILD_BUG_ON((x) >= (y) &&                                                 \
> +                 (PAT_ENTRY_FROM_FLAGS(x) == PAT_ENTRY_FROM_FLAGS(y)))

Imo nothing says that the reserved entries come last. I'm therefore not
convinced of the usefulness of the two uses of this macro.

> +    /* Check that a PAT-related _PAGE_* macro is correct */
> +#define CHECK_PAGE_VALUE(page_value) do {                                      \
> +    /* Check that the _PAGE_* macros only use bits from PAGE_CACHE_ATTRS */    \
> +    BUILD_BUG_ON(((_PAGE_ ## page_value) & PAGE_CACHE_ATTRS) !=                \
> +                 (_PAGE_ ## page_value));                                      \
> +    /* Check that the _PAGE_* are consistent with XEN_MSR_PAT */               \
> +    BUILD_BUG_ON(PAT_ENTRY(PTE_FLAGS_TO_CACHEATTR(_PAGE_ ## page_value)) !=    \
> +                 (X86_MT_ ## page_value));                                     \
> +    case _PAGE_ ## page_value:; /* ensure no duplicate values */               \

Wouldn't this better come first in the macro? The semicolon looks unnecessary
in any event.

> +    /*                                                                         \
> +     * Check that the _PAGE_* entries do not duplicate a smaller reserved      \
> +     * entry.                                                                  \
> +     */                                                                        \
> +    CHECK_DUPLICATE_ENTRY(_PAGE_ ## page_value, _PAGE_RSVD_1);                 \
> +    CHECK_DUPLICATE_ENTRY(_PAGE_ ## page_value, _PAGE_RSVD_2);                 \
> +    CHECK_PAT_ENTRY(PTE_FLAGS_TO_CACHEATTR(_PAGE_ ## page_value));             \
> +} while ( false )
> +
> +    /*
> +     * If one of these trips, the corresponding _PAGE_* macro is inconsistent
> +     * with XEN_MSR_PAT.  This would cause Xen to use incorrect cacheability
> +     * flags, with results that are unknown and possibly harmful.
> +     */
> +    switch (0) {

Nit: Style.

> +    CHECK_PAGE_VALUE(WT);
> +    CHECK_PAGE_VALUE(WB);
> +    CHECK_PAGE_VALUE(WC);
> +    CHECK_PAGE_VALUE(UC);
> +    CHECK_PAGE_VALUE(UCM);
> +    CHECK_PAGE_VALUE(WP);

All of these are lacking "break" and hence are liable to trigger static checker
warnings.

> +    case _PAGE_RSVD_1:
> +    case _PAGE_RSVD_2:
> +        break;
> +    }
> +#undef CHECK_PAT_ENTRY
> +#undef CHECK_PAT_ENTRY_VALUE
> +#undef CHECK_PAGE_VALUE
> +#undef PAGE_FLAGS_TO_CACHEATTR

PTE_FLAGS_TO_CACHEATTR?

> +#undef PAT_ENTRY

You also #define more than these 5 macros now (but as per above e.g.
CHECK_DUPLICATE_ENTRY() may go away again).

Jan


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v7 4/4] x86: Allow using Linux's PAT
  2023-01-09 11:37   ` Jan Beulich
@ 2023-01-09 16:14     ` Demi Marie Obenour
  0 siblings, 0 replies; 9+ messages in thread
From: Demi Marie Obenour @ 2023-01-09 16:14 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Marek Marczykowski-Górecki, Andrew Cooper, George Dunlap,
	Julien Grall, Stefano Stabellini, Wei Liu, Roger Pau Monné,
	Tim Deegan, xen-devel

[-- Attachment #1: Type: text/plain, Size: 3177 bytes --]

On Mon, Jan 09, 2023 at 12:37:34PM +0100, Jan Beulich wrote:
> On 07.01.2023 23:07, Demi Marie Obenour wrote:
> > --- a/xen/arch/x86/Kconfig
> > +++ b/xen/arch/x86/Kconfig
> > @@ -227,6 +227,39 @@ config XEN_ALIGN_2M
> >  
> >  endchoice
> >  
> > +config LINUX_PAT
> > +	bool "Use Linux's PAT instead of Xen's default"
> > +	help
> > +	  Use Linux's Page Attribute Table instead of the default Xen value.
> > +
> > +	  The Page Attribute Table (PAT) maps three bits in the page table entry
> > +	  to the actual cacheability used by the processor.  Many Intel
> > +	  integrated GPUs have errata (bugs) that cause CPU access to GPU memory
> > +	  to ignore the topmost bit.  When using Xen's default PAT, this results
> > +	  in caches not being flushed and incorrect images being displayed.  The
> > +	  default PAT used by Linux does not cause this problem.
> > +
> > +	  If you say Y here, you will be able to use Intel integrated GPUs that
> > +	  are attached to your Linux dom0 or other Linux PV guests.  However,
> > +	  you will not be able to use non-Linux OSs in dom0, and attaching a PCI
> > +	  device to a non-Linux PV guest will result in unpredictable guest
> > +	  behavior.  If you say N here, you will be able to use a non-Linux
> > +	  dom0, and will be able to attach PCI devices to non-Linux PV guests.
> > +
> > +	  Note that saving a PV guest with an assigned PCI device on a machine
> > +	  with one PAT and restoring it on a machine with a different PAT won't
> > +	  work: the resulting guest may boot and even appear to work, but caches
> > +	  will not be flushed when needed, with unpredictable results.  HVM
> > +	  (including PVH and PVHVM) guests and guests without assigned PCI
> > +	  devices do not care what PAT Xen uses, and migration (even live)
> > +	  between hypervisors with different PATs will work fine.  Guests using
> > +	  PV Shim care about the PAT used by the PV Shim firmware, not the
> > +	  host’s PAT.  Also, non-default PAT values are incompatible with the
> > +	  (deprecated) qemu-traditional stubdomain.
> > +
> > +	  Say Y if you are building a hypervisor for a Linux distribution that
> > +	  supports Intel iGPUs.  Say N otherwise.
> 
> I'm not convinced we want this; if other maintainers think differently,
> then I don't mean to stand in the way though. If so, however,
> - the above likely wants guarding by EXPERT and/or UNSUPPORTED

I considered this, but decided against it.  Recent Intel iGPUs are
simply incompatible with Xen’s default PAT, so anyone wanting to use Xen
in a desktop environment must say Y here.  Guarding this with EXPERT or
UNSUPPORTED will not prevent distribution maintainers from enabling it,
because the alternative is building a hypervisor that does not support
the hardware their users actually have.  Qubes OS is *already* shipping
a patch to use Linux’s PAT, so you don’t need to worry that this code
will go untested.  And if there was a vulnerability that requires
CONFIG_LINUX_PAT=y, I’d rather it not be dropped on Qubes users as a
0day.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-01-09 16:14 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-07 22:07 [PATCH v7 0/4] Make PAT handling less brittle Demi Marie Obenour
2023-01-07 22:07 ` [PATCH v7 1/4] x86: Remove MEMORY_NUM_TYPES and NO_HARDCODE_MEM_TYPE Demi Marie Obenour
2023-01-07 22:07 ` [PATCH v7 2/4] x86/mm: Reject invalid cacheability in PV guests by default Demi Marie Obenour
2023-01-09 13:53   ` Jan Beulich
2023-01-07 22:07 ` [PATCH v7 3/4] x86/mm: make code robust to future PAT changes Demi Marie Obenour
2023-01-09 14:07   ` Jan Beulich
2023-01-07 22:07 ` [PATCH v7 4/4] x86: Allow using Linux's PAT Demi Marie Obenour
2023-01-09 11:37   ` Jan Beulich
2023-01-09 16:14     ` Demi Marie Obenour

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.