All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] x86: Support for __ro_after_init
@ 2021-11-30 10:04 Andrew Cooper
  2021-11-30 10:04 ` [PATCH 1/8] x86/boot: Drop incorrect mapping at l2_xenmap[0] Andrew Cooper
                   ` (7 more replies)
  0 siblings, 8 replies; 24+ messages in thread
From: Andrew Cooper @ 2021-11-30 10:04 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

Also some prep cleanup, because the boot time pagetable handling is seemingly
impossible to edit without finding more skeletons lying around...

Andrew Cooper (8):
  x86/boot: Drop incorrect mapping at l2_xenmap[0]
  x86/boot: Better describe the pagetable relocation loops
  x86/boot: Fix data placement around __high_start()
  x86/mm: Drop bogus cacheability logic in update_xen_mappings()
  x86/boot: Drop xen_virt_end
  x86/boot: Adjust .text/.rodata/etc permissions in one place
  x86/boot: Support __ro_after_init
  x86/boot: Check that permission restrictions have taken effect

 xen/arch/x86/boot/x86_64.S        |   8 --
 xen/arch/x86/livepatch.c          |   3 +-
 xen/arch/x86/mm.c                 |  24 +++---
 xen/arch/x86/setup.c              | 153 +++++++++++++++++++-------------------
 xen/arch/x86/smpboot.c            |   3 +-
 xen/arch/x86/xen.lds.S            |   6 ++
 xen/include/asm-x86/setup.h       |   3 +
 xen/include/asm-x86/x86_64/page.h |   2 -
 xen/include/xen/cache.h           |   2 +
 9 files changed, 103 insertions(+), 101 deletions(-)

-- 
2.11.0



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/8] x86/boot: Drop incorrect mapping at l2_xenmap[0]
  2021-11-30 10:04 [PATCH 0/8] x86: Support for __ro_after_init Andrew Cooper
@ 2021-11-30 10:04 ` Andrew Cooper
  2021-11-30 10:33   ` Jan Beulich
  2021-11-30 10:04 ` [PATCH 2/8] x86/boot: Better describe the pagetable relocation loops Andrew Cooper
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2021-11-30 10:04 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

It has been 4 years since the default load address changed from 1M to 2M, and
_stext ceased residing in l2_xenmap[0].  We should not be inserting an unused
mapping.

To ensure we don't create mappings accidentally, loop from 0 and obey
_PAGE_PRESENT on all entries.

Fixes: 7ed93f3a0dff ("x86: change default load address from 1 MiB to 2 MiB")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>

Previously posted on its own.
---
 xen/arch/x86/setup.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index da47cdea14a1..6f241048425c 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1279,16 +1279,12 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
             /* The only data mappings to be relocated are in the Xen area. */
             pl2e = __va(__pa(l2_xenmap));
-            /*
-             * Undo the temporary-hooking of the l1_directmap.  __2M_text_start
-             * is contained in this PTE.
-             */
+
             BUG_ON(using_2M_mapping() &&
                    l2_table_offset((unsigned long)_erodata) ==
                    l2_table_offset((unsigned long)_stext));
-            *pl2e++ = l2e_from_pfn(xen_phys_start >> PAGE_SHIFT,
-                                   PAGE_HYPERVISOR_RX | _PAGE_PSE);
-            for ( i = 1; i < L2_PAGETABLE_ENTRIES; i++, pl2e++ )
+
+            for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++, pl2e++ )
             {
                 unsigned int flags;
 
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/8] x86/boot: Better describe the pagetable relocation loops
  2021-11-30 10:04 [PATCH 0/8] x86: Support for __ro_after_init Andrew Cooper
  2021-11-30 10:04 ` [PATCH 1/8] x86/boot: Drop incorrect mapping at l2_xenmap[0] Andrew Cooper
@ 2021-11-30 10:04 ` Andrew Cooper
  2021-12-02 11:43   ` Jan Beulich
  2021-11-30 10:04 ` [PATCH 3/8] x86/boot: Fix data placement around __high_start() Andrew Cooper
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2021-11-30 10:04 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

The first loop relocates all reachable non-leaf entries in idle_pg_table[],
which includes l2_xenmap[511]'s reference to l1_fixmap_x[].

The second loop relocates only leaf mappings in l2_xenmap[], so update the
skip condition to be opposite to the first loop.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/setup.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 6f241048425c..495b4b7d51fb 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1245,7 +1245,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
             barrier();
             move_memory(e, XEN_IMG_OFFSET, _end - _start, 1);
 
-            /* Walk initial pagetables, relocating page directory entries. */
+            /* Walk idle_pg_table, relocating non-leaf entries. */
             pl4e = __va(__pa(idle_pg_table));
             for ( i = 0 ; i < L4_PAGETABLE_ENTRIES; i++, pl4e++ )
             {
@@ -1277,18 +1277,18 @@ void __init noreturn __start_xen(unsigned long mbi_p)
                 }
             }
 
-            /* The only data mappings to be relocated are in the Xen area. */
-            pl2e = __va(__pa(l2_xenmap));
-
             BUG_ON(using_2M_mapping() &&
                    l2_table_offset((unsigned long)_erodata) ==
                    l2_table_offset((unsigned long)_stext));
 
+            /* Walk l2_xenmap[], relocating 2M superpage leaves. */
+            pl2e = __va(__pa(l2_xenmap));
             for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++, pl2e++ )
             {
                 unsigned int flags;
 
                 if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) ||
+                     !(l2e_get_flags(*pl2e) & _PAGE_PSE) ||
                      (l2e_get_pfn(*pl2e) >= pte_update_limit) )
                     continue;
 
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/8] x86/boot: Fix data placement around __high_start()
  2021-11-30 10:04 [PATCH 0/8] x86: Support for __ro_after_init Andrew Cooper
  2021-11-30 10:04 ` [PATCH 1/8] x86/boot: Drop incorrect mapping at l2_xenmap[0] Andrew Cooper
  2021-11-30 10:04 ` [PATCH 2/8] x86/boot: Better describe the pagetable relocation loops Andrew Cooper
@ 2021-11-30 10:04 ` Andrew Cooper
  2021-12-02 11:49   ` Jan Beulich
  2021-11-30 10:04 ` [PATCH 4/8] x86/mm: Drop bogus cacheability logic in update_xen_mappings() Andrew Cooper
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2021-11-30 10:04 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

multiboot_ptr should be in __initdata - it is only used on the BSP path.
Furthermore, the .align 8 then .long means that stack_start is misaligned.

Move both into setup.c, which lets the compiler handle the details correctly,
as well as providling proper debug information for them.

Declare stack_start in setup.h and avoid extern-ing it locally in smpboot.c.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/boot/x86_64.S  | 8 --------
 xen/arch/x86/setup.c        | 6 ++++++
 xen/arch/x86/smpboot.c      | 3 +--
 xen/include/asm-x86/setup.h | 2 ++
 4 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/boot/x86_64.S b/xen/arch/x86/boot/x86_64.S
index d61048c583b3..27f52e7a7708 100644
--- a/xen/arch/x86/boot/x86_64.S
+++ b/xen/arch/x86/boot/x86_64.S
@@ -67,14 +67,6 @@ ENTRY(__high_start)
         call    __start_xen
         BUG     /* __start_xen() shouldn't return. */
 
-        .data
-        .align 8
-multiboot_ptr:
-        .long   0
-
-GLOBAL(stack_start)
-        .quad   cpu0_stack + STACK_SIZE - CPUINFO_sizeof
-
         .section .data.page_aligned, "aw", @progbits
         .align PAGE_SIZE, 0
 /*
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 495b4b7d51fb..6613e56a2184 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -141,6 +141,12 @@ unsigned long __read_mostly xen_virt_end;
 char __section(".bss.stack_aligned") __aligned(STACK_SIZE)
     cpu0_stack[STACK_SIZE];
 
+/* Used by the BSP/AP paths to find the higher half stack mapping to use. */
+void *stack_start = cpu0_stack + STACK_SIZE - sizeof(struct cpu_info);
+
+/* Used by the boot asm to stash the relocated multiboot info pointer. */
+unsigned int __initdata multiboot_ptr;
+
 struct cpuinfo_x86 __read_mostly boot_cpu_data = { 0, 0, 0, 0, -1 };
 
 unsigned long __read_mostly mmu_cr4_features = XEN_MINIMAL_CR4;
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 329cfdb6c9f6..08c0f2d9df04 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -42,6 +42,7 @@
 #include <asm/microcode.h>
 #include <asm/msr.h>
 #include <asm/mtrr.h>
+#include <asm/setup.h>
 #include <asm/spec_ctrl.h>
 #include <asm/time.h>
 #include <asm/tboot.h>
@@ -419,8 +420,6 @@ void start_secondary(void *unused)
     startup_cpu_idle_loop();
 }
 
-extern void *stack_start;
-
 static int wakeup_secondary_cpu(int phys_apicid, unsigned long start_eip)
 {
     unsigned long send_status = 0, accept_status = 0;
diff --git a/xen/include/asm-x86/setup.h b/xen/include/asm-x86/setup.h
index 24be46115df2..eb9d7b433c13 100644
--- a/xen/include/asm-x86/setup.h
+++ b/xen/include/asm-x86/setup.h
@@ -12,6 +12,8 @@ extern char __2M_rwdata_start[], __2M_rwdata_end[];
 extern unsigned long xenheap_initial_phys_start;
 extern uint64_t boot_tsc_stamp;
 
+extern void *stack_start;
+
 void early_cpu_init(void);
 void early_time_init(void);
 
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/8] x86/mm: Drop bogus cacheability logic in update_xen_mappings()
  2021-11-30 10:04 [PATCH 0/8] x86: Support for __ro_after_init Andrew Cooper
                   ` (2 preceding siblings ...)
  2021-11-30 10:04 ` [PATCH 3/8] x86/boot: Fix data placement around __high_start() Andrew Cooper
@ 2021-11-30 10:04 ` Andrew Cooper
  2021-11-30 13:11   ` Andrew Cooper
  2021-11-30 10:04 ` [PATCH 5/8] x86/boot: Drop xen_virt_end Andrew Cooper
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2021-11-30 10:04 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

There is no circumstance under which any part of the Xen image in memory wants
to have any cacheability other than Write Back.

Furthermore, unmapping random pieces of Xen like that comes with a non-trivial
risk of a Triple Fault, or other fatal condition.  Even if it appears to work,
an NMI or interrupt as a far wider reach into Xen mappings than calling
map_pages_to_xen() thrice.

Simplfy the safety checking to a BUG_ON().  It is substantially more correct
and robust than following either of the unlikely(alias) paths.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>

I'm half tempted to drop the check entirely, but in that case it would be
better to inline the result into the two callers.
---
 xen/arch/x86/mm.c | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 4d799032dc82..9bd4e5cc1d2f 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -785,24 +785,21 @@ bool is_iomem_page(mfn_t mfn)
 
 static int update_xen_mappings(unsigned long mfn, unsigned int cacheattr)
 {
-    int err = 0;
     bool alias = mfn >= PFN_DOWN(xen_phys_start) &&
          mfn < PFN_UP(xen_phys_start + xen_virt_end - XEN_VIRT_START);
-    unsigned long xen_va =
-        XEN_VIRT_START + ((mfn - PFN_DOWN(xen_phys_start)) << PAGE_SHIFT);
+
+    /*
+     * Something is catastrophically broken if someone is trying to change the
+     * cacheability of Xen in memory...
+     */
+    BUG_ON(alias);
 
     if ( boot_cpu_has(X86_FEATURE_XEN_SELFSNOOP) )
         return 0;
 
-    if ( unlikely(alias) && cacheattr )
-        err = map_pages_to_xen(xen_va, _mfn(mfn), 1, 0);
-    if ( !err )
-        err = map_pages_to_xen((unsigned long)mfn_to_virt(mfn), _mfn(mfn), 1,
-                     PAGE_HYPERVISOR | cacheattr_to_pte_flags(cacheattr));
-    if ( unlikely(alias) && !cacheattr && !err )
-        err = map_pages_to_xen(xen_va, _mfn(mfn), 1, PAGE_HYPERVISOR);
-
-    return err;
+    return map_pages_to_xen(
+        (unsigned long)mfn_to_virt(mfn), _mfn(mfn), 1,
+        PAGE_HYPERVISOR | cacheattr_to_pte_flags(cacheattr));
 }
 
 #ifndef NDEBUG
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 5/8] x86/boot: Drop xen_virt_end
  2021-11-30 10:04 [PATCH 0/8] x86: Support for __ro_after_init Andrew Cooper
                   ` (3 preceding siblings ...)
  2021-11-30 10:04 ` [PATCH 4/8] x86/mm: Drop bogus cacheability logic in update_xen_mappings() Andrew Cooper
@ 2021-11-30 10:04 ` Andrew Cooper
  2021-12-02 11:56   ` Jan Beulich
  2021-11-30 10:04 ` [PATCH 6/8] x86/boot: Adjust .text/.rodata/etc permissions in one place Andrew Cooper
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2021-11-30 10:04 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

The calculation in __start_xen() for xen_virt_end is an opencoding of
ROUNDUP(_end, 2M).  This is __2M_rwdata_end as provided by the linker script.

This corrects the bound calculations in arch_livepatch_init() and
update_xen_mappings() to not enforce 2M alignment when Xen is not compiled
with CONFIG_XEN_ALIGN_2M

Furthermore, since 52975142d154 ("x86/boot: Create the l2_xenmap[] mappings
dynamically"), there have not been extraneous mappings to delete, meaning that
the call to destroy_xen_mappings() has been a no-op.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>

This effectively reverts d0ae97d4136e ("x86-64: properly handle alias mappings
beyond _end"), because the preconditions for the change are no longer valid.
---
 xen/arch/x86/livepatch.c          | 3 ++-
 xen/arch/x86/mm.c                 | 3 ++-
 xen/arch/x86/setup.c              | 6 ------
 xen/include/asm-x86/x86_64/page.h | 2 --
 4 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/livepatch.c b/xen/arch/x86/livepatch.c
index 49f0d902e5bb..d056b1ed8b41 100644
--- a/xen/arch/x86/livepatch.c
+++ b/xen/arch/x86/livepatch.c
@@ -17,6 +17,7 @@
 #include <asm/fixmap.h>
 #include <asm/nmi.h>
 #include <asm/livepatch.h>
+#include <asm/setup.h>
 
 static bool has_active_waitqueue(const struct vm_event_domain *ved)
 {
@@ -343,7 +344,7 @@ void __init arch_livepatch_init(void)
 {
     void *start, *end;
 
-    start = (void *)xen_virt_end;
+    start = (void *)__2M_rwdata_end;
     end = (void *)(XEN_VIRT_END - FIXADDR_X_SIZE - NR_CPUS * PAGE_SIZE);
 
     BUG_ON(end <= start);
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 9bd4e5cc1d2f..fdc548a9cb4a 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -786,7 +786,8 @@ bool is_iomem_page(mfn_t mfn)
 static int update_xen_mappings(unsigned long mfn, unsigned int cacheattr)
 {
     bool alias = mfn >= PFN_DOWN(xen_phys_start) &&
-         mfn < PFN_UP(xen_phys_start + xen_virt_end - XEN_VIRT_START);
+        mfn < PFN_UP(xen_phys_start + (unsigned long)__2M_rwdata_end -
+                     XEN_VIRT_START);
 
     /*
      * Something is catastrophically broken if someone is trying to change the
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 6613e56a2184..0e0e706404a3 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -136,8 +136,6 @@ cpumask_t __read_mostly cpu_present_map;
 
 unsigned long __read_mostly xen_phys_start;
 
-unsigned long __read_mostly xen_virt_end;
-
 char __section(".bss.stack_aligned") __aligned(STACK_SIZE)
     cpu0_stack[STACK_SIZE];
 
@@ -1573,10 +1571,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
     }
 #endif
 
-    xen_virt_end = ((unsigned long)_end + (1UL << L2_PAGETABLE_SHIFT) - 1) &
-                   ~((1UL << L2_PAGETABLE_SHIFT) - 1);
-    destroy_xen_mappings(xen_virt_end, XEN_VIRT_START + BOOTSTRAP_MAP_BASE);
-
     /*
      * If not using 2M mappings to gain suitable pagetable permissions
      * directly from the relocation above, remap the code/data
diff --git a/xen/include/asm-x86/x86_64/page.h b/xen/include/asm-x86/x86_64/page.h
index f9faf7f38348..cb1db107c424 100644
--- a/xen/include/asm-x86/x86_64/page.h
+++ b/xen/include/asm-x86/x86_64/page.h
@@ -23,8 +23,6 @@ static inline unsigned long canonicalise_addr(unsigned long addr)
 
 #include <xen/pdx.h>
 
-extern unsigned long xen_virt_end;
-
 /*
  * Note: These are solely for the use by page_{get,set}_owner(), and
  *       therefore don't need to handle the XEN_VIRT_{START,END} range.
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 6/8] x86/boot: Adjust .text/.rodata/etc permissions in one place
  2021-11-30 10:04 [PATCH 0/8] x86: Support for __ro_after_init Andrew Cooper
                   ` (4 preceding siblings ...)
  2021-11-30 10:04 ` [PATCH 5/8] x86/boot: Drop xen_virt_end Andrew Cooper
@ 2021-11-30 10:04 ` Andrew Cooper
  2021-12-02 12:15   ` Jan Beulich
  2021-11-30 10:04 ` [PATCH 7/8] x86/boot: Support __ro_after_init Andrew Cooper
  2021-11-30 10:04 ` [PATCH RFC 8/8] x86/boot: Check that permission restrictions have taken effect Andrew Cooper
  7 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2021-11-30 10:04 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

At the moment, we have two locations selecting restricted permissions, not
very far apart on boot, dependent on opposite answers from using_2M_mapping().
The later location however can shatter superpages if needed, while the former
cannot.

Collect together all the permission adjustments at the slightly later point in
boot, as we're may need to shatter a superpage to support __ro_after_init.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/setup.c | 74 ++++++++++++----------------------------------------
 1 file changed, 16 insertions(+), 58 deletions(-)

diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 0e0e706404a3..8329263430ed 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1281,55 +1281,16 @@ void __init noreturn __start_xen(unsigned long mbi_p)
                 }
             }
 
-            BUG_ON(using_2M_mapping() &&
-                   l2_table_offset((unsigned long)_erodata) ==
-                   l2_table_offset((unsigned long)_stext));
-
             /* Walk l2_xenmap[], relocating 2M superpage leaves. */
             pl2e = __va(__pa(l2_xenmap));
             for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++, pl2e++ )
             {
-                unsigned int flags;
-
                 if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) ||
                      !(l2e_get_flags(*pl2e) & _PAGE_PSE) ||
                      (l2e_get_pfn(*pl2e) >= pte_update_limit) )
                     continue;
 
-                if ( !using_2M_mapping() )
-                {
-                    *pl2e = l2e_from_intpte(l2e_get_intpte(*pl2e) +
-                                            xen_phys_start);
-                    continue;
-                }
-
-                if ( i < l2_table_offset((unsigned long)&__2M_text_end) )
-                {
-                    flags = PAGE_HYPERVISOR_RX | _PAGE_PSE;
-                }
-                else if ( i >= l2_table_offset((unsigned long)&__2M_rodata_start) &&
-                          i <  l2_table_offset((unsigned long)&__2M_rodata_end) )
-                {
-                    flags = PAGE_HYPERVISOR_RO | _PAGE_PSE;
-                }
-                else if ( i >= l2_table_offset((unsigned long)&__2M_init_start) &&
-                          i <  l2_table_offset((unsigned long)&__2M_init_end) )
-                {
-                    flags = PAGE_HYPERVISOR_RWX | _PAGE_PSE;
-                }
-                else if ( (i >= l2_table_offset((unsigned long)&__2M_rwdata_start) &&
-                           i <  l2_table_offset((unsigned long)&__2M_rwdata_end)) )
-                {
-                    flags = PAGE_HYPERVISOR_RW | _PAGE_PSE;
-                }
-                else
-                {
-                    *pl2e = l2e_empty();
-                    continue;
-                }
-
-                *pl2e = l2e_from_paddr(
-                    l2e_get_paddr(*pl2e) + xen_phys_start, flags);
+                *pl2e = l2e_from_intpte(l2e_get_intpte(*pl2e) + xen_phys_start);
             }
 
             /* Re-sync the stack and then switch to relocated pagetables. */
@@ -1572,31 +1533,28 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 #endif
 
     /*
-     * If not using 2M mappings to gain suitable pagetable permissions
-     * directly from the relocation above, remap the code/data
-     * sections with decreased permissions.
+     * Restrict permissions on the Xen code/data.
+     *
+     * Mark .text as RX.
      */
-    if ( !using_2M_mapping() )
-    {
-        /* Mark .text as RX (avoiding the first 2M superpage). */
-        modify_xen_mappings(XEN_VIRT_START + MB(2),
-                            (unsigned long)&__2M_text_end,
-                            PAGE_HYPERVISOR_RX);
+    modify_xen_mappings((unsigned long)&_start,
+                        (unsigned long)&__2M_text_end,
+                        PAGE_HYPERVISOR_RX);
 
-        /* Mark .rodata as RO. */
-        modify_xen_mappings((unsigned long)&__2M_rodata_start,
-                            (unsigned long)&__2M_rodata_end,
-                            PAGE_HYPERVISOR_RO);
+    /* Mark .rodata as RO. */
+    modify_xen_mappings((unsigned long)&__2M_rodata_start,
+                        (unsigned long)&__2M_rodata_end,
+                        PAGE_HYPERVISOR_RO);
 
-        /* Mark .data and .bss as RW. */
-        modify_xen_mappings((unsigned long)&__2M_rwdata_start,
-                            (unsigned long)&__2M_rwdata_end,
-                            PAGE_HYPERVISOR_RW);
+    /* Mark .data and .bss as RW. */
+    modify_xen_mappings((unsigned long)&__2M_rwdata_start,
+                        (unsigned long)&__2M_rwdata_end,
+                        PAGE_HYPERVISOR_RW);
 
+    if ( !IS_ALIGNED((unsigned long)&__2M_rwdata_end, MB(2)) )
         /* Drop the remaining mappings in the shattered superpage. */
         destroy_xen_mappings((unsigned long)&__2M_rwdata_end,
                              ROUNDUP((unsigned long)&__2M_rwdata_end, MB(2)));
-    }
 
     nr_pages = 0;
     for ( i = 0; i < e820.nr_map; i++ )
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 7/8] x86/boot: Support __ro_after_init
  2021-11-30 10:04 [PATCH 0/8] x86: Support for __ro_after_init Andrew Cooper
                   ` (5 preceding siblings ...)
  2021-11-30 10:04 ` [PATCH 6/8] x86/boot: Adjust .text/.rodata/etc permissions in one place Andrew Cooper
@ 2021-11-30 10:04 ` Andrew Cooper
  2021-12-02 13:10   ` Jan Beulich
  2021-11-30 10:04 ` [PATCH RFC 8/8] x86/boot: Check that permission restrictions have taken effect Andrew Cooper
  7 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2021-11-30 10:04 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

For security hardening reasons, it advantageous to make setup-once data
immutable after boot.  Borrow __ro_after_init from Linux.

On x86, place .data.ro_after_init at the start of .rodata, excluding it from
the early permission restrictions.  Re-apply RO restrictions to the whole of
.rodata in init_done(), attempting to reform the superpage if possible.

For architectures which don't implement __ro_after_init explicitly, variables
merges into .data.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/setup.c        | 12 +++++++++++-
 xen/arch/x86/xen.lds.S      |  6 ++++++
 xen/include/asm-x86/setup.h |  1 +
 xen/include/xen/cache.h     |  2 ++
 4 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 8329263430ed..3bbc46f244b9 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -663,6 +663,11 @@ static void noreturn init_done(void)
     init_xenheap_pages(__pa(start), __pa(end));
     printk("Freed %lukB init memory\n", (end - start) >> 10);
 
+    /* Mark .rodata/ro_after_init as RO.  Maybe reform the superpage. */
+    modify_xen_mappings((unsigned long)&__2M_rodata_start,
+                        (unsigned long)&__2M_rodata_end,
+                        PAGE_HYPERVISOR_RO);
+
     startup_cpu_idle_loop();
 }
 
@@ -1541,8 +1546,13 @@ void __init noreturn __start_xen(unsigned long mbi_p)
                         (unsigned long)&__2M_text_end,
                         PAGE_HYPERVISOR_RX);
 
+    /* Mark .data.ro_after_init as RW.  Maybe shatters the .rodata superpage. */
+    modify_xen_mappings((unsigned long)&__ro_after_init_start,
+                        (unsigned long)&__ro_after_init_end,
+                        PAGE_HYPERVISOR_RW);
+
     /* Mark .rodata as RO. */
-    modify_xen_mappings((unsigned long)&__2M_rodata_start,
+    modify_xen_mappings((unsigned long)&__ro_after_init_end,
                         (unsigned long)&__2M_rodata_end,
                         PAGE_HYPERVISOR_RO);
 
diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S
index 87e344d4dd97..4db5b404e073 100644
--- a/xen/arch/x86/xen.lds.S
+++ b/xen/arch/x86/xen.lds.S
@@ -97,6 +97,12 @@ SECTIONS
   __2M_rodata_start = .;       /* Start of 2M superpages, mapped RO. */
   DECL_SECTION(.rodata) {
        _srodata = .;
+
+       __ro_after_init_start = .;
+       *(.data.ro_after_init)
+       . = ALIGN(PAGE_SIZE);
+       __ro_after_init_end = .;
+
        /* Bug frames table */
        __start_bug_frames = .;
        *(.bug_frames.0)
diff --git a/xen/include/asm-x86/setup.h b/xen/include/asm-x86/setup.h
index eb9d7b433c13..34edea405f85 100644
--- a/xen/include/asm-x86/setup.h
+++ b/xen/include/asm-x86/setup.h
@@ -6,6 +6,7 @@
 
 extern const char __2M_text_start[], __2M_text_end[];
 extern const char __2M_rodata_start[], __2M_rodata_end[];
+extern const char __ro_after_init_start[], __ro_after_init_end[];
 extern char __2M_init_start[], __2M_init_end[];
 extern char __2M_rwdata_start[], __2M_rwdata_end[];
 
diff --git a/xen/include/xen/cache.h b/xen/include/xen/cache.h
index 6ee174efa439..f52a0aedf768 100644
--- a/xen/include/xen/cache.h
+++ b/xen/include/xen/cache.h
@@ -15,4 +15,6 @@
 #define __cacheline_aligned __attribute__((__aligned__(SMP_CACHE_BYTES)))
 #endif
 
+#define __ro_after_init __section(".data.ro_after_init")
+
 #endif /* __LINUX_CACHE_H */
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH RFC 8/8] x86/boot: Check that permission restrictions have taken effect
  2021-11-30 10:04 [PATCH 0/8] x86: Support for __ro_after_init Andrew Cooper
                   ` (6 preceding siblings ...)
  2021-11-30 10:04 ` [PATCH 7/8] x86/boot: Support __ro_after_init Andrew Cooper
@ 2021-11-30 10:04 ` Andrew Cooper
  2021-12-02 13:33   ` Jan Beulich
  7 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2021-11-30 10:04 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>

RFC.  I don't know if this is something we'd want to keep or not.

Getting extable handling working for test_nx_data is proving tricky, and while
I can't spot anything that should stop the extable from working with NX
faults, from a security hardening perspective, there really ought to
be.

(Spurious faults aside), there are no circumstances where an NX fault is
legitimate, and restricting extable's ability to interfere with the fatality
of an NX fault provides a better security posture.
---
 xen/arch/x86/setup.c | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 3bbc46f244b9..7cb530a7528f 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -668,6 +668,45 @@ static void noreturn init_done(void)
                         (unsigned long)&__2M_rodata_end,
                         PAGE_HYPERVISOR_RO);
 
+    if ( IS_ENABLED(CONFIG_DEBUG) )
+    {
+        static const char test_rodata = 1;
+        static char __ro_after_init test_ro_after_init = 1;
+
+#define PROBE(insn, c, p)                       \
+    ({                                          \
+        bool fault = 0;                         \
+        asm ( "1:" insn "[ptr]\n\t"             \
+              "2:\n\t"                          \
+              ".section .fixup,\"ax\"\n\t"      \
+              "3: movb $1, %[fault]\n\t"        \
+              "jmp 2b\n\t"                      \
+              ".previous"                       \
+              _ASM_EXTABLE(1b, 3b)              \
+              : [fault] "+r" (fault)            \
+              : [ptr] c (p)                     \
+            );                                  \
+        fault;                                  \
+    })
+
+        if ( !PROBE("notb %", "m", test_rodata) )
+            panic("No fault from test_rodata\n");
+
+        if ( !PROBE("notb %", "m", test_ro_after_init) )
+            panic("No fault from test_ro_after_init\n");
+
+        if ( !PROBE("notb %", "m", init_done) )
+            panic("No fault from modifying init_done\n");
+
+        if ( 0 /* RFC */ && cpu_has_nx )
+        {
+            static char test_nx_data[1] = { 0xc3 };
+
+            if ( !PROBE("call %c", "i", test_nx_data) )
+                panic("No fault from test_nx_data\n");
+        }
+    }
+
     startup_cpu_idle_loop();
 }
 
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] x86/boot: Drop incorrect mapping at l2_xenmap[0]
  2021-11-30 10:04 ` [PATCH 1/8] x86/boot: Drop incorrect mapping at l2_xenmap[0] Andrew Cooper
@ 2021-11-30 10:33   ` Jan Beulich
  2021-11-30 11:14     ` Andrew Cooper
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2021-11-30 10:33 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 30.11.2021 11:04, Andrew Cooper wrote:
> It has been 4 years since the default load address changed from 1M to 2M, and
> _stext ceased residing in l2_xenmap[0].  We should not be inserting an unused
> mapping.
> 
> To ensure we don't create mappings accidentally, loop from 0 and obey
> _PAGE_PRESENT on all entries.
> 
> Fixes: 7ed93f3a0dff ("x86: change default load address from 1 MiB to 2 MiB")
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

I guess this may be worth backporting despite not having any immediate
adverse effect.

Jan



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] x86/boot: Drop incorrect mapping at l2_xenmap[0]
  2021-11-30 10:33   ` Jan Beulich
@ 2021-11-30 11:14     ` Andrew Cooper
  2021-11-30 11:22       ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2021-11-30 11:14 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 30/11/2021 10:33, Jan Beulich wrote:
> On 30.11.2021 11:04, Andrew Cooper wrote:
>> It has been 4 years since the default load address changed from 1M to 2M, and
>> _stext ceased residing in l2_xenmap[0].  We should not be inserting an unused
>> mapping.
>>
>> To ensure we don't create mappings accidentally, loop from 0 and obey
>> _PAGE_PRESENT on all entries.
>>
>> Fixes: 7ed93f3a0dff ("x86: change default load address from 1 MiB to 2 MiB")
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>
> I guess this may be worth backporting despite not having any immediate
> adverse effect.

I'd say so, yes.  I too can't see an adverse effect right now, but I'm
definitely wary of stray executable mappings lying around.


In principle, it would be nice to reclaim the 2M of space (which only
exists for the MB1 path IIRC), but then we're getting into a position
where xen_phys_start isn't really that any more.

A different alternative could be to use it for early memory allocations
in Xen and treat it like .data, similar to Linux's early BRK().  This
might simplify some of the relocation hoops we currently jump through.

~Andrew


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] x86/boot: Drop incorrect mapping at l2_xenmap[0]
  2021-11-30 11:14     ` Andrew Cooper
@ 2021-11-30 11:22       ` Jan Beulich
  2021-11-30 12:39         ` Andrew Cooper
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2021-11-30 11:22 UTC (permalink / raw)
  To: Andrew Cooper, Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 30.11.2021 12:14, Andrew Cooper wrote:
> On 30/11/2021 10:33, Jan Beulich wrote:
>> On 30.11.2021 11:04, Andrew Cooper wrote:
>>> It has been 4 years since the default load address changed from 1M to 2M, and
>>> _stext ceased residing in l2_xenmap[0].  We should not be inserting an unused
>>> mapping.
>>>
>>> To ensure we don't create mappings accidentally, loop from 0 and obey
>>> _PAGE_PRESENT on all entries.
>>>
>>> Fixes: 7ed93f3a0dff ("x86: change default load address from 1 MiB to 2 MiB")
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>
>> I guess this may be worth backporting despite not having any immediate
>> adverse effect.
> 
> I'd say so, yes.  I too can't see an adverse effect right now, but I'm
> definitely wary of stray executable mappings lying around.
> 
> 
> In principle, it would be nice to reclaim the 2M of space (which only
> exists for the MB1 path IIRC), but then we're getting into a position
> where xen_phys_start isn't really that any more.

Well, xen_phys_base might be slightly more accurate, but apart from that
I do think that we reclaim that space (as much as we did reclaim the 1Mb
before the change of the default load address):

    if ( efi_boot_mem_unused(&eb_start, &eb_end) )
    {
        reserve_e820_ram(&boot_e820, __pa(_stext), __pa(eb_start));
        reserve_e820_ram(&boot_e820, __pa(eb_end), __pa(__2M_rwdata_end));
    }
    else
        reserve_e820_ram(&boot_e820, __pa(_stext), __pa(__2M_rwdata_end));

Jan



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] x86/boot: Drop incorrect mapping at l2_xenmap[0]
  2021-11-30 11:22       ` Jan Beulich
@ 2021-11-30 12:39         ` Andrew Cooper
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Cooper @ 2021-11-30 12:39 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 30/11/2021 11:22, Jan Beulich wrote:
> On 30.11.2021 12:14, Andrew Cooper wrote:
>> On 30/11/2021 10:33, Jan Beulich wrote:
>>> On 30.11.2021 11:04, Andrew Cooper wrote:
>>>> It has been 4 years since the default load address changed from 1M to 2M, and
>>>> _stext ceased residing in l2_xenmap[0].  We should not be inserting an unused
>>>> mapping.
>>>>
>>>> To ensure we don't create mappings accidentally, loop from 0 and obey
>>>> _PAGE_PRESENT on all entries.
>>>>
>>>> Fixes: 7ed93f3a0dff ("x86: change default load address from 1 MiB to 2 MiB")
>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>>
>>> I guess this may be worth backporting despite not having any immediate
>>> adverse effect.
>> I'd say so, yes.  I too can't see an adverse effect right now, but I'm
>> definitely wary of stray executable mappings lying around.
>>
>>
>> In principle, it would be nice to reclaim the 2M of space (which only
>> exists for the MB1 path IIRC), but then we're getting into a position
>> where xen_phys_start isn't really that any more.
> Well, xen_phys_base might be slightly more accurate, but apart from that
> I do think that we reclaim that space (as much as we did reclaim the 1Mb
> before the change of the default load address):
>
>     if ( efi_boot_mem_unused(&eb_start, &eb_end) )
>     {
>         reserve_e820_ram(&boot_e820, __pa(_stext), __pa(eb_start));
>         reserve_e820_ram(&boot_e820, __pa(eb_end), __pa(__2M_rwdata_end));
>     }
>     else
>         reserve_e820_ram(&boot_e820, __pa(_stext), __pa(__2M_rwdata_end));

That means there are zero safety barriers between a bad function pointer
and executing arbitrary guest memory, doesn't it...

My "adverse effect" comment was under the impression that we just left
the range unused.

~Andrew


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/8] x86/mm: Drop bogus cacheability logic in update_xen_mappings()
  2021-11-30 10:04 ` [PATCH 4/8] x86/mm: Drop bogus cacheability logic in update_xen_mappings() Andrew Cooper
@ 2021-11-30 13:11   ` Andrew Cooper
  2021-11-30 14:56     ` Andrew Cooper
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2021-11-30 13:11 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Jan Beulich, Roger Pau Monné, Wei Liu

On 30/11/2021 10:04, Andrew Cooper wrote:
> There is no circumstance under which any part of the Xen image in memory wants
> to have any cacheability other than Write Back.
>
> Furthermore, unmapping random pieces of Xen like that comes with a non-trivial
> risk of a Triple Fault, or other fatal condition.  Even if it appears to work,
> an NMI or interrupt as a far wider reach into Xen mappings than calling
> map_pages_to_xen() thrice.
>
> Simplfy the safety checking to a BUG_ON().  It is substantially more correct
> and robust than following either of the unlikely(alias) paths.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Wei Liu <wl@xen.org>
>
> I'm half tempted to drop the check entirely, but in that case it would be
> better to inline the result into the two callers.
> ---
>  xen/arch/x86/mm.c | 21 +++++++++------------
>  1 file changed, 9 insertions(+), 12 deletions(-)
>
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index 4d799032dc82..9bd4e5cc1d2f 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -785,24 +785,21 @@ bool is_iomem_page(mfn_t mfn)
>  
>  static int update_xen_mappings(unsigned long mfn, unsigned int cacheattr)
>  {
> -    int err = 0;
>      bool alias = mfn >= PFN_DOWN(xen_phys_start) &&
>           mfn < PFN_UP(xen_phys_start + xen_virt_end - XEN_VIRT_START);
> -    unsigned long xen_va =
> -        XEN_VIRT_START + ((mfn - PFN_DOWN(xen_phys_start)) << PAGE_SHIFT);
> +
> +    /*
> +     * Something is catastrophically broken if someone is trying to change the
> +     * cacheability of Xen in memory...
> +     */
> +    BUG_ON(alias);
>  
>      if ( boot_cpu_has(X86_FEATURE_XEN_SELFSNOOP) )
>          return 0;
>  
> -    if ( unlikely(alias) && cacheattr )
> -        err = map_pages_to_xen(xen_va, _mfn(mfn), 1, 0);
> -    if ( !err )
> -        err = map_pages_to_xen((unsigned long)mfn_to_virt(mfn), _mfn(mfn), 1,
> -                     PAGE_HYPERVISOR | cacheattr_to_pte_flags(cacheattr));
> -    if ( unlikely(alias) && !cacheattr && !err )
> -        err = map_pages_to_xen(xen_va, _mfn(mfn), 1, PAGE_HYPERVISOR);
> -
> -    return err;
> +    return map_pages_to_xen(
> +        (unsigned long)mfn_to_virt(mfn), _mfn(mfn), 1,
> +        PAGE_HYPERVISOR | cacheattr_to_pte_flags(cacheattr));

In light of the discussion on patch 1, this is no longer safe.  The
alias calculation includes 2M of arbitrary guest memory, and in
principle there are legitimate reasons for a guest to want to map RAM as
WC (e.g. GPU pagetables) or in the future, WP (in place RAM
encryption/decryption).

The gross hack fix would be "mfn >= PFN_DOWN(xen_phys_start + MB(2))",
but but this is screaming for a helper.  xen_in_range() is part-way
there, but is an O(n) loop over the regions.

~Andrew


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/8] x86/mm: Drop bogus cacheability logic in update_xen_mappings()
  2021-11-30 13:11   ` Andrew Cooper
@ 2021-11-30 14:56     ` Andrew Cooper
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Cooper @ 2021-11-30 14:56 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Jan Beulich, Roger Pau Monné, Wei Liu

On 30/11/2021 13:11, Andrew Cooper wrote:
> On 30/11/2021 10:04, Andrew Cooper wrote:
>> There is no circumstance under which any part of the Xen image in memory wants
>> to have any cacheability other than Write Back.
>>
>> Furthermore, unmapping random pieces of Xen like that comes with a non-trivial
>> risk of a Triple Fault, or other fatal condition.  Even if it appears to work,
>> an NMI or interrupt as a far wider reach into Xen mappings than calling
>> map_pages_to_xen() thrice.
>>
>> Simplfy the safety checking to a BUG_ON().  It is substantially more correct
>> and robust than following either of the unlikely(alias) paths.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>> CC: Roger Pau Monné <roger.pau@citrix.com>
>> CC: Wei Liu <wl@xen.org>
>>
>> I'm half tempted to drop the check entirely, but in that case it would be
>> better to inline the result into the two callers.
>> ---
>>  xen/arch/x86/mm.c | 21 +++++++++------------
>>  1 file changed, 9 insertions(+), 12 deletions(-)
>>
>> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
>> index 4d799032dc82..9bd4e5cc1d2f 100644
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -785,24 +785,21 @@ bool is_iomem_page(mfn_t mfn)
>>  
>>  static int update_xen_mappings(unsigned long mfn, unsigned int cacheattr)
>>  {
>> -    int err = 0;
>>      bool alias = mfn >= PFN_DOWN(xen_phys_start) &&
>>           mfn < PFN_UP(xen_phys_start + xen_virt_end - XEN_VIRT_START);
>> -    unsigned long xen_va =
>> -        XEN_VIRT_START + ((mfn - PFN_DOWN(xen_phys_start)) << PAGE_SHIFT);
>> +
>> +    /*
>> +     * Something is catastrophically broken if someone is trying to change the
>> +     * cacheability of Xen in memory...
>> +     */
>> +    BUG_ON(alias);
>>  
>>      if ( boot_cpu_has(X86_FEATURE_XEN_SELFSNOOP) )
>>          return 0;
>>  
>> -    if ( unlikely(alias) && cacheattr )
>> -        err = map_pages_to_xen(xen_va, _mfn(mfn), 1, 0);
>> -    if ( !err )
>> -        err = map_pages_to_xen((unsigned long)mfn_to_virt(mfn), _mfn(mfn), 1,
>> -                     PAGE_HYPERVISOR | cacheattr_to_pte_flags(cacheattr));
>> -    if ( unlikely(alias) && !cacheattr && !err )
>> -        err = map_pages_to_xen(xen_va, _mfn(mfn), 1, PAGE_HYPERVISOR);
>> -
>> -    return err;
>> +    return map_pages_to_xen(
>> +        (unsigned long)mfn_to_virt(mfn), _mfn(mfn), 1,
>> +        PAGE_HYPERVISOR | cacheattr_to_pte_flags(cacheattr));
> In light of the discussion on patch 1, this is no longer safe.  The
> alias calculation includes 2M of arbitrary guest memory, and in
> principle there are legitimate reasons for a guest to want to map RAM as
> WC (e.g. GPU pagetables) or in the future, WP (in place RAM
> encryption/decryption).
>
> The gross hack fix would be "mfn >= PFN_DOWN(xen_phys_start + MB(2))",
> but but this is screaming for a helper.  xen_in_range() is part-way
> there, but is an O(n) loop over the regions.

Further problems.  There is also the init section in the middle of Xen
which contains arbitrary guest memory.  The old algorithm was worse for
that, because it would reinstate the nuked init mappings.

~Andrew


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/8] x86/boot: Better describe the pagetable relocation loops
  2021-11-30 10:04 ` [PATCH 2/8] x86/boot: Better describe the pagetable relocation loops Andrew Cooper
@ 2021-12-02 11:43   ` Jan Beulich
  0 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2021-12-02 11:43 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 30.11.2021 11:04, Andrew Cooper wrote:
> The first loop relocates all reachable non-leaf entries in idle_pg_table[],
> which includes l2_xenmap[511]'s reference to l1_fixmap_x[].
> 
> The second loop relocates only leaf mappings in l2_xenmap[], so update the
> skip condition to be opposite to the first loop.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/8] x86/boot: Fix data placement around __high_start()
  2021-11-30 10:04 ` [PATCH 3/8] x86/boot: Fix data placement around __high_start() Andrew Cooper
@ 2021-12-02 11:49   ` Jan Beulich
  2021-12-02 14:06     ` Andrew Cooper
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2021-12-02 11:49 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 30.11.2021 11:04, Andrew Cooper wrote:
> multiboot_ptr should be in __initdata - it is only used on the BSP path.
> Furthermore, the .align 8 then .long means that stack_start is misaligned.
> 
> Move both into setup.c, which lets the compiler handle the details correctly,
> as well as providling proper debug information for them.
> 
> Declare stack_start in setup.h and avoid extern-ing it locally in smpboot.c.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

Nevertheless I'd like to state that defining a variable in C when all
its uses are in assembly seems a little odd to me.

Jan



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/8] x86/boot: Drop xen_virt_end
  2021-11-30 10:04 ` [PATCH 5/8] x86/boot: Drop xen_virt_end Andrew Cooper
@ 2021-12-02 11:56   ` Jan Beulich
  2021-12-02 14:07     ` Andrew Cooper
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2021-12-02 11:56 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 30.11.2021 11:04, Andrew Cooper wrote:
> The calculation in __start_xen() for xen_virt_end is an opencoding of
> ROUNDUP(_end, 2M).  This is __2M_rwdata_end as provided by the linker script.
> 
> This corrects the bound calculations in arch_livepatch_init() and
> update_xen_mappings() to not enforce 2M alignment when Xen is not compiled
> with CONFIG_XEN_ALIGN_2M
> 
> Furthermore, since 52975142d154 ("x86/boot: Create the l2_xenmap[] mappings
> dynamically"), there have not been extraneous mappings to delete, meaning that
> the call to destroy_xen_mappings() has been a no-op.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

While there's a contextual conflict with patch 4, my understanding is
that this one is independent of that. On that basis
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 6/8] x86/boot: Adjust .text/.rodata/etc permissions in one place
  2021-11-30 10:04 ` [PATCH 6/8] x86/boot: Adjust .text/.rodata/etc permissions in one place Andrew Cooper
@ 2021-12-02 12:15   ` Jan Beulich
  0 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2021-12-02 12:15 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 30.11.2021 11:04, Andrew Cooper wrote:
> At the moment, we have two locations selecting restricted permissions, not
> very far apart on boot, dependent on opposite answers from using_2M_mapping().
> The later location however can shatter superpages if needed, while the former
> cannot.
> 
> Collect together all the permission adjustments at the slightly later point in
> boot, as we're may need to shatter a superpage to support __ro_after_init.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Again with the understanding that this doesn't really depend on patch 4:
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 7/8] x86/boot: Support __ro_after_init
  2021-11-30 10:04 ` [PATCH 7/8] x86/boot: Support __ro_after_init Andrew Cooper
@ 2021-12-02 13:10   ` Jan Beulich
  0 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2021-12-02 13:10 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 30.11.2021 11:04, Andrew Cooper wrote:
> For security hardening reasons, it advantageous to make setup-once data
> immutable after boot.  Borrow __ro_after_init from Linux.
> 
> On x86, place .data.ro_after_init at the start of .rodata, excluding it from
> the early permission restrictions.  Re-apply RO restrictions to the whole of
> .rodata in init_done(), attempting to reform the superpage if possible.
> 
> For architectures which don't implement __ro_after_init explicitly, variables
> merges into .data.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC 8/8] x86/boot: Check that permission restrictions have taken effect
  2021-11-30 10:04 ` [PATCH RFC 8/8] x86/boot: Check that permission restrictions have taken effect Andrew Cooper
@ 2021-12-02 13:33   ` Jan Beulich
  2021-12-06 18:12     ` Andrew Cooper
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2021-12-02 13:33 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 30.11.2021 11:04, Andrew Cooper wrote:
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Wei Liu <wl@xen.org>
> 
> RFC.  I don't know if this is something we'd want to keep or not.
> 
> Getting extable handling working for test_nx_data is proving tricky, and while
> I can't spot anything that should stop the extable from working with NX
> faults, from a security hardening perspective, there really ought to
> be.
> 
> (Spurious faults aside), there are no circumstances where an NX fault is
> legitimate, and restricting extable's ability to interfere with the fatality
> of an NX fault provides a better security posture.

Gating the extable_fixup() invocation accordingly should be possible.
A respective check could live even earlier, but the window between
the !guest_mode() check and the function's invocation isn't very large
anyway.

Since we can't have both testability and such faults being uniformly
fatal, but since otoh we use pre_extable quite sparingly, how about
forcing the fixup to take that path by disabling interrupts around
the test?

In any event this touches the insufficient selectiveness of the fixup
machinery again: Any kind of fault will be recovered from whenever a
fixup record is attached to an insn.

Jan



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/8] x86/boot: Fix data placement around __high_start()
  2021-12-02 11:49   ` Jan Beulich
@ 2021-12-02 14:06     ` Andrew Cooper
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Cooper @ 2021-12-02 14:06 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 02/12/2021 11:49, Jan Beulich wrote:
> On 30.11.2021 11:04, Andrew Cooper wrote:
>> multiboot_ptr should be in __initdata - it is only used on the BSP path.
>> Furthermore, the .align 8 then .long means that stack_start is misaligned.
>>
>> Move both into setup.c, which lets the compiler handle the details correctly,
>> as well as providling proper debug information for them.
>>
>> Declare stack_start in setup.h and avoid extern-ing it locally in smpboot.c.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Thanks.

> Nevertheless I'd like to state that defining a variable in C when all
> its uses are in assembly seems a little odd to me.

I don't see it as odd, although I admit that I did try to see if I could
remove multiboot_ptr entirely first.  Xen is after all a single
freestanding binary.

Having the debug information (well - at a minimum, ELF size info) is
important for livepatch binary diffing, and nothing in asm by default
gets any of that.

Letting the compiler do this all for us is physically shorter, and less
prone to errors.

~Andrew


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/8] x86/boot: Drop xen_virt_end
  2021-12-02 11:56   ` Jan Beulich
@ 2021-12-02 14:07     ` Andrew Cooper
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Cooper @ 2021-12-02 14:07 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 02/12/2021 11:56, Jan Beulich wrote:
> On 30.11.2021 11:04, Andrew Cooper wrote:
>> The calculation in __start_xen() for xen_virt_end is an opencoding of
>> ROUNDUP(_end, 2M).  This is __2M_rwdata_end as provided by the linker script.
>>
>> This corrects the bound calculations in arch_livepatch_init() and
>> update_xen_mappings() to not enforce 2M alignment when Xen is not compiled
>> with CONFIG_XEN_ALIGN_2M
>>
>> Furthermore, since 52975142d154 ("x86/boot: Create the l2_xenmap[] mappings
>> dynamically"), there have not been extraneous mappings to delete, meaning that
>> the call to destroy_xen_mappings() has been a no-op.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> While there's a contextual conflict with patch 4, my understanding is
> that this one is independent of that.

Correct.  I was in the middle of writing this patch when I discovered
the disaster which is the logic in patch 4.

>  On that basis
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Thanks.

~Andrew


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH RFC 8/8] x86/boot: Check that permission restrictions have taken effect
  2021-12-02 13:33   ` Jan Beulich
@ 2021-12-06 18:12     ` Andrew Cooper
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Cooper @ 2021-12-06 18:12 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 02/12/2021 13:33, Jan Beulich wrote:
> On 30.11.2021 11:04, Andrew Cooper wrote:
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>> CC: Roger Pau Monné <roger.pau@citrix.com>
>> CC: Wei Liu <wl@xen.org>
>>
>> RFC.  I don't know if this is something we'd want to keep or not.
>>
>> Getting extable handling working for test_nx_data is proving tricky, and while
>> I can't spot anything that should stop the extable from working with NX
>> faults, from a security hardening perspective, there really ought to
>> be.
>>
>> (Spurious faults aside), there are no circumstances where an NX fault is
>> legitimate, and restricting extable's ability to interfere with the fatality
>> of an NX fault provides a better security posture.
> Gating the extable_fixup() invocation accordingly should be possible.
> A respective check could live even earlier, but the window between
> the !guest_mode() check and the function's invocation isn't very large
> anyway.
>
> Since we can't have both testability and such faults being uniformly
> fatal, but since otoh we use pre_extable quite sparingly, how about
> forcing the fixup to take that path by disabling interrupts around
> the test?

That feels like an abuse of an unrelated mechanism, not to mention fragile.

> In any event this touches the insufficient selectiveness of the fixup
> machinery again: Any kind of fault will be recovered from whenever a
> fixup record is attached to an insn.

There are multiple things here.  Yes, I agree that fixing up all faults
is suboptimal.

But even within #PF alone, things such as SMAP/SMEP faults are
programmer error with no hope of extable being able to adequately
resolve, and should be fatal.

I actually like the approach that Linux has recently taken, by
describing a fixup type rather than arbitrary logic, in an attempt to
keep the number of special cases from getting out of hand.  This
approach is quite easy to filter into specific exceptions.

~Andrew


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2021-12-06 18:13 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-30 10:04 [PATCH 0/8] x86: Support for __ro_after_init Andrew Cooper
2021-11-30 10:04 ` [PATCH 1/8] x86/boot: Drop incorrect mapping at l2_xenmap[0] Andrew Cooper
2021-11-30 10:33   ` Jan Beulich
2021-11-30 11:14     ` Andrew Cooper
2021-11-30 11:22       ` Jan Beulich
2021-11-30 12:39         ` Andrew Cooper
2021-11-30 10:04 ` [PATCH 2/8] x86/boot: Better describe the pagetable relocation loops Andrew Cooper
2021-12-02 11:43   ` Jan Beulich
2021-11-30 10:04 ` [PATCH 3/8] x86/boot: Fix data placement around __high_start() Andrew Cooper
2021-12-02 11:49   ` Jan Beulich
2021-12-02 14:06     ` Andrew Cooper
2021-11-30 10:04 ` [PATCH 4/8] x86/mm: Drop bogus cacheability logic in update_xen_mappings() Andrew Cooper
2021-11-30 13:11   ` Andrew Cooper
2021-11-30 14:56     ` Andrew Cooper
2021-11-30 10:04 ` [PATCH 5/8] x86/boot: Drop xen_virt_end Andrew Cooper
2021-12-02 11:56   ` Jan Beulich
2021-12-02 14:07     ` Andrew Cooper
2021-11-30 10:04 ` [PATCH 6/8] x86/boot: Adjust .text/.rodata/etc permissions in one place Andrew Cooper
2021-12-02 12:15   ` Jan Beulich
2021-11-30 10:04 ` [PATCH 7/8] x86/boot: Support __ro_after_init Andrew Cooper
2021-12-02 13:10   ` Jan Beulich
2021-11-30 10:04 ` [PATCH RFC 8/8] x86/boot: Check that permission restrictions have taken effect Andrew Cooper
2021-12-02 13:33   ` Jan Beulich
2021-12-06 18:12     ` Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.