All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on
@ 2022-12-12  9:55 Julien Grall
  2022-12-12  9:55 ` [PATCH v3 01/18] xen/arm64: flushtlb: Reduce scope of barrier for local TLB flush Julien Grall
                   ` (18 more replies)
  0 siblings, 19 replies; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

Hi all,

Currently, Xen on Arm will switch TTBR whilst the MMU is on. This is
similar to replacing existing mappings with new ones. So we need to
follow a break-before-make sequence.

When switching the TTBR, we need to temporary disable the MMU
before updating the TTBR. This means the page-tables must contain an
identity mapping.

The current memory layout is not very flexible and has an higher chance
to clash with the identity mapping.

On Arm64, we have plenty of unused virtual address space Therefore, we can
simply reshuffle the layout to leave the first part of the virtual
address space empty.

On Arm32, the virtual address space is already quite full. Even if we
find space, it would be necessary to have a dynamic layout. So a
different approach will be necessary. The chosen one is to have
a temporary mapping that will be used to jumped from the ID mapping
to the runtime mapping (or vice versa). The temporary mapping will
be overlapping with the domheap area as it should not be used when
switching on/off the MMU.

The Arm32 part is not yet addressed and will be handled in a follow-up
series.

After this series, most of Xen page-table code should be compliant
with the Arm Arm. The last two issues I am aware of are:
 - domheap: Mappings are replaced without using the Break-Before-Make
   approach.
 - The cache is not cleaned/invalidated when updating the page-tables
   with Data cache off (like during early boot).

The long term plan is to get rid of boot_* page tables and then
directly use the runtime pages. This means for coloring, we will
need to build the pages in the relocated Xen rather than the current
Xen.

For convience, I pushed a branch with everything applied:

https://xenbits.xen.org/git-http/people/julieng/xen-unstable.git
branch boot-pt-rework-v3

Cheers,

Julien Grall (18):
  xen/arm64: flushtlb: Reduce scope of barrier for local TLB flush
  xen/arm64: flushtlb: Implement the TLBI repeat workaround for TLB
    flush by VA
  xen/arm32: flushtlb: Reduce scope of barrier for local TLB flush
  xen/arm: flushtlb: Reduce scope of barrier for the TLB range flush
  xen/arm: Clean-up the memory layout
  xen/arm32: head: Replace "ldr rX, =<label>" with "mov_w rX, <label>"
  xen/arm32: head: Jump to the runtime mapping in enable_mmu()
  xen/arm32: head: Introduce an helper to flush the TLBs
  xen/arm32: head: Remove restriction where to load Xen
  xen/arm32: head: Widen the use of the temporary mapping
  xen/arm: Enable use of dump_pt_walk() early during boot
  xen/arm64: Rework the memory layout
  xen/arm: mm: Allow xen_pt_update() to work with the current root table
  xen/arm: mm: Allow dump_hyp_walk() to work on the current root table
  xen/arm64: mm: Introduce helpers to prepare/enable/disable the
    identity mapping
  xen/arm: linker: Indent correctly _stext
  xen/arm: linker: The identitymap check should cover the whole
    .text.header
  xen/arm64: mm: Rework switch_ttbr()

 xen/arch/arm/arm32/head.S                 | 285 ++++++++++++++--------
 xen/arch/arm/arm64/Makefile               |   1 +
 xen/arch/arm/arm64/head.S                 |  55 +++--
 xen/arch/arm/arm64/mm.c                   | 160 ++++++++++++
 xen/arch/arm/domain_page.c                |   1 +
 xen/arch/arm/include/asm/arm32/flushtlb.h |  27 +-
 xen/arch/arm/include/asm/arm32/mm.h       |   4 +
 xen/arch/arm/include/asm/arm64/flushtlb.h |  58 +++--
 xen/arch/arm/include/asm/arm64/mm.h       |  12 +
 xen/arch/arm/include/asm/config.h         |  71 ++++--
 xen/arch/arm/include/asm/flushtlb.h       |  10 +-
 xen/arch/arm/include/asm/mm.h             |   2 +
 xen/arch/arm/include/asm/setup.h          |  11 +
 xen/arch/arm/mm.c                         | 105 ++++----
 xen/arch/arm/xen.lds.S                    |  12 +-
 15 files changed, 583 insertions(+), 231 deletions(-)
 create mode 100644 xen/arch/arm/arm64/mm.c

-- 
2.38.1



^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v3 01/18] xen/arm64: flushtlb: Reduce scope of barrier for local TLB flush
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
@ 2022-12-12  9:55 ` Julien Grall
  2022-12-13  9:11   ` Michal Orzel
  2023-01-23 16:19   ` Ayan Kumar Halder
  2022-12-12  9:55 ` [PATCH v3 02/18] xen/arm64: flushtlb: Implement the TLBI repeat workaround for TLB flush by VA Julien Grall
                   ` (17 subsequent siblings)
  18 siblings, 2 replies; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

Per D5-4929 in ARM DDI 0487H.a:
"A DSB NSH is sufficient to ensure completion of TLB maintenance
 instructions that apply to a single PE. A DSB ISH is sufficient to
 ensure completion of TLB maintenance instructions that apply to PEs
 in the same Inner Shareable domain.
"

This means barrier after local TLB flushes could be reduced to
non-shareable.

Note that the scope of the barrier in the workaround has not been
changed because Linux v6.1-rc8 is also using 'ish' and I couldn't
find anything in the Neoverse N1 suggesting that a 'nsh' would
be sufficient.

Signed-off-by: Julien Grall <jgrall@amazon.com>

---

    I have used an older version of the Arm Arm because the explanation
    in the latest (ARM DDI 0487I.a) is less obvious. I reckon the paragraph
    about DSB in D8.13.8 is missing the shareability. But this is implied
    in B2.3.11:

    "If the required access types of the DSB is reads and writes, the
     following instructions issued by PEe before the DSB are complete for
     the required shareability domain:

     [...]

     — All TLB maintenance instructions.
    "

    Changes in v3:
        - Patch added
---
 xen/arch/arm/include/asm/arm64/flushtlb.h | 27 ++++++++++++++---------
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/xen/arch/arm/include/asm/arm64/flushtlb.h b/xen/arch/arm/include/asm/arm64/flushtlb.h
index 7c5431518741..39d429ace552 100644
--- a/xen/arch/arm/include/asm/arm64/flushtlb.h
+++ b/xen/arch/arm/include/asm/arm64/flushtlb.h
@@ -12,8 +12,9 @@
  * ARM64_WORKAROUND_REPEAT_TLBI:
  * Modification of the translation table for a virtual address might lead to
  * read-after-read ordering violation.
- * The workaround repeats TLBI+DSB operation for all the TLB flush operations.
- * While this is stricly not necessary, we don't want to take any risk.
+ * The workaround repeats TLBI+DSB ISH operation for all the TLB flush
+ * operations. While this is stricly not necessary, we don't want to
+ * take any risk.
  *
  * For Xen page-tables the ISB will discard any instructions fetched
  * from the old mappings.
@@ -21,38 +22,42 @@
  * For the Stage-2 page-tables the ISB ensures the completion of the DSB
  * (and therefore the TLB invalidation) before continuing. So we know
  * the TLBs cannot contain an entry for a mapping we may have removed.
+ *
+ * Note that for local TLB flush, using non-shareable (nsh) is sufficient
+ * (see D5-4929 in ARM DDI 0487H.a). Althougth, the memory barrier in
+ * for the workaround is left as inner-shareable to match with Linux.
  */
-#define TLB_HELPER(name, tlbop)                  \
+#define TLB_HELPER(name, tlbop, sh)              \
 static inline void name(void)                    \
 {                                                \
     asm volatile(                                \
-        "dsb  ishst;"                            \
+        "dsb  "  # sh  "st;"                     \
         "tlbi "  # tlbop  ";"                    \
         ALTERNATIVE(                             \
             "nop; nop;",                         \
-            "dsb  ish;"                          \
+            "dsb  "  # sh  ";"                   \
             "tlbi "  # tlbop  ";",               \
             ARM64_WORKAROUND_REPEAT_TLBI,        \
             CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
-        "dsb  ish;"                              \
+        "dsb  "  # sh  ";"                       \
         "isb;"                                   \
         : : : "memory");                         \
 }
 
 /* Flush local TLBs, current VMID only. */
-TLB_HELPER(flush_guest_tlb_local, vmalls12e1);
+TLB_HELPER(flush_guest_tlb_local, vmalls12e1, nsh);
 
 /* Flush innershareable TLBs, current VMID only */
-TLB_HELPER(flush_guest_tlb, vmalls12e1is);
+TLB_HELPER(flush_guest_tlb, vmalls12e1is, ish);
 
 /* Flush local TLBs, all VMIDs, non-hypervisor mode */
-TLB_HELPER(flush_all_guests_tlb_local, alle1);
+TLB_HELPER(flush_all_guests_tlb_local, alle1, nsh);
 
 /* Flush innershareable TLBs, all VMIDs, non-hypervisor mode */
-TLB_HELPER(flush_all_guests_tlb, alle1is);
+TLB_HELPER(flush_all_guests_tlb, alle1is, ish);
 
 /* Flush all hypervisor mappings from the TLB of the local processor. */
-TLB_HELPER(flush_xen_tlb_local, alle2);
+TLB_HELPER(flush_xen_tlb_local, alle2, nsh);
 
 /* Flush TLB of local processor for address va. */
 static inline void  __flush_xen_tlb_one_local(vaddr_t va)
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 02/18] xen/arm64: flushtlb: Implement the TLBI repeat workaround for TLB flush by VA
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
  2022-12-12  9:55 ` [PATCH v3 01/18] xen/arm64: flushtlb: Reduce scope of barrier for local TLB flush Julien Grall
@ 2022-12-12  9:55 ` Julien Grall
  2022-12-13  9:23   ` Michal Orzel
  2022-12-12  9:55 ` [PATCH v3 03/18] xen/arm32: flushtlb: Reduce scope of barrier for local TLB flush Julien Grall
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

Looking at the Neoverse N1 errata document, it is not clear to me
why the TLBI repeat workaround is not applied for TLB flush by VA.

The TBL flush by VA helpers are used in flush_xen_tlb_range_va_local()
and flush_xen_tlb_range_va(). So if the range size if a fixed size smaller
than a PAGE_SIZE, it would be possible that the compiler remove the loop
and therefore replicate the sequence described in the erratum 1286807.

So the TLBI repeat workaround should also be applied for the TLB flush
by VA helpers.

Fixes: 22e323d115d8 ("xen/arm: Add workaround for Cortex-A76/Neoverse-N1 erratum #1286807")
Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    This was spotted while looking at reducing the scope of the memory
    barriers. I don't have any HW affected.

    Changes in v3:
        - Patch added
---
 xen/arch/arm/include/asm/arm64/flushtlb.h | 31 +++++++++++++++++------
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/xen/arch/arm/include/asm/arm64/flushtlb.h b/xen/arch/arm/include/asm/arm64/flushtlb.h
index 39d429ace552..5b033c0cb980 100644
--- a/xen/arch/arm/include/asm/arm64/flushtlb.h
+++ b/xen/arch/arm/include/asm/arm64/flushtlb.h
@@ -44,6 +44,27 @@ static inline void name(void)                    \
         : : : "memory");                         \
 }
 
+/*
+ * FLush TLB by VA. This will likely be used in a loop, so the caller
+ * is responsible to use the appropriate memory barriers before/after
+ * the sequence.
+ *
+ * See above about the ARM64_WORKAROUND_REPEAT_TLBI sequence.
+ */
+#define TLB_HELPER_VA(name, tlbop)               \
+static inline void name(vaddr_t va)              \
+{                                                \
+    asm volatile(                                \
+        "tlbi "  # tlbop  ", %0;"                \
+        ALTERNATIVE(                             \
+            "nop; nop;",                         \
+            "dsb  ish;"                          \
+            "tlbi "  # tlbop  ", %0;",           \
+            ARM64_WORKAROUND_REPEAT_TLBI,        \
+            CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
+        : : "r" (va >> PAGE_SHIFT) : "memory");  \
+}
+
 /* Flush local TLBs, current VMID only. */
 TLB_HELPER(flush_guest_tlb_local, vmalls12e1, nsh);
 
@@ -60,16 +81,10 @@ TLB_HELPER(flush_all_guests_tlb, alle1is, ish);
 TLB_HELPER(flush_xen_tlb_local, alle2, nsh);
 
 /* Flush TLB of local processor for address va. */
-static inline void  __flush_xen_tlb_one_local(vaddr_t va)
-{
-    asm volatile("tlbi vae2, %0;" : : "r" (va>>PAGE_SHIFT) : "memory");
-}
+TLB_HELPER_VA(__flush_xen_tlb_one_local, vae2);
 
 /* Flush TLB of all processors in the inner-shareable domain for address va. */
-static inline void __flush_xen_tlb_one(vaddr_t va)
-{
-    asm volatile("tlbi vae2is, %0;" : : "r" (va>>PAGE_SHIFT) : "memory");
-}
+TLB_HELPER_VA(__flush_xen_tlb_one, vae2is);
 
 #endif /* __ASM_ARM_ARM64_FLUSHTLB_H__ */
 /*
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 03/18] xen/arm32: flushtlb: Reduce scope of barrier for local TLB flush
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
  2022-12-12  9:55 ` [PATCH v3 01/18] xen/arm64: flushtlb: Reduce scope of barrier for local TLB flush Julien Grall
  2022-12-12  9:55 ` [PATCH v3 02/18] xen/arm64: flushtlb: Implement the TLBI repeat workaround for TLB flush by VA Julien Grall
@ 2022-12-12  9:55 ` Julien Grall
  2022-12-13 10:48   ` Michal Orzel
  2022-12-12  9:55 ` [PATCH v3 04/18] xen/arm: flushtlb: Reduce scope of barrier for the TLB range flush Julien Grall
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

Per G5-9224 in ARM DDI 0487I.a:

"A DSB NSH is sufficient to ensure completion of TLB maintenance
 instructions that apply to a single PE. A DSB ISH is sufficient to
 ensure completion of TLB maintenance instructions that apply to PEs
 in the same Inner Shareable domain.
"

This is quoting the Armv8 specification because I couldn't find an
explicit statement in the Armv7 specification. Instead, I could find
bits in various places that confirm the same implementation.

Furthermore, Linux has been using 'nsh' since 2013 (62cbbc42e001
"ARM: tlb: reduce scope of barrier domains for TLB invalidation").

This means barrier after local TLB flushes could be reduced to
non-shareable.

Signed-off-by: Julien Grall <jgrall@amazon.com>

---

    Changes in v3:
        - Patch added
---
 xen/arch/arm/include/asm/arm32/flushtlb.h | 27 +++++++++++++----------
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/xen/arch/arm/include/asm/arm32/flushtlb.h b/xen/arch/arm/include/asm/arm32/flushtlb.h
index 9085e6501153..7ae6a12f8155 100644
--- a/xen/arch/arm/include/asm/arm32/flushtlb.h
+++ b/xen/arch/arm/include/asm/arm32/flushtlb.h
@@ -15,30 +15,33 @@
  * For the Stage-2 page-tables the ISB ensures the completion of the DSB
  * (and therefore the TLB invalidation) before continuing. So we know
  * the TLBs cannot contain an entry for a mapping we may have removed.
+ *
+ * Note that for local TLB flush, using non-shareable (nsh) is sufficient
+ * (see G5-9224 in ARM DDI 0487I.a).
  */
-#define TLB_HELPER(name, tlbop) \
-static inline void name(void)   \
-{                               \
-    dsb(ishst);                 \
-    WRITE_CP32(0, tlbop);       \
-    dsb(ish);                   \
-    isb();                      \
+#define TLB_HELPER(name, tlbop, sh) \
+static inline void name(void)       \
+{                                   \
+    dsb(sh ## st);                  \
+    WRITE_CP32(0, tlbop);           \
+    dsb(sh);                        \
+    isb();                          \
 }
 
 /* Flush local TLBs, current VMID only */
-TLB_HELPER(flush_guest_tlb_local, TLBIALL);
+TLB_HELPER(flush_guest_tlb_local, TLBIALL, nsh);
 
 /* Flush inner shareable TLBs, current VMID only */
-TLB_HELPER(flush_guest_tlb, TLBIALLIS);
+TLB_HELPER(flush_guest_tlb, TLBIALLIS, ish);
 
 /* Flush local TLBs, all VMIDs, non-hypervisor mode */
-TLB_HELPER(flush_all_guests_tlb_local, TLBIALLNSNH);
+TLB_HELPER(flush_all_guests_tlb_local, TLBIALLNSNH, nsh);
 
 /* Flush innershareable TLBs, all VMIDs, non-hypervisor mode */
-TLB_HELPER(flush_all_guests_tlb, TLBIALLNSNHIS);
+TLB_HELPER(flush_all_guests_tlb, TLBIALLNSNHIS, ish);
 
 /* Flush all hypervisor mappings from the TLB of the local processor. */
-TLB_HELPER(flush_xen_tlb_local, TLBIALLH);
+TLB_HELPER(flush_xen_tlb_local, TLBIALLH, nsh);
 
 /* Flush TLB of local processor for address va. */
 static inline void __flush_xen_tlb_one_local(vaddr_t va)
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 04/18] xen/arm: flushtlb: Reduce scope of barrier for the TLB range flush
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
                   ` (2 preceding siblings ...)
  2022-12-12  9:55 ` [PATCH v3 03/18] xen/arm32: flushtlb: Reduce scope of barrier for local TLB flush Julien Grall
@ 2022-12-12  9:55 ` Julien Grall
  2022-12-13 11:15   ` Michal Orzel
  2022-12-12  9:55 ` [PATCH v3 05/18] xen/arm: Clean-up the memory layout Julien Grall
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

At the moment, flush_xen_tlb_range_va{,_local}() are using system
wide memory barrier. This is quite expensive and unnecessary.

For the local version, a non-shareable barrier is sufficient.
For the SMP version, a inner-shareable barrier is sufficient.

Furthermore, the initial barrier only need to a store barrier.

For the full explanation of the sequence see asm/arm{32,64}/flushtlb.h.

Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v3:
        - Patch added
---
 xen/arch/arm/include/asm/flushtlb.h | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/include/asm/flushtlb.h b/xen/arch/arm/include/asm/flushtlb.h
index 125a141975e0..e45fb6d97b02 100644
--- a/xen/arch/arm/include/asm/flushtlb.h
+++ b/xen/arch/arm/include/asm/flushtlb.h
@@ -37,13 +37,14 @@ static inline void flush_xen_tlb_range_va_local(vaddr_t va,
 {
     vaddr_t end = va + size;
 
-    dsb(sy); /* Ensure preceding are visible */
+    /* See asm/arm{32,64}/flushtlb.h for the explanation of the sequence. */
+    dsb(nshst); /* Ensure prior page-tables updates have completed */
     while ( va < end )
     {
         __flush_xen_tlb_one_local(va);
         va += PAGE_SIZE;
     }
-    dsb(sy); /* Ensure completion of the TLB flush */
+    dsb(nsh); /* Ensure the TLB invalidation has completed */
     isb();
 }
 
@@ -56,13 +57,14 @@ static inline void flush_xen_tlb_range_va(vaddr_t va,
 {
     vaddr_t end = va + size;
 
-    dsb(sy); /* Ensure preceding are visible */
+    /* See asm/arm{32,64}/flushtlb.h for the explanation of the sequence. */
+    dsb(ishst); /* Ensure prior page-tables updates have completed */
     while ( va < end )
     {
         __flush_xen_tlb_one(va);
         va += PAGE_SIZE;
     }
-    dsb(sy); /* Ensure completion of the TLB flush */
+    dsb(ish); /* Ensure the TLB invalidation has completed */
     isb();
 }
 
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 05/18] xen/arm: Clean-up the memory layout
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
                   ` (3 preceding siblings ...)
  2022-12-12  9:55 ` [PATCH v3 04/18] xen/arm: flushtlb: Reduce scope of barrier for the TLB range flush Julien Grall
@ 2022-12-12  9:55 ` Julien Grall
  2022-12-13 10:57   ` Michal Orzel
  2022-12-12  9:55 ` [PATCH v3 06/18] xen/arm32: head: Replace "ldr rX, =<label>" with "mov_w rX, <label>" Julien Grall
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

In a follow-up patch, the base address for the common mappings will
vary between arm32 and arm64. To avoid any duplication, define
every mapping in the common region from the previous one.

Take the opportunity to:
    * add missing *_SIZE for FIXMAP_VIRT_* and XEN_VIRT_*
    * switch to MB()/GB() to be avoid hexadecimal (easier to read)

Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v3:
        - Switch more macros to use MB()/GB()
        - Remove duplicated sentence in the commit message

    Changes in v2:
        - Use _AT(vaddr_t, ...) to build on 32-bit.
        - Drop COMMON_VIRT_START
---
 xen/arch/arm/include/asm/config.h | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/xen/arch/arm/include/asm/config.h b/xen/arch/arm/include/asm/config.h
index 0fefed1b8aa9..87851e677701 100644
--- a/xen/arch/arm/include/asm/config.h
+++ b/xen/arch/arm/include/asm/config.h
@@ -107,14 +107,19 @@
  *  Unused
  */
 
-#define XEN_VIRT_START         _AT(vaddr_t,0x00200000)
-#define FIXMAP_ADDR(n)        (_AT(vaddr_t,0x00400000) + (n) * PAGE_SIZE)
+#define XEN_VIRT_START          _AT(vaddr_t, MB(2))
+#define XEN_VIRT_SIZE           _AT(vaddr_t, MB(2))
 
-#define BOOT_FDT_VIRT_START    _AT(vaddr_t,0x00600000)
-#define BOOT_FDT_VIRT_SIZE     _AT(vaddr_t, MB(4))
+#define FIXMAP_VIRT_START       (XEN_VIRT_START + XEN_VIRT_SIZE)
+#define FIXMAP_VIRT_SIZE        _AT(vaddr_t, MB(2))
+
+#define FIXMAP_ADDR(n)          (FIXMAP_VIRT_START + (n) * PAGE_SIZE)
+
+#define BOOT_FDT_VIRT_START     (FIXMAP_VIRT_START + FIXMAP_VIRT_SIZE)
+#define BOOT_FDT_VIRT_SIZE      _AT(vaddr_t, MB(4))
 
 #ifdef CONFIG_LIVEPATCH
-#define LIVEPATCH_VMAP_START   _AT(vaddr_t,0x00a00000)
+#define LIVEPATCH_VMAP_START    (BOOT_FDT_VIRT_START + BOOT_FDT_VIRT_SIZE)
 #define LIVEPATCH_VMAP_SIZE    _AT(vaddr_t, MB(2))
 #endif
 
@@ -124,18 +129,18 @@
 
 #define CONFIG_SEPARATE_XENHEAP 1
 
-#define FRAMETABLE_VIRT_START  _AT(vaddr_t,0x02000000)
+#define FRAMETABLE_VIRT_START  _AT(vaddr_t, MB(32))
 #define FRAMETABLE_SIZE        MB(128-32)
 #define FRAMETABLE_NR          (FRAMETABLE_SIZE / sizeof(*frame_table))
 #define FRAMETABLE_VIRT_END    (FRAMETABLE_VIRT_START + FRAMETABLE_SIZE - 1)
 
-#define VMAP_VIRT_START        _AT(vaddr_t,0x10000000)
+#define VMAP_VIRT_START        _AT(vaddr_t, MB(256))
 #define VMAP_VIRT_SIZE         _AT(vaddr_t, GB(1) - MB(256))
 
-#define XENHEAP_VIRT_START     _AT(vaddr_t,0x40000000)
+#define XENHEAP_VIRT_START     _AT(vaddr_t, GB(1))
 #define XENHEAP_VIRT_SIZE      _AT(vaddr_t, GB(1))
 
-#define DOMHEAP_VIRT_START     _AT(vaddr_t,0x80000000)
+#define DOMHEAP_VIRT_START     _AT(vaddr_t, GB(2))
 #define DOMHEAP_VIRT_SIZE      _AT(vaddr_t, GB(2))
 
 #define DOMHEAP_ENTRIES        1024  /* 1024 2MB mapping slots */
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 06/18] xen/arm32: head: Replace "ldr rX, =<label>" with "mov_w rX, <label>"
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
                   ` (4 preceding siblings ...)
  2022-12-12  9:55 ` [PATCH v3 05/18] xen/arm: Clean-up the memory layout Julien Grall
@ 2022-12-12  9:55 ` Julien Grall
  2022-12-13  0:31   ` Stefano Stabellini
  2022-12-13 11:10   ` Michal Orzel
  2022-12-12  9:55 ` [PATCH v3 07/18] xen/arm32: head: Jump to the runtime mapping in enable_mmu() Julien Grall
                   ` (12 subsequent siblings)
  18 siblings, 2 replies; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

"ldr rX, =<label>" is used to load a value from the literal pool. This
implies a memory access.

This can be avoided by using the macro mov_w which encode the value in
the immediate of two instructions.

So replace all "ldr rX, =<label>" with "mov_w rX, <label>".

No functional changes intended.

Signed-off-by: Julien Grall <jgrall@amazon.com>

---

    Changes in v3:
        * Patch added
---
 xen/arch/arm/arm32/head.S | 38 +++++++++++++++++++-------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
index a558c2a6876e..ce680be91be1 100644
--- a/xen/arch/arm/arm32/head.S
+++ b/xen/arch/arm/arm32/head.S
@@ -62,7 +62,7 @@
 .endm
 
 .macro load_paddr rb, sym
-        ldr   \rb, =\sym
+        mov_w \rb, \sym
         add   \rb, \rb, r10
 .endm
 
@@ -149,7 +149,7 @@ past_zImage:
         mov   r8, r2                 /* r8 := DTB base address */
 
         /* Find out where we are */
-        ldr   r0, =start
+        mov_w r0, start
         adr   r9, start              /* r9  := paddr (start) */
         sub   r10, r9, r0            /* r10 := phys-offset */
 
@@ -170,7 +170,7 @@ past_zImage:
         bl    enable_mmu
 
         /* We are still in the 1:1 mapping. Jump to the runtime Virtual Address. */
-        ldr   r0, =primary_switched
+        mov_w r0, primary_switched
         mov   pc, r0
 primary_switched:
         /*
@@ -190,7 +190,7 @@ primary_switched:
         /* Setup the arguments for start_xen and jump to C world */
         mov   r0, r10                /* r0 := Physical offset */
         mov   r1, r8                 /* r1 := paddr(FDT) */
-        ldr   r2, =start_xen
+        mov_w r2, start_xen
         b     launch
 ENDPROC(start)
 
@@ -198,7 +198,7 @@ GLOBAL(init_secondary)
         cpsid aif                    /* Disable all interrupts */
 
         /* Find out where we are */
-        ldr   r0, =start
+        mov_w r0, start
         adr   r9, start              /* r9  := paddr (start) */
         sub   r10, r9, r0            /* r10 := phys-offset */
 
@@ -227,7 +227,7 @@ GLOBAL(init_secondary)
 
 
         /* We are still in the 1:1 mapping. Jump to the runtime Virtual Address. */
-        ldr   r0, =secondary_switched
+        mov_w r0, secondary_switched
         mov   pc, r0
 secondary_switched:
         /*
@@ -236,7 +236,7 @@ secondary_switched:
          *
          * XXX: This is not compliant with the Arm Arm.
          */
-        ldr   r4, =init_ttbr         /* VA of HTTBR value stashed by CPU 0 */
+        mov_w r4, init_ttbr          /* VA of HTTBR value stashed by CPU 0 */
         ldrd  r4, r5, [r4]           /* Actual value */
         dsb
         mcrr  CP64(r4, r5, HTTBR)
@@ -254,7 +254,7 @@ secondary_switched:
 #endif
         PRINT("- Ready -\r\n")
         /* Jump to C world */
-        ldr   r2, =start_secondary
+        mov_w r2, start_secondary
         b     launch
 ENDPROC(init_secondary)
 
@@ -297,8 +297,8 @@ ENDPROC(check_cpu_mode)
  */
 zero_bss:
         PRINT("- Zero BSS -\r\n")
-        ldr   r0, =__bss_start       /* r0 := vaddr(__bss_start) */
-        ldr   r1, =__bss_end         /* r1 := vaddr(__bss_start) */
+        mov_w r0, __bss_start        /* r0 := vaddr(__bss_start) */
+        mov_w r1, __bss_end          /* r1 := vaddr(__bss_start) */
 
         mov   r2, #0
 1:      str   r2, [r0], #4
@@ -330,8 +330,8 @@ cpu_init:
 
 cpu_init_done:
         /* Set up memory attribute type tables */
-        ldr   r0, =MAIR0VAL
-        ldr   r1, =MAIR1VAL
+        mov_w r0, MAIR0VAL
+        mov_w r1,MAIR1VAL
         mcr   CP32(r0, HMAIR0)
         mcr   CP32(r1, HMAIR1)
 
@@ -341,10 +341,10 @@ cpu_init_done:
          * PT walks are write-back, write-allocate in both cache levels,
          * Full 32-bit address space goes through this table.
          */
-        ldr   r0, =(TCR_RES1|TCR_SH0_IS|TCR_ORGN0_WBWA|TCR_IRGN0_WBWA|TCR_T0SZ(0))
+        mov_w r0, (TCR_RES1|TCR_SH0_IS|TCR_ORGN0_WBWA|TCR_IRGN0_WBWA|TCR_T0SZ(0))
         mcr   CP32(r0, HTCR)
 
-        ldr   r0, =HSCTLR_SET
+        mov_w r0, HSCTLR_SET
         mcr   CP32(r0, HSCTLR)
         isb
 
@@ -452,7 +452,7 @@ ENDPROC(cpu_init)
  */
 create_page_tables:
         /* Prepare the page-tables for mapping Xen */
-        ldr   r0, =XEN_VIRT_START
+        mov_w r0, XEN_VIRT_START
         create_table_entry boot_pgtable, boot_second, r0, 1
         create_table_entry boot_second, boot_third, r0, 2
 
@@ -576,7 +576,7 @@ remove_identity_mapping:
         cmp   r1, #XEN_FIRST_SLOT
         beq   1f
         /* It is not in slot 0, remove the entry */
-        ldr   r0, =boot_pgtable      /* r0 := root table */
+        mov_w r0, boot_pgtable       /* r0 := root table */
         lsl   r1, r1, #3             /* r1 := Slot offset */
         strd  r2, r3, [r0, r1]
         b     identity_mapping_removed
@@ -590,7 +590,7 @@ remove_identity_mapping:
         cmp   r1, #XEN_SECOND_SLOT
         beq   identity_mapping_removed
         /* It is not in slot 1, remove the entry */
-        ldr   r0, =boot_second       /* r0 := second table */
+        mov_w r0, boot_second        /* r0 := second table */
         lsl   r1, r1, #3             /* r1 := Slot offset */
         strd  r2, r3, [r0, r1]
 
@@ -620,7 +620,7 @@ ENDPROC(remove_identity_mapping)
 setup_fixmap:
 #if defined(CONFIG_EARLY_PRINTK)
         /* Add UART to the fixmap table */
-        ldr   r0, =EARLY_UART_VIRTUAL_ADDRESS
+        mov_w r0, EARLY_UART_VIRTUAL_ADDRESS
         create_mapping_entry xen_fixmap, r0, r11, type=PT_DEV_L3
 #endif
         /* Map fixmap into boot_second */
@@ -643,7 +643,7 @@ ENDPROC(setup_fixmap)
  * Clobbers r3
  */
 launch:
-        ldr   r3, =init_data
+        mov_w r3, init_data
         add   r3, #INITINFO_stack    /* Find the boot-time stack */
         ldr   sp, [r3]
         add   sp, #STACK_SIZE        /* (which grows down from the top). */
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 07/18] xen/arm32: head: Jump to the runtime mapping in enable_mmu()
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
                   ` (5 preceding siblings ...)
  2022-12-12  9:55 ` [PATCH v3 06/18] xen/arm32: head: Replace "ldr rX, =<label>" with "mov_w rX, <label>" Julien Grall
@ 2022-12-12  9:55 ` Julien Grall
  2022-12-13  0:46   ` Stefano Stabellini
  2022-12-12  9:55 ` [PATCH v3 08/18] xen/arm32: head: Introduce an helper to flush the TLBs Julien Grall
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

At the moment, enable_mmu() will return to an address in the 1:1 mapping
and each path is responsible to switch to the runtime mapping.

In a follow-up patch, the behavior to switch to the runtime mapping
will become more complex. So to avoid more code/comment duplication,
move the switch in enable_mmu().

Lastly, take the opportunity to replace load from literal pool with
mov_w.

Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v3:
        - Fix typo in the commit message

    Changes in v2:
        - Patch added
---
 xen/arch/arm/arm32/head.S | 50 +++++++++++++++++++++++----------------
 1 file changed, 30 insertions(+), 20 deletions(-)

diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
index ce680be91be1..40c1d7502007 100644
--- a/xen/arch/arm/arm32/head.S
+++ b/xen/arch/arm/arm32/head.S
@@ -167,19 +167,11 @@ past_zImage:
         bl    check_cpu_mode
         bl    cpu_init
         bl    create_page_tables
-        bl    enable_mmu
 
-        /* We are still in the 1:1 mapping. Jump to the runtime Virtual Address. */
-        mov_w r0, primary_switched
-        mov   pc, r0
+        /* Address in the runtime mapping to jump to after the MMU is enabled */
+        mov_w lr, primary_switched
+        b     enable_mmu
 primary_switched:
-        /*
-         * The 1:1 map may clash with other parts of the Xen virtual memory
-         * layout. As it is not used anymore, remove it completely to
-         * avoid having to worry about replacing existing mapping
-         * afterwards.
-         */
-        bl    remove_identity_mapping
         bl    setup_fixmap
 #ifdef CONFIG_EARLY_PRINTK
         /* Use a virtual address to access the UART. */
@@ -223,12 +215,10 @@ GLOBAL(init_secondary)
         bl    check_cpu_mode
         bl    cpu_init
         bl    create_page_tables
-        bl    enable_mmu
 
-
-        /* We are still in the 1:1 mapping. Jump to the runtime Virtual Address. */
-        mov_w r0, secondary_switched
-        mov   pc, r0
+        /* Address in the runtime mapping to jump to after the MMU is enabled */
+        mov_w lr, secondary_switched
+        b     enable_mmu
 secondary_switched:
         /*
          * Non-boot CPUs need to move on to the proper pagetables, which were
@@ -523,9 +513,12 @@ virtphys_clash:
 ENDPROC(create_page_tables)
 
 /*
- * Turn on the Data Cache and the MMU. The function will return on the 1:1
- * mapping. In other word, the caller is responsible to switch to the runtime
- * mapping.
+ * Turn on the Data Cache and the MMU. The function will return
+ * to the virtual address provided in LR (e.g. the runtime mapping).
+ *
+ * Inputs:
+ *   r9 : paddr(start)
+ *   lr : Virtual address to return to
  *
  * Clobbers r0 - r3
  */
@@ -551,7 +544,24 @@ enable_mmu:
         dsb                          /* Flush PTE writes and finish reads */
         mcr   CP32(r0, HSCTLR)       /* now paging is enabled */
         isb                          /* Now, flush the icache */
-        mov   pc, lr
+
+        /*
+         * The MMU is turned on and we are in the 1:1 mapping. Switch
+         * to the runtime mapping.
+         */
+        mov_w r0, 1f
+        mov   pc, r0
+1:
+        /*
+         * The 1:1 map may clash with other parts of the Xen virtual memory
+         * layout. As it is not used anymore, remove it completely to
+         * avoid having to worry about replacing existing mapping
+         * afterwards.
+         *
+         * On return this will jump to the virtual address requested by
+         * the caller.
+         */
+        b     remove_identity_mapping
 ENDPROC(enable_mmu)
 
 /*
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 08/18] xen/arm32: head: Introduce an helper to flush the TLBs
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
                   ` (6 preceding siblings ...)
  2022-12-12  9:55 ` [PATCH v3 07/18] xen/arm32: head: Jump to the runtime mapping in enable_mmu() Julien Grall
@ 2022-12-12  9:55 ` Julien Grall
  2022-12-14 14:24   ` Michal Orzel
  2022-12-12  9:55 ` [PATCH v3 09/18] xen/arm32: head: Remove restriction where to load Xen Julien Grall
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

The sequence for flushing the TLBs is 4 instruction long and often
requires an explanation how it works.

So create an helper and use it in the boot code (switch_ttbr() is left
alone for now).

Note that in secondary_switched, we were also flushing the instruction
cache and branch predictor. Neither of them was necessary because:
    * We are only supporting IVIPT cache on arm32, so the instruction
      cache flush is only necessary when executable code is modified.
      None of the boot code is doing that.
    * The instruction cache is not invalidated and misprediction is not
      a problem at boot.

Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v3:
        * Fix typo
        * Update the documentation
        * Rename the argument from tmp1 to tmp
---
 xen/arch/arm/arm32/head.S | 30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
index 40c1d7502007..315abbbaebec 100644
--- a/xen/arch/arm/arm32/head.S
+++ b/xen/arch/arm/arm32/head.S
@@ -66,6 +66,20 @@
         add   \rb, \rb, r10
 .endm
 
+/*
+ * Flush local TLBs
+ *
+ * @tmp:    Scratch register
+ *
+ * See asm/arm32/flushtlb.h for the explanation of the sequence.
+ */
+.macro flush_xen_tlb_local tmp
+        dsb   nshst
+        mcr   CP32(\tmp, TLBIALLH)
+        dsb   nsh
+        isb
+.endm
+
 /*
  * Common register usage in this file:
  *   r0  -
@@ -232,11 +246,7 @@ secondary_switched:
         mcrr  CP64(r4, r5, HTTBR)
         dsb
         isb
-        mcr   CP32(r0, TLBIALLH)     /* Flush hypervisor TLB */
-        mcr   CP32(r0, ICIALLU)      /* Flush I-cache */
-        mcr   CP32(r0, BPIALL)       /* Flush branch predictor */
-        dsb                          /* Ensure completion of TLB+BP flush */
-        isb
+        flush_xen_tlb_local r0
 
 #ifdef CONFIG_EARLY_PRINTK
         /* Use a virtual address to access the UART. */
@@ -529,8 +539,7 @@ enable_mmu:
          * The state of the TLBs is unknown before turning on the MMU.
          * Flush them to avoid stale one.
          */
-        mcr   CP32(r0, TLBIALLH)     /* Flush hypervisor TLBs */
-        dsb   nsh
+        flush_xen_tlb_local r0
 
         /* Write Xen's PT's paddr into the HTTBR */
         load_paddr r0, boot_pgtable
@@ -605,12 +614,7 @@ remove_identity_mapping:
         strd  r2, r3, [r0, r1]
 
 identity_mapping_removed:
-        /* See asm/arm32/flushtlb.h for the explanation of the sequence. */
-        dsb   nshst
-        mcr   CP32(r0, TLBIALLH)
-        dsb   nsh
-        isb
-
+        flush_xen_tlb_local r0
         mov   pc, lr
 ENDPROC(remove_identity_mapping)
 
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 09/18] xen/arm32: head: Remove restriction where to load Xen
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
                   ` (7 preceding siblings ...)
  2022-12-12  9:55 ` [PATCH v3 08/18] xen/arm32: head: Introduce an helper to flush the TLBs Julien Grall
@ 2022-12-12  9:55 ` Julien Grall
  2022-12-13 18:23   ` Julien Grall
  2022-12-12  9:55 ` [PATCH v3 10/18] xen/arm32: head: Widen the use of the temporary mapping Julien Grall
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

At the moment, bootloaders can load Xen anywhere in memory but the
region 2MB - 4MB. While I am not aware of any issue, we have no way
to tell the bootloader to avoid that region.

In addition to that, in the future, Xen may grow over 2MB if we
enable feature like UBSAN or GCOV. To avoid widening the restriction
on the load address, it would be better to get rid of it.

When the identity mapping is clashing with the Xen runtime mapping,
we need an extra indirection to be able to replace the identity
mapping with the Xen runtime mapping.

Reserve a new memory region that will be used to temporarily map Xen.
For convenience, the new area is re-using the same first slot as the
domheap which is used for per-cpu temporary mapping after a CPU has
booted.

Furthermore, directly map boot_second (which cover Xen and more)
to the temporary area. This will avoid to allocate an extra page-table
for the second-level and will helpful for follow-up patches (we will
want to use the fixmap whilst in the temporary mapping).

Lastly, some part of the code now needs to know whether the temporary
mapping was created. So reserve r12 to store this information.

Signed-off-by: Julien Grall <jgrall@amazon.com>
---
    Changes in v3:
        - Remove the ASSERT() in init_domheap_mappings() because it was
          bogus (secondary CPU root tables are initialized to the CPU0
          root table so the entry will be valid). Also, it is not
          related to this patch as the CPU0 root table are rebuilt
          during boot. The ASSERT() will be re-introduced later.

    Changes in v2:
        - Patch added
---
 xen/arch/arm/arm32/head.S         | 139 ++++++++++++++++++++++++++----
 xen/arch/arm/domain_page.c        |   1 +
 xen/arch/arm/include/asm/config.h |  14 +++
 xen/arch/arm/mm.c                 |  14 +++
 4 files changed, 153 insertions(+), 15 deletions(-)

diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
index 315abbbaebec..50adb887ceaf 100644
--- a/xen/arch/arm/arm32/head.S
+++ b/xen/arch/arm/arm32/head.S
@@ -35,6 +35,9 @@
 #define XEN_FIRST_SLOT      first_table_offset(XEN_VIRT_START)
 #define XEN_SECOND_SLOT     second_table_offset(XEN_VIRT_START)
 
+/* Offset between the early boot xen mapping and the runtime xen mapping */
+#define XEN_TEMPORARY_OFFSET      (TEMPORARY_XEN_VIRT_START - XEN_VIRT_START)
+
 #if defined(CONFIG_EARLY_PRINTK) && defined(CONFIG_EARLY_PRINTK_INC)
 #include CONFIG_EARLY_PRINTK_INC
 #endif
@@ -94,7 +97,7 @@
  *   r9  - paddr(start)
  *   r10 - phys offset
  *   r11 - UART address
- *   r12 -
+ *   r12 - Temporary mapping created
  *   r13 - SP
  *   r14 - LR
  *   r15 - PC
@@ -445,6 +448,9 @@ ENDPROC(cpu_init)
  *   r9 : paddr(start)
  *   r10: phys offset
  *
+ * Output:
+ *   r12: Was a temporary mapping created?
+ *
  * Clobbers r0 - r4, r6
  *
  * Register usage within this function:
@@ -484,7 +490,11 @@ create_page_tables:
         /*
          * Setup the 1:1 mapping so we can turn the MMU on. Note that
          * only the first page of Xen will be part of the 1:1 mapping.
+         *
+         * In all the cases, we will link boot_third_id. So create the
+         * mapping in advance.
          */
+        create_mapping_entry boot_third_id, r9, r9
 
         /*
          * Find the first slot used. If the slot is not XEN_FIRST_SLOT,
@@ -501,8 +511,7 @@ create_page_tables:
         /*
          * Find the second slot used. If the slot is XEN_SECOND_SLOT, then the
          * 1:1 mapping will use its own set of page-tables from the
-         * third level. For slot XEN_SECOND_SLOT, Xen is not yet able to handle
-         * it.
+         * third level.
          */
         get_table_slot r1, r9, 2     /* r1 := second slot */
         cmp   r1, #XEN_SECOND_SLOT
@@ -513,13 +522,33 @@ create_page_tables:
 link_from_second_id:
         create_table_entry boot_second_id, boot_third_id, r9, 2
 link_from_third_id:
-        create_mapping_entry boot_third_id, r9, r9
+        /* Good news, we are not clashing with Xen virtual mapping */
+        mov   r12, #0                /* r12 := temporary mapping not created */
         mov   pc, lr
 
 virtphys_clash:
-        /* Identity map clashes with boot_third, which we cannot handle yet */
-        PRINT("- Unable to build boot page tables - virt and phys addresses clash. -\r\n")
-        b     fail
+        /*
+         * The identity map clashes with boot_third. Link boot_first_id and
+         * map Xen to a temporary mapping. See switch_to_runtime_mapping
+         * for more details.
+         */
+        PRINT("- Virt and Phys addresses clash  -\r\n")
+        PRINT("- Create temporary mapping -\r\n")
+
+        /*
+         * This will override the link to boot_second in XEN_FIRST_SLOT.
+         * The page-tables are not live yet. So no need to use
+         * break-before-make.
+         */
+        create_table_entry boot_pgtable, boot_second_id, r9, 1
+        create_table_entry boot_second_id, boot_third_id, r9, 2
+
+        /* Map boot_second (cover Xen mappings) to the temporary 1st slot */
+        mov_w r0, TEMPORARY_XEN_VIRT_START
+        create_table_entry boot_pgtable, boot_second, r0, 1
+
+        mov   r12, #1                /* r12 := temporary mapping created */
+        mov   pc, lr
 ENDPROC(create_page_tables)
 
 /*
@@ -528,9 +557,10 @@ ENDPROC(create_page_tables)
  *
  * Inputs:
  *   r9 : paddr(start)
+ *  r12 : Was the temporary mapping created?
  *   lr : Virtual address to return to
  *
- * Clobbers r0 - r3
+ * Clobbers r0 - r5
  */
 enable_mmu:
         PRINT("- Turning on paging -\r\n")
@@ -558,21 +588,79 @@ enable_mmu:
          * The MMU is turned on and we are in the 1:1 mapping. Switch
          * to the runtime mapping.
          */
-        mov_w r0, 1f
-        mov   pc, r0
+        mov   r5, lr                /* Save LR before overwritting it */
+        mov_w lr, 1f                /* Virtual address in the runtime mapping */
+        b     switch_to_runtime_mapping
 1:
+        mov   lr, r5                /* Restore LR */
         /*
-         * The 1:1 map may clash with other parts of the Xen virtual memory
-         * layout. As it is not used anymore, remove it completely to
-         * avoid having to worry about replacing existing mapping
-         * afterwards.
+         * At this point, either the 1:1 map or the temporary mapping
+         * will be present. The former may clash with other parts of the
+         * Xen virtual memory layout. As both of them are not used
+         * anymore, remove them completely to avoid having to worry
+         * about replacing existing mapping afterwards.
          *
          * On return this will jump to the virtual address requested by
          * the caller.
          */
-        b     remove_identity_mapping
+        teq   r12, #0
+        beq   remove_identity_mapping
+        b     remove_temporary_mapping
 ENDPROC(enable_mmu)
 
+/*
+ * Switch to the runtime mapping. The logic depends on whether the
+ * runtime virtual region is clashing with the physical address
+ *
+ *  - If it is not clashing, we can directly jump to the address in
+ *    the runtime mapping.
+ *  - If it is clashing, create_page_tables() would have mapped Xen to
+ *    a temporary virtual address. We need to switch to the temporary
+ *    mapping so we can remove the identity mapping and map Xen at the
+ *    correct position.
+ *
+ * Inputs
+ *    r9: paddr(start)
+ *   r12: Was a temporary mapping created?
+ *    lr: Address in the runtime mapping to jump to
+ *
+ * Clobbers r0 - r4
+ */
+switch_to_runtime_mapping:
+        /*
+         * Jump to the runtime mapping if the virt and phys are not
+         * clashing
+         */
+        teq   r12, #0
+        beq   ready_to_switch
+
+        /* We are still in the 1:1 mapping. Jump to the temporary Virtual address. */
+        mov_w r0, 1f
+        add   r0, r0, #XEN_TEMPORARY_OFFSET /* r0 := address in temporary mapping */
+        mov   pc, r0
+
+1:
+        /* Remove boot_second_id */
+        mov   r2, #0
+        mov   r3, #0
+        adr_l r0, boot_pgtable
+        get_table_slot r1, r9, 1            /* r1 := first slot */
+        lsl   r1, r1, #3                    /* r1 := first slot offset */
+        strd  r2, r3, [r0, r1]
+
+        flush_xen_tlb_local r0
+
+        /* Map boot_second into boot_pgtable */
+        mov_w r0, XEN_VIRT_START
+        create_table_entry boot_pgtable, boot_second, r0, 1
+
+        /* Ensure any page table updates are visible before continuing */
+        dsb   nsh
+
+ready_to_switch:
+        mov   pc, lr
+ENDPROC(switch_to_runtime_mapping)
+
 /*
  * Remove the 1:1 map from the page-tables. It is not easy to keep track
  * where the 1:1 map was mapped, so we will look for the top-level entry
@@ -618,6 +706,27 @@ identity_mapping_removed:
         mov   pc, lr
 ENDPROC(remove_identity_mapping)
 
+/*
+ * Remove the temporary mapping of Xen starting at TEMPORARY_XEN_VIRT_START.
+ *
+ * Clobbers r0 - r1
+ */
+remove_temporary_mapping:
+        /* r2:r3 := invalid page-table entry */
+        mov   r2, #0
+        mov   r3, #0
+
+        adr_l r0, boot_pgtable
+        mov_w r1, TEMPORARY_XEN_VIRT_START
+        get_table_slot r1, r1, 1     /* r1 := first slot */
+        lsl   r1, r1, #3             /* r1 := first slot offset */
+        strd  r2, r3, [r0, r1]
+
+        flush_xen_tlb_local r0
+
+        mov  pc, lr
+ENDPROC(remove_temporary_mapping)
+
 /*
  * Map the UART in the fixmap (when earlyprintk is used) and hook the
  * fixmap table in the page tables.
diff --git a/xen/arch/arm/domain_page.c b/xen/arch/arm/domain_page.c
index b7c02c919064..907fb93d4df0 100644
--- a/xen/arch/arm/domain_page.c
+++ b/xen/arch/arm/domain_page.c
@@ -60,6 +60,7 @@ bool init_domheap_mappings(unsigned int cpu)
     for ( i = 0; i < DOMHEAP_SECOND_PAGES; i++ )
     {
         lpae_t pte = mfn_to_xen_entry(mfn_add(mfn, i), MT_NORMAL);
+
         pte.pt.table = 1;
         write_pte(&root[first_idx + i], pte);
     }
diff --git a/xen/arch/arm/include/asm/config.h b/xen/arch/arm/include/asm/config.h
index 87851e677701..6c1b762e976d 100644
--- a/xen/arch/arm/include/asm/config.h
+++ b/xen/arch/arm/include/asm/config.h
@@ -148,6 +148,20 @@
 /* Number of domheap pagetable pages required at the second level (2MB mappings) */
 #define DOMHEAP_SECOND_PAGES (DOMHEAP_VIRT_SIZE >> FIRST_SHIFT)
 
+/*
+ * The temporary area is overlapping with the domheap area. This may
+ * be used to create an alias of the first slot containing Xen mappings
+ * when turning on/off the MMU.
+ */
+#define TEMPORARY_AREA_FIRST_SLOT    (first_table_offset(DOMHEAP_VIRT_START))
+
+/* Calculate the address in the temporary area */
+#define TEMPORARY_AREA_ADDR(addr)                           \
+     (((addr) & ~XEN_PT_LEVEL_MASK(1)) |                    \
+      (TEMPORARY_AREA_FIRST_SLOT << XEN_PT_LEVEL_SHIFT(1)))
+
+#define TEMPORARY_XEN_VIRT_START    TEMPORARY_AREA_ADDR(XEN_VIRT_START)
+
 #else /* ARM_64 */
 
 #define SLOT0_ENTRY_BITS  39
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 630175276f6a..1315a2c87db7 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -167,6 +167,9 @@ static void __init __maybe_unused build_assertions(void)
 #define CHECK_SAME_SLOT(level, virt1, virt2) \
     BUILD_BUG_ON(level##_table_offset(virt1) != level##_table_offset(virt2))
 
+#define CHECK_DIFFERENT_SLOT(level, virt1, virt2) \
+    BUILD_BUG_ON(level##_table_offset(virt1) == level##_table_offset(virt2))
+
 #ifdef CONFIG_ARM_64
     CHECK_SAME_SLOT(zeroeth, XEN_VIRT_START, FIXMAP_ADDR(0));
     CHECK_SAME_SLOT(zeroeth, XEN_VIRT_START, BOOT_FDT_VIRT_START);
@@ -174,7 +177,18 @@ static void __init __maybe_unused build_assertions(void)
     CHECK_SAME_SLOT(first, XEN_VIRT_START, FIXMAP_ADDR(0));
     CHECK_SAME_SLOT(first, XEN_VIRT_START, BOOT_FDT_VIRT_START);
 
+    /*
+     * For arm32, the temporary mapping will re-use the domheap
+     * first slot and the second slots will match.
+     */
+#ifdef CONFIG_ARM_32
+    CHECK_SAME_SLOT(first, TEMPORARY_XEN_VIRT_START, DOMHEAP_VIRT_START);
+    CHECK_DIFFERENT_SLOT(first, XEN_VIRT_START, TEMPORARY_XEN_VIRT_START);
+    CHECK_SAME_SLOT(second, XEN_VIRT_START, TEMPORARY_XEN_VIRT_START);
+#endif
+
 #undef CHECK_SAME_SLOT
+#undef CHECK_DIFFERENT_SLOT
 }
 
 void dump_pt_walk(paddr_t ttbr, paddr_t addr,
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 10/18] xen/arm32: head: Widen the use of the temporary mapping
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
                   ` (8 preceding siblings ...)
  2022-12-12  9:55 ` [PATCH v3 09/18] xen/arm32: head: Remove restriction where to load Xen Julien Grall
@ 2022-12-12  9:55 ` Julien Grall
  2022-12-12  9:55 ` [PATCH v3 11/18] xen/arm: Enable use of dump_pt_walk() early during boot Julien Grall
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

At the moment, the temporary mapping is only used when the virtual
runtime region of Xen is clashing with the physical region.

In follow-up patches, we will rework how secondary CPU bring-up works
and it will be convenient to use the fixmap area for accessing
the root page-table (it is per-cpu).

Rework the code to use temporary mapping when the Xen physical address
is not overlapping with the temporary mapping.

This also has the advantage to simplify the logic to identity map
Xen.

Signed-off-by: Julien Grall <jgrall@amazon.com>

---

Even if this patch is rewriting part of the previous patch, I decided
to keep them separated to help the review.

The "folow-up patches" are still in draft at the moment. I still haven't
find a way to split them nicely and not require too much more work
in the coloring side.

I have provided some medium-term goal in the cover letter.

    Changes in v3:
        - Resolve conflicts after switching from "ldr rX, <label>" to
          "mov_w rX, <label>" in a previous patch

    Changes in v2:
        - Patch added
---
 xen/arch/arm/arm32/head.S | 82 +++++++--------------------------------
 1 file changed, 15 insertions(+), 67 deletions(-)

diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
index 50adb887ceaf..2658625bc775 100644
--- a/xen/arch/arm/arm32/head.S
+++ b/xen/arch/arm/arm32/head.S
@@ -459,7 +459,6 @@ ENDPROC(cpu_init)
 create_page_tables:
         /* Prepare the page-tables for mapping Xen */
         mov_w r0, XEN_VIRT_START
-        create_table_entry boot_pgtable, boot_second, r0, 1
         create_table_entry boot_second, boot_third, r0, 2
 
         /* Setup boot_third: */
@@ -479,67 +478,37 @@ create_page_tables:
         cmp   r1, #(XEN_PT_LPAE_ENTRIES<<3) /* 512*8-byte entries per page */
         blo   1b
 
-        /*
-         * If Xen is loaded at exactly XEN_VIRT_START then we don't
-         * need an additional 1:1 mapping, the virtual mapping will
-         * suffice.
-         */
-        cmp   r9, #XEN_VIRT_START
-        moveq pc, lr
-
         /*
          * Setup the 1:1 mapping so we can turn the MMU on. Note that
          * only the first page of Xen will be part of the 1:1 mapping.
-         *
-         * In all the cases, we will link boot_third_id. So create the
-         * mapping in advance.
          */
+        create_table_entry boot_pgtable, boot_second_id, r9, 1
+        create_table_entry boot_second_id, boot_third_id, r9, 2
         create_mapping_entry boot_third_id, r9, r9
 
         /*
-         * Find the first slot used. If the slot is not XEN_FIRST_SLOT,
-         * then the 1:1 mapping will use its own set of page-tables from
-         * the second level.
+         * Find the first slot used. If the slot is not the same
+         * as XEN_TMP_FIRST_SLOT, then we will want to switch
+         * to the temporary mapping before jumping to the runtime
+         * virtual mapping.
          */
         get_table_slot r1, r9, 1     /* r1 := first slot */
-        cmp   r1, #XEN_FIRST_SLOT
-        beq   1f
-        create_table_entry boot_pgtable, boot_second_id, r9, 1
-        b     link_from_second_id
-
-1:
-        /*
-         * Find the second slot used. If the slot is XEN_SECOND_SLOT, then the
-         * 1:1 mapping will use its own set of page-tables from the
-         * third level.
-         */
-        get_table_slot r1, r9, 2     /* r1 := second slot */
-        cmp   r1, #XEN_SECOND_SLOT
-        beq   virtphys_clash
-        create_table_entry boot_second, boot_third_id, r9, 2
-        b     link_from_third_id
+        cmp   r1, #TEMPORARY_AREA_FIRST_SLOT
+        bne   use_temporary_mapping
 
-link_from_second_id:
-        create_table_entry boot_second_id, boot_third_id, r9, 2
-link_from_third_id:
-        /* Good news, we are not clashing with Xen virtual mapping */
+        mov_w r0, XEN_VIRT_START
+        create_table_entry boot_pgtable, boot_second, r0, 1
         mov   r12, #0                /* r12 := temporary mapping not created */
         mov   pc, lr
 
-virtphys_clash:
+use_temporary_mapping:
         /*
-         * The identity map clashes with boot_third. Link boot_first_id and
-         * map Xen to a temporary mapping. See switch_to_runtime_mapping
-         * for more details.
+         * The identity mapping is not using the first slot
+         * TEMPORARY_AREA_FIRST_SLOT. Create a temporary mapping.
+         * See switch_to_runtime_mapping for more details.
          */
-        PRINT("- Virt and Phys addresses clash  -\r\n")
         PRINT("- Create temporary mapping -\r\n")
 
-        /*
-         * This will override the link to boot_second in XEN_FIRST_SLOT.
-         * The page-tables are not live yet. So no need to use
-         * break-before-make.
-         */
         create_table_entry boot_pgtable, boot_second_id, r9, 1
         create_table_entry boot_second_id, boot_third_id, r9, 2
 
@@ -675,33 +644,12 @@ remove_identity_mapping:
         /* r2:r3 := invalid page-table entry */
         mov   r2, #0x0
         mov   r3, #0x0
-        /*
-         * Find the first slot used. Remove the entry for the first
-         * table if the slot is not XEN_FIRST_SLOT.
-         */
+        /* Find the first slot used and remove it */
         get_table_slot r1, r9, 1     /* r1 := first slot */
-        cmp   r1, #XEN_FIRST_SLOT
-        beq   1f
-        /* It is not in slot 0, remove the entry */
         mov_w r0, boot_pgtable       /* r0 := root table */
         lsl   r1, r1, #3             /* r1 := Slot offset */
         strd  r2, r3, [r0, r1]
-        b     identity_mapping_removed
-
-1:
-        /*
-         * Find the second slot used. Remove the entry for the first
-         * table if the slot is not XEN_SECOND_SLOT.
-         */
-        get_table_slot r1, r9, 2     /* r1 := second slot */
-        cmp   r1, #XEN_SECOND_SLOT
-        beq   identity_mapping_removed
-        /* It is not in slot 1, remove the entry */
-        mov_w r0, boot_second        /* r0 := second table */
-        lsl   r1, r1, #3             /* r1 := Slot offset */
-        strd  r2, r3, [r0, r1]
 
-identity_mapping_removed:
         flush_xen_tlb_local r0
         mov   pc, lr
 ENDPROC(remove_identity_mapping)
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 11/18] xen/arm: Enable use of dump_pt_walk() early during boot
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
                   ` (9 preceding siblings ...)
  2022-12-12  9:55 ` [PATCH v3 10/18] xen/arm32: head: Widen the use of the temporary mapping Julien Grall
@ 2022-12-12  9:55 ` Julien Grall
  2022-12-13  1:06   ` Stefano Stabellini
  2022-12-12  9:55 ` [PATCH v3 12/18] xen/arm64: Rework the memory layout Julien Grall
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

At the moment, dump_pt_walk() is using map_domain_page() to map
the page tables.

map_domain_page() is only usuable after init_domheap_mappings() is called
(arm32) or the xenheap has been initialized (arm64).

This means it can be hard to diagnose incorrect page-tables during
early boot. So update dump_pt_walk() to xen_{, un}map_table() instead.

Note that the two helpers are moved earlier to avoid forward declaring
them.

Signed-off-by: Julien Grall <jgrall@amazon.com>
---
 xen/arch/arm/mm.c | 56 +++++++++++++++++++++++------------------------
 1 file changed, 28 insertions(+), 28 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 1315a2c87db7..d0b1cf55f550 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -191,6 +191,30 @@ static void __init __maybe_unused build_assertions(void)
 #undef CHECK_DIFFERENT_SLOT
 }
 
+static lpae_t *xen_map_table(mfn_t mfn)
+{
+    /*
+     * During early boot, map_domain_page() may be unusable. Use the
+     * PMAP to map temporarily a page-table.
+     */
+    if ( system_state == SYS_STATE_early_boot )
+        return pmap_map(mfn);
+
+    return map_domain_page(mfn);
+}
+
+static void xen_unmap_table(const lpae_t *table)
+{
+    /*
+     * During early boot, xen_map_table() will not use map_domain_page()
+     * but the PMAP.
+     */
+    if ( system_state == SYS_STATE_early_boot )
+        pmap_unmap(table);
+    else
+        unmap_domain_page(table);
+}
+
 void dump_pt_walk(paddr_t ttbr, paddr_t addr,
                   unsigned int root_level,
                   unsigned int nr_root_tables)
@@ -230,7 +254,7 @@ void dump_pt_walk(paddr_t ttbr, paddr_t addr,
     else
         root_table = 0;
 
-    mapping = map_domain_page(mfn_add(root_mfn, root_table));
+    mapping = xen_map_table(mfn_add(root_mfn, root_table));
 
     for ( level = root_level; ; level++ )
     {
@@ -246,11 +270,11 @@ void dump_pt_walk(paddr_t ttbr, paddr_t addr,
             break;
 
         /* For next iteration */
-        unmap_domain_page(mapping);
-        mapping = map_domain_page(lpae_get_mfn(pte));
+        xen_unmap_table(mapping);
+        mapping = xen_map_table(lpae_get_mfn(pte));
     }
 
-    unmap_domain_page(mapping);
+    xen_unmap_table(mapping);
 }
 
 void dump_hyp_walk(vaddr_t addr)
@@ -713,30 +737,6 @@ void *ioremap(paddr_t pa, size_t len)
     return ioremap_attr(pa, len, PAGE_HYPERVISOR_NOCACHE);
 }
 
-static lpae_t *xen_map_table(mfn_t mfn)
-{
-    /*
-     * During early boot, map_domain_page() may be unusable. Use the
-     * PMAP to map temporarily a page-table.
-     */
-    if ( system_state == SYS_STATE_early_boot )
-        return pmap_map(mfn);
-
-    return map_domain_page(mfn);
-}
-
-static void xen_unmap_table(const lpae_t *table)
-{
-    /*
-     * During early boot, xen_map_table() will not use map_domain_page()
-     * but the PMAP.
-     */
-    if ( system_state == SYS_STATE_early_boot )
-        pmap_unmap(table);
-    else
-        unmap_domain_page(table);
-}
-
 static int create_xen_table(lpae_t *entry)
 {
     mfn_t mfn;
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 12/18] xen/arm64: Rework the memory layout
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
                   ` (10 preceding siblings ...)
  2022-12-12  9:55 ` [PATCH v3 11/18] xen/arm: Enable use of dump_pt_walk() early during boot Julien Grall
@ 2022-12-12  9:55 ` Julien Grall
  2022-12-13  1:22   ` Stefano Stabellini
  2022-12-12  9:55 ` [PATCH v3 13/18] xen/arm: mm: Allow xen_pt_update() to work with the current root table Julien Grall
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

Xen is currently not fully compliant with the Arm Arm because it will
switch the TTBR with the MMU on.

In order to be compliant, we need to disable the MMU before
switching the TTBR. The implication is the page-tables should
contain an identity mapping of the code switching the TTBR.

In most of the case we expect Xen to be loaded in low memory. I am aware
of one platform (i.e AMD Seattle) where the memory start above 512GB.
To give us some slack, consider that Xen may be loaded in the first 2TB
of the physical address space.

The memory layout is reshuffled to keep the first two slots of the zeroeth
level free. Xen will now be loaded at (2TB + 2MB). This requires a slight
tweak of the boot code because XEN_VIRT_START cannot be used as an
immediate.

This reshuffle will make trivial to create a 1:1 mapping when Xen is
loaded below 2TB.

Signed-off-by: Julien Grall <jgrall@amazon.com>
---

    Changes in v2:
        - Reword the commit message
        - Load Xen at 2TB + 2MB
        - Update the documentation to reflect the new layout
---
 xen/arch/arm/arm64/head.S         |  3 ++-
 xen/arch/arm/include/asm/config.h | 34 +++++++++++++++++++++----------
 xen/arch/arm/mm.c                 | 11 +++++-----
 3 files changed, 31 insertions(+), 17 deletions(-)

diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index ad014716db6f..23c2c7491db2 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -607,7 +607,8 @@ create_page_tables:
          * need an additional 1:1 mapping, the virtual mapping will
          * suffice.
          */
-        cmp   x19, #XEN_VIRT_START
+        ldr   x0, =XEN_VIRT_START
+        cmp   x19, x0
         bne   1f
         ret
 1:
diff --git a/xen/arch/arm/include/asm/config.h b/xen/arch/arm/include/asm/config.h
index 6c1b762e976d..9fe6bfeeeb95 100644
--- a/xen/arch/arm/include/asm/config.h
+++ b/xen/arch/arm/include/asm/config.h
@@ -72,15 +72,12 @@
 #include <xen/page-size.h>
 
 /*
- * Common ARM32 and ARM64 layout:
+ * ARM32 layout:
  *   0  -   2M   Unmapped
  *   2M -   4M   Xen text, data, bss
  *   4M -   6M   Fixmap: special-purpose 4K mapping slots
  *   6M -  10M   Early boot mapping of FDT
- *   10M - 12M   Livepatch vmap (if compiled in)
- *
- * ARM32 layout:
- *   0  -  12M   <COMMON>
+ *  10M -  12M   Livepatch vmap (if compiled in)
  *
  *  32M - 128M   Frametable: 24 bytes per page for 16GB of RAM
  * 256M -   1G   VMAP: ioremap and early_ioremap use this virtual address
@@ -90,8 +87,17 @@
  *   2G -   4G   Domheap: on-demand-mapped
  *
  * ARM64 layout:
- * 0x0000000000000000 - 0x0000007fffffffff (512GB, L0 slot [0])
- *   0  -  12M   <COMMON>
+ * 0x0000000000000000 - 0x00001fffffffffff (2TB, L0 slots [0..1])
+ *
+ *  Reserved to identity map Xen
+ *
+ * 0x0000020000000000 - 0x000028fffffffff (512TB, L0 slot [2]
+ *  (Relative offsets)
+ *   0  -   2M   Unmapped
+ *   2M -   4M   Xen text, data, bss
+ *   4M -   6M   Fixmap: special-purpose 4K mapping slots
+ *   6M -  10M   Early boot mapping of FDT
+ *  10M -  12M   Livepatch vmap (if compiled in)
  *
  *   1G -   2G   VMAP: ioremap and early_ioremap
  *
@@ -107,7 +113,17 @@
  *  Unused
  */
 
+#ifdef CONFIG_ARM_32
 #define XEN_VIRT_START          _AT(vaddr_t, MB(2))
+#else
+
+#define SLOT0_ENTRY_BITS  39
+#define SLOT0(slot) (_AT(vaddr_t,slot) << SLOT0_ENTRY_BITS)
+#define SLOT0_ENTRY_SIZE  SLOT0(1)
+
+#define XEN_VIRT_START          (SLOT0(2) + _AT(vaddr_t, MB(2)))
+#endif
+
 #define XEN_VIRT_SIZE           _AT(vaddr_t, MB(2))
 
 #define FIXMAP_VIRT_START       (XEN_VIRT_START + XEN_VIRT_SIZE)
@@ -164,10 +180,6 @@
 
 #else /* ARM_64 */
 
-#define SLOT0_ENTRY_BITS  39
-#define SLOT0(slot) (_AT(vaddr_t,slot) << SLOT0_ENTRY_BITS)
-#define SLOT0_ENTRY_SIZE  SLOT0(1)
-
 #define VMAP_VIRT_START  GB(1)
 #define VMAP_VIRT_SIZE   GB(1)
 
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index d0b1cf55f550..cc11f5c639e6 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -153,7 +153,7 @@ static void __init __maybe_unused build_assertions(void)
 #endif
     /* Page table structure constraints */
 #ifdef CONFIG_ARM_64
-    BUILD_BUG_ON(zeroeth_table_offset(XEN_VIRT_START));
+    BUILD_BUG_ON(zeroeth_table_offset(XEN_VIRT_START) < 2);
 #endif
     BUILD_BUG_ON(first_table_offset(XEN_VIRT_START));
 #ifdef CONFIG_ARCH_MAP_DOMAIN_PAGE
@@ -498,10 +498,11 @@ void __init setup_pagetables(unsigned long boot_phys_offset)
     phys_offset = boot_phys_offset;
 
 #ifdef CONFIG_ARM_64
-    p = (void *) xen_pgtable;
-    p[0] = pte_of_xenaddr((uintptr_t)xen_first);
-    p[0].pt.table = 1;
-    p[0].pt.xn = 0;
+    pte = pte_of_xenaddr((uintptr_t)xen_first);
+    pte.pt.table = 1;
+    pte.pt.xn = 0;
+    xen_pgtable[zeroeth_table_offset(XEN_VIRT_START)] = pte;
+
     p = (void *) xen_first;
 #else
     p = (void *) cpu0_pgtable;
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 13/18] xen/arm: mm: Allow xen_pt_update() to work with the current root table
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
                   ` (11 preceding siblings ...)
  2022-12-12  9:55 ` [PATCH v3 12/18] xen/arm64: Rework the memory layout Julien Grall
@ 2022-12-12  9:55 ` Julien Grall
  2022-12-13  1:24   ` Stefano Stabellini
  2022-12-12  9:55 ` [PATCH v3 14/18] xen/arm: mm: Allow dump_hyp_walk() to work on " Julien Grall
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

At the moment, xen_pt_update() will only work on the runtime page tables.
In follow-up patches, we will also want to use the helper to update
the boot page tables.

All the existing callers of xen_pt_update() expects to modify the
current page-tables. Therefore, we can read the root physical address
directly from TTBR0_EL2.

Signed-off-by: Julien Grall <jgrall@amazon.com>
---

    Changes in v2:
        - Patch added
---
 xen/arch/arm/mm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index cc11f5c639e6..26d6b70410c5 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1114,7 +1114,7 @@ static int xen_pt_update(unsigned long virt,
      *
      * XXX: Add a check.
      */
-    const mfn_t root = virt_to_mfn(THIS_CPU_PGTABLE);
+    const mfn_t root = maddr_to_mfn(READ_SYSREG64(TTBR0_EL2));
 
     /*
      * The hardware was configured to forbid mapping both writeable and
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 14/18] xen/arm: mm: Allow dump_hyp_walk() to work on the current root table
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
                   ` (12 preceding siblings ...)
  2022-12-12  9:55 ` [PATCH v3 13/18] xen/arm: mm: Allow xen_pt_update() to work with the current root table Julien Grall
@ 2022-12-12  9:55 ` Julien Grall
  2022-12-13  1:24   ` Stefano Stabellini
  2022-12-12  9:55 ` [PATCH v3 15/18] xen/arm64: mm: Introduce helpers to prepare/enable/disable the identity mapping Julien Grall
                   ` (4 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

dump_hyp_walk() is used to print the tables walk in case of the data or
instruction abort.

Those abort are not limited to the runtime and could happen at early
boot. However, the current implementation of dump_hyp_walk() check
that the TTBR matches the runtime page tables.

Therefore, early abort will result to a secondary abort and not
print the table walks.

Given that the function is called in the abort path, there is no
reason for us to keep the BUG_ON() in any form. So drop it.

Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v2:
        - Patch added
---
 xen/arch/arm/mm.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 26d6b70410c5..0cf7ad4f0e8c 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -280,13 +280,11 @@ void dump_pt_walk(paddr_t ttbr, paddr_t addr,
 void dump_hyp_walk(vaddr_t addr)
 {
     uint64_t ttbr = READ_SYSREG64(TTBR0_EL2);
-    lpae_t *pgtable = THIS_CPU_PGTABLE;
 
     printk("Walking Hypervisor VA 0x%"PRIvaddr" "
            "on CPU%d via TTBR 0x%016"PRIx64"\n",
            addr, smp_processor_id(), ttbr);
 
-    BUG_ON( virt_to_maddr(pgtable) != ttbr );
     dump_pt_walk(ttbr, addr, HYP_PT_ROOT_LEVEL, 1);
 }
 
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 15/18] xen/arm64: mm: Introduce helpers to prepare/enable/disable the identity mapping
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
                   ` (13 preceding siblings ...)
  2022-12-12  9:55 ` [PATCH v3 14/18] xen/arm: mm: Allow dump_hyp_walk() to work on " Julien Grall
@ 2022-12-12  9:55 ` Julien Grall
  2022-12-13  1:41   ` Stefano Stabellini
  2022-12-12  9:55 ` [PATCH v3 16/18] xen/arm: linker: Indent correctly _stext Julien Grall
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

In follow-up patches we will need to have part of Xen identity mapped in
order to safely switch the TTBR.

On some platform, the identity mapping may have to start at 0. If we always
keep the identity region mapped, NULL pointer dereference would lead to
access to valid mapping.

It would be possible to relocate Xen to avoid clashing with address 0.
However the identity mapping is only meant to be used in very limited
places. Therefore it would be better to keep the identity region invalid
for most of the time.

Two new external helpers are introduced:
    - arch_setup_page_tables() will setup the page-tables so it is
      easy to create the mapping afterwards.
    - update_identity_mapping() will create/remove the identity mapping

Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v2:
        - Remove the arm32 part
        - Use a different logic for the boot page tables and runtime
          one because Xen may be running in a different place.
---
 xen/arch/arm/arm64/Makefile         |   1 +
 xen/arch/arm/arm64/mm.c             | 121 ++++++++++++++++++++++++++++
 xen/arch/arm/include/asm/arm32/mm.h |   4 +
 xen/arch/arm/include/asm/arm64/mm.h |  12 +++
 xen/arch/arm/include/asm/setup.h    |  11 +++
 xen/arch/arm/mm.c                   |   6 +-
 6 files changed, 153 insertions(+), 2 deletions(-)
 create mode 100644 xen/arch/arm/arm64/mm.c

diff --git a/xen/arch/arm/arm64/Makefile b/xen/arch/arm/arm64/Makefile
index 6d507da0d44d..28481393e98f 100644
--- a/xen/arch/arm/arm64/Makefile
+++ b/xen/arch/arm/arm64/Makefile
@@ -10,6 +10,7 @@ obj-y += entry.o
 obj-y += head.o
 obj-y += insn.o
 obj-$(CONFIG_LIVEPATCH) += livepatch.o
+obj-y += mm.o
 obj-y += smc.o
 obj-y += smpboot.o
 obj-y += traps.o
diff --git a/xen/arch/arm/arm64/mm.c b/xen/arch/arm/arm64/mm.c
new file mode 100644
index 000000000000..9eaf545ea9dd
--- /dev/null
+++ b/xen/arch/arm/arm64/mm.c
@@ -0,0 +1,121 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#include <xen/init.h>
+#include <xen/mm.h>
+
+#include <asm/setup.h>
+
+/* Override macros from asm/page.h to make them work with mfn_t */
+#undef virt_to_mfn
+#define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
+
+static DEFINE_PAGE_TABLE(xen_first_id);
+static DEFINE_PAGE_TABLE(xen_second_id);
+static DEFINE_PAGE_TABLE(xen_third_id);
+
+/*
+ * The identity mapping may start at physical address 0. So we don't want
+ * to keep it mapped longer than necessary.
+ *
+ * When this is called, we are still using the boot_pgtable.
+ *
+ * We need to prepare the identity mapping for both the boot page tables
+ * and runtime page tables.
+ *
+ * The logic to create the entry is slightly different because Xen may
+ * be running at a different location at runtime.
+ */
+static void __init prepare_boot_identity_mapping(void)
+{
+    paddr_t id_addr = virt_to_maddr(_start);
+    lpae_t pte;
+    DECLARE_OFFSETS(id_offsets, id_addr);
+
+    if ( id_offsets[0] != 0 )
+        panic("Cannot handled ID mapping above 512GB\n");
+
+    /* Link first ID table */
+    pte = mfn_to_xen_entry(virt_to_mfn(boot_first_id), MT_NORMAL);
+    pte.pt.table = 1;
+    pte.pt.xn = 0;
+
+    write_pte(&boot_pgtable[id_offsets[0]], pte);
+
+    /* Link second ID table */
+    pte = mfn_to_xen_entry(virt_to_mfn(boot_second_id), MT_NORMAL);
+    pte.pt.table = 1;
+    pte.pt.xn = 0;
+
+    write_pte(&boot_first_id[id_offsets[1]], pte);
+
+    /* Link third ID table */
+    pte = mfn_to_xen_entry(virt_to_mfn(boot_third_id), MT_NORMAL);
+    pte.pt.table = 1;
+    pte.pt.xn = 0;
+
+    write_pte(&boot_second_id[id_offsets[2]], pte);
+
+    /* The mapping in the third table will be created at a later stage */
+}
+
+static void __init prepare_runtime_identity_mapping(void)
+{
+    paddr_t id_addr = virt_to_maddr(_start);
+    lpae_t pte;
+    DECLARE_OFFSETS(id_offsets, id_addr);
+
+    if ( id_offsets[0] != 0 )
+        panic("Cannot handled ID mapping above 512GB\n");
+
+    /* Link first ID table */
+    pte = pte_of_xenaddr((vaddr_t)xen_first_id);
+    pte.pt.table = 1;
+    pte.pt.xn = 0;
+
+    write_pte(&xen_pgtable[id_offsets[0]], pte);
+
+    /* Link second ID table */
+    pte = pte_of_xenaddr((vaddr_t)xen_second_id);
+    pte.pt.table = 1;
+    pte.pt.xn = 0;
+
+    write_pte(&xen_first_id[id_offsets[1]], pte);
+
+    /* Link third ID table */
+    pte = pte_of_xenaddr((vaddr_t)xen_third_id);
+    pte.pt.table = 1;
+    pte.pt.xn = 0;
+
+    write_pte(&xen_second_id[id_offsets[2]], pte);
+
+    /* The mapping in the third table will be created at a later stage */
+}
+
+void __init arch_setup_page_tables(void)
+{
+    prepare_boot_identity_mapping();
+    prepare_runtime_identity_mapping();
+}
+
+void update_identity_mapping(bool enable)
+{
+    paddr_t id_addr = virt_to_maddr(_start);
+    int rc;
+
+    if ( enable )
+        rc = map_pages_to_xen(id_addr, maddr_to_mfn(id_addr), 1,
+                              PAGE_HYPERVISOR_RX);
+    else
+        rc = destroy_xen_mappings(id_addr, id_addr + PAGE_SIZE);
+
+    BUG_ON(rc);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/include/asm/arm32/mm.h b/xen/arch/arm/include/asm/arm32/mm.h
index 8bfc906e7178..856f2dbec4ad 100644
--- a/xen/arch/arm/include/asm/arm32/mm.h
+++ b/xen/arch/arm/include/asm/arm32/mm.h
@@ -18,6 +18,10 @@ static inline bool arch_mfns_in_directmap(unsigned long mfn, unsigned long nr)
 
 bool init_domheap_mappings(unsigned int cpu);
 
+static inline void arch_setup_page_tables(void)
+{
+}
+
 #endif /* __ARM_ARM32_MM_H__ */
 
 /*
diff --git a/xen/arch/arm/include/asm/arm64/mm.h b/xen/arch/arm/include/asm/arm64/mm.h
index aa2adac63189..807d3b2321fd 100644
--- a/xen/arch/arm/include/asm/arm64/mm.h
+++ b/xen/arch/arm/include/asm/arm64/mm.h
@@ -1,6 +1,8 @@
 #ifndef __ARM_ARM64_MM_H__
 #define __ARM_ARM64_MM_H__
 
+extern DEFINE_PAGE_TABLE(xen_pgtable);
+
 /*
  * On ARM64, all the RAM is currently direct mapped in Xen.
  * Hence return always true.
@@ -10,6 +12,16 @@ static inline bool arch_mfns_in_directmap(unsigned long mfn, unsigned long nr)
     return true;
 }
 
+void arch_setup_page_tables(void);
+
+/*
+ * Enable/disable the identity mapping
+ *
+ * Note that nested a call (e.g. enable=true, enable=true) is not
+ * supported.
+ */
+void update_identity_mapping(bool enable);
+
 #endif /* __ARM_ARM64_MM_H__ */
 
 /*
diff --git a/xen/arch/arm/include/asm/setup.h b/xen/arch/arm/include/asm/setup.h
index fdbf68aadcaa..e7a80fecec14 100644
--- a/xen/arch/arm/include/asm/setup.h
+++ b/xen/arch/arm/include/asm/setup.h
@@ -168,6 +168,17 @@ int map_range_to_domain(const struct dt_device_node *dev,
 
 extern const char __ro_after_init_start[], __ro_after_init_end[];
 
+extern DEFINE_BOOT_PAGE_TABLE(boot_pgtable);
+
+#ifdef CONFIG_ARM_64
+extern DEFINE_BOOT_PAGE_TABLE(boot_first_id);
+#endif
+extern DEFINE_BOOT_PAGE_TABLE(boot_second_id);
+extern DEFINE_BOOT_PAGE_TABLE(boot_third_id);
+
+/* Find where Xen will be residing at runtime and return an PT entry */
+lpae_t pte_of_xenaddr(vaddr_t);
+
 #endif
 /*
  * Local variables:
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 0cf7ad4f0e8c..39e0d9e03c9c 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -93,7 +93,7 @@ DEFINE_BOOT_PAGE_TABLE(boot_third);
 
 #ifdef CONFIG_ARM_64
 #define HYP_PT_ROOT_LEVEL 0
-static DEFINE_PAGE_TABLE(xen_pgtable);
+DEFINE_PAGE_TABLE(xen_pgtable);
 static DEFINE_PAGE_TABLE(xen_first);
 #define THIS_CPU_PGTABLE xen_pgtable
 #else
@@ -388,7 +388,7 @@ void flush_page_to_ram(unsigned long mfn, bool sync_icache)
         invalidate_icache();
 }
 
-static inline lpae_t pte_of_xenaddr(vaddr_t va)
+lpae_t pte_of_xenaddr(vaddr_t va)
 {
     paddr_t ma = va + phys_offset;
 
@@ -495,6 +495,8 @@ void __init setup_pagetables(unsigned long boot_phys_offset)
 
     phys_offset = boot_phys_offset;
 
+    arch_setup_page_tables();
+
 #ifdef CONFIG_ARM_64
     pte = pte_of_xenaddr((uintptr_t)xen_first);
     pte.pt.table = 1;
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 16/18] xen/arm: linker: Indent correctly _stext
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
                   ` (14 preceding siblings ...)
  2022-12-12  9:55 ` [PATCH v3 15/18] xen/arm64: mm: Introduce helpers to prepare/enable/disable the identity mapping Julien Grall
@ 2022-12-12  9:55 ` Julien Grall
  2022-12-13  1:42   ` Stefano Stabellini
  2022-12-12  9:55 ` [PATCH v3 17/18] xen/arm: linker: The identitymap check should cover the whole .text.header Julien Grall
                   ` (2 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

_stext is indented by one space more compare to the lines. This doesn't
seem warrant, so delete the extra space.

Signed-off: Julien Grall <jgrall@amazon.com>

---
    Changes in v3:
        - Patch added
---
 xen/arch/arm/xen.lds.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
index 92c298405259..ae8c3b4c6c5f 100644
--- a/xen/arch/arm/xen.lds.S
+++ b/xen/arch/arm/xen.lds.S
@@ -31,7 +31,7 @@ SECTIONS
   . = XEN_VIRT_START;
   _start = .;
   .text : {
-        _stext = .;            /* Text section */
+       _stext = .;             /* Text section */
        *(.text.header)
 
        *(.text.cold)
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 17/18] xen/arm: linker: The identitymap check should cover the whole .text.header
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
                   ` (15 preceding siblings ...)
  2022-12-12  9:55 ` [PATCH v3 16/18] xen/arm: linker: Indent correctly _stext Julien Grall
@ 2022-12-12  9:55 ` Julien Grall
  2022-12-13  1:44   ` Stefano Stabellini
  2022-12-12  9:55 ` [PATCH v3 18/18] xen/arm64: mm: Rework switch_ttbr() Julien Grall
  2022-12-15 11:48 ` [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
  18 siblings, 1 reply; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

At the moment, we are only checking that only some part of .text.header
is part of the identity mapping. However, this doesn't take into account
the litteral pool which will be located at the end of the section.

While we could try to avoid using a literal pool, in the near future we
will also want to use an identity mapping for switch_ttbr().

Not everything in .text.header requires to be part of the identity
mapping. But it is below a page size (i.e. 4KB) so take a shortcut and
check that .text.header is smaller than a page size.

With that _end_boot can be removed as it is now unused. Take the
pportunity to avoid assuming that a page size is always 4KB in the
error message and comment.

Signed-off-by: Julien Grall <jgrall@amazon.com>
---

    Changes in v3:
        - Patch added
---
 xen/arch/arm/arm32/head.S |  2 --
 xen/arch/arm/arm64/head.S |  2 --
 xen/arch/arm/xen.lds.S    | 10 +++++++---
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
index 2658625bc775..e47f90f15b3d 100644
--- a/xen/arch/arm/arm32/head.S
+++ b/xen/arch/arm/arm32/head.S
@@ -730,8 +730,6 @@ fail:   PRINT("- Boot failed -\r\n")
         b     1b
 ENDPROC(fail)
 
-GLOBAL(_end_boot)
-
 /*
  * Switch TTBR
  * r1:r0       ttbr
diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 23c2c7491db2..663f5813b12e 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -812,8 +812,6 @@ fail:   PRINT("- Boot failed -\r\n")
         b     1b
 ENDPROC(fail)
 
-GLOBAL(_end_boot)
-
 /*
  * Switch TTBR
  *
diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
index ae8c3b4c6c5f..3f7ebd19f3ed 100644
--- a/xen/arch/arm/xen.lds.S
+++ b/xen/arch/arm/xen.lds.S
@@ -32,7 +32,9 @@ SECTIONS
   _start = .;
   .text : {
        _stext = .;             /* Text section */
+       _idmap_start = .;
        *(.text.header)
+       _idmap_end = .;
 
        *(.text.cold)
        *(.text.unlikely .text.*_unlikely .text.unlikely.*)
@@ -225,10 +227,12 @@ SECTIONS
 }
 
 /*
- * We require that Xen is loaded at a 4K boundary, so this ensures that any
- * code running on the boot time identity map cannot cross a section boundary.
+ * We require that Xen is loaded at a page boundary, so this ensures that any
+ * code running on the identity map cannot cross a section boundary.
  */
-ASSERT( _end_boot - start <= PAGE_SIZE, "Boot code is larger than 4K")
+ASSERT(IS_ALIGNED(_idmap_start, PAGE_SIZE), "_idmap_start should be page-aligned")
+ASSERT(_idmap_end - _idmap_start <= PAGE_SIZE, "Identity mapped code is larger than a page size")
+
 /*
  * __init_[begin|end] MUST be at word size boundary otherwise we cannot
  * write fault instructions in the space properly.
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v3 18/18] xen/arm64: mm: Rework switch_ttbr()
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
                   ` (16 preceding siblings ...)
  2022-12-12  9:55 ` [PATCH v3 17/18] xen/arm: linker: The identitymap check should cover the whole .text.header Julien Grall
@ 2022-12-12  9:55 ` Julien Grall
  2022-12-13  2:00   ` Stefano Stabellini
  2022-12-15 11:48 ` [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
  18 siblings, 1 reply; 48+ messages in thread
From: Julien Grall @ 2022-12-12  9:55 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

At the moment, switch_ttbr() is switching the TTBR whilst the MMU is
still on.

Switching TTBR is like replacing existing mappings with new ones. So
we need to follow the break-before-make sequence.

In this case, it means the MMU needs to be switched off while the
TTBR is updated. In order to disable the MMU, we need to first
jump to an identity mapping.

Rename switch_ttbr() to switch_ttbr_id() and create an helper on
top to temporary map the identity mapping and call switch_ttbr()
via the identity address.

switch_ttbr_id() is now reworked to temporarily turn off the MMU
before updating the TTBR.

We also need to make sure the helper switch_ttbr() is part of the
identity mapping. So move _end_boot past it.

The arm32 code will use a different approach. So this issue is for now
only resolved on arm64.

Signed-off-by: Julien Grall <jgrall@amazon.com>

---

    Changes in v2:
        - Remove the arm32 changes. This will be addressed differently
        - Re-instate the instruct cache flush. This is not strictly
          necessary but kept it for safety.
        - Use "dsb ish"  rather than "dsb sy".

    TODO:
        * Handle the case where the runtime Xen is loaded at a different
          position for cache coloring. This will be dealt separately.
---
 xen/arch/arm/arm64/head.S     | 50 +++++++++++++++++++++++------------
 xen/arch/arm/arm64/mm.c       | 39 +++++++++++++++++++++++++++
 xen/arch/arm/include/asm/mm.h |  2 ++
 xen/arch/arm/mm.c             | 14 +++++-----
 4 files changed, 82 insertions(+), 23 deletions(-)

diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 663f5813b12e..1f69864492b6 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -816,30 +816,46 @@ ENDPROC(fail)
  * Switch TTBR
  *
  * x0    ttbr
- *
- * TODO: This code does not comply with break-before-make.
  */
-ENTRY(switch_ttbr)
-        dsb   sy                     /* Ensure the flushes happen before
-                                      * continuing */
-        isb                          /* Ensure synchronization with previous
-                                      * changes to text */
-        tlbi   alle2                 /* Flush hypervisor TLB */
-        ic     iallu                 /* Flush I-cache */
-        dsb    sy                    /* Ensure completion of TLB flush */
+ENTRY(switch_ttbr_id)
+        /* 1) Ensure any previous read/write have completed */
+        dsb    ish
+        isb
+
+        /* 2) Turn off MMU */
+        mrs    x1, SCTLR_EL2
+        bic    x1, x1, #SCTLR_Axx_ELx_M
+        msr    SCTLR_EL2, x1
+        isb
+
+        /*
+         * 3) Flush the TLBs.
+         * See asm/arm64/flushtlb.h for the explanation of the sequence.
+         */
+        dsb   nshst
+        tlbi  alle2
+        dsb   nsh
+        isb
+
+        /* 4) Update the TTBR */
+        msr   TTBR0_EL2, x0
         isb
 
-        msr    TTBR0_EL2, x0
+        /*
+         * 5) Flush I-cache
+         * This should not be necessary but it is kept for safety.
+         */
+        ic     iallu
+        isb
 
-        isb                          /* Ensure synchronization with previous
-                                      * changes to text */
-        tlbi   alle2                 /* Flush hypervisor TLB */
-        ic     iallu                 /* Flush I-cache */
-        dsb    sy                    /* Ensure completion of TLB flush */
+        /* 5) Turn on the MMU */
+        mrs   x1, SCTLR_EL2
+        orr   x1, x1, #SCTLR_Axx_ELx_M  /* Enable MMU */
+        msr   SCTLR_EL2, x1
         isb
 
         ret
-ENDPROC(switch_ttbr)
+ENDPROC(switch_ttbr_id)
 
 #ifdef CONFIG_EARLY_PRINTK
 /*
diff --git a/xen/arch/arm/arm64/mm.c b/xen/arch/arm/arm64/mm.c
index 9eaf545ea9dd..2ede4e75ae33 100644
--- a/xen/arch/arm/arm64/mm.c
+++ b/xen/arch/arm/arm64/mm.c
@@ -31,6 +31,15 @@ static void __init prepare_boot_identity_mapping(void)
     lpae_t pte;
     DECLARE_OFFSETS(id_offsets, id_addr);
 
+    /*
+     * We will be re-using the boot ID tables. They may not have been
+     * zeroed but they should be unlinked. So it is fine to use
+     * clear_page().
+     */
+    clear_page(boot_first_id);
+    clear_page(boot_second_id);
+    clear_page(boot_third_id);
+
     if ( id_offsets[0] != 0 )
         panic("Cannot handled ID mapping above 512GB\n");
 
@@ -111,6 +120,36 @@ void update_identity_mapping(bool enable)
     BUG_ON(rc);
 }
 
+extern void switch_ttbr_id(uint64_t ttbr);
+
+typedef void (switch_ttbr_fn)(uint64_t ttbr);
+
+void __init switch_ttbr(uint64_t ttbr)
+{
+    vaddr_t id_addr = virt_to_maddr(switch_ttbr_id);
+    switch_ttbr_fn *fn = (switch_ttbr_fn *)id_addr;
+    lpae_t pte;
+
+    /* Enable the identity mapping in the boot page tables */
+    update_identity_mapping(true);
+    /* Enable the identity mapping in the runtime page tables */
+    pte = pte_of_xenaddr((vaddr_t)switch_ttbr_id);
+    pte.pt.table = 1;
+    pte.pt.xn = 0;
+    pte.pt.ro = 1;
+    write_pte(&xen_third_id[third_table_offset(id_addr)], pte);
+
+    /* Switch TTBR */
+    fn(ttbr);
+
+    /*
+     * Disable the identity mapping in the runtime page tables.
+     * Note it is not necessary to disable it in the boot page tables
+     * because they are not going to be used by this CPU anymore.
+     */
+    update_identity_mapping(false);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
index 68adcac9fa8d..bff6923f3ea9 100644
--- a/xen/arch/arm/include/asm/mm.h
+++ b/xen/arch/arm/include/asm/mm.h
@@ -196,6 +196,8 @@ extern unsigned long total_pages;
 extern void setup_pagetables(unsigned long boot_phys_offset);
 /* Map FDT in boot pagetable */
 extern void *early_fdt_map(paddr_t fdt_paddr);
+/* Switch to a new root page-tables */
+extern void switch_ttbr(uint64_t ttbr);
 /* Remove early mappings */
 extern void remove_early_mappings(void);
 /* Allocate and initialise pagetables for a secondary CPU. Sets init_ttbr to the
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 39e0d9e03c9c..cf23ae02d1b7 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -476,8 +476,6 @@ static void xen_pt_enforce_wnx(void)
     flush_xen_tlb_local();
 }
 
-extern void switch_ttbr(uint64_t ttbr);
-
 /* Clear a translation table and clean & invalidate the cache */
 static void clear_table(void *table)
 {
@@ -550,13 +548,17 @@ void __init setup_pagetables(unsigned long boot_phys_offset)
     ttbr = (uintptr_t) cpu0_pgtable + phys_offset;
 #endif
 
-    switch_ttbr(ttbr);
-
-    xen_pt_enforce_wnx();
-
+    /*
+     * This needs to be setup first so switch_ttbr() can enable the
+     * identity mapping.
+     */
 #ifdef CONFIG_ARM_32
     per_cpu(xen_pgtable, 0) = cpu0_pgtable;
 #endif
+
+    switch_ttbr(ttbr);
+
+    xen_pt_enforce_wnx();
 }
 
 static void clear_boot_pagetables(void)
-- 
2.38.1



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 06/18] xen/arm32: head: Replace "ldr rX, =<label>" with "mov_w rX, <label>"
  2022-12-12  9:55 ` [PATCH v3 06/18] xen/arm32: head: Replace "ldr rX, =<label>" with "mov_w rX, <label>" Julien Grall
@ 2022-12-13  0:31   ` Stefano Stabellini
  2022-12-13 11:10   ` Michal Orzel
  1 sibling, 0 replies; 48+ messages in thread
From: Stefano Stabellini @ 2022-12-13  0:31 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, michal.orzel, Luca.Fancellu, Julien Grall,
	Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk

On Mon, 12 Dec 2022, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> "ldr rX, =<label>" is used to load a value from the literal pool. This
> implies a memory access.
> 
> This can be avoided by using the macro mov_w which encode the value in
> the immediate of two instructions.
> 
> So replace all "ldr rX, =<label>" with "mov_w rX, <label>".
> 
> No functional changes intended.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> 
>     Changes in v3:
>         * Patch added
> ---
>  xen/arch/arm/arm32/head.S | 38 +++++++++++++++++++-------------------
>  1 file changed, 19 insertions(+), 19 deletions(-)
> 
> diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
> index a558c2a6876e..ce680be91be1 100644
> --- a/xen/arch/arm/arm32/head.S
> +++ b/xen/arch/arm/arm32/head.S
> @@ -62,7 +62,7 @@
>  .endm
>  
>  .macro load_paddr rb, sym
> -        ldr   \rb, =\sym
> +        mov_w \rb, \sym
>          add   \rb, \rb, r10
>  .endm
>  
> @@ -149,7 +149,7 @@ past_zImage:
>          mov   r8, r2                 /* r8 := DTB base address */
>  
>          /* Find out where we are */
> -        ldr   r0, =start
> +        mov_w r0, start
>          adr   r9, start              /* r9  := paddr (start) */
>          sub   r10, r9, r0            /* r10 := phys-offset */
>  
> @@ -170,7 +170,7 @@ past_zImage:
>          bl    enable_mmu
>  
>          /* We are still in the 1:1 mapping. Jump to the runtime Virtual Address. */
> -        ldr   r0, =primary_switched
> +        mov_w r0, primary_switched
>          mov   pc, r0
>  primary_switched:
>          /*
> @@ -190,7 +190,7 @@ primary_switched:
>          /* Setup the arguments for start_xen and jump to C world */
>          mov   r0, r10                /* r0 := Physical offset */
>          mov   r1, r8                 /* r1 := paddr(FDT) */
> -        ldr   r2, =start_xen
> +        mov_w r2, start_xen
>          b     launch
>  ENDPROC(start)
>  
> @@ -198,7 +198,7 @@ GLOBAL(init_secondary)
>          cpsid aif                    /* Disable all interrupts */
>  
>          /* Find out where we are */
> -        ldr   r0, =start
> +        mov_w r0, start
>          adr   r9, start              /* r9  := paddr (start) */
>          sub   r10, r9, r0            /* r10 := phys-offset */
>  
> @@ -227,7 +227,7 @@ GLOBAL(init_secondary)
>  
>  
>          /* We are still in the 1:1 mapping. Jump to the runtime Virtual Address. */
> -        ldr   r0, =secondary_switched
> +        mov_w r0, secondary_switched
>          mov   pc, r0
>  secondary_switched:
>          /*
> @@ -236,7 +236,7 @@ secondary_switched:
>           *
>           * XXX: This is not compliant with the Arm Arm.
>           */
> -        ldr   r4, =init_ttbr         /* VA of HTTBR value stashed by CPU 0 */
> +        mov_w r4, init_ttbr          /* VA of HTTBR value stashed by CPU 0 */
>          ldrd  r4, r5, [r4]           /* Actual value */
>          dsb
>          mcrr  CP64(r4, r5, HTTBR)
> @@ -254,7 +254,7 @@ secondary_switched:
>  #endif
>          PRINT("- Ready -\r\n")
>          /* Jump to C world */
> -        ldr   r2, =start_secondary
> +        mov_w r2, start_secondary
>          b     launch
>  ENDPROC(init_secondary)
>  
> @@ -297,8 +297,8 @@ ENDPROC(check_cpu_mode)
>   */
>  zero_bss:
>          PRINT("- Zero BSS -\r\n")
> -        ldr   r0, =__bss_start       /* r0 := vaddr(__bss_start) */
> -        ldr   r1, =__bss_end         /* r1 := vaddr(__bss_start) */
> +        mov_w r0, __bss_start        /* r0 := vaddr(__bss_start) */
> +        mov_w r1, __bss_end          /* r1 := vaddr(__bss_start) */
>  
>          mov   r2, #0
>  1:      str   r2, [r0], #4
> @@ -330,8 +330,8 @@ cpu_init:
>  
>  cpu_init_done:
>          /* Set up memory attribute type tables */
> -        ldr   r0, =MAIR0VAL
> -        ldr   r1, =MAIR1VAL
> +        mov_w r0, MAIR0VAL
> +        mov_w r1,MAIR1VAL
>          mcr   CP32(r0, HMAIR0)
>          mcr   CP32(r1, HMAIR1)
>  
> @@ -341,10 +341,10 @@ cpu_init_done:
>           * PT walks are write-back, write-allocate in both cache levels,
>           * Full 32-bit address space goes through this table.
>           */
> -        ldr   r0, =(TCR_RES1|TCR_SH0_IS|TCR_ORGN0_WBWA|TCR_IRGN0_WBWA|TCR_T0SZ(0))
> +        mov_w r0, (TCR_RES1|TCR_SH0_IS|TCR_ORGN0_WBWA|TCR_IRGN0_WBWA|TCR_T0SZ(0))
>          mcr   CP32(r0, HTCR)
>  
> -        ldr   r0, =HSCTLR_SET
> +        mov_w r0, HSCTLR_SET
>          mcr   CP32(r0, HSCTLR)
>          isb
>  
> @@ -452,7 +452,7 @@ ENDPROC(cpu_init)
>   */
>  create_page_tables:
>          /* Prepare the page-tables for mapping Xen */
> -        ldr   r0, =XEN_VIRT_START
> +        mov_w r0, XEN_VIRT_START
>          create_table_entry boot_pgtable, boot_second, r0, 1
>          create_table_entry boot_second, boot_third, r0, 2
>  
> @@ -576,7 +576,7 @@ remove_identity_mapping:
>          cmp   r1, #XEN_FIRST_SLOT
>          beq   1f
>          /* It is not in slot 0, remove the entry */
> -        ldr   r0, =boot_pgtable      /* r0 := root table */
> +        mov_w r0, boot_pgtable       /* r0 := root table */
>          lsl   r1, r1, #3             /* r1 := Slot offset */
>          strd  r2, r3, [r0, r1]
>          b     identity_mapping_removed
> @@ -590,7 +590,7 @@ remove_identity_mapping:
>          cmp   r1, #XEN_SECOND_SLOT
>          beq   identity_mapping_removed
>          /* It is not in slot 1, remove the entry */
> -        ldr   r0, =boot_second       /* r0 := second table */
> +        mov_w r0, boot_second        /* r0 := second table */
>          lsl   r1, r1, #3             /* r1 := Slot offset */
>          strd  r2, r3, [r0, r1]
>  
> @@ -620,7 +620,7 @@ ENDPROC(remove_identity_mapping)
>  setup_fixmap:
>  #if defined(CONFIG_EARLY_PRINTK)
>          /* Add UART to the fixmap table */
> -        ldr   r0, =EARLY_UART_VIRTUAL_ADDRESS
> +        mov_w r0, EARLY_UART_VIRTUAL_ADDRESS
>          create_mapping_entry xen_fixmap, r0, r11, type=PT_DEV_L3
>  #endif
>          /* Map fixmap into boot_second */
> @@ -643,7 +643,7 @@ ENDPROC(setup_fixmap)
>   * Clobbers r3
>   */
>  launch:
> -        ldr   r3, =init_data
> +        mov_w r3, init_data
>          add   r3, #INITINFO_stack    /* Find the boot-time stack */
>          ldr   sp, [r3]
>          add   sp, #STACK_SIZE        /* (which grows down from the top). */
> -- 
> 2.38.1
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 07/18] xen/arm32: head: Jump to the runtime mapping in enable_mmu()
  2022-12-12  9:55 ` [PATCH v3 07/18] xen/arm32: head: Jump to the runtime mapping in enable_mmu() Julien Grall
@ 2022-12-13  0:46   ` Stefano Stabellini
  0 siblings, 0 replies; 48+ messages in thread
From: Stefano Stabellini @ 2022-12-13  0:46 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, michal.orzel, Luca.Fancellu, Julien Grall,
	Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk

On Mon, 12 Dec 2022, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> At the moment, enable_mmu() will return to an address in the 1:1 mapping
> and each path is responsible to switch to the runtime mapping.
> 
> In a follow-up patch, the behavior to switch to the runtime mapping
> will become more complex. So to avoid more code/comment duplication,
> move the switch in enable_mmu().
> 
> Lastly, take the opportunity to replace load from literal pool with
> mov_w.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

> ---
>     Changes in v3:
>         - Fix typo in the commit message
> 
>     Changes in v2:
>         - Patch added
> ---
>  xen/arch/arm/arm32/head.S | 50 +++++++++++++++++++++++----------------
>  1 file changed, 30 insertions(+), 20 deletions(-)
> 
> diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
> index ce680be91be1..40c1d7502007 100644
> --- a/xen/arch/arm/arm32/head.S
> +++ b/xen/arch/arm/arm32/head.S
> @@ -167,19 +167,11 @@ past_zImage:
>          bl    check_cpu_mode
>          bl    cpu_init
>          bl    create_page_tables
> -        bl    enable_mmu
>  
> -        /* We are still in the 1:1 mapping. Jump to the runtime Virtual Address. */
> -        mov_w r0, primary_switched
> -        mov   pc, r0
> +        /* Address in the runtime mapping to jump to after the MMU is enabled */
> +        mov_w lr, primary_switched
> +        b     enable_mmu
>  primary_switched:
> -        /*
> -         * The 1:1 map may clash with other parts of the Xen virtual memory
> -         * layout. As it is not used anymore, remove it completely to
> -         * avoid having to worry about replacing existing mapping
> -         * afterwards.
> -         */
> -        bl    remove_identity_mapping
>          bl    setup_fixmap
>  #ifdef CONFIG_EARLY_PRINTK
>          /* Use a virtual address to access the UART. */
> @@ -223,12 +215,10 @@ GLOBAL(init_secondary)
>          bl    check_cpu_mode
>          bl    cpu_init
>          bl    create_page_tables
> -        bl    enable_mmu
>  
> -
> -        /* We are still in the 1:1 mapping. Jump to the runtime Virtual Address. */
> -        mov_w r0, secondary_switched
> -        mov   pc, r0
> +        /* Address in the runtime mapping to jump to after the MMU is enabled */
> +        mov_w lr, secondary_switched
> +        b     enable_mmu
>  secondary_switched:
>          /*
>           * Non-boot CPUs need to move on to the proper pagetables, which were
> @@ -523,9 +513,12 @@ virtphys_clash:
>  ENDPROC(create_page_tables)
>  
>  /*
> - * Turn on the Data Cache and the MMU. The function will return on the 1:1
> - * mapping. In other word, the caller is responsible to switch to the runtime
> - * mapping.
> + * Turn on the Data Cache and the MMU. The function will return
> + * to the virtual address provided in LR (e.g. the runtime mapping).
> + *
> + * Inputs:
> + *   r9 : paddr(start)
> + *   lr : Virtual address to return to
>   *
>   * Clobbers r0 - r3
>   */
> @@ -551,7 +544,24 @@ enable_mmu:
>          dsb                          /* Flush PTE writes and finish reads */
>          mcr   CP32(r0, HSCTLR)       /* now paging is enabled */
>          isb                          /* Now, flush the icache */
> -        mov   pc, lr
> +
> +        /*
> +         * The MMU is turned on and we are in the 1:1 mapping. Switch
> +         * to the runtime mapping.
> +         */
> +        mov_w r0, 1f
> +        mov   pc, r0
> +1:
> +        /*
> +         * The 1:1 map may clash with other parts of the Xen virtual memory
> +         * layout. As it is not used anymore, remove it completely to
> +         * avoid having to worry about replacing existing mapping
> +         * afterwards.
> +         *
> +         * On return this will jump to the virtual address requested by
> +         * the caller.
> +         */
> +        b     remove_identity_mapping
>  ENDPROC(enable_mmu)
>  
>  /*
> -- 
> 2.38.1
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 11/18] xen/arm: Enable use of dump_pt_walk() early during boot
  2022-12-12  9:55 ` [PATCH v3 11/18] xen/arm: Enable use of dump_pt_walk() early during boot Julien Grall
@ 2022-12-13  1:06   ` Stefano Stabellini
  0 siblings, 0 replies; 48+ messages in thread
From: Stefano Stabellini @ 2022-12-13  1:06 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, michal.orzel, Luca.Fancellu, Julien Grall,
	Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk

On Mon, 12 Dec 2022, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> At the moment, dump_pt_walk() is using map_domain_page() to map
> the page tables.
> 
> map_domain_page() is only usuable after init_domheap_mappings() is called
> (arm32) or the xenheap has been initialized (arm64).
> 
> This means it can be hard to diagnose incorrect page-tables during
> early boot. So update dump_pt_walk() to xen_{, un}map_table() instead.
> 
> Note that the two helpers are moved earlier to avoid forward declaring
> them.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>  xen/arch/arm/mm.c | 56 +++++++++++++++++++++++------------------------
>  1 file changed, 28 insertions(+), 28 deletions(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 1315a2c87db7..d0b1cf55f550 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -191,6 +191,30 @@ static void __init __maybe_unused build_assertions(void)
>  #undef CHECK_DIFFERENT_SLOT
>  }
>  
> +static lpae_t *xen_map_table(mfn_t mfn)
> +{
> +    /*
> +     * During early boot, map_domain_page() may be unusable. Use the
> +     * PMAP to map temporarily a page-table.
> +     */
> +    if ( system_state == SYS_STATE_early_boot )
> +        return pmap_map(mfn);
> +
> +    return map_domain_page(mfn);
> +}
> +
> +static void xen_unmap_table(const lpae_t *table)
> +{
> +    /*
> +     * During early boot, xen_map_table() will not use map_domain_page()
> +     * but the PMAP.
> +     */
> +    if ( system_state == SYS_STATE_early_boot )
> +        pmap_unmap(table);
> +    else
> +        unmap_domain_page(table);
> +}
> +
>  void dump_pt_walk(paddr_t ttbr, paddr_t addr,
>                    unsigned int root_level,
>                    unsigned int nr_root_tables)
> @@ -230,7 +254,7 @@ void dump_pt_walk(paddr_t ttbr, paddr_t addr,
>      else
>          root_table = 0;
>  
> -    mapping = map_domain_page(mfn_add(root_mfn, root_table));
> +    mapping = xen_map_table(mfn_add(root_mfn, root_table));
>  
>      for ( level = root_level; ; level++ )
>      {
> @@ -246,11 +270,11 @@ void dump_pt_walk(paddr_t ttbr, paddr_t addr,
>              break;
>  
>          /* For next iteration */
> -        unmap_domain_page(mapping);
> -        mapping = map_domain_page(lpae_get_mfn(pte));
> +        xen_unmap_table(mapping);
> +        mapping = xen_map_table(lpae_get_mfn(pte));
>      }
>  
> -    unmap_domain_page(mapping);
> +    xen_unmap_table(mapping);
>  }
>  
>  void dump_hyp_walk(vaddr_t addr)
> @@ -713,30 +737,6 @@ void *ioremap(paddr_t pa, size_t len)
>      return ioremap_attr(pa, len, PAGE_HYPERVISOR_NOCACHE);
>  }
>  
> -static lpae_t *xen_map_table(mfn_t mfn)
> -{
> -    /*
> -     * During early boot, map_domain_page() may be unusable. Use the
> -     * PMAP to map temporarily a page-table.
> -     */
> -    if ( system_state == SYS_STATE_early_boot )
> -        return pmap_map(mfn);
> -
> -    return map_domain_page(mfn);
> -}
> -
> -static void xen_unmap_table(const lpae_t *table)
> -{
> -    /*
> -     * During early boot, xen_map_table() will not use map_domain_page()
> -     * but the PMAP.
> -     */
> -    if ( system_state == SYS_STATE_early_boot )
> -        pmap_unmap(table);
> -    else
> -        unmap_domain_page(table);
> -}
> -
>  static int create_xen_table(lpae_t *entry)
>  {
>      mfn_t mfn;
> -- 
> 2.38.1
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 12/18] xen/arm64: Rework the memory layout
  2022-12-12  9:55 ` [PATCH v3 12/18] xen/arm64: Rework the memory layout Julien Grall
@ 2022-12-13  1:22   ` Stefano Stabellini
  2022-12-13 18:31     ` Julien Grall
  0 siblings, 1 reply; 48+ messages in thread
From: Stefano Stabellini @ 2022-12-13  1:22 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, michal.orzel, Luca.Fancellu, Julien Grall,
	Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk

On Mon, 12 Dec 2022, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> Xen is currently not fully compliant with the Arm Arm because it will
> switch the TTBR with the MMU on.
> 
> In order to be compliant, we need to disable the MMU before
> switching the TTBR. The implication is the page-tables should
> contain an identity mapping of the code switching the TTBR.
> 
> In most of the case we expect Xen to be loaded in low memory. I am aware
> of one platform (i.e AMD Seattle) where the memory start above 512GB.
> To give us some slack, consider that Xen may be loaded in the first 2TB
> of the physical address space.
> 
> The memory layout is reshuffled to keep the first two slots of the zeroeth
> level free. Xen will now be loaded at (2TB + 2MB). This requires a slight
> tweak of the boot code because XEN_VIRT_START cannot be used as an
> immediate.
> 
> This reshuffle will make trivial to create a 1:1 mapping when Xen is
> loaded below 2TB.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> ---
> 
>     Changes in v2:
>         - Reword the commit message
>         - Load Xen at 2TB + 2MB
>         - Update the documentation to reflect the new layout
> ---
>  xen/arch/arm/arm64/head.S         |  3 ++-
>  xen/arch/arm/include/asm/config.h | 34 +++++++++++++++++++++----------
>  xen/arch/arm/mm.c                 | 11 +++++-----
>  3 files changed, 31 insertions(+), 17 deletions(-)
> 
> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> index ad014716db6f..23c2c7491db2 100644
> --- a/xen/arch/arm/arm64/head.S
> +++ b/xen/arch/arm/arm64/head.S
> @@ -607,7 +607,8 @@ create_page_tables:
>           * need an additional 1:1 mapping, the virtual mapping will
>           * suffice.
>           */
> -        cmp   x19, #XEN_VIRT_START
> +        ldr   x0, =XEN_VIRT_START
> +        cmp   x19, x0
>          bne   1f
>          ret
>  1:
> diff --git a/xen/arch/arm/include/asm/config.h b/xen/arch/arm/include/asm/config.h
> index 6c1b762e976d..9fe6bfeeeb95 100644
> --- a/xen/arch/arm/include/asm/config.h
> +++ b/xen/arch/arm/include/asm/config.h
> @@ -72,15 +72,12 @@
>  #include <xen/page-size.h>
>  
>  /*
> - * Common ARM32 and ARM64 layout:
> + * ARM32 layout:
>   *   0  -   2M   Unmapped
>   *   2M -   4M   Xen text, data, bss
>   *   4M -   6M   Fixmap: special-purpose 4K mapping slots
>   *   6M -  10M   Early boot mapping of FDT
> - *   10M - 12M   Livepatch vmap (if compiled in)
> - *
> - * ARM32 layout:
> - *   0  -  12M   <COMMON>
> + *  10M -  12M   Livepatch vmap (if compiled in)
>   *
>   *  32M - 128M   Frametable: 24 bytes per page for 16GB of RAM
>   * 256M -   1G   VMAP: ioremap and early_ioremap use this virtual address
> @@ -90,8 +87,17 @@
>   *   2G -   4G   Domheap: on-demand-mapped
>   *
>   * ARM64 layout:
> - * 0x0000000000000000 - 0x0000007fffffffff (512GB, L0 slot [0])
> - *   0  -  12M   <COMMON>
> + * 0x0000000000000000 - 0x00001fffffffffff (2TB, L0 slots [0..1])
> + *

Extra blank line


> + *  Reserved to identity map Xen
> + *
> + * 0x0000020000000000 - 0x000028fffffffff (512TB, L0 slot [2]
> + *  (Relative offsets)
> + *   0  -   2M   Unmapped
> + *   2M -   4M   Xen text, data, bss
> + *   4M -   6M   Fixmap: special-purpose 4K mapping slots
> + *   6M -  10M   Early boot mapping of FDT
> + *  10M -  12M   Livepatch vmap (if compiled in)
>   *
>   *   1G -   2G   VMAP: ioremap and early_ioremap
>   *
> @@ -107,7 +113,17 @@
>   *  Unused
>   */
>  
> +#ifdef CONFIG_ARM_32
>  #define XEN_VIRT_START          _AT(vaddr_t, MB(2))
> +#else
> +
> +#define SLOT0_ENTRY_BITS  39
> +#define SLOT0(slot) (_AT(vaddr_t,slot) << SLOT0_ENTRY_BITS)
> +#define SLOT0_ENTRY_SIZE  SLOT0(1)
> +
> +#define XEN_VIRT_START          (SLOT0(2) + _AT(vaddr_t, MB(2)))
> +#endif

Sorry for the silly question and I apologize if I got the math wrong.

1<<39 is 512MB, so:

slot0 is [0..512MB]
slot1 is [512MB..1TB]
slot2 is [1TB..1.5TB]
slot3 is [1.5TB..2TB]
slot4 is [2TB..2.5TB]

So, if we want Xen just above 2TB we should use slot4? Which would be
SLOT0(4) ?


>  #define XEN_VIRT_SIZE           _AT(vaddr_t, MB(2))
>  
>  #define FIXMAP_VIRT_START       (XEN_VIRT_START + XEN_VIRT_SIZE)
> @@ -164,10 +180,6 @@
>  
>  #else /* ARM_64 */
>  
> -#define SLOT0_ENTRY_BITS  39
> -#define SLOT0(slot) (_AT(vaddr_t,slot) << SLOT0_ENTRY_BITS)
> -#define SLOT0_ENTRY_SIZE  SLOT0(1)
> -
>  #define VMAP_VIRT_START  GB(1)
>  #define VMAP_VIRT_SIZE   GB(1)
>  
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index d0b1cf55f550..cc11f5c639e6 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -153,7 +153,7 @@ static void __init __maybe_unused build_assertions(void)
>  #endif
>      /* Page table structure constraints */
>  #ifdef CONFIG_ARM_64
> -    BUILD_BUG_ON(zeroeth_table_offset(XEN_VIRT_START));
> +    BUILD_BUG_ON(zeroeth_table_offset(XEN_VIRT_START) < 2);
>  #endif
>      BUILD_BUG_ON(first_table_offset(XEN_VIRT_START));
>  #ifdef CONFIG_ARCH_MAP_DOMAIN_PAGE
> @@ -498,10 +498,11 @@ void __init setup_pagetables(unsigned long boot_phys_offset)
>      phys_offset = boot_phys_offset;
>  
>  #ifdef CONFIG_ARM_64
> -    p = (void *) xen_pgtable;
> -    p[0] = pte_of_xenaddr((uintptr_t)xen_first);
> -    p[0].pt.table = 1;
> -    p[0].pt.xn = 0;
> +    pte = pte_of_xenaddr((uintptr_t)xen_first);
> +    pte.pt.table = 1;
> +    pte.pt.xn = 0;
> +    xen_pgtable[zeroeth_table_offset(XEN_VIRT_START)] = pte;
> +
>      p = (void *) xen_first;
>  #else
>      p = (void *) cpu0_pgtable;
> -- 
> 2.38.1
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 13/18] xen/arm: mm: Allow xen_pt_update() to work with the current root table
  2022-12-12  9:55 ` [PATCH v3 13/18] xen/arm: mm: Allow xen_pt_update() to work with the current root table Julien Grall
@ 2022-12-13  1:24   ` Stefano Stabellini
  0 siblings, 0 replies; 48+ messages in thread
From: Stefano Stabellini @ 2022-12-13  1:24 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, michal.orzel, Luca.Fancellu, Julien Grall,
	Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk

On Mon, 12 Dec 2022, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> At the moment, xen_pt_update() will only work on the runtime page tables.
> In follow-up patches, we will also want to use the helper to update
> the boot page tables.
> 
> All the existing callers of xen_pt_update() expects to modify the
> current page-tables. Therefore, we can read the root physical address
> directly from TTBR0_EL2.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>

Acked-by: Stefano Stabellini <sstabellini@kernel.org>

> ---
> 
>     Changes in v2:
>         - Patch added
> ---
>  xen/arch/arm/mm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index cc11f5c639e6..26d6b70410c5 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -1114,7 +1114,7 @@ static int xen_pt_update(unsigned long virt,
>       *
>       * XXX: Add a check.
>       */
> -    const mfn_t root = virt_to_mfn(THIS_CPU_PGTABLE);
> +    const mfn_t root = maddr_to_mfn(READ_SYSREG64(TTBR0_EL2));
>  
>      /*
>       * The hardware was configured to forbid mapping both writeable and
> -- 
> 2.38.1
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 14/18] xen/arm: mm: Allow dump_hyp_walk() to work on the current root table
  2022-12-12  9:55 ` [PATCH v3 14/18] xen/arm: mm: Allow dump_hyp_walk() to work on " Julien Grall
@ 2022-12-13  1:24   ` Stefano Stabellini
  0 siblings, 0 replies; 48+ messages in thread
From: Stefano Stabellini @ 2022-12-13  1:24 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, michal.orzel, Luca.Fancellu, Julien Grall,
	Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk

On Mon, 12 Dec 2022, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> dump_hyp_walk() is used to print the tables walk in case of the data or
> instruction abort.
> 
> Those abort are not limited to the runtime and could happen at early
> boot. However, the current implementation of dump_hyp_walk() check
> that the TTBR matches the runtime page tables.
> 
> Therefore, early abort will result to a secondary abort and not
> print the table walks.
> 
> Given that the function is called in the abort path, there is no
> reason for us to keep the BUG_ON() in any form. So drop it.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>

Acked-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>     Changes in v2:
>         - Patch added
> ---
>  xen/arch/arm/mm.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 26d6b70410c5..0cf7ad4f0e8c 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -280,13 +280,11 @@ void dump_pt_walk(paddr_t ttbr, paddr_t addr,
>  void dump_hyp_walk(vaddr_t addr)
>  {
>      uint64_t ttbr = READ_SYSREG64(TTBR0_EL2);
> -    lpae_t *pgtable = THIS_CPU_PGTABLE;
>  
>      printk("Walking Hypervisor VA 0x%"PRIvaddr" "
>             "on CPU%d via TTBR 0x%016"PRIx64"\n",
>             addr, smp_processor_id(), ttbr);
>  
> -    BUG_ON( virt_to_maddr(pgtable) != ttbr );
>      dump_pt_walk(ttbr, addr, HYP_PT_ROOT_LEVEL, 1);
>  }
>  
> -- 
> 2.38.1
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 15/18] xen/arm64: mm: Introduce helpers to prepare/enable/disable the identity mapping
  2022-12-12  9:55 ` [PATCH v3 15/18] xen/arm64: mm: Introduce helpers to prepare/enable/disable the identity mapping Julien Grall
@ 2022-12-13  1:41   ` Stefano Stabellini
  2023-01-12 22:03     ` Julien Grall
  0 siblings, 1 reply; 48+ messages in thread
From: Stefano Stabellini @ 2022-12-13  1:41 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, michal.orzel, Luca.Fancellu, Julien Grall,
	Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk

On Mon, 12 Dec 2022, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> In follow-up patches we will need to have part of Xen identity mapped in
> order to safely switch the TTBR.
> 
> On some platform, the identity mapping may have to start at 0. If we always
> keep the identity region mapped, NULL pointer dereference would lead to
> access to valid mapping.
> 
> It would be possible to relocate Xen to avoid clashing with address 0.
> However the identity mapping is only meant to be used in very limited
> places. Therefore it would be better to keep the identity region invalid
> for most of the time.
> 
> Two new external helpers are introduced:
>     - arch_setup_page_tables() will setup the page-tables so it is
>       easy to create the mapping afterwards.
>     - update_identity_mapping() will create/remove the identity mapping
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> ---
>     Changes in v2:
>         - Remove the arm32 part
>         - Use a different logic for the boot page tables and runtime
>           one because Xen may be running in a different place.
> ---
>  xen/arch/arm/arm64/Makefile         |   1 +
>  xen/arch/arm/arm64/mm.c             | 121 ++++++++++++++++++++++++++++
>  xen/arch/arm/include/asm/arm32/mm.h |   4 +
>  xen/arch/arm/include/asm/arm64/mm.h |  12 +++
>  xen/arch/arm/include/asm/setup.h    |  11 +++
>  xen/arch/arm/mm.c                   |   6 +-
>  6 files changed, 153 insertions(+), 2 deletions(-)
>  create mode 100644 xen/arch/arm/arm64/mm.c
> 
> diff --git a/xen/arch/arm/arm64/Makefile b/xen/arch/arm/arm64/Makefile
> index 6d507da0d44d..28481393e98f 100644
> --- a/xen/arch/arm/arm64/Makefile
> +++ b/xen/arch/arm/arm64/Makefile
> @@ -10,6 +10,7 @@ obj-y += entry.o
>  obj-y += head.o
>  obj-y += insn.o
>  obj-$(CONFIG_LIVEPATCH) += livepatch.o
> +obj-y += mm.o
>  obj-y += smc.o
>  obj-y += smpboot.o
>  obj-y += traps.o
> diff --git a/xen/arch/arm/arm64/mm.c b/xen/arch/arm/arm64/mm.c
> new file mode 100644
> index 000000000000..9eaf545ea9dd
> --- /dev/null
> +++ b/xen/arch/arm/arm64/mm.c
> @@ -0,0 +1,121 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#include <xen/init.h>
> +#include <xen/mm.h>
> +
> +#include <asm/setup.h>
> +
> +/* Override macros from asm/page.h to make them work with mfn_t */
> +#undef virt_to_mfn
> +#define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
> +
> +static DEFINE_PAGE_TABLE(xen_first_id);
> +static DEFINE_PAGE_TABLE(xen_second_id);
> +static DEFINE_PAGE_TABLE(xen_third_id);
> +
> +/*
> + * The identity mapping may start at physical address 0. So we don't want
> + * to keep it mapped longer than necessary.
> + *
> + * When this is called, we are still using the boot_pgtable.
> + *
> + * We need to prepare the identity mapping for both the boot page tables
> + * and runtime page tables.
> + *
> + * The logic to create the entry is slightly different because Xen may
> + * be running at a different location at runtime.
> + */
> +static void __init prepare_boot_identity_mapping(void)
> +{
> +    paddr_t id_addr = virt_to_maddr(_start);
> +    lpae_t pte;
> +    DECLARE_OFFSETS(id_offsets, id_addr);
> +
> +    if ( id_offsets[0] != 0 )
> +        panic("Cannot handled ID mapping above 512GB\n");
> +
> +    /* Link first ID table */
> +    pte = mfn_to_xen_entry(virt_to_mfn(boot_first_id), MT_NORMAL);
> +    pte.pt.table = 1;
> +    pte.pt.xn = 0;
> +
> +    write_pte(&boot_pgtable[id_offsets[0]], pte);
> +
> +    /* Link second ID table */
> +    pte = mfn_to_xen_entry(virt_to_mfn(boot_second_id), MT_NORMAL);
> +    pte.pt.table = 1;
> +    pte.pt.xn = 0;
> +
> +    write_pte(&boot_first_id[id_offsets[1]], pte);
> +
> +    /* Link third ID table */
> +    pte = mfn_to_xen_entry(virt_to_mfn(boot_third_id), MT_NORMAL);
> +    pte.pt.table = 1;
> +    pte.pt.xn = 0;
> +
> +    write_pte(&boot_second_id[id_offsets[2]], pte);
> +
> +    /* The mapping in the third table will be created at a later stage */
> +}
> +
> +static void __init prepare_runtime_identity_mapping(void)
> +{
> +    paddr_t id_addr = virt_to_maddr(_start);
> +    lpae_t pte;
> +    DECLARE_OFFSETS(id_offsets, id_addr);
> +
> +    if ( id_offsets[0] != 0 )
> +        panic("Cannot handled ID mapping above 512GB\n");
> +
> +    /* Link first ID table */
> +    pte = pte_of_xenaddr((vaddr_t)xen_first_id);
> +    pte.pt.table = 1;
> +    pte.pt.xn = 0;
> +
> +    write_pte(&xen_pgtable[id_offsets[0]], pte);
> +
> +    /* Link second ID table */
> +    pte = pte_of_xenaddr((vaddr_t)xen_second_id);
> +    pte.pt.table = 1;
> +    pte.pt.xn = 0;
> +
> +    write_pte(&xen_first_id[id_offsets[1]], pte);
> +
> +    /* Link third ID table */
> +    pte = pte_of_xenaddr((vaddr_t)xen_third_id);
> +    pte.pt.table = 1;
> +    pte.pt.xn = 0;
> +
> +    write_pte(&xen_second_id[id_offsets[2]], pte);
> +
> +    /* The mapping in the third table will be created at a later stage */
> +}
> +
> +void __init arch_setup_page_tables(void)
> +{
> +    prepare_boot_identity_mapping();
> +    prepare_runtime_identity_mapping();
> +}
> +
> +void update_identity_mapping(bool enable)
> +{
> +    paddr_t id_addr = virt_to_maddr(_start);
> +    int rc;
> +
> +    if ( enable )
> +        rc = map_pages_to_xen(id_addr, maddr_to_mfn(id_addr), 1,
> +                              PAGE_HYPERVISOR_RX);
> +    else
> +        rc = destroy_xen_mappings(id_addr, id_addr + PAGE_SIZE);
> +
> +    BUG_ON(rc);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/include/asm/arm32/mm.h b/xen/arch/arm/include/asm/arm32/mm.h
> index 8bfc906e7178..856f2dbec4ad 100644
> --- a/xen/arch/arm/include/asm/arm32/mm.h
> +++ b/xen/arch/arm/include/asm/arm32/mm.h
> @@ -18,6 +18,10 @@ static inline bool arch_mfns_in_directmap(unsigned long mfn, unsigned long nr)
>  
>  bool init_domheap_mappings(unsigned int cpu);
>  
> +static inline void arch_setup_page_tables(void)
> +{
> +}
> +
>  #endif /* __ARM_ARM32_MM_H__ */
>  
>  /*
> diff --git a/xen/arch/arm/include/asm/arm64/mm.h b/xen/arch/arm/include/asm/arm64/mm.h
> index aa2adac63189..807d3b2321fd 100644
> --- a/xen/arch/arm/include/asm/arm64/mm.h
> +++ b/xen/arch/arm/include/asm/arm64/mm.h
> @@ -1,6 +1,8 @@
>  #ifndef __ARM_ARM64_MM_H__
>  #define __ARM_ARM64_MM_H__
>  
> +extern DEFINE_PAGE_TABLE(xen_pgtable);
> +
>  /*
>   * On ARM64, all the RAM is currently direct mapped in Xen.
>   * Hence return always true.
> @@ -10,6 +12,16 @@ static inline bool arch_mfns_in_directmap(unsigned long mfn, unsigned long nr)
>      return true;
>  }
>  
> +void arch_setup_page_tables(void);
> +
> +/*
> + * Enable/disable the identity mapping
> + *
> + * Note that nested a call (e.g. enable=true, enable=true) is not
> + * supported.
> + */
> +void update_identity_mapping(bool enable);
> +
>  #endif /* __ARM_ARM64_MM_H__ */
>  
>  /*
> diff --git a/xen/arch/arm/include/asm/setup.h b/xen/arch/arm/include/asm/setup.h
> index fdbf68aadcaa..e7a80fecec14 100644
> --- a/xen/arch/arm/include/asm/setup.h
> +++ b/xen/arch/arm/include/asm/setup.h
> @@ -168,6 +168,17 @@ int map_range_to_domain(const struct dt_device_node *dev,
>  
>  extern const char __ro_after_init_start[], __ro_after_init_end[];
>  
> +extern DEFINE_BOOT_PAGE_TABLE(boot_pgtable);
> +
> +#ifdef CONFIG_ARM_64
> +extern DEFINE_BOOT_PAGE_TABLE(boot_first_id);
> +#endif
> +extern DEFINE_BOOT_PAGE_TABLE(boot_second_id);
> +extern DEFINE_BOOT_PAGE_TABLE(boot_third_id);

This is more a matter of taste but I would either:
- define extern all BOOT_PAGE_TABLEs here both ARM64 and ARM32 with
  #ifdefs
- or define all the ARM64 only BOOT_PAGE_TABLE in arm64/mm.h and all the
  ARM32 only BOOT_PAGE_TABLE in arm32/mm.h

Right now we have a mix, as we have boot_first_id with a #ifdef here
and we have xen_pgtable in arm64/mm.h

Also we are missing boot_second and boot_third. We might as well be
consistent and declare them all?



> +/* Find where Xen will be residing at runtime and return an PT entry */
> +lpae_t pte_of_xenaddr(vaddr_t);
> +
>  #endif
>  /*
>   * Local variables:
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 0cf7ad4f0e8c..39e0d9e03c9c 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -93,7 +93,7 @@ DEFINE_BOOT_PAGE_TABLE(boot_third);
>  
>  #ifdef CONFIG_ARM_64
>  #define HYP_PT_ROOT_LEVEL 0
> -static DEFINE_PAGE_TABLE(xen_pgtable);
> +DEFINE_PAGE_TABLE(xen_pgtable);
>  static DEFINE_PAGE_TABLE(xen_first);
>  #define THIS_CPU_PGTABLE xen_pgtable
>  #else
> @@ -388,7 +388,7 @@ void flush_page_to_ram(unsigned long mfn, bool sync_icache)
>          invalidate_icache();
>  }
>  
> -static inline lpae_t pte_of_xenaddr(vaddr_t va)
> +lpae_t pte_of_xenaddr(vaddr_t va)
>  {
>      paddr_t ma = va + phys_offset;
>  
> @@ -495,6 +495,8 @@ void __init setup_pagetables(unsigned long boot_phys_offset)
>  
>      phys_offset = boot_phys_offset;
>  
> +    arch_setup_page_tables();
> +
>  #ifdef CONFIG_ARM_64
>      pte = pte_of_xenaddr((uintptr_t)xen_first);
>      pte.pt.table = 1;
> -- 
> 2.38.1
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 16/18] xen/arm: linker: Indent correctly _stext
  2022-12-12  9:55 ` [PATCH v3 16/18] xen/arm: linker: Indent correctly _stext Julien Grall
@ 2022-12-13  1:42   ` Stefano Stabellini
  0 siblings, 0 replies; 48+ messages in thread
From: Stefano Stabellini @ 2022-12-13  1:42 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, michal.orzel, Luca.Fancellu, Julien Grall,
	Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk

On Mon, 12 Dec 2022, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> _stext is indented by one space more compare to the lines. This doesn't
> seem warrant, so delete the extra space.
> 
> Signed-off: Julien Grall <jgrall@amazon.com>

Acked-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>     Changes in v3:
>         - Patch added
> ---
>  xen/arch/arm/xen.lds.S | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
> index 92c298405259..ae8c3b4c6c5f 100644
> --- a/xen/arch/arm/xen.lds.S
> +++ b/xen/arch/arm/xen.lds.S
> @@ -31,7 +31,7 @@ SECTIONS
>    . = XEN_VIRT_START;
>    _start = .;
>    .text : {
> -        _stext = .;            /* Text section */
> +       _stext = .;             /* Text section */
>         *(.text.header)
>  
>         *(.text.cold)
> -- 
> 2.38.1
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 17/18] xen/arm: linker: The identitymap check should cover the whole .text.header
  2022-12-12  9:55 ` [PATCH v3 17/18] xen/arm: linker: The identitymap check should cover the whole .text.header Julien Grall
@ 2022-12-13  1:44   ` Stefano Stabellini
  0 siblings, 0 replies; 48+ messages in thread
From: Stefano Stabellini @ 2022-12-13  1:44 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, michal.orzel, Luca.Fancellu, Julien Grall,
	Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk

On Mon, 12 Dec 2022, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> At the moment, we are only checking that only some part of .text.header
> is part of the identity mapping. However, this doesn't take into account
> the litteral pool which will be located at the end of the section.
      ^ literal


> While we could try to avoid using a literal pool, in the near future we
> will also want to use an identity mapping for switch_ttbr().
> 
> Not everything in .text.header requires to be part of the identity
> mapping. But it is below a page size (i.e. 4KB) so take a shortcut and
> check that .text.header is smaller than a page size.
> 
> With that _end_boot can be removed as it is now unused. Take the
> pportunity to avoid assuming that a page size is always 4KB in the
  ^ opportunity

> error message and comment.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> 
>     Changes in v3:
>         - Patch added
> ---
>  xen/arch/arm/arm32/head.S |  2 --
>  xen/arch/arm/arm64/head.S |  2 --
>  xen/arch/arm/xen.lds.S    | 10 +++++++---
>  3 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
> index 2658625bc775..e47f90f15b3d 100644
> --- a/xen/arch/arm/arm32/head.S
> +++ b/xen/arch/arm/arm32/head.S
> @@ -730,8 +730,6 @@ fail:   PRINT("- Boot failed -\r\n")
>          b     1b
>  ENDPROC(fail)
>  
> -GLOBAL(_end_boot)
> -
>  /*
>   * Switch TTBR
>   * r1:r0       ttbr
> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> index 23c2c7491db2..663f5813b12e 100644
> --- a/xen/arch/arm/arm64/head.S
> +++ b/xen/arch/arm/arm64/head.S
> @@ -812,8 +812,6 @@ fail:   PRINT("- Boot failed -\r\n")
>          b     1b
>  ENDPROC(fail)
>  
> -GLOBAL(_end_boot)
> -
>  /*
>   * Switch TTBR
>   *
> diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
> index ae8c3b4c6c5f..3f7ebd19f3ed 100644
> --- a/xen/arch/arm/xen.lds.S
> +++ b/xen/arch/arm/xen.lds.S
> @@ -32,7 +32,9 @@ SECTIONS
>    _start = .;
>    .text : {
>         _stext = .;             /* Text section */
> +       _idmap_start = .;
>         *(.text.header)
> +       _idmap_end = .;
>  
>         *(.text.cold)
>         *(.text.unlikely .text.*_unlikely .text.unlikely.*)
> @@ -225,10 +227,12 @@ SECTIONS
>  }
>  
>  /*
> - * We require that Xen is loaded at a 4K boundary, so this ensures that any
> - * code running on the boot time identity map cannot cross a section boundary.
> + * We require that Xen is loaded at a page boundary, so this ensures that any
> + * code running on the identity map cannot cross a section boundary.
>   */
> -ASSERT( _end_boot - start <= PAGE_SIZE, "Boot code is larger than 4K")
> +ASSERT(IS_ALIGNED(_idmap_start, PAGE_SIZE), "_idmap_start should be page-aligned")
> +ASSERT(_idmap_end - _idmap_start <= PAGE_SIZE, "Identity mapped code is larger than a page size")
> +
>  /*
>   * __init_[begin|end] MUST be at word size boundary otherwise we cannot
>   * write fault instructions in the space properly.
> -- 
> 2.38.1
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 18/18] xen/arm64: mm: Rework switch_ttbr()
  2022-12-12  9:55 ` [PATCH v3 18/18] xen/arm64: mm: Rework switch_ttbr() Julien Grall
@ 2022-12-13  2:00   ` Stefano Stabellini
  2022-12-13 19:08     ` Julien Grall
  0 siblings, 1 reply; 48+ messages in thread
From: Stefano Stabellini @ 2022-12-13  2:00 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, michal.orzel, Luca.Fancellu, Julien Grall,
	Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk

On Mon, 12 Dec 2022, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> At the moment, switch_ttbr() is switching the TTBR whilst the MMU is
> still on.
> 
> Switching TTBR is like replacing existing mappings with new ones. So
> we need to follow the break-before-make sequence.
> 
> In this case, it means the MMU needs to be switched off while the
> TTBR is updated. In order to disable the MMU, we need to first
> jump to an identity mapping.
> 
> Rename switch_ttbr() to switch_ttbr_id() and create an helper on
> top to temporary map the identity mapping and call switch_ttbr()
> via the identity address.
> 
> switch_ttbr_id() is now reworked to temporarily turn off the MMU
> before updating the TTBR.
> 
> We also need to make sure the helper switch_ttbr() is part of the
> identity mapping. So move _end_boot past it.
> 
> The arm32 code will use a different approach. So this issue is for now
> only resolved on arm64.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>

This patch looks overall good to me, aside from the few minor comments
below. I would love for someone else, maybe from ARM, reviewing steps
1-6 making sure they are the right sequence.


> ---
> 
>     Changes in v2:
>         - Remove the arm32 changes. This will be addressed differently
>         - Re-instate the instruct cache flush. This is not strictly
>           necessary but kept it for safety.
>         - Use "dsb ish"  rather than "dsb sy".
> 
>     TODO:
>         * Handle the case where the runtime Xen is loaded at a different
>           position for cache coloring. This will be dealt separately.
> ---
>  xen/arch/arm/arm64/head.S     | 50 +++++++++++++++++++++++------------
>  xen/arch/arm/arm64/mm.c       | 39 +++++++++++++++++++++++++++
>  xen/arch/arm/include/asm/mm.h |  2 ++
>  xen/arch/arm/mm.c             | 14 +++++-----
>  4 files changed, 82 insertions(+), 23 deletions(-)
> 
> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> index 663f5813b12e..1f69864492b6 100644
> --- a/xen/arch/arm/arm64/head.S
> +++ b/xen/arch/arm/arm64/head.S
> @@ -816,30 +816,46 @@ ENDPROC(fail)
>   * Switch TTBR
>   *
>   * x0    ttbr
> - *
> - * TODO: This code does not comply with break-before-make.
>   */
> -ENTRY(switch_ttbr)
> -        dsb   sy                     /* Ensure the flushes happen before
> -                                      * continuing */
> -        isb                          /* Ensure synchronization with previous
> -                                      * changes to text */
> -        tlbi   alle2                 /* Flush hypervisor TLB */
> -        ic     iallu                 /* Flush I-cache */
> -        dsb    sy                    /* Ensure completion of TLB flush */
> +ENTRY(switch_ttbr_id)
> +        /* 1) Ensure any previous read/write have completed */
> +        dsb    ish
> +        isb
> +
> +        /* 2) Turn off MMU */
> +        mrs    x1, SCTLR_EL2
> +        bic    x1, x1, #SCTLR_Axx_ELx_M

do we need a "dsb   sy" here? we have in enable_mmu


> +        msr    SCTLR_EL2, x1
> +        isb
> +
> +        /*
> +         * 3) Flush the TLBs.
> +         * See asm/arm64/flushtlb.h for the explanation of the sequence.
> +         */
> +        dsb   nshst
> +        tlbi  alle2
> +        dsb   nsh
> +        isb
> +
> +        /* 4) Update the TTBR */
> +        msr   TTBR0_EL2, x0
>          isb
>  
> -        msr    TTBR0_EL2, x0
> +        /*
> +         * 5) Flush I-cache
> +         * This should not be necessary but it is kept for safety.
> +         */
> +        ic     iallu
> +        isb
>  
> -        isb                          /* Ensure synchronization with previous
> -                                      * changes to text */
> -        tlbi   alle2                 /* Flush hypervisor TLB */
> -        ic     iallu                 /* Flush I-cache */
> -        dsb    sy                    /* Ensure completion of TLB flush */
> +        /* 5) Turn on the MMU */

This should be 6)


> +        mrs   x1, SCTLR_EL2
> +        orr   x1, x1, #SCTLR_Axx_ELx_M  /* Enable MMU */
> +        msr   SCTLR_EL2, x1
>          isb
>  
>          ret
> -ENDPROC(switch_ttbr)
> +ENDPROC(switch_ttbr_id)
>  
>  #ifdef CONFIG_EARLY_PRINTK
>  /*
> diff --git a/xen/arch/arm/arm64/mm.c b/xen/arch/arm/arm64/mm.c
> index 9eaf545ea9dd..2ede4e75ae33 100644
> --- a/xen/arch/arm/arm64/mm.c
> +++ b/xen/arch/arm/arm64/mm.c
> @@ -31,6 +31,15 @@ static void __init prepare_boot_identity_mapping(void)
>      lpae_t pte;
>      DECLARE_OFFSETS(id_offsets, id_addr);
>  
> +    /*
> +     * We will be re-using the boot ID tables. They may not have been
> +     * zeroed but they should be unlinked. So it is fine to use
> +     * clear_page().
> +     */
> +    clear_page(boot_first_id);
> +    clear_page(boot_second_id);
> +    clear_page(boot_third_id);

Could this code be in patch #15?


>      if ( id_offsets[0] != 0 )
>          panic("Cannot handled ID mapping above 512GB\n");
>  
> @@ -111,6 +120,36 @@ void update_identity_mapping(bool enable)
>      BUG_ON(rc);
>  }
>  
> +extern void switch_ttbr_id(uint64_t ttbr);
> +
> +typedef void (switch_ttbr_fn)(uint64_t ttbr);
> +
> +void __init switch_ttbr(uint64_t ttbr)
> +{
> +    vaddr_t id_addr = virt_to_maddr(switch_ttbr_id);
> +    switch_ttbr_fn *fn = (switch_ttbr_fn *)id_addr;
> +    lpae_t pte;
> +
> +    /* Enable the identity mapping in the boot page tables */

See below...

> +    update_identity_mapping(true);
> +    /* Enable the identity mapping in the runtime page tables */
> +    pte = pte_of_xenaddr((vaddr_t)switch_ttbr_id);
> +    pte.pt.table = 1;
> +    pte.pt.xn = 0;
> +    pte.pt.ro = 1;
> +    write_pte(&xen_third_id[third_table_offset(id_addr)], pte);
> +
> +    /* Switch TTBR */
> +    fn(ttbr);
> +
> +    /*
> +     * Disable the identity mapping in the runtime page tables.
> +     * Note it is not necessary to disable it in the boot page tables
> +     * because they are not going to be used by this CPU anymore.
> +     */

...is update_identity_mapping acting on the boot pagetables or the
runtime pagetables? The two comments make me think that
update_identity_mapping is enabling mapping in the boot pagetables and
removing them from the runtime pagetable, which would be strangely
inconsistent.

> +    update_identity_mapping(false);
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
> index 68adcac9fa8d..bff6923f3ea9 100644
> --- a/xen/arch/arm/include/asm/mm.h
> +++ b/xen/arch/arm/include/asm/mm.h
> @@ -196,6 +196,8 @@ extern unsigned long total_pages;
>  extern void setup_pagetables(unsigned long boot_phys_offset);
>  /* Map FDT in boot pagetable */
>  extern void *early_fdt_map(paddr_t fdt_paddr);
> +/* Switch to a new root page-tables */
> +extern void switch_ttbr(uint64_t ttbr);
>  /* Remove early mappings */
>  extern void remove_early_mappings(void);
>  /* Allocate and initialise pagetables for a secondary CPU. Sets init_ttbr to the
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 39e0d9e03c9c..cf23ae02d1b7 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -476,8 +476,6 @@ static void xen_pt_enforce_wnx(void)
>      flush_xen_tlb_local();
>  }
>  
> -extern void switch_ttbr(uint64_t ttbr);
> -
>  /* Clear a translation table and clean & invalidate the cache */
>  static void clear_table(void *table)
>  {
> @@ -550,13 +548,17 @@ void __init setup_pagetables(unsigned long boot_phys_offset)
>      ttbr = (uintptr_t) cpu0_pgtable + phys_offset;
>  #endif
>  
> -    switch_ttbr(ttbr);
> -
> -    xen_pt_enforce_wnx();
> -
> +    /*
> +     * This needs to be setup first so switch_ttbr() can enable the
> +     * identity mapping.
> +     */
>  #ifdef CONFIG_ARM_32
>      per_cpu(xen_pgtable, 0) = cpu0_pgtable;
>  #endif
> +
> +    switch_ttbr(ttbr);
> +
> +    xen_pt_enforce_wnx();
>  }
>  
>  static void clear_boot_pagetables(void)
> -- 
> 2.38.1
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 01/18] xen/arm64: flushtlb: Reduce scope of barrier for local TLB flush
  2022-12-12  9:55 ` [PATCH v3 01/18] xen/arm64: flushtlb: Reduce scope of barrier for local TLB flush Julien Grall
@ 2022-12-13  9:11   ` Michal Orzel
  2022-12-13  9:45     ` Julien Grall
  2023-01-23 16:19   ` Ayan Kumar Halder
  1 sibling, 1 reply; 48+ messages in thread
From: Michal Orzel @ 2022-12-13  9:11 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Bertrand Marquis, Volodymyr Babchuk

Hi Julien,

On 12/12/2022 10:55, Julien Grall wrote:
> 
> 
> From: Julien Grall <jgrall@amazon.com>
> 
> Per D5-4929 in ARM DDI 0487H.a:
> "A DSB NSH is sufficient to ensure completion of TLB maintenance
>  instructions that apply to a single PE. A DSB ISH is sufficient to
>  ensure completion of TLB maintenance instructions that apply to PEs
>  in the same Inner Shareable domain.
> "
> 
> This means barrier after local TLB flushes could be reduced to
> non-shareable.
> 
> Note that the scope of the barrier in the workaround has not been
> changed because Linux v6.1-rc8 is also using 'ish' and I couldn't
> find anything in the Neoverse N1 suggesting that a 'nsh' would
> be sufficient.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> ---
> 
>     I have used an older version of the Arm Arm because the explanation
>     in the latest (ARM DDI 0487I.a) is less obvious. I reckon the paragraph
>     about DSB in D8.13.8 is missing the shareability. But this is implied
>     in B2.3.11:
> 
>     "If the required access types of the DSB is reads and writes, the
>      following instructions issued by PEe before the DSB are complete for
>      the required shareability domain:
> 
>      [...]
> 
>      — All TLB maintenance instructions.
>     "
> 
>     Changes in v3:
>         - Patch added
> ---
>  xen/arch/arm/include/asm/arm64/flushtlb.h | 27 ++++++++++++++---------
>  1 file changed, 16 insertions(+), 11 deletions(-)
> 
> diff --git a/xen/arch/arm/include/asm/arm64/flushtlb.h b/xen/arch/arm/include/asm/arm64/flushtlb.h
> index 7c5431518741..39d429ace552 100644
> --- a/xen/arch/arm/include/asm/arm64/flushtlb.h
> +++ b/xen/arch/arm/include/asm/arm64/flushtlb.h
> @@ -12,8 +12,9 @@
>   * ARM64_WORKAROUND_REPEAT_TLBI:
Before this line, in the same comment, we state DSB ISHST. This should also be changed
to reflect the change done by this patch.

>   * Modification of the translation table for a virtual address might lead to
>   * read-after-read ordering violation.
> - * The workaround repeats TLBI+DSB operation for all the TLB flush operations.
> - * While this is stricly not necessary, we don't want to take any risk.
> + * The workaround repeats TLBI+DSB ISH operation for all the TLB flush
> + * operations. While this is stricly not necessary, we don't want to
s/stricly/strictly/

> + * take any risk.
>   *
>   * For Xen page-tables the ISB will discard any instructions fetched
>   * from the old mappings.
> @@ -21,38 +22,42 @@
>   * For the Stage-2 page-tables the ISB ensures the completion of the DSB
>   * (and therefore the TLB invalidation) before continuing. So we know
>   * the TLBs cannot contain an entry for a mapping we may have removed.
> + *
> + * Note that for local TLB flush, using non-shareable (nsh) is sufficient
> + * (see D5-4929 in ARM DDI 0487H.a). Althougth, the memory barrier in
s/Althougth/Although/

> + * for the workaround is left as inner-shareable to match with Linux.
So for the workaround we stay with DSB ISH. But ...

>   */
> -#define TLB_HELPER(name, tlbop)                  \
> +#define TLB_HELPER(name, tlbop, sh)              \
>  static inline void name(void)                    \
>  {                                                \
>      asm volatile(                                \
> -        "dsb  ishst;"                            \
> +        "dsb  "  # sh  "st;"                     \
>          "tlbi "  # tlbop  ";"                    \
>          ALTERNATIVE(                             \
>              "nop; nop;",                         \
> -            "dsb  ish;"                          \
> +            "dsb  "  # sh  ";"                   \
... you do not adhere to this.

>              "tlbi "  # tlbop  ";",               \
>              ARM64_WORKAROUND_REPEAT_TLBI,        \
>              CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
> -        "dsb  ish;"                              \
> +        "dsb  "  # sh  ";"                       \
>          "isb;"                                   \
>          : : : "memory");                         \
>  }
> 
>  /* Flush local TLBs, current VMID only. */
> -TLB_HELPER(flush_guest_tlb_local, vmalls12e1);
> +TLB_HELPER(flush_guest_tlb_local, vmalls12e1, nsh);
> 
>  /* Flush innershareable TLBs, current VMID only */
> -TLB_HELPER(flush_guest_tlb, vmalls12e1is);
> +TLB_HELPER(flush_guest_tlb, vmalls12e1is, ish);
> 
>  /* Flush local TLBs, all VMIDs, non-hypervisor mode */
> -TLB_HELPER(flush_all_guests_tlb_local, alle1);
> +TLB_HELPER(flush_all_guests_tlb_local, alle1, nsh);
> 
>  /* Flush innershareable TLBs, all VMIDs, non-hypervisor mode */
> -TLB_HELPER(flush_all_guests_tlb, alle1is);
> +TLB_HELPER(flush_all_guests_tlb, alle1is, ish);
> 
>  /* Flush all hypervisor mappings from the TLB of the local processor. */
> -TLB_HELPER(flush_xen_tlb_local, alle2);
> +TLB_HELPER(flush_xen_tlb_local, alle2, nsh);
> 
>  /* Flush TLB of local processor for address va. */
>  static inline void  __flush_xen_tlb_one_local(vaddr_t va)
> --
> 2.38.1
> 

With the remarks fixed:
Reviewed-by: Michal Orzel <michal.orzel@amd.com>

~Michal



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 02/18] xen/arm64: flushtlb: Implement the TLBI repeat workaround for TLB flush by VA
  2022-12-12  9:55 ` [PATCH v3 02/18] xen/arm64: flushtlb: Implement the TLBI repeat workaround for TLB flush by VA Julien Grall
@ 2022-12-13  9:23   ` Michal Orzel
  0 siblings, 0 replies; 48+ messages in thread
From: Michal Orzel @ 2022-12-13  9:23 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Bertrand Marquis, Volodymyr Babchuk

Hi Julien,

On 12/12/2022 10:55, Julien Grall wrote:
> 
> 
> From: Julien Grall <jgrall@amazon.com>
> 
> Looking at the Neoverse N1 errata document, it is not clear to me
> why the TLBI repeat workaround is not applied for TLB flush by VA.
> 
> The TBL flush by VA helpers are used in flush_xen_tlb_range_va_local()
> and flush_xen_tlb_range_va(). So if the range size if a fixed size smaller
> than a PAGE_SIZE, it would be possible that the compiler remove the loop
> and therefore replicate the sequence described in the erratum 1286807.
> 
> So the TLBI repeat workaround should also be applied for the TLB flush
> by VA helpers.
> 
> Fixes: 22e323d115d8 ("xen/arm: Add workaround for Cortex-A76/Neoverse-N1 erratum #1286807")
> Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>

~Michal



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 01/18] xen/arm64: flushtlb: Reduce scope of barrier for local TLB flush
  2022-12-13  9:11   ` Michal Orzel
@ 2022-12-13  9:45     ` Julien Grall
  2022-12-13  9:50       ` Michal Orzel
  0 siblings, 1 reply; 48+ messages in thread
From: Julien Grall @ 2022-12-13  9:45 UTC (permalink / raw)
  To: Michal Orzel, xen-devel
  Cc: Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Bertrand Marquis, Volodymyr Babchuk



On 13/12/2022 09:11, Michal Orzel wrote:
> Hi Julien,

Hi Michal,

> On 12/12/2022 10:55, Julien Grall wrote:
>>
>>
>> From: Julien Grall <jgrall@amazon.com>
>>
>> Per D5-4929 in ARM DDI 0487H.a:
>> "A DSB NSH is sufficient to ensure completion of TLB maintenance
>>   instructions that apply to a single PE. A DSB ISH is sufficient to
>>   ensure completion of TLB maintenance instructions that apply to PEs
>>   in the same Inner Shareable domain.
>> "
>>
>> This means barrier after local TLB flushes could be reduced to
>> non-shareable.
>>
>> Note that the scope of the barrier in the workaround has not been
>> changed because Linux v6.1-rc8 is also using 'ish' and I couldn't
>> find anything in the Neoverse N1 suggesting that a 'nsh' would
>> be sufficient.
>>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>
>> ---
>>
>>      I have used an older version of the Arm Arm because the explanation
>>      in the latest (ARM DDI 0487I.a) is less obvious. I reckon the paragraph
>>      about DSB in D8.13.8 is missing the shareability. But this is implied
>>      in B2.3.11:
>>
>>      "If the required access types of the DSB is reads and writes, the
>>       following instructions issued by PEe before the DSB are complete for
>>       the required shareability domain:
>>
>>       [...]
>>
>>       — All TLB maintenance instructions.
>>      "
>>
>>      Changes in v3:
>>          - Patch added
>> ---
>>   xen/arch/arm/include/asm/arm64/flushtlb.h | 27 ++++++++++++++---------
>>   1 file changed, 16 insertions(+), 11 deletions(-)
>>
>> diff --git a/xen/arch/arm/include/asm/arm64/flushtlb.h b/xen/arch/arm/include/asm/arm64/flushtlb.h
>> index 7c5431518741..39d429ace552 100644
>> --- a/xen/arch/arm/include/asm/arm64/flushtlb.h
>> +++ b/xen/arch/arm/include/asm/arm64/flushtlb.h
>> @@ -12,8 +12,9 @@
>>    * ARM64_WORKAROUND_REPEAT_TLBI:
> Before this line, in the same comment, we state DSB ISHST. This should also be changed
> to reflect the change done by this patch.

This is on purpose. I decided to keep the sequence as-is and instead add 
a paragraph explaining that 'nsh' is sufficient for local TLB flushes.

> 
>>    * Modification of the translation table for a virtual address might lead to
>>    * read-after-read ordering violation.
>> - * The workaround repeats TLBI+DSB operation for all the TLB flush operations.
>> - * While this is stricly not necessary, we don't want to take any risk.
>> + * The workaround repeats TLBI+DSB ISH operation for all the TLB flush
>> + * operations. While this is stricly not necessary, we don't want to
> s/stricly/strictly/
> 
>> + * take any risk.
>>    *
>>    * For Xen page-tables the ISB will discard any instructions fetched
>>    * from the old mappings.
>> @@ -21,38 +22,42 @@
>>    * For the Stage-2 page-tables the ISB ensures the completion of the DSB
>>    * (and therefore the TLB invalidation) before continuing. So we know
>>    * the TLBs cannot contain an entry for a mapping we may have removed.
>> + *
>> + * Note that for local TLB flush, using non-shareable (nsh) is sufficient
>> + * (see D5-4929 in ARM DDI 0487H.a). Althougth, the memory barrier in
> s/Althougth/Although/
> 
>> + * for the workaround is left as inner-shareable to match with Linux.
> So for the workaround we stay with DSB ISH. But ...
> 
>>    */
>> -#define TLB_HELPER(name, tlbop)                  \
>> +#define TLB_HELPER(name, tlbop, sh)              \
>>   static inline void name(void)                    \
>>   {                                                \
>>       asm volatile(                                \
>> -        "dsb  ishst;"                            \
>> +        "dsb  "  # sh  "st;"                     \
>>           "tlbi "  # tlbop  ";"                    \
>>           ALTERNATIVE(                             \
>>               "nop; nop;",                         \
>> -            "dsb  ish;"                          \
>> +            "dsb  "  # sh  ";"                   \
> ... you do not adhere to this.

This is a leftover from my previous approach. I will drop it.

[...]

> 
> With the remarks fixed:
> Reviewed-by: Michal Orzel <michal.orzel@amd.com>

I am not planning to fix the first remark. Please let me know if your 
Reviewed-by tag stands.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 01/18] xen/arm64: flushtlb: Reduce scope of barrier for local TLB flush
  2022-12-13  9:45     ` Julien Grall
@ 2022-12-13  9:50       ` Michal Orzel
  0 siblings, 0 replies; 48+ messages in thread
From: Michal Orzel @ 2022-12-13  9:50 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Bertrand Marquis, Volodymyr Babchuk

Hi Julien,

On 13/12/2022 10:45, Julien Grall wrote:
> 
> 
> On 13/12/2022 09:11, Michal Orzel wrote:
>> Hi Julien,
> 
> Hi Michal,
> 
>> On 12/12/2022 10:55, Julien Grall wrote:
>>>
>>>
>>> From: Julien Grall <jgrall@amazon.com>
>>>
>>> Per D5-4929 in ARM DDI 0487H.a:
>>> "A DSB NSH is sufficient to ensure completion of TLB maintenance
>>>   instructions that apply to a single PE. A DSB ISH is sufficient to
>>>   ensure completion of TLB maintenance instructions that apply to PEs
>>>   in the same Inner Shareable domain.
>>> "
>>>
>>> This means barrier after local TLB flushes could be reduced to
>>> non-shareable.
>>>
>>> Note that the scope of the barrier in the workaround has not been
>>> changed because Linux v6.1-rc8 is also using 'ish' and I couldn't
>>> find anything in the Neoverse N1 suggesting that a 'nsh' would
>>> be sufficient.
>>>
>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>
>>> ---
>>>
>>>      I have used an older version of the Arm Arm because the explanation
>>>      in the latest (ARM DDI 0487I.a) is less obvious. I reckon the paragraph
>>>      about DSB in D8.13.8 is missing the shareability. But this is implied
>>>      in B2.3.11:
>>>
>>>      "If the required access types of the DSB is reads and writes, the
>>>       following instructions issued by PEe before the DSB are complete for
>>>       the required shareability domain:
>>>
>>>       [...]
>>>
>>>       — All TLB maintenance instructions.
>>>      "
>>>
>>>      Changes in v3:
>>>          - Patch added
>>> ---
>>>   xen/arch/arm/include/asm/arm64/flushtlb.h | 27 ++++++++++++++---------
>>>   1 file changed, 16 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/xen/arch/arm/include/asm/arm64/flushtlb.h b/xen/arch/arm/include/asm/arm64/flushtlb.h
>>> index 7c5431518741..39d429ace552 100644
>>> --- a/xen/arch/arm/include/asm/arm64/flushtlb.h
>>> +++ b/xen/arch/arm/include/asm/arm64/flushtlb.h
>>> @@ -12,8 +12,9 @@
>>>    * ARM64_WORKAROUND_REPEAT_TLBI:
>> Before this line, in the same comment, we state DSB ISHST. This should also be changed
>> to reflect the change done by this patch.
> 
> This is on purpose. I decided to keep the sequence as-is and instead add
> a paragraph explaining that 'nsh' is sufficient for local TLB flushes.
> 
>>
>>>    * Modification of the translation table for a virtual address might lead to
>>>    * read-after-read ordering violation.
>>> - * The workaround repeats TLBI+DSB operation for all the TLB flush operations.
>>> - * While this is stricly not necessary, we don't want to take any risk.
>>> + * The workaround repeats TLBI+DSB ISH operation for all the TLB flush
>>> + * operations. While this is stricly not necessary, we don't want to
>> s/stricly/strictly/
>>
>>> + * take any risk.
>>>    *
>>>    * For Xen page-tables the ISB will discard any instructions fetched
>>>    * from the old mappings.
>>> @@ -21,38 +22,42 @@
>>>    * For the Stage-2 page-tables the ISB ensures the completion of the DSB
>>>    * (and therefore the TLB invalidation) before continuing. So we know
>>>    * the TLBs cannot contain an entry for a mapping we may have removed.
>>> + *
>>> + * Note that for local TLB flush, using non-shareable (nsh) is sufficient
>>> + * (see D5-4929 in ARM DDI 0487H.a). Althougth, the memory barrier in
>> s/Althougth/Although/
>>
>>> + * for the workaround is left as inner-shareable to match with Linux.
>> So for the workaround we stay with DSB ISH. But ...
>>
>>>    */
>>> -#define TLB_HELPER(name, tlbop)                  \
>>> +#define TLB_HELPER(name, tlbop, sh)              \
>>>   static inline void name(void)                    \
>>>   {                                                \
>>>       asm volatile(                                \
>>> -        "dsb  ishst;"                            \
>>> +        "dsb  "  # sh  "st;"                     \
>>>           "tlbi "  # tlbop  ";"                    \
>>>           ALTERNATIVE(                             \
>>>               "nop; nop;",                         \
>>> -            "dsb  ish;"                          \
>>> +            "dsb  "  # sh  ";"                   \
>> ... you do not adhere to this.
> 
> This is a leftover from my previous approach. I will drop it.
> 
> [...]
> 
>>
>> With the remarks fixed:
>> Reviewed-by: Michal Orzel <michal.orzel@amd.com>
> 
> I am not planning to fix the first remark. Please let me know if your
> Reviewed-by tag stands.
I'm ok with it so you can keep my tag.

> 
> Cheers,
> 
> --
> Julien Grall

~Michal


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 03/18] xen/arm32: flushtlb: Reduce scope of barrier for local TLB flush
  2022-12-12  9:55 ` [PATCH v3 03/18] xen/arm32: flushtlb: Reduce scope of barrier for local TLB flush Julien Grall
@ 2022-12-13 10:48   ` Michal Orzel
  0 siblings, 0 replies; 48+ messages in thread
From: Michal Orzel @ 2022-12-13 10:48 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Bertrand Marquis, Volodymyr Babchuk

Hi Julien,

On 12/12/2022 10:55, Julien Grall wrote:
> 
> 
> From: Julien Grall <jgrall@amazon.com>
> 
> Per G5-9224 in ARM DDI 0487I.a:
> 
> "A DSB NSH is sufficient to ensure completion of TLB maintenance
>  instructions that apply to a single PE. A DSB ISH is sufficient to
>  ensure completion of TLB maintenance instructions that apply to PEs
>  in the same Inner Shareable domain.
> "
> 
> This is quoting the Armv8 specification because I couldn't find an
> explicit statement in the Armv7 specification. Instead, I could find
> bits in various places that confirm the same implementation.
> 
> Furthermore, Linux has been using 'nsh' since 2013 (62cbbc42e001
> "ARM: tlb: reduce scope of barrier domains for TLB invalidation").
> 
> This means barrier after local TLB flushes could be reduced to
> non-shareable.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>

~Michal



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 05/18] xen/arm: Clean-up the memory layout
  2022-12-12  9:55 ` [PATCH v3 05/18] xen/arm: Clean-up the memory layout Julien Grall
@ 2022-12-13 10:57   ` Michal Orzel
  2022-12-13 11:00     ` Michal Orzel
  0 siblings, 1 reply; 48+ messages in thread
From: Michal Orzel @ 2022-12-13 10:57 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Bertrand Marquis, Volodymyr Babchuk

Hi Julien,

On 12/12/2022 10:55, Julien Grall wrote:
> 
> 
> From: Julien Grall <jgrall@amazon.com>
> 
> In a follow-up patch, the base address for the common mappings will
> vary between arm32 and arm64. To avoid any duplication, define
> every mapping in the common region from the previous one.
> 
> Take the opportunity to:
>     * add missing *_SIZE for FIXMAP_VIRT_* and XEN_VIRT_*
>     * switch to MB()/GB() to be avoid hexadecimal (easier to read)
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>

~Michal



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 05/18] xen/arm: Clean-up the memory layout
  2022-12-13 10:57   ` Michal Orzel
@ 2022-12-13 11:00     ` Michal Orzel
  0 siblings, 0 replies; 48+ messages in thread
From: Michal Orzel @ 2022-12-13 11:00 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Bertrand Marquis, Volodymyr Babchuk



On 13/12/2022 11:57, Michal Orzel wrote:
> 
> 
> Hi Julien,
> 
> On 12/12/2022 10:55, Julien Grall wrote:
>>
>>
>> From: Julien Grall <jgrall@amazon.com>
>>
>> In a follow-up patch, the base address for the common mappings will
>> vary between arm32 and arm64. To avoid any duplication, define
>> every mapping in the common region from the previous one.
>>
>> Take the opportunity to:
>>     * add missing *_SIZE for FIXMAP_VIRT_* and XEN_VIRT_*
>>     * switch to MB()/GB() to be avoid hexadecimal (easier to read)
I forgot about this one:
NIT: s/to be/to/

>>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
> Reviewed-by: Michal Orzel <michal.orzel@amd.com>
> 
> ~Michal
> 
> 

~Michal


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 06/18] xen/arm32: head: Replace "ldr rX, =<label>" with "mov_w rX, <label>"
  2022-12-12  9:55 ` [PATCH v3 06/18] xen/arm32: head: Replace "ldr rX, =<label>" with "mov_w rX, <label>" Julien Grall
  2022-12-13  0:31   ` Stefano Stabellini
@ 2022-12-13 11:10   ` Michal Orzel
  1 sibling, 0 replies; 48+ messages in thread
From: Michal Orzel @ 2022-12-13 11:10 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Bertrand Marquis, Volodymyr Babchuk

Hi Julien,

On 12/12/2022 10:55, Julien Grall wrote:
> 
> 
> From: Julien Grall <jgrall@amazon.com>
> 
> "ldr rX, =<label>" is used to load a value from the literal pool. This
> implies a memory access.
> 
> This can be avoided by using the macro mov_w which encode the value in
> the immediate of two instructions.
> 
> So replace all "ldr rX, =<label>" with "mov_w rX, <label>".
> 
> No functional changes intended.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>

with one small change.

> 
> ---
> 
>     Changes in v3:
>         * Patch added
> ---
>  xen/arch/arm/arm32/head.S | 38 +++++++++++++++++++-------------------
>  1 file changed, 19 insertions(+), 19 deletions(-)
> 
> diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
> index a558c2a6876e..ce680be91be1 100644
> --- a/xen/arch/arm/arm32/head.S
> +++ b/xen/arch/arm/arm32/head.S
> @@ -62,7 +62,7 @@
>  .endm
> 
>  .macro load_paddr rb, sym
> -        ldr   \rb, =\sym
> +        mov_w \rb, \sym
>          add   \rb, \rb, r10
>  .endm
> 
> @@ -149,7 +149,7 @@ past_zImage:
>          mov   r8, r2                 /* r8 := DTB base address */
> 
>          /* Find out where we are */
> -        ldr   r0, =start
> +        mov_w r0, start
>          adr   r9, start              /* r9  := paddr (start) */
>          sub   r10, r9, r0            /* r10 := phys-offset */
> 
> @@ -170,7 +170,7 @@ past_zImage:
>          bl    enable_mmu
> 
>          /* We are still in the 1:1 mapping. Jump to the runtime Virtual Address. */
> -        ldr   r0, =primary_switched
> +        mov_w r0, primary_switched
>          mov   pc, r0
>  primary_switched:
>          /*
> @@ -190,7 +190,7 @@ primary_switched:
>          /* Setup the arguments for start_xen and jump to C world */
>          mov   r0, r10                /* r0 := Physical offset */
>          mov   r1, r8                 /* r1 := paddr(FDT) */
> -        ldr   r2, =start_xen
> +        mov_w r2, start_xen
>          b     launch
>  ENDPROC(start)
> 
> @@ -198,7 +198,7 @@ GLOBAL(init_secondary)
>          cpsid aif                    /* Disable all interrupts */
> 
>          /* Find out where we are */
> -        ldr   r0, =start
> +        mov_w r0, start
>          adr   r9, start              /* r9  := paddr (start) */
>          sub   r10, r9, r0            /* r10 := phys-offset */
> 
> @@ -227,7 +227,7 @@ GLOBAL(init_secondary)
> 
> 
>          /* We are still in the 1:1 mapping. Jump to the runtime Virtual Address. */
> -        ldr   r0, =secondary_switched
> +        mov_w r0, secondary_switched
>          mov   pc, r0
>  secondary_switched:
>          /*
> @@ -236,7 +236,7 @@ secondary_switched:
>           *
>           * XXX: This is not compliant with the Arm Arm.
>           */
> -        ldr   r4, =init_ttbr         /* VA of HTTBR value stashed by CPU 0 */
> +        mov_w r4, init_ttbr          /* VA of HTTBR value stashed by CPU 0 */
>          ldrd  r4, r5, [r4]           /* Actual value */
>          dsb
>          mcrr  CP64(r4, r5, HTTBR)
> @@ -254,7 +254,7 @@ secondary_switched:
>  #endif
>          PRINT("- Ready -\r\n")
>          /* Jump to C world */
> -        ldr   r2, =start_secondary
> +        mov_w r2, start_secondary
>          b     launch
>  ENDPROC(init_secondary)
> 
> @@ -297,8 +297,8 @@ ENDPROC(check_cpu_mode)
>   */
>  zero_bss:
>          PRINT("- Zero BSS -\r\n")
> -        ldr   r0, =__bss_start       /* r0 := vaddr(__bss_start) */
> -        ldr   r1, =__bss_end         /* r1 := vaddr(__bss_start) */
> +        mov_w r0, __bss_start        /* r0 := vaddr(__bss_start) */
> +        mov_w r1, __bss_end          /* r1 := vaddr(__bss_start) */
> 
>          mov   r2, #0
>  1:      str   r2, [r0], #4
> @@ -330,8 +330,8 @@ cpu_init:
> 
>  cpu_init_done:
>          /* Set up memory attribute type tables */
> -        ldr   r0, =MAIR0VAL
> -        ldr   r1, =MAIR1VAL
> +        mov_w r0, MAIR0VAL
> +        mov_w r1,MAIR1VAL
NIT: please separate arguments with a single space just like you did in every other place

~Michal


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 04/18] xen/arm: flushtlb: Reduce scope of barrier for the TLB range flush
  2022-12-12  9:55 ` [PATCH v3 04/18] xen/arm: flushtlb: Reduce scope of barrier for the TLB range flush Julien Grall
@ 2022-12-13 11:15   ` Michal Orzel
  0 siblings, 0 replies; 48+ messages in thread
From: Michal Orzel @ 2022-12-13 11:15 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Bertrand Marquis, Volodymyr Babchuk

Hi Julien,

On 12/12/2022 10:55, Julien Grall wrote:
> 
> 
> From: Julien Grall <jgrall@amazon.com>
> 
> At the moment, flush_xen_tlb_range_va{,_local}() are using system
> wide memory barrier. This is quite expensive and unnecessary.
> 
> For the local version, a non-shareable barrier is sufficient.
> For the SMP version, a inner-shareable barrier is sufficient.
s/a/an/

> 
> Furthermore, the initial barrier only need to a store barrier.
s/need/needs/

> 
> For the full explanation of the sequence see asm/arm{32,64}/flushtlb.h.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>

~Michal



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 09/18] xen/arm32: head: Remove restriction where to load Xen
  2022-12-12  9:55 ` [PATCH v3 09/18] xen/arm32: head: Remove restriction where to load Xen Julien Grall
@ 2022-12-13 18:23   ` Julien Grall
  0 siblings, 0 replies; 48+ messages in thread
From: Julien Grall @ 2022-12-13 18:23 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Bertrand Marquis, Volodymyr Babchuk

On 12/12/2022 09:55, Julien Grall wrote:
>   /*
>    * Map the UART in the fixmap (when earlyprintk is used) and hook the
>    * fixmap table in the page tables.
> diff --git a/xen/arch/arm/domain_page.c b/xen/arch/arm/domain_page.c
> index b7c02c919064..907fb93d4df0 100644
> --- a/xen/arch/arm/domain_page.c
> +++ b/xen/arch/arm/domain_page.c
> @@ -60,6 +60,7 @@ bool init_domheap_mappings(unsigned int cpu)
>       for ( i = 0; i < DOMHEAP_SECOND_PAGES; i++ )
>       {
>           lpae_t pte = mfn_to_xen_entry(mfn_add(mfn, i), MT_NORMAL);
> +

While the newline is correct, this shouldn't have been part of this 
patch. So I have dropped it from this patch.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 12/18] xen/arm64: Rework the memory layout
  2022-12-13  1:22   ` Stefano Stabellini
@ 2022-12-13 18:31     ` Julien Grall
  0 siblings, 0 replies; 48+ messages in thread
From: Julien Grall @ 2022-12-13 18:31 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, michal.orzel, Luca.Fancellu, Julien Grall,
	Bertrand Marquis, Volodymyr Babchuk

Hi Stefano,

On 13/12/2022 01:22, Stefano Stabellini wrote:
> On Mon, 12 Dec 2022, Julien Grall wrote:
>> From: Julien Grall <jgrall@amazon.com>
>>
>> Xen is currently not fully compliant with the Arm Arm because it will
>> switch the TTBR with the MMU on.
>>
>> In order to be compliant, we need to disable the MMU before
>> switching the TTBR. The implication is the page-tables should
>> contain an identity mapping of the code switching the TTBR.
>>
>> In most of the case we expect Xen to be loaded in low memory. I am aware
>> of one platform (i.e AMD Seattle) where the memory start above 512GB.
>> To give us some slack, consider that Xen may be loaded in the first 2TB
>> of the physical address space.
>>
>> The memory layout is reshuffled to keep the first two slots of the zeroeth
>> level free. Xen will now be loaded at (2TB + 2MB). This requires a slight
>> tweak of the boot code because XEN_VIRT_START cannot be used as an
>> immediate.
>>
>> This reshuffle will make trivial to create a 1:1 mapping when Xen is
>> loaded below 2TB.
>>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>> ---
>>
>>      Changes in v2:
>>          - Reword the commit message
>>          - Load Xen at 2TB + 2MB
>>          - Update the documentation to reflect the new layout
>> ---
>>   xen/arch/arm/arm64/head.S         |  3 ++-
>>   xen/arch/arm/include/asm/config.h | 34 +++++++++++++++++++++----------
>>   xen/arch/arm/mm.c                 | 11 +++++-----
>>   3 files changed, 31 insertions(+), 17 deletions(-)
>>
>> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
>> index ad014716db6f..23c2c7491db2 100644
>> --- a/xen/arch/arm/arm64/head.S
>> +++ b/xen/arch/arm/arm64/head.S
>> @@ -607,7 +607,8 @@ create_page_tables:
>>            * need an additional 1:1 mapping, the virtual mapping will
>>            * suffice.
>>            */
>> -        cmp   x19, #XEN_VIRT_START
>> +        ldr   x0, =XEN_VIRT_START
>> +        cmp   x19, x0
>>           bne   1f
>>           ret
>>   1:
>> diff --git a/xen/arch/arm/include/asm/config.h b/xen/arch/arm/include/asm/config.h
>> index 6c1b762e976d..9fe6bfeeeb95 100644
>> --- a/xen/arch/arm/include/asm/config.h
>> +++ b/xen/arch/arm/include/asm/config.h
>> @@ -72,15 +72,12 @@
>>   #include <xen/page-size.h>
>>   
>>   /*
>> - * Common ARM32 and ARM64 layout:
>> + * ARM32 layout:
>>    *   0  -   2M   Unmapped
>>    *   2M -   4M   Xen text, data, bss
>>    *   4M -   6M   Fixmap: special-purpose 4K mapping slots
>>    *   6M -  10M   Early boot mapping of FDT
>> - *   10M - 12M   Livepatch vmap (if compiled in)
>> - *
>> - * ARM32 layout:
>> - *   0  -  12M   <COMMON>
>> + *  10M -  12M   Livepatch vmap (if compiled in)
>>    *
>>    *  32M - 128M   Frametable: 24 bytes per page for 16GB of RAM
>>    * 256M -   1G   VMAP: ioremap and early_ioremap use this virtual address
>> @@ -90,8 +87,17 @@
>>    *   2G -   4G   Domheap: on-demand-mapped
>>    *
>>    * ARM64 layout:
>> - * 0x0000000000000000 - 0x0000007fffffffff (512GB, L0 slot [0])
>> - *   0  -  12M   <COMMON>
>> + * 0x0000000000000000 - 0x00001fffffffffff (2TB, L0 slots [0..1])
>> + *
> 
> Extra blank line

I have removed it.

> 
> 
>> + *  Reserved to identity map Xen
>> + *
>> + * 0x0000020000000000 - 0x000028fffffffff (512TB, L0 slot [2]
>> + *  (Relative offsets)
>> + *   0  -   2M   Unmapped
>> + *   2M -   4M   Xen text, data, bss
>> + *   4M -   6M   Fixmap: special-purpose 4K mapping slots
>> + *   6M -  10M   Early boot mapping of FDT
>> + *  10M -  12M   Livepatch vmap (if compiled in)
>>    *
>>    *   1G -   2G   VMAP: ioremap and early_ioremap
>>    *
>> @@ -107,7 +113,17 @@
>>    *  Unused
>>    */
>>   
>> +#ifdef CONFIG_ARM_32
>>   #define XEN_VIRT_START          _AT(vaddr_t, MB(2))
>> +#else
>> +
>> +#define SLOT0_ENTRY_BITS  39
>> +#define SLOT0(slot) (_AT(vaddr_t,slot) << SLOT0_ENTRY_BITS)
>> +#define SLOT0_ENTRY_SIZE  SLOT0(1)
>> +
>> +#define XEN_VIRT_START          (SLOT0(2) + _AT(vaddr_t, MB(2)))
>> +#endif
> 
> Sorry for the silly question and I apologize if I got the math wrong.
> 
> 1<<39 is 512MB, so:

Looking at how you use below, I am guessing you mean GB rather than MB.

> 
> slot0 is [0..512MB]
> slot1 is [512MB..1TB]
> slot2 is [1TB..1.5TB]
> slot3 is [1.5TB..2TB]
> slot4 is [2TB..2.5TB]
> 
> So, if we want Xen just above 2TB we should use slot4? Which would be
> SLOT0(4) ?

You are right. I will update the code.

> 
> 
>>   #define XEN_VIRT_SIZE           _AT(vaddr_t, MB(2))
>>   
>>   #define FIXMAP_VIRT_START       (XEN_VIRT_START + XEN_VIRT_SIZE)
>> @@ -164,10 +180,6 @@
>>   
>>   #else /* ARM_64 */
>>   
>> -#define SLOT0_ENTRY_BITS  39
>> -#define SLOT0(slot) (_AT(vaddr_t,slot) << SLOT0_ENTRY_BITS)
>> -#define SLOT0_ENTRY_SIZE  SLOT0(1)
>> -
>>   #define VMAP_VIRT_START  GB(1)
>>   #define VMAP_VIRT_SIZE   GB(1)
>>   
>> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
>> index d0b1cf55f550..cc11f5c639e6 100644
>> --- a/xen/arch/arm/mm.c
>> +++ b/xen/arch/arm/mm.c
>> @@ -153,7 +153,7 @@ static void __init __maybe_unused build_assertions(void)
>>   #endif
>>       /* Page table structure constraints */
>>   #ifdef CONFIG_ARM_64
>> -    BUILD_BUG_ON(zeroeth_table_offset(XEN_VIRT_START));
>> +    BUILD_BUG_ON(zeroeth_table_offset(XEN_VIRT_START) < 2);
>>   #endif
>>       BUILD_BUG_ON(first_table_offset(XEN_VIRT_START));
>>   #ifdef CONFIG_ARCH_MAP_DOMAIN_PAGE
>> @@ -498,10 +498,11 @@ void __init setup_pagetables(unsigned long boot_phys_offset)
>>       phys_offset = boot_phys_offset;
>>   
>>   #ifdef CONFIG_ARM_64
>> -    p = (void *) xen_pgtable;
>> -    p[0] = pte_of_xenaddr((uintptr_t)xen_first);
>> -    p[0].pt.table = 1;
>> -    p[0].pt.xn = 0;
>> +    pte = pte_of_xenaddr((uintptr_t)xen_first);
>> +    pte.pt.table = 1;
>> +    pte.pt.xn = 0;
>> +    xen_pgtable[zeroeth_table_offset(XEN_VIRT_START)] = pte;
>> +
>>       p = (void *) xen_first;
>>   #else
>>       p = (void *) cpu0_pgtable;
>> -- 
>> 2.38.1
>>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 18/18] xen/arm64: mm: Rework switch_ttbr()
  2022-12-13  2:00   ` Stefano Stabellini
@ 2022-12-13 19:08     ` Julien Grall
  2022-12-13 22:56       ` Stefano Stabellini
  0 siblings, 1 reply; 48+ messages in thread
From: Julien Grall @ 2022-12-13 19:08 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, michal.orzel, Luca.Fancellu, Julien Grall,
	Bertrand Marquis, Volodymyr Babchuk

Hi Stefano,

On 13/12/2022 02:00, Stefano Stabellini wrote:
> On Mon, 12 Dec 2022, Julien Grall wrote:
>> From: Julien Grall <jgrall@amazon.com>
>>
>> At the moment, switch_ttbr() is switching the TTBR whilst the MMU is
>> still on.
>>
>> Switching TTBR is like replacing existing mappings with new ones. So
>> we need to follow the break-before-make sequence.
>>
>> In this case, it means the MMU needs to be switched off while the
>> TTBR is updated. In order to disable the MMU, we need to first
>> jump to an identity mapping.
>>
>> Rename switch_ttbr() to switch_ttbr_id() and create an helper on
>> top to temporary map the identity mapping and call switch_ttbr()
>> via the identity address.
>>
>> switch_ttbr_id() is now reworked to temporarily turn off the MMU
>> before updating the TTBR.
>>
>> We also need to make sure the helper switch_ttbr() is part of the
>> identity mapping. So move _end_boot past it.
>>
>> The arm32 code will use a different approach. So this issue is for now
>> only resolved on arm64.
>>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> This patch looks overall good to me, aside from the few minor comments
> below. I would love for someone else, maybe from ARM, reviewing steps
> 1-6 making sure they are the right sequence.
> 
> 
>> ---
>>
>>      Changes in v2:
>>          - Remove the arm32 changes. This will be addressed differently
>>          - Re-instate the instruct cache flush. This is not strictly
>>            necessary but kept it for safety.
>>          - Use "dsb ish"  rather than "dsb sy".
>>
>>      TODO:
>>          * Handle the case where the runtime Xen is loaded at a different
>>            position for cache coloring. This will be dealt separately.
>> ---
>>   xen/arch/arm/arm64/head.S     | 50 +++++++++++++++++++++++------------
>>   xen/arch/arm/arm64/mm.c       | 39 +++++++++++++++++++++++++++
>>   xen/arch/arm/include/asm/mm.h |  2 ++
>>   xen/arch/arm/mm.c             | 14 +++++-----
>>   4 files changed, 82 insertions(+), 23 deletions(-)
>>
>> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
>> index 663f5813b12e..1f69864492b6 100644
>> --- a/xen/arch/arm/arm64/head.S
>> +++ b/xen/arch/arm/arm64/head.S
>> @@ -816,30 +816,46 @@ ENDPROC(fail)
>>    * Switch TTBR
>>    *
>>    * x0    ttbr
>> - *
>> - * TODO: This code does not comply with break-before-make.
>>    */
>> -ENTRY(switch_ttbr)
>> -        dsb   sy                     /* Ensure the flushes happen before
>> -                                      * continuing */
>> -        isb                          /* Ensure synchronization with previous
>> -                                      * changes to text */
>> -        tlbi   alle2                 /* Flush hypervisor TLB */
>> -        ic     iallu                 /* Flush I-cache */
>> -        dsb    sy                    /* Ensure completion of TLB flush */
>> +ENTRY(switch_ttbr_id)
>> +        /* 1) Ensure any previous read/write have completed */
>> +        dsb    ish
>> +        isb
>> +
>> +        /* 2) Turn off MMU */
>> +        mrs    x1, SCTLR_EL2
>> +        bic    x1, x1, #SCTLR_Axx_ELx_M
> 
> do we need a "dsb   sy" here? we have in enable_mmu

Hmmm... The explanation of the dsb + isb in enable_mmu makes no sense. 
The isb doesn't flush the I-cache, it just flushes the pipeline.

For the dsb, I am not convinced it is necessary because we already have 
the 'dsb nsh' above and in any case the barrier seems to be too strong.

I guess that will be another patch... (probably at a lower priority).

Now back to your question of the 'dsb' here. There is already a 'dsb 
ish' above. So memory access before turning off the MMU should be 
completed. Also...

> 
> 
>> +        msr    SCTLR_EL2, x1
>> +        isb

... this isb will ensure the completion of SCTLR before the TLBs are 
flushed before. And there should be no memory access (or than 
instructions here). So I don't think the a dsb is needed.

Would you mind to explain why you think there is one needed?

>> +
>> +        /*
>> +         * 3) Flush the TLBs.
>> +         * See asm/arm64/flushtlb.h for the explanation of the sequence.
>> +         */
>> +        dsb   nshst
>> +        tlbi  alle2
>> +        dsb   nsh
>> +        isb
>> +
>> +        /* 4) Update the TTBR */
>> +        msr   TTBR0_EL2, x0
>>           isb
>>   
>> -        msr    TTBR0_EL2, x0
>> +        /*
>> +         * 5) Flush I-cache
>> +         * This should not be necessary but it is kept for safety.
>> +         */
>> +        ic     iallu
>> +        isb
>>   
>> -        isb                          /* Ensure synchronization with previous
>> -                                      * changes to text */
>> -        tlbi   alle2                 /* Flush hypervisor TLB */
>> -        ic     iallu                 /* Flush I-cache */
>> -        dsb    sy                    /* Ensure completion of TLB flush */
>> +        /* 5) Turn on the MMU */
> 
> This should be 6)

I will update it.

> 
> 
>> +        mrs   x1, SCTLR_EL2
>> +        orr   x1, x1, #SCTLR_Axx_ELx_M  /* Enable MMU */
>> +        msr   SCTLR_EL2, x1
>>           isb
>>   
>>           ret
>> -ENDPROC(switch_ttbr)
>> +ENDPROC(switch_ttbr_id)
>>   
>>   #ifdef CONFIG_EARLY_PRINTK
>>   /*
>> diff --git a/xen/arch/arm/arm64/mm.c b/xen/arch/arm/arm64/mm.c
>> index 9eaf545ea9dd..2ede4e75ae33 100644
>> --- a/xen/arch/arm/arm64/mm.c
>> +++ b/xen/arch/arm/arm64/mm.c
>> @@ -31,6 +31,15 @@ static void __init prepare_boot_identity_mapping(void)
>>       lpae_t pte;
>>       DECLARE_OFFSETS(id_offsets, id_addr);
>>   
>> +    /*
>> +     * We will be re-using the boot ID tables. They may not have been
>> +     * zeroed but they should be unlinked. So it is fine to use
>> +     * clear_page().
>> +     */
>> +    clear_page(boot_first_id);
>> +    clear_page(boot_second_id);
>> +    clear_page(boot_third_id);
> 
> Could this code be in patch #15?

Yes, I can't remember why I decided to clear them in patch #18.

>>       if ( id_offsets[0] != 0 )
>>           panic("Cannot handled ID mapping above 512GB\n");
>>   
>> @@ -111,6 +120,36 @@ void update_identity_mapping(bool enable)
>>       BUG_ON(rc);
>>   }
>>   
>> +extern void switch_ttbr_id(uint64_t ttbr);
>> +
>> +typedef void (switch_ttbr_fn)(uint64_t ttbr);
>> +
>> +void __init switch_ttbr(uint64_t ttbr)
>> +{
>> +    vaddr_t id_addr = virt_to_maddr(switch_ttbr_id);
>> +    switch_ttbr_fn *fn = (switch_ttbr_fn *)id_addr;
>> +    lpae_t pte;
>> +
>> +    /* Enable the identity mapping in the boot page tables */
> 
> See below...
> 
>> +    update_identity_mapping(true);
>> +    /* Enable the identity mapping in the runtime page tables */
>> +    pte = pte_of_xenaddr((vaddr_t)switch_ttbr_id);
>> +    pte.pt.table = 1;
>> +    pte.pt.xn = 0;
>> +    pte.pt.ro = 1;
>> +    write_pte(&xen_third_id[third_table_offset(id_addr)], pte);
>> +
>> +    /* Switch TTBR */
>> +    fn(ttbr);
>> +
>> +    /*
>> +     * Disable the identity mapping in the runtime page tables.
>> +     * Note it is not necessary to disable it in the boot page tables
>> +     * because they are not going to be used by this CPU anymore.
>> +     */
> 
> ...is update_identity_mapping acting on the boot pagetables or the
> runtime pagetables? The two comments make me think that
> update_identity_mapping is enabling mapping in the boot pagetables and
> removing them from the runtime pagetable, which would be strangely
> inconsistent.

update_identity_mapping() is acting on the live page-tables (i.e. the 
one pointed by TTBR_EL2).

Before switch_ttbr(), this will be the boot page-tables. But after, this 
will be the runtime page-tables.

Would the following comment on top of the declaration of 
update_identity_mapping() clarifies it:

"Enable/disable the identity mapping in the live page-tables (i.e. the 
one pointed by TTBR_EL2)."

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 18/18] xen/arm64: mm: Rework switch_ttbr()
  2022-12-13 19:08     ` Julien Grall
@ 2022-12-13 22:56       ` Stefano Stabellini
  2022-12-13 23:01         ` Julien Grall
  0 siblings, 1 reply; 48+ messages in thread
From: Stefano Stabellini @ 2022-12-13 22:56 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, xen-devel, michal.orzel, Luca.Fancellu,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

On Tue, 13 Dec 2022, Julien Grall wrote:
> Hi Stefano,
> 
> On 13/12/2022 02:00, Stefano Stabellini wrote:
> > On Mon, 12 Dec 2022, Julien Grall wrote:
> > > From: Julien Grall <jgrall@amazon.com>
> > > 
> > > At the moment, switch_ttbr() is switching the TTBR whilst the MMU is
> > > still on.
> > > 
> > > Switching TTBR is like replacing existing mappings with new ones. So
> > > we need to follow the break-before-make sequence.
> > > 
> > > In this case, it means the MMU needs to be switched off while the
> > > TTBR is updated. In order to disable the MMU, we need to first
> > > jump to an identity mapping.
> > > 
> > > Rename switch_ttbr() to switch_ttbr_id() and create an helper on
> > > top to temporary map the identity mapping and call switch_ttbr()
> > > via the identity address.
> > > 
> > > switch_ttbr_id() is now reworked to temporarily turn off the MMU
> > > before updating the TTBR.
> > > 
> > > We also need to make sure the helper switch_ttbr() is part of the
> > > identity mapping. So move _end_boot past it.
> > > 
> > > The arm32 code will use a different approach. So this issue is for now
> > > only resolved on arm64.
> > > 
> > > Signed-off-by: Julien Grall <jgrall@amazon.com>
> > 
> > This patch looks overall good to me, aside from the few minor comments
> > below. I would love for someone else, maybe from ARM, reviewing steps
> > 1-6 making sure they are the right sequence.
> > 
> > 
> > > ---
> > > 
> > >      Changes in v2:
> > >          - Remove the arm32 changes. This will be addressed differently
> > >          - Re-instate the instruct cache flush. This is not strictly
> > >            necessary but kept it for safety.
> > >          - Use "dsb ish"  rather than "dsb sy".
> > > 
> > >      TODO:
> > >          * Handle the case where the runtime Xen is loaded at a different
> > >            position for cache coloring. This will be dealt separately.
> > > ---
> > >   xen/arch/arm/arm64/head.S     | 50 +++++++++++++++++++++++------------
> > >   xen/arch/arm/arm64/mm.c       | 39 +++++++++++++++++++++++++++
> > >   xen/arch/arm/include/asm/mm.h |  2 ++
> > >   xen/arch/arm/mm.c             | 14 +++++-----
> > >   4 files changed, 82 insertions(+), 23 deletions(-)
> > > 
> > > diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> > > index 663f5813b12e..1f69864492b6 100644
> > > --- a/xen/arch/arm/arm64/head.S
> > > +++ b/xen/arch/arm/arm64/head.S
> > > @@ -816,30 +816,46 @@ ENDPROC(fail)
> > >    * Switch TTBR
> > >    *
> > >    * x0    ttbr
> > > - *
> > > - * TODO: This code does not comply with break-before-make.
> > >    */
> > > -ENTRY(switch_ttbr)
> > > -        dsb   sy                     /* Ensure the flushes happen before
> > > -                                      * continuing */
> > > -        isb                          /* Ensure synchronization with
> > > previous
> > > -                                      * changes to text */
> > > -        tlbi   alle2                 /* Flush hypervisor TLB */
> > > -        ic     iallu                 /* Flush I-cache */
> > > -        dsb    sy                    /* Ensure completion of TLB flush */
> > > +ENTRY(switch_ttbr_id)
> > > +        /* 1) Ensure any previous read/write have completed */
> > > +        dsb    ish
> > > +        isb
> > > +
> > > +        /* 2) Turn off MMU */
> > > +        mrs    x1, SCTLR_EL2
> > > +        bic    x1, x1, #SCTLR_Axx_ELx_M
> > 
> > do we need a "dsb   sy" here? we have in enable_mmu
> 
> Hmmm... The explanation of the dsb + isb in enable_mmu makes no sense. The isb
> doesn't flush the I-cache, it just flushes the pipeline.
> 
> For the dsb, I am not convinced it is necessary because we already have the
> 'dsb nsh' above and in any case the barrier seems to be too strong.
> 
> I guess that will be another patch... (probably at a lower priority).
> 
> Now back to your question of the 'dsb' here. There is already a 'dsb ish'
> above. So memory access before turning off the MMU should be completed.
> Also...
> 
> > 
> > 
> > > +        msr    SCTLR_EL2, x1
> > > +        isb
> 
> ... this isb will ensure the completion of SCTLR before the TLBs are flushed
> before. And there should be no memory access (or than instructions here). So I
> don't think the a dsb is needed.
> 
> Would you mind to explain why you think there is one needed?

I am not at all sure whether it is needed or not, I was just noticing
that we have the "dsb sy" in enable_mmu and here we don't.

Thinking about it, the only reason for the additional dsb would be to
make sure that the two operations:

  mrs    x1, SCTLR_EL2
  bic    x1, x1, #SCTLR_Axx_ELx_M

are completed before disabling the MMU:

  msr    SCTLR_EL2, x1

Is that actually a requirement? I don't know.


> > > +
> > > +        /*
> > > +         * 3) Flush the TLBs.
> > > +         * See asm/arm64/flushtlb.h for the explanation of the sequence.
> > > +         */
> > > +        dsb   nshst
> > > +        tlbi  alle2
> > > +        dsb   nsh
> > > +        isb
> > > +
> > > +        /* 4) Update the TTBR */
> > > +        msr   TTBR0_EL2, x0
> > >           isb
> > >   -        msr    TTBR0_EL2, x0
> > > +        /*
> > > +         * 5) Flush I-cache
> > > +         * This should not be necessary but it is kept for safety.
> > > +         */
> > > +        ic     iallu
> > > +        isb
> > >   -        isb                          /* Ensure synchronization with
> > > previous
> > > -                                      * changes to text */
> > > -        tlbi   alle2                 /* Flush hypervisor TLB */
> > > -        ic     iallu                 /* Flush I-cache */
> > > -        dsb    sy                    /* Ensure completion of TLB flush */
> > > +        /* 5) Turn on the MMU */
> > 
> > This should be 6)
> 
> I will update it.
> 
> > 
> > 
> > > +        mrs   x1, SCTLR_EL2
> > > +        orr   x1, x1, #SCTLR_Axx_ELx_M  /* Enable MMU */
> > > +        msr   SCTLR_EL2, x1
> > >           isb
> > >             ret
> > > -ENDPROC(switch_ttbr)
> > > +ENDPROC(switch_ttbr_id)
> > >     #ifdef CONFIG_EARLY_PRINTK
> > >   /*
> > > diff --git a/xen/arch/arm/arm64/mm.c b/xen/arch/arm/arm64/mm.c
> > > index 9eaf545ea9dd..2ede4e75ae33 100644
> > > --- a/xen/arch/arm/arm64/mm.c
> > > +++ b/xen/arch/arm/arm64/mm.c
> > > @@ -31,6 +31,15 @@ static void __init prepare_boot_identity_mapping(void)
> > >       lpae_t pte;
> > >       DECLARE_OFFSETS(id_offsets, id_addr);
> > >   +    /*
> > > +     * We will be re-using the boot ID tables. They may not have been
> > > +     * zeroed but they should be unlinked. So it is fine to use
> > > +     * clear_page().
> > > +     */
> > > +    clear_page(boot_first_id);
> > > +    clear_page(boot_second_id);
> > > +    clear_page(boot_third_id);
> > 
> > Could this code be in patch #15?
> 
> Yes, I can't remember why I decided to clear them in patch #18.
> 
> > >       if ( id_offsets[0] != 0 )
> > >           panic("Cannot handled ID mapping above 512GB\n");
> > >   @@ -111,6 +120,36 @@ void update_identity_mapping(bool enable)
> > >       BUG_ON(rc);
> > >   }
> > >   +extern void switch_ttbr_id(uint64_t ttbr);
> > > +
> > > +typedef void (switch_ttbr_fn)(uint64_t ttbr);
> > > +
> > > +void __init switch_ttbr(uint64_t ttbr)
> > > +{
> > > +    vaddr_t id_addr = virt_to_maddr(switch_ttbr_id);
> > > +    switch_ttbr_fn *fn = (switch_ttbr_fn *)id_addr;
> > > +    lpae_t pte;
> > > +
> > > +    /* Enable the identity mapping in the boot page tables */
> > 
> > See below...
> > 
> > > +    update_identity_mapping(true);
> > > +    /* Enable the identity mapping in the runtime page tables */
> > > +    pte = pte_of_xenaddr((vaddr_t)switch_ttbr_id);
> > > +    pte.pt.table = 1;
> > > +    pte.pt.xn = 0;
> > > +    pte.pt.ro = 1;
> > > +    write_pte(&xen_third_id[third_table_offset(id_addr)], pte);
> > > +
> > > +    /* Switch TTBR */
> > > +    fn(ttbr);
> > > +
> > > +    /*
> > > +     * Disable the identity mapping in the runtime page tables.
> > > +     * Note it is not necessary to disable it in the boot page tables
> > > +     * because they are not going to be used by this CPU anymore.
> > > +     */
> > 
> > ...is update_identity_mapping acting on the boot pagetables or the
> > runtime pagetables? The two comments make me think that
> > update_identity_mapping is enabling mapping in the boot pagetables and
> > removing them from the runtime pagetable, which would be strangely
> > inconsistent.
> 
> update_identity_mapping() is acting on the live page-tables (i.e. the one
> pointed by TTBR_EL2).
> 
> Before switch_ttbr(), this will be the boot page-tables. But after, this will
> be the runtime page-tables.
> 
> Would the following comment on top of the declaration of
> update_identity_mapping() clarifies it:
> 
> "Enable/disable the identity mapping in the live page-tables (i.e. the one
> pointed by TTBR_EL2)."

Thank you!


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 18/18] xen/arm64: mm: Rework switch_ttbr()
  2022-12-13 22:56       ` Stefano Stabellini
@ 2022-12-13 23:01         ` Julien Grall
  0 siblings, 0 replies; 48+ messages in thread
From: Julien Grall @ 2022-12-13 23:01 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, michal.orzel, Luca.Fancellu, Julien Grall,
	Bertrand Marquis, Volodymyr Babchuk

Hi Stefano,

On 13/12/2022 22:56, Stefano Stabellini wrote:
> On Tue, 13 Dec 2022, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 13/12/2022 02:00, Stefano Stabellini wrote:
>>> On Mon, 12 Dec 2022, Julien Grall wrote:
>>>> From: Julien Grall <jgrall@amazon.com>
>>>>
>>>> At the moment, switch_ttbr() is switching the TTBR whilst the MMU is
>>>> still on.
>>>>
>>>> Switching TTBR is like replacing existing mappings with new ones. So
>>>> we need to follow the break-before-make sequence.
>>>>
>>>> In this case, it means the MMU needs to be switched off while the
>>>> TTBR is updated. In order to disable the MMU, we need to first
>>>> jump to an identity mapping.
>>>>
>>>> Rename switch_ttbr() to switch_ttbr_id() and create an helper on
>>>> top to temporary map the identity mapping and call switch_ttbr()
>>>> via the identity address.
>>>>
>>>> switch_ttbr_id() is now reworked to temporarily turn off the MMU
>>>> before updating the TTBR.
>>>>
>>>> We also need to make sure the helper switch_ttbr() is part of the
>>>> identity mapping. So move _end_boot past it.
>>>>
>>>> The arm32 code will use a different approach. So this issue is for now
>>>> only resolved on arm64.
>>>>
>>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>
>>> This patch looks overall good to me, aside from the few minor comments
>>> below. I would love for someone else, maybe from ARM, reviewing steps
>>> 1-6 making sure they are the right sequence.
>>>
>>>
>>>> ---
>>>>
>>>>       Changes in v2:
>>>>           - Remove the arm32 changes. This will be addressed differently
>>>>           - Re-instate the instruct cache flush. This is not strictly
>>>>             necessary but kept it for safety.
>>>>           - Use "dsb ish"  rather than "dsb sy".
>>>>
>>>>       TODO:
>>>>           * Handle the case where the runtime Xen is loaded at a different
>>>>             position for cache coloring. This will be dealt separately.
>>>> ---
>>>>    xen/arch/arm/arm64/head.S     | 50 +++++++++++++++++++++++------------
>>>>    xen/arch/arm/arm64/mm.c       | 39 +++++++++++++++++++++++++++
>>>>    xen/arch/arm/include/asm/mm.h |  2 ++
>>>>    xen/arch/arm/mm.c             | 14 +++++-----
>>>>    4 files changed, 82 insertions(+), 23 deletions(-)
>>>>
>>>> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
>>>> index 663f5813b12e..1f69864492b6 100644
>>>> --- a/xen/arch/arm/arm64/head.S
>>>> +++ b/xen/arch/arm/arm64/head.S
>>>> @@ -816,30 +816,46 @@ ENDPROC(fail)
>>>>     * Switch TTBR
>>>>     *
>>>>     * x0    ttbr
>>>> - *
>>>> - * TODO: This code does not comply with break-before-make.
>>>>     */
>>>> -ENTRY(switch_ttbr)
>>>> -        dsb   sy                     /* Ensure the flushes happen before
>>>> -                                      * continuing */
>>>> -        isb                          /* Ensure synchronization with
>>>> previous
>>>> -                                      * changes to text */
>>>> -        tlbi   alle2                 /* Flush hypervisor TLB */
>>>> -        ic     iallu                 /* Flush I-cache */
>>>> -        dsb    sy                    /* Ensure completion of TLB flush */
>>>> +ENTRY(switch_ttbr_id)
>>>> +        /* 1) Ensure any previous read/write have completed */
>>>> +        dsb    ish
>>>> +        isb
>>>> +
>>>> +        /* 2) Turn off MMU */
>>>> +        mrs    x1, SCTLR_EL2
>>>> +        bic    x1, x1, #SCTLR_Axx_ELx_M
>>>
>>> do we need a "dsb   sy" here? we have in enable_mmu
>>
>> Hmmm... The explanation of the dsb + isb in enable_mmu makes no sense. The isb
>> doesn't flush the I-cache, it just flushes the pipeline.
>>
>> For the dsb, I am not convinced it is necessary because we already have the
>> 'dsb nsh' above and in any case the barrier seems to be too strong.
>>
>> I guess that will be another patch... (probably at a lower priority).
>>
>> Now back to your question of the 'dsb' here. There is already a 'dsb ish'
>> above. So memory access before turning off the MMU should be completed.
>> Also...
>>
>>>
>>>
>>>> +        msr    SCTLR_EL2, x1
>>>> +        isb
>>
>> ... this isb will ensure the completion of SCTLR before the TLBs are flushed
>> before. And there should be no memory access (or than instructions here). So I
>> don't think the a dsb is needed.
>>
>> Would you mind to explain why you think there is one needed?
> 
> I am not at all sure whether it is needed or not, I was just noticing
> that we have the "dsb sy" in enable_mmu and here we don't.
> 
> Thinking about it, the only reason for the additional dsb would be to
> make sure that the two operations:
> 
>    mrs    x1, SCTLR_EL2
>    bic    x1, x1, #SCTLR_Axx_ELx_M
> 
> are completed before disabling the MMU:

That's not what a 'dsb' is for. It is used for memory ordering there are 
are no memory access involved here.

If you want the operations to be completed, then this would be an 'isb'. 
Yet, this would not be necessary here as the next instruction cannot be 
re-ordered because of the register dependency.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 08/18] xen/arm32: head: Introduce an helper to flush the TLBs
  2022-12-12  9:55 ` [PATCH v3 08/18] xen/arm32: head: Introduce an helper to flush the TLBs Julien Grall
@ 2022-12-14 14:24   ` Michal Orzel
  2023-01-12 19:38     ` Julien Grall
  0 siblings, 1 reply; 48+ messages in thread
From: Michal Orzel @ 2022-12-14 14:24 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Bertrand Marquis, Volodymyr Babchuk

Hi Julien,

On 12/12/2022 10:55, Julien Grall wrote:
> 
> 
> From: Julien Grall <jgrall@amazon.com>
> 
> The sequence for flushing the TLBs is 4 instruction long and often
> requires an explanation how it works.
> 
> So create an helper and use it in the boot code (switch_ttbr() is left
Here and in title: s/an helper/a helper/

> alone for now).
Could you explain why?

> 
> Note that in secondary_switched, we were also flushing the instruction
> cache and branch predictor. Neither of them was necessary because:
>     * We are only supporting IVIPT cache on arm32, so the instruction
>       cache flush is only necessary when executable code is modified.
>       None of the boot code is doing that.
>     * The instruction cache is not invalidated and misprediction is not
>       a problem at boot.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>

Apart from that, the patch is good, so:
Reviewed-by: Michal Orzel <michal.orzel@amd.com>

> 
> ---
>     Changes in v3:
>         * Fix typo
>         * Update the documentation
>         * Rename the argument from tmp1 to tmp
> ---
>  xen/arch/arm/arm32/head.S | 30 +++++++++++++++++-------------
>  1 file changed, 17 insertions(+), 13 deletions(-)
> 
> diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
> index 40c1d7502007..315abbbaebec 100644
> --- a/xen/arch/arm/arm32/head.S
> +++ b/xen/arch/arm/arm32/head.S
> @@ -66,6 +66,20 @@
>          add   \rb, \rb, r10
>  .endm
> 
> +/*
> + * Flush local TLBs
> + *
> + * @tmp:    Scratch register
As you are respinning a series anyway, could you add just one space after @tmp:?

~Michal


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on
  2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
                   ` (17 preceding siblings ...)
  2022-12-12  9:55 ` [PATCH v3 18/18] xen/arm64: mm: Rework switch_ttbr() Julien Grall
@ 2022-12-15 11:48 ` Julien Grall
  18 siblings, 0 replies; 48+ messages in thread
From: Julien Grall @ 2022-12-15 11:48 UTC (permalink / raw)
  To: xen-devel
  Cc: michal.orzel, Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Bertrand Marquis, Volodymyr Babchuk

Hi,

On 12/12/2022 09:55, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> Hi all,
> 
> Currently, Xen on Arm will switch TTBR whilst the MMU is on. This is
> similar to replacing existing mappings with new ones. So we need to
> follow a break-before-make sequence.
> 
> When switching the TTBR, we need to temporary disable the MMU
> before updating the TTBR. This means the page-tables must contain an
> identity mapping.
> 
> The current memory layout is not very flexible and has an higher chance
> to clash with the identity mapping.
> 
> On Arm64, we have plenty of unused virtual address space Therefore, we can
> simply reshuffle the layout to leave the first part of the virtual
> address space empty.
> 
> On Arm32, the virtual address space is already quite full. Even if we
> find space, it would be necessary to have a dynamic layout. So a
> different approach will be necessary. The chosen one is to have
> a temporary mapping that will be used to jumped from the ID mapping
> to the runtime mapping (or vice versa). The temporary mapping will
> be overlapping with the domheap area as it should not be used when
> switching on/off the MMU.
> 
> The Arm32 part is not yet addressed and will be handled in a follow-up
> series.
> 
> After this series, most of Xen page-table code should be compliant
> with the Arm Arm. The last two issues I am aware of are:
>   - domheap: Mappings are replaced without using the Break-Before-Make
>     approach.
>   - The cache is not cleaned/invalidated when updating the page-tables
>     with Data cache off (like during early boot).
> 
> The long term plan is to get rid of boot_* page tables and then
> directly use the runtime pages. This means for coloring, we will
> need to build the pages in the relocated Xen rather than the current
> Xen.
> 
> For convience, I pushed a branch with everything applied:
> 
> https://xenbits.xen.org/git-http/people/julieng/xen-unstable.git
> branch boot-pt-rework-v3
> 
> Cheers,
> 
> Julien Grall (18):
>    xen/arm: Enable use of dump_pt_walk() early during boot

[...]

>    xen/arm: mm: Allow xen_pt_update() to work with the current root table
>    xen/arm: mm: Allow dump_hyp_walk() to work on the current root table

I have pushed those 3 patches because they are ready.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 08/18] xen/arm32: head: Introduce an helper to flush the TLBs
  2022-12-14 14:24   ` Michal Orzel
@ 2023-01-12 19:38     ` Julien Grall
  0 siblings, 0 replies; 48+ messages in thread
From: Julien Grall @ 2023-01-12 19:38 UTC (permalink / raw)
  To: Michal Orzel, xen-devel
  Cc: Luca.Fancellu, Julien Grall, Stefano Stabellini,
	Bertrand Marquis, Volodymyr Babchuk



On 14/12/2022 14:24, Michal Orzel wrote:
> Hi Julien,

Hi Michal,

> On 12/12/2022 10:55, Julien Grall wrote:
>>
>>
>> From: Julien Grall <jgrall@amazon.com>
>>
>> The sequence for flushing the TLBs is 4 instruction long and often
>> requires an explanation how it works.
>>
>> So create an helper and use it in the boot code (switch_ttbr() is left
> Here and in title: s/an helper/a helper/

Done.

>> alone for now).
> Could you explain why?

So we need to decide how we expect switch_ttbr(). E.g. if Xen is 
relocated at a different, should the caller take care of the 
instruction/branch predictor flush?

I have expanded to "switch_ttbr() is left alone until we decided the 
semantic of the call".

>>
>> Note that in secondary_switched, we were also flushing the instruction
>> cache and branch predictor. Neither of them was necessary because:
>>      * We are only supporting IVIPT cache on arm32, so the instruction
>>        cache flush is only necessary when executable code is modified.
>>        None of the boot code is doing that.
>>      * The instruction cache is not invalidated and misprediction is not
>>        a problem at boot.
>>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> Apart from that, the patch is good, so:
> Reviewed-by: Michal Orzel <michal.orzel@amd.com>
Thanks!

> 
>>
>> ---
>>      Changes in v3:
>>          * Fix typo
>>          * Update the documentation
>>          * Rename the argument from tmp1 to tmp
>> ---
>>   xen/arch/arm/arm32/head.S | 30 +++++++++++++++++-------------
>>   1 file changed, 17 insertions(+), 13 deletions(-)
>>
>> diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
>> index 40c1d7502007..315abbbaebec 100644
>> --- a/xen/arch/arm/arm32/head.S
>> +++ b/xen/arch/arm/arm32/head.S
>> @@ -66,6 +66,20 @@
>>           add   \rb, \rb, r10
>>   .endm
>>
>> +/*
>> + * Flush local TLBs
>> + *
>> + * @tmp:    Scratch register
> As you are respinning a series anyway, could you add just one space after @tmp:?

Ok.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 15/18] xen/arm64: mm: Introduce helpers to prepare/enable/disable the identity mapping
  2022-12-13  1:41   ` Stefano Stabellini
@ 2023-01-12 22:03     ` Julien Grall
  0 siblings, 0 replies; 48+ messages in thread
From: Julien Grall @ 2023-01-12 22:03 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, michal.orzel, Luca.Fancellu, Julien Grall,
	Bertrand Marquis, Volodymyr Babchuk

Hi,

On 13/12/2022 01:41, Stefano Stabellini wrote:
>> diff --git a/xen/arch/arm/include/asm/setup.h b/xen/arch/arm/include/asm/setup.h
>> index fdbf68aadcaa..e7a80fecec14 100644
>> --- a/xen/arch/arm/include/asm/setup.h
>> +++ b/xen/arch/arm/include/asm/setup.h
>> @@ -168,6 +168,17 @@ int map_range_to_domain(const struct dt_device_node *dev,
>>   
>>   extern const char __ro_after_init_start[], __ro_after_init_end[];
>>   
>> +extern DEFINE_BOOT_PAGE_TABLE(boot_pgtable);
>> +
>> +#ifdef CONFIG_ARM_64
>> +extern DEFINE_BOOT_PAGE_TABLE(boot_first_id);
>> +#endif
>> +extern DEFINE_BOOT_PAGE_TABLE(boot_second_id);
>> +extern DEFINE_BOOT_PAGE_TABLE(boot_third_id);
> 
> This is more a matter of taste but I would either:
> - define extern all BOOT_PAGE_TABLEs here both ARM64 and ARM32 with
>    #ifdefs

A grep of BOOT_PAGE_TABLE shows that they are all defined in setup.h.

> - or define all the ARM64 only BOOT_PAGE_TABLE in arm64/mm.h and all the
>    ARM32 only BOOT_PAGE_TABLE in arm32/mm.h >
> Right now we have a mix, as we have boot_first_id with a #ifdef here
> and we have xen_pgtable in arm64/mm.h
We are talking about two distinct set of page-tables. One is used at 
runtime (i.e. xen_pgtable) and the other are for boot/smp-bring up.

So adding the boot_* in setup.h is correct. As I wrote earlier, setup.h 
would need a split. But this is not something I really want to handle 
here...

> 
> Also we are missing boot_second and boot_third. We might as well be
> consistent and declare them all?

My plan is really to kill boot_second and boot_third. So I don't really 
want to export them right now (even temporarily).

In any case, I don't think such change belongs in this patch (it is 
already complex enough).

>> +/* Find where Xen will be residing at runtime and return an PT entry */
>> +lpae_t pte_of_xenaddr(vaddr_t);
>> +
>>   #endif
>>   /*
>>    * Local variables:
>> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
>> index 0cf7ad4f0e8c..39e0d9e03c9c 100644
>> --- a/xen/arch/arm/mm.c
>> +++ b/xen/arch/arm/mm.c
>> @@ -93,7 +93,7 @@ DEFINE_BOOT_PAGE_TABLE(boot_third);
>>   
>>   #ifdef CONFIG_ARM_64
>>   #define HYP_PT_ROOT_LEVEL 0
>> -static DEFINE_PAGE_TABLE(xen_pgtable);
>> +DEFINE_PAGE_TABLE(xen_pgtable);
>>   static DEFINE_PAGE_TABLE(xen_first);
>>   #define THIS_CPU_PGTABLE xen_pgtable
>>   #else
>> @@ -388,7 +388,7 @@ void flush_page_to_ram(unsigned long mfn, bool sync_icache)
>>           invalidate_icache();
>>   }
>>   
>> -static inline lpae_t pte_of_xenaddr(vaddr_t va)
>> +lpae_t pte_of_xenaddr(vaddr_t va)
>>   {
>>       paddr_t ma = va + phys_offset;
>>   
>> @@ -495,6 +495,8 @@ void __init setup_pagetables(unsigned long boot_phys_offset)
>>   
>>       phys_offset = boot_phys_offset;
>>   
>> +    arch_setup_page_tables();
>> +
>>   #ifdef CONFIG_ARM_64
>>       pte = pte_of_xenaddr((uintptr_t)xen_first);
>>       pte.pt.table = 1;
>> -- 
>> 2.38.1
>>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v3 01/18] xen/arm64: flushtlb: Reduce scope of barrier for local TLB flush
  2022-12-12  9:55 ` [PATCH v3 01/18] xen/arm64: flushtlb: Reduce scope of barrier for local TLB flush Julien Grall
  2022-12-13  9:11   ` Michal Orzel
@ 2023-01-23 16:19   ` Ayan Kumar Halder
  1 sibling, 0 replies; 48+ messages in thread
From: Ayan Kumar Halder @ 2023-01-23 16:19 UTC (permalink / raw)
  To: xen-devel


On 12/12/2022 09:55, Julien Grall wrote:
> CAUTION: This message has originated from an External Source. Please use proper judgment and caution when opening attachments, clicking links, or responding to this email.
>
>
> From: Julien Grall <jgrall@amazon.com>
>
> Per D5-4929 in ARM DDI 0487H.a:
> "A DSB NSH is sufficient to ensure completion of TLB maintenance
>   instructions that apply to a single PE. A DSB ISH is sufficient to
>   ensure completion of TLB maintenance instructions that apply to PEs
>   in the same Inner Shareable domain.
> "
>
> This means barrier after local TLB flushes could be reduced to
> non-shareable.
>
> Note that the scope of the barrier in the workaround has not been
> changed because Linux v6.1-rc8 is also using 'ish' and I couldn't
> find anything in the Neoverse N1 suggesting that a 'nsh' would
> be sufficient.
>
> Signed-off-by: Julien Grall <jgrall@amazon.com>
>
> ---
>
>      I have used an older version of the Arm Arm because the explanation
>      in the latest (ARM DDI 0487I.a) is less obvious. I reckon the paragraph
>      about DSB in D8.13.8 is missing the shareability. But this is implied
>      in B2.3.11:
>
>      "If the required access types of the DSB is reads and writes, the
>       following instructions issued by PEe before the DSB are complete for
>       the required shareability domain:
>
>       [...]
>
>       — All TLB maintenance instructions.
>      "
>
>      Changes in v3:
>          - Patch added
> ---
>   xen/arch/arm/include/asm/arm64/flushtlb.h | 27 ++++++++++++++---------
>   1 file changed, 16 insertions(+), 11 deletions(-)
>
> diff --git a/xen/arch/arm/include/asm/arm64/flushtlb.h b/xen/arch/arm/include/asm/arm64/flushtlb.h
> index 7c5431518741..39d429ace552 100644
> --- a/xen/arch/arm/include/asm/arm64/flushtlb.h
> +++ b/xen/arch/arm/include/asm/arm64/flushtlb.h
> @@ -12,8 +12,9 @@
>    * ARM64_WORKAROUND_REPEAT_TLBI:
>    * Modification of the translation table for a virtual address might lead to
>    * read-after-read ordering violation.
> - * The workaround repeats TLBI+DSB operation for all the TLB flush operations.
> - * While this is stricly not necessary, we don't want to take any risk.
> + * The workaround repeats TLBI+DSB ISH operation for all the TLB flush
> + * operations. While this is stricly not necessary, we don't want to
> + * take any risk.
>    *
>    * For Xen page-tables the ISB will discard any instructions fetched
>    * from the old mappings.
> @@ -21,38 +22,42 @@
>    * For the Stage-2 page-tables the ISB ensures the completion of the DSB
>    * (and therefore the TLB invalidation) before continuing. So we know
>    * the TLBs cannot contain an entry for a mapping we may have removed.
> + *
> + * Note that for local TLB flush, using non-shareable (nsh) is sufficient
> + * (see D5-4929 in ARM DDI 0487H.a). Althougth, the memory barrier in
> + * for the workaround is left as inner-shareable to match with Linux.

Nit:- It might be good to mention the Linux commit id.

>    */
> -#define TLB_HELPER(name, tlbop)                  \
> +#define TLB_HELPER(name, tlbop, sh)              \
>   static inline void name(void)                    \
>   {                                                \
>       asm volatile(                                \
> -        "dsb  ishst;"                            \
> +        "dsb  "  # sh  "st;"                     \
>           "tlbi "  # tlbop  ";"                    \
>           ALTERNATIVE(                             \
>               "nop; nop;",                         \
> -            "dsb  ish;"                          \
> +            "dsb  "  # sh  ";"                   \
>               "tlbi "  # tlbop  ";",               \
>               ARM64_WORKAROUND_REPEAT_TLBI,        \
>               CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
> -        "dsb  ish;"                              \
> +        "dsb  "  # sh  ";"                       \
>           "isb;"                                   \
>           : : : "memory");                         \
>   }
>
>   /* Flush local TLBs, current VMID only. */
> -TLB_HELPER(flush_guest_tlb_local, vmalls12e1);
> +TLB_HELPER(flush_guest_tlb_local, vmalls12e1, nsh);
>
>   /* Flush innershareable TLBs, current VMID only */
> -TLB_HELPER(flush_guest_tlb, vmalls12e1is);
> +TLB_HELPER(flush_guest_tlb, vmalls12e1is, ish);
>
>   /* Flush local TLBs, all VMIDs, non-hypervisor mode */
> -TLB_HELPER(flush_all_guests_tlb_local, alle1);
> +TLB_HELPER(flush_all_guests_tlb_local, alle1, nsh);
>
>   /* Flush innershareable TLBs, all VMIDs, non-hypervisor mode */
> -TLB_HELPER(flush_all_guests_tlb, alle1is);
> +TLB_HELPER(flush_all_guests_tlb, alle1is, ish);
>
>   /* Flush all hypervisor mappings from the TLB of the local processor. */
> -TLB_HELPER(flush_xen_tlb_local, alle2);
> +TLB_HELPER(flush_xen_tlb_local, alle2, nsh);
>
>   /* Flush TLB of local processor for address va. */
>   static inline void  __flush_xen_tlb_one_local(vaddr_t va)
> --
> 2.38.1
Reviewed-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
>


^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2023-01-23 16:19 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-12  9:55 [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall
2022-12-12  9:55 ` [PATCH v3 01/18] xen/arm64: flushtlb: Reduce scope of barrier for local TLB flush Julien Grall
2022-12-13  9:11   ` Michal Orzel
2022-12-13  9:45     ` Julien Grall
2022-12-13  9:50       ` Michal Orzel
2023-01-23 16:19   ` Ayan Kumar Halder
2022-12-12  9:55 ` [PATCH v3 02/18] xen/arm64: flushtlb: Implement the TLBI repeat workaround for TLB flush by VA Julien Grall
2022-12-13  9:23   ` Michal Orzel
2022-12-12  9:55 ` [PATCH v3 03/18] xen/arm32: flushtlb: Reduce scope of barrier for local TLB flush Julien Grall
2022-12-13 10:48   ` Michal Orzel
2022-12-12  9:55 ` [PATCH v3 04/18] xen/arm: flushtlb: Reduce scope of barrier for the TLB range flush Julien Grall
2022-12-13 11:15   ` Michal Orzel
2022-12-12  9:55 ` [PATCH v3 05/18] xen/arm: Clean-up the memory layout Julien Grall
2022-12-13 10:57   ` Michal Orzel
2022-12-13 11:00     ` Michal Orzel
2022-12-12  9:55 ` [PATCH v3 06/18] xen/arm32: head: Replace "ldr rX, =<label>" with "mov_w rX, <label>" Julien Grall
2022-12-13  0:31   ` Stefano Stabellini
2022-12-13 11:10   ` Michal Orzel
2022-12-12  9:55 ` [PATCH v3 07/18] xen/arm32: head: Jump to the runtime mapping in enable_mmu() Julien Grall
2022-12-13  0:46   ` Stefano Stabellini
2022-12-12  9:55 ` [PATCH v3 08/18] xen/arm32: head: Introduce an helper to flush the TLBs Julien Grall
2022-12-14 14:24   ` Michal Orzel
2023-01-12 19:38     ` Julien Grall
2022-12-12  9:55 ` [PATCH v3 09/18] xen/arm32: head: Remove restriction where to load Xen Julien Grall
2022-12-13 18:23   ` Julien Grall
2022-12-12  9:55 ` [PATCH v3 10/18] xen/arm32: head: Widen the use of the temporary mapping Julien Grall
2022-12-12  9:55 ` [PATCH v3 11/18] xen/arm: Enable use of dump_pt_walk() early during boot Julien Grall
2022-12-13  1:06   ` Stefano Stabellini
2022-12-12  9:55 ` [PATCH v3 12/18] xen/arm64: Rework the memory layout Julien Grall
2022-12-13  1:22   ` Stefano Stabellini
2022-12-13 18:31     ` Julien Grall
2022-12-12  9:55 ` [PATCH v3 13/18] xen/arm: mm: Allow xen_pt_update() to work with the current root table Julien Grall
2022-12-13  1:24   ` Stefano Stabellini
2022-12-12  9:55 ` [PATCH v3 14/18] xen/arm: mm: Allow dump_hyp_walk() to work on " Julien Grall
2022-12-13  1:24   ` Stefano Stabellini
2022-12-12  9:55 ` [PATCH v3 15/18] xen/arm64: mm: Introduce helpers to prepare/enable/disable the identity mapping Julien Grall
2022-12-13  1:41   ` Stefano Stabellini
2023-01-12 22:03     ` Julien Grall
2022-12-12  9:55 ` [PATCH v3 16/18] xen/arm: linker: Indent correctly _stext Julien Grall
2022-12-13  1:42   ` Stefano Stabellini
2022-12-12  9:55 ` [PATCH v3 17/18] xen/arm: linker: The identitymap check should cover the whole .text.header Julien Grall
2022-12-13  1:44   ` Stefano Stabellini
2022-12-12  9:55 ` [PATCH v3 18/18] xen/arm64: mm: Rework switch_ttbr() Julien Grall
2022-12-13  2:00   ` Stefano Stabellini
2022-12-13 19:08     ` Julien Grall
2022-12-13 22:56       ` Stefano Stabellini
2022-12-13 23:01         ` Julien Grall
2022-12-15 11:48 ` [PATCH v3 00/18] xen/arm: Don't switch TTBR while the MMU is on Julien Grall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.