xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFCv2 00/15] xen/arm: mm: Remove open-coding mappings
@ 2021-04-25 20:13 Julien Grall
  2021-04-25 20:13 ` [PATCH RFCv2 01/15] xen/arm: lpae: Rename LPAE_ENTRIES_MASK_GS to LPAE_ENTRY_MASK_GS Julien Grall
                   ` (14 more replies)
  0 siblings, 15 replies; 59+ messages in thread
From: Julien Grall @ 2021-04-25 20:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Jan Beulich, Wei Liu, Andrew Cooper,
	Roger Pau Monné,
	Hongian Xia

From: Julien Grall <jgrall@amazon.com>

Hi all,

This series was originally sent as "xen/arm: mm: Add limited support
for superpages" [1] and finally has grown enough to remove most of
the open-coding mappings in the boot code.

This will help to:
    1) Get better compliance with the Arm memory model
    2) Pave the way to support other page size (64KB, 16KB)

This is not fully finished (only boot tested on Arm64) but I sent it
early to get more testing and also unblock some of the on-going work
to support static memory allocation for dom0less (see [2]).

There are still a few TODOs:
    - Add support for setting the contiguous bits
    - Remove 1GB alignment in setup_xenheap_mappings()
    - Decide whether we want to provide a common PMAP

Cheers,

[1] <20201119190751.22345-1-julien@xen.org>
[2] <PA4PR08MB6253F49C13ED56811BA5B64E92479@PA4PR08MB6253.eurprd08.prod.outlook.com>

Julien Grall (14):
  xen/arm: lpae: Rename LPAE_ENTRIES_MASK_GS to LPAE_ENTRY_MASK_GS
  xen/arm: lpae: Use the generic helpers to defined the Xen PT helpers
  xen/arm: p2m: Replace level_{orders, masks} arrays with LEVEL_{ORDER,
    MASK}
  xen/arm: mm: Allow other mapping size in xen_pt_update_entry()
  xen/arm: mm: Avoid flushing the TLBs when mapping are inserted
  xen/arm: mm: Don't open-code Xen PT update in remove_early_mappings()
  xen/arm: mm: Re-implement early_fdt_map() using map_pages_to_xen()
  xen/arm32: mm: Check if the virtual address is shared before updating
    it
  xen/arm32: mm: Re-implement setup_xenheap_mappings() using
    map_pages_to_xen()
  xen/arm: mm: Allocate xen page tables in domheap rather than xenheap
  xen/arm: mm: Allow page-table allocation from the boot allocator
  xen/arm: mm: Use the PMAP helpers in xen_{,un}map_table()
  xen/arm: mm: Rework setup_xenheap_mappings()
  xen/arm: mm: Re-implement setup_frame_table_mappings() with
    map_pages_to_xen()

Wei Liu (1):
  xen/arm: add Persistent Map (PMAP) infrastructure

 xen/arch/arm/Makefile        |   1 +
 xen/arch/arm/mm.c            | 362 +++++++++++++++++------------------
 xen/arch/arm/p2m.c           |  16 +-
 xen/arch/arm/pmap.c          | 101 ++++++++++
 xen/arch/arm/setup.c         |  10 +-
 xen/include/asm-arm/config.h |   6 +
 xen/include/asm-arm/lpae.h   |  83 ++++----
 xen/include/asm-arm/page.h   |   4 +
 xen/include/asm-arm/pmap.h   |  10 +
 9 files changed, 356 insertions(+), 237 deletions(-)
 create mode 100644 xen/arch/arm/pmap.c
 create mode 100644 xen/include/asm-arm/pmap.h

-- 
2.17.1



^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH RFCv2 01/15] xen/arm: lpae: Rename LPAE_ENTRIES_MASK_GS to LPAE_ENTRY_MASK_GS
  2021-04-25 20:13 [PATCH RFCv2 00/15] xen/arm: mm: Remove open-coding mappings Julien Grall
@ 2021-04-25 20:13 ` Julien Grall
  2021-05-11 22:12   ` Stefano Stabellini
  2021-04-25 20:13 ` [PATCH RFCv2 02/15] xen/arm: lpae: Use the generic helpers to defined the Xen PT helpers Julien Grall
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-04-25 20:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

Commit 05031fa87357 "xen/arm: guest_walk: Only generate necessary
offsets/masks" introduced LPAE_ENTRIES_MASK_GS. In a follow-up patch,
we will use it for to define LPAE_ENTRY_MASK.

This will lead to inconsistent naming. As LPAE_ENTRY_MASK is used in
many places, it is better to rename LPAE_ENTRIES_MASK_GS and avoid
some churn.

So rename LPAE_ENTRIES_MASK_GS to LPAE_ENTRY_MASK_GS.

Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v2:
        - New patch
---
 xen/include/asm-arm/lpae.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h
index e94de2e7d8e8..4fb9a40a4ca9 100644
--- a/xen/include/asm-arm/lpae.h
+++ b/xen/include/asm-arm/lpae.h
@@ -180,7 +180,7 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
  */
 #define LPAE_SHIFT_GS(gs)         ((gs) - 3)
 #define LPAE_ENTRIES_GS(gs)       (_AC(1, U) << LPAE_SHIFT_GS(gs))
-#define LPAE_ENTRIES_MASK_GS(gs)  (LPAE_ENTRIES_GS(gs) - 1)
+#define LPAE_ENTRY_MASK_GS(gs)  (LPAE_ENTRIES_GS(gs) - 1)
 
 #define LEVEL_ORDER_GS(gs, lvl)   ((3 - (lvl)) * LPAE_SHIFT_GS(gs))
 #define LEVEL_SHIFT_GS(gs, lvl)   (LEVEL_ORDER_GS(gs, lvl) + (gs))
@@ -188,7 +188,7 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
 
 /* Offset in the table at level 'lvl' */
 #define LPAE_TABLE_INDEX_GS(gs, lvl, addr)   \
-    (((addr) >> LEVEL_SHIFT_GS(gs, lvl)) & LPAE_ENTRIES_MASK_GS(gs))
+    (((addr) >> LEVEL_SHIFT_GS(gs, lvl)) & LPAE_ENTRY_MASK_GS(gs))
 
 /* Generate an array @var containing the offset for each level from @addr */
 #define DECLARE_OFFSETS(var, addr)          \
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFCv2 02/15] xen/arm: lpae: Use the generic helpers to defined the Xen PT helpers
  2021-04-25 20:13 [PATCH RFCv2 00/15] xen/arm: mm: Remove open-coding mappings Julien Grall
  2021-04-25 20:13 ` [PATCH RFCv2 01/15] xen/arm: lpae: Rename LPAE_ENTRIES_MASK_GS to LPAE_ENTRY_MASK_GS Julien Grall
@ 2021-04-25 20:13 ` Julien Grall
  2021-05-11 22:26   ` Stefano Stabellini
  2021-04-25 20:13 ` [PATCH RFCv2 03/15] xen/arm: p2m: Replace level_{orders, masks} arrays with LEVEL_{ORDER, MASK} Julien Grall
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-04-25 20:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

Currently, Xen PT helpers are only working with 4KB page granularity
and open-code the generic helpers. To allow more flexibility, we can
re-use the generic helpers and pass Xen's page granularity
(PAGE_SHIFT).

As Xen PT helpers are used in both C and assembly, we need to move
the generic helpers definition outside of the !__ASSEMBLY__ section.

Note the aliases for each level are still kept for the time being so we
can avoid a massive patch to change all the callers.

Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v2:
        - New patch
---
 xen/include/asm-arm/lpae.h | 71 +++++++++++++++++++++-----------------
 1 file changed, 40 insertions(+), 31 deletions(-)

diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h
index 4fb9a40a4ca9..310f5225e056 100644
--- a/xen/include/asm-arm/lpae.h
+++ b/xen/include/asm-arm/lpae.h
@@ -159,6 +159,17 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
 #define lpae_get_mfn(pte)    (_mfn((pte).walk.base))
 #define lpae_set_mfn(pte, mfn)  ((pte).walk.base = mfn_x(mfn))
 
+/* Generate an array @var containing the offset for each level from @addr */
+#define DECLARE_OFFSETS(var, addr)          \
+    const unsigned int var[4] = {           \
+        zeroeth_table_offset(addr),         \
+        first_table_offset(addr),           \
+        second_table_offset(addr),          \
+        third_table_offset(addr)            \
+    }
+
+#endif /* __ASSEMBLY__ */
+
 /*
  * AArch64 supports pages with different sizes (4K, 16K, and 64K).
  * Provide a set of generic helpers that will compute various
@@ -190,17 +201,6 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
 #define LPAE_TABLE_INDEX_GS(gs, lvl, addr)   \
     (((addr) >> LEVEL_SHIFT_GS(gs, lvl)) & LPAE_ENTRY_MASK_GS(gs))
 
-/* Generate an array @var containing the offset for each level from @addr */
-#define DECLARE_OFFSETS(var, addr)          \
-    const unsigned int var[4] = {           \
-        zeroeth_table_offset(addr),         \
-        first_table_offset(addr),           \
-        second_table_offset(addr),          \
-        third_table_offset(addr)            \
-    }
-
-#endif /* __ASSEMBLY__ */
-
 /*
  * These numbers add up to a 48-bit input address space.
  *
@@ -211,26 +211,35 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
  * therefore 39-bits are sufficient.
  */
 
-#define LPAE_SHIFT      9
-#define LPAE_ENTRIES    (_AC(1,U) << LPAE_SHIFT)
-#define LPAE_ENTRY_MASK (LPAE_ENTRIES - 1)
-
-#define THIRD_SHIFT    (PAGE_SHIFT)
-#define THIRD_ORDER    (THIRD_SHIFT - PAGE_SHIFT)
-#define THIRD_SIZE     (_AT(paddr_t, 1) << THIRD_SHIFT)
-#define THIRD_MASK     (~(THIRD_SIZE - 1))
-#define SECOND_SHIFT   (THIRD_SHIFT + LPAE_SHIFT)
-#define SECOND_ORDER   (SECOND_SHIFT - PAGE_SHIFT)
-#define SECOND_SIZE    (_AT(paddr_t, 1) << SECOND_SHIFT)
-#define SECOND_MASK    (~(SECOND_SIZE - 1))
-#define FIRST_SHIFT    (SECOND_SHIFT + LPAE_SHIFT)
-#define FIRST_ORDER    (FIRST_SHIFT - PAGE_SHIFT)
-#define FIRST_SIZE     (_AT(paddr_t, 1) << FIRST_SHIFT)
-#define FIRST_MASK     (~(FIRST_SIZE - 1))
-#define ZEROETH_SHIFT  (FIRST_SHIFT + LPAE_SHIFT)
-#define ZEROETH_ORDER  (ZEROETH_SHIFT - PAGE_SHIFT)
-#define ZEROETH_SIZE   (_AT(paddr_t, 1) << ZEROETH_SHIFT)
-#define ZEROETH_MASK   (~(ZEROETH_SIZE - 1))
+#define LPAE_SHIFT          LPAE_SHIFT_GS(PAGE_SHIFT)
+#define LPAE_ENTRIES        LPAE_ENTRIES_GS(PAGE_SHIFT)
+#define LPAE_ENTRY_MASK     LPAE_ENTRY_MASK_GS(PAGE_SHIFT)
+
+#define LEVEL_SHIFT(lvl)    LEVEL_SHIFT_GS(PAGE_SHIFT, lvl)
+#define LEVEL_ORDER(lvl)    LEVEL_ORDER_GS(PAGE_SHIFT, lvl)
+#define LEVEL_SIZE(lvl)     LEVEL_SIZE_GS(PAGE_SHIFT, lvl)
+#define LEVEL_MASK(lvl)     (~(LEVEL_SIZE(lvl) - 1))
+
+/* Convenience aliases */
+#define THIRD_SHIFT         LEVEL_SHIFT(3)
+#define THIRD_ORDER         LEVEL_ORDER(3)
+#define THIRD_SIZE          LEVEL_SIZE(3)
+#define THIRD_MASK          LEVEL_MASK(3)
+
+#define SECOND_SHIFT        LEVEL_SHIFT(2)
+#define SECOND_ORDER        LEVEL_ORDER(2)
+#define SECOND_SIZE         LEVEL_SIZE(2)
+#define SECOND_MASK         LEVEL_MASK(2)
+
+#define FIRST_SHIFT         LEVEL_SHIFT(1)
+#define FIRST_ORDER         LEVEL_ORDER(1)
+#define FIRST_SIZE          LEVEL_SIZE(1)
+#define FIRST_MASK          LEVEL_MASK(1)
+
+#define ZEROETH_SHIFT       LEVEL_SHIFT(0)
+#define ZEROETH_ORDER       LEVEL_ORDER(0)
+#define ZEROETH_SIZE        LEVEL_SIZE(0)
+#define ZEROETH_MASK        LEVEL_MASK(0)
 
 /* Calculate the offsets into the pagetables for a given VA */
 #define zeroeth_linear_offset(va) ((va) >> ZEROETH_SHIFT)
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFCv2 03/15] xen/arm: p2m: Replace level_{orders, masks} arrays with LEVEL_{ORDER, MASK}
  2021-04-25 20:13 [PATCH RFCv2 00/15] xen/arm: mm: Remove open-coding mappings Julien Grall
  2021-04-25 20:13 ` [PATCH RFCv2 01/15] xen/arm: lpae: Rename LPAE_ENTRIES_MASK_GS to LPAE_ENTRY_MASK_GS Julien Grall
  2021-04-25 20:13 ` [PATCH RFCv2 02/15] xen/arm: lpae: Use the generic helpers to defined the Xen PT helpers Julien Grall
@ 2021-04-25 20:13 ` Julien Grall
  2021-05-11 22:33   ` Stefano Stabellini
  2021-04-25 20:13 ` [PATCH RFCv2 04/15] xen/arm: mm: Allow other mapping size in xen_pt_update_entry() Julien Grall
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-04-25 20:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

The array level_orders and level_masks can be replaced with the
recently introduced macros LEVEL_ORDER and LEVEL_MASK.

Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v2:
        - New patch
---
 xen/arch/arm/p2m.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index ac5031262061..1b04c3534439 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -36,12 +36,6 @@ static unsigned int __read_mostly max_vmid = MAX_VMID_8_BIT;
  */
 unsigned int __read_mostly p2m_ipa_bits = 64;
 
-/* Helpers to lookup the properties of each level */
-static const paddr_t level_masks[] =
-    { ZEROETH_MASK, FIRST_MASK, SECOND_MASK, THIRD_MASK };
-static const uint8_t level_orders[] =
-    { ZEROETH_ORDER, FIRST_ORDER, SECOND_ORDER, THIRD_ORDER };
-
 static mfn_t __read_mostly empty_root_mfn;
 
 static uint64_t generate_vttbr(uint16_t vmid, mfn_t root_mfn)
@@ -232,7 +226,7 @@ static lpae_t *p2m_get_root_pointer(struct p2m_domain *p2m,
      * we can't use (P2M_ROOT_LEVEL - 1) because the root level might be
      * 0. Yet we still want to check if all the unused bits are zeroed.
      */
-    root_table = gfn_x(gfn) >> (level_orders[P2M_ROOT_LEVEL] + LPAE_SHIFT);
+    root_table = gfn_x(gfn) >> (LEVEL_ORDER(P2M_ROOT_LEVEL) + LPAE_SHIFT);
     if ( root_table >= P2M_ROOT_PAGES )
         return NULL;
 
@@ -378,7 +372,7 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
     if ( gfn_x(gfn) > gfn_x(p2m->max_mapped_gfn) )
     {
         for ( level = P2M_ROOT_LEVEL; level < 3; level++ )
-            if ( (gfn_x(gfn) & (level_masks[level] >> PAGE_SHIFT)) >
+            if ( (gfn_x(gfn) & (LEVEL_MASK(level) >> PAGE_SHIFT)) >
                  gfn_x(p2m->max_mapped_gfn) )
                 break;
 
@@ -421,7 +415,7 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
          * The entry may point to a superpage. Find the MFN associated
          * to the GFN.
          */
-        mfn = mfn_add(mfn, gfn_x(gfn) & ((1UL << level_orders[level]) - 1));
+        mfn = mfn_add(mfn, gfn_x(gfn) & ((1UL << LEVEL_ORDER(level)) - 1));
 
         if ( valid )
             *valid = lpae_is_valid(entry);
@@ -432,7 +426,7 @@ out_unmap:
 
 out:
     if ( page_order )
-        *page_order = level_orders[level];
+        *page_order = LEVEL_ORDER(level);
 
     return mfn;
 }
@@ -806,7 +800,7 @@ static bool p2m_split_superpage(struct p2m_domain *p2m, lpae_t *entry,
     /* Convenience aliases */
     mfn_t mfn = lpae_get_mfn(*entry);
     unsigned int next_level = level + 1;
-    unsigned int level_order = level_orders[next_level];
+    unsigned int level_order = LEVEL_ORDER(next_level);
 
     /*
      * This should only be called with target != level and the entry is
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFCv2 04/15] xen/arm: mm: Allow other mapping size in xen_pt_update_entry()
  2021-04-25 20:13 [PATCH RFCv2 00/15] xen/arm: mm: Remove open-coding mappings Julien Grall
                   ` (2 preceding siblings ...)
  2021-04-25 20:13 ` [PATCH RFCv2 03/15] xen/arm: p2m: Replace level_{orders, masks} arrays with LEVEL_{ORDER, MASK} Julien Grall
@ 2021-04-25 20:13 ` Julien Grall
  2021-05-11 22:42   ` Stefano Stabellini
  2021-04-25 20:13 ` [PATCH RFCv2 05/15] xen/arm: mm: Avoid flushing the TLBs when mapping are inserted Julien Grall
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-04-25 20:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Julien Grall

From: Julien Grall <julien.grall@arm.com>

At the moment, xen_pt_update_entry() only supports mapping at level 3
(i.e 4KB mapping). While this is fine for most of the runtime helper,
the boot code will require to use superpage mapping.

We don't want to allow superpage mapping by default as some of the
callers may expect small mappings (i.e populate_pt_range()) or even
expect to unmap only a part of a superpage.

To keep the code simple, a new flag _PAGE_BLOCK is introduced to
allow the caller to enable superpage mapping.

As the code doesn't support all the combinations, xen_pt_check_entry()
is extended to take into account the cases we don't support when
using block mapping:
    - Replacing a table with a mapping. This may happen if region was
    first mapped with 4KB mapping and then later on replaced with a 2MB
    (or 1GB mapping)
    - Removing/modify a table. This may happen if a caller try to
    remove a region with _PAGE_BLOCK set when it was created without it

Note that the current restriction mean that the caller must ensure that
_PAGE_BLOCK is consistently set/cleared across all the updates on a
given virtual region. This ought to be fine with the expected use-cases.

More rework will be necessary if we wanted to remove the restrictions.

Note that nr_mfns is now marked const as it is used for flushing the
TLBs and we don't want it to be modified.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Julien Grall <jgrall@amazon.com>

---

    Changes in v2:
        - Pass the target level rather than the order to
        xen_pt_update_entry()
        - Update some comments
        - Open-code paddr_to_pfn()
        - Add my AWS signed-off-by
---
 xen/arch/arm/mm.c          | 93 ++++++++++++++++++++++++++++++--------
 xen/include/asm-arm/page.h |  4 ++
 2 files changed, 79 insertions(+), 18 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 59f8a3f15fd1..8ebb36899314 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1060,9 +1060,10 @@ static int xen_pt_next_level(bool read_only, unsigned int level,
 }
 
 /* Sanity check of the entry */
-static bool xen_pt_check_entry(lpae_t entry, mfn_t mfn, unsigned int flags)
+static bool xen_pt_check_entry(lpae_t entry, mfn_t mfn, unsigned int level,
+                               unsigned int flags)
 {
-    /* Sanity check when modifying a page. */
+    /* Sanity check when modifying an entry. */
     if ( (flags & _PAGE_PRESENT) && mfn_eq(mfn, INVALID_MFN) )
     {
         /* We don't allow modifying an invalid entry. */
@@ -1072,6 +1073,13 @@ static bool xen_pt_check_entry(lpae_t entry, mfn_t mfn, unsigned int flags)
             return false;
         }
 
+        /* We don't allow modifying a table entry */
+        if ( !lpae_is_mapping(entry, level) )
+        {
+            mm_printk("Modifying a table entry is not allowed.\n");
+            return false;
+        }
+
         /* We don't allow changing memory attributes. */
         if ( entry.pt.ai != PAGE_AI_MASK(flags) )
         {
@@ -1087,7 +1095,7 @@ static bool xen_pt_check_entry(lpae_t entry, mfn_t mfn, unsigned int flags)
             return false;
         }
     }
-    /* Sanity check when inserting a page */
+    /* Sanity check when inserting a mapping */
     else if ( flags & _PAGE_PRESENT )
     {
         /* We should be here with a valid MFN. */
@@ -1096,18 +1104,28 @@ static bool xen_pt_check_entry(lpae_t entry, mfn_t mfn, unsigned int flags)
         /* We don't allow replacing any valid entry. */
         if ( lpae_is_valid(entry) )
         {
-            mm_printk("Changing MFN for a valid entry is not allowed (%#"PRI_mfn" -> %#"PRI_mfn").\n",
-                      mfn_x(lpae_get_mfn(entry)), mfn_x(mfn));
+            if ( lpae_is_mapping(entry, level) )
+                mm_printk("Changing MFN for a valid entry is not allowed (%#"PRI_mfn" -> %#"PRI_mfn").\n",
+                          mfn_x(lpae_get_mfn(entry)), mfn_x(mfn));
+            else
+                mm_printk("Trying to replace a table with a mapping.\n");
             return false;
         }
     }
-    /* Sanity check when removing a page. */
+    /* Sanity check when removing a mapping. */
     else if ( (flags & (_PAGE_PRESENT|_PAGE_POPULATE)) == 0 )
     {
         /* We should be here with an invalid MFN. */
         ASSERT(mfn_eq(mfn, INVALID_MFN));
 
-        /* We don't allow removing page with contiguous bit set. */
+        /* We don't allow removing a table */
+        if ( lpae_is_table(entry, level) )
+        {
+            mm_printk("Removing a table is not allowed.\n");
+            return false;
+        }
+
+        /* We don't allow removing a mapping with contiguous bit set. */
         if ( entry.pt.contig )
         {
             mm_printk("Removing entry with contiguous bit set is not allowed.\n");
@@ -1125,13 +1143,13 @@ static bool xen_pt_check_entry(lpae_t entry, mfn_t mfn, unsigned int flags)
     return true;
 }
 
+/* Update an entry at the level @target. */
 static int xen_pt_update_entry(mfn_t root, unsigned long virt,
-                               mfn_t mfn, unsigned int flags)
+                               mfn_t mfn, unsigned int target,
+                               unsigned int flags)
 {
     int rc;
     unsigned int level;
-    /* We only support 4KB mapping (i.e level 3) for now */
-    unsigned int target = 3;
     lpae_t *table;
     /*
      * The intermediate page tables are read-only when the MFN is not valid
@@ -1186,7 +1204,7 @@ static int xen_pt_update_entry(mfn_t root, unsigned long virt,
     entry = table + offsets[level];
 
     rc = -EINVAL;
-    if ( !xen_pt_check_entry(*entry, mfn, flags) )
+    if ( !xen_pt_check_entry(*entry, mfn, level, flags) )
         goto out;
 
     /* If we are only populating page-table, then we are done. */
@@ -1204,8 +1222,11 @@ static int xen_pt_update_entry(mfn_t root, unsigned long virt,
         {
             pte = mfn_to_xen_entry(mfn, PAGE_AI_MASK(flags));
 
-            /* Third level entries set pte.pt.table = 1 */
-            pte.pt.table = 1;
+            /*
+             * First and second level pages set pte.pt.table = 0, but
+             * third level entries set pte.pt.table = 1.
+             */
+            pte.pt.table = (level == 3);
         }
         else /* We are updating the permission => Copy the current pte. */
             pte = *entry;
@@ -1229,11 +1250,12 @@ static DEFINE_SPINLOCK(xen_pt_lock);
 
 static int xen_pt_update(unsigned long virt,
                          mfn_t mfn,
-                         unsigned long nr_mfns,
+                         const unsigned long nr_mfns,
                          unsigned int flags)
 {
     int rc = 0;
-    unsigned long addr = virt, addr_end = addr + nr_mfns * PAGE_SIZE;
+    unsigned long vfn = virt >> PAGE_SHIFT;
+    unsigned long left = nr_mfns;
 
     /*
      * For arm32, page-tables are different on each CPUs. Yet, they share
@@ -1265,14 +1287,49 @@ static int xen_pt_update(unsigned long virt,
 
     spin_lock(&xen_pt_lock);
 
-    for ( ; addr < addr_end; addr += PAGE_SIZE )
+    while ( left )
     {
-        rc = xen_pt_update_entry(root, addr, mfn, flags);
+        unsigned int order, level;
+        unsigned long mask;
+
+        /*
+         * Don't take into account the MFN when removing mapping (i.e
+         * MFN_INVALID) to calculate the correct target order.
+         *
+         * This loop relies on mfn, vfn, and nr_mfn to be all superpage
+         * aligned (mfn and vfn have to be architecturally), and it uses
+         * `mask` to check for that.
+         *
+         * XXX: Support superpage mappings if nr is not aligned to a
+         * superpage size.
+         */
+        mask = !mfn_eq(mfn, INVALID_MFN) ? mfn_x(mfn) : 0;
+        mask |= vfn | left;
+
+        /*
+         * Always use level 3 mapping unless the caller request block
+         * mapping.
+         */
+        if ( likely(!(flags & _PAGE_BLOCK)) )
+            level = 3;
+        else if ( !(mask & (BIT(FIRST_ORDER, UL) - 1)) )
+            level = 1;
+        else if ( !(mask & (BIT(SECOND_ORDER, UL) - 1)) )
+            level = 2;
+        else
+            level = 3;
+
+        order = LEVEL_ORDER(level);
+
+        rc = xen_pt_update_entry(root, pfn_to_paddr(vfn), mfn, level, flags);
         if ( rc )
             break;
 
+        vfn += 1U << order;
         if ( !mfn_eq(mfn, INVALID_MFN) )
-            mfn = mfn_add(mfn, 1);
+            mfn = mfn_add(mfn, 1U << order);
+
+        left -= (1U << order);
     }
 
     /*
diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
index 131507a51712..7052a87ec0fe 100644
--- a/xen/include/asm-arm/page.h
+++ b/xen/include/asm-arm/page.h
@@ -69,6 +69,7 @@
  * [3:4] Permission flags
  * [5]   Page present
  * [6]   Only populate page tables
+ * [7]   Superpage mappings is allowed
  */
 #define PAGE_AI_MASK(x) ((x) & 0x7U)
 
@@ -82,6 +83,9 @@
 #define _PAGE_PRESENT    (1U << 5)
 #define _PAGE_POPULATE   (1U << 6)
 
+#define _PAGE_BLOCK_BIT     7
+#define _PAGE_BLOCK         (1U << _PAGE_BLOCK_BIT)
+
 /*
  * _PAGE_DEVICE and _PAGE_NORMAL are convenience defines. They are not
  * meant to be used outside of this header.
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFCv2 05/15] xen/arm: mm: Avoid flushing the TLBs when mapping are inserted
  2021-04-25 20:13 [PATCH RFCv2 00/15] xen/arm: mm: Remove open-coding mappings Julien Grall
                   ` (3 preceding siblings ...)
  2021-04-25 20:13 ` [PATCH RFCv2 04/15] xen/arm: mm: Allow other mapping size in xen_pt_update_entry() Julien Grall
@ 2021-04-25 20:13 ` Julien Grall
  2021-05-11 22:51   ` Stefano Stabellini
  2021-04-25 20:13 ` [PATCH RFCv2 06/15] xen/arm: mm: Don't open-code Xen PT update in remove_early_mappings() Julien Grall
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-04-25 20:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

Currently, the function xen_pt_update() will flush the TLBs even when
the mappings are inserted. This is a bit wasteful because we don't
allow mapping replacement. Even if we were, the flush would need to
happen earlier because mapping replacement should use Break-Before-Make
when updating the entry.

A single call to xen_pt_update() can perform a single action. IOW, it
is not possible to, for instance, mix inserting and removing mappings.
Therefore, we can use `flags` to determine what action is performed.

This change will be particularly help to limit the impact of switching
boot time mapping to use xen_pt_update().

Signed-off-by: Julien Grall <jgrall@amazon.com>

---

    Changes in v2:
        - New patch
---
 xen/arch/arm/mm.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 8ebb36899314..1fe52b3af722 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1101,7 +1101,13 @@ static bool xen_pt_check_entry(lpae_t entry, mfn_t mfn, unsigned int level,
         /* We should be here with a valid MFN. */
         ASSERT(!mfn_eq(mfn, INVALID_MFN));
 
-        /* We don't allow replacing any valid entry. */
+        /*
+         * We don't allow replacing any valid entry.
+         *
+         * Note that the function xen_pt_update() relies on this
+         * assumption and will skip the TLB flush. The function will need
+         * to be updated if the check is relaxed.
+         */
         if ( lpae_is_valid(entry) )
         {
             if ( lpae_is_mapping(entry, level) )
@@ -1333,11 +1339,16 @@ static int xen_pt_update(unsigned long virt,
     }
 
     /*
-     * Flush the TLBs even in case of failure because we may have
+     * The TLBs flush can be safely skipped when a mapping is inserted
+     * as we don't allow mapping replacement (see xen_pt_check_entry()).
+     *
+     * For all the other cases, the TLBs will be flushed unconditionally
+     * even if the mapping are failed. This is because we may have
      * partially modified the PT. This will prevent any unexpected
      * behavior afterwards.
      */
-    flush_xen_tlb_range_va(virt, PAGE_SIZE * nr_mfns);
+    if ( !(flags & _PAGE_PRESENT) || mfn_eq(mfn, INVALID_MFN) )
+        flush_xen_tlb_range_va(virt, PAGE_SIZE * nr_mfns);
 
     spin_unlock(&xen_pt_lock);
 
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFCv2 06/15] xen/arm: mm: Don't open-code Xen PT update in remove_early_mappings()
  2021-04-25 20:13 [PATCH RFCv2 00/15] xen/arm: mm: Remove open-coding mappings Julien Grall
                   ` (4 preceding siblings ...)
  2021-04-25 20:13 ` [PATCH RFCv2 05/15] xen/arm: mm: Avoid flushing the TLBs when mapping are inserted Julien Grall
@ 2021-04-25 20:13 ` Julien Grall
  2021-05-11 22:58   ` Stefano Stabellini
  2021-04-25 20:13 ` [PATCH RFCv2 07/15] xen/arm: mm: Re-implement early_fdt_map() using map_pages_to_xen() Julien Grall
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-04-25 20:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Julien Grall

From: Julien Grall <julien.grall@arm.com>

Now that xen_pt_update_entry() is able to deal with different mapping
size, we can replace the open-coding of the page-tables update by a call
to modify_xen_mappings().

As the function is not meant to fail, a BUG_ON() is added to check the
return.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v2:
        - Stay consistent with how function name are used in the commit
        message
        - Add my AWS signed-off-by
---
 xen/arch/arm/mm.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 1fe52b3af722..2cbfbe25240e 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -598,11 +598,11 @@ void * __init early_fdt_map(paddr_t fdt_paddr)
 
 void __init remove_early_mappings(void)
 {
-    lpae_t pte = {0};
-    write_pte(xen_second + second_table_offset(BOOT_FDT_VIRT_START), pte);
-    write_pte(xen_second + second_table_offset(BOOT_FDT_VIRT_START + SZ_2M),
-              pte);
-    flush_xen_tlb_range_va(BOOT_FDT_VIRT_START, BOOT_FDT_SLOT_SIZE);
+    int rc;
+
+    rc = modify_xen_mappings(BOOT_FDT_VIRT_START, BOOT_FDT_VIRT_END,
+                             _PAGE_BLOCK);
+    BUG_ON(rc);
 }
 
 /*
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFCv2 07/15] xen/arm: mm: Re-implement early_fdt_map() using map_pages_to_xen()
  2021-04-25 20:13 [PATCH RFCv2 00/15] xen/arm: mm: Remove open-coding mappings Julien Grall
                   ` (5 preceding siblings ...)
  2021-04-25 20:13 ` [PATCH RFCv2 06/15] xen/arm: mm: Don't open-code Xen PT update in remove_early_mappings() Julien Grall
@ 2021-04-25 20:13 ` Julien Grall
  2021-05-12 21:41   ` Stefano Stabellini
  2021-04-25 20:13 ` [PATCH RFCv2 08/15] xen/arm32: mm: Check if the virtual address is shared before updating it Julien Grall
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-04-25 20:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Julien Grall

From: Julien Grall <julien.grall@arm.com>

Now that map_pages_to_xen() has been extended to support 2MB mappings,
we can replace the create_mappings() calls by map_pages_to_xen() calls.

The mapping can also be marked read-only has Xen as no business to
modify the host Device Tree.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v2:
        - Add my AWS signed-off-by
        - Fix typo in the commit message
---
 xen/arch/arm/mm.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 2cbfbe25240e..8fac24d80086 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -558,6 +558,7 @@ void * __init early_fdt_map(paddr_t fdt_paddr)
     paddr_t offset;
     void *fdt_virt;
     uint32_t size;
+    int rc;
 
     /*
      * Check whether the physical FDT address is set and meets the minimum
@@ -573,8 +574,12 @@ void * __init early_fdt_map(paddr_t fdt_paddr)
     /* The FDT is mapped using 2MB superpage */
     BUILD_BUG_ON(BOOT_FDT_VIRT_START % SZ_2M);
 
-    create_mappings(xen_second, BOOT_FDT_VIRT_START, paddr_to_pfn(base_paddr),
-                    SZ_2M >> PAGE_SHIFT, SZ_2M);
+    rc = map_pages_to_xen(BOOT_FDT_VIRT_START, maddr_to_mfn(base_paddr),
+                          SZ_2M >> PAGE_SHIFT,
+                          PAGE_HYPERVISOR_RO | _PAGE_BLOCK);
+    if ( rc )
+        panic("Unable to map the device-tree.\n");
+
 
     offset = fdt_paddr % SECOND_SIZE;
     fdt_virt = (void *)BOOT_FDT_VIRT_START + offset;
@@ -588,9 +593,12 @@ void * __init early_fdt_map(paddr_t fdt_paddr)
 
     if ( (offset + size) > SZ_2M )
     {
-        create_mappings(xen_second, BOOT_FDT_VIRT_START + SZ_2M,
-                        paddr_to_pfn(base_paddr + SZ_2M),
-                        SZ_2M >> PAGE_SHIFT, SZ_2M);
+        rc = map_pages_to_xen(BOOT_FDT_VIRT_START + SZ_2M,
+                              maddr_to_mfn(base_paddr + SZ_2M),
+                              SZ_2M >> PAGE_SHIFT,
+                              PAGE_HYPERVISOR_RO | _PAGE_BLOCK);
+        if ( rc )
+            panic("Unable to map the device-tree\n");
     }
 
     return fdt_virt;
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFCv2 08/15] xen/arm32: mm: Check if the virtual address is shared before updating it
  2021-04-25 20:13 [PATCH RFCv2 00/15] xen/arm: mm: Remove open-coding mappings Julien Grall
                   ` (6 preceding siblings ...)
  2021-04-25 20:13 ` [PATCH RFCv2 07/15] xen/arm: mm: Re-implement early_fdt_map() using map_pages_to_xen() Julien Grall
@ 2021-04-25 20:13 ` Julien Grall
  2021-05-12 22:00   ` Stefano Stabellini
  2021-04-25 20:13 ` [PATCH RFCv2 09/15] xen/arm32: mm: Re-implement setup_xenheap_mappings() using map_pages_to_xen() Julien Grall
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-04-25 20:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

Only the first 2GB of the virtual address space is shared between all
the page-tables on Arm32.

There is a long outstanding TODO in xen_pt_update() stating that the
function is can only work with shared mapping. Nobody has ever called
the function with private mapping, however as we add more callers
there is a risk to mess things up.

Introduce a new define to mark the ened of the shared mappings and use
it in xen_pt_update() to verify if the address is correct.

Note that on Arm64, all the mappings are shared. Some compiler may
complain about an always true check, so the new define is not introduced
for arm64 and the code is protected with an #ifdef.

Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v2:
        - New patch
---
 xen/arch/arm/mm.c            | 11 +++++++++--
 xen/include/asm-arm/config.h |  4 ++++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 8fac24d80086..5c17cafff847 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1275,11 +1275,18 @@ static int xen_pt_update(unsigned long virt,
      * For arm32, page-tables are different on each CPUs. Yet, they share
      * some common mappings. It is assumed that only common mappings
      * will be modified with this function.
-     *
-     * XXX: Add a check.
      */
     const mfn_t root = virt_to_mfn(THIS_CPU_PGTABLE);
 
+#ifdef SHARED_VIRT_END
+    if ( virt > SHARED_VIRT_END ||
+         (SHARED_VIRT_END - virt) < nr_mfns )
+    {
+        mm_printk("Trying to map outside of the shared area.\n");
+        return -EINVAL;
+    }
+#endif
+
     /*
      * The hardware was configured to forbid mapping both writeable and
      * executable.
diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h
index c7b77912013e..85d4a510ce8a 100644
--- a/xen/include/asm-arm/config.h
+++ b/xen/include/asm-arm/config.h
@@ -137,6 +137,10 @@
 
 #define XENHEAP_VIRT_START     _AT(vaddr_t,0x40000000)
 #define XENHEAP_VIRT_END       _AT(vaddr_t,0x7fffffff)
+
+/* The first 2GB is always shared between all the page-tables. */
+#define SHARED_VIRT_END        _AT(vaddr_t, 0x7fffffff)
+
 #define DOMHEAP_VIRT_START     _AT(vaddr_t,0x80000000)
 #define DOMHEAP_VIRT_END       _AT(vaddr_t,0xffffffff)
 
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFCv2 09/15] xen/arm32: mm: Re-implement setup_xenheap_mappings() using map_pages_to_xen()
  2021-04-25 20:13 [PATCH RFCv2 00/15] xen/arm: mm: Remove open-coding mappings Julien Grall
                   ` (7 preceding siblings ...)
  2021-04-25 20:13 ` [PATCH RFCv2 08/15] xen/arm32: mm: Check if the virtual address is shared before updating it Julien Grall
@ 2021-04-25 20:13 ` Julien Grall
  2021-05-12 22:07   ` Stefano Stabellini
  2021-04-25 20:13 ` [PATCH RFCv2 10/15] xen/arm: mm: Allocate xen page tables in domheap rather than xenheap Julien Grall
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-04-25 20:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

Now that map_pages_to_xen() has been extended to support 2MB mappings,
we can replace the create_mappings() call by map_pages_to_xen() call.

Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v2:
        - New patch

    TODOs:
        - add support for contiguous mapping
---
 xen/arch/arm/mm.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 5c17cafff847..19ecf73542ce 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -806,7 +806,12 @@ void mmu_init_secondary_cpu(void)
 void __init setup_xenheap_mappings(unsigned long base_mfn,
                                    unsigned long nr_mfns)
 {
-    create_mappings(xen_second, XENHEAP_VIRT_START, base_mfn, nr_mfns, MB(32));
+    int rc;
+
+    rc = map_pages_to_xen(XENHEAP_VIRT_START, base_mfn, nr_mfns,
+                          PAGE_HYPERVISOR_RW | _PAGE_BLOCK);
+    if ( rc )
+        panic("Unable to setup the xenheap mappings.\n");
 
     /* Record where the xenheap is, for translation routines. */
     xenheap_virt_end = XENHEAP_VIRT_START + nr_mfns * PAGE_SIZE;
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFCv2 10/15] xen/arm: mm: Allocate xen page tables in domheap rather than xenheap
  2021-04-25 20:13 [PATCH RFCv2 00/15] xen/arm: mm: Remove open-coding mappings Julien Grall
                   ` (8 preceding siblings ...)
  2021-04-25 20:13 ` [PATCH RFCv2 09/15] xen/arm32: mm: Re-implement setup_xenheap_mappings() using map_pages_to_xen() Julien Grall
@ 2021-04-25 20:13 ` Julien Grall
  2021-05-12 22:44   ` Stefano Stabellini
  2021-04-25 20:13 ` [PATCH RFCv2 11/15] xen/arm: mm: Allow page-table allocation from the boot allocator Julien Grall
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-04-25 20:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

xen_{un,}map_table() already uses the helper to map/unmap pages
on-demand (note this is currently a NOP on arm64). So switching to
domheap don't have any disavantage.

But this as the benefit:
    - to keep the page tables unmapped if an arch decided to do so
    - reduce xenheap use on arm32 which can be pretty small

Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v2:
        - New patch
---
 xen/arch/arm/mm.c | 36 +++++++++++++++++++++---------------
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 19ecf73542ce..ae5a07ea956b 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -969,21 +969,6 @@ void *ioremap(paddr_t pa, size_t len)
     return ioremap_attr(pa, len, PAGE_HYPERVISOR_NOCACHE);
 }
 
-static int create_xen_table(lpae_t *entry)
-{
-    void *p;
-    lpae_t pte;
-
-    p = alloc_xenheap_page();
-    if ( p == NULL )
-        return -ENOMEM;
-    clear_page(p);
-    pte = mfn_to_xen_entry(virt_to_mfn(p), MT_NORMAL);
-    pte.pt.table = 1;
-    write_pte(entry, pte);
-    return 0;
-}
-
 static lpae_t *xen_map_table(mfn_t mfn)
 {
     /*
@@ -1024,6 +1009,27 @@ static void xen_unmap_table(const lpae_t *table)
     unmap_domain_page(table);
 }
 
+static int create_xen_table(lpae_t *entry)
+{
+    struct page_info *pg;
+    void *p;
+    lpae_t pte;
+
+    pg = alloc_domheap_page(NULL, 0);
+    if ( pg == NULL )
+        return -ENOMEM;
+
+    p = xen_map_table(page_to_mfn(pg));
+    clear_page(p);
+    xen_unmap_table(p);
+
+    pte = mfn_to_xen_entry(page_to_mfn(pg), MT_NORMAL);
+    pte.pt.table = 1;
+    write_pte(entry, pte);
+
+    return 0;
+}
+
 #define XEN_TABLE_MAP_FAILED 0
 #define XEN_TABLE_SUPER_PAGE 1
 #define XEN_TABLE_NORMAL_PAGE 2
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFCv2 11/15] xen/arm: mm: Allow page-table allocation from the boot allocator
  2021-04-25 20:13 [PATCH RFCv2 00/15] xen/arm: mm: Remove open-coding mappings Julien Grall
                   ` (9 preceding siblings ...)
  2021-04-25 20:13 ` [PATCH RFCv2 10/15] xen/arm: mm: Allocate xen page tables in domheap rather than xenheap Julien Grall
@ 2021-04-25 20:13 ` Julien Grall
  2021-05-13 22:58   ` Stefano Stabellini
  2021-04-25 20:13 ` [PATCH RFCv2 12/15] xen/arm: add Persistent Map (PMAP) infrastructure Julien Grall
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-04-25 20:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Julien Grall

From: Julien Grall <julien.grall@arm.com>

At the moment, page-table can only be allocated from domheap. This means
it is not possible to create mapping in the page-tables via
map_pages_to_xen() if page-table needs to be allocated.

In order to avoid open-coding page-tables update in early boot, we need
to be able to allocate page-tables much earlier. Thankfully, we have the
boot allocator for those cases.

create_xen_table() is updated to cater early boot allocation by using
alloc_boot_pages().

Note, this is not sufficient to bootstrap the page-tables (i.e mapping
before any memory is actually mapped). This will be addressed
separately.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v2:
        - New patch
---
 xen/arch/arm/mm.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index ae5a07ea956b..d090fdfd5994 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1011,19 +1011,27 @@ static void xen_unmap_table(const lpae_t *table)
 
 static int create_xen_table(lpae_t *entry)
 {
-    struct page_info *pg;
+    mfn_t mfn;
     void *p;
     lpae_t pte;
 
-    pg = alloc_domheap_page(NULL, 0);
-    if ( pg == NULL )
-        return -ENOMEM;
+    if ( system_state != SYS_STATE_early_boot )
+    {
+        struct page_info *pg = alloc_domheap_page(NULL, 0);
+
+        if ( pg == NULL )
+            return -ENOMEM;
+
+        mfn = page_to_mfn(pg);
+    }
+    else
+        mfn = alloc_boot_pages(1, 1);
 
-    p = xen_map_table(page_to_mfn(pg));
+    p = xen_map_table(mfn);
     clear_page(p);
     xen_unmap_table(p);
 
-    pte = mfn_to_xen_entry(page_to_mfn(pg), MT_NORMAL);
+    pte = mfn_to_xen_entry(mfn, MT_NORMAL);
     pte.pt.table = 1;
     write_pte(entry, pte);
 
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFCv2 12/15] xen/arm: add Persistent Map (PMAP) infrastructure
  2021-04-25 20:13 [PATCH RFCv2 00/15] xen/arm: mm: Remove open-coding mappings Julien Grall
                   ` (10 preceding siblings ...)
  2021-04-25 20:13 ` [PATCH RFCv2 11/15] xen/arm: mm: Allow page-table allocation from the boot allocator Julien Grall
@ 2021-04-25 20:13 ` Julien Grall
  2021-04-26  9:41   ` Xia, Hongyan
                     ` (2 more replies)
  2021-04-25 20:13 ` [PATCH RFCv2 13/15] xen/arm: mm: Use the PMAP helpers in xen_{,un}map_table() Julien Grall
                   ` (2 subsequent siblings)
  14 siblings, 3 replies; 59+ messages in thread
From: Julien Grall @ 2021-04-25 20:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis, Wei Liu,
	Stefano Stabellini, Julien Grall, Volodymyr Babchuk, Hongyan Xia,
	Julien Grall, Jan Beulich, Wei Liu, Andrew Cooper,
	Roger Pau Monné,
	Hongian Xia

From: Wei Liu <wei.liu2@citrix.com>

The basic idea is like Persistent Kernel Map (PKMAP) in Linux. We
pre-populate all the relevant page tables before the system is fully
set up.

We will need it on Arm in order to rework the arm64 version of
xenheap_setup_mappings() as we may need to use pages allocated from
the boot allocator before they are effectively mapped.

This infrastructure is not lock-protected therefore can only be used
before smpboot. After smpboot, map_domain_page() has to be used.

This is based on the x86 version [1] that was originally implemented
by Wei Liu.

Take the opportunity to switch the parameter attr from unsigned to
unsigned int.

[1] <e92da4ad6015b6089737fcccba3ec1d6424649a5.1588278317.git.hongyxia@amazon.com>

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
[julien: Adapted for Arm]
Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v2:
        - New patch

Cc: Jan Beulich <jbeulich@suse.com>
Cc: Wei Liu <wl@xen.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: Hongian Xia <hongyax@amazon.com>

This is mostly a copy of the PMAP infrastructure currently discussed
on x86. The only difference is how the page-tables are updated.

I think we want to consider to provide a common infrastructure. But
I haven't done it yet to gather feedback on the overall series
first.
---
 xen/arch/arm/Makefile        |   1 +
 xen/arch/arm/mm.c            |   7 +--
 xen/arch/arm/pmap.c          | 101 +++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/config.h |   2 +
 xen/include/asm-arm/lpae.h   |   8 +++
 xen/include/asm-arm/pmap.h   |  10 ++++
 6 files changed, 123 insertions(+), 6 deletions(-)
 create mode 100644 xen/arch/arm/pmap.c
 create mode 100644 xen/include/asm-arm/pmap.h

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index ca75f1040dcc..2196ada24941 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -40,6 +40,7 @@ obj-y += percpu.o
 obj-y += platform.o
 obj-y += platform_hypercall.o
 obj-y += physdev.o
+obj-y += pmap.init.o
 obj-y += processor.o
 obj-y += psci.o
 obj-y += setup.o
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index d090fdfd5994..5e713b599611 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -288,12 +288,7 @@ void dump_hyp_walk(vaddr_t addr)
     dump_pt_walk(ttbr, addr, HYP_PT_ROOT_LEVEL, 1);
 }
 
-/*
- * Standard entry type that we'll use to build Xen's own pagetables.
- * We put the same permissions at every level, because they're ignored
- * by the walker in non-leaf entries.
- */
-static inline lpae_t mfn_to_xen_entry(mfn_t mfn, unsigned attr)
+lpae_t mfn_to_xen_entry(mfn_t mfn, unsigned int attr)
 {
     lpae_t e = (lpae_t) {
         .pt = {
diff --git a/xen/arch/arm/pmap.c b/xen/arch/arm/pmap.c
new file mode 100644
index 000000000000..702b1bde982d
--- /dev/null
+++ b/xen/arch/arm/pmap.c
@@ -0,0 +1,101 @@
+#include <xen/init.h>
+#include <xen/mm.h>
+
+#include <asm/bitops.h>
+#include <asm/flushtlb.h>
+#include <asm/pmap.h>
+
+/*
+ * To be able to use FIXMAP_PMAP_BEGIN.
+ * XXX: move fixmap definition in a separate header
+ */
+#include <xen/acpi.h>
+
+/*
+ * Simple mapping infrastructure to map / unmap pages in fixed map.
+ * This is used to set up the page table for mapcache, which is used
+ * by map domain page infrastructure.
+ *
+ * This structure is not protected by any locks, so it must not be used after
+ * smp bring-up.
+ */
+
+/* Bitmap to track which slot is used */
+static unsigned long __initdata inuse;
+
+/* XXX: Find an header to declare it */
+extern lpae_t xen_fixmap[LPAE_ENTRIES];
+
+void *__init pmap_map(mfn_t mfn)
+{
+    unsigned long flags;
+    unsigned int idx;
+    vaddr_t linear;
+    unsigned int slot;
+    lpae_t *entry, pte;
+
+    BUILD_BUG_ON(sizeof(inuse) * BITS_PER_LONG < NUM_FIX_PMAP);
+
+    ASSERT(system_state < SYS_STATE_smp_boot);
+
+    local_irq_save(flags);
+
+    idx = find_first_zero_bit(&inuse, NUM_FIX_PMAP);
+    if ( idx == NUM_FIX_PMAP )
+        panic("Out of PMAP slots\n");
+
+    __set_bit(idx, &inuse);
+
+    slot = idx + FIXMAP_PMAP_BEGIN;
+    ASSERT(slot >= FIXMAP_PMAP_BEGIN && slot <= FIXMAP_PMAP_END);
+
+    linear = FIXMAP_ADDR(slot);
+    /*
+     * We cannot use set_fixmap() here. We use PMAP when there is no direct map,
+     * so map_pages_to_xen() called by set_fixmap() needs to map pages on
+     * demand, which then calls pmap() again, resulting in a loop. Modify the
+     * PTEs directly instead. The same is true for pmap_unmap().
+     */
+    entry = &xen_fixmap[third_table_offset(linear)];
+
+    ASSERT(!lpae_is_valid(*entry));
+
+    pte = mfn_to_xen_entry(mfn, PAGE_HYPERVISOR_RW);
+    pte.pt.table = 1;
+    write_pte(entry, pte);
+
+    local_irq_restore(flags);
+
+    return (void *)linear;
+}
+
+void __init pmap_unmap(const void *p)
+{
+    unsigned long flags;
+    unsigned int idx;
+    lpae_t *entry;
+    lpae_t pte = { 0 };
+    unsigned int slot = third_table_offset((vaddr_t)p);
+
+    ASSERT(system_state < SYS_STATE_smp_boot);
+    ASSERT(slot >= FIXMAP_PMAP_BEGIN && slot <= FIXMAP_PMAP_END);
+
+    idx = slot - FIXMAP_PMAP_BEGIN;
+    local_irq_save(flags);
+
+    __clear_bit(idx, &inuse);
+    entry = &xen_fixmap[third_table_offset((vaddr_t)p)];
+    write_pte(entry, pte);
+    flush_xen_tlb_range_va_local((vaddr_t)p, PAGE_SIZE);
+
+    local_irq_restore(flags);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h
index 85d4a510ce8a..35050855b6e1 100644
--- a/xen/include/asm-arm/config.h
+++ b/xen/include/asm-arm/config.h
@@ -180,6 +180,8 @@
 #define FIXMAP_MISC     1  /* Ephemeral mappings of hardware */
 #define FIXMAP_ACPI_BEGIN  2  /* Start mappings of ACPI tables */
 #define FIXMAP_ACPI_END    (FIXMAP_ACPI_BEGIN + NUM_FIXMAP_ACPI_PAGES - 1)  /* End mappings of ACPI tables */
+#define FIXMAP_PMAP_BEGIN (FIXMAP_ACPI_END + 1)
+#define FIXMAP_PMAP_END (FIXMAP_PMAP_BEGIN + NUM_FIX_PMAP - 1)
 
 #define NR_hypercalls 64
 
diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h
index 310f5225e056..81fd482ab2ce 100644
--- a/xen/include/asm-arm/lpae.h
+++ b/xen/include/asm-arm/lpae.h
@@ -4,6 +4,7 @@
 #ifndef __ASSEMBLY__
 
 #include <xen/page-defs.h>
+#include <xen/mm-frame.h>
 
 /*
  * WARNING!  Unlike the x86 pagetable code, where l1 is the lowest level and
@@ -168,6 +169,13 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
         third_table_offset(addr)            \
     }
 
+/*
+ * Standard entry type that we'll use to build Xen's own pagetables.
+ * We put the same permissions at every level, because they're ignored
+ * by the walker in non-leaf entries.
+ */
+lpae_t mfn_to_xen_entry(mfn_t mfn, unsigned int attr);
+
 #endif /* __ASSEMBLY__ */
 
 /*
diff --git a/xen/include/asm-arm/pmap.h b/xen/include/asm-arm/pmap.h
new file mode 100644
index 000000000000..8e1dce93f8e4
--- /dev/null
+++ b/xen/include/asm-arm/pmap.h
@@ -0,0 +1,10 @@
+#ifndef __ASM_PMAP_H__
+#define __ARM_PMAP_H__
+
+/* Large enough for mapping 5 levels of page tables with some headroom */
+#define NUM_FIX_PMAP 8
+
+void *pmap_map(mfn_t mfn);
+void pmap_unmap(const void *p);
+
+#endif	/* __ASM_PMAP_H__ */
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFCv2 13/15] xen/arm: mm: Use the PMAP helpers in xen_{,un}map_table()
  2021-04-25 20:13 [PATCH RFCv2 00/15] xen/arm: mm: Remove open-coding mappings Julien Grall
                   ` (11 preceding siblings ...)
  2021-04-25 20:13 ` [PATCH RFCv2 12/15] xen/arm: add Persistent Map (PMAP) infrastructure Julien Grall
@ 2021-04-25 20:13 ` Julien Grall
  2021-05-14 23:35   ` Stefano Stabellini
  2021-04-25 20:13 ` [PATCH RFCv2 14/15] xen/arm: mm: Rework setup_xenheap_mappings() Julien Grall
  2021-04-25 20:13 ` [PATCH RFCv2 15/15] xen/arm: mm: Re-implement setup_frame_table_mappings() with map_pages_to_xen() Julien Grall
  14 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-04-25 20:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk

From: Julien Grall <jgrall@amazon.com>

During early boot, it is not possible to use xen_{,un}map_table()
if the page tables are not residing the Xen binary.

This is a blocker to switch some of the helpers to use xen_pt_update()
as we may need to allocate extra page tables and access them before
the domheap has been initialized (see setup_xenheap_mappings()).

xen_{,un}map_table() are now updated to use the PMAP helpers for early
boot map/unmap. Note that the special case for page-tables residing
in Xen binary has been dropped because it is "complex" and was
only added as a workaround in 8d4f1b8878e0 ("xen/arm: mm: Allow
generic xen page-tables helpers to be called early").

Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v2:
        - New patch
---
 xen/arch/arm/mm.c | 33 +++++++++------------------------
 1 file changed, 9 insertions(+), 24 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 5e713b599611..f5768f2d4a81 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -41,6 +41,7 @@
 #include <xen/sizes.h>
 #include <xen/libfdt/libfdt.h>
 
+#include <asm/pmap.h>
 #include <asm/setup.h>
 
 /* Override macros from asm/page.h to make them work with mfn_t */
@@ -967,27 +968,11 @@ void *ioremap(paddr_t pa, size_t len)
 static lpae_t *xen_map_table(mfn_t mfn)
 {
     /*
-     * We may require to map the page table before map_domain_page() is
-     * useable. The requirements here is it must be useable as soon as
-     * page-tables are allocated dynamically via alloc_boot_pages().
-     *
-     * We need to do the check on physical address rather than virtual
-     * address to avoid truncation on Arm32. Therefore is_kernel() cannot
-     * be used.
+     * During early boot, map_domain_page() may be unusable. Use the
+     * PMAP to map temporarily a page-table.
      */
     if ( system_state == SYS_STATE_early_boot )
-    {
-        if ( is_xen_fixed_mfn(mfn) )
-        {
-            /*
-             * It is fine to demote the type because the size of Xen
-             * will always fit in vaddr_t.
-             */
-            vaddr_t offset = mfn_to_maddr(mfn) - virt_to_maddr(&_start);
-
-            return (lpae_t *)(XEN_VIRT_START + offset);
-        }
-    }
+        return pmap_map(mfn);
 
     return map_domain_page(mfn);
 }
@@ -996,12 +981,12 @@ static void xen_unmap_table(const lpae_t *table)
 {
     /*
      * During early boot, xen_map_table() will not use map_domain_page()
-     * for page-tables residing in Xen binary. So skip the unmap part.
+     * but the PMAP.
      */
-    if ( system_state == SYS_STATE_early_boot && is_kernel(table) )
-        return;
-
-    unmap_domain_page(table);
+    if ( system_state == SYS_STATE_early_boot )
+        pmap_unmap(table);
+    else
+        unmap_domain_page(table);
 }
 
 static int create_xen_table(lpae_t *entry)
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFCv2 14/15] xen/arm: mm: Rework setup_xenheap_mappings()
  2021-04-25 20:13 [PATCH RFCv2 00/15] xen/arm: mm: Remove open-coding mappings Julien Grall
                   ` (12 preceding siblings ...)
  2021-04-25 20:13 ` [PATCH RFCv2 13/15] xen/arm: mm: Use the PMAP helpers in xen_{,un}map_table() Julien Grall
@ 2021-04-25 20:13 ` Julien Grall
  2021-05-14 23:51   ` Stefano Stabellini
  2021-04-25 20:13 ` [PATCH RFCv2 15/15] xen/arm: mm: Re-implement setup_frame_table_mappings() with map_pages_to_xen() Julien Grall
  14 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-04-25 20:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Julien Grall

From: Julien Grall <julien.grall@arm.com>

A few issues have been reported with setup_xenheap_mappings() over the
last couple of years. The main ones are:
    - It will break on platform supporting more than 512GB of RAM
      because the memory allocated by the boot allocator is not yet
      mapped.
    - Aligning all the regions to 1GB may lead to unexpected result
      because we may alias non-cacheable region (such as device or reserved
      regions).

map_pages_to_xen() was recently reworked to allow superpage mappings and
deal with the use of page-tables before they are mapped.

Most of the code in setup_xenheap_mappings() is now replaced with a
single call to map_pages_to_xen().

This also require to re-order the steps setup_mm() so the regions are
given to the boot allocator first and then we setup the xenheap
mappings.

Note that the 1GB alignment is not yet removed.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v2:
        - New patch

    TODO:
        - Remove the 1GB alignment
        - Add support for setting the contiguous bit
---
 xen/arch/arm/mm.c    | 60 ++++----------------------------------------
 xen/arch/arm/setup.c | 10 ++++++--
 2 files changed, 13 insertions(+), 57 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index f5768f2d4a81..c49403b687f5 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -143,17 +143,6 @@ static DEFINE_PAGE_TABLE(cpu0_pgtable);
 static DEFINE_PAGE_TABLES(cpu0_dommap, DOMHEAP_SECOND_PAGES);
 #endif
 
-#ifdef CONFIG_ARM_64
-/* The first page of the first level mapping of the xenheap. The
- * subsequent xenheap first level pages are dynamically allocated, but
- * we need this one to bootstrap ourselves. */
-static DEFINE_PAGE_TABLE(xenheap_first_first);
-/* The zeroeth level slot which uses xenheap_first_first. Used because
- * setup_xenheap_mappings otherwise relies on mfn_to_virt which isn't
- * valid for a non-xenheap mapping. */
-static __initdata int xenheap_first_first_slot = -1;
-#endif
-
 /* Common pagetable leaves */
 /* Second level page tables.
  *
@@ -818,9 +807,9 @@ void __init setup_xenheap_mappings(unsigned long base_mfn,
 void __init setup_xenheap_mappings(unsigned long base_mfn,
                                    unsigned long nr_mfns)
 {
-    lpae_t *first, pte;
     unsigned long mfn, end_mfn;
     vaddr_t vaddr;
+    int rc;
 
     /* Align to previous 1GB boundary */
     mfn = base_mfn & ~((FIRST_SIZE>>PAGE_SHIFT)-1);
@@ -846,49 +835,10 @@ void __init setup_xenheap_mappings(unsigned long base_mfn,
      */
     vaddr = (vaddr_t)__mfn_to_virt(base_mfn) & FIRST_MASK;
 
-    while ( mfn < end_mfn )
-    {
-        int slot = zeroeth_table_offset(vaddr);
-        lpae_t *p = &xen_pgtable[slot];
-
-        if ( p->pt.valid )
-        {
-            /* mfn_to_virt is not valid on the 1st 1st mfn, since it
-             * is not within the xenheap. */
-            first = slot == xenheap_first_first_slot ?
-                xenheap_first_first : mfn_to_virt(lpae_get_mfn(*p));
-        }
-        else if ( xenheap_first_first_slot == -1)
-        {
-            /* Use xenheap_first_first to bootstrap the mappings */
-            first = xenheap_first_first;
-
-            pte = pte_of_xenaddr((vaddr_t)xenheap_first_first);
-            pte.pt.table = 1;
-            write_pte(p, pte);
-
-            xenheap_first_first_slot = slot;
-        }
-        else
-        {
-            mfn_t first_mfn = alloc_boot_pages(1, 1);
-
-            clear_page(mfn_to_virt(first_mfn));
-            pte = mfn_to_xen_entry(first_mfn, MT_NORMAL);
-            pte.pt.table = 1;
-            write_pte(p, pte);
-            first = mfn_to_virt(first_mfn);
-        }
-
-        pte = mfn_to_xen_entry(_mfn(mfn), MT_NORMAL);
-        /* TODO: Set pte.pt.contig when appropriate. */
-        write_pte(&first[first_table_offset(vaddr)], pte);
-
-        mfn += FIRST_SIZE>>PAGE_SHIFT;
-        vaddr += FIRST_SIZE;
-    }
-
-    flush_xen_tlb_local();
+    rc = map_pages_to_xen(vaddr, _mfn(mfn), end_mfn - mfn,
+                          PAGE_HYPERVISOR_RW | _PAGE_BLOCK);
+    if ( rc )
+        panic("Unable to setup the xenheap mappings.\n");
 }
 #endif
 
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 00aad1c194b9..0993a4bb52d4 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -761,8 +761,11 @@ static void __init setup_mm(void)
         ram_start = min(ram_start,bank_start);
         ram_end = max(ram_end,bank_end);
 
-        setup_xenheap_mappings(bank_start>>PAGE_SHIFT, bank_size>>PAGE_SHIFT);
-
+        /*
+         * Add the region to the boot allocator first, so we can use
+         * some to allocate page-tables for setting up the xenheap
+         * mappings.
+         */
         s = bank_start;
         while ( s < bank_end )
         {
@@ -781,6 +784,9 @@ static void __init setup_mm(void)
             fw_unreserved_regions(s, e, init_boot_pages, 0);
             s = n;
         }
+
+        setup_xenheap_mappings(bank_start >> PAGE_SHIFT,
+                               bank_size >> PAGE_SHIFT);
     }
 
     total_pages += ram_size >> PAGE_SHIFT;
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFCv2 15/15] xen/arm: mm: Re-implement setup_frame_table_mappings() with map_pages_to_xen()
  2021-04-25 20:13 [PATCH RFCv2 00/15] xen/arm: mm: Remove open-coding mappings Julien Grall
                   ` (13 preceding siblings ...)
  2021-04-25 20:13 ` [PATCH RFCv2 14/15] xen/arm: mm: Rework setup_xenheap_mappings() Julien Grall
@ 2021-04-25 20:13 ` Julien Grall
  2021-05-15  0:02   ` Stefano Stabellini
  14 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-04-25 20:13 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Julien Grall,
	Volodymyr Babchuk, Julien Grall

From: Julien Grall <julien.grall@arm.com>

Now that map_pages_to_xen() has been extended to support 2MB mappings,
we can replace the create_mappings() call by map_pages_to_xen() call.

This has the advantage to remove the different between 32-bit and 64-bit
code.

Lastly remove create_mappings() as there is no more callers.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Julien Grall <jgrall@amazon.com>

---
    Changes in v2:
        - New patch

    TODO:
        - Add support for setting the contiguous bit
---
 xen/arch/arm/mm.c | 64 +++++------------------------------------------
 1 file changed, 6 insertions(+), 58 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index c49403b687f5..5f8ae029dd6d 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -359,40 +359,6 @@ void clear_fixmap(unsigned map)
     BUG_ON(res != 0);
 }
 
-/* Create Xen's mappings of memory.
- * Mapping_size must be either 2MB or 32MB.
- * Base and virt must be mapping_size aligned.
- * Size must be a multiple of mapping_size.
- * second must be a contiguous set of second level page tables
- * covering the region starting at virt_offset. */
-static void __init create_mappings(lpae_t *second,
-                                   unsigned long virt_offset,
-                                   unsigned long base_mfn,
-                                   unsigned long nr_mfns,
-                                   unsigned int mapping_size)
-{
-    unsigned long i, count;
-    const unsigned long granularity = mapping_size >> PAGE_SHIFT;
-    lpae_t pte, *p;
-
-    ASSERT((mapping_size == MB(2)) || (mapping_size == MB(32)));
-    ASSERT(!((virt_offset >> PAGE_SHIFT) % granularity));
-    ASSERT(!(base_mfn % granularity));
-    ASSERT(!(nr_mfns % granularity));
-
-    count = nr_mfns / LPAE_ENTRIES;
-    p = second + second_linear_offset(virt_offset);
-    pte = mfn_to_xen_entry(_mfn(base_mfn), MT_NORMAL);
-    if ( granularity == 16 * LPAE_ENTRIES )
-        pte.pt.contig = 1;  /* These maps are in 16-entry contiguous chunks. */
-    for ( i = 0; i < count; i++ )
-    {
-        write_pte(p + i, pte);
-        pte.pt.base += 1 << LPAE_SHIFT;
-    }
-    flush_xen_tlb_local();
-}
-
 #ifdef CONFIG_DOMAIN_PAGE
 void *map_domain_page_global(mfn_t mfn)
 {
@@ -850,36 +816,18 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t pe)
     unsigned long frametable_size = nr_pdxs * sizeof(struct page_info);
     mfn_t base_mfn;
     const unsigned long mapping_size = frametable_size < MB(32) ? MB(2) : MB(32);
-#ifdef CONFIG_ARM_64
-    lpae_t *second, pte;
-    unsigned long nr_second;
-    mfn_t second_base;
-    int i;
-#endif
+    int rc;
 
     frametable_base_pdx = mfn_to_pdx(maddr_to_mfn(ps));
     /* Round up to 2M or 32M boundary, as appropriate. */
     frametable_size = ROUNDUP(frametable_size, mapping_size);
     base_mfn = alloc_boot_pages(frametable_size >> PAGE_SHIFT, 32<<(20-12));
 
-#ifdef CONFIG_ARM_64
-    /* Compute the number of second level pages. */
-    nr_second = ROUNDUP(frametable_size, FIRST_SIZE) >> FIRST_SHIFT;
-    second_base = alloc_boot_pages(nr_second, 1);
-    second = mfn_to_virt(second_base);
-    for ( i = 0; i < nr_second; i++ )
-    {
-        clear_page(mfn_to_virt(mfn_add(second_base, i)));
-        pte = mfn_to_xen_entry(mfn_add(second_base, i), MT_NORMAL);
-        pte.pt.table = 1;
-        write_pte(&xen_first[first_table_offset(FRAMETABLE_VIRT_START)+i], pte);
-    }
-    create_mappings(second, 0, mfn_x(base_mfn), frametable_size >> PAGE_SHIFT,
-                    mapping_size);
-#else
-    create_mappings(xen_second, FRAMETABLE_VIRT_START, mfn_x(base_mfn),
-                    frametable_size >> PAGE_SHIFT, mapping_size);
-#endif
+    /* XXX: Handle contiguous bit */
+    rc = map_pages_to_xen(FRAMETABLE_VIRT_START, base_mfn,
+                          frametable_size >> PAGE_SHIFT, PAGE_HYPERVISOR_RW);
+    if ( rc )
+        panic("Unable to setup the frametable mappings.\n");
 
     memset(&frame_table[0], 0, nr_pdxs * sizeof(struct page_info));
     memset(&frame_table[nr_pdxs], -1,
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 12/15] xen/arm: add Persistent Map (PMAP) infrastructure
  2021-04-25 20:13 ` [PATCH RFCv2 12/15] xen/arm: add Persistent Map (PMAP) infrastructure Julien Grall
@ 2021-04-26  9:41   ` Xia, Hongyan
  2021-07-03 17:57     ` Julien Grall
  2021-04-28 21:47   ` Wei Liu
  2021-05-14 23:25   ` Stefano Stabellini
  2 siblings, 1 reply; 59+ messages in thread
From: Xia, Hongyan @ 2021-04-26  9:41 UTC (permalink / raw)
  To: julien, xen-devel
  Cc: Penny.Zheng, Bertrand.Marquis, Wei.Chen, sstabellini, wei.liu2,
	andrew.cooper3, roger.pau, Volodymyr_Babchuk, hongyax,
	Henry.Wang, wl, Grall, Julien, jbeulich

On Sun, 2021-04-25 at 21:13 +0100, Julien Grall wrote:
> From: Wei Liu <wei.liu2@citrix.com>
> 
> The basic idea is like Persistent Kernel Map (PKMAP) in Linux. We
> pre-populate all the relevant page tables before the system is fully
> set up.
> 
> We will need it on Arm in order to rework the arm64 version of
> xenheap_setup_mappings() as we may need to use pages allocated from
> the boot allocator before they are effectively mapped.
> 
> This infrastructure is not lock-protected therefore can only be used
> before smpboot. After smpboot, map_domain_page() has to be used.
> 
> This is based on the x86 version [1] that was originally implemented
> by Wei Liu.
> 
> Take the opportunity to switch the parameter attr from unsigned to
> unsigned int.
> 
> [1] <
> e92da4ad6015b6089737fcccba3ec1d6424649a5.1588278317.git.hongyxia@amazon.com
> >
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
> [julien: Adapted for Arm]
> Signed-off-by: Julien Grall <jgrall@amazon.com>

[...]

> diff --git a/xen/arch/arm/pmap.c b/xen/arch/arm/pmap.c
> new file mode 100644
> index 000000000000..702b1bde982d
> --- /dev/null
> +++ b/xen/arch/arm/pmap.c
> @@ -0,0 +1,101 @@
> +#include <xen/init.h>
> +#include <xen/mm.h>
> +
> +#include <asm/bitops.h>
> +#include <asm/flushtlb.h>
> +#include <asm/pmap.h>
> +
> +/*
> + * To be able to use FIXMAP_PMAP_BEGIN.
> + * XXX: move fixmap definition in a separate header
> + */
> +#include <xen/acpi.h>
> +
> +/*
> + * Simple mapping infrastructure to map / unmap pages in fixed map.
> + * This is used to set up the page table for mapcache, which is used
> + * by map domain page infrastructure.
> + *
> + * This structure is not protected by any locks, so it must not be
> used after
> + * smp bring-up.
> + */
> +
> +/* Bitmap to track which slot is used */
> +static unsigned long __initdata inuse;
> +
> +/* XXX: Find an header to declare it */
> +extern lpae_t xen_fixmap[LPAE_ENTRIES];
> +
> +void *__init pmap_map(mfn_t mfn)
> +{
> +    unsigned long flags;
> +    unsigned int idx;
> +    vaddr_t linear;
> +    unsigned int slot;
> +    lpae_t *entry, pte;
> +
> +    BUILD_BUG_ON(sizeof(inuse) * BITS_PER_LONG < NUM_FIX_PMAP);

This seems wrong to me. It should multiply with something like
BITS_PER_BYTE.

I noticed this line was already present before the Arm version so
probably my fault :(, which also needs to be fixed.

> +
> +    ASSERT(system_state < SYS_STATE_smp_boot);
> +
> +    local_irq_save(flags);
> +
> +    idx = find_first_zero_bit(&inuse, NUM_FIX_PMAP);
> +    if ( idx == NUM_FIX_PMAP )
> +        panic("Out of PMAP slots\n");
> +
> +    __set_bit(idx, &inuse);
> +
> +    slot = idx + FIXMAP_PMAP_BEGIN;
> +    ASSERT(slot >= FIXMAP_PMAP_BEGIN && slot <= FIXMAP_PMAP_END);
> +

From here...

> +    linear = FIXMAP_ADDR(slot);
> +    /*
> +     * We cannot use set_fixmap() here. We use PMAP when there is no
> direct map,
> +     * so map_pages_to_xen() called by set_fixmap() needs to map
> pages on
> +     * demand, which then calls pmap() again, resulting in a loop.
> Modify the
> +     * PTEs directly instead. The same is true for pmap_unmap().
> +     */
> +    entry = &xen_fixmap[third_table_offset(linear)];
> +
> +    ASSERT(!lpae_is_valid(*entry));
> +
> +    pte = mfn_to_xen_entry(mfn, PAGE_HYPERVISOR_RW);
> +    pte.pt.table = 1;
> +    write_pte(entry, pte);
> +

...to here, I wonder if we can move this chunk into arch (like void
*arch_write_pmap_slot(slot)). Such an arch function hides how fixmap is
handled and how page table entry is written behind arch, and the rest
can just be common.

> +    local_irq_restore(flags);
> +
> +    return (void *)linear;
> +}
> +
> +void __init pmap_unmap(const void *p)
> +{
> +    unsigned long flags;
> +    unsigned int idx;
> +    lpae_t *entry;
> +    lpae_t pte = { 0 };
> +    unsigned int slot = third_table_offset((vaddr_t)p);
> +
> +    ASSERT(system_state < SYS_STATE_smp_boot);
> +    ASSERT(slot >= FIXMAP_PMAP_BEGIN && slot <= FIXMAP_PMAP_END);
> +
> +    idx = slot - FIXMAP_PMAP_BEGIN;
> +    local_irq_save(flags);
> +
> +    __clear_bit(idx, &inuse);
> +    entry = &xen_fixmap[third_table_offset((vaddr_t)p)];
> +    write_pte(entry, pte);
> +    flush_xen_tlb_range_va_local((vaddr_t)p, PAGE_SIZE);

and the same for the above, something like arch_clear_pmap(void *) and
the rest into common.

From a quick glance, I don't think x86 and Arm share any useful TLB
flush helpers? So the TLB flush probably should be behind arch as well.

> +
> +    local_irq_restore(flags);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */

[...]

> diff --git a/xen/include/asm-arm/pmap.h b/xen/include/asm-arm/pmap.h
> new file mode 100644
> index 000000000000..8e1dce93f8e4
> --- /dev/null
> +++ b/xen/include/asm-arm/pmap.h
> @@ -0,0 +1,10 @@
> +#ifndef __ASM_PMAP_H__
> +#define __ARM_PMAP_H__

This line doesn't seem to match the #ifndef, but if the functions are
moved to common, this header can be moved to common as well.

Hongyan

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 12/15] xen/arm: add Persistent Map (PMAP) infrastructure
  2021-04-25 20:13 ` [PATCH RFCv2 12/15] xen/arm: add Persistent Map (PMAP) infrastructure Julien Grall
  2021-04-26  9:41   ` Xia, Hongyan
@ 2021-04-28 21:47   ` Wei Liu
  2021-05-14 23:25   ` Stefano Stabellini
  2 siblings, 0 replies; 59+ messages in thread
From: Wei Liu @ 2021-04-28 21:47 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Wei Liu, Stefano Stabellini, Volodymyr Babchuk, Hongyan Xia,
	Julien Grall, Jan Beulich, Wei Liu, Andrew Cooper,
	Roger Pau Monné,
	Hongian Xia

On Sun, Apr 25, 2021 at 09:13:15PM +0100, Julien Grall wrote:
> From: Wei Liu <wei.liu2@citrix.com>
> 
> The basic idea is like Persistent Kernel Map (PKMAP) in Linux. We
> pre-populate all the relevant page tables before the system is fully
> set up.
> 
> We will need it on Arm in order to rework the arm64 version of
> xenheap_setup_mappings() as we may need to use pages allocated from
> the boot allocator before they are effectively mapped.
> 
> This infrastructure is not lock-protected therefore can only be used
> before smpboot. After smpboot, map_domain_page() has to be used.
> 
> This is based on the x86 version [1] that was originally implemented
> by Wei Liu.
> 
> Take the opportunity to switch the parameter attr from unsigned to
> unsigned int.
> 
> [1] <e92da4ad6015b6089737fcccba3ec1d6424649a5.1588278317.git.hongyxia@amazon.com>
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
> [julien: Adapted for Arm]
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> ---
>     Changes in v2:
>         - New patch
> 
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Wei Liu <wl@xen.org>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Roger Pau Monné <roger.pau@citrix.com>
> Cc: Hongian Xia <hongyax@amazon.com>
> 
> This is mostly a copy of the PMAP infrastructure currently discussed
> on x86. The only difference is how the page-tables are updated.
> 
> I think we want to consider to provide a common infrastructure. But
> I haven't done it yet to gather feedback on the overall series
> first.

+1. This infrastructure should be common code with arch hooks where
appropriate.

Wei.


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 01/15] xen/arm: lpae: Rename LPAE_ENTRIES_MASK_GS to LPAE_ENTRY_MASK_GS
  2021-04-25 20:13 ` [PATCH RFCv2 01/15] xen/arm: lpae: Rename LPAE_ENTRIES_MASK_GS to LPAE_ENTRY_MASK_GS Julien Grall
@ 2021-05-11 22:12   ` Stefano Stabellini
  0 siblings, 0 replies; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-11 22:12 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Volodymyr Babchuk

On Sun, 25 Apr 2021, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> Commit 05031fa87357 "xen/arm: guest_walk: Only generate necessary
> offsets/masks" introduced LPAE_ENTRIES_MASK_GS. In a follow-up patch,
> we will use it for to define LPAE_ENTRY_MASK.
> 
> This will lead to inconsistent naming. As LPAE_ENTRY_MASK is used in
> many places, it is better to rename LPAE_ENTRIES_MASK_GS and avoid
> some churn.
> 
> So rename LPAE_ENTRIES_MASK_GS to LPAE_ENTRY_MASK_GS.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>

Acked-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>     Changes in v2:
>         - New patch
> ---
>  xen/include/asm-arm/lpae.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h
> index e94de2e7d8e8..4fb9a40a4ca9 100644
> --- a/xen/include/asm-arm/lpae.h
> +++ b/xen/include/asm-arm/lpae.h
> @@ -180,7 +180,7 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
>   */
>  #define LPAE_SHIFT_GS(gs)         ((gs) - 3)
>  #define LPAE_ENTRIES_GS(gs)       (_AC(1, U) << LPAE_SHIFT_GS(gs))
> -#define LPAE_ENTRIES_MASK_GS(gs)  (LPAE_ENTRIES_GS(gs) - 1)
> +#define LPAE_ENTRY_MASK_GS(gs)  (LPAE_ENTRIES_GS(gs) - 1)
>  
>  #define LEVEL_ORDER_GS(gs, lvl)   ((3 - (lvl)) * LPAE_SHIFT_GS(gs))
>  #define LEVEL_SHIFT_GS(gs, lvl)   (LEVEL_ORDER_GS(gs, lvl) + (gs))
> @@ -188,7 +188,7 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
>  
>  /* Offset in the table at level 'lvl' */
>  #define LPAE_TABLE_INDEX_GS(gs, lvl, addr)   \
> -    (((addr) >> LEVEL_SHIFT_GS(gs, lvl)) & LPAE_ENTRIES_MASK_GS(gs))
> +    (((addr) >> LEVEL_SHIFT_GS(gs, lvl)) & LPAE_ENTRY_MASK_GS(gs))
>  
>  /* Generate an array @var containing the offset for each level from @addr */
>  #define DECLARE_OFFSETS(var, addr)          \
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 02/15] xen/arm: lpae: Use the generic helpers to defined the Xen PT helpers
  2021-04-25 20:13 ` [PATCH RFCv2 02/15] xen/arm: lpae: Use the generic helpers to defined the Xen PT helpers Julien Grall
@ 2021-05-11 22:26   ` Stefano Stabellini
  2021-05-12 14:26     ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-11 22:26 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Volodymyr Babchuk

On Sun, 25 Apr 2021, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> Currently, Xen PT helpers are only working with 4KB page granularity
> and open-code the generic helpers. To allow more flexibility, we can
> re-use the generic helpers and pass Xen's page granularity
> (PAGE_SHIFT).
> 
> As Xen PT helpers are used in both C and assembly, we need to move
> the generic helpers definition outside of the !__ASSEMBLY__ section.
> 
> Note the aliases for each level are still kept for the time being so we
> can avoid a massive patch to change all the callers.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>

The patch is OK as is. I have a couple of suggestions for improvement
below. If you feel like making them, good, otherwise I am also OK if you
don't want to change anything.


> ---
>     Changes in v2:
>         - New patch
> ---
>  xen/include/asm-arm/lpae.h | 71 +++++++++++++++++++++-----------------
>  1 file changed, 40 insertions(+), 31 deletions(-)
> 
> diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h
> index 4fb9a40a4ca9..310f5225e056 100644
> --- a/xen/include/asm-arm/lpae.h
> +++ b/xen/include/asm-arm/lpae.h
> @@ -159,6 +159,17 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
>  #define lpae_get_mfn(pte)    (_mfn((pte).walk.base))
>  #define lpae_set_mfn(pte, mfn)  ((pte).walk.base = mfn_x(mfn))
>  
> +/* Generate an array @var containing the offset for each level from @addr */
> +#define DECLARE_OFFSETS(var, addr)          \
> +    const unsigned int var[4] = {           \
> +        zeroeth_table_offset(addr),         \
> +        first_table_offset(addr),           \
> +        second_table_offset(addr),          \
> +        third_table_offset(addr)            \
> +    }
> +
> +#endif /* __ASSEMBLY__ */
> +
>  /*
>   * AArch64 supports pages with different sizes (4K, 16K, and 64K).
>   * Provide a set of generic helpers that will compute various
> @@ -190,17 +201,6 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
>  #define LPAE_TABLE_INDEX_GS(gs, lvl, addr)   \
>      (((addr) >> LEVEL_SHIFT_GS(gs, lvl)) & LPAE_ENTRY_MASK_GS(gs))
>  
> -/* Generate an array @var containing the offset for each level from @addr */
> -#define DECLARE_OFFSETS(var, addr)          \
> -    const unsigned int var[4] = {           \
> -        zeroeth_table_offset(addr),         \
> -        first_table_offset(addr),           \
> -        second_table_offset(addr),          \
> -        third_table_offset(addr)            \
> -    }
> -
> -#endif /* __ASSEMBLY__ */
> -
>  /*
>   * These numbers add up to a 48-bit input address space.
>   *
> @@ -211,26 +211,35 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
>   * therefore 39-bits are sufficient.
>   */
>  
> -#define LPAE_SHIFT      9
> -#define LPAE_ENTRIES    (_AC(1,U) << LPAE_SHIFT)
> -#define LPAE_ENTRY_MASK (LPAE_ENTRIES - 1)
> -
> -#define THIRD_SHIFT    (PAGE_SHIFT)
> -#define THIRD_ORDER    (THIRD_SHIFT - PAGE_SHIFT)
> -#define THIRD_SIZE     (_AT(paddr_t, 1) << THIRD_SHIFT)
> -#define THIRD_MASK     (~(THIRD_SIZE - 1))
> -#define SECOND_SHIFT   (THIRD_SHIFT + LPAE_SHIFT)
> -#define SECOND_ORDER   (SECOND_SHIFT - PAGE_SHIFT)
> -#define SECOND_SIZE    (_AT(paddr_t, 1) << SECOND_SHIFT)
> -#define SECOND_MASK    (~(SECOND_SIZE - 1))
> -#define FIRST_SHIFT    (SECOND_SHIFT + LPAE_SHIFT)
> -#define FIRST_ORDER    (FIRST_SHIFT - PAGE_SHIFT)
> -#define FIRST_SIZE     (_AT(paddr_t, 1) << FIRST_SHIFT)
> -#define FIRST_MASK     (~(FIRST_SIZE - 1))
> -#define ZEROETH_SHIFT  (FIRST_SHIFT + LPAE_SHIFT)
> -#define ZEROETH_ORDER  (ZEROETH_SHIFT - PAGE_SHIFT)
> -#define ZEROETH_SIZE   (_AT(paddr_t, 1) << ZEROETH_SHIFT)
> -#define ZEROETH_MASK   (~(ZEROETH_SIZE - 1))

Should we add a one-line in-code comment saying that the definitions
below are for 4KB pages? It is not immediately obvious any longer.


> +#define LPAE_SHIFT          LPAE_SHIFT_GS(PAGE_SHIFT)
> +#define LPAE_ENTRIES        LPAE_ENTRIES_GS(PAGE_SHIFT)
> +#define LPAE_ENTRY_MASK     LPAE_ENTRY_MASK_GS(PAGE_SHIFT)
>
> +#define LEVEL_SHIFT(lvl)    LEVEL_SHIFT_GS(PAGE_SHIFT, lvl)
> +#define LEVEL_ORDER(lvl)    LEVEL_ORDER_GS(PAGE_SHIFT, lvl)
> +#define LEVEL_SIZE(lvl)     LEVEL_SIZE_GS(PAGE_SHIFT, lvl)
> +#define LEVEL_MASK(lvl)     (~(LEVEL_SIZE(lvl) - 1))

I would avoid adding these 4 macros. It would be OK if they were just
used within this file but lpae.h is a header: they could end up be used
anywhere in the xen/ code and they have a very generic name. My
suggestion would be to skip them and just do:

#define THIRD_SHIFT         LEVEL_SHIFT_GS(PAGE_SHIFT, 3)

etc.


> +/* Convenience aliases */
> +#define THIRD_SHIFT         LEVEL_SHIFT(3)
> +#define THIRD_ORDER         LEVEL_ORDER(3)
> +#define THIRD_SIZE          LEVEL_SIZE(3)
> +#define THIRD_MASK          LEVEL_MASK(3)
> +
> +#define SECOND_SHIFT        LEVEL_SHIFT(2)
> +#define SECOND_ORDER        LEVEL_ORDER(2)
> +#define SECOND_SIZE         LEVEL_SIZE(2)
> +#define SECOND_MASK         LEVEL_MASK(2)
> +
> +#define FIRST_SHIFT         LEVEL_SHIFT(1)
> +#define FIRST_ORDER         LEVEL_ORDER(1)
> +#define FIRST_SIZE          LEVEL_SIZE(1)
> +#define FIRST_MASK          LEVEL_MASK(1)
> +
> +#define ZEROETH_SHIFT       LEVEL_SHIFT(0)
> +#define ZEROETH_ORDER       LEVEL_ORDER(0)
> +#define ZEROETH_SIZE        LEVEL_SIZE(0)
> +#define ZEROETH_MASK        LEVEL_MASK(0)
>  
>  /* Calculate the offsets into the pagetables for a given VA */
>  #define zeroeth_linear_offset(va) ((va) >> ZEROETH_SHIFT)
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 03/15] xen/arm: p2m: Replace level_{orders, masks} arrays with LEVEL_{ORDER, MASK}
  2021-04-25 20:13 ` [PATCH RFCv2 03/15] xen/arm: p2m: Replace level_{orders, masks} arrays with LEVEL_{ORDER, MASK} Julien Grall
@ 2021-05-11 22:33   ` Stefano Stabellini
  2021-05-12 14:39     ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-11 22:33 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Volodymyr Babchuk

On Sun, 25 Apr 2021, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> The array level_orders and level_masks can be replaced with the
> recently introduced macros LEVEL_ORDER and LEVEL_MASK.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>

So you actually planned to use LEVEL_ORDER and LEVEL_MASK in the xen/
code. I take back the previous comment :-)

Is the 4KB size "hiding" (for the lack of a better word) done on purpose?

Let me rephrase. Are you trying to consolidate info about pages being
4KB in xen/include/asm-arm/lpae.h ?

In any case:

Acked-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>     Changes in v2:
>         - New patch
> ---
>  xen/arch/arm/p2m.c | 16 +++++-----------
>  1 file changed, 5 insertions(+), 11 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index ac5031262061..1b04c3534439 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -36,12 +36,6 @@ static unsigned int __read_mostly max_vmid = MAX_VMID_8_BIT;
>   */
>  unsigned int __read_mostly p2m_ipa_bits = 64;
>  
> -/* Helpers to lookup the properties of each level */
> -static const paddr_t level_masks[] =
> -    { ZEROETH_MASK, FIRST_MASK, SECOND_MASK, THIRD_MASK };
> -static const uint8_t level_orders[] =
> -    { ZEROETH_ORDER, FIRST_ORDER, SECOND_ORDER, THIRD_ORDER };
> -
>  static mfn_t __read_mostly empty_root_mfn;
>  
>  static uint64_t generate_vttbr(uint16_t vmid, mfn_t root_mfn)
> @@ -232,7 +226,7 @@ static lpae_t *p2m_get_root_pointer(struct p2m_domain *p2m,
>       * we can't use (P2M_ROOT_LEVEL - 1) because the root level might be
>       * 0. Yet we still want to check if all the unused bits are zeroed.
>       */
> -    root_table = gfn_x(gfn) >> (level_orders[P2M_ROOT_LEVEL] + LPAE_SHIFT);
> +    root_table = gfn_x(gfn) >> (LEVEL_ORDER(P2M_ROOT_LEVEL) + LPAE_SHIFT);
>      if ( root_table >= P2M_ROOT_PAGES )
>          return NULL;
>  
> @@ -378,7 +372,7 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
>      if ( gfn_x(gfn) > gfn_x(p2m->max_mapped_gfn) )
>      {
>          for ( level = P2M_ROOT_LEVEL; level < 3; level++ )
> -            if ( (gfn_x(gfn) & (level_masks[level] >> PAGE_SHIFT)) >
> +            if ( (gfn_x(gfn) & (LEVEL_MASK(level) >> PAGE_SHIFT)) >
>                   gfn_x(p2m->max_mapped_gfn) )
>                  break;
>  
> @@ -421,7 +415,7 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn,
>           * The entry may point to a superpage. Find the MFN associated
>           * to the GFN.
>           */
> -        mfn = mfn_add(mfn, gfn_x(gfn) & ((1UL << level_orders[level]) - 1));
> +        mfn = mfn_add(mfn, gfn_x(gfn) & ((1UL << LEVEL_ORDER(level)) - 1));
>  
>          if ( valid )
>              *valid = lpae_is_valid(entry);
> @@ -432,7 +426,7 @@ out_unmap:
>  
>  out:
>      if ( page_order )
> -        *page_order = level_orders[level];
> +        *page_order = LEVEL_ORDER(level);
>  
>      return mfn;
>  }
> @@ -806,7 +800,7 @@ static bool p2m_split_superpage(struct p2m_domain *p2m, lpae_t *entry,
>      /* Convenience aliases */
>      mfn_t mfn = lpae_get_mfn(*entry);
>      unsigned int next_level = level + 1;
> -    unsigned int level_order = level_orders[next_level];
> +    unsigned int level_order = LEVEL_ORDER(next_level);
>  
>      /*
>       * This should only be called with target != level and the entry is
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 04/15] xen/arm: mm: Allow other mapping size in xen_pt_update_entry()
  2021-04-25 20:13 ` [PATCH RFCv2 04/15] xen/arm: mm: Allow other mapping size in xen_pt_update_entry() Julien Grall
@ 2021-05-11 22:42   ` Stefano Stabellini
  0 siblings, 0 replies; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-11 22:42 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Julien Grall

On Sun, 25 Apr 2021, Julien Grall wrote:
> From: Julien Grall <julien.grall@arm.com>
> 
> At the moment, xen_pt_update_entry() only supports mapping at level 3
> (i.e 4KB mapping). While this is fine for most of the runtime helper,
> the boot code will require to use superpage mapping.
> 
> We don't want to allow superpage mapping by default as some of the
> callers may expect small mappings (i.e populate_pt_range()) or even
> expect to unmap only a part of a superpage.
> 
> To keep the code simple, a new flag _PAGE_BLOCK is introduced to
> allow the caller to enable superpage mapping.
> 
> As the code doesn't support all the combinations, xen_pt_check_entry()
> is extended to take into account the cases we don't support when
> using block mapping:
>     - Replacing a table with a mapping. This may happen if region was
>     first mapped with 4KB mapping and then later on replaced with a 2MB
>     (or 1GB mapping)
>     - Removing/modify a table. This may happen if a caller try to
>     remove a region with _PAGE_BLOCK set when it was created without it
> 
> Note that the current restriction mean that the caller must ensure that
> _PAGE_BLOCK is consistently set/cleared across all the updates on a
> given virtual region. This ought to be fine with the expected use-cases.
> 
> More rework will be necessary if we wanted to remove the restrictions.
> 
> Note that nr_mfns is now marked const as it is used for flushing the
> TLBs and we don't want it to be modified.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> Signed-off-by: Julien Grall <jgrall@amazon.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> 
>     Changes in v2:
>         - Pass the target level rather than the order to
>         xen_pt_update_entry()
>         - Update some comments
>         - Open-code paddr_to_pfn()
>         - Add my AWS signed-off-by
> ---
>  xen/arch/arm/mm.c          | 93 ++++++++++++++++++++++++++++++--------
>  xen/include/asm-arm/page.h |  4 ++
>  2 files changed, 79 insertions(+), 18 deletions(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 59f8a3f15fd1..8ebb36899314 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -1060,9 +1060,10 @@ static int xen_pt_next_level(bool read_only, unsigned int level,
>  }
>  
>  /* Sanity check of the entry */
> -static bool xen_pt_check_entry(lpae_t entry, mfn_t mfn, unsigned int flags)
> +static bool xen_pt_check_entry(lpae_t entry, mfn_t mfn, unsigned int level,
> +                               unsigned int flags)
>  {
> -    /* Sanity check when modifying a page. */
> +    /* Sanity check when modifying an entry. */
>      if ( (flags & _PAGE_PRESENT) && mfn_eq(mfn, INVALID_MFN) )
>      {
>          /* We don't allow modifying an invalid entry. */
> @@ -1072,6 +1073,13 @@ static bool xen_pt_check_entry(lpae_t entry, mfn_t mfn, unsigned int flags)
>              return false;
>          }
>  
> +        /* We don't allow modifying a table entry */
> +        if ( !lpae_is_mapping(entry, level) )
> +        {
> +            mm_printk("Modifying a table entry is not allowed.\n");
> +            return false;
> +        }
> +
>          /* We don't allow changing memory attributes. */
>          if ( entry.pt.ai != PAGE_AI_MASK(flags) )
>          {
> @@ -1087,7 +1095,7 @@ static bool xen_pt_check_entry(lpae_t entry, mfn_t mfn, unsigned int flags)
>              return false;
>          }
>      }
> -    /* Sanity check when inserting a page */
> +    /* Sanity check when inserting a mapping */
>      else if ( flags & _PAGE_PRESENT )
>      {
>          /* We should be here with a valid MFN. */
> @@ -1096,18 +1104,28 @@ static bool xen_pt_check_entry(lpae_t entry, mfn_t mfn, unsigned int flags)
>          /* We don't allow replacing any valid entry. */
>          if ( lpae_is_valid(entry) )
>          {
> -            mm_printk("Changing MFN for a valid entry is not allowed (%#"PRI_mfn" -> %#"PRI_mfn").\n",
> -                      mfn_x(lpae_get_mfn(entry)), mfn_x(mfn));
> +            if ( lpae_is_mapping(entry, level) )
> +                mm_printk("Changing MFN for a valid entry is not allowed (%#"PRI_mfn" -> %#"PRI_mfn").\n",
> +                          mfn_x(lpae_get_mfn(entry)), mfn_x(mfn));
> +            else
> +                mm_printk("Trying to replace a table with a mapping.\n");
>              return false;
>          }
>      }
> -    /* Sanity check when removing a page. */
> +    /* Sanity check when removing a mapping. */
>      else if ( (flags & (_PAGE_PRESENT|_PAGE_POPULATE)) == 0 )
>      {
>          /* We should be here with an invalid MFN. */
>          ASSERT(mfn_eq(mfn, INVALID_MFN));
>  
> -        /* We don't allow removing page with contiguous bit set. */
> +        /* We don't allow removing a table */
> +        if ( lpae_is_table(entry, level) )
> +        {
> +            mm_printk("Removing a table is not allowed.\n");
> +            return false;
> +        }
> +
> +        /* We don't allow removing a mapping with contiguous bit set. */
>          if ( entry.pt.contig )
>          {
>              mm_printk("Removing entry with contiguous bit set is not allowed.\n");
> @@ -1125,13 +1143,13 @@ static bool xen_pt_check_entry(lpae_t entry, mfn_t mfn, unsigned int flags)
>      return true;
>  }
>  
> +/* Update an entry at the level @target. */
>  static int xen_pt_update_entry(mfn_t root, unsigned long virt,
> -                               mfn_t mfn, unsigned int flags)
> +                               mfn_t mfn, unsigned int target,
> +                               unsigned int flags)
>  {
>      int rc;
>      unsigned int level;
> -    /* We only support 4KB mapping (i.e level 3) for now */
> -    unsigned int target = 3;
>      lpae_t *table;
>      /*
>       * The intermediate page tables are read-only when the MFN is not valid
> @@ -1186,7 +1204,7 @@ static int xen_pt_update_entry(mfn_t root, unsigned long virt,
>      entry = table + offsets[level];
>  
>      rc = -EINVAL;
> -    if ( !xen_pt_check_entry(*entry, mfn, flags) )
> +    if ( !xen_pt_check_entry(*entry, mfn, level, flags) )
>          goto out;
>  
>      /* If we are only populating page-table, then we are done. */
> @@ -1204,8 +1222,11 @@ static int xen_pt_update_entry(mfn_t root, unsigned long virt,
>          {
>              pte = mfn_to_xen_entry(mfn, PAGE_AI_MASK(flags));
>  
> -            /* Third level entries set pte.pt.table = 1 */
> -            pte.pt.table = 1;
> +            /*
> +             * First and second level pages set pte.pt.table = 0, but
> +             * third level entries set pte.pt.table = 1.
> +             */
> +            pte.pt.table = (level == 3);
>          }
>          else /* We are updating the permission => Copy the current pte. */
>              pte = *entry;
> @@ -1229,11 +1250,12 @@ static DEFINE_SPINLOCK(xen_pt_lock);
>  
>  static int xen_pt_update(unsigned long virt,
>                           mfn_t mfn,
> -                         unsigned long nr_mfns,
> +                         const unsigned long nr_mfns,
>                           unsigned int flags)
>  {
>      int rc = 0;
> -    unsigned long addr = virt, addr_end = addr + nr_mfns * PAGE_SIZE;
> +    unsigned long vfn = virt >> PAGE_SHIFT;
> +    unsigned long left = nr_mfns;
>  
>      /*
>       * For arm32, page-tables are different on each CPUs. Yet, they share
> @@ -1265,14 +1287,49 @@ static int xen_pt_update(unsigned long virt,
>  
>      spin_lock(&xen_pt_lock);
>  
> -    for ( ; addr < addr_end; addr += PAGE_SIZE )
> +    while ( left )
>      {
> -        rc = xen_pt_update_entry(root, addr, mfn, flags);
> +        unsigned int order, level;
> +        unsigned long mask;
> +
> +        /*
> +         * Don't take into account the MFN when removing mapping (i.e
> +         * MFN_INVALID) to calculate the correct target order.
> +         *
> +         * This loop relies on mfn, vfn, and nr_mfn to be all superpage
> +         * aligned (mfn and vfn have to be architecturally), and it uses
> +         * `mask` to check for that.
> +         *
> +         * XXX: Support superpage mappings if nr is not aligned to a
> +         * superpage size.
> +         */
> +        mask = !mfn_eq(mfn, INVALID_MFN) ? mfn_x(mfn) : 0;
> +        mask |= vfn | left;
> +
> +        /*
> +         * Always use level 3 mapping unless the caller request block
> +         * mapping.
> +         */
> +        if ( likely(!(flags & _PAGE_BLOCK)) )
> +            level = 3;
> +        else if ( !(mask & (BIT(FIRST_ORDER, UL) - 1)) )
> +            level = 1;
> +        else if ( !(mask & (BIT(SECOND_ORDER, UL) - 1)) )
> +            level = 2;
> +        else
> +            level = 3;
> +
> +        order = LEVEL_ORDER(level);
> +
> +        rc = xen_pt_update_entry(root, pfn_to_paddr(vfn), mfn, level, flags);
>          if ( rc )
>              break;
>  
> +        vfn += 1U << order;
>          if ( !mfn_eq(mfn, INVALID_MFN) )
> -            mfn = mfn_add(mfn, 1);
> +            mfn = mfn_add(mfn, 1U << order);
> +
> +        left -= (1U << order);
>      }
>  
>      /*
> diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> index 131507a51712..7052a87ec0fe 100644
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -69,6 +69,7 @@
>   * [3:4] Permission flags
>   * [5]   Page present
>   * [6]   Only populate page tables
> + * [7]   Superpage mappings is allowed
>   */
>  #define PAGE_AI_MASK(x) ((x) & 0x7U)
>  
> @@ -82,6 +83,9 @@
>  #define _PAGE_PRESENT    (1U << 5)
>  #define _PAGE_POPULATE   (1U << 6)
>  
> +#define _PAGE_BLOCK_BIT     7
> +#define _PAGE_BLOCK         (1U << _PAGE_BLOCK_BIT)
> +
>  /*
>   * _PAGE_DEVICE and _PAGE_NORMAL are convenience defines. They are not
>   * meant to be used outside of this header.
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 05/15] xen/arm: mm: Avoid flushing the TLBs when mapping are inserted
  2021-04-25 20:13 ` [PATCH RFCv2 05/15] xen/arm: mm: Avoid flushing the TLBs when mapping are inserted Julien Grall
@ 2021-05-11 22:51   ` Stefano Stabellini
  0 siblings, 0 replies; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-11 22:51 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Volodymyr Babchuk

On Sun, 25 Apr 2021, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> Currently, the function xen_pt_update() will flush the TLBs even when
> the mappings are inserted. This is a bit wasteful because we don't
> allow mapping replacement. Even if we were, the flush would need to
> happen earlier because mapping replacement should use Break-Before-Make
> when updating the entry.
> 
> A single call to xen_pt_update() can perform a single action. IOW, it
> is not possible to, for instance, mix inserting and removing mappings.
> Therefore, we can use `flags` to determine what action is performed.
> 
> This change will be particularly help to limit the impact of switching
> boot time mapping to use xen_pt_update().
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>

Nice improvement!

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> 
>     Changes in v2:
>         - New patch
> ---
>  xen/arch/arm/mm.c | 17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 8ebb36899314..1fe52b3af722 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -1101,7 +1101,13 @@ static bool xen_pt_check_entry(lpae_t entry, mfn_t mfn, unsigned int level,
>          /* We should be here with a valid MFN. */
>          ASSERT(!mfn_eq(mfn, INVALID_MFN));
>  
> -        /* We don't allow replacing any valid entry. */
> +        /*
> +         * We don't allow replacing any valid entry.
> +         *
> +         * Note that the function xen_pt_update() relies on this
> +         * assumption and will skip the TLB flush. The function will need
> +         * to be updated if the check is relaxed.
> +         */
>          if ( lpae_is_valid(entry) )
>          {
>              if ( lpae_is_mapping(entry, level) )
> @@ -1333,11 +1339,16 @@ static int xen_pt_update(unsigned long virt,
>      }
>  
>      /*
> -     * Flush the TLBs even in case of failure because we may have
> +     * The TLBs flush can be safely skipped when a mapping is inserted
> +     * as we don't allow mapping replacement (see xen_pt_check_entry()).
> +     *
> +     * For all the other cases, the TLBs will be flushed unconditionally
> +     * even if the mapping are failed. This is because we may have
>       * partially modified the PT. This will prevent any unexpected
>       * behavior afterwards.
>       */
> -    flush_xen_tlb_range_va(virt, PAGE_SIZE * nr_mfns);
> +    if ( !(flags & _PAGE_PRESENT) || mfn_eq(mfn, INVALID_MFN) )
> +        flush_xen_tlb_range_va(virt, PAGE_SIZE * nr_mfns);
>  
>      spin_unlock(&xen_pt_lock);
>  
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 06/15] xen/arm: mm: Don't open-code Xen PT update in remove_early_mappings()
  2021-04-25 20:13 ` [PATCH RFCv2 06/15] xen/arm: mm: Don't open-code Xen PT update in remove_early_mappings() Julien Grall
@ 2021-05-11 22:58   ` Stefano Stabellini
  0 siblings, 0 replies; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-11 22:58 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Julien Grall

On Sun, 25 Apr 2021, Julien Grall wrote:
> From: Julien Grall <julien.grall@arm.com>
> 
> Now that xen_pt_update_entry() is able to deal with different mapping
> size, we can replace the open-coding of the page-tables update by a call
> to modify_xen_mappings().
> 
> As the function is not meant to fail, a BUG_ON() is added to check the
> return.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> Signed-off-by: Julien Grall <jgrall@amazon.com>

Acked-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>     Changes in v2:
>         - Stay consistent with how function name are used in the commit
>         message
>         - Add my AWS signed-off-by
> ---
>  xen/arch/arm/mm.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 1fe52b3af722..2cbfbe25240e 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -598,11 +598,11 @@ void * __init early_fdt_map(paddr_t fdt_paddr)
>  
>  void __init remove_early_mappings(void)
>  {
> -    lpae_t pte = {0};
> -    write_pte(xen_second + second_table_offset(BOOT_FDT_VIRT_START), pte);
> -    write_pte(xen_second + second_table_offset(BOOT_FDT_VIRT_START + SZ_2M),
> -              pte);
> -    flush_xen_tlb_range_va(BOOT_FDT_VIRT_START, BOOT_FDT_SLOT_SIZE);
> +    int rc;
> +
> +    rc = modify_xen_mappings(BOOT_FDT_VIRT_START, BOOT_FDT_VIRT_END,
> +                             _PAGE_BLOCK);
> +    BUG_ON(rc);
>  }
>  
>  /*
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 02/15] xen/arm: lpae: Use the generic helpers to defined the Xen PT helpers
  2021-05-11 22:26   ` Stefano Stabellini
@ 2021-05-12 14:26     ` Julien Grall
  2021-05-12 21:30       ` Stefano Stabellini
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-05-12 14:26 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Volodymyr Babchuk

Hi Stefano,

On 11/05/2021 23:26, Stefano Stabellini wrote:
> On Sun, 25 Apr 2021, Julien Grall wrote:
>> From: Julien Grall <jgrall@amazon.com>
>>
>> Currently, Xen PT helpers are only working with 4KB page granularity
>> and open-code the generic helpers. To allow more flexibility, we can
>> re-use the generic helpers and pass Xen's page granularity
>> (PAGE_SHIFT).
>>
>> As Xen PT helpers are used in both C and assembly, we need to move
>> the generic helpers definition outside of the !__ASSEMBLY__ section.
>>
>> Note the aliases for each level are still kept for the time being so we
>> can avoid a massive patch to change all the callers.
>>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> The patch is OK as is. I have a couple of suggestions for improvement
> below. If you feel like making them, good, otherwise I am also OK if you
> don't want to change anything.
> 
> 
>> ---
>>      Changes in v2:
>>          - New patch
>> ---
>>   xen/include/asm-arm/lpae.h | 71 +++++++++++++++++++++-----------------
>>   1 file changed, 40 insertions(+), 31 deletions(-)
>>
>> diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h
>> index 4fb9a40a4ca9..310f5225e056 100644
>> --- a/xen/include/asm-arm/lpae.h
>> +++ b/xen/include/asm-arm/lpae.h
>> @@ -159,6 +159,17 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
>>   #define lpae_get_mfn(pte)    (_mfn((pte).walk.base))
>>   #define lpae_set_mfn(pte, mfn)  ((pte).walk.base = mfn_x(mfn))
>>   
>> +/* Generate an array @var containing the offset for each level from @addr */
>> +#define DECLARE_OFFSETS(var, addr)          \
>> +    const unsigned int var[4] = {           \
>> +        zeroeth_table_offset(addr),         \
>> +        first_table_offset(addr),           \
>> +        second_table_offset(addr),          \
>> +        third_table_offset(addr)            \
>> +    }
>> +
>> +#endif /* __ASSEMBLY__ */
>> +
>>   /*
>>    * AArch64 supports pages with different sizes (4K, 16K, and 64K).
>>    * Provide a set of generic helpers that will compute various
>> @@ -190,17 +201,6 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
>>   #define LPAE_TABLE_INDEX_GS(gs, lvl, addr)   \
>>       (((addr) >> LEVEL_SHIFT_GS(gs, lvl)) & LPAE_ENTRY_MASK_GS(gs))
>>   
>> -/* Generate an array @var containing the offset for each level from @addr */
>> -#define DECLARE_OFFSETS(var, addr)          \
>> -    const unsigned int var[4] = {           \
>> -        zeroeth_table_offset(addr),         \
>> -        first_table_offset(addr),           \
>> -        second_table_offset(addr),          \
>> -        third_table_offset(addr)            \
>> -    }
>> -
>> -#endif /* __ASSEMBLY__ */
>> -
>>   /*
>>    * These numbers add up to a 48-bit input address space.
>>    *
>> @@ -211,26 +211,35 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
>>    * therefore 39-bits are sufficient.
>>    */
>>   
>> -#define LPAE_SHIFT      9
>> -#define LPAE_ENTRIES    (_AC(1,U) << LPAE_SHIFT)
>> -#define LPAE_ENTRY_MASK (LPAE_ENTRIES - 1)
>> -
>> -#define THIRD_SHIFT    (PAGE_SHIFT)
>> -#define THIRD_ORDER    (THIRD_SHIFT - PAGE_SHIFT)
>> -#define THIRD_SIZE     (_AT(paddr_t, 1) << THIRD_SHIFT)
>> -#define THIRD_MASK     (~(THIRD_SIZE - 1))
>> -#define SECOND_SHIFT   (THIRD_SHIFT + LPAE_SHIFT)
>> -#define SECOND_ORDER   (SECOND_SHIFT - PAGE_SHIFT)
>> -#define SECOND_SIZE    (_AT(paddr_t, 1) << SECOND_SHIFT)
>> -#define SECOND_MASK    (~(SECOND_SIZE - 1))
>> -#define FIRST_SHIFT    (SECOND_SHIFT + LPAE_SHIFT)
>> -#define FIRST_ORDER    (FIRST_SHIFT - PAGE_SHIFT)
>> -#define FIRST_SIZE     (_AT(paddr_t, 1) << FIRST_SHIFT)
>> -#define FIRST_MASK     (~(FIRST_SIZE - 1))
>> -#define ZEROETH_SHIFT  (FIRST_SHIFT + LPAE_SHIFT)
>> -#define ZEROETH_ORDER  (ZEROETH_SHIFT - PAGE_SHIFT)
>> -#define ZEROETH_SIZE   (_AT(paddr_t, 1) << ZEROETH_SHIFT)
>> -#define ZEROETH_MASK   (~(ZEROETH_SIZE - 1))
> 
> Should we add a one-line in-code comment saying that the definitions
> below are for 4KB pages? It is not immediately obvious any longer.

Because they are not meant to be for 4KB pages. They are meant to be for 
Xen page size.

Today, it is always 4KB but I would like the Xen code to not rely on that.

I can clarify it in an in-code comment.

>> +#define LPAE_SHIFT          LPAE_SHIFT_GS(PAGE_SHIFT)
>> +#define LPAE_ENTRIES        LPAE_ENTRIES_GS(PAGE_SHIFT)
>> +#define LPAE_ENTRY_MASK     LPAE_ENTRY_MASK_GS(PAGE_SHIFT)
>>
>> +#define LEVEL_SHIFT(lvl)    LEVEL_SHIFT_GS(PAGE_SHIFT, lvl)
>> +#define LEVEL_ORDER(lvl)    LEVEL_ORDER_GS(PAGE_SHIFT, lvl)
>> +#define LEVEL_SIZE(lvl)     LEVEL_SIZE_GS(PAGE_SHIFT, lvl)
>> +#define LEVEL_MASK(lvl)     (~(LEVEL_SIZE(lvl) - 1))
> 
> I would avoid adding these 4 macros. It would be OK if they were just
> used within this file but lpae.h is a header: they could end up be used
> anywhere in the xen/ code and they have a very generic name. My
> suggestion would be to skip them and just do:

Those macros will be used in follow-up patches. They are pretty useful 
to avoid introduce static array with the different information for each 
level.

Would prefix them with XEN_ be better?

> #define THIRD_SHIFT         LEVEL_SHIFT_GS(PAGE_SHIFT, 3)
> 
> etc.
> 
> 
>> +/* Convenience aliases */
>> +#define THIRD_SHIFT         LEVEL_SHIFT(3)
>> +#define THIRD_ORDER         LEVEL_ORDER(3)
>> +#define THIRD_SIZE          LEVEL_SIZE(3)
>> +#define THIRD_MASK          LEVEL_MASK(3)
>> +
>> +#define SECOND_SHIFT        LEVEL_SHIFT(2)
>> +#define SECOND_ORDER        LEVEL_ORDER(2)
>> +#define SECOND_SIZE         LEVEL_SIZE(2)
>> +#define SECOND_MASK         LEVEL_MASK(2)
>> +
>> +#define FIRST_SHIFT         LEVEL_SHIFT(1)
>> +#define FIRST_ORDER         LEVEL_ORDER(1)
>> +#define FIRST_SIZE          LEVEL_SIZE(1)
>> +#define FIRST_MASK          LEVEL_MASK(1)
>> +
>> +#define ZEROETH_SHIFT       LEVEL_SHIFT(0)
>> +#define ZEROETH_ORDER       LEVEL_ORDER(0)
>> +#define ZEROETH_SIZE        LEVEL_SIZE(0)
>> +#define ZEROETH_MASK        LEVEL_MASK(0)
>>   
>>   /* Calculate the offsets into the pagetables for a given VA */
>>   #define zeroeth_linear_offset(va) ((va) >> ZEROETH_SHIFT)
>> -- 
>> 2.17.1
>>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 03/15] xen/arm: p2m: Replace level_{orders, masks} arrays with LEVEL_{ORDER, MASK}
  2021-05-11 22:33   ` Stefano Stabellini
@ 2021-05-12 14:39     ` Julien Grall
  0 siblings, 0 replies; 59+ messages in thread
From: Julien Grall @ 2021-05-12 14:39 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Volodymyr Babchuk

Hi Stefano,

On 11/05/2021 23:33, Stefano Stabellini wrote:
> On Sun, 25 Apr 2021, Julien Grall wrote:
>> From: Julien Grall <jgrall@amazon.com>
>>
>> The array level_orders and level_masks can be replaced with the
>> recently introduced macros LEVEL_ORDER and LEVEL_MASK.
>>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> So you actually planned to use LEVEL_ORDER and LEVEL_MASK in the xen/
> code. I take back the previous comment :-)
> 
> Is the 4KB size "hiding" (for the lack of a better word) done on purpose?
> 
> Let me rephrase. Are you trying to consolidate info about pages being
> 4KB in xen/include/asm-arm/lpae.h ?

THIRD_ORDER, SECOND_ORDER... is already not very 4KB specific :). In 
this case, what I am trying to do is completely removing the static 
arrays so they don't need to be global (or duplicated) when adding 
superpage support for Xen PT (see a follow-up patch).

This also has the added benefits to replace a with a couple of loads 
with only a few instructions working on immediates.

> 
> In any case:
> 
> Acked-by: Stefano Stabellini <sstabellini@kernel.org>

Thank you!

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 02/15] xen/arm: lpae: Use the generic helpers to defined the Xen PT helpers
  2021-05-12 14:26     ` Julien Grall
@ 2021-05-12 21:30       ` Stefano Stabellini
  2021-05-12 22:16         ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-12 21:30 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng,
	Bertrand.Marquis, Julien Grall, Volodymyr Babchuk

On Wed, 12 May 2021, Julien Grall wrote:
> Hi Stefano,
> 
> On 11/05/2021 23:26, Stefano Stabellini wrote:
> > On Sun, 25 Apr 2021, Julien Grall wrote:
> > > From: Julien Grall <jgrall@amazon.com>
> > > 
> > > Currently, Xen PT helpers are only working with 4KB page granularity
> > > and open-code the generic helpers. To allow more flexibility, we can
> > > re-use the generic helpers and pass Xen's page granularity
> > > (PAGE_SHIFT).
> > > 
> > > As Xen PT helpers are used in both C and assembly, we need to move
> > > the generic helpers definition outside of the !__ASSEMBLY__ section.
> > > 
> > > Note the aliases for each level are still kept for the time being so we
> > > can avoid a massive patch to change all the callers.
> > > 
> > > Signed-off-by: Julien Grall <jgrall@amazon.com>
> > 
> > The patch is OK as is. I have a couple of suggestions for improvement
> > below. If you feel like making them, good, otherwise I am also OK if you
> > don't want to change anything.
> > 
> > 
> > > ---
> > >      Changes in v2:
> > >          - New patch
> > > ---
> > >   xen/include/asm-arm/lpae.h | 71 +++++++++++++++++++++-----------------
> > >   1 file changed, 40 insertions(+), 31 deletions(-)
> > > 
> > > diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h
> > > index 4fb9a40a4ca9..310f5225e056 100644
> > > --- a/xen/include/asm-arm/lpae.h
> > > +++ b/xen/include/asm-arm/lpae.h
> > > @@ -159,6 +159,17 @@ static inline bool lpae_is_superpage(lpae_t pte,
> > > unsigned int level)
> > >   #define lpae_get_mfn(pte)    (_mfn((pte).walk.base))
> > >   #define lpae_set_mfn(pte, mfn)  ((pte).walk.base = mfn_x(mfn))
> > >   +/* Generate an array @var containing the offset for each level from
> > > @addr */
> > > +#define DECLARE_OFFSETS(var, addr)          \
> > > +    const unsigned int var[4] = {           \
> > > +        zeroeth_table_offset(addr),         \
> > > +        first_table_offset(addr),           \
> > > +        second_table_offset(addr),          \
> > > +        third_table_offset(addr)            \
> > > +    }
> > > +
> > > +#endif /* __ASSEMBLY__ */
> > > +
> > >   /*
> > >    * AArch64 supports pages with different sizes (4K, 16K, and 64K).
> > >    * Provide a set of generic helpers that will compute various
> > > @@ -190,17 +201,6 @@ static inline bool lpae_is_superpage(lpae_t pte,
> > > unsigned int level)
> > >   #define LPAE_TABLE_INDEX_GS(gs, lvl, addr)   \
> > >       (((addr) >> LEVEL_SHIFT_GS(gs, lvl)) & LPAE_ENTRY_MASK_GS(gs))
> > >   -/* Generate an array @var containing the offset for each level from
> > > @addr */
> > > -#define DECLARE_OFFSETS(var, addr)          \
> > > -    const unsigned int var[4] = {           \
> > > -        zeroeth_table_offset(addr),         \
> > > -        first_table_offset(addr),           \
> > > -        second_table_offset(addr),          \
> > > -        third_table_offset(addr)            \
> > > -    }
> > > -
> > > -#endif /* __ASSEMBLY__ */
> > > -
> > >   /*
> > >    * These numbers add up to a 48-bit input address space.
> > >    *
> > > @@ -211,26 +211,35 @@ static inline bool lpae_is_superpage(lpae_t pte,
> > > unsigned int level)
> > >    * therefore 39-bits are sufficient.
> > >    */
> > >   -#define LPAE_SHIFT      9
> > > -#define LPAE_ENTRIES    (_AC(1,U) << LPAE_SHIFT)
> > > -#define LPAE_ENTRY_MASK (LPAE_ENTRIES - 1)
> > > -
> > > -#define THIRD_SHIFT    (PAGE_SHIFT)
> > > -#define THIRD_ORDER    (THIRD_SHIFT - PAGE_SHIFT)
> > > -#define THIRD_SIZE     (_AT(paddr_t, 1) << THIRD_SHIFT)
> > > -#define THIRD_MASK     (~(THIRD_SIZE - 1))
> > > -#define SECOND_SHIFT   (THIRD_SHIFT + LPAE_SHIFT)
> > > -#define SECOND_ORDER   (SECOND_SHIFT - PAGE_SHIFT)
> > > -#define SECOND_SIZE    (_AT(paddr_t, 1) << SECOND_SHIFT)
> > > -#define SECOND_MASK    (~(SECOND_SIZE - 1))
> > > -#define FIRST_SHIFT    (SECOND_SHIFT + LPAE_SHIFT)
> > > -#define FIRST_ORDER    (FIRST_SHIFT - PAGE_SHIFT)
> > > -#define FIRST_SIZE     (_AT(paddr_t, 1) << FIRST_SHIFT)
> > > -#define FIRST_MASK     (~(FIRST_SIZE - 1))
> > > -#define ZEROETH_SHIFT  (FIRST_SHIFT + LPAE_SHIFT)
> > > -#define ZEROETH_ORDER  (ZEROETH_SHIFT - PAGE_SHIFT)
> > > -#define ZEROETH_SIZE   (_AT(paddr_t, 1) << ZEROETH_SHIFT)
> > > -#define ZEROETH_MASK   (~(ZEROETH_SIZE - 1))
> > 
> > Should we add a one-line in-code comment saying that the definitions
> > below are for 4KB pages? It is not immediately obvious any longer.
> 
> Because they are not meant to be for 4KB pages. They are meant to be for Xen
> page size.
> 
> Today, it is always 4KB but I would like the Xen code to not rely on that.
> 
> I can clarify it in an in-code comment.

That would help I think


> > > +#define LPAE_SHIFT          LPAE_SHIFT_GS(PAGE_SHIFT)
> > > +#define LPAE_ENTRIES        LPAE_ENTRIES_GS(PAGE_SHIFT)
> > > +#define LPAE_ENTRY_MASK     LPAE_ENTRY_MASK_GS(PAGE_SHIFT)
> > > 
> > > +#define LEVEL_SHIFT(lvl)    LEVEL_SHIFT_GS(PAGE_SHIFT, lvl)
> > > +#define LEVEL_ORDER(lvl)    LEVEL_ORDER_GS(PAGE_SHIFT, lvl)
> > > +#define LEVEL_SIZE(lvl)     LEVEL_SIZE_GS(PAGE_SHIFT, lvl)
> > > +#define LEVEL_MASK(lvl)     (~(LEVEL_SIZE(lvl) - 1))
> > 
> > I would avoid adding these 4 macros. It would be OK if they were just
> > used within this file but lpae.h is a header: they could end up be used
> > anywhere in the xen/ code and they have a very generic name. My
> > suggestion would be to skip them and just do:
> 
> Those macros will be used in follow-up patches. They are pretty useful to
> avoid introduce static array with the different information for each level.
> 
> Would prefix them with XEN_ be better?

Maybe. The concern I have is that there are multiple page granularities
(4kb, 16kb, etc) and multiple page sizes (4kb, 2mb, etc). If I just see
LEVEL_ORDER it is not immediately obvious what granularity and what size
we are talking about.

I think using a name that makes it clear that they are referring to Xen
pages, currently 4kb, it would make sense. Or maybe a in-code comment
would be sufficient.

I don't have a great suggestion here so I'll leave it to you. I am also
OK to keep them as is.


> > #define THIRD_SHIFT         LEVEL_SHIFT_GS(PAGE_SHIFT, 3)
> > 
> > etc.
> > 
> > 
> > > +/* Convenience aliases */
> > > +#define THIRD_SHIFT         LEVEL_SHIFT(3)
> > > +#define THIRD_ORDER         LEVEL_ORDER(3)
> > > +#define THIRD_SIZE          LEVEL_SIZE(3)
> > > +#define THIRD_MASK          LEVEL_MASK(3)
> > > +
> > > +#define SECOND_SHIFT        LEVEL_SHIFT(2)
> > > +#define SECOND_ORDER        LEVEL_ORDER(2)
> > > +#define SECOND_SIZE         LEVEL_SIZE(2)
> > > +#define SECOND_MASK         LEVEL_MASK(2)
> > > +
> > > +#define FIRST_SHIFT         LEVEL_SHIFT(1)
> > > +#define FIRST_ORDER         LEVEL_ORDER(1)
> > > +#define FIRST_SIZE          LEVEL_SIZE(1)
> > > +#define FIRST_MASK          LEVEL_MASK(1)
> > > +
> > > +#define ZEROETH_SHIFT       LEVEL_SHIFT(0)
> > > +#define ZEROETH_ORDER       LEVEL_ORDER(0)
> > > +#define ZEROETH_SIZE        LEVEL_SIZE(0)
> > > +#define ZEROETH_MASK        LEVEL_MASK(0)
> > >     /* Calculate the offsets into the pagetables for a given VA */
> > >   #define zeroeth_linear_offset(va) ((va) >> ZEROETH_SHIFT)
> > > -- 
> > > 2.17.1
> > > 
> 
> Cheers,
> 
> -- 
> Julien Grall
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 07/15] xen/arm: mm: Re-implement early_fdt_map() using map_pages_to_xen()
  2021-04-25 20:13 ` [PATCH RFCv2 07/15] xen/arm: mm: Re-implement early_fdt_map() using map_pages_to_xen() Julien Grall
@ 2021-05-12 21:41   ` Stefano Stabellini
  2021-05-12 22:18     ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-12 21:41 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Julien Grall

On Sun, 25 Apr 2021, Julien Grall wrote:
> From: Julien Grall <julien.grall@arm.com>
> 
> Now that map_pages_to_xen() has been extended to support 2MB mappings,
> we can replace the create_mappings() calls by map_pages_to_xen() calls.
> 
> The mapping can also be marked read-only has Xen as no business to
> modify the host Device Tree.

I think that's good. Just FYI there is some work at Xilinx to make
changes to the device tree at runtime but we'll cross that bridge when
we come to it.

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> Signed-off-by: Julien Grall <julien.grall@arm.com>
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> ---
>     Changes in v2:
>         - Add my AWS signed-off-by
>         - Fix typo in the commit message
> ---
>  xen/arch/arm/mm.c | 18 +++++++++++++-----
>  1 file changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 2cbfbe25240e..8fac24d80086 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -558,6 +558,7 @@ void * __init early_fdt_map(paddr_t fdt_paddr)
>      paddr_t offset;
>      void *fdt_virt;
>      uint32_t size;
> +    int rc;
>  
>      /*
>       * Check whether the physical FDT address is set and meets the minimum
> @@ -573,8 +574,12 @@ void * __init early_fdt_map(paddr_t fdt_paddr)
>      /* The FDT is mapped using 2MB superpage */
>      BUILD_BUG_ON(BOOT_FDT_VIRT_START % SZ_2M);
>  
> -    create_mappings(xen_second, BOOT_FDT_VIRT_START, paddr_to_pfn(base_paddr),
> -                    SZ_2M >> PAGE_SHIFT, SZ_2M);
> +    rc = map_pages_to_xen(BOOT_FDT_VIRT_START, maddr_to_mfn(base_paddr),
> +                          SZ_2M >> PAGE_SHIFT,
> +                          PAGE_HYPERVISOR_RO | _PAGE_BLOCK);
> +    if ( rc )
> +        panic("Unable to map the device-tree.\n");
> +
>  
>      offset = fdt_paddr % SECOND_SIZE;
>      fdt_virt = (void *)BOOT_FDT_VIRT_START + offset;
> @@ -588,9 +593,12 @@ void * __init early_fdt_map(paddr_t fdt_paddr)
>  
>      if ( (offset + size) > SZ_2M )
>      {
> -        create_mappings(xen_second, BOOT_FDT_VIRT_START + SZ_2M,
> -                        paddr_to_pfn(base_paddr + SZ_2M),
> -                        SZ_2M >> PAGE_SHIFT, SZ_2M);
> +        rc = map_pages_to_xen(BOOT_FDT_VIRT_START + SZ_2M,
> +                              maddr_to_mfn(base_paddr + SZ_2M),
> +                              SZ_2M >> PAGE_SHIFT,
> +                              PAGE_HYPERVISOR_RO | _PAGE_BLOCK);
> +        if ( rc )
> +            panic("Unable to map the device-tree\n");
>      }
>  
>      return fdt_virt;
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 08/15] xen/arm32: mm: Check if the virtual address is shared before updating it
  2021-04-25 20:13 ` [PATCH RFCv2 08/15] xen/arm32: mm: Check if the virtual address is shared before updating it Julien Grall
@ 2021-05-12 22:00   ` Stefano Stabellini
  2021-05-12 22:23     ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-12 22:00 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Volodymyr Babchuk

On Sun, 25 Apr 2021, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> Only the first 2GB of the virtual address space is shared between all
> the page-tables on Arm32.
> 
> There is a long outstanding TODO in xen_pt_update() stating that the
> function is can only work with shared mapping. Nobody has ever called
           ^ remove

> the function with private mapping, however as we add more callers
> there is a risk to mess things up.
> 
> Introduce a new define to mark the ened of the shared mappings and use
                                     ^end

> it in xen_pt_update() to verify if the address is correct.
> 
> Note that on Arm64, all the mappings are shared. Some compiler may
> complain about an always true check, so the new define is not introduced
> for arm64 and the code is protected with an #ifdef.
 
On arm64 we could maybe define SHARED_VIRT_END to an arbitrarely large
value, such as:

#define SHARED_VIRT_END (1UL<<48)

or:

#define SHARED_VIRT_END DIRECTMAP_VIRT_END

?


> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> ---
>     Changes in v2:
>         - New patch
> ---
>  xen/arch/arm/mm.c            | 11 +++++++++--
>  xen/include/asm-arm/config.h |  4 ++++
>  2 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 8fac24d80086..5c17cafff847 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -1275,11 +1275,18 @@ static int xen_pt_update(unsigned long virt,
>       * For arm32, page-tables are different on each CPUs. Yet, they share
>       * some common mappings. It is assumed that only common mappings
>       * will be modified with this function.
> -     *
> -     * XXX: Add a check.
>       */
>      const mfn_t root = virt_to_mfn(THIS_CPU_PGTABLE);
>  
> +#ifdef SHARED_VIRT_END
> +    if ( virt > SHARED_VIRT_END ||
> +         (SHARED_VIRT_END - virt) < nr_mfns )

The following would be sufficient, right?

    if ( virt + nr_mfns > SHARED_VIRT_END )


> +    {
> +        mm_printk("Trying to map outside of the shared area.\n");
> +        return -EINVAL;
> +    }
> +#endif
> +
>      /*
>       * The hardware was configured to forbid mapping both writeable and
>       * executable.
> diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h
> index c7b77912013e..85d4a510ce8a 100644
> --- a/xen/include/asm-arm/config.h
> +++ b/xen/include/asm-arm/config.h
> @@ -137,6 +137,10 @@
>  
>  #define XENHEAP_VIRT_START     _AT(vaddr_t,0x40000000)
>  #define XENHEAP_VIRT_END       _AT(vaddr_t,0x7fffffff)
> +
> +/* The first 2GB is always shared between all the page-tables. */
> +#define SHARED_VIRT_END        _AT(vaddr_t, 0x7fffffff)
> +
>  #define DOMHEAP_VIRT_START     _AT(vaddr_t,0x80000000)
>  #define DOMHEAP_VIRT_END       _AT(vaddr_t,0xffffffff)
>  
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 09/15] xen/arm32: mm: Re-implement setup_xenheap_mappings() using map_pages_to_xen()
  2021-04-25 20:13 ` [PATCH RFCv2 09/15] xen/arm32: mm: Re-implement setup_xenheap_mappings() using map_pages_to_xen() Julien Grall
@ 2021-05-12 22:07   ` Stefano Stabellini
  2021-05-13 17:55     ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-12 22:07 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Volodymyr Babchuk

[-- Attachment #1: Type: text/plain, Size: 1837 bytes --]

On Sun, 25 Apr 2021, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> Now that map_pages_to_xen() has been extended to support 2MB mappings,
> we can replace the create_mappings() call by map_pages_to_xen() call.
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> ---
>     Changes in v2:
>         - New patch
> 
>     TODOs:
>         - add support for contiguous mapping
> ---
>  xen/arch/arm/mm.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 5c17cafff847..19ecf73542ce 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -806,7 +806,12 @@ void mmu_init_secondary_cpu(void)
>  void __init setup_xenheap_mappings(unsigned long base_mfn,
>                                     unsigned long nr_mfns)
>  {
> -    create_mappings(xen_second, XENHEAP_VIRT_START, base_mfn, nr_mfns, MB(32));
> +    int rc;
> +
> +    rc = map_pages_to_xen(XENHEAP_VIRT_START, base_mfn, nr_mfns,
> +                          PAGE_HYPERVISOR_RW | _PAGE_BLOCK);
> +    if ( rc )
> +        panic("Unable to setup the xenheap mappings.\n");
>  
>      /* Record where the xenheap is, for translation routines. */
>      xenheap_virt_end = XENHEAP_VIRT_START + nr_mfns * PAGE_SIZE;

I get the following build error:

mm.c: In function ‘setup_xenheap_mappings’:
mm.c:811:47: error: incompatible type for argument 2 of ‘map_pages_to_xen’
     rc = map_pages_to_xen(XENHEAP_VIRT_START, base_mfn, nr_mfns,
                                               ^~~~~~~~
In file included from mm.c:24:0:
/local/repos/xen-upstream/xen/include/xen/mm.h:89:5: note: expected ‘mfn_t {aka struct <anonymous>}’ but argument is of type ‘long unsigned int’
 int map_pages_to_xen(
     ^~~~~~~~~~~~~~~~

I think base_mfn needs to be converted to mfn_t

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 02/15] xen/arm: lpae: Use the generic helpers to defined the Xen PT helpers
  2021-05-12 21:30       ` Stefano Stabellini
@ 2021-05-12 22:16         ` Julien Grall
  2021-05-13 22:44           ` Stefano Stabellini
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-05-12 22:16 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Volodymyr Babchuk

Hi Stefano,

On 12/05/2021 22:30, Stefano Stabellini wrote:
> On Wed, 12 May 2021, Julien Grall wrote:
>>>> +#define LPAE_SHIFT          LPAE_SHIFT_GS(PAGE_SHIFT)
>>>> +#define LPAE_ENTRIES        LPAE_ENTRIES_GS(PAGE_SHIFT)
>>>> +#define LPAE_ENTRY_MASK     LPAE_ENTRY_MASK_GS(PAGE_SHIFT)
>>>>
>>>> +#define LEVEL_SHIFT(lvl)    LEVEL_SHIFT_GS(PAGE_SHIFT, lvl)
>>>> +#define LEVEL_ORDER(lvl)    LEVEL_ORDER_GS(PAGE_SHIFT, lvl)
>>>> +#define LEVEL_SIZE(lvl)     LEVEL_SIZE_GS(PAGE_SHIFT, lvl)
>>>> +#define LEVEL_MASK(lvl)     (~(LEVEL_SIZE(lvl) - 1))
>>>
>>> I would avoid adding these 4 macros. It would be OK if they were just
>>> used within this file but lpae.h is a header: they could end up be used
>>> anywhere in the xen/ code and they have a very generic name. My
>>> suggestion would be to skip them and just do:
>>
>> Those macros will be used in follow-up patches. They are pretty useful to
>> avoid introduce static array with the different information for each level.
>>
>> Would prefix them with XEN_ be better?
> 
> Maybe. The concern I have is that there are multiple page granularities
> (4kb, 16kb, etc) and multiple page sizes (4kb, 2mb, etc). If I just see
> LEVEL_ORDER it is not immediately obvious what granularity and what size
> we are talking about.

I am a bit puzzled with your answer. AFAIU, you are happy with the 
existing macros (THIRD_*, SECOND_*) but not with the new macros.

In reality, there is no difference because THIRD_* doesn't tell you the 
exact size but only "this is a level 3 mapping".

So can you clarify what you are after? IOW is it reworking the current 
naming scheme?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 07/15] xen/arm: mm: Re-implement early_fdt_map() using map_pages_to_xen()
  2021-05-12 21:41   ` Stefano Stabellini
@ 2021-05-12 22:18     ` Julien Grall
  0 siblings, 0 replies; 59+ messages in thread
From: Julien Grall @ 2021-05-12 22:18 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Volodymyr Babchuk, Julien Grall

Hi Stefano,

On 12/05/2021 22:41, Stefano Stabellini wrote:
> On Sun, 25 Apr 2021, Julien Grall wrote:
>> From: Julien Grall <julien.grall@arm.com>
>>
>> Now that map_pages_to_xen() has been extended to support 2MB mappings,
>> we can replace the create_mappings() calls by map_pages_to_xen() calls.
>>
>> The mapping can also be marked read-only has Xen as no business to
>> modify the host Device Tree.
> 
> I think that's good. Just FYI there is some work at Xilinx to make
> changes to the device tree at runtime but we'll cross that bridge when
> we come to it.

This particular mapping is only used during early boot. After the DT has 
been unflatten, this region is unmapped and the physical memory released.

So if the DT needs to be modified at runtime, then you would most likely 
want to modify the unflatten version.

> 
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

Thank you!

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 08/15] xen/arm32: mm: Check if the virtual address is shared before updating it
  2021-05-12 22:00   ` Stefano Stabellini
@ 2021-05-12 22:23     ` Julien Grall
  2021-05-13 22:32       ` Stefano Stabellini
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-05-12 22:23 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Volodymyr Babchuk

Hi Stefano,

On 12/05/2021 23:00, Stefano Stabellini wrote:
> On Sun, 25 Apr 2021, Julien Grall wrote:
>> From: Julien Grall <jgrall@amazon.com>
>>
>> Only the first 2GB of the virtual address space is shared between all
>> the page-tables on Arm32.
>>
>> There is a long outstanding TODO in xen_pt_update() stating that the
>> function is can only work with shared mapping. Nobody has ever called
>             ^ remove
> 
>> the function with private mapping, however as we add more callers
>> there is a risk to mess things up.
>>
>> Introduce a new define to mark the ened of the shared mappings and use
>                                       ^end
> 
>> it in xen_pt_update() to verify if the address is correct.
>>
>> Note that on Arm64, all the mappings are shared. Some compiler may
>> complain about an always true check, so the new define is not introduced
>> for arm64 and the code is protected with an #ifdef.
>   
> On arm64 we could maybe define SHARED_VIRT_END to an arbitrarely large
> value, such as:
> 
> #define SHARED_VIRT_END (1UL<<48)
> 
> or:
> 
> #define SHARED_VIRT_END DIRECTMAP_VIRT_END
> 
> ?

I thought about it but I didn't want to define to a random value... I 
felt not define it was better.

> 
> 
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>
>> ---
>>      Changes in v2:
>>          - New patch
>> ---
>>   xen/arch/arm/mm.c            | 11 +++++++++--
>>   xen/include/asm-arm/config.h |  4 ++++
>>   2 files changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
>> index 8fac24d80086..5c17cafff847 100644
>> --- a/xen/arch/arm/mm.c
>> +++ b/xen/arch/arm/mm.c
>> @@ -1275,11 +1275,18 @@ static int xen_pt_update(unsigned long virt,
>>        * For arm32, page-tables are different on each CPUs. Yet, they share
>>        * some common mappings. It is assumed that only common mappings
>>        * will be modified with this function.
>> -     *
>> -     * XXX: Add a check.
>>        */
>>       const mfn_t root = virt_to_mfn(THIS_CPU_PGTABLE);
>>   
>> +#ifdef SHARED_VIRT_END
>> +    if ( virt > SHARED_VIRT_END ||
>> +         (SHARED_VIRT_END - virt) < nr_mfns )
> 
> The following would be sufficient, right?
> 
>      if ( virt + nr_mfns > SHARED_VIRT_END )

This would not protect against an overflow. So I think it is best if we 
keep my version.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 10/15] xen/arm: mm: Allocate xen page tables in domheap rather than xenheap
  2021-04-25 20:13 ` [PATCH RFCv2 10/15] xen/arm: mm: Allocate xen page tables in domheap rather than xenheap Julien Grall
@ 2021-05-12 22:44   ` Stefano Stabellini
  2021-05-13 18:09     ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-12 22:44 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Volodymyr Babchuk

On Sun, 25 Apr 2021, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> xen_{un,}map_table() already uses the helper to map/unmap pages
> on-demand (note this is currently a NOP on arm64). So switching to
> domheap don't have any disavantage.
> 
> But this as the benefit:
>     - to keep the page tables unmapped if an arch decided to do so
>     - reduce xenheap use on arm32 which can be pretty small
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>

Thanks for the patch. It looks OK but let me ask a couple of questions
to clarify my doubts.

This change should have no impact to arm64, right?

For arm32, I wonder why we were using map_domain_page before in
xen_map_table: it wasn't necessary, was it? In fact, one could even say
that it was wrong?


> ---
>     Changes in v2:
>         - New patch
> ---
>  xen/arch/arm/mm.c | 36 +++++++++++++++++++++---------------
>  1 file changed, 21 insertions(+), 15 deletions(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 19ecf73542ce..ae5a07ea956b 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -969,21 +969,6 @@ void *ioremap(paddr_t pa, size_t len)
>      return ioremap_attr(pa, len, PAGE_HYPERVISOR_NOCACHE);
>  }
>  
> -static int create_xen_table(lpae_t *entry)
> -{
> -    void *p;
> -    lpae_t pte;
> -
> -    p = alloc_xenheap_page();
> -    if ( p == NULL )
> -        return -ENOMEM;
> -    clear_page(p);
> -    pte = mfn_to_xen_entry(virt_to_mfn(p), MT_NORMAL);
> -    pte.pt.table = 1;
> -    write_pte(entry, pte);
> -    return 0;
> -}
> -
>  static lpae_t *xen_map_table(mfn_t mfn)
>  {
>      /*
> @@ -1024,6 +1009,27 @@ static void xen_unmap_table(const lpae_t *table)
>      unmap_domain_page(table);
>  }
>  
> +static int create_xen_table(lpae_t *entry)
> +{
> +    struct page_info *pg;
> +    void *p;
> +    lpae_t pte;
> +
> +    pg = alloc_domheap_page(NULL, 0);
> +    if ( pg == NULL )
> +        return -ENOMEM;
> +
> +    p = xen_map_table(page_to_mfn(pg));
> +    clear_page(p);
> +    xen_unmap_table(p);
> +
> +    pte = mfn_to_xen_entry(page_to_mfn(pg), MT_NORMAL);
> +    pte.pt.table = 1;
> +    write_pte(entry, pte);
> +
> +    return 0;
> +}
> +
>  #define XEN_TABLE_MAP_FAILED 0
>  #define XEN_TABLE_SUPER_PAGE 1
>  #define XEN_TABLE_NORMAL_PAGE 2
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 09/15] xen/arm32: mm: Re-implement setup_xenheap_mappings() using map_pages_to_xen()
  2021-05-12 22:07   ` Stefano Stabellini
@ 2021-05-13 17:55     ` Julien Grall
  0 siblings, 0 replies; 59+ messages in thread
From: Julien Grall @ 2021-05-13 17:55 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Volodymyr Babchuk

Hi Stefano,

On 12/05/2021 23:07, Stefano Stabellini wrote:
> On Sun, 25 Apr 2021, Julien Grall wrote:
>> From: Julien Grall <jgrall@amazon.com>
>>
>> Now that map_pages_to_xen() has been extended to support 2MB mappings,
>> we can replace the create_mappings() call by map_pages_to_xen() call.
>>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>
>> ---
>>      Changes in v2:
>>          - New patch
>>
>>      TODOs:
>>          - add support for contiguous mapping
>> ---
>>   xen/arch/arm/mm.c | 7 ++++++-
>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
>> index 5c17cafff847..19ecf73542ce 100644
>> --- a/xen/arch/arm/mm.c
>> +++ b/xen/arch/arm/mm.c
>> @@ -806,7 +806,12 @@ void mmu_init_secondary_cpu(void)
>>   void __init setup_xenheap_mappings(unsigned long base_mfn,
>>                                      unsigned long nr_mfns)
>>   {
>> -    create_mappings(xen_second, XENHEAP_VIRT_START, base_mfn, nr_mfns, MB(32));
>> +    int rc;
>> +
>> +    rc = map_pages_to_xen(XENHEAP_VIRT_START, base_mfn, nr_mfns,
>> +                          PAGE_HYPERVISOR_RW | _PAGE_BLOCK);
>> +    if ( rc )
>> +        panic("Unable to setup the xenheap mappings.\n");
>>   
>>       /* Record where the xenheap is, for translation routines. */
>>       xenheap_virt_end = XENHEAP_VIRT_START + nr_mfns * PAGE_SIZE;
> 
> I get the following build error:
> 
> mm.c: In function ‘setup_xenheap_mappings’:
> mm.c:811:47: error: incompatible type for argument 2 of ‘map_pages_to_xen’
>       rc = map_pages_to_xen(XENHEAP_VIRT_START, base_mfn, nr_mfns,
>                                                 ^~~~~~~~
> In file included from mm.c:24:0:
> /local/repos/xen-upstream/xen/include/xen/mm.h:89:5: note: expected ‘mfn_t {aka struct <anonymous>}’ but argument is of type ‘long unsigned int’
>   int map_pages_to_xen(
>       ^~~~~~~~~~~~~~~~
> 
> I think base_mfn needs to be converted to mfn_t

You are right. I think my scripts are build testing arm32 with debug=n. 
I will fix it and respin.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 10/15] xen/arm: mm: Allocate xen page tables in domheap rather than xenheap
  2021-05-12 22:44   ` Stefano Stabellini
@ 2021-05-13 18:09     ` Julien Grall
  2021-05-13 22:27       ` Stefano Stabellini
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-05-13 18:09 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Volodymyr Babchuk

Hi Stefano,

On 12/05/2021 23:44, Stefano Stabellini wrote:
> On Sun, 25 Apr 2021, Julien Grall wrote:
>> From: Julien Grall <jgrall@amazon.com>
>>
>> xen_{un,}map_table() already uses the helper to map/unmap pages
>> on-demand (note this is currently a NOP on arm64). So switching to
>> domheap don't have any disavantage.
>>
>> But this as the benefit:
>>      - to keep the page tables unmapped if an arch decided to do so
>>      - reduce xenheap use on arm32 which can be pretty small
>>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> Thanks for the patch. It looks OK but let me ask a couple of questions
> to clarify my doubts.
> 
> This change should have no impact to arm64, right?
> 
> For arm32, I wonder why we were using map_domain_page before in
> xen_map_table: it wasn't necessary, was it? In fact, one could even say
> that it was wrong?
In xen_map_table() we need to be able to map pages from Xen binary, 
xenheap... On arm64, we would be able to use mfn_to_virt() because 
everything is mapped in Xen. But that's not the case on arm32. So we 
need a way to map anything easily.

The only difference between xenheap and domheap are the former is always 
mapped while the latter may not be. So one can also view a xenheap page 
as a glorified domheap.

I also don't really want to create yet another interface to map pages 
(we have vmap(), map_domain_page(), map_domain_global_page()...). So, 
when I implemented xen_map_table() a couple of years ago, I came to the 
conclusion that this is a convenient (ab)use of the interface.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 10/15] xen/arm: mm: Allocate xen page tables in domheap rather than xenheap
  2021-05-13 18:09     ` Julien Grall
@ 2021-05-13 22:27       ` Stefano Stabellini
  2021-05-15  8:48         ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-13 22:27 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng,
	Bertrand.Marquis, Julien Grall, Volodymyr Babchuk

On Thu, 13 May 2021, Julien Grall wrote:
> Hi Stefano,
> 
> On 12/05/2021 23:44, Stefano Stabellini wrote:
> > On Sun, 25 Apr 2021, Julien Grall wrote:
> > > From: Julien Grall <jgrall@amazon.com>
> > > 
> > > xen_{un,}map_table() already uses the helper to map/unmap pages
> > > on-demand (note this is currently a NOP on arm64). So switching to
> > > domheap don't have any disavantage.
> > > 
> > > But this as the benefit:
> > >      - to keep the page tables unmapped if an arch decided to do so
> > >      - reduce xenheap use on arm32 which can be pretty small
> > > 
> > > Signed-off-by: Julien Grall <jgrall@amazon.com>
> > 
> > Thanks for the patch. It looks OK but let me ask a couple of questions
> > to clarify my doubts.
> > 
> > This change should have no impact to arm64, right?
> > 
> > For arm32, I wonder why we were using map_domain_page before in
> > xen_map_table: it wasn't necessary, was it? In fact, one could even say
> > that it was wrong?
> In xen_map_table() we need to be able to map pages from Xen binary, xenheap...
> On arm64, we would be able to use mfn_to_virt() because everything is mapped
> in Xen. But that's not the case on arm32. So we need a way to map anything
> easily.
> 
> The only difference between xenheap and domheap are the former is always
> mapped while the latter may not be. So one can also view a xenheap page as a
> glorified domheap.
> 
> I also don't really want to create yet another interface to map pages (we have
> vmap(), map_domain_page(), map_domain_global_page()...). So, when I
> implemented xen_map_table() a couple of years ago, I came to the conclusion
> that this is a convenient (ab)use of the interface.

Got it. Repeating to check if I see the full picture. On ARM64 there are
no changes. On ARM32, at runtime there are no changes mapping/unmapping
pages because xen_map_table is already mapping all pages using domheap,
even xenheap pages are mapped as domheap; so this patch makes no
difference in mapping/unmapping, correct?

The only difference is that on arm32 we are using domheap to allocate
the pages, which is a different (larger) pool.


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 08/15] xen/arm32: mm: Check if the virtual address is shared before updating it
  2021-05-12 22:23     ` Julien Grall
@ 2021-05-13 22:32       ` Stefano Stabellini
  2021-05-13 22:59         ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-13 22:32 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng,
	Bertrand.Marquis, Julien Grall, Volodymyr Babchuk

On Wed, 12 May 2021, Julien Grall wrote:
> Hi Stefano,
> 
> On 12/05/2021 23:00, Stefano Stabellini wrote:
> > On Sun, 25 Apr 2021, Julien Grall wrote:
> > > From: Julien Grall <jgrall@amazon.com>
> > > 
> > > Only the first 2GB of the virtual address space is shared between all
> > > the page-tables on Arm32.
> > > 
> > > There is a long outstanding TODO in xen_pt_update() stating that the
> > > function is can only work with shared mapping. Nobody has ever called
> >             ^ remove
> > 
> > > the function with private mapping, however as we add more callers
> > > there is a risk to mess things up.
> > > 
> > > Introduce a new define to mark the ened of the shared mappings and use
> >                                       ^end
> > 
> > > it in xen_pt_update() to verify if the address is correct.
> > > 
> > > Note that on Arm64, all the mappings are shared. Some compiler may
> > > complain about an always true check, so the new define is not introduced
> > > for arm64 and the code is protected with an #ifdef.
> >   On arm64 we could maybe define SHARED_VIRT_END to an arbitrarely large
> > value, such as:
> > 
> > #define SHARED_VIRT_END (1UL<<48)
> > 
> > or:
> > 
> > #define SHARED_VIRT_END DIRECTMAP_VIRT_END
> > 
> > ?
> 
> I thought about it but I didn't want to define to a random value... I felt not
> define it was better.

Yeah, I see your point: any restrictions in addressing (e.g. 48bits)
are physical address restrictions. Here we are talking about virtual
address restriction, and I don't think there are actually any
restrictions there?  We could validly map something at
0xffff_ffff_ffff_ffff. So even (1<<48) which makes sense at the physical
level, doesn't make sense in terms of virtual addresses.


> > > Signed-off-by: Julien Grall <jgrall@amazon.com>
> > > 
> > > ---
> > >      Changes in v2:
> > >          - New patch
> > > ---
> > >   xen/arch/arm/mm.c            | 11 +++++++++--
> > >   xen/include/asm-arm/config.h |  4 ++++
> > >   2 files changed, 13 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> > > index 8fac24d80086..5c17cafff847 100644
> > > --- a/xen/arch/arm/mm.c
> > > +++ b/xen/arch/arm/mm.c
> > > @@ -1275,11 +1275,18 @@ static int xen_pt_update(unsigned long virt,
> > >        * For arm32, page-tables are different on each CPUs. Yet, they
> > > share
> > >        * some common mappings. It is assumed that only common mappings
> > >        * will be modified with this function.
> > > -     *
> > > -     * XXX: Add a check.
> > >        */
> > >       const mfn_t root = virt_to_mfn(THIS_CPU_PGTABLE);
> > >   +#ifdef SHARED_VIRT_END
> > > +    if ( virt > SHARED_VIRT_END ||
> > > +         (SHARED_VIRT_END - virt) < nr_mfns )
> > 
> > The following would be sufficient, right?
> > 
> >      if ( virt + nr_mfns > SHARED_VIRT_END )
> 
> This would not protect against an overflow. So I think it is best if we keep
> my version.

But there can be no overflow with the way SHARED_VIRT_END is defined.
Even if SHARED_VIRT_END was defined at (1<<48) there can be no overflow.
Only if we defined SHARED_VIRT_END as 0xffff_ffff_ffff_ffff we would
have an overflow, but you wrote above that your preference is not to do
that.


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 02/15] xen/arm: lpae: Use the generic helpers to defined the Xen PT helpers
  2021-05-12 22:16         ` Julien Grall
@ 2021-05-13 22:44           ` Stefano Stabellini
  2021-07-03 17:32             ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-13 22:44 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng,
	Bertrand.Marquis, Julien Grall, Volodymyr Babchuk

On Wed, 12 May 2021, Julien Grall wrote:
> Hi Stefano,
> 
> On 12/05/2021 22:30, Stefano Stabellini wrote:
> > On Wed, 12 May 2021, Julien Grall wrote:
> > > > > +#define LPAE_SHIFT          LPAE_SHIFT_GS(PAGE_SHIFT)
> > > > > +#define LPAE_ENTRIES        LPAE_ENTRIES_GS(PAGE_SHIFT)
> > > > > +#define LPAE_ENTRY_MASK     LPAE_ENTRY_MASK_GS(PAGE_SHIFT)
> > > > > 
> > > > > +#define LEVEL_SHIFT(lvl)    LEVEL_SHIFT_GS(PAGE_SHIFT, lvl)
> > > > > +#define LEVEL_ORDER(lvl)    LEVEL_ORDER_GS(PAGE_SHIFT, lvl)
> > > > > +#define LEVEL_SIZE(lvl)     LEVEL_SIZE_GS(PAGE_SHIFT, lvl)
> > > > > +#define LEVEL_MASK(lvl)     (~(LEVEL_SIZE(lvl) - 1))
> > > > 
> > > > I would avoid adding these 4 macros. It would be OK if they were just
> > > > used within this file but lpae.h is a header: they could end up be used
> > > > anywhere in the xen/ code and they have a very generic name. My
> > > > suggestion would be to skip them and just do:
> > > 
> > > Those macros will be used in follow-up patches. They are pretty useful to
> > > avoid introduce static array with the different information for each
> > > level.
> > > 
> > > Would prefix them with XEN_ be better?
> > 
> > Maybe. The concern I have is that there are multiple page granularities
> > (4kb, 16kb, etc) and multiple page sizes (4kb, 2mb, etc). If I just see
> > LEVEL_ORDER it is not immediately obvious what granularity and what size
> > we are talking about.
> 
> I am a bit puzzled with your answer. AFAIU, you are happy with the existing
> macros (THIRD_*, SECOND_*) but not with the new macros.
>
> In reality, there is no difference because THIRD_* doesn't tell you the exact
> size but only "this is a level 3 mapping".
> 
> So can you clarify what you are after? IOW is it reworking the current naming
> scheme?

You are right -- there is no real difference between THIRD_*, SECOND_*
and LEVEL_*.

The original reason for my comments is that I hadn't read the following
patches, and the definition of LEVEL_* macros is simple, they could be
open coded. It looked like they were only going to be used to make the
definition of THIRD_*, SECOND_* a bit easier. So, at first, I was
wondering if they were needed at all.

Secondly, I realized that they were going to be used in *.c files by
other patches. That's why they are there. But I started thinking whether
we should find a way to make it a bit clearer that they are for Xen
pages, currently at 4KB granularity. THIRD_*, SECOND_*, etc. are already
generic names which don't convey the granularity or whether they are Xen
pages at all. But LEVEL_* seem even more generic.

As I mentioned, I don't have any good suggestions for changes to make
here, so unless you can come up with a good idea let's keep it as is.


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 11/15] xen/arm: mm: Allow page-table allocation from the boot allocator
  2021-04-25 20:13 ` [PATCH RFCv2 11/15] xen/arm: mm: Allow page-table allocation from the boot allocator Julien Grall
@ 2021-05-13 22:58   ` Stefano Stabellini
  0 siblings, 0 replies; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-13 22:58 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Julien Grall

On Sun, 25 Apr 2021, Julien Grall wrote:
> From: Julien Grall <julien.grall@arm.com>
> 
> At the moment, page-table can only be allocated from domheap. This means
> it is not possible to create mapping in the page-tables via
> map_pages_to_xen() if page-table needs to be allocated.
> 
> In order to avoid open-coding page-tables update in early boot, we need
> to be able to allocate page-tables much earlier. Thankfully, we have the
> boot allocator for those cases.
> 
> create_xen_table() is updated to cater early boot allocation by using
> alloc_boot_pages().
> 
> Note, this is not sufficient to bootstrap the page-tables (i.e mapping
> before any memory is actually mapped). This will be addressed
> separately.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> Signed-off-by: Julien Grall <jgrall@amazon.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>     Changes in v2:
>         - New patch
> ---
>  xen/arch/arm/mm.c | 20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index ae5a07ea956b..d090fdfd5994 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -1011,19 +1011,27 @@ static void xen_unmap_table(const lpae_t *table)
>  
>  static int create_xen_table(lpae_t *entry)
>  {
> -    struct page_info *pg;
> +    mfn_t mfn;
>      void *p;
>      lpae_t pte;
>  
> -    pg = alloc_domheap_page(NULL, 0);
> -    if ( pg == NULL )
> -        return -ENOMEM;
> +    if ( system_state != SYS_STATE_early_boot )
> +    {
> +        struct page_info *pg = alloc_domheap_page(NULL, 0);
> +
> +        if ( pg == NULL )
> +            return -ENOMEM;
> +
> +        mfn = page_to_mfn(pg);
> +    }
> +    else
> +        mfn = alloc_boot_pages(1, 1);
>  
> -    p = xen_map_table(page_to_mfn(pg));
> +    p = xen_map_table(mfn);
>      clear_page(p);
>      xen_unmap_table(p);
>  
> -    pte = mfn_to_xen_entry(page_to_mfn(pg), MT_NORMAL);
> +    pte = mfn_to_xen_entry(mfn, MT_NORMAL);
>      pte.pt.table = 1;
>      write_pte(entry, pte);
>  
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 08/15] xen/arm32: mm: Check if the virtual address is shared before updating it
  2021-05-13 22:32       ` Stefano Stabellini
@ 2021-05-13 22:59         ` Julien Grall
  2021-05-14  1:04           ` Stefano Stabellini
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-05-13 22:59 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Wei Chen, Henry.Wang, Penny.Zheng, Bertrand Marquis,
	Julien Grall, Volodymyr Babchuk

[-- Attachment #1: Type: text/plain, Size: 4107 bytes --]

On Thu, 13 May 2021, 23:32 Stefano Stabellini, <sstabellini@kernel.org>
wrote:

> On Wed, 12 May 2021, Julien Grall wrote:
> > Hi Stefano,
> >
> > On 12/05/2021 23:00, Stefano Stabellini wrote:
> > > On Sun, 25 Apr 2021, Julien Grall wrote:
> > > > From: Julien Grall <jgrall@amazon.com>
> > > >
> > > > Only the first 2GB of the virtual address space is shared between all
> > > > the page-tables on Arm32.
> > > >
> > > > There is a long outstanding TODO in xen_pt_update() stating that the
> > > > function is can only work with shared mapping. Nobody has ever called
> > >             ^ remove
> > >
> > > > the function with private mapping, however as we add more callers
> > > > there is a risk to mess things up.
> > > >
> > > > Introduce a new define to mark the ened of the shared mappings and
> use
> > >                                       ^end
> > >
> > > > it in xen_pt_update() to verify if the address is correct.
> > > >
> > > > Note that on Arm64, all the mappings are shared. Some compiler may
> > > > complain about an always true check, so the new define is not
> introduced
> > > > for arm64 and the code is protected with an #ifdef.
> > >   On arm64 we could maybe define SHARED_VIRT_END to an arbitrarely
> large
> > > value, such as:
> > >
> > > #define SHARED_VIRT_END (1UL<<48)
> > >
> > > or:
> > >
> > > #define SHARED_VIRT_END DIRECTMAP_VIRT_END
> > >
> > > ?
> >
> > I thought about it but I didn't want to define to a random value... I
> felt not
> > define it was better.
>
> Yeah, I see your point: any restrictions in addressing (e.g. 48bits)
> are physical address restrictions. Here we are talking about virtual
> address restriction, and I don't think there are actually any
> restrictions there?  We could validly map something at
> 0xffff_ffff_ffff_ffff. So even (1<<48) which makes sense at the physical
> level, doesn't make sense in terms of virtual addresses.
>

The limit for the virtual address is 2^64.


>
> > > > Signed-off-by: Julien Grall <jgrall@amazon.com>
> > > >
> > > > ---
> > > >      Changes in v2:
> > > >          - New patch
> > > > ---
> > > >   xen/arch/arm/mm.c            | 11 +++++++++--
> > > >   xen/include/asm-arm/config.h |  4 ++++
> > > >   2 files changed, 13 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> > > > index 8fac24d80086..5c17cafff847 100644
> > > > --- a/xen/arch/arm/mm.c
> > > > +++ b/xen/arch/arm/mm.c
> > > > @@ -1275,11 +1275,18 @@ static int xen_pt_update(unsigned long virt,
> > > >        * For arm32, page-tables are different on each CPUs. Yet, they
> > > > share
> > > >        * some common mappings. It is assumed that only common
> mappings
> > > >        * will be modified with this function.
> > > > -     *
> > > > -     * XXX: Add a check.
> > > >        */
> > > >       const mfn_t root = virt_to_mfn(THIS_CPU_PGTABLE);
> > > >   +#ifdef SHARED_VIRT_END
> > > > +    if ( virt > SHARED_VIRT_END ||
> > > > +         (SHARED_VIRT_END - virt) < nr_mfns )
> > >
> > > The following would be sufficient, right?
> > >
> > >      if ( virt + nr_mfns > SHARED_VIRT_END )
> >
> > This would not protect against an overflow. So I think it is best if we
> keep
> > my version.
>
> But there can be no overflow with the way SHARED_VIRT_END is defined.

Even if SHARED_VIRT_END was defined at (1<<48) there can be no overflow.
> Only if we defined SHARED_VIRT_END as 0xffff_ffff_ffff_ffff we would
> have an overflow, but you wrote above that your preference is not to do
> that.
>

You can have an overflow regardless the value of SHARED_VIRT_END.

Imagine virt = 2^64 - 1 and nr_mfs = 1. The addition would result to 0.

As a consequence the check would pass when it should not.

One can argue that the caller will always provide sane values. However
given the simplicity of the check, this is not worth the trouble if a
caller is buggy.

Now, the problem with SHARED_VIRT_END equals to 2^64 - 1 is not the
overflow but the compiler that may throw an error/warning for always true
check. Hence the reason of not defining SHARED_VIRT_END on arm64.

Cheers,

[-- Attachment #2: Type: text/html, Size: 6329 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 08/15] xen/arm32: mm: Check if the virtual address is shared before updating it
  2021-05-13 22:59         ` Julien Grall
@ 2021-05-14  1:04           ` Stefano Stabellini
  0 siblings, 0 replies; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-14  1:04 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, xen-devel, Wei Chen, Henry.Wang, Penny.Zheng,
	Bertrand Marquis, Julien Grall, Volodymyr Babchuk

[-- Attachment #1: Type: text/plain, Size: 4877 bytes --]

On Thu, 13 May 2021, Julien Grall wrote:
> On Thu, 13 May 2021, 23:32 Stefano Stabellini, <sstabellini@kernel.org> wrote:
>       On Wed, 12 May 2021, Julien Grall wrote:
>       > Hi Stefano,
>       >
>       > On 12/05/2021 23:00, Stefano Stabellini wrote:
>       > > On Sun, 25 Apr 2021, Julien Grall wrote:
>       > > > From: Julien Grall <jgrall@amazon.com>
>       > > >
>       > > > Only the first 2GB of the virtual address space is shared between all
>       > > > the page-tables on Arm32.
>       > > >
>       > > > There is a long outstanding TODO in xen_pt_update() stating that the
>       > > > function is can only work with shared mapping. Nobody has ever called
>       > >             ^ remove
>       > >
>       > > > the function with private mapping, however as we add more callers
>       > > > there is a risk to mess things up.
>       > > >
>       > > > Introduce a new define to mark the ened of the shared mappings and use
>       > >                                       ^end
>       > >
>       > > > it in xen_pt_update() to verify if the address is correct.
>       > > >
>       > > > Note that on Arm64, all the mappings are shared. Some compiler may
>       > > > complain about an always true check, so the new define is not introduced
>       > > > for arm64 and the code is protected with an #ifdef.
>       > >   On arm64 we could maybe define SHARED_VIRT_END to an arbitrarely large
>       > > value, such as:
>       > >
>       > > #define SHARED_VIRT_END (1UL<<48)
>       > >
>       > > or:
>       > >
>       > > #define SHARED_VIRT_END DIRECTMAP_VIRT_END
>       > >
>       > > ?
>       >
>       > I thought about it but I didn't want to define to a random value... I felt not
>       > define it was better.
> 
>       Yeah, I see your point: any restrictions in addressing (e.g. 48bits)
>       are physical address restrictions. Here we are talking about virtual
>       address restriction, and I don't think there are actually any
>       restrictions there?  We could validly map something at
>       0xffff_ffff_ffff_ffff. So even (1<<48) which makes sense at the physical
>       level, doesn't make sense in terms of virtual addresses.
> 
> 
> The limit for the virtual address is 2^64.
> 
> 
> 
>       > > > Signed-off-by: Julien Grall <jgrall@amazon.com>
>       > > >
>       > > > ---
>       > > >      Changes in v2:
>       > > >          - New patch
>       > > > ---
>       > > >   xen/arch/arm/mm.c            | 11 +++++++++--
>       > > >   xen/include/asm-arm/config.h |  4 ++++
>       > > >   2 files changed, 13 insertions(+), 2 deletions(-)
>       > > >
>       > > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
>       > > > index 8fac24d80086..5c17cafff847 100644
>       > > > --- a/xen/arch/arm/mm.c
>       > > > +++ b/xen/arch/arm/mm.c
>       > > > @@ -1275,11 +1275,18 @@ static int xen_pt_update(unsigned long virt,
>       > > >        * For arm32, page-tables are different on each CPUs. Yet, they
>       > > > share
>       > > >        * some common mappings. It is assumed that only common mappings
>       > > >        * will be modified with this function.
>       > > > -     *
>       > > > -     * XXX: Add a check.
>       > > >        */
>       > > >       const mfn_t root = virt_to_mfn(THIS_CPU_PGTABLE);
>       > > >   +#ifdef SHARED_VIRT_END
>       > > > +    if ( virt > SHARED_VIRT_END ||
>       > > > +         (SHARED_VIRT_END - virt) < nr_mfns )
>       > >
>       > > The following would be sufficient, right?
>       > >
>       > >      if ( virt + nr_mfns > SHARED_VIRT_END )
>       >
>       > This would not protect against an overflow. So I think it is best if we keep
>       > my version.
> 
>       But there can be no overflow with the way SHARED_VIRT_END is defined.
> 
>       Even if SHARED_VIRT_END was defined at (1<<48) there can be no overflow.
>       Only if we defined SHARED_VIRT_END as 0xffff_ffff_ffff_ffff we would
>       have an overflow, but you wrote above that your preference is not to do
>       that.
> 
> 
> You can have an overflow regardless the value of SHARED_VIRT_END.
> 
> Imagine virt = 2^64 - 1 and nr_mfs = 1. The addition would result to 0.
> 
> As a consequence the check would pass when it should not.

Yes you are right, I don't know how I missed it!


> One can argue that the caller will always provide sane values. However given the simplicity of the check, this is not worth the trouble if
> a caller is buggy.
> 
> Now, the problem with SHARED_VIRT_END equals to 2^64 - 1 is not the overflow but the compiler that may throw an error/warning for always
> true check. Hence the reason of not defining SHARED_VIRT_END on arm64.

OK, all checks out.

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 12/15] xen/arm: add Persistent Map (PMAP) infrastructure
  2021-04-25 20:13 ` [PATCH RFCv2 12/15] xen/arm: add Persistent Map (PMAP) infrastructure Julien Grall
  2021-04-26  9:41   ` Xia, Hongyan
  2021-04-28 21:47   ` Wei Liu
@ 2021-05-14 23:25   ` Stefano Stabellini
  2021-05-15  8:54     ` Julien Grall
  2 siblings, 1 reply; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-14 23:25 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Wei Liu, Stefano Stabellini, Volodymyr Babchuk, Hongyan Xia,
	Julien Grall, Jan Beulich, Wei Liu, Andrew Cooper,
	Roger Pau Monné,
	Hongian Xia

[-- Attachment #1: Type: text/plain, Size: 8315 bytes --]

On Sun, 25 Apr 2021, Julien Grall wrote:
> From: Wei Liu <wei.liu2@citrix.com>
> 
> The basic idea is like Persistent Kernel Map (PKMAP) in Linux. We
> pre-populate all the relevant page tables before the system is fully
> set up.
> 
> We will need it on Arm in order to rework the arm64 version of
> xenheap_setup_mappings() as we may need to use pages allocated from
> the boot allocator before they are effectively mapped.
> 
> This infrastructure is not lock-protected therefore can only be used
> before smpboot. After smpboot, map_domain_page() has to be used.
> 
> This is based on the x86 version [1] that was originally implemented
> by Wei Liu.
> 
> Take the opportunity to switch the parameter attr from unsigned to
> unsigned int.
> 
> [1] <e92da4ad6015b6089737fcccba3ec1d6424649a5.1588278317.git.hongyxia@amazon.com>
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
> [julien: Adapted for Arm]
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> ---
>     Changes in v2:
>         - New patch
> 
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Wei Liu <wl@xen.org>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Roger Pau Monné <roger.pau@citrix.com>
> Cc: Hongian Xia <hongyax@amazon.com>
> 
> This is mostly a copy of the PMAP infrastructure currently discussed
> on x86. The only difference is how the page-tables are updated.
> 
> I think we want to consider to provide a common infrastructure. But
> I haven't done it yet to gather feedback on the overall series
> first.
> ---
>  xen/arch/arm/Makefile        |   1 +
>  xen/arch/arm/mm.c            |   7 +--
>  xen/arch/arm/pmap.c          | 101 +++++++++++++++++++++++++++++++++++
>  xen/include/asm-arm/config.h |   2 +
>  xen/include/asm-arm/lpae.h   |   8 +++
>  xen/include/asm-arm/pmap.h   |  10 ++++
>  6 files changed, 123 insertions(+), 6 deletions(-)
>  create mode 100644 xen/arch/arm/pmap.c
>  create mode 100644 xen/include/asm-arm/pmap.h
> 
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index ca75f1040dcc..2196ada24941 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -40,6 +40,7 @@ obj-y += percpu.o
>  obj-y += platform.o
>  obj-y += platform_hypercall.o
>  obj-y += physdev.o
> +obj-y += pmap.init.o
>  obj-y += processor.o
>  obj-y += psci.o
>  obj-y += setup.o
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index d090fdfd5994..5e713b599611 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -288,12 +288,7 @@ void dump_hyp_walk(vaddr_t addr)
>      dump_pt_walk(ttbr, addr, HYP_PT_ROOT_LEVEL, 1);
>  }
>  
> -/*
> - * Standard entry type that we'll use to build Xen's own pagetables.
> - * We put the same permissions at every level, because they're ignored
> - * by the walker in non-leaf entries.
> - */
> -static inline lpae_t mfn_to_xen_entry(mfn_t mfn, unsigned attr)
> +lpae_t mfn_to_xen_entry(mfn_t mfn, unsigned int attr)
>  {
>      lpae_t e = (lpae_t) {
>          .pt = {
> diff --git a/xen/arch/arm/pmap.c b/xen/arch/arm/pmap.c
> new file mode 100644
> index 000000000000..702b1bde982d
> --- /dev/null
> +++ b/xen/arch/arm/pmap.c
> @@ -0,0 +1,101 @@
> +#include <xen/init.h>
> +#include <xen/mm.h>
> +
> +#include <asm/bitops.h>
> +#include <asm/flushtlb.h>
> +#include <asm/pmap.h>
> +
> +/*
> + * To be able to use FIXMAP_PMAP_BEGIN.
> + * XXX: move fixmap definition in a separate header
> + */

yes please :-)


> +#include <xen/acpi.h>
> +
> +/*
> + * Simple mapping infrastructure to map / unmap pages in fixed map.
> + * This is used to set up the page table for mapcache, which is used
> + * by map domain page infrastructure.
> + *
> + * This structure is not protected by any locks, so it must not be used after
> + * smp bring-up.
> + */
> +
> +/* Bitmap to track which slot is used */
> +static unsigned long __initdata inuse;
> +
> +/* XXX: Find an header to declare it */

yep


> +extern lpae_t xen_fixmap[LPAE_ENTRIES];
> +
> +void *__init pmap_map(mfn_t mfn)
> +{
> +    unsigned long flags;
> +    unsigned int idx;
> +    vaddr_t linear;
> +    unsigned int slot;
> +    lpae_t *entry, pte;
> +
> +    BUILD_BUG_ON(sizeof(inuse) * BITS_PER_LONG < NUM_FIX_PMAP);
> +
> +    ASSERT(system_state < SYS_STATE_smp_boot);

One small concern here is that we have been using SYS_STATE_early_boot
to switch implementation of things like xen_map_table. Between
SYS_STATE_early_boot and SYS_STATE_smp_boot there is SYS_STATE_boot.

I guess I am wondering if instead of three potentially different mapping
functions (<= SYS_STATE_early_boot, < SYS_STATE_smp_boot, >=
SYS_STATE_smp_boot) we can get away with only two?



> +    local_irq_save(flags);
> +
> +    idx = find_first_zero_bit(&inuse, NUM_FIX_PMAP);
> +    if ( idx == NUM_FIX_PMAP )
> +        panic("Out of PMAP slots\n");
> +
> +    __set_bit(idx, &inuse);
> +
> +    slot = idx + FIXMAP_PMAP_BEGIN;
> +    ASSERT(slot >= FIXMAP_PMAP_BEGIN && slot <= FIXMAP_PMAP_END);
> +
> +    linear = FIXMAP_ADDR(slot);
> +    /*
> +     * We cannot use set_fixmap() here. We use PMAP when there is no direct map,
> +     * so map_pages_to_xen() called by set_fixmap() needs to map pages on
> +     * demand, which then calls pmap() again, resulting in a loop. Modify the
> +     * PTEs directly instead. The same is true for pmap_unmap().
> +     */
> +    entry = &xen_fixmap[third_table_offset(linear)];
> +
> +    ASSERT(!lpae_is_valid(*entry));
> +
> +    pte = mfn_to_xen_entry(mfn, PAGE_HYPERVISOR_RW);
> +    pte.pt.table = 1;
> +    write_pte(entry, pte);
> +
> +    local_irq_restore(flags);
> +
> +    return (void *)linear;
> +}
> +
> +void __init pmap_unmap(const void *p)
> +{
> +    unsigned long flags;
> +    unsigned int idx;
> +    lpae_t *entry;
> +    lpae_t pte = { 0 };
> +    unsigned int slot = third_table_offset((vaddr_t)p);
> +
> +    ASSERT(system_state < SYS_STATE_smp_boot);
> +    ASSERT(slot >= FIXMAP_PMAP_BEGIN && slot <= FIXMAP_PMAP_END);
> +
> +    idx = slot - FIXMAP_PMAP_BEGIN;
> +    local_irq_save(flags);
> +
> +    __clear_bit(idx, &inuse);
> +    entry = &xen_fixmap[third_table_offset((vaddr_t)p)];
> +    write_pte(entry, pte);
> +    flush_xen_tlb_range_va_local((vaddr_t)p, PAGE_SIZE);
> +
> +    local_irq_restore(flags);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h
> index 85d4a510ce8a..35050855b6e1 100644
> --- a/xen/include/asm-arm/config.h
> +++ b/xen/include/asm-arm/config.h
> @@ -180,6 +180,8 @@
>  #define FIXMAP_MISC     1  /* Ephemeral mappings of hardware */
>  #define FIXMAP_ACPI_BEGIN  2  /* Start mappings of ACPI tables */
>  #define FIXMAP_ACPI_END    (FIXMAP_ACPI_BEGIN + NUM_FIXMAP_ACPI_PAGES - 1)  /* End mappings of ACPI tables */
> +#define FIXMAP_PMAP_BEGIN (FIXMAP_ACPI_END + 1)
> +#define FIXMAP_PMAP_END (FIXMAP_PMAP_BEGIN + NUM_FIX_PMAP - 1)
>  
>  #define NR_hypercalls 64
>  
> diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h
> index 310f5225e056..81fd482ab2ce 100644
> --- a/xen/include/asm-arm/lpae.h
> +++ b/xen/include/asm-arm/lpae.h
> @@ -4,6 +4,7 @@
>  #ifndef __ASSEMBLY__
>  
>  #include <xen/page-defs.h>
> +#include <xen/mm-frame.h>
>  
>  /*
>   * WARNING!  Unlike the x86 pagetable code, where l1 is the lowest level and
> @@ -168,6 +169,13 @@ static inline bool lpae_is_superpage(lpae_t pte, unsigned int level)
>          third_table_offset(addr)            \
>      }
>  
> +/*
> + * Standard entry type that we'll use to build Xen's own pagetables.
> + * We put the same permissions at every level, because they're ignored
> + * by the walker in non-leaf entries.
> + */
> +lpae_t mfn_to_xen_entry(mfn_t mfn, unsigned int attr);
> +
>  #endif /* __ASSEMBLY__ */
>  
>  /*
> diff --git a/xen/include/asm-arm/pmap.h b/xen/include/asm-arm/pmap.h
> new file mode 100644
> index 000000000000..8e1dce93f8e4
> --- /dev/null
> +++ b/xen/include/asm-arm/pmap.h
> @@ -0,0 +1,10 @@
> +#ifndef __ASM_PMAP_H__
> +#define __ARM_PMAP_H__

ASM/ARM


> +/* Large enough for mapping 5 levels of page tables with some headroom */
> +#define NUM_FIX_PMAP 8
> +
> +void *pmap_map(mfn_t mfn);
> +void pmap_unmap(const void *p);

> +#endif	/* __ASM_PMAP_H__ */

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 13/15] xen/arm: mm: Use the PMAP helpers in xen_{,un}map_table()
  2021-04-25 20:13 ` [PATCH RFCv2 13/15] xen/arm: mm: Use the PMAP helpers in xen_{,un}map_table() Julien Grall
@ 2021-05-14 23:35   ` Stefano Stabellini
  2021-05-15  9:03     ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-14 23:35 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Volodymyr Babchuk

On Sun, 25 Apr 2021, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> During early boot, it is not possible to use xen_{,un}map_table()
> if the page tables are not residing the Xen binary.
> 
> This is a blocker to switch some of the helpers to use xen_pt_update()
> as we may need to allocate extra page tables and access them before
> the domheap has been initialized (see setup_xenheap_mappings()).
> 
> xen_{,un}map_table() are now updated to use the PMAP helpers for early
> boot map/unmap. Note that the special case for page-tables residing
> in Xen binary has been dropped because it is "complex" and was
> only added as a workaround in 8d4f1b8878e0 ("xen/arm: mm: Allow
> generic xen page-tables helpers to be called early").
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>

In terms of boot stages:

- SYS_STATE_early_boot --> use pmap_map
- greater than SYS_STATE_early_boot --> map_domain_page

While actually pmap would be able to work as far as SYS_STATE_boot, but
we don't need it. Is it worth simplifying it by changing the checks in
the previous patch to be against SYS_STATE_early_boot?

In any case:

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>



> ---
>     Changes in v2:
>         - New patch
> ---
>  xen/arch/arm/mm.c | 33 +++++++++------------------------
>  1 file changed, 9 insertions(+), 24 deletions(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 5e713b599611..f5768f2d4a81 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -41,6 +41,7 @@
>  #include <xen/sizes.h>
>  #include <xen/libfdt/libfdt.h>
>  
> +#include <asm/pmap.h>
>  #include <asm/setup.h>
>  
>  /* Override macros from asm/page.h to make them work with mfn_t */
> @@ -967,27 +968,11 @@ void *ioremap(paddr_t pa, size_t len)
>  static lpae_t *xen_map_table(mfn_t mfn)
>  {
>      /*
> -     * We may require to map the page table before map_domain_page() is
> -     * useable. The requirements here is it must be useable as soon as
> -     * page-tables are allocated dynamically via alloc_boot_pages().
> -     *
> -     * We need to do the check on physical address rather than virtual
> -     * address to avoid truncation on Arm32. Therefore is_kernel() cannot
> -     * be used.
> +     * During early boot, map_domain_page() may be unusable. Use the
> +     * PMAP to map temporarily a page-table.
>       */
>      if ( system_state == SYS_STATE_early_boot )
> -    {
> -        if ( is_xen_fixed_mfn(mfn) )
> -        {
> -            /*
> -             * It is fine to demote the type because the size of Xen
> -             * will always fit in vaddr_t.
> -             */
> -            vaddr_t offset = mfn_to_maddr(mfn) - virt_to_maddr(&_start);
> -
> -            return (lpae_t *)(XEN_VIRT_START + offset);
> -        }
> -    }
> +        return pmap_map(mfn);
>  
>      return map_domain_page(mfn);
>  }
> @@ -996,12 +981,12 @@ static void xen_unmap_table(const lpae_t *table)
>  {
>      /*
>       * During early boot, xen_map_table() will not use map_domain_page()
> -     * for page-tables residing in Xen binary. So skip the unmap part.
> +     * but the PMAP.
>       */
> -    if ( system_state == SYS_STATE_early_boot && is_kernel(table) )
> -        return;
> -
> -    unmap_domain_page(table);
> +    if ( system_state == SYS_STATE_early_boot )
> +        pmap_unmap(table);
> +    else
> +        unmap_domain_page(table);


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 14/15] xen/arm: mm: Rework setup_xenheap_mappings()
  2021-04-25 20:13 ` [PATCH RFCv2 14/15] xen/arm: mm: Rework setup_xenheap_mappings() Julien Grall
@ 2021-05-14 23:51   ` Stefano Stabellini
  2021-05-15  9:21     ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-14 23:51 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Julien Grall

On Sun, 25 Apr 2021, Julien Grall wrote:
> From: Julien Grall <julien.grall@arm.com>
> 
> A few issues have been reported with setup_xenheap_mappings() over the
> last couple of years. The main ones are:
>     - It will break on platform supporting more than 512GB of RAM
>       because the memory allocated by the boot allocator is not yet
>       mapped.
>     - Aligning all the regions to 1GB may lead to unexpected result
>       because we may alias non-cacheable region (such as device or reserved
>       regions).
> 
> map_pages_to_xen() was recently reworked to allow superpage mappings and
> deal with the use of page-tables before they are mapped.
> 
> Most of the code in setup_xenheap_mappings() is now replaced with a
> single call to map_pages_to_xen().
> 
> This also require to re-order the steps setup_mm() so the regions are
> given to the boot allocator first and then we setup the xenheap
> mappings.

I know this is paranoia but doesn't this introduce a requirement on the
size of the first bank in bootinfo.mem.bank[] ?

It should be at least 8KB?

I know it is unlikely but it is theoretically possible to have a first
bank of just 1KB.

I think we should write the requirement down with an in-code comment?
Or better maybe we should write a message about it in the panic below?


> Note that the 1GB alignment is not yet removed.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> ---
>     Changes in v2:
>         - New patch
> 
>     TODO:
>         - Remove the 1GB alignment
>         - Add support for setting the contiguous bit
> ---
>  xen/arch/arm/mm.c    | 60 ++++----------------------------------------
>  xen/arch/arm/setup.c | 10 ++++++--

I love it!


>  2 files changed, 13 insertions(+), 57 deletions(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index f5768f2d4a81..c49403b687f5 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -143,17 +143,6 @@ static DEFINE_PAGE_TABLE(cpu0_pgtable);
>  static DEFINE_PAGE_TABLES(cpu0_dommap, DOMHEAP_SECOND_PAGES);
>  #endif
>  
> -#ifdef CONFIG_ARM_64
> -/* The first page of the first level mapping of the xenheap. The
> - * subsequent xenheap first level pages are dynamically allocated, but
> - * we need this one to bootstrap ourselves. */
> -static DEFINE_PAGE_TABLE(xenheap_first_first);
> -/* The zeroeth level slot which uses xenheap_first_first. Used because
> - * setup_xenheap_mappings otherwise relies on mfn_to_virt which isn't
> - * valid for a non-xenheap mapping. */
> -static __initdata int xenheap_first_first_slot = -1;
> -#endif
> -
>  /* Common pagetable leaves */
>  /* Second level page tables.
>   *
> @@ -818,9 +807,9 @@ void __init setup_xenheap_mappings(unsigned long base_mfn,
>  void __init setup_xenheap_mappings(unsigned long base_mfn,
>                                     unsigned long nr_mfns)
>  {
> -    lpae_t *first, pte;
>      unsigned long mfn, end_mfn;
>      vaddr_t vaddr;
> +    int rc;
>  
>      /* Align to previous 1GB boundary */
>      mfn = base_mfn & ~((FIRST_SIZE>>PAGE_SHIFT)-1);
> @@ -846,49 +835,10 @@ void __init setup_xenheap_mappings(unsigned long base_mfn,
>       */
>      vaddr = (vaddr_t)__mfn_to_virt(base_mfn) & FIRST_MASK;
>  
> -    while ( mfn < end_mfn )
> -    {
> -        int slot = zeroeth_table_offset(vaddr);
> -        lpae_t *p = &xen_pgtable[slot];
> -
> -        if ( p->pt.valid )
> -        {
> -            /* mfn_to_virt is not valid on the 1st 1st mfn, since it
> -             * is not within the xenheap. */
> -            first = slot == xenheap_first_first_slot ?
> -                xenheap_first_first : mfn_to_virt(lpae_get_mfn(*p));
> -        }
> -        else if ( xenheap_first_first_slot == -1)
> -        {
> -            /* Use xenheap_first_first to bootstrap the mappings */
> -            first = xenheap_first_first;
> -
> -            pte = pte_of_xenaddr((vaddr_t)xenheap_first_first);
> -            pte.pt.table = 1;
> -            write_pte(p, pte);
> -
> -            xenheap_first_first_slot = slot;
> -        }
> -        else
> -        {
> -            mfn_t first_mfn = alloc_boot_pages(1, 1);
> -
> -            clear_page(mfn_to_virt(first_mfn));
> -            pte = mfn_to_xen_entry(first_mfn, MT_NORMAL);
> -            pte.pt.table = 1;
> -            write_pte(p, pte);
> -            first = mfn_to_virt(first_mfn);
> -        }
> -
> -        pte = mfn_to_xen_entry(_mfn(mfn), MT_NORMAL);
> -        /* TODO: Set pte.pt.contig when appropriate. */
> -        write_pte(&first[first_table_offset(vaddr)], pte);
> -
> -        mfn += FIRST_SIZE>>PAGE_SHIFT;
> -        vaddr += FIRST_SIZE;
> -    }
> -
> -    flush_xen_tlb_local();
> +    rc = map_pages_to_xen(vaddr, _mfn(mfn), end_mfn - mfn,
> +                          PAGE_HYPERVISOR_RW | _PAGE_BLOCK);
> +    if ( rc )
> +        panic("Unable to setup the xenheap mappings.\n");

This is the panic I was talking about


>  }
>  #endif
>  
> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> index 00aad1c194b9..0993a4bb52d4 100644
> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -761,8 +761,11 @@ static void __init setup_mm(void)
>          ram_start = min(ram_start,bank_start);
>          ram_end = max(ram_end,bank_end);
>  
> -        setup_xenheap_mappings(bank_start>>PAGE_SHIFT, bank_size>>PAGE_SHIFT);
> -
> +        /*
> +         * Add the region to the boot allocator first, so we can use
> +         * some to allocate page-tables for setting up the xenheap
> +         * mappings.
> +         */
>          s = bank_start;
>          while ( s < bank_end )
>          {
> @@ -781,6 +784,9 @@ static void __init setup_mm(void)
>              fw_unreserved_regions(s, e, init_boot_pages, 0);
>              s = n;
>          }
> +
> +        setup_xenheap_mappings(bank_start >> PAGE_SHIFT,
> +                               bank_size >> PAGE_SHIFT);
>      }
>  
>      total_pages += ram_size >> PAGE_SHIFT;
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 15/15] xen/arm: mm: Re-implement setup_frame_table_mappings() with map_pages_to_xen()
  2021-04-25 20:13 ` [PATCH RFCv2 15/15] xen/arm: mm: Re-implement setup_frame_table_mappings() with map_pages_to_xen() Julien Grall
@ 2021-05-15  0:02   ` Stefano Stabellini
  2021-05-15  9:25     ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-15  0:02 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Stefano Stabellini, Volodymyr Babchuk,
	Julien Grall

On Sun, 25 Apr 2021, Julien Grall wrote:
> From: Julien Grall <julien.grall@arm.com>
> 
> Now that map_pages_to_xen() has been extended to support 2MB mappings,
> we can replace the create_mappings() call by map_pages_to_xen() call.
> 
> This has the advantage to remove the different between 32-bit and 64-bit
> code.
> 
> Lastly remove create_mappings() as there is no more callers.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> ---
>     Changes in v2:
>         - New patch
> 
>     TODO:
>         - Add support for setting the contiguous bit
> ---
>  xen/arch/arm/mm.c | 64 +++++------------------------------------------
>  1 file changed, 6 insertions(+), 58 deletions(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index c49403b687f5..5f8ae029dd6d 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -359,40 +359,6 @@ void clear_fixmap(unsigned map)
>      BUG_ON(res != 0);
>  }
>  
> -/* Create Xen's mappings of memory.
> - * Mapping_size must be either 2MB or 32MB.
> - * Base and virt must be mapping_size aligned.
> - * Size must be a multiple of mapping_size.
> - * second must be a contiguous set of second level page tables
> - * covering the region starting at virt_offset. */
> -static void __init create_mappings(lpae_t *second,
> -                                   unsigned long virt_offset,
> -                                   unsigned long base_mfn,
> -                                   unsigned long nr_mfns,
> -                                   unsigned int mapping_size)
> -{
> -    unsigned long i, count;
> -    const unsigned long granularity = mapping_size >> PAGE_SHIFT;
> -    lpae_t pte, *p;
> -
> -    ASSERT((mapping_size == MB(2)) || (mapping_size == MB(32)));
> -    ASSERT(!((virt_offset >> PAGE_SHIFT) % granularity));
> -    ASSERT(!(base_mfn % granularity));
> -    ASSERT(!(nr_mfns % granularity));
> -
> -    count = nr_mfns / LPAE_ENTRIES;
> -    p = second + second_linear_offset(virt_offset);
> -    pte = mfn_to_xen_entry(_mfn(base_mfn), MT_NORMAL);
> -    if ( granularity == 16 * LPAE_ENTRIES )
> -        pte.pt.contig = 1;  /* These maps are in 16-entry contiguous chunks. */
> -    for ( i = 0; i < count; i++ )
> -    {
> -        write_pte(p + i, pte);
> -        pte.pt.base += 1 << LPAE_SHIFT;
> -    }
> -    flush_xen_tlb_local();
> -}
> -
>  #ifdef CONFIG_DOMAIN_PAGE
>  void *map_domain_page_global(mfn_t mfn)
>  {
> @@ -850,36 +816,18 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t pe)
>      unsigned long frametable_size = nr_pdxs * sizeof(struct page_info);
>      mfn_t base_mfn;
>      const unsigned long mapping_size = frametable_size < MB(32) ? MB(2) : MB(32);
> -#ifdef CONFIG_ARM_64
> -    lpae_t *second, pte;
> -    unsigned long nr_second;
> -    mfn_t second_base;
> -    int i;
> -#endif
> +    int rc;
>  
>      frametable_base_pdx = mfn_to_pdx(maddr_to_mfn(ps));
>      /* Round up to 2M or 32M boundary, as appropriate. */
>      frametable_size = ROUNDUP(frametable_size, mapping_size);
>      base_mfn = alloc_boot_pages(frametable_size >> PAGE_SHIFT, 32<<(20-12));
>  
> -#ifdef CONFIG_ARM_64
> -    /* Compute the number of second level pages. */
> -    nr_second = ROUNDUP(frametable_size, FIRST_SIZE) >> FIRST_SHIFT;
> -    second_base = alloc_boot_pages(nr_second, 1);
> -    second = mfn_to_virt(second_base);
> -    for ( i = 0; i < nr_second; i++ )
> -    {
> -        clear_page(mfn_to_virt(mfn_add(second_base, i)));
> -        pte = mfn_to_xen_entry(mfn_add(second_base, i), MT_NORMAL);
> -        pte.pt.table = 1;
> -        write_pte(&xen_first[first_table_offset(FRAMETABLE_VIRT_START)+i], pte);
> -    }
> -    create_mappings(second, 0, mfn_x(base_mfn), frametable_size >> PAGE_SHIFT,
> -                    mapping_size);
> -#else
> -    create_mappings(xen_second, FRAMETABLE_VIRT_START, mfn_x(base_mfn),
> -                    frametable_size >> PAGE_SHIFT, mapping_size);
> -#endif
> +    /* XXX: Handle contiguous bit */
> +    rc = map_pages_to_xen(FRAMETABLE_VIRT_START, base_mfn,
> +                          frametable_size >> PAGE_SHIFT, PAGE_HYPERVISOR_RW);
> +    if ( rc )
> +        panic("Unable to setup the frametable mappings.\n");

This is a lot better.

I take that "XXX: Handle contiguous bit" refers to the lack of
_PAGE_BLOCK. Why can't we just | _PAGE_BLOCK like in other places?


>      memset(&frame_table[0], 0, nr_pdxs * sizeof(struct page_info));
>      memset(&frame_table[nr_pdxs], -1,



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 10/15] xen/arm: mm: Allocate xen page tables in domheap rather than xenheap
  2021-05-13 22:27       ` Stefano Stabellini
@ 2021-05-15  8:48         ` Julien Grall
  2021-05-18  0:37           ` Stefano Stabellini
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-05-15  8:48 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Volodymyr Babchuk

Hi Stefano,

On 13/05/2021 23:27, Stefano Stabellini wrote:
> On Thu, 13 May 2021, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 12/05/2021 23:44, Stefano Stabellini wrote:
>>> On Sun, 25 Apr 2021, Julien Grall wrote:
>>>> From: Julien Grall <jgrall@amazon.com>
>>>>
>>>> xen_{un,}map_table() already uses the helper to map/unmap pages
>>>> on-demand (note this is currently a NOP on arm64). So switching to
>>>> domheap don't have any disavantage.
>>>>
>>>> But this as the benefit:
>>>>       - to keep the page tables unmapped if an arch decided to do so
>>>>       - reduce xenheap use on arm32 which can be pretty small
>>>>
>>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>>
>>> Thanks for the patch. It looks OK but let me ask a couple of questions
>>> to clarify my doubts.
>>>
>>> This change should have no impact to arm64, right?
>>>
>>> For arm32, I wonder why we were using map_domain_page before in
>>> xen_map_table: it wasn't necessary, was it? In fact, one could even say
>>> that it was wrong?
>> In xen_map_table() we need to be able to map pages from Xen binary, xenheap...
>> On arm64, we would be able to use mfn_to_virt() because everything is mapped
>> in Xen. But that's not the case on arm32. So we need a way to map anything
>> easily.
>>
>> The only difference between xenheap and domheap are the former is always
>> mapped while the latter may not be. So one can also view a xenheap page as a
>> glorified domheap.
>>
>> I also don't really want to create yet another interface to map pages (we have
>> vmap(), map_domain_page(), map_domain_global_page()...). So, when I
>> implemented xen_map_table() a couple of years ago, I came to the conclusion
>> that this is a convenient (ab)use of the interface.
> 
> Got it. Repeating to check if I see the full picture. On ARM64 there are
> no changes. On ARM32, at runtime there are no changes mapping/unmapping
> pages because xen_map_table is already mapping all pages using domheap,
> even xenheap pages are mapped as domheap; so this patch makes no
> difference in mapping/unmapping, correct?

For arm32, it makes a slight difference when allocating a new page table 
(we didn't call map/unmap before) but this is not called often.

The main "drop" in performance happened when xen_{,map}_table() was 
introduced.

> 
> The only difference is that on arm32 we are using domheap to allocate
> the pages, which is a different (larger) pool.

Yes.

Would you be happy to give you acked-by/reviewed-by on this basis?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 12/15] xen/arm: add Persistent Map (PMAP) infrastructure
  2021-05-14 23:25   ` Stefano Stabellini
@ 2021-05-15  8:54     ` Julien Grall
  0 siblings, 0 replies; 59+ messages in thread
From: Julien Grall @ 2021-05-15  8:54 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Wei Liu, Volodymyr Babchuk, Hongyan Xia, Julien Grall,
	Jan Beulich, Wei Liu, Andrew Cooper, Roger Pau Monné,
	Hongian Xia

Hi Stefano,

On 15/05/2021 00:25, Stefano Stabellini wrote:
> On Sun, 25 Apr 2021, Julien Grall wrote:
>> From: Wei Liu <wei.liu2@citrix.com>
>> +extern lpae_t xen_fixmap[LPAE_ENTRIES];
>> +
>> +void *__init pmap_map(mfn_t mfn)
>> +{
>> +    unsigned long flags;
>> +    unsigned int idx;
>> +    vaddr_t linear;
>> +    unsigned int slot;
>> +    lpae_t *entry, pte;
>> +
>> +    BUILD_BUG_ON(sizeof(inuse) * BITS_PER_LONG < NUM_FIX_PMAP);
>> +
>> +    ASSERT(system_state < SYS_STATE_smp_boot);
> 
> One small concern here is that we have been using SYS_STATE_early_boot
> to switch implementation of things like xen_map_table. Between
> SYS_STATE_early_boot and SYS_STATE_smp_boot there is SYS_STATE_boot.
> 
> I guess I am wondering if instead of three potentially different mapping
> functions (<= SYS_STATE_early_boot, < SYS_STATE_smp_boot, >=
> SYS_STATE_smp_boot) we can get away with only two?

This is more flexible than the existing method to map memory when state 
== SYS_STATE_early_boot. If you look at the next patch (#13), you will 
see that there will be only two method to map memory.

[...]

>> diff --git a/xen/include/asm-arm/pmap.h b/xen/include/asm-arm/pmap.h
>> new file mode 100644
>> index 000000000000..8e1dce93f8e4
>> --- /dev/null
>> +++ b/xen/include/asm-arm/pmap.h
>> @@ -0,0 +1,10 @@
>> +#ifndef __ASM_PMAP_H__
>> +#define __ARM_PMAP_H__
> 
> ASM/ARM

I will fix it.

> 
> 
>> +/* Large enough for mapping 5 levels of page tables with some headroom */
>> +#define NUM_FIX_PMAP 8
>> +
>> +void *pmap_map(mfn_t mfn);
>> +void pmap_unmap(const void *p);
> 
>> +#endif	/* __ASM_PMAP_H__ */

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 13/15] xen/arm: mm: Use the PMAP helpers in xen_{,un}map_table()
  2021-05-14 23:35   ` Stefano Stabellini
@ 2021-05-15  9:03     ` Julien Grall
  0 siblings, 0 replies; 59+ messages in thread
From: Julien Grall @ 2021-05-15  9:03 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Volodymyr Babchuk

Hi Stefano,

On 15/05/2021 00:35, Stefano Stabellini wrote:
> On Sun, 25 Apr 2021, Julien Grall wrote:
>> From: Julien Grall <jgrall@amazon.com>
>>
>> During early boot, it is not possible to use xen_{,un}map_table()
>> if the page tables are not residing the Xen binary.
>>
>> This is a blocker to switch some of the helpers to use xen_pt_update()
>> as we may need to allocate extra page tables and access them before
>> the domheap has been initialized (see setup_xenheap_mappings()).
>>
>> xen_{,un}map_table() are now updated to use the PMAP helpers for early
>> boot map/unmap. Note that the special case for page-tables residing
>> in Xen binary has been dropped because it is "complex" and was
>> only added as a workaround in 8d4f1b8878e0 ("xen/arm: mm: Allow
>> generic xen page-tables helpers to be called early").
>>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> In terms of boot stages:
> 
> - SYS_STATE_early_boot --> use pmap_map
> - greater than SYS_STATE_early_boot --> map_domain_page
> 
> While actually pmap would be able to work as far as SYS_STATE_boot, but
> we don't need it. Is it worth simplifying it by changing the checks in
> the previous patch to be against SYS_STATE_early_boot?

We need to differentiate between when this is used and when PMAP can be 
used. The ASSERT() is here to check that the PMAP code is not used 
outside of its limit.

In the current implementation, PMAP can be used at any time until we 
start bring-up secondary CPUs. So the ASSERT() is correct because 
doesn't restrict unnecessarily the use of it.

Note, the code is going to be moved to common code in the next revision.

> 
> In any case:
> 
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

Thank you!

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 14/15] xen/arm: mm: Rework setup_xenheap_mappings()
  2021-05-14 23:51   ` Stefano Stabellini
@ 2021-05-15  9:21     ` Julien Grall
  2021-05-18  0:50       ` Stefano Stabellini
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-05-15  9:21 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Volodymyr Babchuk, Julien Grall

Hi,

On 15/05/2021 00:51, Stefano Stabellini wrote:
> On Sun, 25 Apr 2021, Julien Grall wrote:
>> From: Julien Grall <julien.grall@arm.com>
>>
>> A few issues have been reported with setup_xenheap_mappings() over the
>> last couple of years. The main ones are:
>>      - It will break on platform supporting more than 512GB of RAM
>>        because the memory allocated by the boot allocator is not yet
>>        mapped.
>>      - Aligning all the regions to 1GB may lead to unexpected result
>>        because we may alias non-cacheable region (such as device or reserved
>>        regions).
>>
>> map_pages_to_xen() was recently reworked to allow superpage mappings and
>> deal with the use of page-tables before they are mapped.
>>
>> Most of the code in setup_xenheap_mappings() is now replaced with a
>> single call to map_pages_to_xen().
>>
>> This also require to re-order the steps setup_mm() so the regions are
>> given to the boot allocator first and then we setup the xenheap
>> mappings.
> 
> I know this is paranoia but doesn't this introduce a requirement on the
> size of the first bank in bootinfo.mem.bank[] ?
> 
> It should be at least 8KB?

AFAIK, the current requirement is 4KB because of the 1GB mapping. Long 
term, it would be 8KB.

> 
> I know it is unlikely but it is theoretically possible to have a first
> bank of just 1KB.

All the page allocators are working at the page granularity level. I am 
not entirely sure whether the current Xen would ignore the region or break.

Note that setup_xenheap_mappings() is taking a base MFN and a number of 
pages to map. So this doesn't look to be a new problem here.

For the 8KB requirement, we can look at first all the pages to the boot 
allocator and then do the mapping.

> 
> I think we should write the requirement down with an in-code comment?
> Or better maybe we should write a message about it in the panic below?

I can write an in-code comment but I think writing down in the panic() 
would be wrong because there is no promise this is going to be the only 
reason it fails (imagine there is a bug in the code...).

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 15/15] xen/arm: mm: Re-implement setup_frame_table_mappings() with map_pages_to_xen()
  2021-05-15  0:02   ` Stefano Stabellini
@ 2021-05-15  9:25     ` Julien Grall
  2021-05-18  0:51       ` Stefano Stabellini
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-05-15  9:25 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Volodymyr Babchuk, Julien Grall

Hi Stefano,

On 15/05/2021 01:02, Stefano Stabellini wrote:
> On Sun, 25 Apr 2021, Julien Grall wrote:
>> From: Julien Grall <julien.grall@arm.com>
>>
>> Now that map_pages_to_xen() has been extended to support 2MB mappings,
>> we can replace the create_mappings() call by map_pages_to_xen() call.
>>
>> This has the advantage to remove the different between 32-bit and 64-bit
>> code.
>>
>> Lastly remove create_mappings() as there is no more callers.
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>
>> ---
>>      Changes in v2:
>>          - New patch
>>
>>      TODO:
>>          - Add support for setting the contiguous bit
>> ---
>>   xen/arch/arm/mm.c | 64 +++++------------------------------------------
>>   1 file changed, 6 insertions(+), 58 deletions(-)
>>
>> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
>> index c49403b687f5..5f8ae029dd6d 100644
>> --- a/xen/arch/arm/mm.c
>> +++ b/xen/arch/arm/mm.c
>> @@ -359,40 +359,6 @@ void clear_fixmap(unsigned map)
>>       BUG_ON(res != 0);
>>   }
>>   
>> -/* Create Xen's mappings of memory.
>> - * Mapping_size must be either 2MB or 32MB.
>> - * Base and virt must be mapping_size aligned.
>> - * Size must be a multiple of mapping_size.
>> - * second must be a contiguous set of second level page tables
>> - * covering the region starting at virt_offset. */
>> -static void __init create_mappings(lpae_t *second,
>> -                                   unsigned long virt_offset,
>> -                                   unsigned long base_mfn,
>> -                                   unsigned long nr_mfns,
>> -                                   unsigned int mapping_size)
>> -{
>> -    unsigned long i, count;
>> -    const unsigned long granularity = mapping_size >> PAGE_SHIFT;
>> -    lpae_t pte, *p;
>> -
>> -    ASSERT((mapping_size == MB(2)) || (mapping_size == MB(32)));
>> -    ASSERT(!((virt_offset >> PAGE_SHIFT) % granularity));
>> -    ASSERT(!(base_mfn % granularity));
>> -    ASSERT(!(nr_mfns % granularity));
>> -
>> -    count = nr_mfns / LPAE_ENTRIES;
>> -    p = second + second_linear_offset(virt_offset);
>> -    pte = mfn_to_xen_entry(_mfn(base_mfn), MT_NORMAL);
>> -    if ( granularity == 16 * LPAE_ENTRIES )
>> -        pte.pt.contig = 1;  /* These maps are in 16-entry contiguous chunks. */
>> -    for ( i = 0; i < count; i++ )
>> -    {
>> -        write_pte(p + i, pte);
>> -        pte.pt.base += 1 << LPAE_SHIFT;
>> -    }
>> -    flush_xen_tlb_local();
>> -}
>> -
>>   #ifdef CONFIG_DOMAIN_PAGE
>>   void *map_domain_page_global(mfn_t mfn)
>>   {
>> @@ -850,36 +816,18 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t pe)
>>       unsigned long frametable_size = nr_pdxs * sizeof(struct page_info);
>>       mfn_t base_mfn;
>>       const unsigned long mapping_size = frametable_size < MB(32) ? MB(2) : MB(32);
>> -#ifdef CONFIG_ARM_64
>> -    lpae_t *second, pte;
>> -    unsigned long nr_second;
>> -    mfn_t second_base;
>> -    int i;
>> -#endif
>> +    int rc;
>>   
>>       frametable_base_pdx = mfn_to_pdx(maddr_to_mfn(ps));
>>       /* Round up to 2M or 32M boundary, as appropriate. */
>>       frametable_size = ROUNDUP(frametable_size, mapping_size);
>>       base_mfn = alloc_boot_pages(frametable_size >> PAGE_SHIFT, 32<<(20-12));
>>   
>> -#ifdef CONFIG_ARM_64
>> -    /* Compute the number of second level pages. */
>> -    nr_second = ROUNDUP(frametable_size, FIRST_SIZE) >> FIRST_SHIFT;
>> -    second_base = alloc_boot_pages(nr_second, 1);
>> -    second = mfn_to_virt(second_base);
>> -    for ( i = 0; i < nr_second; i++ )
>> -    {
>> -        clear_page(mfn_to_virt(mfn_add(second_base, i)));
>> -        pte = mfn_to_xen_entry(mfn_add(second_base, i), MT_NORMAL);
>> -        pte.pt.table = 1;
>> -        write_pte(&xen_first[first_table_offset(FRAMETABLE_VIRT_START)+i], pte);
>> -    }
>> -    create_mappings(second, 0, mfn_x(base_mfn), frametable_size >> PAGE_SHIFT,
>> -                    mapping_size);
>> -#else
>> -    create_mappings(xen_second, FRAMETABLE_VIRT_START, mfn_x(base_mfn),
>> -                    frametable_size >> PAGE_SHIFT, mapping_size);
>> -#endif
>> +    /* XXX: Handle contiguous bit */
>> +    rc = map_pages_to_xen(FRAMETABLE_VIRT_START, base_mfn,
>> +                          frametable_size >> PAGE_SHIFT, PAGE_HYPERVISOR_RW);
>> +    if ( rc )
>> +        panic("Unable to setup the frametable mappings.\n");
> 
> This is a lot better.
> 
> I take that "XXX: Handle contiguous bit" refers to the lack of
> _PAGE_BLOCK. Why can't we just | _PAGE_BLOCK like in other places?

I forgot to add _PAGE_BLOCK, however this is unrelated to my comment.

Currently, the frametable is mapped using 2MB mapping and setting the 
contiguous bit for each entry if the mapping is 32MB aligned.

_PAGE_BLOCK will only create 2MB mapping but will not set the contiguous 
bit. This will increase the pressure on the TLBs (we would get 16 entry 
rather than 1) if on system where the TLBs can take advantange of it.

So map_pages_to_xen() needs to gain support for contiguous bit. I 
haven't yet looked at it (hence the RFC state).

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 10/15] xen/arm: mm: Allocate xen page tables in domheap rather than xenheap
  2021-05-15  8:48         ` Julien Grall
@ 2021-05-18  0:37           ` Stefano Stabellini
  0 siblings, 0 replies; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-18  0:37 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng,
	Bertrand.Marquis, Julien Grall, Volodymyr Babchuk

On Sat, 15 May 2021, Julien Grall wrote:
> Hi Stefano,
> 
> On 13/05/2021 23:27, Stefano Stabellini wrote:
> > On Thu, 13 May 2021, Julien Grall wrote:
> > > Hi Stefano,
> > > 
> > > On 12/05/2021 23:44, Stefano Stabellini wrote:
> > > > On Sun, 25 Apr 2021, Julien Grall wrote:
> > > > > From: Julien Grall <jgrall@amazon.com>
> > > > > 
> > > > > xen_{un,}map_table() already uses the helper to map/unmap pages
> > > > > on-demand (note this is currently a NOP on arm64). So switching to
> > > > > domheap don't have any disavantage.
> > > > > 
> > > > > But this as the benefit:
> > > > >       - to keep the page tables unmapped if an arch decided to do so
> > > > >       - reduce xenheap use on arm32 which can be pretty small
> > > > > 
> > > > > Signed-off-by: Julien Grall <jgrall@amazon.com>
> > > > 
> > > > Thanks for the patch. It looks OK but let me ask a couple of questions
> > > > to clarify my doubts.
> > > > 
> > > > This change should have no impact to arm64, right?
> > > > 
> > > > For arm32, I wonder why we were using map_domain_page before in
> > > > xen_map_table: it wasn't necessary, was it? In fact, one could even say
> > > > that it was wrong?
> > > In xen_map_table() we need to be able to map pages from Xen binary,
> > > xenheap...
> > > On arm64, we would be able to use mfn_to_virt() because everything is
> > > mapped
> > > in Xen. But that's not the case on arm32. So we need a way to map anything
> > > easily.
> > > 
> > > The only difference between xenheap and domheap are the former is always
> > > mapped while the latter may not be. So one can also view a xenheap page as
> > > a
> > > glorified domheap.
> > > 
> > > I also don't really want to create yet another interface to map pages (we
> > > have
> > > vmap(), map_domain_page(), map_domain_global_page()...). So, when I
> > > implemented xen_map_table() a couple of years ago, I came to the
> > > conclusion
> > > that this is a convenient (ab)use of the interface.
> > 
> > Got it. Repeating to check if I see the full picture. On ARM64 there are
> > no changes. On ARM32, at runtime there are no changes mapping/unmapping
> > pages because xen_map_table is already mapping all pages using domheap,
> > even xenheap pages are mapped as domheap; so this patch makes no
> > difference in mapping/unmapping, correct?
> 
> For arm32, it makes a slight difference when allocating a new page table (we
> didn't call map/unmap before) but this is not called often.
> 
> The main "drop" in performance happened when xen_{,map}_table() was
> introduced.
> 
> > 
> > The only difference is that on arm32 we are using domheap to allocate
> > the pages, which is a different (larger) pool.
> 
> Yes.
> 
> Would you be happy to give you acked-by/reviewed-by on this basis?

Yes

Acked-by: Stefano Stabellini <sstabellini@kernel.org>


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 14/15] xen/arm: mm: Rework setup_xenheap_mappings()
  2021-05-15  9:21     ` Julien Grall
@ 2021-05-18  0:50       ` Stefano Stabellini
  2022-02-12 19:16         ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-18  0:50 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng,
	Bertrand.Marquis, Julien Grall, Volodymyr Babchuk, Julien Grall

On Sat, 15 May 2021, Julien Grall wrote:
> Hi,
> 
> On 15/05/2021 00:51, Stefano Stabellini wrote:
> > On Sun, 25 Apr 2021, Julien Grall wrote:
> > > From: Julien Grall <julien.grall@arm.com>
> > > 
> > > A few issues have been reported with setup_xenheap_mappings() over the
> > > last couple of years. The main ones are:
> > >      - It will break on platform supporting more than 512GB of RAM
> > >        because the memory allocated by the boot allocator is not yet
> > >        mapped.
> > >      - Aligning all the regions to 1GB may lead to unexpected result
> > >        because we may alias non-cacheable region (such as device or
> > > reserved
> > >        regions).
> > > 
> > > map_pages_to_xen() was recently reworked to allow superpage mappings and
> > > deal with the use of page-tables before they are mapped.
> > > 
> > > Most of the code in setup_xenheap_mappings() is now replaced with a
> > > single call to map_pages_to_xen().
> > > 
> > > This also require to re-order the steps setup_mm() so the regions are
> > > given to the boot allocator first and then we setup the xenheap
> > > mappings.
> > 
> > I know this is paranoia but doesn't this introduce a requirement on the
> > size of the first bank in bootinfo.mem.bank[] ?
> > 
> > It should be at least 8KB?
> 
> AFAIK, the current requirement is 4KB because of the 1GB mapping. Long term,
> it would be 8KB.
> 
> > 
> > I know it is unlikely but it is theoretically possible to have a first
> > bank of just 1KB.
> 
> All the page allocators are working at the page granularity level. I am not
> entirely sure whether the current Xen would ignore the region or break.
> 
> Note that setup_xenheap_mappings() is taking a base MFN and a number of pages
> to map. So this doesn't look to be a new problem here.

Yeah... the example of the first bank being 1KB is wrong because, like
you wrote, it wouldn't work before your patches either, and probably it
will never work.

Maybe a better example is a first bank of 4KB exactly.


> For the 8KB requirement, we can look at first all the pages to the boot
> allocator and then do the mapping.
> 
> > 
> > I think we should write the requirement down with an in-code comment?
> > Or better maybe we should write a message about it in the panic below?
> 
> I can write an in-code comment but I think writing down in the panic() would
> be wrong because there is no promise this is going to be the only reason it
> fails (imagine there is a bug in the code...).

Maybe it is sufficient to print out the error code (EINVAL / ENOMEM etc)
in the panic. If you see -12 you know what to look for.

Looking into it I realize that we are not returning -ENOMEM in case of
alloc_boot_pages failures in create_xen_table. It looks like we would
hit a BUG() in the implementation of alloc_boot_pages. Maybe that's good
enough as is then.


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 15/15] xen/arm: mm: Re-implement setup_frame_table_mappings() with map_pages_to_xen()
  2021-05-15  9:25     ` Julien Grall
@ 2021-05-18  0:51       ` Stefano Stabellini
  0 siblings, 0 replies; 59+ messages in thread
From: Stefano Stabellini @ 2021-05-18  0:51 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng,
	Bertrand.Marquis, Julien Grall, Volodymyr Babchuk, Julien Grall

On Sat, 15 May 2021, Julien Grall wrote:
> Hi Stefano,
> 
> On 15/05/2021 01:02, Stefano Stabellini wrote:
> > On Sun, 25 Apr 2021, Julien Grall wrote:
> > > From: Julien Grall <julien.grall@arm.com>
> > > 
> > > Now that map_pages_to_xen() has been extended to support 2MB mappings,
> > > we can replace the create_mappings() call by map_pages_to_xen() call.
> > > 
> > > This has the advantage to remove the different between 32-bit and 64-bit
> > > code.
> > > 
> > > Lastly remove create_mappings() as there is no more callers.
> > > 
> > > Signed-off-by: Julien Grall <julien.grall@arm.com>
> > > Signed-off-by: Julien Grall <jgrall@amazon.com>
> > > 
> > > ---
> > >      Changes in v2:
> > >          - New patch
> > > 
> > >      TODO:
> > >          - Add support for setting the contiguous bit
> > > ---
> > >   xen/arch/arm/mm.c | 64 +++++------------------------------------------
> > >   1 file changed, 6 insertions(+), 58 deletions(-)
> > > 
> > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> > > index c49403b687f5..5f8ae029dd6d 100644
> > > --- a/xen/arch/arm/mm.c
> > > +++ b/xen/arch/arm/mm.c
> > > @@ -359,40 +359,6 @@ void clear_fixmap(unsigned map)
> > >       BUG_ON(res != 0);
> > >   }
> > >   -/* Create Xen's mappings of memory.
> > > - * Mapping_size must be either 2MB or 32MB.
> > > - * Base and virt must be mapping_size aligned.
> > > - * Size must be a multiple of mapping_size.
> > > - * second must be a contiguous set of second level page tables
> > > - * covering the region starting at virt_offset. */
> > > -static void __init create_mappings(lpae_t *second,
> > > -                                   unsigned long virt_offset,
> > > -                                   unsigned long base_mfn,
> > > -                                   unsigned long nr_mfns,
> > > -                                   unsigned int mapping_size)
> > > -{
> > > -    unsigned long i, count;
> > > -    const unsigned long granularity = mapping_size >> PAGE_SHIFT;
> > > -    lpae_t pte, *p;
> > > -
> > > -    ASSERT((mapping_size == MB(2)) || (mapping_size == MB(32)));
> > > -    ASSERT(!((virt_offset >> PAGE_SHIFT) % granularity));
> > > -    ASSERT(!(base_mfn % granularity));
> > > -    ASSERT(!(nr_mfns % granularity));
> > > -
> > > -    count = nr_mfns / LPAE_ENTRIES;
> > > -    p = second + second_linear_offset(virt_offset);
> > > -    pte = mfn_to_xen_entry(_mfn(base_mfn), MT_NORMAL);
> > > -    if ( granularity == 16 * LPAE_ENTRIES )
> > > -        pte.pt.contig = 1;  /* These maps are in 16-entry contiguous
> > > chunks. */
> > > -    for ( i = 0; i < count; i++ )
> > > -    {
> > > -        write_pte(p + i, pte);
> > > -        pte.pt.base += 1 << LPAE_SHIFT;
> > > -    }
> > > -    flush_xen_tlb_local();
> > > -}
> > > -
> > >   #ifdef CONFIG_DOMAIN_PAGE
> > >   void *map_domain_page_global(mfn_t mfn)
> > >   {
> > > @@ -850,36 +816,18 @@ void __init setup_frametable_mappings(paddr_t ps,
> > > paddr_t pe)
> > >       unsigned long frametable_size = nr_pdxs * sizeof(struct page_info);
> > >       mfn_t base_mfn;
> > >       const unsigned long mapping_size = frametable_size < MB(32) ? MB(2)
> > > : MB(32);
> > > -#ifdef CONFIG_ARM_64
> > > -    lpae_t *second, pte;
> > > -    unsigned long nr_second;
> > > -    mfn_t second_base;
> > > -    int i;
> > > -#endif
> > > +    int rc;
> > >         frametable_base_pdx = mfn_to_pdx(maddr_to_mfn(ps));
> > >       /* Round up to 2M or 32M boundary, as appropriate. */
> > >       frametable_size = ROUNDUP(frametable_size, mapping_size);
> > >       base_mfn = alloc_boot_pages(frametable_size >> PAGE_SHIFT,
> > > 32<<(20-12));
> > >   -#ifdef CONFIG_ARM_64
> > > -    /* Compute the number of second level pages. */
> > > -    nr_second = ROUNDUP(frametable_size, FIRST_SIZE) >> FIRST_SHIFT;
> > > -    second_base = alloc_boot_pages(nr_second, 1);
> > > -    second = mfn_to_virt(second_base);
> > > -    for ( i = 0; i < nr_second; i++ )
> > > -    {
> > > -        clear_page(mfn_to_virt(mfn_add(second_base, i)));
> > > -        pte = mfn_to_xen_entry(mfn_add(second_base, i), MT_NORMAL);
> > > -        pte.pt.table = 1;
> > > -
> > > write_pte(&xen_first[first_table_offset(FRAMETABLE_VIRT_START)+i], pte);
> > > -    }
> > > -    create_mappings(second, 0, mfn_x(base_mfn), frametable_size >>
> > > PAGE_SHIFT,
> > > -                    mapping_size);
> > > -#else
> > > -    create_mappings(xen_second, FRAMETABLE_VIRT_START, mfn_x(base_mfn),
> > > -                    frametable_size >> PAGE_SHIFT, mapping_size);
> > > -#endif
> > > +    /* XXX: Handle contiguous bit */
> > > +    rc = map_pages_to_xen(FRAMETABLE_VIRT_START, base_mfn,
> > > +                          frametable_size >> PAGE_SHIFT,
> > > PAGE_HYPERVISOR_RW);
> > > +    if ( rc )
> > > +        panic("Unable to setup the frametable mappings.\n");
> > 
> > This is a lot better.
> > 
> > I take that "XXX: Handle contiguous bit" refers to the lack of
> > _PAGE_BLOCK. Why can't we just | _PAGE_BLOCK like in other places?
> 
> I forgot to add _PAGE_BLOCK, however this is unrelated to my comment.
> 
> Currently, the frametable is mapped using 2MB mapping and setting the
> contiguous bit for each entry if the mapping is 32MB aligned.
> 
> _PAGE_BLOCK will only create 2MB mapping but will not set the contiguous bit.
> This will increase the pressure on the TLBs (we would get 16 entry rather than
> 1) if on system where the TLBs can take advantange of it.
> 
> So map_pages_to_xen() needs to gain support for contiguous bit. I haven't yet
> looked at it (hence the RFC state).

I see, thanks for the explanation


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 02/15] xen/arm: lpae: Use the generic helpers to defined the Xen PT helpers
  2021-05-13 22:44           ` Stefano Stabellini
@ 2021-07-03 17:32             ` Julien Grall
  2021-07-13 20:53               ` Stefano Stabellini
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2021-07-03 17:32 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Volodymyr Babchuk

Hi Stefano,

Sorry for the late answer.

On 13/05/2021 23:44, Stefano Stabellini wrote:
> On Wed, 12 May 2021, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 12/05/2021 22:30, Stefano Stabellini wrote:
>>> On Wed, 12 May 2021, Julien Grall wrote:
>>>>>> +#define LPAE_SHIFT          LPAE_SHIFT_GS(PAGE_SHIFT)
>>>>>> +#define LPAE_ENTRIES        LPAE_ENTRIES_GS(PAGE_SHIFT)
>>>>>> +#define LPAE_ENTRY_MASK     LPAE_ENTRY_MASK_GS(PAGE_SHIFT)
>>>>>>
>>>>>> +#define LEVEL_SHIFT(lvl)    LEVEL_SHIFT_GS(PAGE_SHIFT, lvl)
>>>>>> +#define LEVEL_ORDER(lvl)    LEVEL_ORDER_GS(PAGE_SHIFT, lvl)
>>>>>> +#define LEVEL_SIZE(lvl)     LEVEL_SIZE_GS(PAGE_SHIFT, lvl)
>>>>>> +#define LEVEL_MASK(lvl)     (~(LEVEL_SIZE(lvl) - 1))
>>>>>
>>>>> I would avoid adding these 4 macros. It would be OK if they were just
>>>>> used within this file but lpae.h is a header: they could end up be used
>>>>> anywhere in the xen/ code and they have a very generic name. My
>>>>> suggestion would be to skip them and just do:
>>>>
>>>> Those macros will be used in follow-up patches. They are pretty useful to
>>>> avoid introduce static array with the different information for each
>>>> level.
>>>>
>>>> Would prefix them with XEN_ be better?
>>>
>>> Maybe. The concern I have is that there are multiple page granularities
>>> (4kb, 16kb, etc) and multiple page sizes (4kb, 2mb, etc). If I just see
>>> LEVEL_ORDER it is not immediately obvious what granularity and what size
>>> we are talking about.
>>
>> I am a bit puzzled with your answer. AFAIU, you are happy with the existing
>> macros (THIRD_*, SECOND_*) but not with the new macros.
>>
>> In reality, there is no difference because THIRD_* doesn't tell you the exact
>> size but only "this is a level 3 mapping".
>>
>> So can you clarify what you are after? IOW is it reworking the current naming
>> scheme?
> 
> You are right -- there is no real difference between THIRD_*, SECOND_*
> and LEVEL_*.
> 
> The original reason for my comments is that I hadn't read the following
> patches, and the definition of LEVEL_* macros is simple, they could be
> open coded. It looked like they were only going to be used to make the
> definition of THIRD_*, SECOND_* a bit easier. So, at first, I was
> wondering if they were needed at all.
> 
> Secondly, I realized that they were going to be used in *.c files by
> other patches. That's why they are there. But I started thinking whether
> we should find a way to make it a bit clearer that they are for Xen
> pages, currently at 4KB granularity. THIRD_*, SECOND_*, etc. are already
> generic names which don't convey the granularity or whether they are Xen
> pages at all. But LEVEL_* seem even more generic.
> 
> As I mentioned, I don't have any good suggestions for changes to make
> here, so unless you can come up with a good idea let's keep it as is.

I am thinking to use the following naming (diff on top of this patch):

-#define LPAE_SHIFT          LPAE_SHIFT_GS(PAGE_SHIFT)
-#define LPAE_ENTRIES        LPAE_ENTRIES_GS(PAGE_SHIFT)
-#define LPAE_ENTRY_MASK     LPAE_ENTRY_MASK_GS(PAGE_SHIFT)
+#define XEN_PT_SHIFT          LPAE_SHIFT_GS(PAGE_SHIFT)
+#define XEN_PT_ENTRIES        LPAE_ENTRIES_GS(PAGE_SHIFT)
+#define XEN_PT_ENTRY_MASK     LPAE_ENTRY_MASK_GS(PAGE_SHIFT)

-#define LEVEL_SHIFT(lvl)    LEVEL_SHIFT_GS(PAGE_SHIFT, lvl)
-#define LEVEL_ORDER(lvl)    LEVEL_ORDER_GS(PAGE_SHIFT, lvl)
-#define LEVEL_SIZE(lvl)     LEVEL_SIZE_GS(PAGE_SHIFT, lvl)
-#define LEVEL_MASK(lvl)     (~(LEVEL_SIZE(lvl) - 1))
+#define XEN_PT_LEVEL_SHIFT(lvl)    LEVEL_SHIFT_GS(PAGE_SHIFT, lvl)
+#define XEN_PT_LEVEL_ORDER(lvl)    LEVEL_ORDER_GS(PAGE_SHIFT, lvl)
+#define XEN_PT_LEVEL_SIZE(lvl)     LEVEL_SIZE_GS(PAGE_SHIFT, lvl)
+#define XEN_PT_LEVEL_MASK(lvl)     (~(LEVEL_SIZE(lvl) - 1))

  /* Convenience aliases */
-#define THIRD_SHIFT         LEVEL_SHIFT(3)
-#define THIRD_ORDER         LEVEL_ORDER(3)
-#define THIRD_SIZE          LEVEL_SIZE(3)
-#define THIRD_MASK          LEVEL_MASK(3)
-
-#define SECOND_SHIFT        LEVEL_SHIFT(2)
-#define SECOND_ORDER        LEVEL_ORDER(2)
-#define SECOND_SIZE         LEVEL_SIZE(2)
-#define SECOND_MASK         LEVEL_MASK(2)
-
-#define FIRST_SHIFT         LEVEL_SHIFT(1)
-#define FIRST_ORDER         LEVEL_ORDER(1)
-#define FIRST_SIZE          LEVEL_SIZE(1)
-#define FIRST_MASK          LEVEL_MASK(1)
-
-#define ZEROETH_SHIFT       LEVEL_SHIFT(0)
-#define ZEROETH_ORDER       LEVEL_ORDER(0)
-#define ZEROETH_SIZE        LEVEL_SIZE(0)
-#define ZEROETH_MASK        LEVEL_MASK(0)
+#define THIRD_SHIFT         XEN_PT_LEVEL_SHIFT(3)
+#define THIRD_ORDER         XEN_PT_LEVEL_ORDER(3)
+#define THIRD_SIZE          XEN_PT_LEVEL_SIZE(3)
+#define THIRD_MASK          XEN_PT_LEVEL_MASK(3)
+
+#define SECOND_SHIFT        XEN_PT_LEVEL_SHIFT(2)
+#define SECOND_ORDER        XEN_PT_LEVEL_ORDER(2)
+#define SECOND_SIZE         XEN_PT_LEVEL_SIZE(2)
+#define SECOND_MASK         XEN_PT_LEVEL_MASK(2)
+
+#define FIRST_SHIFT         XEN_PT_LEVEL_SHIFT(1)
+#define FIRST_ORDER         XEN_PT_LEVEL_ORDER(1)
+#define FIRST_SIZE          XEN_PT_LEVEL_SIZE(1)
+#define FIRST_MASK          XEN_PT_LEVEL_MASK(1)
+
+#define ZEROETH_SHIFT       XEN_PT_LEVEL_SHIFT(0)
+#define ZEROETH_ORDER       XEN_PT_LEVEL_ORDER(0)
+#define ZEROETH_SIZE        XEN_PT_LEVEL_SIZE(0)
+#define ZEROETH_MASK        XEN_PT_LEVEL_MASK(0)

I don't plan to modify the nameing for ZEROETH*, FIRST*, SECOND*, THIRD*.

Let me know if you prefer it over the currrent naming.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 12/15] xen/arm: add Persistent Map (PMAP) infrastructure
  2021-04-26  9:41   ` Xia, Hongyan
@ 2021-07-03 17:57     ` Julien Grall
  0 siblings, 0 replies; 59+ messages in thread
From: Julien Grall @ 2021-07-03 17:57 UTC (permalink / raw)
  To: Xia, Hongyan, xen-devel
  Cc: Penny.Zheng, Bertrand.Marquis, Wei.Chen, sstabellini, wei.liu2,
	andrew.cooper3, roger.pau, Volodymyr_Babchuk, hongyax,
	Henry.Wang, wl, Grall, Julien, jbeulich

Hi Hongyan,

On 26/04/2021 10:41, Xia, Hongyan wrote:
> On Sun, 2021-04-25 at 21:13 +0100, Julien Grall wrote:
>> From: Wei Liu <wei.liu2@citrix.com>
>>
>> The basic idea is like Persistent Kernel Map (PKMAP) in Linux. We
>> pre-populate all the relevant page tables before the system is fully
>> set up.
>>
>> We will need it on Arm in order to rework the arm64 version of
>> xenheap_setup_mappings() as we may need to use pages allocated from
>> the boot allocator before they are effectively mapped.
>>
>> This infrastructure is not lock-protected therefore can only be used
>> before smpboot. After smpboot, map_domain_page() has to be used.
>>
>> This is based on the x86 version [1] that was originally implemented
>> by Wei Liu.
>>
>> Take the opportunity to switch the parameter attr from unsigned to
>> unsigned int.
>>
>> [1] <
>> e92da4ad6015b6089737fcccba3ec1d6424649a5.1588278317.git.hongyxia@amazon.com
>>>
>>
>> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
>> Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
>> [julien: Adapted for Arm]
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> [...]
> 
>> diff --git a/xen/arch/arm/pmap.c b/xen/arch/arm/pmap.c
>> new file mode 100644
>> index 000000000000..702b1bde982d
>> --- /dev/null
>> +++ b/xen/arch/arm/pmap.c
>> @@ -0,0 +1,101 @@
>> +#include <xen/init.h>
>> +#include <xen/mm.h>
>> +
>> +#include <asm/bitops.h>
>> +#include <asm/flushtlb.h>
>> +#include <asm/pmap.h>
>> +
>> +/*
>> + * To be able to use FIXMAP_PMAP_BEGIN.
>> + * XXX: move fixmap definition in a separate header
>> + */
>> +#include <xen/acpi.h>
>> +
>> +/*
>> + * Simple mapping infrastructure to map / unmap pages in fixed map.
>> + * This is used to set up the page table for mapcache, which is used
>> + * by map domain page infrastructure.
>> + *
>> + * This structure is not protected by any locks, so it must not be
>> used after
>> + * smp bring-up.
>> + */
>> +
>> +/* Bitmap to track which slot is used */
>> +static unsigned long __initdata inuse;
>> +
>> +/* XXX: Find an header to declare it */
>> +extern lpae_t xen_fixmap[LPAE_ENTRIES];
>> +
>> +void *__init pmap_map(mfn_t mfn)
>> +{
>> +    unsigned long flags;
>> +    unsigned int idx;
>> +    vaddr_t linear;
>> +    unsigned int slot;
>> +    lpae_t *entry, pte;
>> +
>> +    BUILD_BUG_ON(sizeof(inuse) * BITS_PER_LONG < NUM_FIX_PMAP);
> 
> This seems wrong to me. It should multiply with something like
> BITS_PER_BYTE.

Good spot! I have updated my tree.

> 
> I noticed this line was already present before the Arm version so
> probably my fault :(, which also needs to be fixed.

This should be taken care as the next version will create the pmap in 
common code :).

> 
>> +
>> +    ASSERT(system_state < SYS_STATE_smp_boot);
>> +
>> +    local_irq_save(flags);
>> +
>> +    idx = find_first_zero_bit(&inuse, NUM_FIX_PMAP);
>> +    if ( idx == NUM_FIX_PMAP )
>> +        panic("Out of PMAP slots\n");
>> +
>> +    __set_bit(idx, &inuse);
>> +
>> +    slot = idx + FIXMAP_PMAP_BEGIN;
>> +    ASSERT(slot >= FIXMAP_PMAP_BEGIN && slot <= FIXMAP_PMAP_END);
>> +
> 
>  From here...
> 
>> +    linear = FIXMAP_ADDR(slot);
>> +    /*
>> +     * We cannot use set_fixmap() here. We use PMAP when there is no
>> direct map,
>> +     * so map_pages_to_xen() called by set_fixmap() needs to map
>> pages on
>> +     * demand, which then calls pmap() again, resulting in a loop.
>> Modify the
>> +     * PTEs directly instead. The same is true for pmap_unmap().
>> +     */
>> +    entry = &xen_fixmap[third_table_offset(linear)];
>> +
>> +    ASSERT(!lpae_is_valid(*entry));
>> +
>> +    pte = mfn_to_xen_entry(mfn, PAGE_HYPERVISOR_RW);
>> +    pte.pt.table = 1;
>> +    write_pte(entry, pte);
>> +
> 
> ...to here, I wonder if we can move this chunk into arch (like void
> *arch_write_pmap_slot(slot)). Such an arch function hides how fixmap is
> handled and how page table entry is written behind arch, and the rest
> can just be common.

This is similar to what I had in mind. Let me give a try for the next 
version.

> 
>> +    local_irq_restore(flags);
>> +
>> +    return (void *)linear;
>> +}
>> +
>> +void __init pmap_unmap(const void *p)
>> +{
>> +    unsigned long flags;
>> +    unsigned int idx;
>> +    lpae_t *entry;
>> +    lpae_t pte = { 0 };
>> +    unsigned int slot = third_table_offset((vaddr_t)p);
>> +
>> +    ASSERT(system_state < SYS_STATE_smp_boot);
>> +    ASSERT(slot >= FIXMAP_PMAP_BEGIN && slot <= FIXMAP_PMAP_END);
>> +
>> +    idx = slot - FIXMAP_PMAP_BEGIN;
>> +    local_irq_save(flags);
>> +
>> +    __clear_bit(idx, &inuse);
>> +    entry = &xen_fixmap[third_table_offset((vaddr_t)p)];
>> +    write_pte(entry, pte);
>> +    flush_xen_tlb_range_va_local((vaddr_t)p, PAGE_SIZE);
> 
> and the same for the above, something like arch_clear_pmap(void *) and
> the rest into common.
> 
>  From a quick glance, I don't think x86 and Arm share any useful TLB
> flush helpers? So the TLB flush probably should be behind arch as well.

We could potential define flush_tlb_one_local() on Arm. But, I am not 
sure this is worth it because the page table manipulation is mainly 
happening in arch code so far. Although, this might change in the future.

> 
>> +
>> +    local_irq_restore(flags);
>> +}
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
> 
> [...]
> 
>> diff --git a/xen/include/asm-arm/pmap.h b/xen/include/asm-arm/pmap.h
>> new file mode 100644
>> index 000000000000..8e1dce93f8e4
>> --- /dev/null
>> +++ b/xen/include/asm-arm/pmap.h
>> @@ -0,0 +1,10 @@
>> +#ifndef __ASM_PMAP_H__
>> +#define __ARM_PMAP_H__
> 
> This line doesn't seem to match the #ifndef, but if the functions are
> moved to common, this header can be moved to common as well.

Stefano pointed out the same. I have fixed it in my tree but I will 
likely rename the guard as the header will be moved in common.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 02/15] xen/arm: lpae: Use the generic helpers to defined the Xen PT helpers
  2021-07-03 17:32             ` Julien Grall
@ 2021-07-13 20:53               ` Stefano Stabellini
  2021-07-14 17:40                 ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Stefano Stabellini @ 2021-07-13 20:53 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng,
	Bertrand.Marquis, Julien Grall, Volodymyr Babchuk

On Sat, 3 Jul 2021, Julien Grall wrote:
> Hi Stefano,
> 
> Sorry for the late answer.
> 
> On 13/05/2021 23:44, Stefano Stabellini wrote:
> > On Wed, 12 May 2021, Julien Grall wrote:
> > > Hi Stefano,
> > > 
> > > On 12/05/2021 22:30, Stefano Stabellini wrote:
> > > > On Wed, 12 May 2021, Julien Grall wrote:
> > > > > > > +#define LPAE_SHIFT          LPAE_SHIFT_GS(PAGE_SHIFT)
> > > > > > > +#define LPAE_ENTRIES        LPAE_ENTRIES_GS(PAGE_SHIFT)
> > > > > > > +#define LPAE_ENTRY_MASK     LPAE_ENTRY_MASK_GS(PAGE_SHIFT)
> > > > > > > 
> > > > > > > +#define LEVEL_SHIFT(lvl)    LEVEL_SHIFT_GS(PAGE_SHIFT, lvl)
> > > > > > > +#define LEVEL_ORDER(lvl)    LEVEL_ORDER_GS(PAGE_SHIFT, lvl)
> > > > > > > +#define LEVEL_SIZE(lvl)     LEVEL_SIZE_GS(PAGE_SHIFT, lvl)
> > > > > > > +#define LEVEL_MASK(lvl)     (~(LEVEL_SIZE(lvl) - 1))
> > > > > > 
> > > > > > I would avoid adding these 4 macros. It would be OK if they were
> > > > > > just
> > > > > > used within this file but lpae.h is a header: they could end up be
> > > > > > used
> > > > > > anywhere in the xen/ code and they have a very generic name. My
> > > > > > suggestion would be to skip them and just do:
> > > > > 
> > > > > Those macros will be used in follow-up patches. They are pretty useful
> > > > > to
> > > > > avoid introduce static array with the different information for each
> > > > > level.
> > > > > 
> > > > > Would prefix them with XEN_ be better?
> > > > 
> > > > Maybe. The concern I have is that there are multiple page granularities
> > > > (4kb, 16kb, etc) and multiple page sizes (4kb, 2mb, etc). If I just see
> > > > LEVEL_ORDER it is not immediately obvious what granularity and what size
> > > > we are talking about.
> > > 
> > > I am a bit puzzled with your answer. AFAIU, you are happy with the
> > > existing
> > > macros (THIRD_*, SECOND_*) but not with the new macros.
> > > 
> > > In reality, there is no difference because THIRD_* doesn't tell you the
> > > exact
> > > size but only "this is a level 3 mapping".
> > > 
> > > So can you clarify what you are after? IOW is it reworking the current
> > > naming
> > > scheme?
> > 
> > You are right -- there is no real difference between THIRD_*, SECOND_*
> > and LEVEL_*.
> > 
> > The original reason for my comments is that I hadn't read the following
> > patches, and the definition of LEVEL_* macros is simple, they could be
> > open coded. It looked like they were only going to be used to make the
> > definition of THIRD_*, SECOND_* a bit easier. So, at first, I was
> > wondering if they were needed at all.
> > 
> > Secondly, I realized that they were going to be used in *.c files by
> > other patches. That's why they are there. But I started thinking whether
> > we should find a way to make it a bit clearer that they are for Xen
> > pages, currently at 4KB granularity. THIRD_*, SECOND_*, etc. are already
> > generic names which don't convey the granularity or whether they are Xen
> > pages at all. But LEVEL_* seem even more generic.
> > 
> > As I mentioned, I don't have any good suggestions for changes to make
> > here, so unless you can come up with a good idea let's keep it as is.
> 
> I am thinking to use the following naming (diff on top of this patch):
> 
> -#define LPAE_SHIFT          LPAE_SHIFT_GS(PAGE_SHIFT)
> -#define LPAE_ENTRIES        LPAE_ENTRIES_GS(PAGE_SHIFT)
> -#define LPAE_ENTRY_MASK     LPAE_ENTRY_MASK_GS(PAGE_SHIFT)
> +#define XEN_PT_SHIFT          LPAE_SHIFT_GS(PAGE_SHIFT)
> +#define XEN_PT_ENTRIES        LPAE_ENTRIES_GS(PAGE_SHIFT)
> +#define XEN_PT_ENTRY_MASK     LPAE_ENTRY_MASK_GS(PAGE_SHIFT)
> 
> -#define LEVEL_SHIFT(lvl)    LEVEL_SHIFT_GS(PAGE_SHIFT, lvl)
> -#define LEVEL_ORDER(lvl)    LEVEL_ORDER_GS(PAGE_SHIFT, lvl)
> -#define LEVEL_SIZE(lvl)     LEVEL_SIZE_GS(PAGE_SHIFT, lvl)
> -#define LEVEL_MASK(lvl)     (~(LEVEL_SIZE(lvl) - 1))
> +#define XEN_PT_LEVEL_SHIFT(lvl)    LEVEL_SHIFT_GS(PAGE_SHIFT, lvl)
> +#define XEN_PT_LEVEL_ORDER(lvl)    LEVEL_ORDER_GS(PAGE_SHIFT, lvl)
> +#define XEN_PT_LEVEL_SIZE(lvl)     LEVEL_SIZE_GS(PAGE_SHIFT, lvl)
> +#define XEN_PT_LEVEL_MASK(lvl)     (~(LEVEL_SIZE(lvl) - 1))
> 
>  /* Convenience aliases */
> -#define THIRD_SHIFT         LEVEL_SHIFT(3)
> -#define THIRD_ORDER         LEVEL_ORDER(3)
> -#define THIRD_SIZE          LEVEL_SIZE(3)
> -#define THIRD_MASK          LEVEL_MASK(3)
> -
> -#define SECOND_SHIFT        LEVEL_SHIFT(2)
> -#define SECOND_ORDER        LEVEL_ORDER(2)
> -#define SECOND_SIZE         LEVEL_SIZE(2)
> -#define SECOND_MASK         LEVEL_MASK(2)
> -
> -#define FIRST_SHIFT         LEVEL_SHIFT(1)
> -#define FIRST_ORDER         LEVEL_ORDER(1)
> -#define FIRST_SIZE          LEVEL_SIZE(1)
> -#define FIRST_MASK          LEVEL_MASK(1)
> -
> -#define ZEROETH_SHIFT       LEVEL_SHIFT(0)
> -#define ZEROETH_ORDER       LEVEL_ORDER(0)
> -#define ZEROETH_SIZE        LEVEL_SIZE(0)
> -#define ZEROETH_MASK        LEVEL_MASK(0)
> +#define THIRD_SHIFT         XEN_PT_LEVEL_SHIFT(3)
> +#define THIRD_ORDER         XEN_PT_LEVEL_ORDER(3)
> +#define THIRD_SIZE          XEN_PT_LEVEL_SIZE(3)
> +#define THIRD_MASK          XEN_PT_LEVEL_MASK(3)
> +
> +#define SECOND_SHIFT        XEN_PT_LEVEL_SHIFT(2)
> +#define SECOND_ORDER        XEN_PT_LEVEL_ORDER(2)
> +#define SECOND_SIZE         XEN_PT_LEVEL_SIZE(2)
> +#define SECOND_MASK         XEN_PT_LEVEL_MASK(2)
> +
> +#define FIRST_SHIFT         XEN_PT_LEVEL_SHIFT(1)
> +#define FIRST_ORDER         XEN_PT_LEVEL_ORDER(1)
> +#define FIRST_SIZE          XEN_PT_LEVEL_SIZE(1)
> +#define FIRST_MASK          XEN_PT_LEVEL_MASK(1)
> +
> +#define ZEROETH_SHIFT       XEN_PT_LEVEL_SHIFT(0)
> +#define ZEROETH_ORDER       XEN_PT_LEVEL_ORDER(0)
> +#define ZEROETH_SIZE        XEN_PT_LEVEL_SIZE(0)
> +#define ZEROETH_MASK        XEN_PT_LEVEL_MASK(0)
> 
> I don't plan to modify the nameing for ZEROETH*, FIRST*, SECOND*, THIRD*.
> 
> Let me know if you prefer it over the currrent naming.

Yes, I think it is better, thanks!


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 02/15] xen/arm: lpae: Use the generic helpers to defined the Xen PT helpers
  2021-07-13 20:53               ` Stefano Stabellini
@ 2021-07-14 17:40                 ` Julien Grall
  0 siblings, 0 replies; 59+ messages in thread
From: Julien Grall @ 2021-07-14 17:40 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Volodymyr Babchuk

Hi Stefano,

On 13/07/2021 21:53, Stefano Stabellini wrote:
> On Sat, 3 Jul 2021, Julien Grall wrote:
>> Hi Stefano,
>>
>> Sorry for the late answer.
>>
>> On 13/05/2021 23:44, Stefano Stabellini wrote:
>>> On Wed, 12 May 2021, Julien Grall wrote:
>>>> Hi Stefano,
>>>>
>>>> On 12/05/2021 22:30, Stefano Stabellini wrote:
>>>>> On Wed, 12 May 2021, Julien Grall wrote:
>>>>>>>> +#define LPAE_SHIFT          LPAE_SHIFT_GS(PAGE_SHIFT)
>>>>>>>> +#define LPAE_ENTRIES        LPAE_ENTRIES_GS(PAGE_SHIFT)
>>>>>>>> +#define LPAE_ENTRY_MASK     LPAE_ENTRY_MASK_GS(PAGE_SHIFT)
>>>>>>>>
>>>>>>>> +#define LEVEL_SHIFT(lvl)    LEVEL_SHIFT_GS(PAGE_SHIFT, lvl)
>>>>>>>> +#define LEVEL_ORDER(lvl)    LEVEL_ORDER_GS(PAGE_SHIFT, lvl)
>>>>>>>> +#define LEVEL_SIZE(lvl)     LEVEL_SIZE_GS(PAGE_SHIFT, lvl)
>>>>>>>> +#define LEVEL_MASK(lvl)     (~(LEVEL_SIZE(lvl) - 1))
>>>>>>>
>>>>>>> I would avoid adding these 4 macros. It would be OK if they were
>>>>>>> just
>>>>>>> used within this file but lpae.h is a header: they could end up be
>>>>>>> used
>>>>>>> anywhere in the xen/ code and they have a very generic name. My
>>>>>>> suggestion would be to skip them and just do:
>>>>>>
>>>>>> Those macros will be used in follow-up patches. They are pretty useful
>>>>>> to
>>>>>> avoid introduce static array with the different information for each
>>>>>> level.
>>>>>>
>>>>>> Would prefix them with XEN_ be better?
>>>>>
>>>>> Maybe. The concern I have is that there are multiple page granularities
>>>>> (4kb, 16kb, etc) and multiple page sizes (4kb, 2mb, etc). If I just see
>>>>> LEVEL_ORDER it is not immediately obvious what granularity and what size
>>>>> we are talking about.
>>>>
>>>> I am a bit puzzled with your answer. AFAIU, you are happy with the
>>>> existing
>>>> macros (THIRD_*, SECOND_*) but not with the new macros.
>>>>
>>>> In reality, there is no difference because THIRD_* doesn't tell you the
>>>> exact
>>>> size but only "this is a level 3 mapping".
>>>>
>>>> So can you clarify what you are after? IOW is it reworking the current
>>>> naming
>>>> scheme?
>>>
>>> You are right -- there is no real difference between THIRD_*, SECOND_*
>>> and LEVEL_*.
>>>
>>> The original reason for my comments is that I hadn't read the following
>>> patches, and the definition of LEVEL_* macros is simple, they could be
>>> open coded. It looked like they were only going to be used to make the
>>> definition of THIRD_*, SECOND_* a bit easier. So, at first, I was
>>> wondering if they were needed at all.
>>>
>>> Secondly, I realized that they were going to be used in *.c files by
>>> other patches. That's why they are there. But I started thinking whether
>>> we should find a way to make it a bit clearer that they are for Xen
>>> pages, currently at 4KB granularity. THIRD_*, SECOND_*, etc. are already
>>> generic names which don't convey the granularity or whether they are Xen
>>> pages at all. But LEVEL_* seem even more generic.
>>>
>>> As I mentioned, I don't have any good suggestions for changes to make
>>> here, so unless you can come up with a good idea let's keep it as is.
>>
>> I am thinking to use the following naming (diff on top of this patch):
>>
>> -#define LPAE_SHIFT          LPAE_SHIFT_GS(PAGE_SHIFT)
>> -#define LPAE_ENTRIES        LPAE_ENTRIES_GS(PAGE_SHIFT)
>> -#define LPAE_ENTRY_MASK     LPAE_ENTRY_MASK_GS(PAGE_SHIFT)
>> +#define XEN_PT_SHIFT          LPAE_SHIFT_GS(PAGE_SHIFT)
>> +#define XEN_PT_ENTRIES        LPAE_ENTRIES_GS(PAGE_SHIFT)
>> +#define XEN_PT_ENTRY_MASK     LPAE_ENTRY_MASK_GS(PAGE_SHIFT)
>>
>> -#define LEVEL_SHIFT(lvl)    LEVEL_SHIFT_GS(PAGE_SHIFT, lvl)
>> -#define LEVEL_ORDER(lvl)    LEVEL_ORDER_GS(PAGE_SHIFT, lvl)
>> -#define LEVEL_SIZE(lvl)     LEVEL_SIZE_GS(PAGE_SHIFT, lvl)
>> -#define LEVEL_MASK(lvl)     (~(LEVEL_SIZE(lvl) - 1))
>> +#define XEN_PT_LEVEL_SHIFT(lvl)    LEVEL_SHIFT_GS(PAGE_SHIFT, lvl)
>> +#define XEN_PT_LEVEL_ORDER(lvl)    LEVEL_ORDER_GS(PAGE_SHIFT, lvl)
>> +#define XEN_PT_LEVEL_SIZE(lvl)     LEVEL_SIZE_GS(PAGE_SHIFT, lvl)
>> +#define XEN_PT_LEVEL_MASK(lvl)     (~(LEVEL_SIZE(lvl) - 1))
>>
>>   /* Convenience aliases */
>> -#define THIRD_SHIFT         LEVEL_SHIFT(3)
>> -#define THIRD_ORDER         LEVEL_ORDER(3)
>> -#define THIRD_SIZE          LEVEL_SIZE(3)
>> -#define THIRD_MASK          LEVEL_MASK(3)
>> -
>> -#define SECOND_SHIFT        LEVEL_SHIFT(2)
>> -#define SECOND_ORDER        LEVEL_ORDER(2)
>> -#define SECOND_SIZE         LEVEL_SIZE(2)
>> -#define SECOND_MASK         LEVEL_MASK(2)
>> -
>> -#define FIRST_SHIFT         LEVEL_SHIFT(1)
>> -#define FIRST_ORDER         LEVEL_ORDER(1)
>> -#define FIRST_SIZE          LEVEL_SIZE(1)
>> -#define FIRST_MASK          LEVEL_MASK(1)
>> -
>> -#define ZEROETH_SHIFT       LEVEL_SHIFT(0)
>> -#define ZEROETH_ORDER       LEVEL_ORDER(0)
>> -#define ZEROETH_SIZE        LEVEL_SIZE(0)
>> -#define ZEROETH_MASK        LEVEL_MASK(0)
>> +#define THIRD_SHIFT         XEN_PT_LEVEL_SHIFT(3)
>> +#define THIRD_ORDER         XEN_PT_LEVEL_ORDER(3)
>> +#define THIRD_SIZE          XEN_PT_LEVEL_SIZE(3)
>> +#define THIRD_MASK          XEN_PT_LEVEL_MASK(3)
>> +
>> +#define SECOND_SHIFT        XEN_PT_LEVEL_SHIFT(2)
>> +#define SECOND_ORDER        XEN_PT_LEVEL_ORDER(2)
>> +#define SECOND_SIZE         XEN_PT_LEVEL_SIZE(2)
>> +#define SECOND_MASK         XEN_PT_LEVEL_MASK(2)
>> +
>> +#define FIRST_SHIFT         XEN_PT_LEVEL_SHIFT(1)
>> +#define FIRST_ORDER         XEN_PT_LEVEL_ORDER(1)
>> +#define FIRST_SIZE          XEN_PT_LEVEL_SIZE(1)
>> +#define FIRST_MASK          XEN_PT_LEVEL_MASK(1)
>> +
>> +#define ZEROETH_SHIFT       XEN_PT_LEVEL_SHIFT(0)
>> +#define ZEROETH_ORDER       XEN_PT_LEVEL_ORDER(0)
>> +#define ZEROETH_SIZE        XEN_PT_LEVEL_SIZE(0)
>> +#define ZEROETH_MASK        XEN_PT_LEVEL_MASK(0)
>>
>> I don't plan to modify the nameing for ZEROETH*, FIRST*, SECOND*, THIRD*.
>>
>> Let me know if you prefer it over the currrent naming.
> 
> Yes, I think it is better, thanks!

Ok. I will try to respin the series soon.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFCv2 14/15] xen/arm: mm: Rework setup_xenheap_mappings()
  2021-05-18  0:50       ` Stefano Stabellini
@ 2022-02-12 19:16         ` Julien Grall
  0 siblings, 0 replies; 59+ messages in thread
From: Julien Grall @ 2022-02-12 19:16 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Wei.Chen, Henry.Wang, Penny.Zheng, Bertrand.Marquis,
	Julien Grall, Volodymyr Babchuk, Julien Grall

Hi,

Sorry for the late answering. I finally picked up that series again and 
now preparing a new version.

On 18/05/2021 01:50, Stefano Stabellini wrote:
> On Sat, 15 May 2021, Julien Grall wrote:
>> Hi,
>>
>> On 15/05/2021 00:51, Stefano Stabellini wrote:
>>> On Sun, 25 Apr 2021, Julien Grall wrote:
>>>> From: Julien Grall <julien.grall@arm.com>
>>>>
>>>> A few issues have been reported with setup_xenheap_mappings() over the
>>>> last couple of years. The main ones are:
>>>>       - It will break on platform supporting more than 512GB of RAM
>>>>         because the memory allocated by the boot allocator is not yet
>>>>         mapped.
>>>>       - Aligning all the regions to 1GB may lead to unexpected result
>>>>         because we may alias non-cacheable region (such as device or
>>>> reserved
>>>>         regions).
>>>>
>>>> map_pages_to_xen() was recently reworked to allow superpage mappings and
>>>> deal with the use of page-tables before they are mapped.
>>>>
>>>> Most of the code in setup_xenheap_mappings() is now replaced with a
>>>> single call to map_pages_to_xen().
>>>>
>>>> This also require to re-order the steps setup_mm() so the regions are
>>>> given to the boot allocator first and then we setup the xenheap
>>>> mappings.
>>>
>>> I know this is paranoia but doesn't this introduce a requirement on the
>>> size of the first bank in bootinfo.mem.bank[] ?
>>>
>>> It should be at least 8KB?
>>
>> AFAIK, the current requirement is 4KB because of the 1GB mapping. Long term,
>> it would be 8KB.
>>
>>>
>>> I know it is unlikely but it is theoretically possible to have a first
>>> bank of just 1KB.
>>
>> All the page allocators are working at the page granularity level. I am not
>> entirely sure whether the current Xen would ignore the region or break.
>>
>> Note that setup_xenheap_mappings() is taking a base MFN and a number of pages
>> to map. So this doesn't look to be a new problem here.
> 
> Yeah... the example of the first bank being 1KB is wrong because, like
> you wrote, it wouldn't work before your patches either, and probably it
> will never work.
> 
> Maybe a better example is a first bank of 4KB exactly.

I have done more testing with the 1GB alignment dropped. The 
restrictions are a bit more complicated.

Not all the memory in a bank will go to the boot allocator. This can 
happen if the memory were have already been allocated for other purpose 
(e.g. modules, reserved area...).

So the region needs enough free memory to be able to map the entire 
region. The amount needed will depend on the size of the region.

So I will split the loop in two separate loops. The first loop will add 
all available pages to the boot allocator. The second loop will actually 
do the mapping.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2022-02-12 19:16 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-25 20:13 [PATCH RFCv2 00/15] xen/arm: mm: Remove open-coding mappings Julien Grall
2021-04-25 20:13 ` [PATCH RFCv2 01/15] xen/arm: lpae: Rename LPAE_ENTRIES_MASK_GS to LPAE_ENTRY_MASK_GS Julien Grall
2021-05-11 22:12   ` Stefano Stabellini
2021-04-25 20:13 ` [PATCH RFCv2 02/15] xen/arm: lpae: Use the generic helpers to defined the Xen PT helpers Julien Grall
2021-05-11 22:26   ` Stefano Stabellini
2021-05-12 14:26     ` Julien Grall
2021-05-12 21:30       ` Stefano Stabellini
2021-05-12 22:16         ` Julien Grall
2021-05-13 22:44           ` Stefano Stabellini
2021-07-03 17:32             ` Julien Grall
2021-07-13 20:53               ` Stefano Stabellini
2021-07-14 17:40                 ` Julien Grall
2021-04-25 20:13 ` [PATCH RFCv2 03/15] xen/arm: p2m: Replace level_{orders, masks} arrays with LEVEL_{ORDER, MASK} Julien Grall
2021-05-11 22:33   ` Stefano Stabellini
2021-05-12 14:39     ` Julien Grall
2021-04-25 20:13 ` [PATCH RFCv2 04/15] xen/arm: mm: Allow other mapping size in xen_pt_update_entry() Julien Grall
2021-05-11 22:42   ` Stefano Stabellini
2021-04-25 20:13 ` [PATCH RFCv2 05/15] xen/arm: mm: Avoid flushing the TLBs when mapping are inserted Julien Grall
2021-05-11 22:51   ` Stefano Stabellini
2021-04-25 20:13 ` [PATCH RFCv2 06/15] xen/arm: mm: Don't open-code Xen PT update in remove_early_mappings() Julien Grall
2021-05-11 22:58   ` Stefano Stabellini
2021-04-25 20:13 ` [PATCH RFCv2 07/15] xen/arm: mm: Re-implement early_fdt_map() using map_pages_to_xen() Julien Grall
2021-05-12 21:41   ` Stefano Stabellini
2021-05-12 22:18     ` Julien Grall
2021-04-25 20:13 ` [PATCH RFCv2 08/15] xen/arm32: mm: Check if the virtual address is shared before updating it Julien Grall
2021-05-12 22:00   ` Stefano Stabellini
2021-05-12 22:23     ` Julien Grall
2021-05-13 22:32       ` Stefano Stabellini
2021-05-13 22:59         ` Julien Grall
2021-05-14  1:04           ` Stefano Stabellini
2021-04-25 20:13 ` [PATCH RFCv2 09/15] xen/arm32: mm: Re-implement setup_xenheap_mappings() using map_pages_to_xen() Julien Grall
2021-05-12 22:07   ` Stefano Stabellini
2021-05-13 17:55     ` Julien Grall
2021-04-25 20:13 ` [PATCH RFCv2 10/15] xen/arm: mm: Allocate xen page tables in domheap rather than xenheap Julien Grall
2021-05-12 22:44   ` Stefano Stabellini
2021-05-13 18:09     ` Julien Grall
2021-05-13 22:27       ` Stefano Stabellini
2021-05-15  8:48         ` Julien Grall
2021-05-18  0:37           ` Stefano Stabellini
2021-04-25 20:13 ` [PATCH RFCv2 11/15] xen/arm: mm: Allow page-table allocation from the boot allocator Julien Grall
2021-05-13 22:58   ` Stefano Stabellini
2021-04-25 20:13 ` [PATCH RFCv2 12/15] xen/arm: add Persistent Map (PMAP) infrastructure Julien Grall
2021-04-26  9:41   ` Xia, Hongyan
2021-07-03 17:57     ` Julien Grall
2021-04-28 21:47   ` Wei Liu
2021-05-14 23:25   ` Stefano Stabellini
2021-05-15  8:54     ` Julien Grall
2021-04-25 20:13 ` [PATCH RFCv2 13/15] xen/arm: mm: Use the PMAP helpers in xen_{,un}map_table() Julien Grall
2021-05-14 23:35   ` Stefano Stabellini
2021-05-15  9:03     ` Julien Grall
2021-04-25 20:13 ` [PATCH RFCv2 14/15] xen/arm: mm: Rework setup_xenheap_mappings() Julien Grall
2021-05-14 23:51   ` Stefano Stabellini
2021-05-15  9:21     ` Julien Grall
2021-05-18  0:50       ` Stefano Stabellini
2022-02-12 19:16         ` Julien Grall
2021-04-25 20:13 ` [PATCH RFCv2 15/15] xen/arm: mm: Re-implement setup_frame_table_mappings() with map_pages_to_xen() Julien Grall
2021-05-15  0:02   ` Stefano Stabellini
2021-05-15  9:25     ` Julien Grall
2021-05-18  0:51       ` Stefano Stabellini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).