All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] Make PDX compression optional
@ 2023-07-28  7:58 Alejandro Vallejo
  2023-07-28  7:58 ` [PATCH v2 1/5] arm/mm: Document the differences between arm32 and arm64 directmaps Alejandro Vallejo
                   ` (4 more replies)
  0 siblings, 5 replies; 29+ messages in thread
From: Alejandro Vallejo @ 2023-07-28  7:58 UTC (permalink / raw)
  To: Xen-devel
  Cc: Alejandro Vallejo, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Volodymyr Babchuk, Andrew Cooper,
	George Dunlap, Jan Beulich, Wei Liu, Roger Pau Monné

Currently there's a CONFIG_HAS_PDX Kconfig option, but it's impossible to
disable it because the whole codebase performs unconditional
compression/decompression operations on addresses. This has the
unfortunate side effect that systems without a need for compression still
have to pay the performance impact of juggling bits on every pfn<->pdx
conversion (this requires reading several global variables). This series
attempts to:

  * Leave the state of pdx and pdx compression documented
  * Factor out compression so it _can_ be removed through Kconfig
  * Make it so compression is disabled on x86 and enabled on both Aarch32
    and Aarch64 by default.

Series summary:

Patch 1 makes a clarification in ARM code to explain some discrepancies
        between the concept of "directmap" in arm32 and arm64 relevant to
        this series (i.e: why this series doesn't touch arm32 in directmap
        accesses).
Patch 2 Moves hard-coded compression-related logic to helper functions
Patch 3 Refactors all instances of regions being validated for pdx
        compression conformance so it's done through a helper
Patch 4 Non-functional reorder in order to simplify the patch 8 diff
Patch 5 Adds new Kconfig option to compile out PDX compression and removes
        the old CONFIG_HAS_PDX, as it was non removable

Already committed:

v1/patch 1 documents the current general understanding of the pdx concept and
           pdx compression in particular
v1/patch 3 Marks the pdx compression globals as ro_after_init

Alejandro Vallejo (5):
  arm/mm: Document the differences between arm32 and arm64 directmaps
  mm: Factor out the pdx compression logic in ma/va converters
  mm/pdx: Standardize region validation wrt pdx compression
  pdx: Reorder pdx.[ch]
  pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option

 xen/arch/arm/Kconfig                   |   1 -
 xen/arch/arm/include/asm/mm.h          |  29 +++++-
 xen/arch/x86/Kconfig                   |   1 -
 xen/arch/x86/domain.c                  |  19 ++--
 xen/arch/x86/include/asm/x86_64/page.h |  28 +++---
 xen/arch/x86/x86_64/mm.c               |   7 +-
 xen/common/Kconfig                     |  13 ++-
 xen/common/Makefile                    |   2 +-
 xen/common/efi/boot.c                  |  13 ++-
 xen/common/pdx.c                       |  75 +++++++++------
 xen/include/xen/pdx.h                  | 126 +++++++++++++++++++------
 11 files changed, 218 insertions(+), 96 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2 1/5] arm/mm: Document the differences between arm32 and arm64 directmaps
  2023-07-28  7:58 [PATCH v2 0/5] Make PDX compression optional Alejandro Vallejo
@ 2023-07-28  7:58 ` Alejandro Vallejo
  2023-07-28  9:05   ` Julien Grall
  2023-07-28  7:59 ` [PATCH v2 2/5] mm: Factor out the pdx compression logic in ma/va converters Alejandro Vallejo
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 29+ messages in thread
From: Alejandro Vallejo @ 2023-07-28  7:58 UTC (permalink / raw)
  To: Xen-devel
  Cc: Alejandro Vallejo, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Volodymyr Babchuk

arm32 merely covers the XENHEAP, whereas arm64 currently covers anything in
the frame table. These comments highlight why arm32 doesn't need to account for PDX
compression in its __va() implementation while arm64 does.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v2:
  * Removed statement about "containing GiB" (Julien)
---
 xen/arch/arm/include/asm/mm.h | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
index 4262165ce2..5b530f0f40 100644
--- a/xen/arch/arm/include/asm/mm.h
+++ b/xen/arch/arm/include/asm/mm.h
@@ -280,6 +280,19 @@ static inline paddr_t __virt_to_maddr(vaddr_t va)
 #define virt_to_maddr(va)   __virt_to_maddr((vaddr_t)(va))
 
 #ifdef CONFIG_ARM_32
+/**
+ * Find the virtual address corresponding to a machine address
+ *
+ * Only memory backing the XENHEAP has a corresponding virtual address to
+ * be found. This is so we can save precious virtual space, as it's in
+ * short supply on arm32. This mapping is not subject to PDX compression
+ * because XENHEAP is known to be physically contiguous and can't hence
+ * jump over the PDX hole. This means we can avoid the roundtrips
+ * converting to/from pdx.
+ *
+ * @param ma Machine address
+ * @return Virtual address mapped to `ma`
+ */
 static inline void *maddr_to_virt(paddr_t ma)
 {
     ASSERT(is_xen_heap_mfn(maddr_to_mfn(ma)));
@@ -287,6 +300,19 @@ static inline void *maddr_to_virt(paddr_t ma)
     return (void *)(unsigned long) ma + XENHEAP_VIRT_START;
 }
 #else
+/**
+ * Find the virtual address corresponding to a machine address
+ *
+ * The directmap covers all conventional memory accesible by the
+ * hypervisor. This means it's subject to PDX compression.
+ *
+ * Note there's an extra offset applied (directmap_base_pdx) on top of the
+ * regular PDX compression logic. Its purpose is to skip over the initial
+ * range of non-existing memory, should there be one.
+ *
+ * @param ma Machine address
+ * @return Virtual address mapped to `ma`
+ */
 static inline void *maddr_to_virt(paddr_t ma)
 {
     ASSERT((mfn_to_pdx(maddr_to_mfn(ma)) - directmap_base_pdx) <
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 2/5] mm: Factor out the pdx compression logic in ma/va converters
  2023-07-28  7:58 [PATCH v2 0/5] Make PDX compression optional Alejandro Vallejo
  2023-07-28  7:58 ` [PATCH v2 1/5] arm/mm: Document the differences between arm32 and arm64 directmaps Alejandro Vallejo
@ 2023-07-28  7:59 ` Alejandro Vallejo
  2023-07-28  9:07   ` Julien Grall
  2023-07-31 15:15   ` Jan Beulich
  2023-07-28  7:59 ` [PATCH v2 3/5] mm/pdx: Standardize region validation wrt pdx compression Alejandro Vallejo
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 29+ messages in thread
From: Alejandro Vallejo @ 2023-07-28  7:59 UTC (permalink / raw)
  To: Xen-devel
  Cc: Alejandro Vallejo, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Volodymyr Babchuk, Andrew Cooper,
	George Dunlap, Jan Beulich, Wei Liu, Roger Pau Monné

This patch factors out the pdx compression logic hardcoded in both ports
for the maddr<->vaddr conversion functions.

Touches both x86 and arm ports.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v2:
  * Cast variable to u64 before shifting left to avoid overflow (Julien)
---
 xen/arch/arm/include/asm/mm.h          |  3 +--
 xen/arch/x86/include/asm/x86_64/page.h | 28 +++++++++++---------------
 xen/include/xen/pdx.h                  | 25 +++++++++++++++++++++++
 3 files changed, 38 insertions(+), 18 deletions(-)

diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
index 5b530f0f40..c0d7f0f181 100644
--- a/xen/arch/arm/include/asm/mm.h
+++ b/xen/arch/arm/include/asm/mm.h
@@ -319,8 +319,7 @@ static inline void *maddr_to_virt(paddr_t ma)
            (DIRECTMAP_SIZE >> PAGE_SHIFT));
     return (void *)(XENHEAP_VIRT_START -
                     (directmap_base_pdx << PAGE_SHIFT) +
-                    ((ma & ma_va_bottom_mask) |
-                     ((ma & ma_top_mask) >> pfn_pdx_hole_shift)));
+                    maddr_to_directmapoff(ma));
 }
 #endif
 
diff --git a/xen/arch/x86/include/asm/x86_64/page.h b/xen/arch/x86/include/asm/x86_64/page.h
index 53faa7875b..b589c93e77 100644
--- a/xen/arch/x86/include/asm/x86_64/page.h
+++ b/xen/arch/x86/include/asm/x86_64/page.h
@@ -36,26 +36,22 @@ static inline unsigned long __virt_to_maddr(unsigned long va)
 {
     ASSERT(va < DIRECTMAP_VIRT_END);
     if ( va >= DIRECTMAP_VIRT_START )
-        va -= DIRECTMAP_VIRT_START;
-    else
-    {
-        BUILD_BUG_ON(XEN_VIRT_END - XEN_VIRT_START != GB(1));
-        /* Signed, so ((long)XEN_VIRT_START >> 30) fits in an imm32. */
-        ASSERT(((long)va >> (PAGE_ORDER_1G + PAGE_SHIFT)) ==
-               ((long)XEN_VIRT_START >> (PAGE_ORDER_1G + PAGE_SHIFT)));
-
-        va += xen_phys_start - XEN_VIRT_START;
-    }
-    return (va & ma_va_bottom_mask) |
-           ((va << pfn_pdx_hole_shift) & ma_top_mask);
+        return directmapoff_to_maddr(va - DIRECTMAP_VIRT_START);
+
+    BUILD_BUG_ON(XEN_VIRT_END - XEN_VIRT_START != GB(1));
+    /* Signed, so ((long)XEN_VIRT_START >> 30) fits in an imm32. */
+    ASSERT(((long)va >> (PAGE_ORDER_1G + PAGE_SHIFT)) ==
+           ((long)XEN_VIRT_START >> (PAGE_ORDER_1G + PAGE_SHIFT)));
+
+    return xen_phys_start + va - XEN_VIRT_START;
 }
 
 static inline void *__maddr_to_virt(unsigned long ma)
 {
-    ASSERT(pfn_to_pdx(ma >> PAGE_SHIFT) < (DIRECTMAP_SIZE >> PAGE_SHIFT));
-    return (void *)(DIRECTMAP_VIRT_START +
-                    ((ma & ma_va_bottom_mask) |
-                     ((ma & ma_top_mask) >> pfn_pdx_hole_shift)));
+    /* Offset in the direct map, accounting for pdx compression */
+    size_t va_offset = maddr_to_directmapoff(ma);
+    ASSERT(va_offset < DIRECTMAP_SIZE);
+    return (void *)(DIRECTMAP_VIRT_START + va_offset);
 }
 
 /* read access (should only be used for debug printk's) */
diff --git a/xen/include/xen/pdx.h b/xen/include/xen/pdx.h
index de5439a5e5..d96f03d6e6 100644
--- a/xen/include/xen/pdx.h
+++ b/xen/include/xen/pdx.h
@@ -160,6 +160,31 @@ static inline unsigned long pdx_to_pfn(unsigned long pdx)
 #define mfn_to_pdx(mfn) pfn_to_pdx(mfn_x(mfn))
 #define pdx_to_mfn(pdx) _mfn(pdx_to_pfn(pdx))
 
+/**
+ * Computes the offset into the direct map of an maddr
+ *
+ * @param ma Machine address
+ * @return Offset on the direct map where that
+ *         machine address can be accessed
+ */
+static inline unsigned long maddr_to_directmapoff(uint64_t ma)
+{
+    return ((ma & ma_top_mask) >> pfn_pdx_hole_shift) |
+           (ma & ma_va_bottom_mask);
+}
+
+/**
+ * Computes a machine address given a direct map offset
+ *
+ * @param offset Offset into the direct map
+ * @return Corresponding machine address of that virtual location
+ */
+static inline uint64_t directmapoff_to_maddr(unsigned long offset)
+{
+    return (((uint64_t)offset << pfn_pdx_hole_shift) & ma_top_mask) |
+           (offset & ma_va_bottom_mask);
+}
+
 /**
  * Initializes global variables with information about the compressible
  * range of the current memory regions.
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 3/5] mm/pdx: Standardize region validation wrt pdx compression
  2023-07-28  7:58 [PATCH v2 0/5] Make PDX compression optional Alejandro Vallejo
  2023-07-28  7:58 ` [PATCH v2 1/5] arm/mm: Document the differences between arm32 and arm64 directmaps Alejandro Vallejo
  2023-07-28  7:59 ` [PATCH v2 2/5] mm: Factor out the pdx compression logic in ma/va converters Alejandro Vallejo
@ 2023-07-28  7:59 ` Alejandro Vallejo
  2023-07-28 16:19   ` Julien Grall
  2023-07-31 15:27   ` Jan Beulich
  2023-07-28  7:59 ` [PATCH v2 4/5] pdx: Reorder pdx.[ch] Alejandro Vallejo
  2023-07-28  7:59 ` [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option Alejandro Vallejo
  4 siblings, 2 replies; 29+ messages in thread
From: Alejandro Vallejo @ 2023-07-28  7:59 UTC (permalink / raw)
  To: Xen-devel
  Cc: Alejandro Vallejo, Jan Beulich, Andrew Cooper,
	Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini

Regions must be occasionally validated for pdx compression validity. That
is, whether any of the machine addresses spanning the region have a bit set
in the pdx "hole" (which is expected to always contain zeroes). There are
a few such tests through the code, and they all check for different things.

This patch replaces all such occurrences with a call to a centralized
function that checks a region for validity.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v2:
  * s/occurences/ocurrences in commit message (Julien)
  * Use pfn_to_paddr()/paddr_to_pfn() (Julien)
  * Use (paddr_t,unsigned long) in pdx_is_region_compressible() (Julien, Jan)
---
 xen/arch/x86/x86_64/mm.c |  7 +++++--
 xen/common/efi/boot.c    | 13 ++++++++++---
 xen/common/pdx.c         | 10 ++++++++--
 xen/include/xen/pdx.h    |  9 +++++++++
 4 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index 60db439af3..652e787934 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -1159,6 +1159,9 @@ static int mem_hotadd_check(unsigned long spfn, unsigned long epfn)
 {
     unsigned long s, e, length, sidx, eidx;
 
+    paddr_t mem_base = pfn_to_paddr(spfn);
+    unsigned long mem_npages = epfn - spfn;
+
     if ( (spfn >= epfn) )
         return 0;
 
@@ -1168,7 +1171,7 @@ static int mem_hotadd_check(unsigned long spfn, unsigned long epfn)
     if ( (spfn | epfn) & ((1UL << PAGETABLE_ORDER) - 1) )
         return 0;
 
-    if ( (spfn | epfn) & pfn_hole_mask )
+    if ( !pdx_is_region_compressible(mem_base, mem_npages) )
         return 0;
 
     /* Make sure the new range is not present now */
@@ -1207,7 +1210,7 @@ static int mem_hotadd_check(unsigned long spfn, unsigned long epfn)
 
     length += (e - s) * sizeof(struct page_info);
 
-    if ((length >> PAGE_SHIFT) > (epfn - spfn))
+    if ((length >> PAGE_SHIFT) > mem_npages)
         return 0;
 
     return 1;
diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index 79a654af69..52a7239389 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -14,6 +14,7 @@
 #include <xen/multiboot.h>
 #include <xen/param.h>
 #include <xen/pci_regs.h>
+#include <xen/pdx.h>
 #include <xen/pfn.h>
 #if EFI_PAGE_SIZE != PAGE_SIZE
 # error Cannot use xen/pfn.h here!
@@ -1645,9 +1646,11 @@ static __init void copy_mapping(unsigned long mfn, unsigned long end,
 
 static bool __init cf_check ram_range_valid(unsigned long smfn, unsigned long emfn)
 {
+    paddr_t ram_base = pfn_to_paddr(smfn);
+    unsigned long ram_npages = emfn - smfn;
     unsigned long sz = pfn_to_pdx(emfn - 1) / PDX_GROUP_COUNT + 1;
 
-    return !(smfn & pfn_hole_mask) &&
+    return pdx_is_region_compressible(ram_base, ram_npages) &&
            find_next_bit(pdx_group_valid, sz,
                          pfn_to_pdx(smfn) / PDX_GROUP_COUNT) < sz;
 }
@@ -1660,6 +1663,8 @@ static bool __init cf_check rt_range_valid(unsigned long smfn, unsigned long emf
 
 void __init efi_init_memory(void)
 {
+    paddr_t mem_base;
+    unsigned long mem_npages;
     unsigned int i;
     l4_pgentry_t *efi_l4t;
     struct rt_extra {
@@ -1732,6 +1737,9 @@ void __init efi_init_memory(void)
         smfn = PFN_DOWN(desc->PhysicalStart);
         emfn = PFN_UP(desc->PhysicalStart + len);
 
+        mem_base = pfn_to_paddr(smfn);
+        mem_npages = emfn - smfn;
+
         if ( desc->Attribute & EFI_MEMORY_WB )
             prot |= _PAGE_WB;
         else if ( desc->Attribute & EFI_MEMORY_WT )
@@ -1759,8 +1767,7 @@ void __init efi_init_memory(void)
             prot |= _PAGE_NX;
 
         if ( pfn_to_pdx(emfn - 1) < (DIRECTMAP_SIZE >> PAGE_SHIFT) &&
-             !(smfn & pfn_hole_mask) &&
-             !((smfn ^ (emfn - 1)) & ~pfn_pdx_bottom_mask) )
+             pdx_is_region_compressible(mem_base, mem_npages))
         {
             if ( (unsigned long)mfn_to_virt(emfn - 1) >= HYPERVISOR_VIRT_END )
                 prot &= ~_PAGE_GLOBAL;
diff --git a/xen/common/pdx.c b/xen/common/pdx.c
index 99d4a90a50..3c88ceeb9c 100644
--- a/xen/common/pdx.c
+++ b/xen/common/pdx.c
@@ -88,7 +88,7 @@ bool __mfn_valid(unsigned long mfn)
 }
 
 /* Sets all bits from the most-significant 1-bit down to the LSB */
-static uint64_t __init fill_mask(uint64_t mask)
+static uint64_t fill_mask(uint64_t mask)
 {
     while (mask & (mask + 1))
         mask |= mask + 1;
@@ -96,6 +96,12 @@ static uint64_t __init fill_mask(uint64_t mask)
     return mask;
 }
 
+bool pdx_is_region_compressible(paddr_t base, unsigned long npages)
+{
+    return !(paddr_to_pfn(base) & pfn_hole_mask) &&
+           !(pdx_region_mask(base, npages * PAGE_SIZE) & ~ma_va_bottom_mask);
+}
+
 /* We don't want to compress the low MAX_ORDER bits of the addresses. */
 uint64_t __init pdx_init_mask(uint64_t base_addr)
 {
@@ -103,7 +109,7 @@ uint64_t __init pdx_init_mask(uint64_t base_addr)
                          (uint64_t)1 << (MAX_ORDER + PAGE_SHIFT)) - 1);
 }
 
-uint64_t __init pdx_region_mask(uint64_t base, uint64_t len)
+uint64_t pdx_region_mask(uint64_t base, uint64_t len)
 {
     /*
      * We say a bit "moves" in a range if there exist 2 addresses in that
diff --git a/xen/include/xen/pdx.h b/xen/include/xen/pdx.h
index d96f03d6e6..8c6aec2aea 100644
--- a/xen/include/xen/pdx.h
+++ b/xen/include/xen/pdx.h
@@ -79,6 +79,15 @@ extern unsigned long pfn_top_mask, ma_top_mask;
                          (sizeof(*frame_table) & -sizeof(*frame_table)))
 extern unsigned long pdx_group_valid[];
 
+/**
+ * Validate a region's compatibility with the current compression runtime
+ *
+ * @param base Base address of the region
+ * @param npages Number of PAGE_SIZE-sized pages in the region
+ * @return True iff the region can be used with the current compression
+ */
+bool pdx_is_region_compressible(paddr_t base, unsigned long npages);
+
 /**
  * Calculates a mask covering "moving" bits of all addresses of a region
  *
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 4/5] pdx: Reorder pdx.[ch]
  2023-07-28  7:58 [PATCH v2 0/5] Make PDX compression optional Alejandro Vallejo
                   ` (2 preceding siblings ...)
  2023-07-28  7:59 ` [PATCH v2 3/5] mm/pdx: Standardize region validation wrt pdx compression Alejandro Vallejo
@ 2023-07-28  7:59 ` Alejandro Vallejo
  2023-07-28 16:20   ` Julien Grall
  2023-07-28  7:59 ` [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option Alejandro Vallejo
  4 siblings, 1 reply; 29+ messages in thread
From: Alejandro Vallejo @ 2023-07-28  7:59 UTC (permalink / raw)
  To: Xen-devel
  Cc: Alejandro Vallejo, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu

The next patch compiles out compression-related chunks, and it's helpful to
have them grouped together beforehand.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v2:
  * Fix rebase error by which some function prototypes were left intact
    when they should've been removed
---
 xen/common/pdx.c      | 58 +++++++++++++++++++++---------------------
 xen/include/xen/pdx.h | 59 ++++++++++++++++++++++---------------------
 2 files changed, 59 insertions(+), 58 deletions(-)

diff --git a/xen/common/pdx.c b/xen/common/pdx.c
index 3c88ceeb9c..d3d38965bd 100644
--- a/xen/common/pdx.c
+++ b/xen/common/pdx.c
@@ -20,6 +20,35 @@
 #include <xen/bitops.h>
 #include <xen/nospec.h>
 
+/**
+ * Maximum (non-inclusive) usable pdx. Must be
+ * modifiable after init due to memory hotplug
+ */
+unsigned long __read_mostly max_pdx;
+
+unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS(
+    (FRAMETABLE_NR + PDX_GROUP_COUNT - 1) / PDX_GROUP_COUNT)] = { [0] = 1 };
+
+bool __mfn_valid(unsigned long mfn)
+{
+    if ( unlikely(evaluate_nospec(mfn >= max_page)) )
+        return false;
+    return likely(!(mfn & pfn_hole_mask)) &&
+           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
+                           pdx_group_valid));
+}
+
+void set_pdx_range(unsigned long smfn, unsigned long emfn)
+{
+    unsigned long idx, eidx;
+
+    idx = pfn_to_pdx(smfn) / PDX_GROUP_COUNT;
+    eidx = (pfn_to_pdx(emfn - 1) + PDX_GROUP_COUNT) / PDX_GROUP_COUNT;
+
+    for ( ; idx < eidx; ++idx )
+        __set_bit(idx, pdx_group_valid);
+}
+
 /*
  * Diagram to make sense of the following variables. The masks and shifts
  * are done on mfn values in order to convert to/from pdx:
@@ -47,12 +76,6 @@
  * ones.
  */
 
-/**
- * Maximum (non-inclusive) usable pdx. Must be
- * modifiable after init due to memory hotplug
- */
-unsigned long __read_mostly max_pdx;
-
 /** Mask for the lower non-compressible bits of an mfn */
 unsigned long __ro_after_init pfn_pdx_bottom_mask = ~0UL;
 
@@ -75,18 +98,6 @@ unsigned long __ro_after_init pfn_hole_mask = 0;
 /** Number of bits of the "compressible" bit slice of an mfn */
 unsigned int __ro_after_init pfn_pdx_hole_shift = 0;
 
-unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS(
-    (FRAMETABLE_NR + PDX_GROUP_COUNT - 1) / PDX_GROUP_COUNT)] = { [0] = 1 };
-
-bool __mfn_valid(unsigned long mfn)
-{
-    if ( unlikely(evaluate_nospec(mfn >= max_page)) )
-        return false;
-    return likely(!(mfn & pfn_hole_mask)) &&
-           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
-                           pdx_group_valid));
-}
-
 /* Sets all bits from the most-significant 1-bit down to the LSB */
 static uint64_t fill_mask(uint64_t mask)
 {
@@ -124,17 +135,6 @@ uint64_t pdx_region_mask(uint64_t base, uint64_t len)
     return fill_mask(base ^ (base + len - 1));
 }
 
-void set_pdx_range(unsigned long smfn, unsigned long emfn)
-{
-    unsigned long idx, eidx;
-
-    idx = pfn_to_pdx(smfn) / PDX_GROUP_COUNT;
-    eidx = (pfn_to_pdx(emfn - 1) + PDX_GROUP_COUNT) / PDX_GROUP_COUNT;
-
-    for ( ; idx < eidx; ++idx )
-        __set_bit(idx, pdx_group_valid);
-}
-
 void __init pfn_pdx_hole_setup(unsigned long mask)
 {
     unsigned int i, j, bottom_shift = 0, hole_shift = 0;
diff --git a/xen/include/xen/pdx.h b/xen/include/xen/pdx.h
index 8c6aec2aea..5a82b6bde2 100644
--- a/xen/include/xen/pdx.h
+++ b/xen/include/xen/pdx.h
@@ -70,15 +70,41 @@
 #ifdef CONFIG_HAS_PDX
 
 extern unsigned long max_pdx;
-extern unsigned long pfn_pdx_bottom_mask, ma_va_bottom_mask;
-extern unsigned int pfn_pdx_hole_shift;
-extern unsigned long pfn_hole_mask;
-extern unsigned long pfn_top_mask, ma_top_mask;
 
 #define PDX_GROUP_COUNT ((1 << PDX_GROUP_SHIFT) / \
                          (sizeof(*frame_table) & -sizeof(*frame_table)))
 extern unsigned long pdx_group_valid[];
 
+/**
+ * Mark [smfn, emfn) as allocatable in the frame table
+ *
+ * @param smfn Start mfn
+ * @param emfn End mfn
+ */
+void set_pdx_range(unsigned long smfn, unsigned long emfn);
+
+/**
+ * Invoked to determine if an mfn has an associated valid frame table entry
+ *
+ * In order for it to be legal it must pass bounds, grouping and
+ * compression sanity checks.
+ *
+ * @param mfn To-be-checked mfn
+ * @return True iff all checks pass
+ */
+bool __mfn_valid(unsigned long mfn);
+
+#define page_to_pdx(pg)  ((pg) - frame_table)
+#define pdx_to_page(pdx) gcc11_wrap(frame_table + (pdx))
+
+#define mfn_to_pdx(mfn) pfn_to_pdx(mfn_x(mfn))
+#define pdx_to_mfn(pdx) _mfn(pdx_to_pfn(pdx))
+
+extern unsigned long pfn_pdx_bottom_mask, ma_va_bottom_mask;
+extern unsigned int pfn_pdx_hole_shift;
+extern unsigned long pfn_hole_mask;
+extern unsigned long pfn_top_mask, ma_top_mask;
+
 /**
  * Validate a region's compatibility with the current compression runtime
  *
@@ -120,28 +146,6 @@ uint64_t pdx_region_mask(uint64_t base, uint64_t len);
  */
 uint64_t pdx_init_mask(uint64_t base_addr);
 
-/**
- * Mark [smfn, emfn) as accesible in the frame table
- *
- * @param smfn Start mfn
- * @param emfn End mfn
- */
-void set_pdx_range(unsigned long smfn, unsigned long emfn);
-
-#define page_to_pdx(pg)  ((pg) - frame_table)
-#define pdx_to_page(pdx) gcc11_wrap(frame_table + (pdx))
-
-/**
- * Invoked to determine if an mfn has an associated valid frame table entry
- *
- * In order for it to be legal it must pass bounds, grouping and
- * compression sanity checks.
- *
- * @param mfn To-be-checked mfn
- * @return True iff all checks pass
- */
-bool __mfn_valid(unsigned long mfn);
-
 /**
  * Map pfn to its corresponding pdx
  *
@@ -166,9 +170,6 @@ static inline unsigned long pdx_to_pfn(unsigned long pdx)
            ((pdx << pfn_pdx_hole_shift) & pfn_top_mask);
 }
 
-#define mfn_to_pdx(mfn) pfn_to_pdx(mfn_x(mfn))
-#define pdx_to_mfn(pdx) _mfn(pdx_to_pfn(pdx))
-
 /**
  * Computes the offset into the direct map of an maddr
  *
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option
  2023-07-28  7:58 [PATCH v2 0/5] Make PDX compression optional Alejandro Vallejo
                   ` (3 preceding siblings ...)
  2023-07-28  7:59 ` [PATCH v2 4/5] pdx: Reorder pdx.[ch] Alejandro Vallejo
@ 2023-07-28  7:59 ` Alejandro Vallejo
  2023-07-28 16:27   ` Julien Grall
                     ` (3 more replies)
  4 siblings, 4 replies; 29+ messages in thread
From: Alejandro Vallejo @ 2023-07-28  7:59 UTC (permalink / raw)
  To: Xen-devel
  Cc: Alejandro Vallejo, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Volodymyr Babchuk, Andrew Cooper,
	George Dunlap, Jan Beulich, Wei Liu, Roger Pau Monné

Adds a new compile-time flag to allow disabling pdx compression and
compiles out compression-related code/data. It also shorts the pdx<->pfn
conversion macros and creates stubs for masking fucntions.

While at it, removes the old arch-defined CONFIG_HAS_PDX flag, as it was
not removable in practice.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v2:
  * Merged v1/patch2: Removal of CONFIG_HAS_PDX here (Jan)
---
 xen/arch/arm/Kconfig  |  1 -
 xen/arch/x86/Kconfig  |  1 -
 xen/arch/x86/domain.c | 19 +++++++++++++------
 xen/common/Kconfig    | 13 ++++++++++---
 xen/common/Makefile   |  2 +-
 xen/common/pdx.c      | 15 +++++++++++----
 xen/include/xen/pdx.h | 37 ++++++++++++++++++++++++++++++++++---
 7 files changed, 69 insertions(+), 19 deletions(-)

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 439cc94f33..ea1949fbaa 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -14,7 +14,6 @@ config ARM
 	select HAS_ALTERNATIVE
 	select HAS_DEVICE_TREE
 	select HAS_PASSTHROUGH
-	select HAS_PDX
 	select HAS_PMAP
 	select HAS_UBSAN
 	select IOMMU_FORCE_PT_SHARE
diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 92f3a627da..30df085d96 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -24,7 +24,6 @@ config X86
 	select HAS_PASSTHROUGH
 	select HAS_PCI
 	select HAS_PCI_MSI
-	select HAS_PDX
 	select HAS_SCHED_GRANULARITY
 	select HAS_UBSAN
 	select HAS_VPCI if HVM
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 5f66c2ae33..ee2830aad7 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -458,7 +458,7 @@ void domain_cpu_policy_changed(struct domain *d)
     }
 }
 
-#ifndef CONFIG_BIGMEM
+#if !defined(CONFIG_BIGMEM) && defined(CONFIG_PDX_COMPRESSION)
 /*
  * The hole may be at or above the 44-bit boundary, so we need to determine
  * the total bit count until reaching 32 significant (not squashed out) bits
@@ -485,13 +485,20 @@ static unsigned int __init noinline _domain_struct_bits(void)
 struct domain *alloc_domain_struct(void)
 {
     struct domain *d;
-#ifdef CONFIG_BIGMEM
-    const unsigned int bits = 0;
-#else
+
     /*
-     * We pack the PDX of the domain structure into a 32-bit field within
-     * the page_info structure. Hence the MEMF_bits() restriction.
+     * Without CONFIG_BIGMEM, we pack the PDX of the domain structure into
+     * a 32-bit field within the page_info structure. Hence the MEMF_bits()
+     * restriction. With PDX compression in place the number of bits must
+     * be calculated at runtime, but it's fixed otherwise.
+     *
+     * On systems with CONFIG_BIGMEM there's no packing, and so there's no
+     * such restriction.
      */
+#if defined(CONFIG_BIGMEM) || !defined(CONFIG_PDX_COMPRESSION)
+    const unsigned int bits = IS_ENABLED(CONFIG_BIGMEM) ? 0 :
+                                                          32 + PAGE_SHIFT;
+#else
     static unsigned int __read_mostly bits;
 
     if ( unlikely(!bits) )
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index dd8d7c3f1c..3a0afd8e83 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -23,6 +23,16 @@ config GRANT_TABLE
 
 	  If unsure, say Y.
 
+config PDX_COMPRESSION
+	bool "PDX (Page inDeX) compression support"
+	default ARM
+	help
+	  PDX compression is a technique that allows the hypervisor to
+	  represent physical addresses in a very space-efficient manner.
+	  This is very helpful reducing memory wastage in systems with
+	  memory banks with base addresses far from each other, but carrier
+	  a performance cost.
+
 config ALTERNATIVE_CALL
 	bool
 
@@ -53,9 +63,6 @@ config HAS_IOPORTS
 config HAS_KEXEC
 	bool
 
-config HAS_PDX
-	bool
-
 config HAS_PMAP
 	bool
 
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 46049eac35..0020cafb8a 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -29,7 +29,7 @@ obj-y += multicall.o
 obj-y += notifier.o
 obj-$(CONFIG_NUMA) += numa.o
 obj-y += page_alloc.o
-obj-$(CONFIG_HAS_PDX) += pdx.o
+obj-y += pdx.o
 obj-$(CONFIG_PERF_COUNTERS) += perfc.o
 obj-bin-$(CONFIG_HAS_PMAP) += pmap.init.o
 obj-y += preempt.o
diff --git a/xen/common/pdx.c b/xen/common/pdx.c
index d3d38965bd..a3b1ba9fbb 100644
--- a/xen/common/pdx.c
+++ b/xen/common/pdx.c
@@ -31,11 +31,15 @@ unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS(
 
 bool __mfn_valid(unsigned long mfn)
 {
-    if ( unlikely(evaluate_nospec(mfn >= max_page)) )
+    bool invalid = mfn >= max_page;
+#ifdef CONFIG_PDX_COMPRESSION
+    invalid |= mfn & pfn_hole_mask;
+#endif
+
+    if ( unlikely(evaluate_nospec(invalid)) )
         return false;
-    return likely(!(mfn & pfn_hole_mask)) &&
-           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
-                           pdx_group_valid));
+
+    return test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT, pdx_group_valid);
 }
 
 void set_pdx_range(unsigned long smfn, unsigned long emfn)
@@ -49,6 +53,8 @@ void set_pdx_range(unsigned long smfn, unsigned long emfn)
         __set_bit(idx, pdx_group_valid);
 }
 
+#ifdef CONFIG_PDX_COMPRESSION
+
 /*
  * Diagram to make sense of the following variables. The masks and shifts
  * are done on mfn values in order to convert to/from pdx:
@@ -175,6 +181,7 @@ void __init pfn_pdx_hole_setup(unsigned long mask)
     pfn_top_mask        = ~(pfn_pdx_bottom_mask | pfn_hole_mask);
     ma_top_mask         = pfn_top_mask << PAGE_SHIFT;
 }
+#endif /* CONFIG_PDX_COMPRESSION */
 
 
 /*
diff --git a/xen/include/xen/pdx.h b/xen/include/xen/pdx.h
index 5a82b6bde2..dfb475c8dc 100644
--- a/xen/include/xen/pdx.h
+++ b/xen/include/xen/pdx.h
@@ -67,8 +67,6 @@
  * region involved.
  */
 
-#ifdef CONFIG_HAS_PDX
-
 extern unsigned long max_pdx;
 
 #define PDX_GROUP_COUNT ((1 << PDX_GROUP_SHIFT) / \
@@ -100,6 +98,8 @@ bool __mfn_valid(unsigned long mfn);
 #define mfn_to_pdx(mfn) pfn_to_pdx(mfn_x(mfn))
 #define pdx_to_mfn(pdx) _mfn(pdx_to_pfn(pdx))
 
+#ifdef CONFIG_PDX_COMPRESSION
+
 extern unsigned long pfn_pdx_bottom_mask, ma_va_bottom_mask;
 extern unsigned int pfn_pdx_hole_shift;
 extern unsigned long pfn_hole_mask;
@@ -205,8 +205,39 @@ static inline uint64_t directmapoff_to_maddr(unsigned long offset)
  *             position marks a potentially compressible bit.
  */
 void pfn_pdx_hole_setup(unsigned long mask);
+#else /* CONFIG_PDX_COMPRESSION */
+
+/* Without PDX compression we can skip some computations */
+
+/* pdx<->pfn == identity */
+#define pdx_to_pfn(x) (x)
+#define pfn_to_pdx(x) (x)
+
+/* directmap is indexed by by maddr */
+#define maddr_to_directmapoff(x) (x)
+#define directmapoff_to_maddr(x) (x)
+
+static inline bool pdx_is_region_compressible(unsigned long smfn,
+                                              unsigned long emfn)
+{
+    return true;
+}
+
+static inline uint64_t pdx_init_mask(uint64_t base_addr)
+{
+    return 0;
+}
+
+static inline uint64_t pdx_region_mask(uint64_t base, uint64_t len)
+{
+    return 0;
+}
+
+static inline void pfn_pdx_hole_setup(unsigned long mask)
+{
+}
 
-#endif /* HAS_PDX */
+#endif /* CONFIG_PDX_COMPRESSION */
 #endif /* __XEN_PDX_H__ */
 
 /*
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 1/5] arm/mm: Document the differences between arm32 and arm64 directmaps
  2023-07-28  7:58 ` [PATCH v2 1/5] arm/mm: Document the differences between arm32 and arm64 directmaps Alejandro Vallejo
@ 2023-07-28  9:05   ` Julien Grall
  2023-08-07 13:45     ` Julien Grall
  0 siblings, 1 reply; 29+ messages in thread
From: Julien Grall @ 2023-07-28  9:05 UTC (permalink / raw)
  To: Alejandro Vallejo, Xen-devel
  Cc: Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk

Hi Alex,

On 28/07/2023 08:58, Alejandro Vallejo wrote:
> arm32 merely covers the XENHEAP, whereas arm64 currently covers anything in
> the frame table. These comments highlight why arm32 doesn't need to account for PDX
> compression in its __va() implementation while arm64 does.
> 
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/5] mm: Factor out the pdx compression logic in ma/va converters
  2023-07-28  7:59 ` [PATCH v2 2/5] mm: Factor out the pdx compression logic in ma/va converters Alejandro Vallejo
@ 2023-07-28  9:07   ` Julien Grall
  2023-07-31 15:15   ` Jan Beulich
  1 sibling, 0 replies; 29+ messages in thread
From: Julien Grall @ 2023-07-28  9:07 UTC (permalink / raw)
  To: Alejandro Vallejo, Xen-devel
  Cc: Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk,
	Andrew Cooper, George Dunlap, Jan Beulich, Wei Liu,
	Roger Pau Monné

Hi,

On 28/07/2023 08:59, Alejandro Vallejo wrote:
> This patch factors out the pdx compression logic hardcoded in both ports
> for the maddr<->vaddr conversion functions.
> 
> Touches both x86 and arm ports.
> 
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/5] mm/pdx: Standardize region validation wrt pdx compression
  2023-07-28  7:59 ` [PATCH v2 3/5] mm/pdx: Standardize region validation wrt pdx compression Alejandro Vallejo
@ 2023-07-28 16:19   ` Julien Grall
  2023-07-31 15:27   ` Jan Beulich
  1 sibling, 0 replies; 29+ messages in thread
From: Julien Grall @ 2023-07-28 16:19 UTC (permalink / raw)
  To: Alejandro Vallejo, Xen-devel
  Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Stefano Stabellini

Hi,

On 28/07/2023 08:59, Alejandro Vallejo wrote:
> Regions must be occasionally validated for pdx compression validity. That
> is, whether any of the machine addresses spanning the region have a bit set
> in the pdx "hole" (which is expected to always contain zeroes). There are
> a few such tests through the code, and they all check for different things.
> 
> This patch replaces all such occurrences with a call to a centralized
> function that checks a region for validity.
> 
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 4/5] pdx: Reorder pdx.[ch]
  2023-07-28  7:59 ` [PATCH v2 4/5] pdx: Reorder pdx.[ch] Alejandro Vallejo
@ 2023-07-28 16:20   ` Julien Grall
  0 siblings, 0 replies; 29+ messages in thread
From: Julien Grall @ 2023-07-28 16:20 UTC (permalink / raw)
  To: Alejandro Vallejo, Xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini, Wei Liu

Hi,

On 28/07/2023 08:59, Alejandro Vallejo wrote:
> The next patch compiles out compression-related chunks, and it's helpful to
> have them grouped together beforehand.
> 
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>

Acked-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option
  2023-07-28  7:59 ` [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option Alejandro Vallejo
@ 2023-07-28 16:27   ` Julien Grall
  2023-07-28 16:36   ` Andrew Cooper
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 29+ messages in thread
From: Julien Grall @ 2023-07-28 16:27 UTC (permalink / raw)
  To: Alejandro Vallejo, Xen-devel
  Cc: Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk,
	Andrew Cooper, George Dunlap, Jan Beulich, Wei Liu,
	Roger Pau Monné

Hi,

On 28/07/2023 08:59, Alejandro Vallejo wrote:
> diff --git a/xen/include/xen/pdx.h b/xen/include/xen/pdx.h
> index 5a82b6bde2..dfb475c8dc 100644
> --- a/xen/include/xen/pdx.h
> +++ b/xen/include/xen/pdx.h
> @@ -67,8 +67,6 @@
>    * region involved.
>    */
>   
> -#ifdef CONFIG_HAS_PDX
> -
>   extern unsigned long max_pdx;
>   
>   #define PDX_GROUP_COUNT ((1 << PDX_GROUP_SHIFT) / \
> @@ -100,6 +98,8 @@ bool __mfn_valid(unsigned long mfn);
>   #define mfn_to_pdx(mfn) pfn_to_pdx(mfn_x(mfn))
>   #define pdx_to_mfn(pdx) _mfn(pdx_to_pfn(pdx))
>   
> +#ifdef CONFIG_PDX_COMPRESSION
> +
>   extern unsigned long pfn_pdx_bottom_mask, ma_va_bottom_mask;
>   extern unsigned int pfn_pdx_hole_shift;
>   extern unsigned long pfn_hole_mask;
> @@ -205,8 +205,39 @@ static inline uint64_t directmapoff_to_maddr(unsigned long offset)
>    *             position marks a potentially compressible bit.
>    */
>   void pfn_pdx_hole_setup(unsigned long mask);
> +#else /* CONFIG_PDX_COMPRESSION */

Looking at other places, we tend to put the reason the #else be 
executed. In this case, it is !CONFIG_PDX_COMPRESSION.

Other than that, the changes looks good to me:

Reviewed-by: Julien Grall <jgrall@amazon.com>

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option
  2023-07-28  7:59 ` [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option Alejandro Vallejo
  2023-07-28 16:27   ` Julien Grall
@ 2023-07-28 16:36   ` Andrew Cooper
  2023-07-28 16:58     ` Andrew Cooper
  2023-07-31  9:09     ` Julien Grall
  2023-07-31 15:33   ` Jan Beulich
  2023-08-07 17:48   ` Julien Grall
  3 siblings, 2 replies; 29+ messages in thread
From: Andrew Cooper @ 2023-07-28 16:36 UTC (permalink / raw)
  To: Alejandro Vallejo, Xen-devel
  Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, George Dunlap, Jan Beulich, Wei Liu,
	Roger Pau Monné

On 28/07/2023 8:59 am, Alejandro Vallejo wrote:
> Adds a new compile-time flag to allow disabling pdx compression and
> compiles out compression-related code/data. It also shorts the pdx<->pfn
> conversion macros and creates stubs for masking fucntions.
>
> While at it, removes the old arch-defined CONFIG_HAS_PDX flag, as it was
> not removable in practice.
>
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> ---
> v2:
>   * Merged v1/patch2: Removal of CONFIG_HAS_PDX here (Jan)

This series is now looking fine, except for the Kconfig aspect.

This is not something any user or developer should ever be queried
about.  The feedback on the documentation patches alone show that it's
not understood well by the maintainers, even if the principle is accepted.

There is never any reason to have this active on x86.  Indeed, Julien's
quick metric shows how much performance we waste by having it enabled.

This patch ought to just rename HAS_PDX to PDX, and keep it selected by
the subset of architectures that still want it.

~Andrew


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option
  2023-07-28 16:36   ` Andrew Cooper
@ 2023-07-28 16:58     ` Andrew Cooper
  2023-07-31  8:00       ` Jan Beulich
  2023-07-31  9:09     ` Julien Grall
  1 sibling, 1 reply; 29+ messages in thread
From: Andrew Cooper @ 2023-07-28 16:58 UTC (permalink / raw)
  To: Alejandro Vallejo, Xen-devel
  Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, George Dunlap, Jan Beulich, Wei Liu,
	Roger Pau Monné

On 28/07/2023 5:36 pm, Andrew Cooper wrote:
> On 28/07/2023 8:59 am, Alejandro Vallejo wrote:
>> Adds a new compile-time flag to allow disabling pdx compression and
>> compiles out compression-related code/data. It also shorts the pdx<->pfn
>> conversion macros and creates stubs for masking fucntions.
>>
>> While at it, removes the old arch-defined CONFIG_HAS_PDX flag, as it was
>> not removable in practice.
>>
>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
>> ---
>> v2:
>>   * Merged v1/patch2: Removal of CONFIG_HAS_PDX here (Jan)
> This series is now looking fine, except for the Kconfig aspect.
>
> This is not something any user or developer should ever be queried
> about.  The feedback on the documentation patches alone show that it's
> not understood well by the maintainers, even if the principle is accepted.
>
> There is never any reason to have this active on x86.  Indeed, Julien's
> quick metric shows how much performance we waste by having it enabled.

Further to this, bloat-o-meter says net -30k of code and there are
plenty of fastpaths getting a several cacheline reduction from this.

~Andrew


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option
  2023-07-28 16:58     ` Andrew Cooper
@ 2023-07-31  8:00       ` Jan Beulich
  2023-07-31 17:38         ` Andrew Cooper
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2023-07-31  8:00 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, George Dunlap, Wei Liu, Roger Pau Monné,
	Alejandro Vallejo, Xen-devel

On 28.07.2023 18:58, Andrew Cooper wrote:
> On 28/07/2023 5:36 pm, Andrew Cooper wrote:
>> On 28/07/2023 8:59 am, Alejandro Vallejo wrote:
>>> Adds a new compile-time flag to allow disabling pdx compression and
>>> compiles out compression-related code/data. It also shorts the pdx<->pfn
>>> conversion macros and creates stubs for masking fucntions.
>>>
>>> While at it, removes the old arch-defined CONFIG_HAS_PDX flag, as it was
>>> not removable in practice.
>>>
>>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
>>> ---
>>> v2:
>>>   * Merged v1/patch2: Removal of CONFIG_HAS_PDX here (Jan)
>> This series is now looking fine, except for the Kconfig aspect.
>>
>> This is not something any user or developer should ever be queried
>> about.  The feedback on the documentation patches alone show that it's
>> not understood well by the maintainers, even if the principle is accepted.
>>
>> There is never any reason to have this active on x86.

We can of course continue to disagree here. At least with EXPERT=y
selecting this option ought to remain possible for x86. Whether or
not the original systems this scheme was developed for ever went
public, such systems did (do) exist, and hence running Xen sensibly
on them (without losing all memory except that on node 0) ought to
be possible.

>  Indeed, Julien's
>> quick metric shows how much performance we waste by having it enabled.
> 
> Further to this, bloat-o-meter says net -30k of code and there are
> plenty of fastpaths getting a several cacheline reduction from this.

A similar reduction was achieved by the BMI2-alt-patching series I
had put together, yet you weren't willing to come to consensus on
it.

Jan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option
  2023-07-28 16:36   ` Andrew Cooper
  2023-07-28 16:58     ` Andrew Cooper
@ 2023-07-31  9:09     ` Julien Grall
  1 sibling, 0 replies; 29+ messages in thread
From: Julien Grall @ 2023-07-31  9:09 UTC (permalink / raw)
  To: Andrew Cooper, Alejandro Vallejo, Xen-devel
  Cc: Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk,
	George Dunlap, Jan Beulich, Wei Liu, Roger Pau Monné



On 28/07/2023 17:36, Andrew Cooper wrote:
> On 28/07/2023 8:59 am, Alejandro Vallejo wrote:
>> Adds a new compile-time flag to allow disabling pdx compression and
>> compiles out compression-related code/data. It also shorts the pdx<->pfn
>> conversion macros and creates stubs for masking fucntions.
>>
>> While at it, removes the old arch-defined CONFIG_HAS_PDX flag, as it was
>> not removable in practice.
>>
>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
>> ---
>> v2:
>>    * Merged v1/patch2: Removal of CONFIG_HAS_PDX here (Jan)
> 
> This series is now looking fine, except for the Kconfig aspect.
> 
> This is not something any user or developer should ever be queried
> about.  The feedback on the documentation patches alone show that it's
> not understood well by the maintainers, even if the principle is accepted.
> 
> There is never any reason to have this active on x86. 

I can't really speak for x86. However, for Arm, I think it could be 
useful in the case there is a system where the frametable size doesn't 
change with CONFIG_PDX=n and/or the integrator is happy to waste a bit 
more memory to gain performance.

Also, to reply to Jan's, I don't think this is a sort of option that 
should be gated by expert (againt at least on Arm). Xen ought to work 
fine with and without PDX enabled. From my understanding, the only 
differences are the amount of memory wasted vs performance gain.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/5] mm: Factor out the pdx compression logic in ma/va converters
  2023-07-28  7:59 ` [PATCH v2 2/5] mm: Factor out the pdx compression logic in ma/va converters Alejandro Vallejo
  2023-07-28  9:07   ` Julien Grall
@ 2023-07-31 15:15   ` Jan Beulich
  2023-08-07 16:26     ` Alejandro Vallejo
  1 sibling, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2023-07-31 15:15 UTC (permalink / raw)
  To: Alejandro Vallejo
  Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné,
	Xen-devel

On 28.07.2023 09:59, Alejandro Vallejo wrote:
> --- a/xen/include/xen/pdx.h
> +++ b/xen/include/xen/pdx.h
> @@ -160,6 +160,31 @@ static inline unsigned long pdx_to_pfn(unsigned long pdx)
>  #define mfn_to_pdx(mfn) pfn_to_pdx(mfn_x(mfn))
>  #define pdx_to_mfn(pdx) _mfn(pdx_to_pfn(pdx))
>  
> +/**
> + * Computes the offset into the direct map of an maddr
> + *
> + * @param ma Machine address
> + * @return Offset on the direct map where that
> + *         machine address can be accessed
> + */
> +static inline unsigned long maddr_to_directmapoff(uint64_t ma)

Was there prior agreement to use uint64_t here and ...

> +{
> +    return ((ma & ma_top_mask) >> pfn_pdx_hole_shift) |
> +           (ma & ma_va_bottom_mask);
> +}
> +
> +/**
> + * Computes a machine address given a direct map offset
> + *
> + * @param offset Offset into the direct map
> + * @return Corresponding machine address of that virtual location
> + */
> +static inline uint64_t directmapoff_to_maddr(unsigned long offset)

... here, not paddr_t?

Also you use unsigned long for the offset here, but size_t for
maddr_to_directmapoff()'s return value in __maddr_to_virt().
Would be nice if this was consistent within the patch.

Especially since the names of the helper functions are longish,
I'm afraid I'm not fully convinced of the transformation. But I'm
also not meaning to stand in the way, if everyone else wants to
move in that direction.

Jan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/5] mm/pdx: Standardize region validation wrt pdx compression
  2023-07-28  7:59 ` [PATCH v2 3/5] mm/pdx: Standardize region validation wrt pdx compression Alejandro Vallejo
  2023-07-28 16:19   ` Julien Grall
@ 2023-07-31 15:27   ` Jan Beulich
  2023-08-07 16:50     ` Alejandro Vallejo
  1 sibling, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2023-07-31 15:27 UTC (permalink / raw)
  To: Alejandro Vallejo
  Cc: Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Xen-devel

On 28.07.2023 09:59, Alejandro Vallejo wrote:
> --- a/xen/arch/x86/x86_64/mm.c
> +++ b/xen/arch/x86/x86_64/mm.c
> @@ -1159,6 +1159,9 @@ static int mem_hotadd_check(unsigned long spfn, unsigned long epfn)
>  {
>      unsigned long s, e, length, sidx, eidx;
>  
> +    paddr_t mem_base = pfn_to_paddr(spfn);
> +    unsigned long mem_npages = epfn - spfn;
> +
>      if ( (spfn >= epfn) )
>          return 0;

While occasionally groups of declarations indeed want separating, the
rule of thumb is that the first blank line after declarations separates
them from statements. I don't see reason here to diverge from this.

> @@ -1660,6 +1663,8 @@ static bool __init cf_check rt_range_valid(unsigned long smfn, unsigned long emf
>  
>  void __init efi_init_memory(void)
>  {
> +    paddr_t mem_base;
> +    unsigned long mem_npages;

Why in the outermost scope when ...

> @@ -1732,6 +1737,9 @@ void __init efi_init_memory(void)
>          smfn = PFN_DOWN(desc->PhysicalStart);
>          emfn = PFN_UP(desc->PhysicalStart + len);
>  
> +        mem_base = pfn_to_paddr(smfn);
> +        mem_npages = emfn - smfn;
> +
>          if ( desc->Attribute & EFI_MEMORY_WB )
>              prot |= _PAGE_WB;
>          else if ( desc->Attribute & EFI_MEMORY_WT )
> @@ -1759,8 +1767,7 @@ void __init efi_init_memory(void)
>              prot |= _PAGE_NX;
>  
>          if ( pfn_to_pdx(emfn - 1) < (DIRECTMAP_SIZE >> PAGE_SHIFT) &&
> -             !(smfn & pfn_hole_mask) &&
> -             !((smfn ^ (emfn - 1)) & ~pfn_pdx_bottom_mask) )
> +             pdx_is_region_compressible(mem_base, mem_npages))
>          {
>              if ( (unsigned long)mfn_to_virt(emfn - 1) >= HYPERVISOR_VIRT_END )
>                  prot &= ~_PAGE_GLOBAL;

... you use the variables only in an inner one?

> --- a/xen/common/pdx.c
> +++ b/xen/common/pdx.c
> @@ -88,7 +88,7 @@ bool __mfn_valid(unsigned long mfn)
>  }
>  
>  /* Sets all bits from the most-significant 1-bit down to the LSB */
> -static uint64_t __init fill_mask(uint64_t mask)
> +static uint64_t fill_mask(uint64_t mask)
>  {
>      while (mask & (mask + 1))
>          mask |= mask + 1;

I see why you want __init dropped here, but the function wasn't written
for "common use" and hence may want improving first when intended for
more frequent (post-init) use as well. Then again I wonder why original
checking all got away without using this function ...

Jan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option
  2023-07-28  7:59 ` [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option Alejandro Vallejo
  2023-07-28 16:27   ` Julien Grall
  2023-07-28 16:36   ` Andrew Cooper
@ 2023-07-31 15:33   ` Jan Beulich
  2023-08-07 17:48   ` Julien Grall
  3 siblings, 0 replies; 29+ messages in thread
From: Jan Beulich @ 2023-07-31 15:33 UTC (permalink / raw)
  To: Alejandro Vallejo
  Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné,
	Xen-devel

On 28.07.2023 09:59, Alejandro Vallejo wrote:
> --- a/xen/common/pdx.c
> +++ b/xen/common/pdx.c
> @@ -31,11 +31,15 @@ unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS(
>  
>  bool __mfn_valid(unsigned long mfn)
>  {
> -    if ( unlikely(evaluate_nospec(mfn >= max_page)) )
> +    bool invalid = mfn >= max_page;
> +#ifdef CONFIG_PDX_COMPRESSION
> +    invalid |= mfn & pfn_hole_mask;
> +#endif

Nit: Declaration(s) and statement(s) separated by a blank line please.

> @@ -49,6 +53,8 @@ void set_pdx_range(unsigned long smfn, unsigned long emfn)
>          __set_bit(idx, pdx_group_valid);
>  }
>  
> +#ifdef CONFIG_PDX_COMPRESSION
> +
>  /*
>   * Diagram to make sense of the following variables. The masks and shifts
>   * are done on mfn values in order to convert to/from pdx:

Nit: With a blank line after #ifdef, 

> @@ -175,6 +181,7 @@ void __init pfn_pdx_hole_setup(unsigned long mask)
>      pfn_top_mask        = ~(pfn_pdx_bottom_mask | pfn_hole_mask);
>      ma_top_mask         = pfn_top_mask << PAGE_SHIFT;
>  }
> +#endif /* CONFIG_PDX_COMPRESSION */
>  
>  
>  /*

... we would typically also have one before #endif. In the case here
you could even leverage that there are already (wrongly) two consecutive
blank lines.

> @@ -100,6 +98,8 @@ bool __mfn_valid(unsigned long mfn);
>  #define mfn_to_pdx(mfn) pfn_to_pdx(mfn_x(mfn))
>  #define pdx_to_mfn(pdx) _mfn(pdx_to_pfn(pdx))
>  
> +#ifdef CONFIG_PDX_COMPRESSION
> +
>  extern unsigned long pfn_pdx_bottom_mask, ma_va_bottom_mask;
>  extern unsigned int pfn_pdx_hole_shift;
>  extern unsigned long pfn_hole_mask;
> @@ -205,8 +205,39 @@ static inline uint64_t directmapoff_to_maddr(unsigned long offset)
>   *             position marks a potentially compressible bit.
>   */
>  void pfn_pdx_hole_setup(unsigned long mask);
> +#else /* CONFIG_PDX_COMPRESSION */
> +
> +/* Without PDX compression we can skip some computations */

Same here for the #else then.

Jan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option
  2023-07-31  8:00       ` Jan Beulich
@ 2023-07-31 17:38         ` Andrew Cooper
  2023-08-01  7:57           ` Jan Beulich
  0 siblings, 1 reply; 29+ messages in thread
From: Andrew Cooper @ 2023-07-31 17:38 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, George Dunlap, Wei Liu, Roger Pau Monné,
	Alejandro Vallejo, Xen-devel

On 31/07/2023 9:00 am, Jan Beulich wrote:
> On 28.07.2023 18:58, Andrew Cooper wrote:
>> On 28/07/2023 5:36 pm, Andrew Cooper wrote:
>>> On 28/07/2023 8:59 am, Alejandro Vallejo wrote:
>>>> Adds a new compile-time flag to allow disabling pdx compression and
>>>> compiles out compression-related code/data. It also shorts the pdx<->pfn
>>>> conversion macros and creates stubs for masking fucntions.
>>>>
>>>> While at it, removes the old arch-defined CONFIG_HAS_PDX flag, as it was
>>>> not removable in practice.
>>>>
>>>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
>>>> ---
>>>> v2:
>>>>   * Merged v1/patch2: Removal of CONFIG_HAS_PDX here (Jan)
>>> This series is now looking fine, except for the Kconfig aspect.
>>>
>>> This is not something any user or developer should ever be queried
>>> about.  The feedback on the documentation patches alone show that it's
>>> not understood well by the maintainers, even if the principle is accepted.
>>>
>>> There is never any reason to have this active on x86.
> We can of course continue to disagree here. At least with EXPERT=y
> selecting this option ought to remain possible for x86. Whether or
> not the original systems this scheme was developed for ever went
> public, such systems did (do) exist, and hence running Xen sensibly
> on them (without losing all memory except that on node 0) ought to
> be possible.

There's one system which never made its way into production,
support-for-which in the no-op case is causing a 10% perf hit in at
least one metric, 17% in another.

... and your seriously arguing that we should continue to take this perf
hit?

It is very likely that said machine(s) aren't even powered on these
days, and even if it is on, the vendor can take the overhead of turning
PDX compression on until such time as they make a production system.

Furthermore, it is unrealistic to think that such a machine will ever
make its way into production.  Linux has never PDX compression, and
by-and-large if it doesn't run Linux, you can't sell it in the first place.


It is utterly unreasonable to be carrying this overhead in the first
place.  PDX compression *should not* have been committed on-by-default
in the first place.  (Yes, I know there was no Kconfig back then, and
the review process was non-existent, but someone really should have said
no).

It is equally unreasonable to offer people (under Expert or not) an
ability to shoot themselves in the foot like this.


If in the very unlikely case that such a system does come into
existence, we can consider re-enabling PDX compression (and by that, I
mean doing it in a less invasive way in the common case), but there's
very little chance this will ever be a path we need to take.

>>   Indeed, Julien's
>>> quick metric shows how much performance we waste by having it enabled.
>> Further to this, bloat-o-meter says net -30k of code and there are
>> plenty of fastpaths getting a several cacheline reduction from this.
> A similar reduction was achieved

Really?  You think that replacing the innermost shift and masking with
an alternative that has a shorter instruction sequence gets you the same
net reduction in code?

I do not find that claim credible.

>  by the BMI2-alt-patching series I
> had put together, yet you weren't willing to come to consensus on
> it.

You have AMD machines, and your patch was alleged to be a performance
improvement.  So the fact you didn't spot the problems with PEXT/PDEP on
all AMD hardware prior to Fam19h suggests there was insufficient testing
for an alleged performance improvement.

The patch you posted:

1) Added extra complexity via alternatives, and
2) Reduced performance on AMD systems prior to Fam19h.

in an area of code which useless on all shipping x86 systems.

You literally micro-anti-optimised a no-op path to a more expensive (on
one vendor at least) no-op path, claiming it to be a performance
improvement.

There is no possible way any form of your patch can ever beat
Alejandro's work of just compiling-out the useless logic wholesale.

~Andrew


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option
  2023-07-31 17:38         ` Andrew Cooper
@ 2023-08-01  7:57           ` Jan Beulich
  2023-08-07 16:06             ` Alejandro Vallejo
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2023-08-01  7:57 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, George Dunlap, Wei Liu, Roger Pau Monné,
	Alejandro Vallejo, Xen-devel

On 31.07.2023 19:38, Andrew Cooper wrote:
> On 31/07/2023 9:00 am, Jan Beulich wrote:
>> On 28.07.2023 18:58, Andrew Cooper wrote:
>>> On 28/07/2023 5:36 pm, Andrew Cooper wrote:
>>>> On 28/07/2023 8:59 am, Alejandro Vallejo wrote:
>>>>> Adds a new compile-time flag to allow disabling pdx compression and
>>>>> compiles out compression-related code/data. It also shorts the pdx<->pfn
>>>>> conversion macros and creates stubs for masking fucntions.
>>>>>
>>>>> While at it, removes the old arch-defined CONFIG_HAS_PDX flag, as it was
>>>>> not removable in practice.
>>>>>
>>>>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
>>>>> ---
>>>>> v2:
>>>>>   * Merged v1/patch2: Removal of CONFIG_HAS_PDX here (Jan)
>>>> This series is now looking fine, except for the Kconfig aspect.
>>>>
>>>> This is not something any user or developer should ever be queried
>>>> about.  The feedback on the documentation patches alone show that it's
>>>> not understood well by the maintainers, even if the principle is accepted.
>>>>
>>>> There is never any reason to have this active on x86.
>> We can of course continue to disagree here. At least with EXPERT=y
>> selecting this option ought to remain possible for x86. Whether or
>> not the original systems this scheme was developed for ever went
>> public, such systems did (do) exist, and hence running Xen sensibly
>> on them (without losing all memory except that on node 0) ought to
>> be possible.
> 
> There's one system which never made its way into production,
> support-for-which in the no-op case is causing a 10% perf hit in at
> least one metric, 17% in another.
> 
> ... and your seriously arguing that we should continue to take this perf
> hit?

Have I asked anywhere to keep this enabled by default? Have I been the
one to first propose a way to reduce this overhead?

> It is very likely that said machine(s) aren't even powered on these
> days, and even if it is on, the vendor can take the overhead of turning
> PDX compression on until such time as they make a production system.
> 
> Furthermore, it is unrealistic to think that such a machine will ever
> make its way into production.  Linux has never PDX compression, and
> by-and-large if it doesn't run Linux, you can't sell it in the first place.

I'm sure you recall that Linux has much more VA space for its directmap,
so aiui there simply was no immediate need there.

> It is utterly unreasonable to be carrying this overhead in the first
> place.  PDX compression *should not* have been committed on-by-default
> in the first place.  (Yes, I know there was no Kconfig back then, and
> the review process was non-existent, but someone really should have said
> no).

Are you suggesting no code should be committed supporting new hardware,
ahead of that hardware going public?

> It is equally unreasonable to offer people (under Expert or not) an
> ability to shoot themselves in the foot like this.

Shoot themselves in the foot? If you enable EXPERT, you better know
what you're doing.

>>>   Indeed, Julien's
>>>> quick metric shows how much performance we waste by having it enabled.
>>> Further to this, bloat-o-meter says net -30k of code and there are
>>> plenty of fastpaths getting a several cacheline reduction from this.
>> A similar reduction was achieved
> 
> Really?  You think that replacing the innermost shift and masking with
> an alternative that has a shorter instruction sequence gets you the same
> net reduction in code?
> 
> I do not find that claim credible.

Did you ever seriously look at those patches?

>>  by the BMI2-alt-patching series I
>> had put together, yet you weren't willing to come to consensus on
>> it.
> 
> You have AMD machines, and your patch was alleged to be a performance
> improvement.  So the fact you didn't spot the problems with PEXT/PDEP on
> all AMD hardware prior to Fam19h suggests there was insufficient testing
> for an alleged performance improvement.
> 
> The patch you posted:
> 
> 1) Added extra complexity via alternatives, and
> 2) Reduced performance on AMD systems prior to Fam19h.
> 
> in an area of code which useless on all shipping x86 systems.
> 
> You literally micro-anti-optimised a no-op path to a more expensive (on
> one vendor at least) no-op path, claiming it to be a performance
> improvement.

You appear to forget the patch patching to MOV instructions ("x86: use
MOV for PFN/PDX conversion when possible"). Are you saying MOV has
performance problems on any CPU?

Jan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 1/5] arm/mm: Document the differences between arm32 and arm64 directmaps
  2023-07-28  9:05   ` Julien Grall
@ 2023-08-07 13:45     ` Julien Grall
  0 siblings, 0 replies; 29+ messages in thread
From: Julien Grall @ 2023-08-07 13:45 UTC (permalink / raw)
  To: Alejandro Vallejo, Xen-devel
  Cc: Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk

On 28/07/2023 10:05, Julien Grall wrote:
> On 28/07/2023 08:58, Alejandro Vallejo wrote:
>> arm32 merely covers the XENHEAP, whereas arm64 currently covers 
>> anything in
>> the frame table. These comments highlight why arm32 doesn't need to 
>> account for PDX
>> compression in its __va() implementation while arm64 does.
>>
>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> 
> Reviewed-by: Julien Grall <jgrall@amazon.com>

I have committed the patch.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option
  2023-08-01  7:57           ` Jan Beulich
@ 2023-08-07 16:06             ` Alejandro Vallejo
  2023-08-08  6:17               ` Jan Beulich
  0 siblings, 1 reply; 29+ messages in thread
From: Alejandro Vallejo @ 2023-08-07 16:06 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Volodymyr Babchuk, George Dunlap, Wei Liu,
	Roger Pau Monné,
	Xen-devel

Hi,

A few days have passed. May I suggest taking a step back?

On Tue, Aug 01, 2023 at 09:57:57AM +0200, Jan Beulich wrote:
> On 31.07.2023 19:38, Andrew Cooper wrote:
> > There's one system which never made its way into production,
> > support-for-which in the no-op case is causing a 10% perf hit in at
> > least one metric, 17% in another.
> > 
> > ... and your seriously arguing that we should continue to take this perf
> > hit?
> 
> Have I asked anywhere to keep this enabled by default? Have I been the
> one to first propose a way to reduce this overhead?
> 
> > It is very likely that said machine(s) aren't even powered on these
> > days, and even if it is on, the vendor can take the overhead of turning
> > PDX compression on until such time as they make a production system.
> > 
> > Furthermore, it is unrealistic to think that such a machine will ever
> > make its way into production.  Linux has never PDX compression, and
> > by-and-large if it doesn't run Linux, you can't sell it in the first place.
> 
> I'm sure you recall that Linux has much more VA space for its directmap,
> so aiui there simply was no immediate need there.
> 
> > It is utterly unreasonable to be carrying this overhead in the first
> > place.  PDX compression *should not* have been committed on-by-default
> > in the first place.  (Yes, I know there was no Kconfig back then, and
> > the review process was non-existent, but someone really should have said
> > no).
> 
> Are you suggesting no code should be committed supporting new hardware,
> ahead of that hardware going public?
> 
> > It is equally unreasonable to offer people (under Expert or not) an
> > ability to shoot themselves in the foot like this.
> 
> Shoot themselves in the foot? If you enable EXPERT, you better know
> what you're doing.
> 
> >>>   Indeed, Julien's
> >>>> quick metric shows how much performance we waste by having it enabled.
> >>> Further to this, bloat-o-meter says net -30k of code and there are
> >>> plenty of fastpaths getting a several cacheline reduction from this.
> >> A similar reduction was achieved
> > 
> > Really?  You think that replacing the innermost shift and masking with
> > an alternative that has a shorter instruction sequence gets you the same
> > net reduction in code?
> > 
> > I do not find that claim credible.
> 
> Did you ever seriously look at those patches?
> 
> >>  by the BMI2-alt-patching series I
> >> had put together, yet you weren't willing to come to consensus on
> >> it.
> > 
> > You have AMD machines, and your patch was alleged to be a performance
> > improvement.  So the fact you didn't spot the problems with PEXT/PDEP on
> > all AMD hardware prior to Fam19h suggests there was insufficient testing
> > for an alleged performance improvement.
> > 
> > The patch you posted:
> > 
> > 1) Added extra complexity via alternatives, and
> > 2) Reduced performance on AMD systems prior to Fam19h.
> > 
> > in an area of code which useless on all shipping x86 systems.
> > 
> > You literally micro-anti-optimised a no-op path to a more expensive (on
> > one vendor at least) no-op path, claiming it to be a performance
> > improvement.
> 
> You appear to forget the patch patching to MOV instructions ("x86: use
> MOV for PFN/PDX conversion when possible"). Are you saying MOV has
> performance problems on any CPU?
> 
> Jan

I think we can all agree that (a) the _current_ pdx code implies taking a
perf hit and (b) that we can't just remove PDX compression because it's in
current use in certain systems. The contentious points seem to be strictly
in the status it should hold by default and the means to reduce the perf
hit when it's on.

For the status:

I think it's clear it must remain at least on ARM. I believe Jan argued
about the choice of default in terms of "when" to turn it off, rather than
"if". Seeing that...

> On 18.07.2023 9:33, Jan Beulich wrote:
> I disagree with this choice of default for x86. To avoid surprising
> downstreams, this should at best be a two-step process: Keep the
> default as Y right now, and switch to N a couple of releases later.
... so I think a more productive discussion is following up on that,
looking at the "why", "when", "how" and "risks-involved" rather than going
in circles (more on this at the end of the email).

Now, for the perf hit reduction:

An alt-patching series can probably make it very close to the perf win of
this patch as long as it transforms the conversion hunks into no-ops when
there's no hole. I looked into the 2018 patch, and I don't think it tried
to go that far (afaics it's purpose is to inline the compression parameters
into the code stream). I highly suspect it would still noticiably
underperform compared to this one, but I admit it's guesswork and I'd be
happy to be proven wrong through benchmarks.

Regardless that's a lot more complexity than I was willing to tackle here
seeing that it's use is limited on x86.

Compiling-out and alt-patching compression are not necessarily incompatible
with each other though. It would, in fact, do wonders for ARM, where the
exact same binary might have to run on separate processors with different
memory map configurations. I do think someone ought to do it in the long
run as a performance improvement for that port (if only because
alt-patching is arch-specific), but I really struggle to justify the dev
effort of writing, benchmarking, testing and maintining all that
infrastructure when there's no (known) machine I can buy to test it on.

Summary:
  * While alt-patching is an attractive alternative this series doesn't do
    that and in the spirit of keeping things simple I'd really rather keep
    it that way. Does this sound reasonable?
  * For the topic of when to disable compression by default on x86, I
    genuinely think now's as good a time as any. If we were to do it in 2
    steps any project downstream may very well not notice until 2 releases
    down the line, at which point they simply must turn compression back
    on, which is what they would have to do now anyway.
  * For the topic of allowing or not the option to be selectable, I think
    it would be a mistake to preclude it because while we don't know of
    physical memory maps with big holes on (publicly available) x86, Xen
    may be itself virtualized with arbitrary memory maps. Unlikely and far
    fetched as it is, it's IMO worth being at least cautious about. Gating
    the feature on EXPERT=y and adding a big warning should be good enough
    to avoid foot-shootouts.

Thanks,
Alejandro


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/5] mm: Factor out the pdx compression logic in ma/va converters
  2023-07-31 15:15   ` Jan Beulich
@ 2023-08-07 16:26     ` Alejandro Vallejo
  2023-08-08  6:05       ` Jan Beulich
  0 siblings, 1 reply; 29+ messages in thread
From: Alejandro Vallejo @ 2023-08-07 16:26 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné,
	Xen-devel

On Mon, Jul 31, 2023 at 05:15:19PM +0200, Jan Beulich wrote:
> On 28.07.2023 09:59, Alejandro Vallejo wrote:
> > --- a/xen/include/xen/pdx.h
> > +++ b/xen/include/xen/pdx.h
> > @@ -160,6 +160,31 @@ static inline unsigned long pdx_to_pfn(unsigned long pdx)
> >  #define mfn_to_pdx(mfn) pfn_to_pdx(mfn_x(mfn))
> >  #define pdx_to_mfn(pdx) _mfn(pdx_to_pfn(pdx))
> >  
> > +/**
> > + * Computes the offset into the direct map of an maddr
> > + *
> > + * @param ma Machine address
> > + * @return Offset on the direct map where that
> > + *         machine address can be accessed
> > + */
> > +static inline unsigned long maddr_to_directmapoff(uint64_t ma)
> 
> Was there prior agreement to use uint64_t here and ...
> 
> > +{
> > +    return ((ma & ma_top_mask) >> pfn_pdx_hole_shift) |
> > +           (ma & ma_va_bottom_mask);
> > +}
> > +
> > +/**
> > + * Computes a machine address given a direct map offset
> > + *
> > + * @param offset Offset into the direct map
> > + * @return Corresponding machine address of that virtual location
> > + */
> > +static inline uint64_t directmapoff_to_maddr(unsigned long offset)
> 
> ... here, not paddr_t?
The whole file uses uint64_t rather than paddr_t so I added the new code
using the existing convention. I can just as well make it paddr_t, it's
not a problem.

> 
> Also you use unsigned long for the offset here, but size_t for
> maddr_to_directmapoff()'s return value in __maddr_to_virt().
> Would be nice if this was consistent within the patch.
That's fair. I can leave it as "unsigned long" (seeing that a previous nit was
preferring that type over size_t).

> Especially since the names of the helper functions are longish,
> I'm afraid I'm not fully convinced of the transformation.
In what sense? If it's just naming style I'm happy to consider other names,
but taking compression logic out of non-pdx code is essential to removing
compiling it out.

> But I'm also not meaning to stand in the way, if everyone else wants to
> move in that direction.
> 
> Jan

Thanks,
Alejandro


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/5] mm/pdx: Standardize region validation wrt pdx compression
  2023-07-31 15:27   ` Jan Beulich
@ 2023-08-07 16:50     ` Alejandro Vallejo
  0 siblings, 0 replies; 29+ messages in thread
From: Alejandro Vallejo @ 2023-08-07 16:50 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, Roger Pau Monné,
	Wei Liu, George Dunlap, Julien Grall, Stefano Stabellini,
	Xen-devel

On Mon, Jul 31, 2023 at 05:27:22PM +0200, Jan Beulich wrote:
> On 28.07.2023 09:59, Alejandro Vallejo wrote:
> > --- a/xen/arch/x86/x86_64/mm.c
> > +++ b/xen/arch/x86/x86_64/mm.c
> > @@ -1159,6 +1159,9 @@ static int mem_hotadd_check(unsigned long spfn, unsigned long epfn)
> >  {
> >      unsigned long s, e, length, sidx, eidx;
> >  
> > +    paddr_t mem_base = pfn_to_paddr(spfn);
> > +    unsigned long mem_npages = epfn - spfn;
> > +
> >      if ( (spfn >= epfn) )
> >          return 0;
> 
> While occasionally groups of declarations indeed want separating, the
> rule of thumb is that the first blank line after declarations separates
> them from statements. I don't see reason here to diverge from this.
Ack.

> 
> > @@ -1660,6 +1663,8 @@ static bool __init cf_check rt_range_valid(unsigned long smfn, unsigned long emf
> >  
> >  void __init efi_init_memory(void)
> >  {
> > +    paddr_t mem_base;
> > +    unsigned long mem_npages;
> 
> Why in the outermost scope when ...
> 
> > @@ -1732,6 +1737,9 @@ void __init efi_init_memory(void)
> >          smfn = PFN_DOWN(desc->PhysicalStart);
> >          emfn = PFN_UP(desc->PhysicalStart + len);
> >  
> > +        mem_base = pfn_to_paddr(smfn);
> > +        mem_npages = emfn - smfn;
> > +
> >          if ( desc->Attribute & EFI_MEMORY_WB )
> >              prot |= _PAGE_WB;
> >          else if ( desc->Attribute & EFI_MEMORY_WT )
> > @@ -1759,8 +1767,7 @@ void __init efi_init_memory(void)
> >              prot |= _PAGE_NX;
> >  
> >          if ( pfn_to_pdx(emfn - 1) < (DIRECTMAP_SIZE >> PAGE_SHIFT) &&
> > -             !(smfn & pfn_hole_mask) &&
> > -             !((smfn ^ (emfn - 1)) & ~pfn_pdx_bottom_mask) )
> > +             pdx_is_region_compressible(mem_base, mem_npages))
> >          {
> >              if ( (unsigned long)mfn_to_virt(emfn - 1) >= HYPERVISOR_VIRT_END )
> >                  prot &= ~_PAGE_GLOBAL;
> 
> ... you use the variables only in an inner one?
No good reason. I defaulted to the top of the function because C90 was
picky about declaring midway through a scope and I didn't consider going to
the top of the scope instead. Ack.

> 
> > --- a/xen/common/pdx.c
> > +++ b/xen/common/pdx.c
> > @@ -88,7 +88,7 @@ bool __mfn_valid(unsigned long mfn)
> >  }
> >  
> >  /* Sets all bits from the most-significant 1-bit down to the LSB */
> > -static uint64_t __init fill_mask(uint64_t mask)
> > +static uint64_t fill_mask(uint64_t mask)
> >  {
> >      while (mask & (mask + 1))
> >          mask |= mask + 1;
> 
> I see why you want __init dropped here, but the function wasn't written
> for "common use" and hence may want improving first when intended for
> more frequent (post-init) use as well.
It's not so much that I want it rather than I need it. hotadd_mem_check()
uses it and it's not part of the init process. That's the only non-init use
of it, and it hardly qualifies as "common". IMO, 
don't matter a whole

> Then again I wonder why original
> checking all got away without using this function ...
> 
> Jan
My guess is that most real systems don't exercise the corner cases, and so
the code was just silently broken passing the checks all the time. I could
not find an explanation for the OR check in hotadd_mem_check(), so I just
attributed it to being a bug introduced out of lack of knowledge of pdx.

Thanks,
Alejandro


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option
  2023-07-28  7:59 ` [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option Alejandro Vallejo
                     ` (2 preceding siblings ...)
  2023-07-31 15:33   ` Jan Beulich
@ 2023-08-07 17:48   ` Julien Grall
  2023-08-08  9:52     ` Alejandro Vallejo
  3 siblings, 1 reply; 29+ messages in thread
From: Julien Grall @ 2023-08-07 17:48 UTC (permalink / raw)
  To: Alejandro Vallejo, Xen-devel
  Cc: Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk,
	Andrew Cooper, George Dunlap, Jan Beulich, Wei Liu,
	Roger Pau Monné

Hi Alex,

One more remark in the title. s/HAS// as you renamed the Kconfig.

On 28/07/2023 08:59, Alejandro Vallejo wrote:
> Adds a new compile-time flag to allow disabling pdx compression and
> compiles out compression-related code/data. It also shorts the pdx<->pfn
> conversion macros and creates stubs for masking fucntions.
> 
> While at it, removes the old arch-defined CONFIG_HAS_PDX flag, as it was
> not removable in practice.
> 
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> ---
> v2:
>    * Merged v1/patch2: Removal of CONFIG_HAS_PDX here (Jan)
> ---
>   xen/arch/arm/Kconfig  |  1 -
>   xen/arch/x86/Kconfig  |  1 -
>   xen/arch/x86/domain.c | 19 +++++++++++++------
>   xen/common/Kconfig    | 13 ++++++++++---
>   xen/common/Makefile   |  2 +-
>   xen/common/pdx.c      | 15 +++++++++++----
>   xen/include/xen/pdx.h | 37 ++++++++++++++++++++++++++++++++++---
>   7 files changed, 69 insertions(+), 19 deletions(-)
> 
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 439cc94f33..ea1949fbaa 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -14,7 +14,6 @@ config ARM
>   	select HAS_ALTERNATIVE
>   	select HAS_DEVICE_TREE
>   	select HAS_PASSTHROUGH
> -	select HAS_PDX
>   	select HAS_PMAP
>   	select HAS_UBSAN
>   	select IOMMU_FORCE_PT_SHARE
> diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
> index 92f3a627da..30df085d96 100644
> --- a/xen/arch/x86/Kconfig
> +++ b/xen/arch/x86/Kconfig
> @@ -24,7 +24,6 @@ config X86
>   	select HAS_PASSTHROUGH
>   	select HAS_PCI
>   	select HAS_PCI_MSI
> -	select HAS_PDX
>   	select HAS_SCHED_GRANULARITY
>   	select HAS_UBSAN
>   	select HAS_VPCI if HVM
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index 5f66c2ae33..ee2830aad7 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -458,7 +458,7 @@ void domain_cpu_policy_changed(struct domain *d)
>       }
>   }
>   
> -#ifndef CONFIG_BIGMEM
> +#if !defined(CONFIG_BIGMEM) && defined(CONFIG_PDX_COMPRESSION)
>   /*
>    * The hole may be at or above the 44-bit boundary, so we need to determine
>    * the total bit count until reaching 32 significant (not squashed out) bits
> @@ -485,13 +485,20 @@ static unsigned int __init noinline _domain_struct_bits(void)
>   struct domain *alloc_domain_struct(void)
>   {
>       struct domain *d;
> -#ifdef CONFIG_BIGMEM
> -    const unsigned int bits = 0;
> -#else
> +
>       /*
> -     * We pack the PDX of the domain structure into a 32-bit field within
> -     * the page_info structure. Hence the MEMF_bits() restriction.
> +     * Without CONFIG_BIGMEM, we pack the PDX of the domain structure into
> +     * a 32-bit field within the page_info structure. Hence the MEMF_bits()
> +     * restriction. With PDX compression in place the number of bits must
> +     * be calculated at runtime, but it's fixed otherwise.
> +     *
> +     * On systems with CONFIG_BIGMEM there's no packing, and so there's no
> +     * such restriction.
>        */
> +#if defined(CONFIG_BIGMEM) || !defined(CONFIG_PDX_COMPRESSION)
> +    const unsigned int bits = IS_ENABLED(CONFIG_BIGMEM) ? 0 :
> +                                                          32 + PAGE_SHIFT;
> +#else
>       static unsigned int __read_mostly bits;
>   
>       if ( unlikely(!bits) )
> diff --git a/xen/common/Kconfig b/xen/common/Kconfig
> index dd8d7c3f1c..3a0afd8e83 100644
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -23,6 +23,16 @@ config GRANT_TABLE
>   
>   	  If unsure, say Y.
>   
> +config PDX_COMPRESSION
> +	bool "PDX (Page inDeX) compression support"
> +	default ARM
> +	help
> +	  PDX compression is a technique that allows the hypervisor to
> +	  represent physical addresses in a very space-efficient manner.
> +	  This is very helpful reducing memory wastage in systems with
> +	  memory banks with base addresses far from each other, but carrier
> +	  a performance cost.
> +
>   config ALTERNATIVE_CALL
>   	bool
>   
> @@ -53,9 +63,6 @@ config HAS_IOPORTS
>   config HAS_KEXEC
>   	bool
>   
> -config HAS_PDX
> -	bool
> -
>   config HAS_PMAP
>   	bool
>   
> diff --git a/xen/common/Makefile b/xen/common/Makefile
> index 46049eac35..0020cafb8a 100644
> --- a/xen/common/Makefile
> +++ b/xen/common/Makefile
> @@ -29,7 +29,7 @@ obj-y += multicall.o
>   obj-y += notifier.o
>   obj-$(CONFIG_NUMA) += numa.o
>   obj-y += page_alloc.o
> -obj-$(CONFIG_HAS_PDX) += pdx.o
> +obj-y += pdx.o
>   obj-$(CONFIG_PERF_COUNTERS) += perfc.o
>   obj-bin-$(CONFIG_HAS_PMAP) += pmap.init.o
>   obj-y += preempt.o
> diff --git a/xen/common/pdx.c b/xen/common/pdx.c
> index d3d38965bd..a3b1ba9fbb 100644
> --- a/xen/common/pdx.c
> +++ b/xen/common/pdx.c
> @@ -31,11 +31,15 @@ unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS(
>   
>   bool __mfn_valid(unsigned long mfn)
>   {
> -    if ( unlikely(evaluate_nospec(mfn >= max_page)) )
> +    bool invalid = mfn >= max_page;
> +#ifdef CONFIG_PDX_COMPRESSION
> +    invalid |= mfn & pfn_hole_mask;
> +#endif
> +
> +    if ( unlikely(evaluate_nospec(invalid)) )
>           return false;
> -    return likely(!(mfn & pfn_hole_mask)) &&
> -           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
> -                           pdx_group_valid));
> +
> +    return test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT, pdx_group_valid);
>   }
>   
>   void set_pdx_range(unsigned long smfn, unsigned long emfn)
> @@ -49,6 +53,8 @@ void set_pdx_range(unsigned long smfn, unsigned long emfn)
>           __set_bit(idx, pdx_group_valid);
>   }
>   
> +#ifdef CONFIG_PDX_COMPRESSION
> +
>   /*
>    * Diagram to make sense of the following variables. The masks and shifts
>    * are done on mfn values in order to convert to/from pdx:
> @@ -175,6 +181,7 @@ void __init pfn_pdx_hole_setup(unsigned long mask)
>       pfn_top_mask        = ~(pfn_pdx_bottom_mask | pfn_hole_mask);
>       ma_top_mask         = pfn_top_mask << PAGE_SHIFT;
>   }
> +#endif /* CONFIG_PDX_COMPRESSION */
>   
>   
>   /*
> diff --git a/xen/include/xen/pdx.h b/xen/include/xen/pdx.h
> index 5a82b6bde2..dfb475c8dc 100644
> --- a/xen/include/xen/pdx.h
> +++ b/xen/include/xen/pdx.h
> @@ -67,8 +67,6 @@
>    * region involved.
>    */
>   
> -#ifdef CONFIG_HAS_PDX
> -
>   extern unsigned long max_pdx;
>   
>   #define PDX_GROUP_COUNT ((1 << PDX_GROUP_SHIFT) / \
> @@ -100,6 +98,8 @@ bool __mfn_valid(unsigned long mfn);
>   #define mfn_to_pdx(mfn) pfn_to_pdx(mfn_x(mfn))
>   #define pdx_to_mfn(pdx) _mfn(pdx_to_pfn(pdx))
>   
> +#ifdef CONFIG_PDX_COMPRESSION
> +
>   extern unsigned long pfn_pdx_bottom_mask, ma_va_bottom_mask;
>   extern unsigned int pfn_pdx_hole_shift;
>   extern unsigned long pfn_hole_mask;
> @@ -205,8 +205,39 @@ static inline uint64_t directmapoff_to_maddr(unsigned long offset)
>    *             position marks a potentially compressible bit.
>    */
>   void pfn_pdx_hole_setup(unsigned long mask);
> +#else /* CONFIG_PDX_COMPRESSION */
> +
> +/* Without PDX compression we can skip some computations */
> +
> +/* pdx<->pfn == identity */
> +#define pdx_to_pfn(x) (x)
> +#define pfn_to_pdx(x) (x)
> +
> +/* directmap is indexed by by maddr */
> +#define maddr_to_directmapoff(x) (x)
> +#define directmapoff_to_maddr(x) (x)
> +
> +static inline bool pdx_is_region_compressible(unsigned long smfn,
> +                                              unsigned long emfn)
> +{
> +    return true;
> +}
> +
> +static inline uint64_t pdx_init_mask(uint64_t base_addr)
> +{
> +    return 0;
> +}
> +
> +static inline uint64_t pdx_region_mask(uint64_t base, uint64_t len)
> +{
> +    return 0;
> +}
> +
> +static inline void pfn_pdx_hole_setup(unsigned long mask)
> +{
> +}
>   
> -#endif /* HAS_PDX */
> +#endif /* CONFIG_PDX_COMPRESSION */
>   #endif /* __XEN_PDX_H__ */
>   
>   /*

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/5] mm: Factor out the pdx compression logic in ma/va converters
  2023-08-07 16:26     ` Alejandro Vallejo
@ 2023-08-08  6:05       ` Jan Beulich
  2023-08-08  7:58         ` Julien Grall
  0 siblings, 1 reply; 29+ messages in thread
From: Jan Beulich @ 2023-08-08  6:05 UTC (permalink / raw)
  To: Alejandro Vallejo
  Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Roger Pau Monné,
	Xen-devel

On 07.08.2023 18:26, Alejandro Vallejo wrote:
> On Mon, Jul 31, 2023 at 05:15:19PM +0200, Jan Beulich wrote:
>> Especially since the names of the helper functions are longish,
>> I'm afraid I'm not fully convinced of the transformation.
> In what sense? If it's just naming style I'm happy to consider other names,
> but taking compression logic out of non-pdx code is essential to removing
> compiling it out.

I understand that, but my dislike for long names of functions doing
basic transformations remains. I'm afraid I have no good alternative
suggestion, though, and while it's been a long time, this may also
have been one reason why I didn't go with helpers back at the time.

Plus of course I remain unconvinced that compiling out really is the
better option compared to patching out, as was done by my series.

Jan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option
  2023-08-07 16:06             ` Alejandro Vallejo
@ 2023-08-08  6:17               ` Jan Beulich
  0 siblings, 0 replies; 29+ messages in thread
From: Jan Beulich @ 2023-08-08  6:17 UTC (permalink / raw)
  To: Alejandro Vallejo
  Cc: Andrew Cooper, Stefano Stabellini, Julien Grall,
	Bertrand Marquis, Volodymyr Babchuk, George Dunlap, Wei Liu,
	Roger Pau Monné,
	Xen-devel

On 07.08.2023 18:06, Alejandro Vallejo wrote:
> An alt-patching series can probably make it very close to the perf win of
> this patch as long as it transforms the conversion hunks into no-ops when
> there's no hole. I looked into the 2018 patch, and I don't think it tried
> to go that far (afaics it's purpose is to inline the compression parameters
> into the code stream). I highly suspect it would still noticiably
> underperform compared to this one, but I admit it's guesswork and I'd be
> happy to be proven wrong through benchmarks.

The alt-patching certainly will have some residual overhead; if for nothing
else then for the long clobber lists in some of the alternatives.

> Summary:
>   * While alt-patching is an attractive alternative this series doesn't do
>     that and in the spirit of keeping things simple I'd really rather keep
>     it that way. Does this sound reasonable?
>   * For the topic of when to disable compression by default on x86, I
>     genuinely think now's as good a time as any. If we were to do it in 2
>     steps any project downstream may very well not notice until 2 releases
>     down the line, at which point they simply must turn compression back
>     on, which is what they would have to do now anyway.
>   * For the topic of allowing or not the option to be selectable, I think
>     it would be a mistake to preclude it because while we don't know of
>     physical memory maps with big holes on (publicly available) x86, Xen
>     may be itself virtualized with arbitrary memory maps. Unlikely and far
>     fetched as it is, it's IMO worth being at least cautious about. Gating
>     the feature on EXPERT=y and adding a big warning should be good enough
>     to avoid foot-shootouts.

I could live with this as a compromise; really my main objection is to not
allowing the functionality at all on x86. Beyond that I'm obviously not
overly happy with how things are moving here, but so be it.

Jan


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 2/5] mm: Factor out the pdx compression logic in ma/va converters
  2023-08-08  6:05       ` Jan Beulich
@ 2023-08-08  7:58         ` Julien Grall
  0 siblings, 0 replies; 29+ messages in thread
From: Julien Grall @ 2023-08-08  7:58 UTC (permalink / raw)
  To: Jan Beulich, Alejandro Vallejo
  Cc: Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk,
	Andrew Cooper, George Dunlap, Wei Liu, Roger Pau Monné,
	Xen-devel

Hi Jan,

On 08/08/2023 07:05, Jan Beulich wrote:
> On 07.08.2023 18:26, Alejandro Vallejo wrote:
>> On Mon, Jul 31, 2023 at 05:15:19PM +0200, Jan Beulich wrote:
>>> Especially since the names of the helper functions are longish,
>>> I'm afraid I'm not fully convinced of the transformation.
>> In what sense? If it's just naming style I'm happy to consider other names,
>> but taking compression logic out of non-pdx code is essential to removing
>> compiling it out.
> 
> I understand that, but my dislike for long names of functions doing
> basic transformations remains. I'm afraid I have no good alternative
> suggestion, though, and while it's been a long time, this may also
> have been one reason why I didn't go with helpers back at the time.
> 
> Plus of course I remain unconvinced that compiling out really is the
> better option compared to patching out, as was done by my series.

I am with Alejandro here. The altpatching is more complex and would 
likely require arch specific code.

I don't exactly see the benefits of such approach given that are no 
known x86 platform where PDX might be useful. Even for Arm, I am not 
aware of a platform where PDX could be disabled. I agree this could 
change in the future, but this could be revisit if needed.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option
  2023-08-07 17:48   ` Julien Grall
@ 2023-08-08  9:52     ` Alejandro Vallejo
  0 siblings, 0 replies; 29+ messages in thread
From: Alejandro Vallejo @ 2023-08-08  9:52 UTC (permalink / raw)
  To: Julien Grall
  Cc: Xen-devel, Stefano Stabellini, Bertrand Marquis,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Jan Beulich,
	Wei Liu, Roger Pau Monné

On Mon, Aug 07, 2023 at 06:48:13PM +0100, Julien Grall wrote:
> Hi Alex,
> 
> One more remark in the title. s/HAS// as you renamed the Kconfig.
>
> [snip]
> 
> -- 
> Julien Grall

Oops. Good catch, cheers.


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2023-08-08  9:53 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-28  7:58 [PATCH v2 0/5] Make PDX compression optional Alejandro Vallejo
2023-07-28  7:58 ` [PATCH v2 1/5] arm/mm: Document the differences between arm32 and arm64 directmaps Alejandro Vallejo
2023-07-28  9:05   ` Julien Grall
2023-08-07 13:45     ` Julien Grall
2023-07-28  7:59 ` [PATCH v2 2/5] mm: Factor out the pdx compression logic in ma/va converters Alejandro Vallejo
2023-07-28  9:07   ` Julien Grall
2023-07-31 15:15   ` Jan Beulich
2023-08-07 16:26     ` Alejandro Vallejo
2023-08-08  6:05       ` Jan Beulich
2023-08-08  7:58         ` Julien Grall
2023-07-28  7:59 ` [PATCH v2 3/5] mm/pdx: Standardize region validation wrt pdx compression Alejandro Vallejo
2023-07-28 16:19   ` Julien Grall
2023-07-31 15:27   ` Jan Beulich
2023-08-07 16:50     ` Alejandro Vallejo
2023-07-28  7:59 ` [PATCH v2 4/5] pdx: Reorder pdx.[ch] Alejandro Vallejo
2023-07-28 16:20   ` Julien Grall
2023-07-28  7:59 ` [PATCH v2 5/5] pdx: Add CONFIG_HAS_PDX_COMPRESSION as a common Kconfig option Alejandro Vallejo
2023-07-28 16:27   ` Julien Grall
2023-07-28 16:36   ` Andrew Cooper
2023-07-28 16:58     ` Andrew Cooper
2023-07-31  8:00       ` Jan Beulich
2023-07-31 17:38         ` Andrew Cooper
2023-08-01  7:57           ` Jan Beulich
2023-08-07 16:06             ` Alejandro Vallejo
2023-08-08  6:17               ` Jan Beulich
2023-07-31  9:09     ` Julien Grall
2023-07-31 15:33   ` Jan Beulich
2023-08-07 17:48   ` Julien Grall
2023-08-08  9:52     ` Alejandro Vallejo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.