All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] x86: improve PDX <-> PFN and alike translations
@ 2018-03-13 14:00 Jan Beulich
  2018-03-13 14:14 ` [PATCH v2 1/6] x86: remove page.h and processor.h inclusion from asm_defns.h Jan Beulich
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: Jan Beulich @ 2018-03-13 14:00 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

1: remove page.h and processor.h inclusion from asm_defns.h
2: x86: fix OLDINSTR_2()
3: use PDEP for PTE flags insertion when available
4: use PDEP/PEXT for maddr/direct-map-offset conversion when available
5: use PDEP/PEXT for PFN/PDX conversion when available
6: use MOV for PFN/PDX conversion when possible

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Extensive re-base.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/6] x86: remove page.h and processor.h inclusion from asm_defns.h
  2018-03-13 14:00 [PATCH v2 0/6] x86: improve PDX <-> PFN and alike translations Jan Beulich
@ 2018-03-13 14:14 ` Jan Beulich
  2018-03-13 14:14 ` [PATCH v2 2/6] x86: fix OLDINSTR_2() Jan Beulich
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Jan Beulich @ 2018-03-13 14:14 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

Subsequent changes require this (too wide anyway imo) dependency to be
dropped.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/boot/head.S
+++ b/xen/arch/x86/boot/head.S
@@ -5,6 +5,7 @@
 #include <asm/desc.h>
 #include <asm/fixmap.h>
 #include <asm/page.h>
+#include <asm/processor.h>
 #include <asm/msr.h>
 #include <asm/cpufeature.h>
 #include <public/elfnote.h>
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -9,6 +9,7 @@
 #include <asm/asm_defns.h>
 #include <asm/apicdef.h>
 #include <asm/page.h>
+#include <asm/processor.h>
 #include <asm/desc.h>
 #include <public/xen.h>
 #include <irq_vectors.h>
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -11,6 +11,7 @@
 #include <asm/asm_defns.h>
 #include <asm/apicdef.h>
 #include <asm/page.h>
+#include <asm/processor.h>
 #include <public/xen.h>
 #include <irq_vectors.h>
 
--- a/xen/include/asm-x86/asm_defns.h
+++ b/xen/include/asm-x86/asm_defns.h
@@ -7,9 +7,8 @@
 #include <asm/asm-offsets.h>
 #endif
 #include <asm/bug.h>
-#include <asm/page.h>
-#include <asm/processor.h>
 #include <asm/percpu.h>
+#include <asm/x86-defns.h>
 #include <xen/stringify.h>
 #include <asm/cpufeature.h>
 #include <asm/alternative.h>
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -259,6 +259,7 @@ int init_domain_cpuid_policy(struct doma
 /* Clamp the CPUID policy to reality. */
 void recalculate_cpuid_policy(struct domain *d);
 
+struct vcpu;
 void guest_cpuid(const struct vcpu *v, uint32_t leaf,
                  uint32_t subleaf, struct cpuid_leaf *res);
 
--- a/xen/include/asm-x86/msr.h
+++ b/xen/include/asm-x86/msr.h
@@ -10,6 +10,7 @@
 #include <xen/errno.h>
 #include <asm/asm_defns.h>
 #include <asm/cpufeature.h>
+#include <asm/processor.h>
 
 #define rdmsr(msr,val1,val2) \
      __asm__ __volatile__("rdmsr" \




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 2/6] x86: fix OLDINSTR_2()
  2018-03-13 14:00 [PATCH v2 0/6] x86: improve PDX <-> PFN and alike translations Jan Beulich
  2018-03-13 14:14 ` [PATCH v2 1/6] x86: remove page.h and processor.h inclusion from asm_defns.h Jan Beulich
@ 2018-03-13 14:14 ` Jan Beulich
  2018-03-28  8:48   ` Wei Liu
  2018-03-13 14:15 ` [PATCH v2 3/6] x86: use PDEP for PTE flags insertion when available Jan Beulich
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2018-03-13 14:14 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

Its as_max() invocation was wrongly parenthesized.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: New.

--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -54,8 +54,8 @@ extern void alternative_instructions(voi
 
 #define OLDINSTR_2(oldinstr, n1, n2)                             \
     OLDINSTR(oldinstr,                                           \
-             as_max((alt_repl_len(n1),                           \
-                     alt_repl_len(n2)) "-" alt_orig_len))
+             as_max(alt_repl_len(n1),                            \
+                    alt_repl_len(n2)) "-" alt_orig_len)
 
 #define ALTINSTR_ENTRY(feature, num)                                    \
         " .long .LXEN%=_orig_s - .\n"             /* label           */ \




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 3/6] x86: use PDEP for PTE flags insertion when available
  2018-03-13 14:00 [PATCH v2 0/6] x86: improve PDX <-> PFN and alike translations Jan Beulich
  2018-03-13 14:14 ` [PATCH v2 1/6] x86: remove page.h and processor.h inclusion from asm_defns.h Jan Beulich
  2018-03-13 14:14 ` [PATCH v2 2/6] x86: fix OLDINSTR_2() Jan Beulich
@ 2018-03-13 14:15 ` Jan Beulich
  2018-03-13 14:15 ` [PATCH v2 4/6] x86: use PDEP/PEXT for maddr/direct-map-offset conversion " Jan Beulich
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Jan Beulich @ 2018-03-13 14:15 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

This allows to fold 5 instructions into a single one, reducing code size
quite a bit, especially when not considering the fallback functions
(which won't ever need to be brought into iCache or their mappings into
iTLB on systems supporting BMI2).

Make use of gcc's new V operand modifier, even if that results in a
slightly odd dependency in the sources (but I also didn't want to
introduce yet another manifest constant).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Avoid quoted symbols; use gcc's new V operand modifier instead.
    Re-base.
---
TBD: Also change get_pte_flags() (after having introduced test_pte_flags())?

--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -224,6 +224,12 @@ void init_or_livepatch apply_alternative
         /* 0xe8/0xe9 are relative branches; fix the offset. */
         if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 )
             *(int32_t *)(buf + 1) += repl - orig;
+        /* RIP-relative addressing is easy to check for in VEX-encoded insns. */
+        else if ( a->repl_len >= 8 &&
+                  (*buf & ~1) == 0xc4 &&
+                  a->repl_len >= 9 - (*buf & 1) &&
+                  (buf[4 - (*buf & 1)] & ~0x38) == 0x05 )
+            *(int32_t *)(buf + 5 - (*buf & 1)) += repl - orig;
 
         add_nops(buf + a->repl_len, total_len - a->repl_len);
         text_poke(orig, buf, total_len);
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -391,6 +391,15 @@ void __init arch_init_memory(void)
 #endif
 }
 
+const intpte_t pte_flags_mask = ~(PADDR_MASK & PAGE_MASK);
+
+#ifndef CONFIG_INDIRECT_THUNK /* V modifier unavailable? */
+intpte_t put_pte_flags_v(unsigned int flags)
+{
+    return put_pte_flags_c(flags);
+}
+#endif
+
 int page_is_ram_type(unsigned long mfn, unsigned long mem_type)
 {
     uint64_t maddr = pfn_to_paddr(mfn);
--- a/xen/arch/x86/xen.lds.S
+++ b/xen/arch/x86/xen.lds.S
@@ -66,6 +66,7 @@ SECTIONS
         _stext = .;            /* Text and read-only data */
        *(.text)
        *(.text.__x86_indirect_thunk_*)
+       *(.gnu.linkonce.t.*)
        *(.text.page_aligned)
 
        . = ALIGN(PAGE_SIZE);
--- a/xen/include/asm-x86/x86_64/page.h
+++ b/xen/include/asm-x86/x86_64/page.h
@@ -34,6 +34,9 @@
 
 #ifndef __ASSEMBLY__
 
+#include <asm/alternative.h>
+#include <asm/asm_defns.h>
+#include <asm/cpufeature.h>
 #include <asm/types.h>
 
 #include <xen/pdx.h>
@@ -123,15 +126,53 @@ typedef l4_pgentry_t root_pgentry_t;
 
 /* Extract flags into 24-bit integer, or turn 24-bit flags into a pte mask. */
 #ifndef __ASSEMBLY__
+extern const intpte_t pte_flags_mask;
+intpte_t __attribute_const__ put_pte_flags_v(unsigned int x);
+
 static inline unsigned int get_pte_flags(intpte_t x)
 {
     return ((x >> 40) & ~0xfff) | (x & 0xfff);
 }
 
-static inline intpte_t put_pte_flags(unsigned int x)
+static inline intpte_t put_pte_flags_c(unsigned int x)
 {
     return (((intpte_t)x & ~0xfff) << 40) | (x & 0xfff);
 }
+
+static always_inline intpte_t put_pte_flags(unsigned int x)
+{
+    intpte_t pte;
+
+    if ( __builtin_constant_p(x) )
+        return put_pte_flags_c(x);
+
+#ifdef CONFIG_INDIRECT_THUNK /* V modifier available? */
+#define SYMNAME(pfx...) #pfx "put_pte_flags_%V[pte]_%V[flags]"
+    alternative_io("call " SYMNAME() "\n\t"
+                   LINKONCE_PROLOGUE(SYMNAME) "\n\t"
+                   "mov %[flags], %k[pte]\n\t"
+                   "and $0xfff000, %[flags]\n\t"
+                   "and $0x000fff, %k[pte]\n\t"
+                   "shl $40, %q[flags]\n\t"
+                   "or %q[flags], %[pte]\n\t"
+                   "ret\n\t"
+                   LINKONCE_EPILOGUE(SYMNAME),
+                   "pdep %[mask], %q[flags], %[pte]", X86_FEATURE_BMI2,
+                   ASM_OUTPUT2([pte] "=&r" (pte), [flags] "+r" (x)),
+                   [mask] "m" (pte_flags_mask));
+#undef SYMNAME
+#else
+    alternative_io("call put_pte_flags_v",
+                   /* pdep pte_flags_mask(%rip), %rdi, %rax */
+                   ".byte 0xc4, 0xe2, 0xc3, 0xf5, 0x05\n\t"
+                   ".long pte_flags_mask - 4 - .",
+                   X86_FEATURE_BMI2,
+                   ASM_OUTPUT2("=a" (pte), "+D" (x)), "m" (pte_flags_mask)
+                   : "rcx", "rdx", "rsi", "r8", "r9", "r10", "r11");
+#endif
+
+    return pte;
+}
 #endif
 
 /*
--- a/xen/include/asm-x86/asm_defns.h
+++ b/xen/include/asm-x86/asm_defns.h
@@ -187,6 +187,20 @@ void ret_from_intr(void);
         UNLIKELY_END_SECTION "\n"          \
         ".Llikely." #tag ".%=:"
 
+#define LINKONCE_PROLOGUE(sym)                    \
+        ".ifndef " sym() "\n\t"                   \
+        ".pushsection " sym(.gnu.linkonce.t.) "," \
+                      "\"ax\",@progbits\n\t"      \
+        ".p2align 4\n"                            \
+        sym() ":"
+
+#define LINKONCE_EPILOGUE(sym)                    \
+        ".weak " sym() "\n\t"                     \
+        ".type " sym() ", @function\n\t"          \
+        ".size " sym() ", . - " sym() "\n\t"      \
+        ".popsection\n\t"                         \
+        ".endif"
+
 #endif
 
 /* "Raw" instruction opcodes */



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 4/6] x86: use PDEP/PEXT for maddr/direct-map-offset conversion when available
  2018-03-13 14:00 [PATCH v2 0/6] x86: improve PDX <-> PFN and alike translations Jan Beulich
                   ` (2 preceding siblings ...)
  2018-03-13 14:15 ` [PATCH v2 3/6] x86: use PDEP for PTE flags insertion when available Jan Beulich
@ 2018-03-13 14:15 ` Jan Beulich
  2018-03-13 14:16 ` [PATCH v2 5/6] x86: use PDEP/PEXT for PFN/PDX " Jan Beulich
  2018-03-13 14:16 ` [PATCH v2 6/6] x86: use MOV for PFN/PDX conversion when possible Jan Beulich
  5 siblings, 0 replies; 8+ messages in thread
From: Jan Beulich @ 2018-03-13 14:15 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

Both replace 6 instructions by a single one, further reducing code size,
cache, and TLB footprint (in particular on systems supporting BMI2).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Avoid quoted symbols; use gcc's new V operand modifier instead.
    Re-base.

--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -393,11 +393,26 @@ void __init arch_init_memory(void)
 
 const intpte_t pte_flags_mask = ~(PADDR_MASK & PAGE_MASK);
 
+paddr_t __read_mostly ma_real_mask = ~0UL;
+
 #ifndef CONFIG_INDIRECT_THUNK /* V modifier unavailable? */
 intpte_t put_pte_flags_v(unsigned int flags)
 {
     return put_pte_flags_c(flags);
 }
+
+/* Conversion between machine address and direct map offset. */
+paddr_t do2ma(unsigned long off)
+{
+    return (off & ma_va_bottom_mask) |
+           ((off << pfn_pdx_hole_shift) & ma_top_mask);
+}
+
+unsigned long ma2do(paddr_t ma)
+{
+    return (ma & ma_va_bottom_mask) |
+           ((ma & ma_top_mask) >> pfn_pdx_hole_shift);
+}
 #endif
 
 int page_is_ram_type(unsigned long mfn, unsigned long mem_type)
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -446,6 +446,8 @@ void __init srat_parse_regions(u64 addr)
 	}
 
 	pfn_pdx_hole_setup(mask >> PAGE_SHIFT);
+
+	ma_real_mask = ma_top_mask | ma_va_bottom_mask;
 }
 
 /* Use the information discovered above to actually set up the nodes. */
--- a/xen/include/asm-x86/x86_64/page.h
+++ b/xen/include/asm-x86/x86_64/page.h
@@ -42,6 +42,10 @@
 #include <xen/pdx.h>
 
 extern unsigned long xen_virt_end;
+extern paddr_t ma_real_mask;
+
+paddr_t do2ma(unsigned long);
+unsigned long ma2do(paddr_t);
 
 /*
  * Note: These are solely for the use by page_{get,set}_owner(), and
@@ -52,8 +56,10 @@ extern unsigned long xen_virt_end;
 #define pdx_to_virt(pdx) ((void *)(DIRECTMAP_VIRT_START + \
                                    ((unsigned long)(pdx) << PAGE_SHIFT)))
 
-static inline unsigned long __virt_to_maddr(unsigned long va)
+static always_inline paddr_t __virt_to_maddr(unsigned long va)
 {
+    paddr_t ma;
+
     ASSERT(va < DIRECTMAP_VIRT_END);
     if ( va >= DIRECTMAP_VIRT_START )
         va -= DIRECTMAP_VIRT_START;
@@ -66,16 +72,77 @@ static inline unsigned long __virt_to_ma
 
         va += xen_phys_start - XEN_VIRT_START;
     }
-    return (va & ma_va_bottom_mask) |
-           ((va << pfn_pdx_hole_shift) & ma_top_mask);
+
+#ifdef CONFIG_INDIRECT_THUNK /* V modifier available? */
+#define SYMNAME(pfx...) #pfx "do2ma_%V[ma]_%V[off]"
+    alternative_io("call " SYMNAME() "\n\t"
+                   LINKONCE_PROLOGUE(SYMNAME) "\n\t"
+                   "mov %[shift], %%ecx\n\t"
+                   "mov %[off], %[ma]\n\t"
+                   "and %[bmask], %[ma]\n\t"
+                   "shl %%cl, %[off]\n\t"
+                   "and %[tmask], %[off]\n\t"
+                   "or %[off], %[ma]\n\t"
+                   "ret\n\t"
+                   LINKONCE_EPILOGUE(SYMNAME),
+                   "pdep %[mask], %[off], %[ma]", X86_FEATURE_BMI2,
+                   ASM_OUTPUT2([ma] "=&r" (ma), [off] "+r" (va)),
+                   [mask] "m" (ma_real_mask),
+                   [shift] "m" (pfn_pdx_hole_shift),
+                   [bmask] "m" (ma_va_bottom_mask),
+                   [tmask] "m" (ma_top_mask)
+                   : "ecx");
+#undef SYMNAME
+#else
+    alternative_io("call do2ma",
+                   /* pdep ma_real_mask(%rip), %rdi, %rax */
+                   ".byte 0xc4, 0xe2, 0xc3, 0xf5, 0x05\n\t"
+                   ".long ma_real_mask - 4 - .",
+                   X86_FEATURE_BMI2,
+                   ASM_OUTPUT2("=a" (ma), "+D" (va)), "m" (ma_real_mask)
+                   : "rcx", "rdx", "rsi", "r8", "r9", "r10", "r11");
+#endif
+
+    return ma;
 }
 
-static inline void *__maddr_to_virt(unsigned long ma)
+static always_inline void *__maddr_to_virt(paddr_t ma)
 {
+    unsigned long off;
+
     ASSERT(pfn_to_pdx(ma >> PAGE_SHIFT) < (DIRECTMAP_SIZE >> PAGE_SHIFT));
-    return (void *)(DIRECTMAP_VIRT_START +
-                    ((ma & ma_va_bottom_mask) |
-                     ((ma & ma_top_mask) >> pfn_pdx_hole_shift)));
+
+#ifdef CONFIG_INDIRECT_THUNK /* V modifier available? */
+#define SYMNAME(pfx...) #pfx "ma2do_%V[off]_%V[ma]"
+    alternative_io("call " SYMNAME() "\n\t"
+                   LINKONCE_PROLOGUE(SYMNAME) "\n\t"
+                   "mov %[tmask], %[off]\n\t"
+                   "mov %[shift], %%ecx\n\t"
+                   "and %[ma], %[off]\n\t"
+                   "and %[bmask], %[ma]\n\t"
+                   "shr %%cl, %[off]\n\t"
+                   "or %[ma], %[off]\n\t"
+                   "ret\n\t"
+                   LINKONCE_EPILOGUE(SYMNAME),
+                   "pext %[mask], %[ma], %[off]", X86_FEATURE_BMI2,
+                   ASM_OUTPUT2([off] "=&r" (off), [ma] "+r" (ma)),
+                   [mask] "m" (ma_real_mask),
+                   [shift] "m" (pfn_pdx_hole_shift),
+                   [bmask] "m" (ma_va_bottom_mask),
+                   [tmask] "m" (ma_top_mask)
+                   : "ecx");
+#undef SYMNAME
+#else
+    alternative_io("call ma2do",
+                   /* pext ma_real_mask(%rip), %rdi, %rax */
+                   ".byte 0xc4, 0xe2, 0xc2, 0xf5, 0x05\n\t"
+                   ".long ma_real_mask - 4 - .",
+                   X86_FEATURE_BMI2,
+                   ASM_OUTPUT2("=a" (off), "+D" (ma)), "m" (ma_real_mask)
+                   : "rcx", "rdx", "rsi", "r8", "r9", "r10", "r11");
+#endif
+
+    return (void *)DIRECTMAP_VIRT_START + off;
 }
 
 /* read access (should only be used for debug printk's) */



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 5/6] x86: use PDEP/PEXT for PFN/PDX conversion when available
  2018-03-13 14:00 [PATCH v2 0/6] x86: improve PDX <-> PFN and alike translations Jan Beulich
                   ` (3 preceding siblings ...)
  2018-03-13 14:15 ` [PATCH v2 4/6] x86: use PDEP/PEXT for maddr/direct-map-offset conversion " Jan Beulich
@ 2018-03-13 14:16 ` Jan Beulich
  2018-03-13 14:16 ` [PATCH v2 6/6] x86: use MOV for PFN/PDX conversion when possible Jan Beulich
  5 siblings, 0 replies; 8+ messages in thread
From: Jan Beulich @ 2018-03-13 14:16 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

Both replace 6 instructions by a single one, further reducing code size,
cache, and TLB footprint (in particular on systems supporting BMI2).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Avoid quoted symbols; use gcc's new V operand modifier instead.
    Re-base.

--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -394,6 +394,7 @@ void __init arch_init_memory(void)
 const intpte_t pte_flags_mask = ~(PADDR_MASK & PAGE_MASK);
 
 paddr_t __read_mostly ma_real_mask = ~0UL;
+unsigned long __read_mostly pfn_real_mask = ~0UL;
 
 #ifndef CONFIG_INDIRECT_THUNK /* V modifier unavailable? */
 intpte_t put_pte_flags_v(unsigned int flags)
@@ -413,6 +414,17 @@ unsigned long ma2do(paddr_t ma)
     return (ma & ma_va_bottom_mask) |
            ((ma & ma_top_mask) >> pfn_pdx_hole_shift);
 }
+
+/* Conversion between PDX and PFN. */
+unsigned long pdx2pfn(unsigned long pdx)
+{
+    return generic_pdx_to_pfn(pdx);
+}
+
+unsigned long pfn2pdx(unsigned long pfn)
+{
+    return generic_pfn_to_pdx(pfn);
+}
 #endif
 
 int page_is_ram_type(unsigned long mfn, unsigned long mem_type)
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -448,6 +448,7 @@ void __init srat_parse_regions(u64 addr)
 	pfn_pdx_hole_setup(mask >> PAGE_SHIFT);
 
 	ma_real_mask = ma_top_mask | ma_va_bottom_mask;
+	pfn_real_mask = pfn_top_mask | pfn_pdx_bottom_mask;
 }
 
 /* Use the information discovered above to actually set up the nodes. */
--- /dev/null
+++ b/xen/include/asm-arm/pdx.h
@@ -0,0 +1,16 @@
+#ifndef __ASM_ARM_PDX_H__
+#define __ASM_ARM_PDX_H__
+
+#define pdx_to_pfn generic_pdx_to_pfn
+#define pfn_to_pdx generic_pfn_to_pdx
+
+#endif /* __ASM_ARM_PDX_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
--- /dev/null
+++ b/xen/include/asm-x86/pdx.h
@@ -0,0 +1,93 @@
+#ifndef __ASM_ARM_PDX_H__
+#define __ASM_ARM_PDX_H__
+
+#include <asm/alternative.h>
+#include <asm/asm_defns.h>
+#include <asm/cpufeature.h>
+
+extern unsigned long pfn_real_mask;
+
+static always_inline unsigned long pdx_to_pfn(unsigned long pdx)
+{
+    unsigned long pfn;
+
+#ifdef CONFIG_INDIRECT_THUNK /* V modifier available? */
+#define SYMNAME(pfx...) #pfx "pdx2pfn_%V[pfn]_%V[pdx]"
+    alternative_io("call " SYMNAME() "\n\t"
+                   LINKONCE_PROLOGUE(SYMNAME) "\n\t"
+                   "mov %[shift], %%ecx\n\t"
+                   "mov %[pdx], %[pfn]\n\t"
+                   "and %[bmask], %[pfn]\n\t"
+                   "shl %%cl, %[pdx]\n\t"
+                   "and %[tmask], %[pdx]\n\t"
+                   "or %[pdx], %[pfn]\n\t"
+                   "ret\n\t"
+                   LINKONCE_EPILOGUE(SYMNAME),
+                   "pdep %[mask], %[pdx], %[pfn]", X86_FEATURE_BMI2,
+                   ASM_OUTPUT2([pfn] "=&r" (pfn), [pdx] "+r" (pdx)),
+                   [mask] "m" (pfn_real_mask),
+                   [shift] "m" (pfn_pdx_hole_shift),
+                   [bmask] "m" (pfn_pdx_bottom_mask),
+                   [tmask] "m" (pfn_top_mask)
+                   : "ecx");
+#undef SYMNAME
+#else
+    alternative_io("call pdx2pfn",
+                   /* pdep pfn_real_mask(%rip), %rdi, %rax */
+                   ".byte 0xc4, 0xe2, 0xc3, 0xf5, 0x05\n\t"
+                   ".long pfn_real_mask - 4 - .",
+                   X86_FEATURE_BMI2,
+                   ASM_OUTPUT2("=a" (pfn), "+D" (pdx)), "m" (pfn_real_mask)
+                   : "rcx", "rdx", "rsi", "r8", "r9", "r10", "r11");
+#endif
+
+    return pfn;
+}
+
+static always_inline unsigned long pfn_to_pdx(unsigned long pfn)
+{
+    unsigned long pdx;
+
+#ifdef CONFIG_INDIRECT_THUNK /* V modifier available? */
+#define SYMNAME(pfx...) #pfx "pfn2pdx_%V[pdx]_%V[pfn]"
+    alternative_io("call " SYMNAME() "\n\t"
+                   LINKONCE_PROLOGUE(SYMNAME) "\n\t"
+                   "mov %[tmask], %[pdx]\n\t"
+                   "mov %[shift], %%ecx\n\t"
+                   "and %[pfn], %[pdx]\n\t"
+                   "and %[bmask], %[pfn]\n\t"
+                   "shr %%cl, %[pdx]\n\t"
+                   "or %[pfn], %[pdx]\n\t"
+                   "ret\n\t"
+                   LINKONCE_EPILOGUE(SYMNAME),
+                   "pext %[mask], %[pfn], %[pdx]", X86_FEATURE_BMI2,
+                   ASM_OUTPUT2([pdx] "=&r" (pdx), [pfn] "+r" (pfn)),
+                   [mask] "m" (pfn_real_mask),
+                   [shift] "m" (pfn_pdx_hole_shift),
+                   [bmask] "m" (pfn_pdx_bottom_mask),
+                   [tmask] "m" (pfn_top_mask)
+                   : "ecx");
+#undef SYMNAME
+#else
+    alternative_io("call pfn2pdx",
+                   /* pext pfn_real_mask(%rip), %rdi, %rax */
+                   ".byte 0xc4, 0xe2, 0xc2, 0xf5, 0x05\n\t"
+                   ".long pfn_real_mask - 4 - .",
+                   X86_FEATURE_BMI2,
+                   ASM_OUTPUT2("=a" (pdx), "+D" (pfn)), "m" (pfn_real_mask)
+                   : "rcx", "rdx", "rsi", "r8", "r9", "r10", "r11");
+#endif
+
+    return pdx;
+}
+
+#endif /* __ASM_ARM_PDX_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
--- a/xen/include/xen/pdx.h
+++ b/xen/include/xen/pdx.h
@@ -23,13 +23,13 @@ extern void set_pdx_range(unsigned long
 
 bool __mfn_valid(unsigned long mfn);
 
-static inline unsigned long pfn_to_pdx(unsigned long pfn)
+static inline unsigned long generic_pfn_to_pdx(unsigned long pfn)
 {
     return (pfn & pfn_pdx_bottom_mask) |
            ((pfn & pfn_top_mask) >> pfn_pdx_hole_shift);
 }
 
-static inline unsigned long pdx_to_pfn(unsigned long pdx)
+static inline unsigned long generic_pdx_to_pfn(unsigned long pdx)
 {
     return (pdx & pfn_pdx_bottom_mask) |
            ((pdx << pfn_pdx_hole_shift) & pfn_top_mask);
@@ -37,6 +37,8 @@ static inline unsigned long pdx_to_pfn(u
 
 extern void pfn_pdx_hole_setup(unsigned long);
 
+#include <asm/pdx.h>
+
 #endif /* HAS_PDX */
 #endif /* __XEN_PDX_H__ */
 



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 6/6] x86: use MOV for PFN/PDX conversion when possible
  2018-03-13 14:00 [PATCH v2 0/6] x86: improve PDX <-> PFN and alike translations Jan Beulich
                   ` (4 preceding siblings ...)
  2018-03-13 14:16 ` [PATCH v2 5/6] x86: use PDEP/PEXT for PFN/PDX " Jan Beulich
@ 2018-03-13 14:16 ` Jan Beulich
  5 siblings, 0 replies; 8+ messages in thread
From: Jan Beulich @ 2018-03-13 14:16 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

... and (of course) also maddr / direct-map-offset ones.

Most x86 systems don't actually require the use of PDX compression. Now
that we have patching for the conversion code in place anyway, extend it
to use simple MOV when possible. Introduce a new pseudo-CPU-feature to
key the patching off of.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Avoid quoted symbols; use gcc's new V operand modifier instead.
    Re-base.
---
This patch will only apply cleanly on top of "x86: NOP out XPTI
entry/exit code when it's not in use".

--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1410,6 +1410,9 @@ void __init noreturn __start_xen(unsigne
 
     numa_initmem_init(0, raw_max_page);
 
+    if ( !pfn_pdx_hole_shift )
+        setup_force_cpu_cap(X86_FEATURE_PFN_PDX_IDENT);
+
     if ( max_page - 1 > virt_to_mfn(HYPERVISOR_VIRT_END - 1) )
     {
         unsigned long limit = virt_to_mfn(HYPERVISOR_VIRT_END - 1);
--- a/xen/include/asm-x86/cpufeatures.h
+++ b/xen/include/asm-x86/cpufeatures.h
@@ -31,3 +31,4 @@ XEN_CPUFEATURE(XEN_IBRS_CLEAR,  (FSCAPIN
 XEN_CPUFEATURE(RSB_NATIVE,      (FSCAPINTS+0)*32+18) /* RSB overwrite needed for native */
 XEN_CPUFEATURE(RSB_VMEXIT,      (FSCAPINTS+0)*32+19) /* RSB overwrite needed for vmexit */
 XEN_CPUFEATURE(NO_XPTI,         (FSCAPINTS+0)*32+20) /* XPTI mitigation not in use */
+XEN_CPUFEATURE(PFN_PDX_IDENT,   (FSCAPINTS+0)*32+21) /* PFN <-> PDX mapping is 1:1 */
--- a/xen/include/asm-x86/pdx.h
+++ b/xen/include/asm-x86/pdx.h
@@ -13,7 +13,7 @@ static always_inline unsigned long pdx_t
 
 #ifdef CONFIG_INDIRECT_THUNK /* V modifier available? */
 #define SYMNAME(pfx...) #pfx "pdx2pfn_%V[pfn]_%V[pdx]"
-    alternative_io("call " SYMNAME() "\n\t"
+    alternative_io_2("call " SYMNAME() "\n\t"
                    LINKONCE_PROLOGUE(SYMNAME) "\n\t"
                    "mov %[shift], %%ecx\n\t"
                    "mov %[pdx], %[pfn]\n\t"
@@ -24,6 +24,7 @@ static always_inline unsigned long pdx_t
                    "ret\n\t"
                    LINKONCE_EPILOGUE(SYMNAME),
                    "pdep %[mask], %[pdx], %[pfn]", X86_FEATURE_BMI2,
+                   "mov %[pdx], %[pfn]", X86_FEATURE_PFN_PDX_IDENT,
                    ASM_OUTPUT2([pfn] "=&r" (pfn), [pdx] "+r" (pdx)),
                    [mask] "m" (pfn_real_mask),
                    [shift] "m" (pfn_pdx_hole_shift),
@@ -32,11 +33,12 @@ static always_inline unsigned long pdx_t
                    : "ecx");
 #undef SYMNAME
 #else
-    alternative_io("call pdx2pfn",
+    alternative_io_2("call pdx2pfn",
                    /* pdep pfn_real_mask(%rip), %rdi, %rax */
                    ".byte 0xc4, 0xe2, 0xc3, 0xf5, 0x05\n\t"
                    ".long pfn_real_mask - 4 - .",
                    X86_FEATURE_BMI2,
+                   "mov %%rdi, %%rax", X86_FEATURE_PFN_PDX_IDENT,
                    ASM_OUTPUT2("=a" (pfn), "+D" (pdx)), "m" (pfn_real_mask)
                    : "rcx", "rdx", "rsi", "r8", "r9", "r10", "r11");
 #endif
@@ -50,7 +52,7 @@ static always_inline unsigned long pfn_t
 
 #ifdef CONFIG_INDIRECT_THUNK /* V modifier available? */
 #define SYMNAME(pfx...) #pfx "pfn2pdx_%V[pdx]_%V[pfn]"
-    alternative_io("call " SYMNAME() "\n\t"
+    alternative_io_2("call " SYMNAME() "\n\t"
                    LINKONCE_PROLOGUE(SYMNAME) "\n\t"
                    "mov %[tmask], %[pdx]\n\t"
                    "mov %[shift], %%ecx\n\t"
@@ -61,6 +63,7 @@ static always_inline unsigned long pfn_t
                    "ret\n\t"
                    LINKONCE_EPILOGUE(SYMNAME),
                    "pext %[mask], %[pfn], %[pdx]", X86_FEATURE_BMI2,
+                   "mov %[pfn], %[pdx]", X86_FEATURE_PFN_PDX_IDENT,
                    ASM_OUTPUT2([pdx] "=&r" (pdx), [pfn] "+r" (pfn)),
                    [mask] "m" (pfn_real_mask),
                    [shift] "m" (pfn_pdx_hole_shift),
@@ -69,11 +72,12 @@ static always_inline unsigned long pfn_t
                    : "ecx");
 #undef SYMNAME
 #else
-    alternative_io("call pfn2pdx",
+    alternative_io_2("call pfn2pdx",
                    /* pext pfn_real_mask(%rip), %rdi, %rax */
                    ".byte 0xc4, 0xe2, 0xc2, 0xf5, 0x05\n\t"
                    ".long pfn_real_mask - 4 - .",
                    X86_FEATURE_BMI2,
+                   "mov %%rdi, %%rax", X86_FEATURE_PFN_PDX_IDENT,
                    ASM_OUTPUT2("=a" (pdx), "+D" (pfn)), "m" (pfn_real_mask)
                    : "rcx", "rdx", "rsi", "r8", "r9", "r10", "r11");
 #endif
--- a/xen/include/asm-x86/x86_64/page.h
+++ b/xen/include/asm-x86/x86_64/page.h
@@ -75,7 +75,7 @@ static always_inline paddr_t __virt_to_m
 
 #ifdef CONFIG_INDIRECT_THUNK /* V modifier available? */
 #define SYMNAME(pfx...) #pfx "do2ma_%V[ma]_%V[off]"
-    alternative_io("call " SYMNAME() "\n\t"
+    alternative_io_2("call " SYMNAME() "\n\t"
                    LINKONCE_PROLOGUE(SYMNAME) "\n\t"
                    "mov %[shift], %%ecx\n\t"
                    "mov %[off], %[ma]\n\t"
@@ -86,6 +86,7 @@ static always_inline paddr_t __virt_to_m
                    "ret\n\t"
                    LINKONCE_EPILOGUE(SYMNAME),
                    "pdep %[mask], %[off], %[ma]", X86_FEATURE_BMI2,
+                   "mov %[off], %[ma]", X86_FEATURE_PFN_PDX_IDENT,
                    ASM_OUTPUT2([ma] "=&r" (ma), [off] "+r" (va)),
                    [mask] "m" (ma_real_mask),
                    [shift] "m" (pfn_pdx_hole_shift),
@@ -94,11 +95,12 @@ static always_inline paddr_t __virt_to_m
                    : "ecx");
 #undef SYMNAME
 #else
-    alternative_io("call do2ma",
+    alternative_io_2("call do2ma",
                    /* pdep ma_real_mask(%rip), %rdi, %rax */
                    ".byte 0xc4, 0xe2, 0xc3, 0xf5, 0x05\n\t"
                    ".long ma_real_mask - 4 - .",
                    X86_FEATURE_BMI2,
+                   "mov %%rdi, %%rax", X86_FEATURE_PFN_PDX_IDENT,
                    ASM_OUTPUT2("=a" (ma), "+D" (va)), "m" (ma_real_mask)
                    : "rcx", "rdx", "rsi", "r8", "r9", "r10", "r11");
 #endif
@@ -114,7 +116,7 @@ static always_inline void *__maddr_to_vi
 
 #ifdef CONFIG_INDIRECT_THUNK /* V modifier available? */
 #define SYMNAME(pfx...) #pfx "ma2do_%V[off]_%V[ma]"
-    alternative_io("call " SYMNAME() "\n\t"
+    alternative_io_2("call " SYMNAME() "\n\t"
                    LINKONCE_PROLOGUE(SYMNAME) "\n\t"
                    "mov %[tmask], %[off]\n\t"
                    "mov %[shift], %%ecx\n\t"
@@ -125,6 +127,7 @@ static always_inline void *__maddr_to_vi
                    "ret\n\t"
                    LINKONCE_EPILOGUE(SYMNAME),
                    "pext %[mask], %[ma], %[off]", X86_FEATURE_BMI2,
+                   "mov %[ma], %[off]", X86_FEATURE_PFN_PDX_IDENT,
                    ASM_OUTPUT2([off] "=&r" (off), [ma] "+r" (ma)),
                    [mask] "m" (ma_real_mask),
                    [shift] "m" (pfn_pdx_hole_shift),
@@ -133,11 +136,12 @@ static always_inline void *__maddr_to_vi
                    : "ecx");
 #undef SYMNAME
 #else
-    alternative_io("call ma2do",
+    alternative_io_2("call ma2do",
                    /* pext ma_real_mask(%rip), %rdi, %rax */
                    ".byte 0xc4, 0xe2, 0xc2, 0xf5, 0x05\n\t"
                    ".long ma_real_mask - 4 - .",
                    X86_FEATURE_BMI2,
+                   "mov %%rdi, %%rax", X86_FEATURE_PFN_PDX_IDENT,
                    ASM_OUTPUT2("=a" (off), "+D" (ma)), "m" (ma_real_mask)
                    : "rcx", "rdx", "rsi", "r8", "r9", "r10", "r11");
 #endif



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/6] x86: fix OLDINSTR_2()
  2018-03-13 14:14 ` [PATCH v2 2/6] x86: fix OLDINSTR_2() Jan Beulich
@ 2018-03-28  8:48   ` Wei Liu
  0 siblings, 0 replies; 8+ messages in thread
From: Wei Liu @ 2018-03-28  8:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Tue, Mar 13, 2018 at 08:14:51AM -0600, Jan Beulich wrote:
> Its as_max() invocation was wrongly parenthesized.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-03-28  8:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-13 14:00 [PATCH v2 0/6] x86: improve PDX <-> PFN and alike translations Jan Beulich
2018-03-13 14:14 ` [PATCH v2 1/6] x86: remove page.h and processor.h inclusion from asm_defns.h Jan Beulich
2018-03-13 14:14 ` [PATCH v2 2/6] x86: fix OLDINSTR_2() Jan Beulich
2018-03-28  8:48   ` Wei Liu
2018-03-13 14:15 ` [PATCH v2 3/6] x86: use PDEP for PTE flags insertion when available Jan Beulich
2018-03-13 14:15 ` [PATCH v2 4/6] x86: use PDEP/PEXT for maddr/direct-map-offset conversion " Jan Beulich
2018-03-13 14:16 ` [PATCH v2 5/6] x86: use PDEP/PEXT for PFN/PDX " Jan Beulich
2018-03-13 14:16 ` [PATCH v2 6/6] x86: use MOV for PFN/PDX conversion when possible Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.