xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/10] xen: arm: map normal memory as inner shareable, reduce scope of various barriers
@ 2013-06-28 16:10 Ian Campbell
  2013-06-28 16:10 ` [PATCH 01/10] xen: arm: map memory as inner shareable Ian Campbell
                   ` (10 more replies)
  0 siblings, 11 replies; 44+ messages in thread
From: Ian Campbell @ 2013-06-28 16:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, Stefano Stabellini, Tim Deegan

Currently Xen maps all RAM as Outer-Shareable, since that seemed like
the most conservative option early on when we didn't really know what
Inner- vs. Outer-Shareable meant. However we have long suspected that
actually Inner-Shareable would be the correct type to use.

After reading the docs many times, getting more confused each time, I
finally got a reasonable explanation from a man (and a dog) down the
pub: Inner-Shareable == the processors in an SMP system, while
Outer-Shareable == devices. (NB: Not a random man, he knew what he was
talking about...). With that in mind switch all of Xen's memory mapping,
page table walks, TLB flushes and an appropriate subset of the barriers
to be inner shareable.

In addition I have switched barriers to use the correct read/write/any
variants for their types. Note that I have only tackled the generic
mb/rmb/wmb and smp_* barriers (mainly used by common code) here. There
are also quite a few open-coded full-system dsb's in the arch code which
I will look at separately (10 patches is quite enough for now). Since
those deal with e.g. pagetable updates they will be fun ;-)

I have tested this series on Cortex A15 and AEMv8 fast models in both
cases with 2 CPUs and I can start a guest in both cases. I have not
tested on any real hardware at all.

These changes should result in a performance improvement, although only
having models to go on I haven't actually bothered to measure.

I would appreciate anyone with access to real hardware giving it a go. I
have pushed the patches to:

        git://xenbits.xen.org/people/ianc/xen.git inner-shareable-v1
        
        
Ian Campbell (10):
      xen: arm: map memory as inner shareable.
      xen: arm: Only upgrade guest barriers to inner shareable.
      xen: arm: reduce instruction cache and tlb flushes to inner-shareable.
      xen: arm: consolidate barrier definitions
      xen: use SMP barrier in common code dealing with shared memory protocols
      xen: arm: Use SMP barriers when that is all which is required.
      xen: arm: Use dmb for smp barriers
      xen: arm: add scope to dsb and dmb macros
      xen: arm: weaken SMP barriers to inner shareable.
      xen: arm: use more specific barriers for read and write barriers.

 xen/arch/arm/arm32/head.S            |    8 +++---
 xen/arch/arm/arm64/head.S            |    8 +++---
 xen/arch/arm/domain.c                |    2 +-
 xen/arch/arm/gic.c                   |    8 +++---
 xen/arch/arm/mm.c                    |   28 ++++++++++------------
 xen/arch/arm/platforms/vexpress.c    |    6 ++--
 xen/arch/arm/smpboot.c               |    6 ++--
 xen/arch/arm/time.c                  |    2 +-
 xen/arch/arm/traps.c                 |    2 +-
 xen/common/domain.c                  |    2 +-
 xen/common/domctl.c                  |    2 +-
 xen/common/grant_table.c             |    4 +-
 xen/common/page_alloc.c              |    2 +-
 xen/common/smp.c                     |    4 +-
 xen/common/spinlock.c                |    6 ++--
 xen/common/tmem_xen.c                |   10 ++++----
 xen/common/trace.c                   |    8 +++---
 xen/drivers/char/console.c           |    2 +-
 xen/drivers/video/arm_hdlcd.c        |    2 +-
 xen/include/asm-arm/arm32/flushtlb.h |   12 +++++-----
 xen/include/asm-arm/arm32/io.h       |    4 +-
 xen/include/asm-arm/arm32/page.h     |   12 +++++-----
 xen/include/asm-arm/arm32/system.h   |   16 -------------
 xen/include/asm-arm/arm64/flushtlb.h |    4 +-
 xen/include/asm-arm/arm64/io.h       |    4 +-
 xen/include/asm-arm/arm64/page.h     |   10 ++++----
 xen/include/asm-arm/arm64/system.h   |   17 -------------
 xen/include/asm-arm/page.h           |   42 +++++++++++++++++++++++++++++-----
 xen/include/asm-arm/system.h         |   21 +++++++++++++++++
 xen/include/xen/event.h              |    4 +-
 xen/xsm/flask/ss/sidtab.c            |    4 +-
 31 files changed, 139 insertions(+), 123 deletions(-)

Ian.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 01/10] xen: arm: map memory as inner shareable.
  2013-06-28 16:10 [PATCH 00/10] xen: arm: map normal memory as inner shareable, reduce scope of various barriers Ian Campbell
@ 2013-06-28 16:10 ` Ian Campbell
  2013-07-01 15:39   ` Stefano Stabellini
                     ` (2 more replies)
  2013-06-28 16:10 ` [PATCH 02/10] xen: arm: Only upgrade guest barriers to " Ian Campbell
                   ` (9 subsequent siblings)
  10 siblings, 3 replies; 44+ messages in thread
From: Ian Campbell @ 2013-06-28 16:10 UTC (permalink / raw)
  To: xen-devel
  Cc: julien.grall, tim, Ian Campbell, Leif Lindholm, stefano.stabellini

The inner shareable domain contains all SMP processors, including different
clusters (e.g. big.LITTLE). Therefore this is the correct thing to use for Xen
memory mappings. The outer shareable domain is for devices on busses which are
barriers (e.g. AMBA4). While the system domain is for things behind bridges
which do not.

One wrinkle is that Normal memory with attributes Inner Non-cacheable, Outer
Non-cacheable (which we call BUFFERABLE) must be mapped Outer Shareable on ARM
v7. Therefore change the prototype of mfn_to_xen_entry to take the attribute
index so we can DTRT. On ARMv8 the sharability is ignored and considered to
always be Outer Shareable.

While I'm here change all the dmb/dsb with an implicit sy to an explicit sy,
to make future changes simpler. Other than that don't adjust the barriers,
flushes etc, those remain as they were (which is more than is now required).
I'll change those in a later patch.

Many thanks to Leif for explaining the difference between Inner- and
Outer-Shareable in words of two or less syllables, I hope I've replicated that
explanation properly above!

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
---
 xen/arch/arm/arm32/head.S          |    8 +++---
 xen/arch/arm/arm64/head.S          |    8 +++---
 xen/arch/arm/mm.c                  |   24 ++++++++++------------
 xen/include/asm-arm/arm32/system.h |    4 +-
 xen/include/asm-arm/page.h         |   38 ++++++++++++++++++++++++++++++++---
 5 files changed, 55 insertions(+), 27 deletions(-)

diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
index 0588d54..464c351 100644
--- a/xen/arch/arm/arm32/head.S
+++ b/xen/arch/arm/arm32/head.S
@@ -24,8 +24,8 @@
 
 #define ZIMAGE_MAGIC_NUMBER 0x016f2818
 
-#define PT_PT     0xe7f /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=111 T=1 P=1 */
-#define PT_MEM    0xe7d /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=111 T=0 P=1 */
+#define PT_PT     0xf7f /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=1 P=1 */
+#define PT_MEM    0xf7d /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=0 P=1 */
 #define PT_DEV    0xe71 /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=100 T=0 P=1 */
 #define PT_DEV_L3 0xe73 /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=100 T=1 P=1 */
 
@@ -206,10 +206,10 @@ skip_bss:
         mcr   CP32(r1, HMAIR1)
 
         /* Set up the HTCR:
-         * PT walks use Outer-Shareable accesses,
+         * PT walks use Inner-Shareable accesses,
          * PT walks are write-back, write-allocate in both cache levels,
          * Full 32-bit address space goes through this table. */
-        ldr   r0, =0x80002500
+        ldr   r0, =0x80003500
         mcr   CP32(r0, HTCR)
 
         /* Set up the HSCTLR:
diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 21b7e4d..ffcb880 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -24,8 +24,8 @@
 #include <asm/page.h>
 #include <asm/asm_defns.h>
 
-#define PT_PT     0xe7f /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=111 T=1 P=1 */
-#define PT_MEM    0xe7d /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=111 T=0 P=1 */
+#define PT_PT     0xf7f /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=1 P=1 */
+#define PT_MEM    0xf7d /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=0 P=1 */
 #define PT_DEV    0xe71 /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=100 T=0 P=1 */
 #define PT_DEV_L3 0xe73 /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=100 T=1 P=1 */
 
@@ -178,10 +178,10 @@ skip_bss:
         /* Set up the HTCR:
          * PASize -- 4G
          * Top byte is used
-         * PT walks use Outer-Shareable accesses,
+         * PT walks use Inner-Shareable accesses,
          * PT walks are write-back, write-allocate in both cache levels,
          * Full 64-bit address space goes through this table. */
-        ldr   x0, =0x80802500
+        ldr   x0, =0x80803500
         msr   tcr_el2, x0
 
         /* Set up the HSCTLR:
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index d1290cd..c5213f2 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -163,9 +163,8 @@ void dump_hyp_walk(vaddr_t addr)
 /* Map a 4k page in a fixmap entry */
 void set_fixmap(unsigned map, unsigned long mfn, unsigned attributes)
 {
-    lpae_t pte = mfn_to_xen_entry(mfn);
+    lpae_t pte = mfn_to_xen_entry(mfn, attributes);
     pte.pt.table = 1; /* 4k mappings always have this bit set */
-    pte.pt.ai = attributes;
     pte.pt.xn = 1;
     write_pte(xen_fixmap + third_table_offset(FIXMAP_ADDR(map)), pte);
     flush_xen_data_tlb_range_va(FIXMAP_ADDR(map), PAGE_SIZE);
@@ -212,7 +211,7 @@ void *map_domain_page(unsigned long mfn)
         if ( map[slot].pt.avail == 0 )
         {
             /* Commandeer this 2MB slot */
-            pte = mfn_to_xen_entry(slot_mfn);
+            pte = mfn_to_xen_entry(slot_mfn, WRITEALLOC);
             pte.pt.avail = 1;
             write_pte(map + slot, pte);
             break;
@@ -340,7 +339,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
 
     /* Map the destination in the boot misc area. */
     dest_va = BOOT_MISC_VIRT_START;
-    pte = mfn_to_xen_entry(xen_paddr >> PAGE_SHIFT);
+    pte = mfn_to_xen_entry(xen_paddr >> PAGE_SHIFT, WRITEALLOC);
     write_pte(xen_second + second_table_offset(dest_va), pte);
     flush_xen_data_tlb_range_va(dest_va, SECOND_SIZE);
 
@@ -387,7 +386,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
 
     /* Link in the fixmap pagetable */
     pte = mfn_to_xen_entry((((unsigned long) xen_fixmap) + phys_offset)
-                           >> PAGE_SHIFT);
+                           >> PAGE_SHIFT, WRITEALLOC);
     pte.pt.table = 1;
     write_pte(xen_second + second_table_offset(FIXMAP_ADDR(0)), pte);
     /*
@@ -402,7 +401,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
         unsigned long va = XEN_VIRT_START + (i << PAGE_SHIFT);
         if ( !is_kernel(va) )
             break;
-        pte = mfn_to_xen_entry(mfn);
+        pte = mfn_to_xen_entry(mfn, WRITEALLOC);
         pte.pt.table = 1; /* 4k mappings always have this bit set */
         if ( is_kernel_text(va) || is_kernel_inittext(va) )
         {
@@ -415,7 +414,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
         /* No flush required here as page table is not hooked in yet. */
     }
     pte = mfn_to_xen_entry((((unsigned long) xen_xenmap) + phys_offset)
-                           >> PAGE_SHIFT);
+                           >> PAGE_SHIFT, WRITEALLOC);
     pte.pt.table = 1;
     write_pte(xen_second + second_linear_offset(XEN_VIRT_START), pte);
     /* TLBFLUSH and ISB would be needed here, but wait until we set WXN */
@@ -467,7 +466,7 @@ int init_secondary_pagetables(int cpu)
     /* Initialise first pagetable from first level of boot tables, and
      * hook into the new root. */
     memcpy(first, boot_first, PAGE_SIZE);
-    pte = mfn_to_xen_entry(virt_to_mfn(first));
+    pte = mfn_to_xen_entry(virt_to_mfn(first), WRITEALLOC);
     pte.pt.table = 1;
     write_pte(root, pte);
 #endif
@@ -479,7 +478,7 @@ int init_secondary_pagetables(int cpu)
      * domheap mapping pages. */
     for ( i = 0; i < DOMHEAP_SECOND_PAGES; i++ )
     {
-        pte = mfn_to_xen_entry(virt_to_mfn(domheap+i*LPAE_ENTRIES));
+        pte = mfn_to_xen_entry(virt_to_mfn(domheap+i*LPAE_ENTRIES), WRITEALLOC);
         pte.pt.table = 1;
         write_pte(&first[first_table_offset(DOMHEAP_VIRT_START+i*FIRST_SIZE)], pte);
     }
@@ -525,7 +524,7 @@ static void __init create_mappings(unsigned long virt,
 
     count = nr_mfns / LPAE_ENTRIES;
     p = xen_second + second_linear_offset(virt);
-    pte = mfn_to_xen_entry(base_mfn);
+    pte = mfn_to_xen_entry(base_mfn, WRITEALLOC);
     pte.pt.hint = 1;  /* These maps are in 16-entry contiguous chunks. */
     for ( i = 0; i < count; i++ )
     {
@@ -595,7 +594,7 @@ static int create_xen_table(lpae_t *entry)
     if ( p == NULL )
         return -ENOMEM;
     clear_page(p);
-    pte = mfn_to_xen_entry(virt_to_mfn(p));
+    pte = mfn_to_xen_entry(virt_to_mfn(p), WRITEALLOC);
     pte.pt.table = 1;
     write_pte(entry, pte);
     return 0;
@@ -641,9 +640,8 @@ static int create_xen_entries(enum xenmap_operation op,
                            addr, mfn);
                     return -EINVAL;
                 }
-                pte = mfn_to_xen_entry(mfn);
+                pte = mfn_to_xen_entry(mfn, ai);
                 pte.pt.table = 1;
-                pte.pt.ai = ai;
                 write_pte(&third[third_table_offset(addr)], pte);
                 break;
             case REMOVE:
diff --git a/xen/include/asm-arm/arm32/system.h b/xen/include/asm-arm/arm32/system.h
index 60148cb..b3736f4 100644
--- a/xen/include/asm-arm/arm32/system.h
+++ b/xen/include/asm-arm/arm32/system.h
@@ -7,8 +7,8 @@
 #define wfi() __asm__ __volatile__ ("wfi" : : : "memory")
 
 #define isb() __asm__ __volatile__ ("isb" : : : "memory")
-#define dsb() __asm__ __volatile__ ("dsb" : : : "memory")
-#define dmb() __asm__ __volatile__ ("dmb" : : : "memory")
+#define dsb() __asm__ __volatile__ ("dsb sy" : : : "memory")
+#define dmb() __asm__ __volatile__ ("dmb sy" : : : "memory")
 
 #define mb()            dsb()
 #define rmb()           dsb()
diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
index 41e9eff..cd38956 100644
--- a/xen/include/asm-arm/page.h
+++ b/xen/include/asm-arm/page.h
@@ -185,7 +185,7 @@ typedef union {
 /* Standard entry type that we'll use to build Xen's own pagetables.
  * We put the same permissions at every level, because they're ignored
  * by the walker in non-leaf entries. */
-static inline lpae_t mfn_to_xen_entry(unsigned long mfn)
+static inline lpae_t mfn_to_xen_entry(unsigned long mfn, unsigned attr)
 {
     paddr_t pa = ((paddr_t) mfn) << PAGE_SHIFT;
     lpae_t e = (lpae_t) {
@@ -193,10 +193,9 @@ static inline lpae_t mfn_to_xen_entry(unsigned long mfn)
             .xn = 1,              /* No need to execute outside .text */
             .ng = 1,              /* Makes TLB flushes easier */
             .af = 1,              /* No need for access tracking */
-            .sh = LPAE_SH_OUTER,  /* Xen mappings are globally coherent */
             .ns = 1,              /* Hyp mode is in the non-secure world */
             .user = 1,            /* See below */
-            .ai = WRITEALLOC,
+            .ai = attr,
             .table = 0,           /* Set to 1 for links and 4k maps */
             .valid = 1,           /* Mappings are present */
         }};;
@@ -205,6 +204,37 @@ static inline lpae_t mfn_to_xen_entry(unsigned long mfn)
      * pagetables un User mode it's OK.  If this changes, remember
      * to update the hard-coded values in head.S too */
 
+    switch ( attr )
+    {
+    case BUFFERABLE:
+        /*
+         * ARM ARM: Overlaying the shareability attribute (B3-1376 to 1377)
+         *
+         * A memory region with a resultant memory type attribute of Normal,
+         * and a resultant cacheability attribute of Inner Non-cacheable,
+         * Outer Non-cacheable, must have a resultant shareability attribute
+         * of Outer Shareable, otherwise shareability is UNPREDICTABLE.
+         *
+         * On ARMv8 sharability is ignored and explicitly treated as Outer
+         * Shareable for Normal Inner Non_cacheable, Outer Non-cacheable.
+         */
+        e.pt.sh = LPAE_SH_OUTER;
+        break;
+    case UNCACHED:
+    case DEV_SHARED:
+        /* Shareability is ignored for non-Normal memory, Outer is as
+         * good as anything.
+         *
+         * On ARMv8 sharability is ignored and explicitly treated as Outer
+         * Shareable for any device memory type.
+         */
+        e.pt.sh = LPAE_SH_OUTER;
+        break;
+    default:
+        e.pt.sh = LPAE_SH_INNER;  /* Xen mappings are SMP coherent */
+        break;
+    }
+
     ASSERT(!(pa & ~PAGE_MASK));
     ASSERT(!(pa & ~PADDR_MASK));
 
@@ -219,7 +249,7 @@ static inline lpae_t mfn_to_p2m_entry(unsigned long mfn, unsigned int mattr)
     lpae_t e = (lpae_t) {
         .p2m.xn = 0,
         .p2m.af = 1,
-        .p2m.sh = LPAE_SH_OUTER,
+        .p2m.sh = LPAE_SH_INNER,
         .p2m.write = 1,
         .p2m.read = 1,
         .p2m.mattr = mattr,
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 02/10] xen: arm: Only upgrade guest barriers to inner shareable.
  2013-06-28 16:10 [PATCH 00/10] xen: arm: map normal memory as inner shareable, reduce scope of various barriers Ian Campbell
  2013-06-28 16:10 ` [PATCH 01/10] xen: arm: map memory as inner shareable Ian Campbell
@ 2013-06-28 16:10 ` Ian Campbell
  2013-07-01 15:24   ` Stefano Stabellini
  2013-07-04 10:58   ` Tim Deegan
  2013-06-28 16:10 ` [PATCH 03/10] xen: arm: reduce instruction cache and tlb flushes to inner-shareable Ian Campbell
                   ` (8 subsequent siblings)
  10 siblings, 2 replies; 44+ messages in thread
From: Ian Campbell @ 2013-06-28 16:10 UTC (permalink / raw)
  To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
 xen/arch/arm/traps.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 398d209..e40ae2e 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -62,7 +62,7 @@ void __cpuinit init_traps(void)
     WRITE_SYSREG((vaddr_t)hyp_traps_vector, VBAR_EL2);
 
     /* Setup hypervisor traps */
-    WRITE_SYSREG(HCR_PTW|HCR_BSU_OUTER|HCR_AMO|HCR_IMO|HCR_VM|HCR_TWI|HCR_TSC, HCR_EL2);
+    WRITE_SYSREG(HCR_PTW|HCR_BSU_INNER|HCR_AMO|HCR_IMO|HCR_VM|HCR_TWI|HCR_TSC, HCR_EL2);
     isb();
 }
 
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 03/10] xen: arm: reduce instruction cache and tlb flushes to inner-shareable.
  2013-06-28 16:10 [PATCH 00/10] xen: arm: map normal memory as inner shareable, reduce scope of various barriers Ian Campbell
  2013-06-28 16:10 ` [PATCH 01/10] xen: arm: map memory as inner shareable Ian Campbell
  2013-06-28 16:10 ` [PATCH 02/10] xen: arm: Only upgrade guest barriers to " Ian Campbell
@ 2013-06-28 16:10 ` Ian Campbell
  2013-07-01 15:25   ` Stefano Stabellini
  2013-07-04 11:07   ` Tim Deegan
  2013-06-28 16:10 ` [PATCH 04/10] xen: arm: consolidate barrier definitions Ian Campbell
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 44+ messages in thread
From: Ian Campbell @ 2013-06-28 16:10 UTC (permalink / raw)
  To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini

Now that Xen maps memory and performs pagetable walks as inner shareable we
don't need to push updates down so far when modifying page tables etc.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
 xen/include/asm-arm/arm32/flushtlb.h |    4 ++--
 xen/include/asm-arm/arm32/page.h     |    8 ++++----
 xen/include/asm-arm/arm64/flushtlb.h |    4 ++--
 xen/include/asm-arm/arm64/page.h     |    6 +++---
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/xen/include/asm-arm/arm32/flushtlb.h b/xen/include/asm-arm/arm32/flushtlb.h
index a258f58..14e8827 100644
--- a/xen/include/asm-arm/arm32/flushtlb.h
+++ b/xen/include/asm-arm/arm32/flushtlb.h
@@ -6,7 +6,7 @@ static inline void flush_tlb_local(void)
 {
     dsb();
 
-    WRITE_CP32((uint32_t) 0, TLBIALL);
+    WRITE_CP32((uint32_t) 0, TLBIALLIS);
 
     dsb();
     isb();
@@ -17,7 +17,7 @@ static inline void flush_tlb_all_local(void)
 {
     dsb();
 
-    WRITE_CP32((uint32_t) 0, TLBIALLNSNH);
+    WRITE_CP32((uint32_t) 0, TLBIALLNSNHIS);
 
     dsb();
     isb();
diff --git a/xen/include/asm-arm/arm32/page.h b/xen/include/asm-arm/arm32/page.h
index 38bcffd..e573502 100644
--- a/xen/include/asm-arm/arm32/page.h
+++ b/xen/include/asm-arm/arm32/page.h
@@ -39,8 +39,8 @@ static inline void flush_xen_text_tlb(void)
     asm volatile (
         "isb;"                        /* Ensure synchronization with previous changes to text */
         STORE_CP32(0, TLBIALLH)       /* Flush hypervisor TLB */
-        STORE_CP32(0, ICIALLU)        /* Flush I-cache */
-        STORE_CP32(0, BPIALL)         /* Flush branch predictor */
+        STORE_CP32(0, ICIALLUIS)      /* Flush I-cache */
+        STORE_CP32(0, BPIALLIS)       /* Flush branch predictor */
         "dsb;"                        /* Ensure completion of TLB+BP flush */
         "isb;"
         : : "r" (r0) /*dummy*/ : "memory");
@@ -54,7 +54,7 @@ static inline void flush_xen_data_tlb(void)
 {
     register unsigned long r0 asm ("r0");
     asm volatile("dsb;" /* Ensure preceding are visible */
-                 STORE_CP32(0, TLBIALLH)
+                 STORE_CP32(0, TLBIALLHIS)
                  "dsb;" /* Ensure completion of the TLB flush */
                  "isb;"
                  : : "r" (r0) /* dummy */: "memory");
@@ -69,7 +69,7 @@ static inline void flush_xen_data_tlb_range_va(unsigned long va, unsigned long s
     unsigned long end = va + size;
     dsb(); /* Ensure preceding are visible */
     while ( va < end ) {
-        asm volatile(STORE_CP32(0, TLBIMVAH)
+        asm volatile(STORE_CP32(0, TLBIMVAHIS)
                      : : "r" (va) : "memory");
         va += PAGE_SIZE;
     }
diff --git a/xen/include/asm-arm/arm64/flushtlb.h b/xen/include/asm-arm/arm64/flushtlb.h
index d0535a0..3a6d2cb 100644
--- a/xen/include/asm-arm/arm64/flushtlb.h
+++ b/xen/include/asm-arm/arm64/flushtlb.h
@@ -6,7 +6,7 @@ static inline void flush_tlb_local(void)
 {
     asm volatile(
         "dsb sy;"
-        "tlbi vmalle1;"
+        "tlbi vmalle1is;"
         "dsb sy;"
         "isb;"
         : : : "memory");
@@ -17,7 +17,7 @@ static inline void flush_tlb_all_local(void)
 {
     asm volatile(
         "dsb sy;"
-        "tlbi alle1;"
+        "tlbi alle1is;"
         "dsb sy;"
         "isb;"
         : : : "memory");
diff --git a/xen/include/asm-arm/arm64/page.h b/xen/include/asm-arm/arm64/page.h
index bd48fe3..28748d3 100644
--- a/xen/include/asm-arm/arm64/page.h
+++ b/xen/include/asm-arm/arm64/page.h
@@ -33,7 +33,7 @@ static inline void flush_xen_text_tlb(void)
     asm volatile (
         "isb;"       /* Ensure synchronization with previous changes to text */
         "tlbi   alle2;"                 /* Flush hypervisor TLB */
-        "ic     iallu;"                 /* Flush I-cache */
+        "ic     ialluis;"               /* Flush I-cache */
         "dsb    sy;"                    /* Ensure completion of TLB flush */
         "isb;"
         : : : "memory");
@@ -47,7 +47,7 @@ static inline void flush_xen_data_tlb(void)
 {
     asm volatile (
         "dsb    sy;"                    /* Ensure visibility of PTE writes */
-        "tlbi   alle2;"                 /* Flush hypervisor TLB */
+        "tlbi   alle2is;"               /* Flush hypervisor TLB */
         "dsb    sy;"                    /* Ensure completion of TLB flush */
         "isb;"
         : : : "memory");
@@ -62,7 +62,7 @@ static inline void flush_xen_data_tlb_range_va(unsigned long va, unsigned long s
     unsigned long end = va + size;
     dsb(); /* Ensure preceding are visible */
     while ( va < end ) {
-        asm volatile("tlbi vae2, %0;"
+        asm volatile("tlbi vae2is, %0;"
                      : : "r" (va>>PAGE_SHIFT) : "memory");
         va += PAGE_SIZE;
     }
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 04/10] xen: arm: consolidate barrier definitions
  2013-06-28 16:10 [PATCH 00/10] xen: arm: map normal memory as inner shareable, reduce scope of various barriers Ian Campbell
                   ` (2 preceding siblings ...)
  2013-06-28 16:10 ` [PATCH 03/10] xen: arm: reduce instruction cache and tlb flushes to inner-shareable Ian Campbell
@ 2013-06-28 16:10 ` Ian Campbell
  2013-07-01 15:25   ` Stefano Stabellini
  2013-07-04 11:07   ` Tim Deegan
  2013-06-28 16:10 ` [PATCH 05/10] xen: use SMP barrier in common code dealing with shared memory protocols Ian Campbell
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 44+ messages in thread
From: Ian Campbell @ 2013-06-28 16:10 UTC (permalink / raw)
  To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini

These are effectively identical on both 32- and 64-bit.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
 xen/include/asm-arm/arm32/system.h |   16 ----------------
 xen/include/asm-arm/arm64/system.h |   17 -----------------
 xen/include/asm-arm/system.h       |   16 ++++++++++++++++
 3 files changed, 16 insertions(+), 33 deletions(-)

diff --git a/xen/include/asm-arm/arm32/system.h b/xen/include/asm-arm/arm32/system.h
index b3736f4..9f233fe 100644
--- a/xen/include/asm-arm/arm32/system.h
+++ b/xen/include/asm-arm/arm32/system.h
@@ -2,22 +2,6 @@
 #ifndef __ASM_ARM32_SYSTEM_H
 #define __ASM_ARM32_SYSTEM_H
 
-#define sev() __asm__ __volatile__ ("sev" : : : "memory")
-#define wfe() __asm__ __volatile__ ("wfe" : : : "memory")
-#define wfi() __asm__ __volatile__ ("wfi" : : : "memory")
-
-#define isb() __asm__ __volatile__ ("isb" : : : "memory")
-#define dsb() __asm__ __volatile__ ("dsb sy" : : : "memory")
-#define dmb() __asm__ __volatile__ ("dmb sy" : : : "memory")
-
-#define mb()            dsb()
-#define rmb()           dsb()
-#define wmb()           mb()
-
-#define smp_mb()        mb()
-#define smp_rmb()       rmb()
-#define smp_wmb()       wmb()
-
 extern void __bad_xchg(volatile void *, int);
 
 static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size)
diff --git a/xen/include/asm-arm/arm64/system.h b/xen/include/asm-arm/arm64/system.h
index 4e41913..46f901c 100644
--- a/xen/include/asm-arm/arm64/system.h
+++ b/xen/include/asm-arm/arm64/system.h
@@ -2,23 +2,6 @@
 #ifndef __ASM_ARM64_SYSTEM_H
 #define __ASM_ARM64_SYSTEM_H
 
-#define sev()           asm volatile("sev" : : : "memory")
-#define wfe()           asm volatile("wfe" : : : "memory")
-#define wfi()           asm volatile("wfi" : : : "memory")
-
-#define isb()           asm volatile("isb" : : : "memory")
-#define dsb()           asm volatile("dsb sy" : : : "memory")
-#define dmb()           asm volatile("dmb sy" : : : "memory")
-
-#define mb()            dsb()
-#define rmb()           dsb()
-#define wmb()           mb()
-
-#define smp_mb()        mb()
-#define smp_rmb()       rmb()
-#define smp_wmb()       wmb()
-
-
 extern void __bad_xchg(volatile void *, int);
 
 static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size)
diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
index 290d38d..e003624 100644
--- a/xen/include/asm-arm/system.h
+++ b/xen/include/asm-arm/system.h
@@ -8,6 +8,22 @@
 #define nop() \
     asm volatile ( "nop" )
 
+#define sev()           asm volatile("sev" : : : "memory")
+#define wfe()           asm volatile("wfe" : : : "memory")
+#define wfi()           asm volatile("wfi" : : : "memory")
+
+#define isb()           asm volatile("isb" : : : "memory")
+#define dsb()           asm volatile("dsb sy" : : : "memory")
+#define dmb()           asm volatile("dmb sy" : : : "memory")
+
+#define mb()            dsb()
+#define rmb()           dsb()
+#define wmb()           mb()
+
+#define smp_mb()        mb()
+#define smp_rmb()       rmb()
+#define smp_wmb()       wmb()
+
 #define xchg(ptr,x) \
         ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
 
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 05/10] xen: use SMP barrier in common code dealing with shared memory protocols
  2013-06-28 16:10 [PATCH 00/10] xen: arm: map normal memory as inner shareable, reduce scope of various barriers Ian Campbell
                   ` (3 preceding siblings ...)
  2013-06-28 16:10 ` [PATCH 04/10] xen: arm: consolidate barrier definitions Ian Campbell
@ 2013-06-28 16:10 ` Ian Campbell
  2013-06-28 16:15   ` Ian Campbell
                     ` (2 more replies)
  2013-06-28 16:10 ` [PATCH 06/10] xen: arm: Use SMP barriers when that is all which is required Ian Campbell
                   ` (5 subsequent siblings)
  10 siblings, 3 replies; 44+ messages in thread
From: Ian Campbell @ 2013-06-28 16:10 UTC (permalink / raw)
  To: xen-devel
  Cc: keir, Ian Campbell, stefano.stabellini, tim, jbeuich, julien.grall

Xen currently makes no strong distinction between the SMP barriers (smp_mb
etc) and the regular barrier (mb etc). In Linux, where we inherited these
names from having imported Linux code which uses them, the SMP barriers are
intended to be sufficient for implementing shared-memory protocols between
processors in an SMP system while the standard barriers are useful for MMIO
etc.

On x86 with the stronger ordering model there is not much practical difference
here but ARM has weaker barriers available which are suitable for use as SMP
barriers.

Therefore ensure that common code uses the SMP barriers when that is all which
is required.

On both ARM and x86 both types of barrier are currently identical so there is
no actual change. A future patch will change smp_mb to a weaker barrier on
ARM.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: jbeuich@suse.com
Cc: keir@xen.org
---
I'm not convinced some of these mb() couldn't actually be wmb(), but I didn't
think to hard about this or make any changes along those lines.
---
 xen/common/domain.c        |    2 +-
 xen/common/domctl.c        |    2 +-
 xen/common/grant_table.c   |    4 ++--
 xen/common/page_alloc.c    |    2 +-
 xen/common/smp.c           |    4 ++--
 xen/common/spinlock.c      |    6 +++---
 xen/common/tmem_xen.c      |   10 +++++-----
 xen/common/trace.c         |    8 ++++----
 xen/drivers/char/console.c |    2 +-
 xen/include/xen/event.h    |    4 ++--
 xen/xsm/flask/ss/sidtab.c  |    4 ++--
 11 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index fac3470..1380ea9 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -932,7 +932,7 @@ int map_vcpu_info(struct vcpu *v, unsigned long gfn, unsigned offset)
     v->vcpu_info_mfn = page_to_mfn(page);
 
     /* Set new vcpu_info pointer /before/ setting pending flags. */
-    wmb();
+    smp_wmb();
 
     /*
      * Mark everything as being pending just to make sure nothing gets
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index 9bd8f80..c653efb 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -533,7 +533,7 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
 
             /* Install vcpu array /then/ update max_vcpus. */
             d->vcpu = vcpus;
-            wmb();
+            smp_wmb();
             d->max_vcpus = max;
         }
 
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 3f97328..eb50288 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -426,7 +426,7 @@ static int _set_status_v2(domid_t  domid,
 
     /* Make sure guest sees status update before checking if flags are
        still valid */
-    mb();
+    smp_mb();
 
     scombo.word = *(u32 *)shah;
     barrier();
@@ -1670,7 +1670,7 @@ gnttab_transfer(
             guest_physmap_add_page(e, sha->full_page.frame, mfn, 0);
             sha->full_page.frame = mfn;
         }
-        wmb();
+        smp_wmb();
         shared_entry_header(e->grant_table, gop.ref)->flags |=
             GTF_transfer_completed;
 
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 2162ef1..25a7d3d 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1472,7 +1472,7 @@ int assign_pages(
         ASSERT(page_get_owner(&pg[i]) == NULL);
         ASSERT((pg[i].count_info & ~(PGC_allocated | 1)) == 0);
         page_set_owner(&pg[i], d);
-        wmb(); /* Domain pointer must be visible before updating refcnt. */
+        smp_wmb(); /* Domain pointer must be visible before updating refcnt. */
         pg[i].count_info = PGC_allocated | 1;
         page_list_add_tail(&pg[i], &d->page_list);
     }
diff --git a/xen/common/smp.c b/xen/common/smp.c
index b2b056b..482a203 100644
--- a/xen/common/smp.c
+++ b/xen/common/smp.c
@@ -89,12 +89,12 @@ void smp_call_function_interrupt(void)
     if ( call_data.wait )
     {
         (*func)(info);
-        mb();
+        smp_mb();
         cpumask_clear_cpu(cpu, &call_data.selected);
     }
     else
     {
-        mb();
+        smp_mb();
         cpumask_clear_cpu(cpu, &call_data.selected);
         (*func)(info);
     }
diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
index bfb9670..575cc6d 100644
--- a/xen/common/spinlock.c
+++ b/xen/common/spinlock.c
@@ -218,7 +218,7 @@ void _spin_barrier(spinlock_t *lock)
     u64      loop = 0;
 
     check_barrier(&lock->debug);
-    do { mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
+    do { smp_mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
     if ((loop > 1) && lock->profile)
     {
         lock->profile->time_block += NOW() - block;
@@ -226,9 +226,9 @@ void _spin_barrier(spinlock_t *lock)
     }
 #else
     check_barrier(&lock->debug);
-    do { mb(); } while ( _raw_spin_is_locked(&lock->raw) );
+    do { smp_mb(); } while ( _raw_spin_is_locked(&lock->raw) );
 #endif
-    mb();
+    smp_mb();
 }
 
 int _spin_trylock_recursive(spinlock_t *lock)
diff --git a/xen/common/tmem_xen.c b/xen/common/tmem_xen.c
index 736a8c3..54ec09f 100644
--- a/xen/common/tmem_xen.c
+++ b/xen/common/tmem_xen.c
@@ -173,7 +173,7 @@ EXPORT int tmh_copy_from_client(pfp_t *pfp,
             return -EFAULT;
         }
     }
-    mb();
+    smp_mb();
     if ( len == PAGE_SIZE && !tmem_offset && !pfn_offset && cli_va )
         tmh_copy_page(tmem_va, cli_va);
     else if ( (tmem_offset+len <= PAGE_SIZE) &&
@@ -216,7 +216,7 @@ EXPORT int tmh_compress_from_client(tmem_cli_mfn_t cmfn,
         return 0;
     else if ( copy_from_guest(scratch, clibuf, PAGE_SIZE) )
         return -EFAULT;
-    mb();
+    smp_mb();
     ret = lzo1x_1_compress(cli_va ?: scratch, PAGE_SIZE, dmem, out_len, wmem);
     ASSERT(ret == LZO_E_OK);
     *out_va = dmem;
@@ -260,7 +260,7 @@ EXPORT int tmh_copy_to_client(tmem_cli_mfn_t cmfn, pfp_t *pfp,
     unmap_domain_page(tmem_va);
     if ( cli_va )
         cli_put_page(cli_va, cli_pfp, cli_mfn, 1);
-    mb();
+    smp_mb();
     return rc;
 }
 
@@ -289,7 +289,7 @@ EXPORT int tmh_decompress_to_client(tmem_cli_mfn_t cmfn, void *tmem_va,
         cli_put_page(cli_va, cli_pfp, cli_mfn, 1);
     else if ( copy_to_guest(clibuf, scratch, PAGE_SIZE) )
         return -EFAULT;
-    mb();
+    smp_mb();
     return 1;
 }
 
@@ -311,7 +311,7 @@ EXPORT int tmh_copy_tze_to_client(tmem_cli_mfn_t cmfn, void *tmem_va,
     if ( len < PAGE_SIZE )
         memset((char *)cli_va+len,0,PAGE_SIZE-len);
     cli_put_page(cli_va, cli_pfp, cli_mfn, 1);
-    mb();
+    smp_mb();
     return 1;
 }
 
diff --git a/xen/common/trace.c b/xen/common/trace.c
index fd4ac48..63ea0b7 100644
--- a/xen/common/trace.c
+++ b/xen/common/trace.c
@@ -255,7 +255,7 @@ static int alloc_trace_bufs(unsigned int pages)
     opt_tbuf_size = pages;
 
     printk("xentrace: initialised\n");
-    wmb(); /* above must be visible before tb_init_done flag set */
+    smp_wmb(); /* above must be visible before tb_init_done flag set */
     tb_init_done = 1;
 
     return 0;
@@ -414,7 +414,7 @@ int tb_control(xen_sysctl_tbuf_op_t *tbc)
         int i;
 
         tb_init_done = 0;
-        wmb();
+        smp_wmb();
         /* Clear any lost-record info so we don't get phantom lost records next time we
          * start tracing.  Grab the lock to make sure we're not racing anyone.  After this
          * hypercall returns, no more records should be placed into the buffers. */
@@ -607,7 +607,7 @@ static inline void __insert_record(struct t_buf *buf,
         memcpy(next_page, (char *)rec + remaining, rec_size - remaining);
     }
 
-    wmb();
+    smp_wmb();
 
     next += rec_size;
     if ( next >= 2*data_size )
@@ -718,7 +718,7 @@ void __trace_var(u32 event, bool_t cycles, unsigned int extra,
         return;
 
     /* Read tb_init_done /before/ t_bufs. */
-    rmb();
+    smp_rmb();
 
     spin_lock_irqsave(&this_cpu(t_lock), flags);
 
diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c
index 7cd7bf6..b696b3e 100644
--- a/xen/drivers/char/console.c
+++ b/xen/drivers/char/console.c
@@ -648,7 +648,7 @@ void __init console_init_postirq(void)
     for ( i = conringc ; i != conringp; i++ )
         ring[i & (opt_conring_size - 1)] = conring[i & (conring_size - 1)];
     conring = ring;
-    wmb(); /* Allow users of console_force_unlock() to see larger buffer. */
+    smp_wmb(); /* Allow users of console_force_unlock() to see larger buffer. */
     conring_size = opt_conring_size;
     spin_unlock_irq(&console_lock);
 
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
index 4ac39ad..6f60162 100644
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -85,7 +85,7 @@ void notify_via_xen_event_channel(struct domain *ld, int lport);
         if ( condition )                                                \
             break;                                                      \
         set_bit(_VPF_blocked_in_xen, &current->pause_flags);            \
-        mb(); /* set blocked status /then/ re-evaluate condition */     \
+        smp_mb(); /* set blocked status /then/ re-evaluate condition */ \
         if ( condition )                                                \
         {                                                               \
             clear_bit(_VPF_blocked_in_xen, &current->pause_flags);      \
@@ -99,7 +99,7 @@ void notify_via_xen_event_channel(struct domain *ld, int lport);
     do {                                                                \
         set_bit(_VPF_blocked_in_xen, &current->pause_flags);            \
         raise_softirq(SCHEDULE_SOFTIRQ);                                \
-        mb(); /* set blocked status /then/ caller does his work */      \
+        smp_mb(); /* set blocked status /then/ caller does his work */  \
     } while ( 0 )
 
 #endif /* __XEN_EVENT_H__ */
diff --git a/xen/xsm/flask/ss/sidtab.c b/xen/xsm/flask/ss/sidtab.c
index 586033c..cd1360c 100644
--- a/xen/xsm/flask/ss/sidtab.c
+++ b/xen/xsm/flask/ss/sidtab.c
@@ -79,13 +79,13 @@ int sidtab_insert(struct sidtab *s, u32 sid, struct context *context)
     if ( prev )
     {
         newnode->next = prev->next;
-        wmb();
+        smp_wmb();
         prev->next = newnode;
     }
     else
     {
         newnode->next = s->htable[hvalue];
-        wmb();
+        smp_wmb();
         s->htable[hvalue] = newnode;
     }
 
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 06/10] xen: arm: Use SMP barriers when that is all which is required.
  2013-06-28 16:10 [PATCH 00/10] xen: arm: map normal memory as inner shareable, reduce scope of various barriers Ian Campbell
                   ` (4 preceding siblings ...)
  2013-06-28 16:10 ` [PATCH 05/10] xen: use SMP barrier in common code dealing with shared memory protocols Ian Campbell
@ 2013-06-28 16:10 ` Ian Campbell
  2013-07-01 15:19   ` Stefano Stabellini
  2013-06-28 16:10 ` [PATCH 07/10] xen: arm: Use dmb for smp barriers Ian Campbell
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 44+ messages in thread
From: Ian Campbell @ 2013-06-28 16:10 UTC (permalink / raw)
  To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini

As explained in the previous commit SMP barriers can be used when all we care
about is synchronising against other processors.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
 xen/arch/arm/mm.c      |    2 +-
 xen/arch/arm/smpboot.c |    4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index c5213f2..3f049cb 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -776,7 +776,7 @@ void share_xen_page_with_guest(struct page_info *page,
     page->u.inuse.type_info |= PGT_validated | 1;
 
     page_set_owner(page, d);
-    wmb(); /* install valid domain ptr before updating refcnt. */
+    smp_wmb(); /* install valid domain ptr before updating refcnt. */
     ASSERT((page->count_info & ~PGC_xen_heap) == 0);
 
     /* Only add to the allocation list if the domain isn't dying. */
diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index 8011987..727e09f 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -170,11 +170,11 @@ void __cpuinit start_secondary(unsigned long boot_phys_offset,
 
     /* Run local notifiers */
     notify_cpu_starting(cpuid);
-    wmb();
+    smp_wmb();
 
     /* Now report this CPU is up */
     cpumask_set_cpu(cpuid, &cpu_online_map);
-    wmb();
+    smp_wmb();
 
     local_irq_enable();
 
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 07/10] xen: arm: Use dmb for smp barriers
  2013-06-28 16:10 [PATCH 00/10] xen: arm: map normal memory as inner shareable, reduce scope of various barriers Ian Campbell
                   ` (5 preceding siblings ...)
  2013-06-28 16:10 ` [PATCH 06/10] xen: arm: Use SMP barriers when that is all which is required Ian Campbell
@ 2013-06-28 16:10 ` Ian Campbell
  2013-07-01 15:20   ` Stefano Stabellini
  2013-07-04 11:31   ` Tim Deegan
  2013-06-28 16:10 ` [PATCH 08/10] xen: arm: add scope to dsb and dmb macros Ian Campbell
                   ` (3 subsequent siblings)
  10 siblings, 2 replies; 44+ messages in thread
From: Ian Campbell @ 2013-06-28 16:10 UTC (permalink / raw)
  To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini

The full power of dsb is not required in this context.

Also change wmb() to be dsb() directly instead of indirectly via mb(), for
clarity.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
 xen/include/asm-arm/system.h |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
index e003624..89c61ef 100644
--- a/xen/include/asm-arm/system.h
+++ b/xen/include/asm-arm/system.h
@@ -18,11 +18,11 @@
 
 #define mb()            dsb()
 #define rmb()           dsb()
-#define wmb()           mb()
+#define wmb()           dsb()
 
-#define smp_mb()        mb()
-#define smp_rmb()       rmb()
-#define smp_wmb()       wmb()
+#define smp_mb()        dmb()
+#define smp_rmb()       dmb()
+#define smp_wmb()       dmb()
 
 #define xchg(ptr,x) \
         ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 08/10] xen: arm: add scope to dsb and dmb macros
  2013-06-28 16:10 [PATCH 00/10] xen: arm: map normal memory as inner shareable, reduce scope of various barriers Ian Campbell
                   ` (6 preceding siblings ...)
  2013-06-28 16:10 ` [PATCH 07/10] xen: arm: Use dmb for smp barriers Ian Campbell
@ 2013-06-28 16:10 ` Ian Campbell
  2013-07-01 15:21   ` Stefano Stabellini
  2013-07-04 11:44   ` (no subject) Tim Deegan
  2013-06-28 16:10 ` [PATCH 09/10] xen: arm: weaken SMP barriers to inner shareable Ian Campbell
                   ` (2 subsequent siblings)
  10 siblings, 2 replies; 44+ messages in thread
From: Ian Campbell @ 2013-06-28 16:10 UTC (permalink / raw)
  To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini

Everywhere currently passes "sy"stem, so no actual change.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
 xen/arch/arm/domain.c                |    2 +-
 xen/arch/arm/gic.c                   |    8 ++++----
 xen/arch/arm/mm.c                    |    2 +-
 xen/arch/arm/platforms/vexpress.c    |    6 +++---
 xen/arch/arm/smpboot.c               |    2 +-
 xen/arch/arm/time.c                  |    2 +-
 xen/drivers/video/arm_hdlcd.c        |    2 +-
 xen/include/asm-arm/arm32/flushtlb.h |    8 ++++----
 xen/include/asm-arm/arm32/io.h       |    4 ++--
 xen/include/asm-arm/arm32/page.h     |    4 ++--
 xen/include/asm-arm/arm64/io.h       |    4 ++--
 xen/include/asm-arm/arm64/page.h     |    4 ++--
 xen/include/asm-arm/page.h           |    4 ++--
 xen/include/asm-arm/system.h         |   16 ++++++++--------
 14 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 4c434a1..0bd8f6b 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -44,7 +44,7 @@ void idle_loop(void)
         local_irq_disable();
         if ( cpu_is_haltable(smp_processor_id()) )
         {
-            dsb();
+            dsb("sy");
             wfi();
         }
         local_irq_enable();
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 177560e..42095ee 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -432,7 +432,7 @@ void send_SGI_mask(const cpumask_t *cpumask, enum gic_sgi sgi)
 
     ASSERT(mask < 0x100); /* The target bitmap only supports 8 CPUs */
 
-    dsb();
+    dsb("sy");
 
     GICD[GICD_SGIR] = GICD_SGI_TARGET_LIST
         | (mask<<GICD_SGI_TARGET_SHIFT)
@@ -449,7 +449,7 @@ void send_SGI_self(enum gic_sgi sgi)
 {
     ASSERT(sgi < 16); /* There are only 16 SGIs */
 
-    dsb();
+    dsb("sy");
 
     GICD[GICD_SGIR] = GICD_SGI_TARGET_SELF
         | sgi;
@@ -459,7 +459,7 @@ void send_SGI_allbutself(enum gic_sgi sgi)
 {
    ASSERT(sgi < 16); /* There are only 16 SGIs */
 
-   dsb();
+   dsb("sy");
 
    GICD[GICD_SGIR] = GICD_SGI_TARGET_OTHERS
        | sgi;
@@ -546,7 +546,7 @@ static int __setup_irq(struct irq_desc *desc, unsigned int irq,
 
     desc->action  = new;
     desc->status &= ~IRQ_DISABLED;
-    dsb();
+    dsb("sy");
 
     desc->handler->startup(desc);
 
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 3f049cb..b287a9b 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -324,7 +324,7 @@ void __cpuinit setup_virt_paging(void)
 #define WRITE_TTBR(ttbr)                                                \
     flush_xen_text_tlb();                                               \
     WRITE_SYSREG64(ttbr, TTBR0_EL2);                                    \
-    dsb(); /* ensure memory accesses do not cross over the TTBR0 write */ \
+    dsb("sy"); /* ensure memory accesses do not cross over the TTBR0 write */ \
     /* flush_xen_text_tlb contains an initial isb which ensures the     \
      * write to TTBR0 has completed. */                                 \
     flush_xen_text_tlb()
diff --git a/xen/arch/arm/platforms/vexpress.c b/xen/arch/arm/platforms/vexpress.c
index 8fc30c4..6f6869e 100644
--- a/xen/arch/arm/platforms/vexpress.c
+++ b/xen/arch/arm/platforms/vexpress.c
@@ -46,7 +46,7 @@ static inline int vexpress_ctrl_start(uint32_t *syscfg, int write,
     /* wait for complete flag to be set */
     do {
         stat = syscfg[V2M_SYS_CFGSTAT/4];
-        dsb();
+        dsb("sy");
     } while ( !(stat & V2M_SYS_CFG_COMPLETE) );
 
     /* check error status and return error flag if set */
@@ -111,10 +111,10 @@ static void vexpress_reset(void)
 
     /* switch to slow mode */
     iowritel(sp810, 0x3);
-    dsb(); isb();
+    dsb("sy"); isb();
     /* writing any value to SCSYSSTAT reg will reset the system */
     iowritel(sp810 + 4, 0x1);
-    dsb(); isb();
+    dsb("sy"); isb();
 
     iounmap(sp810);
 }
diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index 727e09f..b88355f 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -211,7 +211,7 @@ void stop_cpu(void)
     local_irq_disable();
     cpu_is_dead = 1;
     /* Make sure the write happens before we sleep forever */
-    dsb();
+    dsb("sy");
     isb();
     while ( 1 )
         wfi();
diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
index 4ed7882..2c254f4 100644
--- a/xen/arch/arm/time.c
+++ b/xen/arch/arm/time.c
@@ -252,7 +252,7 @@ void udelay(unsigned long usecs)
     s_time_t deadline = get_s_time() + 1000 * (s_time_t) usecs;
     while ( get_s_time() - deadline < 0 )
         ;
-    dsb();
+    dsb("sy");
     isb();
 }
 
diff --git a/xen/drivers/video/arm_hdlcd.c b/xen/drivers/video/arm_hdlcd.c
index 72979ea..5c09b0e 100644
--- a/xen/drivers/video/arm_hdlcd.c
+++ b/xen/drivers/video/arm_hdlcd.c
@@ -77,7 +77,7 @@ void (*video_puts)(const char *) = vga_noop_puts;
 
 static void hdlcd_flush(void)
 {
-    dsb();
+    dsb("sy");
 }
 
 static int __init get_color_masks(const char* bpp, struct color_masks **masks)
diff --git a/xen/include/asm-arm/arm32/flushtlb.h b/xen/include/asm-arm/arm32/flushtlb.h
index 14e8827..2776375 100644
--- a/xen/include/asm-arm/arm32/flushtlb.h
+++ b/xen/include/asm-arm/arm32/flushtlb.h
@@ -4,22 +4,22 @@
 /* Flush local TLBs, current VMID only */
 static inline void flush_tlb_local(void)
 {
-    dsb();
+    dsb("sy");
 
     WRITE_CP32((uint32_t) 0, TLBIALLIS);
 
-    dsb();
+    dsb("sy");
     isb();
 }
 
 /* Flush local TLBs, all VMIDs, non-hypervisor mode */
 static inline void flush_tlb_all_local(void)
 {
-    dsb();
+    dsb("sy");
 
     WRITE_CP32((uint32_t) 0, TLBIALLNSNHIS);
 
-    dsb();
+    dsb("sy");
     isb();
 }
 
diff --git a/xen/include/asm-arm/arm32/io.h b/xen/include/asm-arm/arm32/io.h
index ec7e0ff..cb0bc96 100644
--- a/xen/include/asm-arm/arm32/io.h
+++ b/xen/include/asm-arm/arm32/io.h
@@ -30,14 +30,14 @@ static inline uint32_t ioreadl(const volatile void __iomem *addr)
     asm volatile("ldr %1, %0"
                  : "+Qo" (*(volatile uint32_t __force *)addr),
                    "=r" (val));
-    dsb();
+    dsb("sy");
 
     return val;
 }
 
 static inline void iowritel(const volatile void __iomem *addr, uint32_t val)
 {
-    dsb();
+    dsb("sy");
     asm volatile("str %1, %0"
                  : "+Qo" (*(volatile uint32_t __force *)addr)
                  : "r" (val));
diff --git a/xen/include/asm-arm/arm32/page.h b/xen/include/asm-arm/arm32/page.h
index e573502..f8dfbd3 100644
--- a/xen/include/asm-arm/arm32/page.h
+++ b/xen/include/asm-arm/arm32/page.h
@@ -67,13 +67,13 @@ static inline void flush_xen_data_tlb(void)
 static inline void flush_xen_data_tlb_range_va(unsigned long va, unsigned long size)
 {
     unsigned long end = va + size;
-    dsb(); /* Ensure preceding are visible */
+    dsb("sy"); /* Ensure preceding are visible */
     while ( va < end ) {
         asm volatile(STORE_CP32(0, TLBIMVAHIS)
                      : : "r" (va) : "memory");
         va += PAGE_SIZE;
     }
-    dsb(); /* Ensure completion of the TLB flush */
+    dsb("sy"); /* Ensure completion of the TLB flush */
     isb();
 }
 
diff --git a/xen/include/asm-arm/arm64/io.h b/xen/include/asm-arm/arm64/io.h
index ec041cd..0a100ad 100644
--- a/xen/include/asm-arm/arm64/io.h
+++ b/xen/include/asm-arm/arm64/io.h
@@ -24,14 +24,14 @@ static inline uint32_t ioreadl(const volatile void __iomem *addr)
     uint32_t val;
 
     asm volatile("ldr %w0, [%1]" : "=r" (val) : "r" (addr));
-    dsb();
+    dsb("sy");
 
     return val;
 }
 
 static inline void iowritel(const volatile void __iomem *addr, uint32_t val)
 {
-    dsb();
+    dsb("sy");
     asm volatile("str %w0, [%1]" : : "r" (val), "r" (addr));
 }
 
diff --git a/xen/include/asm-arm/arm64/page.h b/xen/include/asm-arm/arm64/page.h
index 28748d3..aca1590 100644
--- a/xen/include/asm-arm/arm64/page.h
+++ b/xen/include/asm-arm/arm64/page.h
@@ -60,13 +60,13 @@ static inline void flush_xen_data_tlb(void)
 static inline void flush_xen_data_tlb_range_va(unsigned long va, unsigned long size)
 {
     unsigned long end = va + size;
-    dsb(); /* Ensure preceding are visible */
+    dsb("sy"); /* Ensure preceding are visible */
     while ( va < end ) {
         asm volatile("tlbi vae2is, %0;"
                      : : "r" (va>>PAGE_SHIFT) : "memory");
         va += PAGE_SIZE;
     }
-    dsb(); /* Ensure completion of the TLB flush */
+    dsb("sy"); /* Ensure completion of the TLB flush */
     isb();
 }
 
diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
index cd38956..eb007ac 100644
--- a/xen/include/asm-arm/page.h
+++ b/xen/include/asm-arm/page.h
@@ -284,10 +284,10 @@ extern size_t cacheline_bytes;
 static inline void flush_xen_dcache_va_range(void *p, unsigned long size)
 {
     void *end;
-    dsb();           /* So the CPU issues all writes to the range */
+    dsb("sy");           /* So the CPU issues all writes to the range */
     for ( end = p + size; p < end; p += cacheline_bytes )
         asm volatile (__flush_xen_dcache_one(0) : : "r" (p));
-    dsb();           /* So we know the flushes happen before continuing */
+    dsb("sy");           /* So we know the flushes happen before continuing */
 }
 
 /* Macro for flushing a single small item.  The predicate is always
diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
index 89c61ef..68efba9 100644
--- a/xen/include/asm-arm/system.h
+++ b/xen/include/asm-arm/system.h
@@ -13,16 +13,16 @@
 #define wfi()           asm volatile("wfi" : : : "memory")
 
 #define isb()           asm volatile("isb" : : : "memory")
-#define dsb()           asm volatile("dsb sy" : : : "memory")
-#define dmb()           asm volatile("dmb sy" : : : "memory")
+#define dsb(scope)      asm volatile("dsb " scope : : : "memory")
+#define dmb(scope)      asm volatile("dmb " scope : : : "memory")
 
-#define mb()            dsb()
-#define rmb()           dsb()
-#define wmb()           dsb()
+#define mb()            dsb("sy")
+#define rmb()           dsb("sy")
+#define wmb()           dsb("sy")
 
-#define smp_mb()        dmb()
-#define smp_rmb()       dmb()
-#define smp_wmb()       dmb()
+#define smp_mb()        dmb("sy")
+#define smp_rmb()       dmb("sy")
+#define smp_wmb()       dmb("sy")
 
 #define xchg(ptr,x) \
         ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 09/10] xen: arm: weaken SMP barriers to inner shareable.
  2013-06-28 16:10 [PATCH 00/10] xen: arm: map normal memory as inner shareable, reduce scope of various barriers Ian Campbell
                   ` (7 preceding siblings ...)
  2013-06-28 16:10 ` [PATCH 08/10] xen: arm: add scope to dsb and dmb macros Ian Campbell
@ 2013-06-28 16:10 ` Ian Campbell
  2013-07-01 15:21   ` Stefano Stabellini
  2013-07-04 11:35   ` Tim Deegan
  2013-06-28 16:10 ` [PATCH 10/10] xen: arm: use more specific barriers for read and write barriers Ian Campbell
  2013-06-28 16:11 ` [PATCH 00/10] xen: arm: map normal memory as inner shareable, reduce scope of various barriers Ian Campbell
  10 siblings, 2 replies; 44+ messages in thread
From: Ian Campbell @ 2013-06-28 16:10 UTC (permalink / raw)
  To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini

Since all processors are in the inner-shareable domain and we map everything
that way this is sufficient.

The non-SMP barriers remain full system. Although in principal they could
become outer shareable barriers for some hardware this would require us to
know which class a given device is. Given the small number of device drivers
in Xen itself its probably not worth worrying over, although maybe someone
will benchmark at some point.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
 xen/include/asm-arm/system.h |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
index 68efba9..7c3e42d 100644
--- a/xen/include/asm-arm/system.h
+++ b/xen/include/asm-arm/system.h
@@ -20,9 +20,9 @@
 #define rmb()           dsb("sy")
 #define wmb()           dsb("sy")
 
-#define smp_mb()        dmb("sy")
-#define smp_rmb()       dmb("sy")
-#define smp_wmb()       dmb("sy")
+#define smp_mb()        dmb("ish")
+#define smp_rmb()       dmb("ish")
+#define smp_wmb()       dmb("ish")
 
 #define xchg(ptr,x) \
         ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 10/10] xen: arm: use more specific barriers for read and write barriers.
  2013-06-28 16:10 [PATCH 00/10] xen: arm: map normal memory as inner shareable, reduce scope of various barriers Ian Campbell
                   ` (8 preceding siblings ...)
  2013-06-28 16:10 ` [PATCH 09/10] xen: arm: weaken SMP barriers to inner shareable Ian Campbell
@ 2013-06-28 16:10 ` Ian Campbell
  2013-07-01 15:22   ` Stefano Stabellini
  2013-07-04 11:42   ` Tim Deegan
  2013-06-28 16:11 ` [PATCH 00/10] xen: arm: map normal memory as inner shareable, reduce scope of various barriers Ian Campbell
  10 siblings, 2 replies; 44+ messages in thread
From: Ian Campbell @ 2013-06-28 16:10 UTC (permalink / raw)
  To: xen-devel; +Cc: julien.grall, tim, Ian Campbell, stefano.stabellini

Note that 32-bit does not provide a load variant of the inner shareable
barrier, so that remains a full any-any barrier.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
 xen/include/asm-arm/system.h |   13 +++++++++----
 1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
index 7c3e42d..efaf645 100644
--- a/xen/include/asm-arm/system.h
+++ b/xen/include/asm-arm/system.h
@@ -17,12 +17,17 @@
 #define dmb(scope)      asm volatile("dmb " scope : : : "memory")
 
 #define mb()            dsb("sy")
-#define rmb()           dsb("sy")
-#define wmb()           dsb("sy")
+#define rmb()           dsb("ld")
+#define wmb()           dsb("st")
 
 #define smp_mb()        dmb("ish")
-#define smp_rmb()       dmb("ish")
-#define smp_wmb()       dmb("ish")
+#ifdef CONFIG_ARM_64
+#define smp_rmb()       dmb("ishld")
+#else
+#define smp_rmb()       dmb("ish") /* 32-bit has no ishld variant. */
+#endif
+
+#define smp_wmb()       dmb("ishst")
 
 #define xchg(ptr,x) \
         ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH 00/10] xen: arm: map normal memory as inner shareable, reduce scope of various barriers
  2013-06-28 16:10 [PATCH 00/10] xen: arm: map normal memory as inner shareable, reduce scope of various barriers Ian Campbell
                   ` (9 preceding siblings ...)
  2013-06-28 16:10 ` [PATCH 10/10] xen: arm: use more specific barriers for read and write barriers Ian Campbell
@ 2013-06-28 16:11 ` Ian Campbell
  10 siblings, 0 replies; 44+ messages in thread
From: Ian Campbell @ 2013-06-28 16:11 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, Stefano Stabellini, Tim Deegan

I forgot to add an "RFC" to the subject line. I'm sure I don't need to
point out that this isn't for 4.3!

On Fri, 2013-06-28 at 17:10 +0100, Ian Campbell wrote:
> Currently Xen maps all RAM as Outer-Shareable, since that seemed like
> the most conservative option early on when we didn't really know what
> Inner- vs. Outer-Shareable meant. However we have long suspected that
> actually Inner-Shareable would be the correct type to use.
> 
> After reading the docs many times, getting more confused each time, I
> finally got a reasonable explanation from a man (and a dog) down the
> pub: Inner-Shareable == the processors in an SMP system, while
> Outer-Shareable == devices. (NB: Not a random man, he knew what he was
> talking about...). With that in mind switch all of Xen's memory mapping,
> page table walks, TLB flushes and an appropriate subset of the barriers
> to be inner shareable.
> 
> In addition I have switched barriers to use the correct read/write/any
> variants for their types. Note that I have only tackled the generic
> mb/rmb/wmb and smp_* barriers (mainly used by common code) here. There
> are also quite a few open-coded full-system dsb's in the arch code which
> I will look at separately (10 patches is quite enough for now). Since
> those deal with e.g. pagetable updates they will be fun ;-)
> 
> I have tested this series on Cortex A15 and AEMv8 fast models in both
> cases with 2 CPUs and I can start a guest in both cases. I have not
> tested on any real hardware at all.
> 
> These changes should result in a performance improvement, although only
> having models to go on I haven't actually bothered to measure.
> 
> I would appreciate anyone with access to real hardware giving it a go. I
> have pushed the patches to:
> 
>         git://xenbits.xen.org/people/ianc/xen.git inner-shareable-v1
>         
>         
> Ian Campbell (10):
>       xen: arm: map memory as inner shareable.
>       xen: arm: Only upgrade guest barriers to inner shareable.
>       xen: arm: reduce instruction cache and tlb flushes to inner-shareable.
>       xen: arm: consolidate barrier definitions
>       xen: use SMP barrier in common code dealing with shared memory protocols
>       xen: arm: Use SMP barriers when that is all which is required.
>       xen: arm: Use dmb for smp barriers
>       xen: arm: add scope to dsb and dmb macros
>       xen: arm: weaken SMP barriers to inner shareable.
>       xen: arm: use more specific barriers for read and write barriers.
> 
>  xen/arch/arm/arm32/head.S            |    8 +++---
>  xen/arch/arm/arm64/head.S            |    8 +++---
>  xen/arch/arm/domain.c                |    2 +-
>  xen/arch/arm/gic.c                   |    8 +++---
>  xen/arch/arm/mm.c                    |   28 ++++++++++------------
>  xen/arch/arm/platforms/vexpress.c    |    6 ++--
>  xen/arch/arm/smpboot.c               |    6 ++--
>  xen/arch/arm/time.c                  |    2 +-
>  xen/arch/arm/traps.c                 |    2 +-
>  xen/common/domain.c                  |    2 +-
>  xen/common/domctl.c                  |    2 +-
>  xen/common/grant_table.c             |    4 +-
>  xen/common/page_alloc.c              |    2 +-
>  xen/common/smp.c                     |    4 +-
>  xen/common/spinlock.c                |    6 ++--
>  xen/common/tmem_xen.c                |   10 ++++----
>  xen/common/trace.c                   |    8 +++---
>  xen/drivers/char/console.c           |    2 +-
>  xen/drivers/video/arm_hdlcd.c        |    2 +-
>  xen/include/asm-arm/arm32/flushtlb.h |   12 +++++-----
>  xen/include/asm-arm/arm32/io.h       |    4 +-
>  xen/include/asm-arm/arm32/page.h     |   12 +++++-----
>  xen/include/asm-arm/arm32/system.h   |   16 -------------
>  xen/include/asm-arm/arm64/flushtlb.h |    4 +-
>  xen/include/asm-arm/arm64/io.h       |    4 +-
>  xen/include/asm-arm/arm64/page.h     |   10 ++++----
>  xen/include/asm-arm/arm64/system.h   |   17 -------------
>  xen/include/asm-arm/page.h           |   42 +++++++++++++++++++++++++++++-----
>  xen/include/asm-arm/system.h         |   21 +++++++++++++++++
>  xen/include/xen/event.h              |    4 +-
>  xen/xsm/flask/ss/sidtab.c            |    4 +-
>  31 files changed, 139 insertions(+), 123 deletions(-)
> 
> Ian.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 05/10] xen: use SMP barrier in common code dealing with shared memory protocols
  2013-06-28 16:10 ` [PATCH 05/10] xen: use SMP barrier in common code dealing with shared memory protocols Ian Campbell
@ 2013-06-28 16:15   ` Ian Campbell
  2013-06-28 16:20   ` Keir Fraser
  2013-07-04 11:26   ` Tim Deegan
  2 siblings, 0 replies; 44+ messages in thread
From: Ian Campbell @ 2013-06-28 16:15 UTC (permalink / raw)
  To: xen-devel; +Cc: julien.grall, keir, tim, jbeulich, stefano.stabellini

On Fri, 2013-06-28 at 17:10 +0100, Ian Campbell wrote:
> Xen currently makes no strong distinction between the SMP barriers (smp_mb
> etc) and the regular barrier (mb etc). In Linux, where we inherited these
> names from having imported Linux code which uses them, the SMP barriers are
> intended to be sufficient for implementing shared-memory protocols between
> processors in an SMP system while the standard barriers are useful for MMIO
> etc.
> 
> On x86 with the stronger ordering model there is not much practical difference
> here but ARM has weaker barriers available which are suitable for use as SMP
> barriers.
> 
> Therefore ensure that common code uses the SMP barriers when that is all which
> is required.
> 
> On both ARM and x86 both types of barrier are currently identical so there is
> no actual change. A future patch will change smp_mb to a weaker barrier on
> ARM.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> Cc: jbeuich@suse.com

I knew something was up here, and double checked "eu" vs "ue", but
failed to notice the entire missing letter. Sorry Jan. CC line
corrected.

> Cc: keir@xen.org
> ---
> I'm not convinced some of these mb() couldn't actually be wmb(), but I didn't
> think to hard about this or make any changes along those lines.
> ---
>  xen/common/domain.c        |    2 +-
>  xen/common/domctl.c        |    2 +-
>  xen/common/grant_table.c   |    4 ++--
>  xen/common/page_alloc.c    |    2 +-
>  xen/common/smp.c           |    4 ++--
>  xen/common/spinlock.c      |    6 +++---
>  xen/common/tmem_xen.c      |   10 +++++-----
>  xen/common/trace.c         |    8 ++++----
>  xen/drivers/char/console.c |    2 +-
>  xen/include/xen/event.h    |    4 ++--
>  xen/xsm/flask/ss/sidtab.c  |    4 ++--
>  11 files changed, 24 insertions(+), 24 deletions(-)
> 
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index fac3470..1380ea9 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -932,7 +932,7 @@ int map_vcpu_info(struct vcpu *v, unsigned long gfn, unsigned offset)
>      v->vcpu_info_mfn = page_to_mfn(page);
>  
>      /* Set new vcpu_info pointer /before/ setting pending flags. */
> -    wmb();
> +    smp_wmb();
>  
>      /*
>       * Mark everything as being pending just to make sure nothing gets
> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> index 9bd8f80..c653efb 100644
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -533,7 +533,7 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>  
>              /* Install vcpu array /then/ update max_vcpus. */
>              d->vcpu = vcpus;
> -            wmb();
> +            smp_wmb();
>              d->max_vcpus = max;
>          }
>  
> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
> index 3f97328..eb50288 100644
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -426,7 +426,7 @@ static int _set_status_v2(domid_t  domid,
>  
>      /* Make sure guest sees status update before checking if flags are
>         still valid */
> -    mb();
> +    smp_mb();
>  
>      scombo.word = *(u32 *)shah;
>      barrier();
> @@ -1670,7 +1670,7 @@ gnttab_transfer(
>              guest_physmap_add_page(e, sha->full_page.frame, mfn, 0);
>              sha->full_page.frame = mfn;
>          }
> -        wmb();
> +        smp_wmb();
>          shared_entry_header(e->grant_table, gop.ref)->flags |=
>              GTF_transfer_completed;
>  
> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> index 2162ef1..25a7d3d 100644
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -1472,7 +1472,7 @@ int assign_pages(
>          ASSERT(page_get_owner(&pg[i]) == NULL);
>          ASSERT((pg[i].count_info & ~(PGC_allocated | 1)) == 0);
>          page_set_owner(&pg[i], d);
> -        wmb(); /* Domain pointer must be visible before updating refcnt. */
> +        smp_wmb(); /* Domain pointer must be visible before updating refcnt. */
>          pg[i].count_info = PGC_allocated | 1;
>          page_list_add_tail(&pg[i], &d->page_list);
>      }
> diff --git a/xen/common/smp.c b/xen/common/smp.c
> index b2b056b..482a203 100644
> --- a/xen/common/smp.c
> +++ b/xen/common/smp.c
> @@ -89,12 +89,12 @@ void smp_call_function_interrupt(void)
>      if ( call_data.wait )
>      {
>          (*func)(info);
> -        mb();
> +        smp_mb();
>          cpumask_clear_cpu(cpu, &call_data.selected);
>      }
>      else
>      {
> -        mb();
> +        smp_mb();
>          cpumask_clear_cpu(cpu, &call_data.selected);
>          (*func)(info);
>      }
> diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
> index bfb9670..575cc6d 100644
> --- a/xen/common/spinlock.c
> +++ b/xen/common/spinlock.c
> @@ -218,7 +218,7 @@ void _spin_barrier(spinlock_t *lock)
>      u64      loop = 0;
>  
>      check_barrier(&lock->debug);
> -    do { mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
> +    do { smp_mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
>      if ((loop > 1) && lock->profile)
>      {
>          lock->profile->time_block += NOW() - block;
> @@ -226,9 +226,9 @@ void _spin_barrier(spinlock_t *lock)
>      }
>  #else
>      check_barrier(&lock->debug);
> -    do { mb(); } while ( _raw_spin_is_locked(&lock->raw) );
> +    do { smp_mb(); } while ( _raw_spin_is_locked(&lock->raw) );
>  #endif
> -    mb();
> +    smp_mb();
>  }
>  
>  int _spin_trylock_recursive(spinlock_t *lock)
> diff --git a/xen/common/tmem_xen.c b/xen/common/tmem_xen.c
> index 736a8c3..54ec09f 100644
> --- a/xen/common/tmem_xen.c
> +++ b/xen/common/tmem_xen.c
> @@ -173,7 +173,7 @@ EXPORT int tmh_copy_from_client(pfp_t *pfp,
>              return -EFAULT;
>          }
>      }
> -    mb();
> +    smp_mb();
>      if ( len == PAGE_SIZE && !tmem_offset && !pfn_offset && cli_va )
>          tmh_copy_page(tmem_va, cli_va);
>      else if ( (tmem_offset+len <= PAGE_SIZE) &&
> @@ -216,7 +216,7 @@ EXPORT int tmh_compress_from_client(tmem_cli_mfn_t cmfn,
>          return 0;
>      else if ( copy_from_guest(scratch, clibuf, PAGE_SIZE) )
>          return -EFAULT;
> -    mb();
> +    smp_mb();
>      ret = lzo1x_1_compress(cli_va ?: scratch, PAGE_SIZE, dmem, out_len, wmem);
>      ASSERT(ret == LZO_E_OK);
>      *out_va = dmem;
> @@ -260,7 +260,7 @@ EXPORT int tmh_copy_to_client(tmem_cli_mfn_t cmfn, pfp_t *pfp,
>      unmap_domain_page(tmem_va);
>      if ( cli_va )
>          cli_put_page(cli_va, cli_pfp, cli_mfn, 1);
> -    mb();
> +    smp_mb();
>      return rc;
>  }
>  
> @@ -289,7 +289,7 @@ EXPORT int tmh_decompress_to_client(tmem_cli_mfn_t cmfn, void *tmem_va,
>          cli_put_page(cli_va, cli_pfp, cli_mfn, 1);
>      else if ( copy_to_guest(clibuf, scratch, PAGE_SIZE) )
>          return -EFAULT;
> -    mb();
> +    smp_mb();
>      return 1;
>  }
>  
> @@ -311,7 +311,7 @@ EXPORT int tmh_copy_tze_to_client(tmem_cli_mfn_t cmfn, void *tmem_va,
>      if ( len < PAGE_SIZE )
>          memset((char *)cli_va+len,0,PAGE_SIZE-len);
>      cli_put_page(cli_va, cli_pfp, cli_mfn, 1);
> -    mb();
> +    smp_mb();
>      return 1;
>  }
>  
> diff --git a/xen/common/trace.c b/xen/common/trace.c
> index fd4ac48..63ea0b7 100644
> --- a/xen/common/trace.c
> +++ b/xen/common/trace.c
> @@ -255,7 +255,7 @@ static int alloc_trace_bufs(unsigned int pages)
>      opt_tbuf_size = pages;
>  
>      printk("xentrace: initialised\n");
> -    wmb(); /* above must be visible before tb_init_done flag set */
> +    smp_wmb(); /* above must be visible before tb_init_done flag set */
>      tb_init_done = 1;
>  
>      return 0;
> @@ -414,7 +414,7 @@ int tb_control(xen_sysctl_tbuf_op_t *tbc)
>          int i;
>  
>          tb_init_done = 0;
> -        wmb();
> +        smp_wmb();
>          /* Clear any lost-record info so we don't get phantom lost records next time we
>           * start tracing.  Grab the lock to make sure we're not racing anyone.  After this
>           * hypercall returns, no more records should be placed into the buffers. */
> @@ -607,7 +607,7 @@ static inline void __insert_record(struct t_buf *buf,
>          memcpy(next_page, (char *)rec + remaining, rec_size - remaining);
>      }
>  
> -    wmb();
> +    smp_wmb();
>  
>      next += rec_size;
>      if ( next >= 2*data_size )
> @@ -718,7 +718,7 @@ void __trace_var(u32 event, bool_t cycles, unsigned int extra,
>          return;
>  
>      /* Read tb_init_done /before/ t_bufs. */
> -    rmb();
> +    smp_rmb();
>  
>      spin_lock_irqsave(&this_cpu(t_lock), flags);
>  
> diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c
> index 7cd7bf6..b696b3e 100644
> --- a/xen/drivers/char/console.c
> +++ b/xen/drivers/char/console.c
> @@ -648,7 +648,7 @@ void __init console_init_postirq(void)
>      for ( i = conringc ; i != conringp; i++ )
>          ring[i & (opt_conring_size - 1)] = conring[i & (conring_size - 1)];
>      conring = ring;
> -    wmb(); /* Allow users of console_force_unlock() to see larger buffer. */
> +    smp_wmb(); /* Allow users of console_force_unlock() to see larger buffer. */
>      conring_size = opt_conring_size;
>      spin_unlock_irq(&console_lock);
>  
> diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
> index 4ac39ad..6f60162 100644
> --- a/xen/include/xen/event.h
> +++ b/xen/include/xen/event.h
> @@ -85,7 +85,7 @@ void notify_via_xen_event_channel(struct domain *ld, int lport);
>          if ( condition )                                                \
>              break;                                                      \
>          set_bit(_VPF_blocked_in_xen, &current->pause_flags);            \
> -        mb(); /* set blocked status /then/ re-evaluate condition */     \
> +        smp_mb(); /* set blocked status /then/ re-evaluate condition */ \
>          if ( condition )                                                \
>          {                                                               \
>              clear_bit(_VPF_blocked_in_xen, &current->pause_flags);      \
> @@ -99,7 +99,7 @@ void notify_via_xen_event_channel(struct domain *ld, int lport);
>      do {                                                                \
>          set_bit(_VPF_blocked_in_xen, &current->pause_flags);            \
>          raise_softirq(SCHEDULE_SOFTIRQ);                                \
> -        mb(); /* set blocked status /then/ caller does his work */      \
> +        smp_mb(); /* set blocked status /then/ caller does his work */  \
>      } while ( 0 )
>  
>  #endif /* __XEN_EVENT_H__ */
> diff --git a/xen/xsm/flask/ss/sidtab.c b/xen/xsm/flask/ss/sidtab.c
> index 586033c..cd1360c 100644
> --- a/xen/xsm/flask/ss/sidtab.c
> +++ b/xen/xsm/flask/ss/sidtab.c
> @@ -79,13 +79,13 @@ int sidtab_insert(struct sidtab *s, u32 sid, struct context *context)
>      if ( prev )
>      {
>          newnode->next = prev->next;
> -        wmb();
> +        smp_wmb();
>          prev->next = newnode;
>      }
>      else
>      {
>          newnode->next = s->htable[hvalue];
> -        wmb();
> +        smp_wmb();
>          s->htable[hvalue] = newnode;
>      }
>  

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 05/10] xen: use SMP barrier in common code dealing with shared memory protocols
  2013-06-28 16:10 ` [PATCH 05/10] xen: use SMP barrier in common code dealing with shared memory protocols Ian Campbell
  2013-06-28 16:15   ` Ian Campbell
@ 2013-06-28 16:20   ` Keir Fraser
  2013-07-04 11:26   ` Tim Deegan
  2 siblings, 0 replies; 44+ messages in thread
From: Keir Fraser @ 2013-06-28 16:20 UTC (permalink / raw)
  To: Ian Campbell, xen-devel; +Cc: julien.grall, jbeuich, tim, stefano.stabellini

On 28/06/2013 17:10, "Ian Campbell" <ian.campbell@citrix.com> wrote:

> Xen currently makes no strong distinction between the SMP barriers (smp_mb
> etc) and the regular barrier (mb etc). In Linux, where we inherited these
> names from having imported Linux code which uses them, the SMP barriers are
> intended to be sufficient for implementing shared-memory protocols between
> processors in an SMP system while the standard barriers are useful for MMIO
> etc.
> 
> On x86 with the stronger ordering model there is not much practical difference
> here but ARM has weaker barriers available which are suitable for use as SMP
> barriers.
> 
> Therefore ensure that common code uses the SMP barriers when that is all which
> is required.
> 
> On both ARM and x86 both types of barrier are currently identical so there is
> no actual change. A future patch will change smp_mb to a weaker barrier on
> ARM.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> Cc: jbeuich@suse.com
> Cc: keir@xen.org
> ---

Acked-by: Keir Fraser <keir@xen.org>

> I'm not convinced some of these mb() couldn't actually be wmb(), but I didn't
> think to hard about this or make any changes along those lines.

Would logically be a separate patch anyway.

 -- Keir

> ---
>  xen/common/domain.c        |    2 +-
>  xen/common/domctl.c        |    2 +-
>  xen/common/grant_table.c   |    4 ++--
>  xen/common/page_alloc.c    |    2 +-
>  xen/common/smp.c           |    4 ++--
>  xen/common/spinlock.c      |    6 +++---
>  xen/common/tmem_xen.c      |   10 +++++-----
>  xen/common/trace.c         |    8 ++++----
>  xen/drivers/char/console.c |    2 +-
>  xen/include/xen/event.h    |    4 ++--
>  xen/xsm/flask/ss/sidtab.c  |    4 ++--
>  11 files changed, 24 insertions(+), 24 deletions(-)
> 
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index fac3470..1380ea9 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -932,7 +932,7 @@ int map_vcpu_info(struct vcpu *v, unsigned long gfn,
> unsigned offset)
>      v->vcpu_info_mfn = page_to_mfn(page);
>  
>      /* Set new vcpu_info pointer /before/ setting pending flags. */
> -    wmb();
> +    smp_wmb();
>  
>      /*
>       * Mark everything as being pending just to make sure nothing gets
> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> index 9bd8f80..c653efb 100644
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -533,7 +533,7 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t)
> u_domctl)
>  
>              /* Install vcpu array /then/ update max_vcpus. */
>              d->vcpu = vcpus;
> -            wmb();
> +            smp_wmb();
>              d->max_vcpus = max;
>          }
>  
> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
> index 3f97328..eb50288 100644
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -426,7 +426,7 @@ static int _set_status_v2(domid_t  domid,
>  
>      /* Make sure guest sees status update before checking if flags are
>         still valid */
> -    mb();
> +    smp_mb();
>  
>      scombo.word = *(u32 *)shah;
>      barrier();
> @@ -1670,7 +1670,7 @@ gnttab_transfer(
>              guest_physmap_add_page(e, sha->full_page.frame, mfn, 0);
>              sha->full_page.frame = mfn;
>          }
> -        wmb();
> +        smp_wmb();
>          shared_entry_header(e->grant_table, gop.ref)->flags |=
>              GTF_transfer_completed;
>  
> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> index 2162ef1..25a7d3d 100644
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -1472,7 +1472,7 @@ int assign_pages(
>          ASSERT(page_get_owner(&pg[i]) == NULL);
>          ASSERT((pg[i].count_info & ~(PGC_allocated | 1)) == 0);
>          page_set_owner(&pg[i], d);
> -        wmb(); /* Domain pointer must be visible before updating refcnt. */
> +        smp_wmb(); /* Domain pointer must be visible before updating refcnt.
> */
>          pg[i].count_info = PGC_allocated | 1;
>          page_list_add_tail(&pg[i], &d->page_list);
>      }
> diff --git a/xen/common/smp.c b/xen/common/smp.c
> index b2b056b..482a203 100644
> --- a/xen/common/smp.c
> +++ b/xen/common/smp.c
> @@ -89,12 +89,12 @@ void smp_call_function_interrupt(void)
>      if ( call_data.wait )
>      {
>          (*func)(info);
> -        mb();
> +        smp_mb();
>          cpumask_clear_cpu(cpu, &call_data.selected);
>      }
>      else
>      {
> -        mb();
> +        smp_mb();
>          cpumask_clear_cpu(cpu, &call_data.selected);
>          (*func)(info);
>      }
> diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
> index bfb9670..575cc6d 100644
> --- a/xen/common/spinlock.c
> +++ b/xen/common/spinlock.c
> @@ -218,7 +218,7 @@ void _spin_barrier(spinlock_t *lock)
>      u64      loop = 0;
>  
>      check_barrier(&lock->debug);
> -    do { mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
> +    do { smp_mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
>      if ((loop > 1) && lock->profile)
>      {
>          lock->profile->time_block += NOW() - block;
> @@ -226,9 +226,9 @@ void _spin_barrier(spinlock_t *lock)
>      }
>  #else
>      check_barrier(&lock->debug);
> -    do { mb(); } while ( _raw_spin_is_locked(&lock->raw) );
> +    do { smp_mb(); } while ( _raw_spin_is_locked(&lock->raw) );
>  #endif
> -    mb();
> +    smp_mb();
>  }
>  
>  int _spin_trylock_recursive(spinlock_t *lock)
> diff --git a/xen/common/tmem_xen.c b/xen/common/tmem_xen.c
> index 736a8c3..54ec09f 100644
> --- a/xen/common/tmem_xen.c
> +++ b/xen/common/tmem_xen.c
> @@ -173,7 +173,7 @@ EXPORT int tmh_copy_from_client(pfp_t *pfp,
>              return -EFAULT;
>          }
>      }
> -    mb();
> +    smp_mb();
>      if ( len == PAGE_SIZE && !tmem_offset && !pfn_offset && cli_va )
>          tmh_copy_page(tmem_va, cli_va);
>      else if ( (tmem_offset+len <= PAGE_SIZE) &&
> @@ -216,7 +216,7 @@ EXPORT int tmh_compress_from_client(tmem_cli_mfn_t cmfn,
>          return 0;
>      else if ( copy_from_guest(scratch, clibuf, PAGE_SIZE) )
>          return -EFAULT;
> -    mb();
> +    smp_mb();
>      ret = lzo1x_1_compress(cli_va ?: scratch, PAGE_SIZE, dmem, out_len,
> wmem);
>      ASSERT(ret == LZO_E_OK);
>      *out_va = dmem;
> @@ -260,7 +260,7 @@ EXPORT int tmh_copy_to_client(tmem_cli_mfn_t cmfn, pfp_t
> *pfp,
>      unmap_domain_page(tmem_va);
>      if ( cli_va )
>          cli_put_page(cli_va, cli_pfp, cli_mfn, 1);
> -    mb();
> +    smp_mb();
>      return rc;
>  }
>  
> @@ -289,7 +289,7 @@ EXPORT int tmh_decompress_to_client(tmem_cli_mfn_t cmfn,
> void *tmem_va,
>          cli_put_page(cli_va, cli_pfp, cli_mfn, 1);
>      else if ( copy_to_guest(clibuf, scratch, PAGE_SIZE) )
>          return -EFAULT;
> -    mb();
> +    smp_mb();
>      return 1;
>  }
>  
> @@ -311,7 +311,7 @@ EXPORT int tmh_copy_tze_to_client(tmem_cli_mfn_t cmfn,
> void *tmem_va,
>      if ( len < PAGE_SIZE )
>          memset((char *)cli_va+len,0,PAGE_SIZE-len);
>      cli_put_page(cli_va, cli_pfp, cli_mfn, 1);
> -    mb();
> +    smp_mb();
>      return 1;
>  }
>  
> diff --git a/xen/common/trace.c b/xen/common/trace.c
> index fd4ac48..63ea0b7 100644
> --- a/xen/common/trace.c
> +++ b/xen/common/trace.c
> @@ -255,7 +255,7 @@ static int alloc_trace_bufs(unsigned int pages)
>      opt_tbuf_size = pages;
>  
>      printk("xentrace: initialised\n");
> -    wmb(); /* above must be visible before tb_init_done flag set */
> +    smp_wmb(); /* above must be visible before tb_init_done flag set */
>      tb_init_done = 1;
>  
>      return 0;
> @@ -414,7 +414,7 @@ int tb_control(xen_sysctl_tbuf_op_t *tbc)
>          int i;
>  
>          tb_init_done = 0;
> -        wmb();
> +        smp_wmb();
>          /* Clear any lost-record info so we don't get phantom lost records
> next time we
>           * start tracing.  Grab the lock to make sure we're not racing
> anyone.  After this
>           * hypercall returns, no more records should be placed into the
> buffers. */
> @@ -607,7 +607,7 @@ static inline void __insert_record(struct t_buf *buf,
>          memcpy(next_page, (char *)rec + remaining, rec_size - remaining);
>      }
>  
> -    wmb();
> +    smp_wmb();
>  
>      next += rec_size;
>      if ( next >= 2*data_size )
> @@ -718,7 +718,7 @@ void __trace_var(u32 event, bool_t cycles, unsigned int
> extra,
>          return;
>  
>      /* Read tb_init_done /before/ t_bufs. */
> -    rmb();
> +    smp_rmb();
>  
>      spin_lock_irqsave(&this_cpu(t_lock), flags);
>  
> diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c
> index 7cd7bf6..b696b3e 100644
> --- a/xen/drivers/char/console.c
> +++ b/xen/drivers/char/console.c
> @@ -648,7 +648,7 @@ void __init console_init_postirq(void)
>      for ( i = conringc ; i != conringp; i++ )
>          ring[i & (opt_conring_size - 1)] = conring[i & (conring_size - 1)];
>      conring = ring;
> -    wmb(); /* Allow users of console_force_unlock() to see larger buffer. */
> +    smp_wmb(); /* Allow users of console_force_unlock() to see larger buffer.
> */
>      conring_size = opt_conring_size;
>      spin_unlock_irq(&console_lock);
>  
> diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
> index 4ac39ad..6f60162 100644
> --- a/xen/include/xen/event.h
> +++ b/xen/include/xen/event.h
> @@ -85,7 +85,7 @@ void notify_via_xen_event_channel(struct domain *ld, int
> lport);
>          if ( condition )                                                \
>              break;                                                      \
>          set_bit(_VPF_blocked_in_xen, &current->pause_flags);            \
> -        mb(); /* set blocked status /then/ re-evaluate condition */     \
> +        smp_mb(); /* set blocked status /then/ re-evaluate condition */ \
>          if ( condition )                                                \
>          {                                                               \
>              clear_bit(_VPF_blocked_in_xen, &current->pause_flags);      \
> @@ -99,7 +99,7 @@ void notify_via_xen_event_channel(struct domain *ld, int
> lport);
>      do {                                                                \
>          set_bit(_VPF_blocked_in_xen, &current->pause_flags);            \
>          raise_softirq(SCHEDULE_SOFTIRQ);                                \
> -        mb(); /* set blocked status /then/ caller does his work */      \
> +        smp_mb(); /* set blocked status /then/ caller does his work */  \
>      } while ( 0 )
>  
>  #endif /* __XEN_EVENT_H__ */
> diff --git a/xen/xsm/flask/ss/sidtab.c b/xen/xsm/flask/ss/sidtab.c
> index 586033c..cd1360c 100644
> --- a/xen/xsm/flask/ss/sidtab.c
> +++ b/xen/xsm/flask/ss/sidtab.c
> @@ -79,13 +79,13 @@ int sidtab_insert(struct sidtab *s, u32 sid, struct
> context *context)
>      if ( prev )
>      {
>          newnode->next = prev->next;
> -        wmb();
> +        smp_wmb();
>          prev->next = newnode;
>      }
>      else
>      {
>          newnode->next = s->htable[hvalue];
> -        wmb();
> +        smp_wmb();
>          s->htable[hvalue] = newnode;
>      }
>  

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 06/10] xen: arm: Use SMP barriers when that is all which is required.
  2013-06-28 16:10 ` [PATCH 06/10] xen: arm: Use SMP barriers when that is all which is required Ian Campbell
@ 2013-07-01 15:19   ` Stefano Stabellini
  2013-07-01 15:24     ` Ian Campbell
  0 siblings, 1 reply; 44+ messages in thread
From: Stefano Stabellini @ 2013-07-01 15:19 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, stefano.stabellini, tim, xen-devel

On Fri, 28 Jun 2013, Ian Campbell wrote:
> As explained in the previous commit SMP barriers can be used when all we care
> about is synchronising against other processors.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> ---
>  xen/arch/arm/mm.c      |    2 +-
>  xen/arch/arm/smpboot.c |    4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index c5213f2..3f049cb 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -776,7 +776,7 @@ void share_xen_page_with_guest(struct page_info *page,
>      page->u.inuse.type_info |= PGT_validated | 1;
>  
>      page_set_owner(page, d);
> -    wmb(); /* install valid domain ptr before updating refcnt. */
> +    smp_wmb(); /* install valid domain ptr before updating refcnt. */
>      ASSERT((page->count_info & ~PGC_xen_heap) == 0);
>  
>      /* Only add to the allocation list if the domain isn't dying. */
> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
> index 8011987..727e09f 100644
> --- a/xen/arch/arm/smpboot.c
> +++ b/xen/arch/arm/smpboot.c
> @@ -170,11 +170,11 @@ void __cpuinit start_secondary(unsigned long boot_phys_offset,
>  
>      /* Run local notifiers */
>      notify_cpu_starting(cpuid);
> -    wmb();
> +    smp_wmb();
>  
>      /* Now report this CPU is up */
>      cpumask_set_cpu(cpuid, &cpu_online_map);
> -    wmb();
> +    smp_wmb();
>  
>      local_irq_enable();

Did you missed few mb() in smpboot.c?

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 07/10] xen: arm: Use dmb for smp barriers
  2013-06-28 16:10 ` [PATCH 07/10] xen: arm: Use dmb for smp barriers Ian Campbell
@ 2013-07-01 15:20   ` Stefano Stabellini
  2013-07-04 11:31   ` Tim Deegan
  1 sibling, 0 replies; 44+ messages in thread
From: Stefano Stabellini @ 2013-07-01 15:20 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, stefano.stabellini, tim, xen-devel

On Fri, 28 Jun 2013, Ian Campbell wrote:
> The full power of dsb is not required in this context.
> 
> Also change wmb() to be dsb() directly instead of indirectly via mb(), for
> clarity.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>


>  xen/include/asm-arm/system.h |    8 ++++----
>  1 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
> index e003624..89c61ef 100644
> --- a/xen/include/asm-arm/system.h
> +++ b/xen/include/asm-arm/system.h
> @@ -18,11 +18,11 @@
>  
>  #define mb()            dsb()
>  #define rmb()           dsb()
> -#define wmb()           mb()
> +#define wmb()           dsb()
>  
> -#define smp_mb()        mb()
> -#define smp_rmb()       rmb()
> -#define smp_wmb()       wmb()
> +#define smp_mb()        dmb()
> +#define smp_rmb()       dmb()
> +#define smp_wmb()       dmb()
>  
>  #define xchg(ptr,x) \
>          ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
> -- 
> 1.7.2.5
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 08/10] xen: arm: add scope to dsb and dmb macros
  2013-06-28 16:10 ` [PATCH 08/10] xen: arm: add scope to dsb and dmb macros Ian Campbell
@ 2013-07-01 15:21   ` Stefano Stabellini
  2013-07-01 15:22     ` Stefano Stabellini
  2013-07-04 11:44   ` (no subject) Tim Deegan
  1 sibling, 1 reply; 44+ messages in thread
From: Stefano Stabellini @ 2013-07-01 15:21 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, stefano.stabellini, tim, xen-devel

On Fri, 28 Jun 2013, Ian Campbell wrote:
> Everywhere currently passes "sy"stem, so no actual change.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>


>  xen/arch/arm/domain.c                |    2 +-
>  xen/arch/arm/gic.c                   |    8 ++++----
>  xen/arch/arm/mm.c                    |    2 +-
>  xen/arch/arm/platforms/vexpress.c    |    6 +++---
>  xen/arch/arm/smpboot.c               |    2 +-
>  xen/arch/arm/time.c                  |    2 +-
>  xen/drivers/video/arm_hdlcd.c        |    2 +-
>  xen/include/asm-arm/arm32/flushtlb.h |    8 ++++----
>  xen/include/asm-arm/arm32/io.h       |    4 ++--
>  xen/include/asm-arm/arm32/page.h     |    4 ++--
>  xen/include/asm-arm/arm64/io.h       |    4 ++--
>  xen/include/asm-arm/arm64/page.h     |    4 ++--
>  xen/include/asm-arm/page.h           |    4 ++--
>  xen/include/asm-arm/system.h         |   16 ++++++++--------
>  14 files changed, 34 insertions(+), 34 deletions(-)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 4c434a1..0bd8f6b 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -44,7 +44,7 @@ void idle_loop(void)
>          local_irq_disable();
>          if ( cpu_is_haltable(smp_processor_id()) )
>          {
> -            dsb();
> +            dsb("sy");
>              wfi();
>          }
>          local_irq_enable();
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index 177560e..42095ee 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -432,7 +432,7 @@ void send_SGI_mask(const cpumask_t *cpumask, enum gic_sgi sgi)
>  
>      ASSERT(mask < 0x100); /* The target bitmap only supports 8 CPUs */
>  
> -    dsb();
> +    dsb("sy");
>  
>      GICD[GICD_SGIR] = GICD_SGI_TARGET_LIST
>          | (mask<<GICD_SGI_TARGET_SHIFT)
> @@ -449,7 +449,7 @@ void send_SGI_self(enum gic_sgi sgi)
>  {
>      ASSERT(sgi < 16); /* There are only 16 SGIs */
>  
> -    dsb();
> +    dsb("sy");
>  
>      GICD[GICD_SGIR] = GICD_SGI_TARGET_SELF
>          | sgi;
> @@ -459,7 +459,7 @@ void send_SGI_allbutself(enum gic_sgi sgi)
>  {
>     ASSERT(sgi < 16); /* There are only 16 SGIs */
>  
> -   dsb();
> +   dsb("sy");
>  
>     GICD[GICD_SGIR] = GICD_SGI_TARGET_OTHERS
>         | sgi;
> @@ -546,7 +546,7 @@ static int __setup_irq(struct irq_desc *desc, unsigned int irq,
>  
>      desc->action  = new;
>      desc->status &= ~IRQ_DISABLED;
> -    dsb();
> +    dsb("sy");
>  
>      desc->handler->startup(desc);
>  
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index 3f049cb..b287a9b 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -324,7 +324,7 @@ void __cpuinit setup_virt_paging(void)
>  #define WRITE_TTBR(ttbr)                                                \
>      flush_xen_text_tlb();                                               \
>      WRITE_SYSREG64(ttbr, TTBR0_EL2);                                    \
> -    dsb(); /* ensure memory accesses do not cross over the TTBR0 write */ \
> +    dsb("sy"); /* ensure memory accesses do not cross over the TTBR0 write */ \
>      /* flush_xen_text_tlb contains an initial isb which ensures the     \
>       * write to TTBR0 has completed. */                                 \
>      flush_xen_text_tlb()
> diff --git a/xen/arch/arm/platforms/vexpress.c b/xen/arch/arm/platforms/vexpress.c
> index 8fc30c4..6f6869e 100644
> --- a/xen/arch/arm/platforms/vexpress.c
> +++ b/xen/arch/arm/platforms/vexpress.c
> @@ -46,7 +46,7 @@ static inline int vexpress_ctrl_start(uint32_t *syscfg, int write,
>      /* wait for complete flag to be set */
>      do {
>          stat = syscfg[V2M_SYS_CFGSTAT/4];
> -        dsb();
> +        dsb("sy");
>      } while ( !(stat & V2M_SYS_CFG_COMPLETE) );
>  
>      /* check error status and return error flag if set */
> @@ -111,10 +111,10 @@ static void vexpress_reset(void)
>  
>      /* switch to slow mode */
>      iowritel(sp810, 0x3);
> -    dsb(); isb();
> +    dsb("sy"); isb();
>      /* writing any value to SCSYSSTAT reg will reset the system */
>      iowritel(sp810 + 4, 0x1);
> -    dsb(); isb();
> +    dsb("sy"); isb();
>  
>      iounmap(sp810);
>  }
> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
> index 727e09f..b88355f 100644
> --- a/xen/arch/arm/smpboot.c
> +++ b/xen/arch/arm/smpboot.c
> @@ -211,7 +211,7 @@ void stop_cpu(void)
>      local_irq_disable();
>      cpu_is_dead = 1;
>      /* Make sure the write happens before we sleep forever */
> -    dsb();
> +    dsb("sy");
>      isb();
>      while ( 1 )
>          wfi();
> diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
> index 4ed7882..2c254f4 100644
> --- a/xen/arch/arm/time.c
> +++ b/xen/arch/arm/time.c
> @@ -252,7 +252,7 @@ void udelay(unsigned long usecs)
>      s_time_t deadline = get_s_time() + 1000 * (s_time_t) usecs;
>      while ( get_s_time() - deadline < 0 )
>          ;
> -    dsb();
> +    dsb("sy");
>      isb();
>  }
>  
> diff --git a/xen/drivers/video/arm_hdlcd.c b/xen/drivers/video/arm_hdlcd.c
> index 72979ea..5c09b0e 100644
> --- a/xen/drivers/video/arm_hdlcd.c
> +++ b/xen/drivers/video/arm_hdlcd.c
> @@ -77,7 +77,7 @@ void (*video_puts)(const char *) = vga_noop_puts;
>  
>  static void hdlcd_flush(void)
>  {
> -    dsb();
> +    dsb("sy");
>  }
>  
>  static int __init get_color_masks(const char* bpp, struct color_masks **masks)
> diff --git a/xen/include/asm-arm/arm32/flushtlb.h b/xen/include/asm-arm/arm32/flushtlb.h
> index 14e8827..2776375 100644
> --- a/xen/include/asm-arm/arm32/flushtlb.h
> +++ b/xen/include/asm-arm/arm32/flushtlb.h
> @@ -4,22 +4,22 @@
>  /* Flush local TLBs, current VMID only */
>  static inline void flush_tlb_local(void)
>  {
> -    dsb();
> +    dsb("sy");
>  
>      WRITE_CP32((uint32_t) 0, TLBIALLIS);
>  
> -    dsb();
> +    dsb("sy");
>      isb();
>  }
>  
>  /* Flush local TLBs, all VMIDs, non-hypervisor mode */
>  static inline void flush_tlb_all_local(void)
>  {
> -    dsb();
> +    dsb("sy");
>  
>      WRITE_CP32((uint32_t) 0, TLBIALLNSNHIS);
>  
> -    dsb();
> +    dsb("sy");
>      isb();
>  }
>  
> diff --git a/xen/include/asm-arm/arm32/io.h b/xen/include/asm-arm/arm32/io.h
> index ec7e0ff..cb0bc96 100644
> --- a/xen/include/asm-arm/arm32/io.h
> +++ b/xen/include/asm-arm/arm32/io.h
> @@ -30,14 +30,14 @@ static inline uint32_t ioreadl(const volatile void __iomem *addr)
>      asm volatile("ldr %1, %0"
>                   : "+Qo" (*(volatile uint32_t __force *)addr),
>                     "=r" (val));
> -    dsb();
> +    dsb("sy");
>  
>      return val;
>  }
>  
>  static inline void iowritel(const volatile void __iomem *addr, uint32_t val)
>  {
> -    dsb();
> +    dsb("sy");
>      asm volatile("str %1, %0"
>                   : "+Qo" (*(volatile uint32_t __force *)addr)
>                   : "r" (val));
> diff --git a/xen/include/asm-arm/arm32/page.h b/xen/include/asm-arm/arm32/page.h
> index e573502..f8dfbd3 100644
> --- a/xen/include/asm-arm/arm32/page.h
> +++ b/xen/include/asm-arm/arm32/page.h
> @@ -67,13 +67,13 @@ static inline void flush_xen_data_tlb(void)
>  static inline void flush_xen_data_tlb_range_va(unsigned long va, unsigned long size)
>  {
>      unsigned long end = va + size;
> -    dsb(); /* Ensure preceding are visible */
> +    dsb("sy"); /* Ensure preceding are visible */
>      while ( va < end ) {
>          asm volatile(STORE_CP32(0, TLBIMVAHIS)
>                       : : "r" (va) : "memory");
>          va += PAGE_SIZE;
>      }
> -    dsb(); /* Ensure completion of the TLB flush */
> +    dsb("sy"); /* Ensure completion of the TLB flush */
>      isb();
>  }
>  
> diff --git a/xen/include/asm-arm/arm64/io.h b/xen/include/asm-arm/arm64/io.h
> index ec041cd..0a100ad 100644
> --- a/xen/include/asm-arm/arm64/io.h
> +++ b/xen/include/asm-arm/arm64/io.h
> @@ -24,14 +24,14 @@ static inline uint32_t ioreadl(const volatile void __iomem *addr)
>      uint32_t val;
>  
>      asm volatile("ldr %w0, [%1]" : "=r" (val) : "r" (addr));
> -    dsb();
> +    dsb("sy");
>  
>      return val;
>  }
>  
>  static inline void iowritel(const volatile void __iomem *addr, uint32_t val)
>  {
> -    dsb();
> +    dsb("sy");
>      asm volatile("str %w0, [%1]" : : "r" (val), "r" (addr));
>  }
>  
> diff --git a/xen/include/asm-arm/arm64/page.h b/xen/include/asm-arm/arm64/page.h
> index 28748d3..aca1590 100644
> --- a/xen/include/asm-arm/arm64/page.h
> +++ b/xen/include/asm-arm/arm64/page.h
> @@ -60,13 +60,13 @@ static inline void flush_xen_data_tlb(void)
>  static inline void flush_xen_data_tlb_range_va(unsigned long va, unsigned long size)
>  {
>      unsigned long end = va + size;
> -    dsb(); /* Ensure preceding are visible */
> +    dsb("sy"); /* Ensure preceding are visible */
>      while ( va < end ) {
>          asm volatile("tlbi vae2is, %0;"
>                       : : "r" (va>>PAGE_SHIFT) : "memory");
>          va += PAGE_SIZE;
>      }
> -    dsb(); /* Ensure completion of the TLB flush */
> +    dsb("sy"); /* Ensure completion of the TLB flush */
>      isb();
>  }
>  
> diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> index cd38956..eb007ac 100644
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -284,10 +284,10 @@ extern size_t cacheline_bytes;
>  static inline void flush_xen_dcache_va_range(void *p, unsigned long size)
>  {
>      void *end;
> -    dsb();           /* So the CPU issues all writes to the range */
> +    dsb("sy");           /* So the CPU issues all writes to the range */
>      for ( end = p + size; p < end; p += cacheline_bytes )
>          asm volatile (__flush_xen_dcache_one(0) : : "r" (p));
> -    dsb();           /* So we know the flushes happen before continuing */
> +    dsb("sy");           /* So we know the flushes happen before continuing */
>  }
>  
>  /* Macro for flushing a single small item.  The predicate is always
> diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
> index 89c61ef..68efba9 100644
> --- a/xen/include/asm-arm/system.h
> +++ b/xen/include/asm-arm/system.h
> @@ -13,16 +13,16 @@
>  #define wfi()           asm volatile("wfi" : : : "memory")
>  
>  #define isb()           asm volatile("isb" : : : "memory")
> -#define dsb()           asm volatile("dsb sy" : : : "memory")
> -#define dmb()           asm volatile("dmb sy" : : : "memory")
> +#define dsb(scope)      asm volatile("dsb " scope : : : "memory")
> +#define dmb(scope)      asm volatile("dmb " scope : : : "memory")
>  
> -#define mb()            dsb()
> -#define rmb()           dsb()
> -#define wmb()           dsb()
> +#define mb()            dsb("sy")
> +#define rmb()           dsb("sy")
> +#define wmb()           dsb("sy")
>  
> -#define smp_mb()        dmb()
> -#define smp_rmb()       dmb()
> -#define smp_wmb()       dmb()
> +#define smp_mb()        dmb("sy")
> +#define smp_rmb()       dmb("sy")
> +#define smp_wmb()       dmb("sy")
>  
>  #define xchg(ptr,x) \
>          ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
> -- 
> 1.7.2.5
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 09/10] xen: arm: weaken SMP barriers to inner shareable.
  2013-06-28 16:10 ` [PATCH 09/10] xen: arm: weaken SMP barriers to inner shareable Ian Campbell
@ 2013-07-01 15:21   ` Stefano Stabellini
  2013-07-01 15:22     ` Stefano Stabellini
  2013-07-04 11:35   ` Tim Deegan
  1 sibling, 1 reply; 44+ messages in thread
From: Stefano Stabellini @ 2013-07-01 15:21 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, stefano.stabellini, tim, xen-devel

On Fri, 28 Jun 2013, Ian Campbell wrote:
> Since all processors are in the inner-shareable domain and we map everything
> that way this is sufficient.
> 
> The non-SMP barriers remain full system. Although in principal they could
> become outer shareable barriers for some hardware this would require us to
> know which class a given device is. Given the small number of device drivers
> in Xen itself its probably not worth worrying over, although maybe someone
> will benchmark at some point.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>


>  xen/include/asm-arm/system.h |    6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
> index 68efba9..7c3e42d 100644
> --- a/xen/include/asm-arm/system.h
> +++ b/xen/include/asm-arm/system.h
> @@ -20,9 +20,9 @@
>  #define rmb()           dsb("sy")
>  #define wmb()           dsb("sy")
>  
> -#define smp_mb()        dmb("sy")
> -#define smp_rmb()       dmb("sy")
> -#define smp_wmb()       dmb("sy")
> +#define smp_mb()        dmb("ish")
> +#define smp_rmb()       dmb("ish")
> +#define smp_wmb()       dmb("ish")
>  
>  #define xchg(ptr,x) \
>          ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
> -- 
> 1.7.2.5
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 10/10] xen: arm: use more specific barriers for read and write barriers.
  2013-06-28 16:10 ` [PATCH 10/10] xen: arm: use more specific barriers for read and write barriers Ian Campbell
@ 2013-07-01 15:22   ` Stefano Stabellini
  2013-07-04 11:42   ` Tim Deegan
  1 sibling, 0 replies; 44+ messages in thread
From: Stefano Stabellini @ 2013-07-01 15:22 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, stefano.stabellini, tim, xen-devel

On Fri, 28 Jun 2013, Ian Campbell wrote:
> Note that 32-bit does not provide a load variant of the inner shareable
> barrier, so that remains a full any-any barrier.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>


Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>



>  xen/include/asm-arm/system.h |   13 +++++++++----
>  1 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
> index 7c3e42d..efaf645 100644
> --- a/xen/include/asm-arm/system.h
> +++ b/xen/include/asm-arm/system.h
> @@ -17,12 +17,17 @@
>  #define dmb(scope)      asm volatile("dmb " scope : : : "memory")
>  
>  #define mb()            dsb("sy")
> -#define rmb()           dsb("sy")
> -#define wmb()           dsb("sy")
> +#define rmb()           dsb("ld")
> +#define wmb()           dsb("st")
>  
>  #define smp_mb()        dmb("ish")
> -#define smp_rmb()       dmb("ish")
> -#define smp_wmb()       dmb("ish")
> +#ifdef CONFIG_ARM_64
> +#define smp_rmb()       dmb("ishld")
> +#else
> +#define smp_rmb()       dmb("ish") /* 32-bit has no ishld variant. */
> +#endif
> +
> +#define smp_wmb()       dmb("ishst")
>  
>  #define xchg(ptr,x) \
>          ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
> -- 
> 1.7.2.5
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 08/10] xen: arm: add scope to dsb and dmb macros
  2013-07-01 15:21   ` Stefano Stabellini
@ 2013-07-01 15:22     ` Stefano Stabellini
  0 siblings, 0 replies; 44+ messages in thread
From: Stefano Stabellini @ 2013-07-01 15:22 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: julien.grall, tim, Ian Campbell, xen-devel

On Mon, 1 Jul 2013, Stefano Stabellini wrote:
> On Fri, 28 Jun 2013, Ian Campbell wrote:
> > Everywhere currently passes "sy"stem, so no actual change.
> > 
> > Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> 
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

sorry I meant Acked-by

> 
> >  xen/arch/arm/domain.c                |    2 +-
> >  xen/arch/arm/gic.c                   |    8 ++++----
> >  xen/arch/arm/mm.c                    |    2 +-
> >  xen/arch/arm/platforms/vexpress.c    |    6 +++---
> >  xen/arch/arm/smpboot.c               |    2 +-
> >  xen/arch/arm/time.c                  |    2 +-
> >  xen/drivers/video/arm_hdlcd.c        |    2 +-
> >  xen/include/asm-arm/arm32/flushtlb.h |    8 ++++----
> >  xen/include/asm-arm/arm32/io.h       |    4 ++--
> >  xen/include/asm-arm/arm32/page.h     |    4 ++--
> >  xen/include/asm-arm/arm64/io.h       |    4 ++--
> >  xen/include/asm-arm/arm64/page.h     |    4 ++--
> >  xen/include/asm-arm/page.h           |    4 ++--
> >  xen/include/asm-arm/system.h         |   16 ++++++++--------
> >  14 files changed, 34 insertions(+), 34 deletions(-)
> > 
> > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> > index 4c434a1..0bd8f6b 100644
> > --- a/xen/arch/arm/domain.c
> > +++ b/xen/arch/arm/domain.c
> > @@ -44,7 +44,7 @@ void idle_loop(void)
> >          local_irq_disable();
> >          if ( cpu_is_haltable(smp_processor_id()) )
> >          {
> > -            dsb();
> > +            dsb("sy");
> >              wfi();
> >          }
> >          local_irq_enable();
> > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> > index 177560e..42095ee 100644
> > --- a/xen/arch/arm/gic.c
> > +++ b/xen/arch/arm/gic.c
> > @@ -432,7 +432,7 @@ void send_SGI_mask(const cpumask_t *cpumask, enum gic_sgi sgi)
> >  
> >      ASSERT(mask < 0x100); /* The target bitmap only supports 8 CPUs */
> >  
> > -    dsb();
> > +    dsb("sy");
> >  
> >      GICD[GICD_SGIR] = GICD_SGI_TARGET_LIST
> >          | (mask<<GICD_SGI_TARGET_SHIFT)
> > @@ -449,7 +449,7 @@ void send_SGI_self(enum gic_sgi sgi)
> >  {
> >      ASSERT(sgi < 16); /* There are only 16 SGIs */
> >  
> > -    dsb();
> > +    dsb("sy");
> >  
> >      GICD[GICD_SGIR] = GICD_SGI_TARGET_SELF
> >          | sgi;
> > @@ -459,7 +459,7 @@ void send_SGI_allbutself(enum gic_sgi sgi)
> >  {
> >     ASSERT(sgi < 16); /* There are only 16 SGIs */
> >  
> > -   dsb();
> > +   dsb("sy");
> >  
> >     GICD[GICD_SGIR] = GICD_SGI_TARGET_OTHERS
> >         | sgi;
> > @@ -546,7 +546,7 @@ static int __setup_irq(struct irq_desc *desc, unsigned int irq,
> >  
> >      desc->action  = new;
> >      desc->status &= ~IRQ_DISABLED;
> > -    dsb();
> > +    dsb("sy");
> >  
> >      desc->handler->startup(desc);
> >  
> > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> > index 3f049cb..b287a9b 100644
> > --- a/xen/arch/arm/mm.c
> > +++ b/xen/arch/arm/mm.c
> > @@ -324,7 +324,7 @@ void __cpuinit setup_virt_paging(void)
> >  #define WRITE_TTBR(ttbr)                                                \
> >      flush_xen_text_tlb();                                               \
> >      WRITE_SYSREG64(ttbr, TTBR0_EL2);                                    \
> > -    dsb(); /* ensure memory accesses do not cross over the TTBR0 write */ \
> > +    dsb("sy"); /* ensure memory accesses do not cross over the TTBR0 write */ \
> >      /* flush_xen_text_tlb contains an initial isb which ensures the     \
> >       * write to TTBR0 has completed. */                                 \
> >      flush_xen_text_tlb()
> > diff --git a/xen/arch/arm/platforms/vexpress.c b/xen/arch/arm/platforms/vexpress.c
> > index 8fc30c4..6f6869e 100644
> > --- a/xen/arch/arm/platforms/vexpress.c
> > +++ b/xen/arch/arm/platforms/vexpress.c
> > @@ -46,7 +46,7 @@ static inline int vexpress_ctrl_start(uint32_t *syscfg, int write,
> >      /* wait for complete flag to be set */
> >      do {
> >          stat = syscfg[V2M_SYS_CFGSTAT/4];
> > -        dsb();
> > +        dsb("sy");
> >      } while ( !(stat & V2M_SYS_CFG_COMPLETE) );
> >  
> >      /* check error status and return error flag if set */
> > @@ -111,10 +111,10 @@ static void vexpress_reset(void)
> >  
> >      /* switch to slow mode */
> >      iowritel(sp810, 0x3);
> > -    dsb(); isb();
> > +    dsb("sy"); isb();
> >      /* writing any value to SCSYSSTAT reg will reset the system */
> >      iowritel(sp810 + 4, 0x1);
> > -    dsb(); isb();
> > +    dsb("sy"); isb();
> >  
> >      iounmap(sp810);
> >  }
> > diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
> > index 727e09f..b88355f 100644
> > --- a/xen/arch/arm/smpboot.c
> > +++ b/xen/arch/arm/smpboot.c
> > @@ -211,7 +211,7 @@ void stop_cpu(void)
> >      local_irq_disable();
> >      cpu_is_dead = 1;
> >      /* Make sure the write happens before we sleep forever */
> > -    dsb();
> > +    dsb("sy");
> >      isb();
> >      while ( 1 )
> >          wfi();
> > diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
> > index 4ed7882..2c254f4 100644
> > --- a/xen/arch/arm/time.c
> > +++ b/xen/arch/arm/time.c
> > @@ -252,7 +252,7 @@ void udelay(unsigned long usecs)
> >      s_time_t deadline = get_s_time() + 1000 * (s_time_t) usecs;
> >      while ( get_s_time() - deadline < 0 )
> >          ;
> > -    dsb();
> > +    dsb("sy");
> >      isb();
> >  }
> >  
> > diff --git a/xen/drivers/video/arm_hdlcd.c b/xen/drivers/video/arm_hdlcd.c
> > index 72979ea..5c09b0e 100644
> > --- a/xen/drivers/video/arm_hdlcd.c
> > +++ b/xen/drivers/video/arm_hdlcd.c
> > @@ -77,7 +77,7 @@ void (*video_puts)(const char *) = vga_noop_puts;
> >  
> >  static void hdlcd_flush(void)
> >  {
> > -    dsb();
> > +    dsb("sy");
> >  }
> >  
> >  static int __init get_color_masks(const char* bpp, struct color_masks **masks)
> > diff --git a/xen/include/asm-arm/arm32/flushtlb.h b/xen/include/asm-arm/arm32/flushtlb.h
> > index 14e8827..2776375 100644
> > --- a/xen/include/asm-arm/arm32/flushtlb.h
> > +++ b/xen/include/asm-arm/arm32/flushtlb.h
> > @@ -4,22 +4,22 @@
> >  /* Flush local TLBs, current VMID only */
> >  static inline void flush_tlb_local(void)
> >  {
> > -    dsb();
> > +    dsb("sy");
> >  
> >      WRITE_CP32((uint32_t) 0, TLBIALLIS);
> >  
> > -    dsb();
> > +    dsb("sy");
> >      isb();
> >  }
> >  
> >  /* Flush local TLBs, all VMIDs, non-hypervisor mode */
> >  static inline void flush_tlb_all_local(void)
> >  {
> > -    dsb();
> > +    dsb("sy");
> >  
> >      WRITE_CP32((uint32_t) 0, TLBIALLNSNHIS);
> >  
> > -    dsb();
> > +    dsb("sy");
> >      isb();
> >  }
> >  
> > diff --git a/xen/include/asm-arm/arm32/io.h b/xen/include/asm-arm/arm32/io.h
> > index ec7e0ff..cb0bc96 100644
> > --- a/xen/include/asm-arm/arm32/io.h
> > +++ b/xen/include/asm-arm/arm32/io.h
> > @@ -30,14 +30,14 @@ static inline uint32_t ioreadl(const volatile void __iomem *addr)
> >      asm volatile("ldr %1, %0"
> >                   : "+Qo" (*(volatile uint32_t __force *)addr),
> >                     "=r" (val));
> > -    dsb();
> > +    dsb("sy");
> >  
> >      return val;
> >  }
> >  
> >  static inline void iowritel(const volatile void __iomem *addr, uint32_t val)
> >  {
> > -    dsb();
> > +    dsb("sy");
> >      asm volatile("str %1, %0"
> >                   : "+Qo" (*(volatile uint32_t __force *)addr)
> >                   : "r" (val));
> > diff --git a/xen/include/asm-arm/arm32/page.h b/xen/include/asm-arm/arm32/page.h
> > index e573502..f8dfbd3 100644
> > --- a/xen/include/asm-arm/arm32/page.h
> > +++ b/xen/include/asm-arm/arm32/page.h
> > @@ -67,13 +67,13 @@ static inline void flush_xen_data_tlb(void)
> >  static inline void flush_xen_data_tlb_range_va(unsigned long va, unsigned long size)
> >  {
> >      unsigned long end = va + size;
> > -    dsb(); /* Ensure preceding are visible */
> > +    dsb("sy"); /* Ensure preceding are visible */
> >      while ( va < end ) {
> >          asm volatile(STORE_CP32(0, TLBIMVAHIS)
> >                       : : "r" (va) : "memory");
> >          va += PAGE_SIZE;
> >      }
> > -    dsb(); /* Ensure completion of the TLB flush */
> > +    dsb("sy"); /* Ensure completion of the TLB flush */
> >      isb();
> >  }
> >  
> > diff --git a/xen/include/asm-arm/arm64/io.h b/xen/include/asm-arm/arm64/io.h
> > index ec041cd..0a100ad 100644
> > --- a/xen/include/asm-arm/arm64/io.h
> > +++ b/xen/include/asm-arm/arm64/io.h
> > @@ -24,14 +24,14 @@ static inline uint32_t ioreadl(const volatile void __iomem *addr)
> >      uint32_t val;
> >  
> >      asm volatile("ldr %w0, [%1]" : "=r" (val) : "r" (addr));
> > -    dsb();
> > +    dsb("sy");
> >  
> >      return val;
> >  }
> >  
> >  static inline void iowritel(const volatile void __iomem *addr, uint32_t val)
> >  {
> > -    dsb();
> > +    dsb("sy");
> >      asm volatile("str %w0, [%1]" : : "r" (val), "r" (addr));
> >  }
> >  
> > diff --git a/xen/include/asm-arm/arm64/page.h b/xen/include/asm-arm/arm64/page.h
> > index 28748d3..aca1590 100644
> > --- a/xen/include/asm-arm/arm64/page.h
> > +++ b/xen/include/asm-arm/arm64/page.h
> > @@ -60,13 +60,13 @@ static inline void flush_xen_data_tlb(void)
> >  static inline void flush_xen_data_tlb_range_va(unsigned long va, unsigned long size)
> >  {
> >      unsigned long end = va + size;
> > -    dsb(); /* Ensure preceding are visible */
> > +    dsb("sy"); /* Ensure preceding are visible */
> >      while ( va < end ) {
> >          asm volatile("tlbi vae2is, %0;"
> >                       : : "r" (va>>PAGE_SHIFT) : "memory");
> >          va += PAGE_SIZE;
> >      }
> > -    dsb(); /* Ensure completion of the TLB flush */
> > +    dsb("sy"); /* Ensure completion of the TLB flush */
> >      isb();
> >  }
> >  
> > diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> > index cd38956..eb007ac 100644
> > --- a/xen/include/asm-arm/page.h
> > +++ b/xen/include/asm-arm/page.h
> > @@ -284,10 +284,10 @@ extern size_t cacheline_bytes;
> >  static inline void flush_xen_dcache_va_range(void *p, unsigned long size)
> >  {
> >      void *end;
> > -    dsb();           /* So the CPU issues all writes to the range */
> > +    dsb("sy");           /* So the CPU issues all writes to the range */
> >      for ( end = p + size; p < end; p += cacheline_bytes )
> >          asm volatile (__flush_xen_dcache_one(0) : : "r" (p));
> > -    dsb();           /* So we know the flushes happen before continuing */
> > +    dsb("sy");           /* So we know the flushes happen before continuing */
> >  }
> >  
> >  /* Macro for flushing a single small item.  The predicate is always
> > diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
> > index 89c61ef..68efba9 100644
> > --- a/xen/include/asm-arm/system.h
> > +++ b/xen/include/asm-arm/system.h
> > @@ -13,16 +13,16 @@
> >  #define wfi()           asm volatile("wfi" : : : "memory")
> >  
> >  #define isb()           asm volatile("isb" : : : "memory")
> > -#define dsb()           asm volatile("dsb sy" : : : "memory")
> > -#define dmb()           asm volatile("dmb sy" : : : "memory")
> > +#define dsb(scope)      asm volatile("dsb " scope : : : "memory")
> > +#define dmb(scope)      asm volatile("dmb " scope : : : "memory")
> >  
> > -#define mb()            dsb()
> > -#define rmb()           dsb()
> > -#define wmb()           dsb()
> > +#define mb()            dsb("sy")
> > +#define rmb()           dsb("sy")
> > +#define wmb()           dsb("sy")
> >  
> > -#define smp_mb()        dmb()
> > -#define smp_rmb()       dmb()
> > -#define smp_wmb()       dmb()
> > +#define smp_mb()        dmb("sy")
> > +#define smp_rmb()       dmb("sy")
> > +#define smp_wmb()       dmb("sy")
> >  
> >  #define xchg(ptr,x) \
> >          ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
> > -- 
> > 1.7.2.5
> > 
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 09/10] xen: arm: weaken SMP barriers to inner shareable.
  2013-07-01 15:21   ` Stefano Stabellini
@ 2013-07-01 15:22     ` Stefano Stabellini
  0 siblings, 0 replies; 44+ messages in thread
From: Stefano Stabellini @ 2013-07-01 15:22 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: julien.grall, tim, Ian Campbell, xen-devel

On Mon, 1 Jul 2013, Stefano Stabellini wrote:
> On Fri, 28 Jun 2013, Ian Campbell wrote:
> > Since all processors are in the inner-shareable domain and we map everything
> > that way this is sufficient.
> > 
> > The non-SMP barriers remain full system. Although in principal they could
> > become outer shareable barriers for some hardware this would require us to
> > know which class a given device is. Given the small number of device drivers
> > in Xen itself its probably not worth worrying over, although maybe someone
> > will benchmark at some point.
> > 
> > Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> 
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> 

same here

> >  xen/include/asm-arm/system.h |    6 +++---
> >  1 files changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
> > index 68efba9..7c3e42d 100644
> > --- a/xen/include/asm-arm/system.h
> > +++ b/xen/include/asm-arm/system.h
> > @@ -20,9 +20,9 @@
> >  #define rmb()           dsb("sy")
> >  #define wmb()           dsb("sy")
> >  
> > -#define smp_mb()        dmb("sy")
> > -#define smp_rmb()       dmb("sy")
> > -#define smp_wmb()       dmb("sy")
> > +#define smp_mb()        dmb("ish")
> > +#define smp_rmb()       dmb("ish")
> > +#define smp_wmb()       dmb("ish")
> >  
> >  #define xchg(ptr,x) \
> >          ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
> > -- 
> > 1.7.2.5
> > 
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 02/10] xen: arm: Only upgrade guest barriers to inner shareable.
  2013-06-28 16:10 ` [PATCH 02/10] xen: arm: Only upgrade guest barriers to " Ian Campbell
@ 2013-07-01 15:24   ` Stefano Stabellini
  2013-07-04 10:58   ` Tim Deegan
  1 sibling, 0 replies; 44+ messages in thread
From: Stefano Stabellini @ 2013-07-01 15:24 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, stefano.stabellini, tim, xen-devel

On Fri, 28 Jun 2013, Ian Campbell wrote:
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>


>  xen/arch/arm/traps.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 398d209..e40ae2e 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -62,7 +62,7 @@ void __cpuinit init_traps(void)
>      WRITE_SYSREG((vaddr_t)hyp_traps_vector, VBAR_EL2);
>  
>      /* Setup hypervisor traps */
> -    WRITE_SYSREG(HCR_PTW|HCR_BSU_OUTER|HCR_AMO|HCR_IMO|HCR_VM|HCR_TWI|HCR_TSC, HCR_EL2);
> +    WRITE_SYSREG(HCR_PTW|HCR_BSU_INNER|HCR_AMO|HCR_IMO|HCR_VM|HCR_TWI|HCR_TSC, HCR_EL2);
>      isb();
>  }

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 06/10] xen: arm: Use SMP barriers when that is all which is required.
  2013-07-01 15:19   ` Stefano Stabellini
@ 2013-07-01 15:24     ` Ian Campbell
  2013-07-04 11:30       ` Tim Deegan
  0 siblings, 1 reply; 44+ messages in thread
From: Ian Campbell @ 2013-07-01 15:24 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: julien.grall, tim, xen-devel

On Mon, 2013-07-01 at 16:19 +0100, Stefano Stabellini wrote:
> On Fri, 28 Jun 2013, Ian Campbell wrote:
> > As explained in the previous commit SMP barriers can be used when all we care
> > about is synchronising against other processors.
> > 
> > Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> > ---
> >  xen/arch/arm/mm.c      |    2 +-
> >  xen/arch/arm/smpboot.c |    4 ++--
> >  2 files changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> > index c5213f2..3f049cb 100644
> > --- a/xen/arch/arm/mm.c
> > +++ b/xen/arch/arm/mm.c
> > @@ -776,7 +776,7 @@ void share_xen_page_with_guest(struct page_info *page,
> >      page->u.inuse.type_info |= PGT_validated | 1;
> >  
> >      page_set_owner(page, d);
> > -    wmb(); /* install valid domain ptr before updating refcnt. */
> > +    smp_wmb(); /* install valid domain ptr before updating refcnt. */
> >      ASSERT((page->count_info & ~PGC_xen_heap) == 0);
> >  
> >      /* Only add to the allocation list if the domain isn't dying. */
> > diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
> > index 8011987..727e09f 100644
> > --- a/xen/arch/arm/smpboot.c
> > +++ b/xen/arch/arm/smpboot.c
> > @@ -170,11 +170,11 @@ void __cpuinit start_secondary(unsigned long boot_phys_offset,
> >  
> >      /* Run local notifiers */
> >      notify_cpu_starting(cpuid);
> > -    wmb();
> > +    smp_wmb();
> >  
> >      /* Now report this CPU is up */
> >      cpumask_set_cpu(cpuid, &cpu_online_map);
> > -    wmb();
> > +    smp_wmb();
> >  
> >      local_irq_enable();
> 
> Did you missed few mb() in smpboot.c?

The ones in __cpu_disable and __cpu_die?

I think I just wasn't 100% sure they might not be touching hardware
(i.e. some platform register to shutdown a CPU) and since they weren't
performance critical I punted on them.

Looking it again the first half of that logic seems to be bogus (that
code goes nowhere near any peripheral).

Ian.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 03/10] xen: arm: reduce instruction cache and tlb flushes to inner-shareable.
  2013-06-28 16:10 ` [PATCH 03/10] xen: arm: reduce instruction cache and tlb flushes to inner-shareable Ian Campbell
@ 2013-07-01 15:25   ` Stefano Stabellini
  2013-07-04 11:07   ` Tim Deegan
  1 sibling, 0 replies; 44+ messages in thread
From: Stefano Stabellini @ 2013-07-01 15:25 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, stefano.stabellini, tim, xen-devel

On Fri, 28 Jun 2013, Ian Campbell wrote:
> Now that Xen maps memory and performs pagetable walks as inner shareable we
> don't need to push updates down so far when modifying page tables etc.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>


>  xen/include/asm-arm/arm32/flushtlb.h |    4 ++--
>  xen/include/asm-arm/arm32/page.h     |    8 ++++----
>  xen/include/asm-arm/arm64/flushtlb.h |    4 ++--
>  xen/include/asm-arm/arm64/page.h     |    6 +++---
>  4 files changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/xen/include/asm-arm/arm32/flushtlb.h b/xen/include/asm-arm/arm32/flushtlb.h
> index a258f58..14e8827 100644
> --- a/xen/include/asm-arm/arm32/flushtlb.h
> +++ b/xen/include/asm-arm/arm32/flushtlb.h
> @@ -6,7 +6,7 @@ static inline void flush_tlb_local(void)
>  {
>      dsb();
>  
> -    WRITE_CP32((uint32_t) 0, TLBIALL);
> +    WRITE_CP32((uint32_t) 0, TLBIALLIS);
>  
>      dsb();
>      isb();
> @@ -17,7 +17,7 @@ static inline void flush_tlb_all_local(void)
>  {
>      dsb();
>  
> -    WRITE_CP32((uint32_t) 0, TLBIALLNSNH);
> +    WRITE_CP32((uint32_t) 0, TLBIALLNSNHIS);
>  
>      dsb();
>      isb();
> diff --git a/xen/include/asm-arm/arm32/page.h b/xen/include/asm-arm/arm32/page.h
> index 38bcffd..e573502 100644
> --- a/xen/include/asm-arm/arm32/page.h
> +++ b/xen/include/asm-arm/arm32/page.h
> @@ -39,8 +39,8 @@ static inline void flush_xen_text_tlb(void)
>      asm volatile (
>          "isb;"                        /* Ensure synchronization with previous changes to text */
>          STORE_CP32(0, TLBIALLH)       /* Flush hypervisor TLB */
> -        STORE_CP32(0, ICIALLU)        /* Flush I-cache */
> -        STORE_CP32(0, BPIALL)         /* Flush branch predictor */
> +        STORE_CP32(0, ICIALLUIS)      /* Flush I-cache */
> +        STORE_CP32(0, BPIALLIS)       /* Flush branch predictor */
>          "dsb;"                        /* Ensure completion of TLB+BP flush */
>          "isb;"
>          : : "r" (r0) /*dummy*/ : "memory");
> @@ -54,7 +54,7 @@ static inline void flush_xen_data_tlb(void)
>  {
>      register unsigned long r0 asm ("r0");
>      asm volatile("dsb;" /* Ensure preceding are visible */
> -                 STORE_CP32(0, TLBIALLH)
> +                 STORE_CP32(0, TLBIALLHIS)
>                   "dsb;" /* Ensure completion of the TLB flush */
>                   "isb;"
>                   : : "r" (r0) /* dummy */: "memory");
> @@ -69,7 +69,7 @@ static inline void flush_xen_data_tlb_range_va(unsigned long va, unsigned long s
>      unsigned long end = va + size;
>      dsb(); /* Ensure preceding are visible */
>      while ( va < end ) {
> -        asm volatile(STORE_CP32(0, TLBIMVAH)
> +        asm volatile(STORE_CP32(0, TLBIMVAHIS)
>                       : : "r" (va) : "memory");
>          va += PAGE_SIZE;
>      }
> diff --git a/xen/include/asm-arm/arm64/flushtlb.h b/xen/include/asm-arm/arm64/flushtlb.h
> index d0535a0..3a6d2cb 100644
> --- a/xen/include/asm-arm/arm64/flushtlb.h
> +++ b/xen/include/asm-arm/arm64/flushtlb.h
> @@ -6,7 +6,7 @@ static inline void flush_tlb_local(void)
>  {
>      asm volatile(
>          "dsb sy;"
> -        "tlbi vmalle1;"
> +        "tlbi vmalle1is;"
>          "dsb sy;"
>          "isb;"
>          : : : "memory");
> @@ -17,7 +17,7 @@ static inline void flush_tlb_all_local(void)
>  {
>      asm volatile(
>          "dsb sy;"
> -        "tlbi alle1;"
> +        "tlbi alle1is;"
>          "dsb sy;"
>          "isb;"
>          : : : "memory");
> diff --git a/xen/include/asm-arm/arm64/page.h b/xen/include/asm-arm/arm64/page.h
> index bd48fe3..28748d3 100644
> --- a/xen/include/asm-arm/arm64/page.h
> +++ b/xen/include/asm-arm/arm64/page.h
> @@ -33,7 +33,7 @@ static inline void flush_xen_text_tlb(void)
>      asm volatile (
>          "isb;"       /* Ensure synchronization with previous changes to text */
>          "tlbi   alle2;"                 /* Flush hypervisor TLB */
> -        "ic     iallu;"                 /* Flush I-cache */
> +        "ic     ialluis;"               /* Flush I-cache */
>          "dsb    sy;"                    /* Ensure completion of TLB flush */
>          "isb;"
>          : : : "memory");
> @@ -47,7 +47,7 @@ static inline void flush_xen_data_tlb(void)
>  {
>      asm volatile (
>          "dsb    sy;"                    /* Ensure visibility of PTE writes */
> -        "tlbi   alle2;"                 /* Flush hypervisor TLB */
> +        "tlbi   alle2is;"               /* Flush hypervisor TLB */
>          "dsb    sy;"                    /* Ensure completion of TLB flush */
>          "isb;"
>          : : : "memory");
> @@ -62,7 +62,7 @@ static inline void flush_xen_data_tlb_range_va(unsigned long va, unsigned long s
>      unsigned long end = va + size;
>      dsb(); /* Ensure preceding are visible */
>      while ( va < end ) {
> -        asm volatile("tlbi vae2, %0;"
> +        asm volatile("tlbi vae2is, %0;"
>                       : : "r" (va>>PAGE_SHIFT) : "memory");
>          va += PAGE_SIZE;
>      }
> -- 
> 1.7.2.5
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 04/10] xen: arm: consolidate barrier definitions
  2013-06-28 16:10 ` [PATCH 04/10] xen: arm: consolidate barrier definitions Ian Campbell
@ 2013-07-01 15:25   ` Stefano Stabellini
  2013-07-04 11:07   ` Tim Deegan
  1 sibling, 0 replies; 44+ messages in thread
From: Stefano Stabellini @ 2013-07-01 15:25 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, stefano.stabellini, tim, xen-devel

On Fri, 28 Jun 2013, Ian Campbell wrote:
> These are effectively identical on both 32- and 64-bit.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>


>  xen/include/asm-arm/arm32/system.h |   16 ----------------
>  xen/include/asm-arm/arm64/system.h |   17 -----------------
>  xen/include/asm-arm/system.h       |   16 ++++++++++++++++
>  3 files changed, 16 insertions(+), 33 deletions(-)
> 
> diff --git a/xen/include/asm-arm/arm32/system.h b/xen/include/asm-arm/arm32/system.h
> index b3736f4..9f233fe 100644
> --- a/xen/include/asm-arm/arm32/system.h
> +++ b/xen/include/asm-arm/arm32/system.h
> @@ -2,22 +2,6 @@
>  #ifndef __ASM_ARM32_SYSTEM_H
>  #define __ASM_ARM32_SYSTEM_H
>  
> -#define sev() __asm__ __volatile__ ("sev" : : : "memory")
> -#define wfe() __asm__ __volatile__ ("wfe" : : : "memory")
> -#define wfi() __asm__ __volatile__ ("wfi" : : : "memory")
> -
> -#define isb() __asm__ __volatile__ ("isb" : : : "memory")
> -#define dsb() __asm__ __volatile__ ("dsb sy" : : : "memory")
> -#define dmb() __asm__ __volatile__ ("dmb sy" : : : "memory")
> -
> -#define mb()            dsb()
> -#define rmb()           dsb()
> -#define wmb()           mb()
> -
> -#define smp_mb()        mb()
> -#define smp_rmb()       rmb()
> -#define smp_wmb()       wmb()
> -
>  extern void __bad_xchg(volatile void *, int);
>  
>  static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size)
> diff --git a/xen/include/asm-arm/arm64/system.h b/xen/include/asm-arm/arm64/system.h
> index 4e41913..46f901c 100644
> --- a/xen/include/asm-arm/arm64/system.h
> +++ b/xen/include/asm-arm/arm64/system.h
> @@ -2,23 +2,6 @@
>  #ifndef __ASM_ARM64_SYSTEM_H
>  #define __ASM_ARM64_SYSTEM_H
>  
> -#define sev()           asm volatile("sev" : : : "memory")
> -#define wfe()           asm volatile("wfe" : : : "memory")
> -#define wfi()           asm volatile("wfi" : : : "memory")
> -
> -#define isb()           asm volatile("isb" : : : "memory")
> -#define dsb()           asm volatile("dsb sy" : : : "memory")
> -#define dmb()           asm volatile("dmb sy" : : : "memory")
> -
> -#define mb()            dsb()
> -#define rmb()           dsb()
> -#define wmb()           mb()
> -
> -#define smp_mb()        mb()
> -#define smp_rmb()       rmb()
> -#define smp_wmb()       wmb()
> -
> -
>  extern void __bad_xchg(volatile void *, int);
>  
>  static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size)
> diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
> index 290d38d..e003624 100644
> --- a/xen/include/asm-arm/system.h
> +++ b/xen/include/asm-arm/system.h
> @@ -8,6 +8,22 @@
>  #define nop() \
>      asm volatile ( "nop" )
>  
> +#define sev()           asm volatile("sev" : : : "memory")
> +#define wfe()           asm volatile("wfe" : : : "memory")
> +#define wfi()           asm volatile("wfi" : : : "memory")
> +
> +#define isb()           asm volatile("isb" : : : "memory")
> +#define dsb()           asm volatile("dsb sy" : : : "memory")
> +#define dmb()           asm volatile("dmb sy" : : : "memory")
> +
> +#define mb()            dsb()
> +#define rmb()           dsb()
> +#define wmb()           mb()
> +
> +#define smp_mb()        mb()
> +#define smp_rmb()       rmb()
> +#define smp_wmb()       wmb()
> +
>  #define xchg(ptr,x) \
>          ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
>  
> -- 
> 1.7.2.5
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 01/10] xen: arm: map memory as inner shareable.
  2013-06-28 16:10 ` [PATCH 01/10] xen: arm: map memory as inner shareable Ian Campbell
@ 2013-07-01 15:39   ` Stefano Stabellini
  2013-07-01 15:42     ` Ian Campbell
  2013-07-02 14:09   ` Leif Lindholm
  2013-07-04 10:58   ` Tim Deegan
  2 siblings, 1 reply; 44+ messages in thread
From: Stefano Stabellini @ 2013-07-01 15:39 UTC (permalink / raw)
  To: Ian Campbell
  Cc: julien.grall, stefano.stabellini, tim, Leif Lindholm, xen-devel

On Fri, 28 Jun 2013, Ian Campbell wrote:
> The inner shareable domain contains all SMP processors, including different
> clusters (e.g. big.LITTLE). Therefore this is the correct thing to use for Xen
> memory mappings. The outer shareable domain is for devices on busses which are
> barriers (e.g. AMBA4). While the system domain is for things behind bridges
> which do not.
> 
> One wrinkle is that Normal memory with attributes Inner Non-cacheable, Outer
> Non-cacheable (which we call BUFFERABLE) must be mapped Outer Shareable on ARM
> v7. Therefore change the prototype of mfn_to_xen_entry to take the attribute
> index so we can DTRT. On ARMv8 the sharability is ignored and considered to
> always be Outer Shareable.
> 
> While I'm here change all the dmb/dsb with an implicit sy to an explicit sy,
> to make future changes simpler. Other than that don't adjust the barriers,
> flushes etc, those remain as they were (which is more than is now required).
> I'll change those in a later patch.
> 
> Many thanks to Leif for explaining the difference between Inner- and
> Outer-Shareable in words of two or less syllables, I hope I've replicated that
> explanation properly above!
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> Cc: Leif Lindholm <leif.lindholm@linaro.org>

It looks OK. I would have kept the dsb sy changes separate.


>  xen/arch/arm/arm32/head.S          |    8 +++---
>  xen/arch/arm/arm64/head.S          |    8 +++---
>  xen/arch/arm/mm.c                  |   24 ++++++++++------------
>  xen/include/asm-arm/arm32/system.h |    4 +-
>  xen/include/asm-arm/page.h         |   38 ++++++++++++++++++++++++++++++++---
>  5 files changed, 55 insertions(+), 27 deletions(-)
> 
> diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
> index 0588d54..464c351 100644
> --- a/xen/arch/arm/arm32/head.S
> +++ b/xen/arch/arm/arm32/head.S
> @@ -24,8 +24,8 @@
>  
>  #define ZIMAGE_MAGIC_NUMBER 0x016f2818
>  
> -#define PT_PT     0xe7f /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=111 T=1 P=1 */
> -#define PT_MEM    0xe7d /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=111 T=0 P=1 */
> +#define PT_PT     0xf7f /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=1 P=1 */
> +#define PT_MEM    0xf7d /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=0 P=1 */
>  #define PT_DEV    0xe71 /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=100 T=0 P=1 */
>  #define PT_DEV_L3 0xe73 /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=100 T=1 P=1 */
>  
> @@ -206,10 +206,10 @@ skip_bss:
>          mcr   CP32(r1, HMAIR1)
>  
>          /* Set up the HTCR:
> -         * PT walks use Outer-Shareable accesses,
> +         * PT walks use Inner-Shareable accesses,
>           * PT walks are write-back, write-allocate in both cache levels,
>           * Full 32-bit address space goes through this table. */
> -        ldr   r0, =0x80002500
> +        ldr   r0, =0x80003500
>          mcr   CP32(r0, HTCR)
>  
>          /* Set up the HSCTLR:
> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> index 21b7e4d..ffcb880 100644
> --- a/xen/arch/arm/arm64/head.S
> +++ b/xen/arch/arm/arm64/head.S
> @@ -24,8 +24,8 @@
>  #include <asm/page.h>
>  #include <asm/asm_defns.h>
>  
> -#define PT_PT     0xe7f /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=111 T=1 P=1 */
> -#define PT_MEM    0xe7d /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=111 T=0 P=1 */
> +#define PT_PT     0xf7f /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=1 P=1 */
> +#define PT_MEM    0xf7d /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=0 P=1 */
>  #define PT_DEV    0xe71 /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=100 T=0 P=1 */
>  #define PT_DEV_L3 0xe73 /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=100 T=1 P=1 */
>  
> @@ -178,10 +178,10 @@ skip_bss:
>          /* Set up the HTCR:
>           * PASize -- 4G
>           * Top byte is used
> -         * PT walks use Outer-Shareable accesses,
> +         * PT walks use Inner-Shareable accesses,
>           * PT walks are write-back, write-allocate in both cache levels,
>           * Full 64-bit address space goes through this table. */
> -        ldr   x0, =0x80802500
> +        ldr   x0, =0x80803500
>          msr   tcr_el2, x0
>  
>          /* Set up the HSCTLR:
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index d1290cd..c5213f2 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -163,9 +163,8 @@ void dump_hyp_walk(vaddr_t addr)
>  /* Map a 4k page in a fixmap entry */
>  void set_fixmap(unsigned map, unsigned long mfn, unsigned attributes)
>  {
> -    lpae_t pte = mfn_to_xen_entry(mfn);
> +    lpae_t pte = mfn_to_xen_entry(mfn, attributes);
>      pte.pt.table = 1; /* 4k mappings always have this bit set */
> -    pte.pt.ai = attributes;
>      pte.pt.xn = 1;
>      write_pte(xen_fixmap + third_table_offset(FIXMAP_ADDR(map)), pte);
>      flush_xen_data_tlb_range_va(FIXMAP_ADDR(map), PAGE_SIZE);
> @@ -212,7 +211,7 @@ void *map_domain_page(unsigned long mfn)
>          if ( map[slot].pt.avail == 0 )
>          {
>              /* Commandeer this 2MB slot */
> -            pte = mfn_to_xen_entry(slot_mfn);
> +            pte = mfn_to_xen_entry(slot_mfn, WRITEALLOC);
>              pte.pt.avail = 1;
>              write_pte(map + slot, pte);
>              break;
> @@ -340,7 +339,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
>  
>      /* Map the destination in the boot misc area. */
>      dest_va = BOOT_MISC_VIRT_START;
> -    pte = mfn_to_xen_entry(xen_paddr >> PAGE_SHIFT);
> +    pte = mfn_to_xen_entry(xen_paddr >> PAGE_SHIFT, WRITEALLOC);
>      write_pte(xen_second + second_table_offset(dest_va), pte);
>      flush_xen_data_tlb_range_va(dest_va, SECOND_SIZE);
>  
> @@ -387,7 +386,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
>  
>      /* Link in the fixmap pagetable */
>      pte = mfn_to_xen_entry((((unsigned long) xen_fixmap) + phys_offset)
> -                           >> PAGE_SHIFT);
> +                           >> PAGE_SHIFT, WRITEALLOC);
>      pte.pt.table = 1;
>      write_pte(xen_second + second_table_offset(FIXMAP_ADDR(0)), pte);
>      /*
> @@ -402,7 +401,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
>          unsigned long va = XEN_VIRT_START + (i << PAGE_SHIFT);
>          if ( !is_kernel(va) )
>              break;
> -        pte = mfn_to_xen_entry(mfn);
> +        pte = mfn_to_xen_entry(mfn, WRITEALLOC);
>          pte.pt.table = 1; /* 4k mappings always have this bit set */
>          if ( is_kernel_text(va) || is_kernel_inittext(va) )
>          {
> @@ -415,7 +414,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
>          /* No flush required here as page table is not hooked in yet. */
>      }
>      pte = mfn_to_xen_entry((((unsigned long) xen_xenmap) + phys_offset)
> -                           >> PAGE_SHIFT);
> +                           >> PAGE_SHIFT, WRITEALLOC);
>      pte.pt.table = 1;
>      write_pte(xen_second + second_linear_offset(XEN_VIRT_START), pte);
>      /* TLBFLUSH and ISB would be needed here, but wait until we set WXN */
> @@ -467,7 +466,7 @@ int init_secondary_pagetables(int cpu)
>      /* Initialise first pagetable from first level of boot tables, and
>       * hook into the new root. */
>      memcpy(first, boot_first, PAGE_SIZE);
> -    pte = mfn_to_xen_entry(virt_to_mfn(first));
> +    pte = mfn_to_xen_entry(virt_to_mfn(first), WRITEALLOC);
>      pte.pt.table = 1;
>      write_pte(root, pte);
>  #endif
> @@ -479,7 +478,7 @@ int init_secondary_pagetables(int cpu)
>       * domheap mapping pages. */
>      for ( i = 0; i < DOMHEAP_SECOND_PAGES; i++ )
>      {
> -        pte = mfn_to_xen_entry(virt_to_mfn(domheap+i*LPAE_ENTRIES));
> +        pte = mfn_to_xen_entry(virt_to_mfn(domheap+i*LPAE_ENTRIES), WRITEALLOC);
>          pte.pt.table = 1;
>          write_pte(&first[first_table_offset(DOMHEAP_VIRT_START+i*FIRST_SIZE)], pte);
>      }
> @@ -525,7 +524,7 @@ static void __init create_mappings(unsigned long virt,
>  
>      count = nr_mfns / LPAE_ENTRIES;
>      p = xen_second + second_linear_offset(virt);
> -    pte = mfn_to_xen_entry(base_mfn);
> +    pte = mfn_to_xen_entry(base_mfn, WRITEALLOC);
>      pte.pt.hint = 1;  /* These maps are in 16-entry contiguous chunks. */
>      for ( i = 0; i < count; i++ )
>      {
> @@ -595,7 +594,7 @@ static int create_xen_table(lpae_t *entry)
>      if ( p == NULL )
>          return -ENOMEM;
>      clear_page(p);
> -    pte = mfn_to_xen_entry(virt_to_mfn(p));
> +    pte = mfn_to_xen_entry(virt_to_mfn(p), WRITEALLOC);
>      pte.pt.table = 1;
>      write_pte(entry, pte);
>      return 0;
> @@ -641,9 +640,8 @@ static int create_xen_entries(enum xenmap_operation op,
>                             addr, mfn);
>                      return -EINVAL;
>                  }
> -                pte = mfn_to_xen_entry(mfn);
> +                pte = mfn_to_xen_entry(mfn, ai);
>                  pte.pt.table = 1;
> -                pte.pt.ai = ai;
>                  write_pte(&third[third_table_offset(addr)], pte);
>                  break;
>              case REMOVE:
> diff --git a/xen/include/asm-arm/arm32/system.h b/xen/include/asm-arm/arm32/system.h
> index 60148cb..b3736f4 100644
> --- a/xen/include/asm-arm/arm32/system.h
> +++ b/xen/include/asm-arm/arm32/system.h
> @@ -7,8 +7,8 @@
>  #define wfi() __asm__ __volatile__ ("wfi" : : : "memory")
>  
>  #define isb() __asm__ __volatile__ ("isb" : : : "memory")
> -#define dsb() __asm__ __volatile__ ("dsb" : : : "memory")
> -#define dmb() __asm__ __volatile__ ("dmb" : : : "memory")
> +#define dsb() __asm__ __volatile__ ("dsb sy" : : : "memory")
> +#define dmb() __asm__ __volatile__ ("dmb sy" : : : "memory")
>  
>  #define mb()            dsb()
>  #define rmb()           dsb()
> diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> index 41e9eff..cd38956 100644
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -185,7 +185,7 @@ typedef union {
>  /* Standard entry type that we'll use to build Xen's own pagetables.
>   * We put the same permissions at every level, because they're ignored
>   * by the walker in non-leaf entries. */
> -static inline lpae_t mfn_to_xen_entry(unsigned long mfn)
> +static inline lpae_t mfn_to_xen_entry(unsigned long mfn, unsigned attr)
>  {
>      paddr_t pa = ((paddr_t) mfn) << PAGE_SHIFT;
>      lpae_t e = (lpae_t) {
> @@ -193,10 +193,9 @@ static inline lpae_t mfn_to_xen_entry(unsigned long mfn)
>              .xn = 1,              /* No need to execute outside .text */
>              .ng = 1,              /* Makes TLB flushes easier */
>              .af = 1,              /* No need for access tracking */
> -            .sh = LPAE_SH_OUTER,  /* Xen mappings are globally coherent */
>              .ns = 1,              /* Hyp mode is in the non-secure world */
>              .user = 1,            /* See below */
> -            .ai = WRITEALLOC,
> +            .ai = attr,
>              .table = 0,           /* Set to 1 for links and 4k maps */
>              .valid = 1,           /* Mappings are present */
>          }};;
> @@ -205,6 +204,37 @@ static inline lpae_t mfn_to_xen_entry(unsigned long mfn)
>       * pagetables un User mode it's OK.  If this changes, remember
>       * to update the hard-coded values in head.S too */
>  
> +    switch ( attr )
> +    {
> +    case BUFFERABLE:
> +        /*
> +         * ARM ARM: Overlaying the shareability attribute (B3-1376 to 1377)
> +         *
> +         * A memory region with a resultant memory type attribute of Normal,
> +         * and a resultant cacheability attribute of Inner Non-cacheable,
> +         * Outer Non-cacheable, must have a resultant shareability attribute
> +         * of Outer Shareable, otherwise shareability is UNPREDICTABLE.
> +         *
> +         * On ARMv8 sharability is ignored and explicitly treated as Outer
> +         * Shareable for Normal Inner Non_cacheable, Outer Non-cacheable.
> +         */
> +        e.pt.sh = LPAE_SH_OUTER;
> +        break;
> +    case UNCACHED:
> +    case DEV_SHARED:
> +        /* Shareability is ignored for non-Normal memory, Outer is as
> +         * good as anything.
> +         *
> +         * On ARMv8 sharability is ignored and explicitly treated as Outer
> +         * Shareable for any device memory type.
> +         */
> +        e.pt.sh = LPAE_SH_OUTER;
> +        break;
> +    default:
> +        e.pt.sh = LPAE_SH_INNER;  /* Xen mappings are SMP coherent */
> +        break;
> +    }
> +
>      ASSERT(!(pa & ~PAGE_MASK));
>      ASSERT(!(pa & ~PADDR_MASK));
>  
> @@ -219,7 +249,7 @@ static inline lpae_t mfn_to_p2m_entry(unsigned long mfn, unsigned int mattr)
>      lpae_t e = (lpae_t) {
>          .p2m.xn = 0,
>          .p2m.af = 1,
> -        .p2m.sh = LPAE_SH_OUTER,
> +        .p2m.sh = LPAE_SH_INNER,
>          .p2m.write = 1,
>          .p2m.read = 1,
>          .p2m.mattr = mattr,
> -- 
> 1.7.2.5
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 01/10] xen: arm: map memory as inner shareable.
  2013-07-01 15:39   ` Stefano Stabellini
@ 2013-07-01 15:42     ` Ian Campbell
  0 siblings, 0 replies; 44+ messages in thread
From: Ian Campbell @ 2013-07-01 15:42 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: julien.grall, tim, Leif Lindholm, xen-devel

On Mon, 2013-07-01 at 16:39 +0100, Stefano Stabellini wrote:
> On Fri, 28 Jun 2013, Ian Campbell wrote:
> > The inner shareable domain contains all SMP processors, including different
> > clusters (e.g. big.LITTLE). Therefore this is the correct thing to use for Xen
> > memory mappings. The outer shareable domain is for devices on busses which are
> > barriers (e.g. AMBA4). While the system domain is for things behind bridges
> > which do not.
> > 
> > One wrinkle is that Normal memory with attributes Inner Non-cacheable, Outer
> > Non-cacheable (which we call BUFFERABLE) must be mapped Outer Shareable on ARM
> > v7. Therefore change the prototype of mfn_to_xen_entry to take the attribute
> > index so we can DTRT. On ARMv8 the sharability is ignored and considered to
> > always be Outer Shareable.
> > 
> > While I'm here change all the dmb/dsb with an implicit sy to an explicit sy,
> > to make future changes simpler. Other than that don't adjust the barriers,
> > flushes etc, those remain as they were (which is more than is now required).
> > I'll change those in a later patch.
> > 
> > Many thanks to Leif for explaining the difference between Inner- and
> > Outer-Shareable in words of two or less syllables, I hope I've replicated that
> > explanation properly above!
> > 
> > Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> > Cc: Leif Lindholm <leif.lindholm@linaro.org>
> 
> It looks OK.

Thanks.

>  I would have kept the dsb sy changes separate.

So would I if I'd have know then that it was going end up being so many
patches. I'll split them out when I resend.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 01/10] xen: arm: map memory as inner shareable.
  2013-06-28 16:10 ` [PATCH 01/10] xen: arm: map memory as inner shareable Ian Campbell
  2013-07-01 15:39   ` Stefano Stabellini
@ 2013-07-02 14:09   ` Leif Lindholm
  2013-07-02 14:26     ` Ian Campbell
  2013-07-04 10:58   ` Tim Deegan
  2 siblings, 1 reply; 44+ messages in thread
From: Leif Lindholm @ 2013-07-02 14:09 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, stefano.stabellini, tim, xen-devel

On Fri, Jun 28, 2013 at 05:10:47PM +0100, Ian Campbell wrote:
> The inner shareable domain contains all SMP processors, including different
> clusters (e.g. big.LITTLE). Therefore this is the correct thing to use for Xen
> memory mappings. The outer shareable domain is for devices on busses which are
> barriers (e.g. AMBA4).

I think this should say something like "which are coherent and
barrier-aware".

And to be technically correct, the example should say "AMBA4 AXI with
ACE").

> While the system domain is for things behind bridges > which do not.
 
And given the above ... -> which "are" not.

> One wrinkle is that Normal memory with attributes Inner Non-cacheable, Outer
> Non-cacheable (which we call BUFFERABLE) must be mapped Outer Shareable on ARM
> v7. Therefore change the prototype of mfn_to_xen_entry to take the attribute
> index so we can DTRT. On ARMv8 the sharability is ignored and considered to
> always be Outer Shareable.
> 
> While I'm here change all the dmb/dsb with an implicit sy to an explicit sy,
> to make future changes simpler. Other than that don't adjust the barriers,
> flushes etc, those remain as they were (which is more than is now required).
> I'll change those in a later patch.
> 
> Many thanks to Leif for explaining the difference between Inner- and
> Outer-Shareable in words of two or less syllables, I hope I've replicated that
> explanation properly above!
 
Apart from my usual nitpicking, indeed :)

> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> Cc: Leif Lindholm <leif.lindholm@linaro.org>
> ---
>  xen/arch/arm/arm32/head.S          |    8 +++---
>  xen/arch/arm/arm64/head.S          |    8 +++---
>  xen/arch/arm/mm.c                  |   24 ++++++++++------------
>  xen/include/asm-arm/arm32/system.h |    4 +-
>  xen/include/asm-arm/page.h         |   38 ++++++++++++++++++++++++++++++++---
>  5 files changed, 55 insertions(+), 27 deletions(-)
> 
> diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
> index 0588d54..464c351 100644
> --- a/xen/arch/arm/arm32/head.S
> +++ b/xen/arch/arm/arm32/head.S
> @@ -24,8 +24,8 @@
>  
>  #define ZIMAGE_MAGIC_NUMBER 0x016f2818
>  
> -#define PT_PT     0xe7f /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=111 T=1 P=1 */
> -#define PT_MEM    0xe7d /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=111 T=0 P=1 */
> +#define PT_PT     0xf7f /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=1 P=1 */
> +#define PT_MEM    0xf7d /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=0 P=1 */
>  #define PT_DEV    0xe71 /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=100 T=0 P=1 */
>  #define PT_DEV_L3 0xe73 /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=100 T=1 P=1 */
>  
> @@ -206,10 +206,10 @@ skip_bss:
>          mcr   CP32(r1, HMAIR1)
>  
>          /* Set up the HTCR:
> -         * PT walks use Outer-Shareable accesses,
> +         * PT walks use Inner-Shareable accesses,
>           * PT walks are write-back, write-allocate in both cache levels,
>           * Full 32-bit address space goes through this table. */
> -        ldr   r0, =0x80002500
> +        ldr   r0, =0x80003500
>          mcr   CP32(r0, HTCR)
>  
>          /* Set up the HSCTLR:
> diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
> index 21b7e4d..ffcb880 100644
> --- a/xen/arch/arm/arm64/head.S
> +++ b/xen/arch/arm/arm64/head.S
> @@ -24,8 +24,8 @@
>  #include <asm/page.h>
>  #include <asm/asm_defns.h>
>  
> -#define PT_PT     0xe7f /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=111 T=1 P=1 */
> -#define PT_MEM    0xe7d /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=111 T=0 P=1 */
> +#define PT_PT     0xf7f /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=1 P=1 */
> +#define PT_MEM    0xf7d /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=0 P=1 */
>  #define PT_DEV    0xe71 /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=100 T=0 P=1 */
>  #define PT_DEV_L3 0xe73 /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=100 T=1 P=1 */
>  
> @@ -178,10 +178,10 @@ skip_bss:
>          /* Set up the HTCR:
>           * PASize -- 4G
>           * Top byte is used
> -         * PT walks use Outer-Shareable accesses,
> +         * PT walks use Inner-Shareable accesses,
>           * PT walks are write-back, write-allocate in both cache levels,
>           * Full 64-bit address space goes through this table. */
> -        ldr   x0, =0x80802500
> +        ldr   x0, =0x80803500
>          msr   tcr_el2, x0
>  
>          /* Set up the HSCTLR:
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index d1290cd..c5213f2 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -163,9 +163,8 @@ void dump_hyp_walk(vaddr_t addr)
>  /* Map a 4k page in a fixmap entry */
>  void set_fixmap(unsigned map, unsigned long mfn, unsigned attributes)
>  {
> -    lpae_t pte = mfn_to_xen_entry(mfn);
> +    lpae_t pte = mfn_to_xen_entry(mfn, attributes);
>      pte.pt.table = 1; /* 4k mappings always have this bit set */
> -    pte.pt.ai = attributes;
>      pte.pt.xn = 1;
>      write_pte(xen_fixmap + third_table_offset(FIXMAP_ADDR(map)), pte);
>      flush_xen_data_tlb_range_va(FIXMAP_ADDR(map), PAGE_SIZE);
> @@ -212,7 +211,7 @@ void *map_domain_page(unsigned long mfn)
>          if ( map[slot].pt.avail == 0 )
>          {
>              /* Commandeer this 2MB slot */
> -            pte = mfn_to_xen_entry(slot_mfn);
> +            pte = mfn_to_xen_entry(slot_mfn, WRITEALLOC);
>              pte.pt.avail = 1;
>              write_pte(map + slot, pte);
>              break;
> @@ -340,7 +339,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
>  
>      /* Map the destination in the boot misc area. */
>      dest_va = BOOT_MISC_VIRT_START;
> -    pte = mfn_to_xen_entry(xen_paddr >> PAGE_SHIFT);
> +    pte = mfn_to_xen_entry(xen_paddr >> PAGE_SHIFT, WRITEALLOC);
>      write_pte(xen_second + second_table_offset(dest_va), pte);
>      flush_xen_data_tlb_range_va(dest_va, SECOND_SIZE);
>  
> @@ -387,7 +386,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
>  
>      /* Link in the fixmap pagetable */
>      pte = mfn_to_xen_entry((((unsigned long) xen_fixmap) + phys_offset)
> -                           >> PAGE_SHIFT);
> +                           >> PAGE_SHIFT, WRITEALLOC);
>      pte.pt.table = 1;
>      write_pte(xen_second + second_table_offset(FIXMAP_ADDR(0)), pte);
>      /*
> @@ -402,7 +401,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
>          unsigned long va = XEN_VIRT_START + (i << PAGE_SHIFT);
>          if ( !is_kernel(va) )
>              break;
> -        pte = mfn_to_xen_entry(mfn);
> +        pte = mfn_to_xen_entry(mfn, WRITEALLOC);
>          pte.pt.table = 1; /* 4k mappings always have this bit set */
>          if ( is_kernel_text(va) || is_kernel_inittext(va) )
>          {
> @@ -415,7 +414,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
>          /* No flush required here as page table is not hooked in yet. */
>      }
>      pte = mfn_to_xen_entry((((unsigned long) xen_xenmap) + phys_offset)
> -                           >> PAGE_SHIFT);
> +                           >> PAGE_SHIFT, WRITEALLOC);
>      pte.pt.table = 1;
>      write_pte(xen_second + second_linear_offset(XEN_VIRT_START), pte);
>      /* TLBFLUSH and ISB would be needed here, but wait until we set WXN */
> @@ -467,7 +466,7 @@ int init_secondary_pagetables(int cpu)
>      /* Initialise first pagetable from first level of boot tables, and
>       * hook into the new root. */
>      memcpy(first, boot_first, PAGE_SIZE);
> -    pte = mfn_to_xen_entry(virt_to_mfn(first));
> +    pte = mfn_to_xen_entry(virt_to_mfn(first), WRITEALLOC);
>      pte.pt.table = 1;
>      write_pte(root, pte);
>  #endif
> @@ -479,7 +478,7 @@ int init_secondary_pagetables(int cpu)
>       * domheap mapping pages. */
>      for ( i = 0; i < DOMHEAP_SECOND_PAGES; i++ )
>      {
> -        pte = mfn_to_xen_entry(virt_to_mfn(domheap+i*LPAE_ENTRIES));
> +        pte = mfn_to_xen_entry(virt_to_mfn(domheap+i*LPAE_ENTRIES), WRITEALLOC);
>          pte.pt.table = 1;
>          write_pte(&first[first_table_offset(DOMHEAP_VIRT_START+i*FIRST_SIZE)], pte);
>      }
> @@ -525,7 +524,7 @@ static void __init create_mappings(unsigned long virt,
>  
>      count = nr_mfns / LPAE_ENTRIES;
>      p = xen_second + second_linear_offset(virt);
> -    pte = mfn_to_xen_entry(base_mfn);
> +    pte = mfn_to_xen_entry(base_mfn, WRITEALLOC);
>      pte.pt.hint = 1;  /* These maps are in 16-entry contiguous chunks. */
>      for ( i = 0; i < count; i++ )
>      {
> @@ -595,7 +594,7 @@ static int create_xen_table(lpae_t *entry)
>      if ( p == NULL )
>          return -ENOMEM;
>      clear_page(p);
> -    pte = mfn_to_xen_entry(virt_to_mfn(p));
> +    pte = mfn_to_xen_entry(virt_to_mfn(p), WRITEALLOC);
>      pte.pt.table = 1;
>      write_pte(entry, pte);
>      return 0;
> @@ -641,9 +640,8 @@ static int create_xen_entries(enum xenmap_operation op,
>                             addr, mfn);
>                      return -EINVAL;
>                  }
> -                pte = mfn_to_xen_entry(mfn);
> +                pte = mfn_to_xen_entry(mfn, ai);
>                  pte.pt.table = 1;
> -                pte.pt.ai = ai;
>                  write_pte(&third[third_table_offset(addr)], pte);
>                  break;
>              case REMOVE:
> diff --git a/xen/include/asm-arm/arm32/system.h b/xen/include/asm-arm/arm32/system.h
> index 60148cb..b3736f4 100644
> --- a/xen/include/asm-arm/arm32/system.h
> +++ b/xen/include/asm-arm/arm32/system.h
> @@ -7,8 +7,8 @@
>  #define wfi() __asm__ __volatile__ ("wfi" : : : "memory")
>  
>  #define isb() __asm__ __volatile__ ("isb" : : : "memory")
> -#define dsb() __asm__ __volatile__ ("dsb" : : : "memory")
> -#define dmb() __asm__ __volatile__ ("dmb" : : : "memory")
> +#define dsb() __asm__ __volatile__ ("dsb sy" : : : "memory")
> +#define dmb() __asm__ __volatile__ ("dmb sy" : : : "memory")
>  
>  #define mb()            dsb()
>  #define rmb()           dsb()
> diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
> index 41e9eff..cd38956 100644
> --- a/xen/include/asm-arm/page.h
> +++ b/xen/include/asm-arm/page.h
> @@ -185,7 +185,7 @@ typedef union {
>  /* Standard entry type that we'll use to build Xen's own pagetables.
>   * We put the same permissions at every level, because they're ignored
>   * by the walker in non-leaf entries. */
> -static inline lpae_t mfn_to_xen_entry(unsigned long mfn)
> +static inline lpae_t mfn_to_xen_entry(unsigned long mfn, unsigned attr)
>  {
>      paddr_t pa = ((paddr_t) mfn) << PAGE_SHIFT;
>      lpae_t e = (lpae_t) {
> @@ -193,10 +193,9 @@ static inline lpae_t mfn_to_xen_entry(unsigned long mfn)
>              .xn = 1,              /* No need to execute outside .text */
>              .ng = 1,              /* Makes TLB flushes easier */
>              .af = 1,              /* No need for access tracking */
> -            .sh = LPAE_SH_OUTER,  /* Xen mappings are globally coherent */
>              .ns = 1,              /* Hyp mode is in the non-secure world */
>              .user = 1,            /* See below */
> -            .ai = WRITEALLOC,
> +            .ai = attr,
>              .table = 0,           /* Set to 1 for links and 4k maps */
>              .valid = 1,           /* Mappings are present */
>          }};;
> @@ -205,6 +204,37 @@ static inline lpae_t mfn_to_xen_entry(unsigned long mfn)
>       * pagetables un User mode it's OK.  If this changes, remember
>       * to update the hard-coded values in head.S too */
>  
> +    switch ( attr )
> +    {
> +    case BUFFERABLE:
> +        /*
> +         * ARM ARM: Overlaying the shareability attribute (B3-1376 to 1377)

It would be worth to indicate the revision of the ARM ARM here (in this
instance DDI 0406C.b).

> +         *
> +         * A memory region with a resultant memory type attribute of Normal,
> +         * and a resultant cacheability attribute of Inner Non-cacheable,
> +         * Outer Non-cacheable, must have a resultant shareability attribute
> +         * of Outer Shareable, otherwise shareability is UNPREDICTABLE.
> +         *
> +         * On ARMv8 sharability is ignored and explicitly treated as Outer
> +         * Shareable for Normal Inner Non_cacheable, Outer Non-cacheable.
> +         */
> +        e.pt.sh = LPAE_SH_OUTER;
> +        break;
> +    case UNCACHED:
> +    case DEV_SHARED:
> +        /* Shareability is ignored for non-Normal memory, Outer is as
> +         * good as anything.
> +         *
> +         * On ARMv8 sharability is ignored and explicitly treated as Outer
> +         * Shareable for any device memory type.
> +         */
> +        e.pt.sh = LPAE_SH_OUTER;
> +        break;
> +    default:
> +        e.pt.sh = LPAE_SH_INNER;  /* Xen mappings are SMP coherent */
> +        break;
> +    }
> +
>      ASSERT(!(pa & ~PAGE_MASK));
>      ASSERT(!(pa & ~PADDR_MASK));
>  
> @@ -219,7 +249,7 @@ static inline lpae_t mfn_to_p2m_entry(unsigned long mfn, unsigned int mattr)
>      lpae_t e = (lpae_t) {
>          .p2m.xn = 0,
>          .p2m.af = 1,
> -        .p2m.sh = LPAE_SH_OUTER,
> +        .p2m.sh = LPAE_SH_INNER,
>          .p2m.write = 1,
>          .p2m.read = 1,
>          .p2m.mattr = mattr,
> -- 
> 1.7.2.5
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 01/10] xen: arm: map memory as inner shareable.
  2013-07-02 14:09   ` Leif Lindholm
@ 2013-07-02 14:26     ` Ian Campbell
  0 siblings, 0 replies; 44+ messages in thread
From: Ian Campbell @ 2013-07-02 14:26 UTC (permalink / raw)
  To: Leif Lindholm; +Cc: julien.grall, stefano.stabellini, tim, xen-devel

On Tue, 2013-07-02 at 16:09 +0200, Leif Lindholm wrote:
> On Fri, Jun 28, 2013 at 05:10:47PM +0100, Ian Campbell wrote:
> > The inner shareable domain contains all SMP processors, including different
> > clusters (e.g. big.LITTLE). Therefore this is the correct thing to use for Xen
> > memory mappings. The outer shareable domain is for devices on busses which are
> > barriers (e.g. AMBA4).
> 
> I think this should say something like "which are coherent and
> barrier-aware".
> 
> And to be technically correct, the example should say "AMBA4 AXI with
> ACE").
> 
> > While the system domain is for things behind bridges > which do not.
>  
> And given the above ... -> which "are" not.
> 
> > One wrinkle is that Normal memory with attributes Inner Non-cacheable, Outer
> > Non-cacheable (which we call BUFFERABLE) must be mapped Outer Shareable on ARM
> > v7. Therefore change the prototype of mfn_to_xen_entry to take the attribute
> > index so we can DTRT. On ARMv8 the sharability is ignored and considered to
> > always be Outer Shareable.
> > 
> > While I'm here change all the dmb/dsb with an implicit sy to an explicit sy,
> > to make future changes simpler. Other than that don't adjust the barriers,
> > flushes etc, those remain as they were (which is more than is now required).
> > I'll change those in a later patch.
> > 
> > Many thanks to Leif for explaining the difference between Inner- and
> > Outer-Shareable in words of two or less syllables, I hope I've replicated that
> > explanation properly above!
>  
> Apart from my usual nitpicking, indeed :)

Thanks, I was sure I was playing a bit fast and loose with the
specifics!

> [...]
> > +    switch ( attr )
> > +    {
> > +    case BUFFERABLE:
> > +        /*
> > +         * ARM ARM: Overlaying the shareability attribute (B3-1376 to 1377)
> 
> It would be worth to indicate the revision of the ARM ARM here (in this
> instance DDI 0406C.b).

Good idea, thanks.

Ian.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 01/10] xen: arm: map memory as inner shareable.
  2013-06-28 16:10 ` [PATCH 01/10] xen: arm: map memory as inner shareable Ian Campbell
  2013-07-01 15:39   ` Stefano Stabellini
  2013-07-02 14:09   ` Leif Lindholm
@ 2013-07-04 10:58   ` Tim Deegan
  2013-07-04 11:03     ` Ian Campbell
  2 siblings, 1 reply; 44+ messages in thread
From: Tim Deegan @ 2013-07-04 10:58 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, stefano.stabellini, Leif Lindholm, xen-devel

At 17:10 +0100 on 28 Jun (1372439447), Ian Campbell wrote:
> @@ -219,7 +249,7 @@ static inline lpae_t mfn_to_p2m_entry(unsigned long mfn, unsigned int mattr)
>      lpae_t e = (lpae_t) {
>          .p2m.xn = 0,
>          .p2m.af = 1,
> -        .p2m.sh = LPAE_SH_OUTER,
> +        .p2m.sh = LPAE_SH_INNER,
>          .p2m.write = 1,
>          .p2m.read = 1,
>          .p2m.mattr = mattr,

I think we need to leave MMIO mappings outer-shareable (or full-system
as appropriate), which means adding the same mechanism that you did for
the xen PT mappings. 

Otherwise, this looks fine.

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 02/10] xen: arm: Only upgrade guest barriers to inner shareable.
  2013-06-28 16:10 ` [PATCH 02/10] xen: arm: Only upgrade guest barriers to " Ian Campbell
  2013-07-01 15:24   ` Stefano Stabellini
@ 2013-07-04 10:58   ` Tim Deegan
  1 sibling, 0 replies; 44+ messages in thread
From: Tim Deegan @ 2013-07-04 10:58 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, stefano.stabellini, xen-devel

At 17:10 +0100 on 28 Jun (1372439448), Ian Campbell wrote:
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

Acked-by: Tim Deegan <tim@xen.org>

> ---
>  xen/arch/arm/traps.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 398d209..e40ae2e 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -62,7 +62,7 @@ void __cpuinit init_traps(void)
>      WRITE_SYSREG((vaddr_t)hyp_traps_vector, VBAR_EL2);
>  
>      /* Setup hypervisor traps */
> -    WRITE_SYSREG(HCR_PTW|HCR_BSU_OUTER|HCR_AMO|HCR_IMO|HCR_VM|HCR_TWI|HCR_TSC, HCR_EL2);
> +    WRITE_SYSREG(HCR_PTW|HCR_BSU_INNER|HCR_AMO|HCR_IMO|HCR_VM|HCR_TWI|HCR_TSC, HCR_EL2);
>      isb();
>  }

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 01/10] xen: arm: map memory as inner shareable.
  2013-07-04 10:58   ` Tim Deegan
@ 2013-07-04 11:03     ` Ian Campbell
  0 siblings, 0 replies; 44+ messages in thread
From: Ian Campbell @ 2013-07-04 11:03 UTC (permalink / raw)
  To: Tim Deegan; +Cc: julien.grall, stefano.stabellini, Leif Lindholm, xen-devel

On Thu, 2013-07-04 at 11:58 +0100, Tim Deegan wrote:
> At 17:10 +0100 on 28 Jun (1372439447), Ian Campbell wrote:
> > @@ -219,7 +249,7 @@ static inline lpae_t mfn_to_p2m_entry(unsigned long mfn, unsigned int mattr)
> >      lpae_t e = (lpae_t) {
> >          .p2m.xn = 0,
> >          .p2m.af = 1,
> > -        .p2m.sh = LPAE_SH_OUTER,
> > +        .p2m.sh = LPAE_SH_INNER,
> >          .p2m.write = 1,
> >          .p2m.read = 1,
> >          .p2m.mattr = mattr,
> 
> I think we need to leave MMIO mappings outer-shareable (or full-system
> as appropriate), which means adding the same mechanism that you did for
> the xen PT mappings. 

Yes, good catch.

THanks

> 
> Otherwise, this looks fine.
> 
> Cheers,
> 
> Tim.
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 03/10] xen: arm: reduce instruction cache and tlb flushes to inner-shareable.
  2013-06-28 16:10 ` [PATCH 03/10] xen: arm: reduce instruction cache and tlb flushes to inner-shareable Ian Campbell
  2013-07-01 15:25   ` Stefano Stabellini
@ 2013-07-04 11:07   ` Tim Deegan
  2013-07-04 11:19     ` Tim Deegan
  1 sibling, 1 reply; 44+ messages in thread
From: Tim Deegan @ 2013-07-04 11:07 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, stefano.stabellini, xen-devel

At 17:10 +0100 on 28 Jun (1372439449), Ian Campbell wrote:
> Now that Xen maps memory and performs pagetable walks as inner shareable we
> don't need to push updates down so far when modifying page tables etc.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

> --- a/xen/include/asm-arm/arm32/page.h
> +++ b/xen/include/asm-arm/arm32/page.h
> @@ -39,8 +39,8 @@ static inline void flush_xen_text_tlb(void)
>      asm volatile (
>          "isb;"                        /* Ensure synchronization with previous changes to text */
>          STORE_CP32(0, TLBIALLH)       /* Flush hypervisor TLB */
> -        STORE_CP32(0, ICIALLU)        /* Flush I-cache */
> -        STORE_CP32(0, BPIALL)         /* Flush branch predictor */
> +        STORE_CP32(0, ICIALLUIS)      /* Flush I-cache */
> +        STORE_CP32(0, BPIALLIS)       /* Flush branch predictor */
>          "dsb;"                        /* Ensure completion of TLB+BP flush */
>          "isb;"
>          : : "r" (r0) /*dummy*/ : "memory");
> @@ -54,7 +54,7 @@ static inline void flush_xen_data_tlb(void)
>  {
>      register unsigned long r0 asm ("r0");
>      asm volatile("dsb;" /* Ensure preceding are visible */
> -                 STORE_CP32(0, TLBIALLH)
> +                 STORE_CP32(0, TLBIALLHIS)
>                   "dsb;" /* Ensure completion of the TLB flush */
>                   "isb;"
>                   : : "r" (r0) /* dummy */: "memory");
> @@ -69,7 +69,7 @@ static inline void flush_xen_data_tlb_range_va(unsigned long va, unsigned long s
>      unsigned long end = va + size;
>      dsb(); /* Ensure preceding are visible */
>      while ( va < end ) {
> -        asm volatile(STORE_CP32(0, TLBIMVAH)
> +        asm volatile(STORE_CP32(0, TLBIMVAHIS)
>                       : : "r" (va) : "memory");
>          va += PAGE_SIZE;
>      }

That's OK for actual Xen data mappings, map_domain_page() &c., but now
set_fixmap() and clear_fixmap() need to use a stronger flush whenever
they map device memory.  The same goes for create_xen_entries() when
ai != WRITEALLOC.

> diff --git a/xen/include/asm-arm/arm64/flushtlb.h b/xen/include/asm-arm/arm64/flushtlb.h
> index d0535a0..3a6d2cb 100644
> --- a/xen/include/asm-arm/arm64/flushtlb.h
> +++ b/xen/include/asm-arm/arm64/flushtlb.h
> @@ -6,7 +6,7 @@ static inline void flush_tlb_local(void)
>  {
>      asm volatile(
>          "dsb sy;"
> -        "tlbi vmalle1;"
> +        "tlbi vmalle1is;"
>          "dsb sy;"
>          "isb;"
>          : : : "memory");
> @@ -17,7 +17,7 @@ static inline void flush_tlb_all_local(void)
>  {
>      asm volatile(
>          "dsb sy;"
> -        "tlbi alle1;"
> +        "tlbi alle1is;"
>          "dsb sy;"
>          "isb;"
>          : : : "memory");

Might these need to be stronger if we're using them on context switch
and guests have MMIO/outer-shareable mappings?

Tim.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 04/10] xen: arm: consolidate barrier definitions
  2013-06-28 16:10 ` [PATCH 04/10] xen: arm: consolidate barrier definitions Ian Campbell
  2013-07-01 15:25   ` Stefano Stabellini
@ 2013-07-04 11:07   ` Tim Deegan
  1 sibling, 0 replies; 44+ messages in thread
From: Tim Deegan @ 2013-07-04 11:07 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, stefano.stabellini, xen-devel

At 17:10 +0100 on 28 Jun (1372439450), Ian Campbell wrote:
> These are effectively identical on both 32- and 64-bit.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

Acked-by: Tim Deegan <tim@xen.org>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 03/10] xen: arm: reduce instruction cache and tlb flushes to inner-shareable.
  2013-07-04 11:07   ` Tim Deegan
@ 2013-07-04 11:19     ` Tim Deegan
  2013-07-04 11:21       ` Tim Deegan
  0 siblings, 1 reply; 44+ messages in thread
From: Tim Deegan @ 2013-07-04 11:19 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, xen-devel, stefano.stabellini

At 12:07 +0100 on 04 Jul (1372939621), Tim Deegan wrote:
> At 17:10 +0100 on 28 Jun (1372439449), Ian Campbell wrote:
> > Now that Xen maps memory and performs pagetable walks as inner shareable we
> > don't need to push updates down so far when modifying page tables etc.
> > 
> > Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> 
> > --- a/xen/include/asm-arm/arm32/page.h
> > +++ b/xen/include/asm-arm/arm32/page.h
> > @@ -39,8 +39,8 @@ static inline void flush_xen_text_tlb(void)
> >      asm volatile (
> >          "isb;"                        /* Ensure synchronization with previous changes to text */
> >          STORE_CP32(0, TLBIALLH)       /* Flush hypervisor TLB */
> > -        STORE_CP32(0, ICIALLU)        /* Flush I-cache */
> > -        STORE_CP32(0, BPIALL)         /* Flush branch predictor */
> > +        STORE_CP32(0, ICIALLUIS)      /* Flush I-cache */
> > +        STORE_CP32(0, BPIALLIS)       /* Flush branch predictor */
> >          "dsb;"                        /* Ensure completion of TLB+BP flush */
> >          "isb;"
> >          : : "r" (r0) /*dummy*/ : "memory");
> > @@ -54,7 +54,7 @@ static inline void flush_xen_data_tlb(void)
> >  {
> >      register unsigned long r0 asm ("r0");
> >      asm volatile("dsb;" /* Ensure preceding are visible */
> > -                 STORE_CP32(0, TLBIALLH)
> > +                 STORE_CP32(0, TLBIALLHIS)
> >                   "dsb;" /* Ensure completion of the TLB flush */
> >                   "isb;"
> >                   : : "r" (r0) /* dummy */: "memory");
> > @@ -69,7 +69,7 @@ static inline void flush_xen_data_tlb_range_va(unsigned long va, unsigned long s
> >      unsigned long end = va + size;
> >      dsb(); /* Ensure preceding are visible */
> >      while ( va < end ) {
> > -        asm volatile(STORE_CP32(0, TLBIMVAH)
> > +        asm volatile(STORE_CP32(0, TLBIMVAHIS)
> >                       : : "r" (va) : "memory");
> >          va += PAGE_SIZE;
> >      }
> 
> That's OK for actual Xen data mappings, map_domain_page() &c., but now
> set_fixmap() and clear_fixmap() need to use a stronger flush whenever
> they map device memory.  The same goes for create_xen_entries() when
> ai != WRITEALLOC.

Ian has pointed out that this is actually making the flushes _stronger_
(and that in general the TLB flush operations need a bit of attention).

So I suggest that we drop the TLB-flush parts of this patch for now, and
address that whole area separately.  In the meantime, the cache-flush
parts are Acked-by: Tim Deegan <tim@xen.org>.

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 03/10] xen: arm: reduce instruction cache and tlb flushes to inner-shareable.
  2013-07-04 11:19     ` Tim Deegan
@ 2013-07-04 11:21       ` Tim Deegan
  0 siblings, 0 replies; 44+ messages in thread
From: Tim Deegan @ 2013-07-04 11:21 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, xen-devel, stefano.stabellini

At 12:19 +0100 on 04 Jul (1372940342), Tim Deegan wrote:
> At 12:07 +0100 on 04 Jul (1372939621), Tim Deegan wrote:
> > At 17:10 +0100 on 28 Jun (1372439449), Ian Campbell wrote:
> > > Now that Xen maps memory and performs pagetable walks as inner shareable we
> > > don't need to push updates down so far when modifying page tables etc.
> > > 
> > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> > 
> > > --- a/xen/include/asm-arm/arm32/page.h
> > > +++ b/xen/include/asm-arm/arm32/page.h
> > > @@ -39,8 +39,8 @@ static inline void flush_xen_text_tlb(void)
> > >      asm volatile (
> > >          "isb;"                        /* Ensure synchronization with previous changes to text */
> > >          STORE_CP32(0, TLBIALLH)       /* Flush hypervisor TLB */
> > > -        STORE_CP32(0, ICIALLU)        /* Flush I-cache */
> > > -        STORE_CP32(0, BPIALL)         /* Flush branch predictor */
> > > +        STORE_CP32(0, ICIALLUIS)      /* Flush I-cache */
> > > +        STORE_CP32(0, BPIALLIS)       /* Flush branch predictor */
> > >          "dsb;"                        /* Ensure completion of TLB+BP flush */
> > >          "isb;"
> > >          : : "r" (r0) /*dummy*/ : "memory");
> > > @@ -54,7 +54,7 @@ static inline void flush_xen_data_tlb(void)
> > >  {
> > >      register unsigned long r0 asm ("r0");
> > >      asm volatile("dsb;" /* Ensure preceding are visible */
> > > -                 STORE_CP32(0, TLBIALLH)
> > > +                 STORE_CP32(0, TLBIALLHIS)
> > >                   "dsb;" /* Ensure completion of the TLB flush */
> > >                   "isb;"
> > >                   : : "r" (r0) /* dummy */: "memory");
> > > @@ -69,7 +69,7 @@ static inline void flush_xen_data_tlb_range_va(unsigned long va, unsigned long s
> > >      unsigned long end = va + size;
> > >      dsb(); /* Ensure preceding are visible */
> > >      while ( va < end ) {
> > > -        asm volatile(STORE_CP32(0, TLBIMVAH)
> > > +        asm volatile(STORE_CP32(0, TLBIMVAHIS)
> > >                       : : "r" (va) : "memory");
> > >          va += PAGE_SIZE;
> > >      }
> > 
> > That's OK for actual Xen data mappings, map_domain_page() &c., but now
> > set_fixmap() and clear_fixmap() need to use a stronger flush whenever
> > they map device memory.  The same goes for create_xen_entries() when
> > ai != WRITEALLOC.
> 
> Ian has pointed out that this is actually making the flushes _stronger_
> (and that in general the TLB flush operations need a bit of attention).
> 
> So I suggest that we drop the TLB-flush parts of this patch for now, and
> address that whole area separately.  In the meantime, the cache-flush
> parts are Acked-by: Tim Deegan <tim@xen.org>.

Wait, no - that made no sense. :)  I retract the ack, and let's start
again with the TLB-flush ops.  I don't think anything else in this
series relies on this patch.

Tim.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 05/10] xen: use SMP barrier in common code dealing with shared memory protocols
  2013-06-28 16:10 ` [PATCH 05/10] xen: use SMP barrier in common code dealing with shared memory protocols Ian Campbell
  2013-06-28 16:15   ` Ian Campbell
  2013-06-28 16:20   ` Keir Fraser
@ 2013-07-04 11:26   ` Tim Deegan
  2 siblings, 0 replies; 44+ messages in thread
From: Tim Deegan @ 2013-07-04 11:26 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, stefano.stabellini, keir, jbeuich, xen-devel

At 17:10 +0100 on 28 Jun (1372439451), Ian Campbell wrote:
> Xen currently makes no strong distinction between the SMP barriers (smp_mb
> etc) and the regular barrier (mb etc). In Linux, where we inherited these
> names from having imported Linux code which uses them, the SMP barriers are
> intended to be sufficient for implementing shared-memory protocols between
> processors in an SMP system while the standard barriers are useful for MMIO
> etc.
> 
> On x86 with the stronger ordering model there is not much practical difference
> here but ARM has weaker barriers available which are suitable for use as SMP
> barriers.
> 
> Therefore ensure that common code uses the SMP barriers when that is all which
> is required.
> 
> On both ARM and x86 both types of barrier are currently identical so there is
> no actual change. A future patch will change smp_mb to a weaker barrier on
> ARM.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

Reviewed-by: Tim Deegan <tim@xen.org>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 06/10] xen: arm: Use SMP barriers when that is all which is required.
  2013-07-01 15:24     ` Ian Campbell
@ 2013-07-04 11:30       ` Tim Deegan
  0 siblings, 0 replies; 44+ messages in thread
From: Tim Deegan @ 2013-07-04 11:30 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, xen-devel, Stefano Stabellini

At 16:24 +0100 on 01 Jul (1372695881), Ian Campbell wrote:
> On Mon, 2013-07-01 at 16:19 +0100, Stefano Stabellini wrote:
> > On Fri, 28 Jun 2013, Ian Campbell wrote:
> > > As explained in the previous commit SMP barriers can be used when all we care
> > > about is synchronising against other processors.
> > > 
> > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> > > ---
> > >  xen/arch/arm/mm.c      |    2 +-
> > >  xen/arch/arm/smpboot.c |    4 ++--
> > >  2 files changed, 3 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> > > index c5213f2..3f049cb 100644
> > > --- a/xen/arch/arm/mm.c
> > > +++ b/xen/arch/arm/mm.c
> > > @@ -776,7 +776,7 @@ void share_xen_page_with_guest(struct page_info *page,
> > >      page->u.inuse.type_info |= PGT_validated | 1;
> > >  
> > >      page_set_owner(page, d);
> > > -    wmb(); /* install valid domain ptr before updating refcnt. */
> > > +    smp_wmb(); /* install valid domain ptr before updating refcnt. */
> > >      ASSERT((page->count_info & ~PGC_xen_heap) == 0);
> > >  
> > >      /* Only add to the allocation list if the domain isn't dying. */
> > > diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
> > > index 8011987..727e09f 100644
> > > --- a/xen/arch/arm/smpboot.c
> > > +++ b/xen/arch/arm/smpboot.c
> > > @@ -170,11 +170,11 @@ void __cpuinit start_secondary(unsigned long boot_phys_offset,
> > >  
> > >      /* Run local notifiers */
> > >      notify_cpu_starting(cpuid);
> > > -    wmb();
> > > +    smp_wmb();
> > >  
> > >      /* Now report this CPU is up */
> > >      cpumask_set_cpu(cpuid, &cpu_online_map);
> > > -    wmb();
> > > +    smp_wmb();
> > >  
> > >      local_irq_enable();
> > 
> > Did you missed few mb() in smpboot.c?
> 
> The ones in __cpu_disable and __cpu_die?
> 
> I think I just wasn't 100% sure they might not be touching hardware
> (i.e. some platform register to shutdown a CPU) and since they weren't
> performance critical I punted on them.
> 
> Looking it again the first half of that logic seems to be bogus (that
> code goes nowhere near any peripheral).

Yes, I think they can both be smp_mb().  With or without that change,

Acked-by: Tim Deegan <tim@xen.org>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 07/10] xen: arm: Use dmb for smp barriers
  2013-06-28 16:10 ` [PATCH 07/10] xen: arm: Use dmb for smp barriers Ian Campbell
  2013-07-01 15:20   ` Stefano Stabellini
@ 2013-07-04 11:31   ` Tim Deegan
  1 sibling, 0 replies; 44+ messages in thread
From: Tim Deegan @ 2013-07-04 11:31 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, stefano.stabellini, xen-devel

At 17:10 +0100 on 28 Jun (1372439453), Ian Campbell wrote:
> The full power of dsb is not required in this context.
> 
> Also change wmb() to be dsb() directly instead of indirectly via mb(), for
> clarity.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

Acked-by: Tim Deegan <tim@xen.org>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* (no subject)
@ 2013-07-04 11:32 Tim Deegan
  0 siblings, 0 replies; 44+ messages in thread
From: Tim Deegan @ 2013-07-04 11:32 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, stefano.stabellini, xen-devel

Bcc: Tim Deegan <tjd-xen@phlegethon.org>
Subject: Re: [Xen-devel] [PATCH 08/10] xen: arm: add scope to dsb and dmb macros
Reply-To: 
In-Reply-To: <1372435856-14040-8-git-send-email-ian.campbell@citrix.com>

At 17:10 +0100 on 28 Jun (1372439454), Ian Campbell wrote:
> Everywhere currently passes "sy"stem, so no actual change.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

Acked-by: Tim Deegan <tim@xen.org>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 09/10] xen: arm: weaken SMP barriers to inner shareable.
  2013-06-28 16:10 ` [PATCH 09/10] xen: arm: weaken SMP barriers to inner shareable Ian Campbell
  2013-07-01 15:21   ` Stefano Stabellini
@ 2013-07-04 11:35   ` Tim Deegan
  1 sibling, 0 replies; 44+ messages in thread
From: Tim Deegan @ 2013-07-04 11:35 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, stefano.stabellini, xen-devel

At 17:10 +0100 on 28 Jun (1372439455), Ian Campbell wrote:
> Since all processors are in the inner-shareable domain and we map everything
> that way this is sufficient.
> 
> The non-SMP barriers remain full system. Although in principal they could

s/principal/principle/. 

> become outer shareable barriers for some hardware this would require us to
> know which class a given device is. Given the small number of device drivers
> in Xen itself its probably not worth worrying over, although maybe someone
> will benchmark at some point.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

Acked-by: Tim Deegan <tim@xen.org>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 10/10] xen: arm: use more specific barriers for read and write barriers.
  2013-06-28 16:10 ` [PATCH 10/10] xen: arm: use more specific barriers for read and write barriers Ian Campbell
  2013-07-01 15:22   ` Stefano Stabellini
@ 2013-07-04 11:42   ` Tim Deegan
  2013-07-04 11:46     ` Ian Campbell
  1 sibling, 1 reply; 44+ messages in thread
From: Tim Deegan @ 2013-07-04 11:42 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, stefano.stabellini, xen-devel

At 17:10 +0100 on 28 Jun (1372439456), Ian Campbell wrote:
> Note that 32-bit does not provide a load variant of the inner shareable
> barrier, so that remains a full any-any barrier.
> 
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> ---
>  xen/include/asm-arm/system.h |   13 +++++++++----
>  1 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
> index 7c3e42d..efaf645 100644
> --- a/xen/include/asm-arm/system.h
> +++ b/xen/include/asm-arm/system.h
> @@ -17,12 +17,17 @@
>  #define dmb(scope)      asm volatile("dmb " scope : : : "memory")
>  
>  #define mb()            dsb("sy")
> -#define rmb()           dsb("sy")
> -#define wmb()           dsb("sy")
> +#define rmb()           dsb("ld")

This doesn't exist on arm32; it'll have to be dsb("sy") there, just like
you've done for smb_rmb() below.

With that change, Acked-by: Tim Deegan <tim@xen.org>

> +#define wmb()           dsb("st")
>  
>  #define smp_mb()        dmb("ish")
> -#define smp_rmb()       dmb("ish")
> -#define smp_wmb()       dmb("ish")
> +#ifdef CONFIG_ARM_64
> +#define smp_rmb()       dmb("ishld")
> +#else
> +#define smp_rmb()       dmb("ish") /* 32-bit has no ishld variant. */
> +#endif
> +
> +#define smp_wmb()       dmb("ishst")
>  
>  #define xchg(ptr,x) \
>          ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
> -- 
> 1.7.2.5
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: (no subject)
  2013-06-28 16:10 ` [PATCH 08/10] xen: arm: add scope to dsb and dmb macros Ian Campbell
  2013-07-01 15:21   ` Stefano Stabellini
@ 2013-07-04 11:44   ` Tim Deegan
  1 sibling, 0 replies; 44+ messages in thread
From: Tim Deegan @ 2013-07-04 11:44 UTC (permalink / raw)
  To: Ian Campbell; +Cc: julien.grall, xen-devel, stefano.stabellini

Oops, a wandering newline seems to have got into the headers here. :)

Tim.

At 12:32 +0100 on 04 Jul (1372941166), Tim Deegan wrote:
> Bcc: Tim Deegan <tjd-xen@phlegethon.org>
> Subject: Re: [Xen-devel] [PATCH 08/10] xen: arm: add scope to dsb and dmb macros
> Reply-To: 
> In-Reply-To: <1372435856-14040-8-git-send-email-ian.campbell@citrix.com>
> 
> At 17:10 +0100 on 28 Jun (1372439454), Ian Campbell wrote:
> > Everywhere currently passes "sy"stem, so no actual change.
> > 
> > Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> 
> Acked-by: Tim Deegan <tim@xen.org>
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 10/10] xen: arm: use more specific barriers for read and write barriers.
  2013-07-04 11:42   ` Tim Deegan
@ 2013-07-04 11:46     ` Ian Campbell
  0 siblings, 0 replies; 44+ messages in thread
From: Ian Campbell @ 2013-07-04 11:46 UTC (permalink / raw)
  To: Tim Deegan; +Cc: julien.grall, stefano.stabellini, xen-devel

On Thu, 2013-07-04 at 12:42 +0100, Tim Deegan wrote:
> At 17:10 +0100 on 28 Jun (1372439456), Ian Campbell wrote:
> > Note that 32-bit does not provide a load variant of the inner shareable
> > barrier, so that remains a full any-any barrier.
> > 
> > Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> > ---
> >  xen/include/asm-arm/system.h |   13 +++++++++----
> >  1 files changed, 9 insertions(+), 4 deletions(-)
> > 
> > diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
> > index 7c3e42d..efaf645 100644
> > --- a/xen/include/asm-arm/system.h
> > +++ b/xen/include/asm-arm/system.h
> > @@ -17,12 +17,17 @@
> >  #define dmb(scope)      asm volatile("dmb " scope : : : "memory")
> >  
> >  #define mb()            dsb("sy")
> > -#define rmb()           dsb("sy")
> > -#define wmb()           dsb("sy")
> > +#define rmb()           dsb("ld")
> 
> This doesn't exist on arm32; it'll have to be dsb("sy") there, just like
> you've done for smb_rmb() below.

Right, suggests that rmb isn't actually used anywhere (since I'm sure I
compile tested) but lets at least make it correct...

> 
> With that change, Acked-by: Tim Deegan <tim@xen.org>

Thanks,
Ian.

> 
> > +#define wmb()           dsb("st")
> >  
> >  #define smp_mb()        dmb("ish")
> > -#define smp_rmb()       dmb("ish")
> > -#define smp_wmb()       dmb("ish")
> > +#ifdef CONFIG_ARM_64
> > +#define smp_rmb()       dmb("ishld")
> > +#else
> > +#define smp_rmb()       dmb("ish") /* 32-bit has no ishld variant. */
> > +#endif
> > +
> > +#define smp_wmb()       dmb("ishst")
> >  
> >  #define xchg(ptr,x) \
> >          ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
> > -- 
> > 1.7.2.5
> > 
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2013-07-04 11:46 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-28 16:10 [PATCH 00/10] xen: arm: map normal memory as inner shareable, reduce scope of various barriers Ian Campbell
2013-06-28 16:10 ` [PATCH 01/10] xen: arm: map memory as inner shareable Ian Campbell
2013-07-01 15:39   ` Stefano Stabellini
2013-07-01 15:42     ` Ian Campbell
2013-07-02 14:09   ` Leif Lindholm
2013-07-02 14:26     ` Ian Campbell
2013-07-04 10:58   ` Tim Deegan
2013-07-04 11:03     ` Ian Campbell
2013-06-28 16:10 ` [PATCH 02/10] xen: arm: Only upgrade guest barriers to " Ian Campbell
2013-07-01 15:24   ` Stefano Stabellini
2013-07-04 10:58   ` Tim Deegan
2013-06-28 16:10 ` [PATCH 03/10] xen: arm: reduce instruction cache and tlb flushes to inner-shareable Ian Campbell
2013-07-01 15:25   ` Stefano Stabellini
2013-07-04 11:07   ` Tim Deegan
2013-07-04 11:19     ` Tim Deegan
2013-07-04 11:21       ` Tim Deegan
2013-06-28 16:10 ` [PATCH 04/10] xen: arm: consolidate barrier definitions Ian Campbell
2013-07-01 15:25   ` Stefano Stabellini
2013-07-04 11:07   ` Tim Deegan
2013-06-28 16:10 ` [PATCH 05/10] xen: use SMP barrier in common code dealing with shared memory protocols Ian Campbell
2013-06-28 16:15   ` Ian Campbell
2013-06-28 16:20   ` Keir Fraser
2013-07-04 11:26   ` Tim Deegan
2013-06-28 16:10 ` [PATCH 06/10] xen: arm: Use SMP barriers when that is all which is required Ian Campbell
2013-07-01 15:19   ` Stefano Stabellini
2013-07-01 15:24     ` Ian Campbell
2013-07-04 11:30       ` Tim Deegan
2013-06-28 16:10 ` [PATCH 07/10] xen: arm: Use dmb for smp barriers Ian Campbell
2013-07-01 15:20   ` Stefano Stabellini
2013-07-04 11:31   ` Tim Deegan
2013-06-28 16:10 ` [PATCH 08/10] xen: arm: add scope to dsb and dmb macros Ian Campbell
2013-07-01 15:21   ` Stefano Stabellini
2013-07-01 15:22     ` Stefano Stabellini
2013-07-04 11:44   ` (no subject) Tim Deegan
2013-06-28 16:10 ` [PATCH 09/10] xen: arm: weaken SMP barriers to inner shareable Ian Campbell
2013-07-01 15:21   ` Stefano Stabellini
2013-07-01 15:22     ` Stefano Stabellini
2013-07-04 11:35   ` Tim Deegan
2013-06-28 16:10 ` [PATCH 10/10] xen: arm: use more specific barriers for read and write barriers Ian Campbell
2013-07-01 15:22   ` Stefano Stabellini
2013-07-04 11:42   ` Tim Deegan
2013-07-04 11:46     ` Ian Campbell
2013-06-28 16:11 ` [PATCH 00/10] xen: arm: map normal memory as inner shareable, reduce scope of various barriers Ian Campbell
2013-07-04 11:32 (no subject) Tim Deegan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).