linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] MIPS: Ordering enforcement fixes for MMIO accessors
@ 2018-10-08  0:36 Maciej W. Rozycki
  2018-10-08  0:37 ` [PATCH 1/4] MIPS: Define MMIO ordering barriers Maciej W. Rozycki
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Maciej W. Rozycki @ 2018-10-08  0:36 UTC (permalink / raw)
  To: Ralf Baechle, Paul Burton; +Cc: linux-mips, linux-kernel

Hi,

 This patch series is a follow-up to my earlier consideration about MMIO 
access ordering recorded here: <https://lkml.org/lkml/2014/4/28/201>.

 As I have learnt in a recent Alpha/Linux discussion starting here: 
<https://marc.info/?i=alpine.LRH.2.02.1808161556450.13597%20()%20file01%20!%20intranet%20!%20prod%20!%20int%20!%20rdu2%20!%20redhat%20!%20com> 
related to MMIO accessor ordering barriers ports are actually required to 
follow the x86 strongly ordered semantics.  As the ordering is not 
specified in the MIPS architecture except for the SYNC instruction we do 
have to put explicit barriers in MMIO accessors as otherwise ordering may 
not be guaranteed.

 Fortunately on strongly ordered systems SYNC is expected to be as cheap 
as a NOP, and on weakly ordered ones it is needed anyway.  As from 
revision 2.60 of the MIPS architecture specification however we have a 
number of SYNC operations defined, and SYNC 0 has been upgraded from an 
ordering to a completion barrier.  We currently don't make use of these 
extra operations and always use SYNC 0 instead, which this means that we 
may be doing too much synchronisation with the barriers we have already 
defined.

 This patch series does not make an attempt to optimise for SYNC operation 
use, which belongs to a separate improvement.  Instead it focuses on 
fixing MMIO accesses so that drivers can rely on our own API definition.

 Following the original consideration specific MMIO barrier operations are 
added.  As they have turned out to be required to be implied by MMIO 
accessors there is no immediate need to make them form a generic 
cross-architecture internal Linux API.  Therefore I defined them for the 
MIPS architecture only, using the names originally coined by mostly taking 
them from the PowerPC port.

 Then I have used them to fix `mmiowb', and then `readX' and `writeX' 
accessors.  Finally I have updated the `_relaxed' accessors to avoid 
unnecessary synchronisation WRT DMA.

 See individual commit descriptions for further details.

 As a follow-up clean-up places across the architecture tree could be 
reviewed for barrier use that is actually related to MMIO rather than 
memory and updated to use the new names of the MMIO barrier operations.  I 
plan to do this for the DECstation and possibly the SiByte platform, 
however I am leaving it for someone else to do it elsewhere.

 Similarly I think the DMA barrier in `readX' and `inX' should be using 
`dma_rmb' rather than `rmb', but I'm leaving it for someone else to 
handle.

 These changes have been verified at run time with an R3000 (MIPS I) 
DECstation machine (32-bit kernel, little endianness), an R4400 (MIPS III) 
DECstation machine (64-bit kernel, little endianness) and an SB-1 (MIPS64) 
SWARM machine (64-bit kernel, big endianness), by booting them into the 
multiuser mode and running them for a couple of hours.

 Please apply.

  Maciej

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/4] MIPS: Define MMIO ordering barriers
  2018-10-08  0:36 [PATCH 0/4] MIPS: Ordering enforcement fixes for MMIO accessors Maciej W. Rozycki
@ 2018-10-08  0:37 ` Maciej W. Rozycki
  2018-10-08  0:37 ` [PATCH 2/4] MIPS: Correct `mmiowb' barrier for `wbflush' platforms Maciej W. Rozycki
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Maciej W. Rozycki @ 2018-10-08  0:37 UTC (permalink / raw)
  To: Ralf Baechle, Paul Burton; +Cc: linux-mips, linux-kernel

Define MMIO ordering barriers as separate operations so as to allow 
making places where such a barrier is required distinct from places 
where a memory or a DMA barrier is needed.

Architecturally MIPS does not specify ordering requirements for uncached 
bus accesses such as MMIO operations normally use and therefore explicit 
barriers have to be inserted between MMIO accesses where unspecified 
ordering of operations would cause unpredictable results.

MIPS MMIO ordering barriers are implemented using the same underlying 
mechanism that memory or a DMA barrier ordering barriers use, that is 
either a suitable SYNC instruction or a platform-specific `wbflush' 
call.  However platforms may implement different ordering rules for 
different kinds of bus activity, so having a separate API makes it 
possible to remove unnecessary barriers and avoid a performance hit they 
may cause due to unrelated bus activity by making their implementation 
expand to nil while keeping the necessary ones.

Also having distinct barriers for each kind of use makes it easier for 
the reader to understand what code has been intended to do.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
---
 arch/mips/include/asm/io.h |   12 ++++++++++++
 1 file changed, 12 insertions(+)

linux-mips-iobarrier.patch
Index: linux-20180930-3maxp/arch/mips/include/asm/io.h
===================================================================
--- linux-20180930-3maxp.orig/arch/mips/include/asm/io.h
+++ linux-20180930-3maxp/arch/mips/include/asm/io.h
@@ -20,6 +20,7 @@
 #include <linux/irqflags.h>
 
 #include <asm/addrspace.h>
+#include <asm/barrier.h>
 #include <asm/bug.h>
 #include <asm/byteorder.h>
 #include <asm/cpu.h>
@@ -80,6 +81,17 @@ static inline void set_io_port_base(unsi
 }
 
 /*
+ * Enforce in-order execution of data I/O.  In the MIPS architecture
+ * these are equivalent to corresponding platform-specific memory
+ * barriers defined in <asm/barrier.h>.  API pinched from PowerPC,
+ * with sync additionally defined.
+ */
+#define iobarrier_rw() mb()
+#define iobarrier_r() rmb()
+#define iobarrier_w() wmb()
+#define iobarrier_sync() iob()
+
+/*
  * Thanks to James van Artsdalen for a better timing-fix than
  * the two short jumps: using outb's to a nonexistent port seems
  * to guarantee better timings even on fast machines.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 2/4] MIPS: Correct `mmiowb' barrier for `wbflush' platforms
  2018-10-08  0:36 [PATCH 0/4] MIPS: Ordering enforcement fixes for MMIO accessors Maciej W. Rozycki
  2018-10-08  0:37 ` [PATCH 1/4] MIPS: Define MMIO ordering barriers Maciej W. Rozycki
@ 2018-10-08  0:37 ` Maciej W. Rozycki
  2018-10-08  0:37 ` [PATCH 3/4] MIPS: Enforce strong ordering for MMIO accessors Maciej W. Rozycki
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Maciej W. Rozycki @ 2018-10-08  0:37 UTC (permalink / raw)
  To: Ralf Baechle, Paul Burton; +Cc: linux-mips, linux-kernel

Redefine `mmiowb' in terms of `iobarrier_w' so that it works correctly 
for MIPS I platforms, which have no SYNC machine instruction and use a 
call to `wbflush' instead.

This doesn't change the semantics for CONFIG_CPU_CAVIUM_OCTEON, because 
`iobarrier_w' expands to `wmb', which is ultimately the same as the 
current arrangement.  For MIPS I platforms this not only makes any code 
that would happen to use `mmiowb' build and run, but it actually 
enforces the ordering required as well, as `iobarrier_w' has it already 
covered with the use of `wmb'.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
---
 arch/mips/include/asm/io.h |   11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

linux-mips-mmiowb.patch
Index: linux-20180930-3maxp/arch/mips/include/asm/io.h
===================================================================
--- linux-20180930-3maxp.orig/arch/mips/include/asm/io.h
+++ linux-20180930-3maxp/arch/mips/include/asm/io.h
@@ -91,6 +91,9 @@ static inline void set_io_port_base(unsi
 #define iobarrier_w() wmb()
 #define iobarrier_sync() iob()
 
+/* Some callers use this older API instead.  */
+#define mmiowb() iobarrier_w()
+
 /*
  * Thanks to James van Artsdalen for a better timing-fix than
  * the two short jumps: using outb's to a nonexistent port seems
@@ -573,14 +576,6 @@ BUILDSTRING(l, u32)
 BUILDSTRING(q, u64)
 #endif
 
-
-#ifdef CONFIG_CPU_CAVIUM_OCTEON
-#define mmiowb() wmb()
-#else
-/* Depends on MIPS II instruction set */
-#define mmiowb() asm volatile ("sync" ::: "memory")
-#endif
-
 static inline void memset_io(volatile void __iomem *addr, unsigned char val, int count)
 {
 	memset((void __force *) addr, val, count);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 3/4] MIPS: Enforce strong ordering for MMIO accessors
  2018-10-08  0:36 [PATCH 0/4] MIPS: Ordering enforcement fixes for MMIO accessors Maciej W. Rozycki
  2018-10-08  0:37 ` [PATCH 1/4] MIPS: Define MMIO ordering barriers Maciej W. Rozycki
  2018-10-08  0:37 ` [PATCH 2/4] MIPS: Correct `mmiowb' barrier for `wbflush' platforms Maciej W. Rozycki
@ 2018-10-08  0:37 ` Maciej W. Rozycki
  2018-10-08  0:37 ` [PATCH 4/4] MIPS: Provide actually relaxed " Maciej W. Rozycki
  2018-10-11 16:40 ` [PATCH 0/4] MIPS: Ordering enforcement fixes for " Paul Burton
  4 siblings, 0 replies; 6+ messages in thread
From: Maciej W. Rozycki @ 2018-10-08  0:37 UTC (permalink / raw)
  To: Ralf Baechle, Paul Burton; +Cc: linux-mips, linux-kernel

Architecturally the MIPS ISA does not specify ordering requirements for 
uncached bus accesses such as MMIO operations normally use and therefore 
explicit barriers have to be inserted between MMIO accesses where 
unspecified ordering of operations would cause unpredictable results.

For example the R2020 write buffer implements write gathering and 
combining[1] and as used with the DECstation models 2100 and 3100 for 
MMIO accesses it bypasses the read buffer entirely, because conflicts 
are resolved by the memory controller for DRAM accesses only[2] (NB the 
R2020 and R3020 buffers are the same except for the maximum clock rate).

Consequently if a device has say a 16-bit control register at offset 0, 
a 16-bit event mask register at offset 2 and a 16-bit reset register at 
offset 4, and the initial value of the control register is 0x1111, then 
in the absence of barriers a hypothetical code sequence like this:

u16 init_dev(u16 __iomem *dev);
	u16 x;

	write16(dev + 2, 0xffff);
	write16(dev + 0, 0x2222);
	x = read16(dev + 0);
	write16(dev + 1, 0x3333);
	write16(dev + 0, 0x4444);

	return x;
}

will return 0x1111 and issue a single 32-bit write of 0x33334444 (in the 
little-endian bus configuration) to offset 0 on the system bus.

This is because the read to set `x' from offset 0 bypasses the write of 
0x2222 that is still in the write buffer pending the completion of the 
write of 0xffff to the reset register.  Then the write of 0x3333 to the 
event mask register is merged with the preceding write to the control 
register as they share the same word address, making it a 32-bit write 
of 0x33332222 to offset 0.  Finally the write of 0x4444 to the control 
register is combined with the outstanding 32-bit write of 0x33332222 to 
offset 0, because, again, it shares the same address.

This is an example from a legacy system, given here because it is well 
documented and affects a machine we actually support.  But likewise 
modern MIPS systems may implement weak MMIO ordering, possibly even 
without having it clearly documented except for being compliant with the 
architecture specification with respect to the currently defined SYNC 
instruction variants[3].

Considering the above and that we are required to implement MMIO 
accessors such that individual accesses made with them are strongly 
ordered with respect to each other[4], add the necessary barriers to our 
`inX', `outX', `readX' and `writeX' handlers, as well the associated 
special use variants.  It's up to platforms then to possibly define the 
respective barriers so as to expand to nil if no ordering enforcement is 
actually needed for a given system; SYNC is supposed to be as cheap as 
a NOP on strongly ordered MIPS implementations though.

Retain the option to generate weakly-ordered accessors, so that the 
arrangement for `war_io_reorder_wmb' is not lost in case we need it for 
fully raw accessors in the future.  The reason for this is that it is 
unclear from commit 1e820da3c9af ("MIPS: Loongson-3: Introduce 
CONFIG_LOONGSON3_ENHANCEMENT") and especially commit 8faca49a6731 
("MIPS: Modify core io.h macros to account for the Octeon Errata 
Core-301.") why they are needed there under the previous assumption that 
these accessors can be weakly ordered.

References:

[1] "LR3020 Write Buffer", LSI Logic Corporation, September 1988, 
    Section "Byte Gathering", pp. 6-7

[2] "DECstation 3100 Desktop Workstation Functional Specification", 
    Digital Equipment Corporation, Revision 1.3, August 28, 1990, 
    Section 6.1 "Processor", p. 4

[3] "MIPS Architecture For Programmers, Volume II-A: The MIPS32 
    Instruction Set Manual", Imagination Technologies LTD, Document 
    Number: MD00086, Revision 6.06, December 15, 2016, Table 5.5 
    "Encodings of the Bits[10:6] of the SYNC instruction; the SType 
    Field", p. 409

[4] "LINUX KERNEL MEMORY BARRIERS", Documentation/memory-barriers.txt, 
    Section "KERNEL I/O BARRIER EFFECTS"

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
References: 8faca49a6731 ("MIPS: Modify core io.h macros to account for the Octeon Errata Core-301.")
References: 1e820da3c9af ("MIPS: Loongson-3: Introduce CONFIG_LOONGSON3_ENHANCEMENT")
---
 arch/mips/include/asm/io.h |   28 ++++++++++++++++++++--------
 1 file changed, 20 insertions(+), 8 deletions(-)

linux-mips-readx-writex-barrier.patch
Index: linux-20180930-3maxp/arch/mips/include/asm/io.h
===================================================================
--- linux-20180930-3maxp.orig/arch/mips/include/asm/io.h
+++ linux-20180930-3maxp/arch/mips/include/asm/io.h
@@ -337,7 +337,7 @@ static inline void iounmap(const volatil
 #define war_io_reorder_wmb()		barrier()
 #endif
 
-#define __BUILD_MEMORY_SINGLE(pfx, bwlq, type, irq)			\
+#define __BUILD_MEMORY_SINGLE(pfx, bwlq, type, barrier, irq)		\
 									\
 static inline void pfx##write##bwlq(type val,				\
 				    volatile void __iomem *mem)		\
@@ -345,7 +345,10 @@ static inline void pfx##write##bwlq(type
 	volatile type *__mem;						\
 	type __val;							\
 									\
-	war_io_reorder_wmb();					\
+	if (barrier)							\
+		iobarrier_rw();						\
+	else								\
+		war_io_reorder_wmb();					\
 									\
 	__mem = (void *)__swizzle_addr_##bwlq((unsigned long)(mem));	\
 									\
@@ -382,6 +385,9 @@ static inline type pfx##read##bwlq(const
 									\
 	__mem = (void *)__swizzle_addr_##bwlq((unsigned long)(mem));	\
 									\
+	if (barrier)							\
+		iobarrier_rw();						\
+									\
 	if (sizeof(type) != sizeof(u64) || sizeof(u64) == sizeof(long)) \
 		__val = *__mem;						\
 	else if (cpu_has_64bits) {					\
@@ -409,14 +415,17 @@ static inline type pfx##read##bwlq(const
 	return pfx##ioswab##bwlq(__mem, __val);				\
 }
 
-#define __BUILD_IOPORT_SINGLE(pfx, bwlq, type, p, slow)			\
+#define __BUILD_IOPORT_SINGLE(pfx, bwlq, type, barrier, p, slow)		\
 									\
 static inline void pfx##out##bwlq##p(type val, unsigned long port)	\
 {									\
 	volatile type *__addr;						\
 	type __val;							\
 									\
-	war_io_reorder_wmb();					\
+	if (barrier)							\
+		iobarrier_rw();						\
+	else								\
+		war_io_reorder_wmb();					\
 									\
 	__addr = (void *)__swizzle_addr_##bwlq(mips_io_port_base + port); \
 									\
@@ -438,6 +447,9 @@ static inline type pfx##in##bwlq##p(unsi
 									\
 	BUILD_BUG_ON(sizeof(type) > sizeof(unsigned long));		\
 									\
+	if (barrier)							\
+		iobarrier_rw();						\
+									\
 	__val = *__addr;						\
 	slow;								\
 									\
@@ -448,7 +460,7 @@ static inline type pfx##in##bwlq##p(unsi
 
 #define __BUILD_MEMORY_PFX(bus, bwlq, type)				\
 									\
-__BUILD_MEMORY_SINGLE(bus, bwlq, type, 1)
+__BUILD_MEMORY_SINGLE(bus, bwlq, type, 1, 1)
 
 #define BUILDIO_MEM(bwlq, type)						\
 									\
@@ -462,8 +474,8 @@ BUILDIO_MEM(l, u32)
 BUILDIO_MEM(q, u64)
 
 #define __BUILD_IOPORT_PFX(bus, bwlq, type)				\
-	__BUILD_IOPORT_SINGLE(bus, bwlq, type, ,)			\
-	__BUILD_IOPORT_SINGLE(bus, bwlq, type, _p, SLOW_DOWN_IO)
+	__BUILD_IOPORT_SINGLE(bus, bwlq, type, 1, ,)			\
+	__BUILD_IOPORT_SINGLE(bus, bwlq, type, 1, _p, SLOW_DOWN_IO)
 
 #define BUILDIO_IOPORT(bwlq, type)					\
 	__BUILD_IOPORT_PFX(, bwlq, type)				\
@@ -478,7 +490,7 @@ BUILDIO_IOPORT(q, u64)
 
 #define __BUILDIO(bwlq, type)						\
 									\
-__BUILD_MEMORY_SINGLE(____raw_, bwlq, type, 0)
+__BUILD_MEMORY_SINGLE(____raw_, bwlq, type, 1, 0)
 
 __BUILDIO(q, u64)
 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 4/4] MIPS: Provide actually relaxed MMIO accessors
  2018-10-08  0:36 [PATCH 0/4] MIPS: Ordering enforcement fixes for MMIO accessors Maciej W. Rozycki
                   ` (2 preceding siblings ...)
  2018-10-08  0:37 ` [PATCH 3/4] MIPS: Enforce strong ordering for MMIO accessors Maciej W. Rozycki
@ 2018-10-08  0:37 ` Maciej W. Rozycki
  2018-10-11 16:40 ` [PATCH 0/4] MIPS: Ordering enforcement fixes for " Paul Burton
  4 siblings, 0 replies; 6+ messages in thread
From: Maciej W. Rozycki @ 2018-10-08  0:37 UTC (permalink / raw)
  To: Ralf Baechle, Paul Burton; +Cc: linux-mips, linux-kernel

Improve performance for the relevant systems and remove the DMA ordering 
barrier from `readX_relaxed' and `writeX_relaxed' MMIO accessors, where 
it is not needed according to our requirements[1].  For consistency make 
the same arrangement with low-level port I/O accessors, but do not 
actually provide any accessors making use of it.

References:

[1] "LINUX KERNEL MEMORY BARRIERS", Documentation/memory-barriers.txt,
    Section "KERNEL I/O BARRIER EFFECTS"

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
---
 arch/mips/include/asm/io.h |   48 ++++++++++++++++++++++++++-------------------
 1 file changed, 28 insertions(+), 20 deletions(-)

linux-mips-readx-writex-relaxed.patch
Index: linux-20180930-3maxp/arch/mips/include/asm/io.h
===================================================================
--- linux-20180930-3maxp.orig/arch/mips/include/asm/io.h
+++ linux-20180930-3maxp/arch/mips/include/asm/io.h
@@ -51,6 +51,11 @@
 # define __raw_ioswabq(a, x)	(x)
 # define ____raw_ioswabq(a, x)	(x)
 
+# define __relaxed_ioswabb ioswabb
+# define __relaxed_ioswabw ioswabw
+# define __relaxed_ioswabl ioswabl
+# define __relaxed_ioswabq ioswabq
+
 /* ioswab[bwlq], __mem_ioswab[bwlq] are defined in mangle-port.h */
 
 #define IO_SPACE_LIMIT 0xffff
@@ -337,7 +342,7 @@ static inline void iounmap(const volatil
 #define war_io_reorder_wmb()		barrier()
 #endif
 
-#define __BUILD_MEMORY_SINGLE(pfx, bwlq, type, barrier, irq)		\
+#define __BUILD_MEMORY_SINGLE(pfx, bwlq, type, barrier, relax, irq)	\
 									\
 static inline void pfx##write##bwlq(type val,				\
 				    volatile void __iomem *mem)		\
@@ -411,11 +416,12 @@ static inline type pfx##read##bwlq(const
 	}								\
 									\
 	/* prevent prefetching of coherent DMA data prematurely */	\
-	rmb();								\
+	if (!relax)							\
+		rmb();							\
 	return pfx##ioswab##bwlq(__mem, __val);				\
 }
 
-#define __BUILD_IOPORT_SINGLE(pfx, bwlq, type, barrier, p, slow)		\
+#define __BUILD_IOPORT_SINGLE(pfx, bwlq, type, barrier, relax, p, slow)	\
 									\
 static inline void pfx##out##bwlq##p(type val, unsigned long port)	\
 {									\
@@ -454,19 +460,21 @@ static inline type pfx##in##bwlq##p(unsi
 	slow;								\
 									\
 	/* prevent prefetching of coherent DMA data prematurely */	\
-	rmb();								\
+	if (!relax)							\
+		rmb();							\
 	return pfx##ioswab##bwlq(__addr, __val);			\
 }
 
-#define __BUILD_MEMORY_PFX(bus, bwlq, type)				\
+#define __BUILD_MEMORY_PFX(bus, bwlq, type, relax)			\
 									\
-__BUILD_MEMORY_SINGLE(bus, bwlq, type, 1, 1)
+__BUILD_MEMORY_SINGLE(bus, bwlq, type, 1, relax, 1)
 
 #define BUILDIO_MEM(bwlq, type)						\
 									\
-__BUILD_MEMORY_PFX(__raw_, bwlq, type)					\
-__BUILD_MEMORY_PFX(, bwlq, type)					\
-__BUILD_MEMORY_PFX(__mem_, bwlq, type)					\
+__BUILD_MEMORY_PFX(__raw_, bwlq, type, 0)				\
+__BUILD_MEMORY_PFX(__relaxed_, bwlq, type, 1)				\
+__BUILD_MEMORY_PFX(__mem_, bwlq, type, 0)				\
+__BUILD_MEMORY_PFX(, bwlq, type, 0)
 
 BUILDIO_MEM(b, u8)
 BUILDIO_MEM(w, u16)
@@ -474,8 +482,8 @@ BUILDIO_MEM(l, u32)
 BUILDIO_MEM(q, u64)
 
 #define __BUILD_IOPORT_PFX(bus, bwlq, type)				\
-	__BUILD_IOPORT_SINGLE(bus, bwlq, type, 1, ,)			\
-	__BUILD_IOPORT_SINGLE(bus, bwlq, type, 1, _p, SLOW_DOWN_IO)
+	__BUILD_IOPORT_SINGLE(bus, bwlq, type, 1, 0, ,)			\
+	__BUILD_IOPORT_SINGLE(bus, bwlq, type, 1, 0, _p, SLOW_DOWN_IO)
 
 #define BUILDIO_IOPORT(bwlq, type)					\
 	__BUILD_IOPORT_PFX(, bwlq, type)				\
@@ -490,19 +498,19 @@ BUILDIO_IOPORT(q, u64)
 
 #define __BUILDIO(bwlq, type)						\
 									\
-__BUILD_MEMORY_SINGLE(____raw_, bwlq, type, 1, 0)
+__BUILD_MEMORY_SINGLE(____raw_, bwlq, type, 1, 0, 0)
 
 __BUILDIO(q, u64)
 
-#define readb_relaxed			readb
-#define readw_relaxed			readw
-#define readl_relaxed			readl
-#define readq_relaxed			readq
+#define readb_relaxed			__relaxed_readb
+#define readw_relaxed			__relaxed_readw
+#define readl_relaxed			__relaxed_readl
+#define readq_relaxed			__relaxed_readq
 
-#define writeb_relaxed			writeb
-#define writew_relaxed			writew
-#define writel_relaxed			writel
-#define writeq_relaxed			writeq
+#define writeb_relaxed			__relaxed_writeb
+#define writew_relaxed			__relaxed_writew
+#define writel_relaxed			__relaxed_writel
+#define writeq_relaxed			__relaxed_writeq
 
 #define readb_be(addr)							\
 	__raw_readb((__force unsigned *)(addr))

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/4] MIPS: Ordering enforcement fixes for MMIO accessors
  2018-10-08  0:36 [PATCH 0/4] MIPS: Ordering enforcement fixes for MMIO accessors Maciej W. Rozycki
                   ` (3 preceding siblings ...)
  2018-10-08  0:37 ` [PATCH 4/4] MIPS: Provide actually relaxed " Maciej W. Rozycki
@ 2018-10-11 16:40 ` Paul Burton
  4 siblings, 0 replies; 6+ messages in thread
From: Paul Burton @ 2018-10-11 16:40 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Ralf Baechle, linux-mips, linux-kernel

Hi Maciej,

On Mon, Oct 08, 2018 at 01:36:55AM +0100, Maciej W. Rozycki wrote:
>  This patch series is a follow-up to my earlier consideration about MMIO 
> access ordering recorded here: <https://lkml.org/lkml/2014/4/28/201>.
> 
>  As I have learnt in a recent Alpha/Linux discussion starting here: 
> <https://marc.info/?i=alpine.LRH.2.02.1808161556450.13597%20()%20file01%20!%20intranet%20!%20prod%20!%20int%20!%20rdu2%20!%20redhat%20!%20com> 
> related to MMIO accessor ordering barriers ports are actually required to 
> follow the x86 strongly ordered semantics.  As the ordering is not 
> specified in the MIPS architecture except for the SYNC instruction we do 
> have to put explicit barriers in MMIO accessors as otherwise ordering may 
> not be guaranteed.
> 
>  Fortunately on strongly ordered systems SYNC is expected to be as cheap 
> as a NOP, and on weakly ordered ones it is needed anyway.  As from 
> revision 2.60 of the MIPS architecture specification however we have a 
> number of SYNC operations defined, and SYNC 0 has been upgraded from an 
> ordering to a completion barrier.  We currently don't make use of these 
> extra operations and always use SYNC 0 instead, which this means that we 
> may be doing too much synchronisation with the barriers we have already 
> defined.
> 
>  This patch series does not make an attempt to optimise for SYNC operation 
> use, which belongs to a separate improvement.  Instead it focuses on 
> fixing MMIO accesses so that drivers can rely on our own API definition.

Agreed, using the lightweight sync types is a whole other can of worms.
I did speak with the architecture team about the description of SYNC
recently (in the context of nanoMIPS documentation if I recall) and hope
the tweaks that were made to the architectural description of it might
help with using them one day soon.

>  Following the original consideration specific MMIO barrier operations are 
> added.  As they have turned out to be required to be implied by MMIO 
> accessors there is no immediate need to make them form a generic 
> cross-architecture internal Linux API.  Therefore I defined them for the 
> MIPS architecture only, using the names originally coined by mostly taking 
> them from the PowerPC port.
> 
>  Then I have used them to fix `mmiowb', and then `readX' and `writeX' 
> accessors.  Finally I have updated the `_relaxed' accessors to avoid 
> unnecessary synchronisation WRT DMA.

Thanks - this definitely leaves us in a better place than we were :)

All 4 patches applied to mips-next for 4.20.

Paul

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-10-11 16:41 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-08  0:36 [PATCH 0/4] MIPS: Ordering enforcement fixes for MMIO accessors Maciej W. Rozycki
2018-10-08  0:37 ` [PATCH 1/4] MIPS: Define MMIO ordering barriers Maciej W. Rozycki
2018-10-08  0:37 ` [PATCH 2/4] MIPS: Correct `mmiowb' barrier for `wbflush' platforms Maciej W. Rozycki
2018-10-08  0:37 ` [PATCH 3/4] MIPS: Enforce strong ordering for MMIO accessors Maciej W. Rozycki
2018-10-08  0:37 ` [PATCH 4/4] MIPS: Provide actually relaxed " Maciej W. Rozycki
2018-10-11 16:40 ` [PATCH 0/4] MIPS: Ordering enforcement fixes for " Paul Burton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).